X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlre.pod;h=a076d3ad66a8747e1a83a5c3be0d334ce0ba8411;hb=afa74d4282044c64ab152392003f47bb0674abd2;hp=0c38ac7cba692057b43f475d15f99a3292783cf7;hpb=49cb94c67d828cadfe8cac24ae5955cf752eb2df;p=p5sagit%2Fp5-mst-13.2.git
diff --git a/pod/perlre.pod b/pod/perlre.pod
index 0c38ac7..a076d3a 100644
--- a/pod/perlre.pod
+++ b/pod/perlre.pod
@@ -1,52 +1,79 @@
=head1 NAME
+X
Backslashed metacharacters in Perl are alphanumeric, such as C<\b>,
C<\w>, C<\n>. Unlike some other regular expression languages, there
@@ -427,39 +709,48 @@ expressions, and 2) whenever you see one, you should stop and
=over 10
=item C<(?#text)>
+X<(?#)>
A comment. The text is ignored. If the C modifier enables
whitespace formatting, a simple C<#> will suffice. Note that Perl closes
the comment as soon as it sees a C<)>, so there is no way to put a literal
C<)> in the comment.
-=item C<(?imsx-imsx)>
+=item C<(?pimsx-imsx)>
+X<(?)>
-One or more embedded pattern-match modifiers. This is particularly
-useful for dynamic patterns, such as those read in from a configuration
-file, read in as an argument, are specified in a table somewhere,
-etc. Consider the case that some of which want to be case sensitive
-and some do not. The case insensitive ones need to include merely
-C<(?i)> at the front of the pattern. For example:
+One or more embedded pattern-match modifiers, to be turned on (or
+turned off, if preceded by C<->) for the remainder of the pattern or
+the remainder of the enclosing pattern group (if any). This is
+particularly useful for dynamic patterns, such as those read in from a
+configuration file, taken from an argument, or specified in a table
+somewhere. Consider the case where some patterns want to be case
+sensitive and some do not: The case insensitive ones merely need to
+include C<(?i)> at the front of the pattern. For example:
$pattern = "foobar";
- if ( /$pattern/i ) { }
+ if ( /$pattern/i ) { }
# more flexible:
$pattern = "(?i)foobar";
- if ( /$pattern/ ) { }
+ if ( /$pattern/ ) { }
-Letters after a C<-> turn those modifiers off. These modifiers are
-localized inside an enclosing group (if any). For example,
+These modifiers are restored at the end of the enclosing group. For example,
( (?i) blah ) \s+ \1
-will match a repeated (I modifier is special in that it can only be enabled,
+not disabled, and that its presence anywhere in a pattern has a global
+effect. Thus C<(?-p)> and C<(?-p:...)> are meaningless and will warn
+when executed under C is not interpolated. Currently,
the rules to determine where the C
ends are somewhat convoluted.
+This feature can be used together with the special variable C<$^N> to
+capture the results of submatches in variables without having to keep
+track of the number of nested parentheses. For example:
+
+ $_ = "The brown fox jumps over the lazy dog";
+ /the (\S+)(?{ $color = $^N }) (\S+)(?{ $animal = $^N })/i;
+ print "color = $color, animal = $animal\n";
+
+Inside the C<(?{...})> block, C<$_> refers to the string the regular
+expression is matching against. You can also use C
is properly scoped in the following sense: If the assertion
is backtracked (compare L<"Backtracking">), all changes introduced after
C
),
+or indirectly with functions such as C is evaluated
at run time, at the moment this subexpression may match. The result
of evaluation is considered as a regular expression and matched as
-if it were inserted instead of this construct.
+if it were inserted instead of this construct. Note that this means
+that the contents of capture buffers defined inside an eval'ed pattern
+are not available outside of the pattern, and vice versa, there is no
+way for the inner pattern to refer to a capture buffer defined outside.
+Thus,
+
+ ('a' x 100)=~/(??{'(.)' x 100})/
+
+B
is not interpolated. As before, the rules to determine
where the C
ends are currently somewhat convoluted.
@@ -609,10 +1052,203 @@ The following pattern matches a parenthesized group:
\)
}x;
-=item C<< (?>pattern) >>
+See also C<(?PARNO)> for a different, more efficient way to accomplish
+the same task.
+
+Because perl's regex engine is not currently re-entrant, delayed
+code may not invoke the regex engine either directly with C
),
+or indirectly with functions such as C is a double-quoted string. C<\1> in
+PerlThink, the righthand side of an C is a double-quoted string. C<\1> in
the usual double-quoted string means a control-A. The customary Unix
meaning of C<\1> is kludged in for C. However, if you get into the habit
of doing that, you get yourself into trouble if you then add an C
@@ -1033,7 +1920,7 @@ C<${1}000>. The operation of interpolation should not be confused
with the operation of matching a backreference. Certainly they mean two
different things on the I.
-=head2 Repeated patterns matching zero-length substring
+=head2 Repeated Patterns Matching a Zero-length Substring
B and C, C and C are substrings
-which can be matched by C than C, C can match is important.
-=item C<(??{ EXPR })>
+=item C<(??{ EXPR })>, C<(?PARNO)>
The ordering is the same as for the regular expression which is
-the result of EXPR.
+the result of EXPR, or the pattern contained by capture buffer PARNO.
=item C<(?(condition)yes-pattern|no-pattern)>
@@ -1210,13 +2097,13 @@ One more rule is needed to understand how a match is determined for the
whole regular expression: a match at an earlier position is always better
than a match at a later position.
-=head2 Creating custom RE engines
+=head2 Creating Custom RE Engines
Overloaded constants (see L