X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlre.pod;h=75717293c2c828db5988931ab50dcd77f23b3187;hb=2095dafae09cfface71d4202b3188926ea0ccc1c;hp=15e58c1cf9dd44e0c93ec640a8e0d5a63db85ffc;hpb=f1cbbd6ecad59f8e4d9d331daa6642ebfd972ae1;p=p5sagit%2Fp5-mst-13.2.git
diff --git a/pod/perlre.pod b/pod/perlre.pod
index 15e58c1..7571729 100644
--- a/pod/perlre.pod
+++ b/pod/perlre.pod
@@ -1,52 +1,70 @@
=head1 NAME
+X
Backslashed metacharacters in Perl are alphanumeric, such as C<\b>,
C<\w>, C<\n>. Unlike some other regular expression languages, there
@@ -416,39 +663,48 @@ expressions, and 2) whenever you see one, you should stop and
=over 10
=item C<(?#text)>
+X<(?#)>
A comment. The text is ignored. If the C modifier enables
whitespace formatting, a simple C<#> will suffice. Note that Perl closes
the comment as soon as it sees a C<)>, so there is no way to put a literal
C<)> in the comment.
-=item C<(?imsx-imsx)>
+=item C<(?kimsx-imsx)>
+X<(?)>
-One or more embedded pattern-match modifiers. This is particularly
-useful for dynamic patterns, such as those read in from a configuration
-file, read in as an argument, are specified in a table somewhere,
-etc. Consider the case that some of which want to be case sensitive
-and some do not. The case insensitive ones need to include merely
-C<(?i)> at the front of the pattern. For example:
+One or more embedded pattern-match modifiers, to be turned on (or
+turned off, if preceded by C<->) for the remainder of the pattern or
+the remainder of the enclosing pattern group (if any). This is
+particularly useful for dynamic patterns, such as those read in from a
+configuration file, taken from an argument, or specified in a table
+somewhere. Consider the case where some patterns want to be case
+sensitive and some do not: The case insensitive ones merely need to
+include C<(?i)> at the front of the pattern. For example:
$pattern = "foobar";
- if ( /$pattern/i ) { }
+ if ( /$pattern/i ) { }
# more flexible:
$pattern = "(?i)foobar";
- if ( /$pattern/ ) { }
+ if ( /$pattern/ ) { }
-Letters after a C<-> turn those modifiers off. These modifiers are
-localized inside an enclosing group (if any). For example,
+These modifiers are restored at the end of the enclosing group. For example,
( (?i) blah ) \s+ \1
-will match a repeated (I is not interpolated. Currently,
the rules to determine where the C
ends are somewhat convoluted.
+This feature can be used together with the special variable C<$^N> to
+capture the results of submatches in variables without having to keep
+track of the number of nested parentheses. For example:
+
+ $_ = "The brown fox jumps over the lazy dog";
+ /the (\S+)(?{ $color = $^N }) (\S+)(?{ $animal = $^N })/i;
+ print "color = $color, animal = $animal\n";
+
+Inside the C<(?{...})> block, C<$_> refers to the string the regular
+expression is matching against. You can also use C
is properly scoped in the following sense: If the assertion
is backtracked (compare L<"Backtracking">), all changes introduced after
C
),
+or indirectly with functions such as C is evaluated
at run time, at the moment this subexpression may match. The result
of evaluation is considered as a regular expression and matched as
-if it were inserted instead of this construct.
+if it were inserted instead of this construct. Note that this means
+that the contents of capture buffers defined inside an eval'ed pattern
+are not available outside of the pattern, and vice versa, there is no
+way for the inner pattern to refer to a capture buffer defined outside.
+Thus,
+
+ ('a' x 100)=~/(??{'(.)' x 100})/
+
+B
is not interpolated. As before, the rules to determine
where the C
ends are currently somewhat convoluted.
@@ -598,10 +1004,203 @@ The following pattern matches a parenthesized group:
\)
}x;
-=item C<< (?>pattern) >>
+See also C<(?PARNO)> for a different, more efficient way to accomplish
+the same task.
+
+Because perl's regex engine is not currently re-entrant, delayed
+code may not invoke the regex engine either directly with C
),
+or indirectly with functions such as C is a double-quoted string. C<\1> in
+PerlThink, the righthand side of an C is a double-quoted string. C<\1> in
the usual double-quoted string means a control-A. The customary Unix
meaning of C<\1> is kludged in for C. However, if you get into the habit
of doing that, you get yourself into trouble if you then add an C
@@ -1018,7 +1876,7 @@ C<${1}000>. The operation of interpolation should not be confused
with the operation of matching a backreference. Certainly they mean two
different things on the I.
-=head2 Repeated patterns matching zero-length substring
+=head2 Repeated Patterns Matching a Zero-length Substring
B and C, C and C are substrings
-which can be matched by C than C, C can match is important.
-=item C<(??{ EXPR })>
+=item C<(??{ EXPR })>, C<(?PARNO)>
The ordering is the same as for the regular expression which is
-the result of EXPR.
+the result of EXPR, or the pattern contained by capture buffer PARNO.
=item C<(?(condition)yes-pattern|no-pattern)>
@@ -1195,13 +2053,13 @@ One more rule is needed to understand how a match is determined for the
whole regular expression: a match at an earlier position is always better
than a match at a later position.
-=head2 Creating custom RE engines
+=head2 Creating Custom RE Engines
Overloaded constants (see L