X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlre.pod;h=6060e181b752a1f30863d0ef931bfcc5b0f1cb5e;hb=6ef249b908f1fd6caec1b0140c6be9c66f4eb1f2;hp=95d473439ed78fed9c171a315f13c1f0b63374fc;hpb=7f7611693ae4cbdb1286d9a1854b4f5a34e9670e;p=p5sagit%2Fp5-mst-13.2.git
diff --git a/pod/perlre.pod b/pod/perlre.pod
index 95d4734..6060e18 100644
--- a/pod/perlre.pod
+++ b/pod/perlre.pod
@@ -1,57 +1,76 @@
=head1 NAME
+X
+
+Backslashed metacharacters in Perl are alphanumeric, such as C<\b>,
+C<\w>, C<\n>. Unlike some other regular expression languages, there
+are no backslashed symbols that aren't alphanumeric. So anything
+that looks like \\, \(, \), \<, \>, \{, or \} is always
+interpreted as a literal character, not a metacharacter. This was
+once used in a common idiom to disable or quote the special meanings
+of regular expression metacharacters in a string that you want to
+use for a pattern. Simply quote all non-"word" characters:
$pattern =~ s/(\W)/\\$1/g;
-Now it is much more common to see either the quotemeta() function or
-the C<\Q> escape sequence used to disable all metacharacters' special
+(If C modifier is special in that it can only be enabled,
+not disabled, and that its presence anywhere in a pattern has a global
+effect. Thus C<(?-p)> and C<(?-p:...)> are meaningless and will warn
+when executed under C is not interpolated. Currently the rules to
-determine where the C
ends are somewhat convoluted.
+B
is properly scoped in the following sense: if the assertion
-is backtracked (compare L<"Backtracking">), all the changes introduced after
-C
is not interpolated. Currently,
+the rules to determine where the C
ends are somewhat convoluted.
+
+This feature can be used together with the special variable C<$^N> to
+capture the results of submatches in variables without having to keep
+track of the number of nested parentheses. For example:
+
+ $_ = "The brown fox jumps over the lazy dog";
+ /the (\S+)(?{ $color = $^N }) (\S+)(?{ $animal = $^N })/i;
+ print "color = $color, animal = $animal\n";
+
+Inside the C<(?{...})> block, C<$_> refers to the string the regular
+expression is matching against. You can also use C
is properly scoped in the following sense: If the assertion
+is backtracked (compare L<"Backtracking">), all changes introduced after
+C
-is put into variable $^R. This happens immediately, so $^R can be used from
-other C<(?{ code })> assertions inside the same regular expression.
+This assertion may be used as a C<(?(condition)yes-pattern|no-pattern)>
+switch. If I
is put into the special variable C<$^R>. This happens
+immediately, so C<$^R> can be used from other C<(?{ code })> assertions
+inside the same regular expression.
+
+The assignment to C<$^R> above is properly localized, so the old
+value of C<$^R> is restored if the assertion is backtracked; compare
+L<"Backtracking">.
-The above assignment to $^R is properly localized, thus the old value of $^R
-is restored if the assertion is backtracked (compare L<"Backtracking">).
+Due to an unfortunate implementation issue, the Perl code contained in these
+blocks is treated as a compile time closure that can have seemingly bizarre
+consequences when used with lexically scoped variables inside of subroutines
+or loops. There are various workarounds for this, including simply using
+global variables instead. If you are using this construct and strange results
+occur then check for the use of lexically scoped variables.
-Due to security concerns, this construction is not allowed if the regular
-expression involves run-time interpolation of variables, unless
-C
),
+or indirectly with functions such as C is evaluated
-at runtime, at the moment this subexpression may match. The result of
-evaluation is considered as a regular expression, and matched as if it
-were inserted instead of this construct.
+B
is not interpolated. Currently the rules to
-determine where the C
ends are somewhat convoluted.
+This is a "postponed" regular subexpression. The C
is evaluated
+at run time, at the moment this subexpression may match. The result
+of evaluation is considered as a regular expression and matched as
+if it were inserted instead of this construct. Note that this means
+that the contents of capture buffers defined inside an eval'ed pattern
+are not available outside of the pattern, and vice versa, there is no
+way for the inner pattern to refer to a capture buffer defined outside.
+Thus,
-The following regular expression matches matching parenthesized group:
+ ('a' x 100)=~/(??{'(.)' x 100})/
+
+B
is not interpolated. As before, the rules to determine
+where the C
ends are currently somewhat convoluted.
+
+The following pattern matches a parenthesized group:
$re = qr{
\(
(?:
(?> [^()]+ ) # Non-parens without backtracking
|
- (?p{ $re }) # Group with matching parens
+ (??{ $re }) # Group with matching parens
)*
\)
}x;
-=item C<(?E
),
+or indirectly with functions such as C