X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlre.pod;h=5e99fd3af51f52a8979f8f453e2a58ced741919f;hb=a5c16299e4688e58a2a7b276af191a614da68f07;hp=fef8ce3b6f976cb896d4fe6ee1a6aa2128938564;hpb=54c18d0455d4f9550786bea467f5a04c96e86890;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perlre.pod b/pod/perlre.pod index fef8ce3..5e99fd3 100644 --- a/pod/perlre.pod +++ b/pod/perlre.pod @@ -121,7 +121,8 @@ The following standard quantifiers are recognized: {n,m} Match at least n but not more than m times (If a curly bracket occurs in any other context, it is treated -as a regular character.) The "*" modifier is equivalent to C<{0,}>, the "+" +as a regular character. In particular, the lower bound +is not optional.) The "*" modifier is equivalent to C<{0,}>, the "+" modifier to C<{1,}>, and the "?" modifier to C<{0,1}>. n and m are limited to integral values less than a preset limit defined when perl is built. This is usually 32766 on the most common platforms. The actual limit can @@ -198,8 +199,8 @@ C<\d>, and C<\D> within character classes, but if you try to use them as endpoints of a range, that's not a range, the "-" is understood literally. If Unicode is in effect, C<\s> matches also "\x{85}", "\x{2028}, and "\x{2029}", see L for more details about -C<\pP>, C<\PP>, and C<\X>, and L about Unicode in -general. +C<\pP>, C<\PP>, and C<\X>, and L about Unicode in general. +You can define your own C<\p> and C<\P> propreties, see L. The POSIX character class syntax @@ -349,8 +350,11 @@ It is also useful when writing C-like scanners, when you have several patterns that you want to match against consequent substrings of your string, see the previous reference. The actual location where C<\G> will match can also be influenced by using C as -an lvalue. Currently C<\G> only works when used at the -beginning of the pattern. See L. +an lvalue: see L. Currently C<\G> is only fully +supported when anchored to the start of the pattern; while it +is permitted to use it elsewhere, as in C, some +such uses (C, for example) currently cause problems, and +it is recommended that you avoid such usage for now. The bracketing construct C<( ... )> creates capture buffers. To refer to the digit'th buffer use \ within the @@ -389,11 +393,14 @@ Several special variables also refer back to portions of the previous match. C<$+> returns whatever the last bracket match matched. C<$&> returns the entire matched string. (At one point C<$0> did also, but now it returns the name of the program.) C<$`> returns -everything before the matched string. And C<$'> returns everything -after the matched string. +everything before the matched string. C<$'> returns everything +after the matched string. And C<$^N> contains whatever was matched by +the most-recently closed group (submatch). C<$^N> can be used in +extended patterns (see below), for example to assign a submatch to a +variable. The numbered variables ($1, $2, $3, etc.) and the related punctuation -set (C<$+>, C<$&>, C<$`>, and C<$'>) are all dynamically scoped +set (C<$+>, C<$&>, C<$`>, C<$'>, and C<$^N>) are all dynamically scoped until the end of the enclosing block or until the next successful match, whichever comes first. (See L.) @@ -560,6 +567,14 @@ This zero-width assertion evaluate any embedded Perl code. It always succeeds, and its C is not interpolated. Currently, the rules to determine where the C ends are somewhat convoluted. +This feature can be used together with the special variable C<$^N> to +capture the results of submatches in variables without having to keep +track of the number of nested parentheses. For example: + + $_ = "The brown fox jumps over the lazy dog"; + /the (\S+)(?{ $color = $^N }) (\S+)(?{ $animal = $^N })/i; + print "color = $color, animal = $animal\n"; + The C is properly scoped in the following sense: If the assertion is backtracked (compare L<"Backtracking">), all changes introduced after Cization are undone, so that @@ -870,7 +885,7 @@ multiple ways it might succeed, you need to understand backtracking to know which variety of success you will achieve. When using look-ahead assertions and negations, this can all get even -tricker. Imagine you'd like to find a sequence of non-digits not +trickier. Imagine you'd like to find a sequence of non-digits not followed by "123". You might try to write that as $_ = "ABC123";