X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlretut.pod;h=f0b5d1d3891ee463b2454dfb8d7feabefb3a7958;hb=0111df86b68202837d8ca044a27bbc00d7895fb1;hp=8f7c8cdd7260505260009a457a92a77afb878112;hpb=076988851d1cdb6cc455615593d7b9380b21955a;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perlretut.pod b/pod/perlretut.pod index 8f7c8cd..f0b5d1d 100644 --- a/pod/perlretut.pod +++ b/pod/perlretut.pod @@ -1403,6 +1403,8 @@ off. C<\G> allows us to easily do context-sensitive matching: The combination of C and C<\G> allows us to process the string a bit at a time and use arbitrary Perl logic to decide what to do next. +Currently, the C<\G> anchor is only fully supported when used to anchor +to the start of the pattern. C<\G> is also invaluable in processing fixed length records with regexps. Suppose we have a snippet of coding region DNA, encoded as @@ -1738,7 +1740,7 @@ traditional Unicode classes: IsPrint /^([LMNPS]|Co|Zs)/ IsPunct /^P/ IsSpace /^Z/ || ($code =~ /^(0009|000A|000B|000C|000D)$/ - IsSpacePerl /^Z/ || ($code =~ /^(0009|000A|000C|000D)$/ + IsSpacePerl /^Z/ || ($code =~ /^(0009|000A|000C|000D|0085|2028|2029)$/ IsUpper /^L[ut]/ IsWord /^[LMN]/ || $code eq "005F" IsXDigit $code =~ /^00(3[0-9]|[46][1-6])$/ @@ -1752,7 +1754,7 @@ For the full list see L. The Unicode has also been separated into various sets of charaters which you can test with C<\p{In...}> (in) and C<\P{In...}> (not in), -for example C<\p{InLatin}>, C<\p{InGreek}>, or C<\P{InKatakana}>. +for example C<\p{Latin}>, C<\p{Greek}>, or C<\P{Katakana}>. For the full list see L. C<\X> is an abbreviation for a character class sequence that includes @@ -1782,10 +1784,11 @@ C<[:space:]> correspond to the familiar C<\d>, C<\w>, and C<\s> character classes. To negate a POSIX class, put a C<^> in front of the name, so that, e.g., C<[:^digit:]> corresponds to C<\D> and under C, C<\P{IsDigit}>. The Unicode and POSIX character classes can -be used just like C<\d>, both inside and outside of character classes: +be used just like C<\d>, with the exception that POSIX character +classes can only be used inside of a character class: /\s+[abc[:digit:]xyz]\s*/; # match a,b,c,x,y,z, or a digit - /^=item\s[:digit:]/; # match '=item', + /^=item\s[[:digit:]]/; # match '=item', # followed by a space and a digit use charnames ":full"; /\s+[abc\p{IsDigit}xyz]\s+/; # match a,b,c,x,y,z, or a digit