[:^space:] \S \P{IsSpace}
[:^word:] \W \P{IsWord}
-The POSIX character classes [.cc.] and [=cc=] are recognized but
-B<not> supported and trying to use them will cause an error.
+Perl respects the POSIX standard in that POSIX character classes are
+only supported within a character class. The POSIX character classes
+[.cc.] and [=cc=] are recognized but B<not> supported and trying to
+use them will cause an error.
Perl defines the following zero-width assertions:
several patterns that you want to match against consequent substrings
of your string, see the previous reference. The actual location
where C<\G> will match can also be influenced by using C<pos()> as
-an lvalue. See L<perlfunc/pos>.
+an lvalue. Currently C<\G> only works when used at the
+beginning of the pattern. See L<perlfunc/pos>.
The bracketing construct C<( ... )> creates capture buffers. To
refer to the digit'th buffer use \<digit> within the
The combination of C<//g> and C<\G> allows us to process the string a
bit at a time and use arbitrary Perl logic to decide what to do next.
+Currently, the C<\G> anchor only works at the beginning of a pattern.
C<\G> is also invaluable in processing fixed length records with
regexps. Suppose we have a snippet of coding region DNA, encoded as
character classes. To negate a POSIX class, put a C<^> in front of
the name, so that, e.g., C<[:^digit:]> corresponds to C<\D> and under
C<utf8>, C<\P{IsDigit}>. The Unicode and POSIX character classes can
-be used just like C<\d>, both inside and outside of character classes:
+be used just like C<\d>, with the exception that POSIX character
+classes can only be used inside of a character class:
/\s+[abc[:digit:]xyz]\s*/; # match a,b,c,x,y,z, or a digit
- /^=item\s[:digit:]/; # match '=item',
+ /^=item\s[[:digit:]]/; # match '=item',
# followed by a space and a digit
use charnames ":full";
/\s+[abc\p{IsDigit}xyz]\s+/; # match a,b,c,x,y,z, or a digit