From: Mark Kvale <kvale@phy.ucsf.edu>
Date: Wed, 27 Mar 2002 16:45:37 +0000 (-0800)
Subject: [DOC PATCH] Regex \G and POSIX restrictions
X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=54c18d0455d4f9550786bea467f5a04c96e86890;p=p5sagit%2Fp5-mst-13.2.git

[DOC PATCH] Regex \G and POSIX restrictions
Message-Id: <02032716453705.38063@ivy.ucsf.edu>

p4raw-id: //depot/perl@15562
---

diff --git a/pod/perlre.pod b/pod/perlre.pod
index 58cd645..fef8ce3 100644
--- a/pod/perlre.pod
+++ b/pod/perlre.pod
@@ -316,8 +316,10 @@ with a '^'. This is a Perl extension.  For example:
     [:^space:]	    \S	    \P{IsSpace}
     [:^word:]	    \W	    \P{IsWord}
 
-The POSIX character classes [.cc.] and [=cc=] are recognized but
-B<not> supported and trying to use them will cause an error.
+Perl respects the POSIX standard in that POSIX character classes are
+only supported within a character class.  The POSIX character classes
+[.cc.] and [=cc=] are recognized but B<not> supported and trying to
+use them will cause an error.
 
 Perl defines the following zero-width assertions:
 
@@ -347,7 +349,8 @@ It is also useful when writing C<lex>-like scanners, when you have
 several patterns that you want to match against consequent substrings
 of your string, see the previous reference.  The actual location
 where C<\G> will match can also be influenced by using C<pos()> as
-an lvalue.  See L<perlfunc/pos>.
+an lvalue.  Currently C<\G> only works when used at the
+beginning of the pattern. See L<perlfunc/pos>.
 
 The bracketing construct C<( ... )> creates capture buffers.  To
 refer to the digit'th buffer use \<digit> within the
diff --git a/pod/perlretut.pod b/pod/perlretut.pod
index e90e03d..8c12a5c 100644
--- a/pod/perlretut.pod
+++ b/pod/perlretut.pod
@@ -1403,6 +1403,7 @@ off.  C<\G> allows us to easily do context-sensitive matching:
 
 The combination of C<//g> and C<\G> allows us to process the string a
 bit at a time and use arbitrary Perl logic to decide what to do next.
+Currently, the C<\G> anchor only works at the beginning of a pattern.
 
 C<\G> is also invaluable in processing fixed length records with
 regexps.  Suppose we have a snippet of coding region DNA, encoded as
@@ -1782,10 +1783,11 @@ C<[:space:]> correspond to the familiar C<\d>, C<\w>, and C<\s>
 character classes.  To negate a POSIX class, put a C<^> in front of
 the name, so that, e.g., C<[:^digit:]> corresponds to C<\D> and under
 C<utf8>, C<\P{IsDigit}>.  The Unicode and POSIX character classes can
-be used just like C<\d>, both inside and outside of character classes:
+be used just like C<\d>, with the exception that POSIX character
+classes can only be used inside of a character class:
 
     /\s+[abc[:digit:]xyz]\s*/;  # match a,b,c,x,y,z, or a digit
-    /^=item\s[:digit:]/;        # match '=item',
+    /^=item\s[[:digit:]]/;      # match '=item',
                                 # followed by a space and a digit
     use charnames ":full";
     /\s+[abc\p{IsDigit}xyz]\s+/;  # match a,b,c,x,y,z, or a digit