=head2 What good is C<\G> in a regular expression?
-The notation C<\G> is used in a match or substitution in conjunction the
-C</g> modifier (and ignored if there's no C</g>) to anchor the regular
-expression to the point just past where the last match occurred, i.e. the
-pos() point. A failed match resets the position of C<\G> unless the
-C</c> modifier is in effect.
+The notation C<\G> is used in a match or substitution in conjunction with
+the C</g> modifier to anchor the regular expression to the point just past
+where the last match occurred, i.e. the pos() point. A failed match resets
+the position of C<\G> unless the C</c> modifier is in effect. C<\G> can be
+used in a match without the C</g> modifier; it acts the same (i.e. still
+anchors at the pos() point) but of course only matches once and does not
+update pos(), as non-C</g> expressions never do. C<\G> in an expression
+applied to a target string that has never been matched against a C</g>
+expression before or has had its pos() reset is functionally equivalent to
+C<\A>, which matches at the beginning of the string.
For example, suppose you had a line of text quoted in standard mail
and Usenet notation, (that is, with leading C<< > >> characters), and
You can intermix C<m//g> matches with C<m/\G.../g>, where C<\G> is a
zero-width assertion that matches the exact position where the previous
-C<m//g>, if any, left off. The C<\G> assertion is not supported without
-the C</g> modifier. (Currently, without C</g>, C<\G> behaves just like
-C<\A>, but that's accidental and may change in the future.)
+C<m//g>, if any, left off. Without the C</g> modifier, the C<\G> assertion
+still anchors at pos(), but the match is of course only attempted once.
+Using C<\G> without C</g> on a target string that has not previously had a
+C</g> match applied to it is the same as using the C<\A> assertion to match
+the beginning of the string.
Examples:
($one,$five,$fifteen) = (`uptime` =~ /(\d+\.\d+)/g);
# scalar context
- $/ = ""; $* = 1; # $* deprecated in modern perls
+ $/ = "";
while (defined($paragraph = <>)) {
while ($paragraph =~ /[a-z]['")]*[.!?]+['")]*\s/g) {
$sentences++;
print "3: '";
print $1 while /(p)/gc; print "', pos=", pos, "\n";
}
+ print "Final: '$1', pos=",pos,"\n" if /\G(.)/;
The last example should print:
1: '', pos=7
2: 'q', pos=8
3: '', pos=8
+ Final: 'q', pos=8
+
+Notice that the final match matched C<q> instead of C<p>, which a match
+without the C<\G> anchor would have done. Also note that the final match
+did not update C<pos> -- C<pos> is only updated on a C</g> match. If the
+final match did indeed match C<p>, it's a good bet that you're running an
+older (pre-5.6.0) Perl.
A useful idiom for C<lex>-like scanners is C</\G.../gc>. You can
combine several regexps like this to process a string part-by-part,