Document use of - in a regex char class.

[p5sagit/p5-mst-13.2.git] / pod / perlre.pod
diff --git a/pod/perlre.pod b/pod/perlre.pod

index 68964a0..37434a6 100644 (file)
--- a/pod/perlre.pod
+++ b/pod/perlre.pod
@@ -80,13 +80,13 @@ beginning of the string, the "$" character at only the end (or before the
 newline at the end) and Perl does certain optimizations with the
 assumption that the string contains only one line.  Embedded newlines
 will not be matched by "^" or "$".  You may, however, wish to treat a
-string as a multiline buffer, such that the "^" will match after any
+string as a multi-line buffer, such that the "^" will match after any
 newline within the string, and "$" will match before any newline.  At the
 cost of a little more overhead, you can do this by using the /m modifier
 on the pattern match operator.  (Older programs did this by setting C<$*>,
 but this practice is now deprecated.)
 
-To facilitate multiline substitutions, the "." character never matches a
+To facilitate multi-line substitutions, the "." character never matches a
 newline unless you use the C</s> modifier, which in effect tells Perl to pretend
 the string is a single line--even if it isn't.  The C</s> modifier also
 overrides the setting of C<$*>, in case you have some (badly behaved) older
@@ -163,7 +163,7 @@ Perl defines the following zero-width assertions:
     \B Match a non-(word boundary)
     \A Match at only beginning of string
     \Z Match at only end of string (or before newline at the end)
-    \G Match only where previous m//g left off
+    \G Match only where previous m//g left off (works only with /g)
 
 A word boundary (C<\b>) is defined as a spot between two characters that
 has a C<\w> on one side of it and a C<\W> on the other side of it (in
@@ -173,9 +173,10 @@ represents backspace rather than a word boundary.)  The C<\A> and C<\Z> are
 just like "^" and "$" except that they won't match multiple times when the
 C</m> modifier is used, while "^" and "$" will match at every internal line
 boundary.  To match the actual end of the string, not ignoring newline,
-you can use C<\Z(?!\n)>.  The C<\G> assertion can be used to mix global
-matches (using C<m//g>) and non-global ones, as described in
+you can use C<\Z(?!\n)>.  The C<\G> assertion can be used to chain global
+matches (using C<m//g>), as described in
 L<perlop/"Regexp Quote-Like Operators">.
+
 It is also useful when writing C<lex>-like scanners, when you have several
 regexps which you want to match against consequent substrings of your
 string, see the previous reference.
@@ -514,7 +515,11 @@ in C<[]>, which will match any one of the characters in the list.  If the
 first character after the "[" is "^", the class matches any character not
 in the list.  Within a list, the "-" character is used to specify a
 range, so that C<a-z> represents all the characters between "a" and "z",
-inclusive.
+inclusive.  If you want "-" itself to be a member of a class, put it
+at the start or end of the list, or escape it with a backslash.  (The
+following all specify the same class of three characters: C<[-az]>,
+C<[az-]>, and C<[a\-z]>.  All are different from C<[a-z]>, which
+specifies a class containing twenty-six characters.)
 
 Characters may be specified using a metacharacter syntax much like that
 used in C: "\n" matches a newline, "\t" a tab, "\r" a carriage return,
@@ -572,3 +577,7 @@ You can't disambiguate that by saying C<\{1}000>, whereas you can fix it with
 C<${1}000>.  Basically, the operation of interpolation should not be confused
 with the operation of matching a backreference.  Certainly they mean two
 different things on the I<left> side of the C<s///>.
+
+=head2 SEE ALSO
+
+"Mastering Regular Expressions" (see L<perlbook>) by Jeffrey Friedl.