X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlretut.pod;h=c0a78a43e49db76d011a254091c442ef2a881650;hb=2810d90162dc487ea10933114344f32d14b3d619;hp=4ea9ecc497a48d3b38c3c826050f09a061bc5431;hpb=48e3bbddf569369fe6921f305df6ab7290c91152;p=p5sagit%2Fp5-mst-13.2.git
diff --git a/pod/perlretut.pod b/pod/perlretut.pod
index 4ea9ecc..c0a78a4 100644
--- a/pod/perlretut.pod
+++ b/pod/perlretut.pod
@@ -158,13 +158,14 @@ that a metacharacter can be matched by putting a backslash before it:
"2+2=4" =~ /2\+2/; # matches, \+ is treated like an ordinary +
"The interval is [0,1)." =~ /[0,1)./ # is a syntax error!
"The interval is [0,1)." =~ /\[0,1\)\./ # matches
- "/usr/bin/perl" =~ /\/usr\/local\/bin\/perl/; # matches
+ "/usr/bin/perl" =~ /\/usr\/bin\/perl/; # matches
In the last regexp, the forward slash C<'/'> is also backslashed,
because it is used to delimit the regexp. This can lead to LTS
(leaning toothpick syndrome), however, and it is often more readable
to change delimiters.
+ "/usr/bin/perl" =~ m!/usr/bin/perl!; # easier to read
The backslash character C<'\'> is a metacharacter itself and needs to
be backslashed:
@@ -689,10 +690,11 @@ inside goes into the special variables C<$1>, C<$2>, etc. They can be
used just as ordinary variables:
# extract hours, minutes, seconds
- $time =~ /(\d\d):(\d\d):(\d\d)/; # match hh:mm:ss format
- $hours = $1;
- $minutes = $2;
- $seconds = $3;
+ if ($time =~ /(\d\d):(\d\d):(\d\d)/) { # match hh:mm:ss format
+ $hours = $1;
+ $minutes = $2;
+ $seconds = $3;
+ }
Now, we know that in scalar context,
S > returns a true or false
@@ -1323,9 +1325,9 @@ If you change C<$pattern> after the first substitution happens, perl
will ignore it. If you don't want any substitutions at all, use the
special delimiter C:
- $pattern = 'Seuss';
+ @pattern = ('Seuss');
while (<>) {
- print if m'$pattern'; # matches '$pattern', not 'Seuss'
+ print if m'@pattern'; # matches literal '@pattern', not 'Seuss'
}
C acts like single quotes on a regexp; all other C delimiters
@@ -1403,7 +1405,8 @@ off. C<\G> allows us to easily do context-sensitive matching:
The combination of C/g> and C<\G> allows us to process the string a
bit at a time and use arbitrary Perl logic to decide what to do next.
-Currently, the C<\G> anchor only works at the beginning of a pattern.
+Currently, the C<\G> anchor is only fully supported when used to anchor
+to the start of the pattern.
C<\G> is also invaluable in processing fixed length records with
regexps. Suppose we have a snippet of coding region DNA, encoded as
@@ -1706,7 +1709,7 @@ it matches I byte 0-255. So
The last regexp matches, but is dangerous because the string
I position is no longer synchronized to the string I
position. This generates the warning 'Malformed UTF-8
-character'. C<\C> is best used for matching the binary data in strings
+character'. The C<\C> is best used for matching the binary data in strings
with binary data intermixed with Unicode characters.
Let us now discuss the rest of the character classes. Just as with
@@ -1751,7 +1754,7 @@ letter, the braces can be dropped. For instance, C<\pM> is the
character class of Unicode 'marks', for example accent marks.
For the full list see L.
-The Unicode has also been separated into various sets of charaters
+The Unicode has also been separated into various sets of characters
which you can test with C<\p{In...}> (in) and C<\P{In...}> (not in),
for example C<\p{Latin}>, C<\p{Greek}>, or C<\P{Katakana}>.
For the full list see L.
@@ -2003,6 +2006,10 @@ They evaluate true if the regexps do I match:
$x =~ /foo(?!baz)/; # matches, 'baz' doesn't follow 'foo'
$x =~ /(? is unsupported in lookbehind, because the already
+treacherous definition of C<\C> would become even more so
+when going backwards.
+
=head2 Using independent subexpressions to prevent backtracking
The last few extended patterns in this tutorial are experimental as of
@@ -2264,7 +2271,7 @@ may surprise you:
$pat = qr/(?{ $foo = 1 })/; # precompile code regexp
/foo${pat}bar/; # compiles ok
-If a regexp has (1) code expressions and interpolating variables,or
+If a regexp has (1) code expressions and interpolating variables, or
(2) a variable that interpolates a code expression, perl treats the
regexp as an error. If the code expression is precompiled into a
variable, however, interpolating is ok. The question is, why is this