X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlretut.pod;h=c0a78a43e49db76d011a254091c442ef2a881650;hb=a537debe17982e491ffa12d12441cf74a452acb2;hp=f0b5d1d3891ee463b2454dfb8d7feabefb3a7958;hpb=25cf8c221694f204b8dcdb15a864ea224c360401;p=p5sagit%2Fp5-mst-13.2.git
diff --git a/pod/perlretut.pod b/pod/perlretut.pod
index f0b5d1d..c0a78a4 100644
--- a/pod/perlretut.pod
+++ b/pod/perlretut.pod
@@ -158,13 +158,14 @@ that a metacharacter can be matched by putting a backslash before it:
"2+2=4" =~ /2\+2/; # matches, \+ is treated like an ordinary +
"The interval is [0,1)." =~ /[0,1)./ # is a syntax error!
"The interval is [0,1)." =~ /\[0,1\)\./ # matches
- "/usr/bin/perl" =~ /\/usr\/local\/bin\/perl/; # matches
+ "/usr/bin/perl" =~ /\/usr\/bin\/perl/; # matches
In the last regexp, the forward slash C<'/'> is also backslashed,
because it is used to delimit the regexp. This can lead to LTS
(leaning toothpick syndrome), however, and it is often more readable
to change delimiters.
+ "/usr/bin/perl" =~ m!/usr/bin/perl!; # easier to read
The backslash character C<'\'> is a metacharacter itself and needs to
be backslashed:
@@ -689,10 +690,11 @@ inside goes into the special variables C<$1>, C<$2>, etc. They can be
used just as ordinary variables:
# extract hours, minutes, seconds
- $time =~ /(\d\d):(\d\d):(\d\d)/; # match hh:mm:ss format
- $hours = $1;
- $minutes = $2;
- $seconds = $3;
+ if ($time =~ /(\d\d):(\d\d):(\d\d)/) { # match hh:mm:ss format
+ $hours = $1;
+ $minutes = $2;
+ $seconds = $3;
+ }
Now, we know that in scalar context,
S > returns a true or false
@@ -1323,9 +1325,9 @@ If you change C<$pattern> after the first substitution happens, perl
will ignore it. If you don't want any substitutions at all, use the
special delimiter C:
- $pattern = 'Seuss';
+ @pattern = ('Seuss');
while (<>) {
- print if m'$pattern'; # matches '$pattern', not 'Seuss'
+ print if m'@pattern'; # matches literal '@pattern', not 'Seuss'
}
C acts like single quotes on a regexp; all other C delimiters
@@ -1707,7 +1709,7 @@ it matches I byte 0-255. So
The last regexp matches, but is dangerous because the string
I position is no longer synchronized to the string I
position. This generates the warning 'Malformed UTF-8
-character'. C<\C> is best used for matching the binary data in strings
+character'. The C<\C> is best used for matching the binary data in strings
with binary data intermixed with Unicode characters.
Let us now discuss the rest of the character classes. Just as with
@@ -1752,7 +1754,7 @@ letter, the braces can be dropped. For instance, C<\pM> is the
character class of Unicode 'marks', for example accent marks.
For the full list see L.
-The Unicode has also been separated into various sets of charaters
+The Unicode has also been separated into various sets of characters
which you can test with C<\p{In...}> (in) and C<\P{In...}> (not in),
for example C<\p{Latin}>, C<\p{Greek}>, or C<\P{Katakana}>.
For the full list see L.
@@ -2004,6 +2006,10 @@ They evaluate true if the regexps do I match:
$x =~ /foo(?!baz)/; # matches, 'baz' doesn't follow 'foo'
$x =~ /(? is unsupported in lookbehind, because the already
+treacherous definition of C<\C> would become even more so
+when going backwards.
+
=head2 Using independent subexpressions to prevent backtracking
The last few extended patterns in this tutorial are experimental as of
@@ -2265,7 +2271,7 @@ may surprise you:
$pat = qr/(?{ $foo = 1 })/; # precompile code regexp
/foo${pat}bar/; # compiles ok
-If a regexp has (1) code expressions and interpolating variables,or
+If a regexp has (1) code expressions and interpolating variables, or
(2) a variable that interpolates a code expression, perl treats the
regexp as an error. If the code expression is precompiled into a
variable, however, interpolating is ok. The question is, why is this