X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlrequick.pod;h=7abd895e8a80441e9004c44a1d8c673920ce281c;hb=4ad40acfc62db410aa4eb7654e17246f1fc97689;hp=a14229c303ddd4e30bb74a798bb9d63422f6a1df;hpb=4b19af017623bfa3bb72bb164598a517f586e0d3;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perlrequick.pod b/pod/perlrequick.pod index a14229c..7abd895 100644 --- a/pod/perlrequick.pod +++ b/pod/perlrequick.pod @@ -74,7 +74,7 @@ A metacharacter can be matched by putting a backslash before it: "2+2=4" =~ /2+2/; # doesn't match, + is a metacharacter "2+2=4" =~ /2\+2/; # matches, \+ is treated like an ordinary + 'C:\WIN32' =~ /C:\\WIN/; # matches - "/usr/bin/perl" =~ /\/usr\/local\/bin\/perl/; # matches + "/usr/bin/perl" =~ /\/usr\/bin\/perl/; # matches In the last regex, the forward slash C<'/'> is also backslashed, because it is used to delimit the regex. @@ -166,24 +166,43 @@ Perl has several abbreviations for common character classes: =over 4 =item * -\d is a digit and represents [0-9] + +\d is a digit and represents + + [0-9] =item * -\s is a whitespace character and represents [\ \t\r\n\f] + +\s is a whitespace character and represents + + [\ \t\r\n\f] =item * -\w is a word character (alphanumeric or _) and represents [0-9a-zA-Z_] + +\w is a word character (alphanumeric or _) and represents + + [0-9a-zA-Z_] =item * -\D is a negated \d; it represents any character but a digit [^0-9] + +\D is a negated \d; it represents any character but a digit + + [^0-9] =item * -\S is a negated \s; it represents any non-whitespace character [^\s] + +\S is a negated \s; it represents any non-whitespace character + + [^\s] =item * -\W is a negated \w; it represents any non-word character [^\w] + +\W is a negated \w; it represents any non-word character + + [^\w] =item * + The period '.' matches any character but "\n" =back @@ -212,11 +231,11 @@ boundary. =head2 Matching this or that -We can match match different character strings with the B +We can match different character strings with the B metacharacter C<'|'>. To match C or C, we form the regex C. As before, perl will try to match the regex at the earliest possible point in the string. At each character position, -perl will first try to match the the first alternative, C. If +perl will first try to match the first alternative, C. If C doesn't match, perl will then try the next alternative, C. If C doesn't match either, then the match fails and perl moves to the next position in the string. Some examples: @@ -231,8 +250,8 @@ C is able to match earlier in the string. "cats" =~ /cats|cat|ca|c/; # matches "cats" At a given character position, the first alternative that allows the -regex match to succeed wil be the one that matches. Here, all the -alternatives match at the first string position, so th first matches. +regex match to succeed will be the one that matches. Here, all the +alternatives match at the first string position, so the first matches. =head2 Grouping things and hierarchical matching @@ -297,18 +316,30 @@ have the following meanings: =over 4 -=item * C = match 'a' 1 or 0 times +=item * + +C = match 'a' 1 or 0 times -=item * C = match 'a' 0 or more times, i.e., any number of times +=item * -=item * C = match 'a' 1 or more times, i.e., at least once +C = match 'a' 0 or more times, i.e., any number of times -=item * C = match at least C times, but not more than C +=item * + +C = match 'a' 1 or more times, i.e., at least once + +=item * + +C = match at least C times, but not more than C times. -=item * C = match at least C or more times +=item * + +C = match at least C or more times -=item * C = match exactly C times +=item * + +C = match exactly C times =back @@ -349,8 +380,9 @@ C<$pattern> won't be changing, use the C modifier, to only perform variable substitutions once. If you don't want any substitutions at all, use the special delimiter C: - $pattern = 'Seuss'; - m'$pattern'; # matches '$pattern', not 'Seuss' + @pattern = ('Seuss'); + m/@pattern/; # matches 'Seuss' + m'@pattern'; # matches the literal string '@pattern' The global modifier C allows the matching operator to match within a string as many times as possible. In scalar context, @@ -445,7 +477,7 @@ To extract a comma-delimited list of numbers, use # $const[2] = '3.142' If the empty regex C is used, the string is split into individual -characters. If the regex has groupings, then list produced contains +characters. If the regex has groupings, then the list produced contains the matched substrings from the groupings as well: $x = "/usr/bin";