"2+2=4" =~ /2+2/; # doesn't match, + is a metacharacter
"2+2=4" =~ /2\+2/; # matches, \+ is treated like an ordinary +
'C:\WIN32' =~ /C:\\WIN/; # matches
- "/usr/bin/perl" =~ /\/usr\/local\/bin\/perl/; # matches
+ "/usr/bin/perl" =~ /\/usr\/bin\/perl/; # matches
In the last regex, the forward slash C<'/'> is also backslashed,
because it is used to delimit the regex.
escape sequences, e.g., C<\033>, or hexadecimal escape sequences,
e.g., C<\x1B>:
- "1000\t2000" =~ m(0\t2) # matches
- "cat" =~ /\143\x61\x74/ # matches, but a weird way to spell cat
+ "1000\t2000" =~ m(0\t2) # matches
+ "cat" =~ /\143\x61\x74/ # matches in ASCII, but a weird way to spell cat
Regexes are treated mostly as double quoted strings, so variable
substitution works:
=item *
-\d is a digit and represents [0-9]
+\d is a digit and represents
+
+ [0-9]
=item *
-\s is a whitespace character and represents [\ \t\r\n\f]
+\s is a whitespace character and represents
+
+ [\ \t\r\n\f]
=item *
-\w is a word character (alphanumeric or _) and represents [0-9a-zA-Z_]
+\w is a word character (alphanumeric or _) and represents
+
+ [0-9a-zA-Z_]
=item *
-\D is a negated \d; it represents any character but a digit [^0-9]
+\D is a negated \d; it represents any character but a digit
+
+ [^0-9]
=item *
-\S is a negated \s; it represents any non-whitespace character [^\s]
+\S is a negated \s; it represents any non-whitespace character
+
+ [^\s]
=item *
-\W is a negated \w; it represents any non-word character [^\w]
+\W is a negated \w; it represents any non-word character
+
+ [^\w]
=item *
=head2 Matching this or that
-We can match match different character strings with the B<alternation>
+We can match different character strings with the B<alternation>
metacharacter C<'|'>. To match C<dog> or C<cat>, we form the regex
C<dog|cat>. As before, perl will try to match the regex at the
earliest possible point in the string. At each character position,
-perl will first try to match the the first alternative, C<dog>. If
+perl will first try to match the first alternative, C<dog>. If
C<dog> doesn't match, perl will then try the next alternative, C<cat>.
If C<cat> doesn't match either, then the match fails and perl moves to
the next position in the string. Some examples:
"cats" =~ /cats|cat|ca|c/; # matches "cats"
At a given character position, the first alternative that allows the
-regex match to succeed wil be the one that matches. Here, all the
-alternatives match at the first string position, so th first matches.
+regex match to succeed will be the one that matches. Here, all the
+alternatives match at the first string position, so the first matches.
=head2 Grouping things and hierarchical matching
=over 4
-=item * C<a?> = match 'a' 1 or 0 times
+=item *
+
+C<a?> = match 'a' 1 or 0 times
+
+=item *
+
+C<a*> = match 'a' 0 or more times, i.e., any number of times
-=item * C<a*> = match 'a' 0 or more times, i.e., any number of times
+=item *
-=item * C<a+> = match 'a' 1 or more times, i.e., at least once
+C<a+> = match 'a' 1 or more times, i.e., at least once
-=item * C<a{n,m}> = match at least C<n> times, but not more than C<m>
+=item *
+
+C<a{n,m}> = match at least C<n> times, but not more than C<m>
times.
-=item * C<a{n,}> = match at least C<n> or more times
+=item *
+
+C<a{n,}> = match at least C<n> or more times
+
+=item *
-=item * C<a{n}> = match exactly C<n> times
+C<a{n}> = match exactly C<n> times
=back
perform variable substitutions once. If you don't want any
substitutions at all, use the special delimiter C<m''>:
- $pattern = 'Seuss';
- m'$pattern'; # matches '$pattern', not 'Seuss'
+ @pattern = ('Seuss');
+ m/@pattern/; # matches 'Seuss'
+ m'@pattern'; # matches the literal string '@pattern'
The global modifier C<//g> allows the matching operator to match
within a string as many times as possible. In scalar context,
$x = "I batted 4 for 4";
$x =~ s/4/four/g; # $x contains "I batted four for four"
+The non-destructive modifier C<s///r> causes the result of the substitution
+to be returned instead of modifying C<$_> (or whatever variable the
+substitute was bound to with C<=~>):
+
+ $x = "I like dogs.";
+ $y = $x =~ s/dogs/cats/r;
+ print "$x $y\n"; # prints "I like dogs. I like cats."
+
+ $x = "Cats are great.";
+ print $x =~ s/Cats/Dogs/r =~ s/Dogs/Frogs/r =~ s/Frogs/Hedgehogs/r, "\n";
+ # prints "Hedgehogs are great."
+
+ @foo = map { s/[a-z]/X/r } qw(a b c 1 2 3);
+ # @foo is now qw(X X X 1 2 3)
+
The evaluation modifier C<s///e> wraps an C<eval{...}> around the
replacement string and the evaluated result is substituted for the
matched substring. Some examples:
# $const[2] = '3.142'
If the empty regex C<//> is used, the string is split into individual
-characters. If the regex has groupings, then list produced contains
+characters. If the regex has groupings, then the list produced contains
the matched substrings from the groupings as well:
$x = "/usr/bin";