This is a quick reference to Perl's regular expressions.
For full information see L<perlre> and L<perlop>, as well
-as the L<references|/"SEE ALSO"> section in this document.
+as the L</"SEE ALSO"> section in this document.
=head1 OPERATORS
$var =~ /foo/;
+ !~ determines to which variable the regex is applied,
+ and negates the result of the match; it returns
+ false if the match succeeds, and true if it fails.
+
+ $var !~ /foo/;
+
m/pattern/igmsoxc searches a string for a pattern match,
applying the given options.
s match as a Single line - . matches \n
o compile pattern Once
x eXtended legibility - free whitespace and comments
- c don't reset pos on fails when using /g
+ c don't reset pos on failed matches when using /g
- If 'pattern' is an empty string, the last I<successfully> match
+ If 'pattern' is an empty string, the last I<successfully> matched
regex is used. Delimiters other than '/' may be used for both this
operator and the following ones.
as a double quoted string unless a single-quote (') is the delimiter.
?pattern? is like m/pattern/ but matches only once. No alternate
- delimiters can be used. Must be reset with 'reset'.
+ delimiters can be used. Must be reset with L<reset|perlfunc/reset>.
=head1 SYNTAX
- \ Escapes the character(s) immediately following it
+ \ Escapes the character immediately following it
. Matches any single character except a newline (unless /s is used)
^ Matches at the beginning of the string (or line, if /m is used)
$ Matches at the end of the string (or line, if /m is used)
[...] Matches any one of the characters contained within the brackets
(...) Groups subexpressions for capturing to $1, $2...
(?:...) Groups subexpressions without capturing (cluster)
- | Matches either the expression preceding or following it
+ | Matches either the subexpression preceding or following it
\1, \2 ... The text from the Nth group
=head2 ESCAPE SEQUENCES
\cx Control-x
\N{name} A named character
- \l Lowercase until next character
- \u Uppercase until next character
+ \l Lowercase next character
+ \u Uppercase next character
\L Lowercase until \E
\U Uppercase until \E
\Q Disable pattern metacharacters until \E
[amy] Match 'a', 'm' or 'y'
[f-j] Dash specifies "range"
[f-j-] Dash escaped or at start or end means 'dash'
- [^f-j] Caret indicates "match char any _except_ these"
+ [^f-j] Caret indicates "match any character _except_ these"
The following work within or without a character class:
\d A digit, same as [0-9]
\D A nondigit, same as [^0-9]
- \w A word character (alphanumeric), same as [a-zA-Z_0-9]
- \W A non-word character, [^a-zA-Z_0-9]
+ \w A word character (alphanumeric), same as [a-zA-Z0-9_]
+ \W A non-word character, [^a-zA-Z0-9_]
\s A whitespace character, same as [ \t\n\r\f]
\S A non-whitespace character, [^ \t\n\r\f]
- \C Match a byte (with Unicode. '.' matches char)
+ \C Match a byte (with Unicode, '.' matches char)
\pP Match P-named (Unicode) property
\p{...} Match Unicode property with long name
\PP Match non-P
^ Match string start (or line, if /m is used)
$ Match string end (or line, if /m is used) or before newline
\b Match word boundary (between \w and \W)
- \B Match except at word boundary
+ \B Match except at word boundary (between \w and \w or \W and \W)
\A Match string start (regardless of /m)
- \Z Match string end (preceding optional newline)
+ \Z Match string end (before optional newline)
\z Match absolute string end
\G Match where previous m//g left off
- \c Suppresses resetting of search position when used with /g.
- Without \c, search pattern is reset to the beginning of the string
=head2 QUANTIFIERS
-Quantifiers are greedy by default --- match the B<longest> leftmost.
+Quantifiers are greedy by default -- match the B<longest> leftmost.
Maximal Minimal Allowed range
------- ------- -------------
{n,m} {n,m}? Must occur at least n times but no more than m times
{n,} {n,}? Must occur at least n times
- {n} {n}? Must match exactly n times
+ {n} {n}? Must occur exactly n times
* *? 0 or more times (same as {0,})
+ +? 1 or more times (same as {1,})
? ?? 0 or 1 time (same as {0,1})
+There is no quantifier {,n} -- that gets understood as a literal string.
+
=head2 EXTENDED CONSTRUCTS
(?#text) A comment
- (?imxs-imsx:...) Enable/disable option (as per m//)
+ (?imxs-imsx:...) Enable/disable option (as per m// modifiers)
(?=...) Zero-width positive lookahead assertion
(?!...) Zero-width negative lookahead assertion
- (?<...) Zero-width positive lookbehind assertion
+ (?<=...) Zero-width positive lookbehind assertion
(?<!...) Zero-width negative lookbehind assertion
(?>...) Grab what we can, prohibit backtracking
(?{ code }) Embedded code, return value becomes $^R
$+ Last parenthesized pattern match
$^N Holds the most recently closed capture
$^R Holds the result of the last (?{...}) expr
- @- Offsets of starts of groups. [0] holds start of whole match
- @+ Offsets of ends of groups. [0] holds end of whole match
+ @- Offsets of starts of groups. $-[0] holds start of whole match
+ @+ Offsets of ends of groups. $+[0] holds end of whole match
-Capture groups are numbered according to their I<opening> paren.
+Captured groups are numbered according to their I<opening> paren.
=head1 FUNCTIONS
lc Lowercase a string
lcfirst Lowercase first char of a string
uc Uppercase a string
- ucfirst Titlecase first char of a string
+ ucfirst Uppercase first char of a string
pos Return or set current match position
quotemeta Quote metacharacters
reset Reset ?pattern? status
and
Jeffrey Goff
for useful advice.
+
+=cut