3 perlreref - Perl Regular Expressions Reference
7 This is a quick reference to Perl's regular expressions.
8 For full information see L<perlre> and L<perlop>, as well
9 as the L<references|/"SEE ALSO"> section in this document.
13 =~ determines to which variable the regex is applied.
14 In its absence, $_ is used.
18 m/pattern/igmsoxc searches a string for a pattern match,
19 applying the given options.
22 g Global - all occurrences
23 m Multiline mode - ^ and $ match internal lines
24 s match as a Single line - . matches \n
25 o compile pattern Once
26 x eXtended legibility - free whitespace and comments
27 c don't reset pos on fails when using /g
29 If 'pattern' is an empty string, the last I<successfully> match
30 regex is used. Delimiters other than '/' may be used for both this
31 operator and the following ones.
33 qr/pattern/imsox lets you store a regex in a variable,
34 or pass one around. Modifiers as for m// and are stored
37 s/pattern/replacement/igmsoxe substitutes matches of
38 'pattern' with 'replacement'. Modifiers as for m//
41 e Evaluate replacement as an expression
43 'e' may be specified multiple times. 'replacement' is interpreted
44 as a double quoted string unless a single-quote (') is the delimiter.
46 ?pattern? is like m/pattern/ but matches only once. No alternate
47 delimiters can be used. Must be reset with 'reset'.
51 \ Escapes the character(s) immediately following it
52 . Matches any single character except a newline (unless /s is used)
53 ^ Matches at the beginning of the string (or line, if /m is used)
54 $ Matches at the end of the string (or line, if /m is used)
55 * Matches the preceding element 0 or more times
56 + Matches the preceding element 1 or more times
57 ? Matches the preceding element 0 or 1 times
58 {...} Specifies a range of occurrences for the element preceding it
59 [...] Matches any one of the characters contained within the brackets
60 (...) Groups subexpressions for capturing to $1, $2...
61 (?:...) Groups subexpressions without capturing (cluster)
62 | Matches either the expression preceding or following it
63 \1, \2 ... The text from the Nth group
65 =head2 ESCAPE SEQUENCES
67 These work as in normal strings.
75 \038 Any octal ASCII value
76 \x7f Any hexadecimal ASCII value
77 \x{263a} A wide hexadecimal value
79 \N{name} A named character
81 \l Lowercase until next character
82 \u Uppercase until next character
85 \Q Disable pattern metacharacters until \E
86 \E End case modification
88 This one works differently from normal strings:
90 \b An assertion, not backspace, except in a character class
92 =head2 CHARACTER CLASSES
94 [amy] Match 'a', 'm' or 'y'
95 [f-j] Dash specifies "range"
96 [f-j-] Dash escaped or at start or end means 'dash'
97 [^f-j] Caret indicates "match char any _except_ these"
99 The following work within or without a character class:
101 \d A digit, same as [0-9]
102 \D A nondigit, same as [^0-9]
103 \w A word character (alphanumeric), same as [a-zA-Z_0-9]
104 \W A non-word character, [^a-zA-Z_0-9]
105 \s A whitespace character, same as [ \t\n\r\f]
106 \S A non-whitespace character, [^ \t\n\r\f]
107 \C Match a byte (with Unicode. '.' matches char)
108 \pP Match P-named (Unicode) property
109 \p{...} Match Unicode property with long name
111 \P{...} Match lack of Unicode property with long name
112 \X Match extended unicode sequence
114 POSIX character classes and their Unicode and Perl equivalents:
116 alnum IsAlnum Alphanumeric
117 alpha IsAlpha Alphabetic
118 ascii IsASCII Any ASCII char
119 blank IsSpace [ \t] Horizontal whitespace (GNU)
120 cntrl IsCntrl Control characters
121 digit IsDigit \d Digits
122 graph IsGraph Alphanumeric and punctuation
123 lower IsLower Lowercase chars (locale aware)
124 print IsPrint Alphanumeric, punct, and space
125 punct IsPunct Punctuation
126 space IsSpace [\s\ck] Whitespace
127 IsSpacePerl \s Perl's whitespace definition
128 upper IsUpper Uppercase chars (locale aware)
129 word IsWord \w Alphanumeric plus _ (Perl)
130 xdigit IsXDigit [\dA-Fa-f] Hexadecimal digit
132 Within a character class:
134 POSIX traditional Unicode
135 [:digit:] \d \p{IsDigit}
136 [:^digit:] \D \P{IsDigit}
140 All are zero-width assertions.
142 ^ Match string start (or line, if /m is used)
143 $ Match string end (or line, if /m is used) or before newline
144 \b Match word boundary (between \w and \W)
145 \B Match except at word boundary
146 \A Match string start (regardless of /m)
147 \Z Match string end (preceding optional newline)
148 \z Match absolute string end
149 \G Match where previous m//g left off
150 \c Suppresses resetting of search position when used with /g.
151 Without \c, search pattern is reset to the beginning of the string
155 Quantifiers are greedy by default --- match the B<longest> leftmost.
157 Maximal Minimal Allowed range
158 ------- ------- -------------
159 {n,m} {n,m}? Must occur at least n times but no more than m times
160 {n,} {n,}? Must occur at least n times
161 {n} {n}? Must match exactly n times
162 * *? 0 or more times (same as {0,})
163 + +? 1 or more times (same as {1,})
164 ? ?? 0 or 1 time (same as {0,1})
166 =head2 EXTENDED CONSTRUCTS
169 (?imxs-imsx:...) Enable/disable option (as per m//)
170 (?=...) Zero-width positive lookahead assertion
171 (?!...) Zero-width negative lookahead assertion
172 (?<...) Zero-width positive lookbehind assertion
173 (?<!...) Zero-width negative lookbehind assertion
174 (?>...) Grab what we can, prohibit backtracking
175 (?{ code }) Embedded code, return value becomes $^R
176 (??{ code }) Dynamic regex, return value used as regex
177 (?(cond)yes|no) cond being integer corresponding to capturing parens
178 (?(cond)yes) or a lookaround/eval zero-width assertion
182 $_ Default variable for operators to use
183 $* Enable multiline matching (deprecated; not in 5.9.0 or later)
185 $& Entire matched string
186 $` Everything prior to matched string
187 $' Everything after to matched string
189 The use of those last three will slow down B<all> regex use
190 within your program. Consult L<perlvar> for C<@LAST_MATCH_START>
191 to see equivalent expressions that won't cause slow down.
192 See also L<Devel::SawAmpersand>.
194 $1, $2 ... hold the Xth captured expr
195 $+ Last parenthesized pattern match
196 $^N Holds the most recently closed capture
197 $^R Holds the result of the last (?{...}) expr
198 @- Offsets of starts of groups. [0] holds start of whole match
199 @+ Offsets of ends of groups. [0] holds end of whole match
201 Capture groups are numbered according to their I<opening> paren.
205 lc Lowercase a string
206 lcfirst Lowercase first char of a string
207 uc Uppercase a string
208 ucfirst Titlecase first char of a string
209 pos Return or set current match position
210 quotemeta Quote metacharacters
211 reset Reset ?pattern? status
212 study Analyze string for optimizing matching
214 split Use regex to split a string into parts
220 This document may be distributed under the same terms as Perl itself.
228 L<perlretut> for a tutorial on regular expressions.
232 L<perlrequick> for a rapid tutorial.
236 L<perlre> for more details.
240 L<perlvar> for details on the variables.
244 L<perlop> for details on the operators.
248 L<perlfunc> for details on the functions.
252 L<perlfaq6> for FAQs on regular expressions.
256 The L<re> module to alter behaviour and aid
261 L<perldebug/"Debugging regular expressions">
265 L<perluniintro>, L<perlunicode>, L<charnames> and L<locale>
266 for details on regexes and internationalisation.
270 I<Mastering Regular Expressions> by Jeffrey Friedl
271 (F<http://regex.info/>) for a thorough grounding and
272 reference on the topic.