[:class:]
-is also available. The available classes and their backslash
-equivalents (if available) are as follows:
+is also available. Note that the C<[> and C<]> braces are I<literal>;
+they must always be used within a character class expression.
+
+ # this is correct:
+ $string =~ /[[:alpha:]]/;
+
+ # this is not, and will generate a warning:
+ $string =~ /[:alpha:]/;
+
+The available classes and their backslash equivalents (if available) are
+as follows:
X<character class>
X<alpha> X<alnum> X<ascii> X<blank> X<cntrl> X<digit> X<graph>
X<lower> X<print> X<punct> X<space> X<upper> X<word> X<xdigit>
backslash character classes (if available), will hold:
X<character class> X<\p> X<\p{}>
- [:...:] \p{...} backslash
+ [[:...:]] \p{...} backslash
alpha IsAlpha
alnum IsAlnum
word IsWord
xdigit IsXDigit
-For example C<[:lower:]> and C<\p{IsLower}> are equivalent.
+For example C<[[:lower:]]> and C<\p{IsLower}> are equivalent.
If the C<utf8> pragma is not used but the C<locale> pragma is, the
classes correlate with the usual isalpha(3) interface (except for
with a '^'. This is a Perl extension. For example:
X<character class, negation>
- POSIX traditional Unicode
+ POSIX traditional Unicode
- [:^digit:] \D \P{IsDigit}
- [:^space:] \S \P{IsSpace}
- [:^word:] \W \P{IsWord}
+ [[:^digit:]] \D \P{IsDigit}
+ [[:^space:]] \S \P{IsSpace}
+ [[:^word:]] \W \P{IsWord}
Perl respects the POSIX standard in that POSIX character classes are
only supported within a character class. The POSIX character classes
claims that there is no 123 in the string. Here's a clearer picture of
why that pattern matches, contrary to popular expectations:
- $x = 'ABC123' ;
- $y = 'ABC445' ;
+ $x = 'ABC123';
+ $y = 'ABC445';
- print "1: got $1\n" if $x =~ /^(ABC)(?!123)/ ;
- print "2: got $1\n" if $y =~ /^(ABC)(?!123)/ ;
+ print "1: got $1\n" if $x =~ /^(ABC)(?!123)/;
+ print "2: got $1\n" if $y =~ /^(ABC)(?!123)/;
- print "3: got $1\n" if $x =~ /^(\D*)(?!123)/ ;
- print "4: got $1\n" if $y =~ /^(\D*)(?!123)/ ;
+ print "3: got $1\n" if $x =~ /^(\D*)(?!123)/;
+ print "4: got $1\n" if $y =~ /^(\D*)(?!123)/;
This prints
of the string in their match. So rewriting this way produces what
you'd expect; that is, case 5 will fail, but case 6 succeeds:
- print "5: got $1\n" if $x =~ /^(\D*)(?=\d)(?!123)/ ;
- print "6: got $1\n" if $y =~ /^(\D*)(?=\d)(?!123)/ ;
+ print "5: got $1\n" if $x =~ /^(\D*)(?=\d)(?!123)/;
+ print "6: got $1\n" if $y =~ /^(\D*)(?=\d)(?!123)/;
6: got ABC