Unicode properties database. C<\w> can be used to match a Japanese
ideograph, for instance.
+(However, and as a limitation of the current implementation, using
+C<\w> or C<\W> I<inside> a C<[...]> character class will still match
+with byte semantics.)
+
=item *
Named Unicode properties, scripts, and block ranges may be used like
Most operators that deal with positions or lengths in a string will
automatically switch to using character positions, including
-C<chop()>, C<substr()>, C<pos()>, C<index()>, C<rindex()>,
+C<chop()>, C<chomp()>, C<substr()>, C<pos()>, C<index()>, C<rindex()>,
C<sprintf()>, C<write()>, and C<length()>. Operators that
specifically do not switch include C<vec()>, C<pack()>, and
-C<unpack()>. Operators that really don't care include C<chomp()>,
+C<unpack()>. Operators that really don't care include
operators that treats strings as a bucket of bits such as C<sort()>,
and operators dealing with filenames.