With the advent of 5.6.0, Perl regexps can handle more than just the
standard ASCII character set. Perl now supports I<Unicode>, a standard
for representing the alphabets from virtually all of the world's written
-languages, and a host of symbols. Perl's text strings are unicode strings, so
+languages, and a host of symbols. Perl's text strings are Unicode strings, so
they can contain characters with a value (codepoint or character number) higher
than 255
lib/perl5/X.X.X/unicore directory (where X.X.X is the perl
version number as it is installed on your system).
-The answer to requirement 2), as of 5.6.0, is that a regexp uses unicode
+The answer to requirement 2), as of 5.6.0, is that a regexp uses Unicode
characters. Internally, this is encoded to bytes using either UTF-8 or a
native 8 bit encoding, depending on the history of the string, but
conceptually it is a sequence of characters, not bytes. See