X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlunicode.pod;h=190247aea796369a877a6603ef29db9832a4cbf3;hb=209071589ddd827372bd46e1358d1d13f6b4dbcb;hp=5f9ee29ece8077ae480c154f808986023f30bff9;hpb=14bb0a9a08468b34d6bd39a990c2bd5d097cab1f;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod index 5f9ee29..190247a 100644 --- a/pod/perlunicode.pod +++ b/pod/perlunicode.pod @@ -904,7 +904,7 @@ Like UTF-8 but EBCDIC-safe, in the way that UTF-8 is ASCII-safe. =item * -UTF-16, UTF-16BE, UTF16-LE, Surrogates, and BOMs (Byte Order Marks) +UTF-16, UTF-16BE, UTF-16LE, Surrogates, and BOMs (Byte Order Marks) The followings items are mostly for reference and general Unicode knowledge, Perl doesn't use these constructs internally. @@ -956,7 +956,7 @@ format". =item * -UTF-32, UTF-32BE, UTF32-LE +UTF-32, UTF-32BE, UTF-32LE The UTF-32 family is pretty much like the UTF-16 family, expect that the units are 32-bit, and therefore the surrogate scheme is not @@ -1149,7 +1149,7 @@ Unicode model is not to use UTF-8 until it is absolutely necessary. =item * -C) writes a Unicode character code point into +C writes a Unicode character code point into a buffer encoding the code point as UTF-8, and returns a pointer pointing after the UTF-8 bytes. @@ -1314,8 +1314,12 @@ byte-encoded. In Perl 5.8.0 the slowness was often quite spectacular; in Perl 5.8.1 a caching scheme was introduced which will hopefully make the slowness -somewhat less spectacular. Operations with UTF-8 encoded strings are -still slower, though. +somewhat less spectacular, at least for some operations. In general, +operations with UTF-8 encoded strings are still slower. As an example, +the Unicode properties (character classes) like C<\p{Nd}> are known to +be quite a bit slower (5-20 times) than their simpler counterparts +like C<\d> (then again, there 268 Unicode characters matching C +compared with the 10 ASCII characters matching C). =head2 Porting code from perl-5.6.X