X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlunicode.pod;h=30a4482260de5b44f3d8ffe67ea0501b2076283c;hb=5ad8ef521b3ffc4e6bbbb9941bc4940d442b56b2;hp=c6866617a29d3187cb2b52d1bc6c852f028bd005;hpb=ee8c7f5465f003860e2347a2946abacac39bd9b9;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod index c686661..30a4482 100644 --- a/pod/perlunicode.pod +++ b/pod/perlunicode.pod @@ -10,7 +10,7 @@ WARNING: The implementation of Unicode support in Perl is incomplete. The following areas need further work. -=over +=over 4 =item Input and Output Disciplines @@ -157,20 +157,10 @@ C<(?:\PM\pM*)>. =item * -The C operator translates characters instead of bytes. It can also -be forced to translate between 8-bit codes and UTF-8. For instance, if you -know your input in Latin-1, you can say: - - while (<>) { - tr/\0-\xff//CU; # latin1 char to utf8 - ... - } - -Similarly you could translate your output with - - tr/\0-\x{ff}//UC; # utf8 to latin1 char - -No, C doesn't take /U or /C (yet?). +The C operator translates characters instead of bytes. Note +that the C functionality has been removed, as the interface +was a mistake. For similar functionality see pack('U0', ...) and +pack('C0', ...). =item * @@ -208,6 +198,18 @@ byte-oriented C and C under utf8. =item * +The bit string operators C<& | ^ ~> can operate on character data. +However, for backward compatibility reasons (bit string operations +when the characters all are less than 256 in ordinal value) one cannot +mix C<~> (the bit complement) and characters both less than 256 and +equal or greater than 256. Most importantly, the DeMorgan's laws +(C<~($x|$y) eq ~$x&~$y>, C<~($x&$y) eq ~$x|~$y>) won't hold. +Another way to look at this is that the complement cannot return +B the 8-bit (byte) wide bit complement, and the full character +wide bit complement. + +=item * + And finally, C reverses by character rather than by byte. =back