From: Jarkko Hietaniemi Date: Fri, 16 Nov 2001 15:26:41 +0000 (+0000) Subject: Update perluniintro on the UTF-8 output matters X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=1d7919c50afce3283e44737a6095660e99d8c972;p=p5sagit%2Fp5-mst-13.2.git Update perluniintro on the UTF-8 output matters (that -w will warn unless the stream is explicitly UTF-8-ified). p4raw-id: //depot/perl@13051 --- diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod index cdd0b40..cd978d0 100644 --- a/pod/perluniintro.pod +++ b/pod/perluniintro.pod @@ -236,11 +236,19 @@ for doing conversions between those encodings: Normally writing out Unicode data - print chr(0x100), "\n"; + print FH chr(0x100), "\n"; -will print out the raw UTF-8 bytes. +will print out the raw UTF-8 bytes, but you will get a warning +out of that if you use C<-w> or C. To avoid the +warning open the stream explicitly in UTF-8: -But reading in correctly formed UTF-8 data will not magically turn + open FH, ">:utf8", "file"; + +and on already open streams use C: + + binmode(STDOUT, ":utf8"); + +Reading in correctly formed UTF-8 data will not magically turn the data into Unicode in Perl's eyes. You can use either the C<':utf8'> I/O discipline when opening files @@ -251,11 +259,11 @@ You can use either the C<':utf8'> I/O discipline when opening files The I/O disciplines can also be specified more flexibly with the C pragma; see L: - use open ':utf8'; # input and output will be UTF-8 - open X, ">utf8"; - print X chr(0x100), "\n"; # this would have been UTF-8 without the pragma + use open ':utf8'; # input and output default discipline will be UTF-8 + open X, ">file"; + print X chr(0x100), "\n"; close X; - open Y, "); # this should print 0x100 close Y; @@ -329,7 +337,8 @@ by repeatedly encoding it in UTF-8: close F; If you run this code twice, the contents of the F will be twice -UTF-8 encoded. A C would have avoided the bug. +UTF-8 encoded. A C would have avoided the bug, or +explicitly opening also the F for input as UTF-8. =head2 Special Cases