From: Jarkko Hietaniemi Date: Tue, 16 Apr 2002 12:34:36 +0000 (+0000) Subject: Let's not promise too much: use utf8 only works on identifier X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=c20e2abd87bda89a16a4bc4b0eaecf500fbede5c;p=p5sagit%2Fp5-mst-13.2.git Let's not promise too much: use utf8 only works on identifier names, not package or subroutine names (admittedly limited [1], but that's what the Camel says, and that's what we are going to stick to for 5.8.0). Also document that use vars does not do utf8. [1] The obvious problem in both is that package and subroutine names need to mappable to the filesystem. p4raw-id: //depot/perl@15947 --- diff --git a/lib/utf8.pm b/lib/utf8.pm index c748a49..e0c4ac1 100644 --- a/lib/utf8.pm +++ b/lib/utf8.pm @@ -56,9 +56,9 @@ Enabling the C pragma has the following effect: Bytes in the source text that have their high-bit set will be treated as being part of a literal UTF-8 character. This includes most -literals such as identifiers, string constants, constant regular -expression patterns and package names. On EBCDIC platforms characters -in the Latin 1 character set are treated as being part of a literal +literals such as identifier names, string constants, and constant +regular expression patterns. On EBCDIC platforms characters in +the Latin 1 character set are treated as being part of a literal UTF-EBCDIC character. =back diff --git a/lib/vars.pm b/lib/vars.pm index 233979d..020568e 100644 --- a/lib/vars.pm +++ b/lib/vars.pm @@ -79,6 +79,8 @@ outside of the package), it can act as an acceptable substitute by pre-declaring global symbols, ensuring their availability to the later-loaded routines. +The C does not work for UTF-8 variable names. + See L. =cut diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod index dd3064f..9235495 100644 --- a/pod/perluniintro.pod +++ b/pod/perluniintro.pod @@ -128,10 +128,9 @@ This model was found to be wrong, or at least clumsy: the Unicodeness is now carried with the data, not attached to the operations. (There is one remaining case where an explicit C is needed: if your Perl script itself is encoded in UTF-8, you can use UTF-8 in your -variable and subroutine names, and in your string and regular -expression literals, by saying C. This is not the default -because that would break existing scripts having legacy 8-bit data in -them.) +identifier names, and in your string and regular expression literals, +by saying C. This is not the default because that would +break existing scripts having legacy 8-bit data in them.) =head2 Perl's Unicode Model