From: Jarkko Hietaniemi Date: Tue, 26 Mar 2002 01:19:57 +0000 (+0000) Subject: Mention the effect of Unicode keys on hashes. X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=3990cdf50f04b3556c0bb3f25d178926ef5d1117;p=p5sagit%2Fp5-mst-13.2.git Mention the effect of Unicode keys on hashes. p4raw-id: //depot/perl@15507 --- diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod index 9ba32ee..dd2a896 100644 --- a/pod/perlunicode.pod +++ b/pod/perlunicode.pod @@ -137,6 +137,19 @@ This works for all characters that have names. =item * +If Unicode is used in hash keys, there is a subtle effect on the hashes. +The hash becomes "Unicode-sticky" so that keys retrieved from the hash +(either by %hash, each %hash, or keys %hash) will be in Unicode, not +in bytes, even when the keys were bytes went they "went in". This +"stickiness" persists unless the hash is completely emptied, either by +using delete() or clearing the with undef() or assigning an empty list +to the hash. Most of the time this difference is negligible, but +there are few places where it matters: for example the regular +expression character classes like C<\w> behave differently for +bytes and characters. + +=item * + If an appropriate L is specified, identifiers within the Perl script may contain Unicode alphanumeric characters, including ideographs. (You are currently on your own when it comes to using the