From: Jarkko Hietaniemi Date: Fri, 6 Sep 2002 06:01:57 +0000 (+0300) Subject: (mostly (Unicode)) pod nits X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=63de3cb284beb0325229608ff63562933eba8f50;p=p5sagit%2Fp5-mst-13.2.git (mostly (Unicode)) pod nits Message-Id: <20020906030157.GA28252@lyta.hut.fi> p4raw-id: //depot/perl@17850 --- diff --git a/pod/perl.pod b/pod/perl.pod index 75331e1..66a0821 100644 --- a/pod/perl.pod +++ b/pod/perl.pod @@ -263,7 +263,8 @@ L, and L. =item * -roll-your-own magic variables (including multiple simultaneous DBM implementations) +roll-your-own magic variables (including multiple simultaneous DBM +implementations) Described in L and L. @@ -288,21 +289,15 @@ and L. =item * -compilability into C code or Perl bytecode - -Described in L and L. - -=item * - support for light-weight processes (threads) -Described in L and L. +Described in L and L. =item * -support for internationalization, localization, and Unicode +support for Unicode, internationalization, and localization -Described in L and L. +Described in L, L and L. =item * diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod index 8489702..49f7432 100644 --- a/pod/perlunicode.pod +++ b/pod/perlunicode.pod @@ -598,17 +598,8 @@ than one Unicode character. =back -The following cases do not yet work: - -=over 8 - -=item * - -the "final sigma" (Greek), and - -=item * - -anything to with locales (Lithuanian, Turkish, Azeri). +Things to do with locales (Lithuanian, Turkish, Azeri) do B work +since Perl does not understand the concept of Unicode locales. =back @@ -771,17 +762,19 @@ which will match assigned characters known to be part of the Greek script. Level 2 - Extended Unicode Support - 3.1 Surrogates - MISSING - 3.2 Canonical Equivalents - MISSING [11][12] - 3.3 Locale-Independent Graphemes - MISSING [13] - 3.4 Locale-Independent Words - MISSING [14] - 3.5 Locale-Independent Loose Matches - MISSING [15] - - [11] see UTR#15 Unicode Normalization - [12] have Unicode::Normalize but not integrated to regexes - [13] have \X but at this level . should equal that - [14] need three classes, not just \w and \W - [15] see UTR#21 Case Mappings + 3.1 Surrogates - MISSING [11] + 3.2 Canonical Equivalents - MISSING [12][13] + 3.3 Locale-Independent Graphemes - MISSING [14] + 3.4 Locale-Independent Words - MISSING [15] + 3.5 Locale-Independent Loose Matches - MISSING [16] + + [11] Surrogates are solely a UTF-16 concept and Perl's internal + representation is UTF-8. The Encode module does UTF-16, though. + [12] see UTR#15 Unicode Normalization + [13] have Unicode::Normalize but not integrated to regexes + [14] have \X but at this level . should equal that + [15] need three classes, not just \w and \W + [16] see UTR#21 Case Mappings =item * diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod index 870926e..223dbae 100644 --- a/pod/perluniintro.pod +++ b/pod/perluniintro.pod @@ -862,7 +862,7 @@ If you have the GNU recode installed, you can also use the Perl front-end C for character conversions. The following are fast conversions from ISO 8859-1 (Latin-1) bytes -to UTF-8 bytes, the code works even with older Perl 5 versions. +to UTF-8 bytes and back, the code works even with older Perl 5 versions. # ISO 8859-1 to UTF-8 s/([\x80-\xFF])/chr(0xC0|ord($1)>>6).chr(0x80|ord($1)&0x3F)/eg;