From: Jarkko Hietaniemi Date: Fri, 6 Jul 2001 00:14:57 +0000 (+0000) Subject: Unterminated C< (noticed by Richard Hatch), and few other X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=5e42d7b483c9b59a3bad60858cd4bfe9fb4caf8e;p=p5sagit%2Fp5-mst-13.2.git Unterminated C< (noticed by Richard Hatch), and few other small Unicode doc tweaks. p4raw-id: //depot/perl@11164 --- diff --git a/pod/perlretut.pod b/pod/perlretut.pod index 7f8e8f5..869a422 100644 --- a/pod/perlretut.pod +++ b/pod/perlretut.pod @@ -1751,12 +1751,10 @@ letter, the braces can be dropped. For instance, C<\pM> is the character class of Unicode 'marks', for example accent marks. For the full list see L. -The Unicode has also been separated into blocks of charaters which you -can test with C<\p{In...}> (in) and C<\P{In...}> (not in), for example -C<\p{InLatin}, C<\p{InGreek}>, or C<\P{InKatakana}>. For the full list see -L. - -For the the full and latest information see the latest Unicode standard. +The Unicode has also been separated into various sets of charaters +which you can test with C<\p{In...}> (in) and C<\P{In...}> (not in), +for example C<\p{InLatin}>, C<\p{InGreek}>, or C<\P{InKatakana}>. +For the full list see L. C<\X> is an abbreviation for a character class sequence that includes the Unicode 'combining character sequences'. A 'combining character @@ -1768,6 +1766,9 @@ S >, which translates in Danish to A with the circle atop it, as in the word Angstrom. C<\X> is equivalent to C<\PM\pM*}>, i.e., a non-mark followed by one or more marks. +For the the full and latest information about Unicode see the latest +Unicode standard, or the Unicode Consortium's website http://www.unicode.org/ + As if all those classes weren't enough, Perl also defines POSIX style character classes. These have the form C<[:name:]>, with C the name of the POSIX class. The POSIX classes are C, C,