From: Jarkko Hietaniemi Date: Fri, 11 Jan 2002 02:11:21 +0000 (+0000) Subject: Small doc tweaks. X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=3ff56b75692f824d27e6db94beeb20f442f3a9b1;p=p5sagit%2Fp5-mst-13.2.git Small doc tweaks. p4raw-id: //depot/perl@14176 --- diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod index 68f8a01..14b0cd3 100644 --- a/pod/perluniintro.pod +++ b/pod/perluniintro.pod @@ -518,19 +518,20 @@ http://www.unicode.org/unicode/reports/tr10/ =item * -Character Ranges +Character Ranges and Classes Character ranges in regular expression character classes (C) and in the C (also known as C) operator are not magically Unicode-aware. What this means that C<[A-Za-z]> will not magically start to mean "all alphabetic letters" (not that it does mean that even for -8-bit characters, you should be using C for that). +8-bit characters, you should be using C for that). For specifying things like that in regular expressions, you can use the various Unicode properties, C<\pL> or perhaps C<\p{Alphabetic}>, in this particular case. You can use Unicode code points as the end points of character ranges, but that means that particular code point -range, nothing more. For further information, see L. +range, nothing more. For further information (there are dozens +of Unicode character classes), see L. =item *