From: Jarkko Hietaniemi Date: Sun, 21 Apr 2002 21:24:07 +0000 (+0000) Subject: One more way to do character class subtraction. X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=237bad5b7686b69e853033a4269a1410b74d1ed4;p=p5sagit%2Fp5-mst-13.2.git One more way to do character class subtraction. p4raw-id: //depot/perl@16052 --- diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod index f635013..033c9ac 100644 --- a/pod/perlunicode.pod +++ b/pod/perlunicode.pod @@ -615,7 +615,7 @@ And finally, C reverses by character rather than by byte. =back -=head2 Defining your own character properties +=head2 User-defined Character Properties You can define your own character properties by defining subroutines that have names beginning with "In" or "Is". The subroutines must be @@ -724,7 +724,8 @@ Level 1 - Basic Unicode Support [ 3] . \p{...} \P{...} [ 4] now scripts (see UTR#24 Script Names) in addition to blocks [ 5] have negation - [ 6] can use look-ahead to emulate subtraction (*) + [ 6] can use regular expression look-ahead [a] + or user-defined character properties [b] to emulate subtraction [ 7] include Letters in word characters [ 8] note that perl does Full casefolding in matching, not Simple: for example U+1F88 is equivalent with U+1F000 U+03B9, @@ -737,7 +738,7 @@ Level 1 - Basic Unicode Support (should also affect <>, $., and script line numbers) (the \x{85}, \x{2028} and \x{2029} do match \s) -(*) You can mimic class subtraction using lookahead. +[a] You can mimic class subtraction using lookahead. For example, what TR18 might write as [{Greek}-[{UNASSIGNED}]] @@ -753,6 +754,8 @@ But in this particular example, you probably really want which will match assigned characters known to be part of the Greek script. +[b] See L. + =item * Level 2 - Extended Unicode Support