From: Jarkko Hietaniemi Date: Fri, 16 Nov 2001 14:14:38 +0000 (+0000) Subject: Document the negated lookahead trick to emulate X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=29bdacb8f1686adf0d6e73a4e2fd7fb9becf6eab;p=p5sagit%2Fp5-mst-13.2.git Document the negated lookahead trick to emulate character class subtraction. p4raw-id: //depot/perl@13046 --- diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod index e1dcf4b..2fca714 100644 --- a/pod/perlunicode.pod +++ b/pod/perlunicode.pod @@ -632,14 +632,23 @@ Level 1 - Basic Unicode Support [ 1] \x{...} [ 2] \N{...} [ 3] . \p{Is...} \P{Is...} - [ 4] now scripts (see UTR#24 Script Names) in addition to blocks + [ 4] now scripts (see UTR#24 Script Names) in addition to blocks [ 5] have negation - [ 6] can use look-ahead to emulate subtracion + [ 6] can use look-ahead to emulate subtraction (*) [ 7] include Letters in word characters [ 8] see UTR#21 Case Mappings: Perl implements 1:1 mappings [ 9] see UTR#13 Unicode Newline Guidelines [10] should do ^ and $ also on \x{2028} and \x{2029} +(*) Instead of [\u0370-\u03FF-[{UNASSIGNED}]] as suggested by the TR +18 you can use negated lookahead: to match currently assigned modern +Greek characters use for example + + /(?!\p{Cn})[\x{0370}-\x{03ff}]/ + +In other words: the matched character must not be a non-assigned +character, but it must be in the block of modern Greek characters. + =item * Level 2 - Extended Unicode Support