From: Jarkko Hietaniemi Date: Mon, 26 Nov 2007 04:55:03 +0000 (+0200) Subject: pod/perlrebackslash.pod: small Unicode additions X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=10fdd3268bfedb0d10912f2f0ba6be13995de3fe;p=p5sagit%2Fp5-mst-13.2.git pod/perlrebackslash.pod: small Unicode additions Message-Id: <200711260255.lAQ2t37n188664@kosh.hut.fi> p4raw-id: //depot/perl@32493 --- diff --git a/pod/perlrebackslash.pod b/pod/perlrebackslash.pod index d8cfb6a..ac95ace 100644 --- a/pod/perlrebackslash.pod +++ b/pod/perlrebackslash.pod @@ -500,7 +500,9 @@ C<< (?>\x0D\x0A)|\v) >>. Since C<\R> can match a more than one character, it cannot be put inside a bracketed character class; C is an error. C<\R> is introduced in perl 5.10. -Mnemonic: none really. C<\R> was picked because PCRE already uses C<\R>. +Mnemonic: none really. C<\R> was picked because PCRE already uses C<\R>, +and more importantly because Unicode recommends such a regular expression +metacharacter, and suggests C<\R> as the notation. =item \X @@ -512,6 +514,11 @@ mark character followed by zero or more mark characters. Mark characters include (but are not restricted to) I and I. +C<\X> matches quite well what normal (non-Unicode-programmer) usage +would consider a single character: for example a base character +(the C<\PM> above), for example a letter, followed by zero or more +diacritics, which are I (the C<\pM*> above). + Mnemonic: eItended Unicode character. =back