A mechanism for inlineable OP equivalents of XSUBs is a TODO.

[p5sagit/p5-mst-13.2.git] / pod / perlrebackslash.pod
diff --git a/pod/perlrebackslash.pod b/pod/perlrebackslash.pod

index ddd7abe..b7a6bdc 100644 (file)
--- a/pod/perlrebackslash.pod
+++ b/pod/perlrebackslash.pod
@@ -83,9 +83,10 @@ quoted constructs>.
  \l                Lowercase next character.
  \L                Lowercase till \E.
  \n                (Logical) newline character.
+ \N                Any character but newline.
  \N{}              Named (Unicode) character.
- \p{}, \pP         Character with a Unicode property.
- \P{}, \PP         Character without a Unicode property.
+ \p{}, \pP         Character with the given Unicode property.
+ \P{}, \PP         Character without the given Unicode property.
  \Q                Quotemeta till \E.
  \r                Return character.
  \R                Generic new line.
@@ -99,7 +100,7 @@ quoted constructs>.
  \w                Character class for word characters.
  \W                Character class for non-word characters.
  \x{}, \x00        Hexadecimal escape sequence.
- \X                Extended Unicode "combining character sequence".
+ \X                Unicode "extended grapheme cluster".
  \z                End of string.
  \Z                End of string.
 
@@ -506,18 +507,14 @@ metacharacter, and suggests C<\R> as the notation.
 
 =item \X
 
-This matches an extended Unicode I<combining character sequence>, and
-is equivalent to C<< (?>\PM\pM*) >>. C<\PM> matches any character that is
-not considered a Unicode mark character, while C<\pM> matches any character
-that is considered a Unicode mark character; so C<\X> matches any non
-mark character followed by zero or more mark characters. Mark characters
-include (but are not restricted to) I<combining characters> and
-I<vowel signs>.
+This matches a Unicode I<extended grapheme cluster>.
 
 C<\X> matches quite well what normal (non-Unicode-programmer) usage
-would consider a single character: for example a base character
-(the C<\PM> above), for example a letter, followed by zero or more
-diacritics, which are I<combining characters> (the C<\pM*> above).
+would consider a single character.  As an example, consider a G with some sort
+of diacritic mark, such as an arrow.  There is no such single character in
+Unicode, but one can be composed using a G followed by a Unicode "COMBINING
+UPWARDS ARROW BELOW", and would be displayed by Unicode-aware software as if it
+were a single character.
 
 Mnemonic: eI<X>tended Unicode character.