From: Karl Williamson Date: Wed, 24 Feb 2010 00:31:48 +0000 (-0700) Subject: Update charnames documentations for \N changes, bugs X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=55bc7d3ca65c8a79bfdaa4be97e25fdf2395a858;p=p5sagit%2Fp5-mst-13.2.git Update charnames documentations for \N changes, bugs \N has a possible new meaning, and mention bug reports filed against charnames --- diff --git a/lib/charnames.pm b/lib/charnames.pm index 0acae61..d1fd7c8 100644 --- a/lib/charnames.pm +++ b/lib/charnames.pm @@ -437,7 +437,14 @@ will also give a warning about being deprecated. =head1 CUSTOM ALIASES This version of charnames supports three mechanisms of adding local -or customized aliases to standard Unicode naming conventions (:full) +or customized aliases to standard Unicode naming conventions (:full). + +Note that an alias should not be something that is a legal curly +brace-enclosed quantifier (see L). For example +C<\N{123}> means to match 123 non-newline characters, and is not treated as an +alias. Aliases are discouraged from beginning with anything other than an +alphabetic character and from containing anything other than alphanumerics, +spaces, dashes, colons, parentheses, and underscores. =head2 Anonymous hashes @@ -530,23 +537,36 @@ state of C-flag as in: } } +See L above for restrictions on C. + =head1 ILLEGAL CHARACTERS -If you ask by name for a character that does not exist, a warning is -given and the Unicode I "\x{FFFD}" is returned. +If you ask by name for a character that does not exist, a warning is given and +the Unicode I "\x{FFFD}" is returned. -If you ask by code for a character that does not exist, no warning is +If you ask by code for a character that is unassigned, no warning is given and C is returned. (Though if you ask for a code point -past U+10FFFF you do get a warning.) +past U+10FFFF you do get a warning.) See L below. =head1 BUGS +viacode should return an empty string for unassigned in-range Unicode code +points, as that is their correct current name. + +viacode(0) doesn't return C, but C + +vianame returns a chr if the input name is of the form C, and an ord +otherwise. It is planned to change this to always return an ord. + +None of the functions work on almost all the Hangul syllable and CJK Unicode +characters that have their code points as part of their names. + Unicode standard named sequences are not recognized, such as C (which should mean C with an additional C). -Since evaluation of the translation function happens in a middle of +Since evaluation of the translation function happens in the middle of compilation (of a string literal), the translation function should not do any Cs or Cs. This restriction should be lifted in a future version of Perl.