=head1 NAME
-charnames - define character names for C<\C{named}> string literal escape.
+charnames - define character names for C<\N{named}> string literal escape.
=head1 SYNOPSIS
use charnames ':full';
- print "\C{GREEK SMALL LETTER SIGMA} is called sigma.\n";
+ print "\N{GREEK SMALL LETTER SIGMA} is called sigma.\n";
use charnames ':short';
- print "\C{greek:Sigma} is an upper-case sigma.\n";
+ print "\N{greek:Sigma} is an upper-case sigma.\n";
use charnames qw(cyrillic greek);
- print "\C{sigma} is Greek sigma, and \C{be} is Cyrillic b.\n";
+ print "\N{sigma} is Greek sigma, and \N{be} is Cyrillic b.\n";
=head1 DESCRIPTION
Pragma C<use charnames> supports arguments C<:full>, C<:short> and
script names. If C<:full> is present, for expansion of
-C<\C{CHARNAME}}> string C<CHARNAME> is first looked in the list of
+C<\N{CHARNAME}}> string C<CHARNAME> is first looked in the list of
standard Unicode names of chars. If C<:short> is present, and
C<CHARNAME> has the form C<SCRIPT:CNAME>, then C<CNAME> is looked up
as a letter in script C<SCRIPT>. If pragma C<use charnames> is used
-with script name arguments, then for C<\C{CHARNAME}}> the name
+with script name arguments, then for C<\N{CHARNAME}}> the name
C<CHARNAME> is looked up as a letter in the given scripts (in the
specified order).
=head1 CUSTOM TRANSLATORS
-The mechanism of translation is C<\C{...}> escapes is general and not
+The mechanism of translation is C<\N{...}> escapes is general and not
hardwired into F<charnames.pm>. A module can install custom
translations (inside the scope which C<use>s the module) by the
following magic incantation:
Here translator() is a subroutine which takes C<CHARNAME> as an
argument, and returns text to insert into the string instead of the
-C<\C{CHARNAME}> escape. Since the text to insert should be different
+C<\N{CHARNAME}> escape. Since the text to insert should be different
in C<utf8> mode and out of it, the function should check the current
state of C<utf8>-flag as in
=item *
Regular expressions match characters instead of bytes. For instance,
-"." matches a character instead of a byte. (However, the C<\O> pattern
-is provided to force a match a single byte ("octet", hence C<\O>).)
+"." matches a character instead of a byte. (However, the C<\C> pattern
+is provided to force a match a single byte ("C<char>" in C, hence
+C<\C>).)
=item *
mentioned with the $ in Perl, unlike in the shells, where it can vary from
one line to the next.
-=item Missing %sbrace%s on \C{}
+=item Missing %sbrace%s on \N{}
-(F) Wrong syntax of character name literal C<\C{charname}> within
+(F) Wrong syntax of character name literal C<\N{charname}> within
double-quotish context.
=item Missing comma after first argument to %s function
\x1b hex char (ESC)
\x{263a} wide hex char (SMILEY)
\c[ control char (ESC)
- \C{name} named char
+ \N{name} named char
\l lowercase next char
\u uppercase next char
If C<use locale> is in effect, the case map used by C<\l>, C<\L>, C<\u>
and C<\U> is taken from the current locale. See L<perllocale>. For
-documentation of C<\C{name}>, see L<charnames>.
+documentation of C<\N{name}>, see L<charnames>.
All systems use the virtual C<"\n"> to represent a line terminator,
called a "newline". There is no such thing as an unvarying, physical
\x1B hex char
\x{263a} wide hex char (Unicode SMILEY)
\c[ control char
- \C{name} named char
+ \N{name} named char
\l lowercase next char (think vi)
\u uppercase next char (think vi)
\L lowercase till \E (think vi)
If C<use locale> is in effect, the case map used by C<\l>, C<\L>, C<\u>
and C<\U> is taken from the current locale. See L<perllocale>. For
-documentation of C<\C{name}>, see L<charnames>.
+documentation of C<\N{name}>, see L<charnames>.
You cannot include a literal C<$> or C<@> within a C<\Q> sequence.
An unescaped C<$> or C<@> interpolates the corresponding variable,
\PP Match non-P
\X Match eXtended Unicode "combining character sequence",
equivalent to C<(?:\PM\pM*)>
- \O Match a single C char (octet) even under utf8.
+ \C Match a single C char (octet) even under utf8.
A C<\w> matches a single alphanumeric character, not a whole word.
Use C<\w+> to match a string of Perl-identifier characters (which isn't
PL_seen_zerolen++; /* Do not optimize RE away */
nextchar();
break;
- case 'O':
+ case 'C':
ret = reg_node(SANY);
*flagp |= HASWIDTH|SIMPLE;
nextchar();
use charnames ':full';
-print "not " unless "Here\C{EXCLAMATION MARK}?" eq 'Here!?';
+print "not " unless "Here\N{EXCLAMATION MARK}?" eq 'Here!?';
print "ok 1\n";
print "# \$res=$res \$\@='$@'\nnot "
if $res = eval <<'EOE'
use charnames ":full";
-"Here: \C{CYRILLIC SMALL LETTER BE}!";
+"Here: \N{CYRILLIC SMALL LETTER BE}!";
1
EOE
or $@ !~ /above 0xFF/;
print "# \$res=$res \$\@='$@'\nnot "
if $res = eval <<'EOE'
use charnames 'cyrillic';
-"Here: \C{Be}!";
+"Here: \N{Be}!";
1
EOE
or $@ !~ /CYRILLIC CAPITAL LETTER BE.*above 0xFF/;
use charnames ':full';
use utf8;
- print "not " unless "\C{CYRILLIC SMALL LETTER BE}" eq $encoded_be;
+ print "not " unless "\N{CYRILLIC SMALL LETTER BE}" eq $encoded_be;
print "ok 4\n";
use charnames qw(cyrillic greek :short);
- print "not " unless "\C{be},\C{alpha},\C{hebrew:bet}"
+ print "not " unless "\N{be},\N{alpha},\N{hebrew:bet}"
eq "$encoded_be,$encoded_alpha,$encoded_bet";
print "ok 5\n";
}
: UTF;
char *leaveit = /* set of acceptably-backslashed characters */
PL_lex_inpat
- ? "\\.^$@AGZdDwWsSbBpPXO+*?|()-nrtfeaxcz0123456789[{]} \t\n\r\f\v#"
+ ? "\\.^$@AGZdDwWsSbBpPXC+*?|()-nrtfeaxcz0123456789[{]} \t\n\r\f\v#"
: "";
while (s < send || dorange) {
}
continue;
- /* \C{latin small letter a} is a named character */
- case 'C':
+ /* \N{latin small letter a} is a named character */
+ case 'N':
++s;
if (*s == '{') {
char* e = strchr(s, '}');