use UnicodeCD 'charscript';
my $charscript = charblock($codepoint);
+ use UnicodeCD 'charblocks';
+ my $charblocks = charblocks();
+
+ use UnicodeCD 'charscripts';
+ my %charscripts = charscripts();
+
+ use UnicodeCD qw(charscript charinrange);
+ my $range = charscript($script);
+ print "looks like $script\n" if charinrange($range, $codepoint);
+
+ use UnicodeCD 'compexcl';
+ my $compexcl = compexcl($codepoint);
+
+ my $unicode_version = UnicodeCD::UnicodeVersion();
+
=head1 DESCRIPTION
-The Unicode module offers a simple interface to the Unicode Character
+The UnicodeCD module offers a simple interface to the Unicode Character
Database.
=cut
In addition to using the C<\p{In...}> and C<\P{In...}> constructs, you
can also test whether a code point is in the I<range> as returned by
L</charblock> and L</charscript> or as the values of the hash returned
-by L</charblocks> and </charscripts> by using charinrange():
+by L</charblocks> and L</charscripts> by using charinrange():
use UnicodeCD qw(charscript charinrange);
$range = charscript('Hiragana');
- print "looks like hiragana\n" if charinrange($range, $code);
+ print "looks like hiragana\n" if charinrange($range, $codepoint);
=cut
my $compexcl = compexcl("09dc");
The compexcl() returns the composition exclusion (that is, if the
-character cannot be decomposed) of the character specified by a B<code
-point argument>.
+character should not be produced during a precomposition) of the
+character specified by a B<code point argument>.
If there is a composition exclusion for the character, true is
returned. Otherwise, false is returned.
Conditions preceded by "NON_" represent the negation of the condition
A I<locale> is defined as a 2-letter ISO 3166 country code, possibly
-followed by a "_" and a 2-letter ISO language code (, possibly followed
-by a "_" and a variant code). You can find the list of those codes
-in L<Locale::Country> and L<Locale::Language>.
+followed by a "_" and a 2-letter ISO language code (possibly followed
+by a "_" and a variant code). You can find the lists of those codes,
+see L<Locale::Country> and L<Locale::Language>.
A I<context> is one of the following choices:
FINAL The letter is not followed by a letter of
general category L (e.g. Ll, Lt, Lu, Lm, or Lo)
MODERN The mapping is only used for modern text
- AFTER_i The last base character was "i" 0069
+ AFTER_i The last base character was "i" (U+0069)
For more information about case mappings see
http://www.unicode.org/unicode/reports/tr21/