is a bit too much: introduce utf8::is_utf8().
p4raw-id: //depot/perl@19777
=head2 Utility functions
-The following functions are defined in the C<utf8::> package by the perl core.
+The following functions are defined in the C<utf8::> package by the
+Perl core. You do not need to say C<use utf8> to use these and in fact
+you should not unless you really want to have UTF-8 source code.
=over 4
should not be used to convert Unicode back to a legacy byte encoding:
use Encode for that.
+=item * $flag = utf8::is_utf8(STRING)
+
+Test whether STRING is in UTF-8.
+
=item * $flag = utf8::valid(STRING)
-[INTERNAL] Test whether STRING is in a consistent state. Will return
-true if string is held as bytes, or is well-formed UTF-8 and has the
-UTF-8 flag on. Main reason for this routine is to allow Perl's
-testsuite to check that operations have left strings in a consistent
-state.
+[INTERNAL] Test whether STRING is in a consistent state regarding
+UTF-8. Will return true is well-formed UTF-8 and has the UTF-8 flag
+on B<or> if string is held as bytes (both these states are 'consistent').
+Main reason for this routine is to allow Perl's testsuite to check
+that operations have left strings in a consistent state. You most
+probably want to use utf8::is_utf8() instead.
=back
That shows the UTF8 flag in FLAGS and both the UTF-8 bytes
and Unicode characters in C<PV>. See also later in this document
-the discussion about the C<is_utf8> function of the C<Encode> module.
+the discussion about the C<utf8::is_utf8()> function.
=back
Okay, if you insist:
- use Encode 'is_utf8';
- print is_utf8($string) ? 1 : 0, "\n";
+ print utf8::is_utf8($string) ? 1 : 0, "\n";
But note that this doesn't mean that any of the characters in the
string are necessary UTF-8 encoded, or that any of the characters have
XS(XS_version_vcmp);
XS(XS_version_boolean);
XS(XS_version_noop);
+XS(XS_utf8_is_utf8);
XS(XS_utf8_valid);
XS(XS_utf8_encode);
XS(XS_utf8_decode);
newXS("version::(nomethod", XS_version_noop, file);
newXS("version::noop", XS_version_noop, file);
}
+ newXS("utf8::is_utf8", XS_utf8_is_utf8, file);
newXS("utf8::valid", XS_utf8_valid, file);
newXS("utf8::encode", XS_utf8_encode, file);
newXS("utf8::decode", XS_utf8_decode, file);
XSRETURN_EMPTY;
}
+XS(XS_utf8_is_utf8)
+{
+ dXSARGS;
+ if (items != 1)
+ Perl_croak(aTHX_ "Usage: utf8::is_utf8(sv)");
+ {
+ SV * sv = ST(0);
+ {
+ STRLEN len;
+ if (SvUTF8(sv))
+ XSRETURN_YES;
+ else
+ XSRETURN_NO;
+ }
+ }
+ XSRETURN_EMPTY;
+}
+
XS(XS_utf8_valid)
{
dXSARGS;