From: Danny Rathjens Date: Thu, 21 Jun 2007 17:35:26 +0000 (-0700) Subject: Apply doc suggestion from: X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=a365f2ce4defc0d7fecd4e9484f8f958454c9192;p=p5sagit%2Fp5-mst-13.2.git Apply doc suggestion from: Subject: [perl #43287] perluniintro inaccurate answer to testing encoding validity From: Danny Rathjens (via RT) Message-ID: p4raw-id: //depot/perl@31462 --- diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod index 9337e5f..dcfb11b 100644 --- a/pod/perluniintro.pod +++ b/pod/perluniintro.pod @@ -656,10 +656,11 @@ Use the C package to try converting it. For example, use Encode 'decode_utf8'; - if (decode_utf8($string_of_bytes_that_I_think_is_utf8)) { - # valid + eval { decode_utf8($string, Encode::FB_CROAK) }; + if ($@) { + # $string is valid utf8 } else { - # invalid + # $string is not valid utf8 } Or use C to try decoding it: @@ -667,9 +668,8 @@ Or use C to try decoding it: use warnings; @chars = unpack("C0U*", $string_of_bytes_that_I_think_is_utf8); -If invalid, a C -warning is produced. The "C0" means -"process the string character per character". Without that the +If invalid, a C warning is produced. The "C0" means +"process the string character per character". Without that, the C would work in C mode (the default if the format string starts with C) and it would return the bytes making up the UTF-8 encoding of the target string, something that will always work.