From: Jarkko Hietaniemi Date: Thu, 27 Dec 2001 23:56:20 +0000 (+0000) Subject: Fast Latin1<->UTF-8 conversion for older Perls. X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=aaef10c5550c567d82a2f114831f7a5c9e62a4e7;p=p5sagit%2Fp5-mst-13.2.git Fast Latin1<->UTF-8 conversion for older Perls. p4raw-id: //depot/perl@13912 --- diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod index 9b447ca..68f8a01 100644 --- a/pod/perluniintro.pod +++ b/pod/perluniintro.pod @@ -790,6 +790,15 @@ C, and C, available from CPAN. If you have the GNU recode installed, you can also use the Perl frontend C for character conversions. +The following are fast conversions from ISO 8859-1 (Latin-1) bytes +to UTF-8 bytes, the code works even with older Perl 5 versions. + + # ISO 8859-1 to UTF-8 + s/([\x80-\xFF])/chr(0xC0|ord($1)>>6).chr(0x80|ord($1)&0x3F)/eg; + + # UTF-8 to ISO 8859-1 + s/([\xC2\xC3])([\x80-\xBF])/chr(ord($1)<<6&0xC0|ord($2)&0x3F)/eg; + =head1 SEE ALSO L, L, L, L, L, L,