Document utf8_to_uv() better.
[p5sagit/p5-mst-13.2.git] / t / op / utf8decode.t

Software error:

Malformed UTF-8 character (fatal) at /var/www/git.shadowcat.co.uk/docroot/gitweb/gitweb.cgi line 1024.

For help, please send mail to the webmaster (chrisj@shadowcatsystems.co.uk), giving this error message and the time and date of the error.

CommitLineData
a9917092 1#!./perl
2
3BEGIN {
4 chdir 't' if -d 't';
5 @INC = '../lib';
6}
7
8print "1..78\n";
9
10my $test = 1;
11
12# This table is based on Markus Kuhn's UTF-8 Decode Stress Tester,
13# http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt,
14# version dated 2000-09-02.
15
16# Note the \0 instead of a raw zero byte in 2.1.1: for example
17# GNU patch v2.1 has "issues" with raw zero bytes.
18
19my @MK = split(/\n/, <<__EOMK__);
201 Correct UTF-8
211.1.1 y "κόσμε" - 11 ce:ba:e1:bd:b9:cf:83:ce:bc:ce:b5 5
222 Boundary conditions
232.1 First possible sequence of certain length
242.1.1 y "\0" 0 1 00 1
252.1.2 y "\80" 80 2 c2:80 1
262.1.3 y "ࠀ" 800 3 e0:a0:80 1
272.1.4 y "𐀀" 10000 4 f0:90:80:80 1
282.1.5 y "" 200000 5 f8:88:80:80:80 1
292.1.6 y "" 4000000 6 fc:84:80:80:80:80 1
302.2 Last possible sequence of certain length
312.2.1 y "\7f" 7f 1 7f 1
322.2.2 y "߿" 7ff 2 df:bf 1
33# The ffff is illegal unless UTF8_ALLOW_FFFF
34