From: Rafael Garcia-Suarez Date: Tue, 18 Dec 2007 09:51:39 +0000 (+0000) Subject: Notes on 5.12 Unicode revamping planned. X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=a3d15f9a2bf22c599dfee4c8fb750856644c6d1f;p=p5sagit%2Fp5-mst-13.2.git Notes on 5.12 Unicode revamping planned. Complete the "reporting bug" section of perldelta. p4raw-id: //depot/perl@32636 --- diff --git a/pod/perl5100delta.pod b/pod/perl5100delta.pod index bd0c1b3..8485c54 100644 --- a/pod/perl5100delta.pod +++ b/pod/perl5100delta.pod @@ -1550,15 +1550,33 @@ Stacked filetest operators won't work when the C pragma is in effect, because they rely on the stat() buffer C<_> being populated, and filetest bypasses stat(). +=head2 UTF-8 problems + +The handling of Unicode still is unclean in several places, where it's +dependent on whether a string is internally flagged as UTF-8. This will +be made more consistent in perl 5.12, but that won't be possible without +a certain amount of backwards incompatibility. + +=head1 Platform Specific Problems + When compiled with g++ and thread support on Linux, it's reported that the C<$!> stops working correctly. This is related to the fact that the glibc provides two strerror_r(3) implementation, and perl selects the wrong one. -=head1 Platform Specific Problems - =head1 Reporting Bugs +If you find what you think is a bug, you might check the articles +recently posted to the comp.lang.perl.misc newsgroup and the perl +bug database at http://rt.perl.org/rt3/ . There may also be +information at http://www.perl.org/ , the Perl Home Page. + +If you believe you have an unreported bug, please run the B +program included with your release. Be sure to trim your bug down +to a tiny but sufficient test case. Your bug report, along with the +output of C, will be sent off to perlbug@perl.org to be +analysed by the Perl porting team. + =head1 SEE ALSO The F file and the perl590delta to perl595delta man pages for diff --git a/pod/perltodo.pod b/pod/perltodo.pod index 0c85ceb..d869b67 100644 --- a/pod/perltodo.pod +++ b/pod/perltodo.pod @@ -667,6 +667,22 @@ also the warning messages (see L, C). These tasks would need C knowledge, and knowledge of how the interpreter works, or a willingness to learn. +=head2 UTF-8 revamp + +The handling of Unicode is unclean in many places. For example, the regexp +engine matches in Unicode semantics whenever the string or the pattern is +flagged as UTF-8, but that should not be dependent on an internal storage +detail of the string. Likewise, case folding behaviour is dependent on the +UTF8 internal flag being on or off. + +=head2 Properly Unicode safe tokeniser and pads. + +The tokeniser isn't actually very UTF-8 clean. C is a hack - +variable names are stored in stashes as raw bytes, without the utf-8 flag +set. The pad API only takes a C pointer, so that's all bytes too. The +tokeniser ignores the UTF-8-ness of C, or any SVs returned from +source filters. All this could be fixed. + =head2 state variable initialization in list context Currently this is illegal: @@ -776,14 +792,6 @@ reinstated. The old perltodo notes "Look at the "reification" code in C". -=head2 Properly Unicode safe tokeniser and pads. - -The tokeniser isn't actually very UTF-8 clean. C is a hack - -variable names are stored in stashes as raw bytes, without the utf-8 flag -set. The pad API only takes a C pointer, so that's all bytes too. The -tokeniser ignores the UTF-8-ness of C, or any SVs returned from -source filters. All this could be fixed. - =head2 The yada yada yada operators Perl 6's Synopsis 3 says: