3 perldelta - what is new for perl v5.9.2
7 This document describes differences between the 5.9.1 and the 5.9.2
8 development releases. See L<perl590delta> and L<perl591delta> for the
9 differences between 5.8.0 and 5.9.1.
11 =head1 Incompatible Changes
13 =head2 Packing and UTF-8 strings
15 The semantics of pack() and unpack() regarding UTF-8-encoded data has been
16 changed. Processing is now by default character per character instead of
17 byte per byte on the underlying encoding. Notably, code that used things
18 like C<pack("a*", $string)> to see through the encoding of string will now
19 simply get back the original $string. Packed strings can also get upgraded
20 during processing when you store upgraded characters. You can get the old
21 behaviour by using C<use bytes>.
23 To be consistent with pack(), the C<C0> in unpack() templates indicates
24 that the data is to be processed in character mode, i.e. character by
25 character; at the contrary, C<U0> in unpack() indicates UTF-8 mode, where
26 the packed string is processed in its UTF-8-encoded Unicode form on a byte
27 by byte basis. This is reversed with regard to perl 5.8.X.
29 Moreover, C<C0> and C<U0> can also be used in pack() templates to specify
30 respectively character and byte modes.
32 C<C0> and C<U0> in the middle of a pack or unpack format now switch to the
33 specified encoding mode, honoring parens grouping. Previously, parens were
36 Also, there is a new pack() character format, C<W>, which is intended to
37 replace the old C<C>. C<C> is kept for unsigned chars coded as bytes in
38 the strings internal representation. C<W> represents unsigned (logical)
39 character values, which can be greater than 255. It is therefore more
40 robust when dealing with potentially UTF-8-encoded data (as C<C> will wrap
41 values outside the range 0..255, and not respect the string encoding).
43 In practice, that means that pack formats are now encoding-neutral, except
46 For consistency, C<A> in unpack() format now trims all Unicode whitespace
47 from the end of the string. Before perl 5.9.2, it used to strip only the
48 classical ASCII space characters.
50 =head1 Core Enhancements
52 =head2 Regexp debug flags
54 A new variable, ${^RE_DEBUG_FLAGS}, controls what debug flags are in
55 effect for the regular expression engine when running under C<use re
56 "debug">. See L<re> for details.
58 =head1 Modules and Pragmata
60 =head1 Utility Changes
64 =head1 Performance Enhancements
66 =head2 Trie optimization for regexp engine
68 The regexp engine is now able to factorize common prefixes and suffixes in
69 regular expressions. A new special variable, ${^RE_TRIE_MAXBUFF}, has been
70 added to fine tune this optimization.
72 =head1 Installation and Configuration Improvements
74 =head1 Selected Bug Fixes
76 =head1 New or Changed Diagnostics
78 =head1 Changed Internals
82 =head2 Platform Specific Problems
86 If you find what you think is a bug, you might check the articles
87 recently posted to the comp.lang.perl.misc newsgroup and the perl
88 bug database at http://bugs.perl.org/ . There may also be
89 information at http://www.perl.org/ , the Perl Home Page.
91 If you believe you have an unreported bug, please run the B<perlbug>
92 program included with your release. Be sure to trim your bug down
93 to a tiny but sufficient test case. Your bug report, along with the
94 output of C<perl -V>, will be sent off to perlbug@perl.org to be
95 analysed by the Perl porting team.
99 The F<Changes> file for exhaustive details on what changed.
101 The F<INSTALL> file for how to build Perl.
103 The F<README> file for general stuff.
105 The F<Artistic> and F<Copying> files for copyright information.