pod/perl592delta.pod

   1 =head1 NAME
   2
   3 perldelta - what is new for perl v5.9.2
   4
   5 =head1 DESCRIPTION
   6
   7 This document describes differences between the 5.9.1 and the 5.9.2
   8 development releases. See L<perl590delta> and L<perl591delta> for the
   9 differences between 5.8.0 and 5.9.1.
  10
  11 =head1 Incompatible Changes
  12
  13 =head2 Packing and UTF-8 strings
  14
  15 The semantics of pack() and unpack() regarding UTF-8-encoded data has been
  16 changed. Processing is now by default character per character instead of
  17 byte per byte on the underlying encoding. Notably, code that used things
  18 like C<pack("a*", $string)> to see through the encoding of string will now
  19 simply get back the original $string. Packed strings can also get upgraded
  20 during processing when you store upgraded characters. You can get the old
  21 behaviour by using C<use bytes>.
  22
  23 To be consistent with pack(), the C<C0> in unpack() templates indicates
  24 that the data is to be processed in character mode, i.e. character by
  25 character; at the contrary, C<U0> in unpack() indicates UTF-8 mode, where
  26 the packed string is processed in its UTF-8-encoded Unicode form on a byte
  27 by byte basis. This is reversed with regard to perl 5.8.X.
  28
  29 Moreover, C<C0> and C<U0> can also be used in pack() templates to specify
  30 respectively character and byte modes.
  31
  32 C<C0> and C<U0> in the middle of a pack or unpack format now switch to the
  33 specified encoding mode, honoring parens grouping. Previously, parens were
  34 ignored.
  35
  36 Also, there is a new pack() character format, C<W>, which is intended to
  37 replace the old C<C>. C<C> is kept for unsigned chars coded as bytes in
  38 the strings internal representation. C<W> represents unsigned (logical)
  39 character values, which can be greater than 255. It is therefore more
  40 robust when dealing with potentially UTF-8-encoded data (as C<C> will wrap
  41 values outside the range 0..255, and not respect the string encoding).
  42
  43 In practice, that means that pack formats are now encoding-neutral, except
  44 C<C>.
  45
  46 For consistency, C<A> in unpack() format now trims all Unicode whitespace
  47 from the end of the string. Before perl 5.9.2, it used to strip only the
  48 classical ASCII space characters.
  49
  50 =head1 Core Enhancements
  51
  52 =head2 Regexp debug flags
  53
  54 A new variable, ${^RE_DEBUG_FLAGS}, controls what debug flags are in
  55 effect for the regular expression engine when running under C<use re
  56 "debug">. See L<re> for details.
  57
  58 =head1 Modules and Pragmata
  59
  60 =head1 Utility Changes
  61
  62 =head1 Documentation
  63
  64 =head1 Performance Enhancements
  65
  66 =head2 Trie optimization for regexp engine
  67
  68 The regexp engine is now able to factorize common prefixes and suffixes in
  69 regular expressions. A new special variable, ${^RE_TRIE_MAXBUFF}, has been
  70 added to fine tune this optimization.
  71
  72 =head1 Installation and Configuration Improvements
  73
  74 =head1 Selected Bug Fixes
  75
  76 =head1 New or Changed Diagnostics
  77
  78 =head1 Changed Internals
  79
  80 =head1 Known Problems
  81
  82 =head2 Platform Specific Problems
  83
  84 =head1 Reporting Bugs
  85
  86 If you find what you think is a bug, you might check the articles
  87 recently posted to the comp.lang.perl.misc newsgroup and the perl
  88 bug database at http://bugs.perl.org/ .  There may also be
  89 information at http://www.perl.org/ , the Perl Home Page.
  90
  91 If you believe you have an unreported bug, please run the B<perlbug>
  92 program included with your release.  Be sure to trim your bug down
  93 to a tiny but sufficient test case.  Your bug report, along with the
  94 output of C<perl -V>, will be sent off to perlbug@perl.org to be
  95 analysed by the Perl porting team.
  96
  97 =head1 SEE ALSO
  98
  99 The F<Changes> file for exhaustive details on what changed.
 100
 101 The F<INSTALL> file for how to build Perl.
 102
 103 The F<README> file for general stuff.
 104
 105 The F<Artistic> and F<Copying> files for copyright information.
 106
 107 =cut