Commit | Line | Data |
e0eb806d |
1 | =head1 NAME |
2 | |
70693193 |
3 | perl592delta - what is new for perl v5.9.2 |
e0eb806d |
4 | |
5 | =head1 DESCRIPTION |
6 | |
7 | This document describes differences between the 5.9.1 and the 5.9.2 |
fa11829f |
8 | development releases. See L<perl590delta> and L<perl591delta> for the |
e0eb806d |
9 | differences between 5.8.0 and 5.9.1. |
10 | |
11 | =head1 Incompatible Changes |
12 | |
a8cf0b1d |
13 | =head2 Packing and UTF-8 strings |
14 | |
15 | The semantics of pack() and unpack() regarding UTF-8-encoded data has been |
f1aa04aa |
16 | changed. Processing is now by default character per character instead of |
17 | byte per byte on the underlying encoding. Notably, code that used things |
18 | like C<pack("a*", $string)> to see through the encoding of string will now |
19 | simply get back the original $string. Packed strings can also get upgraded |
20 | during processing when you store upgraded characters. You can get the old |
21 | behaviour by using C<use bytes>. |
a8cf0b1d |
22 | |
23 | To be consistent with pack(), the C<C0> in unpack() templates indicates |
24 | that the data is to be processed in character mode, i.e. character by |
118f78c9 |
25 | character; on the contrary, C<U0> in unpack() indicates UTF-8 mode, where |
a8cf0b1d |
26 | the packed string is processed in its UTF-8-encoded Unicode form on a byte |
27 | by byte basis. This is reversed with regard to perl 5.8.X. |
28 | |
29 | Moreover, C<C0> and C<U0> can also be used in pack() templates to specify |
30 | respectively character and byte modes. |
31 | |
f1aa04aa |
32 | C<C0> and C<U0> in the middle of a pack or unpack format now switch to the |
33 | specified encoding mode, honoring parens grouping. Previously, parens were |
34 | ignored. |
a8cf0b1d |
35 | |
36 | Also, there is a new pack() character format, C<W>, which is intended to |
f1aa04aa |
37 | replace the old C<C>. C<C> is kept for unsigned chars coded as bytes in |
38 | the strings internal representation. C<W> represents unsigned (logical) |
39 | character values, which can be greater than 255. It is therefore more |
40 | robust when dealing with potentially UTF-8-encoded data (as C<C> will wrap |
41 | values outside the range 0..255, and not respect the string encoding). |
a8cf0b1d |
42 | |
43 | In practice, that means that pack formats are now encoding-neutral, except |
44 | C<C>. |
45 | |
1cdd6bcc |
46 | For consistency, C<A> in unpack() format now trims all Unicode whitespace |
47 | from the end of the string. Before perl 5.9.2, it used to strip only the |
48 | classical ASCII space characters. |
49 | |
1af60bcb |
50 | =head2 Miscellaneous |
3911fe8f |
51 | |
1af60bcb |
52 | The internal dump output has been improved, so that non-printable characters |
53 | such as newline and backspace are output in C<\x> notation, rather than |
54 | octal. |
55 | |
56 | The B<-C> option can no longer be used on the C<#!> line. It wasn't |
118f78c9 |
57 | working there anyway. |
3911fe8f |
58 | |
e0eb806d |
59 | =head1 Core Enhancements |
60 | |
1af60bcb |
61 | =head2 Malloc wrapping |
1cdd6bcc |
62 | |
1af60bcb |
63 | Perl can now be built to detect attempts to assign pathologically large chunks |
64 | of memory. Previously such assignments would suffer from integer wrap-around |
65 | during size calculations causing a misallocation, which would crash perl, and |
66 | could theoretically be used for "stack smashing" attacks. The wrapping |
67 | defaults to enabled on platforms where we know it works (most AIX |
e5c81e17 |
68 | configurations, BSDi, Darwin, DEC OSF/1, FreeBSD, HP-UX, GNU Linux, OpenBSD, |
1af60bcb |
69 | Solaris, VMS and most Win32 compilers) and defaults to disabled on other |
70 | platforms. |
71 | |
72 | =head2 Unicode Character Database 4.0.1 |
73 | |
74 | The copy of the Unicode Character Database included in Perl 5.9 has |
75 | been updated to 4.0.1 from 4.0.0. |
76 | |
77 | =head2 suidperl less insecure |
78 | |
79 | Paul Szabo has analysed and patched C<suidperl> to remove existing known |
80 | insecurities. Currently there are no known holes in C<suidperl>, but previous |
81 | experience shows that we cannot be confident that these were the last. You may |
82 | no longer invoke the set uid perl directly, so to preserve backwards |
83 | compatibility with scripts that invoke #!/usr/bin/suidperl the only set uid |
84 | binary is now C<sperl5.9.>I<n> (C<sperl5.9.2> for this release). C<suidperl> |
85 | is installed as a hard link to C<perl>; both C<suidperl> and C<perl> will |
86 | invoke C<sperl5.9.2> automatically the set uid binary, so this change should |
87 | be completely transparent. |
88 | |
89 | For new projects the core perl team would strongly recommend that you use |
90 | dedicated, single purpose security tools such as C<sudo> in preference to |
91 | C<suidperl>. |
92 | |
118f78c9 |
93 | =head2 PERLIO_DEBUG |
94 | |
95 | The C<PERLIO_DEBUG> environment variable has no longer any effect for |
96 | setuid scripts and for scripts run with B<-T>. |
97 | |
98 | Moreover, with a thread-enabled perl, using C<PERLIO_DEBUG> could lead to |
99 | an internal buffer overflow. This has been fixed. |
100 | |
1af60bcb |
101 | =head2 Formats |
102 | |
103 | In addition to bug fixes, C<format>'s features have been enhanced. See |
104 | L<perlform>. |
105 | |
106 | =head2 Unicode Character Classes |
107 | |
108 | Perl's regular expression engine now contains support for matching on the |
109 | intersection of two Unicode character classes. You can also now refer to |
110 | user-defined character classes from within other user defined character |
111 | classes. |
1cdd6bcc |
112 | |
2bbb3949 |
113 | =head2 Byte-order modifiers for pack() and unpack() |
114 | |
115 | There are two new byte-order modifiers, C<E<gt>> (big-endian) and C<E<lt>> |
116 | (little-endian), that can be appended to most pack() and unpack() template |
117 | characters and groups to force a certain byte-order for that type or group. |
118 | See L<perlfunc/pack> and L<perlpacktut> for details. |
119 | |
118f78c9 |
120 | =head2 Byte count feature in pack() |
121 | |
122 | A new pack() template character, C<".">, returns the number of characters |
123 | read so far. |
124 | |
1af60bcb |
125 | =head2 New variables |
126 | |
127 | A new variable, ${^RE_DEBUG_FLAGS}, controls what debug flags are in |
128 | effect for the regular expression engine when running under C<use re |
129 | "debug">. See L<re> for details. |
130 | |
a69635b7 |
131 | A new variable ${^UTF8LOCALE} indicates where a UTF-8 locale was detected |
1af60bcb |
132 | by perl at startup. |
133 | |
e0eb806d |
134 | =head1 Modules and Pragmata |
135 | |
3911fe8f |
136 | =head2 New modules |
137 | |
138 | =over 4 |
139 | |
140 | =item * |
141 | |
353c6505 |
142 | C<encoding::warnings>, by Audrey Tang, is a module to emit warnings |
118f78c9 |
143 | whenever an ASCII character string containing high-bit bytes is implicitly |
144 | converted into UTF-8. |
145 | |
146 | =item * |
147 | |
3911fe8f |
148 | C<Module::CoreList>, by Richard Clamp, is a small handy module that tells |
149 | you what versions of core modules ship with any versions of Perl 5. It |
150 | comes with a command-line frontend, C<corelist>. |
151 | |
152 | =back |
153 | |
154 | =head2 Updated And Improved Modules and Pragmata |
155 | |
1af60bcb |
156 | Dual-lived modules have been updated to be kept up-to-date with respect to |
157 | CPAN. |
158 | |
159 | The dual-lived modules which contain an C<_> in their version number are |
160 | actually I<ahead> of the corresponding CPAN release. |
161 | |
162 | =over 4 |
163 | |
1af60bcb |
164 | =item B::Concise |
165 | |
166 | C<B::Concise> was significantly improved. |
167 | |
168 | =item Socket |
169 | |
170 | There is experimental support for Linux abstract Unix domain sockets. |
171 | |
172 | =item Sys::Syslog |
173 | |
174 | C<syslog()> can now use numeric constants for facility names and priorities, |
175 | in addition to strings. |
176 | |
177 | =item threads |
178 | |
179 | Detached threads are now also supported on Windows. |
180 | |
181 | =back |
182 | |
e0eb806d |
183 | =head1 Utility Changes |
184 | |
3911fe8f |
185 | =over 4 |
186 | |
187 | =item * |
188 | |
118f78c9 |
189 | The C<corelist> utility is now installed with perl (see L</"New modules"> |
3911fe8f |
190 | above). |
191 | |
192 | =item * |
193 | |
194 | C<h2ph> and C<h2xs> have been made a bit more robust with regard to |
195 | "modern" C code. |
196 | |
1af60bcb |
197 | =item * |
198 | |
199 | Several bugs have been fixed in C<find2perl>, regarding C<-exec> and |
200 | C<-eval>. Also the options C<-path>, C<-ipath> and C<-iname> have been |
201 | added. |
202 | |
203 | =item * |
204 | |
205 | The Perl debugger can now save all debugger commands for sourcing later; |
206 | notably, it can now emulate stepping backwards, by restarting and |
207 | rerunning all bar the last command from a saved command history. |
208 | |
209 | It can also display the parent inheritance tree of a given class. |
210 | |
211 | Perl has a new -dt command-line flag, which enables threads support in the |
212 | debugger. |
213 | |
3911fe8f |
214 | =back |
215 | |
e0eb806d |
216 | =head1 Performance Enhancements |
217 | |
1af60bcb |
218 | =over 4 |
219 | |
220 | =item * |
221 | |
222 | Unicode case mappings (C</i>, C<lc>, C<uc>, etc) are faster. |
223 | |
224 | =item * |
225 | |
226 | C<@a = sort @a> was optimized to do in-place sort. Likewise, C<reverse |
227 | sort ...> is now optimized to sort in reverse, avoiding the generation of |
228 | a temporary intermediate list. |
229 | |
230 | =item * |
231 | |
118f78c9 |
232 | Unnecessary assignments are optimised away in |
1af60bcb |
233 | |
234 | my $s = undef; |
235 | my @a = (); |
236 | my %h = (); |
237 | |
238 | =item * |
239 | |
118f78c9 |
240 | C<map> in scalar context is now optimized. |
1cdd6bcc |
241 | |
1af60bcb |
242 | =item * |
243 | |
244 | The regexp engine now implements the trie optimization : it's able to |
245 | factorize common prefixes and suffixes in regular expressions. A new |
118f78c9 |
246 | special variable, ${^RE_TRIE_MAXBUF}, has been added to fine-tune this |
1af60bcb |
247 | optimization. |
248 | |
249 | =back |
1cdd6bcc |
250 | |
e0eb806d |
251 | =head1 Installation and Configuration Improvements |
252 | |
118f78c9 |
253 | Run-time customization of @INC can be enabled by passing the |
254 | C<-Dusesitecustomize> flag to configure. When enabled, this will make perl |
255 | run F<$sitelibexp/sitecustomize.pl> before anything else. This script can |
256 | then be set up to add additional entries to @INC. |
257 | |
1af60bcb |
258 | There is alpha support for relocatable @INC entries. |
259 | |
260 | Perl should build on Interix and on GNU/kFreeBSD. |
261 | |
e0eb806d |
262 | =head1 Selected Bug Fixes |
263 | |
118f78c9 |
264 | Most of those bugs were reported in the perl 5.8.x maintenance track. |
1af60bcb |
265 | Notably, quite a few utf8 bugs were fixed, and several memory leaks were |
118f78c9 |
266 | suppressed. The perl58Xdelta manpages have more details on them. |
1af60bcb |
267 | |
268 | Development-only bug fixes include : |
269 | |
270 | C<$Foo::_> was wrongly forced as C<$main::_>. |
271 | |
e0eb806d |
272 | =head1 New or Changed Diagnostics |
273 | |
1af60bcb |
274 | A new warning, C<!=~ should be !~>, is emitted to prevent this misspelling |
275 | of the non-matching operator. |
276 | |
3911fe8f |
277 | The warning I<Newline in left-justified string> has been removed. |
278 | |
279 | The error I<Too late for "-T" option> has been reformulated to be more |
280 | descriptive. |
281 | |
1af60bcb |
282 | There is a new compilation error, I<Illegal declaration of subroutine>, |
283 | for an obscure case of syntax errors. |
284 | |
285 | The diagnostic output of Carp has been changed slightly, to add a space after |
286 | the comma between arguments. This makes it much easier for tools such as |
287 | web browsers to wrap it, but might confuse any automatic tools which perform |
288 | detailed parsing of Carp output. |
289 | |
118f78c9 |
290 | C<perl -V> has several improvements, making it more useable from shell |
291 | scripts to get the value of configuration variables. See L<perlrun> for |
292 | details. |
3911fe8f |
293 | |
e0eb806d |
294 | =head1 Changed Internals |
295 | |
118f78c9 |
296 | The perl core has been refactored and reorganised in several places. |
297 | In short, this release will not be binary compatible with any previous |
298 | perl release. |
299 | |
e0eb806d |
300 | =head1 Known Problems |
301 | |
118f78c9 |
302 | For threaded builds, F<ext/threads/shared/t/wait.t> has been reported to |
303 | fail some tests on HP-UX 10.20. |
304 | |
c3ed3de7 |
305 | Net::Ping might fail some tests on HP-UX 11.00 with the latest OS |
306 | upgrades. |
307 | |
921db004 |
308 | F<t/io/dup.t>, F<t/io/open.t> and F<lib/ExtUtils/t/Constant.t> fail some |
309 | tests on some BSD flavours. |
118f78c9 |
310 | |
311 | =head1 Plans for the next release |
312 | |
313 | The current plan for perl 5.9.3 is to add CPANPLUS as a core module. |
314 | More regular expression optimizations are also in the works. |
315 | |
316 | It is planned to release a development version of perl more frequently, |
317 | i.e. each time something major changes. |
e0eb806d |
318 | |
319 | =head1 Reporting Bugs |
320 | |
321 | If you find what you think is a bug, you might check the articles |
322 | recently posted to the comp.lang.perl.misc newsgroup and the perl |
323 | bug database at http://bugs.perl.org/ . There may also be |
324 | information at http://www.perl.org/ , the Perl Home Page. |
325 | |
326 | If you believe you have an unreported bug, please run the B<perlbug> |
327 | program included with your release. Be sure to trim your bug down |
328 | to a tiny but sufficient test case. Your bug report, along with the |
329 | output of C<perl -V>, will be sent off to perlbug@perl.org to be |
330 | analysed by the Perl porting team. |
331 | |
332 | =head1 SEE ALSO |
333 | |
334 | The F<Changes> file for exhaustive details on what changed. |
335 | |
336 | The F<INSTALL> file for how to build Perl. |
337 | |
338 | The F<README> file for general stuff. |
339 | |
340 | The F<Artistic> and F<Copying> files for copyright information. |
341 | |
342 | =cut |