Commit | Line | Data |
81663c3a |
1 | =encoding utf8 |
2 | |
cf6c151c |
3 | =head1 NAME |
4 | |
5 | perldelta - what is new for perl 5.10.0 |
6 | |
7 | =head1 DESCRIPTION |
8 | |
9 | This document describes the differences between the 5.8.8 release and |
10 | the 5.10.0 release. |
11 | |
12 | Many of the bug fixes in 5.10.0 were already seen in the 5.8.X maintenance |
13 | releases; they are not duplicated here and are documented in the set of |
14 | man pages named perl58[1-8]?delta. |
15 | |
cf6c151c |
16 | =head1 Core Enhancements |
17 | |
18 | =head2 The C<feature> pragma |
19 | |
20 | The C<feature> pragma is used to enable new syntax that would break Perl's |
21 | backwards-compatibility with older releases of the language. It's a lexical |
22 | pragma, like C<strict> or C<warnings>. |
23 | |
24 | Currently the following new features are available: C<switch> (adds a |
25 | switch statement), C<say> (adds a C<say> built-in function), and C<state> |
292c2b28 |
26 | (adds a C<state> keyword for declaring "static" variables). Those |
cf6c151c |
27 | features are described in their own sections of this document. |
28 | |
29 | The C<feature> pragma is also implicitly loaded when you require a minimal |
30 | perl version (with the C<use VERSION> construct) greater than, or equal |
31 | to, 5.9.5. See L<feature> for details. |
32 | |
33 | =head2 New B<-E> command-line switch |
34 | |
35 | B<-E> is equivalent to B<-e>, but it implicitly enables all |
36 | optional features (like C<use feature ":5.10">). |
37 | |
38 | =head2 Defined-or operator |
39 | |
40 | A new operator C<//> (defined-or) has been implemented. |
dbef3c66 |
41 | The following expression: |
cf6c151c |
42 | |
43 | $a // $b |
44 | |
45 | is merely equivalent to |
46 | |
47 | defined $a ? $a : $b |
48 | |
dbef3c66 |
49 | and the statement |
cf6c151c |
50 | |
51 | $c //= $d; |
52 | |
53 | can now be used instead of |
54 | |
55 | $c = $d unless defined $c; |
56 | |
57 | The C<//> operator has the same precedence and associativity as C<||>. |
58 | Special care has been taken to ensure that this operator Do What You Mean |
59 | while not breaking old code, but some edge cases involving the empty |
60 | regular expression may now parse differently. See L<perlop> for |
61 | details. |
62 | |
63 | =head2 Switch and Smart Match operator |
64 | |
65 | Perl 5 now has a switch statement. It's available when C<use feature |
66 | 'switch'> is in effect. This feature introduces three new keywords, |
67 | C<given>, C<when>, and C<default>: |
68 | |
69 | given ($foo) { |
70 | when (/^abc/) { $abc = 1; } |
71 | when (/^def/) { $def = 1; } |
72 | when (/^xyz/) { $xyz = 1; } |
73 | default { $nothing = 1; } |
74 | } |
75 | |
76 | A more complete description of how Perl matches the switch variable |
77 | against the C<when> conditions is given in L<perlsyn/"Switch statements">. |
78 | |
79 | This kind of match is called I<smart match>, and it's also possible to use |
80 | it outside of switch statements, via the new C<~~> operator. See |
81 | L<perlsyn/"Smart matching in detail">. |
82 | |
83 | This feature was contributed by Robin Houston. |
84 | |
85 | =head2 Regular expressions |
86 | |
87 | =over 4 |
88 | |
89 | =item Recursive Patterns |
90 | |
91 | It is now possible to write recursive patterns without using the C<(??{})> |
92 | construct. This new way is more efficient, and in many cases easier to |
93 | read. |
94 | |
95 | Each capturing parenthesis can now be treated as an independent pattern |
96 | that can be entered by using the C<(?PARNO)> syntax (C<PARNO> standing for |
97 | "parenthesis number"). For example, the following pattern will match |
98 | nested balanced angle brackets: |
99 | |
100 | / |
101 | ^ # start of line |
102 | ( # start capture buffer 1 |
103 | < # match an opening angle bracket |
104 | (?: # match one of: |
105 | (?> # don't backtrack over the inside of this group |
106 | [^<>]+ # one or more non angle brackets |
107 | ) # end non backtracking group |
108 | | # ... or ... |
109 | (?1) # recurse to bracket 1 and try it again |
110 | )* # 0 or more times. |
111 | > # match a closing angle bracket |
112 | ) # end capture buffer one |
113 | $ # end of line |
114 | /x |
115 | |
e15dad31 |
116 | PCRE users should note that Perl's recursive regex feature allows |
117 | backtracking into a recursed pattern, whereas in PCRE the recursion is |
118 | atomic or "possessive" in nature. As in the example above, you can |
119 | add (?>) to control this selectively. (Yves Orton) |
cf6c151c |
120 | |
121 | =item Named Capture Buffers |
122 | |
123 | It is now possible to name capturing parenthesis in a pattern and refer to |
124 | the captured contents by name. The naming syntax is C<< (?<NAME>....) >>. |
125 | It's possible to backreference to a named buffer with the C<< \k<NAME> >> |
126 | syntax. In code, the new magical hashes C<%+> and C<%-> can be used to |
127 | access the contents of the capture buffers. |
128 | |
e15dad31 |
129 | Thus, to replace all doubled chars with a single copy, one could write |
cf6c151c |
130 | |
131 | s/(?<letter>.)\k<letter>/$+{letter}/g |
132 | |
133 | Only buffers with defined contents will be "visible" in the C<%+> hash, so |
134 | it's possible to do something like |
135 | |
136 | foreach my $name (keys %+) { |
137 | print "content of buffer '$name' is $+{$name}\n"; |
138 | } |
139 | |
140 | The C<%-> hash is a bit more complete, since it will contain array refs |
141 | holding values from all capture buffers similarly named, if there should |
142 | be many of them. |
143 | |
144 | C<%+> and C<%-> are implemented as tied hashes through the new module |
145 | C<Tie::Hash::NamedCapture>. |
146 | |
147 | Users exposed to the .NET regex engine will find that the perl |
148 | implementation differs in that the numerical ordering of the buffers |
149 | is sequential, and not "unnamed first, then named". Thus in the pattern |
150 | |
151 | /(A)(?<B>B)(C)(?<D>D)/ |
152 | |
153 | $1 will be 'A', $2 will be 'B', $3 will be 'C' and $4 will be 'D' and not |
154 | $1 is 'A', $2 is 'C' and $3 is 'B' and $4 is 'D' that a .NET programmer |
155 | would expect. This is considered a feature. :-) (Yves Orton) |
156 | |
157 | =item Possessive Quantifiers |
158 | |
159 | Perl now supports the "possessive quantifier" syntax of the "atomic match" |
160 | pattern. Basically a possessive quantifier matches as much as it can and never |
161 | gives any back. Thus it can be used to control backtracking. The syntax is |
162 | similar to non-greedy matching, except instead of using a '?' as the modifier |
163 | the '+' is used. Thus C<?+>, C<*+>, C<++>, C<{min,max}+> are now legal |
164 | quantifiers. (Yves Orton) |
165 | |
166 | =item Backtracking control verbs |
167 | |
168 | The regex engine now supports a number of special-purpose backtrack |
169 | control verbs: (*THEN), (*PRUNE), (*MARK), (*SKIP), (*COMMIT), (*FAIL) |
170 | and (*ACCEPT). See L<perlre> for their descriptions. (Yves Orton) |
171 | |
172 | =item Relative backreferences |
173 | |
174 | A new syntax C<\g{N}> or C<\gN> where "N" is a decimal integer allows a |
175 | safer form of back-reference notation as well as allowing relative |
176 | backreferences. This should make it easier to generate and embed patterns |
177 | that contain backreferences. See L<perlre/"Capture buffers">. (Yves Orton) |
178 | |
179 | =item C<\K> escape |
180 | |
181 | The functionality of Jeff Pinyan's module Regexp::Keep has been added to |
254a8700 |
182 | the core. In regular expressions you can now use the special escape C<\K> |
cf6c151c |
183 | as a way to do something like floating length positive lookbehind. It is |
184 | also useful in substitutions like: |
185 | |
186 | s/(foo)bar/$1/g |
187 | |
188 | that can now be converted to |
189 | |
190 | s/foo\Kbar//g |
191 | |
192 | which is much more efficient. (Yves Orton) |
193 | |
194 | =item Vertical and horizontal whitespace, and linebreak |
195 | |
292c2b28 |
196 | Regular expressions now recognize the C<\v> and C<\h> escapes that match |
cf6c151c |
197 | vertical and horizontal whitespace, respectively. C<\V> and C<\H> |
198 | logically match their complements. |
199 | |
200 | C<\R> matches a generic linebreak, that is, vertical whitespace, plus |
201 | the multi-character sequence C<"\x0D\x0A">. |
202 | |
203 | =back |
204 | |
205 | =head2 C<say()> |
206 | |
207 | say() is a new built-in, only available when C<use feature 'say'> is in |
208 | effect, that is similar to print(), but that implicitly appends a newline |
209 | to the printed string. See L<perlfunc/say>. (Robin Houston) |
210 | |
211 | =head2 Lexical C<$_> |
212 | |
213 | The default variable C<$_> can now be lexicalized, by declaring it like |
214 | any other lexical variable, with a simple |
215 | |
216 | my $_; |
217 | |
218 | The operations that default on C<$_> will use the lexically-scoped |
219 | version of C<$_> when it exists, instead of the global C<$_>. |
220 | |
221 | In a C<map> or a C<grep> block, if C<$_> was previously my'ed, then the |
222 | C<$_> inside the block is lexical as well (and scoped to the block). |
223 | |
224 | In a scope where C<$_> has been lexicalized, you can still have access to |
225 | the global version of C<$_> by using C<$::_>, or, more simply, by |
597bb945 |
226 | overriding the lexical declaration with C<our $_>. (Rafael Garcia-Suarez) |
cf6c151c |
227 | |
228 | =head2 The C<_> prototype |
229 | |
254a8700 |
230 | A new prototype character has been added. C<_> is equivalent to C<$> but |
231 | defaults to C<$_> if the corresponding argument isn't supplied. (both C<$> |
3d9f6fa1 |
232 | and C<_> denote a scalar). Due to the optional nature of the argument, you |
254a8700 |
233 | can only use it at the end of a prototype, or before a semicolon. |
cf6c151c |
234 | |
235 | This has a small incompatible consequence: the prototype() function has |
236 | been adjusted to return C<_> for some built-ins in appropriate cases (for |
237 | example, C<prototype('CORE::rmdir')>). (Rafael Garcia-Suarez) |
238 | |
239 | =head2 UNITCHECK blocks |
240 | |
241 | C<UNITCHECK>, a new special code block has been introduced, in addition to |
242 | C<BEGIN>, C<CHECK>, C<INIT> and C<END>. |
243 | |
244 | C<CHECK> and C<INIT> blocks, while useful for some specialized purposes, |
245 | are always executed at the transition between the compilation and the |
246 | execution of the main program, and thus are useless whenever code is |
247 | loaded at runtime. On the other hand, C<UNITCHECK> blocks are executed |
248 | just after the unit which defined them has been compiled. See L<perlmod> |
249 | for more information. (Alex Gough) |
250 | |
251 | =head2 New Pragma, C<mro> |
252 | |
253 | A new pragma, C<mro> (for Method Resolution Order) has been added. It |
254 | permits to switch, on a per-class basis, the algorithm that perl uses to |
dbef3c66 |
255 | find inherited methods in case of a multiple inheritance hierarchy. The |
cf6c151c |
256 | default MRO hasn't changed (DFS, for Depth First Search). Another MRO is |
257 | available: the C3 algorithm. See L<mro> for more information. |
258 | (Brandon Black) |
259 | |
dbef3c66 |
260 | Note that, due to changes in the implementation of class hierarchy search, |
cf6c151c |
261 | code that used to undef the C<*ISA> glob will most probably break. Anyway, |
262 | undef'ing C<*ISA> had the side-effect of removing the magic on the @ISA |
263 | array and should not have been done in the first place. |
264 | |
3de67921 |
265 | =head2 readdir() may return a "short filename" on Windows |
266 | |
267 | The readdir() function may return a "short filename" when the long |
268 | filename contains characters outside the ANSI codepage. Similarly |
269 | Cwd::cwd() may return a short directory name, and glob() may return short |
270 | names as well. On the NTFS file system these short names can always be |
271 | represented in the ANSI codepage. This will not be true for all other file |
272 | system drivers; e.g. the FAT filesystem stores short filenames in the OEM |
273 | codepage, so some files on FAT volumes remain unaccessible through the |
274 | ANSI APIs. |
275 | |
276 | Similarly, $^X, @INC, and $ENV{PATH} are preprocessed at startup to make |
277 | sure all paths are valid in the ANSI codepage (if possible). |
278 | |
279 | The Win32::GetLongPathName() function now returns the UTF-8 encoded |
280 | correct long file name instead of using replacement characters to force |
281 | the name into the ANSI codepage. The new Win32::GetANSIPathName() |
282 | function can be used to turn a long pathname into a short one only if the |
283 | long one cannot be represented in the ANSI codepage. |
284 | |
285 | Many other functions in the C<Win32> module have been improved to accept |
286 | UTF-8 encoded arguments. Please see L<Win32> for details. |
287 | |
cf6c151c |
288 | =head2 readpipe() is now overridable |
289 | |
290 | The built-in function readpipe() is now overridable. Overriding it permits |
291 | also to override its operator counterpart, C<qx//> (a.k.a. C<``>). |
292 | Moreover, it now defaults to C<$_> if no argument is provided. (Rafael |
293 | Garcia-Suarez) |
294 | |
597bb945 |
295 | =head2 Default argument for readline() |
cf6c151c |
296 | |
297 | readline() now defaults to C<*ARGV> if no argument is provided. (Rafael |
298 | Garcia-Suarez) |
299 | |
300 | =head2 state() variables |
301 | |
302 | A new class of variables has been introduced. State variables are similar |
303 | to C<my> variables, but are declared with the C<state> keyword in place of |
304 | C<my>. They're visible only in their lexical scope, but their value is |
305 | persistent: unlike C<my> variables, they're not undefined at scope entry, |
306 | but retain their previous value. (Rafael Garcia-Suarez, Nicholas Clark) |
307 | |
308 | To use state variables, one needs to enable them by using |
309 | |
254a8700 |
310 | use feature 'state'; |
cf6c151c |
311 | |
312 | or by using the C<-E> command-line switch in one-liners. |
313 | See L<perlsub/"Persistent variables via state()">. |
314 | |
315 | =head2 Stacked filetest operators |
316 | |
317 | As a new form of syntactic sugar, it's now possible to stack up filetest |
318 | operators. You can now write C<-f -w -x $file> in a row to mean |
319 | C<-x $file && -w _ && -f _>. See L<perlfunc/-X>. |
320 | |
321 | =head2 UNIVERSAL::DOES() |
322 | |
323 | The C<UNIVERSAL> class has a new method, C<DOES()>. It has been added to |
324 | solve semantic problems with the C<isa()> method. C<isa()> checks for |
325 | inheritance, while C<DOES()> has been designed to be overridden when |
326 | module authors use other types of relations between classes (in addition |
327 | to inheritance). (chromatic) |
328 | |
329 | See L<< UNIVERSAL/"$obj->DOES( ROLE )" >>. |
330 | |
cf6c151c |
331 | =head2 Formats |
332 | |
333 | Formats were improved in several ways. A new field, C<^*>, can be used for |
334 | variable-width, one-line-at-a-time text. Null characters are now handled |
335 | correctly in picture lines. Using C<@#> and C<~~> together will now |
336 | produce a compile-time error, as those format fields are incompatible. |
337 | L<perlform> has been improved, and miscellaneous bugs fixed. |
338 | |
339 | =head2 Byte-order modifiers for pack() and unpack() |
340 | |
341 | There are two new byte-order modifiers, C<E<gt>> (big-endian) and C<E<lt>> |
342 | (little-endian), that can be appended to most pack() and unpack() template |
343 | characters and groups to force a certain byte-order for that type or group. |
344 | See L<perlfunc/pack> and L<perlpacktut> for details. |
345 | |
cf6c151c |
346 | =head2 C<no VERSION> |
347 | |
348 | You can now use C<no> followed by a version number to specify that you |
349 | want to use a version of perl older than the specified one. |
350 | |
351 | =head2 C<chdir>, C<chmod> and C<chown> on filehandles |
352 | |
353 | C<chdir>, C<chmod> and C<chown> can now work on filehandles as well as |
354 | filenames, if the system supports respectively C<fchdir>, C<fchmod> and |
355 | C<fchown>, thanks to a patch provided by Gisle Aas. |
356 | |
357 | =head2 OS groups |
358 | |
359 | C<$(> and C<$)> now return groups in the order where the OS returns them, |
360 | thanks to Gisle Aas. This wasn't previously the case. |
361 | |
362 | =head2 Recursive sort subs |
363 | |
364 | You can now use recursive subroutines with sort(), thanks to Robin Houston. |
365 | |
366 | =head2 Exceptions in constant folding |
367 | |
368 | The constant folding routine is now wrapped in an exception handler, and |
369 | if folding throws an exception (such as attempting to evaluate 0/0), perl |
370 | now retains the current optree, rather than aborting the whole program. |
254a8700 |
371 | Without this change, programs would not compile if they had expressions that |
372 | happened to generate exceptions, even though those expressions were in code |
373 | that could never be reached at runtime. (Nicholas Clark, Dave Mitchell) |
cf6c151c |
374 | |
375 | =head2 Source filters in @INC |
376 | |
377 | It's possible to enhance the mechanism of subroutine hooks in @INC by |
378 | adding a source filter on top of the filehandle opened and returned by the |
379 | hook. This feature was planned a long time ago, but wasn't quite working |
380 | until now. See L<perlfunc/require> for details. (Nicholas Clark) |
381 | |
382 | =head2 New internal variables |
383 | |
384 | =over 4 |
385 | |
386 | =item C<${^RE_DEBUG_FLAGS}> |
387 | |
388 | This variable controls what debug flags are in effect for the regular |
389 | expression engine when running under C<use re "debug">. See L<re> for |
390 | details. |
391 | |
392 | =item C<${^CHILD_ERROR_NATIVE}> |
393 | |
394 | This variable gives the native status returned by the last pipe close, |
395 | backtick command, successful call to wait() or waitpid(), or from the |
396 | system() operator. See L<perlrun> for details. (Contributed by Gisle Aas.) |
397 | |
597bb945 |
398 | =item C<${^RE_TRIE_MAXBUF}> |
399 | |
400 | See L</"Trie optimisation of literal string alternations">. |
401 | |
402 | =item C<${^WIN32_SLOPPY_STAT}> |
403 | |
404 | See L</"Sloppy stat on Windows">. |
405 | |
cf6c151c |
406 | =back |
407 | |
408 | =head2 Miscellaneous |
409 | |
410 | C<unpack()> now defaults to unpacking the C<$_> variable. |
411 | |
412 | C<mkdir()> without arguments now defaults to C<$_>. |
413 | |
414 | The internal dump output has been improved, so that non-printable characters |
415 | such as newline and backspace are output in C<\x> notation, rather than |
416 | octal. |
417 | |
418 | The B<-C> option can no longer be used on the C<#!> line. It wasn't |
cba8bf60 |
419 | working there anyway, since the standard streams are already set up |
420 | at this point in the execution of the perl interpreter. You can use |
421 | binmode() instead to get the desired behaviour. |
cf6c151c |
422 | |
423 | =head2 UCD 5.0.0 |
424 | |
425 | The copy of the Unicode Character Database included in Perl 5 has |
426 | been updated to version 5.0.0. |
427 | |
cf6c151c |
428 | =head2 MAD |
429 | |
254a8700 |
430 | MAD, which stands for I<Miscellaneous Attribute Decoration>, is a |
cf6c151c |
431 | still-in-development work leading to a Perl 5 to Perl 6 converter. To |
432 | enable it, it's necessary to pass the argument C<-Dmad> to Configure. The |
254a8700 |
433 | obtained perl isn't binary compatible with a regular perl 5.10, and has |
cf6c151c |
434 | space and speed penalties; moreover not all regression tests still pass |
435 | with it. (Larry Wall, Nicholas Clark) |
436 | |
c7d332a5 |
437 | =head2 kill() on Windows |
438 | |
439 | On Windows platforms, C<kill(-9, $pid)> now kills a process tree. |
440 | (On UNIX, this delivers the signal to all processes in the same process |
441 | group.) |
442 | |
597bb945 |
443 | =head1 Incompatible Changes |
444 | |
445 | =head2 Packing and UTF-8 strings |
446 | |
447 | =for XXX update this |
448 | |
449 | The semantics of pack() and unpack() regarding UTF-8-encoded data has been |
450 | changed. Processing is now by default character per character instead of |
451 | byte per byte on the underlying encoding. Notably, code that used things |
452 | like C<pack("a*", $string)> to see through the encoding of string will now |
453 | simply get back the original $string. Packed strings can also get upgraded |
454 | during processing when you store upgraded characters. You can get the old |
455 | behaviour by using C<use bytes>. |
456 | |
457 | To be consistent with pack(), the C<C0> in unpack() templates indicates |
458 | that the data is to be processed in character mode, i.e. character by |
459 | character; on the contrary, C<U0> in unpack() indicates UTF-8 mode, where |
460 | the packed string is processed in its UTF-8-encoded Unicode form on a byte |
254a8700 |
461 | by byte basis. This is reversed with regard to perl 5.8.X, but now consistent |
462 | between pack() and unpack(). |
597bb945 |
463 | |
464 | Moreover, C<C0> and C<U0> can also be used in pack() templates to specify |
465 | respectively character and byte modes. |
466 | |
467 | C<C0> and C<U0> in the middle of a pack or unpack format now switch to the |
468 | specified encoding mode, honoring parens grouping. Previously, parens were |
469 | ignored. |
470 | |
471 | Also, there is a new pack() character format, C<W>, which is intended to |
472 | replace the old C<C>. C<C> is kept for unsigned chars coded as bytes in |
473 | the strings internal representation. C<W> represents unsigned (logical) |
474 | character values, which can be greater than 255. It is therefore more |
475 | robust when dealing with potentially UTF-8-encoded data (as C<C> will wrap |
476 | values outside the range 0..255, and not respect the string encoding). |
477 | |
478 | In practice, that means that pack formats are now encoding-neutral, except |
479 | C<C>. |
480 | |
481 | For consistency, C<A> in unpack() format now trims all Unicode whitespace |
482 | from the end of the string. Before perl 5.9.2, it used to strip only the |
483 | classical ASCII space characters. |
484 | |
485 | =head2 Byte/character count feature in unpack() |
486 | |
487 | A new unpack() template character, C<".">, returns the number of bytes or |
488 | characters (depending on the selected encoding mode, see above) read so far. |
489 | |
490 | =head2 The C<$*> and C<$#> variables have been removed |
491 | |
492 | C<$*>, which was deprecated in favor of the C</s> and C</m> regexp |
493 | modifiers, has been removed. |
494 | |
495 | The deprecated C<$#> variable (output format for numbers) has been |
496 | removed. |
497 | |
f00638a2 |
498 | Two new severe warnings, C<$#/$* is no longer supported>, have been added. |
597bb945 |
499 | |
500 | =head2 substr() lvalues are no longer fixed-length |
501 | |
502 | The lvalues returned by the three argument form of substr() used to be a |
503 | "fixed length window" on the original string. In some cases this could |
504 | cause surprising action at distance or other undefined behaviour. Now the |
505 | length of the window adjusts itself to the length of the string assigned to |
506 | it. |
507 | |
508 | =head2 Parsing of C<-f _> |
509 | |
510 | The identifier C<_> is now forced to be a bareword after a filetest |
511 | operator. This solves a number of misparsing issues when a global C<_> |
512 | subroutine is defined. |
513 | |
514 | =head2 C<:unique> |
515 | |
516 | The C<:unique> attribute has been made a no-op, since its current |
517 | implementation was fundamentally flawed and not threadsafe. |
518 | |
597bb945 |
519 | =head2 Effect of pragmas in eval |
520 | |
521 | The compile-time value of the C<%^H> hint variable can now propagate into |
522 | eval("")uated code. This makes it more useful to implement lexical |
523 | pragmas. |
524 | |
525 | As a side-effect of this, the overloaded-ness of constants now propagates |
526 | into eval(""). |
527 | |
528 | =head2 chdir FOO |
529 | |
530 | A bareword argument to chdir() is now recognized as a file handle. |
531 | Earlier releases interpreted the bareword as a directory name. |
532 | (Gisle Aas) |
533 | |
534 | =head2 Handling of .pmc files |
535 | |
536 | An old feature of perl was that before C<require> or C<use> look for a |
537 | file with a F<.pm> extension, they will first look for a similar filename |
538 | with a F<.pmc> extension. If this file is found, it will be loaded in |
539 | place of any potentially existing file ending in a F<.pm> extension. |
540 | |
541 | Previously, F<.pmc> files were loaded only if more recent than the |
542 | matching F<.pm> file. Starting with 5.9.4, they'll be always loaded if |
543 | they exist. |
544 | |
545 | =head2 @- and @+ in patterns |
546 | |
547 | The special arrays C<@-> and C<@+> are no longer interpolated in regular |
548 | expressions. (Sadahiro Tomoyuki) |
549 | |
550 | =head2 $AUTOLOAD can now be tainted |
551 | |
552 | If you call a subroutine by a tainted name, and if it defers to an |
553 | AUTOLOAD function, then $AUTOLOAD will be (correctly) tainted. |
554 | (Rick Delaney) |
555 | |
556 | =head2 Tainting and printf |
557 | |
558 | When perl is run under taint mode, C<printf()> and C<sprintf()> will now |
559 | reject any tainted format argument. (Rafael Garcia-Suarez) |
560 | |
561 | =head2 undef and signal handlers |
562 | |
563 | Undefining or deleting a signal handler via C<undef $SIG{FOO}> is now |
564 | equivalent to setting it to C<'DEFAULT'>. (Rafael Garcia-Suarez) |
565 | |
566 | =head2 strictures and dereferencing in defined() |
567 | |
254a8700 |
568 | C<use strict 'refs'> was ignoring taking a hard reference in an argument |
597bb945 |
569 | to defined(), as in : |
570 | |
254a8700 |
571 | use strict 'refs'; |
572 | my $x = 'foo'; |
597bb945 |
573 | if (defined $$x) {...} |
574 | |
575 | This now correctly produces the run-time error C<Can't use string as a |
576 | SCALAR ref while "strict refs" in use>. |
577 | |
578 | C<defined @$foo> and C<defined %$bar> are now also subject to C<strict |
579 | 'refs'> (that is, C<$foo> and C<$bar> shall be proper references there.) |
580 | (C<defined(@foo)> and C<defined(%bar)> are discouraged constructs anyway.) |
581 | (Nicholas Clark) |
582 | |
583 | =head2 C<(?p{})> has been removed |
584 | |
585 | The regular expression construct C<(?p{})>, which was deprecated in perl |
586 | 5.8, has been removed. Use C<(??{})> instead. (Rafael Garcia-Suarez) |
587 | |
588 | =head2 Pseudo-hashes have been removed |
589 | |
590 | Support for pseudo-hashes has been removed from Perl 5.9. (The C<fields> |
591 | pragma remains here, but uses an alternate implementation.) |
592 | |
593 | =head2 Removal of the bytecode compiler and of perlcc |
594 | |
595 | C<perlcc>, the byteloader and the supporting modules (B::C, B::CC, |
596 | B::Bytecode, etc.) are no longer distributed with the perl sources. Those |
597 | experimental tools have never worked reliably, and, due to the lack of |
598 | volunteers to keep them in line with the perl interpreter developments, it |
599 | was decided to remove them instead of shipping a broken version of those. |
600 | The last version of those modules can be found with perl 5.9.4. |
601 | |
602 | However the B compiler framework stays supported in the perl core, as with |
603 | the more useful modules it has permitted (among others, B::Deparse and |
604 | B::Concise). |
605 | |
606 | =head2 Removal of the JPL |
607 | |
ed8ea1b6 |
608 | The JPL (Java-Perl Lingo) has been removed from the perl sources tarball. |
597bb945 |
609 | |
610 | =head2 Recursive inheritance detected earlier |
611 | |
612 | Perl will now immediately throw an exception if you modify any package's |
613 | C<@ISA> in such a way that it would cause recursive inheritance. |
614 | |
615 | Previously, the exception would not occur until Perl attempted to make |
616 | use of the recursive inheritance while resolving a method or doing a |
617 | C<$foo-E<gt>isa($bar)> lookup. |
618 | |
cf6c151c |
619 | =head1 Modules and Pragmata |
c0c97549 |
620 | |
f0e260b8 |
621 | =head2 Pragmata Changes |
622 | |
623 | =over 4 |
624 | |
625 | =item C<feature> |
626 | |
627 | The new pragma C<feature> is used to enable new features that might break |
628 | old code. See L</"The C<feature> pragma"> above. |
629 | |
630 | =item C<mro> |
631 | |
632 | This new pragma enables to change the algorithm used to resolve inherited |
633 | methods. See L</"New Pragma, C<mro>"> above. |
634 | |
635 | =item Scoping of the C<sort> pragma |
636 | |
637 | The C<sort> pragma is now lexically scoped. Its effect used to be global. |
638 | |
639 | =item Scoping of C<bignum>, C<bigint>, C<bigrat> |
640 | |
641 | The three numeric pragmas C<bignum>, C<bigint> and C<bigrat> are now |
642 | lexically scoped. (Tels) |
643 | |
644 | =item C<base> |
645 | |
646 | The C<base> pragma now warns if a class tries to inherit from itself. |
647 | (Curtis "Ovid" Poe) |
648 | |
649 | =item C<strict> and C<warnings> |
650 | |
651 | C<strict> and C<warnings> will now complain loudly if they are loaded via |
652 | incorrect casing (as in C<use Strict;>). (Johan Vromans) |
653 | |
6601a838 |
654 | =item C<version> |
655 | |
656 | The C<version> module provides support for version objects. |
657 | |
f0e260b8 |
658 | =item C<warnings> |
659 | |
660 | The C<warnings> pragma doesn't load C<Carp> anymore. That means that code |
661 | that used C<Carp> routines without having loaded it at compile time might |
662 | need to be adjusted; typically, the following (faulty) code won't work |
663 | anymore, and will require parentheses to be added after the function name: |
664 | |
665 | use warnings; |
666 | require Carp; |
254a8700 |
667 | Carp::confess 'argh'; |
f0e260b8 |
668 | |
669 | =item C<less> |
670 | |
671 | C<less> now does something useful (or at least it tries to). In fact, it |
672 | has been turned into a lexical pragma. So, in your modules, you can now |
673 | test whether your users have requested to use less CPU, or less memory, |
674 | less magic, or maybe even less fat. See L<less> for more. (Joshua ben |
675 | Jore) |
676 | |
677 | =back |
678 | |
0eece9c0 |
679 | =head2 New modules |
680 | |
681 | =over 4 |
682 | |
683 | =item * |
684 | |
685 | C<encoding::warnings>, by Audrey Tang, is a module to emit warnings |
686 | whenever an ASCII character string containing high-bit bytes is implicitly |
597bb945 |
687 | converted into UTF-8. It's a lexical pragma since Perl 5.9.4; on older |
688 | perls, its effect is global. |
0eece9c0 |
689 | |
690 | =item * |
691 | |
692 | C<Module::CoreList>, by Richard Clamp, is a small handy module that tells |
693 | you what versions of core modules ship with any versions of Perl 5. It |
694 | comes with a command-line frontend, C<corelist>. |
695 | |
bd3831ee |
696 | =item * |
697 | |
698 | C<Math::BigInt::FastCalc> is an XS-enabled, and thus faster, version of |
699 | C<Math::BigInt::Calc>. |
700 | |
701 | =item * |
702 | |
703 | C<Compress::Zlib> is an interface to the zlib compression library. It |
704 | comes with a bundled version of zlib, so having a working zlib is not a |
705 | prerequisite to install it. It's used by C<Archive::Tar> (see below). |
706 | |
707 | =item * |
708 | |
709 | C<IO::Zlib> is an C<IO::>-style interface to C<Compress::Zlib>. |
710 | |
711 | =item * |
712 | |
713 | C<Archive::Tar> is a module to manipulate C<tar> archives. |
714 | |
715 | =item * |
716 | |
717 | C<Digest::SHA> is a module used to calculate many types of SHA digests, |
718 | has been included for SHA support in the CPAN module. |
719 | |
720 | =item * |
721 | |
722 | C<ExtUtils::CBuilder> and C<ExtUtils::ParseXS> have been added. |
723 | |
597bb945 |
724 | =item * |
725 | |
726 | C<Hash::Util::FieldHash>, by Anno Siegel, has been added. This module |
727 | provides support for I<field hashes>: hashes that maintain an association |
728 | of a reference with a value, in a thread-safe garbage-collected way. |
729 | Such hashes are useful to implement inside-out objects. |
730 | |
731 | =item * |
732 | |
733 | C<Module::Build>, by Ken Williams, has been added. It's an alternative to |
734 | C<ExtUtils::MakeMaker> to build and install perl modules. |
735 | |
736 | =item * |
737 | |
738 | C<Module::Load>, by Jos Boumans, has been added. It provides a single |
739 | interface to load Perl modules and F<.pl> files. |
740 | |
741 | =item * |
742 | |
743 | C<Module::Loaded>, by Jos Boumans, has been added. It's used to mark |
744 | modules as loaded or unloaded. |
745 | |
746 | =item * |
747 | |
748 | C<Package::Constants>, by Jos Boumans, has been added. It's a simple |
749 | helper to list all constants declared in a given package. |
750 | |
751 | =item * |
752 | |
753 | C<Win32API::File>, by Tye McQueen, has been added (for Windows builds). |
754 | This module provides low-level access to Win32 system API calls for |
755 | files/dirs. |
756 | |
f0e260b8 |
757 | =item * |
758 | |
759 | C<Locale::Maketext::Simple>, needed by CPANPLUS, is a simple wrapper around |
760 | C<Locale::Maketext::Lexicon>. Note that C<Locale::Maketext::Lexicon> isn't |
761 | included in the perl core; the behaviour of C<Locale::Maketext::Simple> |
762 | gracefully degrades when the later isn't present. |
763 | |
764 | =item * |
765 | |
766 | C<Params::Check> implements a generic input parsing/checking mechanism. It |
767 | is used by CPANPLUS. |
768 | |
769 | =item * |
770 | |
771 | C<Term::UI> simplifies the task to ask questions at a terminal prompt. |
772 | |
773 | =item * |
774 | |
775 | C<Object::Accessor> provides an interface to create per-object accessors. |
776 | |
777 | =item * |
778 | |
779 | C<Module::Pluggable> is a simple framework to create modules that accept |
780 | pluggable sub-modules. |
781 | |
782 | =item * |
783 | |
784 | C<Module::Load::Conditional> provides simple ways to query and possibly |
785 | load installed modules. |
786 | |
787 | =item * |
788 | |
789 | C<Time::Piece> provides an object oriented interface to time functions, |
790 | overriding the built-ins localtime() and gmtime(). |
791 | |
792 | =item * |
793 | |
794 | C<IPC::Cmd> helps to find and run external commands, possibly |
795 | interactively. |
796 | |
797 | =item * |
798 | |
799 | C<File::Fetch> provide a simple generic file fetching mechanism. |
800 | |
801 | =item * |
802 | |
803 | C<Log::Message> and C<Log::Message::Simple> are used by the log facility |
804 | of C<CPANPLUS>. |
805 | |
806 | =item * |
807 | |
808 | C<Archive::Extract> is a generic archive extraction mechanism |
809 | for F<.tar> (plain, gziped or bzipped) or F<.zip> files. |
810 | |
811 | =item * |
812 | |
813 | C<CPANPLUS> provides an API and a command-line tool to access the CPAN |
814 | mirrors. |
815 | |
e6746346 |
816 | =item * |
817 | |
818 | C<Pod::Escapes> provides utilities that are useful in decoding Pod |
819 | EE<lt>...E<gt> sequences. |
820 | |
821 | =item * |
822 | |
823 | C<Pod::Simple> is now the backend for several of the Pod-related modules |
824 | included with Perl. |
825 | |
f0e260b8 |
826 | =back |
827 | |
828 | =head2 Selected Changes to Core Modules |
829 | |
830 | =over 4 |
831 | |
832 | =item C<Attribute::Handlers> |
833 | |
834 | C<Attribute::Handlers> can now report the caller's file and line number. |
835 | (David Feldman) |
836 | |
6cdf4617 |
837 | All interpreted attributes are now passed as array references. (Damian |
838 | Conway) |
839 | |
f0e260b8 |
840 | =item C<B::Lint> |
841 | |
842 | C<B::Lint> is now based on C<Module::Pluggable>, and so can be extended |
843 | with plugins. (Joshua ben Jore) |
844 | |
845 | =item C<B> |
846 | |
847 | It's now possible to access the lexical pragma hints (C<%^H>) by using the |
848 | method B::COP::hints_hash(). It returns a C<B::RHE> object, which in turn |
849 | can be used to get a hash reference via the method B::RHE::HASH(). (Joshua |
850 | ben Jore) |
851 | |
852 | =item C<Thread> |
853 | |
854 | As the old 5005thread threading model has been removed, in favor of the |
855 | ithreads scheme, the C<Thread> module is now a compatibility wrapper, to |
856 | be used in old code only. It has been removed from the default list of |
857 | dynamic extensions. |
858 | |
0eece9c0 |
859 | =back |
860 | |
cf6c151c |
861 | =head1 Utility Changes |
c0c97549 |
862 | |
863 | =over 4 |
864 | |
bd3831ee |
865 | =item perl -d |
c0c97549 |
866 | |
867 | The Perl debugger can now save all debugger commands for sourcing later; |
868 | notably, it can now emulate stepping backwards, by restarting and |
869 | rerunning all bar the last command from a saved command history. |
870 | |
871 | It can also display the parent inheritance tree of a given class, with the |
872 | C<i> command. |
873 | |
bd3831ee |
874 | =item ptar |
875 | |
292c2b28 |
876 | C<ptar> is a pure perl implementation of C<tar> that comes with |
bd3831ee |
877 | C<Archive::Tar>. |
878 | |
879 | =item ptardiff |
880 | |
254a8700 |
881 | C<ptardiff> is a small utility used to generate a diff between the contents |
bd3831ee |
882 | of a tar archive and a directory tree. Like C<ptar>, it comes with |
883 | C<Archive::Tar>. |
884 | |
885 | =item shasum |
886 | |
887 | C<shasum> is a command-line utility, used to print or to check SHA |
888 | digests. It comes with the new C<Digest::SHA> module. |
889 | |
890 | =item corelist |
0eece9c0 |
891 | |
892 | The C<corelist> utility is now installed with perl (see L</"New modules"> |
893 | above). |
894 | |
bd3831ee |
895 | =item h2ph and h2xs |
0eece9c0 |
896 | |
254a8700 |
897 | C<h2ph> and C<h2xs> have been made more robust with regard to |
0eece9c0 |
898 | "modern" C code. |
899 | |
bd3831ee |
900 | C<h2xs> implements a new option C<--use-xsloader> to force use of |
901 | C<XSLoader> even in backwards compatible modules. |
902 | |
903 | The handling of authors' names that had apostrophes has been fixed. |
904 | |
905 | Any enums with negative values are now skipped. |
906 | |
907 | =item perlivp |
908 | |
909 | C<perlivp> no longer checks for F<*.ph> files by default. Use the new C<-a> |
910 | option to run I<all> tests. |
911 | |
912 | =item find2perl |
0eece9c0 |
913 | |
914 | C<find2perl> now assumes C<-print> as a default action. Previously, it |
915 | needed to be specified explicitly. |
916 | |
917 | Several bugs have been fixed in C<find2perl>, regarding C<-exec> and |
918 | C<-eval>. Also the options C<-path>, C<-ipath> and C<-iname> have been |
919 | added. |
920 | |
597bb945 |
921 | =item config_data |
922 | |
923 | C<config_data> is a new utility that comes with C<Module::Build>. It |
924 | provides a command-line interface to the configuration of Perl modules |
925 | that use Module::Build's framework of configurability (that is, |
926 | C<*::ConfigData> modules that contain local configuration information for |
927 | their parent modules.) |
928 | |
f00638a2 |
929 | =item cpanp |
f0e260b8 |
930 | |
254a8700 |
931 | C<cpanp>, the CPANPLUS shell, has been added. (C<cpanp-run-perl>, a |
f0e260b8 |
932 | helper for CPANPLUS operation, has been added too, but isn't intended for |
933 | direct use). |
934 | |
f00638a2 |
935 | =item cpan2dist |
f0e260b8 |
936 | |
292c2b28 |
937 | C<cpan2dist> is a new utility that comes with CPANPLUS. It's a tool to |
f0e260b8 |
938 | create distributions (or packages) from CPAN modules. |
939 | |
f00638a2 |
940 | =item pod2html |
f0e260b8 |
941 | |
942 | The output of C<pod2html> has been enhanced to be more customizable via |
943 | CSS. Some formatting problems were also corrected. (Jari Aalto) |
944 | |
c0c97549 |
945 | =back |
946 | |
cf6c151c |
947 | =head1 New Documentation |
c0c97549 |
948 | |
597bb945 |
949 | The L<perlpragma> manpage documents how to write one's own lexical |
950 | pragmas in pure Perl (something that is possible starting with 5.9.4). |
951 | |
bd3831ee |
952 | The new L<perlglossary> manpage is a glossary of terms used in the Perl |
953 | documentation, technical and otherwise, kindly provided by O'Reilly Media, |
954 | Inc. |
955 | |
597bb945 |
956 | The L<perlreguts> manpage, courtesy of Yves Orton, describes internals of the |
957 | Perl regular expression engine. |
958 | |
62c26f88 |
959 | The L<perlreapi> manpage describes the interface to the perl interpreter |
960 | used to write pluggable regular expression engines (by Ævar Arnfjörð |
961 | Bjarmason). |
962 | |
597bb945 |
963 | The L<perlunitut> manpage is an tutorial for programming with Unicode and |
964 | string encodings in Perl, courtesy of Juerd Waalboer. |
965 | |
f0e260b8 |
966 | A new manual page, L<perlunifaq> (the Perl Unicode FAQ), has been added |
967 | (Juerd Waalboer). |
968 | |
dbef3c66 |
969 | The L<perlcommunity> manpage gives a description of the Perl community |
970 | on the Internet and in real life. (Edgar "Trizor" Bering) |
971 | |
f00638a2 |
972 | The L<CORE> manual page documents the C<CORE::> namespace. (Tels) |
973 | |
c0c97549 |
974 | The long-existing feature of C</(?{...})/> regexps setting C<$_> and pos() |
975 | is now documented. |
976 | |
cf6c151c |
977 | =head1 Performance Enhancements |
c0c97549 |
978 | |
597bb945 |
979 | =head2 In-place sorting |
0eece9c0 |
980 | |
c0c97549 |
981 | Sorting arrays in place (C<@a = sort @a>) is now optimized to avoid |
982 | making a temporary copy of the array. |
983 | |
0eece9c0 |
984 | Likewise, C<reverse sort ...> is now optimized to sort in reverse, |
985 | avoiding the generation of a temporary intermediate list. |
986 | |
597bb945 |
987 | =head2 Lexical array access |
0eece9c0 |
988 | |
c0c97549 |
989 | Access to elements of lexical arrays via a numeric constant between 0 and |
990 | 255 is now faster. (This used to be only the case for global arrays.) |
991 | |
597bb945 |
992 | =head2 XS-assisted SWASHGET |
bd3831ee |
993 | |
994 | Some pure-perl code that perl was using to retrieve Unicode properties and |
995 | transliteration mappings has been reimplemented in XS. |
996 | |
597bb945 |
997 | =head2 Constant subroutines |
bd3831ee |
998 | |
999 | The interpreter internals now support a far more memory efficient form of |
1000 | inlineable constants. Storing a reference to a constant value in a symbol |
1001 | table is equivalent to a full typeglob referencing a constant subroutine, |
1002 | but using about 400 bytes less memory. This proxy constant subroutine is |
1003 | automatically upgraded to a real typeglob with subroutine if necessary. |
1004 | The approach taken is analogous to the existing space optimisation for |
1005 | subroutine stub declarations, which are stored as plain scalars in place |
1006 | of the full typeglob. |
1007 | |
1008 | Several of the core modules have been converted to use this feature for |
1009 | their system dependent constants - as a result C<use POSIX;> now takes about |
1010 | 200K less memory. |
1011 | |
597bb945 |
1012 | =head2 C<PERL_DONT_CREATE_GVSV> |
bd3831ee |
1013 | |
1014 | The new compilation flag C<PERL_DONT_CREATE_GVSV>, introduced as an option |
1015 | in perl 5.8.8, is turned on by default in perl 5.9.3. It prevents perl |
1016 | from creating an empty scalar with every new typeglob. See L<perl588delta> |
1017 | for details. |
1018 | |
597bb945 |
1019 | =head2 Weak references are cheaper |
bd3831ee |
1020 | |
1021 | Weak reference creation is now I<O(1)> rather than I<O(n)>, courtesy of |
1022 | Nicholas Clark. Weak reference deletion remains I<O(n)>, but if deletion only |
1023 | happens at program exit, it may be skipped completely. |
1024 | |
597bb945 |
1025 | =head2 sort() enhancements |
bd3831ee |
1026 | |
1027 | Salvador Fandiño provided improvements to reduce the memory usage of C<sort> |
1028 | and to speed up some cases. |
1029 | |
597bb945 |
1030 | =head2 Memory optimisations |
1031 | |
1032 | Several internal data structures (typeglobs, GVs, CVs, formats) have been |
1033 | restructured to use less memory. (Nicholas Clark) |
1034 | |
1035 | =head2 UTF-8 cache optimisation |
1036 | |
1037 | The UTF-8 caching code is now more efficient, and used more often. |
1038 | (Nicholas Clark) |
1039 | |
1040 | =head2 Sloppy stat on Windows |
1041 | |
1042 | On Windows, perl's stat() function normally opens the file to determine |
1043 | the link count and update attributes that may have been changed through |
1044 | hard links. Setting ${^WIN32_SLOPPY_STAT} to a true value speeds up |
1045 | stat() by not performing this operation. (Jan Dubois) |
1046 | |
597bb945 |
1047 | =head2 Regular expressions optimisations |
1048 | |
1049 | =over 4 |
1050 | |
1051 | =item Engine de-recursivised |
1052 | |
1053 | The regular expression engine is no longer recursive, meaning that |
1054 | patterns that used to overflow the stack will either die with useful |
1055 | explanations, or run to completion, which, since they were able to blow |
1056 | the stack before, will likely take a very long time to happen. If you were |
1057 | experiencing the occasional stack overflow (or segfault) and upgrade to |
1058 | discover that now perl apparently hangs instead, look for a degenerate |
1059 | regex. (Dave Mitchell) |
1060 | |
1061 | =item Single char char-classes treated as literals |
1062 | |
1063 | Classes of a single character are now treated the same as if the character |
1064 | had been used as a literal, meaning that code that uses char-classes as an |
1065 | escaping mechanism will see a speedup. (Yves Orton) |
1066 | |
1067 | =item Trie optimisation of literal string alternations |
1068 | |
1069 | Alternations, where possible, are optimised into more efficient matching |
1070 | structures. String literal alternations are merged into a trie and are |
1071 | matched simultaneously. This means that instead of O(N) time for matching |
1072 | N alternations at a given point, the new code performs in O(1) time. |
1073 | A new special variable, ${^RE_TRIE_MAXBUF}, has been added to fine-tune |
1074 | this optimization. (Yves Orton) |
1075 | |
1076 | B<Note:> Much code exists that works around perl's historic poor |
1077 | performance on alternations. Often the tricks used to do so will disable |
1078 | the new optimisations. Hopefully the utility modules used for this purpose |
99d59c4d |
1079 | will be educated about these new optimisations. |
597bb945 |
1080 | |
1081 | =item Aho-Corasick start-point optimisation |
1082 | |
1083 | When a pattern starts with a trie-able alternation and there aren't |
e15dad31 |
1084 | better optimisations available, the regex engine will use Aho-Corasick |
597bb945 |
1085 | matching to find the start point. (Yves Orton) |
1086 | |
0eece9c0 |
1087 | =back |
1088 | |
cf6c151c |
1089 | =head1 Installation and Configuration Improvements |
c0c97549 |
1090 | |
597bb945 |
1091 | =head2 Configuration improvements |
1092 | |
1093 | =over 4 |
1094 | |
1095 | =item C<-Dusesitecustomize> |
bd3831ee |
1096 | |
0eece9c0 |
1097 | Run-time customization of @INC can be enabled by passing the |
597bb945 |
1098 | C<-Dusesitecustomize> flag to Configure. When enabled, this will make perl |
0eece9c0 |
1099 | run F<$sitelibexp/sitecustomize.pl> before anything else. This script can |
1100 | then be set up to add additional entries to @INC. |
1101 | |
597bb945 |
1102 | =item Relocatable installations |
1103 | |
1104 | There is now Configure support for creating a relocatable perl tree. If |
1105 | you Configure with C<-Duserelocatableinc>, then the paths in @INC (and |
1106 | everything else in %Config) can be optionally located via the path of the |
1107 | perl executable. |
1108 | |
1109 | That means that, if the string C<".../"> is found at the start of any |
1110 | path, it's substituted with the directory of $^X. So, the relocation can |
1111 | be configured on a per-directory basis, although the default with |
1112 | C<-Duserelocatableinc> is that everything is relocated. The initial |
1113 | install is done to the original configured prefix. |
1114 | |
1115 | =item strlcat() and strlcpy() |
1116 | |
1117 | The configuration process now detects whether strlcat() and strlcpy() are |
1118 | available. When they are not available, perl's own version is used (from |
1119 | Russ Allbery's public domain implementation). Various places in the perl |
1120 | interpreter now use them. (Steve Peters) |
1121 | |
f0e260b8 |
1122 | =item C<d_pseudofork> and C<d_printf_format_null> |
1123 | |
1124 | A new configuration variable, available as C<$Config{d_pseudofork}> in |
1125 | the L<Config> module, has been added, to distinguish real fork() support |
1126 | from fake pseudofork used on Windows platforms. |
1127 | |
1128 | A new configuration variable, C<d_printf_format_null>, has been added, |
1129 | to see if printf-like formats are allowed to be NULL. |
1130 | |
1131 | =item Configure help |
1132 | |
1133 | C<Configure -h> has been extended with the most commonly used options. |
1134 | |
597bb945 |
1135 | =back |
1136 | |
1137 | =head2 Compilation improvements |
1138 | |
1139 | =over 4 |
1140 | |
1141 | =item Parallel build |
0eece9c0 |
1142 | |
bd3831ee |
1143 | Parallel makes should work properly now, although there may still be problems |
1144 | if C<make test> is instructed to run in parallel. |
1145 | |
597bb945 |
1146 | =item Borland's compilers support |
1147 | |
bd3831ee |
1148 | Building with Borland's compilers on Win32 should work more smoothly. In |
1149 | particular Steve Hay has worked to side step many warnings emitted by their |
1150 | compilers and at least one C compiler internal error. |
1151 | |
597bb945 |
1152 | =item Static build on Windows |
1153 | |
f0e260b8 |
1154 | Perl extensions on Windows now can be statically built into the Perl DLL. |
1155 | |
1156 | Also, it's now possible to build a C<perl-static.exe> that doesn't depend |
1157 | on the Perl DLL on Win32. See the Win32 makefiles for details. |
1158 | (Vadim Konovalov) |
bd3831ee |
1159 | |
69d2c521 |
1160 | =item ppport.h files |
597bb945 |
1161 | |
1162 | All F<ppport.h> files in the XS modules bundled with perl are now |
1163 | autogenerated at build time. (Marcus Holland-Moritz) |
1164 | |
f0e260b8 |
1165 | =item C++ compatibility |
1166 | |
1167 | Efforts have been made to make perl and the core XS modules compilable |
1168 | with various C++ compilers (although the situation is not perfect with |
1169 | some of the compilers on some of the platforms tested.) |
1170 | |
597bb945 |
1171 | =item Support for Microsoft 64-bit compiler |
1172 | |
1173 | Support for building perl with Microsoft's 64-bit compiler has been |
1174 | improved. (ActiveState) |
1175 | |
f0e260b8 |
1176 | =item Visual C++ |
1177 | |
c01f0d41 |
1178 | Perl can now be compiled with Microsoft Visual C++ 2005 (and 2008 Beta 2). |
f0e260b8 |
1179 | |
1180 | =item Win32 builds |
1181 | |
1182 | All win32 builds (MS-Win, WinCE) have been merged and cleaned up. |
1183 | |
597bb945 |
1184 | =back |
1185 | |
1186 | =head2 Installation improvements |
1187 | |
1188 | =over 4 |
1189 | |
1190 | =item Module auxiliary files |
1191 | |
1192 | README files and changelogs for CPAN modules bundled with perl are no |
1193 | longer installed. |
1194 | |
1195 | =back |
1196 | |
bd3831ee |
1197 | =head2 New Or Improved Platforms |
1198 | |
597bb945 |
1199 | Perl has been reported to work on Symbian OS. See L<perlsymbian> for more |
bd3831ee |
1200 | information. |
1201 | |
597bb945 |
1202 | Many improvements have been made towards making Perl work correctly on |
1203 | z/OS. |
1204 | |
f0e260b8 |
1205 | Perl has been reported to work on DragonFlyBSD and MidnightBSD. |
597bb945 |
1206 | |
bd3831ee |
1207 | The VMS port has been improved. See L<perlvms>. |
1208 | |
d43695a1 |
1209 | Support for Cray XT4 Catamount/Qk has been added. See |
1210 | F<hints/catamount.sh> in the source code distribution for more |
1211 | information. |
bd3831ee |
1212 | |
f0e260b8 |
1213 | Vendor patches have been merged for RedHat and Gentoo. |
1214 | |
1215 | DynaLoader::dl_unload_file() now works on Windows. |
bd3831ee |
1216 | |
cf6c151c |
1217 | =head1 Selected Bug Fixes |
c0c97549 |
1218 | |
bd3831ee |
1219 | =over 4 |
1220 | |
1221 | =item strictures in regexp-eval blocks |
1222 | |
c0c97549 |
1223 | C<strict> wasn't in effect in regexp-eval blocks (C</(?{...})/>). |
1224 | |
bd3831ee |
1225 | =item Calling CORE::require() |
1226 | |
1227 | CORE::require() and CORE::do() were always parsed as require() and do() |
1228 | when they were overridden. This is now fixed. |
1229 | |
1230 | =item Subscripts of slices |
1231 | |
1232 | You can now use a non-arrowed form for chained subscripts after a list |
1233 | slice, like in: |
1234 | |
1235 | ({foo => "bar"})[0]{foo} |
1236 | |
1237 | This used to be a syntax error; a C<< -> >> was required. |
1238 | |
1239 | =item C<no warnings 'category'> works correctly with -w |
1240 | |
1241 | Previously when running with warnings enabled globally via C<-w>, selective |
1242 | disabling of specific warning categories would actually turn off all warnings. |
1243 | This is now fixed; now C<no warnings 'io';> will only turn off warnings in the |
1244 | C<io> class. Previously it would erroneously turn off all warnings. |
1245 | |
597bb945 |
1246 | =item threads improvements |
bd3831ee |
1247 | |
1248 | Several memory leaks in ithreads were closed. Also, ithreads were made |
1249 | less memory-intensive. |
1250 | |
597bb945 |
1251 | C<threads> is now a dual-life module, also available on CPAN. It has been |
1252 | expanded in many ways. A kill() method is available for thread signalling. |
1253 | One can get thread status, or the list of running or joinable threads. |
1254 | |
1255 | A new C<< threads->exit() >> method is used to exit from the application |
1256 | (this is the default for the main thread) or from the current thread only |
1257 | (this is the default for all other threads). On the other hand, the exit() |
1258 | built-in now always causes the whole application to terminate. (Jerry |
1259 | D. Hedden) |
1260 | |
bd3831ee |
1261 | =item chr() and negative values |
1262 | |
1263 | chr() on a negative value now gives C<\x{FFFD}>, the Unicode replacement |
1264 | character, unless when the C<bytes> pragma is in effect, where the low |
1265 | eight bytes of the value are used. |
1266 | |
597bb945 |
1267 | =item PERL5SHELL and tainting |
1268 | |
1269 | On Windows, the PERL5SHELL environment variable is now checked for |
1270 | taintedness. (Rafael Garcia-Suarez) |
1271 | |
1272 | =item Using *FILE{IO} |
1273 | |
1274 | C<stat()> and C<-X> filetests now treat *FILE{IO} filehandles like *FILE |
1275 | filehandles. (Steve Peters) |
1276 | |
1277 | =item Overloading and reblessing |
1278 | |
1279 | Overloading now works when references are reblessed into another class. |
1280 | Internally, this has been implemented by moving the flag for "overloading" |
1281 | from the reference to the referent, which logically is where it should |
1282 | always have been. (Nicholas Clark) |
1283 | |
1284 | =item Overloading and UTF-8 |
1285 | |
1286 | A few bugs related to UTF-8 handling with objects that have |
1287 | stringification overloaded have been fixed. (Nicholas Clark) |
1288 | |
1289 | =item eval memory leaks fixed |
1290 | |
1291 | Traditionally, C<eval 'syntax error'> has leaked badly. Many (but not all) |
1292 | of these leaks have now been eliminated or reduced. (Dave Mitchell) |
1293 | |
1294 | =item Random device on Windows |
1295 | |
1296 | In previous versions, perl would read the file F</dev/urandom> if it |
1297 | existed when seeding its random number generator. That file is unlikely |
1298 | to exist on Windows, and if it did would probably not contain appropriate |
1299 | data, so perl no longer tries to read it on Windows. (Alex Davies) |
1300 | |
1301 | =item PERLIO_DEBUG |
1302 | |
254a8700 |
1303 | The C<PERLIO_DEBUG> environment variable no longer has any effect for |
597bb945 |
1304 | setuid scripts and for scripts run with B<-T>. |
1305 | |
1306 | Moreover, with a thread-enabled perl, using C<PERLIO_DEBUG> could lead to |
1307 | an internal buffer overflow. This has been fixed. |
1308 | |
f0e260b8 |
1309 | =item PerlIO::scalar and read-only scalars |
1310 | |
1311 | PerlIO::scalar will now prevent writing to read-only scalars. Moreover, |
1312 | seek() is now supported with PerlIO::scalar-based filehandles, the |
1313 | underlying string being zero-filled as needed. (Rafael, Jarkko Hietaniemi) |
1314 | |
1315 | =item study() and UTF-8 |
1316 | |
1317 | study() never worked for UTF-8 strings, but could lead to false results. |
1318 | It's now a no-op on UTF-8 data. (Yves Orton) |
1319 | |
1320 | =item Critical signals |
1321 | |
1322 | The signals SIGILL, SIGBUS and SIGSEGV are now always delivered in an |
1323 | "unsafe" manner (contrary to other signals, that are deferred until the |
1324 | perl interpreter reaches a reasonably stable state; see |
1325 | L<perlipc/"Deferred Signals (Safe Signals)">). (Rafael) |
1326 | |
1327 | =item @INC-hook fix |
1328 | |
1329 | When a module or a file is loaded through an @INC-hook, and when this hook |
1330 | has set a filename entry in %INC, __FILE__ is now set for this module |
1331 | accordingly to the contents of that %INC entry. (Rafael) |
1332 | |
1333 | =item C<-t> switch fix |
1334 | |
1335 | The C<-w> and C<-t> switches can now be used together without messing |
254a8700 |
1336 | up which categories of warnings are activated. (Rafael) |
f0e260b8 |
1337 | |
1338 | =item Duping UTF-8 filehandles |
1339 | |
1340 | Duping a filehandle which has the C<:utf8> PerlIO layer set will now |
1341 | properly carry that layer on the duped filehandle. (Rafael) |
1342 | |
1343 | =item Localisation of hash elements |
1344 | |
292c2b28 |
1345 | Localizing a hash element whose key was given as a variable didn't work |
f0e260b8 |
1346 | correctly if the variable was changed while the local() was in effect (as |
1347 | in C<local $h{$x}; ++$x>). (Bo Lindbergh) |
1348 | |
bd3831ee |
1349 | =back |
0eece9c0 |
1350 | |
cf6c151c |
1351 | =head1 New or Changed Diagnostics |
c0c97549 |
1352 | |
bd3831ee |
1353 | =over 4 |
1354 | |
d43695a1 |
1355 | =item Use of uninitialized value |
1356 | |
1357 | Perl will now try to tell you the name of the variable (if any) that was |
1358 | undefined. |
1359 | |
bd3831ee |
1360 | =item Deprecated use of my() in false conditional |
1361 | |
c0c97549 |
1362 | A new deprecation warning, I<Deprecated use of my() in false conditional>, |
1363 | has been added, to warn against the use of the dubious and deprecated |
1364 | construct |
1365 | |
1366 | my $x if 0; |
1367 | |
1368 | See L<perldiag>. Use C<state> variables instead. |
1369 | |
bd3831ee |
1370 | =item !=~ should be !~ |
1371 | |
0eece9c0 |
1372 | A new warning, C<!=~ should be !~>, is emitted to prevent this misspelling |
1373 | of the non-matching operator. |
1374 | |
bd3831ee |
1375 | =item Newline in left-justified string |
1376 | |
0eece9c0 |
1377 | The warning I<Newline in left-justified string> has been removed. |
1378 | |
bd3831ee |
1379 | =item Too late for "-T" option |
1380 | |
0eece9c0 |
1381 | The error I<Too late for "-T" option> has been reformulated to be more |
1382 | descriptive. |
1383 | |
bd3831ee |
1384 | =item "%s" variable %s masks earlier declaration |
1385 | |
1386 | This warning is now emitted in more consistent cases; in short, when one |
1387 | of the declarations involved is a C<my> variable: |
1388 | |
1389 | my $x; my $x; # warns |
1390 | my $x; our $x; # warns |
1391 | our $x; my $x; # warns |
1392 | |
1393 | On the other hand, the following: |
1394 | |
1395 | our $x; our $x; |
1396 | |
1397 | now gives a C<"our" variable %s redeclared> warning. |
1398 | |
1399 | =item readdir()/closedir()/etc. attempted on invalid dirhandle |
1400 | |
1401 | These new warnings are now emitted when a dirhandle is used but is |
1402 | either closed or not really a dirhandle. |
1403 | |
f0e260b8 |
1404 | =item Opening dirhandle/filehandle %s also as a file/directory |
1405 | |
1406 | Two deprecation warnings have been added: (Rafael) |
1407 | |
1408 | Opening dirhandle %s also as a file |
1409 | Opening filehandle %s also as a directory |
1410 | |
f00638a2 |
1411 | =item Use of -P is deprecated |
1412 | |
1413 | Perl's command-line switch C<-P> is now deprecated. |
1414 | |
6601a838 |
1415 | =item v-string in use/require is non-portable |
1416 | |
1417 | Perl will warn you against potential backwards compatibility problems with |
1418 | the C<use VERSION> syntax. |
1419 | |
bd3831ee |
1420 | =item perl -V |
1421 | |
0eece9c0 |
1422 | C<perl -V> has several improvements, making it more useable from shell |
1423 | scripts to get the value of configuration variables. See L<perlrun> for |
1424 | details. |
1425 | |
bd3831ee |
1426 | =back |
1427 | |
cf6c151c |
1428 | =head1 Changed Internals |
c0c97549 |
1429 | |
16993b2e |
1430 | In general, the source code of perl has been refactored, tidied up, |
1431 | and optimized in many places. Also, memory management and allocation |
1432 | has been improved in several points. |
1433 | |
1434 | When compiling the perl core with gcc, as many gcc warning flags are |
1435 | turned on as is possible on the platform. (This quest for cleanliness |
1436 | doesn't extend to XS code because we cannot guarantee the tidiness of |
1437 | code we didn't write.) Similar strictness flags have been added or |
1438 | tightened for various other C compilers. |
bd3831ee |
1439 | |
c0c97549 |
1440 | =head2 Reordering of SVt_* constants |
1441 | |
1442 | The relative ordering of constants that define the various types of C<SV> |
1443 | have changed; in particular, C<SVt_PVGV> has been moved before C<SVt_PVLV>, |
1444 | C<SVt_PVAV>, C<SVt_PVHV> and C<SVt_PVCV>. This is unlikely to make any |
1445 | difference unless you have code that explicitly makes assumptions about that |
1446 | ordering. (The inheritance hierarchy of C<B::*> objects has been changed |
1447 | to reflect this.) |
1448 | |
254a8700 |
1449 | =head2 Elimination of SVt_PVBM |
1450 | |
1451 | Related to this, the internal type C<SVt_PVBM> has been been removed. This |
1452 | dedicated type of C<SV> was used by the C<index> operator and parts of the |
1453 | regexp engine to facilitate fast Boyer-Moore matches. Its use internally has |
1454 | been replaced by C<SV>s of type C<SVt_PVGV>. |
1455 | |
1456 | =head2 New type SVt_BIND |
1457 | |
1458 | A new type C<SVt_BIND> has been added, in readiness for the project to |
1459 | implement Perl 6 on 5. There deliberately is no implementation yet, and |
1460 | they cannot yet be created or destroyed. |
1461 | |
c0c97549 |
1462 | =head2 Removal of CPP symbols |
1463 | |
1464 | The C preprocessor symbols C<PERL_PM_APIVERSION> and |
1465 | C<PERL_XS_APIVERSION>, which were supposed to give the version number of |
1466 | the oldest perl binary-compatible (resp. source-compatible) with the |
1467 | present one, were not used, and sometimes had misleading values. They have |
1468 | been removed. |
1469 | |
1470 | =head2 Less space is used by ops |
1471 | |
1472 | The C<BASEOP> structure now uses less space. The C<op_seq> field has been |
254a8700 |
1473 | removed and replaced by a single bit bit-field C<op_opt>. C<op_type> is now 9 |
c0c97549 |
1474 | bits long. (Consequently, the C<B::OP> class doesn't provide an C<seq> |
1475 | method anymore.) |
1476 | |
1477 | =head2 New parser |
1478 | |
1479 | perl's parser is now generated by bison (it used to be generated by |
1480 | byacc.) As a result, it seems to be a bit more robust. |
1481 | |
bd3831ee |
1482 | Also, Dave Mitchell improved the lexer debugging output under C<-DT>. |
1483 | |
1484 | =head2 Use of C<const> |
1485 | |
1486 | Andy Lester supplied many improvements to determine which function |
1487 | parameters and local variables could actually be declared C<const> to the C |
1488 | compiler. Steve Peters provided new C<*_set> macros and reworked the core to |
1489 | use these rather than assigning to macros in LVALUE context. |
1490 | |
1491 | =head2 Mathoms |
1492 | |
1493 | A new file, F<mathoms.c>, has been added. It contains functions that are |
1494 | no longer used in the perl core, but that remain available for binary or |
1495 | source compatibility reasons. However, those functions will not be |
1496 | compiled in if you add C<-DNO_MATHOMS> in the compiler flags. |
1497 | |
1498 | =head2 C<AvFLAGS> has been removed |
1499 | |
1500 | The C<AvFLAGS> macro has been removed. |
1501 | |
1502 | =head2 C<av_*> changes |
1503 | |
1504 | The C<av_*()> functions, used to manipulate arrays, no longer accept null |
1505 | C<AV*> parameters. |
1506 | |
597bb945 |
1507 | =head2 $^H and %^H |
1508 | |
1509 | The implementation of the special variables $^H and %^H has changed, to |
254a8700 |
1510 | allow implementing lexical pragmas in pure Perl. |
597bb945 |
1511 | |
bd3831ee |
1512 | =head2 B:: modules inheritance changed |
1513 | |
1514 | The inheritance hierarchy of C<B::> modules has changed; C<B::NV> now |
1515 | inherits from C<B::SV> (it used to inherit from C<B::IV>). |
1516 | |
f0e260b8 |
1517 | =head2 Anonymous hash and array constructors |
1518 | |
1519 | The anonymous hash and array constructors now take 1 op in the optree |
1520 | instead of 3, now that pp_anonhash and pp_anonlist return a reference to |
1521 | an hash/array when the op is flagged with OPf_SPECIAL (Nicholas Clark). |
1522 | |
cf6c151c |
1523 | =head1 Known Problems |
c0c97549 |
1524 | |
1525 | There's still a remaining problem in the implementation of the lexical |
1526 | C<$_>: it doesn't work inside C</(?{...})/> blocks. (See the TODO test in |
1527 | F<t/op/mydef.t>.) |
1528 | |
cf6c151c |
1529 | =head1 Platform Specific Problems |
c0c97549 |
1530 | |
cf6c151c |
1531 | =head1 Reporting Bugs |
1532 | |
1533 | =head1 SEE ALSO |
1534 | |
1535 | The F<Changes> file and the perl590delta to perl595delta man pages for |
1536 | exhaustive details on what changed. |
1537 | |
1538 | The F<INSTALL> file for how to build Perl. |
1539 | |
1540 | The F<README> file for general stuff. |
1541 | |
1542 | The F<Artistic> and F<Copying> files for copyright information. |
1543 | |
1544 | =cut |