3 perldelta - what is new for perl 5.10.0
7 This document describes the differences between the 5.8.8 release and
10 Many of the bug fixes in 5.10.0 were already seen in the 5.8.X maintenance
11 releases; they are not duplicated here and are documented in the set of
12 man pages named perl58[1-8]?delta.
14 =head1 Core Enhancements
16 =head2 The C<feature> pragma
18 The C<feature> pragma is used to enable new syntax that would break Perl's
19 backwards-compatibility with older releases of the language. It's a lexical
20 pragma, like C<strict> or C<warnings>.
22 Currently the following new features are available: C<switch> (adds a
23 switch statement), C<say> (adds a C<say> built-in function), and C<state>
24 (adds an C<state> keyword for declaring "static" variables). Those
25 features are described in their own sections of this document.
27 The C<feature> pragma is also implicitly loaded when you require a minimal
28 perl version (with the C<use VERSION> construct) greater than, or equal
29 to, 5.9.5. See L<feature> for details.
31 =head2 New B<-E> command-line switch
33 B<-E> is equivalent to B<-e>, but it implicitly enables all
34 optional features (like C<use feature ":5.10">).
36 =head2 Defined-or operator
38 A new operator C<//> (defined-or) has been implemented.
39 The following statement:
43 is merely equivalent to
51 can now be used instead of
53 $c = $d unless defined $c;
55 The C<//> operator has the same precedence and associativity as C<||>.
56 Special care has been taken to ensure that this operator Do What You Mean
57 while not breaking old code, but some edge cases involving the empty
58 regular expression may now parse differently. See L<perlop> for
61 =head2 Switch and Smart Match operator
63 Perl 5 now has a switch statement. It's available when C<use feature
64 'switch'> is in effect. This feature introduces three new keywords,
65 C<given>, C<when>, and C<default>:
68 when (/^abc/) { $abc = 1; }
69 when (/^def/) { $def = 1; }
70 when (/^xyz/) { $xyz = 1; }
71 default { $nothing = 1; }
74 A more complete description of how Perl matches the switch variable
75 against the C<when> conditions is given in L<perlsyn/"Switch statements">.
77 This kind of match is called I<smart match>, and it's also possible to use
78 it outside of switch statements, via the new C<~~> operator. See
79 L<perlsyn/"Smart matching in detail">.
81 This feature was contributed by Robin Houston.
83 =head2 Regular expressions
87 =item Recursive Patterns
89 It is now possible to write recursive patterns without using the C<(??{})>
90 construct. This new way is more efficient, and in many cases easier to
93 Each capturing parenthesis can now be treated as an independent pattern
94 that can be entered by using the C<(?PARNO)> syntax (C<PARNO> standing for
95 "parenthesis number"). For example, the following pattern will match
96 nested balanced angle brackets:
100 ( # start capture buffer 1
101 < # match an opening angle bracket
103 (?> # don't backtrack over the inside of this group
104 [^<>]+ # one or more non angle brackets
105 ) # end non backtracking group
107 (?1) # recurse to bracket 1 and try it again
108 )* # 0 or more times.
109 > # match a closing angle bracket
110 ) # end capture buffer one
114 Note, users experienced with PCRE will find that the Perl implementation
115 of this feature differs from the PCRE one in that it is possible to
116 backtrack into a recursed pattern, whereas in PCRE the recursion is
117 atomic or "possessive" in nature. (Yves Orton)
119 =item Named Capture Buffers
121 It is now possible to name capturing parenthesis in a pattern and refer to
122 the captured contents by name. The naming syntax is C<< (?<NAME>....) >>.
123 It's possible to backreference to a named buffer with the C<< \k<NAME> >>
124 syntax. In code, the new magical hashes C<%+> and C<%-> can be used to
125 access the contents of the capture buffers.
127 Thus, to replace all doubled chars, one could write
129 s/(?<letter>.)\k<letter>/$+{letter}/g
131 Only buffers with defined contents will be "visible" in the C<%+> hash, so
132 it's possible to do something like
134 foreach my $name (keys %+) {
135 print "content of buffer '$name' is $+{$name}\n";
138 The C<%-> hash is a bit more complete, since it will contain array refs
139 holding values from all capture buffers similarly named, if there should
142 C<%+> and C<%-> are implemented as tied hashes through the new module
143 C<Tie::Hash::NamedCapture>.
145 Users exposed to the .NET regex engine will find that the perl
146 implementation differs in that the numerical ordering of the buffers
147 is sequential, and not "unnamed first, then named". Thus in the pattern
149 /(A)(?<B>B)(C)(?<D>D)/
151 $1 will be 'A', $2 will be 'B', $3 will be 'C' and $4 will be 'D' and not
152 $1 is 'A', $2 is 'C' and $3 is 'B' and $4 is 'D' that a .NET programmer
153 would expect. This is considered a feature. :-) (Yves Orton)
155 =item Possessive Quantifiers
157 Perl now supports the "possessive quantifier" syntax of the "atomic match"
158 pattern. Basically a possessive quantifier matches as much as it can and never
159 gives any back. Thus it can be used to control backtracking. The syntax is
160 similar to non-greedy matching, except instead of using a '?' as the modifier
161 the '+' is used. Thus C<?+>, C<*+>, C<++>, C<{min,max}+> are now legal
162 quantifiers. (Yves Orton)
164 =item Backtracking control verbs
166 The regex engine now supports a number of special-purpose backtrack
167 control verbs: (*THEN), (*PRUNE), (*MARK), (*SKIP), (*COMMIT), (*FAIL)
168 and (*ACCEPT). See L<perlre> for their descriptions. (Yves Orton)
170 =item Relative backreferences
172 A new syntax C<\g{N}> or C<\gN> where "N" is a decimal integer allows a
173 safer form of back-reference notation as well as allowing relative
174 backreferences. This should make it easier to generate and embed patterns
175 that contain backreferences. See L<perlre/"Capture buffers">. (Yves Orton)
179 The functionality of Jeff Pinyan's module Regexp::Keep has been added to
180 the core. You can now use in regular expressions the special escape C<\K>
181 as a way to do something like floating length positive lookbehind. It is
182 also useful in substitutions like:
186 that can now be converted to
190 which is much more efficient. (Yves Orton)
192 =item Vertical and horizontal whitespace, and linebreak
194 Regular expressions now recognize the C<\v> and C<\h> escapes, that match
195 vertical and horizontal whitespace, respectively. C<\V> and C<\H>
196 logically match their complements.
198 C<\R> matches a generic linebreak, that is, vertical whitespace, plus
199 the multi-character sequence C<"\x0D\x0A">.
205 say() is a new built-in, only available when C<use feature 'say'> is in
206 effect, that is similar to print(), but that implicitly appends a newline
207 to the printed string. See L<perlfunc/say>. (Robin Houston)
211 The default variable C<$_> can now be lexicalized, by declaring it like
212 any other lexical variable, with a simple
216 The operations that default on C<$_> will use the lexically-scoped
217 version of C<$_> when it exists, instead of the global C<$_>.
219 In a C<map> or a C<grep> block, if C<$_> was previously my'ed, then the
220 C<$_> inside the block is lexical as well (and scoped to the block).
222 In a scope where C<$_> has been lexicalized, you can still have access to
223 the global version of C<$_> by using C<$::_>, or, more simply, by
224 overriding the lexical declaration with C<our $_>. (Rafael Garcia-Suarez)
226 =head2 The C<_> prototype
228 A new prototype character has been added. C<_> is equivalent to C<$> (it
229 denotes a scalar), but defaults to C<$_> if the corresponding argument
230 isn't supplied. Due to the optional nature of the argument, you can only
231 use it at the end of a prototype, or before a semicolon.
233 This has a small incompatible consequence: the prototype() function has
234 been adjusted to return C<_> for some built-ins in appropriate cases (for
235 example, C<prototype('CORE::rmdir')>). (Rafael Garcia-Suarez)
237 =head2 UNITCHECK blocks
239 C<UNITCHECK>, a new special code block has been introduced, in addition to
240 C<BEGIN>, C<CHECK>, C<INIT> and C<END>.
242 C<CHECK> and C<INIT> blocks, while useful for some specialized purposes,
243 are always executed at the transition between the compilation and the
244 execution of the main program, and thus are useless whenever code is
245 loaded at runtime. On the other hand, C<UNITCHECK> blocks are executed
246 just after the unit which defined them has been compiled. See L<perlmod>
247 for more information. (Alex Gough)
249 =head2 New Pragma, C<mro>
251 A new pragma, C<mro> (for Method Resolution Order) has been added. It
252 permits to switch, on a per-class basis, the algorithm that perl uses to
253 find inherited methods in case of a mutiple inheritance hierachy. The
254 default MRO hasn't changed (DFS, for Depth First Search). Another MRO is
255 available: the C3 algorithm. See L<mro> for more information.
258 Note that, due to changes in the implentation of class hierarchy search,
259 code that used to undef the C<*ISA> glob will most probably break. Anyway,
260 undef'ing C<*ISA> had the side-effect of removing the magic on the @ISA
261 array and should not have been done in the first place.
263 =head2 readpipe() is now overridable
265 The built-in function readpipe() is now overridable. Overriding it permits
266 also to override its operator counterpart, C<qx//> (a.k.a. C<``>).
267 Moreover, it now defaults to C<$_> if no argument is provided. (Rafael
270 =head2 Default argument for readline()
272 readline() now defaults to C<*ARGV> if no argument is provided. (Rafael
275 =head2 state() variables
277 A new class of variables has been introduced. State variables are similar
278 to C<my> variables, but are declared with the C<state> keyword in place of
279 C<my>. They're visible only in their lexical scope, but their value is
280 persistent: unlike C<my> variables, they're not undefined at scope entry,
281 but retain their previous value. (Rafael Garcia-Suarez, Nicholas Clark)
283 To use state variables, one needs to enable them by using
287 or by using the C<-E> command-line switch in one-liners.
288 See L<perlsub/"Persistent variables via state()">.
290 =head2 Stacked filetest operators
292 As a new form of syntactic sugar, it's now possible to stack up filetest
293 operators. You can now write C<-f -w -x $file> in a row to mean
294 C<-x $file && -w _ && -f _>. See L<perlfunc/-X>.
296 =head2 UNIVERSAL::DOES()
298 The C<UNIVERSAL> class has a new method, C<DOES()>. It has been added to
299 solve semantic problems with the C<isa()> method. C<isa()> checks for
300 inheritance, while C<DOES()> has been designed to be overridden when
301 module authors use other types of relations between classes (in addition
302 to inheritance). (chromatic)
304 See L<< UNIVERSAL/"$obj->DOES( ROLE )" >>.
306 =head2 C<CLONE_SKIP()>
308 Perl has now support for the C<CLONE_SKIP> special subroutine. Like
309 C<CLONE>, C<CLONE_SKIP> is called once per package; however, it is called
310 just before cloning starts, and in the context of the parent thread. If it
311 returns a true value, then no objects of that class will be cloned. See
312 L<perlmod> for details. (Contributed by Dave Mitchell.)
316 Formats were improved in several ways. A new field, C<^*>, can be used for
317 variable-width, one-line-at-a-time text. Null characters are now handled
318 correctly in picture lines. Using C<@#> and C<~~> together will now
319 produce a compile-time error, as those format fields are incompatible.
320 L<perlform> has been improved, and miscellaneous bugs fixed.
322 =head2 Byte-order modifiers for pack() and unpack()
324 There are two new byte-order modifiers, C<E<gt>> (big-endian) and C<E<lt>>
325 (little-endian), that can be appended to most pack() and unpack() template
326 characters and groups to force a certain byte-order for that type or group.
327 See L<perlfunc/pack> and L<perlpacktut> for details.
331 You can now use C<no> followed by a version number to specify that you
332 want to use a version of perl older than the specified one.
334 =head2 C<chdir>, C<chmod> and C<chown> on filehandles
336 C<chdir>, C<chmod> and C<chown> can now work on filehandles as well as
337 filenames, if the system supports respectively C<fchdir>, C<fchmod> and
338 C<fchown>, thanks to a patch provided by Gisle Aas.
342 C<$(> and C<$)> now return groups in the order where the OS returns them,
343 thanks to Gisle Aas. This wasn't previously the case.
345 =head2 Recursive sort subs
347 You can now use recursive subroutines with sort(), thanks to Robin Houston.
349 =head2 Exceptions in constant folding
351 The constant folding routine is now wrapped in an exception handler, and
352 if folding throws an exception (such as attempting to evaluate 0/0), perl
353 now retains the current optree, rather than aborting the whole program.
354 (Nicholas Clark, Dave Mitchell)
356 =head2 Source filters in @INC
358 It's possible to enhance the mechanism of subroutine hooks in @INC by
359 adding a source filter on top of the filehandle opened and returned by the
360 hook. This feature was planned a long time ago, but wasn't quite working
361 until now. See L<perlfunc/require> for details. (Nicholas Clark)
363 =head2 New internal variables
367 =item C<${^RE_DEBUG_FLAGS}>
369 This variable controls what debug flags are in effect for the regular
370 expression engine when running under C<use re "debug">. See L<re> for
373 =item C<${^CHILD_ERROR_NATIVE}>
375 This variable gives the native status returned by the last pipe close,
376 backtick command, successful call to wait() or waitpid(), or from the
377 system() operator. See L<perlrun> for details. (Contributed by Gisle Aas.)
379 =item C<${^RE_TRIE_MAXBUF}>
381 See L</"Trie optimisation of literal string alternations">.
383 =item C<${^WIN32_SLOPPY_STAT}>
385 See L</"Sloppy stat on Windows">.
391 C<unpack()> now defaults to unpacking the C<$_> variable.
393 C<mkdir()> without arguments now defaults to C<$_>.
395 The internal dump output has been improved, so that non-printable characters
396 such as newline and backspace are output in C<\x> notation, rather than
399 The B<-C> option can no longer be used on the C<#!> line. It wasn't
400 working there anyway.
404 The copy of the Unicode Character Database included in Perl 5 has
405 been updated to version 5.0.0.
409 MAD, which stands for I<Misc Attribute Decoration>, is a
410 still-in-development work leading to a Perl 5 to Perl 6 converter. To
411 enable it, it's necessary to pass the argument C<-Dmad> to Configure. The
412 obtained perl isn't binary compatible with a regular perl 5.9.4, and has
413 space and speed penalties; moreover not all regression tests still pass
414 with it. (Larry Wall, Nicholas Clark)
416 =head1 Incompatible Changes
418 =head2 Packing and UTF-8 strings
422 The semantics of pack() and unpack() regarding UTF-8-encoded data has been
423 changed. Processing is now by default character per character instead of
424 byte per byte on the underlying encoding. Notably, code that used things
425 like C<pack("a*", $string)> to see through the encoding of string will now
426 simply get back the original $string. Packed strings can also get upgraded
427 during processing when you store upgraded characters. You can get the old
428 behaviour by using C<use bytes>.
430 To be consistent with pack(), the C<C0> in unpack() templates indicates
431 that the data is to be processed in character mode, i.e. character by
432 character; on the contrary, C<U0> in unpack() indicates UTF-8 mode, where
433 the packed string is processed in its UTF-8-encoded Unicode form on a byte
434 by byte basis. This is reversed with regard to perl 5.8.X.
436 Moreover, C<C0> and C<U0> can also be used in pack() templates to specify
437 respectively character and byte modes.
439 C<C0> and C<U0> in the middle of a pack or unpack format now switch to the
440 specified encoding mode, honoring parens grouping. Previously, parens were
443 Also, there is a new pack() character format, C<W>, which is intended to
444 replace the old C<C>. C<C> is kept for unsigned chars coded as bytes in
445 the strings internal representation. C<W> represents unsigned (logical)
446 character values, which can be greater than 255. It is therefore more
447 robust when dealing with potentially UTF-8-encoded data (as C<C> will wrap
448 values outside the range 0..255, and not respect the string encoding).
450 In practice, that means that pack formats are now encoding-neutral, except
453 For consistency, C<A> in unpack() format now trims all Unicode whitespace
454 from the end of the string. Before perl 5.9.2, it used to strip only the
455 classical ASCII space characters.
457 =head2 Byte/character count feature in unpack()
459 A new unpack() template character, C<".">, returns the number of bytes or
460 characters (depending on the selected encoding mode, see above) read so far.
462 =head2 The C<$*> and C<$#> variables have been removed
464 C<$*>, which was deprecated in favor of the C</s> and C</m> regexp
465 modifiers, has been removed.
467 The deprecated C<$#> variable (output format for numbers) has been
470 Two new warnings, C<$#/$* is no longer supported>, have been added.
472 =head2 substr() lvalues are no longer fixed-length
474 The lvalues returned by the three argument form of substr() used to be a
475 "fixed length window" on the original string. In some cases this could
476 cause surprising action at distance or other undefined behaviour. Now the
477 length of the window adjusts itself to the length of the string assigned to
480 =head2 Parsing of C<-f _>
482 The identifier C<_> is now forced to be a bareword after a filetest
483 operator. This solves a number of misparsing issues when a global C<_>
484 subroutine is defined.
488 The C<:unique> attribute has been made a no-op, since its current
489 implementation was fundamentally flawed and not threadsafe.
491 =head2 Scoping of the C<sort> pragma
493 The C<sort> pragma is now lexically scoped. Its effect used to be global.
495 =head2 Scoping of C<bignum>, C<bigint>, C<bigrat>
497 The three numeric pragmas C<bignum>, C<bigint> and C<bigrat> are now
498 lexically scoped. (Tels)
500 =head2 Effect of pragmas in eval
502 The compile-time value of the C<%^H> hint variable can now propagate into
503 eval("")uated code. This makes it more useful to implement lexical
506 As a side-effect of this, the overloaded-ness of constants now propagates
511 A bareword argument to chdir() is now recognized as a file handle.
512 Earlier releases interpreted the bareword as a directory name.
515 =head2 Handling of .pmc files
517 An old feature of perl was that before C<require> or C<use> look for a
518 file with a F<.pm> extension, they will first look for a similar filename
519 with a F<.pmc> extension. If this file is found, it will be loaded in
520 place of any potentially existing file ending in a F<.pm> extension.
522 Previously, F<.pmc> files were loaded only if more recent than the
523 matching F<.pm> file. Starting with 5.9.4, they'll be always loaded if
526 =head2 @- and @+ in patterns
528 The special arrays C<@-> and C<@+> are no longer interpolated in regular
529 expressions. (Sadahiro Tomoyuki)
531 =head2 $AUTOLOAD can now be tainted
533 If you call a subroutine by a tainted name, and if it defers to an
534 AUTOLOAD function, then $AUTOLOAD will be (correctly) tainted.
537 =head2 Tainting and printf
539 When perl is run under taint mode, C<printf()> and C<sprintf()> will now
540 reject any tainted format argument. (Rafael Garcia-Suarez)
542 =head2 undef and signal handlers
544 Undefining or deleting a signal handler via C<undef $SIG{FOO}> is now
545 equivalent to setting it to C<'DEFAULT'>. (Rafael Garcia-Suarez)
547 =head2 strictures and dereferencing in defined()
549 C<use strict "refs"> was ignoring taking a hard reference in an argument
550 to defined(), as in :
554 if (defined $$x) {...}
556 This now correctly produces the run-time error C<Can't use string as a
557 SCALAR ref while "strict refs" in use>.
559 C<defined @$foo> and C<defined %$bar> are now also subject to C<strict
560 'refs'> (that is, C<$foo> and C<$bar> shall be proper references there.)
561 (C<defined(@foo)> and C<defined(%bar)> are discouraged constructs anyway.)
564 =head2 C<(?p{})> has been removed
566 The regular expression construct C<(?p{})>, which was deprecated in perl
567 5.8, has been removed. Use C<(??{})> instead. (Rafael Garcia-Suarez)
569 =head2 Pseudo-hashes have been removed
571 Support for pseudo-hashes has been removed from Perl 5.9. (The C<fields>
572 pragma remains here, but uses an alternate implementation.)
574 =head2 Removal of the bytecode compiler and of perlcc
576 C<perlcc>, the byteloader and the supporting modules (B::C, B::CC,
577 B::Bytecode, etc.) are no longer distributed with the perl sources. Those
578 experimental tools have never worked reliably, and, due to the lack of
579 volunteers to keep them in line with the perl interpreter developments, it
580 was decided to remove them instead of shipping a broken version of those.
581 The last version of those modules can be found with perl 5.9.4.
583 However the B compiler framework stays supported in the perl core, as with
584 the more useful modules it has permitted (among others, B::Deparse and
587 =head2 Removal of the JPL
589 The JPL (Java-Perl Linguo) has been removed from the perl sources tarball.
591 =head2 Recursive inheritance detected earlier
593 Perl will now immediately throw an exception if you modify any package's
594 C<@ISA> in such a way that it would cause recursive inheritance.
596 Previously, the exception would not occur until Perl attempted to make
597 use of the recursive inheritance while resolving a method or doing a
598 C<$foo-E<gt>isa($bar)> lookup.
600 =head1 Modules and Pragmata
608 C<encoding::warnings>, by Audrey Tang, is a module to emit warnings
609 whenever an ASCII character string containing high-bit bytes is implicitly
610 converted into UTF-8. It's a lexical pragma since Perl 5.9.4; on older
611 perls, its effect is global.
615 C<Module::CoreList>, by Richard Clamp, is a small handy module that tells
616 you what versions of core modules ship with any versions of Perl 5. It
617 comes with a command-line frontend, C<corelist>.
621 C<Math::BigInt::FastCalc> is an XS-enabled, and thus faster, version of
622 C<Math::BigInt::Calc>.
626 C<Compress::Zlib> is an interface to the zlib compression library. It
627 comes with a bundled version of zlib, so having a working zlib is not a
628 prerequisite to install it. It's used by C<Archive::Tar> (see below).
632 C<IO::Zlib> is an C<IO::>-style interface to C<Compress::Zlib>.
636 C<Archive::Tar> is a module to manipulate C<tar> archives.
640 C<Digest::SHA> is a module used to calculate many types of SHA digests,
641 has been included for SHA support in the CPAN module.
645 C<ExtUtils::CBuilder> and C<ExtUtils::ParseXS> have been added.
649 C<Hash::Util::FieldHash>, by Anno Siegel, has been added. This module
650 provides support for I<field hashes>: hashes that maintain an association
651 of a reference with a value, in a thread-safe garbage-collected way.
652 Such hashes are useful to implement inside-out objects.
656 C<Module::Build>, by Ken Williams, has been added. It's an alternative to
657 C<ExtUtils::MakeMaker> to build and install perl modules.
661 C<Module::Load>, by Jos Boumans, has been added. It provides a single
662 interface to load Perl modules and F<.pl> files.
666 C<Module::Loaded>, by Jos Boumans, has been added. It's used to mark
667 modules as loaded or unloaded.
671 C<Package::Constants>, by Jos Boumans, has been added. It's a simple
672 helper to list all constants declared in a given package.
676 C<Win32API::File>, by Tye McQueen, has been added (for Windows builds).
677 This module provides low-level access to Win32 system API calls for
682 =head1 Utility Changes
688 The Perl debugger can now save all debugger commands for sourcing later;
689 notably, it can now emulate stepping backwards, by restarting and
690 rerunning all bar the last command from a saved command history.
692 It can also display the parent inheritance tree of a given class, with the
695 Perl has a new -dt command-line flag, which enables threads support in the
700 C<ptar> is a pure perl implementation of C<tar>, that comes with
705 C<ptardiff> is a small script used to generate a diff between the contents
706 of a tar archive and a directory tree. Like C<ptar>, it comes with
711 C<shasum> is a command-line utility, used to print or to check SHA
712 digests. It comes with the new C<Digest::SHA> module.
716 The C<corelist> utility is now installed with perl (see L</"New modules">
721 C<h2ph> and C<h2xs> have been made a bit more robust with regard to
724 C<h2xs> implements a new option C<--use-xsloader> to force use of
725 C<XSLoader> even in backwards compatible modules.
727 The handling of authors' names that had apostrophes has been fixed.
729 Any enums with negative values are now skipped.
733 C<perlivp> no longer checks for F<*.ph> files by default. Use the new C<-a>
734 option to run I<all> tests.
738 C<find2perl> now assumes C<-print> as a default action. Previously, it
739 needed to be specified explicitly.
741 Several bugs have been fixed in C<find2perl>, regarding C<-exec> and
742 C<-eval>. Also the options C<-path>, C<-ipath> and C<-iname> have been
747 C<config_data> is a new utility that comes with C<Module::Build>. It
748 provides a command-line interface to the configuration of Perl modules
749 that use Module::Build's framework of configurability (that is,
750 C<*::ConfigData> modules that contain local configuration information for
751 their parent modules.)
755 =head1 New Documentation
757 The L<perlpragma> manpage documents how to write one's own lexical
758 pragmas in pure Perl (something that is possible starting with 5.9.4).
760 The new L<perlglossary> manpage is a glossary of terms used in the Perl
761 documentation, technical and otherwise, kindly provided by O'Reilly Media,
764 The L<perlreguts> manpage, courtesy of Yves Orton, describes internals of the
765 Perl regular expression engine.
767 The L<perlunitut> manpage is an tutorial for programming with Unicode and
768 string encodings in Perl, courtesy of Juerd Waalboer.
770 The long-existing feature of C</(?{...})/> regexps setting C<$_> and pos()
773 =head1 Performance Enhancements
775 =head2 In-place sorting
777 Sorting arrays in place (C<@a = sort @a>) is now optimized to avoid
778 making a temporary copy of the array.
780 Likewise, C<reverse sort ...> is now optimized to sort in reverse,
781 avoiding the generation of a temporary intermediate list.
783 =head2 Lexical array access
785 Access to elements of lexical arrays via a numeric constant between 0 and
786 255 is now faster. (This used to be only the case for global arrays.)
788 =head2 XS-assisted SWASHGET
790 Some pure-perl code that perl was using to retrieve Unicode properties and
791 transliteration mappings has been reimplemented in XS.
793 =head2 Constant subroutines
795 The interpreter internals now support a far more memory efficient form of
796 inlineable constants. Storing a reference to a constant value in a symbol
797 table is equivalent to a full typeglob referencing a constant subroutine,
798 but using about 400 bytes less memory. This proxy constant subroutine is
799 automatically upgraded to a real typeglob with subroutine if necessary.
800 The approach taken is analogous to the existing space optimisation for
801 subroutine stub declarations, which are stored as plain scalars in place
802 of the full typeglob.
804 Several of the core modules have been converted to use this feature for
805 their system dependent constants - as a result C<use POSIX;> now takes about
808 =head2 C<PERL_DONT_CREATE_GVSV>
810 The new compilation flag C<PERL_DONT_CREATE_GVSV>, introduced as an option
811 in perl 5.8.8, is turned on by default in perl 5.9.3. It prevents perl
812 from creating an empty scalar with every new typeglob. See L<perl588delta>
815 =head2 Weak references are cheaper
817 Weak reference creation is now I<O(1)> rather than I<O(n)>, courtesy of
818 Nicholas Clark. Weak reference deletion remains I<O(n)>, but if deletion only
819 happens at program exit, it may be skipped completely.
821 =head2 sort() enhancements
823 Salvador FandiƱo provided improvements to reduce the memory usage of C<sort>
824 and to speed up some cases.
826 =head2 Memory optimisations
828 Several internal data structures (typeglobs, GVs, CVs, formats) have been
829 restructured to use less memory. (Nicholas Clark)
831 =head2 UTF-8 cache optimisation
833 The UTF-8 caching code is now more efficient, and used more often.
836 =head2 Sloppy stat on Windows
838 On Windows, perl's stat() function normally opens the file to determine
839 the link count and update attributes that may have been changed through
840 hard links. Setting ${^WIN32_SLOPPY_STAT} to a true value speeds up
841 stat() by not performing this operation. (Jan Dubois)
845 =head2 Regular expressions optimisations
849 =item Engine de-recursivised
851 The regular expression engine is no longer recursive, meaning that
852 patterns that used to overflow the stack will either die with useful
853 explanations, or run to completion, which, since they were able to blow
854 the stack before, will likely take a very long time to happen. If you were
855 experiencing the occasional stack overflow (or segfault) and upgrade to
856 discover that now perl apparently hangs instead, look for a degenerate
857 regex. (Dave Mitchell)
859 =item Single char char-classes treated as literals
861 Classes of a single character are now treated the same as if the character
862 had been used as a literal, meaning that code that uses char-classes as an
863 escaping mechanism will see a speedup. (Yves Orton)
865 =item Trie optimisation of literal string alternations
867 Alternations, where possible, are optimised into more efficient matching
868 structures. String literal alternations are merged into a trie and are
869 matched simultaneously. This means that instead of O(N) time for matching
870 N alternations at a given point, the new code performs in O(1) time.
871 A new special variable, ${^RE_TRIE_MAXBUF}, has been added to fine-tune
872 this optimization. (Yves Orton)
874 B<Note:> Much code exists that works around perl's historic poor
875 performance on alternations. Often the tricks used to do so will disable
876 the new optimisations. Hopefully the utility modules used for this purpose
877 will be educated about these new optimisations by the time 5.10 is
880 =item Aho-Corasick start-point optimisation
882 When a pattern starts with a trie-able alternation and there aren't
883 better optimisations available the regex engine will use Aho-Corasick
884 matching to find the start point. (Yves Orton)
888 =head1 Installation and Configuration Improvements
890 =head2 Configuration improvements
894 =item C<-Dusesitecustomize>
896 Run-time customization of @INC can be enabled by passing the
897 C<-Dusesitecustomize> flag to Configure. When enabled, this will make perl
898 run F<$sitelibexp/sitecustomize.pl> before anything else. This script can
899 then be set up to add additional entries to @INC.
901 =item Relocatable installations
903 There is now Configure support for creating a relocatable perl tree. If
904 you Configure with C<-Duserelocatableinc>, then the paths in @INC (and
905 everything else in %Config) can be optionally located via the path of the
908 That means that, if the string C<".../"> is found at the start of any
909 path, it's substituted with the directory of $^X. So, the relocation can
910 be configured on a per-directory basis, although the default with
911 C<-Duserelocatableinc> is that everything is relocated. The initial
912 install is done to the original configured prefix.
914 =item strlcat() and strlcpy()
916 The configuration process now detects whether strlcat() and strlcpy() are
917 available. When they are not available, perl's own version is used (from
918 Russ Allbery's public domain implementation). Various places in the perl
919 interpreter now use them. (Steve Peters)
923 =head2 Compilation improvements
929 Parallel makes should work properly now, although there may still be problems
930 if C<make test> is instructed to run in parallel.
932 =item Borland's compilers support
934 Building with Borland's compilers on Win32 should work more smoothly. In
935 particular Steve Hay has worked to side step many warnings emitted by their
936 compilers and at least one C compiler internal error.
938 =item Static build on Windows
940 Perl extensions on Windows now can be statically built into the Perl DLL,
941 thanks to a work by Vadim Konovalov.
945 All F<ppport.h> files in the XS modules bundled with perl are now
946 autogenerated at build time. (Marcus Holland-Moritz)
948 =item Building XS extensions on Windows
950 Support for building XS extension modules with the free MinGW compiler has
951 been improved in the case where perl itself was built with the Microsoft
952 VC++ compiler. (ActiveState)
954 =item Support for Microsoft 64-bit compiler
956 Support for building perl with Microsoft's 64-bit compiler has been
957 improved. (ActiveState)
961 =head2 Installation improvements
965 =item Module auxiliary files
967 README files and changelogs for CPAN modules bundled with perl are no
972 =head2 New Or Improved Platforms
974 Perl has been reported to work on Symbian OS. See L<perlsymbian> for more
977 Many improvements have been made towards making Perl work correctly on
980 Perl has been reported to work on DragonFlyBSD.
982 The VMS port has been improved. See L<perlvms>.
984 DynaLoader::dl_unload_file() now works on Windows.
986 Portability of Perl on various recent compilers on Windows has been
987 improved (Borland C++, Visual C++ 7.0).
989 =head1 Selected Bug Fixes
993 =item strictures in regexp-eval blocks
995 C<strict> wasn't in effect in regexp-eval blocks (C</(?{...})/>).
997 =item Calling CORE::require()
999 CORE::require() and CORE::do() were always parsed as require() and do()
1000 when they were overridden. This is now fixed.
1002 =item Subscripts of slices
1004 You can now use a non-arrowed form for chained subscripts after a list
1007 ({foo => "bar"})[0]{foo}
1009 This used to be a syntax error; a C<< -> >> was required.
1011 =item C<no warnings 'category'> works correctly with -w
1013 Previously when running with warnings enabled globally via C<-w>, selective
1014 disabling of specific warning categories would actually turn off all warnings.
1015 This is now fixed; now C<no warnings 'io';> will only turn off warnings in the
1016 C<io> class. Previously it would erroneously turn off all warnings.
1018 =item threads improvements
1020 Several memory leaks in ithreads were closed. Also, ithreads were made
1021 less memory-intensive.
1023 C<threads> is now a dual-life module, also available on CPAN. It has been
1024 expanded in many ways. A kill() method is available for thread signalling.
1025 One can get thread status, or the list of running or joinable threads.
1027 A new C<< threads->exit() >> method is used to exit from the application
1028 (this is the default for the main thread) or from the current thread only
1029 (this is the default for all other threads). On the other hand, the exit()
1030 built-in now always causes the whole application to terminate. (Jerry
1033 =item chr() and negative values
1035 chr() on a negative value now gives C<\x{FFFD}>, the Unicode replacement
1036 character, unless when the C<bytes> pragma is in effect, where the low
1037 eight bytes of the value are used.
1039 =item PERL5SHELL and tainting
1041 On Windows, the PERL5SHELL environment variable is now checked for
1042 taintedness. (Rafael Garcia-Suarez)
1044 =item Using *FILE{IO}
1046 C<stat()> and C<-X> filetests now treat *FILE{IO} filehandles like *FILE
1047 filehandles. (Steve Peters)
1049 =item Overloading and reblessing
1051 Overloading now works when references are reblessed into another class.
1052 Internally, this has been implemented by moving the flag for "overloading"
1053 from the reference to the referent, which logically is where it should
1054 always have been. (Nicholas Clark)
1056 =item Overloading and UTF-8
1058 A few bugs related to UTF-8 handling with objects that have
1059 stringification overloaded have been fixed. (Nicholas Clark)
1061 =item eval memory leaks fixed
1063 Traditionally, C<eval 'syntax error'> has leaked badly. Many (but not all)
1064 of these leaks have now been eliminated or reduced. (Dave Mitchell)
1066 =item Random device on Windows
1068 In previous versions, perl would read the file F</dev/urandom> if it
1069 existed when seeding its random number generator. That file is unlikely
1070 to exist on Windows, and if it did would probably not contain appropriate
1071 data, so perl no longer tries to read it on Windows. (Alex Davies)
1075 The C<PERLIO_DEBUG> environment variable has no longer any effect for
1076 setuid scripts and for scripts run with B<-T>.
1078 Moreover, with a thread-enabled perl, using C<PERLIO_DEBUG> could lead to
1079 an internal buffer overflow. This has been fixed.
1083 =head1 New or Changed Diagnostics
1087 =item Deprecated use of my() in false conditional
1089 A new deprecation warning, I<Deprecated use of my() in false conditional>,
1090 has been added, to warn against the use of the dubious and deprecated
1095 See L<perldiag>. Use C<state> variables instead.
1097 =item !=~ should be !~
1099 A new warning, C<!=~ should be !~>, is emitted to prevent this misspelling
1100 of the non-matching operator.
1102 =item Newline in left-justified string
1104 The warning I<Newline in left-justified string> has been removed.
1106 =item Too late for "-T" option
1108 The error I<Too late for "-T" option> has been reformulated to be more
1111 =item "%s" variable %s masks earlier declaration
1113 This warning is now emitted in more consistent cases; in short, when one
1114 of the declarations involved is a C<my> variable:
1116 my $x; my $x; # warns
1117 my $x; our $x; # warns
1118 our $x; my $x; # warns
1120 On the other hand, the following:
1124 now gives a C<"our" variable %s redeclared> warning.
1126 =item readdir()/closedir()/etc. attempted on invalid dirhandle
1128 These new warnings are now emitted when a dirhandle is used but is
1129 either closed or not really a dirhandle.
1133 C<perl -V> has several improvements, making it more useable from shell
1134 scripts to get the value of configuration variables. See L<perlrun> for
1139 =head1 Changed Internals
1141 In general, the source code of perl has been refactored, tied up, and
1142 optimized in many places. Also, memory management and allocation has been
1143 improved in a couple of points.
1145 =head2 Reordering of SVt_* constants
1147 The relative ordering of constants that define the various types of C<SV>
1148 have changed; in particular, C<SVt_PVGV> has been moved before C<SVt_PVLV>,
1149 C<SVt_PVAV>, C<SVt_PVHV> and C<SVt_PVCV>. This is unlikely to make any
1150 difference unless you have code that explicitly makes assumptions about that
1151 ordering. (The inheritance hierarchy of C<B::*> objects has been changed
1154 =head2 Removal of CPP symbols
1156 The C preprocessor symbols C<PERL_PM_APIVERSION> and
1157 C<PERL_XS_APIVERSION>, which were supposed to give the version number of
1158 the oldest perl binary-compatible (resp. source-compatible) with the
1159 present one, were not used, and sometimes had misleading values. They have
1162 =head2 Less space is used by ops
1164 The C<BASEOP> structure now uses less space. The C<op_seq> field has been
1165 removed and replaced by the one-bit fields C<op_opt>. C<op_type> is now 9
1166 bits long. (Consequently, the C<B::OP> class doesn't provide an C<seq>
1171 perl's parser is now generated by bison (it used to be generated by
1172 byacc.) As a result, it seems to be a bit more robust.
1174 Also, Dave Mitchell improved the lexer debugging output under C<-DT>.
1176 =head2 Use of C<const>
1178 Andy Lester supplied many improvements to determine which function
1179 parameters and local variables could actually be declared C<const> to the C
1180 compiler. Steve Peters provided new C<*_set> macros and reworked the core to
1181 use these rather than assigning to macros in LVALUE context.
1185 A new file, F<mathoms.c>, has been added. It contains functions that are
1186 no longer used in the perl core, but that remain available for binary or
1187 source compatibility reasons. However, those functions will not be
1188 compiled in if you add C<-DNO_MATHOMS> in the compiler flags.
1190 =head2 C<AvFLAGS> has been removed
1192 The C<AvFLAGS> macro has been removed.
1194 =head2 C<av_*> changes
1196 The C<av_*()> functions, used to manipulate arrays, no longer accept null
1201 The implementation of the special variables $^H and %^H has changed, to
1202 allow implementing lexical pragmas in pure perl.
1204 =head2 B:: modules inheritance changed
1206 The inheritance hierarchy of C<B::> modules has changed; C<B::NV> now
1207 inherits from C<B::SV> (it used to inherit from C<B::IV>).
1211 =head1 Known Problems
1213 There's still a remaining problem in the implementation of the lexical
1214 C<$_>: it doesn't work inside C</(?{...})/> blocks. (See the TODO test in
1217 =head1 Platform Specific Problems
1219 =head1 Reporting Bugs
1223 The F<Changes> file and the perl590delta to perl595delta man pages for
1224 exhaustive details on what changed.
1226 The F<INSTALL> file for how to build Perl.
1228 The F<README> file for general stuff.
1230 The F<Artistic> and F<Copying> files for copyright information.