X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperl5100delta.pod;h=fe9f02ee43b885d1f6ae4cab5b35328127dd676c;hb=fd99c0b988756e2bf38f3d45b3d594b10c776fcb;hp=64b67453685a1959a0fe11f7d37931c484e0eb47;hpb=dbef3c66e2761d118774f973c961faa9b1e467d9;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perl5100delta.pod b/pod/perl5100delta.pod index 64b6745..fe9f02e 100644 --- a/pod/perl5100delta.pod +++ b/pod/perl5100delta.pod @@ -1,6 +1,8 @@ +=encoding utf8 + =head1 NAME -perldelta - what is new for perl 5.10.0 +perl5100delta - what is new for perl 5.10.0 =head1 DESCRIPTION @@ -21,7 +23,7 @@ pragma, like C or C. Currently the following new features are available: C (adds a switch statement), C (adds a C built-in function), and C -(adds an C keyword for declaring "static" variables). Those +(adds a C keyword for declaring "static" variables). Those features are described in their own sections of this document. The C pragma is also implicitly loaded when you require a minimal @@ -101,7 +103,7 @@ nested balanced angle brackets: < # match an opening angle bracket (?: # match one of: (?> # don't backtrack over the inside of this group - [^<>]+ # one or more non angle brackets + [^<>]+ # one or more non angle brackets ) # end non backtracking group | # ... or ... (?1) # recurse to bracket 1 and try it again @@ -111,10 +113,10 @@ nested balanced angle brackets: $ # end of line /x -Note, users experienced with PCRE will find that the Perl implementation -of this feature differs from the PCRE one in that it is possible to -backtrack into a recursed pattern, whereas in PCRE the recursion is -atomic or "possessive" in nature. (Yves Orton) +PCRE users should note that Perl's recursive regex feature allows +backtracking into a recursed pattern, whereas in PCRE the recursion is +atomic or "possessive" in nature. As in the example above, you can +add (?>) to control this selectively. (Yves Orton) =item Named Capture Buffers @@ -124,7 +126,7 @@ It's possible to backreference to a named buffer with the C<< \k >> syntax. In code, the new magical hashes C<%+> and C<%-> can be used to access the contents of the capture buffers. -Thus, to replace all doubled chars, one could write +Thus, to replace all doubled chars with a single copy, one could write s/(?.)\k/$+{letter}/g @@ -177,7 +179,7 @@ that contain backreferences. See L. (Yves Orton) =item C<\K> escape The functionality of Jeff Pinyan's module Regexp::Keep has been added to -the core. You can now use in regular expressions the special escape C<\K> +the core. In regular expressions you can now use the special escape C<\K> as a way to do something like floating length positive lookbehind. It is also useful in substitutions like: @@ -191,7 +193,7 @@ which is much more efficient. (Yves Orton) =item Vertical and horizontal whitespace, and linebreak -Regular expressions now recognize the C<\v> and C<\h> escapes, that match +Regular expressions now recognize the C<\v> and C<\h> escapes that match vertical and horizontal whitespace, respectively. C<\V> and C<\H> logically match their complements. @@ -225,10 +227,10 @@ overriding the lexical declaration with C. (Rafael Garcia-Suarez) =head2 The C<_> prototype -A new prototype character has been added. C<_> is equivalent to C<$> (it -denotes a scalar), but defaults to C<$_> if the corresponding argument -isn't supplied. Due to the optional nature of the argument, you can only -use it at the end of a prototype, or before a semicolon. +A new prototype character has been added. C<_> is equivalent to C<$> but +defaults to C<$_> if the corresponding argument isn't supplied (both C<$> +and C<_> denote a scalar). Due to the optional nature of the argument, +you can only use it at the end of a prototype, or before a semicolon. This has a small incompatible consequence: the prototype() function has been adjusted to return C<_> for some built-ins in appropriate cases (for @@ -258,7 +260,33 @@ available: the C3 algorithm. See L for more information. Note that, due to changes in the implementation of class hierarchy search, code that used to undef the C<*ISA> glob will most probably break. Anyway, undef'ing C<*ISA> had the side-effect of removing the magic on the @ISA -array and should not have been done in the first place. +array and should not have been done in the first place. Also, the +cache C<*::ISA::CACHE::> no longer exists; to force reset the @ISA cache, +you now need to use the C API, or more simply to assign to @ISA +(e.g. with C<@ISA = @ISA>). + +=head2 readdir() may return a "short filename" on Windows + +The readdir() function may return a "short filename" when the long +filename contains characters outside the ANSI codepage. Similarly +Cwd::cwd() may return a short directory name, and glob() may return short +names as well. On the NTFS file system these short names can always be +represented in the ANSI codepage. This will not be true for all other file +system drivers; e.g. the FAT filesystem stores short filenames in the OEM +codepage, so some files on FAT volumes remain unaccessible through the +ANSI APIs. + +Similarly, $^X, @INC, and $ENV{PATH} are preprocessed at startup to make +sure all paths are valid in the ANSI codepage (if possible). + +The Win32::GetLongPathName() function now returns the UTF-8 encoded +correct long file name instead of using replacement characters to force +the name into the ANSI codepage. The new Win32::GetANSIPathName() +function can be used to turn a long pathname into a short one only if the +long one cannot be represented in the ANSI codepage. + +Many other functions in the C module have been improved to accept +UTF-8 encoded arguments. Please see L for details. =head2 readpipe() is now overridable @@ -282,10 +310,10 @@ but retain their previous value. (Rafael Garcia-Suarez, Nicholas Clark) To use state variables, one needs to enable them by using - use feature "state"; + use feature 'state'; or by using the C<-E> command-line switch in one-liners. -See L. +See L. =head2 Stacked filetest operators @@ -303,14 +331,6 @@ to inheritance). (chromatic) See L<< UNIVERSAL/"$obj->DOES( ROLE )" >>. -=head2 C - -Perl has now support for the C special subroutine. Like -C, C is called once per package; however, it is called -just before cloning starts, and in the context of the parent thread. If it -returns a true value, then no objects of that class will be cloned. See -L for details. (Contributed by Dave Mitchell.) - =head2 Formats Formats were improved in several ways. A new field, C<^*>, can be used for @@ -351,7 +371,9 @@ You can now use recursive subroutines with sort(), thanks to Robin Houston. The constant folding routine is now wrapped in an exception handler, and if folding throws an exception (such as attempting to evaluate 0/0), perl now retains the current optree, rather than aborting the whole program. -(Nicholas Clark, Dave Mitchell) +Without this change, programs would not compile if they had expressions that +happened to generate exceptions, even though those expressions were in code +that could never be reached at runtime. (Nicholas Clark, Dave Mitchell) =head2 Source filters in @INC @@ -374,7 +396,7 @@ details. This variable gives the native status returned by the last pipe close, backtick command, successful call to wait() or waitpid(), or from the -system() operator. See L for details. (Contributed by Gisle Aas.) +system() operator. See L for details. (Contributed by Gisle Aas.) =item C<${^RE_TRIE_MAXBUF}> @@ -397,7 +419,9 @@ such as newline and backspace are output in C<\x> notation, rather than octal. The B<-C> option can no longer be used on the C<#!> line. It wasn't -working there anyway. +working there anyway, since the standard streams are already set up +at this point in the execution of the perl interpreter. You can use +binmode() instead to get the desired behaviour. =head2 UCD 5.0.0 @@ -406,19 +430,23 @@ been updated to version 5.0.0. =head2 MAD -MAD, which stands for I, is a +MAD, which stands for I, is a still-in-development work leading to a Perl 5 to Perl 6 converter. To enable it, it's necessary to pass the argument C<-Dmad> to Configure. The -obtained perl isn't binary compatible with a regular perl 5.9.4, and has +obtained perl isn't binary compatible with a regular perl 5.10, and has space and speed penalties; moreover not all regression tests still pass with it. (Larry Wall, Nicholas Clark) +=head2 kill() on Windows + +On Windows platforms, C now kills a process tree. +(On UNIX, this delivers the signal to all processes in the same process +group.) + =head1 Incompatible Changes =head2 Packing and UTF-8 strings -=for XXX update this - The semantics of pack() and unpack() regarding UTF-8-encoded data has been changed. Processing is now by default character per character instead of byte per byte on the underlying encoding. Notably, code that used things @@ -431,7 +459,8 @@ To be consistent with pack(), the C in unpack() templates indicates that the data is to be processed in character mode, i.e. character by character; on the contrary, C in unpack() indicates UTF-8 mode, where the packed string is processed in its UTF-8-encoded Unicode form on a byte -by byte basis. This is reversed with regard to perl 5.8.X. +by byte basis. This is reversed with regard to perl 5.8.X, but now consistent +between pack() and unpack(). Moreover, C and C can also be used in pack() templates to specify respectively character and byte modes. @@ -514,6 +543,14 @@ Previously, F<.pmc> files were loaded only if more recent than the matching F<.pm> file. Starting with 5.9.4, they'll be always loaded if they exist. +=head2 $^V is now a C object instead of a v-string + +$^V can still be used with the C<%vd> format in printf, but any +character-level operations will now access the string representation +of the C object and not the ordinals of a v-string. +Expressions like C<< substr($^V, 0, 2) >> or C<< split //, $^V >> +no longer work and must be rewritten. + =head2 @- and @+ in patterns The special arrays C<@-> and C<@+> are no longer interpolated in regular @@ -537,11 +574,11 @@ equivalent to setting it to C<'DEFAULT'>. (Rafael Garcia-Suarez) =head2 strictures and dereferencing in defined() -C was ignoring taking a hard reference in an argument +C was ignoring taking a hard reference in an argument to defined(), as in : - use strict "refs"; - my $x = "foo"; + use strict 'refs'; + my $x = 'foo'; if (defined $$x) {...} This now correctly produces the run-time error Cisa($bar)> lookup. =head1 Modules and Pragmata +=head2 Upgrading individual core modules + +Even more core modules are now also available separately through the +CPAN. If you wish to update one of these modules, you don't need to +wait for a new perl release. From within the cpan shell, running the +'r' command will report on modules with upgrades available. See +C for more information. + =head2 Pragmata Changes =over 4 @@ -623,6 +668,10 @@ The C pragma now warns if a class tries to inherit from itself. C and C will now complain loudly if they are loaded via incorrect casing (as in C). (Johan Vromans) +=item C + +The C module provides support for version objects. + =item C The C pragma doesn't load C anymore. That means that code @@ -632,7 +681,7 @@ anymore, and will require parentheses to be added after the function name: use warnings; require Carp; - Carp::confess "argh"; + Carp::confess 'argh'; =item C @@ -781,6 +830,16 @@ for F<.tar> (plain, gziped or bzipped) or F<.zip> files. C provides an API and a command-line tool to access the CPAN mirrors. +=item * + +C provides utilities that are useful in decoding Pod +EE...E sequences. + +=item * + +C is now the backend for several of the Pod-related modules +included with Perl. + =back =head2 Selected Changes to Core Modules @@ -792,6 +851,9 @@ mirrors. C can now report the caller's file and line number. (David Feldman) +All interpreted attributes are now passed as array references. (Damian +Conway) + =item C C is now based on C, and so can be extended @@ -826,17 +888,14 @@ rerunning all bar the last command from a saved command history. It can also display the parent inheritance tree of a given class, with the C command. -Perl has a new -dt command-line flag, which enables threads support in the -debugger. - =item ptar -C is a pure perl implementation of C, that comes with +C is a pure perl implementation of C that comes with C. =item ptardiff -C is a small script used to generate a diff between the contents +C is a small utility used to generate a diff between the contents of a tar archive and a directory tree. Like C, it comes with C. @@ -852,7 +911,7 @@ above). =item h2ph and h2xs -C and C have been made a bit more robust with regard to +C and C have been made more robust with regard to "modern" C code. C implements a new option C<--use-xsloader> to force use of @@ -886,13 +945,13 @@ their parent modules.) =item cpanp -C, the CPANPLUS shell, has been added. (C, an +C, the CPANPLUS shell, has been added. (C, a helper for CPANPLUS operation, has been added too, but isn't intended for direct use). =item cpan2dist -C is a new utility, that comes with CPANPLUS. It's a tool to +C is a new utility that comes with CPANPLUS. It's a tool to create distributions (or packages) from CPAN modules. =item pod2html @@ -914,6 +973,10 @@ Inc. The L manpage, courtesy of Yves Orton, describes internals of the Perl regular expression engine. +The L manpage describes the interface to the perl interpreter +used to write pluggable regular expression engines (by Ævar Arnfjörð +Bjarmason). + The L manpage is an tutorial for programming with Unicode and string encodings in Perl, courtesy of Juerd Waalboer. @@ -967,7 +1030,7 @@ their system dependent constants - as a result C now takes about The new compilation flag C, introduced as an option in perl 5.8.8, is turned on by default in perl 5.9.3. It prevents perl -from creating an empty scalar with every new typeglob. See L +from creating an empty scalar with every new typeglob. See L for details. =head2 Weak references are cheaper @@ -1030,13 +1093,12 @@ this optimization. (Yves Orton) B Much code exists that works around perl's historic poor performance on alternations. Often the tricks used to do so will disable the new optimisations. Hopefully the utility modules used for this purpose -will be educated about these new optimisations by the time 5.10 is -released. +will be educated about these new optimisations. =item Aho-Corasick start-point optimisation When a pattern starts with a trie-able alternation and there aren't -better optimisations available the regex engine will use Aho-Corasick +better optimisations available, the regex engine will use Aho-Corasick matching to find the start point. (Yves Orton) =back @@ -1123,12 +1185,6 @@ Efforts have been made to make perl and the core XS modules compilable with various C++ compilers (although the situation is not perfect with some of the compilers on some of the platforms tested.) -=item Building XS extensions on Windows - -Support for building XS extension modules with the free MinGW compiler has -been improved in the case where perl itself was built with the Microsoft -VC++ compiler. (ActiveState) - =item Support for Microsoft 64-bit compiler Support for building perl with Microsoft's 64-bit compiler has been @@ -1136,7 +1192,7 @@ improved. (ActiveState) =item Visual C++ -Perl now can be compiled with Microsoft Visual C++. +Perl can now be compiled with Microsoft Visual C++ 2005 (and 2008 Beta 2). =item Win32 builds @@ -1165,9 +1221,14 @@ z/OS. Perl has been reported to work on DragonFlyBSD and MidnightBSD. +Perl has also been reported to work on NexentaOS +( http://www.gnusolaris.org/ ). + The VMS port has been improved. See L. -Support for Cray XT4 Catamount/Qk has been added. +Support for Cray XT4 Catamount/Qk has been added. See +F in the source code distribution for more +information. Vendor patches have been merged for RedHat and Gentoo. @@ -1221,7 +1282,7 @@ D. Hedden) chr() on a negative value now gives C<\x{FFFD}>, the Unicode replacement character, unless when the C pragma is in effect, where the low -eight bytes of the value are used. +eight bits of the value are used. =item PERL5SHELL and tainting @@ -1259,7 +1320,7 @@ data, so perl no longer tries to read it on Windows. (Alex Davies) =item PERLIO_DEBUG -The C environment variable has no longer any effect for +The C environment variable no longer has any effect for setuid scripts and for scripts run with B<-T>. Moreover, with a thread-enabled perl, using C could lead to @@ -1292,7 +1353,7 @@ accordingly to the contents of that %INC entry. (Rafael) =item C<-t> switch fix The C<-w> and C<-t> switches can now be used together without messing -up what categories of warnings are activated or not. (Rafael) +up which categories of warnings are activated. (Rafael) =item Duping UTF-8 filehandles @@ -1301,7 +1362,7 @@ properly carry that layer on the duped filehandle. (Rafael) =item Localisation of hash elements -Localizing an hash element whose key was given as a variable didn't work +Localizing a hash element whose key was given as a variable didn't work correctly if the variable was changed while the local() was in effect (as in C). (Bo Lindbergh) @@ -1311,6 +1372,11 @@ in C). (Bo Lindbergh) =over 4 +=item Use of uninitialized value + +Perl will now try to tell you the name of the variable (if any) that was +undefined. + =item Deprecated use of my() in false conditional A new deprecation warning, I, @@ -1366,6 +1432,11 @@ Two deprecation warnings have been added: (Rafael) Perl's command-line switch C<-P> is now deprecated. +=item v-string in use/require is non-portable + +Perl will warn you against potential backwards compatibility problems with +the C syntax. + =item perl -V C has several improvements, making it more useable from shell @@ -1376,9 +1447,15 @@ details. =head1 Changed Internals -In general, the source code of perl has been refactored, tied up, and -optimized in many places. Also, memory management and allocation has been -improved in a couple of points. +In general, the source code of perl has been refactored, tidied up, +and optimized in many places. Also, memory management and allocation +has been improved in several points. + +When compiling the perl core with gcc, as many gcc warning flags are +turned on as is possible on the platform. (This quest for cleanliness +doesn't extend to XS code because we cannot guarantee the tidiness of +code we didn't write.) Similar strictness flags have been added or +tightened for various other C compilers. =head2 Reordering of SVt_* constants @@ -1389,6 +1466,19 @@ difference unless you have code that explicitly makes assumptions about that ordering. (The inheritance hierarchy of C objects has been changed to reflect this.) +=head2 Elimination of SVt_PVBM + +Related to this, the internal type C has been been removed. This +dedicated type of C was used by the C operator and parts of the +regexp engine to facilitate fast Boyer-Moore matches. Its use internally has +been replaced by Cs of type C. + +=head2 New type SVt_BIND + +A new type C has been added, in readiness for the project to +implement Perl 6 on 5. There deliberately is no implementation yet, and +they cannot yet be created or destroyed. + =head2 Removal of CPP symbols The C preprocessor symbols C and @@ -1400,7 +1490,7 @@ been removed. =head2 Less space is used by ops The C structure now uses less space. The C field has been -removed and replaced by the one-bit fields C. C is now 9 +removed and replaced by a single bit bit-field C. C is now 9 bits long. (Consequently, the C class doesn't provide an C method anymore.) @@ -1437,7 +1527,7 @@ C parameters. =head2 $^H and %^H The implementation of the special variables $^H and %^H has changed, to -allow implementing lexical pragmas in pure perl. +allow implementing lexical pragmas in pure Perl. =head2 B:: modules inheritance changed @@ -1448,11 +1538,7 @@ inherits from C (it used to inherit from C). The anonymous hash and array constructors now take 1 op in the optree instead of 3, now that pp_anonhash and pp_anonlist return a reference to -an hash/array when the op is flagged with OPf_SPECIAL (Nicholas Clark). - -=for p5p XXX have we some docs on how to create regexp engine plugins, since that's now possible ? (perlreguts) - -=for p5p XXX new BIND SV type, #29544, #29642 +an hash/array when the op is flagged with OPf_SPECIAL. (Nicholas Clark) =head1 Known Problems @@ -1460,10 +1546,37 @@ There's still a remaining problem in the implementation of the lexical C<$_>: it doesn't work inside C blocks. (See the TODO test in F.) +Stacked filetest operators won't work when the C pragma is in +effect, because they rely on the stat() buffer C<_> being populated, and +filetest bypasses stat(). + +=head2 UTF-8 problems + +The handling of Unicode still is unclean in several places, where it's +dependent on whether a string is internally flagged as UTF-8. This will +be made more consistent in perl 5.12, but that won't be possible without +a certain amount of backwards incompatibility. + =head1 Platform Specific Problems +When compiled with g++ and thread support on Linux, it's reported that the +C<$!> stops working correctly. This is related to the fact that the glibc +provides two strerror_r(3) implementation, and perl selects the wrong +one. + =head1 Reporting Bugs +If you find what you think is a bug, you might check the articles +recently posted to the comp.lang.perl.misc newsgroup and the perl +bug database at http://rt.perl.org/rt3/ . There may also be +information at http://www.perl.org/ , the Perl Home Page. + +If you believe you have an unreported bug, please run the B +program included with your release. Be sure to trim your bug down +to a tiny but sufficient test case. Your bug report, along with the +output of C, will be sent off to perlbug@perl.org to be +analysed by the Perl porting team. + =head1 SEE ALSO The F file and the perl590delta to perl595delta man pages for