Commit | Line | Data |
f6eae373 |
1 | =head1 NAME |
2 | |
3 | perldelta - what is new for perl v5.9.5 |
4 | |
5 | =head1 DESCRIPTION |
6 | |
7 | This document describes differences between the 5.9.4 and the 5.9.5 |
8 | development releases. See L<perl590delta>, L<perl591delta>, |
9 | L<perl592delta>, L<perl593delta> and L<perl594delta> for the differences |
10 | between 5.8.0 and 5.9.4. |
11 | |
12 | =head1 Incompatible Changes |
13 | |
20ee07fb |
14 | =head2 Tainting and printf |
15 | |
16 | When perl is run under taint mode, C<printf()> and C<sprintf()> will now |
5a093634 |
17 | reject any tainted format argument. (Rafael Garcia-SUarez) |
20ee07fb |
18 | |
54a37cc6 |
19 | =head2 undef and signal handlers |
20 | |
21 | Undefining or deleting a signal handler via C<undef $SIG{FOO}> is now |
22 | equivalent to setting it to C<'DEFAULT'>. |
23 | |
73966613 |
24 | =head2 Removal of the bytecode compiler and of perlcc |
25 | |
26 | C<perlcc>, the byteloader and the supporting modules (B::C, B::CC, |
27 | B::Bytecode, etc.) are no longer distributed with the perl sources. Those |
28 | experimental tools have never worked reliably, and, due to the lack of |
29 | volunteers to keep them in line with the perl interpreter developments, it |
30 | was decided to remove them instead of shipping a broken version of those. |
31 | The last version of those modules can be found with perl 5.9.4. |
32 | |
33 | However the B compiler framework stays supported in the perl core, as with |
34 | the more useful modules it has permitted (among others, B::Deparse and |
35 | B::Concise). |
36 | |
37 | =head2 Removal of the JPL |
38 | |
39 | The JPL (Java-Perl Linguo) has been removed from the perl sources tarball. |
40 | |
f6eae373 |
41 | =head1 Core Enhancements |
42 | |
072f65b4 |
43 | =head2 Regular expressions |
44 | |
45 | =over 4 |
46 | |
47 | =item Recursive Patterns |
48 | |
49 | It is now possible to write recursive patterns without using the C<(??{})> |
50 | construct. This new way is more efficient, and in many cases easier to |
51 | read. |
52 | |
53 | Each capturing parenthesis can now be treated as an independent pattern |
54 | that can be entered by using the C<(?PARNO)> syntax (C<PARNO> standing for |
55 | "parenthesis number"). For example, the following pattern will match |
56 | nested balanced angle brackets: |
57 | |
58 | / |
59 | ^ # start of line |
60 | ( # start capture buffer 1 |
61 | < # match an opening angle bracket |
62 | (?: # match one of: |
63 | (?> # don't backtrack over the inside of this group |
64 | [^<>]+ # one or more non angle brackets |
65 | ) # end non backtracking group |
66 | | # ... or ... |
67 | (?1) # recurse to bracket 1 and try it again |
68 | )* # 0 or more times. |
69 | > # match a closing angle bracket |
70 | ) # end capture buffer one |
71 | $ # end of line |
72 | /x |
73 | |
74 | Note, users experienced with PCRE will find that the Perl implementation |
75 | of this feature differs from the PCRE one in that it is possible to |
76 | backtrack into a recursed pattern, whereas in PCRE the recursion is |
73966613 |
77 | atomic or "possessive" in nature. (Yves Orton) |
072f65b4 |
78 | |
79 | =item Named Capture Buffers |
80 | |
81 | It is now possible to name capturing parenthesis in a pattern and refer to |
82 | the captured contents by name. The naming syntax is C<< (?<NAME>....) >>. |
83 | It's possible to backreference to a named buffer with the C<< \k<NAME> >> |
84 | syntax. In code, the new magical hash C<%+> can be used to access the |
85 | contents of the buffers. |
86 | |
87 | Thus, to replace all doubled chars, one could write |
88 | |
89 | s/(?<letter>.)\k<letter>/$+{letter}/g |
90 | |
91 | Only buffers with defined contents will be "visible" in the hash, so |
92 | it's possible to do something like |
93 | |
94 | foreach my $name (keys %+) { |
95 | print "content of buffer '$name' is $+{$name}\n"; |
96 | } |
97 | |
98 | Users exposed to the .NET regex engine will find that the perl |
99 | implementation differs in that the numerical ordering of the buffers |
100 | is sequential, and not "unnamed first, then named". Thus in the pattern |
101 | |
102 | /(A)(?<B>B)(C)(?<D>D)/ |
103 | |
104 | $1 will be 'A', $2 will be 'B', $3 will be 'C' and $4 will be 'D' and not |
105 | $1 is 'A', $2 is 'C' and $3 is 'B' and $4 is 'D' that a .NET programmer |
73966613 |
106 | would expect. This is considered a feature. :-) (Yves Orton) |
072f65b4 |
107 | |
b9b4dddf |
108 | =item Possessive Quantifiers |
109 | |
ee9b8eae |
110 | Perl now supports the "possessive quantifier" syntax of the "atomic match" |
b9b4dddf |
111 | pattern. Basically a possessive quantifier matches as much as it can and never |
ee9b8eae |
112 | gives any back. Thus it can be used to control backtracking. The syntax is |
b9b4dddf |
113 | similar to non-greedy matching, except instead of using a '?' as the modifier |
114 | the '+' is used. Thus C<?+>, C<*+>, C<++>, C<{min,max}+> are now legal |
73966613 |
115 | quantifiers. (Yves Orton) |
b9b4dddf |
116 | |
24b23f37 |
117 | =item Backtracking control verbs |
118 | |
119 | The regex engine now supports a number of special purpose backtrack |
5d458dd8 |
120 | control verbs: (*THEN), (*PRUNE), (*MARK), (*SKIP), (*COMMIT), (*FAIL) |
c74340f9 |
121 | and (*ACCEPT). See L<perlre> for their descriptions. (Yves Orton) |
122 | |
123 | =item Relative backreferences |
124 | |
2bf803e2 |
125 | A new syntax C<\g{N}> or C<\gN> where "N" is a decimal integer allows a |
126 | safer form of back-reference notation as well as allowing relative |
127 | backreferences. This should make it easier to generate and embed patterns |
c74340f9 |
128 | that contain backreferences. (Yves Orton) |
24b23f37 |
129 | |
072f65b4 |
130 | =back |
131 | |
ee9b8eae |
132 | =item Regexp::Keep internalized |
133 | |
134 | The functionality of Jeff Pinyan's module Regexp::Keep has been added to |
135 | the core. You can now use in regular expressions the special escape C<\K> |
136 | as a way to do something like floating length positive lookbehind. It is |
137 | also useful in substitutions like: |
138 | |
139 | s/(foo)bar/$1/g |
140 | |
141 | that can now be converted to |
142 | |
143 | s/foo\Kbar//g |
144 | |
145 | which is much more efficient. |
146 | |
d5494b07 |
147 | =head2 The C<_> prototype |
148 | |
149 | A new prototype character has been added. C<_> is equivalent to C<$> (it |
150 | denotes a scalar), but defaults to C<$_> if the corresponding argument |
151 | isn't supplied. Due to the optional nature of the argument, you can only |
152 | use it at the end of a prototype, or before a semicolon. |
153 | |
73966613 |
154 | This has a small incompatible consequence: the prototype() function has |
155 | been adjusted to return C<_> for some built-ins in appropriate cases (for |
156 | example, C<prototype('CORE::rmdir')>). (Rafael Garcia-Suarez) |
157 | |
49f595a6 |
158 | =head2 UNITCHECK blocks |
159 | |
160 | C<UNITCHECK>, a new special code block has been introduced, in addition to |
161 | C<BEGIN>, C<CHECK>, C<INIT> and C<END>. |
162 | |
163 | C<CHECK> and C<INIT> blocks, while useful for some specialized purposes, |
164 | are always executed at the transition between the compilation and the |
165 | execution of the main program, and thus are useless whenever code is |
166 | loaded at runtime. On the other hand, C<UNITCHECK> blocks are executed |
167 | just after the unit which defined them has been compiled. See L<perlmod> |
168 | for more information. (Alex Gough) |
169 | |
5a093634 |
170 | =head2 readpipe() is now overridable |
171 | |
172 | The built-in function readpipe() is now overridable. Overriding it permits |
173 | also to override its operator counterpart, C<qx//> (a.k.a. C<``>). (Rafael |
174 | Garcia-Suarez) |
175 | |
73966613 |
176 | =head2 UCD 5.0.0 |
177 | |
178 | The copy of the Unicode Character Database included in Perl 5.9 has |
179 | been updated to version 5.0.0. |
180 | |
f6eae373 |
181 | =head1 Modules and Pragmas |
182 | |
183 | =head2 New Core Modules |
184 | |
73966613 |
185 | =over 4 |
186 | |
187 | =item * |
188 | |
189 | C<Locale::Maketext::Simple>, needed by CPANPLUS, is a simple wrapper around |
190 | C<Locale::Maketext::Lexicon>. Note that C<Locale::Maketext::Lexicon> isn't |
191 | included in the perl core; the behaviour of C<Locale::Maketext::Simple> |
192 | gracefully degrades when the later isn't present. |
193 | |
194 | =item * |
195 | |
196 | C<Params::Check> implements a generic input parsing/checking mechanism. It |
197 | is used by CPANPLUS. |
198 | |
5a093634 |
199 | =item * |
200 | |
201 | C<Term::UI> simplifies the task to ask questions at a terminal prompt. |
202 | |
203 | =item * |
204 | |
205 | C<Object::Accessor> provides an interface to create per-object accessors. |
206 | |
73966613 |
207 | =back |
208 | |
d5494b07 |
209 | =head2 Module changes |
210 | |
211 | =over 4 |
212 | |
213 | =item C<base> |
214 | |
215 | The C<base> pragma now warns if a class tries to inherit from itself. |
216 | |
18857c0b |
217 | =item C<warnings> |
218 | |
219 | The C<warnings> pragma doesn't load C<Carp> anymore. That means that code |
220 | that used C<Carp> routines without having loaded it at compile time might |
221 | need to be adjusted; typically, the following (faulty) code won't work |
222 | anymore, and will require parentheses to be added after the function name: |
223 | |
224 | use warnings; |
225 | require Carp; |
226 | Carp::confess "argh"; |
227 | |
d5494b07 |
228 | =back |
229 | |
f6eae373 |
230 | =head1 Utility Changes |
231 | |
232 | =head1 Documentation |
233 | |
234 | =head1 Performance Enhancements |
235 | |
236 | =head1 Installation and Configuration Improvements |
237 | |
73966613 |
238 | =head2 C++ compatibility |
239 | |
240 | Efforts have been made to make perl and the core XS modules compilable |
241 | with various C++ compilers (although the situation is not perfect with |
242 | some of the compilers on some of the platforms tested.) |
243 | |
244 | =head2 Ports |
245 | |
246 | Perl has been reported to work on MidnightBSD. |
247 | |
f6eae373 |
248 | =head1 Selected Bug Fixes |
249 | |
49f595a6 |
250 | PerlIO::scalar will now prevent writing to read-only scalars. Moreover, |
251 | seek() is now supported with PerlIO::scalar-based filehandles, the |
252 | underlying string being zero-filled as needed. |
73966613 |
253 | |
254 | study() never worked for UTF-8 strings, but could lead to false results. |
255 | It's now a no-op on UTF-8 data. (Yves Orton) |
256 | |
49f595a6 |
257 | The signals SIGILL, SIGBUS and SIGSEGV are now always delivered in an |
258 | "unsafe" manner (contrary to other signals, that are deferred until the |
259 | perl interpreter reaches a reasonably stable state; see |
260 | L<perlipc/"Deferred Signals (Safe Signals)">). |
261 | |
5a093634 |
262 | When a module or a file is loaded through an @INC-hook, and when this hook |
263 | has set a filename entry in %INC, __FILE__ is now set for this module |
264 | accordingly to the contents of that %INC entry. |
265 | |
f6eae373 |
266 | =head1 New or Changed Diagnostics |
267 | |
268 | =head1 Changed Internals |
269 | |
73966613 |
270 | The anonymous hash and array constructors now take 1 op in the optree |
271 | instead of 3, now that pp_anonhash and pp_anonlist return a reference to |
272 | an hash/array when the op is flagged with OPf_SPECIAL (Nicholas Clark). |
273 | |
f6eae373 |
274 | =head1 Known Problems |
275 | |
276 | =head2 Platform Specific Problems |
277 | |
278 | =head1 Reporting Bugs |
279 | |
280 | If you find what you think is a bug, you might check the articles |
281 | recently posted to the comp.lang.perl.misc newsgroup and the perl |
282 | bug database at http://rt.perl.org/rt3/ . There may also be |
283 | information at http://www.perl.org/ , the Perl Home Page. |
284 | |
285 | If you believe you have an unreported bug, please run the B<perlbug> |
286 | program included with your release. Be sure to trim your bug down |
287 | to a tiny but sufficient test case. Your bug report, along with the |
288 | output of C<perl -V>, will be sent off to perlbug@perl.org to be |
289 | analysed by the Perl porting team. |
290 | |
291 | =head1 SEE ALSO |
292 | |
293 | The F<Changes> file for exhaustive details on what changed. |
294 | |
295 | The F<INSTALL> file for how to build Perl. |
296 | |
297 | The F<README> file for general stuff. |
298 | |
299 | The F<Artistic> and F<Copying> files for copyright information. |
300 | |
301 | =cut |