Larry's fix for buggy propagation of utf8-ness in join(); add test
[p5sagit/p5-mst-13.2.git] / pod / perltrap.pod
CommitLineData
a0d0e21e 1=head1 NAME
2
3perltrap - Perl traps for the unwary
4
5=head1 DESCRIPTION
6
cb1a09d0 7The biggest trap of all is forgetting to use the B<-w> switch; see
8L<perlrun>. The second biggest trap is not making your entire program
daff0e37 9runnable under C<use strict>. The third biggest trap is not reading
10the list of changes in this version of Perl; see L<perldelta>.
a0d0e21e 11
12=head2 Awk Traps
13
14Accustomed B<awk> users should take special note of the following:
15
16=over 4
17
18=item *
19
20The English module, loaded via
21
22 use English;
23
54310121 24allows you to refer to special variables (like C<$/>) with names (like
19799a22 25$RS), as though they were in B<awk>; see L<perlvar> for details.
a0d0e21e 26
27=item *
28
29Semicolons are required after all simple statements in Perl (except
30at the end of a block). Newline is not a statement delimiter.
31
32=item *
33
34Curly brackets are required on C<if>s and C<while>s.
35
36=item *
37
5db417f7 38Variables begin with "$", "@" or "%" in Perl.
a0d0e21e 39
40=item *
41
42Arrays index from 0. Likewise string positions in substr() and
43index().
44
45=item *
46
47You have to decide whether your array has numeric or string indices.
48
49=item *
50
aa689395 51Hash values do not spring into existence upon mere reference.
a0d0e21e 52
53=item *
54
55You have to decide whether you want to use string or numeric
56comparisons.
57
58=item *
59
60Reading an input line does not split it for you. You get to split it
54310121 61to an array yourself. And the split() operator has different
62arguments than B<awk>'s.
a0d0e21e 63
64=item *
65
66The current input line is normally in $_, not $0. It generally does
67not have the newline stripped. ($0 is the name of the program
68executed.) See L<perlvar>.
69
70=item *
71
c47ff5f1 72$<I<digit>> does not refer to fields--it refers to substrings matched
8b0a4b75 73by the last match pattern.
a0d0e21e 74
75=item *
76
77The print() statement does not add field and record separators unless
8b0a4b75 78you set C<$,> and C<$\>. You can set $OFS and $ORS if you're using
a0d0e21e 79the English module.
80
81=item *
82
83You must open your files before you print to them.
84
85=item *
86
87The range operator is "..", not comma. The comma operator works as in
88C.
89
90=item *
91
92The match operator is "=~", not "~". ("~" is the one's complement
93operator, as in C.)
94
95=item *
96
97The exponentiation operator is "**", not "^". "^" is the XOR
98operator, as in C. (You know, one could get the feeling that B<awk> is
99basically incompatible with C.)
100
101=item *
102
103The concatenation operator is ".", not the null string. (Using the
5f05dabc 104null string would render C</pat/ /pat/> unparsable, because the third slash
105would be interpreted as a division operator--the tokenizer is in fact
c47ff5f1 106slightly context sensitive for operators like "/", "?", and ">".
a0d0e21e 107And in fact, "." itself can be the beginning of a number.)
108
109=item *
110
111The C<next>, C<exit>, and C<continue> keywords work differently.
112
113=item *
114
115
116The following variables work differently:
117
118 Awk Perl
119 ARGC $#ARGV or scalar @ARGV
120 ARGV[0] $0
121 FILENAME $ARGV
122 FNR $. - something
123 FS (whatever you like)
124 NF $#Fld, or some such
125 NR $.
126 OFMT $#
127 OFS $,
128 ORS $\
129 RLENGTH length($&)
130 RS $/
131 RSTART length($`)
132 SUBSEP $;
133
134=item *
135
136You cannot set $RS to a pattern, only a string.
137
138=item *
139
140When in doubt, run the B<awk> construct through B<a2p> and see what it
141gives you.
142
143=back
144
145=head2 C Traps
146
147Cerebral C programmers should take note of the following:
148
149=over 4
150
151=item *
152
153Curly brackets are required on C<if>'s and C<while>'s.
154
155=item *
156
157You must use C<elsif> rather than C<else if>.
158
159=item *
160
54310121 161The C<break> and C<continue> keywords from C become in
a0d0e21e 162Perl C<last> and C<next>, respectively.
19799a22 163Unlike in C, these do I<not> work within a C<do { } while> construct.
a0d0e21e 164
165=item *
166
167There's no switch statement. (But it's easy to build one on the fly.)
168
169=item *
170
5db417f7 171Variables begin with "$", "@" or "%" in Perl.
a0d0e21e 172
173=item *
174
6dbacca0 175C<printf()> does not implement the "*" format for interpolating
a0d0e21e 176field widths, but it's trivial to use interpolation of double-quoted
177strings to achieve the same effect.
178
179=item *
180
181Comments begin with "#", not "/*".
182
183=item *
184
185You can't take the address of anything, although a similar operator
5f05dabc 186in Perl is the backslash, which creates a reference.
a0d0e21e 187
188=item *
189
4633a7c4 190C<ARGV> must be capitalized. C<$ARGV[0]> is C's C<argv[1]>, and C<argv[0]>
191ends up in C<$0>.
a0d0e21e 192
193=item *
194
195System calls such as link(), unlink(), rename(), etc. return nonzero for
196success, not 0.
197
198=item *
199
200Signal handlers deal with signal names, not numbers. Use C<kill -l>
201to find their names on your system.
202
203=back
204
205=head2 Sed Traps
206
207Seasoned B<sed> programmers should take note of the following:
208
209=over 4
210
211=item *
212
213Backreferences in substitutions use "$" rather than "\".
214
215=item *
216
217The pattern matching metacharacters "(", ")", and "|" do not have backslashes
218in front.
219
220=item *
221
222The range operator is C<...>, rather than comma.
223
224=back
225
226=head2 Shell Traps
227
228Sharp shell programmers should take note of the following:
229
230=over 4
231
232=item *
233
54310121 234The backtick operator does variable interpolation without regard to
a0d0e21e 235the presence of single quotes in the command.
236
237=item *
238
54310121 239The backtick operator does no translation of the return value, unlike B<csh>.
a0d0e21e 240
241=item *
242
243Shells (especially B<csh>) do several levels of substitution on each
5f05dabc 244command line. Perl does substitution in only certain constructs
54310121 245such as double quotes, backticks, angle brackets, and search patterns.
a0d0e21e 246
247=item *
248
249Shells interpret scripts a little bit at a time. Perl compiles the
250entire program before executing it (except for C<BEGIN> blocks, which
251execute at compile time).
252
253=item *
254
255The arguments are available via @ARGV, not $1, $2, etc.
256
257=item *
258
259The environment is not automatically made available as separate scalar
260variables.
261
262=back
263
264=head2 Perl Traps
265
266Practicing Perl Programmers should take note of the following:
267
268=over 4
269
270=item *
271
272Remember that many operations behave differently in a list
273context than they do in a scalar one. See L<perldata> for details.
274
275=item *
276
68dc0745 277Avoid barewords if you can, especially all lowercase ones.
54310121 278You can't tell by just looking at it whether a bareword is
279a function or a string. By using quotes on strings and
5f05dabc 280parentheses on function calls, you won't ever get them confused.
a0d0e21e 281
282=item *
283
54310121 284You cannot discern from mere inspection which builtins
285are unary operators (like chop() and chdir())
a0d0e21e 286and which are list operators (like print() and unlink()).
5f05dabc 287(User-defined subroutines can be B<only> list operators, never
a0d0e21e 288unary ones.) See L<perlop>.
289
290=item *
291
748a9306 292People have a hard time remembering that some functions
a0d0e21e 293default to $_, or @ARGV, or whatever, but that others which
54310121 294you might expect to do not.
a0d0e21e 295
6dbacca0 296=item *
a0d0e21e 297
c47ff5f1 298The <FH> construct is not the name of the filehandle, it is a readline
5f05dabc 299operation on that handle. The data read is assigned to $_ only if the
748a9306 300file read is the sole condition in a while loop:
301
302 while (<FH>) { }
54310121 303 while (defined($_ = <FH>)) { }..
748a9306 304 <FH>; # data discarded!
305
6dbacca0 306=item *
748a9306 307
19799a22 308Remember not to use C<=> when you need C<=~>;
a0d0e21e 309these two constructs are quite different:
310
311 $x = /foo/;
312 $x =~ /foo/;
313
314=item *
315
54310121 316The C<do {}> construct isn't a real loop that you can use
a0d0e21e 317loop control on.
318
319=item *
320
54310121 321Use C<my()> for local variables whenever you can get away with
322it (but see L<perlform> for where you can't).
323Using C<local()> actually gives a local value to a global
a0d0e21e 324variable, which leaves you open to unforeseen side-effects
325of dynamic scoping.
326
c07a80fd 327=item *
328
329If you localize an exported variable in a module, its exported value will
330not change. The local name becomes an alias to a new value but the
331external name is still an alias for the original.
332
a0d0e21e 333=back
334
5f05dabc 335=head2 Perl4 to Perl5 Traps
a0d0e21e 336
54310121 337Practicing Perl4 Programmers should take note of the following
6dbacca0 338Perl4-to-Perl5 specific traps.
339
340They're crudely ordered according to the following list:
a0d0e21e 341
342=over 4
343
6dbacca0 344=item Discontinuance, Deprecation, and BugFix traps
a0d0e21e 345
6dbacca0 346Anything that's been fixed as a perl4 bug, removed as a perl4 feature
347or deprecated as a perl4 feature with the intent to encourage usage of
348some other perl5 feature.
a0d0e21e 349
6dbacca0 350=item Parsing Traps
748a9306 351
6dbacca0 352Traps that appear to stem from the new parser.
a0d0e21e 353
6dbacca0 354=item Numerical Traps
a0d0e21e 355
6dbacca0 356Traps having to do with numerical or mathematical operators.
a0d0e21e 357
6dbacca0 358=item General data type traps
a0d0e21e 359
6dbacca0 360Traps involving perl standard data types.
a0d0e21e 361
6dbacca0 362=item Context Traps - scalar, list contexts
363
364Traps related to context within lists, scalar statements/declarations.
365
366=item Precedence Traps
367
368Traps related to the precedence of parsing, evaluation, and execution of
369code.
370
371=item General Regular Expression Traps using s///, etc.
372
373Traps related to the use of pattern matching.
374
375=item Subroutine, Signal, Sorting Traps
376
377Traps related to the use of signals and signal handlers, general subroutines,
378and sorting, along with sorting subroutines.
379
380=item OS Traps
381
382OS-specific traps.
383
384=item DBM Traps
385
386Traps specific to the use of C<dbmopen()>, and specific dbm implementations.
387
388=item Unclassified Traps
389
390Everything else.
391
392=back
393
394If you find an example of a conversion trap that is not listed here,
4375e838 395please submit it to <F<perlbug@perl.org>> for inclusion.
9f1b1f2d 396Also note that at least some of these can be caught with the
397C<use warnings> pragma or the B<-w> switch.
6dbacca0 398
399=head2 Discontinuance, Deprecation, and BugFix traps
400
401Anything that has been discontinued, deprecated, or fixed as
54310121 402a bug from perl4.
a0d0e21e 403
6dbacca0 404=over 4
405
54310121 406=item * Discontinuance
6dbacca0 407
408Symbols starting with "_" are no longer forced into package main, except
409for C<$_> itself (and C<@_>, etc.).
410
411 package test;
412 $_legacy = 1;
cb1a09d0 413
6dbacca0 414 package main;
415 print "\$_legacy is ",$_legacy,"\n";
54310121 416
6dbacca0 417 # perl4 prints: $_legacy is 1
418 # perl5 prints: $_legacy is
419
54310121 420=item * Deprecation
6dbacca0 421
422Double-colon is now a valid package separator in a variable name. Thus these
5f05dabc 423behave differently in perl4 vs. perl5, because the packages don't exist.
6dbacca0 424
425 $a=1;$b=2;$c=3;$var=4;
426 print "$a::$b::$c ";
cb1a09d0 427 print "$var::abc::xyz\n";
c47ff5f1 428
6dbacca0 429 # perl4 prints: 1::2::3 4::abc::xyz
430 # perl5 prints: 3
cb1a09d0 431
6dbacca0 432Given that C<::> is now the preferred package delimiter, it is debatable
433whether this should be classed as a bug or not.
434(The older package delimiter, ' ,is used here)
cb1a09d0 435
6dbacca0 436 $x = 10 ;
437 print "x=${'x}\n" ;
54310121 438
6dbacca0 439 # perl4 prints: x=10
440 # perl5 prints: Can't find string terminator "'" anywhere before EOF
a0d0e21e 441
5e77893f 442You can avoid this problem, and remain compatible with perl4, if you
443always explicitly include the package name:
444
445 $x = 10 ;
446 print "x=${main'x}\n" ;
447
54310121 448Also see precedence traps, for parsing C<$:>.
a0d0e21e 449
6dbacca0 450=item * BugFix
a0d0e21e 451
6dbacca0 452The second and third arguments of C<splice()> are now evaluated in scalar
453context (as the Camel says) rather than list context.
a0d0e21e 454
1d2dff63 455 sub sub1{return(0,2) } # return a 2-element list
456 sub sub2{ return(1,2,3)} # return a 3-element list
54310121 457 @a1 = ("a","b","c","d","e");
6dbacca0 458 @a2 = splice(@a1,&sub1,&sub2);
459 print join(' ',@a2),"\n";
54310121 460
6dbacca0 461 # perl4 prints: a b
54310121 462 # perl5 prints: c d e
a0d0e21e 463
54310121 464=item * Discontinuance
a0d0e21e 465
6dbacca0 466You can't do a C<goto> into a block that is optimized away. Darn.
a0d0e21e 467
6dbacca0 468 goto marker1;
a0d0e21e 469
54310121 470 for(1){
6dbacca0 471 marker1:
472 print "Here I is!\n";
54310121 473 }
474
6dbacca0 475 # perl4 prints: Here I is!
476 # perl5 dumps core (SEGV)
477
54310121 478=item * Discontinuance
6dbacca0 479
480It is no longer syntactically legal to use whitespace as the name
481of a variable, or as a delimiter for any kind of quote construct.
54310121 482Double darn.
6dbacca0 483
484 $a = ("foo bar");
485 $b = q baz ;
486 print "a is $a, b is $b\n";
54310121 487
6dbacca0 488 # perl4 prints: a is foo bar, b is baz
54310121 489 # perl5 errors: Bareword found where operator expected
5e378fdf 490
6dbacca0 491=item * Discontinuance
492
493The archaic while/if BLOCK BLOCK syntax is no longer supported.
494
495 if { 1 } {
496 print "True!";
497 }
498 else {
499 print "False!";
500 }
54310121 501
6dbacca0 502 # perl4 prints: True!
503 # perl5 errors: syntax error at test.pl line 1, near "if {"
504
505=item * BugFix
506
507The C<**> operator now binds more tightly than unary minus.
508It was documented to work this way before, but didn't.
509
510 print -4**2,"\n";
54310121 511
6dbacca0 512 # perl4 prints: 16
513 # perl5 prints: -16
514
54310121 515=item * Discontinuance
6dbacca0 516
517The meaning of C<foreach{}> has changed slightly when it is iterating over a
518list which is not an array. This used to assign the list to a
519temporary array, but no longer does so (for efficiency). This means
520that you'll now be iterating over the actual values, not over copies of
521the values. Modifications to the loop variable can change the original
522values.
523
524 @list = ('ab','abc','bcd','def');
525 foreach $var (grep(/ab/,@list)){
526 $var = 1;
527 }
528 print (join(':',@list));
54310121 529
6dbacca0 530 # perl4 prints: ab:abc:bcd:def
531 # perl5 prints: 1:1:bcd:def
532
533To retain Perl4 semantics you need to assign your list
54310121 534explicitly to a temporary array and then iterate over that. For
6dbacca0 535example, you might need to change
536
537 foreach $var (grep(/ab/,@list)){
538
539to
540
541 foreach $var (@tmp = grep(/ab/,@list)){
542
543Otherwise changing $var will clobber the values of @list. (This most often
544happens when you use C<$_> for the loop variable, and call subroutines in
545the loop that don't properly localize C<$_>.)
546
5e378fdf 547=item * Discontinuance
548
549C<split> with no arguments now behaves like C<split ' '> (which doesn't
550return an initial null field if $_ starts with whitespace), it used to
551behave like C<split /\s+/> (which does).
552
553 $_ = ' hi mom';
554 print join(':', split);
555
556 # perl4 prints: :hi:mom
557 # perl5 prints: hi:mom
558
55497cff 559=item * BugFix
560
9607fc9c 561Perl 4 would ignore any text which was attached to an B<-e> switch,
55497cff 562always taking the code snippet from the following arg. Additionally, it
9607fc9c 563would silently accept an B<-e> switch without a following arg. Both of
55497cff 564these behaviors have been fixed.
565
566 perl -e'print "attached to -e"' 'print "separate arg"'
54310121 567
55497cff 568 # perl4 prints: separate arg
569 # perl5 prints: attached to -e
54310121 570
55497cff 571 perl -e
572
573 # perl4 prints:
574 # perl5 dies: No code specified for -e.
575
576=item * Discontinuance
577
578In Perl 4 the return value of C<push> was undocumented, but it was
579actually the last value being pushed onto the target list. In Perl 5
580the return value of C<push> is documented, but has changed, it is the
581number of elements in the resulting list.
582
583 @x = ('existing');
584 print push(@x, 'first new', 'second new');
54310121 585
55497cff 586 # perl4 prints: second new
587 # perl5 prints: 3
588
6dbacca0 589=item * Deprecation
590
591Some error messages will be different.
592
54310121 593=item * Discontinuance
6dbacca0 594
595Some bugs may have been inadvertently removed. :-)
596
597=back
598
599=head2 Parsing Traps
600
601Perl4-to-Perl5 traps from having to do with parsing.
602
603=over 4
604
605=item * Parsing
606
607Note the space between . and =
608
609 $string . = "more string";
610 print $string;
54310121 611
6dbacca0 612 # perl4 prints: more string
613 # perl5 prints: syntax error at - line 1, near ". ="
614
615=item * Parsing
616
617Better parsing in perl 5
618
619 sub foo {}
620 &foo
621 print("hello, world\n");
54310121 622
6dbacca0 623 # perl4 prints: hello, world
624 # perl5 prints: syntax error
625
626=item * Parsing
627
628"if it looks like a function, it is a function" rule.
629
630 print
631 ($foo == 1) ? "is one\n" : "is zero\n";
54310121 632
6dbacca0 633 # perl4 prints: is zero
634 # perl5 warns: "Useless use of a constant in void context" if using -w
635
c12982c8 636=item * Parsing
637
638String interpolation of the C<$#array> construct differs when braces
639are to used around the name.
640
641 @ = (1..3);
642 print "${#a}";
643
644 # perl4 prints: 2
645 # perl5 fails with syntax error
646
647 @ = (1..3);
648 print "$#{a}";
649
650 # perl4 prints: {a}
651 # perl5 prints: 2
652
6dbacca0 653=back
654
655=head2 Numerical Traps
656
657Perl4-to-Perl5 traps having to do with numerical operators,
658operands, or output from same.
659
660=over 5
661
662=item * Numerical
663
664Formatted output and significant digits
665
666 print 7.373504 - 0, "\n";
54310121 667 printf "%20.18f\n", 7.373504 - 0;
668
6dbacca0 669 # Perl4 prints:
670 7.375039999999996141
671 7.37503999999999614
54310121 672
6dbacca0 673 # Perl5 prints:
674 7.373504
675 7.37503999999999614
676
677=item * Numerical
678
5f05dabc 679This specific item has been deleted. It demonstrated how the auto-increment
5e378fdf 680operator would not catch when a number went over the signed int limit. Fixed
a6006777 681in version 5.003_04. But always be wary when using large integers.
682If in doubt:
6dbacca0 683
5e378fdf 684 use Math::BigInt;
6dbacca0 685
54310121 686=item * Numerical
6dbacca0 687
688Assignment of return values from numeric equality tests
689does not work in perl5 when the test evaluates to false (0).
690Logical tests now return an null, instead of 0
a6006777 691
6dbacca0 692 $p = ($test == 1);
693 print $p,"\n";
a6006777 694
6dbacca0 695 # perl4 prints: 0
696 # perl5 prints:
697
dc848c6f 698Also see L<"General Regular Expression Traps using s///, etc.">
699for another example of this new feature...
6dbacca0 700
651ad3b1 701=item * Bitwise string ops
702
703When bitwise operators which can operate upon either numbers or
704strings (C<& | ^ ~>) are given only strings as arguments, perl4 would
705treat the operands as bitstrings so long as the program contained a call
706to the C<vec()> function. perl5 treats the string operands as bitstrings.
707(See L<perlop/Bitwise String Operators> for more details.)
708
709 $fred = "10";
710 $barney = "12";
711 $betty = $fred & $barney;
712 print "$betty\n";
713 # Uncomment the next line to change perl4's behavior
714 # ($dummy) = vec("dummy", 0, 0);
715
716 # Perl4 prints:
717 8
718
719 # Perl5 prints:
720 10
721
722 # If vec() is used anywhere in the program, both print:
723 10
724
6dbacca0 725=back
726
727=head2 General data type traps
728
729Perl4-to-Perl5 traps involving most data-types, and their usage
730within certain expressions and/or context.
731
732=over 5
733
734=item * (Arrays)
735
736Negative array subscripts now count from the end of the array.
737
738 @a = (1, 2, 3, 4, 5);
739 print "The third element of the array is $a[3] also expressed as $a[-2] \n";
54310121 740
6dbacca0 741 # perl4 prints: The third element of the array is 4 also expressed as
742 # perl5 prints: The third element of the array is 4 also expressed as 4
743
744=item * (Arrays)
745
746Setting C<$#array> lower now discards array elements, and makes them
747impossible to recover.
748
54310121 749 @a = (a,b,c,d,e);
6dbacca0 750 print "Before: ",join('',@a);
54310121 751 $#a =1;
6dbacca0 752 print ", After: ",join('',@a);
753 $#a =3;
754 print ", Recovered: ",join('',@a),"\n";
54310121 755
6dbacca0 756 # perl4 prints: Before: abcde, After: ab, Recovered: abcd
757 # perl5 prints: Before: abcde, After: ab, Recovered: ab
758
759=item * (Hashes)
760
761Hashes get defined before use
762
54310121 763 local($s,@a,%h);
6dbacca0 764 die "scalar \$s defined" if defined($s);
765 die "array \@a defined" if defined(@a);
766 die "hash \%h defined" if defined(%h);
54310121 767
6dbacca0 768 # perl4 prints:
769 # perl5 dies: hash %h defined
770
475342a6 771Perl will now generate a warning when it sees defined(@a) and
772defined(%h).
773
6dbacca0 774=item * (Globs)
775
776glob assignment from variable to variable will fail if the assigned
777variable is localized subsequent to the assignment
778
779 @a = ("This is Perl 4");
780 *b = *a;
781 local(@a);
782 print @b,"\n";
54310121 783
6dbacca0 784 # perl4 prints: This is Perl 4
785 # perl5 prints:
54310121 786
a3cb178b 787=item * (Globs)
54310121 788
a3cb178b 789Assigning C<undef> to a glob has no effect in Perl 5. In Perl 4
790it undefines the associated scalar (but may have other side effects
791including SEGVs).
5e378fdf 792
6dbacca0 793=item * (Scalar String)
794
795Changes in unary negation (of strings)
796This change effects both the return value and what it
797does to auto(magic)increment.
798
799 $x = "aaa";
800 print ++$x," : ";
801 print -$x," : ";
802 print ++$x,"\n";
54310121 803
6dbacca0 804 # perl4 prints: aab : -0 : 1
805 # perl5 prints: aab : -aab : aac
806
807=item * (Constants)
808
809perl 4 lets you modify constants:
810
811 $foo = "x";
812 &mod($foo);
813 for ($x = 0; $x < 3; $x++) {
814 &mod("a");
815 }
816 sub mod {
817 print "before: $_[0]";
818 $_[0] = "m";
819 print " after: $_[0]\n";
820 }
54310121 821
6dbacca0 822 # perl4:
823 # before: x after: m
824 # before: a after: m
825 # before: m after: m
826 # before: m after: m
54310121 827
6dbacca0 828 # Perl5:
829 # before: x after: m
830 # Modification of a read-only value attempted at foo.pl line 12.
831 # before: a
832
833=item * (Scalars)
834
835The behavior is slightly different for:
836
837 print "$x", defined $x
54310121 838
6dbacca0 839 # perl 4: 1
840 # perl 5: <no output, $x is not called into existence>
841
842=item * (Variable Suicide)
843
844Variable suicide behavior is more consistent under Perl 5.
aa689395 845Perl5 exhibits the same behavior for hashes and scalars,
5f05dabc 846that perl4 exhibits for only scalars.
6dbacca0 847
848 $aGlobal{ "aKey" } = "global value";
849 print "MAIN:", $aGlobal{"aKey"}, "\n";
850 $GlobalLevel = 0;
851 &test( *aGlobal );
852
853 sub test {
854 local( *theArgument ) = @_;
855 local( %aNewLocal ); # perl 4 != 5.001l,m
54310121 856 $aNewLocal{"aKey"} = "this should never appear";
6dbacca0 857 print "SUB: ", $theArgument{"aKey"}, "\n";
858 $aNewLocal{"aKey"} = "level $GlobalLevel"; # what should print
859 $GlobalLevel++;
860 if( $GlobalLevel<4 ) {
861 &test( *aNewLocal );
862 }
863 }
54310121 864
6dbacca0 865 # Perl4:
866 # MAIN:global value
867 # SUB: global value
868 # SUB: level 0
869 # SUB: level 1
870 # SUB: level 2
54310121 871
6dbacca0 872 # Perl5:
873 # MAIN:global value
874 # SUB: global value
875 # SUB: this should never appear
876 # SUB: this should never appear
877 # SUB: this should never appear
878
84dc3c4d 879=back
6dbacca0 880
881=head2 Context Traps - scalar, list contexts
882
883=over 5
884
885=item * (list context)
886
887The elements of argument lists for formats are now evaluated in list
888context. This means you can interpolate list values now.
889
890 @fmt = ("foo","bar","baz");
891 format STDOUT=
892 @<<<<< @||||| @>>>>>
893 @fmt;
894 .
54310121 895 write;
896
6dbacca0 897 # perl4 errors: Please use commas to separate fields in file
898 # perl5 prints: foo bar baz
899
900=item * (scalar context)
901
54310121 902The C<caller()> function now returns a false value in a scalar context
903if there is no caller. This lets library files determine if they're
6dbacca0 904being required.
905
906 caller() ? (print "You rang?\n") : (print "Got a 0\n");
54310121 907
6dbacca0 908 # perl4 errors: There is no caller
909 # perl5 prints: Got a 0
5e378fdf 910
6dbacca0 911=item * (scalar context)
912
913The comma operator in a scalar context is now guaranteed to give a
914scalar context to its arguments.
915
916 @y= ('a','b','c');
917 $x = (1, 2, @y);
918 print "x = $x\n";
54310121 919
6dbacca0 920 # Perl4 prints: x = c # Thinks list context interpolates list
921 # Perl5 prints: x = 3 # Knows scalar uses length of list
922
923=item * (list, builtin)
924
925C<sprintf()> funkiness (array argument converted to scalar array count)
926This test could be added to t/op/sprintf.t
927
928 @z = ('%s%s', 'foo', 'bar');
929 $x = sprintf(@z);
930 if ($x eq 'foobar') {print "ok 2\n";} else {print "not ok 2 '$x'\n";}
54310121 931
6dbacca0 932 # perl4 prints: ok 2
933 # perl5 prints: not ok 2
934
935C<printf()> works fine, though:
936
937 printf STDOUT (@z);
54310121 938 print "\n";
939
6dbacca0 940 # perl4 prints: foobar
941 # perl5 prints: foobar
942
943Probably a bug.
944
945=back
946
947=head2 Precedence Traps
948
949Perl4-to-Perl5 traps involving precedence order.
950
f4b17341 951Perl 4 has almost the same precedence rules as Perl 5 for the operators
952that they both have. Perl 4 however, seems to have had some
953inconsistencies that made the behavior differ from what was documented.
954
84dc3c4d 955=over 5
956
5e378fdf 957=item * Precedence
958
8dbef698 959LHS vs. RHS of any assignment operator. LHS is evaluated first
960in perl4, second in perl5; this can affect the relationship
961between side-effects in sub-expressions.
5e378fdf 962
963 @arr = ( 'left', 'right' );
964 $a{shift @arr} = shift @arr;
965 print join( ' ', keys %a );
966
967 # perl4 prints: left
968 # perl5 prints: right
969
970=item * Precedence
6dbacca0 971
972These are now semantic errors because of precedence:
973
974 @list = (1,2,3,4,5);
975 %map = ("a",1,"b",2,"c",3,"d",4);
976 $n = shift @list + 2; # first item in list plus 2
977 print "n is $n, ";
978 $m = keys %map + 2; # number of items in hash plus 2
979 print "m is $m\n";
54310121 980
6dbacca0 981 # perl4 prints: n is 3, m is 6
982 # perl5 errors and fails to compile
983
984=item * Precedence
a0d0e21e 985
4633a7c4 986The precedence of assignment operators is now the same as the precedence
987of assignment. Perl 4 mistakenly gave them the precedence of the associated
988operator. So you now must parenthesize them in expressions like
989
990 /foo/ ? ($a += 2) : ($a -= 2);
a6006777 991
4633a7c4 992Otherwise
993
6dbacca0 994 /foo/ ? $a += 2 : $a -= 2
4633a7c4 995
996would be erroneously parsed as
997
998 (/foo/ ? $a += 2 : $a) -= 2;
999
1000On the other hand,
1001
54310121 1002 $a += /foo/ ? 1 : 2;
4633a7c4 1003
1004now works as a C programmer would expect.
1005
6dbacca0 1006=item * Precedence
4633a7c4 1007
6dbacca0 1008 open FOO || die;
a0d0e21e 1009
5f05dabc 1010is now incorrect. You need parentheses around the filehandle.
1011Otherwise, perl5 leaves the statement as its default precedence:
a0d0e21e 1012
6dbacca0 1013 open(FOO || die);
54310121 1014
6dbacca0 1015 # perl4 opens or dies
1016 # perl5 errors: Precedence problem: open FOO should be open(FOO)
a0d0e21e 1017
6dbacca0 1018=item * Precedence
a0d0e21e 1019
6dbacca0 1020perl4 gives the special variable, C<$:> precedence, where perl5
1021treats C<$::> as main C<package>
a0d0e21e 1022
6dbacca0 1023 $a = "x"; print "$::a";
54310121 1024
6dbacca0 1025 # perl 4 prints: -:a
1026 # perl 5 prints: x
5e378fdf 1027
6dbacca0 1028=item * Precedence
a0d0e21e 1029
f4b17341 1030perl4 had buggy precedence for the file test operators vis-a-vis
1031the assignment operators. Thus, although the precedence table
1032for perl4 leads one to believe C<-e $foo .= "q"> should parse as
1033C<((-e $foo) .= "q")>, it actually parses as C<(-e ($foo .= "q"))>.
1034In perl5, the precedence is as documented.
54310121 1035
1036 -e $foo .= "q"
a0d0e21e 1037
6dbacca0 1038 # perl4 prints: no output
1039 # perl5 prints: Can't modify -e in concatenation
a0d0e21e 1040
f4b17341 1041=item * Precedence
1042
1043In perl4, keys(), each() and values() were special high-precedence operators
1044that operated on a single hash, but in perl5, they are regular named unary
1045operators. As documented, named unary operators have lower precedence
1046than the arithmetic and concatenation operators C<+ - .>, but the perl4
1047variants of these operators actually bind tighter than C<+ - .>.
1048Thus, for:
1049
1050 %foo = 1..10;
1051 print keys %foo - 1
1052
1053 # perl4 prints: 4
1054 # perl5 prints: Type of arg 1 to keys must be hash (not subtraction)
1055
1056The perl4 behavior was probably more useful, if less consistent.
1057
6dbacca0 1058=back
1059
1060=head2 General Regular Expression Traps using s///, etc.
1061
1062All types of RE traps.
1063
1064=over 5
1065
1066=item * Regular Expression
1067
1068C<s'$lhs'$rhs'> now does no interpolation on either side. It used to
19799a22 1069interpolate $lhs but not $rhs. (And still does not match a literal
6dbacca0 1070'$' in string)
1071
1072 $a=1;$b=2;
1073 $string = '1 2 $a $b';
1074 $string =~ s'$a'$b';
1075 print $string,"\n";
54310121 1076
6dbacca0 1077 # perl4 prints: $b 2 $a $b
1078 # perl5 prints: 1 2 $a $b
1079
1080=item * Regular Expression
a0d0e21e 1081
1082C<m//g> now attaches its state to the searched string rather than the
6dbacca0 1083regular expression. (Once the scope of a block is left for the sub, the
1084state of the searched string is lost)
1085
1086 $_ = "ababab";
1087 while(m/ab/g){
1088 &doit("blah");
1089 }
1090 sub doit{local($_) = shift; print "Got $_ "}
54310121 1091
6dbacca0 1092 # perl4 prints: blah blah blah
1093 # perl5 prints: infinite loop blah...
1094
1095=item * Regular Expression
1096
68dc0745 1097Currently, if you use the C<m//o> qualifier on a regular expression
1098within an anonymous sub, I<all> closures generated from that anonymous
1099sub will use the regular expression as it was compiled when it was used
1100the very first time in any such closure. For instance, if you say
1101
1102 sub build_match {
1103 my($left,$right) = @_;
1104 return sub { $_[0] =~ /$left stuff $right/o; };
1105 }
1106
1107build_match() will always return a sub which matches the contents of
19799a22 1108$left and $right as they were the I<first> time that build_match()
68dc0745 1109was called, not as they are in the current call.
1110
1111This is probably a bug, and may change in future versions of Perl.
1112
1113=item * Regular Expression
1114
6dbacca0 1115If no parentheses are used in a match, Perl4 sets C<$+> to
1116the whole match, just like C<$&>. Perl5 does not.
1117
1118 "abcdef" =~ /b.*e/;
1119 print "\$+ = $+\n";
54310121 1120
6dbacca0 1121 # perl4 prints: bcde
1122 # perl5 prints:
1123
1124=item * Regular Expression
1125
1126substitution now returns the null string if it fails
1127
1128 $string = "test";
1129 $value = ($string =~ s/foo//);
1130 print $value, "\n";
54310121 1131
6dbacca0 1132 # perl4 prints: 0
1133 # perl5 prints:
1134
1135Also see L<Numerical Traps> for another example of this new feature.
1136
1137=item * Regular Expression
1138
54310121 1139C<s`lhs`rhs`> (using backticks) is now a normal substitution, with no
1140backtick expansion
6dbacca0 1141
1142 $string = "";
1143 $string =~ s`^`hostname`;
1144 print $string, "\n";
54310121 1145
6dbacca0 1146 # perl4 prints: <the local hostname>
1147 # perl5 prints: hostname
1148
1149=item * Regular Expression
1150
1151Stricter parsing of variables used in regular expressions
1152
1153 s/^([^$grpc]*$grpc[$opt$plus$rep]?)//o;
54310121 1154
6dbacca0 1155 # perl4: compiles w/o error
1156 # perl5: with Scalar found where operator expected ..., near "$opt$plus"
1157
1158an added component of this example, apparently from the same script, is
1159the actual value of the s'd string after the substitution.
1160C<[$opt]> is a character class in perl4 and an array subscript in perl5
1161
54310121 1162 $grpc = 'a';
6dbacca0 1163 $opt = 'r';
1164 $_ = 'bar';
1165 s/^([^$grpc]*$grpc[$opt]?)/foo/;
1166 print ;
54310121 1167
6dbacca0 1168 # perl4 prints: foo
1169 # perl5 prints: foobar
1170
1171=item * Regular Expression
1172
1173Under perl5, C<m?x?> matches only once, like C<?x?>. Under perl4, it matched
1174repeatedly, like C</x/> or C<m!x!>.
1175
1176 $test = "once";
1177 sub match { $test =~ m?once?; }
1178 &match();
1179 if( &match() ) {
1180 # m?x? matches more then once
1181 print "perl4\n";
54310121 1182 } else {
6dbacca0 1183 # m?x? matches only once
54310121 1184 print "perl5\n";
6dbacca0 1185 }
54310121 1186
6dbacca0 1187 # perl4 prints: perl4
1188 # perl5 prints: perl5
a0d0e21e 1189
a0d0e21e 1190
6dbacca0 1191=back
1192
1193=head2 Subroutine, Signal, Sorting Traps
a0d0e21e 1194
6dbacca0 1195The general group of Perl4-to-Perl5 traps having to do with
1196Signals, Sorting, and their related subroutines, as well as
1197general subroutine traps. Includes some OS-Specific traps.
a0d0e21e 1198
6dbacca0 1199=over 5
a0d0e21e 1200
6dbacca0 1201=item * (Signals)
a0d0e21e 1202
6dbacca0 1203Barewords that used to look like strings to Perl will now look like subroutine
1204calls if a subroutine by that name is defined before the compiler sees them.
a0d0e21e 1205
6dbacca0 1206 sub SeeYa { warn"Hasta la vista, baby!" }
1207 $SIG{'TERM'} = SeeYa;
1208 print "SIGTERM is now $SIG{'TERM'}\n";
54310121 1209
6dbacca0 1210 # perl4 prints: SIGTERM is main'SeeYa
1211 # perl5 prints: SIGTERM is now main::1
a0d0e21e 1212
6dbacca0 1213Use B<-w> to catch this one
a0d0e21e 1214
6dbacca0 1215=item * (Sort Subroutine)
a0d0e21e 1216
6dbacca0 1217reverse is no longer allowed as the name of a sort subroutine.
a0d0e21e 1218
6dbacca0 1219 sub reverse{ print "yup "; $a <=> $b }
54310121 1220 print sort reverse a,b,c;
1221
6dbacca0 1222 # perl4 prints: yup yup yup yup abc
54310121 1223 # perl5 prints: abc
a0d0e21e 1224
b996531f 1225=item * warn() won't let you specify a filehandle.
1226
1227Although it _always_ printed to STDERR, warn() would let you specify a
1228filehandle in perl4. With perl5 it does not.
5e378fdf 1229
1230 warn STDERR "Foo!";
1231
1232 # perl4 prints: Foo!
54310121 1233 # perl5 prints: String found where operator expected
5e378fdf 1234
6dbacca0 1235=back
a0d0e21e 1236
6dbacca0 1237=head2 OS Traps
1238
1239=over 5
1240
1241=item * (SysV)
1242
54310121 1243Under HPUX, and some other SysV OSes, one had to reset any signal handler,
1244within the signal handler function, each time a signal was handled with
1245perl4. With perl5, the reset is now done correctly. Any code relying
6dbacca0 1246on the handler _not_ being reset will have to be reworked.
1247
a6006777 1248Since version 5.002, Perl uses sigaction() under SysV.
6dbacca0 1249
1250 sub gotit {
54310121 1251 print "Got @_... ";
1252 }
6dbacca0 1253 $SIG{'INT'} = 'gotit';
54310121 1254
6dbacca0 1255 $| = 1;
1256 $pid = fork;
1257 if ($pid) {
1258 kill('INT', $pid);
1259 sleep(1);
1260 kill('INT', $pid);
54310121 1261 } else {
6dbacca0 1262 while (1) {sleep(10);}
54310121 1263 }
1264
6dbacca0 1265 # perl4 (HPUX) prints: Got INT...
1266 # perl5 (HPUX) prints: Got INT... Got INT...
1267
1268=item * (SysV)
1269
c47ff5f1 1270Under SysV OSes, C<seek()> on a file opened to append C<<< >> >>> now does
54310121 1271the right thing w.r.t. the fopen() manpage. e.g., - When a file is opened
6dbacca0 1272for append, it is impossible to overwrite information already in
1273the file.
1274
1275 open(TEST,">>seek.test");
54310121 1276 $start = tell TEST ;
6dbacca0 1277 foreach(1 .. 9){
1278 print TEST "$_ ";
1279 }
1280 $end = tell TEST ;
1281 seek(TEST,$start,0);
1282 print TEST "18 characters here";
54310121 1283
6dbacca0 1284 # perl4 (solaris) seek.test has: 18 characters here
1285 # perl5 (solaris) seek.test has: 1 2 3 4 5 6 7 8 9 18 characters here
a0d0e21e 1286
a0d0e21e 1287
a0d0e21e 1288
6dbacca0 1289=back
a0d0e21e 1290
6dbacca0 1291=head2 Interpolation Traps
a0d0e21e 1292
8b0a4b75 1293Perl4-to-Perl5 traps having to do with how things get interpolated
1294within certain expressions, statements, contexts, or whatever.
1295
6dbacca0 1296=over 5
a0d0e21e 1297
6dbacca0 1298=item * Interpolation
a0d0e21e 1299
6dbacca0 1300@ now always interpolates an array in double-quotish strings.
1301
54310121 1302 print "To: someone@somewhere.com\n";
1303
6dbacca0 1304 # perl4 prints: To:someone@somewhere.com
9607fc9c 1305 # perl5 errors : In string, @somewhere now must be written as \@somewhere
6dbacca0 1306
1307=item * Interpolation
1308
6dbacca0 1309Double-quoted strings may no longer end with an unescaped $ or @.
1310
1311 $foo = "foo$";
1312 $bar = "bar@";
1313 print "foo is $foo, bar is $bar\n";
54310121 1314
6dbacca0 1315 # perl4 prints: foo is foo$, bar is bar@
1316 # perl5 errors: Final $ should be \$ or $name
1317
1318Note: perl5 DOES NOT error on the terminating @ in $bar
1319
1320=item * Interpolation
a0d0e21e 1321
8b0a4b75 1322Perl now sometimes evaluates arbitrary expressions inside braces that occur
1323within double quotes (usually when the opening brace is preceded by C<$>
1324or C<@>).
1325
1326 @www = "buz";
1327 $foo = "foo";
1328 $bar = "bar";
1329 sub foo { return "bar" };
1330 print "|@{w.w.w}|${main'foo}|";
1331
1332 # perl4 prints: |@{w.w.w}|foo|
1333 # perl5 prints: |buz|bar|
1334
1335Note that you can C<use strict;> to ward off such trappiness under perl5.
1336
1337=item * Interpolation
1338
748a9306 1339The construct "this is $$x" used to interpolate the pid at that
19799a22 1340point, but now apparently tries to dereference $x. C<$$> by itself still
748a9306 1341works fine, however.
1342
6dbacca0 1343 print "this is $$x\n";
748a9306 1344
6dbacca0 1345 # perl4 prints: this is XXXx (XXX is the current pid)
1346 # perl5 prints: this is
1347
1348=item * Interpolation
1349
54310121 1350Creation of hashes on the fly with C<eval "EXPR"> now requires either both
1351C<$>'s to be protected in the specification of the hash name, or both curlies
6dbacca0 1352to be protected. If both curlies are protected, the result will be compatible
1353with perl4 and perl5. This is a very common practice, and should be changed
1354to use the block form of C<eval{}> if possible.
c07a80fd 1355
6dbacca0 1356 $hashname = "foobar";
1357 $key = "baz";
1358 $value = 1234;
1359 eval "\$$hashname{'$key'} = q|$value|";
1360 (defined($foobar{'baz'})) ? (print "Yup") : (print "Nope");
1361
1362 # perl4 prints: Yup
1363 # perl5 prints: Nope
1364
1365Changing
1366
1367 eval "\$$hashname{'$key'} = q|$value|";
c07a80fd 1368
1369to
1370
6dbacca0 1371 eval "\$\$hashname{'$key'} = q|$value|";
c07a80fd 1372
6dbacca0 1373causes the following result:
c07a80fd 1374
6dbacca0 1375 # perl4 prints: Nope
1376 # perl5 prints: Yup
c07a80fd 1377
6dbacca0 1378or, changing to
a0d0e21e 1379
6dbacca0 1380 eval "\$$hashname\{'$key'\} = q|$value|";
1381
1382causes the following result:
1383
1384 # perl4 prints: Yup
1385 # perl5 prints: Yup
1386 # and is compatible for both versions
1387
1388
1389=item * Interpolation
1390
1391perl4 programs which unconsciously rely on the bugs in earlier perl versions.
1392
1393 perl -e '$bar=q/not/; print "This is $foo{$bar} perl5"'
54310121 1394
6dbacca0 1395 # perl4 prints: This is not perl5
1396 # perl5 prints: This is perl5
1397
1398=item * Interpolation
1399
54310121 1400You also have to be careful about array references.
6dbacca0 1401
1402 print "$foo{"
1403
1404 perl 4 prints: {
1405 perl 5 prints: syntax error
1406
1407=item * Interpolation
1408
1409Similarly, watch out for:
1410
1411 $foo = "array";
1412 print "\$$foo{bar}\n";
54310121 1413
6dbacca0 1414 # perl4 prints: $array{bar}
1415 # perl5 prints: $
1416
1417Perl 5 is looking for C<$array{bar}> which doesn't exist, but perl 4 is
1418happy just to expand $foo to "array" by itself. Watch out for this
1419especially in C<eval>'s.
1420
1421=item * Interpolation
1422
1423C<qq()> string passed to C<eval>
1424
1425 eval qq(
1426 foreach \$y (keys %\$x\) {
1427 \$count++;
1428 }
1429 );
54310121 1430
6dbacca0 1431 # perl4 runs this ok
54310121 1432 # perl5 prints: Can't find string terminator ")"
a0d0e21e 1433
6dbacca0 1434=back
1435
1436=head2 DBM Traps
1437
1438General DBM traps.
1439
1440=over 5
1441
1442=item * DBM
1443
1444Existing dbm databases created under perl4 (or any other dbm/ndbm tool)
1445may cause the same script, run under perl5, to fail. The build of perl5
1446must have been linked with the same dbm/ndbm as the default for C<dbmopen()>
1447to function properly without C<tie>'ing to an extension dbm implementation.
1448
1449 dbmopen (%dbm, "file", undef);
1450 print "ok\n";
1451
1452 # perl4 prints: ok
1453 # perl5 prints: ok (IFF linked with -ldbm or -lndbm)
1454
1455
1456=item * DBM
1457
1458Existing dbm databases created under perl4 (or any other dbm/ndbm tool)
1459may cause the same script, run under perl5, to fail. The error generated
1460when exceeding the limit on the key/value size will cause perl5 to exit
1461immediately.
1462
1463 dbmopen(DB, "testdb",0600) || die "couldn't open db! $!";
1464 $DB{'trap'} = "x" x 1024; # value too large for most dbm/ndbm
1465 print "YUP\n";
1466
1467 # perl4 prints:
1468 dbm store returned -1, errno 28, key "trap" at - line 3.
1469 YUP
1470
1471 # perl5 prints:
1472 dbm store returned -1, errno 28, key "trap" at - line 3.
a0d0e21e 1473
1474=back
6dbacca0 1475
1476=head2 Unclassified Traps
1477
1478Everything else.
1479
84dc3c4d 1480=over 5
1481
5db417f7 1482=item * C<require>/C<do> trap using returned value
6dbacca0 1483
1484If the file doit.pl has:
1485
1486 sub foo {
1487 $rc = do "./do.pl";
1488 return 8;
54310121 1489 }
6dbacca0 1490 print &foo, "\n";
1491
1492And the do.pl file has the following single line:
1493
1494 return 3;
1495
1496Running doit.pl gives the following:
1497
1498 # perl 4 prints: 3 (aborts the subroutine early)
54310121 1499 # perl 5 prints: 8
6dbacca0 1500
1501Same behavior if you replace C<do> with C<require>.
1502
5db417f7 1503=item * C<split> on empty string with LIMIT specified
1504
1505 $string = '';
1506 @list = split(/foo/, $string, 2)
1507
1508Perl4 returns a one element list containing the empty string but Perl5
1509returns an empty list.
1510
6dbacca0 1511=back
1512
54310121 1513As always, if any of these are ever officially declared as bugs,
6dbacca0 1514they'll be fixed and removed.
1515