Updated.
[p5sagit/p5-mst-13.2.git] / pod / perltrap.pod
CommitLineData
a0d0e21e 1=head1 NAME
2
3perltrap - Perl traps for the unwary
4
5=head1 DESCRIPTION
6
cb1a09d0 7The biggest trap of all is forgetting to use the B<-w> switch; see
8L<perlrun>. The second biggest trap is not making your entire program
9runnable under C<use strict>.
a0d0e21e 10
11=head2 Awk Traps
12
13Accustomed B<awk> users should take special note of the following:
14
15=over 4
16
17=item *
18
19The English module, loaded via
20
21 use English;
22
23allows you to refer to special variables (like $RS) as
24though they were in B<awk>; see L<perlvar> for details.
25
26=item *
27
28Semicolons are required after all simple statements in Perl (except
29at the end of a block). Newline is not a statement delimiter.
30
31=item *
32
33Curly brackets are required on C<if>s and C<while>s.
34
35=item *
36
37Variables begin with "$" or "@" in Perl.
38
39=item *
40
41Arrays index from 0. Likewise string positions in substr() and
42index().
43
44=item *
45
46You have to decide whether your array has numeric or string indices.
47
48=item *
49
50Associative array values do not spring into existence upon mere
51reference.
52
53=item *
54
55You have to decide whether you want to use string or numeric
56comparisons.
57
58=item *
59
60Reading an input line does not split it for you. You get to split it
61yourself to an array. And split() operator has different
62arguments.
63
64=item *
65
66The current input line is normally in $_, not $0. It generally does
67not have the newline stripped. ($0 is the name of the program
68executed.) See L<perlvar>.
69
70=item *
71
72$<I<digit>> does not refer to fields--it refers to substrings matched by
73the last match pattern.
74
75=item *
76
77The print() statement does not add field and record separators unless
78you set C<$,> and C<$.>. You can set $OFS and $ORS if you're using
79the English module.
80
81=item *
82
83You must open your files before you print to them.
84
85=item *
86
87The range operator is "..", not comma. The comma operator works as in
88C.
89
90=item *
91
92The match operator is "=~", not "~". ("~" is the one's complement
93operator, as in C.)
94
95=item *
96
97The exponentiation operator is "**", not "^". "^" is the XOR
98operator, as in C. (You know, one could get the feeling that B<awk> is
99basically incompatible with C.)
100
101=item *
102
103The concatenation operator is ".", not the null string. (Using the
104null string would render C</pat/ /pat/> unparsable, since the third slash
105would be interpreted as a division operator--the tokener is in fact
106slightly context sensitive for operators like "/", "?", and ">".
107And in fact, "." itself can be the beginning of a number.)
108
109=item *
110
111The C<next>, C<exit>, and C<continue> keywords work differently.
112
113=item *
114
115
116The following variables work differently:
117
118 Awk Perl
119 ARGC $#ARGV or scalar @ARGV
120 ARGV[0] $0
121 FILENAME $ARGV
122 FNR $. - something
123 FS (whatever you like)
124 NF $#Fld, or some such
125 NR $.
126 OFMT $#
127 OFS $,
128 ORS $\
129 RLENGTH length($&)
130 RS $/
131 RSTART length($`)
132 SUBSEP $;
133
134=item *
135
136You cannot set $RS to a pattern, only a string.
137
138=item *
139
140When in doubt, run the B<awk> construct through B<a2p> and see what it
141gives you.
142
143=back
144
145=head2 C Traps
146
147Cerebral C programmers should take note of the following:
148
149=over 4
150
151=item *
152
153Curly brackets are required on C<if>'s and C<while>'s.
154
155=item *
156
157You must use C<elsif> rather than C<else if>.
158
159=item *
160
161The C<break> and C<continue> keywords from C become in
162Perl C<last> and C<next>, respectively.
163Unlike in C, these do I<NOT> work within a C<do { } while> construct.
164
165=item *
166
167There's no switch statement. (But it's easy to build one on the fly.)
168
169=item *
170
171Variables begin with "$" or "@" in Perl.
172
173=item *
174
175printf() does not implement the "*" format for interpolating
176field widths, but it's trivial to use interpolation of double-quoted
177strings to achieve the same effect.
178
179=item *
180
181Comments begin with "#", not "/*".
182
183=item *
184
185You can't take the address of anything, although a similar operator
186in Perl 5 is the backslash, which creates a reference.
187
188=item *
189
4633a7c4 190C<ARGV> must be capitalized. C<$ARGV[0]> is C's C<argv[1]>, and C<argv[0]>
191ends up in C<$0>.
a0d0e21e 192
193=item *
194
195System calls such as link(), unlink(), rename(), etc. return nonzero for
196success, not 0.
197
198=item *
199
200Signal handlers deal with signal names, not numbers. Use C<kill -l>
201to find their names on your system.
202
203=back
204
205=head2 Sed Traps
206
207Seasoned B<sed> programmers should take note of the following:
208
209=over 4
210
211=item *
212
213Backreferences in substitutions use "$" rather than "\".
214
215=item *
216
217The pattern matching metacharacters "(", ")", and "|" do not have backslashes
218in front.
219
220=item *
221
222The range operator is C<...>, rather than comma.
223
224=back
225
226=head2 Shell Traps
227
228Sharp shell programmers should take note of the following:
229
230=over 4
231
232=item *
233
234The backtick operator does variable interpretation without regard to
235the presence of single quotes in the command.
236
237=item *
238
239The backtick operator does no translation of the return value, unlike B<csh>.
240
241=item *
242
243Shells (especially B<csh>) do several levels of substitution on each
244command line. Perl does substitution only in certain constructs
245such as double quotes, backticks, angle brackets, and search patterns.
246
247=item *
248
249Shells interpret scripts a little bit at a time. Perl compiles the
250entire program before executing it (except for C<BEGIN> blocks, which
251execute at compile time).
252
253=item *
254
255The arguments are available via @ARGV, not $1, $2, etc.
256
257=item *
258
259The environment is not automatically made available as separate scalar
260variables.
261
262=back
263
264=head2 Perl Traps
265
266Practicing Perl Programmers should take note of the following:
267
268=over 4
269
270=item *
271
272Remember that many operations behave differently in a list
273context than they do in a scalar one. See L<perldata> for details.
274
275=item *
276
277Avoid barewords if you can, especially all lower-case ones.
278You can't tell just by looking at it whether a bareword is
279a function or a string. By using quotes on strings and
280parens on function calls, you won't ever get them confused.
281
282=item *
283
284You cannot discern from mere inspection which built-ins
285are unary operators (like chop() and chdir())
286and which are list operators (like print() and unlink()).
287(User-defined subroutines can B<only> be list operators, never
288unary ones.) See L<perlop>.
289
290=item *
291
748a9306 292People have a hard time remembering that some functions
a0d0e21e 293default to $_, or @ARGV, or whatever, but that others which
294you might expect to do not.
295
296=item *
297
748a9306 298The <FH> construct is not the name of the filehandle, it is a readline
299operation on that handle. The data read is only assigned to $_ if the
300file read is the sole condition in a while loop:
301
302 while (<FH>) { }
303 while ($_ = <FH>) { }..
304 <FH>; # data discarded!
305
306=item *
307
a0d0e21e 308Remember not to use "C<=>" when you need "C<=~>";
309these two constructs are quite different:
310
311 $x = /foo/;
312 $x =~ /foo/;
313
314=item *
315
316The C<do {}> construct isn't a real loop that you can use
317loop control on.
318
319=item *
320
321Use my() for local variables whenever you can get away with
322it (but see L<perlform> for where you can't).
323Using local() actually gives a local value to a global
324variable, which leaves you open to unforeseen side-effects
325of dynamic scoping.
326
327=back
328
329=head2 Perl4 Traps
330
331Penitent Perl 4 Programmers should take note of the following
332incompatible changes that occurred between release 4 and release 5:
333
334=over 4
335
336=item *
337
338C<@> now always interpolates an array in double-quotish strings. Some programs
339may now need to use backslash to protect any C<@> that shouldn't interpolate.
340
341=item *
748a9306 342
a0d0e21e 343Barewords that used to look like strings to Perl will now look like subroutine
344calls if a subroutine by that name is defined before the compiler sees them.
345For example:
346
347 sub SeeYa { die "Hasta la vista, baby!" }
748a9306 348 $SIG{'QUIT'} = SeeYa;
a0d0e21e 349
350In Perl 4, that set the signal handler; in Perl 5, it actually calls the
351function! You may use the B<-w> switch to find such places.
352
353=item *
354
355Symbols starting with C<_> are no longer forced into package C<main>, except
356for $_ itself (and @_, etc.).
357
358=item *
359
cb1a09d0 360Double-colon is now a valid package separator in an identifier. Thus these
361behave differently in perl4 vs. perl5:
362
363 print "$a::$b::$c\n";
364 print "$var::abc::xyz\n";
365
366=item *
367
a0d0e21e 368C<s'$lhs'$rhs'> now does no interpolation on either side. It used to
369interpolate C<$lhs> but not C<$rhs>.
370
371=item *
372
373The second and third arguments of splice() are now evaluated in scalar
374context (as the book says) rather than list context.
375
376=item *
377
378These are now semantic errors because of precedence:
379
380 shift @list + 20;
381 $n = keys %map + 20;
382
383Because if that were to work, then this couldn't:
384
385 sleep $dormancy + 20;
386
387=item *
388
4633a7c4 389The precedence of assignment operators is now the same as the precedence
390of assignment. Perl 4 mistakenly gave them the precedence of the associated
391operator. So you now must parenthesize them in expressions like
392
393 /foo/ ? ($a += 2) : ($a -= 2);
394
395Otherwise
396
397 /foo/ ? $a += 2 : $a -= 2;
398
399would be erroneously parsed as
400
401 (/foo/ ? $a += 2 : $a) -= 2;
402
403On the other hand,
404
405 $a += /foo/ ? 1 : 2;
406
407now works as a C programmer would expect.
408
409=item *
410
a0d0e21e 411C<open FOO || die> is now incorrect. You need parens around the filehandle.
412While temporarily supported, using such a construct will
413generate a non-fatal (but non-suppressible) warning.
414
415=item *
416
417The elements of argument lists for formats are now evaluated in list
418context. This means you can interpolate list values now.
419
420=item *
421
422You can't do a C<goto> into a block that is optimized away. Darn.
423
424=item *
425
426It is no longer syntactically legal to use whitespace as the name
427of a variable, or as a delimiter for any kind of quote construct.
428Double darn.
429
430=item *
431
432The caller() function now returns a false value in a scalar context if there
433is no caller. This lets library files determine if they're being required.
434
435=item *
436
437C<m//g> now attaches its state to the searched string rather than the
438regular expression.
439
440=item *
441
442C<reverse> is no longer allowed as the name of a sort subroutine.
443
444=item *
445
446B<taintperl> is no longer a separate executable. There is now a B<-T>
447switch to turn on tainting when it isn't turned on automatically.
448
449=item *
450
451Double-quoted strings may no longer end with an unescaped C<$> or C<@>.
452
453=item *
454
455The archaic C<while/if> BLOCK BLOCK syntax is no longer supported.
456
457
458=item *
459
460Negative array subscripts now count from the end of the array.
461
462=item *
463
464The comma operator in a scalar context is now guaranteed to give a
465scalar context to its arguments.
466
467=item *
468
469The C<**> operator now binds more tightly than unary minus.
470It was documented to work this way before, but didn't.
471
472=item *
473
474Setting C<$#array> lower now discards array elements.
475
476=item *
477
478delete() is not guaranteed to return the old value for tie()d arrays,
479since this capability may be onerous for some modules to implement.
480
481=item *
482
748a9306 483The construct "this is $$x" used to interpolate the pid at that
484point, but now tries to dereference $x. C<$$> by itself still
485works fine, however.
486
487=item *
488
a0d0e21e 489Some error messages will be different.
490
491=item *
492
493Some bugs may have been inadvertently removed.
494
495=back