3 perltrap - Perl traps for the unwary
7 The biggest trap of all is forgetting to use the B<-w> switch; see
8 L<perlrun>. The second biggest trap is not making your entire program
9 runnable under C<use strict>.
13 Accustomed B<awk> users should take special note of the following:
19 The English module, loaded via
23 allows you to refer to special variables (like $RS) as
24 though they were in B<awk>; see L<perlvar> for details.
28 Semicolons are required after all simple statements in Perl (except
29 at the end of a block). Newline is not a statement delimiter.
33 Curly brackets are required on C<if>s and C<while>s.
37 Variables begin with "$" or "@" in Perl.
41 Arrays index from 0. Likewise string positions in substr() and
46 You have to decide whether your array has numeric or string indices.
50 Associative array values do not spring into existence upon mere
55 You have to decide whether you want to use string or numeric
60 Reading an input line does not split it for you. You get to split it
61 yourself to an array. And split() operator has different
66 The current input line is normally in $_, not $0. It generally does
67 not have the newline stripped. ($0 is the name of the program
68 executed.) See L<perlvar>.
72 $<I<digit>> does not refer to fields--it refers to substrings matched by
73 the last match pattern.
77 The print() statement does not add field and record separators unless
78 you set C<$,> and C<$.>. You can set $OFS and $ORS if you're using
83 You must open your files before you print to them.
87 The range operator is "..", not comma. The comma operator works as in
92 The match operator is "=~", not "~". ("~" is the one's complement
97 The exponentiation operator is "**", not "^". "^" is the XOR
98 operator, as in C. (You know, one could get the feeling that B<awk> is
99 basically incompatible with C.)
103 The concatenation operator is ".", not the null string. (Using the
104 null string would render C</pat/ /pat/> unparsable, since the third slash
105 would be interpreted as a division operator--the tokener is in fact
106 slightly context sensitive for operators like "/", "?", and ">".
107 And in fact, "." itself can be the beginning of a number.)
111 The C<next>, C<exit>, and C<continue> keywords work differently.
116 The following variables work differently:
119 ARGC $#ARGV or scalar @ARGV
123 FS (whatever you like)
124 NF $#Fld, or some such
136 You cannot set $RS to a pattern, only a string.
140 When in doubt, run the B<awk> construct through B<a2p> and see what it
147 Cerebral C programmers should take note of the following:
153 Curly brackets are required on C<if>'s and C<while>'s.
157 You must use C<elsif> rather than C<else if>.
161 The C<break> and C<continue> keywords from C become in
162 Perl C<last> and C<next>, respectively.
163 Unlike in C, these do I<NOT> work within a C<do { } while> construct.
167 There's no switch statement. (But it's easy to build one on the fly.)
171 Variables begin with "$" or "@" in Perl.
175 printf() does not implement the "*" format for interpolating
176 field widths, but it's trivial to use interpolation of double-quoted
177 strings to achieve the same effect.
181 Comments begin with "#", not "/*".
185 You can't take the address of anything, although a similar operator
186 in Perl 5 is the backslash, which creates a reference.
190 C<ARGV> must be capitalized. C<$ARGV[0]> is C's C<argv[1]>, and C<argv[0]>
195 System calls such as link(), unlink(), rename(), etc. return nonzero for
200 Signal handlers deal with signal names, not numbers. Use C<kill -l>
201 to find their names on your system.
207 Seasoned B<sed> programmers should take note of the following:
213 Backreferences in substitutions use "$" rather than "\".
217 The pattern matching metacharacters "(", ")", and "|" do not have backslashes
222 The range operator is C<...>, rather than comma.
228 Sharp shell programmers should take note of the following:
234 The backtick operator does variable interpretation without regard to
235 the presence of single quotes in the command.
239 The backtick operator does no translation of the return value, unlike B<csh>.
243 Shells (especially B<csh>) do several levels of substitution on each
244 command line. Perl does substitution only in certain constructs
245 such as double quotes, backticks, angle brackets, and search patterns.
249 Shells interpret scripts a little bit at a time. Perl compiles the
250 entire program before executing it (except for C<BEGIN> blocks, which
251 execute at compile time).
255 The arguments are available via @ARGV, not $1, $2, etc.
259 The environment is not automatically made available as separate scalar
266 Practicing Perl Programmers should take note of the following:
272 Remember that many operations behave differently in a list
273 context than they do in a scalar one. See L<perldata> for details.
277 Avoid barewords if you can, especially all lower-case ones.
278 You can't tell just by looking at it whether a bareword is
279 a function or a string. By using quotes on strings and
280 parens on function calls, you won't ever get them confused.
284 You cannot discern from mere inspection which built-ins
285 are unary operators (like chop() and chdir())
286 and which are list operators (like print() and unlink()).
287 (User-defined subroutines can B<only> be list operators, never
288 unary ones.) See L<perlop>.
292 People have a hard time remembering that some functions
293 default to $_, or @ARGV, or whatever, but that others which
294 you might expect to do not.
298 The <FH> construct is not the name of the filehandle, it is a readline
299 operation on that handle. The data read is only assigned to $_ if the
300 file read is the sole condition in a while loop:
303 while ($_ = <FH>) { }..
304 <FH>; # data discarded!
308 Remember not to use "C<=>" when you need "C<=~>";
309 these two constructs are quite different:
316 The C<do {}> construct isn't a real loop that you can use
321 Use my() for local variables whenever you can get away with
322 it (but see L<perlform> for where you can't).
323 Using local() actually gives a local value to a global
324 variable, which leaves you open to unforeseen side-effects
331 Penitent Perl 4 Programmers should take note of the following
332 incompatible changes that occurred between release 4 and release 5:
338 C<@> now always interpolates an array in double-quotish strings. Some programs
339 may now need to use backslash to protect any C<@> that shouldn't interpolate.
343 Barewords that used to look like strings to Perl will now look like subroutine
344 calls if a subroutine by that name is defined before the compiler sees them.
347 sub SeeYa { die "Hasta la vista, baby!" }
348 $SIG{'QUIT'} = SeeYa;
350 In Perl 4, that set the signal handler; in Perl 5, it actually calls the
351 function! You may use the B<-w> switch to find such places.
355 Symbols starting with C<_> are no longer forced into package C<main>, except
356 for $_ itself (and @_, etc.).
360 Double-colon is now a valid package separator in an identifier. Thus these
361 behave differently in perl4 vs. perl5:
363 print "$a::$b::$c\n";
364 print "$var::abc::xyz\n";
368 C<s'$lhs'$rhs'> now does no interpolation on either side. It used to
369 interpolate C<$lhs> but not C<$rhs>.
373 The second and third arguments of splice() are now evaluated in scalar
374 context (as the book says) rather than list context.
378 These are now semantic errors because of precedence:
383 Because if that were to work, then this couldn't:
385 sleep $dormancy + 20;
389 The precedence of assignment operators is now the same as the precedence
390 of assignment. Perl 4 mistakenly gave them the precedence of the associated
391 operator. So you now must parenthesize them in expressions like
393 /foo/ ? ($a += 2) : ($a -= 2);
397 /foo/ ? $a += 2 : $a -= 2;
399 would be erroneously parsed as
401 (/foo/ ? $a += 2 : $a) -= 2;
407 now works as a C programmer would expect.
411 C<open FOO || die> is now incorrect. You need parens around the filehandle.
412 While temporarily supported, using such a construct will
413 generate a non-fatal (but non-suppressible) warning.
417 The elements of argument lists for formats are now evaluated in list
418 context. This means you can interpolate list values now.
422 You can't do a C<goto> into a block that is optimized away. Darn.
426 It is no longer syntactically legal to use whitespace as the name
427 of a variable, or as a delimiter for any kind of quote construct.
432 The caller() function now returns a false value in a scalar context if there
433 is no caller. This lets library files determine if they're being required.
437 C<m//g> now attaches its state to the searched string rather than the
442 C<reverse> is no longer allowed as the name of a sort subroutine.
446 B<taintperl> is no longer a separate executable. There is now a B<-T>
447 switch to turn on tainting when it isn't turned on automatically.
451 Double-quoted strings may no longer end with an unescaped C<$> or C<@>.
455 The archaic C<while/if> BLOCK BLOCK syntax is no longer supported.
460 Negative array subscripts now count from the end of the array.
464 The comma operator in a scalar context is now guaranteed to give a
465 scalar context to its arguments.
469 The C<**> operator now binds more tightly than unary minus.
470 It was documented to work this way before, but didn't.
474 Setting C<$#array> lower now discards array elements.
478 delete() is not guaranteed to return the old value for tie()d arrays,
479 since this capability may be onerous for some modules to implement.
483 The construct "this is $$x" used to interpolate the pid at that
484 point, but now tries to dereference $x. C<$$> by itself still
489 Some error messages will be different.
493 Some bugs may have been inadvertently removed.