3 perltrap - Perl traps for the unwary
7 The biggest trap of all is forgetting to use the B<-w> switch;
8 see L<perlrun>. Making your entire program runnable under
12 can help make your program more bullet-proof, but sometimes
13 it's too annoying for quick throw-away programs.
17 Accustomed B<awk> users should take special note of the following:
23 The English module, loaded via
27 allows you to refer to special variables (like $RS) as
28 though they were in B<awk>; see L<perlvar> for details.
32 Semicolons are required after all simple statements in Perl (except
33 at the end of a block). Newline is not a statement delimiter.
37 Curly brackets are required on C<if>s and C<while>s.
41 Variables begin with "$" or "@" in Perl.
45 Arrays index from 0. Likewise string positions in substr() and
50 You have to decide whether your array has numeric or string indices.
54 Associative array values do not spring into existence upon mere
59 You have to decide whether you want to use string or numeric
64 Reading an input line does not split it for you. You get to split it
65 yourself to an array. And split() operator has different
70 The current input line is normally in $_, not $0. It generally does
71 not have the newline stripped. ($0 is the name of the program
72 executed.) See L<perlvar>.
76 $<I<digit>> does not refer to fields--it refers to substrings matched by
77 the last match pattern.
81 The print() statement does not add field and record separators unless
82 you set C<$,> and C<$.>. You can set $OFS and $ORS if you're using
87 You must open your files before you print to them.
91 The range operator is "..", not comma. The comma operator works as in
96 The match operator is "=~", not "~". ("~" is the one's complement
101 The exponentiation operator is "**", not "^". "^" is the XOR
102 operator, as in C. (You know, one could get the feeling that B<awk> is
103 basically incompatible with C.)
107 The concatenation operator is ".", not the null string. (Using the
108 null string would render C</pat/ /pat/> unparsable, since the third slash
109 would be interpreted as a division operator--the tokener is in fact
110 slightly context sensitive for operators like "/", "?", and ">".
111 And in fact, "." itself can be the beginning of a number.)
115 The C<next>, C<exit>, and C<continue> keywords work differently.
120 The following variables work differently:
123 ARGC $#ARGV or scalar @ARGV
127 FS (whatever you like)
128 NF $#Fld, or some such
140 You cannot set $RS to a pattern, only a string.
144 When in doubt, run the B<awk> construct through B<a2p> and see what it
151 Cerebral C programmers should take note of the following:
157 Curly brackets are required on C<if>'s and C<while>'s.
161 You must use C<elsif> rather than C<else if>.
165 The C<break> and C<continue> keywords from C become in
166 Perl C<last> and C<next>, respectively.
167 Unlike in C, these do I<NOT> work within a C<do { } while> construct.
171 There's no switch statement. (But it's easy to build one on the fly.)
175 Variables begin with "$" or "@" in Perl.
179 printf() does not implement the "*" format for interpolating
180 field widths, but it's trivial to use interpolation of double-quoted
181 strings to achieve the same effect.
185 Comments begin with "#", not "/*".
189 You can't take the address of anything, although a similar operator
190 in Perl 5 is the backslash, which creates a reference.
194 C<ARGV> must be capitalized.
198 System calls such as link(), unlink(), rename(), etc. return nonzero for
203 Signal handlers deal with signal names, not numbers. Use C<kill -l>
204 to find their names on your system.
210 Seasoned B<sed> programmers should take note of the following:
216 Backreferences in substitutions use "$" rather than "\".
220 The pattern matching metacharacters "(", ")", and "|" do not have backslashes
225 The range operator is C<...>, rather than comma.
231 Sharp shell programmers should take note of the following:
237 The backtick operator does variable interpretation without regard to
238 the presence of single quotes in the command.
242 The backtick operator does no translation of the return value, unlike B<csh>.
246 Shells (especially B<csh>) do several levels of substitution on each
247 command line. Perl does substitution only in certain constructs
248 such as double quotes, backticks, angle brackets, and search patterns.
252 Shells interpret scripts a little bit at a time. Perl compiles the
253 entire program before executing it (except for C<BEGIN> blocks, which
254 execute at compile time).
258 The arguments are available via @ARGV, not $1, $2, etc.
262 The environment is not automatically made available as separate scalar
269 Practicing Perl Programmers should take note of the following:
275 Remember that many operations behave differently in a list
276 context than they do in a scalar one. See L<perldata> for details.
280 Avoid barewords if you can, especially all lower-case ones.
281 You can't tell just by looking at it whether a bareword is
282 a function or a string. By using quotes on strings and
283 parens on function calls, you won't ever get them confused.
287 You cannot discern from mere inspection which built-ins
288 are unary operators (like chop() and chdir())
289 and which are list operators (like print() and unlink()).
290 (User-defined subroutines can B<only> be list operators, never
291 unary ones.) See L<perlop>.
295 People have a hard type remembering that some functions
296 default to $_, or @ARGV, or whatever, but that others which
297 you might expect to do not.
301 Remember not to use "C<=>" when you need "C<=~>";
302 these two constructs are quite different:
309 The C<do {}> construct isn't a real loop that you can use
314 Use my() for local variables whenever you can get away with
315 it (but see L<perlform> for where you can't).
316 Using local() actually gives a local value to a global
317 variable, which leaves you open to unforeseen side-effects
324 Penitent Perl 4 Programmers should take note of the following
325 incompatible changes that occurred between release 4 and release 5:
331 C<@> now always interpolates an array in double-quotish strings. Some programs
332 may now need to use backslash to protect any C<@> that shouldn't interpolate.
335 Barewords that used to look like strings to Perl will now look like subroutine
336 calls if a subroutine by that name is defined before the compiler sees them.
339 sub SeeYa { die "Hasta la vista, baby!" }
342 In Perl 4, that set the signal handler; in Perl 5, it actually calls the
343 function! You may use the B<-w> switch to find such places.
347 Symbols starting with C<_> are no longer forced into package C<main>, except
348 for $_ itself (and @_, etc.).
352 C<s'$lhs'$rhs'> now does no interpolation on either side. It used to
353 interpolate C<$lhs> but not C<$rhs>.
357 The second and third arguments of splice() are now evaluated in scalar
358 context (as the book says) rather than list context.
362 These are now semantic errors because of precedence:
367 Because if that were to work, then this couldn't:
369 sleep $dormancy + 20;
373 C<open FOO || die> is now incorrect. You need parens around the filehandle.
374 While temporarily supported, using such a construct will
375 generate a non-fatal (but non-suppressible) warning.
379 The elements of argument lists for formats are now evaluated in list
380 context. This means you can interpolate list values now.
384 You can't do a C<goto> into a block that is optimized away. Darn.
388 It is no longer syntactically legal to use whitespace as the name
389 of a variable, or as a delimiter for any kind of quote construct.
394 The caller() function now returns a false value in a scalar context if there
395 is no caller. This lets library files determine if they're being required.
399 C<m//g> now attaches its state to the searched string rather than the
404 C<reverse> is no longer allowed as the name of a sort subroutine.
408 B<taintperl> is no longer a separate executable. There is now a B<-T>
409 switch to turn on tainting when it isn't turned on automatically.
413 Double-quoted strings may no longer end with an unescaped C<$> or C<@>.
417 The archaic C<while/if> BLOCK BLOCK syntax is no longer supported.
422 Negative array subscripts now count from the end of the array.
426 The comma operator in a scalar context is now guaranteed to give a
427 scalar context to its arguments.
431 The C<**> operator now binds more tightly than unary minus.
432 It was documented to work this way before, but didn't.
436 Setting C<$#array> lower now discards array elements.
440 delete() is not guaranteed to return the old value for tie()d arrays,
441 since this capability may be onerous for some modules to implement.
445 Some error messages will be different.
449 Some bugs may have been inadvertently removed.