perl 5.002_01: t/cmd/mod.t
[p5sagit/p5-mst-13.2.git] / pod / perlop.pod
CommitLineData
a0d0e21e 1=head1 NAME
2
3perlop - Perl operators and precedence
4
5=head1 SYNOPSIS
6
7Perl operators have the following associativity and precedence,
8listed from highest precedence to lowest. Note that all operators
9borrowed from C keep the same precedence relationship with each other,
10even where C's precedence is slightly screwy. (This makes learning
c07a80fd 11Perl easier for C folks.) With very few exceptions, these all
12operate on scalar values only, not array values.
a0d0e21e 13
14 left terms and list operators (leftward)
15 left ->
16 nonassoc ++ --
17 right **
18 right ! ~ \ and unary + and -
19 left =~ !~
20 left * / % x
21 left + - .
22 left << >>
23 nonassoc named unary operators
24 nonassoc < > <= >= lt gt le ge
25 nonassoc == != <=> eq ne cmp
26 left &
27 left | ^
28 left &&
29 left ||
30 nonassoc ..
31 right ?:
32 right = += -= *= etc.
33 left , =>
34 nonassoc list operators (rightward)
a5f75d66 35 right not
a0d0e21e 36 left and
37 left or xor
38
39In the following sections, these operators are covered in precedence order.
40
cb1a09d0 41=head1 DESCRIPTION
a0d0e21e 42
43=head2 Terms and List Operators (Leftward)
44
45Any TERM is of highest precedence of Perl. These includes variables,
46quote and quotelike operators, any expression in parentheses,
47and any function whose arguments are parenthesized. Actually, there
48aren't really functions in this sense, just list operators and unary
49operators behaving as functions because you put parentheses around
50the arguments. These are all documented in L<perlfunc>.
51
52If any list operator (print(), etc.) or any unary operator (chdir(), etc.)
53is followed by a left parenthesis as the next token, the operator and
54arguments within parentheses are taken to be of highest precedence,
55just like a normal function call.
56
57In the absence of parentheses, the precedence of list operators such as
58C<print>, C<sort>, or C<chmod> is either very high or very low depending on
59whether you look at the left side of operator or the right side of it.
60For example, in
61
62 @ary = (1, 3, sort 4, 2);
63 print @ary; # prints 1324
64
65the commas on the right of the sort are evaluated before the sort, but
66the commas on the left are evaluated after. In other words, list
67operators tend to gobble up all the arguments that follow them, and
68then act like a simple TERM with regard to the preceding expression.
69Note that you have to be careful with parens:
70
71 # These evaluate exit before doing the print:
72 print($foo, exit); # Obviously not what you want.
73 print $foo, exit; # Nor is this.
74
75 # These do the print before evaluating exit:
76 (print $foo), exit; # This is what you want.
77 print($foo), exit; # Or this.
78 print ($foo), exit; # Or even this.
79
80Also note that
81
82 print ($foo & 255) + 1, "\n";
83
84probably doesn't do what you expect at first glance. See
85L<Named Unary Operators> for more discussion of this.
86
87Also parsed as terms are the C<do {}> and C<eval {}> constructs, as
88well as subroutine and method calls, and the anonymous
89constructors C<[]> and C<{}>.
90
91See also L<Quote and Quotelike Operators> toward the end of this section,
c07a80fd 92as well as L<"I/O Operators">.
a0d0e21e 93
94=head2 The Arrow Operator
95
96Just as in C and C++, "C<-E<gt>>" is an infix dereference operator. If the
97right side is either a C<[...]> or C<{...}> subscript, then the left side
98must be either a hard or symbolic reference to an array or hash (or
99a location capable of holding a hard reference, if it's an lvalue (assignable)).
100See L<perlref>.
101
102Otherwise, the right side is a method name or a simple scalar variable
103containing the method name, and the left side must either be an object
104(a blessed reference) or a class name (that is, a package name).
105See L<perlobj>.
106
107=head2 Autoincrement and Autodecrement
108
109"++" and "--" work as in C. That is, if placed before a variable, they
110increment or decrement the variable before returning the value, and if
111placed after, increment or decrement the variable after returning the value.
112
113The autoincrement operator has a little extra built-in magic to it. If
114you increment a variable that is numeric, or that has ever been used in
115a numeric context, you get a normal increment. If, however, the
116variable has only been used in string contexts since it was set, and
117has a value that is not null and matches the pattern
118C</^[a-zA-Z]*[0-9]*$/>, the increment is done as a string, preserving each
119character within its range, with carry:
120
121 print ++($foo = '99'); # prints '100'
122 print ++($foo = 'a0'); # prints 'a1'
123 print ++($foo = 'Az'); # prints 'Ba'
124 print ++($foo = 'zz'); # prints 'aaa'
125
126The autodecrement operator is not magical.
127
128=head2 Exponentiation
129
130Binary "**" is the exponentiation operator. Note that it binds even more
cb1a09d0 131tightly than unary minus, so -2**4 is -(2**4), not (-2)**4. (This is
132implemented using C's pow(3) function, which actually works on doubles
133internally.)
a0d0e21e 134
135=head2 Symbolic Unary Operators
136
137Unary "!" performs logical negation, i.e. "not". See also C<not> for a lower
138precedence version of this.
139
140Unary "-" performs arithmetic negation if the operand is numeric. If
141the operand is an identifier, a string consisting of a minus sign
142concatenated with the identifier is returned. Otherwise, if the string
143starts with a plus or minus, a string starting with the opposite sign
144is returned. One effect of these rules is that C<-bareword> is equivalent
145to C<"-bareword">.
146
147Unary "~" performs bitwise negation, i.e. 1's complement.
148
149Unary "+" has no effect whatsoever, even on strings. It is useful
150syntactically for separating a function name from a parenthesized expression
151that would otherwise be interpreted as the complete list of function
152arguments. (See examples above under L<List Operators>.)
153
154Unary "\" creates a reference to whatever follows it. See L<perlref>.
155Do not confuse this behavior with the behavior of backslash within a
156string, although both forms do convey the notion of protecting the next
157thing from interpretation.
158
159=head2 Binding Operators
160
c07a80fd 161Binary "=~" binds a scalar expression to a pattern match. Certain operations
cb1a09d0 162search or modify the string $_ by default. This operator makes that kind
163of operation work on some other string. The right argument is a search
164pattern, substitution, or translation. The left argument is what is
165supposed to be searched, substituted, or translated instead of the default
166$_. The return value indicates the success of the operation. (If the
167right argument is an expression rather than a search pattern,
168substitution, or translation, it is interpreted as a search pattern at run
169time. This is less efficient than an explicit search, since the pattern
170must be compiled every time the expression is evaluated--unless you've
171used C</o>.)
a0d0e21e 172
173Binary "!~" is just like "=~" except the return value is negated in
174the logical sense.
175
176=head2 Multiplicative Operators
177
178Binary "*" multiplies two numbers.
179
180Binary "/" divides two numbers.
181
182Binary "%" computes the modulus of the two numbers.
183
184Binary "x" is the repetition operator. In a scalar context, it
185returns a string consisting of the left operand repeated the number of
186times specified by the right operand. In a list context, if the left
187operand is a list in parens, it repeats the list.
188
189 print '-' x 80; # print row of dashes
190
191 print "\t" x ($tab/8), ' ' x ($tab%8); # tab over
192
193 @ones = (1) x 80; # a list of 80 1's
194 @ones = (5) x @ones; # set all elements to 5
195
196
197=head2 Additive Operators
198
199Binary "+" returns the sum of two numbers.
200
201Binary "-" returns the difference of two numbers.
202
203Binary "." concatenates two strings.
204
205=head2 Shift Operators
206
207Binary "<<" returns the value of its left argument shifted left by the
208number of bits specified by the right argument. Arguments should be
209integers.
210
211Binary ">>" returns the value of its left argument shifted right by the
212number of bits specified by the right argument. Arguments should be
213integers.
214
215=head2 Named Unary Operators
216
217The various named unary operators are treated as functions with one
218argument, with optional parentheses. These include the filetest
219operators, like C<-f>, C<-M>, etc. See L<perlfunc>.
220
221If any list operator (print(), etc.) or any unary operator (chdir(), etc.)
222is followed by a left parenthesis as the next token, the operator and
223arguments within parentheses are taken to be of highest precedence,
224just like a normal function call. Examples:
225
226 chdir $foo || die; # (chdir $foo) || die
227 chdir($foo) || die; # (chdir $foo) || die
228 chdir ($foo) || die; # (chdir $foo) || die
229 chdir +($foo) || die; # (chdir $foo) || die
230
231but, because * is higher precedence than ||:
232
233 chdir $foo * 20; # chdir ($foo * 20)
234 chdir($foo) * 20; # (chdir $foo) * 20
235 chdir ($foo) * 20; # (chdir $foo) * 20
236 chdir +($foo) * 20; # chdir ($foo * 20)
237
238 rand 10 * 20; # rand (10 * 20)
239 rand(10) * 20; # (rand 10) * 20
240 rand (10) * 20; # (rand 10) * 20
241 rand +(10) * 20; # rand (10 * 20)
242
243See also L<"List Operators">.
244
245=head2 Relational Operators
246
247Binary "<" returns true if the left argument is numerically less than
248the right argument.
249
250Binary ">" returns true if the left argument is numerically greater
251than the right argument.
252
253Binary "<=" returns true if the left argument is numerically less than
254or equal to the right argument.
255
256Binary ">=" returns true if the left argument is numerically greater
257than or equal to the right argument.
258
259Binary "lt" returns true if the left argument is stringwise less than
260the right argument.
261
262Binary "gt" returns true if the left argument is stringwise greater
263than the right argument.
264
265Binary "le" returns true if the left argument is stringwise less than
266or equal to the right argument.
267
268Binary "ge" returns true if the left argument is stringwise greater
269than or equal to the right argument.
270
271=head2 Equality Operators
272
273Binary "==" returns true if the left argument is numerically equal to
274the right argument.
275
276Binary "!=" returns true if the left argument is numerically not equal
277to the right argument.
278
279Binary "<=>" returns -1, 0, or 1 depending on whether the left argument is numerically
280less than, equal to, or greater than the right argument.
281
282Binary "eq" returns true if the left argument is stringwise equal to
283the right argument.
284
285Binary "ne" returns true if the left argument is stringwise not equal
286to the right argument.
287
288Binary "cmp" returns -1, 0, or 1 depending on whether the left argument is stringwise
289less than, equal to, or greater than the right argument.
290
291=head2 Bitwise And
292
293Binary "&" returns its operators ANDed together bit by bit.
294
295=head2 Bitwise Or and Exclusive Or
296
297Binary "|" returns its operators ORed together bit by bit.
298
299Binary "^" returns its operators XORed together bit by bit.
300
301=head2 C-style Logical And
302
303Binary "&&" performs a short-circuit logical AND operation. That is,
304if the left operand is false, the right operand is not even evaluated.
305Scalar or list context propagates down to the right operand if it
306is evaluated.
307
308=head2 C-style Logical Or
309
310Binary "||" performs a short-circuit logical OR operation. That is,
311if the left operand is true, the right operand is not even evaluated.
312Scalar or list context propagates down to the right operand if it
313is evaluated.
314
315The C<||> and C<&&> operators differ from C's in that, rather than returning
3160 or 1, they return the last value evaluated. Thus, a reasonably portable
317way to find out the home directory (assuming it's not "0") might be:
318
319 $home = $ENV{'HOME'} || $ENV{'LOGDIR'} ||
320 (getpwuid($<))[7] || die "You're homeless!\n";
321
322As more readable alternatives to C<&&> and C<||>, Perl provides "and" and
323"or" operators (see below). The short-circuit behavior is identical. The
324precedence of "and" and "or" is much lower, however, so that you can
325safely use them after a list operator without the need for
326parentheses:
327
328 unlink "alpha", "beta", "gamma"
329 or gripe(), next LINE;
330
331With the C-style operators that would have been written like this:
332
333 unlink("alpha", "beta", "gamma")
334 || (gripe(), next LINE);
335
336=head2 Range Operator
337
338Binary ".." is the range operator, which is really two different
339operators depending on the context. In a list context, it returns an
340array of values counting (by ones) from the left value to the right
341value. This is useful for writing C<for (1..10)> loops and for doing
342slice operations on arrays. Be aware that under the current implementation,
343a temporary array is created, so you'll burn a lot of memory if you
344write something like this:
345
346 for (1 .. 1_000_000) {
347 # code
348 }
349
350In a scalar context, ".." returns a boolean value. The operator is
351bistable, like a flip-flop, and emulates the line-range (comma) operator
352of B<sed>, B<awk>, and various editors. Each ".." operator maintains its
353own boolean state. It is false as long as its left operand is false.
354Once the left operand is true, the range operator stays true until the
355right operand is true, I<AFTER> which the range operator becomes false
356again. (It doesn't become false till the next time the range operator is
357evaluated. It can test the right operand and become false on the same
358evaluation it became true (as in B<awk>), but it still returns true once.
359If you don't want it to test the right operand till the next evaluation
360(as in B<sed>), use three dots ("...") instead of two.) The right
361operand is not evaluated while the operator is in the "false" state, and
362the left operand is not evaluated while the operator is in the "true"
363state. The precedence is a little lower than || and &&. The value
364returned is either the null string for false, or a sequence number
365(beginning with 1) for true. The sequence number is reset for each range
366encountered. The final sequence number in a range has the string "E0"
367appended to it, which doesn't affect its numeric value, but gives you
368something to search for if you want to exclude the endpoint. You can
369exclude the beginning point by waiting for the sequence number to be
370greater than 1. If either operand of scalar ".." is a numeric literal,
371that operand is implicitly compared to the C<$.> variable, the current
372line number. Examples:
373
374As a scalar operator:
375
376 if (101 .. 200) { print; } # print 2nd hundred lines
377 next line if (1 .. /^$/); # skip header lines
378 s/^/> / if (/^$/ .. eof()); # quote body
379
380As a list operator:
381
382 for (101 .. 200) { print; } # print $_ 100 times
383 @foo = @foo[$[ .. $#foo]; # an expensive no-op
384 @foo = @foo[$#foo-4 .. $#foo]; # slice last 5 items
385
386The range operator (in a list context) makes use of the magical
d28ebecd 387autoincrement algorithm if the operands are strings. You
a0d0e21e 388can say
389
390 @alphabet = ('A' .. 'Z');
391
392to get all the letters of the alphabet, or
393
394 $hexdigit = (0 .. 9, 'a' .. 'f')[$num & 15];
395
396to get a hexadecimal digit, or
397
398 @z2 = ('01' .. '31'); print $z2[$mday];
399
400to get dates with leading zeros. If the final value specified is not
401in the sequence that the magical increment would produce, the sequence
402goes until the next value would be longer than the final value
403specified.
404
405=head2 Conditional Operator
406
407Ternary "?:" is the conditional operator, just as in C. It works much
408like an if-then-else. If the argument before the ? is true, the
409argument before the : is returned, otherwise the argument after the :
cb1a09d0 410is returned. For example:
411
412 printf "I have %d dog%s.\n", $n,
413 ($n == 1) ? '' : "s";
414
415Scalar or list context propagates downward into the 2nd
416or 3rd argument, whichever is selected.
417
418 $a = $ok ? $b : $c; # get a scalar
419 @a = $ok ? @b : @c; # get an array
420 $a = $ok ? @b : @c; # oops, that's just a count!
421
422The operator may be assigned to if both the 2nd and 3rd arguments are
423legal lvalues (meaning that you can assign to them):
a0d0e21e 424
425 ($a_or_b ? $a : $b) = $c;
426
cb1a09d0 427This is not necessarily guaranteed to contribute to the readability of your program.
a0d0e21e 428
4633a7c4 429=head2 Assignment Operators
a0d0e21e 430
431"=" is the ordinary assignment operator.
432
433Assignment operators work as in C. That is,
434
435 $a += 2;
436
437is equivalent to
438
439 $a = $a + 2;
440
441although without duplicating any side effects that dereferencing the lvalue
442might trigger, such as from tie(). Other assignment operators work similarly.
443The following are recognized:
444
445 **= += *= &= <<= &&=
446 -= /= |= >>= ||=
447 .= %= ^=
448 x=
449
450Note that while these are grouped by family, they all have the precedence
451of assignment.
452
453Unlike in C, the assignment operator produces a valid lvalue. Modifying
454an assignment is equivalent to doing the assignment and then modifying
455the variable that was assigned to. This is useful for modifying
456a copy of something, like this:
457
458 ($tmp = $global) =~ tr [A-Z] [a-z];
459
460Likewise,
461
462 ($a += 2) *= 3;
463
464is equivalent to
465
466 $a += 2;
467 $a *= 3;
468
748a9306 469=head2 Comma Operator
a0d0e21e 470
471Binary "," is the comma operator. In a scalar context it evaluates
472its left argument, throws that value away, then evaluates its right
473argument and returns that value. This is just like C's comma operator.
474
475In a list context, it's just the list argument separator, and inserts
476both its arguments into the list.
477
4633a7c4 478The => digraph is mostly just a synonym for the comma operator. It's useful for
cb1a09d0 479documenting arguments that come in pairs. As of release 5.001, it also forces
4633a7c4 480any word to the left of it to be interpreted as a string.
748a9306 481
a0d0e21e 482=head2 List Operators (Rightward)
483
484On the right side of a list operator, it has very low precedence,
485such that it controls all comma-separated expressions found there.
486The only operators with lower precedence are the logical operators
487"and", "or", and "not", which may be used to evaluate calls to list
488operators without the need for extra parentheses:
489
490 open HANDLE, "filename"
491 or die "Can't open: $!\n";
492
493See also discussion of list operators in L<List Operators (Leftward)>.
494
495=head2 Logical Not
496
497Unary "not" returns the logical negation of the expression to its right.
498It's the equivalent of "!" except for the very low precedence.
499
500=head2 Logical And
501
502Binary "and" returns the logical conjunction of the two surrounding
503expressions. It's equivalent to && except for the very low
504precedence. This means that it short-circuits: i.e. the right
505expression is evaluated only if the left expression is true.
506
507=head2 Logical or and Exclusive Or
508
509Binary "or" returns the logical disjunction of the two surrounding
510expressions. It's equivalent to || except for the very low
511precedence. This means that it short-circuits: i.e. the right
512expression is evaluated only if the left expression is false.
513
514Binary "xor" returns the exclusive-OR of the two surrounding expressions.
515It cannot short circuit, of course.
516
517=head2 C Operators Missing From Perl
518
519Here is what C has that Perl doesn't:
520
521=over 8
522
523=item unary &
524
525Address-of operator. (But see the "\" operator for taking a reference.)
526
527=item unary *
528
529Dereference-address operator. (Perl's prefix dereferencing
530operators are typed: $, @, %, and &.)
531
532=item (TYPE)
533
534Type casting operator.
535
536=back
537
538=head2 Quote and Quotelike Operators
539
540While we usually think of quotes as literal values, in Perl they
541function as operators, providing various kinds of interpolating and
542pattern matching capabilities. Perl provides customary quote characters
543for these behaviors, but also provides a way for you to choose your
544quote character for any of them. In the following table, a C<{}> represents
545any pair of delimiters you choose. Non-bracketing delimiters use
546the same character fore and aft, but the 4 sorts of brackets
547(round, angle, square, curly) will all nest.
548
549 Customary Generic Meaning Interpolates
550 '' q{} Literal no
551 "" qq{} Literal yes
552 `` qx{} Command yes
553 qw{} Word list no
554 // m{} Pattern match yes
555 s{}{} Substitution yes
556 tr{}{} Translation no
557
cb1a09d0 558For constructs that do interpolation, variables beginning with "C<$>" or "C<@>"
a0d0e21e 559are interpolated, as are the following sequences:
560
561 \t tab
562 \n newline
563 \r return
564 \f form feed
a0d0e21e 565 \b backspace
566 \a alarm (bell)
567 \e escape
568 \033 octal char
569 \x1b hex char
570 \c[ control char
571 \l lowercase next char
572 \u uppercase next char
573 \L lowercase till \E
574 \U uppercase till \E
575 \E end case modification
576 \Q quote regexp metacharacters till \E
577
578Patterns are subject to an additional level of interpretation as a
579regular expression. This is done as a second pass, after variables are
580interpolated, so that regular expressions may be incorporated into the
581pattern from the variables. If this is not what you want, use C<\Q> to
582interpolate a variable literally.
583
584Apart from the above, there are no multiple levels of interpolation. In
585particular, contrary to the expectations of shell programmers, backquotes
586do I<NOT> interpolate within double quotes, nor do single quotes impede
587evaluation of variables when used within double quotes.
588
cb1a09d0 589=head2 Regexp Quotelike Operators
590
591Here are the quotelike operators that apply to pattern
592matching and related activities.
593
a0d0e21e 594=over 8
595
596=item ?PATTERN?
597
598This is just like the C</pattern/> search, except that it matches only
599once between calls to the reset() operator. This is a useful
600optimization when you only want to see the first occurrence of
601something in each file of a set of files, for instance. Only C<??>
602patterns local to the current package are reset.
603
604This usage is vaguely deprecated, and may be removed in some future
605version of Perl.
606
607=item m/PATTERN/gimosx
608
609=item /PATTERN/gimosx
610
611Searches a string for a pattern match, and in a scalar context returns
612true (1) or false (''). If no string is specified via the C<=~> or
613C<!~> operator, the $_ string is searched. (The string specified with
614C<=~> need not be an lvalue--it may be the result of an expression
615evaluation, but remember the C<=~> binds rather tightly.) See also
616L<perlre>.
617
618Options are:
619
620 g Match globally, i.e. find all occurrences.
621 i Do case-insensitive pattern matching.
622 m Treat string as multiple lines.
623 o Only compile pattern once.
624 s Treat string as single line.
625 x Use extended regular expressions.
626
627If "/" is the delimiter then the initial C<m> is optional. With the C<m>
628you can use any pair of non-alphanumeric, non-whitespace characters as
629delimiters. This is particularly useful for matching Unix path names
630that contain "/", to avoid LTS (leaning toothpick syndrome).
631
632PATTERN may contain variables, which will be interpolated (and the
633pattern recompiled) every time the pattern search is evaluated. (Note
634that C<$)> and C<$|> might not be interpolated because they look like
635end-of-string tests.) If you want such a pattern to be compiled only
636once, add a C</o> after the trailing delimiter. This avoids expensive
637run-time recompilations, and is useful when the value you are
638interpolating won't change over the life of the script. However, mentioning
639C</o> constitutes a promise that you won't change the variables in the pattern.
640If you change them, Perl won't even notice.
641
4633a7c4 642If the PATTERN evaluates to a null string, the last
643successfully executed regular expression is used instead.
a0d0e21e 644
645If used in a context that requires a list value, a pattern match returns a
646list consisting of the subexpressions matched by the parentheses in the
647pattern, i.e. ($1, $2, $3...). (Note that here $1 etc. are also set, and
648that this differs from Perl 4's behavior.) If the match fails, a null
649array is returned. If the match succeeds, but there were no parentheses,
650a list value of (1) is returned.
651
652Examples:
653
654 open(TTY, '/dev/tty');
655 <TTY> =~ /^y/i && foo(); # do foo if desired
656
657 if (/Version: *([0-9.]*)/) { $version = $1; }
658
659 next if m#^/usr/spool/uucp#;
660
661 # poor man's grep
662 $arg = shift;
663 while (<>) {
664 print if /$arg/o; # compile only once
665 }
666
667 if (($F1, $F2, $Etc) = ($foo =~ /^(\S+)\s+(\S+)\s*(.*)/))
668
669This last example splits $foo into the first two words and the
670remainder of the line, and assigns those three fields to $F1, $F2 and
671$Etc. The conditional is true if any variables were assigned, i.e. if
672the pattern matched.
673
674The C</g> modifier specifies global pattern matching--that is, matching
675as many times as possible within the string. How it behaves depends on
676the context. In a list context, it returns a list of all the
677substrings matched by all the parentheses in the regular expression.
678If there are no parentheses, it returns a list of all the matched
679strings, as if there were parentheses around the whole pattern.
680
681In a scalar context, C<m//g> iterates through the string, returning TRUE
682each time it matches, and FALSE when it eventually runs out of
683matches. (In other words, it remembers where it left off last time and
684restarts the search at that point. You can actually find the current
685match position of a string using the pos() function--see L<perlfunc>.)
686If you modify the string in any way, the match position is reset to the
687beginning. Examples:
688
689 # list context
690 ($one,$five,$fifteen) = (`uptime` =~ /(\d+\.\d+)/g);
691
692 # scalar context
693 $/ = ""; $* = 1; # $* deprecated in Perl 5
694 while ($paragraph = <>) {
695 while ($paragraph =~ /[a-z]['")]*[.!?]+['")]*\s/g) {
696 $sentences++;
697 }
698 }
699 print "$sentences\n";
700
701=item q/STRING/
702
703=item C<'STRING'>
704
705A single-quoted, literal string. Backslashes are ignored, unless
706followed by the delimiter or another backslash, in which case the
707delimiter or backslash is interpolated.
708
709 $foo = q!I said, "You said, 'She said it.'"!;
710 $bar = q('This is it.');
711
712=item qq/STRING/
713
714=item "STRING"
715
716A double-quoted, interpolated string.
717
718 $_ .= qq
719 (*** The previous line contains the naughty word "$1".\n)
720 if /(tcl|rexx|python)/; # :-)
721
722=item qx/STRING/
723
724=item `STRING`
725
726A string which is interpolated and then executed as a system command.
727The collected standard output of the command is returned. In scalar
728context, it comes back as a single (potentially multi-line) string.
729In list context, returns a list of lines (however you've defined lines
730with $/ or $INPUT_RECORD_SEPARATOR).
731
732 $today = qx{ date };
733
734See L<I/O Operators> for more discussion.
735
736=item qw/STRING/
737
738Returns a list of the words extracted out of STRING, using embedded
739whitespace as the word delimiters. It is exactly equivalent to
740
741 split(' ', q/STRING/);
742
743Some frequently seen examples:
744
745 use POSIX qw( setlocale localeconv )
746 @EXPORT = qw( foo bar baz );
747
748=item s/PATTERN/REPLACEMENT/egimosx
749
750Searches a string for a pattern, and if found, replaces that pattern
751with the replacement text and returns the number of substitutions
752made. Otherwise it returns false (0).
753
754If no string is specified via the C<=~> or C<!~> operator, the C<$_>
755variable is searched and modified. (The string specified with C<=~> must
756be a scalar variable, an array element, a hash element, or an assignment
757to one of those, i.e. an lvalue.)
758
759If the delimiter chosen is single quote, no variable interpolation is
760done on either the PATTERN or the REPLACEMENT. Otherwise, if the
761PATTERN contains a $ that looks like a variable rather than an
762end-of-string test, the variable will be interpolated into the pattern
763at run-time. If you only want the pattern compiled once the first time
764the variable is interpolated, use the C</o> option. If the pattern
4633a7c4 765evaluates to a null string, the last successfully executed regular
a0d0e21e 766expression is used instead. See L<perlre> for further explanation on these.
767
768Options are:
769
770 e Evaluate the right side as an expression.
771 g Replace globally, i.e. all occurrences.
772 i Do case-insensitive pattern matching.
773 m Treat string as multiple lines.
774 o Only compile pattern once.
775 s Treat string as single line.
776 x Use extended regular expressions.
777
778Any non-alphanumeric, non-whitespace delimiter may replace the
779slashes. If single quotes are used, no interpretation is done on the
780replacement string (the C</e> modifier overrides this, however). If
781backquotes are used, the replacement string is a command to execute
782whose output will be used as the actual replacement text. If the
783PATTERN is delimited by bracketing quotes, the REPLACEMENT has its own
784pair of quotes, which may or may not be bracketing quotes, e.g.
785C<s(foo)(bar)> or C<sE<lt>fooE<gt>/bar/>. A C</e> will cause the
786replacement portion to be interpreter as a full-fledged Perl expression
787and eval()ed right then and there. It is, however, syntax checked at
788compile-time.
789
790Examples:
791
792 s/\bgreen\b/mauve/g; # don't change wintergreen
793
794 $path =~ s|/usr/bin|/usr/local/bin|;
795
796 s/Login: $foo/Login: $bar/; # run-time pattern
797
798 ($foo = $bar) =~ s/this/that/;
799
800 $count = ($paragraph =~ s/Mister\b/Mr./g);
801
802 $_ = 'abc123xyz';
803 s/\d+/$&*2/e; # yields 'abc246xyz'
804 s/\d+/sprintf("%5d",$&)/e; # yields 'abc 246xyz'
805 s/\w/$& x 2/eg; # yields 'aabbcc 224466xxyyzz'
806
807 s/%(.)/$percent{$1}/g; # change percent escapes; no /e
808 s/%(.)/$percent{$1} || $&/ge; # expr now, so /e
809 s/^=(\w+)/&pod($1)/ge; # use function call
810
811 # /e's can even nest; this will expand
812 # simple embedded variables in $_
813 s/(\$\w+)/$1/eeg;
814
815 # Delete C comments.
816 $program =~ s {
4633a7c4 817 /\* # Match the opening delimiter.
818 .*? # Match a minimal number of characters.
819 \*/ # Match the closing delimiter.
a0d0e21e 820 } []gsx;
821
822 s/^\s*(.*?)\s*$/$1/; # trim white space
823
824 s/([^ ]*) *([^ ]*)/$2 $1/; # reverse 1st two fields
825
826Note the use of $ instead of \ in the last example. Unlike
827B<sed>, we only use the \<I<digit>> form in the left hand side.
828Anywhere else it's $<I<digit>>.
829
830Occasionally, you can't just use a C</g> to get all the changes
831to occur. Here are two common cases:
832
833 # put commas in the right places in an integer
834 1 while s/(.*\d)(\d\d\d)/$1,$2/g; # perl4
835 1 while s/(\d)(\d\d\d)(?!\d)/$1,$2/g; # perl5
836
837 # expand tabs to 8-column spacing
838 1 while s/\t+/' ' x (length($&)*8 - length($`)%8)/e;
839
840
841=item tr/SEARCHLIST/REPLACEMENTLIST/cds
842
843=item y/SEARCHLIST/REPLACEMENTLIST/cds
844
845Translates all occurrences of the characters found in the search list
846with the corresponding character in the replacement list. It returns
847the number of characters replaced or deleted. If no string is
848specified via the =~ or !~ operator, the $_ string is translated. (The
849string specified with =~ must be a scalar variable, an array element,
850or an assignment to one of those, i.e. an lvalue.) For B<sed> devotees,
851C<y> is provided as a synonym for C<tr>. If the SEARCHLIST is
852delimited by bracketing quotes, the REPLACEMENTLIST has its own pair of
853quotes, which may or may not be bracketing quotes, e.g. C<tr[A-Z][a-z]>
854or C<tr(+-*/)/ABCD/>.
855
856Options:
857
858 c Complement the SEARCHLIST.
859 d Delete found but unreplaced characters.
860 s Squash duplicate replaced characters.
861
862If the C</c> modifier is specified, the SEARCHLIST character set is
863complemented. If the C</d> modifier is specified, any characters specified
864by SEARCHLIST not found in REPLACEMENTLIST are deleted. (Note
865that this is slightly more flexible than the behavior of some B<tr>
866programs, which delete anything they find in the SEARCHLIST, period.)
867If the C</s> modifier is specified, sequences of characters that were
868translated to the same character are squashed down to a single instance of the
869character.
870
871If the C</d> modifier is used, the REPLACEMENTLIST is always interpreted
872exactly as specified. Otherwise, if the REPLACEMENTLIST is shorter
873than the SEARCHLIST, the final character is replicated till it is long
874enough. If the REPLACEMENTLIST is null, the SEARCHLIST is replicated.
875This latter is useful for counting characters in a class or for
876squashing character sequences in a class.
877
878Examples:
879
880 $ARGV[1] =~ tr/A-Z/a-z/; # canonicalize to lower case
881
882 $cnt = tr/*/*/; # count the stars in $_
883
884 $cnt = $sky =~ tr/*/*/; # count the stars in $sky
885
886 $cnt = tr/0-9//; # count the digits in $_
887
888 tr/a-zA-Z//s; # bookkeeper -> bokeper
889
890 ($HOST = $host) =~ tr/a-z/A-Z/;
891
892 tr/a-zA-Z/ /cs; # change non-alphas to single space
893
894 tr [\200-\377]
895 [\000-\177]; # delete 8th bit
896
748a9306 897If multiple translations are given for a character, only the first one is used:
898
899 tr/AAA/XYZ/
900
901will translate any A to X.
902
a0d0e21e 903Note that because the translation table is built at compile time, neither
904the SEARCHLIST nor the REPLACEMENTLIST are subjected to double quote
905interpolation. That means that if you want to use variables, you must use
906an eval():
907
908 eval "tr/$oldlist/$newlist/";
909 die $@ if $@;
910
911 eval "tr/$oldlist/$newlist/, 1" or die $@;
912
913=back
914
915=head2 I/O Operators
916
917There are several I/O operators you should know about.
918A string is enclosed by backticks (grave accents) first undergoes
919variable substitution just like a double quoted string. It is then
920interpreted as a command, and the output of that command is the value
921of the pseudo-literal, like in a shell. In a scalar context, a single
922string consisting of all the output is returned. In a list context,
923a list of values is returned, one for each line of output. (You can
924set C<$/> to use a different line terminator.) The command is executed
925each time the pseudo-literal is evaluated. The status value of the
926command is returned in C<$?> (see L<perlvar> for the interpretation
927of C<$?>). Unlike in B<csh>, no translation is done on the return
928data--newlines remain newlines. Unlike in any of the shells, single
929quotes do not hide variable names in the command from interpretation.
930To pass a $ through to the shell you need to hide it with a backslash.
cb1a09d0 931The generalized form of backticks is C<qx//>. (Because backticks
932always undergo shell expansion as well, see L<perlsec> for
933security concerns.)
a0d0e21e 934
935Evaluating a filehandle in angle brackets yields the next line from
748a9306 936that file (newline included, so it's never false until end of file, at
937which time an undefined value is returned). Ordinarily you must assign
938that value to a variable, but there is one situation where an automatic
a0d0e21e 939assignment happens. I<If and ONLY if> the input symbol is the only
940thing inside the conditional of a C<while> loop, the value is
748a9306 941automatically assigned to the variable C<$_>. The assigned value is
942then tested to see if it is defined. (This may seem like an odd thing
943to you, but you'll use the construct in almost every Perl script you
944write.) Anyway, the following lines are equivalent to each other:
a0d0e21e 945
748a9306 946 while (defined($_ = <STDIN>)) { print; }
a0d0e21e 947 while (<STDIN>) { print; }
948 for (;<STDIN>;) { print; }
748a9306 949 print while defined($_ = <STDIN>);
a0d0e21e 950 print while <STDIN>;
951
952The filehandles STDIN, STDOUT and STDERR are predefined. (The
953filehandles C<stdin>, C<stdout> and C<stderr> will also work except in
954packages, where they would be interpreted as local identifiers rather
955than global.) Additional filehandles may be created with the open()
cb1a09d0 956function. See L<perlfunc/open()> for details on this.
a0d0e21e 957
958If a <FILEHANDLE> is used in a context that is looking for a list, a
959list consisting of all the input lines is returned, one line per list
960element. It's easy to make a I<LARGE> data space this way, so use with
961care.
962
d28ebecd 963The null filehandle E<lt>E<gt> is special and can be used to emulate the
964behavior of B<sed> and B<awk>. Input from E<lt>E<gt> comes either from
a0d0e21e 965standard input, or from each file listed on the command line. Here's
d28ebecd 966how it works: the first time E<lt>E<gt> is evaluated, the @ARGV array is
a0d0e21e 967checked, and if it is null, C<$ARGV[0]> is set to "-", which when opened
968gives you standard input. The @ARGV array is then processed as a list
969of filenames. The loop
970
971 while (<>) {
972 ... # code for each line
973 }
974
975is equivalent to the following Perl-like pseudo code:
976
977 unshift(@ARGV, '-') if $#ARGV < $[;
978 while ($ARGV = shift) {
979 open(ARGV, $ARGV);
980 while (<ARGV>) {
981 ... # code for each line
982 }
983 }
984
985except that it isn't so cumbersome to say, and will actually work. It
986really does shift array @ARGV and put the current filename into variable
d28ebecd 987$ARGV. It also uses filehandle I<ARGV> internally--E<lt>E<gt> is just a synonym
a0d0e21e 988for <ARGV>, which is magical. (The pseudo code above doesn't work
989because it treats <ARGV> as non-magical.)
990
d28ebecd 991You can modify @ARGV before the first E<lt>E<gt> as long as the array ends up
a0d0e21e 992containing the list of filenames you really want. Line numbers (C<$.>)
993continue as if the input were one big happy file. (But see example
994under eof() for how to reset line numbers on each file.)
995
996If you want to set @ARGV to your own list of files, go right ahead. If
997you want to pass switches into your script, you can use one of the
998Getopts modules or put a loop on the front like this:
999
1000 while ($_ = $ARGV[0], /^-/) {
1001 shift;
1002 last if /^--$/;
1003 if (/^-D(.*)/) { $debug = $1 }
1004 if (/^-v/) { $verbose++ }
1005 ... # other switches
1006 }
1007 while (<>) {
1008 ... # code for each line
1009 }
1010
d28ebecd 1011The E<lt>E<gt> symbol will return FALSE only once. If you call it again after
a0d0e21e 1012this it will assume you are processing another @ARGV list, and if you
1013haven't set @ARGV, will input from STDIN.
1014
1015If the string inside the angle brackets is a reference to a scalar
1016variable (e.g. <$foo>), then that variable contains the name of the
cb1a09d0 1017filehandle to input from, or a reference to the same. For example:
1018
1019 $fh = \*STDIN;
1020 $line = <$fh>;
a0d0e21e 1021
cb1a09d0 1022If the string inside angle brackets is not a filehandle or a scalar
1023variable containing a filehandle name or reference, then it is interpreted
4633a7c4 1024as a filename pattern to be globbed, and either a list of filenames or the
1025next filename in the list is returned, depending on context. One level of
1026$ interpretation is done first, but you can't say C<E<lt>$fooE<gt>>
1027because that's an indirect filehandle as explained in the previous
1028paragraph. In older version of Perl, programmers would insert curly
1029brackets to force interpretation as a filename glob: C<E<lt>${foo}E<gt>>.
d28ebecd 1030These days, it's considered cleaner to call the internal function directly
4633a7c4 1031as C<glob($foo)>, which is probably the right way to have done it in the
1032first place.) Example:
a0d0e21e 1033
1034 while (<*.c>) {
1035 chmod 0644, $_;
1036 }
1037
1038is equivalent to
1039
1040 open(FOO, "echo *.c | tr -s ' \t\r\f' '\\012\\012\\012\\012'|");
1041 while (<FOO>) {
1042 chop;
1043 chmod 0644, $_;
1044 }
1045
1046In fact, it's currently implemented that way. (Which means it will not
1047work on filenames with spaces in them unless you have csh(1) on your
1048machine.) Of course, the shortest way to do the above is:
1049
1050 chmod 0644, <*.c>;
1051
1052Because globbing invokes a shell, it's often faster to call readdir() yourself
1053and just do your own grep() on the filenames. Furthermore, due to its current
1054implementation of using a shell, the glob() routine may get "Arg list too
1055long" errors (unless you've installed tcsh(1L) as F</bin/csh>).
1056
4633a7c4 1057A glob only evaluates its (embedded) argument when it is starting a new
1058list. All values must be read before it will start over. In a list
1059context this isn't important, because you automatically get them all
1060anyway. In a scalar context, however, the operator returns the next value
1061each time it is called, or a FALSE value if you've just run out. Again,
1062FALSE is returned only once. So if you're expecting a single value from
1063a glob, it is much better to say
1064
1065 ($file) = <blurch*>;
1066
1067than
1068
1069 $file = <blurch*>;
1070
1071because the latter will alternate between returning a filename and
1072returning FALSE.
1073
1074It you're trying to do variable interpolation, it's definitely better
1075to use the glob() function, because the older notation can cause people
1076to become confused with the indirect filehandle notatin.
1077
1078 @files = glob("$dir/*.[ch]");
1079 @files = glob($files[$i]);
1080
a0d0e21e 1081=head2 Constant Folding
1082
1083Like C, Perl does a certain amount of expression evaluation at
1084compile time, whenever it determines that all of the arguments to an
1085operator are static and have no side effects. In particular, string
1086concatenation happens at compile time between literals that don't do
1087variable substitution. Backslash interpretation also happens at
1088compile time. You can say
1089
1090 'Now is the time for all' . "\n" .
1091 'good men to come to.'
1092
1093and this all reduces to one string internally. Likewise, if
1094you say
1095
1096 foreach $file (@filenames) {
1097 if (-s $file > 5 + 100 * 2**16) { ... }
1098 }
1099
1100the compiler will pre-compute the number that
1101expression represents so that the interpreter
1102won't have to.
1103
1104
1105=head2 Integer arithmetic
1106
1107By default Perl assumes that it must do most of its arithmetic in
1108floating point. But by saying
1109
1110 use integer;
1111
1112you may tell the compiler that it's okay to use integer operations
1113from here to the end of the enclosing BLOCK. An inner BLOCK may
1114countermand this by saying
1115
1116 no integer;
1117
1118which lasts until the end of that BLOCK.
1119