2 ''' $Header: perl.man.1,v 3.0 89/10/18 15:21:29 lwall Locked $
4 ''' $Log: perl.man.1,v $
5 ''' Revision 3.0 89/10/18 15:21:29 lwall
27 ''' Set up \*(-- to give an unbreakable dash;
28 ''' string Tr holds user defined translation string.
29 ''' Bell System Logo is used as a dummy character.
34 .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
35 .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
52 perl \- Practical Extraction and Report Language
55 [options] filename args
58 is an interpreted language optimized for scanning arbitrary text files,
59 extracting information from those text files, and printing reports based
61 It's also a good language for many system management tasks.
62 The language is intended to be practical (easy to use, efficient, complete)
63 rather than beautiful (tiny, elegant, minimal).
64 It combines (in the author's opinion, anyway) some of the best features of C,
65 \fIsed\fR, \fIawk\fR, and \fIsh\fR,
66 so people familiar with those languages should have little difficulty with it.
67 (Language historians will also note some vestiges of \fIcsh\fR, Pascal, and
69 Expression syntax corresponds quite closely to C expression syntax.
70 Unlike most Unix utilities,
72 does not arbitrarily limit the size of your data\*(--if you've got
75 can slurp in your whole file as a single string.
76 Recursion is of unlimited depth.
77 And the hash tables used by associative arrays grow as necessary to prevent
80 uses sophisticated pattern matching techniques to scan large amounts of
82 Although optimized for scanning text,
84 can also deal with binary data, and can make dbm files look like associative
85 arrays (where dbm is available).
88 scripts are safer than C programs
89 through a dataflow tracing mechanism which prevents many stupid security holes.
90 If you have a problem that would ordinarily use \fIsed\fR
91 or \fIawk\fR or \fIsh\fR, but it
92 exceeds their capabilities or must run a little faster,
93 and you don't want to write the silly thing in C, then
96 There are also translators to turn your
107 looks for your script in one of the following places:
109 Specified line by line via
111 switches on the command line.
113 Contained in the file specified by the first filename on the command line.
114 (Note that systems supporting the #! notation invoke interpreters this way.)
116 Passed in implicitly via standard input.
117 This only works if there are no filename arguments\*(--to pass
120 script you must explicitly specify a \- for the script name.
122 After locating your script,
124 compiles it to an internal form.
125 If the script is syntactically correct, it is executed.
127 Note: on first reading this section may not make much sense to you. It's here
128 at the front for easy reference.
130 A single-character option may be combined with the following option, if any.
131 This is particularly useful when invoking a script using the #! construct which
132 only allows one argument. Example:
136 #!/usr/bin/perl \-spi.bak # same as \-s \-p \-i.bak
143 turns on autosplit mode when used with a
147 An implicit split command to the @F array
148 is done as the first thing inside the implicit while loop produced by
155 perl \-ane \'print pop(@F), "\en";\'
161 print pop(@F), "\en";
167 runs the script under the perl debugger.
168 See the section on Debugging.
171 sets debugging flags.
172 To watch how it executes your script, use
174 (This only works if debugging is compiled into your
176 Another nice value is \-D1024, which lists your compiled syntax tree.
177 And \-D512 displays compiled regular expressions.
179 .BI \-e " commandline"
180 may be used to enter one line of script.
183 commands may be given to build up a multi-line script.
188 will not look for a script filename in the argument list.
191 specifies that files processed by the <> construct are to be edited
193 It does this by renaming the input file, opening the output file by the
194 same name, and selecting that output file as the default for print statements.
195 The extension, if supplied, is added to the name of the
196 old file to make a backup copy.
197 If no extension is supplied, no backup is made.
198 Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using
203 #!/usr/bin/perl \-pi.bak
206 which is equivalent to
211 if ($ARGV ne $oldargv) {
212 rename($ARGV, $ARGV . \'.bak\');
213 open(ARGVOUT, ">$ARGV");
220 print; # this prints to original filename
227 form doesn't need to compare $ARGV to $oldargv to know when
228 the filename has changed.
229 It does, however, use ARGVOUT for the selected filehandle.
232 is restored as the default output filehandle after the loop.
234 You can use eof to locate the end of each input file, in case you want
235 to append to each file, or reset line numbering (see example under eof).
238 may be used in conjunction with
240 to tell the C preprocessor where to look for include files.
241 By default /usr/include and /usr/lib/perl are searched.
246 to assume the following loop around your script, which makes it iterate
247 over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR:
252 .\|.\|. # your script goes here
256 Note that the lines are not printed by default.
259 to have lines printed.
260 Here is an efficient way to delete all files older than a week:
263 find . \-mtime +7 \-print | perl \-ne \'chop;unlink;\'
266 This is faster than using the \-exec switch of find because you don't have to
267 start a process on every filename found.
272 to assume the following loop around your script, which makes it iterate
273 over filename arguments somewhat like \fIsed\fR:
278 .\|.\|. # your script goes here
284 Note that the lines are printed automatically.
285 To suppress printing use the
295 causes your script to be run through the C preprocessor before
298 (Since both comments and cpp directives begin with the # character,
299 you should avoid starting comments with any words recognized
300 by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".)
303 enables some rudimentary switch parsing for switches on the command line
304 after the script name but before any filename arguments (or before a \-\|\-).
305 Any switch found there is removed from @ARGV and sets the corresponding variable in the
308 The following script prints \*(L"true\*(R" if and only if the script is
309 invoked with a \-xyz switch.
314 if ($xyz) { print "true\en"; }
321 use the PATH environment variable to search for the script
322 (unless the name of the script starts with a slash).
323 Typically this is used to emulate #! startup on machines that don't
324 support #!, in the following manner:
328 eval "exec /usr/bin/perl \-S $0 $*"
329 if $running_under_some_shell;
332 The system ignores the first line and feeds the script to /bin/sh,
333 which proceeds to try to execute the
335 script as a shell script.
336 The shell executes the second line as a normal shell command, and thus
340 On some systems $0 doesn't always contain the full pathname,
345 to search for the script if necessary.
348 locates the script, it parses the lines and ignores them because
349 the variable $running_under_some_shell is never true.
354 to dump core after compiling your script.
355 You can then take this core dump and turn it into an executable file
356 by using the undump program (not supplied).
357 This speeds startup at the expense of some disk space (which you can
358 minimize by stripping the executable).
359 (Still, a "hello world" executable comes out to about 200K on my machine.)
360 If you are going to run your executable as a set-id program then you
361 should probably compile it using taintperl rather than normal perl.
362 If you want to execute a portion of your script before dumping, use the
363 dump operator instead.
368 to do unsafe operations.
369 Currently the only \*(L"unsafe\*(R" operation is the unlinking of directories while
370 running as superuser.
373 prints the version and patchlevel of your
378 prints warnings about identifiers that are mentioned only once, and scalar
379 variables that are used before being set.
380 Also warns about redefined subroutines, and references to undefined
381 filehandles or filehandles opened readonly that you are attempting to
383 Also warns you if you use == on values that don't look like numbers, and if
384 your subroutines recurse more than 100 deep.
385 .Sh "Data Types and Objects"
388 has three data types: scalars, arrays of scalars, and
389 associative arrays of scalars.
390 Normal arrays are indexed by number, and associative arrays by string.
392 The interpretation of operations and values in perl sometimes
393 depends on the requirements
394 of the context around the operation or value.
395 There are three major contexts: string, numeric and array.
396 Certain operations return array values
397 in contexts wanting an array, and scalar values otherwise.
398 (If this is true of an operation it will be mentioned in the documentation
400 Operations which return scalars don't care whether the context is looking
401 for a string or a number, but
402 scalar variables and values are interpreted as strings or numbers
403 as appropriate to the context.
404 A scalar is interpreted as TRUE in the boolean sense if it is not the null
406 Booleans returned by operators are 1 for true and \'0\' or \'\' (the null
409 There are actually two varieties of null string: defined and undefined.
410 Undefined null strings are returned when there is no real value for something,
411 such as when there was an error, or at end of file, or when you refer
412 to an uninitialized variable or element of an array.
413 An undefined null string may become defined the first time you access it, but
414 prior to that you can use the defined() operator to determine whether the
415 value is defined or not.
417 References to scalar variables always begin with \*(L'$\*(R', even when referring
418 to a scalar that is part of an array.
423 $days \h'|2i'# a simple scalar variable
424 $days[28] \h'|2i'# 29th element of array @days
425 $days{\'Feb\'}\h'|2i'# one value from an associative array
426 $#days \h'|2i'# last index of array @days
428 but entire arrays or array slices are denoted by \*(L'@\*(R':
430 @days \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n])
431 @days[3,4,5]\h'|2i'# same as @days[3.\|.5]
432 @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'})
434 and entire associative arrays are denoted by \*(L'%\*(R':
436 %days \h'|2i'# (key1, val1, key2, val2 .\|.\|.)
439 Any of these eight constructs may serve as an lvalue,
440 that is, may be assigned to.
441 (It also turns out that an assignment is itself an lvalue in
442 certain contexts\*(--see examples under s, tr and chop.)
443 Assignment to a scalar evaluates the righthand side in a scalar context,
444 while assignment to an array or array slice evaluates the righthand side
447 You may find the length of array @days by evaluating
448 \*(L"$#days\*(R", as in
450 (Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.)
451 Assigning to $#days changes the length of the array.
452 Shortening an array by this method does not actually destroy any values.
453 Lengthening an array that was previously shortened recovers the values that
454 were in those elements.
455 You can also gain some measure of efficiency by preextending an array that
457 (You can also extend an array by assigning to an element that is off the
459 This differs from assigning to $#whatever in that intervening values
460 are set to null rather than recovered.)
461 You can truncate an array down to nothing by assigning the null list () to
463 The following are exactly equivalent
467 $#whatever = $[ \- 1;
471 Multi-dimensional arrays are not directly supported, but see the discussion
472 of the $; variable later for a means of emulating multiple subscripts with
473 an associative array.
475 Every data type has its own namespace.
476 You can, without fear of conflict, use the same name for a scalar variable,
477 an array, an associative array, a filehandle, a subroutine name, and/or
479 Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R',
480 or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved
481 with respect to variable names.
482 (They ARE reserved with respect to labels and filehandles, however, which
483 don't have an initial special character.
484 Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\').
485 Using uppercase filehandles also improves readability and protects you
486 from conflict with future reserved words.)
487 Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all
489 Names which start with a letter may also contain digits and underscores.
490 Names which do not start with a letter are limited to one character,
491 e.g. \*(L"$%\*(R" or \*(L"$$\*(R".
492 (Most of the one character names have a predefined significance to
496 Numeric literals are specified in any of the usual floating point or
508 String literals are delimited by either single or double quotes.
509 They work much like shell quotes:
510 double-quoted string literals are subject to backslash and variable
511 substitution; single-quoted strings are not (except for \e\' and \e\e).
512 The usual backslash rules apply for making characters such as newline, tab, etc.
513 You can also embed newlines directly in your strings, i.e. they can end on
514 a different line than they begin.
515 This is nice, but if you forget your trailing quote, the error will not be
518 finds another line containing the quote character, which
519 may be much further on in the script.
520 Variable substitution inside strings is limited to scalar variables, normal
521 array values, and array slices.
522 (In other words, identifiers beginning with $ or @, followed by an optional
523 bracketed expression as a subscript.)
524 The following code segment prints out \*(L"The price is $100.\*(R"
528 $Price = \'$100\';\h'|3.5i'# not interpreted
529 print "The price is $Price.\e\|n";\h'|3.5i'# interpreted
532 Note that you can put curly brackets around the identifier to delimit it
533 from following alphanumerics.
535 Array values are interpolated into double-quoted strings by joining all the
536 elements of the array with the delimiter specified in the $" variable,
538 (Since in versions of perl prior to 3.0 the @ character was not a metacharacter
539 in double-quoted strings, the interpolation of @array, $array[EXPR],
540 @array[LIST], $array{EXPR}, or @array{LIST} only happens if array is
541 referenced elsewhere in the program or is predefined.)
542 The following are equivalent:
546 $temp = join($",@ARGV);
552 Within search patterns (which also undergo double-quoteish substitution)
553 there is a bad ambiguity: Is /$foo[bar]/ to be
554 interpreted as /${foo}[bar]/ (where [bar] is a character class for the
555 regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to
557 If @foo doesn't otherwise exist, then it's obviously a character class.
558 If @foo exists, perl takes a good guess about [bar], and is almost always right.
559 If it does guess wrong, or if you're just plain paranoid,
560 you can force the correct interpretation with curly brackets as above.
562 A line-oriented form of quoting is based on the shell here-is syntax.
563 Following a << you specify a string to terminate the quoted material, and all lines
564 following the current line down to the terminating string are the value
566 The terminating string may be either an identifier (a word), or some
568 If quoted, the type of quotes you use determines the treatment of the text,
569 just as in regular quoting.
570 An unquoted identifier works like double quotes.
571 There must be no space between the << and the identifier.
572 (If you put a space it will be treated as a null identifier, which is
573 valid, and matches the first blank line\*(--see Merry Christmas example below.)
574 The terminating string must appear by itself (unquoted and with no surrounding
575 whitespace) on the terminating line.
578 print <<EOF; # same as above
582 print <<"EOF"; # same as above
586 print << x 10; # null identifier is delimiter
589 print <<`EOC`; # execute commands
594 print <<foo, <<bar; # you can stack them
601 Array literals are denoted by separating individual values by commas, and
602 enclosing the list in parentheses.
603 In a context not requiring an array value, the value of the array literal
604 is the value of the final element, as in the C comma operator.
609 @foo = (\'cc\', \'\-E\', $bar);
611 assigns the entire array value to array foo, but
613 $foo = (\'cc\', \'\-E\', $bar);
616 assigns the value of variable bar to variable foo.
617 Array lists may be assigned to if and only if each element of the list
621 ($a, $b, $c) = (1, 2, 3);
623 ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00);
625 The final element may be an array or an associative array:
627 ($a, $b, @rest) = split;
628 local($a, $b, %rest) = @_;
631 You can actually put an array anywhere in the list, but the first array
632 in the list will soak up all the values, and anything after it will get
634 This may be useful in a local().
636 An associative array literal contains pairs of values to be interpreted
637 as a key and a value:
641 # same as map assignment above
642 %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
645 Array assignment in a scalar context returns the number of elements
646 produced by the expression on the right side of the assignment:
649 $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2
653 There are several other pseudo-literals that you should know about.
654 If a string is enclosed by backticks (grave accents), it first undergoes
655 variable substitution just like a double quoted string.
656 It is then interpreted as a command, and the output of that command
657 is the value of the pseudo-literal, like in a shell.
658 The command is executed each time the pseudo-literal is evaluated.
659 The status value of the command is returned in $? (see Predefined Names
660 for the interpretation of $?).
661 Unlike in \f2csh\f1, no translation is done on the return
662 data\*(--newlines remain newlines.
663 Unlike in any of the shells, single quotes do not hide variable names
664 in the command from interpretation.
665 To pass a $ through to the shell you need to hide it with a backslash.
667 Evaluating a filehandle in angle brackets yields the next line
668 from that file (newline included, so it's never false until EOF, at
669 which time an undefined value is returned).
670 Ordinarily you must assign that value to a variable,
671 but there is one situation where in which an automatic assignment happens.
672 If (and only if) the input symbol is the only thing inside the conditional of a
675 automatically assigned to the variable \*(L"$_\*(R".
676 (This may seem like an odd thing to you, but you'll use the construct
680 Anyway, the following lines are equivalent to each other:
684 while ($_ = <STDIN>) { print; }
685 while (<STDIN>) { print; }
686 for (\|;\|<STDIN>;\|) { print; }
687 print while $_ = <STDIN>;
702 will also work except in packages, where they would be interpreted as
703 local identifiers rather than global.)
704 Additional filehandles may be created with the
708 If a <FILEHANDLE> is used in a context that is looking for an array, an array
709 consisting of all the input lines is returned, one line per array element.
710 It's easy to make a LARGE data space this way, so use with care.
712 The null filehandle <> is special and can be used to emulate the behavior of
713 \fIsed\fR and \fIawk\fR.
714 Input from <> comes either from standard input, or from each file listed on
716 Here's how it works: the first time <> is evaluated, the ARGV array is checked,
717 and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard
719 The ARGV array is then processed as a list of filenames.
725 .\|.\|. # code for each line
731 unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[;
732 while ($ARGV = shift) {
735 .\|.\|. # code for each line
740 except that it isn't as cumbersome to say.
741 It really does shift array ARGV and put the current filename into
743 It also uses filehandle ARGV internally.
744 You can modify @ARGV before the first <> as long as you leave the first
745 filename at the beginning of the array.
746 Line numbers ($.) continue as if the input was one big happy file.
747 (But see example under eof for how to reset line numbers on each file.)
750 If you want to set @ARGV to your own list of files, go right ahead.
751 If you want to pass switches into your script, you can
752 put a loop on the front like this:
756 while ($_ = $ARGV[0], /\|^\-/\|) {
758 last if /\|^\-\|\-$\|/\|;
759 /\|^\-D\|(.*\|)/ \|&& \|($debug = $1);
760 /\|^\-v\|/ \|&& \|$verbose++;
761 .\|.\|. # other switches
764 .\|.\|. # code for each line
768 The <> symbol will return FALSE only once.
769 If you call it again after this it will assume you are processing another
770 @ARGV list, and if you haven't set @ARGV, will input from
773 If the string inside the angle brackets is a reference to a scalar variable
775 then that variable contains the name of the filehandle to input from.
777 If the string inside angle brackets is not a filehandle, it is interpreted
778 as a filename pattern to be globbed, and either an array of filenames or the
779 next filename in the list is returned, depending on context.
780 One level of $ interpretation is done first, but you can't say <$foo>
781 because that's an indirect filehandle as explained in the previous
783 You could insert curly brackets to force interpretation as a
784 filename glob: <${foo}>.
796 open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|");
803 In fact, it's currently implemented that way.
804 (Which means it will not work on filenames with spaces in them unless
805 you have /bin/csh on your machine.)
806 Of course, the shortest way to do the above is:
816 script consists of a sequence of declarations and commands.
817 The only things that need to be declared in
819 are report formats and subroutines.
820 See the sections below for more information on those declarations.
821 All uninitialized objects user-created objects are assumed to
822 start with a null or 0 value until they
823 are defined by some explicit operation such as assignment.
824 The sequence of commands is executed just once, unlike in
828 scripts, where the sequence of commands is executed for each input line.
829 While this means that you must explicitly loop over the lines of your input file
830 (or files), it also means you have much more control over which files and which
832 (Actually, I'm lying\*(--it is possible to do an implicit loop with either the
838 A declaration can be put anywhere a command can, but has no effect on the
839 execution of the primary sequence of commands--declarations all take effect
841 Typically all the declarations are put at the beginning or the end of the script.
844 is, for the most part, a free-form language.
845 (The only exception to this is format declarations, for fairly obvious reasons.)
846 Comments are indicated by the # character, and extend to the end of the line.
847 If you attempt to use /* */ C comments, it will be interpreted either as
848 division or pattern matching, depending on the context.
850 .Sh "Compound statements"
853 a sequence of commands may be treated as one command by enclosing it
855 We will call this a BLOCK.
857 The following compound commands may be used to control flow:
862 if (EXPR) BLOCK else BLOCK
863 if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK
864 LABEL while (EXPR) BLOCK
865 LABEL while (EXPR) BLOCK continue BLOCK
866 LABEL for (EXPR; EXPR; EXPR) BLOCK
867 LABEL foreach VAR (ARRAY) BLOCK
868 LABEL BLOCK continue BLOCK
871 Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not
873 This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed.
874 If you want to write conditionals without curly brackets there are several
876 The following all do the same thing:
880 if (!open(foo)) { die "Can't open $foo: $!"; }
881 die "Can't open $foo: $!" unless open(foo);
882 open(foo) || die "Can't open $foo: $!"; # foo or bust!
883 open(foo) ? die "Can't open $foo: $!" : \'hi mom\';
884 # a bit exotic, that last one
890 statement is straightforward.
891 Since BLOCKs are always bounded by curly brackets, there is never any
892 ambiguity about which
901 the sense of the test is reversed.
905 statement executes the block as long as the expression is true
906 (does not evaluate to the null string or 0).
907 The LABEL is optional, and if present, consists of an identifier followed by
909 The LABEL identifies the loop for the loop control statements
917 BLOCK, it is always executed just before
918 the conditional is about to be evaluated again, similarly to the third part
922 Thus it can be used to increment a loop variable, even when the loop has
923 been continued via the
925 statement (similar to the C \*(L"continue\*(R" statement).
929 is replaced by the word
931 the sense of the test is reversed, but the conditional is still tested before
938 statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional
939 is true if the value of the last command in that block is true.
943 loop works exactly like the corresponding
949 for ($i = 1; $i < 10; $i++) {
963 The foreach loop iterates over a normal array value and sets the variable
964 VAR to be each element of the array in turn.
965 The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword,
966 so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity.
967 If VAR is omitted, $_ is set to each value.
968 If ARRAY is an actual array (as opposed to an expression returning an array
969 value), you can modify each element of the array
970 by modifying VAR inside the loop.
975 for (@ary) { s/foo/bar/; }
977 foreach $elem (@elements) {
982 for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) {
983 print $_, "\en"; sleep(1);
986 for (1..15) { print "Merry Christmas\en"; }
989 foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'}) {
990 print "Item: $item\en";
995 The BLOCK by itself (labeled or not) is equivalent to a loop that executes
997 Thus you can use any of the loop control statements in it to leave or
1002 This construct is particularly nice for doing case structures.
1007 if (/^abc/) { $abc = 1; last foo; }
1008 if (/^def/) { $def = 1; last foo; }
1009 if (/^xyz/) { $xyz = 1; last foo; }
1014 There is no official switch statement in perl, because there
1015 are already several ways to write the equivalent.
1016 In addition to the above, you could write
1021 $abc = 1, last foo if /^abc/;
1022 $def = 1, last foo if /^def/;
1023 $xyz = 1, last foo if /^xyz/;
1031 /^abc/ && do { $abc = 1; last foo; }
1032 /^def/ && do { $def = 1; last foo; }
1033 /^xyz/ && do { $xyz = 1; last foo; }
1041 /^abc/ && ($abc = 1, last foo);
1042 /^def/ && ($def = 1, last foo);
1043 /^xyz/ && ($xyz = 1, last foo);
1051 { $abc = 1; last foo; }
1053 { $def = 1; last foo; }
1055 { $xyz = 1; last foo; }
1060 As it happens, these are all optimized internally to a switch structure,
1061 so perl jumps directly to the desired statement, and you needn't worry
1062 about perl executing a lot of unnecessary statements when you have a string
1063 of 50 elsifs, as long as you are testing the same simple scalar variable
1064 using ==, eq, or pattern matching as above.
1065 (If you're curious as to whether the optimizer has done this for a particular
1066 case statement, you can use the \-D1024 switch to list the syntax tree
1068 .Sh "Simple statements"
1069 The only kind of simple statement is an expression evaluated for its side
1071 Every expression (simple statement) must be terminated with a semicolon.
1072 Note that this is like C, but unlike Pascal (and
1075 Any simple statement may optionally be followed by a
1076 single modifier, just before the terminating semicolon.
1077 The possible modifiers are:
1091 modifiers have the expected semantics.
1096 modifiers also have the expected semantics (conditional evaluated first),
1097 except when applied to a do-BLOCK command,
1098 in which case the block executes once before the conditional is evaluated.
1099 This is so that you can write loops like:
1106 } until $_ \|eq \|".\|\e\|n";
1111 operator below. Note also that the loop control commands described later will
1112 NOT work in this construct, since modifiers don't take loop labels.
1117 expressions work almost exactly like C expressions, only the differences
1118 will be mentioned here.
1124 The exponentiation operator.
1126 The exponentiation assignment operator.
1128 The null list, used to initialize an array to null.
1130 Concatenation of two strings.
1132 The concatenation assignment operator.
1134 String equality (== is numeric equality).
1135 For a mnemonic just think of \*(L"eq\*(R" as a string.
1136 (If you are used to the
1138 behavior of using == for either string or numeric equality
1139 based on the current form of the comparands, beware!
1140 You must be explicit here.)
1142 String inequality (!= is numeric inequality).
1146 String greater than.
1148 String less than or equal.
1150 String greater than or equal.
1152 Certain operations search or modify the string \*(L"$_\*(R" by default.
1153 This operator makes that kind of operation work on some other string.
1154 The right argument is a search pattern, substitution, or translation.
1155 The left argument is what is supposed to be searched, substituted, or
1156 translated instead of the default \*(L"$_\*(R".
1157 The return value indicates the success of the operation.
1158 (If the right argument is an expression other than a search pattern,
1159 substitution, or translation, it is interpreted as a search pattern
1161 This is less efficient than an explicit search, since the pattern must
1162 be compiled every time the expression is evaluated.)
1163 The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else.
1165 Just like =~ except the return value is negated.
1167 The repetition operator.
1168 Returns a string consisting of the left operand repeated the
1169 number of times specified by the right operand.
1172 print \'\-\' x 80; # print row of dashes
1173 print \'\-\' x80; # illegal, x80 is identifier
1175 print "\et" x ($tab/8), \' \' x ($tab%8); # tab over
1179 The repetition assignment operator.
1181 The range operator, which is really two different operators depending
1183 In an array context, returns an array of values counting (by ones)
1184 from the left value to the right value.
1185 This is useful for writing \*(L"for (1..10)\*(R" loops and for doing
1186 slice operations on arrays.
1188 In a scalar context, .\|. returns a boolean value.
1189 The operator is bistable, like a flip-flop..
1190 Each .\|. operator maintains its own boolean state.
1191 It is false as long as its left operand is false.
1192 Once the left operand is true, the range operator stays true
1193 until the right operand is true,
1194 AFTER which the range operator becomes false again.
1195 (It doesn't become false till the next time the range operator is evaluated.
1196 It can become false on the same evaluation it became true, but it still returns
1198 The right operand is not evaluated while the operator is in the \*(L"false\*(R" state,
1199 and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state.
1200 The scalar .\|. operator is primarily intended for doing line number ranges
1202 the fashion of \fIsed\fR or \fIawk\fR.
1203 The precedence is a little lower than || and &&.
1204 The value returned is either the null string for false, or a sequence number
1205 (beginning with 1) for true.
1206 The sequence number is reset for each range encountered.
1207 The final sequence number in a range has the string \'E0\' appended to it, which
1208 doesn't affect its numeric value, but gives you something to search for if you
1209 want to exclude the endpoint.
1210 You can exclude the beginning point by waiting for the sequence number to be
1212 If either operand of scalar .\|. is static, that operand is implicitly compared
1213 to the $. variable, the current line number.
1218 As a scalar operator:
1219 if (101 .\|. 200) { print; } # print 2nd hundred lines
1221 next line if (1 .\|. /^$/); # skip header lines
1223 s/^/> / if (/^$/ .\|. eof()); # quote body
1226 As an array operator:
1227 for (101 .\|. 200) { print; } # print $_ 100 times
1229 @foo = @foo[$[ .\|. $#foo]; # an expensive no-op
1230 @foo = @foo[$#foo-4 .\|. $#foo]; # slice last 5 items
1235 This unary operator takes one argument, either a filename or a filehandle,
1236 and tests the associated file to see if something is true about it.
1237 If the argument is omitted, tests $_, except for \-t, which tests
1239 It returns 1 for true and \'\' for false, or the undefined value if the
1241 Precedence is higher than logical and relational operators, but lower than
1242 arithmetic operators.
1243 The operator may be any of:
1245 \-r File is readable by effective uid.
1246 \-w File is writable by effective uid.
1247 \-x File is executable by effective uid.
1248 \-o File is owned by effective uid.
1249 \-R File is readable by real uid.
1250 \-W File is writable by real uid.
1251 \-X File is executable by real uid.
1252 \-O File is owned by real uid.
1254 \-z File has zero size.
1255 \-s File has non-zero size.
1256 \-f File is a plain file.
1257 \-d File is a directory.
1258 \-l File is a symbolic link.
1259 \-p File is a named pipe (FIFO).
1260 \-S File is a socket.
1261 \-b File is a block special file.
1262 \-c File is a character special file.
1263 \-u File has setuid bit set.
1264 \-g File has setgid bit set.
1265 \-k File has sticky bit set.
1266 \-t Filehandle is opened to a tty.
1267 \-T File is a text file.
1268 \-B File is a binary file (opposite of \-T).
1271 The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X
1272 is based solely on the mode of the file and the uids and gids of the user.
1273 There may be other reasons you can't actually read, write or execute the file.
1274 Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and
1275 \-x and \-X return 1 if any execute bit is set in the mode.
1276 Scripts run by the superuser may thus need to do a stat() in order to determine
1277 the actual mode of the file, or temporarily set the uid to something else.
1285 next unless \-f $_; # ignore specials
1290 Note that \-s/a/b/ does not do a negated substitution.
1291 Saying \-exp($foo) still works as expected, however\*(--only single letters
1292 following a minus are interpreted as file tests.
1294 The \-T and \-B switches work as follows.
1295 The first block or so of the file is examined for odd characters such as
1296 strange control codes or metacharacters.
1297 If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file.
1298 Also, any file containing null in the first block is considered a binary file.
1299 If \-T or \-B is used on a filehandle, the current stdio buffer is examined
1300 rather than the first block.
1301 Both \-T and \-B return TRUE on a null file, or a file at EOF when testing
1304 If any of the file tests (or either stat operator) are given the special
1305 filehandle consisting of a solitary underline, then the stat structure
1306 of the previous file test (or stat operator) is used, saving a system
1308 (This doesn't work with \-t, and you need to remember that lstat and -l
1309 will leave values in the stat structure for the symbolic link, not the
1314 print "Can do.\en" if -r $a || -w _ || -x _;
1318 print "Readable\en" if -r _;
1319 print "Writable\en" if -w _;
1320 print "Executable\en" if -x _;
1321 print "Setuid\en" if -u _;
1322 print "Setgid\en" if -g _;
1323 print "Sticky\en" if -k _;
1324 print "Text\en" if -T _;
1325 print "Binary\en" if -B _;
1329 Here is what C has that
1333 Address-of operator.
1335 Dereference-address operator.
1337 Type casting operator.
1341 does a certain amount of expression evaluation at compile time, whenever
1342 it determines that all of the arguments to an operator are static and have
1344 In particular, string concatenation happens at compile time between literals that don't do variable substitution.
1345 Backslash interpretation also happens at compile time.
1350 \'Now is the time for all\' . "\|\e\|n" .
1351 \'good men to come to.\'
1354 and this all reduces to one string internally.
1356 The autoincrement operator has a little extra built-in magic to it.
1357 If you increment a variable that is numeric, or that has ever been used in
1358 a numeric context, you get a normal increment.
1359 If, however, the variable has only been used in string contexts since it
1360 was set, and has a value that is not null and matches the
1361 pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done
1362 as a string, preserving each character within its range, with carry:
1365 print ++($foo = \'99\'); # prints \*(L'100\*(R'
1366 print ++($foo = \'a0\'); # prints \*(L'a1\*(R'
1367 print ++($foo = \'Az\'); # prints \*(L'Ba\*(R'
1368 print ++($foo = \'zz\'); # prints \*(L'aaa\*(R'
1371 The autodecrement is not magical.