2 ''' $Header: perl.man.1,v 3.0.1.3 90/02/28 17:54:32 lwall Locked $
4 ''' $Log: perl.man.1,v $
5 ''' Revision 3.0.1.3 90/02/28 17:54:32 lwall
6 ''' patch9: @array in scalar context now returns length of array
7 ''' patch9: in manual, example of open and ?: was backwards
9 ''' Revision 3.0.1.2 89/11/17 15:30:03 lwall
10 ''' patch5: fixed some manual typos and indent problems
12 ''' Revision 3.0.1.1 89/11/11 04:41:22 lwall
13 ''' patch2: explained about sh and ${1+"$@"}
14 ''' patch2: documented that space must separate word and '' string
16 ''' Revision 3.0 89/10/18 15:21:29 lwall
38 ''' Set up \*(-- to give an unbreakable dash;
39 ''' string Tr holds user defined translation string.
40 ''' Bell System Logo is used as a dummy character.
45 .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
46 .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
63 perl \- Practical Extraction and Report Language
66 [options] filename args
69 is an interpreted language optimized for scanning arbitrary text files,
70 extracting information from those text files, and printing reports based
72 It's also a good language for many system management tasks.
73 The language is intended to be practical (easy to use, efficient, complete)
74 rather than beautiful (tiny, elegant, minimal).
75 It combines (in the author's opinion, anyway) some of the best features of C,
76 \fIsed\fR, \fIawk\fR, and \fIsh\fR,
77 so people familiar with those languages should have little difficulty with it.
78 (Language historians will also note some vestiges of \fIcsh\fR, Pascal, and
80 Expression syntax corresponds quite closely to C expression syntax.
81 Unlike most Unix utilities,
83 does not arbitrarily limit the size of your data\*(--if you've got
86 can slurp in your whole file as a single string.
87 Recursion is of unlimited depth.
88 And the hash tables used by associative arrays grow as necessary to prevent
91 uses sophisticated pattern matching techniques to scan large amounts of
93 Although optimized for scanning text,
95 can also deal with binary data, and can make dbm files look like associative
96 arrays (where dbm is available).
99 scripts are safer than C programs
100 through a dataflow tracing mechanism which prevents many stupid security holes.
101 If you have a problem that would ordinarily use \fIsed\fR
102 or \fIawk\fR or \fIsh\fR, but it
103 exceeds their capabilities or must run a little faster,
104 and you don't want to write the silly thing in C, then
107 There are also translators to turn your
118 looks for your script in one of the following places:
120 Specified line by line via
122 switches on the command line.
124 Contained in the file specified by the first filename on the command line.
125 (Note that systems supporting the #! notation invoke interpreters this way.)
127 Passed in implicitly via standard input.
128 This only works if there are no filename arguments\*(--to pass
131 script you must explicitly specify a \- for the script name.
133 After locating your script,
135 compiles it to an internal form.
136 If the script is syntactically correct, it is executed.
138 Note: on first reading this section may not make much sense to you. It's here
139 at the front for easy reference.
141 A single-character option may be combined with the following option, if any.
142 This is particularly useful when invoking a script using the #! construct which
143 only allows one argument. Example:
147 #!/usr/bin/perl \-spi.bak # same as \-s \-p \-i.bak
154 turns on autosplit mode when used with a
158 An implicit split command to the @F array
159 is done as the first thing inside the implicit while loop produced by
166 perl \-ane \'print pop(@F), "\en";\'
172 print pop(@F), "\en";
178 runs the script under the perl debugger.
179 See the section on Debugging.
182 sets debugging flags.
183 To watch how it executes your script, use
185 (This only works if debugging is compiled into your
187 Another nice value is \-D1024, which lists your compiled syntax tree.
188 And \-D512 displays compiled regular expressions.
190 .BI \-e " commandline"
191 may be used to enter one line of script.
194 commands may be given to build up a multi-line script.
199 will not look for a script filename in the argument list.
202 specifies that files processed by the <> construct are to be edited
204 It does this by renaming the input file, opening the output file by the
205 same name, and selecting that output file as the default for print statements.
206 The extension, if supplied, is added to the name of the
207 old file to make a backup copy.
208 If no extension is supplied, no backup is made.
209 Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using
214 #!/usr/bin/perl \-pi.bak
217 which is equivalent to
222 if ($ARGV ne $oldargv) {
223 rename($ARGV, $ARGV . \'.bak\');
224 open(ARGVOUT, ">$ARGV");
231 print; # this prints to original filename
238 form doesn't need to compare $ARGV to $oldargv to know when
239 the filename has changed.
240 It does, however, use ARGVOUT for the selected filehandle.
243 is restored as the default output filehandle after the loop.
245 You can use eof to locate the end of each input file, in case you want
246 to append to each file, or reset line numbering (see example under eof).
249 may be used in conjunction with
251 to tell the C preprocessor where to look for include files.
252 By default /usr/include and /usr/lib/perl are searched.
257 to assume the following loop around your script, which makes it iterate
258 over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR:
263 .\|.\|. # your script goes here
267 Note that the lines are not printed by default.
270 to have lines printed.
271 Here is an efficient way to delete all files older than a week:
274 find . \-mtime +7 \-print | perl \-ne \'chop;unlink;\'
277 This is faster than using the \-exec switch of find because you don't have to
278 start a process on every filename found.
283 to assume the following loop around your script, which makes it iterate
284 over filename arguments somewhat like \fIsed\fR:
289 .\|.\|. # your script goes here
295 Note that the lines are printed automatically.
296 To suppress printing use the
306 causes your script to be run through the C preprocessor before
309 (Since both comments and cpp directives begin with the # character,
310 you should avoid starting comments with any words recognized
311 by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".)
314 enables some rudimentary switch parsing for switches on the command line
315 after the script name but before any filename arguments (or before a \-\|\-).
316 Any switch found there is removed from @ARGV and sets the corresponding variable in the
319 The following script prints \*(L"true\*(R" if and only if the script is
320 invoked with a \-xyz switch.
325 if ($xyz) { print "true\en"; }
332 use the PATH environment variable to search for the script
333 (unless the name of the script starts with a slash).
334 Typically this is used to emulate #! startup on machines that don't
335 support #!, in the following manner:
339 eval "exec /usr/bin/perl \-S $0 $*"
340 if $running_under_some_shell;
343 The system ignores the first line and feeds the script to /bin/sh,
344 which proceeds to try to execute the
346 script as a shell script.
347 The shell executes the second line as a normal shell command, and thus
351 On some systems $0 doesn't always contain the full pathname,
356 to search for the script if necessary.
359 locates the script, it parses the lines and ignores them because
360 the variable $running_under_some_shell is never true.
361 A better construct than $* would be ${1+"$@"}, which handles embedded spaces
362 and such in the filenames, but doesn't work if the script is being interpreted
364 In order to start up sh rather than csh, some systems may have to replace the
365 #! line with a line containing just
366 a colon, which will be politely ignored by perl.
371 to dump core after compiling your script.
372 You can then take this core dump and turn it into an executable file
373 by using the undump program (not supplied).
374 This speeds startup at the expense of some disk space (which you can
375 minimize by stripping the executable).
376 (Still, a "hello world" executable comes out to about 200K on my machine.)
377 If you are going to run your executable as a set-id program then you
378 should probably compile it using taintperl rather than normal perl.
379 If you want to execute a portion of your script before dumping, use the
380 dump operator instead.
385 to do unsafe operations.
386 Currently the only \*(L"unsafe\*(R" operation is the unlinking of directories while
387 running as superuser.
390 prints the version and patchlevel of your
395 prints warnings about identifiers that are mentioned only once, and scalar
396 variables that are used before being set.
397 Also warns about redefined subroutines, and references to undefined
398 filehandles or filehandles opened readonly that you are attempting to
400 Also warns you if you use == on values that don't look like numbers, and if
401 your subroutines recurse more than 100 deep.
402 .Sh "Data Types and Objects"
405 has three data types: scalars, arrays of scalars, and
406 associative arrays of scalars.
407 Normal arrays are indexed by number, and associative arrays by string.
409 The interpretation of operations and values in perl sometimes
410 depends on the requirements
411 of the context around the operation or value.
412 There are three major contexts: string, numeric and array.
413 Certain operations return array values
414 in contexts wanting an array, and scalar values otherwise.
415 (If this is true of an operation it will be mentioned in the documentation
417 Operations which return scalars don't care whether the context is looking
418 for a string or a number, but
419 scalar variables and values are interpreted as strings or numbers
420 as appropriate to the context.
421 A scalar is interpreted as TRUE in the boolean sense if it is not the null
423 Booleans returned by operators are 1 for true and 0 or \'\' (the null
426 There are actually two varieties of null string: defined and undefined.
427 Undefined null strings are returned when there is no real value for something,
428 such as when there was an error, or at end of file, or when you refer
429 to an uninitialized variable or element of an array.
430 An undefined null string may become defined the first time you access it, but
431 prior to that you can use the defined() operator to determine whether the
432 value is defined or not.
434 References to scalar variables always begin with \*(L'$\*(R', even when referring
435 to a scalar that is part of an array.
440 $days \h'|2i'# a simple scalar variable
441 $days[28] \h'|2i'# 29th element of array @days
442 $days{\'Feb\'}\h'|2i'# one value from an associative array
443 $#days \h'|2i'# last index of array @days
445 but entire arrays or array slices are denoted by \*(L'@\*(R':
447 @days \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n])
448 @days[3,4,5]\h'|2i'# same as @days[3.\|.5]
449 @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'})
451 and entire associative arrays are denoted by \*(L'%\*(R':
453 %days \h'|2i'# (key1, val1, key2, val2 .\|.\|.)
456 Any of these eight constructs may serve as an lvalue,
457 that is, may be assigned to.
458 (It also turns out that an assignment is itself an lvalue in
459 certain contexts\*(--see examples under s, tr and chop.)
460 Assignment to a scalar evaluates the righthand side in a scalar context,
461 while assignment to an array or array slice evaluates the righthand side
464 You may find the length of array @days by evaluating
465 \*(L"$#days\*(R", as in
467 (Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.)
468 Assigning to $#days changes the length of the array.
469 Shortening an array by this method does not actually destroy any values.
470 Lengthening an array that was previously shortened recovers the values that
471 were in those elements.
472 You can also gain some measure of efficiency by preextending an array that
474 (You can also extend an array by assigning to an element that is off the
476 This differs from assigning to $#whatever in that intervening values
477 are set to null rather than recovered.)
478 You can truncate an array down to nothing by assigning the null list () to
480 The following are exactly equivalent
484 $#whatever = $[ \- 1;
488 If you evaluate an array in a scalar context, it returns the length of
490 The following is always true:
493 @whatever == $#whatever \- $[ + 1;
497 Multi-dimensional arrays are not directly supported, but see the discussion
498 of the $; variable later for a means of emulating multiple subscripts with
499 an associative array.
500 You could also write a subroutine to turn multiple subscripts into a single
503 Every data type has its own namespace.
504 You can, without fear of conflict, use the same name for a scalar variable,
505 an array, an associative array, a filehandle, a subroutine name, and/or
507 Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R',
508 or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved
509 with respect to variable names.
510 (They ARE reserved with respect to labels and filehandles, however, which
511 don't have an initial special character.
512 Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\').
513 Using uppercase filehandles also improves readability and protects you
514 from conflict with future reserved words.)
515 Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all
517 Names which start with a letter may also contain digits and underscores.
518 Names which do not start with a letter are limited to one character,
519 e.g. \*(L"$%\*(R" or \*(L"$$\*(R".
520 (Most of the one character names have a predefined significance to
524 Numeric literals are specified in any of the usual floating point or
536 String literals are delimited by either single or double quotes.
537 They work much like shell quotes:
538 double-quoted string literals are subject to backslash and variable
539 substitution; single-quoted strings are not (except for \e\' and \e\e).
540 The usual backslash rules apply for making characters such as newline, tab, etc.
541 You can also embed newlines directly in your strings, i.e. they can end on
542 a different line than they begin.
543 This is nice, but if you forget your trailing quote, the error will not be
546 finds another line containing the quote character, which
547 may be much further on in the script.
548 Variable substitution inside strings is limited to scalar variables, normal
549 array values, and array slices.
550 (In other words, identifiers beginning with $ or @, followed by an optional
551 bracketed expression as a subscript.)
552 The following code segment prints out \*(L"The price is $100.\*(R"
556 $Price = \'$100\';\h'|3.5i'# not interpreted
557 print "The price is $Price.\e\|n";\h'|3.5i'# interpreted
560 Note that you can put curly brackets around the identifier to delimit it
561 from following alphanumerics.
562 Also note that a single quoted string must be separated from a preceding
563 word by a space, since single quote is a valid character in an identifier
566 Array values are interpolated into double-quoted strings by joining all the
567 elements of the array with the delimiter specified in the $" variable,
569 (Since in versions of perl prior to 3.0 the @ character was not a metacharacter
570 in double-quoted strings, the interpolation of @array, $array[EXPR],
571 @array[LIST], $array{EXPR}, or @array{LIST} only happens if array is
572 referenced elsewhere in the program or is predefined.)
573 The following are equivalent:
577 $temp = join($",@ARGV);
583 Within search patterns (which also undergo double-quotish substitution)
584 there is a bad ambiguity: Is /$foo[bar]/ to be
585 interpreted as /${foo}[bar]/ (where [bar] is a character class for the
586 regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to
588 If @foo doesn't otherwise exist, then it's obviously a character class.
589 If @foo exists, perl takes a good guess about [bar], and is almost always right.
590 If it does guess wrong, or if you're just plain paranoid,
591 you can force the correct interpretation with curly brackets as above.
593 A line-oriented form of quoting is based on the shell here-is syntax.
594 Following a << you specify a string to terminate the quoted material, and all lines
595 following the current line down to the terminating string are the value
597 The terminating string may be either an identifier (a word), or some
599 If quoted, the type of quotes you use determines the treatment of the text,
600 just as in regular quoting.
601 An unquoted identifier works like double quotes.
602 There must be no space between the << and the identifier.
603 (If you put a space it will be treated as a null identifier, which is
604 valid, and matches the first blank line\*(--see Merry Christmas example below.)
605 The terminating string must appear by itself (unquoted and with no surrounding
606 whitespace) on the terminating line.
609 print <<EOF; # same as above
613 print <<"EOF"; # same as above
617 print << x 10; # null identifier is delimiter
620 print <<`EOC`; # execute commands
625 print <<foo, <<bar; # you can stack them
632 Array literals are denoted by separating individual values by commas, and
633 enclosing the list in parentheses.
634 In a context not requiring an array value, the value of the array literal
635 is the value of the final element, as in the C comma operator.
640 @foo = (\'cc\', \'\-E\', $bar);
642 assigns the entire array value to array foo, but
644 $foo = (\'cc\', \'\-E\', $bar);
647 assigns the value of variable bar to variable foo.
648 Array lists may be assigned to if and only if each element of the list
652 ($a, $b, $c) = (1, 2, 3);
654 ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00);
656 The final element may be an array or an associative array:
658 ($a, $b, @rest) = split;
659 local($a, $b, %rest) = @_;
662 You can actually put an array anywhere in the list, but the first array
663 in the list will soak up all the values, and anything after it will get
665 This may be useful in a local().
667 An associative array literal contains pairs of values to be interpreted
668 as a key and a value:
672 # same as map assignment above
673 %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
676 Array assignment in a scalar context returns the number of elements
677 produced by the expression on the right side of the assignment:
680 $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2
684 There are several other pseudo-literals that you should know about.
685 If a string is enclosed by backticks (grave accents), it first undergoes
686 variable substitution just like a double quoted string.
687 It is then interpreted as a command, and the output of that command
688 is the value of the pseudo-literal, like in a shell.
689 The command is executed each time the pseudo-literal is evaluated.
690 The status value of the command is returned in $? (see Predefined Names
691 for the interpretation of $?).
692 Unlike in \f2csh\f1, no translation is done on the return
693 data\*(--newlines remain newlines.
694 Unlike in any of the shells, single quotes do not hide variable names
695 in the command from interpretation.
696 To pass a $ through to the shell you need to hide it with a backslash.
698 Evaluating a filehandle in angle brackets yields the next line
699 from that file (newline included, so it's never false until EOF, at
700 which time an undefined value is returned).
701 Ordinarily you must assign that value to a variable,
702 but there is one situation where an automatic assignment happens.
703 If (and only if) the input symbol is the only thing inside the conditional of a
706 automatically assigned to the variable \*(L"$_\*(R".
707 (This may seem like an odd thing to you, but you'll use the construct
711 Anyway, the following lines are equivalent to each other:
715 while ($_ = <STDIN>) { print; }
716 while (<STDIN>) { print; }
717 for (\|;\|<STDIN>;\|) { print; }
718 print while $_ = <STDIN>;
733 will also work except in packages, where they would be interpreted as
734 local identifiers rather than global.)
735 Additional filehandles may be created with the
739 If a <FILEHANDLE> is used in a context that is looking for an array, an array
740 consisting of all the input lines is returned, one line per array element.
741 It's easy to make a LARGE data space this way, so use with care.
743 The null filehandle <> is special and can be used to emulate the behavior of
744 \fIsed\fR and \fIawk\fR.
745 Input from <> comes either from standard input, or from each file listed on
747 Here's how it works: the first time <> is evaluated, the ARGV array is checked,
748 and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard
750 The ARGV array is then processed as a list of filenames.
756 .\|.\|. # code for each line
762 unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[;
763 while ($ARGV = shift) {
766 .\|.\|. # code for each line
771 except that it isn't as cumbersome to say.
772 It really does shift array ARGV and put the current filename into
774 It also uses filehandle ARGV internally.
775 You can modify @ARGV before the first <> as long as you leave the first
776 filename at the beginning of the array.
777 Line numbers ($.) continue as if the input was one big happy file.
778 (But see example under eof for how to reset line numbers on each file.)
781 If you want to set @ARGV to your own list of files, go right ahead.
782 If you want to pass switches into your script, you can
783 put a loop on the front like this:
787 while ($_ = $ARGV[0], /\|^\-/\|) {
789 last if /\|^\-\|\-$\|/\|;
790 /\|^\-D\|(.*\|)/ \|&& \|($debug = $1);
791 /\|^\-v\|/ \|&& \|$verbose++;
792 .\|.\|. # other switches
795 .\|.\|. # code for each line
799 The <> symbol will return FALSE only once.
800 If you call it again after this it will assume you are processing another
801 @ARGV list, and if you haven't set @ARGV, will input from
804 If the string inside the angle brackets is a reference to a scalar variable
806 then that variable contains the name of the filehandle to input from.
808 If the string inside angle brackets is not a filehandle, it is interpreted
809 as a filename pattern to be globbed, and either an array of filenames or the
810 next filename in the list is returned, depending on context.
811 One level of $ interpretation is done first, but you can't say <$foo>
812 because that's an indirect filehandle as explained in the previous
814 You could insert curly brackets to force interpretation as a
815 filename glob: <${foo}>.
827 open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|");
834 In fact, it's currently implemented that way.
835 (Which means it will not work on filenames with spaces in them unless
836 you have /bin/csh on your machine.)
837 Of course, the shortest way to do the above is:
847 script consists of a sequence of declarations and commands.
848 The only things that need to be declared in
850 are report formats and subroutines.
851 See the sections below for more information on those declarations.
852 All uninitialized user-created objects are assumed to
853 start with a null or 0 value until they
854 are defined by some explicit operation such as assignment.
855 The sequence of commands is executed just once, unlike in
859 scripts, where the sequence of commands is executed for each input line.
860 While this means that you must explicitly loop over the lines of your input file
861 (or files), it also means you have much more control over which files and which
863 (Actually, I'm lying\*(--it is possible to do an implicit loop with either the
869 A declaration can be put anywhere a command can, but has no effect on the
870 execution of the primary sequence of commands--declarations all take effect
872 Typically all the declarations are put at the beginning or the end of the script.
875 is, for the most part, a free-form language.
876 (The only exception to this is format declarations, for fairly obvious reasons.)
877 Comments are indicated by the # character, and extend to the end of the line.
878 If you attempt to use /* */ C comments, it will be interpreted either as
879 division or pattern matching, depending on the context.
881 .Sh "Compound statements"
884 a sequence of commands may be treated as one command by enclosing it
886 We will call this a BLOCK.
888 The following compound commands may be used to control flow:
893 if (EXPR) BLOCK else BLOCK
894 if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK
895 LABEL while (EXPR) BLOCK
896 LABEL while (EXPR) BLOCK continue BLOCK
897 LABEL for (EXPR; EXPR; EXPR) BLOCK
898 LABEL foreach VAR (ARRAY) BLOCK
899 LABEL BLOCK continue BLOCK
902 Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not
904 This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed.
905 If you want to write conditionals without curly brackets there are several
907 The following all do the same thing:
911 if (!open(foo)) { die "Can't open $foo: $!"; }
912 die "Can't open $foo: $!" unless open(foo);
913 open(foo) || die "Can't open $foo: $!"; # foo or bust!
914 open(foo) ? \'hi mom\' : die "Can't open $foo: $!";
915 # a bit exotic, that last one
921 statement is straightforward.
922 Since BLOCKs are always bounded by curly brackets, there is never any
923 ambiguity about which
932 the sense of the test is reversed.
936 statement executes the block as long as the expression is true
937 (does not evaluate to the null string or 0).
938 The LABEL is optional, and if present, consists of an identifier followed by
940 The LABEL identifies the loop for the loop control statements
948 BLOCK, it is always executed just before
949 the conditional is about to be evaluated again, similarly to the third part
953 Thus it can be used to increment a loop variable, even when the loop has
954 been continued via the
956 statement (similar to the C \*(L"continue\*(R" statement).
960 is replaced by the word
962 the sense of the test is reversed, but the conditional is still tested before
969 statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional
970 is true if the value of the last command in that block is true.
974 loop works exactly like the corresponding
980 for ($i = 1; $i < 10; $i++) {
994 The foreach loop iterates over a normal array value and sets the variable
995 VAR to be each element of the array in turn.
996 The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword,
997 so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity.
998 If VAR is omitted, $_ is set to each value.
999 If ARRAY is an actual array (as opposed to an expression returning an array
1000 value), you can modify each element of the array
1001 by modifying VAR inside the loop.
1006 for (@ary) { s/foo/bar/; }
1008 foreach $elem (@elements) {
1013 for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) {
1014 print $_, "\en"; sleep(1);
1017 for (1..15) { print "Merry Christmas\en"; }
1020 foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'}) {
1021 print "Item: $item\en";
1026 The BLOCK by itself (labeled or not) is equivalent to a loop that executes
1028 Thus you can use any of the loop control statements in it to leave or
1033 This construct is particularly nice for doing case structures.
1038 if (/^abc/) { $abc = 1; last foo; }
1039 if (/^def/) { $def = 1; last foo; }
1040 if (/^xyz/) { $xyz = 1; last foo; }
1045 There is no official switch statement in perl, because there
1046 are already several ways to write the equivalent.
1047 In addition to the above, you could write
1052 $abc = 1, last foo if /^abc/;
1053 $def = 1, last foo if /^def/;
1054 $xyz = 1, last foo if /^xyz/;
1062 /^abc/ && do { $abc = 1; last foo; }
1063 /^def/ && do { $def = 1; last foo; }
1064 /^xyz/ && do { $xyz = 1; last foo; }
1072 /^abc/ && ($abc = 1, last foo);
1073 /^def/ && ($def = 1, last foo);
1074 /^xyz/ && ($xyz = 1, last foo);
1082 { $abc = 1; last foo; }
1084 { $def = 1; last foo; }
1086 { $xyz = 1; last foo; }
1091 As it happens, these are all optimized internally to a switch structure,
1092 so perl jumps directly to the desired statement, and you needn't worry
1093 about perl executing a lot of unnecessary statements when you have a string
1094 of 50 elsifs, as long as you are testing the same simple scalar variable
1095 using ==, eq, or pattern matching as above.
1096 (If you're curious as to whether the optimizer has done this for a particular
1097 case statement, you can use the \-D1024 switch to list the syntax tree
1099 .Sh "Simple statements"
1100 The only kind of simple statement is an expression evaluated for its side
1102 Every expression (simple statement) must be terminated with a semicolon.
1103 Note that this is like C, but unlike Pascal (and
1106 Any simple statement may optionally be followed by a
1107 single modifier, just before the terminating semicolon.
1108 The possible modifiers are:
1122 modifiers have the expected semantics.
1127 modifiers also have the expected semantics (conditional evaluated first),
1128 except when applied to a do-BLOCK command,
1129 in which case the block executes once before the conditional is evaluated.
1130 This is so that you can write loops like:
1137 } until $_ \|eq \|".\|\e\|n";
1142 operator below. Note also that the loop control commands described later will
1143 NOT work in this construct, since modifiers don't take loop labels.
1148 expressions work almost exactly like C expressions, only the differences
1149 will be mentioned here.
1155 The exponentiation operator.
1157 The exponentiation assignment operator.
1159 The null list, used to initialize an array to null.
1161 Concatenation of two strings.
1163 The concatenation assignment operator.
1165 String equality (== is numeric equality).
1166 For a mnemonic just think of \*(L"eq\*(R" as a string.
1167 (If you are used to the
1169 behavior of using == for either string or numeric equality
1170 based on the current form of the comparands, beware!
1171 You must be explicit here.)
1173 String inequality (!= is numeric inequality).
1177 String greater than.
1179 String less than or equal.
1181 String greater than or equal.
1183 Certain operations search or modify the string \*(L"$_\*(R" by default.
1184 This operator makes that kind of operation work on some other string.
1185 The right argument is a search pattern, substitution, or translation.
1186 The left argument is what is supposed to be searched, substituted, or
1187 translated instead of the default \*(L"$_\*(R".
1188 The return value indicates the success of the operation.
1189 (If the right argument is an expression other than a search pattern,
1190 substitution, or translation, it is interpreted as a search pattern
1192 This is less efficient than an explicit search, since the pattern must
1193 be compiled every time the expression is evaluated.)
1194 The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else.
1196 Just like =~ except the return value is negated.
1198 The repetition operator.
1199 Returns a string consisting of the left operand repeated the
1200 number of times specified by the right operand.
1203 print \'\-\' x 80; # print row of dashes
1204 print \'\-\' x80; # illegal, x80 is identifier
1206 print "\et" x ($tab/8), \' \' x ($tab%8); # tab over
1210 The repetition assignment operator.
1212 The range operator, which is really two different operators depending
1214 In an array context, returns an array of values counting (by ones)
1215 from the left value to the right value.
1216 This is useful for writing \*(L"for (1..10)\*(R" loops and for doing
1217 slice operations on arrays.
1219 In a scalar context, .\|. returns a boolean value.
1220 The operator is bistable, like a flip-flop..
1221 Each .\|. operator maintains its own boolean state.
1222 It is false as long as its left operand is false.
1223 Once the left operand is true, the range operator stays true
1224 until the right operand is true,
1225 AFTER which the range operator becomes false again.
1226 (It doesn't become false till the next time the range operator is evaluated.
1227 It can become false on the same evaluation it became true, but it still returns
1229 The right operand is not evaluated while the operator is in the \*(L"false\*(R" state,
1230 and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state.
1231 The scalar .\|. operator is primarily intended for doing line number ranges
1233 the fashion of \fIsed\fR or \fIawk\fR.
1234 The precedence is a little lower than || and &&.
1235 The value returned is either the null string for false, or a sequence number
1236 (beginning with 1) for true.
1237 The sequence number is reset for each range encountered.
1238 The final sequence number in a range has the string \'E0\' appended to it, which
1239 doesn't affect its numeric value, but gives you something to search for if you
1240 want to exclude the endpoint.
1241 You can exclude the beginning point by waiting for the sequence number to be
1243 If either operand of scalar .\|. is static, that operand is implicitly compared
1244 to the $. variable, the current line number.
1249 As a scalar operator:
1250 if (101 .\|. 200) { print; } # print 2nd hundred lines
1252 next line if (1 .\|. /^$/); # skip header lines
1254 s/^/> / if (/^$/ .\|. eof()); # quote body
1257 As an array operator:
1258 for (101 .\|. 200) { print; } # print $_ 100 times
1260 @foo = @foo[$[ .\|. $#foo]; # an expensive no-op
1261 @foo = @foo[$#foo-4 .\|. $#foo]; # slice last 5 items
1266 This unary operator takes one argument, either a filename or a filehandle,
1267 and tests the associated file to see if something is true about it.
1268 If the argument is omitted, tests $_, except for \-t, which tests
1270 It returns 1 for true and \'\' for false, or the undefined value if the
1272 Precedence is higher than logical and relational operators, but lower than
1273 arithmetic operators.
1274 The operator may be any of:
1276 \-r File is readable by effective uid.
1277 \-w File is writable by effective uid.
1278 \-x File is executable by effective uid.
1279 \-o File is owned by effective uid.
1280 \-R File is readable by real uid.
1281 \-W File is writable by real uid.
1282 \-X File is executable by real uid.
1283 \-O File is owned by real uid.
1285 \-z File has zero size.
1286 \-s File has non-zero size.
1287 \-f File is a plain file.
1288 \-d File is a directory.
1289 \-l File is a symbolic link.
1290 \-p File is a named pipe (FIFO).
1291 \-S File is a socket.
1292 \-b File is a block special file.
1293 \-c File is a character special file.
1294 \-u File has setuid bit set.
1295 \-g File has setgid bit set.
1296 \-k File has sticky bit set.
1297 \-t Filehandle is opened to a tty.
1298 \-T File is a text file.
1299 \-B File is a binary file (opposite of \-T).
1302 The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X
1303 is based solely on the mode of the file and the uids and gids of the user.
1304 There may be other reasons you can't actually read, write or execute the file.
1305 Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and
1306 \-x and \-X return 1 if any execute bit is set in the mode.
1307 Scripts run by the superuser may thus need to do a stat() in order to determine
1308 the actual mode of the file, or temporarily set the uid to something else.
1316 next unless \-f $_; # ignore specials
1321 Note that \-s/a/b/ does not do a negated substitution.
1322 Saying \-exp($foo) still works as expected, however\*(--only single letters
1323 following a minus are interpreted as file tests.
1325 The \-T and \-B switches work as follows.
1326 The first block or so of the file is examined for odd characters such as
1327 strange control codes or metacharacters.
1328 If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file.
1329 Also, any file containing null in the first block is considered a binary file.
1330 If \-T or \-B is used on a filehandle, the current stdio buffer is examined
1331 rather than the first block.
1332 Both \-T and \-B return TRUE on a null file, or a file at EOF when testing
1335 If any of the file tests (or either stat operator) are given the special
1336 filehandle consisting of a solitary underline, then the stat structure
1337 of the previous file test (or stat operator) is used, saving a system
1339 (This doesn't work with \-t, and you need to remember that lstat and -l
1340 will leave values in the stat structure for the symbolic link, not the
1345 print "Can do.\en" if -r $a || -w _ || -x _;
1349 print "Readable\en" if -r _;
1350 print "Writable\en" if -w _;
1351 print "Executable\en" if -x _;
1352 print "Setuid\en" if -u _;
1353 print "Setgid\en" if -g _;
1354 print "Sticky\en" if -k _;
1355 print "Text\en" if -T _;
1356 print "Binary\en" if -B _;
1360 Here is what C has that
1364 Address-of operator.
1366 Dereference-address operator.
1368 Type casting operator.
1372 does a certain amount of expression evaluation at compile time, whenever
1373 it determines that all of the arguments to an operator are static and have
1375 In particular, string concatenation happens at compile time between literals that don't do variable substitution.
1376 Backslash interpretation also happens at compile time.
1381 \'Now is the time for all\' . "\|\e\|n" .
1382 \'good men to come to.\'
1385 and this all reduces to one string internally.
1387 The autoincrement operator has a little extra built-in magic to it.
1388 If you increment a variable that is numeric, or that has ever been used in
1389 a numeric context, you get a normal increment.
1390 If, however, the variable has only been used in string contexts since it
1391 was set, and has a value that is not null and matches the
1392 pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done
1393 as a string, preserving each character within its range, with carry:
1396 print ++($foo = \'99\'); # prints \*(L'100\*(R'
1397 print ++($foo = \'a0\'); # prints \*(L'a1\*(R'
1398 print ++($foo = \'Az\'); # prints \*(L'Ba\*(R'
1399 print ++($foo = \'zz\'); # prints \*(L'aaa\*(R'
1402 The autodecrement is not magical.