2 ''' $Header: perl.man.1,v 2.0.1.1 88/06/28 16:28:09 root Exp $
4 ''' $Log: perl.man.1,v $
5 ''' Revision 2.0.1.1 88/06/28 16:28:09 root
6 ''' patch1: fixed some quotes
7 ''' patch1: clarified syntax of LIST
9 ''' Revision 2.0 88/06/05 00:09:23 root
10 ''' Baseline version 2.0.
31 ''' Set up \*(-- to give an unbreakable dash;
32 ''' string Tr holds user defined translation string.
33 ''' Bell System Logo is used as a dummy character.
38 .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
39 .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
55 perl - Practical Extraction and Report Language
57 .B perl [options] filename args
60 is a interpreted language optimized for scanning arbitrary text files,
61 extracting information from those text files, and printing reports based
63 It's also a good language for many system management tasks.
64 The language is intended to be practical (easy to use, efficient, complete)
65 rather than beautiful (tiny, elegant, minimal).
66 It combines (in the author's opinion, anyway) some of the best features of C,
67 \fIsed\fR, \fIawk\fR, and \fIsh\fR,
68 so people familiar with those languages should have little difficulty with it.
69 (Language historians will also note some vestiges of \fIcsh\fR, Pascal, and
71 Expression syntax corresponds quite closely to C expression syntax.
72 If you have a problem that would ordinarily use \fIsed\fR
73 or \fIawk\fR or \fIsh\fR, but it
74 exceeds their capabilities or must run a little faster,
75 and you don't want to write the silly thing in C, then
78 There are also translators to turn your sed and awk scripts into perl scripts.
83 looks for your script in one of the following places:
85 Specified line by line via
87 switches on the command line.
89 Contained in the file specified by the first filename on the command line.
90 (Note that systems supporting the #! notation invoke interpreters this way.)
92 Passed in implicity via standard input.
93 This only works if there are no filename arguments\*(--to pass
94 arguments to a stdin script you must explicitly specify a - for the script name.
96 After locating your script,
98 compiles it to an internal form.
99 If the script is syntactically correct, it is executed.
101 Note: on first reading this section may not make much sense to you. It's here
102 at the front for easy reference.
104 A single-character option may be combined with the following option, if any.
105 This is particularly useful when invoking a script using the #! construct which
106 only allows one argument. Example:
110 #!/usr/bin/perl -spi.bak # same as -s -p -i.bak
117 turns on autosplit mode when used with a \-n or \-p.
118 An implicit split command to the @F array
119 is done as the first thing inside the implicit while loop produced by
123 perl -ane 'print pop(@F),"\en";'
135 sets debugging flags.
136 To watch how it executes your script, use
138 (This only works if debugging is compiled into your
142 may be used to enter one line of script.
145 commands may be given to build up a multi-line script.
150 will not look for a script filename in the argument list.
153 specifies that files processed by the <> construct are to be edited
155 It does this by renaming the input file, opening the output file by the
156 same name, and selecting that output file as the default for print statements.
157 The extension, if supplied, is added to the name of the
158 old file to make a backup copy.
159 If no extension is supplied, no backup is made.
160 Saying \*(L"perl -p -i.bak -e "s/foo/bar/;" .\|.\|. \*(R" is the same as using
165 #!/usr/bin/perl -pi.bak
168 which is equivalent to
173 if ($ARGV ne $oldargv) {
174 rename($ARGV,$ARGV . '.bak');
175 open(ARGVOUT,">$ARGV");
182 print; # this prints to original filename
187 except that the \-i form doesn't need to compare $ARGV to $oldargv to know when
188 the filename has changed.
189 It does, however, use ARGVOUT for the selected filehandle.
190 Note that stdout is restored as the default output filehandle after the loop.
192 You can use eof to locate the end of each input file, in case you want
193 to append to each file, or reset line numbering (see example under eof).
196 may be used in conjunction with
198 to tell the C preprocessor where to look for include files.
199 By default /usr/include and /usr/lib/perl are searched.
204 to assume the following loop around your script, which makes it iterate
205 over filename arguments somewhat like \*(L"sed -n\*(R" or \fIawk\fR:
210 .\|.\|. # your script goes here
214 Note that the lines are not printed by default.
217 to have lines printed.
218 Here is an efficient way to delete all files older than a week:
221 find . -mtime +7 -print | perl -ne 'chop;unlink;'
224 This is faster than using the -exec switch find because you don't have to
225 start a process on every filename found.
230 to assume the following loop around your script, which makes it iterate
231 over filename arguments somewhat like \fIsed\fR:
236 .\|.\|. # your script goes here
242 Note that the lines are printed automatically.
243 To suppress printing use the
253 causes your script to be run through the C preprocessor before
256 (Since both comments and cpp directives begin with the # character,
257 you should avoid starting comments with any words recognized
258 by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".)
261 enables some rudimentary switch parsing for switches on the command line
262 after the script name but before any filename arguments (or before a --).
263 Any switch found there is removed from @ARGV and sets the corresponding variable in the
266 The following script prints \*(L"true\*(R" if and only if the script is
267 invoked with a -xyz switch.
272 if ($xyz) { print "true\en"; }
277 makes perl use the PATH environment variable to search for the script
278 (unless the name of the script starts with a slash).
279 Typically this is used to emulate #! startup on machines that don't
280 support #!, in the following manner:
284 eval "exec /usr/bin/perl -S $0 $*"
285 if $running_under_some_shell;
288 The system ignores the first line and feeds the script to /bin/sh,
289 which proceeds to try to execute the perl script as a shell script.
290 The shell executes the second line as a normal shell command, and thus
291 starts up the perl interpreter.
292 On some systems $0 doesn't always contain the full pathname,
293 so the -S tells perl to search for the script if necessary.
294 After perl locates the script, it parses the lines and ignores them because
295 the variable $running_under_some_shell is never true.
298 allows perl to do unsafe operations.
299 Currently the only \*(L"unsafe\*(R" operation is the unlinking of directories while
300 running as superuser.
303 prints the version and patchlevel of your perl executable.
306 prints warnings about identifiers that are mentioned only once, and scalar
307 variables that are used before being set.
308 Also warns about redefined subroutines, and references to undefined
309 subroutines and filehandles.
310 .Sh "Data Types and Objects"
312 Perl has about two and a half data types: scalars, arrays of scalars, and
314 Scalars and arrays of scalars are first class objects, for the most part,
315 in the sense that they can be used as a whole as values in an expression.
316 Associative arrays can only be accessed on an association by association basis;
317 they don't have a value as a whole (at least not yet).
319 Scalars are interpreted as strings or numbers as appropriate.
320 A scalar is interpreted as TRUE in the boolean sense if it is not the null
322 Booleans returned by operators are 1 for true and '0' or '' (the null
325 References to scalar variables always begin with \*(L'$\*(R', even when referring
326 to a scalar that is part of an array.
331 $days \h'|2i'# a simple scalar variable
332 $days[28] \h'|2i'# 29th element of array @days
333 $days{'Feb'}\h'|2i'# one value from an associative array
334 $#days \h'|2i'# last index of array @days
336 but entire arrays are denoted by \*(L'@\*(R':
338 @days \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n])
342 Any of these five constructs may server as an lvalue,
343 that is, may be assigned to.
344 (You may also use an assignment to one of these lvalues as an lvalue in
345 certain contexts\*(--see s, tr and chop.)
346 You may find the length of array @days by evaluating
347 \*(L"$#days\*(R", as in
349 (Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.)
350 Assigning to $#days changes the length of the array.
351 Shortening an array by this method does not actually destroy any values.
352 Lengthening an array that was previously shortened recovers the values that
353 were in those elements.
354 You can also gain some measure of efficiency by preextending an array that
356 (You can also extend an array by assigning to an element that is off the
358 This differs from assigning to $#whatever in that intervening values
359 are set to null rather than recovered.)
360 You can truncate an array down to nothing by assigning the null list () to
362 The following are exactly equivalent
366 $#whatever = $[ \- 1;
370 Every data type has its own namespace.
371 You can, without fear of conflict, use the same name for a scalar variable,
372 an array, an associative array, a filehandle, a subroutine name, and/or
374 Since variable and array references always start with \*(L'$\*(R'
375 or \*(L'@\*(R', the \*(L"reserved\*(R" words aren't in fact reserved
376 with respect to variable names.
377 (They ARE reserved with respect to labels and filehandles, however, which
378 don't have an initial special character.
379 Hint: you could say open(LOG,'logfile') rather than open(log,'logfile').)
380 Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all
382 Names which start with a letter may also contain digits and underscores.
383 Names which do not start with a letter are limited to one character,
384 e.g. \*(L"$%\*(R" or \*(L"$$\*(R".
385 (Many one character names have a predefined significance to
389 String literals are delimited by either single or double quotes.
390 They work much like shell quotes:
391 double-quoted string literals are subject to backslash and variable
392 substitution; single-quoted strings are not.
393 The usual backslash rules apply for making characters such as newline, tab, etc.
394 You can also embed newlines directly in your strings, i.e. they can end on
395 a different line than they begin.
396 This is nice, but if you forget your trailing quote, the error will not be
397 reported until perl finds another line containing the quote character, which
398 may be much further on in the script.
399 Variable substitution inside strings is limited (currently) to simple scalar variables.
400 The following code segment prints out \*(L"The price is $100.\*(R"
404 $Price = '$100';\h'|3.5i'# not interpreted
405 print "The price is $Price.\e\|n";\h'|3.5i'# interpreted
408 Note that you can put curly brackets around the identifier to delimit it
409 from following alphanumerics.
411 Array literals are denoted by separating individual values by commas, and
412 enclosing the list in parentheses.
413 In a context not requiring an array value, the value of the array literal
414 is the value of the final element, as in the C comma operator.
419 @foo = ('cc', '\-E', $bar);
421 assigns the entire array value to array foo, but
423 $foo = ('cc', '\-E', $bar);
426 assigns the value of variable bar to variable foo.
427 Array lists may be assigned to if and only if each element of the list
431 ($a, $b, $c) = (1, 2, 3);
433 ($map{'red'}, $map{'blue'}, $map{'green'}) = (0x00f, 0x0f0, 0xf00);
436 Array assignment returns the number of elements assigned.
438 Numeric literals are specified in any of the usual floating point or
441 There are several other pseudo-literals that you should know about.
442 If a string is enclosed by backticks (grave accents), it first undergoes
443 variable substitution just like a double quoted string.
444 It is then interpreted as a command, and the output of that command
445 is the value of the pseudo-literal, like in a shell.
446 The command is executed each time the pseudo-literal is evaluated.
447 The status value of the command is returned in $? (see Predefined Names
448 for the interpretation of $?).
449 Unlike in \f2csh\f1, no translation is done on the return
450 data\*(--newlines remain newlines.
451 Unlike in any of the shells, single quotes do not hide variable names
452 in the command from interpretation.
453 To pass a $ through to the shell you need to hide it with a backslash.
455 Evaluating a filehandle in angle brackets yields the next line
456 from that file (newline included, so it's never false until EOF).
457 Ordinarily you must assign that value to a variable,
458 but there is one situation where in which an automatic assignment happens.
459 If (and only if) the input symbol is the only thing inside the conditional of a
462 automatically assigned to the variable \*(L"$_\*(R".
463 (This may seem like an odd thing to you, but you'll use the construct
467 Anyway, the following lines are equivalent to each other:
471 while ($_ = <stdin>) {
473 for (\|;\|<stdin>;\|) {
482 Additional filehandles may be created with the
486 If a <FILEHANDLE> is used in a context that is looking for an array, an array
487 consisting of all the input lines is returned, one line per array element.
488 It's easy to make a LARGE data space this way, so use with care.
490 The null filehandle <> is special and can be used to emulate the behavior of
491 \fIsed\fR and \fIawk\fR.
492 Input from <> comes either from standard input, or from each file listed on
494 Here's how it works: the first time <> is evaluated, the ARGV array is checked,
495 and if it is null, $ARGV[0] is set to '-', which when opened gives you standard
497 The ARGV array is then processed as a list of filenames.
503 .\|.\|. # code for each line
509 unshift(@ARGV, '\-') \|if \|$#ARGV < $[;
510 while ($ARGV = shift) {
513 .\|.\|. # code for each line
518 except that it isn't as cumbersome to say.
519 It really does shift array ARGV and put the current filename into
521 It also uses filehandle ARGV internally.
522 You can modify @ARGV before the first <> as long as you leave the first
523 filename at the beginning of the array.
524 Line numbers ($.) continue as if the input was one big happy file.
525 (But see example under eof for how to reset line numbers on each file.)
528 If you want to set @ARGV to your own list of files, go right ahead.
529 If you want to pass switches into your script, you can
530 put a loop on the front like this:
534 while ($_ = $ARGV[0], /\|^\-/\|) {
536 last if /\|^\-\|\-$\|/\|;
537 /\|^\-D\|(.*\|)/ \|&& \|($debug = $1);
538 /\|^\-v\|/ \|&& \|$verbose++;
539 .\|.\|. # other switches
542 .\|.\|. # code for each line
546 The <> symbol will return FALSE only once.
547 If you call it again after this it will assume you are processing another
548 @ARGV list, and if you haven't set @ARGV, will input from stdin.
550 If the string inside the angle brackets is a reference to a scalar variable
552 then that variable contains the name of the filehandle to input from.
554 If the string inside angle brackets is not a filehandle, it is interpreted
555 as a filename pattern to be globbed, and either an array of filenames or the
556 next filename in the list is returned, depending on context.
557 One level of $ interpretation is done first, but you can't say <$foo>
558 because that's an indirect filehandle as explained in the previous
560 You could insert curly brackets to force interpretation as a
561 filename glob: <${foo}>.
573 open(foo,"echo *.c | tr -s ' \et\er\ef' '\e\e012\e\e012\e\e012\e\e012'|");
580 In fact, it's currently implemented that way.
581 (Which means it will not work on filenames with spaces in them.)
582 Of course, the shortest way to do the above is:
592 script consists of a sequence of declarations and commands.
593 The only things that need to be declared in
595 are report formats and subroutines.
596 See the sections below for more information on those declarations.
597 All objects are assumed to start with a null or 0 value.
598 The sequence of commands is executed just once, unlike in
602 scripts, where the sequence of commands is executed for each input line.
603 While this means that you must explicitly loop over the lines of your input file
604 (or files), it also means you have much more control over which files and which
606 (Actually, I'm lying\*(--it is possible to do an implicit loop with either the
612 A declaration can be put anywhere a command can, but has no effect on the
613 execution of the primary sequence of commands.
614 Typically all the declarations are put at the beginning or the end of the script.
617 is, for the most part, a free-form language.
618 (The only exception to this is format declarations, for fairly obvious reasons.)
619 Comments are indicated by the # character, and extend to the end of the line.
620 If you attempt to use /* */ C comments, it will be interpreted either as
621 division or pattern matching, depending on the context.
623 .Sh "Compound statements"
626 a sequence of commands may be treated as one command by enclosing it
628 We will call this a BLOCK.
630 The following compound commands may be used to control flow:
635 if (EXPR) BLOCK else BLOCK
636 if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK
637 LABEL while (EXPR) BLOCK
638 LABEL while (EXPR) BLOCK continue BLOCK
639 LABEL for (EXPR; EXPR; EXPR) BLOCK
640 LABEL foreach VAR (ARRAY) BLOCK
641 LABEL BLOCK continue BLOCK
644 Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not
646 This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed.
647 If you want to write conditionals without curly brackets there are several
649 The following all do the same thing:
653 if (!open(foo)) { die "Can't open $foo"; }
654 die "Can't open $foo" unless open(foo);
655 open(foo) || die "Can't open $foo"; # foo or bust!
656 open(foo) ? die "Can't open $foo" : 'hi mom';
657 # a bit exotic, that last one
663 statement is straightforward.
664 Since BLOCKs are always bounded by curly brackets, there is never any
665 ambiguity about which
674 the sense of the test is reversed.
678 statement executes the block as long as the expression is true
679 (does not evaluate to the null string or 0).
680 The LABEL is optional, and if present, consists of an identifier followed by
682 The LABEL identifies the loop for the loop control statements
690 BLOCK, it is always executed just before
691 the conditional is about to be evaluated again, similarly to the third part
695 Thus it can be used to increment a loop variable, even when the loop has
696 been continued via the
698 statement (similar to the C \*(L"continue\*(R" statement).
702 is replaced by the word
704 the sense of the test is reversed, but the conditional is still tested before
711 statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional
712 is true if the value of the last command in that block is true.
716 loop works exactly like the corresponding
722 for ($i = 1; $i < 10; $i++) {
736 The foreach loop iterates over a normal array value and sets the variable
737 VAR to be each element of the array in turn.
738 The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword,
739 so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity.
740 If VAR is omitted, $_ is set to each value.
741 If ARRAY is an actual array (as opposed to an expression returning an array
742 value), you can modify each element of the array
743 by modifying VAR inside the loop.
748 for (@ary) { s/foo/bar/; }
750 foreach $elem (@elements) {
754 for ((10,9,8,7,6,5,4,3,2,1,'BOOM')) {
755 print $_,"\en"; sleep(1);
759 foreach $item (split(/:[\e\e\en:]*/,$ENV{'TERMCAP'}) {
760 print "Item: $item\en";
764 The BLOCK by itself (labeled or not) is equivalent to a loop that executes
766 Thus you can use any of the loop control statements in it to leave or
771 This construct is particularly nice for doing case structures.
776 if (/abc/) { $abc = 1; last foo; }
777 if (/def/) { $def = 1; last foo; }
778 if (/xyz/) { $xyz = 1; last foo; }
783 It's also nice for exiting subroutines early.
784 Note the double curly brackets:
798 .Sh "Simple statements"
799 The only kind of simple statement is an expression evaluated for its side
801 Every expression (simple statement) must be terminated with a semicolon.
802 Note that this is like C, but unlike Pascal (and
805 Any simple statement may optionally be followed by a
806 single modifier, just before the terminating semicolon.
807 The possible modifiers are:
821 modifiers have the expected semantics.
826 modifiers also have the expected semantics (conditional evaluated first),
827 except when applied to a do-BLOCK command,
828 in which case the block executes once before the conditional is evaluated.
829 This is so that you can write loops like:
836 } until $_ \|eq \|".\|\e\|n";
841 operator below. Note also that the loop control commands described later will
842 NOT work in this construct, since modifiers don't take loop labels.
847 expressions work almost exactly like C expressions, only the differences
848 will be mentioned here.
854 The null list, used to initialize an array to null.
856 Concatenation of two strings.
858 The corresponding assignment operator.
860 String equality (== is numeric equality).
861 For a mnemonic just think of \*(L"eq\*(R" as a string.
862 (If you are used to the
864 behavior of using == for either string or numeric equality
865 based on the current form of the comparands, beware!
866 You must be explicit here.)
868 String inequality (!= is numeric inequality).
874 String less than or equal.
876 String greater than or equal.
878 Certain operations search or modify the string \*(L"$_\*(R" by default.
879 This operator makes that kind of operation work on some other string.
880 The right argument is a search pattern, substitution, or translation.
881 The left argument is what is supposed to be searched, substituted, or
882 translated instead of the default \*(L"$_\*(R".
883 The return value indicates the success of the operation.
884 (If the right argument is an expression other than a search pattern,
885 substitution, or translation, it is interpreted as a search pattern
887 This is less efficient than an explicit search, since the pattern must
888 be compiled every time the expression is evaluated.)
889 The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else.
891 Just like =~ except the return value is negated.
893 The repetition operator.
894 Returns a string consisting of the left operand repeated the
895 number of times specified by the right operand.
898 print '-' x 80; # print row of dashes
899 print '-' x80; # illegal, x80 is identifier
901 print "\et" x ($tab/8), ' ' x ($tab%8); # tab over
905 The corresponding assignment operator.
907 The range operator, which is bistable.
908 Each .. operator maintains its own boolean state.
909 It is false as long as its left operand is false.
910 Once the left operand is true, the range operator stays true
911 until the right operand is true,
912 AFTER which the range operator becomes false again.
913 (It doesn't become false till the next time the range operator evaluated.
914 It can become false on the same evaluation it became true, but it still returns
916 The right operand is not evaluated while the operator is in the \*(L"false\*(R" state,
917 and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state.
918 The .. operator is primarily intended for doing line number ranges after
919 the fashion of \fIsed\fR or \fIawk\fR.
920 The precedence is a little lower than || and &&.
921 The value returned is either the null string for false, or a sequence number
922 (beginning with 1) for true.
923 The sequence number is reset for each range encountered.
924 The final sequence number in a range has the string 'E0' appended to it, which
925 doesn't affect its numeric value, but gives you something to search for if you
926 want to exclude the endpoint.
927 You can exclude the beginning point by waiting for the sequence number to be
929 If either operand of .. is static, that operand is implicitly compared to
930 the $. variable, the current line number.
935 if (101 .. 200) { print; } # print 2nd hundred lines
937 next line if (1 .. /^$/); # skip header lines
939 s/^/> / if (/^$/ .. eof()); # quote body
944 This unary operator takes one argument, either a filename or a filehandle,
945 and tests the associated file to see if something is true about it.
946 If the argument is omitted, tests $_, except for \-t, which tests stdin.
947 It returns 1 for true and '' for false.
948 Precedence is higher than logical and relational operators, but lower than
949 arithmetic operators.
950 The operator may be any of:
952 \-r File is readable by effective uid.
953 \-w File is writeable by effective uid.
954 \-x File is executable by effective uid.
955 \-o File is owned by effective uid.
956 \-R File is readable by real uid.
957 \-W File is writeable by real uid.
958 \-X File is executable by real uid.
959 \-O File is owned by real uid.
961 \-z File has zero size.
962 \-s File has non-zero size.
963 \-f File is a plain file.
964 \-d File is a directory.
965 \-l File is a symbolic link.
966 \-p File is a named pipe (FIFO).
967 \-S File is a socket.
968 \-b File is a block special file.
969 \-c File is a character special file.
970 \-u File has setuid bit set.
971 \-g File has setgid bit set.
972 \-k File has sticky bit set.
973 \-t Filehandle is opened to a tty.
974 \-T File is a text file.
975 \-B File is a binary file (opposite of \-T).
978 The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X
979 is based solely on the mode of the file and the uids and gids of the user.
980 There may be other reasons you can't actually read, write or execute the file.
981 Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and
982 \-x and \-X return 1 if any execute bit is set in the mode.
983 Scripts run by the superuser may thus need to do a stat() in order to determine
984 the actual mode of the file, or temporarily set the uid to something else.
992 next unless \-f $_; # ignore specials
997 Note that -s/a/b/ does not do a negated substitution.
998 Saying -exp($foo) still works as expected, however\*(--only single letters
999 following a minus are interpreted as file tests.
1001 The \-T and \-B switches work as follows.
1002 The first block or so of the file is examined for odd characters such as
1003 strange control codes or metacharacters.
1004 If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file.
1005 Also, any file containing null in the first block is considered a binary file.
1006 If \-T or \-B is used on a filehandle, the current stdio buffer is examined
1007 rather than the first block.
1008 Since input doesn't work well on binary files you should probably test a
1009 filehandle before doing any input if you're unsure of the nature of the
1010 filehandle you've been handed (usually via stdin).
1011 Both \-T and \-B return TRUE on a null file, or a file at EOF when testing
1014 Here is what C has that
1018 Address-of operator.
1020 Dereference-address operator.
1022 Type casting operator.
1026 does a certain amount of expression evaluation at compile time, whenever
1027 it determines that all of the arguments to an operator are static and have
1029 In particular, string concatenation happens at compile time between literals that don't do variable substitution.
1030 Backslash interpretation also happens at compile time.
1035 'Now is the time for all' . "\|\e\|n" .
1036 'good men to come to.'
1039 and this all reduces to one string internally.
1041 The autoincrement operator has a little extra built-in magic to it.
1042 If you increment a variable that is numeric, or that has ever been used in
1043 a numeric context, you get a normal increment.
1044 If, however, the variable has only been used in string contexts since it
1045 was set, and has a value that is not null and matches the
1046 pattern /^[a-zA-Z]*[0-9]*$/, the increment is done
1047 as a string, preserving each character within its range, with carry:
1050 print ++($foo = '99'); # prints '100'
1051 print ++($foo = 'a0'); # prints 'a1'
1052 print ++($foo = 'Az'); # prints 'Ba'
1053 print ++($foo = 'zz'); # prints 'aaa'
1056 The autodecrement is not magical.
1058 Along with the literals and variables mentioned earlier,
1059 the following operations can serve as terms in an expression.
1060 Some of these operations take a LIST as an argument.
1061 Such a list can consist of any combination of scalar arguments or arrays;
1062 the arrays will be included in the list as if each individual element were
1063 interpolated at that point in the list.
1064 Elements of the LIST should be separated by commas.
1065 .Ip "/PATTERN/i" 8 4
1066 Searches a string for a pattern, and returns true (1) or false ('').
1067 If no string is specified via the =~ or !~ operator,
1068 the $_ string is searched.
1069 (The string specified with =~ need not be an lvalue\*(--it may be the result of an expression evaluation, but remember the =~ binds rather tightly.)
1070 See also the section on regular expressions.
1072 If you prepend an `m' you can use any pair of characters as delimiters.
1073 This is particularly useful for matching Unix path names that contain `/'.
1074 If the final delimiter is followed by the optional letter `i', the matching is
1075 done in a case-insensitive manner.
1077 If used in a context that requires an array value, a pattern match returns an
1078 array consisting of the subexpressions matched by the parens in pattern,
1079 i.e. ($1, $2, $3.\|.\|.).
1085 open(tty, '/dev/tty');
1086 <tty> \|=~ \|/\|^y\|/i \|&& \|do foo(\|); # do foo if desired
1088 if (/Version: \|*\|([0-9.]*\|)\|/\|) { $version = $1; }
1090 next if m#^/usr/spool/uucp#;
1092 if (($F1,$F2,$Etc) = ($foo =~ /^(\eS+)\es+(\eS+)\es*(.*)/))
1095 This last example splits $foo into the first two words and the remainder
1096 of the line, and assigns those three fields to $F1, $F2 and $Etc.
1097 The conditional is true if any variables were assigned, i.e. if the pattern
1100 This is just like the /pattern/ search, except that it matches only once between
1104 This is a useful optimization when you only want to see the first occurence of
1105 something in each file of a set of files, for instance.
1106 .Ip "chdir EXPR" 8 2
1107 Changes the working directory to EXPR, if possible.
1108 Returns 1 upon success, 0 otherwise.
1109 See example under die().
1110 .Ip "chmod LIST" 8 2
1111 Changes the permissions of a list of files.
1112 The first element of the list must be the numerical mode.
1113 Returns the number of files successfully changed.
1117 $cnt = chmod 0755,'foo','bar';
1118 chmod 0755,@executables;
1121 .Ip "chop(VARIABLE)" 8 5
1123 Chops off the last character of a string and returns it.
1124 It's used primarily to remove the newline from the end of an input record,
1125 but is much more efficient than s/\en// because it neither scans nor copies
1127 If VARIABLE is omitted, chops $_.
1133 chop; # avoid \en on last field
1134 @array = split(/:/);
1139 You can actually chop anything that's an lvalue, including an assignment:
1145 .Ip "chown LIST" 8 2
1146 Changes the owner (and group) of a list of files.
1147 The first two elements of the list must be the NUMERICAL uid and gid,
1149 Returns the number of files successfully changed.
1153 $cnt = chown $uid,$gid,'foo','bar';
1154 chown $uid,$gid,@filenames;
1158 Here's an example of looking up non-numeric uids:
1167 open(pass,'/etc/passwd') || die "Can't open passwd";
1169 ($login,$pass,$uid,$gid) = split(/:/);
1170 $uid{$login} = $uid;
1171 $gid{$login} = $gid;
1173 @ary = <$pattern>; # get filenames
1174 if ($uid{$user} eq '') {
1175 die "$user not in passwd file";
1178 unshift(@ary,$uid{$user},$gid{$user});
1183 .Ip "close(FILEHANDLE)" 8 5
1184 .Ip "close FILEHANDLE" 8
1185 Closes the file or pipe associated with the file handle.
1186 You don't have to close FILEHANDLE if you are immediately going to
1187 do another open on it, since open will close it for you.
1190 However, an explicit close on an input file resets the line counter ($.), while
1191 the implicit close done by
1194 Also, closing a pipe will wait for the process executing on the pipe to complete,
1195 in case you want to look at the output of the pipe afterwards.
1200 open(output,'|sort >foo'); # pipe to sort
1201 .\|.\|. # print stuff to output
1202 close(output); # wait for sort to finish
1203 open(input,'foo'); # get sort's results
1206 FILEHANDLE may be an expression whose value gives the real filehandle name.
1207 .Ip "crypt(PLAINTEXT,SALT)" 8 6
1208 Encrypts a string exactly like the crypt() function in the C library.
1209 Useful for checking the password file for lousy passwords.
1210 Only the guys wearing white hats should do this.
1211 .Ip "delete $ASSOC{KEY}" 8 6
1212 Deletes the specified value from the specified associative array.
1213 Returns the deleted value;
1214 The following deletes all the values of an associative array:
1218 foreach $key (keys(ARRAY)) {
1219 delete $ARRAY{$key};
1223 (But it would be faster to use the reset command.)
1225 Prints the value of EXPR to stderr and exits with the current value of $!
1227 If $! is 0, exits with the value of ($? >> 8) (`command` status).
1228 If ($? >> 8) is 0, exits with 255.
1229 Equivalent examples:
1233 die "Can't cd to spool.\en" unless chdir '/usr/spool/news';
1235 chdir '/usr/spool/news' || die "Can't cd to spool.\en"
1239 If the value of EXPR does not end in a newline, the current script line
1240 number and input line number (if any) are also printed, and a newline is
1242 Hint: sometimes appending \*(L", stopped\*(R" to your message will cause it to make
1243 better sense when the string \*(L"at foo line 123\*(R" is appended.
1244 Suppose you are running script \*(L"canasta\*(R".
1248 die "/etc/games is no good";
1249 die "/etc/games is no good, stopped";
1251 produce, respectively
1253 /etc/games is no good at canasta line 123.
1254 /etc/games is no good, stopped at canasta line 123.
1260 Returns the value of the last command in the sequence of commands indicated
1262 When modified by a loop modifier, executes the BLOCK once before testing the
1264 (On other statements the loop modifiers test the conditional first.)
1265 .Ip "do SUBROUTINE (LIST)" 8 3
1266 Executes a SUBROUTINE declared by a
1268 declaration, and returns the value
1269 of the last expression evaluated in SUBROUTINE.
1270 If you pass arrays as part of LIST you may wish to pass the length
1271 of the array in front of each array.
1272 (See the section on subroutines later on.)
1273 SUBROUTINE may be a scalar variable, in which case the variable contains
1274 the name of the subroutine to execute.
1275 The parentheses are required to avoid confusion with the next form of \*(L"do\*(R".
1277 Uses the value of EXPR as a filename and executes the contents of the file
1279 It's primary use is to include subroutines from a perl subroutine library.
1288 except that it's more efficient, more concise, keeps track of the current
1289 filename for error messages, and searches all the -I libraries if the file
1290 isn't in the current directory (see also the @INC array in Predefined Names).
1291 It's the same, however, in that it does reparse the file every time you
1292 call it, so if you are going to use the file inside a loop you might prefer
1293 to use #include, at the expense of a little more startup time.
1294 (The main problem with #include is that cpp doesn't grok # comments--a
1295 workaround is to use \*(L";#\*(R" for standalone comments.)
1296 Note that the following are NOT equivalent:
1300 do $foo; # eval a file
1301 do $foo(); # call a subroutine
1304 .Ip "each(ASSOC_ARRAY)" 8 6
1305 Returns a 2 element array consisting of the key and value for the next
1306 value of an associative array, so that you can iterate over it.
1307 Entries are returned in an apparently random order.
1308 When the array is entirely read, a null array is returned (which when
1309 assigned produces a FALSE (0) value).
1310 The next call to each() after that will start iterating again.
1311 The iterator can be reset only by reading all the elements from the array.
1312 You must not modify the array while iterating over it.
1313 There is a single iterator for each associative array, shared by all
1314 each(), keys() and values() function calls in the program.
1315 The following prints out your environment like the printenv program, only
1316 in a different order:
1320 while (($key,$value) = each(ENV)) {
1321 print "$key=$value\en";
1325 See also keys() and values().
1326 .Ip "eof(FILEHANDLE)" 8 8
1328 Returns 1 if the next read on FILEHANDLE will return end of file, or if
1329 FILEHANDLE is not open.
1330 FILEHANDLE may be an expression whose value gives the real filehandle name.
1331 An eof without an argument returns the eof status for the last file read.
1332 Empty parentheses () may be used to indicate the pseudo file formed of the
1333 files listed on the command line, i.e. eof() is reasonable to use inside
1334 a while (<>) loop to detect the end of only the last file.
1335 Use eof(ARGV) or eof without the parens to test EACH file in a while (<>) loop.
1340 # insert dashes just before last line of last file
1343 print "--------------\en";
1349 # reset line numbering on each input file
1352 if (eof) { # Not eof().
1359 EXPR is parsed and executed as if it were a little perl program.
1360 It is executed in the context of the current perl program, so that
1361 any variable settings, subroutine or format definitions remain afterwards.
1362 The value returned is the value of the last expression evaluated, just
1363 as with subroutines.
1364 If there is a syntax error or runtime error, a null string is returned by
1365 eval, and $@ is set to the error message.
1366 If there was no error, $@ is null.
1367 If EXPR is omitted, evaluates $_.
1369 If there is more than one argument in LIST,
1370 calls execvp() with the arguments in LIST.
1371 If there is only one argument, the argument is checked for shell metacharacters.
1372 If there are any, the entire argument is passed to /bin/sh -c for parsing.
1373 If there are none, the argument is split into words and passed directly to
1374 execvp(), which is more efficient.
1375 Note: exec (and system) do not flush your output buffer, so you may need to
1376 set $| to avoid lost output.
1380 exec '/bin/echo', 'Your arguments are: ', @ARGV;
1381 exec "sort $outfile | uniq";
1385 Evaluates EXPR and exits immediately with that value.
1391 exit 0 \|if \|$ans \|=~ \|/\|^[Xx]\|/\|;
1397 Returns e to the power of EXPR.
1400 Returns the child pid to the parent process and 0 to the child process.
1401 Note: unflushed buffers remain unflushed in both processes, which means
1402 you may need to set $| to avoid duplicate output.
1403 .Ip "gmtime(EXPR)" 8 4
1404 Converts a time as returned by the time function to a 9-element array with
1405 the time analyzed for the Greenwich timezone.
1406 Typically used as follows:
1410 ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst)
1414 All array elements are numeric, and come straight out of a struct tm.
1415 In particular this means that $mon has the range 0..11 and $wday has the