2 ''' $RCSfile: perl.man,v $$Revision: 4.0.1.6 $$Date: 92/06/08 15:07:29 $
5 ''' Revision 4.0.1.6 92/06/08 15:07:29 lwall
6 ''' patch20: documented that numbers may contain underline
7 ''' patch20: clarified that DATA may only be read from main script
8 ''' patch20: relaxed requirement for semicolon at the end of a block
9 ''' patch20: added ... as variant on ..
10 ''' patch20: documented need for 1; at the end of a required file
11 ''' patch20: extended bracket-style quotes to two-arg operators: s()() and tr()()
12 ''' patch20: paragraph mode now skips extra newlines automatically
13 ''' patch20: documented PERLLIB and PERLDB
14 ''' patch20: documented limit on size of regexp
16 ''' Revision 4.0.1.5 91/11/11 16:42:00 lwall
17 ''' patch19: added little-endian pack/unpack options
19 ''' Revision 4.0.1.4 91/11/05 18:11:05 lwall
20 ''' patch11: added sort {} LIST
21 ''' patch11: added eval {}
22 ''' patch11: documented meaning of scalar(%foo)
23 ''' patch11: sprintf() now supports any length of s field
25 ''' Revision 4.0.1.3 91/06/10 01:26:02 lwall
26 ''' patch10: documented some newer features in addenda
28 ''' Revision 4.0.1.2 91/06/07 11:41:23 lwall
29 ''' patch4: added global modifier for pattern matches
30 ''' patch4: default top-of-form format is now FILEHANDLE_TOP
31 ''' patch4: added $^P variable to control calling of perldb routines
32 ''' patch4: added $^F variable to specify maximum system fd, default 2
33 ''' patch4: changed old $^P to $^X
35 ''' Revision 4.0.1.1 91/04/11 17:50:44 lwall
36 ''' patch1: fixed some typos
38 ''' Revision 4.0 91/03/20 01:38:08 lwall
55 .ie \\n(.$>=3 .ne \\$3
60 ''' Set up \*(-- to give an unbreakable dash;
61 ''' string Tr holds user defined translation string.
62 ''' Bell System Logo is used as a dummy character.
67 .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
68 .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
85 perl \- Practical Extraction and Report Language
88 [options] filename args
91 is an interpreted language optimized for scanning arbitrary text files,
92 extracting information from those text files, and printing reports based
94 It's also a good language for many system management tasks.
95 The language is intended to be practical (easy to use, efficient, complete)
96 rather than beautiful (tiny, elegant, minimal).
97 It combines (in the author's opinion, anyway) some of the best features of C,
98 \fIsed\fR, \fIawk\fR, and \fIsh\fR,
99 so people familiar with those languages should have little difficulty with it.
100 (Language historians will also note some vestiges of \fIcsh\fR, Pascal, and
102 Expression syntax corresponds quite closely to C expression syntax.
103 Unlike most Unix utilities,
105 does not arbitrarily limit the size of your data\*(--if you've got
108 can slurp in your whole file as a single string.
109 Recursion is of unlimited depth.
110 And the hash tables used by associative arrays grow as necessary to prevent
111 degraded performance.
113 uses sophisticated pattern matching techniques to scan large amounts of
115 Although optimized for scanning text,
117 can also deal with binary data, and can make dbm files look like associative
118 arrays (where dbm is available).
121 scripts are safer than C programs
122 through a dataflow tracing mechanism which prevents many stupid security holes.
123 If you have a problem that would ordinarily use \fIsed\fR
124 or \fIawk\fR or \fIsh\fR, but it
125 exceeds their capabilities or must run a little faster,
126 and you don't want to write the silly thing in C, then
129 There are also translators to turn your
140 looks for your script in one of the following places:
142 Specified line by line via
144 switches on the command line.
146 Contained in the file specified by the first filename on the command line.
147 (Note that systems supporting the #! notation invoke interpreters this way.)
149 Passed in implicitly via standard input.
150 This only works if there are no filename arguments\*(--to pass
153 script you must explicitly specify a \- for the script name.
155 After locating your script,
157 compiles it to an internal form.
158 If the script is syntactically correct, it is executed.
160 Note: on first reading this section may not make much sense to you. It's here
161 at the front for easy reference.
163 A single-character option may be combined with the following option, if any.
164 This is particularly useful when invoking a script using the #! construct which
165 only allows one argument. Example:
169 #!/usr/bin/perl \-spi.bak # same as \-s \-p \-i.bak
176 specifies the record separator ($/) as an octal number.
177 If there are no digits, the null character is the separator.
178 Other switches may precede or follow the digits.
179 For example, if you have a version of
181 which can print filenames terminated by the null character, you can say this:
184 find . \-name '*.bak' \-print0 | perl \-n0e unlink
187 The special value 00 will cause Perl to slurp files in paragraph mode.
188 The value 0777 will cause Perl to slurp files whole since there is no
189 legal character with that value.
192 turns on autosplit mode when used with a
196 An implicit split command to the @F array
197 is done as the first thing inside the implicit while loop produced by
204 perl \-ane \'print pop(@F), "\en";\'
210 print pop(@F), "\en";
218 to check the syntax of the script and then exit without executing it.
221 runs the script under the perl debugger.
222 See the section on Debugging.
225 sets debugging flags.
226 To watch how it executes your script, use
228 (This only works if debugging is compiled into your
230 Another nice value is \-D1024, which lists your compiled syntax tree.
231 And \-D512 displays compiled regular expressions.
233 .BI \-e " commandline"
234 may be used to enter one line of script.
237 commands may be given to build up a multi-line script.
242 will not look for a script filename in the argument list.
245 specifies that files processed by the <> construct are to be edited
247 It does this by renaming the input file, opening the output file by the
248 same name, and selecting that output file as the default for print statements.
249 The extension, if supplied, is added to the name of the
250 old file to make a backup copy.
251 If no extension is supplied, no backup is made.
252 Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using
257 #!/usr/bin/perl \-pi.bak
260 which is equivalent to
265 if ($ARGV ne $oldargv) {
266 rename($ARGV, $ARGV . \'.bak\');
267 open(ARGVOUT, ">$ARGV");
274 print; # this prints to original filename
281 form doesn't need to compare $ARGV to $oldargv to know when
282 the filename has changed.
283 It does, however, use ARGVOUT for the selected filehandle.
286 is restored as the default output filehandle after the loop.
288 You can use eof to locate the end of each input file, in case you want
289 to append to each file, or reset line numbering (see example under eof).
292 may be used in conjunction with
294 to tell the C preprocessor where to look for include files.
295 By default /usr/include and /usr/lib/perl are searched.
298 enables automatic line-ending processing. It has two effects:
299 first, it automatically chops the line terminator when used with
303 and second, it assigns $\e to have the value of
305 so that any print statements will have that line terminator added back on. If
307 is omitted, sets $\e to the current value of $/.
308 For instance, to trim lines to 80 columns:
311 perl -lpe \'substr($_, 80) = ""\'
314 Note that the assignment $\e = $/ is done when the switch is processed,
315 so the input record separator can be different than the output record
318 switch is followed by a
323 gnufind / -print0 | perl -ln0e 'print "found $_" if -p'
326 This sets $\e to newline and then sets $/ to the null character.
331 to assume the following loop around your script, which makes it iterate
332 over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR:
337 .\|.\|. # your script goes here
341 Note that the lines are not printed by default.
344 to have lines printed.
345 Here is an efficient way to delete all files older than a week:
348 find . \-mtime +7 \-print | perl \-nle \'unlink;\'
351 This is faster than using the \-exec switch of find because you don't have to
352 start a process on every filename found.
357 to assume the following loop around your script, which makes it iterate
358 over filename arguments somewhat like \fIsed\fR:
363 .\|.\|. # your script goes here
369 Note that the lines are printed automatically.
370 To suppress printing use the
380 causes your script to be run through the C preprocessor before
383 (Since both comments and cpp directives begin with the # character,
384 you should avoid starting comments with any words recognized
385 by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".)
388 enables some rudimentary switch parsing for switches on the command line
389 after the script name but before any filename arguments (or before a \-\|\-).
390 Any switch found there is removed from @ARGV and sets the corresponding variable in the
393 The following script prints \*(L"true\*(R" if and only if the script is
394 invoked with a \-xyz switch.
399 if ($xyz) { print "true\en"; }
406 use the PATH environment variable to search for the script
407 (unless the name of the script starts with a slash).
408 Typically this is used to emulate #! startup on machines that don't
409 support #!, in the following manner:
413 eval "exec /usr/bin/perl \-S $0 $*"
414 if $running_under_some_shell;
417 The system ignores the first line and feeds the script to /bin/sh,
418 which proceeds to try to execute the
420 script as a shell script.
421 The shell executes the second line as a normal shell command, and thus
425 On some systems $0 doesn't always contain the full pathname,
430 to search for the script if necessary.
433 locates the script, it parses the lines and ignores them because
434 the variable $running_under_some_shell is never true.
435 A better construct than $* would be ${1+"$@"}, which handles embedded spaces
436 and such in the filenames, but doesn't work if the script is being interpreted
438 In order to start up sh rather than csh, some systems may have to replace the
439 #! line with a line containing just
440 a colon, which will be politely ignored by perl.
441 Other systems can't control that, and need a totally devious construct that
442 will work under any of csh, sh or perl, such as the following:
446 eval '(exit $?0)' && eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
447 & eval 'exec /usr/bin/perl -S $0 $argv:q'
455 to dump core after compiling your script.
456 You can then take this core dump and turn it into an executable file
457 by using the undump program (not supplied).
458 This speeds startup at the expense of some disk space (which you can
459 minimize by stripping the executable).
460 (Still, a "hello world" executable comes out to about 200K on my machine.)
461 If you are going to run your executable as a set-id program then you
462 should probably compile it using taintperl rather than normal perl.
463 If you want to execute a portion of your script before dumping, use the
464 dump operator instead.
465 Note: availability of undump is platform specific and may not be available
466 for a specific port of perl.
471 to do unsafe operations.
472 Currently the only \*(L"unsafe\*(R" operations are the unlinking of directories while
473 running as superuser, and running setuid programs with fatal taint checks
474 turned into warnings.
477 prints the version and patchlevel of your
482 prints warnings about identifiers that are mentioned only once, and scalar
483 variables that are used before being set.
484 Also warns about redefined subroutines, and references to undefined
485 filehandles or filehandles opened readonly that you are attempting to
487 Also warns you if you use == on values that don't look like numbers, and if
488 your subroutines recurse more than 100 deep.
493 that the script is embedded in a message.
494 Leading garbage will be discarded until the first line that starts
495 with #! and contains the string "perl".
496 Any meaningful switches on that line will be applied (but only one
497 group of switches, as with normal #! processing).
498 If a directory name is specified, Perl will switch to that directory
499 before running the script.
502 switch only controls the the disposal of leading garbage.
503 The script must be terminated with _\|_END_\|_ if there is trailing garbage
504 to be ignored (the script can process any or all of the trailing garbage
505 via the DATA filehandle if desired).
506 .Sh "Data Types and Objects"
509 has three data types: scalars, arrays of scalars, and
510 associative arrays of scalars.
511 Normal arrays are indexed by number, and associative arrays by string.
513 The interpretation of operations and values in perl sometimes
514 depends on the requirements
515 of the context around the operation or value.
516 There are three major contexts: string, numeric and array.
517 Certain operations return array values
518 in contexts wanting an array, and scalar values otherwise.
519 (If this is true of an operation it will be mentioned in the documentation
521 Operations which return scalars don't care whether the context is looking
522 for a string or a number, but
523 scalar variables and values are interpreted as strings or numbers
524 as appropriate to the context.
525 A scalar is interpreted as TRUE in the boolean sense if it is not the null
527 Booleans returned by operators are 1 for true and 0 or \'\' (the null
530 There are actually two varieties of null string: defined and undefined.
531 Undefined null strings are returned when there is no real value for something,
532 such as when there was an error, or at end of file, or when you refer
533 to an uninitialized variable or element of an array.
534 An undefined null string may become defined the first time you access it, but
535 prior to that you can use the defined() operator to determine whether the
536 value is defined or not.
538 References to scalar variables always begin with \*(L'$\*(R', even when referring
539 to a scalar that is part of an array.
544 $days \h'|2i'# a simple scalar variable
545 $days[28] \h'|2i'# 29th element of array @days
546 $days{\'Feb\'}\h'|2i'# one value from an associative array
547 $#days \h'|2i'# last index of array @days
549 but entire arrays or array slices are denoted by \*(L'@\*(R':
551 @days \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n])
552 @days[3,4,5]\h'|2i'# same as @days[3.\|.5]
553 @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'})
555 and entire associative arrays are denoted by \*(L'%\*(R':
557 %days \h'|2i'# (key1, val1, key2, val2 .\|.\|.)
560 Any of these eight constructs may serve as an lvalue,
561 that is, may be assigned to.
562 (It also turns out that an assignment is itself an lvalue in
563 certain contexts\*(--see examples under s, tr and chop.)
564 Assignment to a scalar evaluates the righthand side in a scalar context,
565 while assignment to an array or array slice evaluates the righthand side
568 You may find the length of array @days by evaluating
569 \*(L"$#days\*(R", as in
571 (Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.)
572 Assigning to $#days changes the length of the array.
573 Shortening an array by this method does not actually destroy any values.
574 Lengthening an array that was previously shortened recovers the values that
575 were in those elements.
576 You can also gain some measure of efficiency by preextending an array that
578 (You can also extend an array by assigning to an element that is off the
580 This differs from assigning to $#whatever in that intervening values
581 are set to null rather than recovered.)
582 You can truncate an array down to nothing by assigning the null list () to
584 The following are exactly equivalent
588 $#whatever = $[ \- 1;
592 If you evaluate an array in a scalar context, it returns the length of
594 The following is always true:
597 scalar(@whatever) == $#whatever \- $[ + 1;
600 If you evaluate an associative array in a scalar context, it returns
601 a value which is true if and only if the array contains any elements.
602 (If there are any elements, the value returned is a string consisting
603 of the number of used buckets and the number of allocated buckets, separated
606 Multi-dimensional arrays are not directly supported, but see the discussion
607 of the $; variable later for a means of emulating multiple subscripts with
608 an associative array.
609 You could also write a subroutine to turn multiple subscripts into a single
612 Every data type has its own namespace.
613 You can, without fear of conflict, use the same name for a scalar variable,
614 an array, an associative array, a filehandle, a subroutine name, and/or
616 Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R',
617 or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved
618 with respect to variable names.
619 (They ARE reserved with respect to labels and filehandles, however, which
620 don't have an initial special character.
621 Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\').
622 Using uppercase filehandles also improves readability and protects you
623 from conflict with future reserved words.)
624 Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all
626 Names which start with a letter may also contain digits and underscores.
627 Names which do not start with a letter are limited to one character,
628 e.g. \*(L"$%\*(R" or \*(L"$$\*(R".
629 (Most of the one character names have a predefined significance to
633 Numeric literals are specified in any of the usual floating point or
646 String literals are delimited by either single or double quotes.
647 They work much like shell quotes:
648 double-quoted string literals are subject to backslash and variable
649 substitution; single-quoted strings are not (except for \e\' and \e\e).
650 The usual backslash rules apply for making characters such as newline, tab,
651 etc., as well as some more exotic forms:
664 \el lowercase next char
665 \eu uppercase next char
666 \eL lowercase till \eE
667 \eU uppercase till \eE
668 \eE end case modification
671 You can also embed newlines directly in your strings, i.e. they can end on
672 a different line than they begin.
673 This is nice, but if you forget your trailing quote, the error will not be
676 finds another line containing the quote character, which
677 may be much further on in the script.
678 Variable substitution inside strings is limited to scalar variables, normal
679 array values, and array slices.
680 (In other words, identifiers beginning with $ or @, followed by an optional
681 bracketed expression as a subscript.)
682 The following code segment prints out \*(L"The price is $100.\*(R"
686 $Price = \'$100\';\h'|3.5i'# not interpreted
687 print "The price is $Price.\e\|n";\h'|3.5i'# interpreted
690 Note that you can put curly brackets around the identifier to delimit it
691 from following alphanumerics.
692 Also note that a single quoted string must be separated from a preceding
693 word by a space, since single quote is a valid character in an identifier
696 Two special literals are _\|_LINE_\|_ and _\|_FILE_\|_, which represent the current
697 line number and filename at that point in your program.
698 They may only be used as separate tokens; they will not be interpolated
700 In addition, the token _\|_END_\|_ may be used to indicate the logical end of the
701 script before the actual end of file.
702 Any following text is ignored, but may be read via the DATA filehandle.
703 (The DATA filehandle may read data only from the main script, but not from
704 any required file or evaluated string.)
705 The two control characters ^D and ^Z are synonyms for _\|_END_\|_.
707 A word that doesn't have any other interpretation in the grammar will be
708 treated as if it had single quotes around it.
709 For this purpose, a word consists only of alphanumeric characters and underline,
710 and must start with an alphabetic character.
711 As with filehandles and labels, a bare word that consists entirely of
712 lowercase letters risks conflict with future reserved words, and if you
715 switch, Perl will warn you about any such words.
717 Array values are interpolated into double-quoted strings by joining all the
718 elements of the array with the delimiter specified in the $" variable,
720 (Since in versions of perl prior to 3.0 the @ character was not a metacharacter
721 in double-quoted strings, the interpolation of @array, $array[EXPR],
722 @array[LIST], $array{EXPR}, or @array{LIST} only happens if array is
723 referenced elsewhere in the program or is predefined.)
724 The following are equivalent:
728 $temp = join($",@ARGV);
734 Within search patterns (which also undergo double-quotish substitution)
735 there is a bad ambiguity: Is /$foo[bar]/ to be
736 interpreted as /${foo}[bar]/ (where [bar] is a character class for the
737 regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to
739 If @foo doesn't otherwise exist, then it's obviously a character class.
740 If @foo exists, perl takes a good guess about [bar], and is almost always right.
741 If it does guess wrong, or if you're just plain paranoid,
742 you can force the correct interpretation with curly brackets as above.
744 A line-oriented form of quoting is based on the shell here-is syntax.
745 Following a << you specify a string to terminate the quoted material, and all lines
746 following the current line down to the terminating string are the value
748 The terminating string may be either an identifier (a word), or some
750 If quoted, the type of quotes you use determines the treatment of the text,
751 just as in regular quoting.
752 An unquoted identifier works like double quotes.
753 There must be no space between the << and the identifier.
754 (If you put a space it will be treated as a null identifier, which is
755 valid, and matches the first blank line\*(--see Merry Christmas example below.)
756 The terminating string must appear by itself (unquoted and with no surrounding
757 whitespace) on the terminating line.
760 print <<EOF; # same as above
764 print <<"EOF"; # same as above
768 print << x 10; # null identifier is delimiter
771 print <<`EOC`; # execute commands
776 print <<foo, <<bar; # you can stack them
783 Array literals are denoted by separating individual values by commas, and
784 enclosing the list in parentheses:
790 In a context not requiring an array value, the value of the array literal
791 is the value of the final element, as in the C comma operator.
796 @foo = (\'cc\', \'\-E\', $bar);
798 assigns the entire array value to array foo, but
800 $foo = (\'cc\', \'\-E\', $bar);
803 assigns the value of variable bar to variable foo.
804 Note that the value of an actual array in a scalar context is the length
805 of the array; the following assigns to $foo the value 3:
809 @foo = (\'cc\', \'\-E\', $bar);
810 $foo = @foo; # $foo gets 3
813 You may have an optional comma before the closing parenthesis of an
814 array literal, so that you can say:
824 When a LIST is evaluated, each element of the list is evaluated in
825 an array context, and the resulting array value is interpolated into LIST
826 just as if each individual element were a member of LIST. Thus arrays
827 lose their identity in a LIST\*(--the list
831 contains all the elements of @foo followed by all the elements of @bar,
832 followed by all the elements returned by the subroutine named SomeSub.
834 A list value may also be subscripted like a normal array.
838 $time = (stat($file))[8]; # stat returns array value
839 $digit = ('a','b','c','d','e','f')[$digit-10];
840 return (pop(@foo),pop(@foo))[0];
844 Array lists may be assigned to if and only if each element of the list
848 ($a, $b, $c) = (1, 2, 3);
850 ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00);
852 The final element may be an array or an associative array:
854 ($a, $b, @rest) = split;
855 local($a, $b, %rest) = @_;
858 You can actually put an array anywhere in the list, but the first array
859 in the list will soak up all the values, and anything after it will get
861 This may be useful in a local().
863 An associative array literal contains pairs of values to be interpreted
864 as a key and a value:
868 # same as map assignment above
869 %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
872 Array assignment in a scalar context returns the number of elements
873 produced by the expression on the right side of the assignment:
876 $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2
880 There are several other pseudo-literals that you should know about.
881 If a string is enclosed by backticks (grave accents), it first undergoes
882 variable substitution just like a double quoted string.
883 It is then interpreted as a command, and the output of that command
884 is the value of the pseudo-literal, like in a shell.
885 In a scalar context, a single string consisting of all the output is
887 In an array context, an array of values is returned, one for each line
889 (You can set $/ to use a different line terminator.)
890 The command is executed each time the pseudo-literal is evaluated.
891 The status value of the command is returned in $? (see Predefined Names
892 for the interpretation of $?).
893 Unlike in \f2csh\f1, no translation is done on the return
894 data\*(--newlines remain newlines.
895 Unlike in any of the shells, single quotes do not hide variable names
896 in the command from interpretation.
897 To pass a $ through to the shell you need to hide it with a backslash.
899 Evaluating a filehandle in angle brackets yields the next line
900 from that file (newline included, so it's never false until EOF, at
901 which time an undefined value is returned).
902 Ordinarily you must assign that value to a variable,
903 but there is one situation where an automatic assignment happens.
904 If (and only if) the input symbol is the only thing inside the conditional of a
907 automatically assigned to the variable \*(L"$_\*(R".
908 (This may seem like an odd thing to you, but you'll use the construct
912 Anyway, the following lines are equivalent to each other:
916 while ($_ = <STDIN>) { print; }
917 while (<STDIN>) { print; }
918 for (\|;\|<STDIN>;\|) { print; }
919 print while $_ = <STDIN>;
934 will also work except in packages, where they would be interpreted as
935 local identifiers rather than global.)
936 Additional filehandles may be created with the
940 If a <FILEHANDLE> is used in a context that is looking for an array, an array
941 consisting of all the input lines is returned, one line per array element.
942 It's easy to make a LARGE data space this way, so use with care.
944 The null filehandle <> is special and can be used to emulate the behavior of
945 \fIsed\fR and \fIawk\fR.
946 Input from <> comes either from standard input, or from each file listed on
948 Here's how it works: the first time <> is evaluated, the ARGV array is checked,
949 and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard
951 The ARGV array is then processed as a list of filenames.
957 .\|.\|. # code for each line
961 is equivalent to the following Perl-like pseudo code:
963 unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[;
964 while ($ARGV = shift) {
967 .\|.\|. # code for each line
972 except that it isn't as cumbersome to say, and will actually work.
973 It really does shift array ARGV and put the current filename into
975 It also uses filehandle ARGV internally\*(--<> is just a synonym for
976 <ARGV>, which is magical.
977 (The pseudo code above doesn't work because it treats <ARGV> as non-magical.)
979 You can modify @ARGV before the first <> as long as the array ends up
980 containing the list of filenames you really want.
981 Line numbers ($.) continue as if the input was one big happy file.
982 (But see example under eof for how to reset line numbers on each file.)
985 If you want to set @ARGV to your own list of files, go right ahead.
986 If you want to pass switches into your script, you can
987 put a loop on the front like this:
991 while ($_ = $ARGV[0], /\|^\-/\|) {
993 last if /\|^\-\|\-$\|/\|;
994 /\|^\-D\|(.*\|)/ \|&& \|($debug = $1);
995 /\|^\-v\|/ \|&& \|$verbose++;
996 .\|.\|. # other switches
999 .\|.\|. # code for each line
1003 The <> symbol will return FALSE only once.
1004 If you call it again after this it will assume you are processing another
1005 @ARGV list, and if you haven't set @ARGV, will input from
1008 If the string inside the angle brackets is a reference to a scalar variable
1010 then that variable contains the name of the filehandle to input from.
1012 If the string inside angle brackets is not a filehandle, it is interpreted
1013 as a filename pattern to be globbed, and either an array of filenames or the
1014 next filename in the list is returned, depending on context.
1015 One level of $ interpretation is done first, but you can't say <$foo>
1016 because that's an indirect filehandle as explained in the previous
1018 You could insert curly brackets to force interpretation as a
1019 filename glob: <${foo}>.
1031 open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|");
1038 In fact, it's currently implemented that way.
1039 (Which means it will not work on filenames with spaces in them unless
1040 you have /bin/csh on your machine.)
1041 Of course, the shortest way to do the above is:
1051 script consists of a sequence of declarations and commands.
1052 The only things that need to be declared in
1054 are report formats and subroutines.
1055 See the sections below for more information on those declarations.
1056 All uninitialized user-created objects are assumed to
1057 start with a null or 0 value until they
1058 are defined by some explicit operation such as assignment.
1059 The sequence of commands is executed just once, unlike in
1063 scripts, where the sequence of commands is executed for each input line.
1064 While this means that you must explicitly loop over the lines of your input file
1065 (or files), it also means you have much more control over which files and which
1067 (Actually, I'm lying\*(--it is possible to do an implicit loop with either the
1073 A declaration can be put anywhere a command can, but has no effect on the
1074 execution of the primary sequence of commands\*(--declarations all take effect
1076 Typically all the declarations are put at the beginning or the end of the script.
1079 is, for the most part, a free-form language.
1080 (The only exception to this is format declarations, for fairly obvious reasons.)
1081 Comments are indicated by the # character, and extend to the end of the line.
1082 If you attempt to use /* */ C comments, it will be interpreted either as
1083 division or pattern matching, depending on the context.
1085 .Sh "Compound statements"
1088 a sequence of commands may be treated as one command by enclosing it
1090 We will call this a BLOCK.
1092 The following compound commands may be used to control flow:
1097 if (EXPR) BLOCK else BLOCK
1098 if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK
1099 LABEL while (EXPR) BLOCK
1100 LABEL while (EXPR) BLOCK continue BLOCK
1101 LABEL for (EXPR; EXPR; EXPR) BLOCK
1102 LABEL foreach VAR (ARRAY) BLOCK
1103 LABEL BLOCK continue BLOCK
1106 Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not
1108 This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed.
1109 If you want to write conditionals without curly brackets there are several
1110 other ways to do it.
1111 The following all do the same thing:
1115 if (!open(foo)) { die "Can't open $foo: $!"; }
1116 die "Can't open $foo: $!" unless open(foo);
1117 open(foo) || die "Can't open $foo: $!"; # foo or bust!
1118 open(foo) ? \'hi mom\' : die "Can't open $foo: $!";
1119 # a bit exotic, that last one
1125 statement is straightforward.
1126 Since BLOCKs are always bounded by curly brackets, there is never any
1127 ambiguity about which
1136 the sense of the test is reversed.
1140 statement executes the block as long as the expression is true
1141 (does not evaluate to the null string or 0).
1142 The LABEL is optional, and if present, consists of an identifier followed by
1144 The LABEL identifies the loop for the loop control statements
1152 BLOCK, it is always executed just before
1153 the conditional is about to be evaluated again, similarly to the third part
1157 Thus it can be used to increment a loop variable, even when the loop has
1158 been continued via the
1160 statement (similar to the C \*(L"continue\*(R" statement).
1164 is replaced by the word
1166 the sense of the test is reversed, but the conditional is still tested before
1167 the first iteration.
1173 statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional
1174 is true if the value of the last command in that block is true.
1178 loop works exactly like the corresponding
1184 for ($i = 1; $i < 10; $i++) {
1198 The foreach loop iterates over a normal array value and sets the variable
1199 VAR to be each element of the array in turn.
1200 The variable is implicitly local to the loop, and regains its former value
1201 upon exiting the loop.
1202 The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword,
1203 so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity.
1204 If VAR is omitted, $_ is set to each value.
1205 If ARRAY is an actual array (as opposed to an expression returning an array
1206 value), you can modify each element of the array
1207 by modifying VAR inside the loop.
1212 for (@ary) { s/foo/bar/; }
1214 foreach $elem (@elements) {
1219 for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) {
1220 print $_, "\en"; sleep(1);
1223 for (1..15) { print "Merry Christmas\en"; }
1226 foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'})) {
1227 print "Item: $item\en";
1232 The BLOCK by itself (labeled or not) is equivalent to a loop that executes
1234 Thus you can use any of the loop control statements in it to leave or
1239 This construct is particularly nice for doing case structures.
1244 if (/^abc/) { $abc = 1; last foo; }
1245 if (/^def/) { $def = 1; last foo; }
1246 if (/^xyz/) { $xyz = 1; last foo; }
1251 There is no official switch statement in perl, because there
1252 are already several ways to write the equivalent.
1253 In addition to the above, you could write
1258 $abc = 1, last foo if /^abc/;
1259 $def = 1, last foo if /^def/;
1260 $xyz = 1, last foo if /^xyz/;
1268 /^abc/ && do { $abc = 1; last foo; };
1269 /^def/ && do { $def = 1; last foo; };
1270 /^xyz/ && do { $xyz = 1; last foo; };
1278 /^abc/ && ($abc = 1, last foo);
1279 /^def/ && ($def = 1, last foo);
1280 /^xyz/ && ($xyz = 1, last foo);
1297 As it happens, these are all optimized internally to a switch structure,
1298 so perl jumps directly to the desired statement, and you needn't worry
1299 about perl executing a lot of unnecessary statements when you have a string
1300 of 50 elsifs, as long as you are testing the same simple scalar variable
1301 using ==, eq, or pattern matching as above.
1302 (If you're curious as to whether the optimizer has done this for a particular
1303 case statement, you can use the \-D1024 switch to list the syntax tree
1305 .Sh "Simple statements"
1306 The only kind of simple statement is an expression evaluated for its side
1308 Every simple statement must be terminated with a semicolon, unless it is the
1309 final statement in a block, in which case the semicolon is optional.
1310 (Semicolon is still encouraged there if the block takes up more than one line).
1312 Any simple statement may optionally be followed by a
1313 single modifier, just before the terminating semicolon.
1314 The possible modifiers are:
1328 modifiers have the expected semantics.
1333 modifiers also have the expected semantics (conditional evaluated first),
1334 except when applied to a do-BLOCK or a do-SUBROUTINE command,
1335 in which case the block executes once before the conditional is evaluated.
1336 This is so that you can write loops like:
1343 } until $_ \|eq \|".\|\e\|n";
1348 operator below. Note also that the loop control commands described later will
1349 NOT work in this construct, since modifiers don't take loop labels.
1354 expressions work almost exactly like C expressions, only the differences
1355 will be mentioned here.
1361 The exponentiation operator.
1363 The exponentiation assignment operator.
1365 The null list, used to initialize an array to null.
1367 Concatenation of two strings.
1369 The concatenation assignment operator.
1371 String equality (== is numeric equality).
1372 For a mnemonic just think of \*(L"eq\*(R" as a string.
1373 (If you are used to the
1375 behavior of using == for either string or numeric equality
1376 based on the current form of the comparands, beware!
1377 You must be explicit here.)
1379 String inequality (!= is numeric inequality).
1383 String greater than.
1385 String less than or equal.
1387 String greater than or equal.
1389 String comparison, returning -1, 0, or 1.
1391 Numeric comparison, returning -1, 0, or 1.
1393 Certain operations search or modify the string \*(L"$_\*(R" by default.
1394 This operator makes that kind of operation work on some other string.
1395 The right argument is a search pattern, substitution, or translation.
1396 The left argument is what is supposed to be searched, substituted, or
1397 translated instead of the default \*(L"$_\*(R".
1398 The return value indicates the success of the operation.
1399 (If the right argument is an expression other than a search pattern,
1400 substitution, or translation, it is interpreted as a search pattern
1402 This is less efficient than an explicit search, since the pattern must
1403 be compiled every time the expression is evaluated.)
1404 The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else.
1406 Just like =~ except the return value is negated.
1408 The repetition operator.
1409 Returns a string consisting of the left operand repeated the
1410 number of times specified by the right operand.
1411 In an array context, if the left operand is a list in parens, it repeats
1415 print \'\-\' x 80; # print row of dashes
1416 print \'\-\' x80; # illegal, x80 is identifier
1418 print "\et" x ($tab/8), \' \' x ($tab%8); # tab over
1420 @ones = (1) x 80; # an array of 80 1's
1421 @ones = (5) x @ones; # set all elements to 5
1425 The repetition assignment operator.
1426 Only works on scalars.
1428 The range operator, which is really two different operators depending
1430 In an array context, returns an array of values counting (by ones)
1431 from the left value to the right value.
1432 This is useful for writing \*(L"for (1..10)\*(R" loops and for doing
1433 slice operations on arrays.
1435 In a scalar context, .\|. returns a boolean value.
1436 The operator is bistable, like a flip-flop, and
1437 emulates the line-range (comma) operator of sed, awk, and various editors.
1438 Each .\|. operator maintains its own boolean state.
1439 It is false as long as its left operand is false.
1440 Once the left operand is true, the range operator stays true
1441 until the right operand is true,
1442 AFTER which the range operator becomes false again.
1443 (It doesn't become false till the next time the range operator is evaluated.
1444 It can test the right operand and become false on the
1445 same evaluation it became true (as in awk), but it still returns true once.
1446 If you don't want it to test the right operand till the next
1447 evaluation (as in sed), use three dots (.\|.\|.) instead of two.)
1448 The right operand is not evaluated while the operator is in the \*(L"false\*(R" state,
1449 and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state.
1450 The precedence is a little lower than || and &&.
1451 The value returned is either the null string for false, or a sequence number
1452 (beginning with 1) for true.
1453 The sequence number is reset for each range encountered.
1454 The final sequence number in a range has the string \'E0\' appended to it, which
1455 doesn't affect its numeric value, but gives you something to search for if you
1456 want to exclude the endpoint.
1457 You can exclude the beginning point by waiting for the sequence number to be
1459 If either operand of scalar .\|. is static, that operand is implicitly compared
1460 to the $. variable, the current line number.
1465 As a scalar operator:
1466 if (101 .\|. 200) { print; } # print 2nd hundred lines
1468 next line if (1 .\|. /^$/); # skip header lines
1470 s/^/> / if (/^$/ .\|. eof()); # quote body
1473 As an array operator:
1474 for (101 .\|. 200) { print; } # print $_ 100 times
1476 @foo = @foo[$[ .\|. $#foo]; # an expensive no-op
1477 @foo = @foo[$#foo-4 .\|. $#foo]; # slice last 5 items
1482 This unary operator takes one argument, either a filename or a filehandle,
1483 and tests the associated file to see if something is true about it.
1484 If the argument is omitted, tests $_, except for \-t, which tests
1486 It returns 1 for true and \'\' for false, or the undefined value if the
1488 Precedence is higher than logical and relational operators, but lower than
1489 arithmetic operators.
1490 The operator may be any of:
1492 \-r File is readable by effective uid/gid.
1493 \-w File is writable by effective uid/gid.
1494 \-x File is executable by effective uid/gid.
1495 \-o File is owned by effective uid.
1496 \-R File is readable by real uid/gid.
1497 \-W File is writable by real uid/gid.
1498 \-X File is executable by real uid/gid.
1499 \-O File is owned by real uid.
1501 \-z File has zero size.
1502 \-s File has non-zero size (returns size).
1503 \-f File is a plain file.
1504 \-d File is a directory.
1505 \-l File is a symbolic link.
1506 \-p File is a named pipe (FIFO).
1507 \-S File is a socket.
1508 \-b File is a block special file.
1509 \-c File is a character special file.
1510 \-u File has setuid bit set.
1511 \-g File has setgid bit set.
1512 \-k File has sticky bit set.
1513 \-t Filehandle is opened to a tty.
1514 \-T File is a text file.
1515 \-B File is a binary file (opposite of \-T).
1516 \-M Age of file in days when script started.
1517 \-A Same for access time.
1518 \-C Same for inode change time.
1521 The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X
1522 is based solely on the mode of the file and the uids and gids of the user.
1523 There may be other reasons you can't actually read, write or execute the file.
1524 Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and
1525 \-x and \-X return 1 if any execute bit is set in the mode.
1526 Scripts run by the superuser may thus need to do a stat() in order to determine
1527 the actual mode of the file, or temporarily set the uid to something else.
1535 next unless \-f $_; # ignore specials
1540 Note that \-s/a/b/ does not do a negated substitution.
1541 Saying \-exp($foo) still works as expected, however\*(--only single letters
1542 following a minus are interpreted as file tests.
1544 The \-T and \-B switches work as follows.
1545 The first block or so of the file is examined for odd characters such as
1546 strange control codes or metacharacters.
1547 If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file.
1548 Also, any file containing null in the first block is considered a binary file.
1549 If \-T or \-B is used on a filehandle, the current stdio buffer is examined
1550 rather than the first block.
1551 Both \-T and \-B return TRUE on a null file, or a file at EOF when testing
1554 If any of the file tests (or either stat operator) are given the special
1555 filehandle consisting of a solitary underline, then the stat structure
1556 of the previous file test (or stat operator) is used, saving a system
1558 (This doesn't work with \-t, and you need to remember that lstat and -l
1559 will leave values in the stat structure for the symbolic link, not the
1564 print "Can do.\en" if -r $a || -w _ || -x _;
1568 print "Readable\en" if -r _;
1569 print "Writable\en" if -w _;
1570 print "Executable\en" if -x _;
1571 print "Setuid\en" if -u _;
1572 print "Setgid\en" if -g _;
1573 print "Sticky\en" if -k _;
1574 print "Text\en" if -T _;
1575 print "Binary\en" if -B _;
1579 Here is what C has that
1583 Address-of operator.
1585 Dereference-address operator.
1587 Type casting operator.
1591 does a certain amount of expression evaluation at compile time, whenever
1592 it determines that all of the arguments to an operator are static and have
1594 In particular, string concatenation happens at compile time between literals that don't do variable substitution.
1595 Backslash interpretation also happens at compile time.
1600 \'Now is the time for all\' . "\|\e\|n" .
1601 \'good men to come to.\'
1604 and this all reduces to one string internally.
1606 The autoincrement operator has a little extra built-in magic to it.
1607 If you increment a variable that is numeric, or that has ever been used in
1608 a numeric context, you get a normal increment.
1609 If, however, the variable has only been used in string contexts since it
1610 was set, and has a value that is not null and matches the
1611 pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done
1612 as a string, preserving each character within its range, with carry:
1615 print ++($foo = \'99\'); # prints \*(L'100\*(R'
1616 print ++($foo = \'a0\'); # prints \*(L'a1\*(R'
1617 print ++($foo = \'Az\'); # prints \*(L'Ba\*(R'
1618 print ++($foo = \'zz\'); # prints \*(L'aaa\*(R'
1621 The autodecrement is not magical.
1623 The range operator (in an array context) makes use of the magical
1624 autoincrement algorithm if the minimum and maximum are strings.
1627 @alphabet = (\'A\' .. \'Z\');
1629 to get all the letters of the alphabet, or
1631 $hexdigit = (0 .. 9, \'a\' .. \'f\')[$num & 15];
1633 to get a hexadecimal digit, or
1635 @z2 = (\'01\' .. \'31\'); print @z2[$mday];
1637 to get dates with leading zeros.
1638 (If the final value specified is not in the sequence that the magical increment
1639 would produce, the sequence goes until the next value would be longer than
1640 the final value specified.)
1642 The || and && operators differ from C's in that, rather than returning 0 or 1,
1643 they return the last value evaluated.
1644 Thus, a portable way to find out the home directory might be:
1647 $home = $ENV{'HOME'} || $ENV{'LOGDIR'} ||
1648 (getpwuid($<))[7] || die "You're homeless!\en";
1652 Along with the literals and variables mentioned earlier,
1653 the operations in the following section can serve as terms in an expression.
1654 Some of these operations take a LIST as an argument.
1655 Such a list can consist of any combination of scalar arguments or array values;
1656 the array values will be included in the list as if each individual element were
1657 interpolated at that point in the list, forming a longer single-dimensional
1659 Elements of the LIST should be separated by commas.
1660 If an operation is listed both with and without parentheses around its
1661 arguments, it means you can either use it as a unary operator or
1663 To use it as a function call, the next token on the same line must
1664 be a left parenthesis.
1665 (There may be intervening white space.)
1666 Such a function then has highest precedence, as you would expect from
1668 If any token other than a left parenthesis follows, then it is a
1669 unary operator, with a precedence depending only on whether it is a LIST
1671 LIST operators have lowest precedence.
1672 All other unary operators have a precedence greater than relational operators
1673 but less than arithmetic operators.
1674 See the section on Precedence.
1676 For operators that can be used in either a scalar or array context,
1677 failure is generally indicated in a scalar context by returning
1678 the undefined value, and in an array context by returning the null list.
1679 Remember though that
1680 THERE IS NO GENERAL RULE FOR CONVERTING A LIST INTO A SCALAR.
1681 Each operator decides which sort of scalar it would be most
1682 appropriate to return.
1683 Some operators return the length of the list
1684 that would have been returned in an array context.
1685 Some operators return the first value in the list.
1686 Some operators return the last value in the list.
1687 Some operators return a count of successful operations.
1688 In general, they do what you want, unless you want consistency.
1692 This is just like the /pattern/ search, except that it matches only once between
1696 This is a useful optimization when you only want to see the first occurrence of
1697 something in each file of a set of files, for instance.
1698 Only ?? patterns local to the current package are reset.
1699 .Ip "accept(NEWSOCKET,GENERICSOCKET)" 8 2
1700 Does the same thing that the accept system call does.
1701 Returns true if it succeeded, false otherwise.
1702 See example in section on Interprocess Communication.
1703 .Ip "alarm(SECONDS)" 8 4
1704 .Ip "alarm SECONDS" 8
1705 Arranges to have a SIGALRM delivered to this process after the specified number
1706 of seconds (minus 1, actually) have elapsed. Thus, alarm(15) will cause
1707 a SIGALRM at some point more than 14 seconds in the future.
1708 Only one timer may be counting at once. Each call disables the previous
1709 timer, and an argument of 0 may be supplied to cancel the previous timer
1710 without starting a new one.
1711 The returned value is the amount of time remaining on the previous timer.
1712 .Ip "atan2(Y,X)" 8 2
1713 Returns the arctangent of Y/X in the range
1714 .if t \-\(*p to \(*p.
1716 .Ip "bind(SOCKET,NAME)" 8 2
1717 Does the same thing that the bind system call does.
1718 Returns true if it succeeded, false otherwise.
1719 NAME should be a packed address of the proper type for the socket.
1720 See example in section on Interprocess Communication.
1721 .Ip "binmode(FILEHANDLE)" 8 4
1722 .Ip "binmode FILEHANDLE" 8 4
1723 Arranges for the file to be read in \*(L"binary\*(R" mode in operating systems
1724 that distinguish between binary and text files.
1725 Files that are not read in binary mode have CR LF sequences translated
1726 to LF on input and LF translated to CR LF on output.
1727 Binmode has no effect under Unix.
1728 If FILEHANDLE is an expression, the value is taken as the name of
1732 Returns the context of the current subroutine call:
1735 ($package,$filename,$line) = caller;
1738 With EXPR, returns some extra information that the debugger uses to print
1739 a stack trace. The value of EXPR indicates how many call frames to go
1740 back before the current one.
1741 .Ip "chdir(EXPR)" 8 2
1742 .Ip "chdir EXPR" 8 2
1743 Changes the working directory to EXPR, if possible.
1744 If EXPR is omitted, changes to home directory.
1745 Returns 1 upon success, 0 otherwise.
1748 .Ip "chmod(LIST)" 8 2
1749 .Ip "chmod LIST" 8 2
1750 Changes the permissions of a list of files.
1751 The first element of the list must be the numerical mode.
1752 Returns the number of files successfully changed.
1756 $cnt = chmod 0755, \'foo\', \'bar\';
1757 chmod 0755, @executables;
1760 .Ip "chop(LIST)" 8 7
1761 .Ip "chop(VARIABLE)" 8
1762 .Ip "chop VARIABLE" 8
1764 Chops off the last character of a string and returns the character chopped.
1765 It's used primarily to remove the newline from the end of an input record,
1766 but is much more efficient than s/\en// because it neither scans nor copies
1768 If VARIABLE is omitted, chops $_.
1774 chop; # avoid \en on last field
1775 @array = split(/:/);
1780 You can actually chop anything that's an lvalue, including an assignment:
1783 chop($cwd = \`pwd\`);
1784 chop($answer = <STDIN>);
1787 If you chop a list, each element is chopped.
1788 Only the value of the last chop is returned.
1789 .Ip "chown(LIST)" 8 2
1790 .Ip "chown LIST" 8 2
1791 Changes the owner (and group) of a list of files.
1792 The first two elements of the list must be the NUMERICAL uid and gid,
1794 Returns the number of files successfully changed.
1798 $cnt = chown $uid, $gid, \'foo\', \'bar\';
1799 chown $uid, $gid, @filenames;
1803 Here's an example that looks up non-numeric uids in the passwd file:
1813 open(pass, \'/etc/passwd\') || die "Can't open passwd: $!\en";
1816 open(pass, \'/etc/passwd\')
1817 || die "Can't open passwd: $!\en";
1820 ($login,$pass,$uid,$gid) = split(/:/);
1821 $uid{$login} = $uid;
1822 $gid{$login} = $gid;
1824 @ary = <${pattern}>; # get filenames
1825 if ($uid{$user} eq \'\') {
1826 die "$user not in passwd file";
1829 chown $uid{$user}, $gid{$user}, @ary;
1833 .Ip "chroot(FILENAME)" 8 5
1834 .Ip "chroot FILENAME" 8
1835 Does the same as the system call of that name.
1836 If you don't know what it does, don't worry about it.
1837 If FILENAME is omitted, does chroot to $_.
1838 .Ip "close(FILEHANDLE)" 8 5
1839 .Ip "close FILEHANDLE" 8
1840 Closes the file or pipe associated with the file handle.
1841 You don't have to close FILEHANDLE if you are immediately going to
1842 do another open on it, since open will close it for you.
1845 However, an explicit close on an input file resets the line counter ($.), while
1846 the implicit close done by
1849 Also, closing a pipe will wait for the process executing on the pipe to complete,
1850 in case you want to look at the output of the pipe afterwards.
1851 Closing a pipe explicitly also puts the status value of the command into $?.
1856 open(OUTPUT, \'|sort >foo\'); # pipe to sort
1857 .\|.\|. # print stuff to output
1858 close OUTPUT; # wait for sort to finish
1859 open(INPUT, \'foo\'); # get sort's results
1862 FILEHANDLE may be an expression whose value gives the real filehandle name.
1863 .Ip "closedir(DIRHANDLE)" 8 5
1864 .Ip "closedir DIRHANDLE" 8
1865 Closes a directory opened by opendir().
1866 .Ip "connect(SOCKET,NAME)" 8 2
1867 Does the same thing that the connect system call does.
1868 Returns true if it succeeded, false otherwise.
1869 NAME should be a package address of the proper type for the socket.
1870 See example in section on Interprocess Communication.
1873 Returns the cosine of EXPR (expressed in radians).
1874 If EXPR is omitted takes cosine of $_.
1875 .Ip "crypt(PLAINTEXT,SALT)" 8 6
1876 Encrypts a string exactly like the crypt() function in the C library.
1877 Useful for checking the password file for lousy passwords.
1878 Only the guys wearing white hats should do this.
1879 .Ip "dbmclose(ASSOC_ARRAY)" 8 6
1880 .Ip "dbmclose ASSOC_ARRAY" 8
1881 Breaks the binding between a dbm file and an associative array.
1882 The values remaining in the associative array are meaningless unless
1883 you happen to want to know what was in the cache for the dbm file.
1884 This function is only useful if you have ndbm.
1885 .Ip "dbmopen(ASSOC,DBNAME,MODE)" 8 6
1886 This binds a dbm or ndbm file to an associative array.
1887 ASSOC is the name of the associative array.
1888 (Unlike normal open, the first argument is NOT a filehandle, even though
1890 DBNAME is the name of the database (without the .dir or .pag extension).
1891 If the database does not exist, it is created with protection specified
1892 by MODE (as modified by the umask).
1893 If your system only supports the older dbm functions, you may perform only one
1894 dbmopen in your program.
1895 If your system has neither dbm nor ndbm, calling dbmopen produces a fatal
1898 Values assigned to the associative array prior to the dbmopen are lost.
1899 A certain number of values from the dbm file are cached in memory.
1900 By default this number is 64, but you can increase it by preallocating
1901 that number of garbage entries in the associative array before the dbmopen.
1902 You can flush the cache if necessary with the reset command.
1904 If you don't have write access to the dbm file, you can only read
1905 associative array variables, not set them.
1906 If you want to test whether you can write, either use file tests or
1907 try setting a dummy array entry inside an eval, which will trap the error.
1909 Note that functions such as keys() and values() may return huge array values
1910 when used on large dbm files.
1911 You may prefer to use the each() function to iterate over large dbm files.
1916 # print out history file offsets
1917 dbmopen(HIST,'/usr/lib/news/history',0666);
1918 while (($key,$val) = each %HIST) {
1919 print $key, ' = ', unpack('L',$val), "\en";
1924 .Ip "defined(EXPR)" 8 6
1925 .Ip "defined EXPR" 8
1926 Returns a boolean value saying whether the lvalue EXPR has a real value
1928 Many operations return the undefined value under exceptional conditions,
1929 such as end of file, uninitialized variable, system error and such.
1930 This function allows you to distinguish between an undefined null string
1931 and a defined null string with operations that might return a real null
1932 string, in particular referencing elements of an array.
1933 You may also check to see if arrays or subroutines exist.
1934 Use on predefined variables is not guaranteed to produce intuitive results.
1939 print if defined $switch{'D'};
1940 print "$val\en" while defined($val = pop(@ary));
1941 die "Can't readlink $sym: $!"
1942 unless defined($value = readlink $sym);
1943 eval '@foo = ()' if defined(@foo);
1944 die "No XYZ package defined" unless defined %_XYZ;
1945 sub foo { defined &$bar ? &$bar(@_) : die "No bar"; }
1949 .Ip "delete $ASSOC{KEY}" 8 6
1950 Deletes the specified value from the specified associative array.
1951 Returns the deleted value, or the undefined value if nothing was deleted.
1952 Deleting from $ENV{} modifies the environment.
1953 Deleting from an array bound to a dbm file deletes the entry from the dbm
1956 The following deletes all the values of an associative array:
1960 foreach $key (keys %ARRAY) {
1961 delete $ARRAY{$key};
1965 (But it would be faster to use the
1968 Saying undef %ARRAY is faster yet.)
1971 Outside of an eval, prints the value of LIST to
1973 and exits with the current value of $!
1975 If $! is 0, exits with the value of ($? >> 8) (\`command\` status).
1976 If ($? >> 8) is 0, exits with 255.
1977 Inside an eval, the error message is stuffed into $@ and the eval is terminated
1978 with the undefined value.
1980 Equivalent examples:
1985 die "Can't cd to spool: $!\en" unless chdir \'/usr/spool/news\';
1988 die "Can't cd to spool: $!\en"
1989 unless chdir \'/usr/spool/news\';
1992 chdir \'/usr/spool/news\' || die "Can't cd to spool: $!\en"
1996 If the value of EXPR does not end in a newline, the current script line
1997 number and input line number (if any) are also printed, and a newline is
1999 Hint: sometimes appending \*(L", stopped\*(R" to your message will cause it to make
2000 better sense when the string \*(L"at foo line 123\*(R" is appended.
2001 Suppose you are running script \*(L"canasta\*(R".
2005 die "/etc/games is no good";
2006 die "/etc/games is no good, stopped";
2008 produce, respectively
2010 /etc/games is no good at canasta line 123.
2011 /etc/games is no good, stopped at canasta line 123.
2017 Returns the value of the last command in the sequence of commands indicated
2019 When modified by a loop modifier, executes the BLOCK once before testing the
2021 (On other statements the loop modifiers test the conditional first.)
2022 .Ip "do SUBROUTINE (LIST)" 8 3
2023 Executes a SUBROUTINE declared by a
2025 declaration, and returns the value
2026 of the last expression evaluated in SUBROUTINE.
2027 If there is no subroutine by that name, produces a fatal error.
2028 (You may use the \*(L"defined\*(R" operator to determine if a subroutine
2030 If you pass arrays as part of LIST you may wish to pass the length
2031 of the array in front of each array.
2032 (See the section on subroutines later on.)
2033 The parentheses are required to avoid confusion with the \*(L"do EXPR\*(R"
2036 SUBROUTINE may also be a single scalar variable, in which case
2037 the name of the subroutine to execute is taken from the variable.
2039 As an alternate (and preferred) form,
2040 you may call a subroutine by prefixing the name with
2041 an ampersand: &foo(@args).
2042 If you aren't passing any arguments, you don't have to use parentheses.
2043 If you omit the parentheses, no @_ array is passed to the subroutine.
2044 The & form is also used to specify subroutines to the defined and undef
2048 if (defined &$var) { &$var($parm); undef &$var; }
2052 Uses the value of EXPR as a filename and executes the contents of the file
2056 Its primary use is to include subroutines from a
2065 eval \`cat stat.pl\`;
2068 except that it's more efficient, more concise, keeps track of the current
2069 filename for error messages, and searches all the
2071 libraries if the file
2072 isn't in the current directory (see also the @INC array in Predefined Names).
2073 It's the same, however, in that it does reparse the file every time you
2074 call it, so if you are going to use the file inside a loop you might prefer
2075 to use \-P and #include, at the expense of a little more startup time.
2076 (The main problem with #include is that cpp doesn't grok # comments\*(--a
2077 workaround is to use \*(L";#\*(R" for standalone comments.)
2078 Note that the following are NOT equivalent:
2082 do $foo; # eval a file
2083 do $foo(); # call a subroutine
2086 Note that inclusion of library routines is better done with
2087 the \*(L"require\*(R" operator.
2088 .Ip "dump LABEL" 8 6
2089 This causes an immediate core dump.
2090 Primarily this is so that you can use the undump program to turn your
2091 core dump into an executable binary after having initialized all your
2092 variables at the beginning of the program.
2093 When the new binary is executed it will begin by executing a "goto LABEL"
2094 (with all the restrictions that goto suffers).
2095 Think of it as a goto with an intervening core dump and reincarnation.
2096 If LABEL is omitted, restarts the program from the top.
2097 WARNING: any files opened at the time of the dump will NOT be open any more
2098 when the program is reincarnated, with possible resulting confusion on the part
2107 require 'getopt.pl';
2118 dump QUICKSTART if $ARGV[0] eq '-d';
2124 .Ip "each(ASSOC_ARRAY)" 8 6
2125 .Ip "each ASSOC_ARRAY" 8
2126 Returns a 2 element array consisting of the key and value for the next
2127 value of an associative array, so that you can iterate over it.
2128 Entries are returned in an apparently random order.
2129 When the array is entirely read, a null array is returned (which when
2130 assigned produces a FALSE (0) value).
2131 The next call to each() after that will start iterating again.
2132 The iterator can be reset only by reading all the elements from the array.
2133 You must not modify the array while iterating over it.
2134 There is a single iterator for each associative array, shared by all
2135 each(), keys() and values() function calls in the program.
2136 The following prints out your environment like the printenv program, only
2137 in a different order:
2141 while (($key,$value) = each %ENV) {
2142 print "$key=$value\en";
2146 See also keys() and values().
2147 .Ip "eof(FILEHANDLE)" 8 8
2150 Returns 1 if the next read on FILEHANDLE will return end of file, or if
2151 FILEHANDLE is not open.
2152 FILEHANDLE may be an expression whose value gives the real filehandle name.
2153 (Note that this function actually reads a character and then ungetc's it,
2154 so it is not very useful in an interactive context.)
2155 An eof without an argument returns the eof status for the last file read.
2156 Empty parentheses () may be used to indicate the pseudo file formed of the
2157 files listed on the command line, i.e. eof() is reasonable to use inside
2158 a while (<>) loop to detect the end of only the last file.
2159 Use eof(ARGV) or eof without the parentheses to test EACH file in a while (<>) loop.
2164 # insert dashes just before last line of last file
2167 print "\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\en";
2173 # reset line numbering on each input file
2176 if (eof) { # Not eof().
2182 .Ip "eval(EXPR)" 8 6
2184 .Ip "eval BLOCK" 8 6
2185 EXPR is parsed and executed as if it were a little
2188 It is executed in the context of the current
2191 any variable settings, subroutine or format definitions remain afterwards.
2192 The value returned is the value of the last expression evaluated, just
2193 as with subroutines.
2194 If there is a syntax error or runtime error, or a die statement is
2195 executed, an undefined value is returned by
2196 eval, and $@ is set to the error message.
2197 If there was no error, $@ is guaranteed to be a null string.
2198 If EXPR is omitted, evaluates $_.
2199 The final semicolon, if any, may be omitted from the expression.
2201 Note that, since eval traps otherwise-fatal errors, it is useful for
2202 determining whether a particular feature
2203 (such as dbmopen or symlink) is implemented.
2204 It is also Perl's exception trapping mechanism, where the die operator is
2205 used to raise exceptions.
2207 If the code to be executed doesn't vary, you may use
2208 the eval-BLOCK form to trap run-time errors without incurring
2209 the penalty of recompiling each time.
2210 The error, if any, is still returned in $@.
2211 Evaluating a single-quoted string (as EXPR) has the same effect, except that
2212 the eval-EXPR form reports syntax errors at run time via $@, whereas the
2213 eval-BLOCK form reports syntax errors at compile time. The eval-EXPR form
2214 is optimized to eval-BLOCK the first time it succeeds. (Since the replacement
2215 side of a substitution is considered a single-quoted string when you
2216 use the e modifier, the same optimization occurs there.) Examples:
2220 # make divide-by-zero non-fatal
2221 eval { $answer = $a / $b; }; warn $@ if $@;
2223 # optimized to same thing after first use
2224 eval '$answer = $a / $b'; warn $@ if $@;
2226 # a compile-time error
2230 eval '$answer ='; # sets $@
2233 .Ip "exec(LIST)" 8 8
2235 If there is more than one argument in LIST, or if LIST is an array with
2236 more than one value,
2237 calls execvp() with the arguments in LIST.
2238 If there is only one scalar argument, the argument is checked for shell metacharacters.
2239 If there are any, the entire argument is passed to \*(L"/bin/sh \-c\*(R" for parsing.
2240 If there are none, the argument is split into words and passed directly to
2241 execvp(), which is more efficient.
2242 Note: exec (and system) do not flush your output buffer, so you may need to
2243 set $| to avoid lost output.
2247 exec \'/bin/echo\', \'Your arguments are: \', @ARGV;
2248 exec "sort $outfile | uniq";
2252 If you don't really want to execute the first argument, but want to lie
2253 to the program you are executing about its own name, you can specify
2254 the program you actually want to run by assigning that to a variable and
2255 putting the name of the variable in front of the LIST without a comma.
2256 (This always forces interpretation of the LIST as a multi-valued list, even
2257 if there is only a single scalar in the list.)
2262 $shell = '/bin/csh';
2263 exec $shell '-sh'; # pretend it's a login shell
2266 .Ip "exit(EXPR)" 8 6
2268 Evaluates EXPR and exits immediately with that value.
2274 exit 0 \|if \|$ans \|=~ \|/\|^[Xx]\|/\|;
2279 If EXPR is omitted, exits with 0 status.
2284 to the power of EXPR.
2285 If EXPR is omitted, gives exp($_).
2286 .Ip "fcntl(FILEHANDLE,FUNCTION,SCALAR)" 8 4
2287 Implements the fcntl(2) function.
2288 You'll probably have to say
2291 require "fcntl.ph"; # probably /usr/local/lib/perl/fcntl.ph
2294 first to get the correct function definitions.
2295 If fcntl.ph doesn't exist or doesn't have the correct definitions
2297 your own, based on your C header files such as <sys/fcntl.h>.
2298 (There is a perl script called h2ph that comes with the perl kit
2299 which may help you in this.)
2300 Argument processing and value return works just like ioctl below.
2301 Note that fcntl will produce a fatal error if used on a machine that doesn't implement
2303 .Ip "fileno(FILEHANDLE)" 8 4
2304 .Ip "fileno FILEHANDLE" 8 4
2305 Returns the file descriptor for a filehandle.
2306 Useful for constructing bitmaps for select().
2307 If FILEHANDLE is an expression, the value is taken as the name of
2309 .Ip "flock(FILEHANDLE,OPERATION)" 8 4
2310 Calls flock(2) on FILEHANDLE.
2311 See manual page for flock(2) for definition of OPERATION.
2312 Returns true for success, false on failure.
2313 Will produce a fatal error if used on a machine that doesn't implement
2315 Here's a mailbox appender for BSD systems.
2325 flock(MBOX,$LOCK_EX);
2326 # and, in case someone appended
2327 # while we were waiting...
2332 flock(MBOX,$LOCK_UN);
2335 open(MBOX, ">>/usr/spool/mail/$ENV{'USER'}")
2336 || die "Can't open mailbox: $!";
2339 print MBOX $msg,"\en\en";
2345 Returns the child pid to the parent process and 0 to the child process.
2346 Note: unflushed buffers remain unflushed in both processes, which means
2347 you may need to set $| to avoid duplicate output.
2348 .Ip "getc(FILEHANDLE)" 8 4
2349 .Ip "getc FILEHANDLE" 8
2351 Returns the next character from the input file attached to FILEHANDLE, or
2352 a null string at EOF.
2353 If FILEHANDLE is omitted, reads from STDIN.
2355 Returns the current login from /etc/utmp, if any.
2356 If null, use getpwuid.
2358 $login = getlogin || (getpwuid($<))[0] || "Somebody";
2360 .Ip "getpeername(SOCKET)" 8 3
2361 Returns the packed sockaddr address of other end of the SOCKET connection.
2365 # An internet sockaddr
2366 $sockaddr = 'S n a4 x8';
2367 $hersockaddr = getpeername(S);
2369 ($family, $port, $heraddr) = unpack($sockaddr,$hersockaddr);
2372 ($family, $port, $heraddr) =
2373 unpack($sockaddr,$hersockaddr);
2377 .Ip "getpgrp(PID)" 8 4
2379 Returns the current process group for the specified PID, 0 for the current
2381 Will produce a fatal error if used on a machine that doesn't implement
2383 If EXPR is omitted, returns process group of current process.
2385 Returns the process id of the parent process.
2386 .Ip "getpriority(WHICH,WHO)" 8 4
2387 Returns the current priority for a process, a process group, or a user.
2388 (See getpriority(2).)
2389 Will produce a fatal error if used on a machine that doesn't implement
2391 .Ip "getpwnam(NAME)" 8
2392 .Ip "getgrnam(NAME)" 8
2393 .Ip "gethostbyname(NAME)" 8
2394 .Ip "getnetbyname(NAME)" 8
2395 .Ip "getprotobyname(NAME)" 8
2396 .Ip "getpwuid(UID)" 8
2397 .Ip "getgrgid(GID)" 8
2398 .Ip "getservbyname(NAME,PROTO)" 8
2399 .Ip "gethostbyaddr(ADDR,ADDRTYPE)" 8
2400 .Ip "getnetbyaddr(ADDR,ADDRTYPE)" 8
2401 .Ip "getprotobynumber(NUMBER)" 8
2402 .Ip "getservbyport(PORT,PROTO)" 8
2411 .Ip "sethostent(STAYOPEN)" 8
2412 .Ip "setnetent(STAYOPEN)" 8
2413 .Ip "setprotoent(STAYOPEN)" 8
2414 .Ip "setservent(STAYOPEN)" 8
2421 These routines perform the same functions as their counterparts in the
2423 Within an array context,
2424 the return values from the various get routines are as follows:
2427 ($name,$passwd,$uid,$gid,
2428 $quota,$comment,$gcos,$dir,$shell) = getpw.\|.\|.
2429 ($name,$passwd,$gid,$members) = getgr.\|.\|.
2430 ($name,$aliases,$addrtype,$length,@addrs) = gethost.\|.\|.
2431 ($name,$aliases,$addrtype,$net) = getnet.\|.\|.
2432 ($name,$aliases,$proto) = getproto.\|.\|.
2433 ($name,$aliases,$port,$proto) = getserv.\|.\|.
2436 (If the entry doesn't exist you get a null list.)
2438 Within a scalar context, you get the name, unless the function was a
2439 lookup by name, in which case you get the other thing, whatever it is.
2440 (If the entry doesn't exist you get the undefined value.)
2453 The $members value returned by getgr.\|.\|. is a space separated list
2454 of the login names of the members of the group.
2456 For the gethost.\|.\|. functions, if the h_errno variable is supported in C,
2457 it will be returned to you via $? if the function call fails.
2458 The @addrs value returned by a successful call is a list of the
2459 raw addresses returned by the corresponding system library call.
2460 In the Internet domain, each address is four bytes long and you can unpack
2461 it by saying something like:
2464 ($a,$b,$c,$d) = unpack('C4',$addr[0]);
2467 .Ip "getsockname(SOCKET)" 8 3
2468 Returns the packed sockaddr address of this end of the SOCKET connection.
2472 # An internet sockaddr
2473 $sockaddr = 'S n a4 x8';
2474 $mysockaddr = getsockname(S);
2476 ($family, $port, $myaddr) = unpack($sockaddr,$mysockaddr);
2479 ($family, $port, $myaddr) =
2480 unpack($sockaddr,$mysockaddr);
2484 .Ip "getsockopt(SOCKET,LEVEL,OPTNAME)" 8 3
2485 Returns the socket option requested, or undefined if there is an error.
2486 .Ip "gmtime(EXPR)" 8 4
2488 Converts a time as returned by the time function to a 9-element array with
2489 the time analyzed for the Greenwich timezone.
2490 Typically used as follows:
2495 ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = gmtime(time);
2498 ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
2503 All array elements are numeric, and come straight out of a struct tm.
2504 In particular this means that $mon has the range 0.\|.11 and $wday has the
2506 If EXPR is omitted, does gmtime(time).
2507 .Ip "goto LABEL" 8 6
2508 Finds the statement labeled with LABEL and resumes execution there.
2509 Currently you may only go to statements in the main body of the program
2510 that are not nested inside a do {} construct.
2511 This statement is not implemented very efficiently, and is here only to make
2515 I may change its semantics at any time, consistent with support for translated
2518 Use it at your own risk.
2519 Better yet, don't use it at all.
2520 .Ip "grep(EXPR,LIST)" 8 4
2521 Evaluates EXPR for each element of LIST (locally setting $_ to each element)
2522 and returns the array value consisting of those elements for which the
2523 expression evaluated to true.
2524 In a scalar context, returns the number of times the expression was true.
2527 @foo = grep(!/^#/, @bar); # weed out comments
2530 Note that, since $_ is a reference into the array value, it can be
2531 used to modify the elements of the array.
2532 While this is useful and supported, it can cause bizarre results if
2533 the LIST is not a named array.
2536 Returns the decimal value of EXPR interpreted as an hex string.
2537 (To interpret strings that might start with 0 or 0x see oct().)
2538 If EXPR is omitted, uses $_.
2539 .Ip "index(STR,SUBSTR,POSITION)" 8 4
2540 .Ip "index(STR,SUBSTR)" 8 4
2541 Returns the position of the first occurrence of SUBSTR in STR at or after
2543 If POSITION is omitted, starts searching from the beginning of the string.
2544 The return value is based at 0, or whatever you've
2545 set the $[ variable to.
2546 If the substring is not found, returns one less than the base, ordinarily \-1.
2549 Returns the integer portion of EXPR.
2550 If EXPR is omitted, uses $_.
2551 .Ip "ioctl(FILEHANDLE,FUNCTION,SCALAR)" 8 4
2552 Implements the ioctl(2) function.
2553 You'll probably have to say
2556 require "ioctl.ph"; # probably /usr/local/lib/perl/ioctl.ph
2559 first to get the correct function definitions.
2560 If ioctl.ph doesn't exist or doesn't have the correct definitions
2562 your own, based on your C header files such as <sys/ioctl.h>.
2563 (There is a perl script called h2ph that comes with the perl kit
2564 which may help you in this.)
2565 SCALAR will be read and/or written depending on the FUNCTION\*(--a pointer
2566 to the string value of SCALAR will be passed as the third argument of
2567 the actual ioctl call.
2568 (If SCALAR has no string value but does have a numeric value, that value
2569 will be passed rather than a pointer to the string value.
2570 To guarantee this to be true, add a 0 to the scalar before using it.)
2571 The pack() and unpack() functions are useful for manipulating the values
2572 of structures used by ioctl().
2573 The following example sets the erase character to DEL.
2578 $sgttyb_t = "ccccs"; # 4 chars and a short
2579 if (ioctl(STDIN,$TIOCGETP,$sgttyb)) {
2580 @ary = unpack($sgttyb_t,$sgttyb);
2582 $sgttyb = pack($sgttyb_t,@ary);
2583 ioctl(STDIN,$TIOCSETP,$sgttyb)
2584 || die "Can't ioctl: $!";
2588 The return value of ioctl (and fcntl) is as follows:
2592 if OS returns:\h'|3i'perl returns:
2593 -1\h'|3i' undefined value
2594 0\h'|3i' string "0 but true"
2595 anything else\h'|3i' that number
2598 Thus perl returns true on success and false on failure, yet you can still
2599 easily determine the actual value returned by the operating system:
2602 ($retval = ioctl(...)) || ($retval = -1);
2603 printf "System returned %d\en", $retval;
2605 .Ip "join(EXPR,LIST)" 8 8
2606 .Ip "join(EXPR,ARRAY)" 8
2607 Joins the separate strings of LIST or ARRAY into a single string with fields
2608 separated by the value of EXPR, and returns the string.
2613 $_ = join(\|\':\', $login,$passwd,$uid,$gid,$gcos,$home,$shell);
2617 $login,$passwd,$uid,$gid,$gcos,$home,$shell);
2623 .Ip "keys(ASSOC_ARRAY)" 8 6
2624 .Ip "keys ASSOC_ARRAY" 8
2625 Returns a normal array consisting of all the keys of the named associative
2627 The keys are returned in an apparently random order, but it is the same order
2628 as either the values() or each() function produces (given that the associative array
2629 has not been modified).
2630 Here is yet another way to print your environment:
2635 @values = values %ENV;
2636 while ($#keys >= 0) {
2637 print pop(@keys), \'=\', pop(@values), "\en";
2640 or how about sorted by key:
2643 foreach $key (sort(keys %ENV)) {
2644 print $key, \'=\', $ENV{$key}, "\en";
2648 .Ip "kill(LIST)" 8 8
2650 Sends a signal to a list of processes.
2651 The first element of the list must be the signal to send.
2652 Returns the number of processes successfully signaled.
2655 $cnt = kill 1, $child1, $child2;
2659 If the signal is negative, kills process groups instead of processes.
2660 (On System V, a negative \fIprocess\fR number will also kill process groups,
2661 but that's not portable.)
2662 You may use a signal name in quotes.
2663 .Ip "last LABEL" 8 8
2669 statement in C (as used in loops); it immediately exits the loop in question.
2670 If the LABEL is omitted, the command refers to the innermost enclosing loop.
2673 block, if any, is not executed:
2677 line: while (<STDIN>) {
2678 last line if /\|^$/; # exit when done with header
2683 .Ip "length(EXPR)" 8 4
2685 Returns the length in characters of the value of EXPR.
2686 If EXPR is omitted, returns length of $_.
2687 .Ip "link(OLDFILE,NEWFILE)" 8 2
2688 Creates a new filename linked to the old filename.
2689 Returns 1 for success, 0 otherwise.
2690 .Ip "listen(SOCKET,QUEUESIZE)" 8 2
2691 Does the same thing that the listen system call does.
2692 Returns true if it succeeded, false otherwise.
2693 See example in section on Interprocess Communication.
2694 .Ip "local(LIST)" 8 4
2695 Declares the listed variables to be local to the enclosing block,
2696 subroutine, eval or \*(L"do\*(R".
2697 All the listed elements must be legal lvalues.
2698 This operator works by saving the current values of those variables in LIST
2699 on a hidden stack and restoring them upon exiting the block, subroutine or eval.
2700 This means that called subroutines can also reference the local variable,
2701 but not the global one.
2702 The LIST may be assigned to if desired, which allows you to initialize
2703 your local variables.
2704 (If no initializer is given for a particular variable, it is created with
2705 an undefined value.)
2706 Commonly this is used to name the parameters to a subroutine.
2712 local($min, $max, $thunk) = @_;
2713 local($result) = \'\';
2716 # Presumably $thunk makes reference to $i
2718 for ($i = $min; $i < $max; $i++) {
2719 $result .= eval $thunk;
2726 if ($sw eq \'-v\') {
2727 # init local array with global array
2728 local(@ARGV) = @ARGV;
2729 unshift(@ARGV,\'echo\');
2735 # temporarily add to digits associative array
2737 # (NOTE: not claiming this is efficient!)
2738 local(%digits) = (%digits,'t',10,'e',11);
2743 Note that local() is a run-time command, and so gets executed every time
2744 through a loop, using up more stack storage each time until it's all
2745 released at once when the loop is exited.
2746 .Ip "localtime(EXPR)" 8 4
2747 .Ip "localtime EXPR" 8
2748 Converts a time as returned by the time function to a 9-element array with
2749 the time analyzed for the local timezone.
2750 Typically used as follows:
2755 ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
2758 ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
2763 All array elements are numeric, and come straight out of a struct tm.
2764 In particular this means that $mon has the range 0.\|.11 and $wday has the
2766 If EXPR is omitted, does localtime(time).
2769 Returns logarithm (base
2772 If EXPR is omitted, returns log of $_.
2773 .Ip "lstat(FILEHANDLE)" 8 6
2774 .Ip "lstat FILEHANDLE" 8
2776 .Ip "lstat SCALARVARIABLE" 8
2777 Does the same thing as the stat() function, but stats a symbolic link
2778 instead of the file the symbolic link points to.
2779 If symbolic links are unimplemented on your system, a normal stat is done.
2780 .Ip "m/PATTERN/gio" 8 4
2781 .Ip "/PATTERN/gio" 8
2782 Searches a string for a pattern match, and returns true (1) or false (\'\').
2783 If no string is specified via the =~ or !~ operator,
2784 the $_ string is searched.
2785 (The string specified with =~ need not be an lvalue\*(--it may be the result of an expression evaluation, but remember the =~ binds rather tightly.)
2786 See also the section on regular expressions.
2788 If / is the delimiter then the initial \*(L'm\*(R' is optional.
2789 With the \*(L'm\*(R' you can use any pair of non-alphanumeric characters
2791 This is particularly useful for matching Unix path names that contain \*(L'/\*(R'.
2792 If the final delimiter is followed by the optional letter \*(L'i\*(R', the matching is
2793 done in a case-insensitive manner.
2794 PATTERN may contain references to scalar variables, which will be interpolated
2795 (and the pattern recompiled) every time the pattern search is evaluated.
2796 (Note that $) and $| may not be interpolated because they look like end-of-string tests.)
2797 If you want such a pattern to be compiled only once, add an \*(L"o\*(R" after
2798 the trailing delimiter.
2799 This avoids expensive run-time recompilations, and
2800 is useful when the value you are interpolating won't change over the
2802 If the PATTERN evaluates to a null string, the most recent successful
2803 regular expression is used instead.
2805 If used in a context that requires an array value, a pattern match returns an
2806 array consisting of the subexpressions matched by the parentheses in the
2808 i.e. ($1, $2, $3.\|.\|.).
2809 It does NOT actually set $1, $2, etc. in this case, nor does it set $+, $`, $&
2811 If the match fails, a null array is returned.
2812 If the match succeeds, but there were no parentheses, an array value of (1)
2819 open(tty, \'/dev/tty\');
2820 <tty> \|=~ \|/\|^y\|/i \|&& \|do foo(\|); # do foo if desired
2822 if (/Version: \|*\|([0\-9.]*\|)\|/\|) { $version = $1; }
2824 next if m#^/usr/spool/uucp#;
2830 print if /$arg/o; # compile only once
2833 if (($F1, $F2, $Etc) = ($foo =~ /^(\eS+)\es+(\eS+)\es*(.*)/))
2836 This last example splits $foo into the first two words and the remainder
2837 of the line, and assigns those three fields to $F1, $F2 and $Etc.
2838 The conditional is true if any variables were assigned, i.e. if the pattern
2841 The \*(L"g\*(R" modifier specifies global pattern matching\*(--that is,
2842 matching as many times as possible within the string. How it behaves
2843 depends on the context. In an array context, it returns a list of
2844 all the substrings matched by all the parentheses in the regular expression.
2845 If there are no parentheses, it returns a list of all the matched strings,
2846 as if there were parentheses around the whole pattern. In a scalar context,
2847 it iterates through the string, returning TRUE each time it matches, and
2848 FALSE when it eventually runs out of matches. (In other words, it remembers
2849 where it left off last time and restarts the search at that point.) It
2850 presumes that you have not modified the string since the last match.
2851 Modifying the string between matches may result in undefined behavior.
2852 (You can actually get away with in-place modifications via substr()
2853 that do not change the length of the entire string. In general, however,
2854 you should be using s///g for such modifications.) Examples:
2858 ($one,$five,$fifteen) = (\`uptime\` =~ /(\ed+\e.\ed+)/g);
2862 while ($paragraph = <>) {
2863 while ($paragraph =~ /[a-z][\'")]*[.!?]+[\'")]*\es/g) {
2867 print "$sentences\en";
2870 .Ip "mkdir(FILENAME,MODE)" 8 3
2871 Creates the directory specified by FILENAME, with permissions specified by
2872 MODE (as modified by umask).
2873 If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
2874 .Ip "msgctl(ID,CMD,ARG)" 8 4
2875 Calls the System V IPC function msgctl. If CMD is &IPC_STAT, then ARG
2876 must be a variable which will hold the returned msqid_ds structure.
2877 Returns like ioctl: the undefined value for error, "0 but true" for
2878 zero, or the actual return value otherwise.
2879 .Ip "msgget(KEY,FLAGS)" 8 4
2880 Calls the System V IPC function msgget. Returns the message queue id,
2881 or the undefined value if there is an error.
2882 .Ip "msgsnd(ID,MSG,FLAGS)" 8 4
2883 Calls the System V IPC function msgsnd to send the message MSG to the
2884 message queue ID. MSG must begin with the long integer message type,
2885 which may be created with pack("L", $type). Returns true if
2886 successful, or false if there is an error.
2887 .Ip "msgrcv(ID,VAR,SIZE,TYPE,FLAGS)" 8 4
2888 Calls the System V IPC function msgrcv to receive a message from
2889 message queue ID into variable VAR with a maximum message size of
2890 SIZE. Note that if a message is received, the message type will be
2891 the first thing in VAR, and the maximum length of VAR is SIZE plus the
2892 size of the message type. Returns true if successful, or false if
2894 .Ip "next LABEL" 8 8
2900 statement in C; it starts the next iteration of the loop:
2904 line: while (<STDIN>) {
2905 next line if /\|^#/; # discard comments
2910 Note that if there were a
2912 block on the above, it would get executed even on discarded lines.
2913 If the LABEL is omitted, the command refers to the innermost enclosing loop.
2916 Returns the decimal value of EXPR interpreted as an octal string.
2917 (If EXPR happens to start off with 0x, interprets it as a hex string instead.)
2918 The following will handle decimal, octal and hex in the standard notation:
2921 $val = oct($val) if $val =~ /^0/;
2924 If EXPR is omitted, uses $_.
2925 .Ip "open(FILEHANDLE,EXPR)" 8 8
2926 .Ip "open(FILEHANDLE)" 8
2927 .Ip "open FILEHANDLE" 8
2928 Opens the file whose filename is given by EXPR, and associates it with
2930 If FILEHANDLE is an expression, its value is used as the name of the
2931 real filehandle wanted.
2932 If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE
2933 contains the filename.
2934 If the filename begins with \*(L"<\*(R" or nothing, the file is opened for
2936 If the filename begins with \*(L">\*(R", the file is opened for output.
2937 If the filename begins with \*(L">>\*(R", the file is opened for appending.
2938 (You can put a \'+\' in front of the \'>\' or \'<\' to indicate that you
2939 want both read and write access to the file.)
2940 If the filename begins with \*(L"|\*(R", the filename is interpreted
2941 as a command to which output is to be piped, and if the filename ends
2942 with a \*(L"|\*(R", the filename is interpreted as command which pipes
2944 (You may not have a command that pipes both in and out.)
2945 Opening \'\-\' opens
2947 and opening \'>\-\' opens
2949 Open returns non-zero upon success, the undefined value otherwise.
2950 If the open involved a pipe, the return value happens to be the pid
2957 open article || die "Can't find article $article: $!\en";
2958 while (<article>) {\|.\|.\|.
2961 open(LOG, \'>>/usr/spool/news/twitlog\'\|); # (log is reserved)
2964 open(LOG, \'>>/usr/spool/news/twitlog\'\|);
2969 open(article, "caesar <$article |"\|); # decrypt article
2972 open(article, "caesar <$article |"\|);
2977 open(extract, "|sort >/tmp/Tmp$$"\|); # $$ is our process#
2980 open(extract, "|sort >/tmp/Tmp$$"\|);
2981 # $$ is our process#
2985 # process argument list of files along with any includes
2987 foreach $file (@ARGV) {
2988 do process($file, \'fh00\'); # no pun intended
2992 local($filename, $input) = @_;
2993 $input++; # this is a string increment
2994 unless (open($input, $filename)) {
2995 print STDERR "Can't open $filename: $!\en";
2999 while (<$input>) { # note the use of indirection
3002 while (<$input>) { # note use of indirection
3004 if (/^#include "(.*)"/) {
3005 do process($1, $input);
3013 You may also, in the Bourne shell tradition, specify an EXPR beginning
3014 with \*(L">&\*(R", in which case the rest of the string
3015 is interpreted as the name of a filehandle
3016 (or file descriptor, if numeric) which is to be duped and opened.
3017 You may use & after >, >>, <, +>, +>> and +<.
3018 The mode you specify should match the mode of the original filehandle.
3019 Here is a script that saves, redirects, and restores
3027 open(SAVEOUT, ">&STDOUT");
3028 open(SAVEERR, ">&STDERR");
3030 open(STDOUT, ">foo.out") || die "Can't redirect stdout";
3031 open(STDERR, ">&STDOUT") || die "Can't dup stdout";
3033 select(STDERR); $| = 1; # make unbuffered
3034 select(STDOUT); $| = 1; # make unbuffered
3036 print STDOUT "stdout 1\en"; # this works for
3037 print STDERR "stderr 1\en"; # subprocesses too
3042 open(STDOUT, ">&SAVEOUT");
3043 open(STDERR, ">&SAVEERR");
3045 print STDOUT "stdout 2\en";
3046 print STDERR "stderr 2\en";
3049 If you open a pipe on the command \*(L"\-\*(R", i.e. either \*(L"|\-\*(R" or \*(L"\-|\*(R",
3050 then there is an implicit fork done, and the return value of open
3051 is the pid of the child within the parent process, and 0 within the child
3053 (Use defined($pid) to determine if the open was successful.)
3054 The filehandle behaves normally for the parent, but i/o to that
3055 filehandle is piped from/to the
3057 of the child process.
3058 In the child process the filehandle isn't opened\*(--i/o happens from/to
3063 Typically this is used like the normal piped open when you want to exercise
3064 more control over just how the pipe command gets executed, such as when
3065 you are running setuid, and don't want to have to scan shell commands
3067 The following pairs are more or less equivalent:
3071 open(FOO, "|tr \'[a\-z]\' \'[A\-Z]\'");
3072 open(FOO, "|\-") || exec \'tr\', \'[a\-z]\', \'[A\-Z]\';
3074 open(FOO, "cat \-n '$file'|");
3075 open(FOO, "\-|") || exec \'cat\', \'\-n\', $file;
3078 Explicitly closing any piped filehandle causes the parent process to wait for the
3079 child to finish, and returns the status value in $?.
3080 Note: on any operation which may do a fork,
3081 unflushed buffers remain unflushed in both
3082 processes, which means you may need to set $| to
3083 avoid duplicate output.
3085 The filename that is passed to open will have leading and trailing
3087 In order to open a file with arbitrary weird characters in it, it's necessary
3088 to protect any leading and trailing whitespace thusly:
3092 $file =~ s#^(\es)#./$1#;
3093 open(FOO, "< $file\e0");
3096 .Ip "opendir(DIRHANDLE,EXPR)" 8 3
3097 Opens a directory named EXPR for processing by readdir(), telldir(), seekdir(),
3098 rewinddir() and closedir().
3099 Returns true if successful.
3100 DIRHANDLEs have their own namespace separate from FILEHANDLEs.
3103 Returns the numeric ascii value of the first character of EXPR.
3104 If EXPR is omitted, uses $_.
3105 ''' Comments on f & d by gnb@melba.bby.oz.au 22/11/89
3106 .Ip "pack(TEMPLATE,LIST)" 8 4
3107 Takes an array or list of values and packs it into a binary structure,
3108 returning the string containing the structure.
3109 The TEMPLATE is a sequence of characters that give the order and type
3110 of values, as follows:
3113 A An ascii string, will be space padded.
3114 a An ascii string, will be null padded.
3115 c A signed char value.
3116 C An unsigned char value.
3117 s A signed short value.
3118 S An unsigned short value.
3119 i A signed integer value.
3120 I An unsigned integer value.
3121 l A signed long value.
3122 L An unsigned long value.
3123 n A short in \*(L"network\*(R" order.
3124 N A long in \*(L"network\*(R" order.
3125 f A single-precision float in the native format.
3126 d A double-precision float in the native format.
3127 p A pointer to a string.
3128 v A short in \*(L"VAX\*(R" (little-endian) order.
3129 V A long in \*(L"VAX\*(R" (little-endian) order.
3132 @ Null fill to absolute position.
3133 u A uuencoded string.
3134 b A bit string (ascending bit order, like vec()).
3135 B A bit string (descending bit order).
3136 h A hex string (low nybble first).
3137 H A hex string (high nybble first).
3140 Each letter may optionally be followed by a number which gives a repeat
3142 With all types except "a", "A", "b", "B", "h" and "H",
3143 the pack function will gobble up that many values
3145 A * for the repeat count means to use however many items are left.
3146 The "a" and "A" types gobble just one value, but pack it as a string of length
3148 padding with nulls or spaces as necessary.
3149 (When unpacking, "A" strips trailing spaces and nulls, but "a" does not.)
3150 Likewise, the "b" and "B" fields pack a string that many bits long.
3151 The "h" and "H" fields pack a string that many nybbles long.
3152 Real numbers (floats and doubles) are in the native machine format
3153 only; due to the multiplicity of floating formats around, and the lack
3154 of a standard \*(L"network\*(R" representation, no facility for
3155 interchange has been made.
3156 This means that packed floating point data
3157 written on one machine may not be readable on another - even if both
3158 use IEEE floating point arithmetic (as the endian-ness of the memory
3159 representation is not part of the IEEE spec).
3161 doubles internally for all numeric calculation, and converting from
3162 double -> float -> double will lose precision (i.e. unpack("f",
3163 pack("f", $foo)) will not in general equal $foo).
3168 $foo = pack("cccc",65,66,67,68);
3170 $foo = pack("c4",65,66,67,68);
3173 $foo = pack("ccxxcc",65,66,67,68);
3174 # foo eq "AB\e0\e0CD"
3176 $foo = pack("s2",1,2);
3177 # "\e1\e0\e2\e0" on little-endian
3178 # "\e0\e1\e0\e2" on big-endian
3180 $foo = pack("a4","abcd","x","y","z");
3183 $foo = pack("aaaa","abcd","x","y","z");
3186 $foo = pack("a14","abcdefg");
3187 # "abcdefg\e0\e0\e0\e0\e0\e0\e0"
3189 $foo = pack("i9pl", gmtime);
3190 # a real struct tm (on my system anyway)
3193 unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
3196 The same template may generally also be used in the unpack function.
3197 .Ip "pipe(READHANDLE,WRITEHANDLE)" 8 3
3198 Opens a pair of connected pipes like the corresponding system call.
3199 Note that if you set up a loop of piped processes, deadlock can occur
3200 unless you are very careful.
3201 In addition, note that perl's pipes use stdio buffering, so you may need
3202 to set $| to flush your WRITEHANDLE after each command, depending on
3204 [Requires version 3.0 patchlevel 9.]
3207 Pops and returns the last value of the array, shortening the array by 1.
3208 Has the same effect as
3211 $tmp = $ARRAY[$#ARRAY\-\|\-];
3214 If there are no elements in the array, returns the undefined value.
3215 .Ip "print(FILEHANDLE LIST)" 8 10
3217 .Ip "print FILEHANDLE LIST" 8
3220 Prints a string or a comma-separated list of strings.
3221 Returns non-zero if successful.
3222 FILEHANDLE may be a scalar variable name, in which case the variable contains
3223 the name of the filehandle, thus introducing one level of indirection.
3224 (NOTE: If FILEHANDLE is a variable and the next token is a term, it may be
3225 misinterpreted as an operator unless you interpose a + or put parens around
3227 If FILEHANDLE is omitted, prints by default to standard output (or to the
3228 last selected output channel\*(--see select()).
3229 If LIST is also omitted, prints $_ to
3231 To set the default output channel to something other than
3233 use the select operation.
3234 Note that, because print takes a LIST, anything in the LIST is evaluated
3235 in an array context, and any subroutine that you call will have one or more
3236 of its expressions evaluated in an array context.
3237 Also be careful not to follow the print keyword with a left parenthesis
3238 unless you want the corresponding right parenthesis to terminate the
3239 arguments to the print\*(--interpose a + or put parens around all the arguments.
3240 .Ip "printf(FILEHANDLE LIST)" 8 10
3241 .Ip "printf(LIST)" 8
3242 .Ip "printf FILEHANDLE LIST" 8
3244 Equivalent to a \*(L"print FILEHANDLE sprintf(LIST)\*(R".
3245 .Ip "push(ARRAY,LIST)" 8 7
3246 Treats ARRAY (@ is optional) as a stack, and pushes the values of LIST
3247 onto the end of ARRAY.
3248 The length of ARRAY increases by the length of LIST.
3249 Has the same effect as
3253 $ARRAY[++$#ARRAY] = $value;
3257 but is more efficient.
3261 These are not really functions, but simply syntactic sugar to let you
3262 avoid putting too many backslashes into quoted strings.
3263 The q operator is a generalized single quote, and the qq operator a
3264 generalized double quote.
3265 The qx operator is a generalized backquote.
3266 Any non-alphanumeric delimiter can be used in place of /, including newline.
3267 If the delimiter is an opening bracket or parenthesis, the final delimiter
3268 will be the corresponding closing bracket or parenthesis.
3269 (Embedded occurrences of the closing bracket need to be backslashed as usual.)
3274 $foo = q!I said, "You said, \'She said it.\'"!;
3275 $bar = q(\'This is it.\');
3276 $today = qx{ date };
3278 *** The previous line contains the naughty word "$&".\en
3279 if /(ibm|apple|awk)/; # :-)
3282 .Ip "rand(EXPR)" 8 8
3285 Returns a random fractional number between 0 and the value of EXPR.
3286 (EXPR should be positive.)
3287 If EXPR is omitted, returns a value between 0 and 1.
3289 .Ip "read(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
3290 .Ip "read(FILEHANDLE,SCALAR,LENGTH)" 8 5
3291 Attempts to read LENGTH bytes of data into variable SCALAR from the specified
3293 Returns the number of bytes actually read, or undef if there was an error.
3294 SCALAR will be grown or shrunk to the length actually read.
3295 An OFFSET may be specified to place the read data at some other place
3296 than the beginning of the string.
3297 This call is actually implemented in terms of stdio's fread call. To get
3298 a true read system call, see sysread.
3299 .Ip "readdir(DIRHANDLE)" 8 3
3300 .Ip "readdir DIRHANDLE" 8
3301 Returns the next directory entry for a directory opened by opendir().
3302 If used in an array context, returns all the rest of the entries in the
3304 If there are no more entries, returns an undefined value in a scalar context
3305 or a null list in an array context.
3306 .Ip "readlink(EXPR)" 8 6
3307 .Ip "readlink EXPR" 8
3308 Returns the value of a symbolic link, if symbolic links are implemented.
3309 If not, gives a fatal error.
3310 If there is some system error, returns the undefined value and sets $! (errno).
3311 If EXPR is omitted, uses $_.
3312 .Ip "recv(SOCKET,SCALAR,LEN,FLAGS)" 8 4
3313 Receives a message on a socket.
3314 Attempts to receive LENGTH bytes of data into variable SCALAR from the specified
3316 Returns the address of the sender, or the undefined value if there's an error.
3317 SCALAR will be grown or shrunk to the length actually read.
3318 Takes the same flags as the system call of the same name.
3319 .Ip "redo LABEL" 8 8
3323 command restarts the loop block without evaluating the conditional again.
3326 block, if any, is not executed.
3327 If the LABEL is omitted, the command refers to the innermost enclosing loop.
3328 This command is normally used by programs that want to lie to themselves
3329 about what was just input:
3333 # a simpleminded Pascal comment stripper
3334 # (warning: assumes no { or } in strings)
3335 line: while (<STDIN>) {
3336 while (s|\|({.*}.*\|){.*}|$1 \||) {}
3341 if (\|/\|}/\|) { # end of comment?
3351 .Ip "rename(OLDNAME,NEWNAME)" 8 2
3352 Changes the name of a file.
3353 Returns 1 for success, 0 otherwise.
3354 Will not work across filesystem boundaries.
3355 .Ip "require(EXPR)" 8 6
3356 .Ip "require EXPR" 8
3358 Includes the library file specified by EXPR, or by $_ if EXPR is not supplied.
3359 Has semantics similar to the following subroutine:
3363 local($filename) = @_;
3364 return 1 if $INC{$filename};
3365 local($realfilename,$result);
3367 foreach $prefix (@INC) {
3368 $realfilename = "$prefix/$filename";
3369 if (-f $realfilename) {
3370 $result = do $realfilename;
3374 die "Can't find $filename in \e@INC";
3377 die "$filename did not return true value" unless $result;
3378 $INC{$filename} = $realfilename;
3383 Note that the file will not be included twice under the same specified name.
3384 The file must return true as the last statement to indicate successful
3385 execution of any initialization code, so it's customary to end
3386 such a file with \*(L"1;\*(R" unless you're sure it'll return true otherwise.
3387 .Ip "reset(EXPR)" 8 6
3392 block at the end of a loop to clear variables and reset ?? searches
3393 so that they work again.
3394 The expression is interpreted as a list of single characters (hyphens allowed
3396 All variables and arrays beginning with one of those letters are reset to
3397 their pristine state.
3398 If the expression is omitted, one-match searches (?pattern?) are reset to
3400 Only resets variables or searches in the current package.
3406 reset \'X\'; \h'|2i'# reset all X variables
3407 reset \'a\-z\';\h'|2i'# reset lower case variables
3408 reset; \h'|2i'# just reset ?? searches
3411 Note: resetting \*(L"A\-Z\*(R" is not recommended since you'll wipe out your ARGV and ENV
3414 The use of reset on dbm associative arrays does not change the dbm file.
3415 (It does, however, flush any entries cached by perl, which may be useful if
3416 you are sharing the dbm file.
3417 Then again, maybe not.)
3418 .Ip "return LIST" 8 3
3419 Returns from a subroutine with the value specified.
3420 (Note that a subroutine can automatically return
3421 the value of the last expression evaluated.
3422 That's the preferred method\*(--use of an explicit
3425 .Ip "reverse(LIST)" 8 4
3426 .Ip "reverse LIST" 8
3427 In an array context, returns an array value consisting of the elements
3428 of LIST in the opposite order.
3429 In a scalar context, returns a string value consisting of the bytes of
3430 the first element of LIST in the opposite order.
3431 .Ip "rewinddir(DIRHANDLE)" 8 5
3432 .Ip "rewinddir DIRHANDLE" 8
3433 Sets the current position to the beginning of the directory for the readdir() routine on DIRHANDLE.
3434 .Ip "rindex(STR,SUBSTR,POSITION)" 8 6
3435 .Ip "rindex(STR,SUBSTR)" 8 4
3436 Works just like index except that it
3437 returns the position of the LAST occurrence of SUBSTR in STR.
3438 If POSITION is specified, returns the last occurrence at or before that
3440 .Ip "rmdir(FILENAME)" 8 4
3441 .Ip "rmdir FILENAME" 8
3442 Deletes the directory specified by FILENAME if it is empty.
3443 If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
3444 If FILENAME is omitted, uses $_.
3445 .Ip "s/PATTERN/REPLACEMENT/gieo" 8 3
3446 Searches a string for a pattern, and if found, replaces that pattern with the
3447 replacement text and returns the number of substitutions made.
3448 Otherwise it returns false (0).
3449 The \*(L"g\*(R" is optional, and if present, indicates that all occurrences
3450 of the pattern are to be replaced.
3451 The \*(L"i\*(R" is also optional, and if present, indicates that matching
3452 is to be done in a case-insensitive manner.
3453 The \*(L"e\*(R" is likewise optional, and if present, indicates that
3454 the replacement string is to be evaluated as an expression rather than just
3455 as a double-quoted string.
3456 Any non-alphanumeric delimiter may replace the slashes;
3457 if single quotes are used, no
3458 interpretation is done on the replacement string (the e modifier overrides
3459 this, however); if backquotes are used, the replacement string is a command
3460 to execute whose output will be used as the actual replacement text.
3461 If the PATTERN is delimited by bracketing quotes, the REPLACEMENT
3462 has its own pair of quotes, which may or may not be bracketing quotes, e.g.
3463 s(foo)(bar) or s<foo>/bar/.
3464 If no string is specified via the =~ or !~ operator,
3465 the $_ string is searched and modified.
3466 (The string specified with =~ must be a scalar variable, an array element,
3467 or an assignment to one of those, i.e. an lvalue.)
3468 If the pattern contains a $ that looks like a variable rather than an
3469 end-of-string test, the variable will be interpolated into the pattern at
3471 If you only want the pattern compiled once the first time the variable is
3472 interpolated, add an \*(L"o\*(R" at the end.
3473 If the PATTERN evaluates to a null string, the most recent successful
3474 regular expression is used instead.
3475 See also the section on regular expressions.
3479 s/\|\e\|bgreen\e\|b/mauve/g; # don't change wintergreen
3481 $path \|=~ \|s|\|/usr/bin|\|/usr/local/bin|;
3483 s/Login: $foo/Login: $bar/; # run-time pattern
3485 ($foo = $bar) =~ s/bar/foo/;
3488 s/\ed+/$&*2/e; # yields \*(L'abc246xyz\*(R'
3489 s/\ed+/sprintf("%5d",$&)/e; # yields \*(L'abc 246xyz\*(R'
3490 s/\ew/$& x 2/eg; # yields \*(L'aabbcc 224466xxyyzz\*(R'
3492 s/\|([^ \|]*\|) *\|([^ \|]*\|)\|/\|$2 $1/; # reverse 1st two fields
3495 (Note the use of $ instead of \|\e\| in the last example. See section
3496 on regular expressions.)
3497 .Ip "scalar(EXPR)" 8 3
3498 Forces EXPR to be interpreted in a scalar context and returns the value
3500 .Ip "seek(FILEHANDLE,POSITION,WHENCE)" 8 3
3501 Randomly positions the file pointer for FILEHANDLE, just like the fseek()
3503 FILEHANDLE may be an expression whose value gives the name of the filehandle.
3504 Returns 1 upon success, 0 otherwise.
3505 .Ip "seekdir(DIRHANDLE,POS)" 8 3
3506 Sets the current position for the readdir() routine on DIRHANDLE.
3507 POS must be a value returned by telldir().
3508 Has the same caveats about possible directory compaction as the corresponding
3509 system library routine.
3510 .Ip "select(FILEHANDLE)" 8 3
3512 Returns the currently selected filehandle.
3513 Sets the current default filehandle for output, if FILEHANDLE is supplied.
3514 This has two effects: first, a
3518 without a filehandle will default to this FILEHANDLE.
3519 Second, references to variables related to output will refer to this output
3521 For example, if you have to set the top of form format for more than
3522 one output channel, you might do the following:
3527 $^ = \'report1_top\';
3529 $^ = \'report2_top\';
3532 FILEHANDLE may be an expression whose value gives the name of the actual filehandle.
3536 $oldfh = select(STDERR); $| = 1; select($oldfh);
3539 .Ip "select(RBITS,WBITS,EBITS,TIMEOUT)" 8 3
3540 This calls the select system call with the bitmasks specified, which can
3541 be constructed using fileno() and vec(), along these lines:
3544 $rin = $win = $ein = '';
3545 vec($rin,fileno(STDIN),1) = 1;
3546 vec($win,fileno(STDOUT),1) = 1;
3550 If you want to select on many filehandles you might wish to write a subroutine:
3554 local(@fhlist) = split(' ',$_[0]);
3557 vec($bits,fileno($_),1) = 1;
3561 $rin = &fhbits('STDIN TTY SOCK');
3567 ($nfound,$timeleft) =
3568 select($rout=$rin, $wout=$win, $eout=$ein, $timeout);
3570 or to block until something becomes ready:
3573 $nfound = select($rout=$rin, $wout=$win, $eout=$ein, undef);
3576 $nfound = select($rout=$rin, $wout=$win,
3581 Any of the bitmasks can also be undef.
3582 The timeout, if specified, is in seconds, which may be fractional.
3583 NOTE: not all implementations are capable of returning the $timeleft.
3584 If not, they always return $timeleft equal to the supplied $timeout.
3585 .Ip "semctl(ID,SEMNUM,CMD,ARG)" 8 4
3586 Calls the System V IPC function semctl. If CMD is &IPC_STAT or
3587 &GETALL, then ARG must be a variable which will hold the returned
3588 semid_ds structure or semaphore value array. Returns like ioctl: the
3589 undefined value for error, "0 but true" for zero, or the actual return
3591 .Ip "semget(KEY,NSEMS,SIZE,FLAGS)" 8 4
3592 Calls the System V IPC function semget. Returns the semaphore id, or
3593 the undefined value if there is an error.
3594 .Ip "semop(KEY,OPSTRING)" 8 4
3595 Calls the System V IPC function semop to perform semaphore operations
3596 such as signaling and waiting. OPSTRING must be a packed array of
3597 semop structures. Each semop structure can be generated with
3598 \&'pack("sss", $semnum, $semop, $semflag)'. The number of semaphore
3599 operations is implied by the length of OPSTRING. Returns true if
3600 successful, or false if there is an error. As an example, the
3601 following code waits on semaphore $semnum of semaphore id $semid:
3604 $semop = pack("sss", $semnum, -1, 0);
3605 die "Semaphore trouble: $!\en" unless semop($semid, $semop);
3608 To signal the semaphore, replace "-1" with "1".
3609 .Ip "send(SOCKET,MSG,FLAGS,TO)" 8 4
3610 .Ip "send(SOCKET,MSG,FLAGS)" 8
3611 Sends a message on a socket.
3612 Takes the same flags as the system call of the same name.
3613 On unconnected sockets you must specify a destination to send TO.
3614 Returns the number of characters sent, or the undefined value if
3616 .Ip "setpgrp(PID,PGRP)" 8 4
3617 Sets the current process group for the specified PID, 0 for the current
3619 Will produce a fatal error if used on a machine that doesn't implement
3621 .Ip "setpriority(WHICH,WHO,PRIORITY)" 8 4
3622 Sets the current priority for a process, a process group, or a user.
3623 (See setpriority(2).)
3624 Will produce a fatal error if used on a machine that doesn't implement
3626 .Ip "setsockopt(SOCKET,LEVEL,OPTNAME,OPTVAL)" 8 3
3627 Sets the socket option requested.
3628 Returns undefined if there is an error.
3629 OPTVAL may be specified as undef if you don't want to pass an argument.
3630 .Ip "shift(ARRAY)" 8 6
3633 Shifts the first value of the array off and returns it,
3634 shortening the array by 1 and moving everything down.
3635 If there are no elements in the array, returns the undefined value.
3636 If ARRAY is omitted, shifts the @ARGV array in the main program, and the @_
3637 array in subroutines.
3638 (This is determined lexically.)
3639 See also unshift(), push() and pop().
3640 Shift() and unshift() do the same thing to the left end of an array that push()
3641 and pop() do to the right end.
3642 .Ip "shmctl(ID,CMD,ARG)" 8 4
3643 Calls the System V IPC function shmctl. If CMD is &IPC_STAT, then ARG
3644 must be a variable which will hold the returned shmid_ds structure.
3645 Returns like ioctl: the undefined value for error, "0 but true" for
3646 zero, or the actual return value otherwise.
3647 .Ip "shmget(KEY,SIZE,FLAGS)" 8 4
3648 Calls the System V IPC function shmget. Returns the shared memory
3649 segment id, or the undefined value if there is an error.
3650 .Ip "shmread(ID,VAR,POS,SIZE)" 8 4
3651 .Ip "shmwrite(ID,STRING,POS,SIZE)" 8
3652 Reads or writes the System V shared memory segment ID starting at
3653 position POS for size SIZE by attaching to it, copying in/out, and
3654 detaching from it. When reading, VAR must be a variable which
3655 will hold the data read. When writing, if STRING is too long,
3656 only SIZE bytes are used; if STRING is too short, nulls are
3657 written to fill out SIZE bytes. Return true if successful, or
3658 false if there is an error.
3659 .Ip "shutdown(SOCKET,HOW)" 8 3
3660 Shuts down a socket connection in the manner indicated by HOW, which has
3661 the same interpretation as in the system call of the same name.
3664 Returns the sine of EXPR (expressed in radians).
3665 If EXPR is omitted, returns sine of $_.
3666 .Ip "sleep(EXPR)" 8 6
3669 Causes the script to sleep for EXPR seconds, or forever if no EXPR.
3670 May be interrupted by sending the process a SIGALRM.
3671 Returns the number of seconds actually slept.
3672 You probably cannot mix alarm() and sleep() calls, since sleep() is
3673 often implemented using alarm().
3674 .Ip "socket(SOCKET,DOMAIN,TYPE,PROTOCOL)" 8 3
3675 Opens a socket of the specified kind and attaches it to filehandle SOCKET.
3676 DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
3678 You may need to run h2ph on sys/socket.h to get the proper values handy
3679 in a perl library file.
3680 Return true if successful.
3681 See the example in the section on Interprocess Communication.
3682 .Ip "socketpair(SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL)" 8 3
3683 Creates an unnamed pair of sockets in the specified domain, of the specified
3685 DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
3687 If unimplemented, yields a fatal error.
3688 Return true if successful.
3689 .Ip "sort(SUBROUTINE LIST)" 8 9
3691 .Ip "sort SUBROUTINE LIST" 8
3692 .Ip "sort BLOCK LIST" 8
3694 Sorts the LIST and returns the sorted array value.
3695 Nonexistent values of arrays are stripped out.
3696 If SUBROUTINE or BLOCK is omitted, sorts in standard string comparison order.
3697 If SUBROUTINE is specified, gives the name of a subroutine that returns
3698 an integer less than, equal to, or greater than 0,
3699 depending on how the elements of the array are to be ordered.
3700 (The <=> and cmp operators are extremely useful in such routines.)
3701 SUBROUTINE may be a scalar variable name, in which case the value provides
3702 the name of the subroutine to use.
3703 In place of a SUBROUTINE name, you can provide a BLOCK as an anonymous,
3704 in-line sort subroutine.
3706 In the interests of efficiency the normal calling code for subroutines
3707 is bypassed, with the following effects: the subroutine may not be a recursive
3708 subroutine, and the two elements to be compared are passed into the subroutine
3709 not via @_ but as $a and $b (see example below).
3710 They are passed by reference so don't modify $a and $b.
3717 @articles = sort @files;
3720 # same thing, but with explicit sort routine
3721 @articles = sort {$a cmp $b} @files;
3724 # same thing in reversed order
3725 @articles = sort {$b cmp $a} @files;
3728 # sort numerically ascending
3729 @articles = sort {$a <=> $b} @files;
3732 # sort numerically descending
3733 @articles = sort {$b <=> $a} @files;
3736 # sort using explicit subroutine name
3738 $age{$a} <=> $age{$b}; # presuming integers
3740 @sortedclass = sort byage @class;
3743 sub reverse { $b cmp $a; }
3744 @harry = (\'dog\',\'cat\',\'x\',\'Cain\',\'Abel\');
3745 @george = (\'gone\',\'chased\',\'yz\',\'Punished\',\'Axed\');
3747 # prints AbelCaincatdogx
3748 print sort reverse @harry;
3749 # prints xdogcatCainAbel
3750 print sort @george, \'to\', @harry;
3751 # prints AbelAxedCainPunishedcatchaseddoggonetoxyz
3754 .Ip "splice(ARRAY,OFFSET,LENGTH,LIST)" 8 8
3755 .Ip "splice(ARRAY,OFFSET,LENGTH)" 8
3756 .Ip "splice(ARRAY,OFFSET)" 8
3757 Removes the elements designated by OFFSET and LENGTH from an array, and
3758 replaces them with the elements of LIST, if any.
3759 Returns the elements removed from the array.
3760 The array grows or shrinks as necessary.
3761 If LENGTH is omitted, removes everything from OFFSET onward.
3762 The following equivalencies hold (assuming $[ == 0):
3765 push(@a,$x,$y)\h'|3.5i'splice(@a,$#a+1,0,$x,$y)
3766 pop(@a)\h'|3.5i'splice(@a,-1)
3767 shift(@a)\h'|3.5i'splice(@a,0,1)
3768 unshift(@a,$x,$y)\h'|3.5i'splice(@a,0,0,$x,$y)
3769 $a[$x] = $y\h'|3.5i'splice(@a,$x,1,$y);
3771 Example, assuming array lengths are passed before arrays:
3773 sub aeq { # compare two array values
3774 local(@a) = splice(@_,0,shift);
3775 local(@b) = splice(@_,0,shift);
3776 return 0 unless @a == @b; # same len?
3778 return 0 if pop(@a) ne pop(@b);
3782 if (&aeq($len,@foo[1..$len],0+@bar,@bar)) { ... }
3785 .Ip "split(/PATTERN/,EXPR,LIMIT)" 8 8
3786 .Ip "split(/PATTERN/,EXPR)" 8 8
3787 .Ip "split(/PATTERN/)" 8
3789 Splits a string into an array of strings, and returns it.
3790 (If not in an array context, returns the number of fields found and splits
3792 (In an array context, you can force the split into @_
3793 by using ?? as the pattern delimiters, but it still returns the array value.))
3794 If EXPR is omitted, splits the $_ string.
3795 If PATTERN is also omitted, splits on whitespace (/[\ \et\en]+/).
3796 Anything matching PATTERN is taken to be a delimiter separating the fields.
3797 (Note that the delimiter may be longer than one character.)
3798 If LIMIT is specified, splits into no more than that many fields (though it
3799 may split into fewer).
3800 If LIMIT is unspecified, trailing null fields are stripped (which
3801 potential users of pop() would do well to remember).
3802 A pattern matching the null string (not to be confused with a null pattern //,
3803 which is just one member of the set of patterns matching a null string)
3804 will split the value of EXPR into separate characters at each point it
3809 print join(\':\', split(/ */, \'hi there\'));
3812 produces the output \*(L'h:i:t:h:e:r:e\*(R'.
3814 The LIMIT parameter can be used to partially split a line
3817 ($login, $passwd, $remainder) = split(\|/\|:\|/\|, $_, 3);
3820 (When assigning to a list, if LIMIT is omitted, perl supplies a LIMIT one
3821 larger than the number of variables in the list, to avoid unnecessary work.
3822 For the list above LIMIT would have been 4 by default.
3823 In time critical applications it behooves you not to split into
3824 more fields than you really need.)
3826 If the PATTERN contains parentheses, additional array elements are created
3827 from each matching substring in the delimiter.
3829 split(/([,-])/,"1-10,20");
3831 produces the array value
3835 The pattern /PATTERN/ may be replaced with an expression to specify patterns
3836 that vary at runtime.
3837 (To do runtime compilation only once, use /$variable/o.)
3838 As a special case, specifying a space (\'\ \') will split on white space
3839 just as split with no arguments does, but leading white space does NOT
3840 produce a null first field.
3841 Thus, split(\'\ \') can be used to emulate
3843 default behavior, whereas
3844 split(/\ /) will give you as many null initial fields as there are
3851 open(passwd, \'/etc/passwd\');
3854 ($login, $passwd, $uid, $gid, $gcos, $home, $shell) = split(\|/\|:\|/\|);
3857 ($login, $passwd, $uid, $gid, $gcos, $home, $shell)
3858 = split(\|/\|:\|/\|);
3864 (Note that $shell above will still have a newline on it. See chop().)
3867 .Ip "sprintf(FORMAT,LIST)" 8 4
3868 Returns a string formatted by the usual printf conventions.
3869 The * character is not supported.
3870 .Ip "sqrt(EXPR)" 8 4
3872 Return the square root of EXPR.
3873 If EXPR is omitted, returns square root of $_.
3874 .Ip "srand(EXPR)" 8 4
3876 Sets the random number seed for the
3879 If EXPR is omitted, does srand(time).
3880 .Ip "stat(FILEHANDLE)" 8 8
3881 .Ip "stat FILEHANDLE" 8
3883 .Ip "stat SCALARVARIABLE" 8
3884 Returns a 13-element array giving the statistics for a file, either the file
3885 opened via FILEHANDLE, or named by EXPR.
3886 Returns a null list if the stat fails.
3887 Typically used as follows:
3891 ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
3892 $atime,$mtime,$ctime,$blksize,$blocks)
3896 If stat is passed the special filehandle consisting of an underline,
3897 no stat is done, but the current contents of the stat structure from
3898 the last stat or filetest are returned.
3903 if (-x $file && (($d) = stat(_)) && $d < 0) {
3904 print "$file is executable NFS file\en";
3908 (This only works on machines for which the device number is negative under NFS.)
3909 .Ip "study(SCALAR)" 8 6
3910 .Ip "study SCALAR" 8
3912 Takes extra time to study SCALAR ($_ if unspecified) in anticipation of
3913 doing many pattern matches on the string before it is next modified.
3914 This may or may not save time, depending on the nature and number of patterns
3915 you are searching on, and on the distribution of character frequencies in
3916 the string to be searched\*(--you probably want to compare runtimes with and
3917 without it to see which runs faster.
3918 Those loops which scan for many short constant strings (including the constant
3919 parts of more complex patterns) will benefit most.
3920 You may have only one study active at a time\*(--if you study a different
3921 scalar the first is \*(L"unstudied\*(R".
3922 (The way study works is this: a linked list of every character in the string
3923 to be searched is made, so we know, for example, where all the \*(L'k\*(R' characters
3925 From each search string, the rarest character is selected, based on some
3926 static frequency tables constructed from some C programs and English text.
3927 Only those places that contain this \*(L"rarest\*(R" character are examined.)
3929 For example, here is a loop which inserts index producing entries before any line
3930 containing a certain pattern:
3936 print ".IX foo\en" if /\ebfoo\eb/;
3937 print ".IX bar\en" if /\ebbar\eb/;
3938 print ".IX blurfl\en" if /\ebblurfl\eb/;
3944 In searching for /\ebfoo\eb/, only those locations in $_ that contain \*(L'f\*(R'
3945 will be looked at, because \*(L'f\*(R' is rarer than \*(L'o\*(R'.
3946 In general, this is a big win except in pathological cases.
3947 The only question is whether it saves you more time than it took to build
3948 the linked list in the first place.
3950 Note that if you have to look for strings that you don't know till runtime,
3951 you can build an entire loop as a string and eval that to avoid recompiling
3952 all your patterns all the time.
3953 Together with undefining $/ to input entire files as one record, this can
3954 be very fast, often faster than specialized programs like fgrep.
3955 The following scans a list of files (@files)
3956 for a list of words (@words), and prints out the names of those files that
3961 $search = \'while (<>) { study;\';
3962 foreach $word (@words) {
3963 $search .= "++\e$seen{\e$ARGV} if /\e\eb$word\e\eb/;\en";
3968 eval $search; # this screams
3969 $/ = "\en"; # put back to normal input delim
3970 foreach $file (sort keys(%seen)) {
3975 .Ip "substr(EXPR,OFFSET,LEN)" 8 2
3976 .Ip "substr(EXPR,OFFSET)" 8 2
3977 Extracts a substring out of EXPR and returns it.
3978 First character is at offset 0, or whatever you've set $[ to.
3979 If OFFSET is negative, starts that far from the end of the string.
3980 If LEN is omitted, returns everything to the end of the string.
3981 You can use the substr() function as an lvalue, in which case EXPR must
3983 If you assign something shorter than LEN, the string will shrink, and
3984 if you assign something longer than LEN, the string will grow to accommodate it.
3985 To keep the string the same length you may need to pad or chop your value using
3987 .Ip "symlink(OLDFILE,NEWFILE)" 8 2
3988 Creates a new filename symbolically linked to the old filename.
3989 Returns 1 for success, 0 otherwise.
3990 On systems that don't support symbolic links, produces a fatal error at
3992 To check for that, use eval:
3995 $symlink_exists = (eval \'symlink("","");\', $@ eq \'\');
3998 .Ip "syscall(LIST)" 8 6
3999 .Ip "syscall LIST" 8
4000 Calls the system call specified as the first element of the list, passing
4001 the remaining elements as arguments to the system call.
4002 If unimplemented, produces a fatal error.
4003 The arguments are interpreted as follows: if a given argument is numeric,
4004 the argument is passed as an int.
4005 If not, the pointer to the string value is passed.
4006 You are responsible to make sure a string is pre-extended long enough
4007 to receive any result that might be written into a string.
4008 If your integer arguments are not literals and have never been interpreted
4009 in a numeric context, you may need to add 0 to them to force them to look
4013 require 'syscall.ph'; # may need to run h2ph
4014 syscall(&SYS_write, fileno(STDOUT), "hi there\en", 9);
4017 .Ip "sysread(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
4018 .Ip "sysread(FILEHANDLE,SCALAR,LENGTH)" 8 5
4019 Attempts to read LENGTH bytes of data into variable SCALAR from the specified
4020 FILEHANDLE, using the system call read(2).
4021 It bypasses stdio, so mixing this with other kinds of reads may cause
4023 Returns the number of bytes actually read, or undef if there was an error.
4024 SCALAR will be grown or shrunk to the length actually read.
4025 An OFFSET may be specified to place the read data at some other place
4026 than the beginning of the string.
4027 .Ip "system(LIST)" 8 6
4029 Does exactly the same thing as \*(L"exec LIST\*(R" except that a fork
4030 is done first, and the parent process waits for the child process to complete.
4031 Note that argument processing varies depending on the number of arguments.
4032 The return value is the exit status of the program as returned by the wait()
4034 To get the actual exit value divide by 256.
4037 .Ip "syswrite(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
4038 .Ip "syswrite(FILEHANDLE,SCALAR,LENGTH)" 8 5
4039 Attempts to write LENGTH bytes of data from variable SCALAR to the specified
4040 FILEHANDLE, using the system call write(2).
4041 It bypasses stdio, so mixing this with prints may cause
4043 Returns the number of bytes actually written, or undef if there was an error.
4044 An OFFSET may be specified to place the read data at some other place
4045 than the beginning of the string.
4046 .Ip "tell(FILEHANDLE)" 8 6
4047 .Ip "tell FILEHANDLE" 8 6
4049 Returns the current file position for FILEHANDLE.
4050 FILEHANDLE may be an expression whose value gives the name of the actual
4052 If FILEHANDLE is omitted, assumes the file last read.
4053 .Ip "telldir(DIRHANDLE)" 8 5
4054 .Ip "telldir DIRHANDLE" 8
4055 Returns the current position of the readdir() routines on DIRHANDLE.
4056 Value may be given to seekdir() to access a particular location in
4058 Has the same caveats about possible directory compaction as the corresponding
4059 system library routine.
4061 Returns the number of non-leap seconds since 00:00:00 UTC, January 1, 1970.
4062 Suitable for feeding to gmtime() and localtime().
4064 Returns a four-element array giving the user and system times, in seconds, for this
4065 process and the children of this process.
4067 ($user,$system,$cuser,$csystem) = times;
4069 .Ip "tr/SEARCHLIST/REPLACEMENTLIST/cds" 8 5
4070 .Ip "y/SEARCHLIST/REPLACEMENTLIST/cds" 8
4071 Translates all occurrences of the characters found in the search list with
4072 the corresponding character in the replacement list.
4073 It returns the number of characters replaced or deleted.
4074 If no string is specified via the =~ or !~ operator,
4075 the $_ string is translated.
4076 (The string specified with =~ must be a scalar variable, an array element,
4077 or an assignment to one of those, i.e. an lvalue.)
4082 is provided as a synonym for
4084 If the SEARCHLIST is delimited by bracketing quotes, the REPLACEMENTLIST
4085 has its own pair of quotes, which may or may not be bracketing quotes, e.g.
4086 tr[A-Z][a-z] or tr(+-*/)/ABCD/.
4088 If the c modifier is specified, the SEARCHLIST character set is complemented.
4089 If the d modifier is specified, any characters specified by SEARCHLIST that
4090 are not found in REPLACEMENTLIST are deleted.
4091 (Note that this is slightly more flexible than the behavior of some
4093 programs, which delete anything they find in the SEARCHLIST, period.)
4094 If the s modifier is specified, sequences of characters that were translated
4095 to the same character are squashed down to 1 instance of the character.
4097 If the d modifier was used, the REPLACEMENTLIST is always interpreted exactly
4099 Otherwise, if the REPLACEMENTLIST is shorter than the SEARCHLIST,
4100 the final character is replicated till it is long enough.
4101 If the REPLACEMENTLIST is null, the SEARCHLIST is replicated.
4102 This latter is useful for counting characters in a class, or for squashing
4103 character sequences in a class.
4108 $ARGV[1] \|=~ \|y/A\-Z/a\-z/; \h'|3i'# canonicalize to lower case
4110 $cnt = tr/*/*/; \h'|3i'# count the stars in $_
4112 $cnt = tr/0\-9//; \h'|3i'# count the digits in $_
4114 tr/a\-zA\-Z//s; \h'|3i'# bookkeeper \-> bokeper
4116 ($HOST = $host) =~ tr/a\-z/A\-Z/;
4118 y/a\-zA\-Z/ /cs; \h'|3i'# change non-alphas to single space
4120 tr/\e200\-\e377/\e0\-\e177/;\h'|3i'# delete 8th bit
4123 .Ip "truncate(FILEHANDLE,LENGTH)" 8 4
4124 .Ip "truncate(EXPR,LENGTH)" 8
4125 Truncates the file opened on FILEHANDLE, or named by EXPR, to the specified
4127 Produces a fatal error if truncate isn't implemented on your system.
4128 .Ip "umask(EXPR)" 8 4
4131 Sets the umask for the process and returns the old one.
4132 If EXPR is omitted, merely returns current umask.
4133 .Ip "undef(EXPR)" 8 6
4136 Undefines the value of EXPR, which must be an lvalue.
4137 Use only on a scalar value, an entire array, or a subroutine name (using &).
4138 (Undef will probably not do what you expect on most predefined variables or
4140 Always returns the undefined value.
4141 You can omit the EXPR, in which case nothing is undefined, but you still
4142 get an undefined value that you could, for instance, return from a subroutine.
4148 undef $bar{'blurfl'};
4152 return (wantarray ? () : undef) if $they_blew_it;
4155 .Ip "unlink(LIST)" 8 4
4157 Deletes a list of files.
4158 Returns the number of files successfully deleted.
4162 $cnt = unlink \'a\', \'b\', \'c\';
4167 Note: unlink will not delete directories unless you are superuser and the
4171 Even if these conditions are met, be warned that unlinking a directory
4172 can inflict damage on your filesystem.
4174 .Ip "unpack(TEMPLATE,EXPR)" 8 4
4175 Unpack does the reverse of pack: it takes a string representing
4176 a structure and expands it out into an array value, returning the array
4178 (In a scalar context, it merely returns the first value produced.)
4179 The TEMPLATE has the same format as in the pack function.
4180 Here's a subroutine that does substring:
4185 local($what,$where,$howmuch) = @_;
4186 unpack("x$where a$howmuch", $what);
4192 sub ord { unpack("c",$_[0]); }
4195 In addition, you may prefix a field with a %<number> to indicate that
4196 you want a <number>-bit checksum of the items instead of the items themselves.
4197 Default is a 16-bit checksum.
4198 For example, the following computes the same number as the System V sum program:
4203 $checksum += unpack("%16C*", $_);
4208 .Ip "unshift(ARRAY,LIST)" 8 4
4209 Does the opposite of a
4211 Or the opposite of a
4213 depending on how you look at it.
4214 Prepends list to the front of the array, and returns the number of elements
4218 unshift(ARGV, \'\-e\') unless $ARGV[0] =~ /^\-/;
4221 .Ip "utime(LIST)" 8 2
4222 .Ip "utime LIST" 8 2
4223 Changes the access and modification times on each file of a list of files.
4224 The first two elements of the list must be the NUMERICAL access and
4225 modification times, in that order.
4226 Returns the number of files successfully changed.
4227 The inode modification time of each file is set to the current time.
4228 Example of a \*(L"touch\*(R" command:
4234 utime $now, $now, @ARGV;
4237 .Ip "values(ASSOC_ARRAY)" 8 6
4238 .Ip "values ASSOC_ARRAY" 8
4239 Returns a normal array consisting of all the values of the named associative
4241 The values are returned in an apparently random order, but it is the same order
4242 as either the keys() or each() function would produce on the same array.
4243 See also keys() and each().
4244 .Ip "vec(EXPR,OFFSET,BITS)" 8 2
4245 Treats a string as a vector of unsigned integers, and returns the value
4246 of the bitfield specified.
4247 May also be assigned to.
4248 BITS must be a power of two from 1 to 32.
4250 Vectors created with vec() can also be manipulated with the logical operators
4252 which will assume a bit vector operation is desired when both operands are
4254 This interpretation is not enabled unless there is at least one vec() in
4255 your program, to protect older programs.
4257 To transform a bit vector into a string or array of 0's and 1's, use these:
4260 $bits = unpack("b*", $vector);
4261 @bits = split(//, unpack("b*", $vector));
4264 If you know the exact length in bits, it can be used in place of the *.
4266 Waits for a child process to terminate and returns the pid of the deceased
4267 process, or -1 if there are no child processes.
4268 The status is returned in $?.
4269 .Ip "waitpid(PID,FLAGS)" 8 6
4270 Waits for a particular child process to terminate and returns the pid of the deceased
4271 process, or -1 if there is no such child process.
4272 The status is returned in $?.
4276 require "sys/wait.h";
4278 waitpid(-1,&WNOHANG);
4281 then you can do a non-blocking wait for any process. Non-blocking wait
4282 is only available on machines supporting either the
4287 However, waiting for a particular pid with FLAGS of 0 is implemented
4288 everywhere. (Perl emulates the system call by remembering the status
4289 values of processes that have exited but have not been harvested by the
4292 Returns true if the context of the currently executing subroutine
4293 is looking for an array value.
4294 Returns false if the context is looking for a scalar.
4297 return wantarray ? () : undef;
4300 .Ip "warn(LIST)" 8 4
4302 Produces a message on STDERR just like \*(L"die\*(R", but doesn't exit.
4303 .Ip "write(FILEHANDLE)" 8 6
4306 Writes a formatted record (possibly multi-line) to the specified file,
4307 using the format associated with that file.
4308 By default the format for a file is the one having the same name is the
4309 filehandle, but the format for the current output channel (see
4311 may be set explicitly
4312 by assigning the name of the format to the $~ variable.
4314 Top of form processing is handled automatically:
4315 if there is insufficient room on the current page for the formatted
4316 record, the page is advanced by writing a form feed,
4317 a special top-of-page format is used
4318 to format the new page header, and then the record is written.
4319 By default the top-of-page format is the name of the filehandle with
4320 \*(L"_TOP\*(R" appended, but it may be dynamicallly set to the
4321 format of your choice by assigning the name to the $^ variable while
4322 the filehandle is selected.
4323 The number of lines remaining on the current page is in variable $-, which
4324 can be set to 0 to force a new page.
4326 If FILEHANDLE is unspecified, output goes to the current default output channel,
4329 but may be changed by the
4332 If the FILEHANDLE is an EXPR, then the expression is evaluated and the
4333 resulting string is used to look up the name of the FILEHANDLE at run time.
4334 For more on formats, see the section on formats later on.
4336 Note that write is NOT the opposite of read.
4339 operators have the following associativity and precedence:
4342 nonassoc\h'|1i'print printf exec system sort reverse
4343 \h'1.5i'chmod chown kill unlink utime die return
4345 right\h'|1i'= += \-= *= etc.
4352 nonassoc\h'|1i'== != <=> eq ne cmp
4353 nonassoc\h'|1i'< > <= >= lt gt le ge
4354 nonassoc\h'|1i'chdir exit eval reset sleep rand umask
4355 nonassoc\h'|1i'\-r \-w \-x etc.
4360 right\h'|1i'! ~ and unary minus
4362 nonassoc\h'|1i'++ \-\|\-
4363 left\h'|1i'\*(L'(\*(R'
4366 As mentioned earlier, if any list operator (print, etc.) or
4367 any unary operator (chdir, etc.)
4368 is followed by a left parenthesis as the next token on the same line,
4369 the operator and arguments within parentheses are taken to
4370 be of highest precedence, just like a normal function call.
4374 chdir $foo || die;\h'|3i'# (chdir $foo) || die
4375 chdir($foo) || die;\h'|3i'# (chdir $foo) || die
4376 chdir ($foo) || die;\h'|3i'# (chdir $foo) || die
4377 chdir +($foo) || die;\h'|3i'# (chdir $foo) || die
4379 but, because * is higher precedence than ||:
4381 chdir $foo * 20;\h'|3i'# chdir ($foo * 20)
4382 chdir($foo) * 20;\h'|3i'# (chdir $foo) * 20
4383 chdir ($foo) * 20;\h'|3i'# (chdir $foo) * 20
4384 chdir +($foo) * 20;\h'|3i'# chdir ($foo * 20)
4386 rand 10 * 20;\h'|3i'# rand (10 * 20)
4387 rand(10) * 20;\h'|3i'# (rand 10) * 20
4388 rand (10) * 20;\h'|3i'# (rand 10) * 20
4389 rand +(10) * 20;\h'|3i'# rand (10 * 20)
4392 In the absence of parentheses,
4393 the precedence of list operators such as print, sort or chmod is
4394 either very high or very low depending on whether you look at the left
4395 side of operator or the right side of it.
4399 @ary = (1, 3, sort 4, 2);
4400 print @ary; # prints 1324
4403 the commas on the right of the sort are evaluated before the sort, but
4404 the commas on the left are evaluated after.
4405 In other words, list operators tend to gobble up all the arguments that
4406 follow them, and then act like a simple term with regard to the preceding
4408 Note that you have to be careful with parens:
4412 # These evaluate exit before doing the print:
4413 print($foo, exit); # Obviously not what you want.
4414 print $foo, exit; # Nor is this.
4417 # These do the print before evaluating exit:
4418 (print $foo), exit; # This is what you want.
4419 print($foo), exit; # Or this.
4420 print ($foo), exit; # Or even this.
4424 print ($foo & 255) + 1, "\en";
4427 probably doesn't do what you expect at first glance.
4429 A subroutine may be declared as follows:
4436 Any arguments passed to the routine come in as array @_,
4437 that is ($_[0], $_[1], .\|.\|.).
4438 The array @_ is a local array, but its values are references to the
4439 actual scalar parameters.
4440 The return value of the subroutine is the value of the last expression
4441 evaluated, and can be either an array value or a scalar value.
4442 Alternately, a return statement may be used to specify the returned value and
4443 exit the subroutine.
4444 To create local variables see the
4448 A subroutine is called using the
4450 operator or the & operator.
4457 local($max) = pop(@_);
4459 $max = $foo \|if \|$max < $foo;
4465 $bestday = &MAX($mon,$tue,$wed,$thu,$fri);
4470 # get a line, combining continuation lines
4471 # that start with whitespace
4473 $thisline = $lookahead;
4474 line: while ($lookahead = <STDIN>) {
4475 if ($lookahead \|=~ \|/\|^[ \^\e\|t]\|/\|) {
4476 $thisline \|.= \|$lookahead;
4485 $lookahead = <STDIN>; # get first line
4486 while ($_ = do get_line(\|)) {
4493 Use array assignment to a local list to name your formal arguments:
4496 local($key, $value) = @_;
4497 $foo{$key} = $value unless $foo{$key};
4501 This also has the effect of turning call-by-reference into call-by-value,
4502 since the assignment copies the values.
4504 Subroutines may be called recursively.
4505 If a subroutine is called using the & form, the argument list is optional.
4506 If omitted, no @_ array is set up for the subroutine; the @_ array at the
4507 time of the call is visible to subroutine instead.
4510 do foo(1,2,3); # pass three arguments
4511 &foo(1,2,3); # the same
4513 do foo(); # pass a null list
4515 &foo; # pass no arguments\*(--more efficient
4518 .Sh "Passing By Reference"
4519 Sometimes you don't want to pass the value of an array to a subroutine but
4520 rather the name of it, so that the subroutine can modify the global copy
4521 of it rather than working with a local copy.
4522 In perl you can refer to all the objects of a particular name by prefixing
4523 the name with a star: *foo.
4524 When evaluated, it produces a scalar value that represents all the objects
4525 of that name, including any filehandle, format or subroutine.
4526 When assigned to within a local() operation, it causes the name mentioned
4527 to refer to whatever * value was assigned to it.
4532 local(*someary) = @_;
4533 foreach $elem (@someary) {
4541 Assignment to *name is currently recommended only inside a local().
4542 You can actually assign to *name anywhere, but the previous referent of
4543 *name may be stranded forever.
4544 This may or may not bother you.
4546 Note that scalars are already passed by reference, so you can modify scalar
4547 arguments without using this mechanism by referring explicitly to the $_[nnn]
4549 You can modify all the elements of an array by passing all the elements
4550 as scalars, but you have to use the * mechanism to push, pop or change the
4552 The * mechanism will probably be more efficient in any case.
4554 Since a *name value contains unprintable binary data, if it is used as
4555 an argument in a print, or as a %s argument in a printf or sprintf, it
4556 then has the value '*name', just so it prints out pretty.
4558 Even if you don't want to modify an array, this mechanism is useful for
4559 passing multiple arrays in a single LIST, since normally the LIST mechanism
4560 will merge all the array values so that you can't extract out the
4562 .Sh "Regular Expressions"
4563 The patterns used in pattern matching are regular expressions such as
4564 those supplied in the Version 8 regexp routines.
4565 (In fact, the routines are derived from Henry Spencer's freely redistributable
4566 reimplementation of the V8 routines.)
4567 In addition, \ew matches an alphanumeric character (including \*(L"_\*(R") and \eW a nonalphanumeric.
4568 Word boundaries may be matched by \eb, and non-boundaries by \eB.
4569 A whitespace character is matched by \es, non-whitespace by \eS.
4570 A numeric character is matched by \ed, non-numeric by \eD.
4571 You may use \ew, \es and \ed within character classes.
4572 Also, \en, \er, \ef, \et and \eNNN have their normal interpretations.
4573 Within character classes \eb represents backspace rather than a word boundary.
4574 Alternatives may be separated by |.
4575 The bracketing construct \|(\ .\|.\|.\ \|) may also be used, in which case \e<digit>
4576 matches the digit'th substring.
4577 (Outside of the pattern, always use $ instead of \e in front of the digit.
4578 The scope of $<digit> (and $\`, $& and $\')
4579 extends to the end of the enclosing BLOCK or eval string, or to
4580 the next pattern match with subexpressions.
4581 The \e<digit> notation sometimes works outside the current pattern, but should
4582 not be relied upon.)
4583 You may have as many parentheses as you wish. If you have more than 9
4584 substrings, the variables $10, $11, ... refer to the corresponding
4585 substring. Within the pattern, \e10, \e11,
4586 etc. refer back to substrings if there have been at least that many left parens
4587 before the backreference. Otherwise (for backward compatibilty) \e10
4588 is the same as \e010, a backspace,
4589 and \e11 the same as \e011, a tab.
4591 (\e1 through \e9 are always backreferences.)
4593 $+ returns whatever the last bracket match matched.
4594 $& returns the entire matched string.
4595 ($0 used to return the same thing, but not any more.)
4596 $\` returns everything before the matched string.
4597 $\' returns everything after the matched string.
4601 s/\|^\|([^ \|]*\|) \|*([^ \|]*\|)\|/\|$2 $1\|/; # swap first two words
4604 if (/\|Time: \|(.\|.\|):\|(.\|.\|):\|(.\|.\|)\|/\|) {
4611 By default, the ^ character is only guaranteed to match at the beginning
4613 the $ character only at the end (or before the newline at the end)
4616 does certain optimizations with the assumption that the string contains
4618 The behavior of ^ and $ on embedded newlines will be inconsistent.
4619 You may, however, wish to treat a string as a multi-line buffer, such that
4620 the ^ will match after any newline within the string, and $ will match
4622 At the cost of a little more overhead, you can do this by setting the variable
4624 Setting it back to 0 makes
4626 revert to its old behavior.
4628 To facilitate multi-line substitutions, the . character never matches a newline
4629 (even when $* is 0).
4630 In particular, the following leaves a newline on the $_ string:
4634 s/.*(some_string).*/$1/;
4636 If the newline is unwanted, try one of
4638 s/.*(some_string).*\en/$1/;
4639 s/.*(some_string)[^\e000]*/$1/;
4640 s/.*(some_string)(.|\en)*/$1/;
4641 chop; s/.*(some_string).*/$1/;
4642 /(some_string)/ && ($_ = $1);
4645 Any item of a regular expression may be followed with digits in curly brackets
4646 of the form {n,m}, where n gives the minimum number of times to match the item
4647 and m gives the maximum.
4648 The form {n} is equivalent to {n,n} and matches exactly n times.
4649 The form {n,} matches n or more times.
4650 (If a curly bracket occurs in any other context, it is treated as a regular
4652 The * modifier is equivalent to {0,}, the + modifier to {1,} and the ? modifier
4654 There is no limit to the size of n or m, but large numbers will chew up
4657 You will note that all backslashed metacharacters in
4660 such as \eb, \ew, \en.
4661 Unlike some other regular expression languages, there are no backslashed
4662 symbols that aren't alphanumeric.
4663 So anything that looks like \e\e, \e(, \e), \e<, \e>, \e{, or \e} is always
4664 interpreted as a literal character, not a metacharacter.
4665 This makes it simple to quote a string that you want to use for a pattern
4666 but that you are afraid might contain metacharacters.
4667 Simply quote all the non-alphanumeric characters:
4670 $pattern =~ s/(\eW)/\e\e$1/g;
4674 Output record formats for use with the
4676 operator may declared as follows:
4685 If name is omitted, format \*(L"STDOUT\*(R" is defined.
4686 FORMLIST consists of a sequence of lines, each of which may be of one of three
4691 A \*(L"picture\*(R" line giving the format for one output line.
4693 An argument line supplying values to plug into a picture line.
4695 Picture lines are printed exactly as they look, except for certain fields
4696 that substitute values into the line.
4697 Each picture field starts with either @ or ^.
4698 The @ field (not to be confused with the array marker @) is the normal
4699 case; ^ fields are used
4700 to do rudimentary multi-line text block filling.
4701 The length of the field is supplied by padding out the field
4702 with multiple <, >, or | characters to specify, respectively, left justification,
4703 right justification, or centering.
4704 As an alternate form of right justification,
4705 you may also use # characters (with an optional .) to specify a numeric field.
4706 (Use of ^ instead of @ causes the field to be blanked if undefined.)
4707 If any of the values supplied for these fields contains a newline, only
4708 the text up to the newline is printed.
4709 The special field @* can be used for printing multi-line values.
4710 It should appear by itself on a line.
4712 The values are specified on the following line, in the same order as
4714 The values should be separated by commas.
4716 Picture fields that begin with ^ rather than @ are treated specially.
4717 The value supplied must be a scalar variable name which contains a text
4720 puts as much text as it can into the field, and then chops off the front
4721 of the string so that the next time the variable is referenced,
4722 more of the text can be printed.
4723 Normally you would use a sequence of fields in a vertical stack to print
4724 out a block of text.
4725 If you like, you can end the final field with .\|.\|., which will appear in the
4726 output if the text was too long to appear in its entirety.
4727 You can change which characters are legal to break on by changing the
4728 variable $: to a list of the desired characters.
4730 Since use of ^ fields can produce variable length records if the text to be
4731 formatted is short, you can suppress blank lines by putting the tilde (~)
4732 character anywhere in the line.
4733 (Normally you should put it in the front if possible, for visibility.)
4734 The tilde will be translated to a space upon output.
4735 If you put a second tilde contiguous to the first, the line will be repeated
4736 until all the fields on the line are exhausted.
4737 (If you use a field of the @ variety, the expression you supply had better
4738 not give the same value every time forever!)
4747 # a report on the /etc/passwd file
4750 Name Login Office Uid Gid Home
4751 ------------------------------------------------------------------
4754 @<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<<
4755 $name, $login, $office,$uid,$gid, $home
4759 # a report from a bug report form
4762 @<<<<<<<<<<<<<<<<<<<<<<< @||| @>>>>>>>>>>>>>>>>>>>>>>>
4764 ------------------------------------------------------------------
4767 Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4769 Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4770 \& $index, $description
4771 Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4772 \& $priority, $date, $description
4773 From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4774 \& $from, $description
4775 Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4776 \& $programmer, $description
4777 \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4779 \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4781 \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4783 \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4785 \&~ ^<<<<<<<<<<<<<<<<<<<<<<<...
4793 It is possible to intermix prints with writes on the same output channel,
4794 but you'll have to handle $\- (lines left on the page) yourself.
4796 If you are printing lots of fields that are usually blank, you should consider
4797 using the reset operator between records.
4798 Not only is it more efficient, but it can prevent the bug of adding another
4799 field and forgetting to zero it.
4800 .Sh "Interprocess Communication"
4801 The IPC facilities of perl are built on the Berkeley socket mechanism.
4802 If you don't have sockets, you can ignore this section.
4803 The calls have the same names as the corresponding system calls,
4804 but the arguments tend to differ, for two reasons.
4805 First, perl file handles work differently than C file descriptors.
4806 Second, perl already knows the length of its strings, so you don't need
4807 to pass that information.
4808 Here is a sample client (untested):
4811 ($them,$port) = @ARGV;
4812 $port = 2345 unless $port;
4813 $them = 'localhost' unless $them;
4815 $SIG{'INT'} = 'dokill';
4816 sub dokill { kill 9,$child if $child; }
4818 require 'sys/socket.ph';
4820 $sockaddr = 'S n a4 x8';
4821 chop($hostname = `hostname`);
4823 ($name, $aliases, $proto) = getprotobyname('tcp');
4824 ($name, $aliases, $port) = getservbyname($port, 'tcp')
4825 unless $port =~ /^\ed+$/;
4827 ($name, $aliases, $type, $len, $thisaddr) = gethostbyname($hostname);
4830 ($name, $aliases, $type, $len, $thisaddr) =
4831 gethostbyname($hostname);
4833 ($name, $aliases, $type, $len, $thataddr) = gethostbyname($them);
4835 $this = pack($sockaddr, &AF_INET, 0, $thisaddr);
4836 $that = pack($sockaddr, &AF_INET, $port, $thataddr);
4838 socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
4839 bind(S, $this) || die "bind: $!";
4840 connect(S, $that) || die "connect: $!";
4842 select(S); $| = 1; select(stdout);
4844 if ($child = fork) {
4858 And here's a server:
4862 $port = 2345 unless $port;
4864 require 'sys/socket.ph';
4866 $sockaddr = 'S n a4 x8';
4868 ($name, $aliases, $proto) = getprotobyname('tcp');
4869 ($name, $aliases, $port) = getservbyname($port, 'tcp')
4870 unless $port =~ /^\ed+$/;
4872 $this = pack($sockaddr, &AF_INET, $port, "\e0\e0\e0\e0");
4874 select(NS); $| = 1; select(stdout);
4876 socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
4877 bind(S, $this) || die "bind: $!";
4878 listen(S, 5) || die "connect: $!";
4880 select(S); $| = 1; select(stdout);
4883 print "Listening again\en";
4884 ($addr = accept(NS,S)) || die $!;
4885 print "accept ok\en";
4887 ($af,$port,$inetaddr) = unpack($sockaddr,$addr);
4888 @inetaddr = unpack('C4',$inetaddr);
4889 print "$af $port @inetaddr\en";
4898 .Sh "Predefined Names"
4899 The following names have special meaning to
4901 I could have used alphabetic symbols for some of these, but I didn't want
4902 to take the chance that someone would say reset \*(L"a\-zA\-Z\*(R" and wipe them all
4904 You'll just have to suffer along with these silly symbols.
4905 Most of them have reasonable mnemonics, or analogues in one of the shells.
4907 The default input and pattern-searching space.
4908 The following pairs are equivalent:
4912 while (<>) {\|.\|.\|. # only equivalent in while!
4913 while ($_ = <>) {\|.\|.\|.
4917 $_ \|=~ \|/\|^Subject:/
4928 (Mnemonic: underline is understood in certain operations.)
4930 The current input line number of the last filehandle that was read.
4932 Remember that only an explicit close on the filehandle resets the line number.
4933 Since <> never does an explicit close, line numbers increase across ARGV files
4934 (but see examples under eof).
4935 (Mnemonic: many programs use . to mean the current line number.)
4937 The input record separator, newline by default.
4940 RS variable, including treating blank lines as delimiters
4941 if set to the null string.
4942 You may set it to a multicharacter string to match a multi-character
4944 Note that setting it to "\en\en" means something slightly different
4945 than setting it to "", if the file contains consecutive blank lines.
4946 Setting it to "" will treat two or more consecutive blank lines as a single
4948 Setting it to "\en\en" will blindly assume that the next input character
4949 belongs to the next paragraph, even if it's a newline.
4950 (Mnemonic: / is used to delimit line boundaries when quoting poetry.)
4952 The output field separator for the print operator.
4953 Ordinarily the print operator simply prints out the comma separated fields
4955 In order to get behavior more like
4957 set this variable as you would set
4959 OFS variable to specify what is printed between fields.
4960 (Mnemonic: what is printed when there is a , in your print statement.)
4962 This is like $, except that it applies to array values interpolated into
4963 a double-quoted string (or similar interpreted string).
4965 (Mnemonic: obvious, I think.)
4967 The output record separator for the print operator.
4968 Ordinarily the print operator simply prints out the comma separated fields
4969 you specify, with no trailing newline or record separator assumed.
4970 In order to get behavior more like
4972 set this variable as you would set
4974 ORS variable to specify what is printed at the end of the print.
4975 (Mnemonic: you set $\e instead of adding \en at the end of the print.
4976 Also, it's just like /, but it's what you get \*(L"back\*(R" from
4979 The output format for printed numbers.
4980 This variable is a half-hearted attempt to emulate
4983 There are times, however, when
4987 have differing notions of what
4989 Also, the initial value is %.20g rather than %.6g, so you need to set $#
4993 (Mnemonic: # is the number sign.)
4995 The current page number of the currently selected output channel.
4996 (Mnemonic: % is page number in nroff.)
4998 The current page length (printable lines) of the currently selected output
5001 (Mnemonic: = has horizontal lines.)
5003 The number of lines left on the page of the currently selected output channel.
5004 (Mnemonic: lines_on_page \- lines_printed.)
5006 The name of the current report format for the currently selected output
5008 Default is name of the filehandle.
5009 (Mnemonic: brother to $^.)
5011 The name of the current top-of-page format for the currently selected output
5013 Default is name of the filehandle with \*(L"_TOP\*(R" appended.
5014 (Mnemonic: points to top of page.)
5016 If set to nonzero, forces a flush after every write or print on the currently
5017 selected output channel.
5021 will typically be line buffered if output is to the
5022 terminal and block buffered otherwise.
5023 Setting this variable is useful primarily when you are outputting to a pipe,
5024 such as when you are running a
5026 script under rsh and want to see the
5027 output as it's happening.
5028 (Mnemonic: when you want your pipes to be piping hot.)
5030 The process number of the
5032 running this script.
5033 (Mnemonic: same as shells.)
5035 The status returned by the last pipe close, backtick (\`\`) command or
5038 Note that this is the status word returned by the wait() system
5039 call, so the exit value of the subprocess is actually ($? >> 8).
5040 $? & 255 gives which signal, if any, the process died from, and whether
5041 there was a core dump.
5042 (Mnemonic: similar to sh and ksh.)
5044 The string matched by the last successful pattern match
5045 (not counting any matches hidden
5046 within a BLOCK or eval enclosed by the current BLOCK).
5047 (Mnemonic: like & in some editors.)
5049 The string preceding whatever was matched by the last successful pattern match
5050 (not counting any matches hidden within a BLOCK or eval enclosed by the current
5052 (Mnemonic: \` often precedes a quoted string.)
5054 The string following whatever was matched by the last successful pattern match
5055 (not counting any matches hidden within a BLOCK or eval enclosed by the current
5057 (Mnemonic: \' often follows a quoted string.)
5064 print "$\`:$&:$\'\en"; # prints abc:def:ghi
5068 The last bracket matched by the last search pattern.
5069 This is useful if you don't know which of a set of alternative patterns
5074 /Version: \|(.*\|)|Revision: \|(.*\|)\|/ \|&& \|($rev = $+);
5077 (Mnemonic: be positive and forward looking.)
5079 Set to 1 to do multiline matching within a string, 0 to tell
5081 that it can assume that strings contain a single line, for the purpose
5082 of optimizing pattern matches.
5083 Pattern matches on strings containing multiple newlines can produce confusing
5084 results when $* is 0.
5086 (Mnemonic: * matches multiple things.)
5087 Note that this variable only influences the interpretation of ^ and $.
5088 A literal newline can be searched for even when $* == 0.
5090 Contains the name of the file containing the
5092 script being executed.
5093 Assigning to $0 modifies the argument area that the ps(1) program sees.
5094 (Mnemonic: same as sh and ksh.)
5096 Contains the subpattern from the corresponding set of parentheses in the last
5097 pattern matched, not counting patterns matched in nested blocks that have
5098 been exited already.
5099 (Mnemonic: like \edigit.)
5101 The index of the first element in an array, and of the first character in
5103 Default is 0, but you could set it to 1 to make
5108 when subscripting and when evaluating the index() and substr() functions.
5109 (Mnemonic: [ begins subscripts.)
5111 The string printed out when you say \*(L"perl -v\*(R".
5112 It can be used to determine at the beginning of a script whether the perl
5113 interpreter executing the script is in the right range of versions.
5114 If used in a numeric context, returns the version + patchlevel / 1000.
5119 # see if getc is available
5120 ($version,$patchlevel) =
5121 $] =~ /(\ed+\e.\ed+).*\enPatch level: (\ed+)/;
5122 print STDERR "(No filename completion available.)\en"
5123 if $version * 1000 + $patchlevel < 2016;
5125 or, used numerically,
5127 warn "No checksumming!\en" if $] < 3.019;
5130 (Mnemonic: Is this version of perl in the right bracket?)
5132 The subscript separator for multi-dimensional array emulation.
5133 If you refer to an associative array element as
5139 $foo{join($;, $a, $b, $c)}
5143 @foo{$a,$b,$c} # a slice\*(--note the @
5147 ($foo{$a},$foo{$b},$foo{$c})
5150 Default is "\e034", the same as SUBSEP in
5152 Note that if your keys contain binary data there might not be any safe
5154 (Mnemonic: comma (the syntactic subscript separator) is a semi-semicolon.
5155 Yeah, I know, it's pretty lame, but $, is already taken for something more
5158 If used in a numeric context, yields the current value of errno, with all the
5160 (This means that you shouldn't depend on the value of $! to be anything
5161 in particular unless you've gotten a specific error return indicating a
5163 If used in a string context, yields the corresponding system error string.
5164 You can assign to $! in order to set errno
5165 if, for instance, you want $! to return the string for error n, or you want
5166 to set the exit value for the die operator.
5167 (Mnemonic: What just went bang?)
5169 The perl syntax error message from the last eval command.
5170 If null, the last eval parsed and executed correctly (although the operations
5171 you invoked may have failed in the normal fashion).
5172 (Mnemonic: Where was the syntax error \*(L"at\*(R"?)
5174 The real uid of this process.
5175 (Mnemonic: it's the uid you came FROM, if you're running setuid.)
5177 The effective uid of this process.
5182 $< = $>; # set real uid to the effective uid
5183 ($<,$>) = ($>,$<); # swap real and effective uid
5186 (Mnemonic: it's the uid you went TO, if you're running setuid.)
5187 Note: $< and $> can only be swapped on machines supporting setreuid().
5189 The real gid of this process.
5190 If you are on a machine that supports membership in multiple groups
5191 simultaneously, gives a space separated list of groups you are in.
5192 The first number is the one returned by getgid(), and the subsequent ones
5193 by getgroups(), one of which may be the same as the first number.
5194 (Mnemonic: parentheses are used to GROUP things.
5195 The real gid is the group you LEFT, if you're running setgid.)
5197 The effective gid of this process.
5198 If you are on a machine that supports membership in multiple groups
5199 simultaneously, gives a space separated list of groups you are in.
5200 The first number is the one returned by getegid(), and the subsequent ones
5201 by getgroups(), one of which may be the same as the first number.
5202 (Mnemonic: parentheses are used to GROUP things.
5203 The effective gid is the group that's RIGHT for you, if you're running setgid.)
5205 Note: $<, $>, $( and $) can only be set on machines that support the
5206 corresponding set[re][ug]id() routine.
5207 $( and $) can only be swapped on machines supporting setregid().
5209 The current set of characters after which a string may be broken to
5210 fill continuation fields (starting with ^) in a format.
5211 Default is "\ \en-", to break on whitespace or hyphens.
5212 (Mnemonic: a \*(L"colon\*(R" in poetry is a part of a line.)
5214 The current value of the debugging flags.
5219 The maximum system file descriptor, ordinarily 2. System file descriptors
5220 are passed to subprocesses, while higher file descriptors are not.
5221 During an open, system file descriptors are preserved even if the open
5222 fails. Ordinary file descriptors are closed before the open is attempted.
5224 The current value of the inplace-edit extension.
5225 Use undef to disable inplace editing.
5230 What formats output to perform a formfeed. Default is \ef.
5232 The internal flag that the debugger clears so that it doesn't
5233 debug itself. You could conceivable disable debugging yourself
5236 The time at which the script began running, in seconds since the epoch.
5237 The values returned by the
5242 filetests are based on this value.
5244 The current value of the warning switch.
5245 (Mnemonic: related to the
5249 The name that Perl itself was executed as, from argv[0].
5251 contains the name of the current file when reading from <>.
5253 The array ARGV contains the command line arguments intended for the script.
5254 Note that $#ARGV is the generally number of arguments minus one, since
5255 $ARGV[0] is the first argument, NOT the command name.
5256 See $0 for the command name.
5258 The array INC contains the list of places to look for
5261 evaluated by the \*(L"do EXPR\*(R" command or the \*(L"require\*(R" command.
5262 It initially consists of the arguments to any
5264 command line switches, followed
5267 library, probably \*(L"/usr/local/lib/perl\*(R",
5268 followed by \*(L".\*(R", to represent the current directory.
5270 The associative array INC contains entries for each filename that has
5271 been included via \*(L"do\*(R" or \*(L"require\*(R".
5272 The key is the filename you specified, and the value is the location of
5273 the file actually found.
5274 The \*(L"require\*(R" command uses this array to determine whether
5275 a given file has already been included.
5277 The associative array ENV contains your current environment.
5278 Setting a value in ENV changes the environment for child processes.
5280 The associative array SIG is used to set signal handlers for various signals.
5285 sub handler { # 1st argument is signal name
5287 print "Caught a SIG$sig\-\|\-shutting down\en";
5292 $SIG{\'INT\'} = \'handler\';
5293 $SIG{\'QUIT\'} = \'handler\';
5295 $SIG{\'INT\'} = \'DEFAULT\'; # restore default action
5296 $SIG{\'QUIT\'} = \'IGNORE\'; # ignore SIGQUIT
5299 The SIG array only contains values for the signals actually set within
5302 Perl provides a mechanism for alternate namespaces to protect packages from
5303 stomping on each others variables.
5304 By default, a perl script starts compiling into the package known as \*(L"main\*(R".
5307 declaration, you can switch namespaces.
5308 The scope of the package declaration is from the declaration itself to the end
5309 of the enclosing block (the same scope as the local() operator).
5310 Typically it would be the first declaration in a file to be included by
5311 the \*(L"require\*(R" operator.
5312 You can switch into a package in more than one place; it merely influences
5313 which symbol table is used by the compiler for the rest of that block.
5314 You can refer to variables and filehandles in other packages by prefixing
5315 the identifier with the package name and a single quote.
5316 If the package name is null, the \*(L"main\*(R" package as assumed.
5318 Only identifiers starting with letters are stored in the packages symbol
5320 All other symbols are kept in package \*(L"main\*(R".
5321 In addition, the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC
5322 and SIG are forced to be in package \*(L"main\*(R", even when used for
5323 other purposes than their built-in one.
5324 Note also that, if you have a package called \*(L"m\*(R", \*(L"s\*(R"
5325 or \*(L"y\*(R", the you can't use the qualified form of an identifier since it
5326 will be interpreted instead as a pattern match, a substitution
5329 Eval'ed strings are compiled in the package in which the eval was compiled
5331 (Assignments to $SIG{}, however, assume the signal handler specified is in the
5333 Qualify the signal handler name if you wish to have a signal handler in
5335 For an example, examine perldb.pl in the perl library.
5336 It initially switches to the DB package so that the debugger doesn't interfere
5337 with variables in the script you are trying to debug.
5338 At various points, however, it temporarily switches back to the main package
5339 to evaluate various expressions in the context of the main package.
5341 The symbol table for a package happens to be stored in the associative array
5342 of that name prepended with an underscore.
5343 The value in each entry of the associative array is
5344 what you are referring to when you use the *name notation.
5345 In fact, the following have the same effect (in package main, anyway),
5346 though the first is more
5347 efficient because it does the symbol table lookups at compile time:
5352 local($_main{'foo'}) = $_main{'bar'};
5355 You can use this to print out all the variables in a package, for instance.
5356 Here is dumpvar.pl from the perl library:
5363 \& local(*stab) = eval("*_$package");
5364 \& while (($key,$val) = each(%stab)) {
5366 \& local(*entry) = $val;
5367 \& if (defined $entry) {
5368 \& print "\e$$key = '$entry'\en";
5371 \& if (defined @entry) {
5372 \& print "\e@$key = (\en";
5373 \& foreach $num ($[ .. $#entry) {
5374 \& print " $num\et'",$entry[$num],"'\en";
5379 \& if ($key ne "_$package" && defined %entry) {
5380 \& print "\e%$key = (\en";
5381 \& foreach $key (sort keys(%entry)) {
5382 \& print " $key\et'",$entry{$key},"'\en";
5391 Note that, even though the subroutine is compiled in package dumpvar, the
5392 name of the subroutine is qualified so that its name is inserted into package
5395 Each programmer will, of course, have his or her own preferences in regards
5396 to formatting, but there are some general guidelines that will make your
5397 programs easier to read.
5399 Just because you CAN do something a particular way doesn't mean that
5400 you SHOULD do it that way.
5402 is designed to give you several ways to do anything, so consider picking
5403 the most readable one.
5406 open(FOO,$foo) || die "Can't open $foo: $!";
5410 die "Can't open $foo: $!" unless open(FOO,$foo);
5412 because the second way hides the main point of the statement in a
5416 print "Starting analysis\en" if $verbose;
5420 $verbose && print "Starting analysis\en";
5422 since the main point isn't whether the user typed -v or not.
5424 Similarly, just because an operator lets you assume default arguments
5425 doesn't mean that you have to make use of the defaults.
5426 The defaults are there for lazy systems programmers writing one-shot
5428 If you want your program to be readable, consider supplying the argument.
5430 Along the same lines, just because you
5432 omit parentheses in many places doesn't mean that you ought to:
5435 return print reverse sort num values array;
5436 return print(reverse(sort num (values(%array))));
5439 When in doubt, parenthesize.
5440 At the very least it will let some poor schmuck bounce on the % key in vi.
5442 Even if you aren't in doubt, consider the mental welfare of the person who
5443 has to maintain the code after you, and who will probably put parens in
5446 Don't go through silly contortions to exit a loop at the top or the
5449 provides the "last" operator so you can exit in the middle.
5450 Just outdent it a little to make it more visible:
5464 Don't be afraid to use loop labels\*(--they're there to enhance readability as
5465 well as to allow multi-level loop breaks.
5468 For portability, when using features that may not be implemented on every
5469 machine, test the construct in an eval to see if it fails.
5470 If you know what version or patchlevel a particular feature was implemented,
5471 you can test $] to see if it will be there.
5473 Choose mnemonic identifiers.
5481 switch, your script will be run under a debugging monitor.
5482 It will halt before the first executable statement and ask you for a
5485 Prints out a help message.
5490 Executes until it reaches the beginning of another statement.
5493 Executes over subroutine calls, until it reaches the beginning of the
5497 Executes statements until it has finished the current subroutine.
5500 Executes until the next breakpoint is reached.
5502 Continue to the specified line.
5503 Inserts a one-time-only breakpoint at the specified line.
5506 .Ip "l min+incr" 12 4
5507 List incr+1 lines starting at min.
5508 If min is omitted, starts where last listing left off.
5509 If incr is omitted, previous value of incr is used.
5510 .Ip "l min-max" 12 4
5511 List lines in the indicated range.
5513 List just the indicated line.
5517 List previous window.
5519 List window around line.
5520 .Ip "l subname" 12 4
5522 If it's a long subroutine it just lists the beginning.
5523 Use \*(L"l\*(R" to list more.
5524 .Ip "/pattern/" 12 4
5525 Regular expression search forward for pattern; the final / is optional.
5526 .Ip "?pattern?" 12 4
5527 Regular expression search backward for pattern; the final ? is optional.
5529 List lines that have breakpoints or actions.
5531 Lists the names of all subroutines.
5533 Toggle trace mode on or off.
5534 .Ip "b line condition" 12 4
5536 If line is omitted, sets a breakpoint on the
5537 line that is about to be executed.
5538 If a condition is specified, it is evaluated each time the statement is
5539 reached and a breakpoint is taken only if the condition is true.
5540 Breakpoints may only be set on lines that begin an executable statement.
5541 .Ip "b subname condition" 12 4
5542 Set breakpoint at first executable line of subroutine.
5545 If line is omitted, deletes the breakpoint on the
5546 line that is about to be executed.
5548 Delete all breakpoints.
5549 .Ip "a line command" 12 4
5550 Set an action for line.
5551 A multi-line command may be entered by backslashing the newlines.
5553 Delete all line actions.
5554 .Ip "< command" 12 4
5555 Set an action to happen before every debugger prompt.
5556 A multi-line command may be entered by backslashing the newlines.
5557 .Ip "> command" 12 4
5558 Set an action to happen after the prompt when you've just given a command
5559 to return to executing the script.
5560 A multi-line command may be entered by backslashing the newlines.
5561 .Ip "V package" 12 4
5562 List all variables in package.
5563 Default is main package.
5565 Redo a debugging command.
5566 If number is omitted, redoes the previous command.
5567 .Ip "! -number" 12 4
5568 Redo the command that was that many commands ago.
5569 .Ip "H -number" 12 4
5570 Display last n commands.
5571 Only commands longer than one character are listed.
5572 If number is omitted, lists them all.
5576 Execute command as a perl statement.
5577 A missing semicolon will be supplied.
5579 Same as \*(L"print DB'OUT expr\*(R".
5580 The DB'OUT filehandle is opened to /dev/tty, regardless of where STDOUT
5581 may be redirected to.
5583 If you want to modify the debugger, copy perldb.pl from the perl library
5584 to your current directory and modify it as necessary.
5585 (You'll also have to put -I. on your command line.)
5586 You can do some customization by setting up a .perldb file which contains
5587 initialization code.
5588 For instance, you could make aliases like these:
5591 $DB'alias{'len'} = 's/^len(.*)/p length($1)/';
5592 $DB'alias{'stop'} = 's/^stop (at|in)/b/';
5594 's/^\e./p "\e$DB\e'sub(\e$DB\e'line):\et",\e$DB\e'line[\e$DB\e'line]/';
5597 .Sh "Setuid Scripts"
5599 is designed to make it easy to write secure setuid and setgid scripts.
5600 Unlike shells, which are based on multiple substitution passes on each line
5603 uses a more conventional evaluation scheme with fewer hidden \*(L"gotchas\*(R".
5604 Additionally, since the language has more built-in functionality, it
5605 has to rely less upon external (and possibly untrustworthy) programs to
5606 accomplish its purposes.
5608 In an unpatched 4.2 or 4.3bsd kernel, setuid scripts are intrinsically
5609 insecure, but this kernel feature can be disabled.
5612 can emulate the setuid and setgid mechanism when it notices the otherwise
5613 useless setuid/gid bits on perl scripts.
5614 If the kernel feature isn't disabled,
5616 will complain loudly that your setuid script is insecure.
5617 You'll need to either disable the kernel setuid script feature, or put
5618 a C wrapper around the script.
5620 When perl is executing a setuid script, it takes special precautions to
5621 prevent you from falling into any obvious traps.
5622 (In some ways, a perl script is more secure than the corresponding
5624 Any command line argument, environment variable, or input is marked as
5625 \*(L"tainted\*(R", and may not be used, directly or indirectly, in any
5626 command that invokes a subshell, or in any command that modifies files,
5627 directories or processes.
5628 Any variable that is set within an expression that has previously referenced
5629 a tainted value also becomes tainted (even if it is logically impossible
5630 for the tainted value to influence the variable).
5635 $foo = shift; # $foo is tainted
5636 $bar = $foo,\'bar\'; # $bar is also tainted
5637 $xxx = <>; # Tainted
5638 $path = $ENV{\'PATH\'}; # Tainted, but see below
5639 $abc = \'abc\'; # Not tainted
5642 system "echo $foo"; # Insecure
5643 system "/bin/echo", $foo; # Secure (doesn't use sh)
5644 system "echo $bar"; # Insecure
5645 system "echo $abc"; # Insecure until PATH set
5648 $ENV{\'PATH\'} = \'/bin:/usr/bin\';
5649 $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
5651 $path = $ENV{\'PATH\'}; # Not tainted
5652 system "echo $abc"; # Is secure now!
5655 open(FOO,"$foo"); # OK
5656 open(FOO,">$foo"); # Not OK
5658 open(FOO,"echo $foo|"); # Not OK, but...
5659 open(FOO,"-|") || exec \'echo\', $foo; # OK
5661 $zzz = `echo $foo`; # Insecure, zzz tainted
5663 unlink $abc,$foo; # Insecure
5664 umask $foo; # Insecure
5667 exec "echo $foo"; # Insecure
5668 exec "echo", $foo; # Secure (doesn't use sh)
5669 exec "sh", \'-c\', $foo; # Considered secure, alas
5672 The taintedness is associated with each scalar value, so some elements
5673 of an array can be tainted, and others not.
5675 If you try to do something insecure, you will get a fatal error saying
5676 something like \*(L"Insecure dependency\*(R" or \*(L"Insecure PATH\*(R".
5677 Note that you can still write an insecure system call or exec,
5678 but only by explicitly doing something like the last example above.
5679 You can also bypass the tainting mechanism by referencing
5682 presumes that if you reference a substring using $1, $2, etc, you knew
5683 what you were doing when you wrote the pattern:
5686 $ARGV[0] =~ /^\-P(\ew+)$/;
5687 $printer = $1; # Not tainted
5690 This is fairly secure since \ew+ doesn't match shell metacharacters.
5691 Use of .+ would have been insecure, but
5693 doesn't check for that, so you must be careful with your patterns.
5694 This is the ONLY mechanism for untainting user supplied filenames if you
5695 want to do file operations on them (unless you make $> equal to $<).
5697 It's also possible to get into trouble with other operations that don't care
5698 whether they use tainted values.
5699 Make judicious use of the file tests in dealing with any user-supplied
5701 When possible, do opens and such after setting $> = $<.
5703 doesn't prevent you from opening tainted filenames for reading, so be
5704 careful what you print out.
5705 The tainting mechanism is intended to prevent stupid mistakes, not to remove
5706 the need for thought.
5709 Used if chdir has no argument.
5711 Used if chdir has no argument and HOME is not set.
5713 Used in executing subprocesses, and in finding the script if \-S
5716 A colon-separated list of directories in which to look for Perl library
5717 files before looking in the standard library and the current directory.
5719 The command used to get the debugger code. If unset, uses
5727 uses no other environment variables, except to make them available
5728 to the script being executed, and to child processes.
5729 However, scripts running setuid would do well to execute the following lines
5730 before doing anything else, just to keep people honest:
5734 $ENV{\'PATH\'} = \'/bin:/usr/bin\'; # or whatever you need
5735 $ENV{\'SHELL\'} = \'/bin/sh\' if $ENV{\'SHELL\'} ne \'\';
5736 $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
5740 Larry Wall <lwall@netlabs.com>
5742 MS-DOS port by Diomidis Spinellis <dds@cc.ic.ac.uk>
5744 /tmp/perl\-eXXXXXX temporary file for
5748 a2p awk to perl translator
5750 s2p sed to perl translator
5752 Compilation errors will tell you the line number of the error, with an
5753 indication of the next token or token type that was to be examined.
5754 (In the case of a script passed to
5760 is counted as one line.)
5762 Setuid scripts have additional constraints that can produce error messages
5763 such as \*(L"Insecure dependency\*(R".
5764 See the section on setuid scripts.
5768 users should take special note of the following:
5770 Semicolons are required after all simple statements in
5772 (except at the end of a block).
5773 Newline is not a statement delimiter.
5775 Curly brackets are required on ifs and whiles.
5777 Variables begin with $ or @ in
5780 Arrays index from 0 unless you set $[.
5781 Likewise string positions in substr() and index().
5783 You have to decide whether your array has numeric or string indices.
5785 Associative array values do not spring into existence upon mere reference.
5787 You have to decide whether you want to use string or numeric comparisons.
5789 Reading an input line does not split it for you. You get to split it yourself
5793 operator has different arguments.
5795 The current input line is normally in $_, not $0.
5796 It generally does not have the newline stripped.
5797 ($0 is the name of the program executed.)
5799 $<digit> does not refer to fields\*(--it refers to substrings matched by the last
5804 statement does not add field and record separators unless you set
5807 You must open your files before you print to them.
5809 The range operator is \*(L".\|.\*(R", not comma.
5810 (The comma operator works as in C.)
5812 The match operator is \*(L"=~\*(R", not \*(L"~\*(R".
5813 (\*(L"~\*(R" is the one's complement operator, as in C.)
5815 The exponentiation operator is \*(L"**\*(R", not \*(L"^\*(R".
5816 (\*(L"^\*(R" is the XOR operator, as in C.)
5818 The concatenation operator is \*(L".\*(R", not the null string.
5819 (Using the null string would render \*(L"/pat/ /pat/\*(R" unparsable,
5820 since the third slash would be interpreted as a division operator\*(--the
5821 tokener is in fact slightly context sensitive for operators like /, ?, and <.
5822 And in fact, . itself can be the beginning of a number.)
5830 The following variables work differently
5834 ARGC \h'|2.5i'$#ARGV
5836 FILENAME\h'|2.5i'$ARGV
5837 FNR \h'|2.5i'$. \- something
5838 FS \h'|2.5i'(whatever you like)
5839 NF \h'|2.5i'$#Fld, or some such
5844 RLENGTH \h'|2.5i'length($&)
5846 RSTART \h'|2.5i'length($\`)
5851 When in doubt, run the
5853 construct through a2p and see what it gives you.
5855 Cerebral C programmers should take note of the following:
5857 Curly brackets are required on ifs and whiles.
5859 You should use \*(L"elsif\*(R" rather than \*(L"else if\*(R"
5870 There's no switch statement.
5872 Variables begin with $ or @ in
5875 Printf does not implement *.
5877 Comments begin with #, not /*.
5879 You can't take the address of anything.
5881 ARGV must be capitalized.
5883 The \*(L"system\*(R" calls link, unlink, rename, etc. return nonzero for success, not 0.
5885 Signal handlers deal with signal names, not numbers.
5889 programmers should take note of the following:
5891 Backreferences in substitutions use $ rather than \e.
5893 The pattern matching metacharacters (, ), and | do not have backslashes in front.
5895 The range operator is .\|. rather than comma.
5897 Sharp shell programmers should take note of the following:
5899 The backtick operator does variable interpretation without regard to the
5900 presence of single quotes in the command.
5902 The backtick operator does no translation of the return value, unlike csh.
5904 Shells (especially csh) do several levels of substitution on each command line.
5906 does substitution only in certain constructs such as double quotes,
5907 backticks, angle brackets and search patterns.
5909 Shells interpret scripts a little bit at a time.
5911 compiles the whole program before executing it.
5913 The arguments are available via @ARGV, not $1, $2, etc.
5915 The environment is not automatically made available as variables.
5916 .SH ERRATA\0AND\0ADDENDA
5918 .I Programming\0Perl ,
5919 has the following omissions and goofs.
5921 On page 5, the examples which read
5928 eval "exec /usr/bin/perl
5932 On page 195, the equivalent to the System V sum program only works for
5933 very small files. To do larger files, use
5937 $checksum = unpack("%32C*",<>) % 32767;
5941 The descriptions of alarm and sleep refer to signal SIGALARM. These
5942 should refer to SIGALRM.
5946 switch to set the initial value of $/ was added to Perl after the book
5951 switch now does automatic line ending processing.
5953 The qx// construct is now a synonym for backticks.
5955 $0 may now be assigned to set the argument displayed by
5958 The new @###.## format was omitted accidentally from the description
5961 It wasn't known at press time that s///ee caused multiple evaluations of
5962 the replacement expression. This is to be construed as a feature.
5964 (LIST) x $count now does array replication.
5966 There is now no limit on the number of parentheses in a regular expression.
5968 In double-quote context, more escapes are supported: \ee, \ea, \ex1b, \ec[,
5969 \el, \eL, \eu, \eU, \eE. The latter five control up/lower case translation.
5973 variable may now be set to a multi-character delimiter.
5975 There is now a g modifier on ordinary pattern matching that causes it
5976 to iterate through a string finding multiple matches.
5978 All of the $^X variables are new except for $^T.
5980 The default top-of-form format for FILEHANDLE is now FILEHANDLE_TOP rather
5983 The eval {} and sort {} constructs were added in version 4.018.
5985 The v and V (little-endian) template options for pack and unpack were
5990 is at the mercy of your machine's definitions of various operations
5991 such as type casting, atof() and sprintf().
5993 If your stdio requires an seek or eof between reads and writes on a particular
5996 (This doesn't apply to sysread() and syswrite().)
5998 While none of the built-in data types have any arbitrary size limits (apart
5999 from memory size), there are still a few arbitrary limits:
6000 a given identifier may not be longer than 255 characters,
6001 and no component of your PATH may be longer than 255 if you use \-S.
6002 A regular expression may not compile to more than 32767 bytes internally.
6005 actually stands for Pathologically Eclectic Rubbish Lister, but don't tell