perl 3.0 patch #23 patch #19, continued
[p5sagit/p5-mst-13.2.git] / perl.man.1
CommitLineData
8d063cd8 1.rn '' }`
450a55e4 2''' $Header: perl_man.1,v 3.0.1.7 90/08/09 04:24:03 lwall Locked $
8d063cd8 3'''
4''' $Log: perl.man.1,v $
450a55e4 5''' Revision 3.0.1.7 90/08/09 04:24:03 lwall
6''' patch19: added -x switch to extract script from input trash
7''' patch19: Added -c switch to do compilation only
8''' patch19: bare identifiers are now strings if no other interpretation possible
9''' patch19: -s now returns size of file
10''' patch19: Added __LINE__ and __FILE__ tokens
11''' patch19: Added __END__ token
12'''
13''' Revision 3.0.1.6 90/08/03 11:14:44 lwall
14''' patch19: Intermediate diffs for Randal
15'''
0f85fab0 16''' Revision 3.0.1.5 90/03/27 16:14:37 lwall
17''' patch16: .. now works using magical string increment
18'''
79a0689e 19''' Revision 3.0.1.4 90/03/12 16:44:33 lwall
20''' patch13: (LIST,) now legal
21''' patch13: improved LIST documentation
22''' patch13: example of if-elsif switch was wrong
23'''
ac58e20f 24''' Revision 3.0.1.3 90/02/28 17:54:32 lwall
25''' patch9: @array in scalar context now returns length of array
26''' patch9: in manual, example of open and ?: was backwards
27'''
ffed7fef 28''' Revision 3.0.1.2 89/11/17 15:30:03 lwall
29''' patch5: fixed some manual typos and indent problems
30'''
ae986130 31''' Revision 3.0.1.1 89/11/11 04:41:22 lwall
32''' patch2: explained about sh and ${1+"$@"}
33''' patch2: documented that space must separate word and '' string
34'''
a687059c 35''' Revision 3.0 89/10/18 15:21:29 lwall
36''' 3.0 baseline
8d063cd8 37'''
38'''
39.de Sh
40.br
41.ne 5
42.PP
43\fB\\$1\fR
44.PP
45..
46.de Sp
47.if t .sp .5v
48.if n .sp
49..
50.de Ip
51.br
52.ie \\n.$>=3 .ne \\$3
53.el .ne 3
54.IP "\\$1" \\$2
55..
56'''
57''' Set up \*(-- to give an unbreakable dash;
58''' string Tr holds user defined translation string.
59''' Bell System Logo is used as a dummy character.
60'''
378cc40b 61.tr \(*W-|\(bv\*(Tr
8d063cd8 62.ie n \{\
378cc40b 63.ds -- \(*W-
64.if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
65.if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
8d063cd8 66.ds L" ""
67.ds R" ""
68.ds L' '
69.ds R' '
70'br\}
71.el\{\
72.ds -- \(em\|
73.tr \*(Tr
74.ds L" ``
75.ds R" ''
76.ds L' `
77.ds R' '
78'br\}
a687059c 79.TH PERL 1 "\*(RP"
80.UC
8d063cd8 81.SH NAME
a687059c 82perl \- Practical Extraction and Report Language
8d063cd8 83.SH SYNOPSIS
a687059c 84.B perl
85[options] filename args
8d063cd8 86.SH DESCRIPTION
87.I Perl
a687059c 88is an interpreted language optimized for scanning arbitrary text files,
8d063cd8 89extracting information from those text files, and printing reports based
90on that information.
91It's also a good language for many system management tasks.
92The language is intended to be practical (easy to use, efficient, complete)
93rather than beautiful (tiny, elegant, minimal).
94It combines (in the author's opinion, anyway) some of the best features of C,
95\fIsed\fR, \fIawk\fR, and \fIsh\fR,
96so people familiar with those languages should have little difficulty with it.
97(Language historians will also note some vestiges of \fIcsh\fR, Pascal, and
98even BASIC-PLUS.)
99Expression syntax corresponds quite closely to C expression syntax.
a687059c 100Unlike most Unix utilities,
101.I perl
102does not arbitrarily limit the size of your data\*(--if you've got
103the memory,
104.I perl
105can slurp in your whole file as a single string.
106Recursion is of unlimited depth.
107And the hash tables used by associative arrays grow as necessary to prevent
108degraded performance.
109.I Perl
110uses sophisticated pattern matching techniques to scan large amounts of
111data very quickly.
112Although optimized for scanning text,
113.I perl
114can also deal with binary data, and can make dbm files look like associative
115arrays (where dbm is available).
116Setuid
117.I perl
118scripts are safer than C programs
119through a dataflow tracing mechanism which prevents many stupid security holes.
8d063cd8 120If you have a problem that would ordinarily use \fIsed\fR
121or \fIawk\fR or \fIsh\fR, but it
122exceeds their capabilities or must run a little faster,
123and you don't want to write the silly thing in C, then
124.I perl
125may be for you.
a687059c 126There are also translators to turn your
127.I sed
128and
129.I awk
130scripts into
131.I perl
132scripts.
8d063cd8 133OK, enough hype.
134.PP
135Upon startup,
136.I perl
137looks for your script in one of the following places:
138.Ip 1. 4 2
139Specified line by line via
140.B \-e
141switches on the command line.
142.Ip 2. 4 2
143Contained in the file specified by the first filename on the command line.
144(Note that systems supporting the #! notation invoke interpreters this way.)
145.Ip 3. 4 2
a687059c 146Passed in implicitly via standard input.
378cc40b 147This only works if there are no filename arguments\*(--to pass
a687059c 148arguments to a
149.I stdin
150script you must explicitly specify a \- for the script name.
8d063cd8 151.PP
152After locating your script,
153.I perl
154compiles it to an internal form.
155If the script is syntactically correct, it is executed.
156.Sh "Options"
83b4785a 157Note: on first reading this section may not make much sense to you. It's here
8d063cd8 158at the front for easy reference.
159.PP
160A single-character option may be combined with the following option, if any.
161This is particularly useful when invoking a script using the #! construct which
162only allows one argument. Example:
163.nf
164
165.ne 2
a687059c 166 #!/usr/bin/perl \-spi.bak # same as \-s \-p \-i.bak
8d063cd8 167 .\|.\|.
168
169.fi
170Options include:
171.TP 5
378cc40b 172.B \-a
a687059c 173turns on autosplit mode when used with a
174.B \-n
175or
176.BR \-p .
378cc40b 177An implicit split command to the @F array
178is done as the first thing inside the implicit while loop produced by
a687059c 179the
180.B \-n
181or
182.BR \-p .
378cc40b 183.nf
184
a687059c 185 perl \-ane \'print pop(@F), "\en";\'
378cc40b 186
187is equivalent to
188
189 while (<>) {
a687059c 190 @F = split(\' \');
191 print pop(@F), "\en";
378cc40b 192 }
193
194.fi
195.TP 5
450a55e4 196.B \-c
197causes
198.I perl
199to check the syntax of the script and then exit without executing it.
200.TP 5
a687059c 201.BI \-d
202runs the script under the perl debugger.
203See the section on Debugging.
204.TP 5
205.BI \-D number
8d063cd8 206sets debugging flags.
207To watch how it executes your script, use
a687059c 208.BR \-D14 .
8d063cd8 209(This only works if debugging is compiled into your
210.IR perl .)
a687059c 211Another nice value is \-D1024, which lists your compiled syntax tree.
212And \-D512 displays compiled regular expressions.
8d063cd8 213.TP 5
a687059c 214.BI \-e " commandline"
8d063cd8 215may be used to enter one line of script.
216Multiple
217.B \-e
218commands may be given to build up a multi-line script.
219If
220.B \-e
221is given,
222.I perl
223will not look for a script filename in the argument list.
224.TP 5
a687059c 225.BI \-i extension
8d063cd8 226specifies that files processed by the <> construct are to be edited
227in-place.
228It does this by renaming the input file, opening the output file by the
229same name, and selecting that output file as the default for print statements.
230The extension, if supplied, is added to the name of the
231old file to make a backup copy.
232If no extension is supplied, no backup is made.
a687059c 233Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using
8d063cd8 234the script:
235.nf
236
237.ne 2
a687059c 238 #!/usr/bin/perl \-pi.bak
8d063cd8 239 s/foo/bar/;
240
241which is equivalent to
242
243.ne 14
378cc40b 244 #!/usr/bin/perl
8d063cd8 245 while (<>) {
246 if ($ARGV ne $oldargv) {
a687059c 247 rename($ARGV, $ARGV . \'.bak\');
248 open(ARGVOUT, ">$ARGV");
8d063cd8 249 select(ARGVOUT);
250 $oldargv = $ARGV;
251 }
252 s/foo/bar/;
253 }
254 continue {
255 print; # this prints to original filename
256 }
a687059c 257 select(STDOUT);
8d063cd8 258
259.fi
a687059c 260except that the
261.B \-i
262form doesn't need to compare $ARGV to $oldargv to know when
8d063cd8 263the filename has changed.
264It does, however, use ARGVOUT for the selected filehandle.
a687059c 265Note that
266.I STDOUT
267is restored as the default output filehandle after the loop.
378cc40b 268.Sp
269You can use eof to locate the end of each input file, in case you want
270to append to each file, or reset line numbering (see example under eof).
8d063cd8 271.TP 5
a687059c 272.BI \-I directory
8d063cd8 273may be used in conjunction with
274.B \-P
275to tell the C preprocessor where to look for include files.
276By default /usr/include and /usr/lib/perl are searched.
277.TP 5
278.B \-n
279causes
280.I perl
281to assume the following loop around your script, which makes it iterate
a687059c 282over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR:
8d063cd8 283.nf
284
285.ne 3
286 while (<>) {
378cc40b 287 .\|.\|. # your script goes here
8d063cd8 288 }
289
290.fi
291Note that the lines are not printed by default.
292See
293.B \-p
294to have lines printed.
378cc40b 295Here is an efficient way to delete all files older than a week:
296.nf
297
a687059c 298 find . \-mtime +7 \-print | perl \-ne \'chop;unlink;\'
378cc40b 299
300.fi
a687059c 301This is faster than using the \-exec switch of find because you don't have to
378cc40b 302start a process on every filename found.
8d063cd8 303.TP 5
304.B \-p
305causes
306.I perl
307to assume the following loop around your script, which makes it iterate
308over filename arguments somewhat like \fIsed\fR:
309.nf
310
311.ne 5
312 while (<>) {
378cc40b 313 .\|.\|. # your script goes here
8d063cd8 314 } continue {
315 print;
316 }
317
318.fi
319Note that the lines are printed automatically.
320To suppress printing use the
321.B \-n
322switch.
83b4785a 323A
324.B \-p
325overrides a
326.B \-n
327switch.
8d063cd8 328.TP 5
329.B \-P
330causes your script to be run through the C preprocessor before
331compilation by
a687059c 332.IR perl .
8d063cd8 333(Since both comments and cpp directives begin with the # character,
334you should avoid starting comments with any words recognized
335by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".)
336.TP 5
337.B \-s
338enables some rudimentary switch parsing for switches on the command line
a687059c 339after the script name but before any filename arguments (or before a \-\|\-).
83b4785a 340Any switch found there is removed from @ARGV and sets the corresponding variable in the
8d063cd8 341.I perl
342script.
343The following script prints \*(L"true\*(R" if and only if the script is
a687059c 344invoked with a \-xyz switch.
8d063cd8 345.nf
346
347.ne 2
a687059c 348 #!/usr/bin/perl \-s
83b4785a 349 if ($xyz) { print "true\en"; }
8d063cd8 350
351.fi
378cc40b 352.TP 5
353.B \-S
a687059c 354makes
355.I perl
356use the PATH environment variable to search for the script
378cc40b 357(unless the name of the script starts with a slash).
358Typically this is used to emulate #! startup on machines that don't
359support #!, in the following manner:
360.nf
361
362 #!/usr/bin/perl
a687059c 363 eval "exec /usr/bin/perl \-S $0 $*"
378cc40b 364 if $running_under_some_shell;
365
366.fi
367The system ignores the first line and feeds the script to /bin/sh,
a687059c 368which proceeds to try to execute the
369.I perl
370script as a shell script.
378cc40b 371The shell executes the second line as a normal shell command, and thus
a687059c 372starts up the
373.I perl
374interpreter.
378cc40b 375On some systems $0 doesn't always contain the full pathname,
a687059c 376so the
377.B \-S
378tells
379.I perl
380to search for the script if necessary.
381After
382.I perl
383locates the script, it parses the lines and ignores them because
378cc40b 384the variable $running_under_some_shell is never true.
ae986130 385A better construct than $* would be ${1+"$@"}, which handles embedded spaces
386and such in the filenames, but doesn't work if the script is being interpreted
387by csh.
388In order to start up sh rather than csh, some systems may have to replace the
389#! line with a line containing just
390a colon, which will be politely ignored by perl.
450a55e4 391Other systems can't control that, and need a totally devious construct that
392will work under any of csh, sh or perl, such as the following:
393.nf
394
395.ne 3
396 eval '(exit $?0)' && eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
397 & eval 'exec /usr/bin/perl -S $0 $argv:q'
398 if 0;
399
400.fi
378cc40b 401.TP 5
a687059c 402.B \-u
403causes
404.I perl
405to dump core after compiling your script.
406You can then take this core dump and turn it into an executable file
407by using the undump program (not supplied).
408This speeds startup at the expense of some disk space (which you can
409minimize by stripping the executable).
410(Still, a "hello world" executable comes out to about 200K on my machine.)
411If you are going to run your executable as a set-id program then you
412should probably compile it using taintperl rather than normal perl.
413If you want to execute a portion of your script before dumping, use the
414dump operator instead.
450a55e4 415Note: availability of undump is platform specific and may not be available
416for a specific port of perl.
a687059c 417.TP 5
378cc40b 418.B \-U
a687059c 419allows
420.I perl
421to do unsafe operations.
13281fa4 422Currently the only \*(L"unsafe\*(R" operation is the unlinking of directories while
378cc40b 423running as superuser.
424.TP 5
425.B \-v
a687059c 426prints the version and patchlevel of your
427.I perl
428executable.
378cc40b 429.TP 5
430.B \-w
431prints warnings about identifiers that are mentioned only once, and scalar
432variables that are used before being set.
433Also warns about redefined subroutines, and references to undefined
a687059c 434filehandles or filehandles opened readonly that you are attempting to
435write on.
436Also warns you if you use == on values that don't look like numbers, and if
437your subroutines recurse more than 100 deep.
450a55e4 438.TP 5
439.BI \-x directory
440tells
441.I perl
442that the script is embedded in a message.
443Leading garbage will be discarded until the first line that starts
444with #! and contains the string "perl".
445Any meaningful switches on that line will be applied (but only one
446group of switches, as with normal #! processing).
447If a directory name is specified, Perl will switch to that directory
448before running the script.
449The
450.B \-x
451switch only controls the the disposal of leading garbage.
452The script must be terminated with __END__ if there is trailing garbage
453to be ignored (the script can process any or all of the trailing garbage
454via standard input if desired).
8d063cd8 455.Sh "Data Types and Objects"
456.PP
a687059c 457.I Perl
458has three data types: scalars, arrays of scalars, and
459associative arrays of scalars.
460Normal arrays are indexed by number, and associative arrays by string.
8d063cd8 461.PP
a687059c 462The interpretation of operations and values in perl sometimes
463depends on the requirements
464of the context around the operation or value.
465There are three major contexts: string, numeric and array.
466Certain operations return array values
467in contexts wanting an array, and scalar values otherwise.
468(If this is true of an operation it will be mentioned in the documentation
469for that operation.)
470Operations which return scalars don't care whether the context is looking
471for a string or a number, but
472scalar variables and values are interpreted as strings or numbers
473as appropriate to the context.
378cc40b 474A scalar is interpreted as TRUE in the boolean sense if it is not the null
8d063cd8 475string or 0.
ffed7fef 476Booleans returned by operators are 1 for true and 0 or \'\' (the null
8d063cd8 477string) for false.
478.PP
a687059c 479There are actually two varieties of null string: defined and undefined.
480Undefined null strings are returned when there is no real value for something,
481such as when there was an error, or at end of file, or when you refer
482to an uninitialized variable or element of an array.
483An undefined null string may become defined the first time you access it, but
484prior to that you can use the defined() operator to determine whether the
485value is defined or not.
486.PP
378cc40b 487References to scalar variables always begin with \*(L'$\*(R', even when referring
488to a scalar that is part of an array.
8d063cd8 489Thus:
490.nf
491
492.ne 3
378cc40b 493 $days \h'|2i'# a simple scalar variable
8d063cd8 494 $days[28] \h'|2i'# 29th element of array @days
a687059c 495 $days{\'Feb\'}\h'|2i'# one value from an associative array
378cc40b 496 $#days \h'|2i'# last index of array @days
8d063cd8 497
a687059c 498but entire arrays or array slices are denoted by \*(L'@\*(R':
8d063cd8 499
500 @days \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n])
a687059c 501 @days[3,4,5]\h'|2i'# same as @days[3.\|.5]
502 @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'})
503
504and entire associative arrays are denoted by \*(L'%\*(R':
8d063cd8 505
a687059c 506 %days \h'|2i'# (key1, val1, key2, val2 .\|.\|.)
8d063cd8 507.fi
508.PP
a687059c 509Any of these eight constructs may serve as an lvalue,
378cc40b 510that is, may be assigned to.
a687059c 511(It also turns out that an assignment is itself an lvalue in
512certain contexts\*(--see examples under s, tr and chop.)
513Assignment to a scalar evaluates the righthand side in a scalar context,
514while assignment to an array or array slice evaluates the righthand side
515in an array context.
516.PP
378cc40b 517You may find the length of array @days by evaluating
8d063cd8 518\*(L"$#days\*(R", as in
519.IR csh .
378cc40b 520(Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.)
521Assigning to $#days changes the length of the array.
522Shortening an array by this method does not actually destroy any values.
523Lengthening an array that was previously shortened recovers the values that
524were in those elements.
525You can also gain some measure of efficiency by preextending an array that
526is going to get big.
527(You can also extend an array by assigning to an element that is off the
528end of the array.
529This differs from assigning to $#whatever in that intervening values
530are set to null rather than recovered.)
531You can truncate an array down to nothing by assigning the null list () to
532it.
533The following are exactly equivalent
534.nf
535
536 @whatever = ();
537 $#whatever = $[ \- 1;
538
539.fi
8d063cd8 540.PP
ac58e20f 541If you evaluate an array in a scalar context, it returns the length of
542the array.
543The following is always true:
544.nf
545
546 @whatever == $#whatever \- $[ + 1;
547
548.fi
549.PP
a687059c 550Multi-dimensional arrays are not directly supported, but see the discussion
551of the $; variable later for a means of emulating multiple subscripts with
552an associative array.
ac58e20f 553You could also write a subroutine to turn multiple subscripts into a single
554subscript.
a687059c 555.PP
8d063cd8 556Every data type has its own namespace.
378cc40b 557You can, without fear of conflict, use the same name for a scalar variable,
8d063cd8 558an array, an associative array, a filehandle, a subroutine name, and/or
559a label.
a687059c 560Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R',
561or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved
8d063cd8 562with respect to variable names.
563(They ARE reserved with respect to labels and filehandles, however, which
378cc40b 564don't have an initial special character.
a687059c 565Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\').
566Using uppercase filehandles also improves readability and protects you
567from conflict with future reserved words.)
8d063cd8 568Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all
569different names.
570Names which start with a letter may also contain digits and underscores.
571Names which do not start with a letter are limited to one character,
572e.g. \*(L"$%\*(R" or \*(L"$$\*(R".
a687059c 573(Most of the one character names have a predefined significance to
574.IR perl .
8d063cd8 575More later.)
576.PP
a687059c 577Numeric literals are specified in any of the usual floating point or
578integer formats:
579.nf
580
581.ne 5
582 12345
583 12345.67
584 .23E-10
585 0xffff # hex
586 0377 # octal
587
588.fi
8d063cd8 589String literals are delimited by either single or double quotes.
590They work much like shell quotes:
591double-quoted string literals are subject to backslash and variable
a687059c 592substitution; single-quoted strings are not (except for \e\' and \e\e).
8d063cd8 593The usual backslash rules apply for making characters such as newline, tab, etc.
594You can also embed newlines directly in your strings, i.e. they can end on
595a different line than they begin.
596This is nice, but if you forget your trailing quote, the error will not be
a687059c 597reported until
598.I perl
599finds another line containing the quote character, which
8d063cd8 600may be much further on in the script.
a687059c 601Variable substitution inside strings is limited to scalar variables, normal
602array values, and array slices.
603(In other words, identifiers beginning with $ or @, followed by an optional
604bracketed expression as a subscript.)
8d063cd8 605The following code segment prints out \*(L"The price is $100.\*(R"
606.nf
607
608.ne 2
a687059c 609 $Price = \'$100\';\h'|3.5i'# not interpreted
8d063cd8 610 print "The price is $Price.\e\|n";\h'|3.5i'# interpreted
611
612.fi
83b4785a 613Note that you can put curly brackets around the identifier to delimit it
614from following alphanumerics.
ae986130 615Also note that a single quoted string must be separated from a preceding
616word by a space, since single quote is a valid character in an identifier
617(see Packages).
8d063cd8 618.PP
450a55e4 619Two special literals are __LINE__ and __FILE__, which represent the current
620line number and filename at that point in your program.
621They may only be used as separate tokens; they will not be interpolated
622into strings.
623In addition, the token __END__ may be used to indicate the logical end of the
624script before the actual end of file.
625Any following text is ignored (but if the script is being read from
626the standard input, then the rest of the input is available by reading
627from filehandle STDIN).
628The two control characters ^D and ^Z are synonyms for __END__.
629.PP
630A word that doesn't have any other interpretation in the grammar will be
631treated as if it had single quotes around it.
632For this purpose, a word consists only of alphanumeric characters and underline,
633and must start with an alphabetic character.
634As with filehandles and labels, a bare word that consists entirely of
635lowercase letters risks conflict with future reserved words, and if you
636use the
637.B \-w
638switch, Perl will warn you about any such words.
639.PP
a687059c 640Array values are interpolated into double-quoted strings by joining all the
641elements of the array with the delimiter specified in the $" variable,
642space by default.
643(Since in versions of perl prior to 3.0 the @ character was not a metacharacter
644in double-quoted strings, the interpolation of @array, $array[EXPR],
645@array[LIST], $array{EXPR}, or @array{LIST} only happens if array is
646referenced elsewhere in the program or is predefined.)
647The following are equivalent:
648.nf
649
650.ne 4
651 $temp = join($",@ARGV);
652 system "echo $temp";
653
654 system "echo @ARGV";
655
656.fi
ae986130 657Within search patterns (which also undergo double-quotish substitution)
a687059c 658there is a bad ambiguity: Is /$foo[bar]/ to be
659interpreted as /${foo}[bar]/ (where [bar] is a character class for the
660regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to
661array @foo)?
662If @foo doesn't otherwise exist, then it's obviously a character class.
663If @foo exists, perl takes a good guess about [bar], and is almost always right.
664If it does guess wrong, or if you're just plain paranoid,
665you can force the correct interpretation with curly brackets as above.
666.PP
667A line-oriented form of quoting is based on the shell here-is syntax.
668Following a << you specify a string to terminate the quoted material, and all lines
669following the current line down to the terminating string are the value
670of the item.
671The terminating string may be either an identifier (a word), or some
672quoted text.
673If quoted, the type of quotes you use determines the treatment of the text,
674just as in regular quoting.
675An unquoted identifier works like double quotes.
676There must be no space between the << and the identifier.
677(If you put a space it will be treated as a null identifier, which is
678valid, and matches the first blank line\*(--see Merry Christmas example below.)
679The terminating string must appear by itself (unquoted and with no surrounding
680whitespace) on the terminating line.
681.nf
682
683 print <<EOF; # same as above
684The price is $Price.
685EOF
686
687 print <<"EOF"; # same as above
688The price is $Price.
689EOF
690
691 print << x 10; # null identifier is delimiter
692Merry Christmas!
693
694 print <<`EOC`; # execute commands
695echo hi there
696echo lo there
697EOC
698
699 print <<foo, <<bar; # you can stack them
700I said foo.
701foo
702I said bar.
703bar
704
705.fi
8d063cd8 706Array literals are denoted by separating individual values by commas, and
79a0689e 707enclosing the list in parentheses:
708.nf
709
710 (LIST)
711
712.fi
8d063cd8 713In a context not requiring an array value, the value of the array literal
714is the value of the final element, as in the C comma operator.
715For example,
716.nf
717
83b4785a 718.ne 4
a687059c 719 @foo = (\'cc\', \'\-E\', $bar);
8d063cd8 720
721assigns the entire array value to array foo, but
722
a687059c 723 $foo = (\'cc\', \'\-E\', $bar);
8d063cd8 724
725.fi
726assigns the value of variable bar to variable foo.
79a0689e 727Note that the value of an actual array in a scalar context is the length
728of the array; the following assigns to $foo the value 3:
729.nf
730
731.ne 2
732 @foo = (\'cc\', \'\-E\', $bar);
733 $foo = @foo; # $foo gets 3
734
735.fi
736You may have an optional comma before the closing parenthesis of an
737array literal, so that you can say:
738.nf
739
740 @foo = (
741 1,
742 2,
743 3,
744 );
745
746.fi
747When a LIST is evaluated, each element of the list is evaluated in
748an array context, and the resulting array value is interpolated into LIST
749just as if each individual element were a member of LIST. Thus arrays
750lose their identity in a LIST\*(--the list
751
752 (@foo,@bar,&SomeSub)
753
754contains all the elements of @foo followed by all the elements of @bar,
755followed by all the elements returned by the subroutine named SomeSub.
756.PP
757A list value may also be subscripted like a normal array.
758Examples:
759.nf
760
761 $time = (stat($file))[8]; # stat returns array value
762 $digit = ('a','b','c','d','e','f')[$digit-10];
763 return (pop(@foo),pop(@foo))[0];
764
765.fi
766.PP
8d063cd8 767Array lists may be assigned to if and only if each element of the list
768is an lvalue:
769.nf
770
771 ($a, $b, $c) = (1, 2, 3);
772
a687059c 773 ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00);
774
775The final element may be an array or an associative array:
776
777 ($a, $b, @rest) = split;
778 local($a, $b, %rest) = @_;
8d063cd8 779
780.fi
a687059c 781You can actually put an array anywhere in the list, but the first array
782in the list will soak up all the values, and anything after it will get
783a null value.
784This may be useful in a local().
8d063cd8 785.PP
a687059c 786An associative array literal contains pairs of values to be interpreted
787as a key and a value:
788.nf
789
790.ne 2
791 # same as map assignment above
792 %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
793
794.fi
795Array assignment in a scalar context returns the number of elements
796produced by the expression on the right side of the assignment:
797.nf
798
799 $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2
800
801.fi
8d063cd8 802.PP
803There are several other pseudo-literals that you should know about.
378cc40b 804If a string is enclosed by backticks (grave accents), it first undergoes
805variable substitution just like a double quoted string.
806It is then interpreted as a command, and the output of that command
807is the value of the pseudo-literal, like in a shell.
450a55e4 808In a scalar context, a single string consisting of all the output is
809returned.
810In an array context, an array of values is returned, one for each line
811of output.
812(You can set $/ to use a different line terminator.)
8d063cd8 813The command is executed each time the pseudo-literal is evaluated.
378cc40b 814The status value of the command is returned in $? (see Predefined Names
815for the interpretation of $?).
816Unlike in \f2csh\f1, no translation is done on the return
8d063cd8 817data\*(--newlines remain newlines.
378cc40b 818Unlike in any of the shells, single quotes do not hide variable names
819in the command from interpretation.
820To pass a $ through to the shell you need to hide it with a backslash.
8d063cd8 821.PP
822Evaluating a filehandle in angle brackets yields the next line
a687059c 823from that file (newline included, so it's never false until EOF, at
824which time an undefined value is returned).
8d063cd8 825Ordinarily you must assign that value to a variable,
ac58e20f 826but there is one situation where an automatic assignment happens.
8d063cd8 827If (and only if) the input symbol is the only thing inside the conditional of a
828.I while
829loop, the value is
830automatically assigned to the variable \*(L"$_\*(R".
831(This may seem like an odd thing to you, but you'll use the construct
832in almost every
833.I perl
834script you write.)
835Anyway, the following lines are equivalent to each other:
836.nf
837
a687059c 838.ne 5
839 while ($_ = <STDIN>) { print; }
840 while (<STDIN>) { print; }
841 for (\|;\|<STDIN>;\|) { print; }
842 print while $_ = <STDIN>;
843 print while <STDIN>;
8d063cd8 844
845.fi
846The filehandles
a687059c 847.IR STDIN ,
848.I STDOUT
849and
850.I STDERR
851are predefined.
852(The filehandles
8d063cd8 853.IR stdin ,
854.I stdout
855and
856.I stderr
a687059c 857will also work except in packages, where they would be interpreted as
858local identifiers rather than global.)
8d063cd8 859Additional filehandles may be created with the
860.I open
861function.
862.PP
378cc40b 863If a <FILEHANDLE> is used in a context that is looking for an array, an array
864consisting of all the input lines is returned, one line per array element.
865It's easy to make a LARGE data space this way, so use with care.
866.PP
8d063cd8 867The null filehandle <> is special and can be used to emulate the behavior of
868\fIsed\fR and \fIawk\fR.
869Input from <> comes either from standard input, or from each file listed on
870the command line.
871Here's how it works: the first time <> is evaluated, the ARGV array is checked,
a687059c 872and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard
8d063cd8 873input.
874The ARGV array is then processed as a list of filenames.
875The loop
876.nf
877
878.ne 3
879 while (<>) {
880 .\|.\|. # code for each line
881 }
882
883.ne 10
884is equivalent to
885
a687059c 886 unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[;
8d063cd8 887 while ($ARGV = shift) {
888 open(ARGV, $ARGV);
889 while (<ARGV>) {
890 .\|.\|. # code for each line
891 }
892 }
893
894.fi
895except that it isn't as cumbersome to say.
896It really does shift array ARGV and put the current filename into
897variable ARGV.
898It also uses filehandle ARGV internally.
899You can modify @ARGV before the first <> as long as you leave the first
900filename at the beginning of the array.
83b4785a 901Line numbers ($.) continue as if the input was one big happy file.
378cc40b 902(But see example under eof for how to reset line numbers on each file.)
8d063cd8 903.PP
83b4785a 904.ne 5
378cc40b 905If you want to set @ARGV to your own list of files, go right ahead.
8d063cd8 906If you want to pass switches into your script, you can
907put a loop on the front like this:
908.nf
909
910.ne 10
911 while ($_ = $ARGV[0], /\|^\-/\|) {
912 shift;
913 last if /\|^\-\|\-$\|/\|;
914 /\|^\-D\|(.*\|)/ \|&& \|($debug = $1);
915 /\|^\-v\|/ \|&& \|$verbose++;
916 .\|.\|. # other switches
917 }
918 while (<>) {
919 .\|.\|. # code for each line
920 }
921
922.fi
923The <> symbol will return FALSE only once.
924If you call it again after this it will assume you are processing another
a687059c 925@ARGV list, and if you haven't set @ARGV, will input from
926.IR STDIN .
378cc40b 927.PP
928If the string inside the angle brackets is a reference to a scalar variable
929(e.g. <$foo>),
930then that variable contains the name of the filehandle to input from.
931.PP
932If the string inside angle brackets is not a filehandle, it is interpreted
933as a filename pattern to be globbed, and either an array of filenames or the
934next filename in the list is returned, depending on context.
935One level of $ interpretation is done first, but you can't say <$foo>
936because that's an indirect filehandle as explained in the previous
937paragraph.
938You could insert curly brackets to force interpretation as a
939filename glob: <${foo}>.
940Example:
941.nf
942
943.ne 3
944 while (<*.c>) {
a687059c 945 chmod 0644, $_;
378cc40b 946 }
947
948is equivalent to
949
950.ne 5
a687059c 951 open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|");
378cc40b 952 while (<foo>) {
953 chop;
a687059c 954 chmod 0644, $_;
378cc40b 955 }
956
957.fi
958In fact, it's currently implemented that way.
a687059c 959(Which means it will not work on filenames with spaces in them unless
960you have /bin/csh on your machine.)
378cc40b 961Of course, the shortest way to do the above is:
962.nf
963
a687059c 964 chmod 0644, <*.c>;
378cc40b 965
966.fi
8d063cd8 967.Sh "Syntax"
968.PP
969A
970.I perl
971script consists of a sequence of declarations and commands.
972The only things that need to be declared in
973.I perl
974are report formats and subroutines.
975See the sections below for more information on those declarations.
ffed7fef 976All uninitialized user-created objects are assumed to
a687059c 977start with a null or 0 value until they
978are defined by some explicit operation such as assignment.
8d063cd8 979The sequence of commands is executed just once, unlike in
980.I sed
981and
982.I awk
983scripts, where the sequence of commands is executed for each input line.
984While this means that you must explicitly loop over the lines of your input file
985(or files), it also means you have much more control over which files and which
986lines you look at.
987(Actually, I'm lying\*(--it is possible to do an implicit loop with either the
988.B \-n
989or
990.B \-p
991switch.)
992.PP
993A declaration can be put anywhere a command can, but has no effect on the
a687059c 994execution of the primary sequence of commands--declarations all take effect
995at compile time.
8d063cd8 996Typically all the declarations are put at the beginning or the end of the script.
997.PP
998.I Perl
999is, for the most part, a free-form language.
1000(The only exception to this is format declarations, for fairly obvious reasons.)
1001Comments are indicated by the # character, and extend to the end of the line.
1002If you attempt to use /* */ C comments, it will be interpreted either as
1003division or pattern matching, depending on the context.
1004So don't do that.
1005.Sh "Compound statements"
1006In
1007.IR perl ,
1008a sequence of commands may be treated as one command by enclosing it
1009in curly brackets.
1010We will call this a BLOCK.
1011.PP
1012The following compound commands may be used to control flow:
1013.nf
1014
1015.ne 4
1016 if (EXPR) BLOCK
1017 if (EXPR) BLOCK else BLOCK
378cc40b 1018 if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK
8d063cd8 1019 LABEL while (EXPR) BLOCK
1020 LABEL while (EXPR) BLOCK continue BLOCK
1021 LABEL for (EXPR; EXPR; EXPR) BLOCK
378cc40b 1022 LABEL foreach VAR (ARRAY) BLOCK
8d063cd8 1023 LABEL BLOCK continue BLOCK
1024
1025.fi
83b4785a 1026Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not
8d063cd8 1027statements.
1028This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed.
1029If you want to write conditionals without curly brackets there are several
1030other ways to do it.
1031The following all do the same thing:
1032.nf
1033
1034.ne 5
a687059c 1035 if (!open(foo)) { die "Can't open $foo: $!"; }
1036 die "Can't open $foo: $!" unless open(foo);
1037 open(foo) || die "Can't open $foo: $!"; # foo or bust!
ac58e20f 1038 open(foo) ? \'hi mom\' : die "Can't open $foo: $!";
a687059c 1039 # a bit exotic, that last one
8d063cd8 1040
1041.fi
8d063cd8 1042.PP
1043The
1044.I if
1045statement is straightforward.
1046Since BLOCKs are always bounded by curly brackets, there is never any
1047ambiguity about which
1048.I if
1049an
1050.I else
1051goes with.
1052If you use
1053.I unless
1054in place of
1055.IR if ,
1056the sense of the test is reversed.
1057.PP
1058The
1059.I while
1060statement executes the block as long as the expression is true
1061(does not evaluate to the null string or 0).
1062The LABEL is optional, and if present, consists of an identifier followed by
1063a colon.
1064The LABEL identifies the loop for the loop control statements
1065.IR next ,
a687059c 1066.IR last ,
8d063cd8 1067and
1068.I redo
1069(see below).
1070If there is a
1071.I continue
1072BLOCK, it is always executed just before
1073the conditional is about to be evaluated again, similarly to the third part
1074of a
1075.I for
1076loop in C.
1077Thus it can be used to increment a loop variable, even when the loop has
1078been continued via the
1079.I next
1080statement (similar to the C \*(L"continue\*(R" statement).
1081.PP
1082If the word
1083.I while
1084is replaced by the word
1085.IR until ,
1086the sense of the test is reversed, but the conditional is still tested before
1087the first iteration.
1088.PP
1089In either the
1090.I if
1091or the
1092.I while
1093statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional
1094is true if the value of the last command in that block is true.
1095.PP
1096The
1097.I for
1098loop works exactly like the corresponding
1099.I while
1100loop:
1101.nf
1102
1103.ne 12
1104 for ($i = 1; $i < 10; $i++) {
1105 .\|.\|.
1106 }
1107
1108is the same as
1109
1110 $i = 1;
1111 while ($i < 10) {
1112 .\|.\|.
1113 } continue {
1114 $i++;
1115 }
1116.fi
1117.PP
378cc40b 1118The foreach loop iterates over a normal array value and sets the variable
1119VAR to be each element of the array in turn.
450a55e4 1120The variable is implicitly local to the loop, and regains its former value
1121upon exiting the loop.
13281fa4 1122The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword,
1123so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity.
378cc40b 1124If VAR is omitted, $_ is set to each value.
1125If ARRAY is an actual array (as opposed to an expression returning an array
1126value), you can modify each element of the array
1127by modifying VAR inside the loop.
1128Examples:
1129.nf
1130
1131.ne 5
1132 for (@ary) { s/foo/bar/; }
1133
1134 foreach $elem (@elements) {
1135 $elem *= 2;
1136 }
1137
a687059c 1138.ne 3
1139 for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) {
1140 print $_, "\en"; sleep(1);
378cc40b 1141 }
1142
a687059c 1143 for (1..15) { print "Merry Christmas\en"; }
1144
378cc40b 1145.ne 3
450a55e4 1146 foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'})) {
378cc40b 1147 print "Item: $item\en";
1148 }
a687059c 1149
378cc40b 1150.fi
1151.PP
8d063cd8 1152The BLOCK by itself (labeled or not) is equivalent to a loop that executes
1153once.
1154Thus you can use any of the loop control statements in it to leave or
1155restart the block.
1156The
1157.I continue
1158block is optional.
1159This construct is particularly nice for doing case structures.
1160.nf
1161
1162.ne 6
1163 foo: {
a687059c 1164 if (/^abc/) { $abc = 1; last foo; }
1165 if (/^def/) { $def = 1; last foo; }
1166 if (/^xyz/) { $xyz = 1; last foo; }
8d063cd8 1167 $nothing = 1;
1168 }
1169
1170.fi
a687059c 1171There is no official switch statement in perl, because there
1172are already several ways to write the equivalent.
1173In addition to the above, you could write
378cc40b 1174.nf
1175
a687059c 1176.ne 6
1177 foo: {
ffed7fef 1178 $abc = 1, last foo if /^abc/;
1179 $def = 1, last foo if /^def/;
1180 $xyz = 1, last foo if /^xyz/;
a687059c 1181 $nothing = 1;
1182 }
1183
1184or
1185
1186.ne 6
1187 foo: {
450a55e4 1188 /^abc/ && do { $abc = 1; last foo; };
1189 /^def/ && do { $def = 1; last foo; };
1190 /^xyz/ && do { $xyz = 1; last foo; };
a687059c 1191 $nothing = 1;
1192 }
1193
1194or
1195
1196.ne 6
1197 foo: {
1198 /^abc/ && ($abc = 1, last foo);
1199 /^def/ && ($def = 1, last foo);
1200 /^xyz/ && ($xyz = 1, last foo);
1201 $nothing = 1;
1202 }
1203
1204or even
1205
378cc40b 1206.ne 8
a687059c 1207 if (/^abc/)
79a0689e 1208 { $abc = 1; }
a687059c 1209 elsif (/^def/)
79a0689e 1210 { $def = 1; }
a687059c 1211 elsif (/^xyz/)
79a0689e 1212 { $xyz = 1; }
a687059c 1213 else
1214 {$nothing = 1;}
378cc40b 1215
1216.fi
a687059c 1217As it happens, these are all optimized internally to a switch structure,
1218so perl jumps directly to the desired statement, and you needn't worry
1219about perl executing a lot of unnecessary statements when you have a string
1220of 50 elsifs, as long as you are testing the same simple scalar variable
1221using ==, eq, or pattern matching as above.
1222(If you're curious as to whether the optimizer has done this for a particular
1223case statement, you can use the \-D1024 switch to list the syntax tree
1224before execution.)
8d063cd8 1225.Sh "Simple statements"
1226The only kind of simple statement is an expression evaluated for its side
1227effects.
1228Every expression (simple statement) must be terminated with a semicolon.
1229Note that this is like C, but unlike Pascal (and
1230.IR awk ).
1231.PP
1232Any simple statement may optionally be followed by a
1233single modifier, just before the terminating semicolon.
1234The possible modifiers are:
1235.nf
1236
1237.ne 4
1238 if EXPR
1239 unless EXPR
1240 while EXPR
1241 until EXPR
1242
1243.fi
1244The
1245.I if
1246and
1247.I unless
1248modifiers have the expected semantics.
1249The
1250.I while
1251and
378cc40b 1252.I until
8d063cd8 1253modifiers also have the expected semantics (conditional evaluated first),
1254except when applied to a do-BLOCK command,
1255in which case the block executes once before the conditional is evaluated.
1256This is so that you can write loops like:
1257.nf
1258
1259.ne 4
1260 do {
a687059c 1261 $_ = <STDIN>;
8d063cd8 1262 .\|.\|.
1263 } until $_ \|eq \|".\|\e\|n";
1264
1265.fi
1266(See the
1267.I do
1268operator below. Note also that the loop control commands described later will
83b4785a 1269NOT work in this construct, since modifiers don't take loop labels.
8d063cd8 1270Sorry.)
1271.Sh "Expressions"
1272Since
1273.I perl
1274expressions work almost exactly like C expressions, only the differences
1275will be mentioned here.
1276.PP
1277Here's what
1278.I perl
1279has that C doesn't:
a687059c 1280.Ip ** 8 2
1281The exponentiation operator.
1282.Ip **= 8
1283The exponentiation assignment operator.
8d063cd8 1284.Ip (\|) 8 3
1285The null list, used to initialize an array to null.
1286.Ip . 8
1287Concatenation of two strings.
1288.Ip .= 8
a687059c 1289The concatenation assignment operator.
8d063cd8 1290.Ip eq 8
1291String equality (== is numeric equality).
1292For a mnemonic just think of \*(L"eq\*(R" as a string.
1293(If you are used to the
1294.I awk
1295behavior of using == for either string or numeric equality
1296based on the current form of the comparands, beware!
1297You must be explicit here.)
1298.Ip ne 8
1299String inequality (!= is numeric inequality).
1300.Ip lt 8
1301String less than.
1302.Ip gt 8
1303String greater than.
1304.Ip le 8
1305String less than or equal.
1306.Ip ge 8
1307String greater than or equal.
1308.Ip =~ 8 2
1309Certain operations search or modify the string \*(L"$_\*(R" by default.
1310This operator makes that kind of operation work on some other string.
1311The right argument is a search pattern, substitution, or translation.
1312The left argument is what is supposed to be searched, substituted, or
1313translated instead of the default \*(L"$_\*(R".
1314The return value indicates the success of the operation.
1315(If the right argument is an expression other than a search pattern,
1316substitution, or translation, it is interpreted as a search pattern
1317at run time.
1318This is less efficient than an explicit search, since the pattern must
1319be compiled every time the expression is evaluated.)
1320The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else.
1321.Ip !~ 8
1322Just like =~ except the return value is negated.
1323.Ip x 8
1324The repetition operator.
1325Returns a string consisting of the left operand repeated the
1326number of times specified by the right operand.
1327.nf
1328
a687059c 1329 print \'\-\' x 80; # print row of dashes
1330 print \'\-\' x80; # illegal, x80 is identifier
8d063cd8 1331
a687059c 1332 print "\et" x ($tab/8), \' \' x ($tab%8); # tab over
8d063cd8 1333
1334.fi
1335.Ip x= 8
a687059c 1336The repetition assignment operator.
1337.Ip .\|. 8
1338The range operator, which is really two different operators depending
1339on the context.
1340In an array context, returns an array of values counting (by ones)
1341from the left value to the right value.
1342This is useful for writing \*(L"for (1..10)\*(R" loops and for doing
1343slice operations on arrays.
1344.Sp
1345In a scalar context, .\|. returns a boolean value.
1346The operator is bistable, like a flip-flop..
1347Each .\|. operator maintains its own boolean state.
378cc40b 1348It is false as long as its left operand is false.
1349Once the left operand is true, the range operator stays true
1350until the right operand is true,
1351AFTER which the range operator becomes false again.
a687059c 1352(It doesn't become false till the next time the range operator is evaluated.
8d063cd8 1353It can become false on the same evaluation it became true, but it still returns
1354true once.)
13281fa4 1355The right operand is not evaluated while the operator is in the \*(L"false\*(R" state,
1356and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state.
a687059c 1357The scalar .\|. operator is primarily intended for doing line number ranges
1358after
8d063cd8 1359the fashion of \fIsed\fR or \fIawk\fR.
1360The precedence is a little lower than || and &&.
1361The value returned is either the null string for false, or a sequence number
1362(beginning with 1) for true.
1363The sequence number is reset for each range encountered.
a687059c 1364The final sequence number in a range has the string \'E0\' appended to it, which
8d063cd8 1365doesn't affect its numeric value, but gives you something to search for if you
1366want to exclude the endpoint.
1367You can exclude the beginning point by waiting for the sequence number to be
1368greater than 1.
a687059c 1369If either operand of scalar .\|. is static, that operand is implicitly compared
1370to the $. variable, the current line number.
8d063cd8 1371Examples:
1372.nf
1373
a687059c 1374.ne 6
1375As a scalar operator:
1376 if (101 .\|. 200) { print; } # print 2nd hundred lines
8d063cd8 1377
a687059c 1378 next line if (1 .\|. /^$/); # skip header lines
8d063cd8 1379
a687059c 1380 s/^/> / if (/^$/ .\|. eof()); # quote body
1381
1382.ne 4
1383As an array operator:
1384 for (101 .\|. 200) { print; } # print $_ 100 times
1385
1386 @foo = @foo[$[ .\|. $#foo]; # an expensive no-op
1387 @foo = @foo[$#foo-4 .\|. $#foo]; # slice last 5 items
8d063cd8 1388
1389.fi
378cc40b 1390.Ip \-x 8
1391A file test.
1392This unary operator takes one argument, either a filename or a filehandle,
1393and tests the associated file to see if something is true about it.
a687059c 1394If the argument is omitted, tests $_, except for \-t, which tests
1395.IR STDIN .
1396It returns 1 for true and \'\' for false, or the undefined value if the
1397file doesn't exist.
378cc40b 1398Precedence is higher than logical and relational operators, but lower than
1399arithmetic operators.
1400The operator may be any of:
1401.nf
1402 \-r File is readable by effective uid.
a687059c 1403 \-w File is writable by effective uid.
378cc40b 1404 \-x File is executable by effective uid.
1405 \-o File is owned by effective uid.
1406 \-R File is readable by real uid.
a687059c 1407 \-W File is writable by real uid.
378cc40b 1408 \-X File is executable by real uid.
1409 \-O File is owned by real uid.
1410 \-e File exists.
1411 \-z File has zero size.
450a55e4 1412 \-s File has non-zero size (returns size).
378cc40b 1413 \-f File is a plain file.
1414 \-d File is a directory.
1415 \-l File is a symbolic link.
1416 \-p File is a named pipe (FIFO).
1417 \-S File is a socket.
1418 \-b File is a block special file.
1419 \-c File is a character special file.
1420 \-u File has setuid bit set.
1421 \-g File has setgid bit set.
1422 \-k File has sticky bit set.
1423 \-t Filehandle is opened to a tty.
1424 \-T File is a text file.
1425 \-B File is a binary file (opposite of \-T).
1426
1427.fi
1428The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X
1429is based solely on the mode of the file and the uids and gids of the user.
1430There may be other reasons you can't actually read, write or execute the file.
1431Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and
1432\-x and \-X return 1 if any execute bit is set in the mode.
1433Scripts run by the superuser may thus need to do a stat() in order to determine
1434the actual mode of the file, or temporarily set the uid to something else.
1435.Sp
1436Example:
1437.nf
1438.ne 7
1439
1440 while (<>) {
1441 chop;
1442 next unless \-f $_; # ignore specials
1443 .\|.\|.
1444 }
1445
1446.fi
a687059c 1447Note that \-s/a/b/ does not do a negated substitution.
1448Saying \-exp($foo) still works as expected, however\*(--only single letters
378cc40b 1449following a minus are interpreted as file tests.
1450.Sp
1451The \-T and \-B switches work as follows.
1452The first block or so of the file is examined for odd characters such as
1453strange control codes or metacharacters.
1454If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file.
1455Also, any file containing null in the first block is considered a binary file.
1456If \-T or \-B is used on a filehandle, the current stdio buffer is examined
1457rather than the first block.
378cc40b 1458Both \-T and \-B return TRUE on a null file, or a file at EOF when testing
1459a filehandle.
8d063cd8 1460.PP
a687059c 1461If any of the file tests (or either stat operator) are given the special
1462filehandle consisting of a solitary underline, then the stat structure
1463of the previous file test (or stat operator) is used, saving a system
1464call.
1465(This doesn't work with \-t, and you need to remember that lstat and -l
1466will leave values in the stat structure for the symbolic link, not the
1467real file.)
1468Example:
1469.nf
1470
1471 print "Can do.\en" if -r $a || -w _ || -x _;
1472
1473.ne 9
1474 stat($filename);
1475 print "Readable\en" if -r _;
1476 print "Writable\en" if -w _;
1477 print "Executable\en" if -x _;
1478 print "Setuid\en" if -u _;
1479 print "Setgid\en" if -g _;
1480 print "Sticky\en" if -k _;
1481 print "Text\en" if -T _;
1482 print "Binary\en" if -B _;
1483
1484.fi
1485.PP
8d063cd8 1486Here is what C has that
1487.I perl
1488doesn't:
1489.Ip "unary &" 12
1490Address-of operator.
1491.Ip "unary *" 12
1492Dereference-address operator.
378cc40b 1493.Ip "(TYPE)" 12
1494Type casting operator.
8d063cd8 1495.PP
1496Like C,
1497.I perl
1498does a certain amount of expression evaluation at compile time, whenever
1499it determines that all of the arguments to an operator are static and have
1500no side effects.
1501In particular, string concatenation happens at compile time between literals that don't do variable substitution.
1502Backslash interpretation also happens at compile time.
1503You can say
1504.nf
1505
1506.ne 2
a687059c 1507 \'Now is the time for all\' . "\|\e\|n" .
1508 \'good men to come to.\'
8d063cd8 1509
1510.fi
1511and this all reduces to one string internally.
1512.PP
378cc40b 1513The autoincrement operator has a little extra built-in magic to it.
1514If you increment a variable that is numeric, or that has ever been used in
1515a numeric context, you get a normal increment.
1516If, however, the variable has only been used in string contexts since it
1517was set, and has a value that is not null and matches the
a687059c 1518pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done
378cc40b 1519as a string, preserving each character within its range, with carry:
1520.nf
1521
a687059c 1522 print ++($foo = \'99\'); # prints \*(L'100\*(R'
1523 print ++($foo = \'a0\'); # prints \*(L'a1\*(R'
1524 print ++($foo = \'Az\'); # prints \*(L'Ba\*(R'
1525 print ++($foo = \'zz\'); # prints \*(L'aaa\*(R'
378cc40b 1526
1527.fi
1528The autodecrement is not magical.
0f85fab0 1529.PP
1530The range operator (in an array context) makes use of the magical
1531autoincrement algorithm if the minimum and maximum are strings.
1532You can say
1533
1534 @alphabet = (\'A\' .. \'Z\');
1535
1536to get all the letters of the alphabet, or
1537
1538 $hexdigit = (0 .. 9, \'a\' .. \'f\')[$num & 15];
1539
1540to get a hexadecimal digit, or
1541
1542 @z2 = (\'01\' .. \'31\'); print @z2[$mday];
1543
1544to get dates with leading zeros.
1545(If the final value specified is not in the sequence that the magical increment
1546would produce, the sequence goes until the next value would be longer than
1547the final value specified.)
450a55e4 1548.PP
1549The || and && operators differ from C's in that, rather than returning 0 or 1,
1550they return the last value evaluated.
1551Thus, a portable way to find out the home directory might be:
1552.nf
1553
1554 $home = $ENV{'HOME'} || $ENV{'LOGDIR'} ||
1555 (getpwuid($<))[7] || die "You're homeless!\en";
1556
1557.fi