perl.man.1

   1 .rn '' }`
   2 ''' $Header: perl_man.1,v 3.0.1.9 90/10/20 02:14:24 lwall Locked $
   3 '''
   4 ''' $Log:       perl.man.1,v $
   5 ''' Revision 3.0.1.9  90/10/20  02:14:24  lwall
   6 ''' patch37: fixed various typos in man page
   7 '''
   8 ''' Revision 3.0.1.8  90/10/15  18:16:19  lwall
   9 ''' patch29: added DATA filehandle to read stuff after __END__
  10 ''' patch29: added cmp and <=>
  11 ''' patch29: added -M, -A and -C
  12 '''
  13 ''' Revision 3.0.1.7  90/08/09  04:24:03  lwall
  14 ''' patch19: added -x switch to extract script from input trash
  15 ''' patch19: Added -c switch to do compilation only
  16 ''' patch19: bare identifiers are now strings if no other interpretation possible
  17 ''' patch19: -s now returns size of file
  18 ''' patch19: Added __LINE__ and __FILE__ tokens
  19 ''' patch19: Added __END__ token
  20 '''
  21 ''' Revision 3.0.1.6  90/08/03  11:14:44  lwall
  22 ''' patch19: Intermediate diffs for Randal
  23 '''
  24 ''' Revision 3.0.1.5  90/03/27  16:14:37  lwall
  25 ''' patch16: .. now works using magical string increment
  26 '''
  27 ''' Revision 3.0.1.4  90/03/12  16:44:33  lwall
  28 ''' patch13: (LIST,) now legal
  29 ''' patch13: improved LIST documentation
  30 ''' patch13: example of if-elsif switch was wrong
  31 '''
  32 ''' Revision 3.0.1.3  90/02/28  17:54:32  lwall
  33 ''' patch9: @array in scalar context now returns length of array
  34 ''' patch9: in manual, example of open and ?: was backwards
  35 '''
  36 ''' Revision 3.0.1.2  89/11/17  15:30:03  lwall
  37 ''' patch5: fixed some manual typos and indent problems
  38 '''
  39 ''' Revision 3.0.1.1  89/11/11  04:41:22  lwall
  40 ''' patch2: explained about sh and ${1+"$@"}
  41 ''' patch2: documented that space must separate word and '' string
  42 '''
  43 ''' Revision 3.0  89/10/18  15:21:29  lwall
  44 ''' 3.0 baseline
  45 '''
  46 '''
  47 .de Sh
  48 .br
  49 .ne 5
  50 .PP
  51 \fB\\$1\fR
  52 .PP
  53 ..
  54 .de Sp
  55 .if t .sp .5v
  56 .if n .sp
  57 ..
  58 .de Ip
  59 .br
  60 .ie \\n(.$>=3 .ne \\$3
  61 .el .ne 3
  62 .IP "\\$1" \\$2
  63 ..
  64 '''
  65 '''     Set up \*(-- to give an unbreakable dash;
  66 '''     string Tr holds user defined translation string.
  67 '''     Bell System Logo is used as a dummy character.
  68 '''
  69 .tr \(*W-|\(bv\*(Tr
  70 .ie n \{\
  71 .ds -- \(*W-
  72 .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
  73 .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
  74 .ds L" ""
  75 .ds R" ""
  76 .ds L' '
  77 .ds R' '
  78 'br\}
  79 .el\{\
  80 .ds -- \(em\|
  81 .tr \*(Tr
  82 .ds L" ``
  83 .ds R" ''
  84 .ds L' `
  85 .ds R' '
  86 'br\}
  87 .TH PERL 1 "\*(RP"
  88 .UC
  89 .SH NAME
  90 perl \- Practical Extraction and Report Language
  91 .SH SYNOPSIS
  92 .B perl
  93 [options] filename args
  94 .SH DESCRIPTION
  95 .I Perl
  96 is an interpreted language optimized for scanning arbitrary text files,
  97 extracting information from those text files, and printing reports based
  98 on that information.
  99 It's also a good language for many system management tasks.
 100 The language is intended to be practical (easy to use, efficient, complete)
 101 rather than beautiful (tiny, elegant, minimal).
 102 It combines (in the author's opinion, anyway) some of the best features of C,
 103 \fIsed\fR, \fIawk\fR, and \fIsh\fR,
 104 so people familiar with those languages should have little difficulty with it.
 105 (Language historians will also note some vestiges of \fIcsh\fR, Pascal, and
 106 even BASIC-PLUS.)
 107 Expression syntax corresponds quite closely to C expression syntax.
 108 Unlike most Unix utilities,
 109 .I perl
 110 does not arbitrarily limit the size of your data\*(--if you've got
 111 the memory,
 112 .I perl
 113 can slurp in your whole file as a single string.
 114 Recursion is of unlimited depth.
 115 And the hash tables used by associative arrays grow as necessary to prevent
 116 degraded performance.
 117 .I Perl
 118 uses sophisticated pattern matching techniques to scan large amounts of
 119 data very quickly.
 120 Although optimized for scanning text,
 121 .I perl
 122 can also deal with binary data, and can make dbm files look like associative
 123 arrays (where dbm is available).
 124 Setuid
 125 .I perl
 126 scripts are safer than C programs
 127 through a dataflow tracing mechanism which prevents many stupid security holes.
 128 If you have a problem that would ordinarily use \fIsed\fR
 129 or \fIawk\fR or \fIsh\fR, but it
 130 exceeds their capabilities or must run a little faster,
 131 and you don't want to write the silly thing in C, then
 132 .I perl
 133 may be for you.
 134 There are also translators to turn your
 135 .I sed
 136 and
 137 .I awk
 138 scripts into
 139 .I perl
 140 scripts.
 141 OK, enough hype.
 142 .PP
 143 Upon startup,
 144 .I perl
 145 looks for your script in one of the following places:
 146 .Ip 1. 4 2
 147 Specified line by line via
 148 .B \-e
 149 switches on the command line.
 150 .Ip 2. 4 2
 151 Contained in the file specified by the first filename on the command line.
 152 (Note that systems supporting the #! notation invoke interpreters this way.)
 153 .Ip 3. 4 2
 154 Passed in implicitly via standard input.
 155 This only works if there are no filename arguments\*(--to pass
 156 arguments to a
 157 .I stdin
 158 script you must explicitly specify a \- for the script name.
 159 .PP
 160 After locating your script,
 161 .I perl
 162 compiles it to an internal form.
 163 If the script is syntactically correct, it is executed.
 164 .Sh "Options"
 165 Note: on first reading this section may not make much sense to you.  It's here
 166 at the front for easy reference.
 167 .PP
 168 A single-character option may be combined with the following option, if any.
 169 This is particularly useful when invoking a script using the #! construct which
 170 only allows one argument.  Example:
 171 .nf
 172
 173 .ne 2
 174         #!/usr/bin/perl \-spi.bak       # same as \-s \-p \-i.bak
 175         .\|.\|.
 176
 177 .fi
 178 Options include:
 179 .TP 5
 180 .B \-a
 181 turns on autosplit mode when used with a
 182 .B \-n
 183 or
 184 .BR \-p .
 185 An implicit split command to the @F array
 186 is done as the first thing inside the implicit while loop produced by
 187 the
 188 .B \-n
 189 or
 190 .BR \-p .
 191 .nf
 192
 193         perl \-ane \'print pop(@F), "\en";\'
 194
 195 is equivalent to
 196
 197         while (<>) {
 198                 @F = split(\' \');
 199                 print pop(@F), "\en";
 200         }
 201
 202 .fi
 203 .TP 5
 204 .B \-c
 205 causes
 206 .I perl
 207 to check the syntax of the script and then exit without executing it.
 208 .TP 5
 209 .BI \-d
 210 runs the script under the perl debugger.
 211 See the section on Debugging.
 212 .TP 5
 213 .BI \-D number
 214 sets debugging flags.
 215 To watch how it executes your script, use
 216 .BR \-D14 .
 217 (This only works if debugging is compiled into your
 218 .IR perl .)
 219 Another nice value is \-D1024, which lists your compiled syntax tree.
 220 And \-D512 displays compiled regular expressions.
 221 .TP 5
 222 .BI \-e " commandline"
 223 may be used to enter one line of script.
 224 Multiple
 225 .B \-e
 226 commands may be given to build up a multi-line script.
 227 If
 228 .B \-e
 229 is given,
 230 .I perl
 231 will not look for a script filename in the argument list.
 232 .TP 5
 233 .BI \-i extension
 234 specifies that files processed by the <> construct are to be edited
 235 in-place.
 236 It does this by renaming the input file, opening the output file by the
 237 same name, and selecting that output file as the default for print statements.
 238 The extension, if supplied, is added to the name of the
 239 old file to make a backup copy.
 240 If no extension is supplied, no backup is made.
 241 Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using
 242 the script:
 243 .nf
 244
 245 .ne 2
 246         #!/usr/bin/perl \-pi.bak
 247         s/foo/bar/;
 248
 249 which is equivalent to
 250
 251 .ne 14
 252         #!/usr/bin/perl
 253         while (<>) {
 254                 if ($ARGV ne $oldargv) {
 255                         rename($ARGV, $ARGV . \'.bak\');
 256                         open(ARGVOUT, ">$ARGV");
 257                         select(ARGVOUT);
 258                         $oldargv = $ARGV;
 259                 }
 260                 s/foo/bar/;
 261         }
 262         continue {
 263             print;      # this prints to original filename
 264         }
 265         select(STDOUT);
 266
 267 .fi
 268 except that the
 269 .B \-i
 270 form doesn't need to compare $ARGV to $oldargv to know when
 271 the filename has changed.
 272 It does, however, use ARGVOUT for the selected filehandle.
 273 Note that
 274 .I STDOUT
 275 is restored as the default output filehandle after the loop.
 276 .Sp
 277 You can use eof to locate the end of each input file, in case you want
 278 to append to each file, or reset line numbering (see example under eof).
 279 .TP 5
 280 .BI \-I directory
 281 may be used in conjunction with
 282 .B \-P
 283 to tell the C preprocessor where to look for include files.
 284 By default /usr/include and /usr/lib/perl are searched.
 285 .TP 5
 286 .B \-n
 287 causes
 288 .I perl
 289 to assume the following loop around your script, which makes it iterate
 290 over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR:
 291 .nf
 292
 293 .ne 3
 294         while (<>) {
 295                 .\|.\|.         # your script goes here
 296         }
 297
 298 .fi
 299 Note that the lines are not printed by default.
 300 See
 301 .B \-p
 302 to have lines printed.
 303 Here is an efficient way to delete all files older than a week:
 304 .nf
 305
 306         find . \-mtime +7 \-print | perl \-ne \'chop;unlink;\'
 307
 308 .fi
 309 This is faster than using the \-exec switch of find because you don't have to
 310 start a process on every filename found.
 311 .TP 5
 312 .B \-p
 313 causes
 314 .I perl
 315 to assume the following loop around your script, which makes it iterate
 316 over filename arguments somewhat like \fIsed\fR:
 317 .nf
 318
 319 .ne 5
 320         while (<>) {
 321                 .\|.\|.         # your script goes here
 322         } continue {
 323                 print;
 324         }
 325
 326 .fi
 327 Note that the lines are printed automatically.
 328 To suppress printing use the
 329 .B \-n
 330 switch.
 331 A
 332 .B \-p
 333 overrides a
 334 .B \-n
 335 switch.
 336 .TP 5
 337 .B \-P
 338 causes your script to be run through the C preprocessor before
 339 compilation by
 340 .IR perl .
 341 (Since both comments and cpp directives begin with the # character,
 342 you should avoid starting comments with any words recognized
 343 by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".)
 344 .TP 5
 345 .B \-s
 346 enables some rudimentary switch parsing for switches on the command line
 347 after the script name but before any filename arguments (or before a \-\|\-).
 348 Any switch found there is removed from @ARGV and sets the corresponding variable in the
 349 .I perl
 350 script.
 351 The following script prints \*(L"true\*(R" if and only if the script is
 352 invoked with a \-xyz switch.
 353 .nf
 354
 355 .ne 2
 356         #!/usr/bin/perl \-s
 357         if ($xyz) { print "true\en"; }
 358
 359 .fi
 360 .TP 5
 361 .B \-S
 362 makes
 363 .I perl
 364 use the PATH environment variable to search for the script
 365 (unless the name of the script starts with a slash).
 366 Typically this is used to emulate #! startup on machines that don't
 367 support #!, in the following manner:
 368 .nf
 369
 370         #!/usr/bin/perl
 371         eval "exec /usr/bin/perl \-S $0 $*"
 372                 if $running_under_some_shell;
 373
 374 .fi
 375 The system ignores the first line and feeds the script to /bin/sh,
 376 which proceeds to try to execute the
 377 .I perl
 378 script as a shell script.
 379 The shell executes the second line as a normal shell command, and thus
 380 starts up the
 381 .I perl
 382 interpreter.
 383 On some systems $0 doesn't always contain the full pathname,
 384 so the
 385 .B \-S
 386 tells
 387 .I perl
 388 to search for the script if necessary.
 389 After
 390 .I perl
 391 locates the script, it parses the lines and ignores them because
 392 the variable $running_under_some_shell is never true.
 393 A better construct than $* would be ${1+"$@"}, which handles embedded spaces
 394 and such in the filenames, but doesn't work if the script is being interpreted
 395 by csh.
 396 In order to start up sh rather than csh, some systems may have to replace the
 397 #! line with a line containing just
 398 a colon, which will be politely ignored by perl.
 399 Other systems can't control that, and need a totally devious construct that
 400 will work under any of csh, sh or perl, such as the following:
 401 .nf
 402
 403 .ne 3
 404         eval '(exit $?0)' && eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
 405         & eval 'exec /usr/bin/perl -S $0 $argv:q'
 406                 if 0;
 407
 408 .fi
 409 .TP 5
 410 .B \-u
 411 causes
 412 .I perl
 413 to dump core after compiling your script.
 414 You can then take this core dump and turn it into an executable file
 415 by using the undump program (not supplied).
 416 This speeds startup at the expense of some disk space (which you can
 417 minimize by stripping the executable).
 418 (Still, a "hello world" executable comes out to about 200K on my machine.)
 419 If you are going to run your executable as a set-id program then you
 420 should probably compile it using taintperl rather than normal perl.
 421 If you want to execute a portion of your script before dumping, use the
 422 dump operator instead.
 423 Note: availability of undump is platform specific and may not be available
 424 for a specific port of perl.
 425 .TP 5
 426 .B \-U
 427 allows
 428 .I perl
 429 to do unsafe operations.
 430 Currently the only \*(L"unsafe\*(R" operation is the unlinking of directories while
 431 running as superuser.
 432 .TP 5
 433 .B \-v
 434 prints the version and patchlevel of your
 435 .I perl
 436 executable.
 437 .TP 5
 438 .B \-w
 439 prints warnings about identifiers that are mentioned only once, and scalar
 440 variables that are used before being set.
 441 Also warns about redefined subroutines, and references to undefined
 442 filehandles or filehandles opened readonly that you are attempting to
 443 write on.
 444 Also warns you if you use == on values that don't look like numbers, and if
 445 your subroutines recurse more than 100 deep.
 446 .TP 5
 447 .BI \-x directory
 448 tells
 449 .I perl
 450 that the script is embedded in a message.
 451 Leading garbage will be discarded until the first line that starts
 452 with #! and contains the string "perl".
 453 Any meaningful switches on that line will be applied (but only one
 454 group of switches, as with normal #! processing).
 455 If a directory name is specified, Perl will switch to that directory
 456 before running the script.
 457 The
 458 .B \-x
 459 switch only controls the the disposal of leading garbage.
 460 The script must be terminated with __END__ if there is trailing garbage
 461 to be ignored (the script can process any or all of the trailing garbage
 462 via the DATA filehandle if desired).
 463 .Sh "Data Types and Objects"
 464 .PP
 465 .I Perl
 466 has three data types: scalars, arrays of scalars, and
 467 associative arrays of scalars.
 468 Normal arrays are indexed by number, and associative arrays by string.
 469 .PP
 470 The interpretation of operations and values in perl sometimes
 471 depends on the requirements
 472 of the context around the operation or value.
 473 There are three major contexts: string, numeric and array.
 474 Certain operations return array values
 475 in contexts wanting an array, and scalar values otherwise.
 476 (If this is true of an operation it will be mentioned in the documentation
 477 for that operation.)
 478 Operations which return scalars don't care whether the context is looking
 479 for a string or a number, but
 480 scalar variables and values are interpreted as strings or numbers
 481 as appropriate to the context.
 482 A scalar is interpreted as TRUE in the boolean sense if it is not the null
 483 string or 0.
 484 Booleans returned by operators are 1 for true and 0 or \'\' (the null
 485 string) for false.
 486 .PP
 487 There are actually two varieties of null string: defined and undefined.
 488 Undefined null strings are returned when there is no real value for something,
 489 such as when there was an error, or at end of file, or when you refer
 490 to an uninitialized variable or element of an array.
 491 An undefined null string may become defined the first time you access it, but
 492 prior to that you can use the defined() operator to determine whether the
 493 value is defined or not.
 494 .PP
 495 References to scalar variables always begin with \*(L'$\*(R', even when referring
 496 to a scalar that is part of an array.
 497 Thus:
 498 .nf
 499
 500 .ne 3
 501     $days       \h'|2i'# a simple scalar variable
 502     $days[28]   \h'|2i'# 29th element of array @days
 503     $days{\'Feb\'}\h'|2i'# one value from an associative array
 504     $#days      \h'|2i'# last index of array @days
 505
 506 but entire arrays or array slices are denoted by \*(L'@\*(R':
 507
 508     @days       \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n])
 509     @days[3,4,5]\h'|2i'# same as @days[3.\|.5]
 510     @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'})
 511
 512 and entire associative arrays are denoted by \*(L'%\*(R':
 513
 514     %days       \h'|2i'# (key1, val1, key2, val2 .\|.\|.)
 515 .fi
 516 .PP
 517 Any of these eight constructs may serve as an lvalue,
 518 that is, may be assigned to.
 519 (It also turns out that an assignment is itself an lvalue in
 520 certain contexts\*(--see examples under s, tr and chop.)
 521 Assignment to a scalar evaluates the righthand side in a scalar context,
 522 while assignment to an array or array slice evaluates the righthand side
 523 in an array context.
 524 .PP
 525 You may find the length of array @days by evaluating
 526 \*(L"$#days\*(R", as in
 527 .IR csh .
 528 (Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.)
 529 Assigning to $#days changes the length of the array.
 530 Shortening an array by this method does not actually destroy any values.
 531 Lengthening an array that was previously shortened recovers the values that
 532 were in those elements.
 533 You can also gain some measure of efficiency by preextending an array that
 534 is going to get big.
 535 (You can also extend an array by assigning to an element that is off the
 536 end of the array.
 537 This differs from assigning to $#whatever in that intervening values
 538 are set to null rather than recovered.)
 539 You can truncate an array down to nothing by assigning the null list () to
 540 it.
 541 The following are exactly equivalent
 542 .nf
 543
 544         @whatever = ();
 545         $#whatever = $[ \- 1;
 546
 547 .fi
 548 .PP
 549 If you evaluate an array in a scalar context, it returns the length of
 550 the array.
 551 The following is always true:
 552 .nf
 553
 554         @whatever == $#whatever \- $[ + 1;
 555
 556 .fi
 557 .PP
 558 Multi-dimensional arrays are not directly supported, but see the discussion
 559 of the $; variable later for a means of emulating multiple subscripts with
 560 an associative array.
 561 You could also write a subroutine to turn multiple subscripts into a single
 562 subscript.
 563 .PP
 564 Every data type has its own namespace.
 565 You can, without fear of conflict, use the same name for a scalar variable,
 566 an array, an associative array, a filehandle, a subroutine name, and/or
 567 a label.
 568 Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R',
 569 or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved
 570 with respect to variable names.
 571 (They ARE reserved with respect to labels and filehandles, however, which
 572 don't have an initial special character.
 573 Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\').
 574 Using uppercase filehandles also improves readability and protects you
 575 from conflict with future reserved words.)
 576 Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all
 577 different names.
 578 Names which start with a letter may also contain digits and underscores.
 579 Names which do not start with a letter are limited to one character,
 580 e.g. \*(L"$%\*(R" or \*(L"$$\*(R".
 581 (Most of the one character names have a predefined significance to
 582 .IR perl .
 583 More later.)
 584 .PP
 585 Numeric literals are specified in any of the usual floating point or
 586 integer formats:
 587 .nf
 588
 589 .ne 5
 590     12345
 591     12345.67
 592     .23E-10
 593     0xffff      # hex
 594     0377        # octal
 595
 596 .fi
 597 String literals are delimited by either single or double quotes.
 598 They work much like shell quotes:
 599 double-quoted string literals are subject to backslash and variable
 600 substitution; single-quoted strings are not (except for \e\' and \e\e).
 601 The usual backslash rules apply for making characters such as newline, tab, etc.
 602 You can also embed newlines directly in your strings, i.e. they can end on
 603 a different line than they begin.
 604 This is nice, but if you forget your trailing quote, the error will not be
 605 reported until
 606 .I perl
 607 finds another line containing the quote character, which
 608 may be much further on in the script.
 609 Variable substitution inside strings is limited to scalar variables, normal
 610 array values, and array slices.
 611 (In other words, identifiers beginning with $ or @, followed by an optional
 612 bracketed expression as a subscript.)
 613 The following code segment prints out \*(L"The price is $100.\*(R"
 614 .nf
 615
 616 .ne 2
 617     $Price = \'$100\';\h'|3.5i'# not interpreted
 618     print "The price is $Price.\e\|n";\h'|3.5i'# interpreted
 619
 620 .fi
 621 Note that you can put curly brackets around the identifier to delimit it
 622 from following alphanumerics.
 623 Also note that a single quoted string must be separated from a preceding
 624 word by a space, since single quote is a valid character in an identifier
 625 (see Packages).
 626 .PP
 627 Two special literals are __LINE__ and __FILE__, which represent the current
 628 line number and filename at that point in your program.
 629 They may only be used as separate tokens; they will not be interpolated
 630 into strings.
 631 In addition, the token __END__ may be used to indicate the logical end of the
 632 script before the actual end of file.
 633 Any following text is ignored (but may be read via the DATA filehandle).
 634 The two control characters ^D and ^Z are synomyms for __END__.
 635 .PP
 636 A word that doesn't have any other interpretation in the grammar will be
 637 treated as if it had single quotes around it.
 638 For this purpose, a word consists only of alphanumeric characters and underline,
 639 and must start with an alphabetic character.
 640 As with filehandles and labels, a bare word that consists entirely of
 641 lowercase letters risks conflict with future reserved words, and if you
 642 use the
 643 .B \-w
 644 switch, Perl will warn you about any such words.
 645 .PP
 646 Array values are interpolated into double-quoted strings by joining all the
 647 elements of the array with the delimiter specified in the $" variable,
 648 space by default.
 649 (Since in versions of perl prior to 3.0 the @ character was not a metacharacter
 650 in double-quoted strings, the interpolation of @array, $array[EXPR],
 651 @array[LIST], $array{EXPR}, or @array{LIST} only happens if array is
 652 referenced elsewhere in the program or is predefined.)
 653 The following are equivalent:
 654 .nf
 655
 656 .ne 4
 657         $temp = join($",@ARGV);
 658         system "echo $temp";
 659
 660         system "echo @ARGV";
 661
 662 .fi
 663 Within search patterns (which also undergo double-quotish substitution)
 664 there is a bad ambiguity:  Is /$foo[bar]/ to be
 665 interpreted as /${foo}[bar]/ (where [bar] is a character class for the
 666 regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to
 667 array @foo)?
 668 If @foo doesn't otherwise exist, then it's obviously a character class.
 669 If @foo exists, perl takes a good guess about [bar], and is almost always right.
 670 If it does guess wrong, or if you're just plain paranoid,
 671 you can force the correct interpretation with curly brackets as above.
 672 .PP
 673 A line-oriented form of quoting is based on the shell here-is syntax.
 674 Following a << you specify a string to terminate the quoted material, and all lines
 675 following the current line down to the terminating string are the value
 676 of the item.
 677 The terminating string may be either an identifier (a word), or some
 678 quoted text.
 679 If quoted, the type of quotes you use determines the treatment of the text,
 680 just as in regular quoting.
 681 An unquoted identifier works like double quotes.
 682 There must be no space between the << and the identifier.
 683 (If you put a space it will be treated as a null identifier, which is
 684 valid, and matches the first blank line\*(--see Merry Christmas example below.)
 685 The terminating string must appear by itself (unquoted and with no surrounding
 686 whitespace) on the terminating line.
 687 .nf
 688
 689         print <<EOF;            # same as above
 690 The price is $Price.
 691 EOF
 692
 693         print <<"EOF";          # same as above
 694 The price is $Price.
 695 EOF
 696
 697         print << x 10;          # null identifier is delimiter
 698 Merry Christmas!
 699
 700         print <<`EOC`;          # execute commands
 701 echo hi there
 702 echo lo there
 703 EOC
 704
 705         print <<foo, <<bar;     # you can stack them
 706 I said foo.
 707 foo
 708 I said bar.
 709 bar
 710
 711 .fi
 712 Array literals are denoted by separating individual values by commas, and
 713 enclosing the list in parentheses:
 714 .nf
 715
 716         (LIST)
 717
 718 .fi
 719 In a context not requiring an array value, the value of the array literal
 720 is the value of the final element, as in the C comma operator.
 721 For example,
 722 .nf
 723
 724 .ne 4
 725     @foo = (\'cc\', \'\-E\', $bar);
 726
 727 assigns the entire array value to array foo, but
 728
 729     $foo = (\'cc\', \'\-E\', $bar);
 730
 731 .fi
 732 assigns the value of variable bar to variable foo.
 733 Note that the value of an actual array in a scalar context is the length
 734 of the array; the following assigns to $foo the value 3:
 735 .nf
 736
 737 .ne 2
 738     @foo = (\'cc\', \'\-E\', $bar);
 739     $foo = @foo;                # $foo gets 3
 740
 741 .fi
 742 You may have an optional comma before the closing parenthesis of an
 743 array literal, so that you can say:
 744 .nf
 745
 746     @foo = (
 747         1,
 748         2,
 749         3,
 750     );
 751
 752 .fi
 753 When a LIST is evaluated, each element of the list is evaluated in
 754 an array context, and the resulting array value is interpolated into LIST
 755 just as if each individual element were a member of LIST.  Thus arrays
 756 lose their identity in a LIST\*(--the list
 757
 758         (@foo,@bar,&SomeSub)
 759
 760 contains all the elements of @foo followed by all the elements of @bar,
 761 followed by all the elements returned by the subroutine named SomeSub.
 762 .PP
 763 A list value may also be subscripted like a normal array.
 764 Examples:
 765 .nf
 766
 767         $time = (stat($file))[8];       # stat returns array value
 768         $digit = ('a','b','c','d','e','f')[$digit-10];
 769         return (pop(@foo),pop(@foo))[0];
 770
 771 .fi
 772 .PP
 773 Array lists may be assigned to if and only if each element of the list
 774 is an lvalue:
 775 .nf
 776
 777     ($a, $b, $c) = (1, 2, 3);
 778
 779     ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00);
 780
 781 The final element may be an array or an associative array:
 782
 783     ($a, $b, @rest) = split;
 784     local($a, $b, %rest) = @_;
 785
 786 .fi
 787 You can actually put an array anywhere in the list, but the first array
 788 in the list will soak up all the values, and anything after it will get
 789 a null value.
 790 This may be useful in a local().
 791 .PP
 792 An associative array literal contains pairs of values to be interpreted
 793 as a key and a value:
 794 .nf
 795
 796 .ne 2
 797     # same as map assignment above
 798     %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
 799
 800 .fi
 801 Array assignment in a scalar context returns the number of elements
 802 produced by the expression on the right side of the assignment:
 803 .nf
 804
 805         $x = (($foo,$bar) = (3,2,1));   # set $x to 3, not 2
 806
 807 .fi
 808 .PP
 809 There are several other pseudo-literals that you should know about.
 810 If a string is enclosed by backticks (grave accents), it first undergoes
 811 variable substitution just like a double quoted string.
 812 It is then interpreted as a command, and the output of that command
 813 is the value of the pseudo-literal, like in a shell.
 814 In a scalar context, a single string consisting of all the output is
 815 returned.
 816 In an array context, an array of values is returned, one for each line
 817 of output.
 818 (You can set $/ to use a different line terminator.)
 819 The command is executed each time the pseudo-literal is evaluated.
 820 The status value of the command is returned in $? (see Predefined Names
 821 for the interpretation of $?).
 822 Unlike in \f2csh\f1, no translation is done on the return
 823 data\*(--newlines remain newlines.
 824 Unlike in any of the shells, single quotes do not hide variable names
 825 in the command from interpretation.
 826 To pass a $ through to the shell you need to hide it with a backslash.
 827 .PP
 828 Evaluating a filehandle in angle brackets yields the next line
 829 from that file (newline included, so it's never false until EOF, at
 830 which time an undefined value is returned).
 831 Ordinarily you must assign that value to a variable,
 832 but there is one situation where an automatic assignment happens.
 833 If (and only if) the input symbol is the only thing inside the conditional of a
 834 .I while
 835 loop, the value is
 836 automatically assigned to the variable \*(L"$_\*(R".
 837 (This may seem like an odd thing to you, but you'll use the construct
 838 in almost every
 839 .I perl
 840 script you write.)
 841 Anyway, the following lines are equivalent to each other:
 842 .nf
 843
 844 .ne 5
 845     while ($_ = <STDIN>) { print; }
 846     while (<STDIN>) { print; }
 847     for (\|;\|<STDIN>;\|) { print; }
 848     print while $_ = <STDIN>;
 849     print while <STDIN>;
 850
 851 .fi
 852 The filehandles
 853 .IR STDIN ,
 854 .I STDOUT
 855 and
 856 .I STDERR
 857 are predefined.
 858 (The filehandles
 859 .IR stdin ,
 860 .I stdout
 861 and
 862 .I stderr
 863 will also work except in packages, where they would be interpreted as
 864 local identifiers rather than global.)
 865 Additional filehandles may be created with the
 866 .I open
 867 function.
 868 .PP
 869 If a <FILEHANDLE> is used in a context that is looking for an array, an array
 870 consisting of all the input lines is returned, one line per array element.
 871 It's easy to make a LARGE data space this way, so use with care.
 872 .PP
 873 The null filehandle <> is special and can be used to emulate the behavior of
 874 \fIsed\fR and \fIawk\fR.
 875 Input from <> comes either from standard input, or from each file listed on
 876 the command line.
 877 Here's how it works: the first time <> is evaluated, the ARGV array is checked,
 878 and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard
 879 input.
 880 The ARGV array is then processed as a list of filenames.
 881 The loop
 882 .nf
 883
 884 .ne 3
 885         while (<>) {
 886                 .\|.\|.                 # code for each line
 887         }
 888
 889 .ne 10
 890 is equivalent to
 891
 892         unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[;
 893         while ($ARGV = shift) {
 894                 open(ARGV, $ARGV);
 895                 while (<ARGV>) {
 896                         .\|.\|.         # code for each line
 897                 }
 898         }
 899
 900 .fi
 901 except that it isn't as cumbersome to say.
 902 It really does shift array ARGV and put the current filename into
 903 variable ARGV.
 904 It also uses filehandle ARGV internally.
 905 You can modify @ARGV before the first <> as long as you leave the first
 906 filename at the beginning of the array.
 907 Line numbers ($.) continue as if the input was one big happy file.
 908 (But see example under eof for how to reset line numbers on each file.)
 909 .PP
 910 .ne 5
 911 If you want to set @ARGV to your own list of files, go right ahead.
 912 If you want to pass switches into your script, you can
 913 put a loop on the front like this:
 914 .nf
 915
 916 .ne 10
 917         while ($_ = $ARGV[0], /\|^\-/\|) {
 918                 shift;
 919             last if /\|^\-\|\-$\|/\|;
 920                 /\|^\-D\|(.*\|)/ \|&& \|($debug = $1);
 921                 /\|^\-v\|/ \|&& \|$verbose++;
 922                 .\|.\|.         # other switches
 923         }
 924         while (<>) {
 925                 .\|.\|.         # code for each line
 926         }
 927
 928 .fi
 929 The <> symbol will return FALSE only once.
 930 If you call it again after this it will assume you are processing another
 931 @ARGV list, and if you haven't set @ARGV, will input from
 932 .IR STDIN .
 933 .PP
 934 If the string inside the angle brackets is a reference to a scalar variable
 935 (e.g. <$foo>),
 936 then that variable contains the name of the filehandle to input from.
 937 .PP
 938 If the string inside angle brackets is not a filehandle, it is interpreted
 939 as a filename pattern to be globbed, and either an array of filenames or the
 940 next filename in the list is returned, depending on context.
 941 One level of $ interpretation is done first, but you can't say <$foo>
 942 because that's an indirect filehandle as explained in the previous
 943 paragraph.
 944 You could insert curly brackets to force interpretation as a
 945 filename glob: <${foo}>.
 946 Example:
 947 .nf
 948
 949 .ne 3
 950         while (<*.c>) {
 951                 chmod 0644, $_;
 952         }
 953
 954 is equivalent to
 955
 956 .ne 5
 957         open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|");
 958         while (<foo>) {
 959                 chop;
 960                 chmod 0644, $_;
 961         }
 962
 963 .fi
 964 In fact, it's currently implemented that way.
 965 (Which means it will not work on filenames with spaces in them unless
 966 you have /bin/csh on your machine.)
 967 Of course, the shortest way to do the above is:
 968 .nf
 969
 970         chmod 0644, <*.c>;
 971
 972 .fi
 973 .Sh "Syntax"
 974 .PP
 975 A
 976 .I perl
 977 script consists of a sequence of declarations and commands.
 978 The only things that need to be declared in
 979 .I perl
 980 are report formats and subroutines.
 981 See the sections below for more information on those declarations.
 982 All uninitialized user-created objects are assumed to
 983 start with a null or 0 value until they
 984 are defined by some explicit operation such as assignment.
 985 The sequence of commands is executed just once, unlike in
 986 .I sed
 987 and
 988 .I awk
 989 scripts, where the sequence of commands is executed for each input line.
 990 While this means that you must explicitly loop over the lines of your input file
 991 (or files), it also means you have much more control over which files and which
 992 lines you look at.
 993 (Actually, I'm lying\*(--it is possible to do an implicit loop with either the
 994 .B \-n
 995 or
 996 .B \-p
 997 switch.)
 998 .PP
 999 A declaration can be put anywhere a command can, but has no effect on the
1000 execution of the primary sequence of commands--declarations all take effect
1001 at compile time.
1002 Typically all the declarations are put at the beginning or the end of the script.
1003 .PP
1004 .I Perl
1005 is, for the most part, a free-form language.
1006 (The only exception to this is format declarations, for fairly obvious reasons.)
1007 Comments are indicated by the # character, and extend to the end of the line.
1008 If you attempt to use /* */ C comments, it will be interpreted either as
1009 division or pattern matching, depending on the context.
1010 So don't do that.
1011 .Sh "Compound statements"
1012 In
1013 .IR perl ,
1014 a sequence of commands may be treated as one command by enclosing it
1015 in curly brackets.
1016 We will call this a BLOCK.
1017 .PP
1018 The following compound commands may be used to control flow:
1019 .nf
1020
1021 .ne 4
1022         if (EXPR) BLOCK
1023         if (EXPR) BLOCK else BLOCK
1024         if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK
1025         LABEL while (EXPR) BLOCK
1026         LABEL while (EXPR) BLOCK continue BLOCK
1027         LABEL for (EXPR; EXPR; EXPR) BLOCK
1028         LABEL foreach VAR (ARRAY) BLOCK
1029         LABEL BLOCK continue BLOCK
1030
1031 .fi
1032 Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not
1033 statements.
1034 This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed.
1035 If you want to write conditionals without curly brackets there are several
1036 other ways to do it.
1037 The following all do the same thing:
1038 .nf
1039
1040 .ne 5
1041         if (!open(foo)) { die "Can't open $foo: $!"; }
1042         die "Can't open $foo: $!" unless open(foo);
1043         open(foo) || die "Can't open $foo: $!"; # foo or bust!
1044         open(foo) ? \'hi mom\' : die "Can't open $foo: $!";
1045                                 # a bit exotic, that last one
1046
1047 .fi
1048 .PP
1049 The
1050 .I if
1051 statement is straightforward.
1052 Since BLOCKs are always bounded by curly brackets, there is never any
1053 ambiguity about which
1054 .I if
1055 an
1056 .I else
1057 goes with.
1058 If you use
1059 .I unless
1060 in place of
1061 .IR if ,
1062 the sense of the test is reversed.
1063 .PP
1064 The
1065 .I while
1066 statement executes the block as long as the expression is true
1067 (does not evaluate to the null string or 0).
1068 The LABEL is optional, and if present, consists of an identifier followed by
1069 a colon.
1070 The LABEL identifies the loop for the loop control statements
1071 .IR next ,
1072 .IR last ,
1073 and
1074 .I redo
1075 (see below).
1076 If there is a
1077 .I continue
1078 BLOCK, it is always executed just before
1079 the conditional is about to be evaluated again, similarly to the third part
1080 of a
1081 .I for
1082 loop in C.
1083 Thus it can be used to increment a loop variable, even when the loop has
1084 been continued via the
1085 .I next
1086 statement (similar to the C \*(L"continue\*(R" statement).
1087 .PP
1088 If the word
1089 .I while
1090 is replaced by the word
1091 .IR until ,
1092 the sense of the test is reversed, but the conditional is still tested before
1093 the first iteration.
1094 .PP
1095 In either the
1096 .I if
1097 or the
1098 .I while
1099 statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional
1100 is true if the value of the last command in that block is true.
1101 .PP
1102 The
1103 .I for
1104 loop works exactly like the corresponding
1105 .I while
1106 loop:
1107 .nf
1108
1109 .ne 12
1110         for ($i = 1; $i < 10; $i++) {
1111                 .\|.\|.
1112         }
1113
1114 is the same as
1115
1116         $i = 1;
1117         while ($i < 10) {
1118                 .\|.\|.
1119         } continue {
1120                 $i++;
1121         }
1122 .fi
1123 .PP
1124 The foreach loop iterates over a normal array value and sets the variable
1125 VAR to be each element of the array in turn.
1126 The variable is implicitly local to the loop, and regains its former value
1127 upon exiting the loop.
1128 The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword,
1129 so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity.
1130 If VAR is omitted, $_ is set to each value.
1131 If ARRAY is an actual array (as opposed to an expression returning an array
1132 value), you can modify each element of the array
1133 by modifying VAR inside the loop.
1134 Examples:
1135 .nf
1136
1137 .ne 5
1138         for (@ary) { s/foo/bar/; }
1139
1140         foreach $elem (@elements) {
1141                 $elem *= 2;
1142         }
1143
1144 .ne 3
1145         for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) {
1146                 print $_, "\en"; sleep(1);
1147         }
1148
1149         for (1..15) { print "Merry Christmas\en"; }
1150
1151 .ne 3
1152         foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'})) {
1153                 print "Item: $item\en";
1154         }
1155
1156 .fi
1157 .PP
1158 The BLOCK by itself (labeled or not) is equivalent to a loop that executes
1159 once.
1160 Thus you can use any of the loop control statements in it to leave or
1161 restart the block.
1162 The
1163 .I continue
1164 block is optional.
1165 This construct is particularly nice for doing case structures.
1166 .nf
1167
1168 .ne 6
1169         foo: {
1170                 if (/^abc/) { $abc = 1; last foo; }
1171                 if (/^def/) { $def = 1; last foo; }
1172                 if (/^xyz/) { $xyz = 1; last foo; }
1173                 $nothing = 1;
1174         }
1175
1176 .fi
1177 There is no official switch statement in perl, because there
1178 are already several ways to write the equivalent.
1179 In addition to the above, you could write
1180 .nf
1181
1182 .ne 6
1183         foo: {
1184                 $abc = 1, last foo  if /^abc/;
1185                 $def = 1, last foo  if /^def/;
1186                 $xyz = 1, last foo  if /^xyz/;
1187                 $nothing = 1;
1188         }
1189
1190 or
1191
1192 .ne 6
1193         foo: {
1194                 /^abc/ && do { $abc = 1; last foo; };
1195                 /^def/ && do { $def = 1; last foo; };
1196                 /^xyz/ && do { $xyz = 1; last foo; };
1197                 $nothing = 1;
1198         }
1199
1200 or
1201
1202 .ne 6
1203         foo: {
1204                 /^abc/ && ($abc = 1, last foo);
1205                 /^def/ && ($def = 1, last foo);
1206                 /^xyz/ && ($xyz = 1, last foo);
1207                 $nothing = 1;
1208         }
1209
1210 or even
1211
1212 .ne 8
1213         if (/^abc/)
1214                 { $abc = 1; }
1215         elsif (/^def/)
1216                 { $def = 1; }
1217         elsif (/^xyz/)
1218                 { $xyz = 1; }
1219         else
1220                 {$nothing = 1;}
1221
1222 .fi
1223 As it happens, these are all optimized internally to a switch structure,
1224 so perl jumps directly to the desired statement, and you needn't worry
1225 about perl executing a lot of unnecessary statements when you have a string
1226 of 50 elsifs, as long as you are testing the same simple scalar variable
1227 using ==, eq, or pattern matching as above.
1228 (If you're curious as to whether the optimizer has done this for a particular
1229 case statement, you can use the \-D1024 switch to list the syntax tree
1230 before execution.)
1231 .Sh "Simple statements"
1232 The only kind of simple statement is an expression evaluated for its side
1233 effects.
1234 Every expression (simple statement) must be terminated with a semicolon.
1235 Note that this is like C, but unlike Pascal (and
1236 .IR awk ).
1237 .PP
1238 Any simple statement may optionally be followed by a
1239 single modifier, just before the terminating semicolon.
1240 The possible modifiers are:
1241 .nf
1242
1243 .ne 4
1244         if EXPR
1245         unless EXPR
1246         while EXPR
1247         until EXPR
1248
1249 .fi
1250 The
1251 .I if
1252 and
1253 .I unless
1254 modifiers have the expected semantics.
1255 The
1256 .I while
1257 and
1258 .I until
1259 modifiers also have the expected semantics (conditional evaluated first),
1260 except when applied to a do-BLOCK command,
1261 in which case the block executes once before the conditional is evaluated.
1262 This is so that you can write loops like:
1263 .nf
1264
1265 .ne 4
1266         do {
1267                 $_ = <STDIN>;
1268                 .\|.\|.
1269         } until $_ \|eq \|".\|\e\|n";
1270
1271 .fi
1272 (See the
1273 .I do
1274 operator below.  Note also that the loop control commands described later will
1275 NOT work in this construct, since modifiers don't take loop labels.
1276 Sorry.)
1277 .Sh "Expressions"
1278 Since
1279 .I perl
1280 expressions work almost exactly like C expressions, only the differences
1281 will be mentioned here.
1282 .PP
1283 Here's what
1284 .I perl
1285 has that C doesn't:
1286 .Ip ** 8 2
1287 The exponentiation operator.
1288 .Ip **= 8
1289 The exponentiation assignment operator.
1290 .Ip (\|) 8 3
1291 The null list, used to initialize an array to null.
1292 .Ip . 8
1293 Concatenation of two strings.
1294 .Ip .= 8
1295 The concatenation assignment operator.
1296 .Ip eq 8
1297 String equality (== is numeric equality).
1298 For a mnemonic just think of \*(L"eq\*(R" as a string.
1299 (If you are used to the
1300 .I awk
1301 behavior of using == for either string or numeric equality
1302 based on the current form of the comparands, beware!
1303 You must be explicit here.)
1304 .Ip ne 8
1305 String inequality (!= is numeric inequality).
1306 .Ip lt 8
1307 String less than.
1308 .Ip gt 8
1309 String greater than.
1310 .Ip le 8
1311 String less than or equal.
1312 .Ip ge 8
1313 String greater than or equal.
1314 .Ip cmp 8
1315 String comparison, returning -1, 0, or 1.
1316 .Ip <=> 8
1317 Numeric comparison, returning -1, 0, or 1.
1318 .Ip =~ 8 2
1319 Certain operations search or modify the string \*(L"$_\*(R" by default.
1320 This operator makes that kind of operation work on some other string.
1321 The right argument is a search pattern, substitution, or translation.
1322 The left argument is what is supposed to be searched, substituted, or
1323 translated instead of the default \*(L"$_\*(R".
1324 The return value indicates the success of the operation.
1325 (If the right argument is an expression other than a search pattern,
1326 substitution, or translation, it is interpreted as a search pattern
1327 at run time.
1328 This is less efficient than an explicit search, since the pattern must
1329 be compiled every time the expression is evaluated.)
1330 The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else.
1331 .Ip !~ 8
1332 Just like =~ except the return value is negated.
1333 .Ip x 8
1334 The repetition operator.
1335 Returns a string consisting of the left operand repeated the
1336 number of times specified by the right operand.
1337 .nf
1338
1339         print \'\-\' x 80;              # print row of dashes
1340         print \'\-\' x80;               # illegal, x80 is identifier
1341
1342         print "\et" x ($tab/8), \' \' x ($tab%8);       # tab over
1343
1344 .fi
1345 .Ip x= 8
1346 The repetition assignment operator.
1347 .Ip .\|. 8
1348 The range operator, which is really two different operators depending
1349 on the context.
1350 In an array context, returns an array of values counting (by ones)
1351 from the left value to the right value.
1352 This is useful for writing \*(L"for (1..10)\*(R" loops and for doing
1353 slice operations on arrays.
1354 .Sp
1355 In a scalar context, .\|. returns a boolean value.
1356 The operator is bistable, like a flip-flop..
1357 Each .\|. operator maintains its own boolean state.
1358 It is false as long as its left operand is false.
1359 Once the left operand is true, the range operator stays true
1360 until the right operand is true,
1361 AFTER which the range operator becomes false again.
1362 (It doesn't become false till the next time the range operator is evaluated.
1363 It can become false on the same evaluation it became true, but it still returns
1364 true once.)
1365 The right operand is not evaluated while the operator is in the \*(L"false\*(R" state,
1366 and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state.
1367 The scalar .\|. operator is primarily intended for doing line number ranges
1368 after
1369 the fashion of \fIsed\fR or \fIawk\fR.
1370 The precedence is a little lower than || and &&.
1371 The value returned is either the null string for false, or a sequence number
1372 (beginning with 1) for true.
1373 The sequence number is reset for each range encountered.
1374 The final sequence number in a range has the string \'E0\' appended to it, which
1375 doesn't affect its numeric value, but gives you something to search for if you
1376 want to exclude the endpoint.
1377 You can exclude the beginning point by waiting for the sequence number to be
1378 greater than 1.
1379 If either operand of scalar .\|. is static, that operand is implicitly compared
1380 to the $. variable, the current line number.
1381 Examples:
1382 .nf
1383
1384 .ne 6
1385 As a scalar operator:
1386     if (101 .\|. 200) { print; }        # print 2nd hundred lines
1387
1388     next line if (1 .\|. /^$/); # skip header lines
1389
1390     s/^/> / if (/^$/ .\|. eof());       # quote body
1391
1392 .ne 4
1393 As an array operator:
1394     for (101 .\|. 200) { print; }       # print $_ 100 times
1395
1396     @foo = @foo[$[ .\|. $#foo]; # an expensive no-op
1397     @foo = @foo[$#foo-4 .\|. $#foo];    # slice last 5 items
1398
1399 .fi
1400 .Ip \-x 8
1401 A file test.
1402 This unary operator takes one argument, either a filename or a filehandle,
1403 and tests the associated file to see if something is true about it.
1404 If the argument is omitted, tests $_, except for \-t, which tests
1405 .IR STDIN .
1406 It returns 1 for true and \'\' for false, or the undefined value if the
1407 file doesn't exist.
1408 Precedence is higher than logical and relational operators, but lower than
1409 arithmetic operators.
1410 The operator may be any of:
1411 .nf
1412         \-r     File is readable by effective uid.
1413         \-w     File is writable by effective uid.
1414         \-x     File is executable by effective uid.
1415         \-o     File is owned by effective uid.
1416         \-R     File is readable by real uid.
1417         \-W     File is writable by real uid.
1418         \-X     File is executable by real uid.
1419         \-O     File is owned by real uid.
1420         \-e     File exists.
1421         \-z     File has zero size.
1422         \-s     File has non-zero size (returns size).
1423         \-f     File is a plain file.
1424         \-d     File is a directory.
1425         \-l     File is a symbolic link.
1426         \-p     File is a named pipe (FIFO).
1427         \-S     File is a socket.
1428         \-b     File is a block special file.
1429         \-c     File is a character special file.
1430         \-u     File has setuid bit set.
1431         \-g     File has setgid bit set.
1432         \-k     File has sticky bit set.
1433         \-t     Filehandle is opened to a tty.
1434         \-T     File is a text file.
1435         \-B     File is a binary file (opposite of \-T).
1436         \-M     Age of file in days when script started.
1437         \-A     Same for access time.
1438         \-C     Same for inode change time.
1439
1440 .fi
1441 The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X
1442 is based solely on the mode of the file and the uids and gids of the user.
1443 There may be other reasons you can't actually read, write or execute the file.
1444 Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and
1445 \-x and \-X return 1 if any execute bit is set in the mode.
1446 Scripts run by the superuser may thus need to do a stat() in order to determine
1447 the actual mode of the file, or temporarily set the uid to something else.
1448 .Sp
1449 Example:
1450 .nf
1451 .ne 7
1452
1453         while (<>) {
1454                 chop;
1455                 next unless \-f $_;     # ignore specials
1456                 .\|.\|.
1457         }
1458
1459 .fi
1460 Note that \-s/a/b/ does not do a negated substitution.
1461 Saying \-exp($foo) still works as expected, however\*(--only single letters
1462 following a minus are interpreted as file tests.
1463 .Sp
1464 The \-T and \-B switches work as follows.
1465 The first block or so of the file is examined for odd characters such as
1466 strange control codes or metacharacters.
1467 If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file.
1468 Also, any file containing null in the first block is considered a binary file.
1469 If \-T or \-B is used on a filehandle, the current stdio buffer is examined
1470 rather than the first block.
1471 Both \-T and \-B return TRUE on a null file, or a file at EOF when testing
1472 a filehandle.
1473 .PP
1474 If any of the file tests (or either stat operator) are given the special
1475 filehandle consisting of a solitary underline, then the stat structure
1476 of the previous file test (or stat operator) is used, saving a system
1477 call.
1478 (This doesn't work with \-t, and you need to remember that lstat and -l
1479 will leave values in the stat structure for the symbolic link, not the
1480 real file.)
1481 Example:
1482 .nf
1483
1484         print "Can do.\en" if -r $a || -w _ || -x _;
1485
1486 .ne 9
1487         stat($filename);
1488         print "Readable\en" if -r _;
1489         print "Writable\en" if -w _;
1490         print "Executable\en" if -x _;
1491         print "Setuid\en" if -u _;
1492         print "Setgid\en" if -g _;
1493         print "Sticky\en" if -k _;
1494         print "Text\en" if -T _;
1495         print "Binary\en" if -B _;
1496
1497 .fi
1498 .PP
1499 Here is what C has that
1500 .I perl
1501 doesn't:
1502 .Ip "unary &" 12
1503 Address-of operator.
1504 .Ip "unary *" 12
1505 Dereference-address operator.
1506 .Ip "(TYPE)" 12
1507 Type casting operator.
1508 .PP
1509 Like C,
1510 .I perl
1511 does a certain amount of expression evaluation at compile time, whenever
1512 it determines that all of the arguments to an operator are static and have
1513 no side effects.
1514 In particular, string concatenation happens at compile time between literals that don't do variable substitution.
1515 Backslash interpretation also happens at compile time.
1516 You can say
1517 .nf
1518
1519 .ne 2
1520         \'Now is the time for all\' . "\|\e\|n" .
1521         \'good men to come to.\'
1522
1523 .fi
1524 and this all reduces to one string internally.
1525 .PP
1526 The autoincrement operator has a little extra built-in magic to it.
1527 If you increment a variable that is numeric, or that has ever been used in
1528 a numeric context, you get a normal increment.
1529 If, however, the variable has only been used in string contexts since it
1530 was set, and has a value that is not null and matches the
1531 pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done
1532 as a string, preserving each character within its range, with carry:
1533 .nf
1534
1535         print ++($foo = \'99\');        # prints \*(L'100\*(R'
1536         print ++($foo = \'a0\');        # prints \*(L'a1\*(R'
1537         print ++($foo = \'Az\');        # prints \*(L'Ba\*(R'
1538         print ++($foo = \'zz\');        # prints \*(L'aaa\*(R'
1539
1540 .fi
1541 The autodecrement is not magical.
1542 .PP
1543 The range operator (in an array context) makes use of the magical
1544 autoincrement algorithm if the minimum and maximum are strings.
1545 You can say
1546
1547         @alphabet = (\'A\' .. \'Z\');
1548
1549 to get all the letters of the alphabet, or
1550
1551         $hexdigit = (0 .. 9, \'a\' .. \'f\')[$num & 15];
1552
1553 to get a hexadecimal digit, or
1554
1555         @z2 = (\'01\' .. \'31\');  print @z2[$mday];
1556
1557 to get dates with leading zeros.
1558 (If the final value specified is not in the sequence that the magical increment
1559 would produce, the sequence goes until the next value would be longer than
1560 the final value specified.)
1561 .PP
1562 The || and && operators differ from C's in that, rather than returning 0 or 1,
1563 they return the last value evaluated.
1564 Thus, a portable way to find out the home directory might be:
1565 .nf
1566
1567         $home = $ENV{'HOME'} || $ENV{'LOGDIR'} ||
1568             (getpwuid($<))[7] || die "You're homeless!\en";
1569
1570 .fi