perl.man

   1 .rn '' }`
   2 ''' $Header: perl.man,v 4.0 91/03/20 01:38:08 lwall Locked $
   3 '''
   4 ''' $Log:       perl.man,v $
   5 ''' Revision 4.0  91/03/20  01:38:08  lwall
   6 ''' 4.0 baseline.
   7 '''
   8 '''
   9 .de Sh
  10 .br
  11 .ne 5
  12 .PP
  13 \fB\\$1\fR
  14 .PP
  15 ..
  16 .de Sp
  17 .if t .sp .5v
  18 .if n .sp
  19 ..
  20 .de Ip
  21 .br
  22 .ie \\n(.$>=3 .ne \\$3
  23 .el .ne 3
  24 .IP "\\$1" \\$2
  25 ..
  26 '''
  27 '''     Set up \*(-- to give an unbreakable dash;
  28 '''     string Tr holds user defined translation string.
  29 '''     Bell System Logo is used as a dummy character.
  30 '''
  31 .tr \(*W-|\(bv\*(Tr
  32 .ie n \{\
  33 .ds -- \(*W-
  34 .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
  35 .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
  36 .ds L" ""
  37 .ds R" ""
  38 .ds L' '
  39 .ds R' '
  40 'br\}
  41 .el\{\
  42 .ds -- \(em\|
  43 .tr \*(Tr
  44 .ds L" ``
  45 .ds R" ''
  46 .ds L' `
  47 .ds R' '
  48 'br\}
  49 .TH PERL 1 "\*(RP"
  50 .UC
  51 .SH NAME
  52 perl \- Practical Extraction and Report Language
  53 .SH SYNOPSIS
  54 .B perl
  55 [options] filename args
  56 .SH DESCRIPTION
  57 .I Perl
  58 is an interpreted language optimized for scanning arbitrary text files,
  59 extracting information from those text files, and printing reports based
  60 on that information.
  61 It's also a good language for many system management tasks.
  62 The language is intended to be practical (easy to use, efficient, complete)
  63 rather than beautiful (tiny, elegant, minimal).
  64 It combines (in the author's opinion, anyway) some of the best features of C,
  65 \fIsed\fR, \fIawk\fR, and \fIsh\fR,
  66 so people familiar with those languages should have little difficulty with it.
  67 (Language historians will also note some vestiges of \fIcsh\fR, Pascal, and
  68 even BASIC-PLUS.)
  69 Expression syntax corresponds quite closely to C expression syntax.
  70 Unlike most Unix utilities,
  71 .I perl
  72 does not arbitrarily limit the size of your data\*(--if you've got
  73 the memory,
  74 .I perl
  75 can slurp in your whole file as a single string.
  76 Recursion is of unlimited depth.
  77 And the hash tables used by associative arrays grow as necessary to prevent
  78 degraded performance.
  79 .I Perl
  80 uses sophisticated pattern matching techniques to scan large amounts of
  81 data very quickly.
  82 Although optimized for scanning text,
  83 .I perl
  84 can also deal with binary data, and can make dbm files look like associative
  85 arrays (where dbm is available).
  86 Setuid
  87 .I perl
  88 scripts are safer than C programs
  89 through a dataflow tracing mechanism which prevents many stupid security holes.
  90 If you have a problem that would ordinarily use \fIsed\fR
  91 or \fIawk\fR or \fIsh\fR, but it
  92 exceeds their capabilities or must run a little faster,
  93 and you don't want to write the silly thing in C, then
  94 .I perl
  95 may be for you.
  96 There are also translators to turn your
  97 .I sed
  98 and
  99 .I awk
 100 scripts into
 101 .I perl
 102 scripts.
 103 OK, enough hype.
 104 .PP
 105 Upon startup,
 106 .I perl
 107 looks for your script in one of the following places:
 108 .Ip 1. 4 2
 109 Specified line by line via
 110 .B \-e
 111 switches on the command line.
 112 .Ip 2. 4 2
 113 Contained in the file specified by the first filename on the command line.
 114 (Note that systems supporting the #! notation invoke interpreters this way.)
 115 .Ip 3. 4 2
 116 Passed in implicitly via standard input.
 117 This only works if there are no filename arguments\*(--to pass
 118 arguments to a
 119 .I stdin
 120 script you must explicitly specify a \- for the script name.
 121 .PP
 122 After locating your script,
 123 .I perl
 124 compiles it to an internal form.
 125 If the script is syntactically correct, it is executed.
 126 .Sh "Options"
 127 Note: on first reading this section may not make much sense to you.  It's here
 128 at the front for easy reference.
 129 .PP
 130 A single-character option may be combined with the following option, if any.
 131 This is particularly useful when invoking a script using the #! construct which
 132 only allows one argument.  Example:
 133 .nf
 134
 135 .ne 2
 136         #!/usr/bin/perl \-spi.bak       # same as \-s \-p \-i.bak
 137         .\|.\|.
 138
 139 .fi
 140 Options include:
 141 .TP 5
 142 .BI \-0 digits
 143 specifies the record separator ($/) as an octal number.
 144 If there are no digits, the null character is the separator.
 145 Other switches may precede or follow the digits.
 146 For example, if you have a version of
 147 .I find
 148 which can print filenames terminated by the null character, you can say this:
 149 .nf
 150
 151     find . \-name '*.bak' \-print0 | perl \-n0e unlink
 152
 153 .fi
 154 The special value 00 will cause Perl to slurp files in paragraph mode.
 155 The value 0777 will cause Perl to slurp files whole since there is no
 156 legal character with that value.
 157 .TP 5
 158 .B \-a
 159 turns on autosplit mode when used with a
 160 .B \-n
 161 or
 162 .BR \-p .
 163 An implicit split command to the @F array
 164 is done as the first thing inside the implicit while loop produced by
 165 the
 166 .B \-n
 167 or
 168 .BR \-p .
 169 .nf
 170
 171         perl \-ane \'print pop(@F), "\en";\'
 172
 173 is equivalent to
 174
 175         while (<>) {
 176                 @F = split(\' \');
 177                 print pop(@F), "\en";
 178         }
 179
 180 .fi
 181 .TP 5
 182 .B \-c
 183 causes
 184 .I perl
 185 to check the syntax of the script and then exit without executing it.
 186 .TP 5
 187 .BI \-d
 188 runs the script under the perl debugger.
 189 See the section on Debugging.
 190 .TP 5
 191 .BI \-D number
 192 sets debugging flags.
 193 To watch how it executes your script, use
 194 .BR \-D14 .
 195 (This only works if debugging is compiled into your
 196 .IR perl .)
 197 Another nice value is \-D1024, which lists your compiled syntax tree.
 198 And \-D512 displays compiled regular expressions.
 199 .TP 5
 200 .BI \-e " commandline"
 201 may be used to enter one line of script.
 202 Multiple
 203 .B \-e
 204 commands may be given to build up a multi-line script.
 205 If
 206 .B \-e
 207 is given,
 208 .I perl
 209 will not look for a script filename in the argument list.
 210 .TP 5
 211 .BI \-i extension
 212 specifies that files processed by the <> construct are to be edited
 213 in-place.
 214 It does this by renaming the input file, opening the output file by the
 215 same name, and selecting that output file as the default for print statements.
 216 The extension, if supplied, is added to the name of the
 217 old file to make a backup copy.
 218 If no extension is supplied, no backup is made.
 219 Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using
 220 the script:
 221 .nf
 222
 223 .ne 2
 224         #!/usr/bin/perl \-pi.bak
 225         s/foo/bar/;
 226
 227 which is equivalent to
 228
 229 .ne 14
 230         #!/usr/bin/perl
 231         while (<>) {
 232                 if ($ARGV ne $oldargv) {
 233                         rename($ARGV, $ARGV . \'.bak\');
 234                         open(ARGVOUT, ">$ARGV");
 235                         select(ARGVOUT);
 236                         $oldargv = $ARGV;
 237                 }
 238                 s/foo/bar/;
 239         }
 240         continue {
 241             print;      # this prints to original filename
 242         }
 243         select(STDOUT);
 244
 245 .fi
 246 except that the
 247 .B \-i
 248 form doesn't need to compare $ARGV to $oldargv to know when
 249 the filename has changed.
 250 It does, however, use ARGVOUT for the selected filehandle.
 251 Note that
 252 .I STDOUT
 253 is restored as the default output filehandle after the loop.
 254 .Sp
 255 You can use eof to locate the end of each input file, in case you want
 256 to append to each file, or reset line numbering (see example under eof).
 257 .TP 5
 258 .BI \-I directory
 259 may be used in conjunction with
 260 .B \-P
 261 to tell the C preprocessor where to look for include files.
 262 By default /usr/include and /usr/lib/perl are searched.
 263 .TP 5
 264 .BI \-l octnum
 265 enables automatic line-ending processing.  It has two effects:
 266 first, it automatically chops the line terminator when used with
 267 .B \-n
 268 or
 269 .B \-p ,
 270 and second, it assigns $\e to have the value of
 271 .I octnum
 272 so that any print statements will have that line terminator added back on.  If
 273 .I octnum
 274 is omitted, sets $\e to the current value of $/.
 275 For instance, to trim lines to 80 columns:
 276 .nf
 277
 278         perl -lpe \'substr($_, 80) = ""\'
 279
 280 .fi
 281 Note that the assignment $\e = $/ is done when the switch is processed,
 282 so the input record separator can be different than the output record
 283 separator if the
 284 .B \-l
 285 switch is followed by a
 286 .B \-0
 287 switch:
 288 .nf
 289
 290         gnufind / -print0 | perl -ln0e 'print "found $_" if -p'
 291
 292 .fi
 293 This sets $\e to newline and then sets $/ to the null character.
 294 .TP 5
 295 .B \-n
 296 causes
 297 .I perl
 298 to assume the following loop around your script, which makes it iterate
 299 over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR:
 300 .nf
 301
 302 .ne 3
 303         while (<>) {
 304                 .\|.\|.         # your script goes here
 305         }
 306
 307 .fi
 308 Note that the lines are not printed by default.
 309 See
 310 .B \-p
 311 to have lines printed.
 312 Here is an efficient way to delete all files older than a week:
 313 .nf
 314
 315         find . \-mtime +7 \-print | perl \-nle \'unlink;\'
 316
 317 .fi
 318 This is faster than using the \-exec switch of find because you don't have to
 319 start a process on every filename found.
 320 .TP 5
 321 .B \-p
 322 causes
 323 .I perl
 324 to assume the following loop around your script, which makes it iterate
 325 over filename arguments somewhat like \fIsed\fR:
 326 .nf
 327
 328 .ne 5
 329         while (<>) {
 330                 .\|.\|.         # your script goes here
 331         } continue {
 332                 print;
 333         }
 334
 335 .fi
 336 Note that the lines are printed automatically.
 337 To suppress printing use the
 338 .B \-n
 339 switch.
 340 A
 341 .B \-p
 342 overrides a
 343 .B \-n
 344 switch.
 345 .TP 5
 346 .B \-P
 347 causes your script to be run through the C preprocessor before
 348 compilation by
 349 .IR perl .
 350 (Since both comments and cpp directives begin with the # character,
 351 you should avoid starting comments with any words recognized
 352 by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".)
 353 .TP 5
 354 .B \-s
 355 enables some rudimentary switch parsing for switches on the command line
 356 after the script name but before any filename arguments (or before a \-\|\-).
 357 Any switch found there is removed from @ARGV and sets the corresponding variable in the
 358 .I perl
 359 script.
 360 The following script prints \*(L"true\*(R" if and only if the script is
 361 invoked with a \-xyz switch.
 362 .nf
 363
 364 .ne 2
 365         #!/usr/bin/perl \-s
 366         if ($xyz) { print "true\en"; }
 367
 368 .fi
 369 .TP 5
 370 .B \-S
 371 makes
 372 .I perl
 373 use the PATH environment variable to search for the script
 374 (unless the name of the script starts with a slash).
 375 Typically this is used to emulate #! startup on machines that don't
 376 support #!, in the following manner:
 377 .nf
 378
 379         #!/usr/bin/perl
 380         eval "exec /usr/bin/perl \-S $0 $*"
 381                 if $running_under_some_shell;
 382
 383 .fi
 384 The system ignores the first line and feeds the script to /bin/sh,
 385 which proceeds to try to execute the
 386 .I perl
 387 script as a shell script.
 388 The shell executes the second line as a normal shell command, and thus
 389 starts up the
 390 .I perl
 391 interpreter.
 392 On some systems $0 doesn't always contain the full pathname,
 393 so the
 394 .B \-S
 395 tells
 396 .I perl
 397 to search for the script if necessary.
 398 After
 399 .I perl
 400 locates the script, it parses the lines and ignores them because
 401 the variable $running_under_some_shell is never true.
 402 A better construct than $* would be ${1+"$@"}, which handles embedded spaces
 403 and such in the filenames, but doesn't work if the script is being interpreted
 404 by csh.
 405 In order to start up sh rather than csh, some systems may have to replace the
 406 #! line with a line containing just
 407 a colon, which will be politely ignored by perl.
 408 Other systems can't control that, and need a totally devious construct that
 409 will work under any of csh, sh or perl, such as the following:
 410 .nf
 411
 412 .ne 3
 413         eval '(exit $?0)' && eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
 414         & eval 'exec /usr/bin/perl -S $0 $argv:q'
 415                 if 0;
 416
 417 .fi
 418 .TP 5
 419 .B \-u
 420 causes
 421 .I perl
 422 to dump core after compiling your script.
 423 You can then take this core dump and turn it into an executable file
 424 by using the undump program (not supplied).
 425 This speeds startup at the expense of some disk space (which you can
 426 minimize by stripping the executable).
 427 (Still, a "hello world" executable comes out to about 200K on my machine.)
 428 If you are going to run your executable as a set-id program then you
 429 should probably compile it using taintperl rather than normal perl.
 430 If you want to execute a portion of your script before dumping, use the
 431 dump operator instead.
 432 Note: availability of undump is platform specific and may not be available
 433 for a specific port of perl.
 434 .TP 5
 435 .B \-U
 436 allows
 437 .I perl
 438 to do unsafe operations.
 439 Currently the only \*(L"unsafe\*(R" operation is the unlinking of directories while
 440 running as superuser.
 441 .TP 5
 442 .B \-v
 443 prints the version and patchlevel of your
 444 .I perl
 445 executable.
 446 .TP 5
 447 .B \-w
 448 prints warnings about identifiers that are mentioned only once, and scalar
 449 variables that are used before being set.
 450 Also warns about redefined subroutines, and references to undefined
 451 filehandles or filehandles opened readonly that you are attempting to
 452 write on.
 453 Also warns you if you use == on values that don't look like numbers, and if
 454 your subroutines recurse more than 100 deep.
 455 .TP 5
 456 .BI \-x directory
 457 tells
 458 .I perl
 459 that the script is embedded in a message.
 460 Leading garbage will be discarded until the first line that starts
 461 with #! and contains the string "perl".
 462 Any meaningful switches on that line will be applied (but only one
 463 group of switches, as with normal #! processing).
 464 If a directory name is specified, Perl will switch to that directory
 465 before running the script.
 466 The
 467 .B \-x
 468 switch only controls the the disposal of leading garbage.
 469 The script must be terminated with __END__ if there is trailing garbage
 470 to be ignored (the script can process any or all of the trailing garbage
 471 via the DATA filehandle if desired).
 472 .Sh "Data Types and Objects"
 473 .PP
 474 .I Perl
 475 has three data types: scalars, arrays of scalars, and
 476 associative arrays of scalars.
 477 Normal arrays are indexed by number, and associative arrays by string.
 478 .PP
 479 The interpretation of operations and values in perl sometimes
 480 depends on the requirements
 481 of the context around the operation or value.
 482 There are three major contexts: string, numeric and array.
 483 Certain operations return array values
 484 in contexts wanting an array, and scalar values otherwise.
 485 (If this is true of an operation it will be mentioned in the documentation
 486 for that operation.)
 487 Operations which return scalars don't care whether the context is looking
 488 for a string or a number, but
 489 scalar variables and values are interpreted as strings or numbers
 490 as appropriate to the context.
 491 A scalar is interpreted as TRUE in the boolean sense if it is not the null
 492 string or 0.
 493 Booleans returned by operators are 1 for true and 0 or \'\' (the null
 494 string) for false.
 495 .PP
 496 There are actually two varieties of null string: defined and undefined.
 497 Undefined null strings are returned when there is no real value for something,
 498 such as when there was an error, or at end of file, or when you refer
 499 to an uninitialized variable or element of an array.
 500 An undefined null string may become defined the first time you access it, but
 501 prior to that you can use the defined() operator to determine whether the
 502 value is defined or not.
 503 .PP
 504 References to scalar variables always begin with \*(L'$\*(R', even when referring
 505 to a scalar that is part of an array.
 506 Thus:
 507 .nf
 508
 509 .ne 3
 510     $days       \h'|2i'# a simple scalar variable
 511     $days[28]   \h'|2i'# 29th element of array @days
 512     $days{\'Feb\'}\h'|2i'# one value from an associative array
 513     $#days      \h'|2i'# last index of array @days
 514
 515 but entire arrays or array slices are denoted by \*(L'@\*(R':
 516
 517     @days       \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n])
 518     @days[3,4,5]\h'|2i'# same as @days[3.\|.5]
 519     @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'})
 520
 521 and entire associative arrays are denoted by \*(L'%\*(R':
 522
 523     %days       \h'|2i'# (key1, val1, key2, val2 .\|.\|.)
 524 .fi
 525 .PP
 526 Any of these eight constructs may serve as an lvalue,
 527 that is, may be assigned to.
 528 (It also turns out that an assignment is itself an lvalue in
 529 certain contexts\*(--see examples under s, tr and chop.)
 530 Assignment to a scalar evaluates the righthand side in a scalar context,
 531 while assignment to an array or array slice evaluates the righthand side
 532 in an array context.
 533 .PP
 534 You may find the length of array @days by evaluating
 535 \*(L"$#days\*(R", as in
 536 .IR csh .
 537 (Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.)
 538 Assigning to $#days changes the length of the array.
 539 Shortening an array by this method does not actually destroy any values.
 540 Lengthening an array that was previously shortened recovers the values that
 541 were in those elements.
 542 You can also gain some measure of efficiency by preextending an array that
 543 is going to get big.
 544 (You can also extend an array by assigning to an element that is off the
 545 end of the array.
 546 This differs from assigning to $#whatever in that intervening values
 547 are set to null rather than recovered.)
 548 You can truncate an array down to nothing by assigning the null list () to
 549 it.
 550 The following are exactly equivalent
 551 .nf
 552
 553         @whatever = ();
 554         $#whatever = $[ \- 1;
 555
 556 .fi
 557 .PP
 558 If you evaluate an array in a scalar context, it returns the length of
 559 the array.
 560 The following is always true:
 561 .nf
 562
 563         @whatever == $#whatever \- $[ + 1;
 564
 565 .fi
 566 .PP
 567 Multi-dimensional arrays are not directly supported, but see the discussion
 568 of the $; variable later for a means of emulating multiple subscripts with
 569 an associative array.
 570 You could also write a subroutine to turn multiple subscripts into a single
 571 subscript.
 572 .PP
 573 Every data type has its own namespace.
 574 You can, without fear of conflict, use the same name for a scalar variable,
 575 an array, an associative array, a filehandle, a subroutine name, and/or
 576 a label.
 577 Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R',
 578 or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved
 579 with respect to variable names.
 580 (They ARE reserved with respect to labels and filehandles, however, which
 581 don't have an initial special character.
 582 Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\').
 583 Using uppercase filehandles also improves readability and protects you
 584 from conflict with future reserved words.)
 585 Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all
 586 different names.
 587 Names which start with a letter may also contain digits and underscores.
 588 Names which do not start with a letter are limited to one character,
 589 e.g. \*(L"$%\*(R" or \*(L"$$\*(R".
 590 (Most of the one character names have a predefined significance to
 591 .IR perl .
 592 More later.)
 593 .PP
 594 Numeric literals are specified in any of the usual floating point or
 595 integer formats:
 596 .nf
 597
 598 .ne 5
 599     12345
 600     12345.67
 601     .23E-10
 602     0xffff      # hex
 603     0377        # octal
 604
 605 .fi
 606 String literals are delimited by either single or double quotes.
 607 They work much like shell quotes:
 608 double-quoted string literals are subject to backslash and variable
 609 substitution; single-quoted strings are not (except for \e\' and \e\e).
 610 The usual backslash rules apply for making characters such as newline, tab,
 611 etc., as well as some more exotic forms:
 612 .nf
 613
 614         \et             tab
 615         \en             newline
 616         \er             return
 617         \ef             form feed
 618         \eb             backspace
 619         \ea             alarm (bell)
 620         \ee             escape
 621         \e033           octal char
 622         \ex1b           hex char
 623         \ec[            control char
 624         \el             lowercase next char
 625         \eu             uppercase next char
 626         \eL             lowercase till \eE
 627         \eU             uppercase till \eE
 628         \eE             end case modification
 629
 630 .fi
 631 You can also embed newlines directly in your strings, i.e. they can end on
 632 a different line than they begin.
 633 This is nice, but if you forget your trailing quote, the error will not be
 634 reported until
 635 .I perl
 636 finds another line containing the quote character, which
 637 may be much further on in the script.
 638 Variable substitution inside strings is limited to scalar variables, normal
 639 array values, and array slices.
 640 (In other words, identifiers beginning with $ or @, followed by an optional
 641 bracketed expression as a subscript.)
 642 The following code segment prints out \*(L"The price is $100.\*(R"
 643 .nf
 644
 645 .ne 2
 646     $Price = \'$100\';\h'|3.5i'# not interpreted
 647     print "The price is $Price.\e\|n";\h'|3.5i'# interpreted
 648
 649 .fi
 650 Note that you can put curly brackets around the identifier to delimit it
 651 from following alphanumerics.
 652 Also note that a single quoted string must be separated from a preceding
 653 word by a space, since single quote is a valid character in an identifier
 654 (see Packages).
 655 .PP
 656 Two special literals are __LINE__ and __FILE__, which represent the current
 657 line number and filename at that point in your program.
 658 They may only be used as separate tokens; they will not be interpolated
 659 into strings.
 660 In addition, the token __END__ may be used to indicate the logical end of the
 661 script before the actual end of file.
 662 Any following text is ignored (but may be read via the DATA filehandle).
 663 The two control characters ^D and ^Z are synonyms for __END__.
 664 .PP
 665 A word that doesn't have any other interpretation in the grammar will be
 666 treated as if it had single quotes around it.
 667 For this purpose, a word consists only of alphanumeric characters and underline,
 668 and must start with an alphabetic character.
 669 As with filehandles and labels, a bare word that consists entirely of
 670 lowercase letters risks conflict with future reserved words, and if you
 671 use the
 672 .B \-w
 673 switch, Perl will warn you about any such words.
 674 .PP
 675 Array values are interpolated into double-quoted strings by joining all the
 676 elements of the array with the delimiter specified in the $" variable,
 677 space by default.
 678 (Since in versions of perl prior to 3.0 the @ character was not a metacharacter
 679 in double-quoted strings, the interpolation of @array, $array[EXPR],
 680 @array[LIST], $array{EXPR}, or @array{LIST} only happens if array is
 681 referenced elsewhere in the program or is predefined.)
 682 The following are equivalent:
 683 .nf
 684
 685 .ne 4
 686         $temp = join($",@ARGV);
 687         system "echo $temp";
 688
 689         system "echo @ARGV";
 690
 691 .fi
 692 Within search patterns (which also undergo double-quotish substitution)
 693 there is a bad ambiguity:  Is /$foo[bar]/ to be
 694 interpreted as /${foo}[bar]/ (where [bar] is a character class for the
 695 regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to
 696 array @foo)?
 697 If @foo doesn't otherwise exist, then it's obviously a character class.
 698 If @foo exists, perl takes a good guess about [bar], and is almost always right.
 699 If it does guess wrong, or if you're just plain paranoid,
 700 you can force the correct interpretation with curly brackets as above.
 701 .PP
 702 A line-oriented form of quoting is based on the shell here-is syntax.
 703 Following a << you specify a string to terminate the quoted material, and all lines
 704 following the current line down to the terminating string are the value
 705 of the item.
 706 The terminating string may be either an identifier (a word), or some
 707 quoted text.
 708 If quoted, the type of quotes you use determines the treatment of the text,
 709 just as in regular quoting.
 710 An unquoted identifier works like double quotes.
 711 There must be no space between the << and the identifier.
 712 (If you put a space it will be treated as a null identifier, which is
 713 valid, and matches the first blank line\*(--see Merry Christmas example below.)
 714 The terminating string must appear by itself (unquoted and with no surrounding
 715 whitespace) on the terminating line.
 716 .nf
 717
 718         print <<EOF;            # same as above
 719 The price is $Price.
 720 EOF
 721
 722         print <<"EOF";          # same as above
 723 The price is $Price.
 724 EOF
 725
 726         print << x 10;          # null identifier is delimiter
 727 Merry Christmas!
 728
 729         print <<`EOC`;          # execute commands
 730 echo hi there
 731 echo lo there
 732 EOC
 733
 734         print <<foo, <<bar;     # you can stack them
 735 I said foo.
 736 foo
 737 I said bar.
 738 bar
 739
 740 .fi
 741 Array literals are denoted by separating individual values by commas, and
 742 enclosing the list in parentheses:
 743 .nf
 744
 745         (LIST)
 746
 747 .fi
 748 In a context not requiring an array value, the value of the array literal
 749 is the value of the final element, as in the C comma operator.
 750 For example,
 751 .nf
 752
 753 .ne 4
 754     @foo = (\'cc\', \'\-E\', $bar);
 755
 756 assigns the entire array value to array foo, but
 757
 758     $foo = (\'cc\', \'\-E\', $bar);
 759
 760 .fi
 761 assigns the value of variable bar to variable foo.
 762 Note that the value of an actual array in a scalar context is the length
 763 of the array; the following assigns to $foo the value 3:
 764 .nf
 765
 766 .ne 2
 767     @foo = (\'cc\', \'\-E\', $bar);
 768     $foo = @foo;                # $foo gets 3
 769
 770 .fi
 771 You may have an optional comma before the closing parenthesis of an
 772 array literal, so that you can say:
 773 .nf
 774
 775     @foo = (
 776         1,
 777         2,
 778         3,
 779     );
 780
 781 .fi
 782 When a LIST is evaluated, each element of the list is evaluated in
 783 an array context, and the resulting array value is interpolated into LIST
 784 just as if each individual element were a member of LIST.  Thus arrays
 785 lose their identity in a LIST\*(--the list
 786
 787         (@foo,@bar,&SomeSub)
 788
 789 contains all the elements of @foo followed by all the elements of @bar,
 790 followed by all the elements returned by the subroutine named SomeSub.
 791 .PP
 792 A list value may also be subscripted like a normal array.
 793 Examples:
 794 .nf
 795
 796         $time = (stat($file))[8];       # stat returns array value
 797         $digit = ('a','b','c','d','e','f')[$digit-10];
 798         return (pop(@foo),pop(@foo))[0];
 799
 800 .fi
 801 .PP
 802 Array lists may be assigned to if and only if each element of the list
 803 is an lvalue:
 804 .nf
 805
 806     ($a, $b, $c) = (1, 2, 3);
 807
 808     ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00);
 809
 810 The final element may be an array or an associative array:
 811
 812     ($a, $b, @rest) = split;
 813     local($a, $b, %rest) = @_;
 814
 815 .fi
 816 You can actually put an array anywhere in the list, but the first array
 817 in the list will soak up all the values, and anything after it will get
 818 a null value.
 819 This may be useful in a local().
 820 .PP
 821 An associative array literal contains pairs of values to be interpreted
 822 as a key and a value:
 823 .nf
 824
 825 .ne 2
 826     # same as map assignment above
 827     %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
 828
 829 .fi
 830 Array assignment in a scalar context returns the number of elements
 831 produced by the expression on the right side of the assignment:
 832 .nf
 833
 834         $x = (($foo,$bar) = (3,2,1));   # set $x to 3, not 2
 835
 836 .fi
 837 .PP
 838 There are several other pseudo-literals that you should know about.
 839 If a string is enclosed by backticks (grave accents), it first undergoes
 840 variable substitution just like a double quoted string.
 841 It is then interpreted as a command, and the output of that command
 842 is the value of the pseudo-literal, like in a shell.
 843 In a scalar context, a single string consisting of all the output is
 844 returned.
 845 In an array context, an array of values is returned, one for each line
 846 of output.
 847 (You can set $/ to use a different line terminator.)
 848 The command is executed each time the pseudo-literal is evaluated.
 849 The status value of the command is returned in $? (see Predefined Names
 850 for the interpretation of $?).
 851 Unlike in \f2csh\f1, no translation is done on the return
 852 data\*(--newlines remain newlines.
 853 Unlike in any of the shells, single quotes do not hide variable names
 854 in the command from interpretation.
 855 To pass a $ through to the shell you need to hide it with a backslash.
 856 .PP
 857 Evaluating a filehandle in angle brackets yields the next line
 858 from that file (newline included, so it's never false until EOF, at
 859 which time an undefined value is returned).
 860 Ordinarily you must assign that value to a variable,
 861 but there is one situation where an automatic assignment happens.
 862 If (and only if) the input symbol is the only thing inside the conditional of a
 863 .I while
 864 loop, the value is
 865 automatically assigned to the variable \*(L"$_\*(R".
 866 (This may seem like an odd thing to you, but you'll use the construct
 867 in almost every
 868 .I perl
 869 script you write.)
 870 Anyway, the following lines are equivalent to each other:
 871 .nf
 872
 873 .ne 5
 874     while ($_ = <STDIN>) { print; }
 875     while (<STDIN>) { print; }
 876     for (\|;\|<STDIN>;\|) { print; }
 877     print while $_ = <STDIN>;
 878     print while <STDIN>;
 879
 880 .fi
 881 The filehandles
 882 .IR STDIN ,
 883 .I STDOUT
 884 and
 885 .I STDERR
 886 are predefined.
 887 (The filehandles
 888 .IR stdin ,
 889 .I stdout
 890 and
 891 .I stderr
 892 will also work except in packages, where they would be interpreted as
 893 local identifiers rather than global.)
 894 Additional filehandles may be created with the
 895 .I open
 896 function.
 897 .PP
 898 If a <FILEHANDLE> is used in a context that is looking for an array, an array
 899 consisting of all the input lines is returned, one line per array element.
 900 It's easy to make a LARGE data space this way, so use with care.
 901 .PP
 902 The null filehandle <> is special and can be used to emulate the behavior of
 903 \fIsed\fR and \fIawk\fR.
 904 Input from <> comes either from standard input, or from each file listed on
 905 the command line.
 906 Here's how it works: the first time <> is evaluated, the ARGV array is checked,
 907 and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard
 908 input.
 909 The ARGV array is then processed as a list of filenames.
 910 The loop
 911 .nf
 912
 913 .ne 3
 914         while (<>) {
 915                 .\|.\|.                 # code for each line
 916         }
 917
 918 .ne 10
 919 is equivalent to
 920
 921         unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[;
 922         while ($ARGV = shift) {
 923                 open(ARGV, $ARGV);
 924                 while (<ARGV>) {
 925                         .\|.\|.         # code for each line
 926                 }
 927         }
 928
 929 .fi
 930 except that it isn't as cumbersome to say.
 931 It really does shift array ARGV and put the current filename into
 932 variable ARGV.
 933 It also uses filehandle ARGV internally.
 934 You can modify @ARGV before the first <> as long as you leave the first
 935 filename at the beginning of the array.
 936 Line numbers ($.) continue as if the input was one big happy file.
 937 (But see example under eof for how to reset line numbers on each file.)
 938 .PP
 939 .ne 5
 940 If you want to set @ARGV to your own list of files, go right ahead.
 941 If you want to pass switches into your script, you can
 942 put a loop on the front like this:
 943 .nf
 944
 945 .ne 10
 946         while ($_ = $ARGV[0], /\|^\-/\|) {
 947                 shift;
 948             last if /\|^\-\|\-$\|/\|;
 949                 /\|^\-D\|(.*\|)/ \|&& \|($debug = $1);
 950                 /\|^\-v\|/ \|&& \|$verbose++;
 951                 .\|.\|.         # other switches
 952         }
 953         while (<>) {
 954                 .\|.\|.         # code for each line
 955         }
 956
 957 .fi
 958 The <> symbol will return FALSE only once.
 959 If you call it again after this it will assume you are processing another
 960 @ARGV list, and if you haven't set @ARGV, will input from
 961 .IR STDIN .
 962 .PP
 963 If the string inside the angle brackets is a reference to a scalar variable
 964 (e.g. <$foo>),
 965 then that variable contains the name of the filehandle to input from.
 966 .PP
 967 If the string inside angle brackets is not a filehandle, it is interpreted
 968 as a filename pattern to be globbed, and either an array of filenames or the
 969 next filename in the list is returned, depending on context.
 970 One level of $ interpretation is done first, but you can't say <$foo>
 971 because that's an indirect filehandle as explained in the previous
 972 paragraph.
 973 You could insert curly brackets to force interpretation as a
 974 filename glob: <${foo}>.
 975 Example:
 976 .nf
 977
 978 .ne 3
 979         while (<*.c>) {
 980                 chmod 0644, $_;
 981         }
 982
 983 is equivalent to
 984
 985 .ne 5
 986         open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|");
 987         while (<foo>) {
 988                 chop;
 989                 chmod 0644, $_;
 990         }
 991
 992 .fi
 993 In fact, it's currently implemented that way.
 994 (Which means it will not work on filenames with spaces in them unless
 995 you have /bin/csh on your machine.)
 996 Of course, the shortest way to do the above is:
 997 .nf
 998
 999         chmod 0644, <*.c>;
1000
1001 .fi
1002 .Sh "Syntax"
1003 .PP
1004 A
1005 .I perl
1006 script consists of a sequence of declarations and commands.
1007 The only things that need to be declared in
1008 .I perl
1009 are report formats and subroutines.
1010 See the sections below for more information on those declarations.
1011 All uninitialized user-created objects are assumed to
1012 start with a null or 0 value until they
1013 are defined by some explicit operation such as assignment.
1014 The sequence of commands is executed just once, unlike in
1015 .I sed
1016 and
1017 .I awk
1018 scripts, where the sequence of commands is executed for each input line.
1019 While this means that you must explicitly loop over the lines of your input file
1020 (or files), it also means you have much more control over which files and which
1021 lines you look at.
1022 (Actually, I'm lying\*(--it is possible to do an implicit loop with either the
1023 .B \-n
1024 or
1025 .B \-p
1026 switch.)
1027 .PP
1028 A declaration can be put anywhere a command can, but has no effect on the
1029 execution of the primary sequence of commands\*(--declarations all take effect
1030 at compile time.
1031 Typically all the declarations are put at the beginning or the end of the script.
1032 .PP
1033 .I Perl
1034 is, for the most part, a free-form language.
1035 (The only exception to this is format declarations, for fairly obvious reasons.)
1036 Comments are indicated by the # character, and extend to the end of the line.
1037 If you attempt to use /* */ C comments, it will be interpreted either as
1038 division or pattern matching, depending on the context.
1039 So don't do that.
1040 .Sh "Compound statements"
1041 In
1042 .IR perl ,
1043 a sequence of commands may be treated as one command by enclosing it
1044 in curly brackets.
1045 We will call this a BLOCK.
1046 .PP
1047 The following compound commands may be used to control flow:
1048 .nf
1049
1050 .ne 4
1051         if (EXPR) BLOCK
1052         if (EXPR) BLOCK else BLOCK
1053         if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK
1054         LABEL while (EXPR) BLOCK
1055         LABEL while (EXPR) BLOCK continue BLOCK
1056         LABEL for (EXPR; EXPR; EXPR) BLOCK
1057         LABEL foreach VAR (ARRAY) BLOCK
1058         LABEL BLOCK continue BLOCK
1059
1060 .fi
1061 Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not
1062 statements.
1063 This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed.
1064 If you want to write conditionals without curly brackets there are several
1065 other ways to do it.
1066 The following all do the same thing:
1067 .nf
1068
1069 .ne 5
1070         if (!open(foo)) { die "Can't open $foo: $!"; }
1071         die "Can't open $foo: $!" unless open(foo);
1072         open(foo) || die "Can't open $foo: $!"; # foo or bust!
1073         open(foo) ? \'hi mom\' : die "Can't open $foo: $!";
1074                                 # a bit exotic, that last one
1075
1076 .fi
1077 .PP
1078 The
1079 .I if
1080 statement is straightforward.
1081 Since BLOCKs are always bounded by curly brackets, there is never any
1082 ambiguity about which
1083 .I if
1084 an
1085 .I else
1086 goes with.
1087 If you use
1088 .I unless
1089 in place of
1090 .IR if ,
1091 the sense of the test is reversed.
1092 .PP
1093 The
1094 .I while
1095 statement executes the block as long as the expression is true
1096 (does not evaluate to the null string or 0).
1097 The LABEL is optional, and if present, consists of an identifier followed by
1098 a colon.
1099 The LABEL identifies the loop for the loop control statements
1100 .IR next ,
1101 .IR last ,
1102 and
1103 .I redo
1104 (see below).
1105 If there is a
1106 .I continue
1107 BLOCK, it is always executed just before
1108 the conditional is about to be evaluated again, similarly to the third part
1109 of a
1110 .I for
1111 loop in C.
1112 Thus it can be used to increment a loop variable, even when the loop has
1113 been continued via the
1114 .I next
1115 statement (similar to the C \*(L"continue\*(R" statement).
1116 .PP
1117 If the word
1118 .I while
1119 is replaced by the word
1120 .IR until ,
1121 the sense of the test is reversed, but the conditional is still tested before
1122 the first iteration.
1123 .PP
1124 In either the
1125 .I if
1126 or the
1127 .I while
1128 statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional
1129 is true if the value of the last command in that block is true.
1130 .PP
1131 The
1132 .I for
1133 loop works exactly like the corresponding
1134 .I while
1135 loop:
1136 .nf
1137
1138 .ne 12
1139         for ($i = 1; $i < 10; $i++) {
1140                 .\|.\|.
1141         }
1142
1143 is the same as
1144
1145         $i = 1;
1146         while ($i < 10) {
1147                 .\|.\|.
1148         } continue {
1149                 $i++;
1150         }
1151 .fi
1152 .PP
1153 The foreach loop iterates over a normal array value and sets the variable
1154 VAR to be each element of the array in turn.
1155 The variable is implicitly local to the loop, and regains its former value
1156 upon exiting the loop.
1157 The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword,
1158 so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity.
1159 If VAR is omitted, $_ is set to each value.
1160 If ARRAY is an actual array (as opposed to an expression returning an array
1161 value), you can modify each element of the array
1162 by modifying VAR inside the loop.
1163 Examples:
1164 .nf
1165
1166 .ne 5
1167         for (@ary) { s/foo/bar/; }
1168
1169         foreach $elem (@elements) {
1170                 $elem *= 2;
1171         }
1172
1173 .ne 3
1174         for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) {
1175                 print $_, "\en"; sleep(1);
1176         }
1177
1178         for (1..15) { print "Merry Christmas\en"; }
1179
1180 .ne 3
1181         foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'})) {
1182                 print "Item: $item\en";
1183         }
1184
1185 .fi
1186 .PP
1187 The BLOCK by itself (labeled or not) is equivalent to a loop that executes
1188 once.
1189 Thus you can use any of the loop control statements in it to leave or
1190 restart the block.
1191 The
1192 .I continue
1193 block is optional.
1194 This construct is particularly nice for doing case structures.
1195 .nf
1196
1197 .ne 6
1198         foo: {
1199                 if (/^abc/) { $abc = 1; last foo; }
1200                 if (/^def/) { $def = 1; last foo; }
1201                 if (/^xyz/) { $xyz = 1; last foo; }
1202                 $nothing = 1;
1203         }
1204
1205 .fi
1206 There is no official switch statement in perl, because there
1207 are already several ways to write the equivalent.
1208 In addition to the above, you could write
1209 .nf
1210
1211 .ne 6
1212         foo: {
1213                 $abc = 1, last foo  if /^abc/;
1214                 $def = 1, last foo  if /^def/;
1215                 $xyz = 1, last foo  if /^xyz/;
1216                 $nothing = 1;
1217         }
1218
1219 or
1220
1221 .ne 6
1222         foo: {
1223                 /^abc/ && do { $abc = 1; last foo; };
1224                 /^def/ && do { $def = 1; last foo; };
1225                 /^xyz/ && do { $xyz = 1; last foo; };
1226                 $nothing = 1;
1227         }
1228
1229 or
1230
1231 .ne 6
1232         foo: {
1233                 /^abc/ && ($abc = 1, last foo);
1234                 /^def/ && ($def = 1, last foo);
1235                 /^xyz/ && ($xyz = 1, last foo);
1236                 $nothing = 1;
1237         }
1238
1239 or even
1240
1241 .ne 8
1242         if (/^abc/)
1243                 { $abc = 1; }
1244         elsif (/^def/)
1245                 { $def = 1; }
1246         elsif (/^xyz/)
1247                 { $xyz = 1; }
1248         else
1249                 {$nothing = 1;}
1250
1251 .fi
1252 As it happens, these are all optimized internally to a switch structure,
1253 so perl jumps directly to the desired statement, and you needn't worry
1254 about perl executing a lot of unnecessary statements when you have a string
1255 of 50 elsifs, as long as you are testing the same simple scalar variable
1256 using ==, eq, or pattern matching as above.
1257 (If you're curious as to whether the optimizer has done this for a particular
1258 case statement, you can use the \-D1024 switch to list the syntax tree
1259 before execution.)
1260 .Sh "Simple statements"
1261 The only kind of simple statement is an expression evaluated for its side
1262 effects.
1263 Every expression (simple statement) must be terminated with a semicolon.
1264 Note that this is like C, but unlike Pascal (and
1265 .IR awk ).
1266 .PP
1267 Any simple statement may optionally be followed by a
1268 single modifier, just before the terminating semicolon.
1269 The possible modifiers are:
1270 .nf
1271
1272 .ne 4
1273         if EXPR
1274         unless EXPR
1275         while EXPR
1276         until EXPR
1277
1278 .fi
1279 The
1280 .I if
1281 and
1282 .I unless
1283 modifiers have the expected semantics.
1284 The
1285 .I while
1286 and
1287 .I until
1288 modifiers also have the expected semantics (conditional evaluated first),
1289 except when applied to a do-BLOCK or a do-SUBROUTINE command,
1290 in which case the block executes once before the conditional is evaluated.
1291 This is so that you can write loops like:
1292 .nf
1293
1294 .ne 4
1295         do {
1296                 $_ = <STDIN>;
1297                 .\|.\|.
1298         } until $_ \|eq \|".\|\e\|n";
1299
1300 .fi
1301 (See the
1302 .I do
1303 operator below.  Note also that the loop control commands described later will
1304 NOT work in this construct, since modifiers don't take loop labels.
1305 Sorry.)
1306 .Sh "Expressions"
1307 Since
1308 .I perl
1309 expressions work almost exactly like C expressions, only the differences
1310 will be mentioned here.
1311 .PP
1312 Here's what
1313 .I perl
1314 has that C doesn't:
1315 .Ip ** 8 2
1316 The exponentiation operator.
1317 .Ip **= 8
1318 The exponentiation assignment operator.
1319 .Ip (\|) 8 3
1320 The null list, used to initialize an array to null.
1321 .Ip . 8
1322 Concatenation of two strings.
1323 .Ip .= 8
1324 The concatenation assignment operator.
1325 .Ip eq 8
1326 String equality (== is numeric equality).
1327 For a mnemonic just think of \*(L"eq\*(R" as a string.
1328 (If you are used to the
1329 .I awk
1330 behavior of using == for either string or numeric equality
1331 based on the current form of the comparands, beware!
1332 You must be explicit here.)
1333 .Ip ne 8
1334 String inequality (!= is numeric inequality).
1335 .Ip lt 8
1336 String less than.
1337 .Ip gt 8
1338 String greater than.
1339 .Ip le 8
1340 String less than or equal.
1341 .Ip ge 8
1342 String greater than or equal.
1343 .Ip cmp 8
1344 String comparison, returning -1, 0, or 1.
1345 .Ip <=> 8
1346 Numeric comparison, returning -1, 0, or 1.
1347 .Ip =~ 8 2
1348 Certain operations search or modify the string \*(L"$_\*(R" by default.
1349 This operator makes that kind of operation work on some other string.
1350 The right argument is a search pattern, substitution, or translation.
1351 The left argument is what is supposed to be searched, substituted, or
1352 translated instead of the default \*(L"$_\*(R".
1353 The return value indicates the success of the operation.
1354 (If the right argument is an expression other than a search pattern,
1355 substitution, or translation, it is interpreted as a search pattern
1356 at run time.
1357 This is less efficient than an explicit search, since the pattern must
1358 be compiled every time the expression is evaluated.)
1359 The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else.
1360 .Ip !~ 8
1361 Just like =~ except the return value is negated.
1362 .Ip x 8
1363 The repetition operator.
1364 Returns a string consisting of the left operand repeated the
1365 number of times specified by the right operand.
1366 In an array context, if the left operand is a list in parens, it repeats
1367 the list.
1368 .nf
1369
1370         print \'\-\' x 80;              # print row of dashes
1371         print \'\-\' x80;               # illegal, x80 is identifier
1372
1373         print "\et" x ($tab/8), \' \' x ($tab%8);       # tab over
1374
1375         @ones = (1) x ;                 # an array of 80 1's
1376         @ones = (5) x @ones;            # set all elements to 5
1377
1378 .fi
1379 .Ip x= 8
1380 The repetition assignment operator.
1381 Only works on scalars.
1382 .Ip .\|. 8
1383 The range operator, which is really two different operators depending
1384 on the context.
1385 In an array context, returns an array of values counting (by ones)
1386 from the left value to the right value.
1387 This is useful for writing \*(L"for (1..10)\*(R" loops and for doing
1388 slice operations on arrays.
1389 .Sp
1390 In a scalar context, .\|. returns a boolean value.
1391 The operator is bistable, like a flip-flop..
1392 Each .\|. operator maintains its own boolean state.
1393 It is false as long as its left operand is false.
1394 Once the left operand is true, the range operator stays true
1395 until the right operand is true,
1396 AFTER which the range operator becomes false again.
1397 (It doesn't become false till the next time the range operator is evaluated.
1398 It can become false on the same evaluation it became true, but it still returns
1399 true once.)
1400 The right operand is not evaluated while the operator is in the \*(L"false\*(R" state,
1401 and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state.
1402 The scalar .\|. operator is primarily intended for doing line number ranges
1403 after
1404 the fashion of \fIsed\fR or \fIawk\fR.
1405 The precedence is a little lower than || and &&.
1406 The value returned is either the null string for false, or a sequence number
1407 (beginning with 1) for true.
1408 The sequence number is reset for each range encountered.
1409 The final sequence number in a range has the string \'E0\' appended to it, which
1410 doesn't affect its numeric value, but gives you something to search for if you
1411 want to exclude the endpoint.
1412 You can exclude the beginning point by waiting for the sequence number to be
1413 greater than 1.
1414 If either operand of scalar .\|. is static, that operand is implicitly compared
1415 to the $. variable, the current line number.
1416 Examples:
1417 .nf
1418
1419 .ne 6
1420 As a scalar operator:
1421     if (101 .\|. 200) { print; }        # print 2nd hundred lines
1422
1423     next line if (1 .\|. /^$/); # skip header lines
1424
1425     s/^/> / if (/^$/ .\|. eof());       # quote body
1426
1427 .ne 4
1428 As an array operator:
1429     for (101 .\|. 200) { print; }       # print $_ 100 times
1430
1431     @foo = @foo[$[ .\|. $#foo]; # an expensive no-op
1432     @foo = @foo[$#foo-4 .\|. $#foo];    # slice last 5 items
1433
1434 .fi
1435 .Ip \-x 8
1436 A file test.
1437 This unary operator takes one argument, either a filename or a filehandle,
1438 and tests the associated file to see if something is true about it.
1439 If the argument is omitted, tests $_, except for \-t, which tests
1440 .IR STDIN .
1441 It returns 1 for true and \'\' for false, or the undefined value if the
1442 file doesn't exist.
1443 Precedence is higher than logical and relational operators, but lower than
1444 arithmetic operators.
1445 The operator may be any of:
1446 .nf
1447         \-r     File is readable by effective uid.
1448         \-w     File is writable by effective uid.
1449         \-x     File is executable by effective uid.
1450         \-o     File is owned by effective uid.
1451         \-R     File is readable by real uid.
1452         \-W     File is writable by real uid.
1453         \-X     File is executable by real uid.
1454         \-O     File is owned by real uid.
1455         \-e     File exists.
1456         \-z     File has zero size.
1457         \-s     File has non-zero size (returns size).
1458         \-f     File is a plain file.
1459         \-d     File is a directory.
1460         \-l     File is a symbolic link.
1461         \-p     File is a named pipe (FIFO).
1462         \-S     File is a socket.
1463         \-b     File is a block special file.
1464         \-c     File is a character special file.
1465         \-u     File has setuid bit set.
1466         \-g     File has setgid bit set.
1467         \-k     File has sticky bit set.
1468         \-t     Filehandle is opened to a tty.
1469         \-T     File is a text file.
1470         \-B     File is a binary file (opposite of \-T).
1471         \-M     Age of file in days when script started.
1472         \-A     Same for access time.
1473         \-C     Same for inode change time.
1474
1475 .fi
1476 The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X
1477 is based solely on the mode of the file and the uids and gids of the user.
1478 There may be other reasons you can't actually read, write or execute the file.
1479 Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and
1480 \-x and \-X return 1 if any execute bit is set in the mode.
1481 Scripts run by the superuser may thus need to do a stat() in order to determine
1482 the actual mode of the file, or temporarily set the uid to something else.
1483 .Sp
1484 Example:
1485 .nf
1486 .ne 7
1487
1488         while (<>) {
1489                 chop;
1490                 next unless \-f $_;     # ignore specials
1491                 .\|.\|.
1492         }
1493
1494 .fi
1495 Note that \-s/a/b/ does not do a negated substitution.
1496 Saying \-exp($foo) still works as expected, however\*(--only single letters
1497 following a minus are interpreted as file tests.
1498 .Sp
1499 The \-T and \-B switches work as follows.
1500 The first block or so of the file is examined for odd characters such as
1501 strange control codes or metacharacters.
1502 If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file.
1503 Also, any file containing null in the first block is considered a binary file.
1504 If \-T or \-B is used on a filehandle, the current stdio buffer is examined
1505 rather than the first block.
1506 Both \-T and \-B return TRUE on a null file, or a file at EOF when testing
1507 a filehandle.
1508 .PP
1509 If any of the file tests (or either stat operator) are given the special
1510 filehandle consisting of a solitary underline, then the stat structure
1511 of the previous file test (or stat operator) is used, saving a system
1512 call.
1513 (This doesn't work with \-t, and you need to remember that lstat and -l
1514 will leave values in the stat structure for the symbolic link, not the
1515 real file.)
1516 Example:
1517 .nf
1518
1519         print "Can do.\en" if -r $a || -w _ || -x _;
1520
1521 .ne 9
1522         stat($filename);
1523         print "Readable\en" if -r _;
1524         print "Writable\en" if -w _;
1525         print "Executable\en" if -x _;
1526         print "Setuid\en" if -u _;
1527         print "Setgid\en" if -g _;
1528         print "Sticky\en" if -k _;
1529         print "Text\en" if -T _;
1530         print "Binary\en" if -B _;
1531
1532 .fi
1533 .PP
1534 Here is what C has that
1535 .I perl
1536 doesn't:
1537 .Ip "unary &" 12
1538 Address-of operator.
1539 .Ip "unary *" 12
1540 Dereference-address operator.
1541 .Ip "(TYPE)" 12
1542 Type casting operator.
1543 .PP
1544 Like C,
1545 .I perl
1546 does a certain amount of expression evaluation at compile time, whenever
1547 it determines that all of the arguments to an operator are static and have
1548 no side effects.
1549 In particular, string concatenation happens at compile time between literals that don't do variable substitution.
1550 Backslash interpretation also happens at compile time.
1551 You can say
1552 .nf
1553
1554 .ne 2
1555         \'Now is the time for all\' . "\|\e\|n" .
1556         \'good men to come to.\'
1557
1558 .fi
1559 and this all reduces to one string internally.
1560 .PP
1561 The autoincrement operator has a little extra built-in magic to it.
1562 If you increment a variable that is numeric, or that has ever been used in
1563 a numeric context, you get a normal increment.
1564 If, however, the variable has only been used in string contexts since it
1565 was set, and has a value that is not null and matches the
1566 pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done
1567 as a string, preserving each character within its range, with carry:
1568 .nf
1569
1570         print ++($foo = \'99\');        # prints \*(L'100\*(R'
1571         print ++($foo = \'a0\');        # prints \*(L'a1\*(R'
1572         print ++($foo = \'Az\');        # prints \*(L'Ba\*(R'
1573         print ++($foo = \'zz\');        # prints \*(L'aaa\*(R'
1574
1575 .fi
1576 The autodecrement is not magical.
1577 .PP
1578 The range operator (in an array context) makes use of the magical
1579 autoincrement algorithm if the minimum and maximum are strings.
1580 You can say
1581
1582         @alphabet = (\'A\' .. \'Z\');
1583
1584 to get all the letters of the alphabet, or
1585
1586         $hexdigit = (0 .. 9, \'a\' .. \'f\')[$num & 15];
1587
1588 to get a hexadecimal digit, or
1589
1590         @z2 = (\'01\' .. \'31\');  print @z2[$mday];
1591
1592 to get dates with leading zeros.
1593 (If the final value specified is not in the sequence that the magical increment
1594 would produce, the sequence goes until the next value would be longer than
1595 the final value specified.)
1596 .PP
1597 The || and && operators differ from C's in that, rather than returning 0 or 1,
1598 they return the last value evaluated.
1599 Thus, a portable way to find out the home directory might be:
1600 .nf
1601
1602         $home = $ENV{'HOME'} || $ENV{'LOGDIR'} ||
1603             (getpwuid($<))[7] || die "You're homeless!\en";
1604
1605 .fi
1606 ''' Beginning of part 2
1607 ''' $Header: perl.man,v 4.0 91/03/20 01:38:08 lwall Locked $
1608 '''
1609 ''' $Log:       perl.man,v $
1610 ''' Revision 4.0  91/03/20  01:38:08  lwall
1611 ''' 4.0 baseline.
1612 '''
1613 ''' Revision 3.0.1.11  91/01/11  18:17:08  lwall
1614 ''' patch42: fixed some man page entries
1615 '''
1616 ''' Revision 3.0.1.10  90/11/10  01:46:29  lwall
1617 ''' patch38: random cleanup
1618 ''' patch38: added alarm function
1619 '''
1620 ''' Revision 3.0.1.9  90/10/15  18:17:37  lwall
1621 ''' patch29: added caller
1622 ''' patch29: index and substr now have optional 3rd args
1623 ''' patch29: added SysV IPC
1624 '''
1625 ''' Revision 3.0.1.8  90/08/13  22:21:00  lwall
1626 ''' patch28: documented that you can't interpolate $) or $| in pattern
1627 '''
1628 ''' Revision 3.0.1.7  90/08/09  04:27:04  lwall
1629 ''' patch19: added require operator
1630 '''
1631 ''' Revision 3.0.1.6  90/08/03  11:15:29  lwall
1632 ''' patch19: Intermediate diffs for Randal
1633 '''
1634 ''' Revision 3.0.1.5  90/03/27  16:15:17  lwall
1635 ''' patch16: MSDOS support
1636 '''
1637 ''' Revision 3.0.1.4  90/03/12  16:46:02  lwall
1638 ''' patch13: documented behavior of @array = /noparens/
1639 '''
1640 ''' Revision 3.0.1.3  90/02/28  17:55:58  lwall
1641 ''' patch9: grep now returns number of items matched in scalar context
1642 ''' patch9: documented in-place modification capabilites of grep
1643 '''
1644 ''' Revision 3.0.1.2  89/11/17  15:30:16  lwall
1645 ''' patch5: fixed some manual typos and indent problems
1646 '''
1647 ''' Revision 3.0.1.1  89/11/11  04:43:10  lwall
1648 ''' patch2: made some line breaks depend on troff vs. nroff
1649 ''' patch2: example of unshift had args backwards
1650 '''
1651 ''' Revision 3.0  89/10/18  15:21:37  lwall
1652 ''' 3.0 baseline
1653 '''
1654 '''
1655 .PP
1656 Along with the literals and variables mentioned earlier,
1657 the operations in the following section can serve as terms in an expression.
1658 Some of these operations take a LIST as an argument.
1659 Such a list can consist of any combination of scalar arguments or array values;
1660 the array values will be included in the list as if each individual element were
1661 interpolated at that point in the list, forming a longer single-dimensional
1662 array value.
1663 Elements of the LIST should be separated by commas.
1664 If an operation is listed both with and without parentheses around its
1665 arguments, it means you can either use it as a unary operator or
1666 as a function call.
1667 To use it as a function call, the next token on the same line must
1668 be a left parenthesis.
1669 (There may be intervening white space.)
1670 Such a function then has highest precedence, as you would expect from
1671 a function.
1672 If any token other than a left parenthesis follows, then it is a
1673 unary operator, with a precedence depending only on whether it is a LIST
1674 operator or not.
1675 LIST operators have lowest precedence.
1676 All other unary operators have a precedence greater than relational operators
1677 but less than arithmetic operators.
1678 See the section on Precedence.
1679 .Ip "/PATTERN/" 8 4
1680 See m/PATTERN/.
1681 .Ip "?PATTERN?" 8 4
1682 This is just like the /pattern/ search, except that it matches only once between
1683 calls to the
1684 .I reset
1685 operator.
1686 This is a useful optimization when you only want to see the first occurrence of
1687 something in each file of a set of files, for instance.
1688 Only ?? patterns local to the current package are reset.
1689 .Ip "accept(NEWSOCKET,GENERICSOCKET)" 8 2
1690 Does the same thing that the accept system call does.
1691 Returns true if it succeeded, false otherwise.
1692 See example in section on Interprocess Communication.
1693 .Ip "alarm(SECONDS)" 8 4
1694 .Ip "alarm SECONDS" 8
1695 Arranges to have a SIGALRM delivered to this process after the specified number
1696 of seconds (minus 1, actually) have elapsed.  Thus, alarm(15) will cause
1697 a SIGALRM at some point more than 14 seconds in the future.
1698 Only one timer may be counting at once.  Each call disables the previous
1699 timer, and an argument of 0 may be supplied to cancel the previous timer
1700 without starting a new one.
1701 The returned value is the amount of time remaining on the previous timer.
1702 .Ip "atan2(Y,X)" 8 2
1703 Returns the arctangent of Y/X in the range
1704 .if t \-\(*p to \(*p.
1705 .if n \-PI to PI.
1706 .Ip "bind(SOCKET,NAME)" 8 2
1707 Does the same thing that the bind system call does.
1708 Returns true if it succeeded, false otherwise.
1709 NAME should be a packed address of the proper type for the socket.
1710 See example in section on Interprocess Communication.
1711 .Ip "binmode(FILEHANDLE)" 8 4
1712 .Ip "binmode FILEHANDLE" 8 4
1713 Arranges for the file to be read in \*(L"binary\*(R" mode in operating systems
1714 that distinguish between binary and text files.
1715 Files that are not read in binary mode have CR LF sequences translated
1716 to LF on input and LF translated to CR LF on output.
1717 Binmode has no effect under Unix.
1718 If FILEHANDLE is an expression, the value is taken as the name of
1719 the filehandle.
1720 .Ip "caller(EXPR)"
1721 .Ip "caller"
1722 Returns the context of the current subroutine call:
1723 .nf
1724
1725         ($package,$filename,$line) = caller;
1726
1727 .fi
1728 With EXPR, returns some extra information that the debugger uses to print
1729 a stack trace.  The value of EXPR indicates how many call frames to go
1730 back before the current one.
1731 .Ip "chdir(EXPR)" 8 2
1732 .Ip "chdir EXPR" 8 2
1733 Changes the working directory to EXPR, if possible.
1734 If EXPR is omitted, changes to home directory.
1735 Returns 1 upon success, 0 otherwise.
1736 See example under
1737 .IR die .
1738 .Ip "chmod(LIST)" 8 2
1739 .Ip "chmod LIST" 8 2
1740 Changes the permissions of a list of files.
1741 The first element of the list must be the numerical mode.
1742 Returns the number of files successfully changed.
1743 .nf
1744
1745 .ne 2
1746         $cnt = chmod 0755, \'foo\', \'bar\';
1747         chmod 0755, @executables;
1748
1749 .fi
1750 .Ip "chop(LIST)" 8 7
1751 .Ip "chop(VARIABLE)" 8
1752 .Ip "chop VARIABLE" 8
1753 .Ip "chop" 8
1754 Chops off the last character of a string and returns the character chopped.
1755 It's used primarily to remove the newline from the end of an input record,
1756 but is much more efficient than s/\en// because it neither scans nor copies
1757 the string.
1758 If VARIABLE is omitted, chops $_.
1759 Example:
1760 .nf
1761
1762 .ne 5
1763         while (<>) {
1764                 chop;   # avoid \en on last field
1765                 @array = split(/:/);
1766                 .\|.\|.
1767         }
1768
1769 .fi
1770 You can actually chop anything that's an lvalue, including an assignment:
1771 .nf
1772
1773         chop($cwd = \`pwd\`);
1774         chop($answer = <STDIN>);
1775
1776 .fi
1777 If you chop a list, each element is chopped.
1778 Only the value of the last chop is returned.
1779 .Ip "chown(LIST)" 8 2
1780 .Ip "chown LIST" 8 2
1781 Changes the owner (and group) of a list of files.
1782 The first two elements of the list must be the NUMERICAL uid and gid,
1783 in that order.
1784 Returns the number of files successfully changed.
1785 .nf
1786
1787 .ne 2
1788         $cnt = chown $uid, $gid, \'foo\', \'bar\';
1789         chown $uid, $gid, @filenames;
1790
1791 .fi
1792 .ne 23
1793 Here's an example of looking up non-numeric uids:
1794 .nf
1795
1796         print "User: ";
1797         $user = <STDIN>;
1798         chop($user);
1799         print "Files: "
1800         $pattern = <STDIN>;
1801         chop($pattern);
1802 .ie t \{\
1803         open(pass, \'/etc/passwd\') || die "Can't open passwd: $!\en";
1804 'br\}
1805 .el \{\
1806         open(pass, \'/etc/passwd\')
1807                 || die "Can't open passwd: $!\en";
1808 'br\}
1809         while (<pass>) {
1810                 ($login,$pass,$uid,$gid) = split(/:/);
1811                 $uid{$login} = $uid;
1812                 $gid{$login} = $gid;
1813         }
1814         @ary = <${pattern}>;    # get filenames
1815         if ($uid{$user} eq \'\') {
1816                 die "$user not in passwd file";
1817         }
1818         else {
1819                 chown $uid{$user}, $gid{$user}, @ary;
1820         }
1821
1822 .fi
1823 .Ip "chroot(FILENAME)" 8 5
1824 .Ip "chroot FILENAME" 8
1825 Does the same as the system call of that name.
1826 If you don't know what it does, don't worry about it.
1827 If FILENAME is omitted, does chroot to $_.
1828 .Ip "close(FILEHANDLE)" 8 5
1829 .Ip "close FILEHANDLE" 8
1830 Closes the file or pipe associated with the file handle.
1831 You don't have to close FILEHANDLE if you are immediately going to
1832 do another open on it, since open will close it for you.
1833 (See
1834 .IR open .)
1835 However, an explicit close on an input file resets the line counter ($.), while
1836 the implicit close done by
1837 .I open
1838 does not.
1839 Also, closing a pipe will wait for the process executing on the pipe to complete,
1840 in case you want to look at the output of the pipe afterwards.
1841 Closing a pipe explicitly also puts the status value of the command into $?.
1842 Example:
1843 .nf
1844
1845 .ne 4
1846         open(OUTPUT, \'|sort >foo\');   # pipe to sort
1847         .\|.\|. # print stuff to output
1848         close OUTPUT;           # wait for sort to finish
1849         open(INPUT, \'foo\');   # get sort's results
1850
1851 .fi
1852 FILEHANDLE may be an expression whose value gives the real filehandle name.
1853 .Ip "closedir(DIRHANDLE)" 8 5
1854 .Ip "closedir DIRHANDLE" 8
1855 Closes a directory opened by opendir().
1856 .Ip "connect(SOCKET,NAME)" 8 2
1857 Does the same thing that the connect system call does.
1858 Returns true if it succeeded, false otherwise.
1859 NAME should be a package address of the proper type for the socket.
1860 See example in section on Interprocess Communication.
1861 .Ip "cos(EXPR)" 8 6
1862 .Ip "cos EXPR" 8 6
1863 Returns the cosine of EXPR (expressed in radians).
1864 If EXPR is omitted takes cosine of $_.
1865 .Ip "crypt(PLAINTEXT,SALT)" 8 6
1866 Encrypts a string exactly like the crypt() function in the C library.
1867 Useful for checking the password file for lousy passwords.
1868 Only the guys wearing white hats should do this.
1869 .Ip "dbmclose(ASSOC_ARRAY)" 8 6
1870 .Ip "dbmclose ASSOC_ARRAY" 8
1871 Breaks the binding between a dbm file and an associative array.
1872 The values remaining in the associative array are meaningless unless
1873 you happen to want to know what was in the cache for the dbm file.
1874 This function is only useful if you have ndbm.
1875 .Ip "dbmopen(ASSOC,DBNAME,MODE)" 8 6
1876 This binds a dbm or ndbm file to an associative array.
1877 ASSOC is the name of the associative array.
1878 (Unlike normal open, the first argument is NOT a filehandle, even though
1879 it looks like one).
1880 DBNAME is the name of the database (without the .dir or .pag extension).
1881 If the database does not exist, it is created with protection specified
1882 by MODE (as modified by the umask).
1883 If your system only supports the older dbm functions, you may only have one
1884 dbmopen in your program.
1885 If your system has neither dbm nor ndbm, calling dbmopen produces a fatal
1886 error.
1887 .Sp
1888 Values assigned to the associative array prior to the dbmopen are lost.
1889 A certain number of values from the dbm file are cached in memory.
1890 By default this number is 64, but you can increase it by preallocating
1891 that number of garbage entries in the associative array before the dbmopen.
1892 You can flush the cache if necessary with the reset command.
1893 .Sp
1894 If you don't have write access to the dbm file, you can only read
1895 associative array variables, not set them.
1896 If you want to test whether you can write, either use file tests or
1897 try setting a dummy array entry inside an eval, which will trap the error.
1898 .Sp
1899 Note that functions such as keys() and values() may return huge array values
1900 when used on large dbm files.
1901 You may prefer to use the each() function to iterate over large dbm files.
1902 Example:
1903 .nf
1904
1905 .ne 6
1906         # print out history file offsets
1907         dbmopen(HIST,'/usr/lib/news/history',0666);
1908         while (($key,$val) = each %HIST) {
1909                 print $key, ' = ', unpack('L',$val), "\en";
1910         }
1911         dbmclose(HIST);
1912
1913 .fi
1914 .Ip "defined(EXPR)" 8 6
1915 .Ip "defined EXPR" 8
1916 Returns a boolean value saying whether the lvalue EXPR has a real value
1917 or not.
1918 Many operations return the undefined value under exceptional conditions,
1919 such as end of file, uninitialized variable, system error and such.
1920 This function allows you to distinguish between an undefined null string
1921 and a defined null string with operations that might return a real null
1922 string, in particular referencing elements of an array.
1923 You may also check to see if arrays or subroutines exist.
1924 Use on predefined variables is not guaranteed to produce intuitive results.
1925 Examples:
1926 .nf
1927
1928 .ne 7
1929         print if defined $switch{'D'};
1930         print "$val\en" while defined($val = pop(@ary));
1931         die "Can't readlink $sym: $!"
1932                 unless defined($value = readlink $sym);
1933         eval '@foo = ()' if defined(@foo);
1934         die "No XYZ package defined" unless defined %_XYZ;
1935         sub foo { defined &bar ? &bar(@_) : die "No bar"; }
1936
1937 .fi
1938 See also undef.
1939 .Ip "delete $ASSOC{KEY}" 8 6
1940 Deletes the specified value from the specified associative array.
1941 Returns the deleted value, or the undefined value if nothing was deleted.
1942 Deleting from $ENV{} modifies the environment.
1943 Deleting from an array bound to a dbm file deletes the entry from the dbm
1944 file.
1945 .Sp
1946 The following deletes all the values of an associative array:
1947 .nf
1948
1949 .ne 3
1950         foreach $key (keys %ARRAY) {
1951                 delete $ARRAY{$key};
1952         }
1953
1954 .fi
1955 (But it would be faster to use the
1956 .I reset
1957 command.
1958 Saying undef %ARRAY is faster yet.)
1959 .Ip "die(LIST)" 8
1960 .Ip "die LIST" 8
1961 Outside of an eval, prints the value of LIST to
1962 .I STDERR
1963 and exits with the current value of $!
1964 (errno).
1965 If $! is 0, exits with the value of ($? >> 8) (\`command\` status).
1966 If ($? >> 8) is 0, exits with 255.
1967 Inside an eval, the error message is stuffed into $@ and the eval is terminated
1968 with the undefined value.
1969 .Sp
1970 Equivalent examples:
1971 .nf
1972
1973 .ne 3
1974 .ie t \{\
1975         die "Can't cd to spool: $!\en" unless chdir \'/usr/spool/news\';
1976 'br\}
1977 .el \{\
1978         die "Can't cd to spool: $!\en"
1979                 unless chdir \'/usr/spool/news\';
1980 'br\}
1981
1982         chdir \'/usr/spool/news\' || die "Can't cd to spool: $!\en"
1983
1984 .fi
1985 .Sp
1986 If the value of EXPR does not end in a newline, the current script line
1987 number and input line number (if any) are also printed, and a newline is
1988 supplied.
1989 Hint: sometimes appending \*(L", stopped\*(R" to your message will cause it to make
1990 better sense when the string \*(L"at foo line 123\*(R" is appended.
1991 Suppose you are running script \*(L"canasta\*(R".
1992 .nf
1993
1994 .ne 7
1995         die "/etc/games is no good";
1996         die "/etc/games is no good, stopped";
1997
1998 produce, respectively
1999
2000         /etc/games is no good at canasta line 123.
2001         /etc/games is no good, stopped at canasta line 123.
2002
2003 .fi
2004 See also
2005 .IR exit .
2006 .Ip "do BLOCK" 8 4
2007 Returns the value of the last command in the sequence of commands indicated
2008 by BLOCK.
2009 When modified by a loop modifier, executes the BLOCK once before testing the
2010 loop condition.
2011 (On other statements the loop modifiers test the conditional first.)
2012 .Ip "do SUBROUTINE (LIST)" 8 3
2013 Executes a SUBROUTINE declared by a
2014 .I sub
2015 declaration, and returns the value
2016 of the last expression evaluated in SUBROUTINE.
2017 If there is no subroutine by that name, produces a fatal error.
2018 (You may use the \*(L"defined\*(R" operator to determine if a subroutine
2019 exists.)
2020 If you pass arrays as part of LIST you may wish to pass the length
2021 of the array in front of each array.
2022 (See the section on subroutines later on.)
2023 SUBROUTINE may be a scalar variable, in which case the variable contains
2024 the name of the subroutine to execute.
2025 The parentheses are required to avoid confusion with the \*(L"do EXPR\*(R"
2026 form.
2027 .Sp
2028 As an alternate form, you may call a subroutine by prefixing the name with
2029 an ampersand: &foo(@args).
2030 If you aren't passing any arguments, you don't have to use parentheses.
2031 If you omit the parentheses, no @_ array is passed to the subroutine.
2032 The & form is also used to specify subroutines to the defined and undef
2033 operators.
2034 .Ip "do EXPR" 8 3
2035 Uses the value of EXPR as a filename and executes the contents of the file
2036 as a
2037 .I perl
2038 script.
2039 Its primary use is to include subroutines from a
2040 .I perl
2041 subroutine library.
2042 .nf
2043
2044         do \'stat.pl\';
2045
2046 is just like
2047
2048         eval \`cat stat.pl\`;
2049
2050 .fi
2051 except that it's more efficient, more concise, keeps track of the current
2052 filename for error messages, and searches all the
2053 .B \-I
2054 libraries if the file
2055 isn't in the current directory (see also the @INC array in Predefined Names).
2056 It's the same, however, in that it does reparse the file every time you
2057 call it, so if you are going to use the file inside a loop you might prefer
2058 to use \-P and #include, at the expense of a little more startup time.
2059 (The main problem with #include is that cpp doesn't grok # comments\*(--a
2060 workaround is to use \*(L";#\*(R" for standalone comments.)
2061 Note that the following are NOT equivalent:
2062 .nf
2063
2064 .ne 2
2065         do $foo;        # eval a file
2066         do $foo();      # call a subroutine
2067
2068 .fi
2069 Note that inclusion of library routines is better done with
2070 the \*(L"require\*(R" operator.
2071 .Ip "dump LABEL" 8 6
2072 This causes an immediate core dump.
2073 Primarily this is so that you can use the undump program to turn your
2074 core dump into an executable binary after having initialized all your
2075 variables at the beginning of the program.
2076 When the new binary is executed it will begin by executing a "goto LABEL"
2077 (with all the restrictions that goto suffers).
2078 Think of it as a goto with an intervening core dump and reincarnation.
2079 If LABEL is omitted, restarts the program from the top.
2080 WARNING: any files opened at the time of the dump will NOT be open any more
2081 when the program is reincarnated, with possible resulting confusion on the part
2082 of perl.
2083 See also \-u.
2084 .Sp
2085 Example:
2086 .nf
2087
2088 .ne 16
2089         #!/usr/bin/perl
2090         require 'getopt.pl';
2091         require 'stat.pl';
2092         %days = (
2093             'Sun',1,
2094             'Mon',2,
2095             'Tue',3,
2096             'Wed',4,
2097             'Thu',5,
2098             'Fri',6,
2099             'Sat',7);
2100
2101         dump QUICKSTART if $ARGV[0] eq '-d';
2102
2103     QUICKSTART:
2104         do Getopt('f');
2105
2106 .fi
2107 .Ip "each(ASSOC_ARRAY)" 8 6
2108 .Ip "each ASSOC_ARRAY" 8
2109 Returns a 2 element array consisting of the key and value for the next
2110 value of an associative array, so that you can iterate over it.
2111 Entries are returned in an apparently random order.
2112 When the array is entirely read, a null array is returned (which when
2113 assigned produces a FALSE (0) value).
2114 The next call to each() after that will start iterating again.
2115 The iterator can be reset only by reading all the elements from the array.
2116 You must not modify the array while iterating over it.
2117 There is a single iterator for each associative array, shared by all
2118 each(), keys() and values() function calls in the program.
2119 The following prints out your environment like the printenv program, only
2120 in a different order:
2121 .nf
2122
2123 .ne 3
2124         while (($key,$value) = each %ENV) {
2125                 print "$key=$value\en";
2126         }
2127
2128 .fi
2129 See also keys() and values().
2130 .Ip "eof(FILEHANDLE)" 8 8
2131 .Ip "eof()" 8
2132 .Ip "eof" 8
2133 Returns 1 if the next read on FILEHANDLE will return end of file, or if
2134 FILEHANDLE is not open.
2135 FILEHANDLE may be an expression whose value gives the real filehandle name.
2136 (Note that this function actually reads a character and then ungetc's it,
2137 so it is not very useful in an interactive context.)
2138 An eof without an argument returns the eof status for the last file read.
2139 Empty parentheses () may be used to indicate the pseudo file formed of the
2140 files listed on the command line, i.e. eof() is reasonable to use inside
2141 a while (<>) loop to detect the end of only the last file.
2142 Use eof(ARGV) or eof without the parentheses to test EACH file in a while (<>) loop.
2143 Examples:
2144 .nf
2145
2146 .ne 7
2147         # insert dashes just before last line of last file
2148         while (<>) {
2149                 if (eof()) {
2150                         print "\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\en";
2151                 }
2152                 print;
2153         }
2154
2155 .ne 7
2156         # reset line numbering on each input file
2157         while (<>) {
2158                 print "$.\et$_";
2159                 if (eof) {      # Not eof().
2160                         close(ARGV);
2161                 }
2162         }
2163
2164 .fi
2165 .Ip "eval(EXPR)" 8 6
2166 .Ip "eval EXPR" 8 6
2167 EXPR is parsed and executed as if it were a little
2168 .I perl
2169 program.
2170 It is executed in the context of the current
2171 .I perl
2172 program, so that
2173 any variable settings, subroutine or format definitions remain afterwards.
2174 The value returned is the value of the last expression evaluated, just
2175 as with subroutines.
2176 If there is a syntax error or runtime error, or a die statement is
2177 executed, an undefined value is returned by
2178 eval, and $@ is set to the error message.
2179 If there was no error, $@ is guaranteed to be a null string.
2180 If EXPR is omitted, evaluates $_.
2181 The final semicolon, if any, may be omitted from the expression.
2182 .Sp
2183 Note that, since eval traps otherwise-fatal errors, it is useful for
2184 determining whether a particular feature
2185 (such as dbmopen or symlink) is implemented.
2186 It is also Perl's exception trapping mechanism, where the die operator is
2187 used to raise exceptions.
2188 .Ip "exec(LIST)" 8 8
2189 .Ip "exec LIST" 8 6
2190 If there is more than one argument in LIST, or if LIST is an array with
2191 more than one value,
2192 calls execvp() with the arguments in LIST.
2193 If there is only one scalar argument, the argument is checked for shell metacharacters.
2194 If there are any, the entire argument is passed to \*(L"/bin/sh \-c\*(R" for parsing.
2195 If there are none, the argument is split into words and passed directly to
2196 execvp(), which is more efficient.
2197 Note: exec (and system) do not flush your output buffer, so you may need to
2198 set $| to avoid lost output.
2199 Examples:
2200 .nf
2201
2202         exec \'/bin/echo\', \'Your arguments are: \', @ARGV;
2203         exec "sort $outfile | uniq";
2204
2205 .fi
2206 .Sp
2207 If you don't really want to execute the first argument, but want to lie
2208 to the program you are executing about its own name, you can specify
2209 the program you actually want to run by assigning that to a variable and
2210 putting the name of the variable in front of the LIST without a comma.
2211 (This always forces interpretation of the LIST as a multi-valued list, even
2212 if there is only a single scalar in the list.)
2213 Example:
2214 .nf
2215
2216 .ne 2
2217         $shell = '/bin/csh';
2218         exec $shell '-sh';              # pretend it's a login shell
2219
2220 .fi
2221 .Ip "exit(EXPR)" 8 6
2222 .Ip "exit EXPR" 8
2223 Evaluates EXPR and exits immediately with that value.
2224 Example:
2225 .nf
2226
2227 .ne 2
2228         $ans = <STDIN>;
2229         exit 0 \|if \|$ans \|=~ \|/\|^[Xx]\|/\|;
2230
2231 .fi
2232 See also
2233 .IR die .
2234 If EXPR is omitted, exits with 0 status.
2235 .Ip "exp(EXPR)" 8 3
2236 .Ip "exp EXPR" 8
2237 Returns
2238 .I e
2239 to the power of EXPR.
2240 If EXPR is omitted, gives exp($_).
2241 .Ip "fcntl(FILEHANDLE,FUNCTION,SCALAR)" 8 4
2242 Implements the fcntl(2) function.
2243 You'll probably have to say
2244 .nf
2245
2246         require "fcntl.ph";     # probably /usr/local/lib/perl/fcntl.ph
2247
2248 .fi
2249 first to get the correct function definitions.
2250 If fcntl.ph doesn't exist or doesn't have the correct definitions
2251 you'll have to roll
2252 your own, based on your C header files such as <sys/fcntl.h>.
2253 (There is a perl script called h2ph that comes with the perl kit
2254 which may help you in this.)
2255 Argument processing and value return works just like ioctl below.
2256 Note that fcntl will produce a fatal error if used on a machine that doesn't implement
2257 fcntl(2).
2258 .Ip "fileno(FILEHANDLE)" 8 4
2259 .Ip "fileno FILEHANDLE" 8 4
2260 Returns the file descriptor for a filehandle.
2261 Useful for constructing bitmaps for select().
2262 If FILEHANDLE is an expression, the value is taken as the name of
2263 the filehandle.
2264 .Ip "flock(FILEHANDLE,OPERATION)" 8 4
2265 Calls flock(2) on FILEHANDLE.
2266 See manual page for flock(2) for definition of OPERATION.
2267 Returns true for success, false on failure.
2268 Will produce a fatal error if used on a machine that doesn't implement
2269 flock(2).
2270 Here's a mailbox appender for BSD systems.
2271 .nf
2272
2273 .ne 20
2274         $LOCK_SH = 1;
2275         $LOCK_EX = 2;
2276         $LOCK_NB = 4;
2277         $LOCK_UN = 8;
2278
2279         sub lock {
2280             flock(MBOX,$LOCK_EX);
2281             # and, in case someone appended
2282             # while we were waiting...
2283             seek(MBOX, 0, 2);
2284         }
2285
2286         sub unlock {
2287             flock(MBOX,$LOCK_UN);
2288         }
2289
2290         open(MBOX, ">>/usr/spool/mail/$ENV{'USER'}")
2291                 || die "Can't open mailbox: $!";
2292
2293         do lock();
2294         print MBOX $msg,"\en\en";
2295         do unlock();
2296
2297 .fi
2298 .Ip "fork" 8 4
2299 Does a fork() call.
2300 Returns the child pid to the parent process and 0 to the child process.
2301 Note: unflushed buffers remain unflushed in both processes, which means
2302 you may need to set $| to avoid duplicate output.
2303 .Ip "getc(FILEHANDLE)" 8 4
2304 .Ip "getc FILEHANDLE" 8
2305 .Ip "getc" 8
2306 Returns the next character from the input file attached to FILEHANDLE, or
2307 a null string at EOF.
2308 If FILEHANDLE is omitted, reads from STDIN.
2309 .Ip "getlogin" 8 3
2310 Returns the current login from /etc/utmp, if any.
2311 If null, use getpwuid.
2312
2313         $login = getlogin || (getpwuid($<))[0] || "Somebody";
2314
2315 .Ip "getpeername(SOCKET)" 8 3
2316 Returns the packed sockaddr address of other end of the SOCKET connection.
2317 .nf
2318
2319 .ne 4
2320         # An internet sockaddr
2321         $sockaddr = 'S n a4 x8';
2322         $hersockaddr = getpeername(S);
2323 .ie t \{\
2324         ($family, $port, $heraddr) = unpack($sockaddr,$hersockaddr);
2325 'br\}
2326 .el \{\
2327         ($family, $port, $heraddr) =
2328                         unpack($sockaddr,$hersockaddr);
2329 'br\}
2330
2331 .fi
2332 .Ip "getpgrp(PID)" 8 4
2333 .Ip "getpgrp PID" 8
2334 Returns the current process group for the specified PID, 0 for the current
2335 process.
2336 Will produce a fatal error if used on a machine that doesn't implement
2337 getpgrp(2).
2338 If EXPR is omitted, returns process group of current process.
2339 .Ip "getppid" 8 4
2340 Returns the process id of the parent process.
2341 .Ip "getpriority(WHICH,WHO)" 8 4
2342 Returns the current priority for a process, a process group, or a user.
2343 (See getpriority(2).)
2344 Will produce a fatal error if used on a machine that doesn't implement
2345 getpriority(2).
2346 .Ip "getpwnam(NAME)" 8
2347 .Ip "getgrnam(NAME)" 8
2348 .Ip "gethostbyname(NAME)" 8
2349 .Ip "getnetbyname(NAME)" 8
2350 .Ip "getprotobyname(NAME)" 8
2351 .Ip "getpwuid(UID)" 8
2352 .Ip "getgrgid(GID)" 8
2353 .Ip "getservbyname(NAME,PROTO)" 8
2354 .Ip "gethostbyaddr(ADDR,ADDRTYPE)" 8
2355 .Ip "getnetbyaddr(ADDR,ADDRTYPE)" 8
2356 .Ip "getprotobynumber(NUMBER)" 8
2357 .Ip "getservbyport(PORT,PROTO)" 8
2358 .Ip "getpwent" 8
2359 .Ip "getgrent" 8
2360 .Ip "gethostent" 8
2361 .Ip "getnetent" 8
2362 .Ip "getprotoent" 8
2363 .Ip "getservent" 8
2364 .Ip "setpwent" 8
2365 .Ip "setgrent" 8
2366 .Ip "sethostent(STAYOPEN)" 8
2367 .Ip "setnetent(STAYOPEN)" 8
2368 .Ip "setprotoent(STAYOPEN)" 8
2369 .Ip "setservent(STAYOPEN)" 8
2370 .Ip "endpwent" 8
2371 .Ip "endgrent" 8
2372 .Ip "endhostent" 8
2373 .Ip "endnetent" 8
2374 .Ip "endprotoent" 8
2375 .Ip "endservent" 8
2376 These routines perform the same functions as their counterparts in the
2377 system library.
2378 The return values from the various get routines are as follows:
2379 .nf
2380
2381         ($name,$passwd,$uid,$gid,
2382            $quota,$comment,$gcos,$dir,$shell) = getpw.\|.\|.
2383         ($name,$passwd,$gid,$members) = getgr.\|.\|.
2384         ($name,$aliases,$addrtype,$length,@addrs) = gethost.\|.\|.
2385         ($name,$aliases,$addrtype,$net) = getnet.\|.\|.
2386         ($name,$aliases,$proto) = getproto.\|.\|.
2387         ($name,$aliases,$port,$proto) = getserv.\|.\|.
2388
2389 .fi
2390 The $members value returned by getgr.\|.\|. is a space separated list
2391 of the login names of the members of the group.
2392 .Sp
2393 The @addrs value returned by the gethost.\|.\|. functions is a list of the
2394 raw addresses returned by the corresponding system library call.
2395 In the Internet domain, each address is four bytes long and you can unpack
2396 it by saying something like:
2397 .nf
2398
2399         ($a,$b,$c,$d) = unpack('C4',$addr[0]);
2400
2401 .fi
2402 .Ip "getsockname(SOCKET)" 8 3
2403 Returns the packed sockaddr address of this end of the SOCKET connection.
2404 .nf
2405
2406 .ne 4
2407         # An internet sockaddr
2408         $sockaddr = 'S n a4 x8';
2409         $mysockaddr = getsockname(S);
2410 .ie t \{\
2411         ($family, $port, $myaddr) = unpack($sockaddr,$mysockaddr);
2412 'br\}
2413 .el \{\
2414         ($family, $port, $myaddr) =
2415                         unpack($sockaddr,$mysockaddr);
2416 'br\}
2417
2418 .fi
2419 .Ip "getsockopt(SOCKET,LEVEL,OPTNAME)" 8 3
2420 Returns the socket option requested, or undefined if there is an error.
2421 .Ip "gmtime(EXPR)" 8 4
2422 .Ip "gmtime EXPR" 8
2423 Converts a time as returned by the time function to a 9-element array with
2424 the time analyzed for the Greenwich timezone.
2425 Typically used as follows:
2426 .nf
2427
2428 .ne 3
2429 .ie t \{\
2430     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = gmtime(time);
2431 'br\}
2432 .el \{\
2433     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
2434                                                 gmtime(time);
2435 'br\}
2436
2437 .fi
2438 All array elements are numeric, and come straight out of a struct tm.
2439 In particular this means that $mon has the range 0.\|.11 and $wday has the
2440 range 0.\|.6.
2441 If EXPR is omitted, does gmtime(time).
2442 .Ip "goto LABEL" 8 6
2443 Finds the statement labeled with LABEL and resumes execution there.
2444 Currently you may only go to statements in the main body of the program
2445 that are not nested inside a do {} construct.
2446 This statement is not implemented very efficiently, and is here only to make
2447 the
2448 .IR sed -to- perl
2449 translator easier.
2450 I may change its semantics at any time, consistent with support for translated
2451 .I sed
2452 scripts.
2453 Use it at your own risk.
2454 Better yet, don't use it at all.
2455 .Ip "grep(EXPR,LIST)" 8 4
2456 Evaluates EXPR for each element of LIST (locally setting $_ to each element)
2457 and returns the array value consisting of those elements for which the
2458 expression evaluated to true.
2459 In a scalar context, returns the number of times the expression was true.
2460 .nf
2461
2462         @foo = grep(!/^#/, @bar);    # weed out comments
2463
2464 .fi
2465 Note that, since $_ is a reference into the array value, it can be
2466 used to modify the elements of the array.
2467 While this is useful and supported, it can cause bizarre results if
2468 the LIST is not a named array.
2469 .Ip "hex(EXPR)" 8 4
2470 .Ip "hex EXPR" 8
2471 Returns the decimal value of EXPR interpreted as an hex string.
2472 (To interpret strings that might start with 0 or 0x see oct().)
2473 If EXPR is omitted, uses $_.
2474 .Ip "index(STR,SUBSTR,POSITION)" 8 4
2475 .Ip "index(STR,SUBSTR)" 8 4
2476 Returns the position of the first occurrence of SUBSTR in STR at or after
2477 POSITION.
2478 If POSITION is omitted, starts searching from the beginning of the string.
2479 The return value is based at 0, or whatever you've
2480 set the $[ variable to.
2481 If the substring is not found, returns one less than the base, ordinarily \-1.
2482 .Ip "int(EXPR)" 8 4
2483 .Ip "int EXPR" 8
2484 Returns the integer portion of EXPR.
2485 If EXPR is omitted, uses $_.
2486 .Ip "ioctl(FILEHANDLE,FUNCTION,SCALAR)" 8 4
2487 Implements the ioctl(2) function.
2488 You'll probably have to say
2489 .nf
2490
2491         require "ioctl.ph";     # probably /usr/local/lib/perl/ioctl.ph
2492
2493 .fi
2494 first to get the correct function definitions.
2495 If ioctl.ph doesn't exist or doesn't have the correct definitions
2496 you'll have to roll
2497 your own, based on your C header files such as <sys/ioctl.h>.
2498 (There is a perl script called h2ph that comes with the perl kit
2499 which may help you in this.)
2500 SCALAR will be read and/or written depending on the FUNCTION\*(--a pointer
2501 to the string value of SCALAR will be passed as the third argument of
2502 the actual ioctl call.
2503 (If SCALAR has no string value but does have a numeric value, that value
2504 will be passed rather than a pointer to the string value.
2505 To guarantee this to be true, add a 0 to the scalar before using it.)
2506 The pack() and unpack() functions are useful for manipulating the values
2507 of structures used by ioctl().
2508 The following example sets the erase character to DEL.
2509 .nf
2510
2511 .ne 9
2512         require 'ioctl.ph';
2513         $sgttyb_t = "ccccs";            # 4 chars and a short
2514         if (ioctl(STDIN,$TIOCGETP,$sgttyb)) {
2515                 @ary = unpack($sgttyb_t,$sgttyb);
2516                 $ary[2] = 127;
2517                 $sgttyb = pack($sgttyb_t,@ary);
2518                 ioctl(STDIN,$TIOCSETP,$sgttyb)
2519                         || die "Can't ioctl: $!";
2520         }
2521
2522 .fi
2523 The return value of ioctl (and fcntl) is as follows:
2524 .nf
2525
2526 .ne 4
2527         if OS returns:\h'|3i'perl returns:
2528           -1\h'|3i'  undefined value
2529           0\h'|3i'  string "0 but true"
2530           anything else\h'|3i'  that number
2531
2532 .fi
2533 Thus perl returns true on success and false on failure, yet you can still
2534 easily determine the actual value returned by the operating system:
2535 .nf
2536
2537         ($retval = ioctl(...)) || ($retval = -1);
2538         printf "System returned %d\en", $retval;
2539 .fi
2540 .Ip "join(EXPR,LIST)" 8 8
2541 .Ip "join(EXPR,ARRAY)" 8
2542 Joins the separate strings of LIST or ARRAY into a single string with fields
2543 separated by the value of EXPR, and returns the string.
2544 Example:
2545 .nf
2546
2547 .ie t \{\
2548     $_ = join(\|\':\', $login,$passwd,$uid,$gid,$gcos,$home,$shell);
2549 'br\}
2550 .el \{\
2551     $_ = join(\|\':\',
2552                 $login,$passwd,$uid,$gid,$gcos,$home,$shell);
2553 'br\}
2554
2555 .fi
2556 See
2557 .IR split .
2558 .Ip "keys(ASSOC_ARRAY)" 8 6
2559 .Ip "keys ASSOC_ARRAY" 8
2560 Returns a normal array consisting of all the keys of the named associative
2561 array.
2562 The keys are returned in an apparently random order, but it is the same order
2563 as either the values() or each() function produces (given that the associative array
2564 has not been modified).
2565 Here is yet another way to print your environment:
2566 .nf
2567
2568 .ne 5
2569         @keys = keys %ENV;
2570         @values = values %ENV;
2571         while ($#keys >= 0) {
2572                 print pop(@keys), \'=\', pop(@values), "\en";
2573         }
2574
2575 or how about sorted by key:
2576
2577 .ne 3
2578         foreach $key (sort(keys %ENV)) {
2579                 print $key, \'=\', $ENV{$key}, "\en";
2580         }
2581
2582 .fi
2583 .Ip "kill(LIST)" 8 8
2584 .Ip "kill LIST" 8 2
2585 Sends a signal to a list of processes.
2586 The first element of the list must be the signal to send.
2587 Returns the number of processes successfully signaled.
2588 .nf
2589
2590         $cnt = kill 1, $child1, $child2;
2591         kill 9, @goners;
2592
2593 .fi
2594 If the signal is negative, kills process groups instead of processes.
2595 (On System V, a negative \fIprocess\fR number will also kill process groups,
2596 but that's not portable.)
2597 You may use a signal name in quotes.
2598 .Ip "last LABEL" 8 8
2599 .Ip "last" 8
2600 The
2601 .I last
2602 command is like the
2603 .I break
2604 statement in C (as used in loops); it immediately exits the loop in question.
2605 If the LABEL is omitted, the command refers to the innermost enclosing loop.
2606 The
2607 .I continue
2608 block, if any, is not executed:
2609 .nf
2610
2611 .ne 4
2612         line: while (<STDIN>) {
2613                 last line if /\|^$/;    # exit when done with header
2614                 .\|.\|.
2615         }
2616
2617 .fi
2618 .Ip "length(EXPR)" 8 4
2619 .Ip "length EXPR" 8
2620 Returns the length in characters of the value of EXPR.
2621 If EXPR is omitted, returns length of $_.
2622 .Ip "link(OLDFILE,NEWFILE)" 8 2
2623 Creates a new filename linked to the old filename.
2624 Returns 1 for success, 0 otherwise.
2625 .Ip "listen(SOCKET,QUEUESIZE)" 8 2
2626 Does the same thing that the listen system call does.
2627 Returns true if it succeeded, false otherwise.
2628 See example in section on Interprocess Communication.
2629 .Ip "local(LIST)" 8 4
2630 Declares the listed variables to be local to the enclosing block,
2631 subroutine, eval or \*(L"do\*(R".
2632 All the listed elements must be legal lvalues.
2633 This operator works by saving the current values of those variables in LIST
2634 on a hidden stack and restoring them upon exiting the block, subroutine or eval.
2635 This means that called subroutines can also reference the local variable,
2636 but not the global one.
2637 The LIST may be assigned to if desired, which allows you to initialize
2638 your local variables.
2639 (If no initializer is given for a particular variable, it is created with
2640 an undefined value.)
2641 Commonly this is used to name the parameters to a subroutine.
2642 Examples:
2643 .nf
2644
2645 .ne 13
2646         sub RANGEVAL {
2647                 local($min, $max, $thunk) = @_;
2648                 local($result) = \'\';
2649                 local($i);
2650
2651                 # Presumably $thunk makes reference to $i
2652
2653                 for ($i = $min; $i < $max; $i++) {
2654                         $result .= eval $thunk;
2655                 }
2656
2657                 $result;
2658         }
2659
2660 .ne 6
2661         if ($sw eq \'-v\') {
2662             # init local array with global array
2663             local(@ARGV) = @ARGV;
2664             unshift(@ARGV,\'echo\');
2665             system @ARGV;
2666         }
2667         # @ARGV restored
2668
2669 .ne 6
2670         # temporarily add to digits associative array
2671         if ($base12) {
2672                 # (NOTE: not claiming this is efficient!)
2673                 local(%digits) = (%digits,'t',10,'e',11);
2674                 do parse_num();
2675         }
2676
2677 .fi
2678 Note that local() is a run-time command, and so gets executed every time
2679 through a loop, using up more stack storage each time until it's all
2680 released at once when the loop is exited.
2681 .Ip "localtime(EXPR)" 8 4
2682 .Ip "localtime EXPR" 8
2683 Converts a time as returned by the time function to a 9-element array with
2684 the time analyzed for the local timezone.
2685 Typically used as follows:
2686 .nf
2687
2688 .ne 3
2689 .ie t \{\
2690     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
2691 'br\}
2692 .el \{\
2693     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
2694                                                 localtime(time);
2695 'br\}
2696
2697 .fi
2698 All array elements are numeric, and come straight out of a struct tm.
2699 In particular this means that $mon has the range 0.\|.11 and $wday has the
2700 range 0.\|.6.
2701 If EXPR is omitted, does localtime(time).
2702 .Ip "log(EXPR)" 8 4
2703 .Ip "log EXPR" 8
2704 Returns logarithm (base
2705 .IR e )
2706 of EXPR.
2707 If EXPR is omitted, returns log of $_.
2708 .Ip "lstat(FILEHANDLE)" 8 6
2709 .Ip "lstat FILEHANDLE" 8
2710 .Ip "lstat(EXPR)" 8
2711 .Ip "lstat SCALARVARIABLE" 8
2712 Does the same thing as the stat() function, but stats a symbolic link
2713 instead of the file the symbolic link points to.
2714 If symbolic links are unimplemented on your system, a normal stat is done.
2715 .Ip "m/PATTERN/io" 8 4
2716 .Ip "/PATTERN/io" 8
2717 Searches a string for a pattern match, and returns true (1) or false (\'\').
2718 If no string is specified via the =~ or !~ operator,
2719 the $_ string is searched.
2720 (The string specified with =~ need not be an lvalue\*(--it may be the result of an expression evaluation, but remember the =~ binds rather tightly.)
2721 See also the section on regular expressions.
2722 .Sp
2723 If / is the delimiter then the initial \*(L'm\*(R' is optional.
2724 With the \*(L'm\*(R' you can use any pair of non-alphanumeric characters
2725 as delimiters.
2726 This is particularly useful for matching Unix path names that contain \*(L'/\*(R'.
2727 If the final delimiter is followed by the optional letter \*(L'i\*(R', the matching is
2728 done in a case-insensitive manner.
2729 PATTERN may contain references to scalar variables, which will be interpolated
2730 (and the pattern recompiled) every time the pattern search is evaluated.
2731 (Note that $) and $| may not be interpolated because they look like end-of-string tests.)
2732 If you want such a pattern to be compiled only once, add an \*(L"o\*(R" after
2733 the trailing delimiter.
2734 This avoids expensive run-time recompilations, and
2735 is useful when the value you are interpolating won't change over the
2736 life of the script.
2737 If the PATTERN evaluates to a null string, the most recent successful
2738 regular expression is used instead.
2739 .Sp
2740 If used in a context that requires an array value, a pattern match returns an
2741 array consisting of the subexpressions matched by the parentheses in the
2742 pattern,
2743 i.e. ($1, $2, $3.\|.\|.).
2744 It does NOT actually set $1, $2, etc. in this case, nor does it set $+, $`, $&
2745 or $'.
2746 If the match fails, a null array is returned.
2747 If the match succeeds, but there were no parentheses, an array value of (1)
2748 is returned.
2749 .Sp
2750 Examples:
2751 .nf
2752
2753 .ne 4
2754     open(tty, \'/dev/tty\');
2755     <tty> \|=~ \|/\|^y\|/i \|&& \|do foo(\|);   # do foo if desired
2756
2757     if (/Version: \|*\|([0\-9.]*\|)\|/\|) { $version = $1; }
2758
2759     next if m#^/usr/spool/uucp#;
2760
2761 .ne 5
2762     # poor man's grep
2763     $arg = shift;
2764     while (<>) {
2765             print if /$arg/o;   # compile only once
2766     }
2767
2768     if (($F1, $F2, $Etc) = ($foo =~ /^(\eS+)\es+(\eS+)\es*(.*)/))
2769
2770 .fi
2771 This last example splits $foo into the first two words and the remainder
2772 of the line, and assigns those three fields to $F1, $F2 and $Etc.
2773 The conditional is true if any variables were assigned, i.e. if the pattern
2774 matched.
2775 .Ip "mkdir(FILENAME,MODE)" 8 3
2776 Creates the directory specified by FILENAME, with permissions specified by
2777 MODE (as modified by umask).
2778 If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
2779 .Ip "msgctl(ID,CMD,ARG)" 8 4
2780 Calls the System V IPC function msgctl.  If CMD is &IPC_STAT, then ARG
2781 must be a variable which will hold the returned msqid_ds structure.
2782 Returns like ioctl: the undefined value for error, "0 but true" for
2783 zero, or the actual return value otherwise.
2784 .Ip "msgget(KEY,FLAGS)" 8 4
2785 Calls the System V IPC function msgget.  Returns the message queue id,
2786 or the undefined value if there is an error.
2787 .Ip "msgsnd(ID,MSG,FLAGS)" 8 4
2788 Calls the System V IPC function msgsnd to send the message MSG to the
2789 message queue ID.  MSG must begin with the long integer message type,
2790 which may be created with pack("L", $type).  Returns true if
2791 successful, or false if there is an error.
2792 .Ip "msgrcv(ID,VAR,SIZE,TYPE,FLAGS)" 8 4
2793 Calls the System V IPC function msgrcv to receive a message from
2794 message queue ID into variable VAR with a maximum message size of
2795 SIZE.  Note that if a message is received, the message type will be
2796 the first thing in VAR, and the maximum length of VAR is SIZE plus the
2797 size of the message type.  Returns true if successful, or false if
2798 there is an error.
2799 ''' Beginning of part 3
2800 ''' $Header: perl.man,v 4.0 91/03/20 01:38:08 lwall Locked $
2801 '''
2802 ''' $Log:       perl.man,v $
2803 ''' Revision 4.0  91/03/20  01:38:08  lwall
2804 ''' 4.0 baseline.
2805 '''
2806 ''' Revision 3.0.1.12  91/01/11  18:18:15  lwall
2807 ''' patch42: added binary and hex pack/unpack options
2808 '''
2809 ''' Revision 3.0.1.11  90/11/10  01:48:21  lwall
2810 ''' patch38: random cleanup
2811 ''' patch38: documented tr///cds
2812 '''
2813 ''' Revision 3.0.1.10  90/10/20  02:15:17  lwall
2814 ''' patch37: patch37: fixed various typos in man page
2815 '''
2816 ''' Revision 3.0.1.9  90/10/16  10:02:43  lwall
2817 ''' patch29: you can now read into the middle string
2818 ''' patch29: index and substr now have optional 3rd args
2819 ''' patch29: added scalar reverse
2820 ''' patch29: added scalar
2821 ''' patch29: added SysV IPC
2822 ''' patch29: added waitpid
2823 ''' patch29: added sysread and syswrite
2824 '''
2825 ''' Revision 3.0.1.8  90/08/09  04:39:04  lwall
2826 ''' patch19: added require operator
2827 ''' patch19: added truncate operator
2828 ''' patch19: unpack can do checksumming
2829 '''
2830 ''' Revision 3.0.1.7  90/08/03  11:15:42  lwall
2831 ''' patch19: Intermediate diffs for Randal
2832 '''
2833 ''' Revision 3.0.1.6  90/03/27  16:17:56  lwall
2834 ''' patch16: MSDOS support
2835 '''
2836 ''' Revision 3.0.1.5  90/03/12  16:52:21  lwall
2837 ''' patch13: documented that print $filehandle &foo is ambiguous
2838 ''' patch13: added splice operator: @oldelems = splice(@array,$offset,$len,LIST)
2839 '''
2840 ''' Revision 3.0.1.4  90/02/28  18:00:09  lwall
2841 ''' patch9: added pipe function
2842 ''' patch9: documented how to handle arbitrary weird characters in filenames
2843 ''' patch9: documented the unflushed buffers problem on piped opens
2844 ''' patch9: documented how to force top of page
2845 '''
2846 ''' Revision 3.0.1.3  89/12/21  20:10:12  lwall
2847 ''' patch7: documented that s`pat`repl` does command substitution on replacement
2848 ''' patch7: documented that $timeleft from select() is likely not implemented
2849 '''
2850 ''' Revision 3.0.1.2  89/11/17  15:31:05  lwall
2851 ''' patch5: fixed some manual typos and indent problems
2852 ''' patch5: added warning about print making an array context
2853 '''
2854 ''' Revision 3.0.1.1  89/11/11  04:45:06  lwall
2855 ''' patch2: made some line breaks depend on troff vs. nroff
2856 '''
2857 ''' Revision 3.0  89/10/18  15:21:46  lwall
2858 ''' 3.0 baseline
2859 '''
2860 .Ip "next LABEL" 8 8
2861 .Ip "next" 8
2862 The
2863 .I next
2864 command is like the
2865 .I continue
2866 statement in C; it starts the next iteration of the loop:
2867 .nf
2868
2869 .ne 4
2870         line: while (<STDIN>) {
2871                 next line if /\|^#/;    # discard comments
2872                 .\|.\|.
2873         }
2874
2875 .fi
2876 Note that if there were a
2877 .I continue
2878 block on the above, it would get executed even on discarded lines.
2879 If the LABEL is omitted, the command refers to the innermost enclosing loop.
2880 .Ip "oct(EXPR)" 8 4
2881 .Ip "oct EXPR" 8
2882 Returns the decimal value of EXPR interpreted as an octal string.
2883 (If EXPR happens to start off with 0x, interprets it as a hex string instead.)
2884 The following will handle decimal, octal and hex in the standard notation:
2885 .nf
2886
2887         $val = oct($val) if $val =~ /^0/;
2888
2889 .fi
2890 If EXPR is omitted, uses $_.
2891 .Ip "open(FILEHANDLE,EXPR)" 8 8
2892 .Ip "open(FILEHANDLE)" 8
2893 .Ip "open FILEHANDLE" 8
2894 Opens the file whose filename is given by EXPR, and associates it with
2895 FILEHANDLE.
2896 If FILEHANDLE is an expression, its value is used as the name of the
2897 real filehandle wanted.
2898 If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE
2899 contains the filename.
2900 If the filename begins with \*(L"<\*(R" or nothing, the file is opened for
2901 input.
2902 If the filename begins with \*(L">\*(R", the file is opened for output.
2903 If the filename begins with \*(L">>\*(R", the file is opened for appending.
2904 (You can put a \'+\' in front of the \'>\' or \'<\' to indicate that you
2905 want both read and write access to the file.)
2906 If the filename begins with \*(L"|\*(R", the filename is interpreted
2907 as a command to which output is to be piped, and if the filename ends
2908 with a \*(L"|\*(R", the filename is interpreted as command which pipes
2909 input to us.
2910 (You may not have a command that pipes both in and out.)
2911 Opening \'\-\' opens
2912 .I STDIN
2913 and opening \'>\-\' opens
2914 .IR STDOUT .
2915 Open returns non-zero upon success, the undefined value otherwise.
2916 If the open involved a pipe, the return value happens to be the pid
2917 of the subprocess.
2918 Examples:
2919 .nf
2920
2921 .ne 3
2922         $article = 100;
2923         open article || die "Can't find article $article: $!\en";
2924         while (<article>) {\|.\|.\|.
2925
2926 .ie t \{\
2927         open(LOG, \'>>/usr/spool/news/twitlog\'\|);     # (log is reserved)
2928 'br\}
2929 .el \{\
2930         open(LOG, \'>>/usr/spool/news/twitlog\'\|);
2931                                         # (log is reserved)
2932 'br\}
2933
2934 .ie t \{\
2935         open(article, "caesar <$article |"\|);          # decrypt article
2936 'br\}
2937 .el \{\
2938         open(article, "caesar <$article |"\|);
2939                                         # decrypt article
2940 'br\}
2941
2942 .ie t \{\
2943         open(extract, "|sort >/tmp/Tmp$$"\|);           # $$ is our process#
2944 'br\}
2945 .el \{\
2946         open(extract, "|sort >/tmp/Tmp$$"\|);
2947                                         # $$ is our process#
2948 'br\}
2949
2950 .ne 7
2951         # process argument list of files along with any includes
2952
2953         foreach $file (@ARGV) {
2954                 do process($file, \'fh00\');    # no pun intended
2955         }
2956
2957         sub process {
2958                 local($filename, $input) = @_;
2959                 $input++;               # this is a string increment
2960                 unless (open($input, $filename)) {
2961                         print STDERR "Can't open $filename: $!\en";
2962                         return;
2963                 }
2964 .ie t \{\
2965                 while (<$input>) {              # note the use of indirection
2966 'br\}
2967 .el \{\
2968                 while (<$input>) {              # note use of indirection
2969 'br\}
2970                         if (/^#include "(.*)"/) {
2971                                 do process($1, $input);
2972                                 next;
2973                         }
2974                         .\|.\|.         # whatever
2975                 }
2976         }
2977
2978 .fi
2979 You may also, in the Bourne shell tradition, specify an EXPR beginning
2980 with \*(L">&\*(R", in which case the rest of the string
2981 is interpreted as the name of a filehandle
2982 (or file descriptor, if numeric) which is to be duped and opened.
2983 You may use & after >, >>, <, +>, +>> and +<.
2984 The mode you specify should match the mode of the original filehandle.
2985 Here is a script that saves, redirects, and restores
2986 .I STDOUT
2987 and
2988 .IR STDERR :
2989 .nf
2990
2991 .ne 21
2992         #!/usr/bin/perl
2993         open(SAVEOUT, ">&STDOUT");
2994         open(SAVEERR, ">&STDERR");
2995
2996         open(STDOUT, ">foo.out") || die "Can't redirect stdout";
2997         open(STDERR, ">&STDOUT") || die "Can't dup stdout";
2998
2999         select(STDERR); $| = 1;         # make unbuffered
3000         select(STDOUT); $| = 1;         # make unbuffered
3001
3002         print STDOUT "stdout 1\en";     # this works for
3003         print STDERR "stderr 1\en";     # subprocesses too
3004
3005         close(STDOUT);
3006         close(STDERR);
3007
3008         open(STDOUT, ">&SAVEOUT");
3009         open(STDERR, ">&SAVEERR");
3010
3011         print STDOUT "stdout 2\en";
3012         print STDERR "stderr 2\en";
3013
3014 .fi
3015 If you open a pipe on the command \*(L"\-\*(R", i.e. either \*(L"|\-\*(R" or \*(L"\-|\*(R",
3016 then there is an implicit fork done, and the return value of open
3017 is the pid of the child within the parent process, and 0 within the child
3018 process.
3019 (Use defined($pid) to determine if the open was successful.)
3020 The filehandle behaves normally for the parent, but i/o to that
3021 filehandle is piped from/to the
3022 .IR STDOUT / STDIN
3023 of the child process.
3024 In the child process the filehandle isn't opened\*(--i/o happens from/to
3025 the new
3026 .I STDOUT
3027 or
3028 .IR STDIN .
3029 Typically this is used like the normal piped open when you want to exercise
3030 more control over just how the pipe command gets executed, such as when
3031 you are running setuid, and don't want to have to scan shell commands
3032 for metacharacters.
3033 The following pairs are more or less equivalent:
3034 .nf
3035
3036 .ne 5
3037         open(FOO, "|tr \'[a\-z]\' \'[A\-Z]\'");
3038         open(FOO, "|\-") || exec \'tr\', \'[a\-z]\', \'[A\-Z]\';
3039
3040         open(FOO, "cat \-n '$file'|");
3041         open(FOO, "\-|") || exec \'cat\', \'\-n\', $file;
3042
3043 .fi
3044 Explicitly closing any piped filehandle causes the parent process to wait for the
3045 child to finish, and returns the status value in $?.
3046 Note: on any operation which may do a fork,
3047 unflushed buffers remain unflushed in both
3048 processes, which means you may need to set $| to
3049 avoid duplicate output.
3050 .Sp
3051 The filename that is passed to open will have leading and trailing
3052 whitespace deleted.
3053 In order to open a file with arbitrary weird characters in it, it's necessary
3054 to protect any leading and trailing whitespace thusly:
3055 .nf
3056
3057 .ne 2
3058         $file =~ s#^(\es)#./$1#;
3059         open(FOO, "< $file\e0");
3060
3061 .fi
3062 .Ip "opendir(DIRHANDLE,EXPR)" 8 3
3063 Opens a directory named EXPR for processing by readdir(), telldir(), seekdir(),
3064 rewinddir() and closedir().
3065 Returns true if successful.
3066 DIRHANDLEs have their own namespace separate from FILEHANDLEs.
3067 .Ip "ord(EXPR)" 8 4
3068 .Ip "ord EXPR" 8
3069 Returns the numeric ascii value of the first character of EXPR.
3070 If EXPR is omitted, uses $_.
3071 ''' Comments on f & d by gnb@melba.bby.oz.au    22/11/89
3072 .Ip "pack(TEMPLATE,LIST)" 8 4
3073 Takes an array or list of values and packs it into a binary structure,
3074 returning the string containing the structure.
3075 The TEMPLATE is a sequence of characters that give the order and type
3076 of values, as follows:
3077 .nf
3078
3079         A       An ascii string, will be space padded.
3080         a       An ascii string, will be null padded.
3081         c       A signed char value.
3082         C       An unsigned char value.
3083         s       A signed short value.
3084         S       An unsigned short value.
3085         i       A signed integer value.
3086         I       An unsigned integer value.
3087         l       A signed long value.
3088         L       An unsigned long value.
3089         n       A short in \*(L"network\*(R" order.
3090         N       A long in \*(L"network\*(R" order.
3091         f       A single-precision float in the native format.
3092         d       A double-precision float in the native format.
3093         p       A pointer to a string.
3094         x       A null byte.
3095         X       Back up a byte.
3096         @       Null fill to absolute position.
3097         u       A uuencoded string.
3098         b       A bit string (ascending bit order, like vec()).
3099         B       A bit string (descending bit order).
3100         h       A hex string (low nybble first).
3101         H       A hex string (high nybble first).
3102
3103 .fi
3104 Each letter may optionally be followed by a number which gives a repeat
3105 count.
3106 With all types except "a", "A", "b", "B", "h" and "H",
3107 the pack function will gobble up that many values
3108 from the LIST.
3109 A * for the repeat count means to use however many items are left.
3110 The "a" and "A" types gobble just one value, but pack it as a string of length
3111 count,
3112 padding with nulls or spaces as necessary.
3113 (When unpacking, "A" strips trailing spaces and nulls, but "a" does not.)
3114 Likewise, the "b" and "B" fields pack a string that many bits long.
3115 The "h" and "H" fields pack a string that many nybbles long.
3116 Real numbers (floats and doubles) are in the native machine format
3117 only; due to the multiplicity of floating formats around, and the lack
3118 of a standard \*(L"network\*(R" representation, no facility for
3119 interchange has been made.
3120 This means that packed floating point data
3121 written on one machine may not be readable on another - even if both
3122 use IEEE floating point arithmetic (as the endian-ness of the memory
3123 representation is not part of the IEEE spec).
3124 Note that perl uses
3125 doubles internally for all numeric calculation, and converting from
3126 double -> float -> double will lose precision (i.e. unpack("f",
3127 pack("f", $foo)) will not in general equal $foo).
3128 .br
3129 Examples:
3130 .nf
3131
3132         $foo = pack("cccc",65,66,67,68);
3133         # foo eq "ABCD"
3134         $foo = pack("c4",65,66,67,68);
3135         # same thing
3136
3137         $foo = pack("ccxxcc",65,66,67,68);
3138         # foo eq "AB\e0\e0CD"
3139
3140         $foo = pack("s2",1,2);
3141         # "\e1\e0\e2\e0" on little-endian
3142         # "\e0\e1\e0\e2" on big-endian
3143
3144         $foo = pack("a4","abcd","x","y","z");
3145         # "abcd"
3146
3147         $foo = pack("aaaa","abcd","x","y","z");
3148         # "axyz"
3149
3150         $foo = pack("a14","abcdefg");
3151         # "abcdefg\e0\e0\e0\e0\e0\e0\e0"
3152
3153         $foo = pack("i9pl", gmtime);
3154         # a real struct tm (on my system anyway)
3155
3156         sub bintodec {
3157             unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
3158         }
3159 .fi
3160 The same template may generally also be used in the unpack function.
3161 .Ip "pipe(READHANDLE,WRITEHANDLE)" 8 3
3162 Opens a pair of connected pipes like the corresponding system call.
3163 Note that if you set up a loop of piped processes, deadlock can occur
3164 unless you are very careful.
3165 In addition, note that perl's pipes use stdio buffering, so you may need
3166 to set $| to flush your WRITEHANDLE after each command, depending on
3167 the application.
3168 [Requires version 3.0 patchlevel 9.]
3169 .Ip "pop(ARRAY)" 8
3170 .Ip "pop ARRAY" 8 6
3171 Pops and returns the last value of the array, shortening the array by 1.
3172 Has the same effect as
3173 .nf
3174
3175         $tmp = $ARRAY[$#ARRAY\-\|\-];
3176
3177 .fi
3178 If there are no elements in the array, returns the undefined value.
3179 .Ip "print(FILEHANDLE LIST)" 8 10
3180 .Ip "print(LIST)" 8
3181 .Ip "print FILEHANDLE LIST" 8
3182 .Ip "print LIST" 8
3183 .Ip "print" 8
3184 Prints a string or a comma-separated list of strings.
3185 Returns non-zero if successful.
3186 FILEHANDLE may be a scalar variable name, in which case the variable contains
3187 the name of the filehandle, thus introducing one level of indirection.
3188 (NOTE: If FILEHANDLE is a variable and the next token is a term, it may be
3189 misinterpreted as an operator unless you interpose a + or put parens around
3190 the arguments.)
3191 If FILEHANDLE is omitted, prints by default to standard output (or to the
3192 last selected output channel\*(--see select()).
3193 If LIST is also omitted, prints $_ to
3194 .IR STDOUT .
3195 To set the default output channel to something other than
3196 .I STDOUT
3197 use the select operation.
3198 Note that, because print takes a LIST, anything in the LIST is evaluated
3199 in an array context, and any subroutine that you call will have one or more
3200 of its expressions evaluated in an array context.
3201 Also be careful not to follow the print keyword with a left parenthesis
3202 unless you want the corresponding right parenthesis to terminate the
3203 arguments to the print\*(--interpose a + or put parens around all the arguments.
3204 .Ip "printf(FILEHANDLE LIST)" 8 10
3205 .Ip "printf(LIST)" 8
3206 .Ip "printf FILEHANDLE LIST" 8
3207 .Ip "printf LIST" 8
3208 Equivalent to a \*(L"print FILEHANDLE sprintf(LIST)\*(R".
3209 .Ip "push(ARRAY,LIST)" 8 7
3210 Treats ARRAY (@ is optional) as a stack, and pushes the values of LIST
3211 onto the end of ARRAY.
3212 The length of ARRAY increases by the length of LIST.
3213 Has the same effect as
3214 .nf
3215
3216     for $value (LIST) {
3217             $ARRAY[++$#ARRAY] = $value;
3218     }
3219
3220 .fi
3221 but is more efficient.
3222 .Ip "q/STRING/" 8 5
3223 .Ip "qq/STRING/" 8
3224 .Ip "qx/STRING/" 8
3225 These are not really functions, but simply syntactic sugar to let you
3226 avoid putting too many backslashes into quoted strings.
3227 The q operator is a generalized single quote, and the qq operator a
3228 generalized double quote.
3229 The qx operator is a generalized backquote.
3230 Any non-alphanumeric delimiter can be used in place of /, including newline.
3231 If the delimiter is an opening bracket or parenthesis, the final delimiter
3232 will be the corresponding closing bracket or parenthesis.
3233 (Embedded occurrences of the closing bracket need to be backslashed as usual.)
3234 Examples:
3235 .nf
3236
3237 .ne 5
3238         $foo = q!I said, "You said, \'She said it.\'"!;
3239         $bar = q(\'This is it.\');
3240         $today = qx{ date };
3241         $_ .= qq
3242 *** The previous line contains the naughty word "$&".\en
3243                 if /(ibm|apple|awk)/;      # :-)
3244
3245 .fi
3246 .Ip "rand(EXPR)" 8 8
3247 .Ip "rand EXPR" 8
3248 .Ip "rand" 8
3249 Returns a random fractional number between 0 and the value of EXPR.
3250 (EXPR should be positive.)
3251 If EXPR is omitted, returns a value between 0 and 1.
3252 See also srand().
3253 .Ip "read(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
3254 .Ip "read(FILEHANDLE,SCALAR,LENGTH)" 8 5
3255 Attempts to read LENGTH bytes of data into variable SCALAR from the specified
3256 FILEHANDLE.
3257 Returns the number of bytes actually read, or undef if there was an error.
3258 SCALAR will be grown or shrunk to the length actually read.
3259 An OFFSET may be specified to place the read data at some other place
3260 than the beginning of the string.
3261 This call is actually implemented in terms of stdio's fread call.  To get
3262 a true read system call, see sysread.
3263 .Ip "readdir(DIRHANDLE)" 8 3
3264 .Ip "readdir DIRHANDLE" 8
3265 Returns the next directory entry for a directory opened by opendir().
3266 If used in an array context, returns all the rest of the entries in the
3267 directory.
3268 If there are no more entries, returns an undefined value in a scalar context
3269 or a null list in an array context.
3270 .Ip "readlink(EXPR)" 8 6
3271 .Ip "readlink EXPR" 8
3272 Returns the value of a symbolic link, if symbolic links are implemented.
3273 If not, gives a fatal error.
3274 If there is some system error, returns the undefined value and sets $! (errno).
3275 If EXPR is omitted, uses $_.
3276 .Ip "recv(SOCKET,SCALAR,LEN,FLAGS)" 8 4
3277 Receives a message on a socket.
3278 Attempts to receive LENGTH bytes of data into variable SCALAR from the specified
3279 SOCKET filehandle.
3280 Returns the address of the sender, or the undefined value if there's an error.
3281 SCALAR will be grown or shrunk to the length actually read.
3282 Takes the same flags as the system call of the same name.
3283 .Ip "redo LABEL" 8 8
3284 .Ip "redo" 8
3285 The
3286 .I redo
3287 command restarts the loop block without evaluating the conditional again.
3288 The
3289 .I continue
3290 block, if any, is not executed.
3291 If the LABEL is omitted, the command refers to the innermost enclosing loop.
3292 This command is normally used by programs that want to lie to themselves
3293 about what was just input:
3294 .nf
3295
3296 .ne 16
3297         # a simpleminded Pascal comment stripper
3298         # (warning: assumes no { or } in strings)
3299         line: while (<STDIN>) {
3300                 while (s|\|({.*}.*\|){.*}|$1 \||) {}
3301                 s|{.*}| \||;
3302                 if (s|{.*| \||) {
3303                         $front = $_;
3304                         while (<STDIN>) {
3305                                 if (\|/\|}/\|) {        # end of comment?
3306                                         s|^|$front{|;
3307                                         redo line;
3308                                 }
3309                         }
3310                 }
3311                 print;
3312         }
3313
3314 .fi
3315 .Ip "rename(OLDNAME,NEWNAME)" 8 2
3316 Changes the name of a file.
3317 Returns 1 for success, 0 otherwise.
3318 Will not work across filesystem boundaries.
3319 .Ip "require(EXPR)" 8 6
3320 .Ip "require EXPR" 8
3321 .Ip "require" 8
3322 Includes the library file specified by EXPR, or by $_ if EXPR is not supplied.
3323 Has semantics similar to the following subroutine:
3324 .nf
3325
3326         sub require {
3327             local($filename) = @_;
3328             return 1 if $INC{$filename};
3329             local($realfilename,$result);
3330             ITER: {
3331                 foreach $prefix (@INC) {
3332                     $realfilename = "$prefix/$filename";
3333                     if (-f $realfilename) {
3334                         $result = do $realfilename;
3335                         last ITER;
3336                     }
3337                 }
3338                 die "Can't find $filename in \e@INC";
3339             }
3340             die $@ if $@;
3341             die "$filename did not return true value" unless $result;
3342             $INC{$filename} = $realfilename;
3343             $result;
3344         }
3345
3346 .fi
3347 Note that the file will not be included twice under the same specified name.
3348 .Ip "reset(EXPR)" 8 6
3349 .Ip "reset EXPR" 8
3350 .Ip "reset" 8
3351 Generally used in a
3352 .I continue
3353 block at the end of a loop to clear variables and reset ?? searches
3354 so that they work again.
3355 The expression is interpreted as a list of single characters (hyphens allowed
3356 for ranges).
3357 All variables and arrays beginning with one of those letters are reset to
3358 their pristine state.
3359 If the expression is omitted, one-match searches (?pattern?) are reset to
3360 match again.
3361 Only resets variables or searches in the current package.
3362 Always returns 1.
3363 Examples:
3364 .nf
3365
3366 .ne 3
3367     reset \'X\';        \h'|2i'# reset all X variables
3368     reset \'a\-z\';\h'|2i'# reset lower case variables
3369     reset;      \h'|2i'# just reset ?? searches
3370
3371 .fi
3372 Note: resetting \*(L"A\-Z\*(R" is not recommended since you'll wipe out your ARGV and ENV
3373 arrays.
3374 .Sp
3375 The use of reset on dbm associative arrays does not change the dbm file.
3376 (It does, however, flush any entries cached by perl, which may be useful if
3377 you are sharing the dbm file.
3378 Then again, maybe not.)
3379 .Ip "return LIST" 8 3
3380 Returns from a subroutine with the value specified.
3381 (Note that a subroutine can automatically return
3382 the value of the last expression evaluated.
3383 That's the preferred method\*(--use of an explicit
3384 .I return
3385 is a bit slower.)
3386 .Ip "reverse(LIST)" 8 4
3387 .Ip "reverse LIST" 8
3388 In an array context, returns an array value consisting of the elements
3389 of LIST in the opposite order.
3390 In a scalar context, returns a string value consisting of the bytes of
3391 the first element of LIST in the opposite order.
3392 .Ip "rewinddir(DIRHANDLE)" 8 5
3393 .Ip "rewinddir DIRHANDLE" 8
3394 Sets the current position to the beginning of the directory for the readdir() routine on DIRHANDLE.
3395 .Ip "rindex(STR,SUBSTR,POSITION)" 8 6
3396 .Ip "rindex(STR,SUBSTR)" 8 4
3397 Works just like index except that it
3398 returns the position of the LAST occurrence of SUBSTR in STR.
3399 If POSITION is specified, returns the last occurrence at or before that
3400 position.
3401 .Ip "rmdir(FILENAME)" 8 4
3402 .Ip "rmdir FILENAME" 8
3403 Deletes the directory specified by FILENAME if it is empty.
3404 If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
3405 If FILENAME is omitted, uses $_.
3406 .Ip "s/PATTERN/REPLACEMENT/gieo" 8 3
3407 Searches a string for a pattern, and if found, replaces that pattern with the
3408 replacement text and returns the number of substitutions made.
3409 Otherwise it returns false (0).
3410 The \*(L"g\*(R" is optional, and if present, indicates that all occurrences
3411 of the pattern are to be replaced.
3412 The \*(L"i\*(R" is also optional, and if present, indicates that matching
3413 is to be done in a case-insensitive manner.
3414 The \*(L"e\*(R" is likewise optional, and if present, indicates that
3415 the replacement string is to be evaluated as an expression rather than just
3416 as a double-quoted string.
3417 Any non-alphanumeric delimiter may replace the slashes;
3418 if single quotes are used, no
3419 interpretation is done on the replacement string (the e modifier overrides
3420 this, however); if backquotes are used, the replacement string is a command
3421 to execute whose output will be used as the actual replacement text.
3422 If no string is specified via the =~ or !~ operator,
3423 the $_ string is searched and modified.
3424 (The string specified with =~ must be a scalar variable, an array element,
3425 or an assignment to one of those, i.e. an lvalue.)
3426 If the pattern contains a $ that looks like a variable rather than an
3427 end-of-string test, the variable will be interpolated into the pattern at
3428 run-time.
3429 If you only want the pattern compiled once the first time the variable is
3430 interpolated, add an \*(L"o\*(R" at the end.
3431 If the PATTERN evaluates to a null string, the most recent successful
3432 regular expression is used instead.
3433 See also the section on regular expressions.
3434 Examples:
3435 .nf
3436
3437     s/\|\e\|bgreen\e\|b/mauve/g;                # don't change wintergreen
3438
3439     $path \|=~ \|s|\|/usr/bin|\|/usr/local/bin|;
3440
3441     s/Login: $foo/Login: $bar/; # run-time pattern
3442
3443     ($foo = $bar) =~ s/bar/foo/;
3444
3445     $_ = \'abc123xyz\';
3446     s/\ed+/$&*2/e;              # yields \*(L'abc246xyz\*(R'
3447     s/\ed+/sprintf("%5d",$&)/e; # yields \*(L'abc  246xyz\*(R'
3448     s/\ew/$& x 2/eg;            # yields \*(L'aabbcc  224466xxyyzz\*(R'
3449
3450     s/\|([^ \|]*\|) *\|([^ \|]*\|)\|/\|$2 $1/;  # reverse 1st two fields
3451
3452 .fi
3453 (Note the use of $ instead of \|\e\| in the last example.  See section
3454 on regular expressions.)
3455 .Ip "scalar(EXPR)" 8 3
3456 Forces EXPR to be interpreted in a scalar context and returns the value
3457 of EXPR.
3458 .Ip "seek(FILEHANDLE,POSITION,WHENCE)" 8 3
3459 Randomly positions the file pointer for FILEHANDLE, just like the fseek()
3460 call of stdio.
3461 FILEHANDLE may be an expression whose value gives the name of the filehandle.
3462 Returns 1 upon success, 0 otherwise.
3463 .Ip "seekdir(DIRHANDLE,POS)" 8 3
3464 Sets the current position for the readdir() routine on DIRHANDLE.
3465 POS must be a value returned by telldir().
3466 Has the same caveats about possible directory compaction as the corresponding
3467 system library routine.
3468 .Ip "select(FILEHANDLE)" 8 3
3469 .Ip "select" 8 3
3470 Returns the currently selected filehandle.
3471 Sets the current default filehandle for output, if FILEHANDLE is supplied.
3472 This has two effects: first, a
3473 .I write
3474 or a
3475 .I print
3476 without a filehandle will default to this FILEHANDLE.
3477 Second, references to variables related to output will refer to this output
3478 channel.
3479 For example, if you have to set the top of form format for more than
3480 one output channel, you might do the following:
3481 .nf
3482
3483 .ne 4
3484         select(REPORT1);
3485         $^ = \'report1_top\';
3486         select(REPORT2);
3487         $^ = \'report2_top\';
3488
3489 .fi
3490 FILEHANDLE may be an expression whose value gives the name of the actual filehandle.
3491 Thus:
3492 .nf
3493
3494         $oldfh = select(STDERR); $| = 1; select($oldfh);
3495
3496 .fi
3497 .Ip "select(RBITS,WBITS,EBITS,TIMEOUT)" 8 3
3498 This calls the select system call with the bitmasks specified, which can
3499 be constructed using fileno() and vec(), along these lines:
3500 .nf
3501
3502         $rin = $win = $ein = '';
3503         vec($rin,fileno(STDIN),1) = 1;
3504         vec($win,fileno(STDOUT),1) = 1;
3505         $ein = $rin | $win;
3506
3507 .fi
3508 If you want to select on many filehandles you might wish to write a subroutine:
3509 .nf
3510
3511         sub fhbits {
3512             local(@fhlist) = split(' ',$_[0]);
3513             local($bits);
3514             for (@fhlist) {
3515                 vec($bits,fileno($_),1) = 1;
3516             }
3517             $bits;
3518         }
3519         $rin = &fhbits('STDIN TTY SOCK');
3520
3521 .fi
3522 The usual idiom is:
3523 .nf
3524
3525         ($nfound,$timeleft) =
3526           select($rout=$rin, $wout=$win, $eout=$ein, $timeout);
3527
3528 or to block until something becomes ready:
3529
3530 .ie t \{\
3531         $nfound = select($rout=$rin, $wout=$win, $eout=$ein, undef);
3532 'br\}
3533 .el \{\
3534         $nfound = select($rout=$rin, $wout=$win,
3535                                 $eout=$ein, undef);
3536 'br\}
3537
3538 .fi
3539 Any of the bitmasks can also be undef.
3540 The timeout, if specified, is in seconds, which may be fractional.
3541 NOTE: not all implementations are capable of returning the $timeleft.
3542 If not, they always return $timeleft equal to the supplied $timeout.
3543 .Ip "semctl(ID,SEMNUM,CMD,ARG)" 8 4
3544 Calls the System V IPC function semctl.  If CMD is &IPC_STAT or
3545 &GETALL, then ARG must be a variable which will hold the returned
3546 semid_ds structure or semaphore value array.  Returns like ioctl: the
3547 undefined value for error, "0 but true" for zero, or the actual return
3548 value otherwise.
3549 .Ip "semget(KEY,NSEMS,SIZE,FLAGS)" 8 4
3550 Calls the System V IPC function semget.  Returns the semaphore id, or
3551 the undefined value if there is an error.
3552 .Ip "semop(KEY,OPSTRING)" 8 4
3553 Calls the System V IPC function semop to perform semaphore operations
3554 such as signaling and waiting.  OPSTRING must be a packed array of
3555 semop structures.  Each semop structure can be generated with
3556 \&'pack("sss", $semnum, $semop, $semflag)'.  The number of semaphore
3557 operations is implied by the length of OPSTRING.  Returns true if
3558 successful, or false if there is an error.  As an example, the
3559 following code waits on semaphore $semnum of semaphore id $semid:
3560 .nf
3561
3562         $semop = pack("sss", $semnum, -1, 0);
3563         die "Semaphore trouble: $!\en" unless semop($semid, $semop);
3564
3565 .fi
3566 To signal the semaphore, replace "-1" with "1".
3567 .Ip "send(SOCKET,MSG,FLAGS,TO)" 8 4
3568 .Ip "send(SOCKET,MSG,FLAGS)" 8
3569 Sends a message on a socket.
3570 Takes the same flags as the system call of the same name.
3571 On unconnected sockets you must specify a destination to send TO.
3572 Returns the number of characters sent, or the undefined value if
3573 there is an error.
3574 .Ip "setpgrp(PID,PGRP)" 8 4
3575 Sets the current process group for the specified PID, 0 for the current
3576 process.
3577 Will produce a fatal error if used on a machine that doesn't implement
3578 setpgrp(2).
3579 .Ip "setpriority(WHICH,WHO,PRIORITY)" 8 4
3580 Sets the current priority for a process, a process group, or a user.
3581 (See setpriority(2).)
3582 Will produce a fatal error if used on a machine that doesn't implement
3583 setpriority(2).
3584 .Ip "setsockopt(SOCKET,LEVEL,OPTNAME,OPTVAL)" 8 3
3585 Sets the socket option requested.
3586 Returns undefined if there is an error.
3587 OPTVAL may be specified as undef if you don't want to pass an argument.
3588 .Ip "shift(ARRAY)" 8 6
3589 .Ip "shift ARRAY" 8
3590 .Ip "shift" 8
3591 Shifts the first value of the array off and returns it,
3592 shortening the array by 1 and moving everything down.
3593 If there are no elements in the array, returns the undefined value.
3594 If ARRAY is omitted, shifts the @ARGV array in the main program, and the @_
3595 array in subroutines.
3596 (This is determined lexically.)
3597 See also unshift(), push() and pop().
3598 Shift() and unshift() do the same thing to the left end of an array that push()
3599 and pop() do to the right end.
3600 .Ip "shmctl(ID,CMD,ARG)" 8 4
3601 Calls the System V IPC function shmctl.  If CMD is &IPC_STAT, then ARG
3602 must be a variable which will hold the returned shmid_ds structure.
3603 Returns like ioctl: the undefined value for error, "0 but true" for
3604 zero, or the actual return value otherwise.
3605 .Ip "shmget(KEY,SIZE,FLAGS)" 8 4
3606 Calls the System V IPC function shmget.  Returns the shared memory
3607 segment id, or the undefined value if there is an error.
3608 .Ip "shmread(ID,VAR,POS,SIZE)" 8 4
3609 .Ip "shmwrite(ID,STRING,POS,SIZE)" 8
3610 Reads or writes the System V shared memory segment ID starting at
3611 position POS for size SIZE by attaching to it, copying in/out, and
3612 detaching from it.  When reading, VAR must be a variable which
3613 will hold the data read.  When writing, if STRING is too long,
3614 only SIZE bytes are used; if STRING is too short, nulls are
3615 written to fill out SIZE bytes.  Return true if successful, or
3616 false if there is an error.
3617 .Ip "shutdown(SOCKET,HOW)" 8 3
3618 Shuts down a socket connection in the manner indicated by HOW, which has
3619 the same interpretation as in the system call of the same name.
3620 .Ip "sin(EXPR)" 8 4
3621 .Ip "sin EXPR" 8
3622 Returns the sine of EXPR (expressed in radians).
3623 If EXPR is omitted, returns sine of $_.
3624 .Ip "sleep(EXPR)" 8 6
3625 .Ip "sleep EXPR" 8
3626 .Ip "sleep" 8
3627 Causes the script to sleep for EXPR seconds, or forever if no EXPR.
3628 May be interrupted by sending the process a SIGALARM.
3629 Returns the number of seconds actually slept.
3630 .Ip "socket(SOCKET,DOMAIN,TYPE,PROTOCOL)" 8 3
3631 Opens a socket of the specified kind and attaches it to filehandle SOCKET.
3632 DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
3633 of the same name.
3634 You may need to run h2ph on sys/socket.h to get the proper values handy
3635 in a perl library file.
3636 Return true if successful.
3637 See the example in the section on Interprocess Communication.
3638 .Ip "socketpair(SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL)" 8 3
3639 Creates an unnamed pair of sockets in the specified domain, of the specified
3640 type.
3641 DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
3642 of the same name.
3643 If unimplemented, yields a fatal error.
3644 Return true if successful.
3645 .Ip "sort(SUBROUTINE LIST)" 8 9
3646 .Ip "sort(LIST)" 8
3647 .Ip "sort SUBROUTINE LIST" 8
3648 .Ip "sort LIST" 8
3649 Sorts the LIST and returns the sorted array value.
3650 Nonexistent values of arrays are stripped out.
3651 If SUBROUTINE is omitted, sorts in standard string comparison order.
3652 If SUBROUTINE is specified, gives the name of a subroutine that returns
3653 an integer less than, equal to, or greater than 0,
3654 depending on how the elements of the array are to be ordered.
3655 In the interests of efficiency the normal calling code for subroutines
3656 is bypassed, with the following effects: the subroutine may not be a recursive
3657 subroutine, and the two elements to be compared are passed into the subroutine
3658 not via @_ but as $a and $b (see example below).
3659 They are passed by reference so don't modify $a and $b.
3660 SUBROUTINE may be a scalar variable name, in which case the value provides
3661 the name of the subroutine to use.
3662 Examples:
3663 .nf
3664
3665 .ne 4
3666         sub byage {
3667             $age{$a} - $age{$b};        # presuming integers
3668         }
3669         @sortedclass = sort byage @class;
3670
3671 .ne 9
3672         sub reverse { $a lt $b ? 1 : $a gt $b ? \-1 : 0; }
3673         @harry = (\'dog\',\'cat\',\'x\',\'Cain\',\'Abel\');
3674         @george = (\'gone\',\'chased\',\'yz\',\'Punished\',\'Axed\');
3675         print sort @harry;
3676                 # prints AbelCaincatdogx
3677         print sort reverse @harry;
3678                 # prints xdogcatCainAbel
3679         print sort @george, \'to\', @harry;
3680                 # prints AbelAxedCainPunishedcatchaseddoggonetoxyz
3681
3682 .fi
3683 .Ip "splice(ARRAY,OFFSET,LENGTH,LIST)" 8 8
3684 .Ip "splice(ARRAY,OFFSET,LENGTH)" 8
3685 .Ip "splice(ARRAY,OFFSET)" 8
3686 Removes the elements designated by OFFSET and LENGTH from an array, and
3687 replaces them with the elements of LIST, if any.
3688 Returns the elements removed from the array.
3689 The array grows or shrinks as necessary.
3690 If LENGTH is omitted, removes everything from OFFSET onward.
3691 The following equivalencies hold (assuming $[ == 0):
3692 .nf
3693
3694         push(@a,$x,$y)\h'|3.5i'splice(@a,$#a+1,0,$x,$y)
3695         pop(@a)\h'|3.5i'splice(@a,-1)
3696         shift(@a)\h'|3.5i'splice(@a,0,1)
3697         unshift(@a,$x,$y)\h'|3.5i'splice(@a,0,0,$x,$y)
3698         $a[$x] = $y\h'|3.5i'splice(@a,$x,1,$y);
3699
3700 Example, assuming array lengths are passed before arrays:
3701
3702         sub aeq {       # compare two array values
3703                 local(@a) = splice(@_,0,shift);
3704                 local(@b) = splice(@_,0,shift);
3705                 return 0 unless @a == @b;       # same len?
3706                 while (@a) {
3707                     return 0 if pop(@a) ne pop(@b);
3708                 }
3709                 return 1;
3710         }
3711         if (&aeq($len,@foo[1..$len],0+@bar,@bar)) { ... }
3712
3713 .fi
3714 .Ip "split(/PATTERN/,EXPR,LIMIT)" 8 8
3715 .Ip "split(/PATTERN/,EXPR)" 8 8
3716 .Ip "split(/PATTERN/)" 8
3717 .Ip "split" 8
3718 Splits a string into an array of strings, and returns it.
3719 (If not in an array context, returns the number of fields found and splits
3720 into the @_ array.
3721 (In an array context, you can force the split into @_
3722 by using ?? as the pattern delimiters, but it still returns the array value.))
3723 If EXPR is omitted, splits the $_ string.
3724 If PATTERN is also omitted, splits on whitespace (/[\ \et\en]+/).
3725 Anything matching PATTERN is taken to be a delimiter separating the fields.
3726 (Note that the delimiter may be longer than one character.)
3727 If LIMIT is specified, splits into no more than that many fields (though it
3728 may split into fewer).
3729 If LIMIT is unspecified, trailing null fields are stripped (which
3730 potential users of pop() would do well to remember).
3731 A pattern matching the null string (not to be confused with a null pattern //,
3732 which is just one member of the set of patterns matching a null string)
3733 will split the value of EXPR into separate characters at each point it
3734 matches that way.
3735 For example:
3736 .nf
3737
3738         print join(\':\', split(/ */, \'hi there\'));
3739
3740 .fi
3741 produces the output \*(L'h:i:t:h:e:r:e\*(R'.
3742 .Sp
3743 The LIMIT parameter can be used to partially split a line
3744 .nf
3745
3746         ($login, $passwd, $remainder) = split(\|/\|:\|/\|, $_, 3);
3747
3748 .fi
3749 (When assigning to a list, if LIMIT is omitted, perl supplies a LIMIT one
3750 larger than the number of variables in the list, to avoid unnecessary work.
3751 For the list above LIMIT would have been 4 by default.
3752 In time critical applications it behooves you not to split into
3753 more fields than you really need.)
3754 .Sp
3755 If the PATTERN contains parentheses, additional array elements are created
3756 from each matching substring in the delimiter.
3757 .Sp
3758         split(/([,-])/,"1-10,20");
3759 .Sp
3760 produces the array value
3761 .Sp
3762         (1,'-',10,',',20)
3763 .Sp
3764 The pattern /PATTERN/ may be replaced with an expression to specify patterns
3765 that vary at runtime.
3766 (To do runtime compilation only once, use /$variable/o.)
3767 As a special case, specifying a space (\'\ \') will split on white space
3768 just as split with no arguments does, but leading white space does NOT
3769 produce a null first field.
3770 Thus, split(\'\ \') can be used to emulate
3771 .IR awk 's
3772 default behavior, whereas
3773 split(/\ /) will give you as many null initial fields as there are
3774 leading spaces.
3775 .Sp
3776 Example:
3777 .nf
3778
3779 .ne 5
3780         open(passwd, \'/etc/passwd\');
3781         while (<passwd>) {
3782 .ie t \{\
3783                 ($login, $passwd, $uid, $gid, $gcos, $home, $shell) = split(\|/\|:\|/\|);
3784 'br\}
3785 .el \{\
3786                 ($login, $passwd, $uid, $gid, $gcos, $home, $shell)
3787                         = split(\|/\|:\|/\|);
3788 'br\}
3789                 .\|.\|.
3790         }
3791
3792 .fi
3793 (Note that $shell above will still have a newline on it.  See chop().)
3794 See also
3795 .IR join .
3796 .Ip "sprintf(FORMAT,LIST)" 8 4
3797 Returns a string formatted by the usual printf conventions.
3798 The * character is not supported.
3799 .Ip "sqrt(EXPR)" 8 4
3800 .Ip "sqrt EXPR" 8
3801 Return the square root of EXPR.
3802 If EXPR is omitted, returns square root of $_.
3803 .Ip "srand(EXPR)" 8 4
3804 .Ip "srand EXPR" 8
3805 Sets the random number seed for the
3806 .I rand
3807 operator.
3808 If EXPR is omitted, does srand(time).
3809 .Ip "stat(FILEHANDLE)" 8 8
3810 .Ip "stat FILEHANDLE" 8
3811 .Ip "stat(EXPR)" 8
3812 .Ip "stat SCALARVARIABLE" 8
3813 Returns a 13-element array giving the statistics for a file, either the file
3814 opened via FILEHANDLE, or named by EXPR.
3815 Typically used as follows:
3816 .nf
3817
3818 .ne 3
3819     ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
3820        $atime,$mtime,$ctime,$blksize,$blocks)
3821            = stat($filename);
3822
3823 .fi
3824 If stat is passed the special filehandle consisting of an underline,
3825 no stat is done, but the current contents of the stat structure from
3826 the last stat or filetest are returned.
3827 Example:
3828 .nf
3829
3830 .ne 3
3831         if (-x $file && (($d) = stat(_)) && $d < 0) {
3832                 print "$file is executable NFS file\en";
3833         }
3834
3835 .fi
3836 .Ip "study(SCALAR)" 8 6
3837 .Ip "study SCALAR" 8
3838 .Ip "study"
3839 Takes extra time to study SCALAR ($_ if unspecified) in anticipation of
3840 doing many pattern matches on the string before it is next modified.
3841 This may or may not save time, depending on the nature and number of patterns
3842 you are searching on, and on the distribution of character frequencies in
3843 the string to be searched\*(--you probably want to compare runtimes with and
3844 without it to see which runs faster.
3845 Those loops which scan for many short constant strings (including the constant
3846 parts of more complex patterns) will benefit most.
3847 You may have only one study active at a time\*(--if you study a different
3848 scalar the first is \*(L"unstudied\*(R".
3849 (The way study works is this: a linked list of every character in the string
3850 to be searched is made, so we know, for example, where all the \*(L'k\*(R' characters
3851 are.
3852 From each search string, the rarest character is selected, based on some
3853 static frequency tables constructed from some C programs and English text.
3854 Only those places that contain this \*(L"rarest\*(R" character are examined.)
3855 .Sp
3856 For example, here is a loop which inserts index producing entries before any line
3857 containing a certain pattern:
3858 .nf
3859
3860 .ne 8
3861         while (<>) {
3862                 study;
3863                 print ".IX foo\en" if /\ebfoo\eb/;
3864                 print ".IX bar\en" if /\ebbar\eb/;
3865                 print ".IX blurfl\en" if /\ebblurfl\eb/;
3866                 .\|.\|.
3867                 print;
3868         }
3869
3870 .fi
3871 In searching for /\ebfoo\eb/, only those locations in $_ that contain \*(L'f\*(R'
3872 will be looked at, because \*(L'f\*(R' is rarer than \*(L'o\*(R'.
3873 In general, this is a big win except in pathological cases.
3874 The only question is whether it saves you more time than it took to build
3875 the linked list in the first place.
3876 .Sp
3877 Note that if you have to look for strings that you don't know till runtime,
3878 you can build an entire loop as a string and eval that to avoid recompiling
3879 all your patterns all the time.
3880 Together with undefining $/ to input entire files as one record, this can
3881 be very fast, often faster than specialized programs like fgrep.
3882 The following scans a list of files (@files)
3883 for a list of words (@words), and prints out the names of those files that
3884 contain a match:
3885 .nf
3886
3887 .ne 12
3888         $search = \'while (<>) { study;\';
3889         foreach $word (@words) {
3890             $search .= "++\e$seen{\e$ARGV} if /\eb$word\eb/;\en";
3891         }
3892         $search .= "}";
3893         @ARGV = @files;
3894         undef $/;
3895         eval $search;           # this screams
3896         $/ = "\en";             # put back to normal input delim
3897         foreach $file (sort keys(%seen)) {
3898             print $file, "\en";
3899         }
3900
3901 .fi
3902 .Ip "substr(EXPR,OFFSET,LEN)" 8 2
3903 .Ip "substr(EXPR,OFFSET)" 8 2
3904 Extracts a substring out of EXPR and returns it.
3905 First character is at offset 0, or whatever you've set $[ to.
3906 If OFFSET is negative, starts that far from the end of the string.
3907 If LEN is omitted, returns everything to the end of the string.
3908 You can use the substr() function as an lvalue, in which case EXPR must
3909 be an lvalue.
3910 If you assign something shorter than LEN, the string will shrink, and
3911 if you assign something longer than LEN, the string will grow to accommodate it.
3912 To keep the string the same length you may need to pad or chop your value using
3913 sprintf().
3914 .Ip "symlink(OLDFILE,NEWFILE)" 8 2
3915 Creates a new filename symbolically linked to the old filename.
3916 Returns 1 for success, 0 otherwise.
3917 On systems that don't support symbolic links, produces a fatal error at
3918 run time.
3919 To check for that, use eval:
3920 .nf
3921
3922         $symlink_exists = (eval \'symlink("","");\', $@ eq \'\');
3923
3924 .fi
3925 .Ip "syscall(LIST)" 8 6
3926 .Ip "syscall LIST" 8
3927 Calls the system call specified as the first element of the list, passing
3928 the remaining elements as arguments to the system call.
3929 If unimplemented, produces a fatal error.
3930 The arguments are interpreted as follows: if a given argument is numeric,
3931 the argument is passed as an int.
3932 If not, the pointer to the string value is passed.
3933 You are responsible to make sure a string is pre-extended long enough
3934 to receive any result that might be written into a string.
3935 If your integer arguments are not literals and have never been interpreted
3936 in a numeric context, you may need to add 0 to them to force them to look
3937 like numbers.
3938 .nf
3939
3940         require 'syscall.ph';           # may need to run h2ph
3941         syscall(&SYS_write, fileno(STDOUT), "hi there\en", 9);
3942
3943 .fi
3944 .Ip "sysread(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
3945 .Ip "sysread(FILEHANDLE,SCALAR,LENGTH)" 8 5
3946 Attempts to read LENGTH bytes of data into variable SCALAR from the specified
3947 FILEHANDLE, using the system call read(2).
3948 It bypasses stdio, so mixing this with other kinds of reads may cause
3949 confusion.
3950 Returns the number of bytes actually read, or undef if there was an error.
3951 SCALAR will be grown or shrunk to the length actually read.
3952 An OFFSET may be specified to place the read data at some other place
3953 than the beginning of the string.
3954 .Ip "system(LIST)" 8 6
3955 .Ip "system LIST" 8
3956 Does exactly the same thing as \*(L"exec LIST\*(R" except that a fork
3957 is done first, and the parent process waits for the child process to complete.
3958 Note that argument processing varies depending on the number of arguments.
3959 The return value is the exit status of the program as returned by the wait()
3960 call.
3961 To get the actual exit value divide by 256.
3962 See also
3963 .IR exec .
3964 .Ip "syswrite(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
3965 .Ip "syswrite(FILEHANDLE,SCALAR,LENGTH)" 8 5
3966 Attempts to write LENGTH bytes of data from variable SCALAR to the specified
3967 FILEHANDLE, using the system call write(2).
3968 It bypasses stdio, so mixing this with prints may cause
3969 confusion.
3970 Returns the number of bytes actually written, or undef if there was an error.
3971 An OFFSET may be specified to place the read data at some other place
3972 than the beginning of the string.
3973 .Ip "tell(FILEHANDLE)" 8 6
3974 .Ip "tell FILEHANDLE" 8 6
3975 .Ip "tell" 8
3976 Returns the current file position for FILEHANDLE.
3977 FILEHANDLE may be an expression whose value gives the name of the actual
3978 filehandle.
3979 If FILEHANDLE is omitted, assumes the file last read.
3980 .Ip "telldir(DIRHANDLE)" 8 5
3981 .Ip "telldir DIRHANDLE" 8
3982 Returns the current position of the readdir() routines on DIRHANDLE.
3983 Value may be given to seekdir() to access a particular location in
3984 a directory.
3985 Has the same caveats about possible directory compaction as the corresponding
3986 system library routine.
3987 .Ip "time" 8 4
3988 Returns the number of non-leap seconds since 00:00:00 UTC, January 1, 1970.
3989 Suitable for feeding to gmtime() and localtime().
3990 .Ip "times" 8 4
3991 Returns a four-element array giving the user and system times, in seconds, for this
3992 process and the children of this process.
3993 .Sp
3994     ($user,$system,$cuser,$csystem) = times;
3995 .Sp
3996 .Ip "tr/SEARCHLIST/REPLACEMENTLIST/cds" 8 5
3997 .Ip "y/SEARCHLIST/REPLACEMENTLIST/cds" 8
3998 Translates all occurrences of the characters found in the search list with
3999 the corresponding character in the replacement list.
4000 It returns the number of characters replaced or deleted.
4001 If no string is specified via the =~ or !~ operator,
4002 the $_ string is translated.
4003 (The string specified with =~ must be a scalar variable, an array element,
4004 or an assignment to one of those, i.e. an lvalue.)
4005 For
4006 .I sed
4007 devotees,
4008 .I y
4009 is provided as a synonym for
4010 .IR tr .
4011 .Sp
4012 If the c modifier is specified, the SEARCHLIST character set is complemented.
4013 If the d modifier is specified, any characters specified by SEARCHLIST that
4014 are not found in REPLACEMENTLIST are deleted.
4015 (Note that this is slightly more flexible than the behavior of some
4016 .I tr
4017 programs, which delete anything they find in the SEARCHLIST, period.)
4018 If the s modifier is specified, sequences of characters that were translated
4019 to the same character are squashed down to 1 instance of the character.
4020 .Sp
4021 If the d modifier was used, the REPLACEMENTLIST is always interpreted exactly
4022 as specified.
4023 Otherwise, if the REPLACEMENTLIST is shorter than the SEARCHLIST,
4024 the final character is replicated till it is long enough.
4025 If the REPLACEMENTLIST is null, the SEARCHLIST is replicated.
4026 This latter is useful for counting characters in a class, or for squashing
4027 character sequences in a class.
4028 .Sp
4029 Examples:
4030 .nf
4031
4032     $ARGV[1] \|=~ \|y/A\-Z/a\-z/;       \h'|3i'# canonicalize to lower case
4033
4034     $cnt = tr/*/*/;             \h'|3i'# count the stars in $_
4035
4036     $cnt = tr/0\-9//;           \h'|3i'# count the digits in $_
4037
4038     tr/a\-zA\-Z//s;     \h'|3i'# bookkeeper \-> bokeper
4039
4040     ($HOST = $host) =~ tr/a\-z/A\-Z/;
4041
4042     y/a\-zA\-Z/ /cs;    \h'|3i'# change non-alphas to single space
4043
4044     tr/\e200\-\e377/\e0\-\e177/;\h'|3i'# delete 8th bit
4045
4046 .fi
4047 .Ip "truncate(FILEHANDLE,LENGTH)" 8 4
4048 .Ip "truncate(EXPR,LENGTH)" 8
4049 Truncates the file opened on FILEHANDLE, or named by EXPR, to the specified
4050 length.
4051 Produces a fatal error if truncate isn't implemented on your system.
4052 .Ip "umask(EXPR)" 8 4
4053 .Ip "umask EXPR" 8
4054 .Ip "umask" 8
4055 Sets the umask for the process and returns the old one.
4056 If EXPR is omitted, merely returns current umask.
4057 .Ip "undef(EXPR)" 8 6
4058 .Ip "undef EXPR" 8
4059 .Ip "undef" 8
4060 Undefines the value of EXPR, which must be an lvalue.
4061 Use only on a scalar value, an entire array, or a subroutine name (using &).
4062 (Undef will probably not do what you expect on most predefined variables or
4063 dbm array values.)
4064 Always returns the undefined value.
4065 You can omit the EXPR, in which case nothing is undefined, but you still
4066 get an undefined value that you could, for instance, return from a subroutine.
4067 Examples:
4068 .nf
4069
4070 .ne 6
4071         undef $foo;
4072         undef $bar{'blurfl'};
4073         undef @ary;
4074         undef %assoc;
4075         undef &mysub;
4076         return (wantarray ? () : undef) if $they_blew_it;
4077
4078 .fi
4079 .Ip "unlink(LIST)" 8 4
4080 .Ip "unlink LIST" 8
4081 Deletes a list of files.
4082 Returns the number of files successfully deleted.
4083 .nf
4084
4085 .ne 2
4086         $cnt = unlink \'a\', \'b\', \'c\';
4087         unlink @goners;
4088         unlink <*.bak>;
4089
4090 .fi
4091 Note: unlink will not delete directories unless you are superuser and the
4092 .B \-U
4093 flag is supplied to
4094 .IR perl .
4095 Even if these conditions are met, be warned that unlinking a directory
4096 can inflict damage on your filesystem.
4097 Use rmdir instead.
4098 .Ip "unpack(TEMPLATE,EXPR)" 8 4
4099 Unpack does the reverse of pack: it takes a string representing
4100 a structure and expands it out into an array value, returning the array
4101 value.
4102 (In a scalar context, it merely returns the first value produced.)
4103 The TEMPLATE has the same format as in the pack function.
4104 Here's a subroutine that does substring:
4105 .nf
4106
4107 .ne 4
4108         sub substr {
4109                 local($what,$where,$howmuch) = @_;
4110                 unpack("x$where a$howmuch", $what);
4111         }
4112
4113 .ne 3
4114 and then there's
4115
4116         sub ord { unpack("c",$_[0]); }
4117
4118 .fi
4119 In addition, you may prefix a field with a %<number> to indicate that
4120 you want a <number>-bit checksum of the items instead of the items themselves.
4121 Default is a 16-bit checksum.
4122 For example, the following computes the same number as the System V sum program:
4123 .nf
4124
4125 .ne 4
4126         while (<>) {
4127             $checksum += unpack("%16C*", $_);
4128         }
4129         $checksum %= 65536;
4130
4131 .fi
4132 .Ip "unshift(ARRAY,LIST)" 8 4
4133 Does the opposite of a
4134 .IR shift .
4135 Or the opposite of a
4136 .IR push ,
4137 depending on how you look at it.
4138 Prepends list to the front of the array, and returns the number of elements
4139 in the new array.
4140 .nf
4141
4142         unshift(ARGV, \'\-e\') unless $ARGV[0] =~ /^\-/;
4143
4144 .fi
4145 .Ip "utime(LIST)" 8 2
4146 .Ip "utime LIST" 8 2
4147 Changes the access and modification times on each file of a list of files.
4148 The first two elements of the list must be the NUMERICAL access and
4149 modification times, in that order.
4150 Returns the number of files successfully changed.
4151 The inode modification time of each file is set to the current time.
4152 Example of a \*(L"touch\*(R" command:
4153 .nf
4154
4155 .ne 3
4156         #!/usr/bin/perl
4157         $now = time;
4158         utime $now, $now, @ARGV;
4159
4160 .fi
4161 .Ip "values(ASSOC_ARRAY)" 8 6
4162 .Ip "values ASSOC_ARRAY" 8
4163 Returns a normal array consisting of all the values of the named associative
4164 array.
4165 The values are returned in an apparently random order, but it is the same order
4166 as either the keys() or each() function would produce on the same array.
4167 See also keys() and each().
4168 .Ip "vec(EXPR,OFFSET,BITS)" 8 2
4169 Treats a string as a vector of unsigned integers, and returns the value
4170 of the bitfield specified.
4171 May also be assigned to.
4172 BITS must be a power of two from 1 to 32.
4173 .Sp
4174 Vectors created with vec() can also be manipulated with the logical operators
4175 |, & and ^,
4176 which will assume a bit vector operation is desired when both operands are
4177 strings.
4178 This interpretation is not enabled unless there is at least one vec() in
4179 your program, to protect older programs.
4180 .Sp
4181 To transform a bit vector into a string or array of 0's and 1's, use these:
4182 .nf
4183
4184         $bits = unpack("b*", $vector);
4185         @bits = split(//, unpack("b*", $vector));
4186
4187 .fi
4188 If you know the exact length in bits, it can be used in place of the *.
4189 .Ip "wait" 8 6
4190 Waits for a child process to terminate and returns the pid of the deceased
4191 process, or -1 if there are no child processes.
4192 The status is returned in $?.
4193 .Ip "waitpid(PID,FLAGS)" 8 6
4194 Waits for a particular child process to terminate and returns the pid of the deceased
4195 process, or -1 if there is no such child process.
4196 The status is returned in $?.
4197 If you say
4198 .nf
4199
4200         require "sys/wait.h";
4201         .\|.\|.
4202         waitpid(-1,&WNOHANG);
4203
4204 .fi
4205 then you can do a non-blocking wait for any process.  Non-blocking wait
4206 is only available on machines supporting either the
4207 .I waitpid (2)
4208 or
4209 .I wait4 (2)
4210 system calls.
4211 However, waiting for a particular pid with FLAGS of 0 is implemented
4212 everywhere.  (Perl emulates the system call by remembering the status
4213 values of processes that have exited but have not been harvested by the
4214 Perl script yet.)
4215 .Ip "wantarray" 8 4
4216 Returns true if the context of the currently executing subroutine
4217 is looking for an array value.
4218 Returns false if the context is looking for a scalar.
4219 .nf
4220
4221         return wantarray ? () : undef;
4222
4223 .fi
4224 .Ip "warn(LIST)" 8 4
4225 .Ip "warn LIST" 8
4226 Produces a message on STDERR just like \*(L"die\*(R", but doesn't exit.
4227 .Ip "write(FILEHANDLE)" 8 6
4228 .Ip "write(EXPR)" 8
4229 .Ip "write" 8
4230 Writes a formatted record (possibly multi-line) to the specified file,
4231 using the format associated with that file.
4232 By default the format for a file is the one having the same name is the
4233 filehandle, but the format for the current output channel (see
4234 .IR select )
4235 may be set explicitly
4236 by assigning the name of the format to the $~ variable.
4237 .Sp
4238 Top of form processing is handled automatically:
4239 if there is insufficient room on the current page for the formatted
4240 record, the page is advanced by writing a form feed,
4241 a special top-of-page format is used
4242 to format the new page header, and then the record is written.
4243 By default the top-of-page format is \*(L"top\*(R", but it
4244 may be set to the
4245 format of your choice by assigning the name to the $^ variable.
4246 The number of lines remaining on the current page is in variable $-, which
4247 can be set to 0 to force a new page.
4248 .Sp
4249 If FILEHANDLE is unspecified, output goes to the current default output channel,
4250 which starts out as
4251 .I STDOUT
4252 but may be changed by the
4253 .I select
4254 operator.
4255 If the FILEHANDLE is an EXPR, then the expression is evaluated and the
4256 resulting string is used to look up the name of the FILEHANDLE at run time.
4257 For more on formats, see the section on formats later on.
4258 .Sp
4259 Note that write is NOT the opposite of read.
4260 ''' Beginning of part 4
4261 ''' $Header: perl.man,v 4.0 91/03/20 01:38:08 lwall Locked $
4262 '''
4263 ''' $Log:       perl.man,v $
4264 ''' Revision 4.0  91/03/20  01:38:08  lwall
4265 ''' 4.0 baseline.
4266 '''
4267 ''' Revision 3.0.1.14  91/01/11  18:18:53  lwall
4268 ''' patch42: started an addendum and errata section in the man page
4269 '''
4270 ''' Revision 3.0.1.13  90/11/10  01:51:00  lwall
4271 ''' patch38: random cleanup
4272 '''
4273 ''' Revision 3.0.1.12  90/10/20  02:15:43  lwall
4274 ''' patch37: patch37: fixed various typos in man page
4275 '''
4276 ''' Revision 3.0.1.11  90/10/16  10:04:28  lwall
4277 ''' patch29: added @###.## fields to format
4278 '''
4279 ''' Revision 3.0.1.10  90/08/09  04:47:35  lwall
4280 ''' patch19: added require operator
4281 ''' patch19: added numeric interpretation of $]
4282 '''
4283 ''' Revision 3.0.1.9  90/08/03  11:15:58  lwall
4284 ''' patch19: Intermediate diffs for Randal
4285 '''
4286 ''' Revision 3.0.1.8  90/03/27  16:19:31  lwall
4287 ''' patch16: MSDOS support
4288 '''
4289 ''' Revision 3.0.1.7  90/03/14  12:29:50  lwall
4290 ''' patch15: man page falsely states that you can't subscript array values
4291 '''
4292 ''' Revision 3.0.1.6  90/03/12  16:54:04  lwall
4293 ''' patch13: improved documentation of *name
4294 '''
4295 ''' Revision 3.0.1.5  90/02/28  18:01:52  lwall
4296 ''' patch9: $0 is now always the command name
4297 '''
4298 ''' Revision 3.0.1.4  89/12/21  20:12:39  lwall
4299 ''' patch7: documented that package'filehandle works as well as $package'variable
4300 ''' patch7: documented which identifiers are always in package main
4301 '''
4302 ''' Revision 3.0.1.3  89/11/17  15:32:25  lwall
4303 ''' patch5: fixed some manual typos and indent problems
4304 ''' patch5: clarified difference between $! and $@
4305 '''
4306 ''' Revision 3.0.1.2  89/11/11  04:46:40  lwall
4307 ''' patch2: made some line breaks depend on troff vs. nroff
4308 ''' patch2: clarified operation of ^ and $ when $* is false
4309 '''
4310 ''' Revision 3.0.1.1  89/10/26  23:18:43  lwall
4311 ''' patch1: documented the desirability of unnecessary parentheses
4312 '''
4313 ''' Revision 3.0  89/10/18  15:21:55  lwall
4314 ''' 3.0 baseline
4315 '''
4316 .Sh "Precedence"
4317 .I Perl
4318 operators have the following associativity and precedence:
4319 .nf
4320
4321 nonassoc\h'|1i'print printf exec system sort reverse
4322 \h'1.5i'chmod chown kill unlink utime die return
4323 left\h'|1i',
4324 right\h'|1i'= += \-= *= etc.
4325 right\h'|1i'?:
4326 nonassoc\h'|1i'.\|.
4327 left\h'|1i'||
4328 left\h'|1i'&&
4329 left\h'|1i'| ^
4330 left\h'|1i'&
4331 nonassoc\h'|1i'== != <=> eq ne cmp
4332 nonassoc\h'|1i'< > <= >= lt gt le ge
4333 nonassoc\h'|1i'chdir exit eval reset sleep rand umask
4334 nonassoc\h'|1i'\-r \-w \-x etc.
4335 left\h'|1i'<< >>
4336 left\h'|1i'+ \- .
4337 left\h'|1i'* / % x
4338 left\h'|1i'=~ !~
4339 right\h'|1i'! ~ and unary minus
4340 right\h'|1i'**
4341 nonassoc\h'|1i'++ \-\|\-
4342 left\h'|1i'\*(L'(\*(R'
4343
4344 .fi
4345 As mentioned earlier, if any list operator (print, etc.) or
4346 any unary operator (chdir, etc.)
4347 is followed by a left parenthesis as the next token on the same line,
4348 the operator and arguments within parentheses are taken to
4349 be of highest precedence, just like a normal function call.
4350 Examples:
4351 .nf
4352
4353         chdir $foo || die;\h'|3i'# (chdir $foo) || die
4354         chdir($foo) || die;\h'|3i'# (chdir $foo) || die
4355         chdir ($foo) || die;\h'|3i'# (chdir $foo) || die
4356         chdir +($foo) || die;\h'|3i'# (chdir $foo) || die
4357
4358 but, because * is higher precedence than ||:
4359
4360         chdir $foo * 20;\h'|3i'# chdir ($foo * 20)
4361         chdir($foo) * 20;\h'|3i'# (chdir $foo) * 20
4362         chdir ($foo) * 20;\h'|3i'# (chdir $foo) * 20
4363         chdir +($foo) * 20;\h'|3i'# chdir ($foo * 20)
4364
4365         rand 10 * 20;\h'|3i'# rand (10 * 20)
4366         rand(10) * 20;\h'|3i'# (rand 10) * 20
4367         rand (10) * 20;\h'|3i'# (rand 10) * 20
4368         rand +(10) * 20;\h'|3i'# rand (10 * 20)
4369
4370 .fi
4371 In the absence of parentheses,
4372 the precedence of list operators such as print, sort or chmod is
4373 either very high or very low depending on whether you look at the left
4374 side of operator or the right side of it.
4375 For example, in
4376 .nf
4377
4378         @ary = (1, 3, sort 4, 2);
4379         print @ary;             # prints 1324
4380
4381 .fi
4382 the commas on the right of the sort are evaluated before the sort, but
4383 the commas on the left are evaluated after.
4384 In other words, list operators tend to gobble up all the arguments that
4385 follow them, and then act like a simple term with regard to the preceding
4386 expression.
4387 Note that you have to be careful with parens:
4388 .nf
4389
4390 .ne 3
4391         # These evaluate exit before doing the print:
4392         print($foo, exit);      # Obviously not what you want.
4393         print $foo, exit;       # Nor is this.
4394
4395 .ne 4
4396         # These do the print before evaluating exit:
4397         (print $foo), exit;     # This is what you want.
4398         print($foo), exit;      # Or this.
4399         print ($foo), exit;     # Or even this.
4400
4401 Also note that
4402
4403         print ($foo & 255) + 1, "\en";
4404
4405 .fi
4406 probably doesn't do what you expect at first glance.
4407 .Sh "Subroutines"
4408 A subroutine may be declared as follows:
4409 .nf
4410
4411     sub NAME BLOCK
4412
4413 .fi
4414 .PP
4415 Any arguments passed to the routine come in as array @_,
4416 that is ($_[0], $_[1], .\|.\|.).
4417 The array @_ is a local array, but its values are references to the
4418 actual scalar parameters.
4419 The return value of the subroutine is the value of the last expression
4420 evaluated, and can be either an array value or a scalar value.
4421 Alternately, a return statement may be used to specify the returned value and
4422 exit the subroutine.
4423 To create local variables see the
4424 .I local
4425 operator.
4426 .PP
4427 A subroutine is called using the
4428 .I do
4429 operator or the & operator.
4430 .nf
4431
4432 .ne 12
4433 Example:
4434
4435         sub MAX {
4436                 local($max) = pop(@_);
4437                 foreach $foo (@_) {
4438                         $max = $foo \|if \|$max < $foo;
4439                 }
4440                 $max;
4441         }
4442
4443         .\|.\|.
4444         $bestday = &MAX($mon,$tue,$wed,$thu,$fri);
4445
4446 .ne 21
4447 Example:
4448
4449         # get a line, combining continuation lines
4450         #  that start with whitespace
4451         sub get_line {
4452                 $thisline = $lookahead;
4453                 line: while ($lookahead = <STDIN>) {
4454                         if ($lookahead \|=~ \|/\|^[ \^\e\|t]\|/\|) {
4455                                 $thisline \|.= \|$lookahead;
4456                         }
4457                         else {
4458                                 last line;
4459                         }
4460                 }
4461                 $thisline;
4462         }
4463
4464         $lookahead = <STDIN>;   # get first line
4465         while ($_ = do get_line(\|)) {
4466                 .\|.\|.
4467         }
4468
4469 .fi
4470 .nf
4471 .ne 6
4472 Use array assignment to a local list to name your formal arguments:
4473
4474         sub maybeset {
4475                 local($key, $value) = @_;
4476                 $foo{$key} = $value unless $foo{$key};
4477         }
4478
4479 .fi
4480 This also has the effect of turning call-by-reference into call-by-value,
4481 since the assignment copies the values.
4482 .Sp
4483 Subroutines may be called recursively.
4484 If a subroutine is called using the & form, the argument list is optional.
4485 If omitted, no @_ array is set up for the subroutine; the @_ array at the
4486 time of the call is visible to subroutine instead.
4487 .nf
4488
4489         do foo(1,2,3);          # pass three arguments
4490         &foo(1,2,3);            # the same
4491
4492         do foo();               # pass a null list
4493         &foo();                 # the same
4494         &foo;                   # pass no arguments\*(--more efficient
4495
4496 .fi
4497 .Sh "Passing By Reference"
4498 Sometimes you don't want to pass the value of an array to a subroutine but
4499 rather the name of it, so that the subroutine can modify the global copy
4500 of it rather than working with a local copy.
4501 In perl you can refer to all the objects of a particular name by prefixing
4502 the name with a star: *foo.
4503 When evaluated, it produces a scalar value that represents all the objects
4504 of that name, including any filehandle, format or subroutine.
4505 When assigned to within a local() operation, it causes the name mentioned
4506 to refer to whatever * value was assigned to it.
4507 Example:
4508 .nf
4509
4510         sub doubleary {
4511             local(*someary) = @_;
4512             foreach $elem (@someary) {
4513                 $elem *= 2;
4514             }
4515         }
4516         do doubleary(*foo);
4517         do doubleary(*bar);
4518
4519 .fi
4520 Assignment to *name is currently recommended only inside a local().
4521 You can actually assign to *name anywhere, but the previous referent of
4522 *name may be stranded forever.
4523 This may or may not bother you.
4524 .Sp
4525 Note that scalars are already passed by reference, so you can modify scalar
4526 arguments without using this mechanism by referring explicitly to the $_[nnn]
4527 in question.
4528 You can modify all the elements of an array by passing all the elements
4529 as scalars, but you have to use the * mechanism to push, pop or change the
4530 size of an array.
4531 The * mechanism will probably be more efficient in any case.
4532 .Sp
4533 Since a *name value contains unprintable binary data, if it is used as
4534 an argument in a print, or as a %s argument in a printf or sprintf, it
4535 then has the value '*name', just so it prints out pretty.
4536 .Sp
4537 Even if you don't want to modify an array, this mechanism is useful for
4538 passing multiple arrays in a single LIST, since normally the LIST mechanism
4539 will merge all the array values so that you can't extract out the
4540 individual arrays.
4541 .Sh "Regular Expressions"
4542 The patterns used in pattern matching are regular expressions such as
4543 those supplied in the Version 8 regexp routines.
4544 (In fact, the routines are derived from Henry Spencer's freely redistributable
4545 reimplementation of the V8 routines.)
4546 In addition, \ew matches an alphanumeric character (including \*(L"_\*(R") and \eW a nonalphanumeric.
4547 Word boundaries may be matched by \eb, and non-boundaries by \eB.
4548 A whitespace character is matched by \es, non-whitespace by \eS.
4549 A numeric character is matched by \ed, non-numeric by \eD.
4550 You may use \ew, \es and \ed within character classes.
4551 Also, \en, \er, \ef, \et and \eNNN have their normal interpretations.
4552 Within character classes \eb represents backspace rather than a word boundary.
4553 Alternatives may be separated by |.
4554 The bracketing construct \|(\ .\|.\|.\ \|) may also be used, in which case \e<digit>
4555 matches the digit'th substring.
4556 (Outside of the pattern, always use $ instead of \e in front of the digit.
4557 The scope of $<digit> (and $\`, $& and $\')
4558 extends to the end of the enclosing BLOCK or eval string, or to
4559 the next pattern match with subexpressions.
4560 The \e<digit> notation sometimes works outside the current pattern, but should
4561 not be relied upon.)
4562 You may have as many parentheses as you wish.  If you have more than 9
4563 substrings, the variables $10, $11, ... refer to the corresponding
4564 substring.  Within the pattern, \e10, \e11,
4565 etc. refer back to substrings if there have been at least that many left parens
4566 before the backreference.  Otherwise (for backward compatibilty) \e10
4567 is the same as \e010, a backspace,
4568 and \e11 the same as \e011, a tab.
4569 And so on.
4570 (\e1 through \e9 are always backreferences.)
4571 .PP
4572 $+ returns whatever the last bracket match matched.
4573 $& returns the entire matched string.
4574 ($0 used to return the same thing, but not any more.)
4575 $\` returns everything before the matched string.
4576 $\' returns everything after the matched string.
4577 Examples:
4578 .nf
4579
4580         s/\|^\|([^ \|]*\|) \|*([^ \|]*\|)\|/\|$2 $1\|/; # swap first two words
4581
4582 .ne 5
4583         if (/\|Time: \|(.\|.\|):\|(.\|.\|):\|(.\|.\|)\|/\|) {
4584                 $hours = $1;
4585                 $minutes = $2;
4586                 $seconds = $3;
4587         }
4588
4589 .fi
4590 By default, the ^ character is only guaranteed to match at the beginning
4591 of the string,
4592 the $ character only at the end (or before the newline at the end)
4593 and
4594 .I perl
4595 does certain optimizations with the assumption that the string contains
4596 only one line.
4597 The behavior of ^ and $ on embedded newlines will be inconsistent.
4598 You may, however, wish to treat a string as a multi-line buffer, such that
4599 the ^ will match after any newline within the string, and $ will match
4600 before any newline.
4601 At the cost of a little more overhead, you can do this by setting the variable
4602 $* to 1.
4603 Setting it back to 0 makes
4604 .I perl
4605 revert to its old behavior.
4606 .PP
4607 To facilitate multi-line substitutions, the . character never matches a newline
4608 (even when $* is 0).
4609 In particular, the following leaves a newline on the $_ string:
4610 .nf
4611
4612         $_ = <STDIN>;
4613         s/.*(some_string).*/$1/;
4614
4615 If the newline is unwanted, try one of
4616
4617         s/.*(some_string).*\en/$1/;
4618         s/.*(some_string)[^\e000]*/$1/;
4619         s/.*(some_string)(.|\en)*/$1/;
4620         chop; s/.*(some_string).*/$1/;
4621         /(some_string)/ && ($_ = $1);
4622
4623 .fi
4624 Any item of a regular expression may be followed with digits in curly brackets
4625 of the form {n,m}, where n gives the minimum number of times to match the item
4626 and m gives the maximum.
4627 The form {n} is equivalent to {n,n} and matches exactly n times.
4628 The form {n,} matches n or more times.
4629 (If a curly bracket occurs in any other context, it is treated as a regular
4630 character.)
4631 The * modifier is equivalent to {0,}, the + modifier to {1,} and the ? modifier
4632 to {0,1}.
4633 There is no limit to the size of n or m, but large numbers will chew up
4634 more memory.
4635 .Sp
4636 You will note that all backslashed metacharacters in
4637 .I perl
4638 are alphanumeric,
4639 such as \eb, \ew, \en.
4640 Unlike some other regular expression languages, there are no backslashed
4641 symbols that aren't alphanumeric.
4642 So anything that looks like \e\e, \e(, \e), \e<, \e>, \e{, or \e} is always
4643 interpreted as a literal character, not a metacharacter.
4644 This makes it simple to quote a string that you want to use for a pattern
4645 but that you are afraid might contain metacharacters.
4646 Simply quote all the non-alphanumeric characters:
4647 .nf
4648
4649         $pattern =~ s/(\eW)/\e\e$1/g;
4650
4651 .fi
4652 .Sh "Formats"
4653 Output record formats for use with the
4654 .I write
4655 operator may declared as follows:
4656 .nf
4657
4658 .ne 3
4659     format NAME =
4660     FORMLIST
4661     .
4662
4663 .fi
4664 If name is omitted, format \*(L"STDOUT\*(R" is defined.
4665 FORMLIST consists of a sequence of lines, each of which may be of one of three
4666 types:
4667 .Ip 1. 4
4668 A comment.
4669 .Ip 2. 4
4670 A \*(L"picture\*(R" line giving the format for one output line.
4671 .Ip 3. 4
4672 An argument line supplying values to plug into a picture line.
4673 .PP
4674 Picture lines are printed exactly as they look, except for certain fields
4675 that substitute values into the line.
4676 Each picture field starts with either @ or ^.
4677 The @ field (not to be confused with the array marker @) is the normal
4678 case; ^ fields are used
4679 to do rudimentary multi-line text block filling.
4680 The length of the field is supplied by padding out the field
4681 with multiple <, >, or | characters to specify, respectively, left justification,
4682 right justification, or centering.
4683 As an alternate form of right justification,
4684 you may also use # characters (with an optional .) to specify a numeric field.
4685 (Use of ^ instead of @ causes the field to be blanked if undefined.)
4686 If any of the values supplied for these fields contains a newline, only
4687 the text up to the newline is printed.
4688 The special field @* can be used for printing multi-line values.
4689 It should appear by itself on a line.
4690 .PP
4691 The values are specified on the following line, in the same order as
4692 the picture fields.
4693 The values should be separated by commas.
4694 .PP
4695 Picture fields that begin with ^ rather than @ are treated specially.
4696 The value supplied must be a scalar variable name which contains a text
4697 string.
4698 .I Perl
4699 puts as much text as it can into the field, and then chops off the front
4700 of the string so that the next time the variable is referenced,
4701 more of the text can be printed.
4702 Normally you would use a sequence of fields in a vertical stack to print
4703 out a block of text.
4704 If you like, you can end the final field with .\|.\|., which will appear in the
4705 output if the text was too long to appear in its entirety.
4706 You can change which characters are legal to break on by changing the
4707 variable $: to a list of the desired characters.
4708 .PP
4709 Since use of ^ fields can produce variable length records if the text to be
4710 formatted is short, you can suppress blank lines by putting the tilde (~)
4711 character anywhere in the line.
4712 (Normally you should put it in the front if possible, for visibility.)
4713 The tilde will be translated to a space upon output.
4714 If you put a second tilde contiguous to the first, the line will be repeated
4715 until all the fields on the line are exhausted.
4716 (If you use a field of the @ variety, the expression you supply had better
4717 not give the same value every time forever!)
4718 .PP
4719 Examples:
4720 .nf
4721 .lg 0
4722 .cs R 25
4723 .ft C
4724
4725 .ne 10
4726 # a report on the /etc/passwd file
4727 format top =
4728 \&                        Passwd File
4729 Name                Login    Office   Uid   Gid Home
4730 ------------------------------------------------------------------
4731 \&.
4732 format STDOUT =
4733 @<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<<
4734 $name,              $login,  $office,$uid,$gid, $home
4735 \&.
4736
4737 .ne 29
4738 # a report from a bug report form
4739 format top =
4740 \&                        Bug Reports
4741 @<<<<<<<<<<<<<<<<<<<<<<<     @|||         @>>>>>>>>>>>>>>>>>>>>>>>
4742 $system,                      $%,         $date
4743 ------------------------------------------------------------------
4744 \&.
4745 format STDOUT =
4746 Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4747 \&         $subject
4748 Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4749 \&       $index,                       $description
4750 Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4751 \&          $priority,        $date,   $description
4752 From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4753 \&      $from,                         $description
4754 Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4755 \&             $programmer,            $description
4756 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4757 \&                                     $description
4758 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4759 \&                                     $description
4760 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4761 \&                                     $description
4762 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4763 \&                                     $description
4764 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<...
4765 \&                                     $description
4766 \&.
4767
4768 .ft R
4769 .cs R
4770 .lg
4771 .fi
4772 It is possible to intermix prints with writes on the same output channel,
4773 but you'll have to handle $\- (lines left on the page) yourself.
4774 .PP
4775 If you are printing lots of fields that are usually blank, you should consider
4776 using the reset operator between records.
4777 Not only is it more efficient, but it can prevent the bug of adding another
4778 field and forgetting to zero it.
4779 .Sh "Interprocess Communication"
4780 The IPC facilities of perl are built on the Berkeley socket mechanism.
4781 If you don't have sockets, you can ignore this section.
4782 The calls have the same names as the corresponding system calls,
4783 but the arguments tend to differ, for two reasons.
4784 First, perl file handles work differently than C file descriptors.
4785 Second, perl already knows the length of its strings, so you don't need
4786 to pass that information.
4787 Here is a sample client (untested):
4788 .nf
4789
4790         ($them,$port) = @ARGV;
4791         $port = 2345 unless $port;
4792         $them = 'localhost' unless $them;
4793
4794         $SIG{'INT'} = 'dokill';
4795         sub dokill { kill 9,$child if $child; }
4796
4797         require 'sys/socket.ph';
4798
4799         $sockaddr = 'S n a4 x8';
4800         chop($hostname = `hostname`);
4801
4802         ($name, $aliases, $proto) = getprotobyname('tcp');
4803         ($name, $aliases, $port) = getservbyname($port, 'tcp')
4804                 unless $port =~ /^\ed+$/;
4805 .ie t \{\
4806         ($name, $aliases, $type, $len, $thisaddr) = gethostbyname($hostname);
4807 'br\}
4808 .el \{\
4809         ($name, $aliases, $type, $len, $thisaddr) =
4810                                         gethostbyname($hostname);
4811 'br\}
4812         ($name, $aliases, $type, $len, $thataddr) = gethostbyname($them);
4813
4814         $this = pack($sockaddr, &AF_INET, 0, $thisaddr);
4815         $that = pack($sockaddr, &AF_INET, $port, $thataddr);
4816
4817         socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
4818         bind(S, $this) || die "bind: $!";
4819         connect(S, $that) || die "connect: $!";
4820
4821         select(S); $| = 1; select(stdout);
4822
4823         if ($child = fork) {
4824                 while (<>) {
4825                         print S;
4826                 }
4827                 sleep 3;
4828                 do dokill();
4829         }
4830         else {
4831                 while (<S>) {
4832                         print;
4833                 }
4834         }
4835
4836 .fi
4837 And here's a server:
4838 .nf
4839
4840         ($port) = @ARGV;
4841         $port = 2345 unless $port;
4842
4843         require 'sys/socket.ph';
4844
4845         $sockaddr = 'S n a4 x8';
4846
4847         ($name, $aliases, $proto) = getprotobyname('tcp');
4848         ($name, $aliases, $port) = getservbyname($port, 'tcp')
4849                 unless $port =~ /^\ed+$/;
4850
4851         $this = pack($sockaddr, &AF_INET, $port, "\e0\e0\e0\e0");
4852
4853         select(NS); $| = 1; select(stdout);
4854
4855         socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
4856         bind(S, $this) || die "bind: $!";
4857         listen(S, 5) || die "connect: $!";
4858
4859         select(S); $| = 1; select(stdout);
4860
4861         for (;;) {
4862                 print "Listening again\en";
4863                 ($addr = accept(NS,S)) || die $!;
4864                 print "accept ok\en";
4865
4866                 ($af,$port,$inetaddr) = unpack($sockaddr,$addr);
4867                 @inetaddr = unpack('C4',$inetaddr);
4868                 print "$af $port @inetaddr\en";
4869
4870                 while (<NS>) {
4871                         print;
4872                         print NS;
4873                 }
4874         }
4875
4876 .fi
4877 .Sh "Predefined Names"
4878 The following names have special meaning to
4879 .IR perl .
4880 I could have used alphabetic symbols for some of these, but I didn't want
4881 to take the chance that someone would say reset \*(L"a\-zA\-Z\*(R" and wipe them all
4882 out.
4883 You'll just have to suffer along with these silly symbols.
4884 Most of them have reasonable mnemonics, or analogues in one of the shells.
4885 .Ip $_ 8
4886 The default input and pattern-searching space.
4887 The following pairs are equivalent:
4888 .nf
4889
4890 .ne 2
4891         while (<>) {\|.\|.\|.   # only equivalent in while!
4892         while ($_ = <>) {\|.\|.\|.
4893
4894 .ne 2
4895         /\|^Subject:/
4896         $_ \|=~ \|/\|^Subject:/
4897
4898 .ne 2
4899         y/a\-z/A\-Z/
4900         $_ =~ y/a\-z/A\-Z/
4901
4902 .ne 2
4903         chop
4904         chop($_)
4905
4906 .fi
4907 (Mnemonic: underline is understood in certain operations.)
4908 .Ip $. 8
4909 The current input line number of the last filehandle that was read.
4910 Readonly.
4911 Remember that only an explicit close on the filehandle resets the line number.
4912 Since <> never does an explicit close, line numbers increase across ARGV files
4913 (but see examples under eof).
4914 (Mnemonic: many programs use . to mean the current line number.)
4915 .Ip $/ 8
4916 The input record separator, newline by default.
4917 Works like
4918 .IR awk 's
4919 RS variable, including treating blank lines as delimiters
4920 if set to the null string.
4921 You may set it to a multicharacter string to match a multi-character
4922 delimiter.
4923 (Mnemonic: / is used to delimit line boundaries when quoting poetry.)
4924 .Ip $, 8
4925 The output field separator for the print operator.
4926 Ordinarily the print operator simply prints out the comma separated fields
4927 you specify.
4928 In order to get behavior more like
4929 .IR awk ,
4930 set this variable as you would set
4931 .IR awk 's
4932 OFS variable to specify what is printed between fields.
4933 (Mnemonic: what is printed when there is a , in your print statement.)
4934 .Ip $"" 8
4935 This is like $, except that it applies to array values interpolated into
4936 a double-quoted string (or similar interpreted string).
4937 Default is a space.
4938 (Mnemonic: obvious, I think.)
4939 .Ip $\e 8
4940 The output record separator for the print operator.
4941 Ordinarily the print operator simply prints out the comma separated fields
4942 you specify, with no trailing newline or record separator assumed.
4943 In order to get behavior more like
4944 .IR awk ,
4945 set this variable as you would set
4946 .IR awk 's
4947 ORS variable to specify what is printed at the end of the print.
4948 (Mnemonic: you set $\e instead of adding \en at the end of the print.
4949 Also, it's just like /, but it's what you get \*(L"back\*(R" from
4950 .IR perl .)
4951 .Ip $# 8
4952 The output format for printed numbers.
4953 This variable is a half-hearted attempt to emulate
4954 .IR awk 's
4955 OFMT variable.
4956 There are times, however, when
4957 .I awk
4958 and
4959 .I perl
4960 have differing notions of what
4961 is in fact numeric.
4962 Also, the initial value is %.20g rather than %.6g, so you need to set $#
4963 explicitly to get
4964 .IR awk 's
4965 value.
4966 (Mnemonic: # is the number sign.)
4967 .Ip $% 8
4968 The current page number of the currently selected output channel.
4969 (Mnemonic: % is page number in nroff.)
4970 .Ip $= 8
4971 The current page length (printable lines) of the currently selected output
4972 channel.
4973 Default is 60.
4974 (Mnemonic: = has horizontal lines.)
4975 .Ip $\- 8
4976 The number of lines left on the page of the currently selected output channel.
4977 (Mnemonic: lines_on_page \- lines_printed.)
4978 .Ip $~ 8
4979 The name of the current report format for the currently selected output
4980 channel.
4981 (Mnemonic: brother to $^.)
4982 .Ip $^ 8
4983 The name of the current top-of-page format for the currently selected output
4984 channel.
4985 (Mnemonic: points to top of page.)
4986 .Ip $| 8
4987 If set to nonzero, forces a flush after every write or print on the currently
4988 selected output channel.
4989 Default is 0.
4990 Note that
4991 .I STDOUT
4992 will typically be line buffered if output is to the
4993 terminal and block buffered otherwise.
4994 Setting this variable is useful primarily when you are outputting to a pipe,
4995 such as when you are running a
4996 .I perl
4997 script under rsh and want to see the
4998 output as it's happening.
4999 (Mnemonic: when you want your pipes to be piping hot.)
5000 .Ip $$ 8
5001 The process number of the
5002 .I perl
5003 running this script.
5004 (Mnemonic: same as shells.)
5005 .Ip $? 8
5006 The status returned by the last pipe close, backtick (\`\`) command or
5007 .I system
5008 operator.
5009 Note that this is the status word returned by the wait() system
5010 call, so the exit value of the subprocess is actually ($? >> 8).
5011 $? & 255 gives which signal, if any, the process died from, and whether
5012 there was a core dump.
5013 (Mnemonic: similar to sh and ksh.)
5014 .Ip $& 8 4
5015 The string matched by the last pattern match (not counting any matches hidden
5016 within a BLOCK or eval enclosed by the current BLOCK).
5017 (Mnemonic: like & in some editors.)
5018 .Ip $\` 8 4
5019 The string preceding whatever was matched by the last pattern match
5020 (not counting any matches hidden within a BLOCK or eval enclosed by the current
5021 BLOCK).
5022 (Mnemonic: \` often precedes a quoted string.)
5023 .Ip $\' 8 4
5024 The string following whatever was matched by the last pattern match
5025 (not counting any matches hidden within a BLOCK or eval enclosed by the current
5026 BLOCK).
5027 (Mnemonic: \' often follows a quoted string.)
5028 Example:
5029 .nf
5030
5031 .ne 3
5032         $_ = \'abcdefghi\';
5033         /def/;
5034         print "$\`:$&:$\'\en";          # prints abc:def:ghi
5035
5036 .fi
5037 .Ip $+ 8 4
5038 The last bracket matched by the last search pattern.
5039 This is useful if you don't know which of a set of alternative patterns
5040 matched.
5041 For example:
5042 .nf
5043
5044     /Version: \|(.*\|)|Revision: \|(.*\|)\|/ \|&& \|($rev = $+);
5045
5046 .fi
5047 (Mnemonic: be positive and forward looking.)
5048 .Ip $* 8 2
5049 Set to 1 to do multiline matching within a string, 0 to tell
5050 .I perl
5051 that it can assume that strings contain a single line, for the purpose
5052 of optimizing pattern matches.
5053 Pattern matches on strings containing multiple newlines can produce confusing
5054 results when $* is 0.
5055 Default is 0.
5056 (Mnemonic: * matches multiple things.)
5057 Note that this variable only influences the interpretation of ^ and $.
5058 A literal newline can be searched for even when $* == 0.
5059 .Ip $0 8
5060 Contains the name of the file containing the
5061 .I perl
5062 script being executed.
5063 Assigning to $0 modifies the argument area that the ps(1) program sees.
5064 (Mnemonic: same as sh and ksh.)
5065 .Ip $<digit> 8
5066 Contains the subpattern from the corresponding set of parentheses in the last
5067 pattern matched, not counting patterns matched in nested blocks that have
5068 been exited already.
5069 (Mnemonic: like \edigit.)
5070 .Ip $[ 8 2
5071 The index of the first element in an array, and of the first character in
5072 a substring.
5073 Default is 0, but you could set it to 1 to make
5074 .I perl
5075 behave more like
5076 .I awk
5077 (or Fortran)
5078 when subscripting and when evaluating the index() and substr() functions.
5079 (Mnemonic: [ begins subscripts.)
5080 .Ip $] 8 2
5081 The string printed out when you say \*(L"perl -v\*(R".
5082 It can be used to determine at the beginning of a script whether the perl
5083 interpreter executing the script is in the right range of versions.
5084 If used in a numeric context, returns the version + patchlevel / 1000.
5085 Example:
5086 .nf
5087
5088 .ne 8
5089         # see if getc is available
5090         ($version,$patchlevel) =
5091                  $] =~ /(\ed+\e.\ed+).*\enPatch level: (\ed+)/;
5092         print STDERR "(No filename completion available.)\en"
5093                  if $version * 1000 + $patchlevel < 2016;
5094
5095 or, used numerically,
5096
5097         warn "No checksumming!\en" if $] < 3.019;
5098
5099 .fi
5100 (Mnemonic: Is this version of perl in the right bracket?)
5101 .Ip $; 8 2
5102 The subscript separator for multi-dimensional array emulation.
5103 If you refer to an associative array element as
5104 .nf
5105         $foo{$a,$b,$c}
5106
5107 it really means
5108
5109         $foo{join($;, $a, $b, $c)}
5110
5111 But don't put
5112
5113         @foo{$a,$b,$c}          # a slice\*(--note the @
5114
5115 which means
5116
5117         ($foo{$a},$foo{$b},$foo{$c})
5118
5119 .fi
5120 Default is "\e034", the same as SUBSEP in
5121 .IR awk .
5122 Note that if your keys contain binary data there might not be any safe
5123 value for $;.
5124 (Mnemonic: comma (the syntactic subscript separator) is a semi-semicolon.
5125 Yeah, I know, it's pretty lame, but $, is already taken for something more
5126 important.)
5127 .Ip $! 8 2
5128 If used in a numeric context, yields the current value of errno, with all the
5129 usual caveats.
5130 (This means that you shouldn't depend on the value of $! to be anything
5131 in particular unless you've gotten a specific error return indicating a
5132 system error.)
5133 If used in a string context, yields the corresponding system error string.
5134 You can assign to $! in order to set errno
5135 if, for instance, you want $! to return the string for error n, or you want
5136 to set the exit value for the die operator.
5137 (Mnemonic: What just went bang?)
5138 .Ip $@ 8 2
5139 The perl syntax error message from the last eval command.
5140 If null, the last eval parsed and executed correctly (although the operations
5141 you invoked may have failed in the normal fashion).
5142 (Mnemonic: Where was the syntax error \*(L"at\*(R"?)
5143 .Ip $< 8 2
5144 The real uid of this process.
5145 (Mnemonic: it's the uid you came FROM, if you're running setuid.)
5146 .Ip $> 8 2
5147 The effective uid of this process.
5148 Example:
5149 .nf
5150
5151 .ne 2
5152         $< = $>;        # set real uid to the effective uid
5153         ($<,$>) = ($>,$<);      # swap real and effective uid
5154
5155 .fi
5156 (Mnemonic: it's the uid you went TO, if you're running setuid.)
5157 Note: $< and $> can only be swapped on machines supporting setreuid().
5158 .Ip $( 8 2
5159 The real gid of this process.
5160 If you are on a machine that supports membership in multiple groups
5161 simultaneously, gives a space separated list of groups you are in.
5162 The first number is the one returned by getgid(), and the subsequent ones
5163 by getgroups(), one of which may be the same as the first number.
5164 (Mnemonic: parentheses are used to GROUP things.
5165 The real gid is the group you LEFT, if you're running setgid.)
5166 .Ip $) 8 2
5167 The effective gid of this process.
5168 If you are on a machine that supports membership in multiple groups
5169 simultaneously, gives a space separated list of groups you are in.
5170 The first number is the one returned by getegid(), and the subsequent ones
5171 by getgroups(), one of which may be the same as the first number.
5172 (Mnemonic: parentheses are used to GROUP things.
5173 The effective gid is the group that's RIGHT for you, if you're running setgid.)
5174 .Sp
5175 Note: $<, $>, $( and $) can only be set on machines that support the
5176 corresponding set[re][ug]id() routine.
5177 $( and $) can only be swapped on machines supporting setregid().
5178 .Ip $: 8 2
5179 The current set of characters after which a string may be broken to
5180 fill continuation fields (starting with ^) in a format.
5181 Default is "\ \en-", to break on whitespace or hyphens.
5182 (Mnemonic: a \*(L"colon\*(R" in poetry is a part of a line.)
5183 .Ip $^D 8 2
5184 The current value of the debugging flags.
5185 (Mnemonic: value of
5186 .B \-D
5187 switch.)
5188 .Ip $^I 8 2
5189 The current value of the inplace-edit extension.
5190 Use undef to disable inplace editing.
5191 (Mnemonic: value of
5192 .B \-i
5193 switch.)
5194 .Ip $^P 8 2
5195 The name that Perl itself was invoked as, from argv[0].
5196 .Ip $^T 8 2
5197 The time at which the script began running, in seconds since the epoch.
5198 The values returned by the
5199 .B \-M ,
5200 .B \-A
5201 and
5202 .B \-C
5203 filetests are based on this value.
5204 .Ip $^W 8 2
5205 The current value of the warning switch.
5206 (Mnemonic: related to the
5207 .B \-w
5208 switch.)
5209 .Ip $ARGV 8 3
5210 contains the name of the current file when reading from <>.
5211 .Ip @ARGV 8 3
5212 The array ARGV contains the command line arguments intended for the script.
5213 Note that $#ARGV is the generally number of arguments minus one, since
5214 $ARGV[0] is the first argument, NOT the command name.
5215 See $0 for the command name.
5216 .Ip @INC 8 3
5217 The array INC contains the list of places to look for
5218 .I perl
5219 scripts to be
5220 evaluated by the \*(L"do EXPR\*(R" command or the \*(L"require\*(R" command.
5221 It initially consists of the arguments to any
5222 .B \-I
5223 command line switches, followed
5224 by the default
5225 .I perl
5226 library, probably \*(L"/usr/local/lib/perl\*(R",
5227 followed by \*(L".\*(R", to represent the current directory.
5228 .Ip %INC 8 3
5229 The associative array INC contains entries for each filename that has
5230 been included via \*(L"do\*(R" or \*(L"require\*(R".
5231 The key is the filename you specified, and the value is the location of
5232 the file actually found.
5233 The \*(L"require\*(R" command uses this array to determine whether
5234 a given file has already been included.
5235 .Ip $ENV{expr} 8 2
5236 The associative array ENV contains your current environment.
5237 Setting a value in ENV changes the environment for child processes.
5238 .Ip $SIG{expr} 8 2
5239 The associative array SIG is used to set signal handlers for various signals.
5240 Example:
5241 .nf
5242
5243 .ne 12
5244         sub handler {   # 1st argument is signal name
5245                 local($sig) = @_;
5246                 print "Caught a SIG$sig\-\|\-shutting down\en";
5247                 close(LOG);
5248                 exit(0);
5249         }
5250
5251         $SIG{\'INT\'} = \'handler\';
5252         $SIG{\'QUIT\'} = \'handler\';
5253         .\|.\|.
5254         $SIG{\'INT\'} = \'DEFAULT\';    # restore default action
5255         $SIG{\'QUIT\'} = \'IGNORE\';    # ignore SIGQUIT
5256
5257 .fi
5258 The SIG array only contains values for the signals actually set within
5259 the perl script.
5260 .Sh "Packages"
5261 Perl provides a mechanism for alternate namespaces to protect packages from
5262 stomping on each others variables.
5263 By default, a perl script starts compiling into the package known as \*(L"main\*(R".
5264 By use of the
5265 .I package
5266 declaration, you can switch namespaces.
5267 The scope of the package declaration is from the declaration itself to the end
5268 of the enclosing block (the same scope as the local() operator).
5269 Typically it would be the first declaration in a file to be included by
5270 the \*(L"require\*(R" operator.
5271 You can switch into a package in more than one place; it merely influences
5272 which symbol table is used by the compiler for the rest of that block.
5273 You can refer to variables and filehandles in other packages by prefixing
5274 the identifier with the package name and a single quote.
5275 If the package name is null, the \*(L"main\*(R" package as assumed.
5276 .PP
5277 Only identifiers starting with letters are stored in the packages symbol
5278 table.
5279 All other symbols are kept in package \*(L"main\*(R".
5280 In addition, the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC
5281 and SIG are forced to be in package \*(L"main\*(R", even when used for
5282 other purposes than their built-in one.
5283 Note also that, if you have a package called \*(L"m\*(R", \*(L"s\*(R"
5284 or \*(L"y\*(R", the you can't use the qualified form of an identifier since it
5285 will be interpreted instead as a pattern match, a substitution
5286 or a translation.
5287 .PP
5288 Eval'ed strings are compiled in the package in which the eval was compiled
5289 in.
5290 (Assignments to $SIG{}, however, assume the signal handler specified is in the
5291 main package.
5292 Qualify the signal handler name if you wish to have a signal handler in
5293 a package.)
5294 For an example, examine perldb.pl in the perl library.
5295 It initially switches to the DB package so that the debugger doesn't interfere
5296 with variables in the script you are trying to debug.
5297 At various points, however, it temporarily switches back to the main package
5298 to evaluate various expressions in the context of the main package.
5299 .PP
5300 The symbol table for a package happens to be stored in the associative array
5301 of that name prepended with an underscore.
5302 The value in each entry of the associative array is
5303 what you are referring to when you use the *name notation.
5304 In fact, the following have the same effect (in package main, anyway),
5305 though the first is more
5306 efficient because it does the symbol table lookups at compile time:
5307 .nf
5308
5309 .ne 2
5310         local(*foo) = *bar;
5311         local($_main{'foo'}) = $_main{'bar'};
5312
5313 .fi
5314 You can use this to print out all the variables in a package, for instance.
5315 Here is dumpvar.pl from the perl library:
5316 .nf
5317 .ne 11
5318         package dumpvar;
5319
5320         sub main'dumpvar {
5321         \&    ($package) = @_;
5322         \&    local(*stab) = eval("*_$package");
5323         \&    while (($key,$val) = each(%stab)) {
5324         \&        {
5325         \&            local(*entry) = $val;
5326         \&            if (defined $entry) {
5327         \&                print "\e$$key = '$entry'\en";
5328         \&            }
5329 .ne 7
5330         \&            if (defined @entry) {
5331         \&                print "\e@$key = (\en";
5332         \&                foreach $num ($[ .. $#entry) {
5333         \&                    print "  $num\et'",$entry[$num],"'\en";
5334         \&                }
5335         \&                print ")\en";
5336         \&            }
5337 .ne 10
5338         \&            if ($key ne "_$package" && defined %entry) {
5339         \&                print "\e%$key = (\en";
5340         \&                foreach $key (sort keys(%entry)) {
5341         \&                    print "  $key\et'",$entry{$key},"'\en";
5342         \&                }
5343         \&                print ")\en";
5344         \&            }
5345         \&        }
5346         \&    }
5347         }
5348
5349 .fi
5350 Note that, even though the subroutine is compiled in package dumpvar, the
5351 name of the subroutine is qualified so that its name is inserted into package
5352 \*(L"main\*(R".
5353 .Sh "Style"
5354 Each programmer will, of course, have his or her own preferences in regards
5355 to formatting, but there are some general guidelines that will make your
5356 programs easier to read.
5357 .Ip 1. 4 4
5358 Just because you CAN do something a particular way doesn't mean that
5359 you SHOULD do it that way.
5360 .I Perl
5361 is designed to give you several ways to do anything, so consider picking
5362 the most readable one.
5363 For instance
5364
5365         open(FOO,$foo) || die "Can't open $foo: $!";
5366
5367 is better than
5368
5369         die "Can't open $foo: $!" unless open(FOO,$foo);
5370
5371 because the second way hides the main point of the statement in a
5372 modifier.
5373 On the other hand
5374
5375         print "Starting analysis\en" if $verbose;
5376
5377 is better than
5378
5379         $verbose && print "Starting analysis\en";
5380
5381 since the main point isn't whether the user typed -v or not.
5382 .Sp
5383 Similarly, just because an operator lets you assume default arguments
5384 doesn't mean that you have to make use of the defaults.
5385 The defaults are there for lazy systems programmers writing one-shot
5386 programs.
5387 If you want your program to be readable, consider supplying the argument.
5388 .Sp
5389 Along the same lines, just because you
5390 .I can
5391 omit parentheses in many places doesn't mean that you ought to:
5392 .nf
5393
5394         return print reverse sort num values array;
5395         return print(reverse(sort num (values(%array))));
5396
5397 .fi
5398 When in doubt, parenthesize.
5399 At the very least it will let some poor schmuck bounce on the % key in vi.
5400 .Sp
5401 Even if you aren't in doubt, consider the mental welfare of the person who
5402 has to maintain the code after you, and who will probably put parens in
5403 the wrong place.
5404 .Ip 2. 4 4
5405 Don't go through silly contortions to exit a loop at the top or the
5406 bottom, when
5407 .I perl
5408 provides the "last" operator so you can exit in the middle.
5409 Just outdent it a little to make it more visible:
5410 .nf
5411
5412 .ne 7
5413     line:
5414         for (;;) {
5415             statements;
5416         last line if $foo;
5417             next line if /^#/;
5418             statements;
5419         }
5420
5421 .fi
5422 .Ip 3. 4 4
5423 Don't be afraid to use loop labels\*(--they're there to enhance readability as
5424 well as to allow multi-level loop breaks.
5425 See last example.
5426 .Ip 4. 4 4
5427 For portability, when using features that may not be implemented on every
5428 machine, test the construct in an eval to see if it fails.
5429 If you know what version or patchlevel a particular feature was implemented,
5430 you can test $] to see if it will be there.
5431 .Ip 5. 4 4
5432 Choose mnemonic identifiers.
5433 .Ip 6. 4 4
5434 Be consistent.
5435 .Sh "Debugging"
5436 If you invoke
5437 .I perl
5438 with a
5439 .B \-d
5440 switch, your script will be run under a debugging monitor.
5441 It will halt before the first executable statement and ask you for a
5442 command, such as:
5443 .Ip "h" 12 4
5444 Prints out a help message.
5445 .Ip "T" 12 4
5446 Stack trace.
5447 .Ip "s" 12 4
5448 Single step.
5449 Executes until it reaches the beginning of another statement.
5450 .Ip "n" 12 4
5451 Next.
5452 Executes over subroutine calls, until it reaches the beginning of the
5453 next statement.
5454 .Ip "f" 12 4
5455 Finish.
5456 Executes statements until it has finished the current subroutine.
5457 .Ip "c" 12 4
5458 Continue.
5459 Executes until the next breakpoint is reached.
5460 .Ip "c line" 12 4
5461 Continue to the specified line.
5462 Inserts a one-time-only breakpoint at the specified line.
5463 .Ip "<CR>" 12 4
5464 Repeat last n or s.
5465 .Ip "l min+incr" 12 4
5466 List incr+1 lines starting at min.
5467 If min is omitted, starts where last listing left off.
5468 If incr is omitted, previous value of incr is used.
5469 .Ip "l min-max" 12 4
5470 List lines in the indicated range.
5471 .Ip "l line" 12 4
5472 List just the indicated line.
5473 .Ip "l" 12 4
5474 List next window.
5475 .Ip "-" 12 4
5476 List previous window.
5477 .Ip "w line" 12 4
5478 List window around line.
5479 .Ip "l subname" 12 4
5480 List subroutine.
5481 If it's a long subroutine it just lists the beginning.
5482 Use \*(L"l\*(R" to list more.
5483 .Ip "/pattern/" 12 4
5484 Regular expression search forward for pattern; the final / is optional.
5485 .Ip "?pattern?" 12 4
5486 Regular expression search backward for pattern; the final ? is optional.
5487 .Ip "L" 12 4
5488 List lines that have breakpoints or actions.
5489 .Ip "S" 12 4
5490 Lists the names of all subroutines.
5491 .Ip "t" 12 4
5492 Toggle trace mode on or off.
5493 .Ip "b line condition" 12 4
5494 Set a breakpoint.
5495 If line is omitted, sets a breakpoint on the
5496 line that is about to be executed.
5497 If a condition is specified, it is evaluated each time the statement is
5498 reached and a breakpoint is taken only if the condition is true.
5499 Breakpoints may only be set on lines that begin an executable statement.
5500 .Ip "b subname condition" 12 4
5501 Set breakpoint at first executable line of subroutine.
5502 .Ip "d line" 12 4
5503 Delete breakpoint.
5504 If line is omitted, deletes the breakpoint on the
5505 line that is about to be executed.
5506 .Ip "D" 12 4
5507 Delete all breakpoints.
5508 .Ip "a line command" 12 4
5509 Set an action for line.
5510 A multi-line command may be entered by backslashing the newlines.
5511 .Ip "A" 12 4
5512 Delete all line actions.
5513 .Ip "< command" 12 4
5514 Set an action to happen before every debugger prompt.
5515 A multi-line command may be entered by backslashing the newlines.
5516 .Ip "> command" 12 4
5517 Set an action to happen after the prompt when you've just given a command
5518 to return to executing the script.
5519 A multi-line command may be entered by backslashing the newlines.
5520 .Ip "V package" 12 4
5521 List all variables in package.
5522 Default is main package.
5523 .Ip "! number" 12 4
5524 Redo a debugging command.
5525 If number is omitted, redoes the previous command.
5526 .Ip "! -number" 12 4
5527 Redo the command that was that many commands ago.
5528 .Ip "H -number" 12 4
5529 Display last n commands.
5530 Only commands longer than one character are listed.
5531 If number is omitted, lists them all.
5532 .Ip "q or ^D" 12 4
5533 Quit.
5534 .Ip "command" 12 4
5535 Execute command as a perl statement.
5536 A missing semicolon will be supplied.
5537 .Ip "p expr" 12 4
5538 Same as \*(L"print DB'OUT expr\*(R".
5539 The DB'OUT filehandle is opened to /dev/tty, regardless of where STDOUT
5540 may be redirected to.
5541 .PP
5542 If you want to modify the debugger, copy perldb.pl from the perl library
5543 to your current directory and modify it as necessary.
5544 (You'll also have to put -I. on your command line.)
5545 You can do some customization by setting up a .perldb file which contains
5546 initialization code.
5547 For instance, you could make aliases like these:
5548 .nf
5549
5550     $DB'alias{'len'} = 's/^len(.*)/p length($1)/';
5551     $DB'alias{'stop'} = 's/^stop (at|in)/b/';
5552     $DB'alias{'.'} =
5553       's/^\e./p "\e$DB\e'sub(\e$DB\e'line):\et",\e$DB\e'line[\e$DB\e'line]/';
5554
5555 .fi
5556 .Sh "Setuid Scripts"
5557 .I Perl
5558 is designed to make it easy to write secure setuid and setgid scripts.
5559 Unlike shells, which are based on multiple substitution passes on each line
5560 of the script,
5561 .I perl
5562 uses a more conventional evaluation scheme with fewer hidden \*(L"gotchas\*(R".
5563 Additionally, since the language has more built-in functionality, it
5564 has to rely less upon external (and possibly untrustworthy) programs to
5565 accomplish its purposes.
5566 .PP
5567 In an unpatched 4.2 or 4.3bsd kernel, setuid scripts are intrinsically
5568 insecure, but this kernel feature can be disabled.
5569 If it is,
5570 .I perl
5571 can emulate the setuid and setgid mechanism when it notices the otherwise
5572 useless setuid/gid bits on perl scripts.
5573 If the kernel feature isn't disabled,
5574 .I perl
5575 will complain loudly that your setuid script is insecure.
5576 You'll need to either disable the kernel setuid script feature, or put
5577 a C wrapper around the script.
5578 .PP
5579 When perl is executing a setuid script, it takes special precautions to
5580 prevent you from falling into any obvious traps.
5581 (In some ways, a perl script is more secure than the corresponding
5582 C program.)
5583 Any command line argument, environment variable, or input is marked as
5584 \*(L"tainted\*(R", and may not be used, directly or indirectly, in any
5585 command that invokes a subshell, or in any command that modifies files,
5586 directories or processes.
5587 Any variable that is set within an expression that has previously referenced
5588 a tainted value also becomes tainted (even if it is logically impossible
5589 for the tainted value to influence the variable).
5590 For example:
5591 .nf
5592
5593 .ne 5
5594         $foo = shift;                   # $foo is tainted
5595         $bar = $foo,\'bar\';            # $bar is also tainted
5596         $xxx = <>;                      # Tainted
5597         $path = $ENV{\'PATH\'}; # Tainted, but see below
5598         $abc = \'abc\';                 # Not tainted
5599
5600 .ne 4
5601         system "echo $foo";             # Insecure
5602         system "/bin/echo", $foo;       # Secure (doesn't use sh)
5603         system "echo $bar";             # Insecure
5604         system "echo $abc";             # Insecure until PATH set
5605
5606 .ne 5
5607         $ENV{\'PATH\'} = \'/bin:/usr/bin\';
5608         $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
5609
5610         $path = $ENV{\'PATH\'}; # Not tainted
5611         system "echo $abc";             # Is secure now!
5612
5613 .ne 5
5614         open(FOO,"$foo");               # OK
5615         open(FOO,">$foo");              # Not OK
5616
5617         open(FOO,"echo $foo|"); # Not OK, but...
5618         open(FOO,"-|") || exec \'echo\', $foo;  # OK
5619
5620         $zzz = `echo $foo`;             # Insecure, zzz tainted
5621
5622         unlink $abc,$foo;               # Insecure
5623         umask $foo;                     # Insecure
5624
5625 .ne 3
5626         exec "echo $foo";               # Insecure
5627         exec "echo", $foo;              # Secure (doesn't use sh)
5628         exec "sh", \'-c\', $foo;        # Considered secure, alas
5629
5630 .fi
5631 The taintedness is associated with each scalar value, so some elements
5632 of an array can be tainted, and others not.
5633 .PP
5634 If you try to do something insecure, you will get a fatal error saying
5635 something like \*(L"Insecure dependency\*(R" or \*(L"Insecure PATH\*(R".
5636 Note that you can still write an insecure system call or exec,
5637 but only by explicitly doing something like the last example above.
5638 You can also bypass the tainting mechanism by referencing
5639 subpatterns\*(--\c
5640 .I perl
5641 presumes that if you reference a substring using $1, $2, etc, you knew
5642 what you were doing when you wrote the pattern:
5643 .nf
5644
5645         $ARGV[0] =~ /^\-P(\ew+)$/;
5646         $printer = $1;          # Not tainted
5647
5648 .fi
5649 This is fairly secure since \ew+ doesn't match shell metacharacters.
5650 Use of .+ would have been insecure, but
5651 .I perl
5652 doesn't check for that, so you must be careful with your patterns.
5653 This is the ONLY mechanism for untainting user supplied filenames if you
5654 want to do file operations on them (unless you make $> equal to $<).
5655 .PP
5656 It's also possible to get into trouble with other operations that don't care
5657 whether they use tainted values.
5658 Make judicious use of the file tests in dealing with any user-supplied
5659 filenames.
5660 When possible, do opens and such after setting $> = $<.
5661 .I Perl
5662 doesn't prevent you from opening tainted filenames for reading, so be
5663 careful what you print out.
5664 The tainting mechanism is intended to prevent stupid mistakes, not to remove
5665 the need for thought.
5666 .SH ENVIRONMENT
5667 .I Perl
5668 uses PATH in executing subprocesses, and in finding the script if \-S
5669 is used.
5670 HOME or LOGDIR are used if chdir has no argument.
5671 .PP
5672 Apart from these,
5673 .I perl
5674 uses no environment variables, except to make them available
5675 to the script being executed, and to child processes.
5676 However, scripts running setuid would do well to execute the following lines
5677 before doing anything else, just to keep people honest:
5678 .nf
5679
5680 .ne 3
5681     $ENV{\'PATH\'} = \'/bin:/usr/bin\';    # or whatever you need
5682     $ENV{\'SHELL\'} = \'/bin/sh\' if $ENV{\'SHELL\'} ne \'\';
5683     $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
5684
5685 .fi
5686 .SH AUTHOR
5687 Larry Wall <lwall@jpl-devvax.Jpl.Nasa.Gov>
5688 .br
5689 MS-DOS port by Diomidis Spinellis <dds@cc.ic.ac.uk>
5690 .SH FILES
5691 /tmp/perl\-eXXXXXX      temporary file for
5692 .B \-e
5693 commands.
5694 .SH SEE ALSO
5695 a2p     awk to perl translator
5696 .br
5697 s2p     sed to perl translator
5698 .SH DIAGNOSTICS
5699 Compilation errors will tell you the line number of the error, with an
5700 indication of the next token or token type that was to be examined.
5701 (In the case of a script passed to
5702 .I perl
5703 via
5704 .B \-e
5705 switches, each
5706 .B \-e
5707 is counted as one line.)
5708 .PP
5709 Setuid scripts have additional constraints that can produce error messages
5710 such as \*(L"Insecure dependency\*(R".
5711 See the section on setuid scripts.
5712 .SH TRAPS
5713 Accustomed
5714 .IR awk
5715 users should take special note of the following:
5716 .Ip * 4 2
5717 Semicolons are required after all simple statements in
5718 .IR perl .
5719 Newline
5720 is not a statement delimiter.
5721 .Ip * 4 2
5722 Curly brackets are required on ifs and whiles.
5723 .Ip * 4 2
5724 Variables begin with $ or @ in
5725 .IR perl .
5726 .Ip * 4 2
5727 Arrays index from 0 unless you set $[.
5728 Likewise string positions in substr() and index().
5729 .Ip * 4 2
5730 You have to decide whether your array has numeric or string indices.
5731 .Ip * 4 2
5732 Associative array values do not spring into existence upon mere reference.
5733 .Ip * 4 2
5734 You have to decide whether you want to use string or numeric comparisons.
5735 .Ip * 4 2
5736 Reading an input line does not split it for you.  You get to split it yourself
5737 to an array.
5738 And the
5739 .I split
5740 operator has different arguments.
5741 .Ip * 4 2
5742 The current input line is normally in $_, not $0.
5743 It generally does not have the newline stripped.
5744 ($0 is the name of the program executed.)
5745 .Ip * 4 2
5746 $<digit> does not refer to fields\*(--it refers to substrings matched by the last
5747 match pattern.
5748 .Ip * 4 2
5749 The
5750 .I print
5751 statement does not add field and record separators unless you set
5752 $, and $\e.
5753 .Ip * 4 2
5754 You must open your files before you print to them.
5755 .Ip * 4 2
5756 The range operator is \*(L".\|.\*(R", not comma.
5757 (The comma operator works as in C.)
5758 .Ip * 4 2
5759 The match operator is \*(L"=~\*(R", not \*(L"~\*(R".
5760 (\*(L"~\*(R" is the one's complement operator, as in C.)
5761 .Ip * 4 2
5762 The exponentiation operator is \*(L"**\*(R", not \*(L"^\*(R".
5763 (\*(L"^\*(R" is the XOR operator, as in C.)
5764 .Ip * 4 2
5765 The concatenation operator is \*(L".\*(R", not the null string.
5766 (Using the null string would render \*(L"/pat/ /pat/\*(R" unparsable,
5767 since the third slash would be interpreted as a division operator\*(--the
5768 tokener is in fact slightly context sensitive for operators like /, ?, and <.
5769 And in fact, . itself can be the beginning of a number.)
5770 .Ip * 4 2
5771 .IR Next ,
5772 .I exit
5773 and
5774 .I continue
5775 work differently.
5776 .Ip * 4 2
5777 The following variables work differently
5778 .nf
5779
5780           Awk   \h'|2.5i'Perl
5781           ARGC  \h'|2.5i'$#ARGV
5782           ARGV[0]       \h'|2.5i'$0
5783           FILENAME\h'|2.5i'$ARGV
5784           FNR   \h'|2.5i'$. \- something
5785           FS    \h'|2.5i'(whatever you like)
5786           NF    \h'|2.5i'$#Fld, or some such
5787           NR    \h'|2.5i'$.
5788           OFMT  \h'|2.5i'$#
5789           OFS   \h'|2.5i'$,
5790           ORS   \h'|2.5i'$\e
5791           RLENGTH       \h'|2.5i'length($&)
5792           RS    \h'|2.5i'$/
5793           RSTART        \h'|2.5i'length($\`)
5794           SUBSEP        \h'|2.5i'$;
5795
5796 .fi
5797 .Ip * 4 2
5798 When in doubt, run the
5799 .I awk
5800 construct through a2p and see what it gives you.
5801 .PP
5802 Cerebral C programmers should take note of the following:
5803 .Ip * 4 2
5804 Curly brackets are required on ifs and whiles.
5805 .Ip * 4 2
5806 You should use \*(L"elsif\*(R" rather than \*(L"else if\*(R"
5807 .Ip * 4 2
5808 .I Break
5809 and
5810 .I continue
5811 become
5812 .I last
5813 and
5814 .IR next ,
5815 respectively.
5816 .Ip * 4 2
5817 There's no switch statement.
5818 .Ip * 4 2
5819 Variables begin with $ or @ in
5820 .IR perl .
5821 .Ip * 4 2
5822 Printf does not implement *.
5823 .Ip * 4 2
5824 Comments begin with #, not /*.
5825 .Ip * 4 2
5826 You can't take the address of anything.
5827 .Ip * 4 2
5828 ARGV must be capitalized.
5829 .Ip * 4 2
5830 The \*(L"system\*(R" calls link, unlink, rename, etc. return nonzero for success, not 0.
5831 .Ip * 4 2
5832 Signal handlers deal with signal names, not numbers.
5833 .PP
5834 Seasoned
5835 .I sed
5836 programmers should take note of the following:
5837 .Ip * 4 2
5838 Backreferences in substitutions use $ rather than \e.
5839 .Ip * 4 2
5840 The pattern matching metacharacters (, ), and | do not have backslashes in front.
5841 .Ip * 4 2
5842 The range operator is .\|. rather than comma.
5843 .PP
5844 Sharp shell programmers should take note of the following:
5845 .Ip * 4 2
5846 The backtick operator does variable interpretation without regard to the
5847 presence of single quotes in the command.
5848 .Ip * 4 2
5849 The backtick operator does no translation of the return value, unlike csh.
5850 .Ip * 4 2
5851 Shells (especially csh) do several levels of substitution on each command line.
5852 .I Perl
5853 does substitution only in certain constructs such as double quotes,
5854 backticks, angle brackets and search patterns.
5855 .Ip * 4 2
5856 Shells interpret scripts a little bit at a time.
5857 .I Perl
5858 compiles the whole program before executing it.
5859 .Ip * 4 2
5860 The arguments are available via @ARGV, not $1, $2, etc.
5861 .Ip * 4 2
5862 The environment is not automatically made available as variables.
5863 .SH ERRATA\0AND\0ADDENDA
5864 The Perl book,
5865 .I Programming\0Perl ,
5866 has the following omissions and goofs.
5867 .PP
5868 On page 5, the examples which read
5869 .nf
5870
5871         eval "/usr/bin/perl
5872
5873 should read
5874
5875         eval "exec /usr/bin/perl
5876
5877 .fi
5878 .PP
5879 On page 195, the equivalent to the System V sum program only works for
5880 very small files.  To do larger files, use
5881 .nf
5882
5883         undef $/;
5884         $checksum = unpack("%32C*",<>) % 32767;
5885
5886 .fi
5887 .PP
5888 The
5889 .B \-0
5890 switch to set the initial value of $/ was added to Perl after the book
5891 went to press.
5892 .PP
5893 The
5894 .B \-l
5895 switch now does automatic line ending processing.
5896 .PP
5897 The qx// construct is now a synonym for backticks.
5898 .PP
5899 $0 may now be assigned to set the argument displayed by
5900 .I ps (1).
5901 .PP
5902 The new @###.## format was omitted accidentally from the description
5903 on formats.
5904 .PP
5905 It wasn't known at press time that s///ee caused multiple evaluations of
5906 the replacement expression.  This is to be construed as a feature.
5907 .PP
5908 (LIST) x $count now does array replication.
5909 .PP
5910 There is now no limit on the number of parentheses in a regular expression.
5911 .PP
5912 In double-quote context, more escapes are supported: \ee, \ea, \ex1b, \ec[,
5913 \el, \eL, \eu, \eU, \eE.  The latter five control up/lower case translation.
5914 .PP
5915 The
5916 .B $/
5917 variable may now be set to a multi-character delimiter.
5918 .SH BUGS
5919 .PP
5920 .I Perl
5921 is at the mercy of your machine's definitions of various operations
5922 such as type casting, atof() and sprintf().
5923 .PP
5924 If your stdio requires an seek or eof between reads and writes on a particular
5925 stream, so does
5926 .IR perl .
5927 .PP
5928 While none of the built-in data types have any arbitrary size limits (apart
5929 from memory size), there are still a few arbitrary limits:
5930 a given identifier may not be longer than 255 characters;
5931 sprintf is limited on many machines to 128 characters per field (unless the format
5932 specifier is exactly %s);
5933 and no component of your PATH may be longer than 255 if you use \-S.
5934 .PP
5935 .I Perl
5936 actually stands for Pathologically Eclectic Rubbish Lister, but don't tell
5937 anyone I said that.
5938 .rn }` ''