perl.man

   1 .rn '' }`
   2 ''' $RCSfile: perl.man,v $$Revision: 4.1 $$Date: 92/08/07 18:25:59 $
   3 '''
   4 ''' $Log:       perl.man,v $
   5 ''' Revision 4.1  92/08/07  18:25:59  lwall
   6 '''
   7 ''' Revision 4.0.1.6  92/06/08  15:07:29  lwall
   8 ''' patch20: documented that numbers may contain underline
   9 ''' patch20: clarified that DATA may only be read from main script
  10 ''' patch20: relaxed requirement for semicolon at the end of a block
  11 ''' patch20: added ... as variant on ..
  12 ''' patch20: documented need for 1; at the end of a required file
  13 ''' patch20: extended bracket-style quotes to two-arg operators: s()() and tr()()
  14 ''' patch20: paragraph mode now skips extra newlines automatically
  15 ''' patch20: documented PERLLIB and PERLDB
  16 ''' patch20: documented limit on size of regexp
  17 '''
  18 ''' Revision 4.0.1.5  91/11/11  16:42:00  lwall
  19 ''' patch19: added little-endian pack/unpack options
  20 '''
  21 ''' Revision 4.0.1.4  91/11/05  18:11:05  lwall
  22 ''' patch11: added sort {} LIST
  23 ''' patch11: added eval {}
  24 ''' patch11: documented meaning of scalar(%foo)
  25 ''' patch11: sprintf() now supports any length of s field
  26 '''
  27 ''' Revision 4.0.1.3  91/06/10  01:26:02  lwall
  28 ''' patch10: documented some newer features in addenda
  29 '''
  30 ''' Revision 4.0.1.2  91/06/07  11:41:23  lwall
  31 ''' patch4: added global modifier for pattern matches
  32 ''' patch4: default top-of-form format is now FILEHANDLE_TOP
  33 ''' patch4: added $^P variable to control calling of perldb routines
  34 ''' patch4: added $^F variable to specify maximum system fd, default 2
  35 ''' patch4: changed old $^P to $^X
  36 '''
  37 ''' Revision 4.0.1.1  91/04/11  17:50:44  lwall
  38 ''' patch1: fixed some typos
  39 '''
  40 ''' Revision 4.0  91/03/20  01:38:08  lwall
  41 ''' 4.0 baseline.
  42 '''
  43 '''
  44 .de Sh
  45 .br
  46 .ne 5
  47 .PP
  48 \fB\\$1\fR
  49 .PP
  50 ..
  51 .de Sp
  52 .if t .sp .5v
  53 .if n .sp
  54 ..
  55 .de Ip
  56 .br
  57 .ie \\n(.$>=3 .ne \\$3
  58 .el .ne 3
  59 .IP "\\$1" \\$2
  60 ..
  61 '''
  62 '''     Set up \*(-- to give an unbreakable dash;
  63 '''     string Tr holds user defined translation string.
  64 '''     Bell System Logo is used as a dummy character.
  65 '''
  66 .tr \(*W-|\(bv\*(Tr
  67 .ie n \{\
  68 .ds -- \(*W-
  69 .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
  70 .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
  71 .ds L" ""
  72 .ds R" ""
  73 .ds L' '
  74 .ds R' '
  75 'br\}
  76 .el\{\
  77 .ds -- \(em\|
  78 .tr \*(Tr
  79 .ds L" ``
  80 .ds R" ''
  81 .ds L' `
  82 .ds R' '
  83 'br\}
  84 .TH PERL 1 "\*(RP"
  85 .UC
  86 .SH NAME
  87 perl \- Practical Extraction and Report Language
  88 .SH SYNOPSIS
  89 .B perl
  90 [options] filename args
  91 .SH DESCRIPTION
  92 .I Perl
  93 is an interpreted language optimized for scanning arbitrary text files,
  94 extracting information from those text files, and printing reports based
  95 on that information.
  96 It's also a good language for many system management tasks.
  97 The language is intended to be practical (easy to use, efficient, complete)
  98 rather than beautiful (tiny, elegant, minimal).
  99 It combines (in the author's opinion, anyway) some of the best features of C,
 100 \fIsed\fR, \fIawk\fR, and \fIsh\fR,
 101 so people familiar with those languages should have little difficulty with it.
 102 (Language historians will also note some vestiges of \fIcsh\fR, Pascal, and
 103 even BASIC-PLUS.)
 104 Expression syntax corresponds quite closely to C expression syntax.
 105 Unlike most Unix utilities,
 106 .I perl
 107 does not arbitrarily limit the size of your data\*(--if you've got
 108 the memory,
 109 .I perl
 110 can slurp in your whole file as a single string.
 111 Recursion is of unlimited depth.
 112 And the hash tables used by associative arrays grow as necessary to prevent
 113 degraded performance.
 114 .I Perl
 115 uses sophisticated pattern matching techniques to scan large amounts of
 116 data very quickly.
 117 Although optimized for scanning text,
 118 .I perl
 119 can also deal with binary data, and can make dbm files look like associative
 120 arrays (where dbm is available).
 121 Setuid
 122 .I perl
 123 scripts are safer than C programs
 124 through a dataflow tracing mechanism which prevents many stupid security holes.
 125 If you have a problem that would ordinarily use \fIsed\fR
 126 or \fIawk\fR or \fIsh\fR, but it
 127 exceeds their capabilities or must run a little faster,
 128 and you don't want to write the silly thing in C, then
 129 .I perl
 130 may be for you.
 131 There are also translators to turn your
 132 .I sed
 133 and
 134 .I awk
 135 scripts into
 136 .I perl
 137 scripts.
 138 OK, enough hype.
 139 .PP
 140 Upon startup,
 141 .I perl
 142 looks for your script in one of the following places:
 143 .Ip 1. 4 2
 144 Specified line by line via
 145 .B \-e
 146 switches on the command line.
 147 .Ip 2. 4 2
 148 Contained in the file specified by the first filename on the command line.
 149 (Note that systems supporting the #! notation invoke interpreters this way.)
 150 .Ip 3. 4 2
 151 Passed in implicitly via standard input.
 152 This only works if there are no filename arguments\*(--to pass
 153 arguments to a
 154 .I stdin
 155 script you must explicitly specify a \- for the script name.
 156 .PP
 157 After locating your script,
 158 .I perl
 159 compiles it to an internal form.
 160 If the script is syntactically correct, it is executed.
 161 .Sh "Options"
 162 Note: on first reading this section may not make much sense to you.  It's here
 163 at the front for easy reference.
 164 .PP
 165 A single-character option may be combined with the following option, if any.
 166 This is particularly useful when invoking a script using the #! construct which
 167 only allows one argument.  Example:
 168 .nf
 169
 170 .ne 2
 171         #!/usr/bin/perl \-spi.bak       # same as \-s \-p \-i.bak
 172         .\|.\|.
 173
 174 .fi
 175 Options include:
 176 .TP 5
 177 .BI \-0 digits
 178 specifies the record separator ($/) as an octal number.
 179 If there are no digits, the null character is the separator.
 180 Other switches may precede or follow the digits.
 181 For example, if you have a version of
 182 .I find
 183 which can print filenames terminated by the null character, you can say this:
 184 .nf
 185
 186     find . \-name '*.bak' \-print0 | perl \-n0e unlink
 187
 188 .fi
 189 The special value 00 will cause Perl to slurp files in paragraph mode.
 190 The value 0777 will cause Perl to slurp files whole since there is no
 191 legal character with that value.
 192 .TP 5
 193 .B \-a
 194 turns on autosplit mode when used with a
 195 .B \-n
 196 or
 197 .BR \-p .
 198 An implicit split command to the @F array
 199 is done as the first thing inside the implicit while loop produced by
 200 the
 201 .B \-n
 202 or
 203 .BR \-p .
 204 .nf
 205
 206         perl \-ane \'print pop(@F), "\en";\'
 207
 208 is equivalent to
 209
 210         while (<>) {
 211                 @F = split(\' \');
 212                 print pop(@F), "\en";
 213         }
 214
 215 .fi
 216 .TP 5
 217 .B \-c
 218 causes
 219 .I perl
 220 to check the syntax of the script and then exit without executing it.
 221 .TP 5
 222 .BI \-d
 223 runs the script under the perl debugger.
 224 See the section on Debugging.
 225 .TP 5
 226 .BI \-D number
 227 sets debugging flags.
 228 To watch how it executes your script, use
 229 .BR \-D14 .
 230 (This only works if debugging is compiled into your
 231 .IR perl .)
 232 Another nice value is \-D1024, which lists your compiled syntax tree.
 233 And \-D512 displays compiled regular expressions.
 234 .TP 5
 235 .BI \-e " commandline"
 236 may be used to enter one line of script.
 237 Multiple
 238 .B \-e
 239 commands may be given to build up a multi-line script.
 240 If
 241 .B \-e
 242 is given,
 243 .I perl
 244 will not look for a script filename in the argument list.
 245 .TP 5
 246 .BI \-i extension
 247 specifies that files processed by the <> construct are to be edited
 248 in-place.
 249 It does this by renaming the input file, opening the output file by the
 250 same name, and selecting that output file as the default for print statements.
 251 The extension, if supplied, is added to the name of the
 252 old file to make a backup copy.
 253 If no extension is supplied, no backup is made.
 254 Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using
 255 the script:
 256 .nf
 257
 258 .ne 2
 259         #!/usr/bin/perl \-pi.bak
 260         s/foo/bar/;
 261
 262 which is equivalent to
 263
 264 .ne 14
 265         #!/usr/bin/perl
 266         while (<>) {
 267                 if ($ARGV ne $oldargv) {
 268                         rename($ARGV, $ARGV . \'.bak\');
 269                         open(ARGVOUT, ">$ARGV");
 270                         select(ARGVOUT);
 271                         $oldargv = $ARGV;
 272                 }
 273                 s/foo/bar/;
 274         }
 275         continue {
 276             print;      # this prints to original filename
 277         }
 278         select(STDOUT);
 279
 280 .fi
 281 except that the
 282 .B \-i
 283 form doesn't need to compare $ARGV to $oldargv to know when
 284 the filename has changed.
 285 It does, however, use ARGVOUT for the selected filehandle.
 286 Note that
 287 .I STDOUT
 288 is restored as the default output filehandle after the loop.
 289 .Sp
 290 You can use eof to locate the end of each input file, in case you want
 291 to append to each file, or reset line numbering (see example under eof).
 292 .TP 5
 293 .BI \-I directory
 294 may be used in conjunction with
 295 .B \-P
 296 to tell the C preprocessor where to look for include files.
 297 By default /usr/include and /usr/lib/perl are searched.
 298 .TP 5
 299 .BI \-l octnum
 300 enables automatic line-ending processing.  It has two effects:
 301 first, it automatically chops the line terminator when used with
 302 .B \-n
 303 or
 304 .B \-p ,
 305 and second, it assigns $\e to have the value of
 306 .I octnum
 307 so that any print statements will have that line terminator added back on.  If
 308 .I octnum
 309 is omitted, sets $\e to the current value of $/.
 310 For instance, to trim lines to 80 columns:
 311 .nf
 312
 313         perl -lpe \'substr($_, 80) = ""\'
 314
 315 .fi
 316 Note that the assignment $\e = $/ is done when the switch is processed,
 317 so the input record separator can be different than the output record
 318 separator if the
 319 .B \-l
 320 switch is followed by a
 321 .B \-0
 322 switch:
 323 .nf
 324
 325         gnufind / -print0 | perl -ln0e 'print "found $_" if -p'
 326
 327 .fi
 328 This sets $\e to newline and then sets $/ to the null character.
 329 .TP 5
 330 .B \-n
 331 causes
 332 .I perl
 333 to assume the following loop around your script, which makes it iterate
 334 over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR:
 335 .nf
 336
 337 .ne 3
 338         while (<>) {
 339                 .\|.\|.         # your script goes here
 340         }
 341
 342 .fi
 343 Note that the lines are not printed by default.
 344 See
 345 .B \-p
 346 to have lines printed.
 347 Here is an efficient way to delete all files older than a week:
 348 .nf
 349
 350         find . \-mtime +7 \-print | perl \-nle \'unlink;\'
 351
 352 .fi
 353 This is faster than using the \-exec switch of find because you don't have to
 354 start a process on every filename found.
 355 .TP 5
 356 .B \-p
 357 causes
 358 .I perl
 359 to assume the following loop around your script, which makes it iterate
 360 over filename arguments somewhat like \fIsed\fR:
 361 .nf
 362
 363 .ne 5
 364         while (<>) {
 365                 .\|.\|.         # your script goes here
 366         } continue {
 367                 print;
 368         }
 369
 370 .fi
 371 Note that the lines are printed automatically.
 372 To suppress printing use the
 373 .B \-n
 374 switch.
 375 A
 376 .B \-p
 377 overrides a
 378 .B \-n
 379 switch.
 380 .TP 5
 381 .B \-P
 382 causes your script to be run through the C preprocessor before
 383 compilation by
 384 .IR perl .
 385 (Since both comments and cpp directives begin with the # character,
 386 you should avoid starting comments with any words recognized
 387 by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".)
 388 .TP 5
 389 .B \-s
 390 enables some rudimentary switch parsing for switches on the command line
 391 after the script name but before any filename arguments (or before a \-\|\-).
 392 Any switch found there is removed from @ARGV and sets the corresponding variable in the
 393 .I perl
 394 script.
 395 The following script prints \*(L"true\*(R" if and only if the script is
 396 invoked with a \-xyz switch.
 397 .nf
 398
 399 .ne 2
 400         #!/usr/bin/perl \-s
 401         if ($xyz) { print "true\en"; }
 402
 403 .fi
 404 .TP 5
 405 .B \-S
 406 makes
 407 .I perl
 408 use the PATH environment variable to search for the script
 409 (unless the name of the script starts with a slash).
 410 Typically this is used to emulate #! startup on machines that don't
 411 support #!, in the following manner:
 412 .nf
 413
 414         #!/usr/bin/perl
 415         eval "exec /usr/bin/perl \-S $0 $*"
 416                 if $running_under_some_shell;
 417
 418 .fi
 419 The system ignores the first line and feeds the script to /bin/sh,
 420 which proceeds to try to execute the
 421 .I perl
 422 script as a shell script.
 423 The shell executes the second line as a normal shell command, and thus
 424 starts up the
 425 .I perl
 426 interpreter.
 427 On some systems $0 doesn't always contain the full pathname,
 428 so the
 429 .B \-S
 430 tells
 431 .I perl
 432 to search for the script if necessary.
 433 After
 434 .I perl
 435 locates the script, it parses the lines and ignores them because
 436 the variable $running_under_some_shell is never true.
 437 A better construct than $* would be ${1+"$@"}, which handles embedded spaces
 438 and such in the filenames, but doesn't work if the script is being interpreted
 439 by csh.
 440 In order to start up sh rather than csh, some systems may have to replace the
 441 #! line with a line containing just
 442 a colon, which will be politely ignored by perl.
 443 Other systems can't control that, and need a totally devious construct that
 444 will work under any of csh, sh or perl, such as the following:
 445 .nf
 446
 447 .ne 3
 448         eval '(exit $?0)' && eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
 449         & eval 'exec /usr/bin/perl -S $0 $argv:q'
 450                 if 0;
 451
 452 .fi
 453 .TP 5
 454 .B \-u
 455 causes
 456 .I perl
 457 to dump core after compiling your script.
 458 You can then take this core dump and turn it into an executable file
 459 by using the undump program (not supplied).
 460 This speeds startup at the expense of some disk space (which you can
 461 minimize by stripping the executable).
 462 (Still, a "hello world" executable comes out to about 200K on my machine.)
 463 If you are going to run your executable as a set-id program then you
 464 should probably compile it using taintperl rather than normal perl.
 465 If you want to execute a portion of your script before dumping, use the
 466 dump operator instead.
 467 Note: availability of undump is platform specific and may not be available
 468 for a specific port of perl.
 469 .TP 5
 470 .B \-U
 471 allows
 472 .I perl
 473 to do unsafe operations.
 474 Currently the only \*(L"unsafe\*(R" operations are the unlinking of directories while
 475 running as superuser, and running setuid programs with fatal taint checks
 476 turned into warnings.
 477 .TP 5
 478 .B \-v
 479 prints the version and patchlevel of your
 480 .I perl
 481 executable.
 482 .TP 5
 483 .B \-w
 484 prints warnings about identifiers that are mentioned only once, and scalar
 485 variables that are used before being set.
 486 Also warns about redefined subroutines, and references to undefined
 487 filehandles or filehandles opened readonly that you are attempting to
 488 write on.
 489 Also warns you if you use == on values that don't look like numbers, and if
 490 your subroutines recurse more than 100 deep.
 491 .TP 5
 492 .BI \-x directory
 493 tells
 494 .I perl
 495 that the script is embedded in a message.
 496 Leading garbage will be discarded until the first line that starts
 497 with #! and contains the string "perl".
 498 Any meaningful switches on that line will be applied (but only one
 499 group of switches, as with normal #! processing).
 500 If a directory name is specified, Perl will switch to that directory
 501 before running the script.
 502 The
 503 .B \-x
 504 switch only controls the the disposal of leading garbage.
 505 The script must be terminated with _\|_END_\|_ if there is trailing garbage
 506 to be ignored (the script can process any or all of the trailing garbage
 507 via the DATA filehandle if desired).
 508 .Sh "Data Types and Objects"
 509 .PP
 510 .I Perl
 511 has three data types: scalars, arrays of scalars, and
 512 associative arrays of scalars.
 513 Normal arrays are indexed by number, and associative arrays by string.
 514 .PP
 515 The interpretation of operations and values in perl sometimes
 516 depends on the requirements
 517 of the context around the operation or value.
 518 There are three major contexts: string, numeric and array.
 519 Certain operations return array values
 520 in contexts wanting an array, and scalar values otherwise.
 521 (If this is true of an operation it will be mentioned in the documentation
 522 for that operation.)
 523 Operations which return scalars don't care whether the context is looking
 524 for a string or a number, but
 525 scalar variables and values are interpreted as strings or numbers
 526 as appropriate to the context.
 527 A scalar is interpreted as TRUE in the boolean sense if it is not the null
 528 string or 0.
 529 Booleans returned by operators are 1 for true and 0 or \'\' (the null
 530 string) for false.
 531 .PP
 532 There are actually two varieties of null string: defined and undefined.
 533 Undefined null strings are returned when there is no real value for something,
 534 such as when there was an error, or at end of file, or when you refer
 535 to an uninitialized variable or element of an array.
 536 An undefined null string may become defined the first time you access it, but
 537 prior to that you can use the defined() operator to determine whether the
 538 value is defined or not.
 539 .PP
 540 References to scalar variables always begin with \*(L'$\*(R', even when referring
 541 to a scalar that is part of an array.
 542 Thus:
 543 .nf
 544
 545 .ne 3
 546     $days       \h'|2i'# a simple scalar variable
 547     $days[28]   \h'|2i'# 29th element of array @days
 548     $days{\'Feb\'}\h'|2i'# one value from an associative array
 549     $#days      \h'|2i'# last index of array @days
 550
 551 but entire arrays or array slices are denoted by \*(L'@\*(R':
 552
 553     @days       \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n])
 554     @days[3,4,5]\h'|2i'# same as @days[3.\|.5]
 555     @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'})
 556
 557 and entire associative arrays are denoted by \*(L'%\*(R':
 558
 559     %days       \h'|2i'# (key1, val1, key2, val2 .\|.\|.)
 560 .fi
 561 .PP
 562 Any of these eight constructs may serve as an lvalue,
 563 that is, may be assigned to.
 564 (It also turns out that an assignment is itself an lvalue in
 565 certain contexts\*(--see examples under s, tr and chop.)
 566 Assignment to a scalar evaluates the righthand side in a scalar context,
 567 while assignment to an array or array slice evaluates the righthand side
 568 in an array context.
 569 .PP
 570 You may find the length of array @days by evaluating
 571 \*(L"$#days\*(R", as in
 572 .IR csh .
 573 (Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.)
 574 Assigning to $#days changes the length of the array.
 575 Shortening an array by this method does not actually destroy any values.
 576 Lengthening an array that was previously shortened recovers the values that
 577 were in those elements.
 578 You can also gain some measure of efficiency by preextending an array that
 579 is going to get big.
 580 (You can also extend an array by assigning to an element that is off the
 581 end of the array.
 582 This differs from assigning to $#whatever in that intervening values
 583 are set to null rather than recovered.)
 584 You can truncate an array down to nothing by assigning the null list () to
 585 it.
 586 The following are exactly equivalent
 587 .nf
 588
 589         @whatever = ();
 590         $#whatever = $[ \- 1;
 591
 592 .fi
 593 .PP
 594 If you evaluate an array in a scalar context, it returns the length of
 595 the array.
 596 The following is always true:
 597 .nf
 598
 599         scalar(@whatever) == $#whatever \- $[ + 1;
 600
 601 .fi
 602 If you evaluate an associative array in a scalar context, it returns
 603 a value which is true if and only if the array contains any elements.
 604 (If there are any elements, the value returned is a string consisting
 605 of the number of used buckets and the number of allocated buckets, separated
 606 by a slash.)
 607 .PP
 608 Multi-dimensional arrays are not directly supported, but see the discussion
 609 of the $; variable later for a means of emulating multiple subscripts with
 610 an associative array.
 611 You could also write a subroutine to turn multiple subscripts into a single
 612 subscript.
 613 .PP
 614 Every data type has its own namespace.
 615 You can, without fear of conflict, use the same name for a scalar variable,
 616 an array, an associative array, a filehandle, a subroutine name, and/or
 617 a label.
 618 Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R',
 619 or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved
 620 with respect to variable names.
 621 (They ARE reserved with respect to labels and filehandles, however, which
 622 don't have an initial special character.
 623 Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\').
 624 Using uppercase filehandles also improves readability and protects you
 625 from conflict with future reserved words.)
 626 Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all
 627 different names.
 628 Names which start with a letter may also contain digits and underscores.
 629 Names which do not start with a letter are limited to one character,
 630 e.g. \*(L"$%\*(R" or \*(L"$$\*(R".
 631 (Most of the one character names have a predefined significance to
 632 .IR perl .
 633 More later.)
 634 .PP
 635 Numeric literals are specified in any of the usual floating point or
 636 integer formats:
 637 .nf
 638
 639 .ne 6
 640     12345
 641     12345.67
 642     .23E-10
 643     0xffff      # hex
 644     0377        # octal
 645     4_294_967_296
 646
 647 .fi
 648 String literals are delimited by either single or double quotes.
 649 They work much like shell quotes:
 650 double-quoted string literals are subject to backslash and variable
 651 substitution; single-quoted strings are not (except for \e\' and \e\e).
 652 The usual backslash rules apply for making characters such as newline, tab,
 653 etc., as well as some more exotic forms:
 654 .nf
 655
 656         \et             tab
 657         \en             newline
 658         \er             return
 659         \ef             form feed
 660         \eb             backspace
 661         \ea             alarm (bell)
 662         \ee             escape
 663         \e033           octal char
 664         \ex1b           hex char
 665         \ec[            control char
 666         \el             lowercase next char
 667         \eu             uppercase next char
 668         \eL             lowercase till \eE
 669         \eU             uppercase till \eE
 670         \eE             end case modification
 671
 672 .fi
 673 You can also embed newlines directly in your strings, i.e. they can end on
 674 a different line than they begin.
 675 This is nice, but if you forget your trailing quote, the error will not be
 676 reported until
 677 .I perl
 678 finds another line containing the quote character, which
 679 may be much further on in the script.
 680 Variable substitution inside strings is limited to scalar variables, normal
 681 array values, and array slices.
 682 (In other words, identifiers beginning with $ or @, followed by an optional
 683 bracketed expression as a subscript.)
 684 The following code segment prints out \*(L"The price is $100.\*(R"
 685 .nf
 686
 687 .ne 2
 688     $Price = \'$100\';\h'|3.5i'# not interpreted
 689     print "The price is $Price.\e\|n";\h'|3.5i'# interpreted
 690
 691 .fi
 692 Note that you can put curly brackets around the identifier to delimit it
 693 from following alphanumerics.
 694 Also note that a single quoted string must be separated from a preceding
 695 word by a space, since single quote is a valid character in an identifier
 696 (see Packages).
 697 .PP
 698 Two special literals are _\|_LINE_\|_ and _\|_FILE_\|_, which represent the current
 699 line number and filename at that point in your program.
 700 They may only be used as separate tokens; they will not be interpolated
 701 into strings.
 702 In addition, the token _\|_END_\|_ may be used to indicate the logical end of the
 703 script before the actual end of file.
 704 Any following text is ignored, but may be read via the DATA filehandle.
 705 (The DATA filehandle may read data only from the main script, but not from
 706 any required file or evaluated string.)
 707 The two control characters ^D and ^Z are synonyms for _\|_END_\|_.
 708 .PP
 709 A word that doesn't have any other interpretation in the grammar will be
 710 treated as if it had single quotes around it.
 711 For this purpose, a word consists only of alphanumeric characters and underline,
 712 and must start with an alphabetic character.
 713 As with filehandles and labels, a bare word that consists entirely of
 714 lowercase letters risks conflict with future reserved words, and if you
 715 use the
 716 .B \-w
 717 switch, Perl will warn you about any such words.
 718 .PP
 719 Array values are interpolated into double-quoted strings by joining all the
 720 elements of the array with the delimiter specified in the $" variable,
 721 space by default.
 722 (Since in versions of perl prior to 3.0 the @ character was not a metacharacter
 723 in double-quoted strings, the interpolation of @array, $array[EXPR],
 724 @array[LIST], $array{EXPR}, or @array{LIST} only happens if array is
 725 referenced elsewhere in the program or is predefined.)
 726 The following are equivalent:
 727 .nf
 728
 729 .ne 4
 730         $temp = join($",@ARGV);
 731         system "echo $temp";
 732
 733         system "echo @ARGV";
 734
 735 .fi
 736 Within search patterns (which also undergo double-quotish substitution)
 737 there is a bad ambiguity:  Is /$foo[bar]/ to be
 738 interpreted as /${foo}[bar]/ (where [bar] is a character class for the
 739 regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to
 740 array @foo)?
 741 If @foo doesn't otherwise exist, then it's obviously a character class.
 742 If @foo exists, perl takes a good guess about [bar], and is almost always right.
 743 If it does guess wrong, or if you're just plain paranoid,
 744 you can force the correct interpretation with curly brackets as above.
 745 .PP
 746 A line-oriented form of quoting is based on the shell here-is syntax.
 747 Following a << you specify a string to terminate the quoted material, and all lines
 748 following the current line down to the terminating string are the value
 749 of the item.
 750 The terminating string may be either an identifier (a word), or some
 751 quoted text.
 752 If quoted, the type of quotes you use determines the treatment of the text,
 753 just as in regular quoting.
 754 An unquoted identifier works like double quotes.
 755 There must be no space between the << and the identifier.
 756 (If you put a space it will be treated as a null identifier, which is
 757 valid, and matches the first blank line\*(--see Merry Christmas example below.)
 758 The terminating string must appear by itself (unquoted and with no surrounding
 759 whitespace) on the terminating line.
 760 .nf
 761
 762         print <<EOF;            # same as above
 763 The price is $Price.
 764 EOF
 765
 766         print <<"EOF";          # same as above
 767 The price is $Price.
 768 EOF
 769
 770         print << x 10;          # null identifier is delimiter
 771 Merry Christmas!
 772
 773         print <<`EOC`;          # execute commands
 774 echo hi there
 775 echo lo there
 776 EOC
 777
 778         print <<foo, <<bar;     # you can stack them
 779 I said foo.
 780 foo
 781 I said bar.
 782 bar
 783
 784 .fi
 785 Array literals are denoted by separating individual values by commas, and
 786 enclosing the list in parentheses:
 787 .nf
 788
 789         (LIST)
 790
 791 .fi
 792 In a context not requiring an array value, the value of the array literal
 793 is the value of the final element, as in the C comma operator.
 794 For example,
 795 .nf
 796
 797 .ne 4
 798     @foo = (\'cc\', \'\-E\', $bar);
 799
 800 assigns the entire array value to array foo, but
 801
 802     $foo = (\'cc\', \'\-E\', $bar);
 803
 804 .fi
 805 assigns the value of variable bar to variable foo.
 806 Note that the value of an actual array in a scalar context is the length
 807 of the array; the following assigns to $foo the value 3:
 808 .nf
 809
 810 .ne 2
 811     @foo = (\'cc\', \'\-E\', $bar);
 812     $foo = @foo;                # $foo gets 3
 813
 814 .fi
 815 You may have an optional comma before the closing parenthesis of an
 816 array literal, so that you can say:
 817 .nf
 818
 819     @foo = (
 820         1,
 821         2,
 822         3,
 823     );
 824
 825 .fi
 826 When a LIST is evaluated, each element of the list is evaluated in
 827 an array context, and the resulting array value is interpolated into LIST
 828 just as if each individual element were a member of LIST.  Thus arrays
 829 lose their identity in a LIST\*(--the list
 830
 831         (@foo,@bar,&SomeSub)
 832
 833 contains all the elements of @foo followed by all the elements of @bar,
 834 followed by all the elements returned by the subroutine named SomeSub.
 835 .PP
 836 A list value may also be subscripted like a normal array.
 837 Examples:
 838 .nf
 839
 840         $time = (stat($file))[8];       # stat returns array value
 841         $digit = ('a','b','c','d','e','f')[$digit-10];
 842         return (pop(@foo),pop(@foo))[0];
 843
 844 .fi
 845 .PP
 846 Array lists may be assigned to if and only if each element of the list
 847 is an lvalue:
 848 .nf
 849
 850     ($a, $b, $c) = (1, 2, 3);
 851
 852     ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00);
 853
 854 The final element may be an array or an associative array:
 855
 856     ($a, $b, @rest) = split;
 857     local($a, $b, %rest) = @_;
 858
 859 .fi
 860 You can actually put an array anywhere in the list, but the first array
 861 in the list will soak up all the values, and anything after it will get
 862 a null value.
 863 This may be useful in a local().
 864 .PP
 865 An associative array literal contains pairs of values to be interpreted
 866 as a key and a value:
 867 .nf
 868
 869 .ne 2
 870     # same as map assignment above
 871     %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
 872
 873 .fi
 874 Array assignment in a scalar context returns the number of elements
 875 produced by the expression on the right side of the assignment:
 876 .nf
 877
 878         $x = (($foo,$bar) = (3,2,1));   # set $x to 3, not 2
 879
 880 .fi
 881 .PP
 882 There are several other pseudo-literals that you should know about.
 883 If a string is enclosed by backticks (grave accents), it first undergoes
 884 variable substitution just like a double quoted string.
 885 It is then interpreted as a command, and the output of that command
 886 is the value of the pseudo-literal, like in a shell.
 887 In a scalar context, a single string consisting of all the output is
 888 returned.
 889 In an array context, an array of values is returned, one for each line
 890 of output.
 891 (You can set $/ to use a different line terminator.)
 892 The command is executed each time the pseudo-literal is evaluated.
 893 The status value of the command is returned in $? (see Predefined Names
 894 for the interpretation of $?).
 895 Unlike in \f2csh\f1, no translation is done on the return
 896 data\*(--newlines remain newlines.
 897 Unlike in any of the shells, single quotes do not hide variable names
 898 in the command from interpretation.
 899 To pass a $ through to the shell you need to hide it with a backslash.
 900 .PP
 901 Evaluating a filehandle in angle brackets yields the next line
 902 from that file (newline included, so it's never false until EOF, at
 903 which time an undefined value is returned).
 904 Ordinarily you must assign that value to a variable,
 905 but there is one situation where an automatic assignment happens.
 906 If (and only if) the input symbol is the only thing inside the conditional of a
 907 .I while
 908 loop, the value is
 909 automatically assigned to the variable \*(L"$_\*(R".
 910 (This may seem like an odd thing to you, but you'll use the construct
 911 in almost every
 912 .I perl
 913 script you write.)
 914 Anyway, the following lines are equivalent to each other:
 915 .nf
 916
 917 .ne 5
 918     while ($_ = <STDIN>) { print; }
 919     while (<STDIN>) { print; }
 920     for (\|;\|<STDIN>;\|) { print; }
 921     print while $_ = <STDIN>;
 922     print while <STDIN>;
 923
 924 .fi
 925 The filehandles
 926 .IR STDIN ,
 927 .I STDOUT
 928 and
 929 .I STDERR
 930 are predefined.
 931 (The filehandles
 932 .IR stdin ,
 933 .I stdout
 934 and
 935 .I stderr
 936 will also work except in packages, where they would be interpreted as
 937 local identifiers rather than global.)
 938 Additional filehandles may be created with the
 939 .I open
 940 function.
 941 .PP
 942 If a <FILEHANDLE> is used in a context that is looking for an array, an array
 943 consisting of all the input lines is returned, one line per array element.
 944 It's easy to make a LARGE data space this way, so use with care.
 945 .PP
 946 The null filehandle <> is special and can be used to emulate the behavior of
 947 \fIsed\fR and \fIawk\fR.
 948 Input from <> comes either from standard input, or from each file listed on
 949 the command line.
 950 Here's how it works: the first time <> is evaluated, the ARGV array is checked,
 951 and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard
 952 input.
 953 The ARGV array is then processed as a list of filenames.
 954 The loop
 955 .nf
 956
 957 .ne 3
 958         while (<>) {
 959                 .\|.\|.                 # code for each line
 960         }
 961
 962 .ne 10
 963 is equivalent to the following Perl-like pseudo code:
 964
 965         unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[;
 966         while ($ARGV = shift) {
 967                 open(ARGV, $ARGV);
 968                 while (<ARGV>) {
 969                         .\|.\|.         # code for each line
 970                 }
 971         }
 972
 973 .fi
 974 except that it isn't as cumbersome to say, and will actually work.
 975 It really does shift array ARGV and put the current filename into
 976 variable ARGV.
 977 It also uses filehandle ARGV internally\*(--<> is just a synonym for
 978 <ARGV>, which is magical.
 979 (The pseudo code above doesn't work because it treats <ARGV> as non-magical.)
 980 .PP
 981 You can modify @ARGV before the first <> as long as the array ends up
 982 containing the list of filenames you really want.
 983 Line numbers ($.) continue as if the input was one big happy file.
 984 (But see example under eof for how to reset line numbers on each file.)
 985 .PP
 986 .ne 5
 987 If you want to set @ARGV to your own list of files, go right ahead.
 988 If you want to pass switches into your script, you can
 989 put a loop on the front like this:
 990 .nf
 991
 992 .ne 10
 993         while ($_ = $ARGV[0], /\|^\-/\|) {
 994                 shift;
 995             last if /\|^\-\|\-$\|/\|;
 996                 /\|^\-D\|(.*\|)/ \|&& \|($debug = $1);
 997                 /\|^\-v\|/ \|&& \|$verbose++;
 998                 .\|.\|.         # other switches
 999         }
1000         while (<>) {
1001                 .\|.\|.         # code for each line
1002         }
1003
1004 .fi
1005 The <> symbol will return FALSE only once.
1006 If you call it again after this it will assume you are processing another
1007 @ARGV list, and if you haven't set @ARGV, will input from
1008 .IR STDIN .
1009 .PP
1010 If the string inside the angle brackets is a reference to a scalar variable
1011 (e.g. <$foo>),
1012 then that variable contains the name of the filehandle to input from.
1013 .PP
1014 If the string inside angle brackets is not a filehandle, it is interpreted
1015 as a filename pattern to be globbed, and either an array of filenames or the
1016 next filename in the list is returned, depending on context.
1017 One level of $ interpretation is done first, but you can't say <$foo>
1018 because that's an indirect filehandle as explained in the previous
1019 paragraph.
1020 You could insert curly brackets to force interpretation as a
1021 filename glob: <${foo}>.
1022 Example:
1023 .nf
1024
1025 .ne 3
1026         while (<*.c>) {
1027                 chmod 0644, $_;
1028         }
1029
1030 is equivalent to
1031
1032 .ne 5
1033         open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|");
1034         while (<foo>) {
1035                 chop;
1036                 chmod 0644, $_;
1037         }
1038
1039 .fi
1040 In fact, it's currently implemented that way.
1041 (Which means it will not work on filenames with spaces in them unless
1042 you have /bin/csh on your machine.)
1043 Of course, the shortest way to do the above is:
1044 .nf
1045
1046         chmod 0644, <*.c>;
1047
1048 .fi
1049 .Sh "Syntax"
1050 .PP
1051 A
1052 .I perl
1053 script consists of a sequence of declarations and commands.
1054 The only things that need to be declared in
1055 .I perl
1056 are report formats and subroutines.
1057 See the sections below for more information on those declarations.
1058 All uninitialized user-created objects are assumed to
1059 start with a null or 0 value until they
1060 are defined by some explicit operation such as assignment.
1061 The sequence of commands is executed just once, unlike in
1062 .I sed
1063 and
1064 .I awk
1065 scripts, where the sequence of commands is executed for each input line.
1066 While this means that you must explicitly loop over the lines of your input file
1067 (or files), it also means you have much more control over which files and which
1068 lines you look at.
1069 (Actually, I'm lying\*(--it is possible to do an implicit loop with either the
1070 .B \-n
1071 or
1072 .B \-p
1073 switch.)
1074 .PP
1075 A declaration can be put anywhere a command can, but has no effect on the
1076 execution of the primary sequence of commands\*(--declarations all take effect
1077 at compile time.
1078 Typically all the declarations are put at the beginning or the end of the script.
1079 .PP
1080 .I Perl
1081 is, for the most part, a free-form language.
1082 (The only exception to this is format declarations, for fairly obvious reasons.)
1083 Comments are indicated by the # character, and extend to the end of the line.
1084 If you attempt to use /* */ C comments, it will be interpreted either as
1085 division or pattern matching, depending on the context.
1086 So don't do that.
1087 .Sh "Compound statements"
1088 In
1089 .IR perl ,
1090 a sequence of commands may be treated as one command by enclosing it
1091 in curly brackets.
1092 We will call this a BLOCK.
1093 .PP
1094 The following compound commands may be used to control flow:
1095 .nf
1096
1097 .ne 4
1098         if (EXPR) BLOCK
1099         if (EXPR) BLOCK else BLOCK
1100         if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK
1101         LABEL while (EXPR) BLOCK
1102         LABEL while (EXPR) BLOCK continue BLOCK
1103         LABEL for (EXPR; EXPR; EXPR) BLOCK
1104         LABEL foreach VAR (ARRAY) BLOCK
1105         LABEL BLOCK continue BLOCK
1106
1107 .fi
1108 Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not
1109 statements.
1110 This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed.
1111 If you want to write conditionals without curly brackets there are several
1112 other ways to do it.
1113 The following all do the same thing:
1114 .nf
1115
1116 .ne 5
1117         if (!open(foo)) { die "Can't open $foo: $!"; }
1118         die "Can't open $foo: $!" unless open(foo);
1119         open(foo) || die "Can't open $foo: $!"; # foo or bust!
1120         open(foo) ? \'hi mom\' : die "Can't open $foo: $!";
1121                                 # a bit exotic, that last one
1122
1123 .fi
1124 .PP
1125 The
1126 .I if
1127 statement is straightforward.
1128 Since BLOCKs are always bounded by curly brackets, there is never any
1129 ambiguity about which
1130 .I if
1131 an
1132 .I else
1133 goes with.
1134 If you use
1135 .I unless
1136 in place of
1137 .IR if ,
1138 the sense of the test is reversed.
1139 .PP
1140 The
1141 .I while
1142 statement executes the block as long as the expression is true
1143 (does not evaluate to the null string or 0).
1144 The LABEL is optional, and if present, consists of an identifier followed by
1145 a colon.
1146 The LABEL identifies the loop for the loop control statements
1147 .IR next ,
1148 .IR last ,
1149 and
1150 .I redo
1151 (see below).
1152 If there is a
1153 .I continue
1154 BLOCK, it is always executed just before
1155 the conditional is about to be evaluated again, similarly to the third part
1156 of a
1157 .I for
1158 loop in C.
1159 Thus it can be used to increment a loop variable, even when the loop has
1160 been continued via the
1161 .I next
1162 statement (similar to the C \*(L"continue\*(R" statement).
1163 .PP
1164 If the word
1165 .I while
1166 is replaced by the word
1167 .IR until ,
1168 the sense of the test is reversed, but the conditional is still tested before
1169 the first iteration.
1170 .PP
1171 In either the
1172 .I if
1173 or the
1174 .I while
1175 statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional
1176 is true if the value of the last command in that block is true.
1177 .PP
1178 The
1179 .I for
1180 loop works exactly like the corresponding
1181 .I while
1182 loop:
1183 .nf
1184
1185 .ne 12
1186         for ($i = 1; $i < 10; $i++) {
1187                 .\|.\|.
1188         }
1189
1190 is the same as
1191
1192         $i = 1;
1193         while ($i < 10) {
1194                 .\|.\|.
1195         } continue {
1196                 $i++;
1197         }
1198 .fi
1199 .PP
1200 The foreach loop iterates over a normal array value and sets the variable
1201 VAR to be each element of the array in turn.
1202 The variable is implicitly local to the loop, and regains its former value
1203 upon exiting the loop.
1204 The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword,
1205 so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity.
1206 If VAR is omitted, $_ is set to each value.
1207 If ARRAY is an actual array (as opposed to an expression returning an array
1208 value), you can modify each element of the array
1209 by modifying VAR inside the loop.
1210 Examples:
1211 .nf
1212
1213 .ne 5
1214         for (@ary) { s/foo/bar/; }
1215
1216         foreach $elem (@elements) {
1217                 $elem *= 2;
1218         }
1219
1220 .ne 3
1221         for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) {
1222                 print $_, "\en"; sleep(1);
1223         }
1224
1225         for (1..15) { print "Merry Christmas\en"; }
1226
1227 .ne 3
1228         foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'})) {
1229                 print "Item: $item\en";
1230         }
1231
1232 .fi
1233 .PP
1234 The BLOCK by itself (labeled or not) is equivalent to a loop that executes
1235 once.
1236 Thus you can use any of the loop control statements in it to leave or
1237 restart the block.
1238 The
1239 .I continue
1240 block is optional.
1241 This construct is particularly nice for doing case structures.
1242 .nf
1243
1244 .ne 6
1245         foo: {
1246                 if (/^abc/) { $abc = 1; last foo; }
1247                 if (/^def/) { $def = 1; last foo; }
1248                 if (/^xyz/) { $xyz = 1; last foo; }
1249                 $nothing = 1;
1250         }
1251
1252 .fi
1253 There is no official switch statement in perl, because there
1254 are already several ways to write the equivalent.
1255 In addition to the above, you could write
1256 .nf
1257
1258 .ne 6
1259         foo: {
1260                 $abc = 1, last foo  if /^abc/;
1261                 $def = 1, last foo  if /^def/;
1262                 $xyz = 1, last foo  if /^xyz/;
1263                 $nothing = 1;
1264         }
1265
1266 or
1267
1268 .ne 6
1269         foo: {
1270                 /^abc/ && do { $abc = 1; last foo; };
1271                 /^def/ && do { $def = 1; last foo; };
1272                 /^xyz/ && do { $xyz = 1; last foo; };
1273                 $nothing = 1;
1274         }
1275
1276 or
1277
1278 .ne 6
1279         foo: {
1280                 /^abc/ && ($abc = 1, last foo);
1281                 /^def/ && ($def = 1, last foo);
1282                 /^xyz/ && ($xyz = 1, last foo);
1283                 $nothing = 1;
1284         }
1285
1286 or even
1287
1288 .ne 8
1289         if (/^abc/)
1290                 { $abc = 1; }
1291         elsif (/^def/)
1292                 { $def = 1; }
1293         elsif (/^xyz/)
1294                 { $xyz = 1; }
1295         else
1296                 {$nothing = 1;}
1297
1298 .fi
1299 As it happens, these are all optimized internally to a switch structure,
1300 so perl jumps directly to the desired statement, and you needn't worry
1301 about perl executing a lot of unnecessary statements when you have a string
1302 of 50 elsifs, as long as you are testing the same simple scalar variable
1303 using ==, eq, or pattern matching as above.
1304 (If you're curious as to whether the optimizer has done this for a particular
1305 case statement, you can use the \-D1024 switch to list the syntax tree
1306 before execution.)
1307 .Sh "Simple statements"
1308 The only kind of simple statement is an expression evaluated for its side
1309 effects.
1310 Every simple statement must be terminated with a semicolon, unless it is the
1311 final statement in a block, in which case the semicolon is optional.
1312 (Semicolon is still encouraged there if the block takes up more than one line).
1313 .PP
1314 Any simple statement may optionally be followed by a
1315 single modifier, just before the terminating semicolon.
1316 The possible modifiers are:
1317 .nf
1318
1319 .ne 4
1320         if EXPR
1321         unless EXPR
1322         while EXPR
1323         until EXPR
1324
1325 .fi
1326 The
1327 .I if
1328 and
1329 .I unless
1330 modifiers have the expected semantics.
1331 The
1332 .I while
1333 and
1334 .I until
1335 modifiers also have the expected semantics (conditional evaluated first),
1336 except when applied to a do-BLOCK or a do-SUBROUTINE command,
1337 in which case the block executes once before the conditional is evaluated.
1338 This is so that you can write loops like:
1339 .nf
1340
1341 .ne 4
1342         do {
1343                 $_ = <STDIN>;
1344                 .\|.\|.
1345         } until $_ \|eq \|".\|\e\|n";
1346
1347 .fi
1348 (See the
1349 .I do
1350 operator below.  Note also that the loop control commands described later will
1351 NOT work in this construct, since modifiers don't take loop labels.
1352 Sorry.)
1353 .Sh "Expressions"
1354 Since
1355 .I perl
1356 expressions work almost exactly like C expressions, only the differences
1357 will be mentioned here.
1358 .PP
1359 Here's what
1360 .I perl
1361 has that C doesn't:
1362 .Ip ** 8 2
1363 The exponentiation operator.
1364 .Ip **= 8
1365 The exponentiation assignment operator.
1366 .Ip (\|) 8 3
1367 The null list, used to initialize an array to null.
1368 .Ip . 8
1369 Concatenation of two strings.
1370 .Ip .= 8
1371 The concatenation assignment operator.
1372 .Ip eq 8
1373 String equality (== is numeric equality).
1374 For a mnemonic just think of \*(L"eq\*(R" as a string.
1375 (If you are used to the
1376 .I awk
1377 behavior of using == for either string or numeric equality
1378 based on the current form of the comparands, beware!
1379 You must be explicit here.)
1380 .Ip ne 8
1381 String inequality (!= is numeric inequality).
1382 .Ip lt 8
1383 String less than.
1384 .Ip gt 8
1385 String greater than.
1386 .Ip le 8
1387 String less than or equal.
1388 .Ip ge 8
1389 String greater than or equal.
1390 .Ip cmp 8
1391 String comparison, returning -1, 0, or 1.
1392 .Ip <=> 8
1393 Numeric comparison, returning -1, 0, or 1.
1394 .Ip =~ 8 2
1395 Certain operations search or modify the string \*(L"$_\*(R" by default.
1396 This operator makes that kind of operation work on some other string.
1397 The right argument is a search pattern, substitution, or translation.
1398 The left argument is what is supposed to be searched, substituted, or
1399 translated instead of the default \*(L"$_\*(R".
1400 The return value indicates the success of the operation.
1401 (If the right argument is an expression other than a search pattern,
1402 substitution, or translation, it is interpreted as a search pattern
1403 at run time.
1404 This is less efficient than an explicit search, since the pattern must
1405 be compiled every time the expression is evaluated.)
1406 The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else.
1407 .Ip !~ 8
1408 Just like =~ except the return value is negated.
1409 .Ip x 8
1410 The repetition operator.
1411 Returns a string consisting of the left operand repeated the
1412 number of times specified by the right operand.
1413 In an array context, if the left operand is a list in parens, it repeats
1414 the list.
1415 .nf
1416
1417         print \'\-\' x 80;              # print row of dashes
1418         print \'\-\' x80;               # illegal, x80 is identifier
1419
1420         print "\et" x ($tab/8), \' \' x ($tab%8);       # tab over
1421
1422         @ones = (1) x 80;               # an array of 80 1's
1423         @ones = (5) x @ones;            # set all elements to 5
1424
1425 .fi
1426 .Ip x= 8
1427 The repetition assignment operator.
1428 Only works on scalars.
1429 .Ip .\|. 8
1430 The range operator, which is really two different operators depending
1431 on the context.
1432 In an array context, returns an array of values counting (by ones)
1433 from the left value to the right value.
1434 This is useful for writing \*(L"for (1..10)\*(R" loops and for doing
1435 slice operations on arrays.
1436 .Sp
1437 In a scalar context, .\|. returns a boolean value.
1438 The operator is bistable, like a flip-flop, and
1439 emulates the line-range (comma) operator of sed, awk, and various editors.
1440 Each .\|. operator maintains its own boolean state.
1441 It is false as long as its left operand is false.
1442 Once the left operand is true, the range operator stays true
1443 until the right operand is true,
1444 AFTER which the range operator becomes false again.
1445 (It doesn't become false till the next time the range operator is evaluated.
1446 It can test the right operand and become false on the
1447 same evaluation it became true (as in awk), but it still returns true once.
1448 If you don't want it to test the right operand till the next
1449 evaluation (as in sed), use three dots (.\|.\|.) instead of two.)
1450 The right operand is not evaluated while the operator is in the \*(L"false\*(R" state,
1451 and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state.
1452 The precedence is a little lower than || and &&.
1453 The value returned is either the null string for false, or a sequence number
1454 (beginning with 1) for true.
1455 The sequence number is reset for each range encountered.
1456 The final sequence number in a range has the string \'E0\' appended to it, which
1457 doesn't affect its numeric value, but gives you something to search for if you
1458 want to exclude the endpoint.
1459 You can exclude the beginning point by waiting for the sequence number to be
1460 greater than 1.
1461 If either operand of scalar .\|. is static, that operand is implicitly compared
1462 to the $. variable, the current line number.
1463 Examples:
1464 .nf
1465
1466 .ne 6
1467 As a scalar operator:
1468     if (101 .\|. 200) { print; }        # print 2nd hundred lines
1469
1470     next line if (1 .\|. /^$/); # skip header lines
1471
1472     s/^/> / if (/^$/ .\|. eof());       # quote body
1473
1474 .ne 4
1475 As an array operator:
1476     for (101 .\|. 200) { print; }       # print $_ 100 times
1477
1478     @foo = @foo[$[ .\|. $#foo]; # an expensive no-op
1479     @foo = @foo[$#foo-4 .\|. $#foo];    # slice last 5 items
1480
1481 .fi
1482 .Ip \-x 8
1483 A file test.
1484 This unary operator takes one argument, either a filename or a filehandle,
1485 and tests the associated file to see if something is true about it.
1486 If the argument is omitted, tests $_, except for \-t, which tests
1487 .IR STDIN .
1488 It returns 1 for true and \'\' for false, or the undefined value if the
1489 file doesn't exist.
1490 Precedence is higher than logical and relational operators, but lower than
1491 arithmetic operators.
1492 The operator may be any of:
1493 .nf
1494         \-r     File is readable by effective uid/gid.
1495         \-w     File is writable by effective uid/gid.
1496         \-x     File is executable by effective uid/gid.
1497         \-o     File is owned by effective uid.
1498         \-R     File is readable by real uid/gid.
1499         \-W     File is writable by real uid/gid.
1500         \-X     File is executable by real uid/gid.
1501         \-O     File is owned by real uid.
1502         \-e     File exists.
1503         \-z     File has zero size.
1504         \-s     File has non-zero size (returns size).
1505         \-f     File is a plain file.
1506         \-d     File is a directory.
1507         \-l     File is a symbolic link.
1508         \-p     File is a named pipe (FIFO).
1509         \-S     File is a socket.
1510         \-b     File is a block special file.
1511         \-c     File is a character special file.
1512         \-u     File has setuid bit set.
1513         \-g     File has setgid bit set.
1514         \-k     File has sticky bit set.
1515         \-t     Filehandle is opened to a tty.
1516         \-T     File is a text file.
1517         \-B     File is a binary file (opposite of \-T).
1518         \-M     Age of file in days when script started.
1519         \-A     Same for access time.
1520         \-C     Same for inode change time.
1521
1522 .fi
1523 The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X
1524 is based solely on the mode of the file and the uids and gids of the user.
1525 There may be other reasons you can't actually read, write or execute the file.
1526 Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and
1527 \-x and \-X return 1 if any execute bit is set in the mode.
1528 Scripts run by the superuser may thus need to do a stat() in order to determine
1529 the actual mode of the file, or temporarily set the uid to something else.
1530 .Sp
1531 Example:
1532 .nf
1533 .ne 7
1534
1535         while (<>) {
1536                 chop;
1537                 next unless \-f $_;     # ignore specials
1538                 .\|.\|.
1539         }
1540
1541 .fi
1542 Note that \-s/a/b/ does not do a negated substitution.
1543 Saying \-exp($foo) still works as expected, however\*(--only single letters
1544 following a minus are interpreted as file tests.
1545 .Sp
1546 The \-T and \-B switches work as follows.
1547 The first block or so of the file is examined for odd characters such as
1548 strange control codes or metacharacters.
1549 If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file.
1550 Also, any file containing null in the first block is considered a binary file.
1551 If \-T or \-B is used on a filehandle, the current stdio buffer is examined
1552 rather than the first block.
1553 Both \-T and \-B return TRUE on a null file, or a file at EOF when testing
1554 a filehandle.
1555 .PP
1556 If any of the file tests (or either stat operator) are given the special
1557 filehandle consisting of a solitary underline, then the stat structure
1558 of the previous file test (or stat operator) is used, saving a system
1559 call.
1560 (This doesn't work with \-t, and you need to remember that lstat and -l
1561 will leave values in the stat structure for the symbolic link, not the
1562 real file.)
1563 Example:
1564 .nf
1565
1566         print "Can do.\en" if -r $a || -w _ || -x _;
1567
1568 .ne 9
1569         stat($filename);
1570         print "Readable\en" if -r _;
1571         print "Writable\en" if -w _;
1572         print "Executable\en" if -x _;
1573         print "Setuid\en" if -u _;
1574         print "Setgid\en" if -g _;
1575         print "Sticky\en" if -k _;
1576         print "Text\en" if -T _;
1577         print "Binary\en" if -B _;
1578
1579 .fi
1580 .PP
1581 Here is what C has that
1582 .I perl
1583 doesn't:
1584 .Ip "unary &" 12
1585 Address-of operator.
1586 .Ip "unary *" 12
1587 Dereference-address operator.
1588 .Ip "(TYPE)" 12
1589 Type casting operator.
1590 .PP
1591 Like C,
1592 .I perl
1593 does a certain amount of expression evaluation at compile time, whenever
1594 it determines that all of the arguments to an operator are static and have
1595 no side effects.
1596 In particular, string concatenation happens at compile time between literals that don't do variable substitution.
1597 Backslash interpretation also happens at compile time.
1598 You can say
1599 .nf
1600
1601 .ne 2
1602         \'Now is the time for all\' . "\|\e\|n" .
1603         \'good men to come to.\'
1604
1605 .fi
1606 and this all reduces to one string internally.
1607 .PP
1608 The autoincrement operator has a little extra built-in magic to it.
1609 If you increment a variable that is numeric, or that has ever been used in
1610 a numeric context, you get a normal increment.
1611 If, however, the variable has only been used in string contexts since it
1612 was set, and has a value that is not null and matches the
1613 pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done
1614 as a string, preserving each character within its range, with carry:
1615 .nf
1616
1617         print ++($foo = \'99\');        # prints \*(L'100\*(R'
1618         print ++($foo = \'a0\');        # prints \*(L'a1\*(R'
1619         print ++($foo = \'Az\');        # prints \*(L'Ba\*(R'
1620         print ++($foo = \'zz\');        # prints \*(L'aaa\*(R'
1621
1622 .fi
1623 The autodecrement is not magical.
1624 .PP
1625 The range operator (in an array context) makes use of the magical
1626 autoincrement algorithm if the minimum and maximum are strings.
1627 You can say
1628
1629         @alphabet = (\'A\' .. \'Z\');
1630
1631 to get all the letters of the alphabet, or
1632
1633         $hexdigit = (0 .. 9, \'a\' .. \'f\')[$num & 15];
1634
1635 to get a hexadecimal digit, or
1636
1637         @z2 = (\'01\' .. \'31\');  print @z2[$mday];
1638
1639 to get dates with leading zeros.
1640 (If the final value specified is not in the sequence that the magical increment
1641 would produce, the sequence goes until the next value would be longer than
1642 the final value specified.)
1643 .PP
1644 The || and && operators differ from C's in that, rather than returning 0 or 1,
1645 they return the last value evaluated.
1646 Thus, a portable way to find out the home directory might be:
1647 .nf
1648
1649         $home = $ENV{'HOME'} || $ENV{'LOGDIR'} ||
1650             (getpwuid($<))[7] || die "You're homeless!\en";
1651
1652 .fi
1653 .PP
1654 Along with the literals and variables mentioned earlier,
1655 the operations in the following section can serve as terms in an expression.
1656 Some of these operations take a LIST as an argument.
1657 Such a list can consist of any combination of scalar arguments or array values;
1658 the array values will be included in the list as if each individual element were
1659 interpolated at that point in the list, forming a longer single-dimensional
1660 array value.
1661 Elements of the LIST should be separated by commas.
1662 If an operation is listed both with and without parentheses around its
1663 arguments, it means you can either use it as a unary operator or
1664 as a function call.
1665 To use it as a function call, the next token on the same line must
1666 be a left parenthesis.
1667 (There may be intervening white space.)
1668 Such a function then has highest precedence, as you would expect from
1669 a function.
1670 If any token other than a left parenthesis follows, then it is a
1671 unary operator, with a precedence depending only on whether it is a LIST
1672 operator or not.
1673 LIST operators have lowest precedence.
1674 All other unary operators have a precedence greater than relational operators
1675 but less than arithmetic operators.
1676 See the section on Precedence.
1677 .PP
1678 For operators that can be used in either a scalar or array context,
1679 failure is generally indicated in a scalar context by returning
1680 the undefined value, and in an array context by returning the null list.
1681 Remember though that
1682 THERE IS NO GENERAL RULE FOR CONVERTING A LIST INTO A SCALAR.
1683 Each operator decides which sort of scalar it would be most
1684 appropriate to return.
1685 Some operators return the length of the list
1686 that would have been returned in an array context.
1687 Some operators return the first value in the list.
1688 Some operators return the last value in the list.
1689 Some operators return a count of successful operations.
1690 In general, they do what you want, unless you want consistency.
1691 .Ip "/PATTERN/" 8 4
1692 See m/PATTERN/.
1693 .Ip "?PATTERN?" 8 4
1694 This is just like the /pattern/ search, except that it matches only once between
1695 calls to the
1696 .I reset
1697 operator.
1698 This is a useful optimization when you only want to see the first occurrence of
1699 something in each file of a set of files, for instance.
1700 Only ?? patterns local to the current package are reset.
1701 .Ip "accept(NEWSOCKET,GENERICSOCKET)" 8 2
1702 Does the same thing that the accept system call does.
1703 Returns true if it succeeded, false otherwise.
1704 See example in section on Interprocess Communication.
1705 .Ip "alarm(SECONDS)" 8 4
1706 .Ip "alarm SECONDS" 8
1707 Arranges to have a SIGALRM delivered to this process after the specified number
1708 of seconds (minus 1, actually) have elapsed.  Thus, alarm(15) will cause
1709 a SIGALRM at some point more than 14 seconds in the future.
1710 Only one timer may be counting at once.  Each call disables the previous
1711 timer, and an argument of 0 may be supplied to cancel the previous timer
1712 without starting a new one.
1713 The returned value is the amount of time remaining on the previous timer.
1714 .Ip "atan2(Y,X)" 8 2
1715 Returns the arctangent of Y/X in the range
1716 .if t \-\(*p to \(*p.
1717 .if n \-PI to PI.
1718 .Ip "bind(SOCKET,NAME)" 8 2
1719 Does the same thing that the bind system call does.
1720 Returns true if it succeeded, false otherwise.
1721 NAME should be a packed address of the proper type for the socket.
1722 See example in section on Interprocess Communication.
1723 .Ip "binmode(FILEHANDLE)" 8 4
1724 .Ip "binmode FILEHANDLE" 8 4
1725 Arranges for the file to be read in \*(L"binary\*(R" mode in operating systems
1726 that distinguish between binary and text files.
1727 Files that are not read in binary mode have CR LF sequences translated
1728 to LF on input and LF translated to CR LF on output.
1729 Binmode has no effect under Unix.
1730 If FILEHANDLE is an expression, the value is taken as the name of
1731 the filehandle.
1732 .Ip "caller(EXPR)"
1733 .Ip "caller"
1734 Returns the context of the current subroutine call:
1735 .nf
1736
1737         ($package,$filename,$line) = caller;
1738
1739 .fi
1740 With EXPR, returns some extra information that the debugger uses to print
1741 a stack trace.  The value of EXPR indicates how many call frames to go
1742 back before the current one.
1743 .Ip "chdir(EXPR)" 8 2
1744 .Ip "chdir EXPR" 8 2
1745 Changes the working directory to EXPR, if possible.
1746 If EXPR is omitted, changes to home directory.
1747 Returns 1 upon success, 0 otherwise.
1748 See example under
1749 .IR die .
1750 .Ip "chmod(LIST)" 8 2
1751 .Ip "chmod LIST" 8 2
1752 Changes the permissions of a list of files.
1753 The first element of the list must be the numerical mode.
1754 Returns the number of files successfully changed.
1755 .nf
1756
1757 .ne 2
1758         $cnt = chmod 0755, \'foo\', \'bar\';
1759         chmod 0755, @executables;
1760
1761 .fi
1762 .Ip "chop(LIST)" 8 7
1763 .Ip "chop(VARIABLE)" 8
1764 .Ip "chop VARIABLE" 8
1765 .Ip "chop" 8
1766 Chops off the last character of a string and returns the character chopped.
1767 It's used primarily to remove the newline from the end of an input record,
1768 but is much more efficient than s/\en// because it neither scans nor copies
1769 the string.
1770 If VARIABLE is omitted, chops $_.
1771 Example:
1772 .nf
1773
1774 .ne 5
1775         while (<>) {
1776                 chop;   # avoid \en on last field
1777                 @array = split(/:/);
1778                 .\|.\|.
1779         }
1780
1781 .fi
1782 You can actually chop anything that's an lvalue, including an assignment:
1783 .nf
1784
1785         chop($cwd = \`pwd\`);
1786         chop($answer = <STDIN>);
1787
1788 .fi
1789 If you chop a list, each element is chopped.
1790 Only the value of the last chop is returned.
1791 .Ip "chown(LIST)" 8 2
1792 .Ip "chown LIST" 8 2
1793 Changes the owner (and group) of a list of files.
1794 The first two elements of the list must be the NUMERICAL uid and gid,
1795 in that order.
1796 Returns the number of files successfully changed.
1797 .nf
1798
1799 .ne 2
1800         $cnt = chown $uid, $gid, \'foo\', \'bar\';
1801         chown $uid, $gid, @filenames;
1802
1803 .fi
1804 .ne 23
1805 Here's an example that looks up non-numeric uids in the passwd file:
1806 .nf
1807
1808         print "User: ";
1809         $user = <STDIN>;
1810         chop($user);
1811         print "Files: "
1812         $pattern = <STDIN>;
1813         chop($pattern);
1814 .ie t \{\
1815         open(pass, \'/etc/passwd\') || die "Can't open passwd: $!\en";
1816 'br\}
1817 .el \{\
1818         open(pass, \'/etc/passwd\')
1819                 || die "Can't open passwd: $!\en";
1820 'br\}
1821         while (<pass>) {
1822                 ($login,$pass,$uid,$gid) = split(/:/);
1823                 $uid{$login} = $uid;
1824                 $gid{$login} = $gid;
1825         }
1826         @ary = <${pattern}>;    # get filenames
1827         if ($uid{$user} eq \'\') {
1828                 die "$user not in passwd file";
1829         }
1830         else {
1831                 chown $uid{$user}, $gid{$user}, @ary;
1832         }
1833
1834 .fi
1835 .Ip "chroot(FILENAME)" 8 5
1836 .Ip "chroot FILENAME" 8
1837 Does the same as the system call of that name.
1838 If you don't know what it does, don't worry about it.
1839 If FILENAME is omitted, does chroot to $_.
1840 .Ip "close(FILEHANDLE)" 8 5
1841 .Ip "close FILEHANDLE" 8
1842 Closes the file or pipe associated with the file handle, returning true only
1843 if stdio successfully flushes buffers and closes the system file descriptor.
1844 You don't have to close FILEHANDLE if you are immediately going to
1845 do another open on it, since open will close it for you.
1846 (See
1847 .IR open .)
1848 However, an explicit close on an input file resets the line counter ($.), while
1849 the implicit close done by
1850 .I open
1851 does not.
1852 Also, closing a pipe will wait for the process executing on the pipe to complete,
1853 in case you want to look at the output of the pipe afterwards.
1854 Closing a pipe explicitly also puts the status value of the command into $?.
1855 Example:
1856 .nf
1857
1858 .ne 4
1859         open(OUTPUT, \'|sort >foo\');   # pipe to sort
1860         .\|.\|. # print stuff to output
1861         close OUTPUT;           # wait for sort to finish
1862         open(INPUT, \'foo\');   # get sort's results
1863
1864 .fi
1865 FILEHANDLE may be an expression whose value gives the real filehandle name.
1866 .Ip "closedir(DIRHANDLE)" 8 5
1867 .Ip "closedir DIRHANDLE" 8
1868 Closes a directory opened by opendir().
1869 .Ip "connect(SOCKET,NAME)" 8 2
1870 Does the same thing that the connect system call does.
1871 Returns true if it succeeded, false otherwise.
1872 NAME should be a package address of the proper type for the socket.
1873 See example in section on Interprocess Communication.
1874 .Ip "cos(EXPR)" 8 6
1875 .Ip "cos EXPR" 8 6
1876 Returns the cosine of EXPR (expressed in radians).
1877 If EXPR is omitted takes cosine of $_.
1878 .Ip "crypt(PLAINTEXT,SALT)" 8 6
1879 Encrypts a string exactly like the crypt() function in the C library.
1880 Useful for checking the password file for lousy passwords.
1881 Only the guys wearing white hats should do this.
1882 .Ip "dbmclose(ASSOC_ARRAY)" 8 6
1883 .Ip "dbmclose ASSOC_ARRAY" 8
1884 Breaks the binding between a dbm file and an associative array.
1885 The values remaining in the associative array are meaningless unless
1886 you happen to want to know what was in the cache for the dbm file.
1887 This function is only useful if you have ndbm.
1888 .Ip "dbmopen(ASSOC,DBNAME,MODE)" 8 6
1889 This binds a dbm or ndbm file to an associative array.
1890 ASSOC is the name of the associative array.
1891 (Unlike normal open, the first argument is NOT a filehandle, even though
1892 it looks like one).
1893 DBNAME is the name of the database (without the .dir or .pag extension).
1894 If the database does not exist, it is created with protection specified
1895 by MODE (as modified by the umask).
1896 If your system only supports the older dbm functions, you may perform only one
1897 dbmopen in your program.
1898 If your system has neither dbm nor ndbm, calling dbmopen produces a fatal
1899 error.
1900 .Sp
1901 Values assigned to the associative array prior to the dbmopen are lost.
1902 A certain number of values from the dbm file are cached in memory.
1903 By default this number is 64, but you can increase it by preallocating
1904 that number of garbage entries in the associative array before the dbmopen.
1905 You can flush the cache if necessary with the reset command.
1906 .Sp
1907 If you don't have write access to the dbm file, you can only read
1908 associative array variables, not set them.
1909 If you want to test whether you can write, either use file tests or
1910 try setting a dummy array entry inside an eval, which will trap the error.
1911 .Sp
1912 Note that functions such as keys() and values() may return huge array values
1913 when used on large dbm files.
1914 You may prefer to use the each() function to iterate over large dbm files.
1915 Example:
1916 .nf
1917
1918 .ne 6
1919         # print out history file offsets
1920         dbmopen(HIST,'/usr/lib/news/history',0666);
1921         while (($key,$val) = each %HIST) {
1922                 print $key, ' = ', unpack('L',$val), "\en";
1923         }
1924         dbmclose(HIST);
1925
1926 .fi
1927 .Ip "defined(EXPR)" 8 6
1928 .Ip "defined EXPR" 8
1929 Returns a boolean value saying whether the lvalue EXPR has a real value
1930 or not.
1931 Many operations return the undefined value under exceptional conditions,
1932 such as end of file, uninitialized variable, system error and such.
1933 This function allows you to distinguish between an undefined null string
1934 and a defined null string with operations that might return a real null
1935 string, in particular referencing elements of an array.
1936 You may also check to see if arrays or subroutines exist.
1937 Use on predefined variables is not guaranteed to produce intuitive results.
1938 Examples:
1939 .nf
1940
1941 .ne 7
1942         print if defined $switch{'D'};
1943         print "$val\en" while defined($val = pop(@ary));
1944         die "Can't readlink $sym: $!"
1945                 unless defined($value = readlink $sym);
1946         eval '@foo = ()' if defined(@foo);
1947         die "No XYZ package defined" unless defined %_XYZ;
1948         sub foo { defined &$bar ? &$bar(@_) : die "No bar"; }
1949
1950 .fi
1951 See also undef.
1952 .Ip "delete $ASSOC{KEY}" 8 6
1953 Deletes the specified value from the specified associative array.
1954 Returns the deleted value, or the undefined value if nothing was deleted.
1955 Deleting from $ENV{} modifies the environment.
1956 Deleting from an array bound to a dbm file deletes the entry from the dbm
1957 file.
1958 .Sp
1959 The following deletes all the values of an associative array:
1960 .nf
1961
1962 .ne 3
1963         foreach $key (keys %ARRAY) {
1964                 delete $ARRAY{$key};
1965         }
1966
1967 .fi
1968 (But it would be faster to use the
1969 .I reset
1970 command.
1971 Saying undef %ARRAY is faster yet.)
1972 .Ip "die(LIST)" 8
1973 .Ip "die LIST" 8
1974 Outside of an eval, prints the value of LIST to
1975 .I STDERR
1976 and exits with the current value of $!
1977 (errno).
1978 If $! is 0, exits with the value of ($? >> 8) (\`command\` status).
1979 If ($? >> 8) is 0, exits with 255.
1980 Inside an eval, the error message is stuffed into $@ and the eval is terminated
1981 with the undefined value.
1982 .Sp
1983 Equivalent examples:
1984 .nf
1985
1986 .ne 3
1987 .ie t \{\
1988         die "Can't cd to spool: $!\en" unless chdir \'/usr/spool/news\';
1989 'br\}
1990 .el \{\
1991         die "Can't cd to spool: $!\en"
1992                 unless chdir \'/usr/spool/news\';
1993 'br\}
1994
1995         chdir \'/usr/spool/news\' || die "Can't cd to spool: $!\en"
1996
1997 .fi
1998 .Sp
1999 If the value of EXPR does not end in a newline, the current script line
2000 number and input line number (if any) are also printed, and a newline is
2001 supplied.
2002 Hint: sometimes appending \*(L", stopped\*(R" to your message will cause it to make
2003 better sense when the string \*(L"at foo line 123\*(R" is appended.
2004 Suppose you are running script \*(L"canasta\*(R".
2005 .nf
2006
2007 .ne 7
2008         die "/etc/games is no good";
2009         die "/etc/games is no good, stopped";
2010
2011 produce, respectively
2012
2013         /etc/games is no good at canasta line 123.
2014         /etc/games is no good, stopped at canasta line 123.
2015
2016 .fi
2017 See also
2018 .IR exit .
2019 .Ip "do BLOCK" 8 4
2020 Returns the value of the last command in the sequence of commands indicated
2021 by BLOCK.
2022 When modified by a loop modifier, executes the BLOCK once before testing the
2023 loop condition.
2024 (On other statements the loop modifiers test the conditional first.)
2025 .Ip "do SUBROUTINE (LIST)" 8 3
2026 Executes a SUBROUTINE declared by a
2027 .I sub
2028 declaration, and returns the value
2029 of the last expression evaluated in SUBROUTINE.
2030 If there is no subroutine by that name, produces a fatal error.
2031 (You may use the \*(L"defined\*(R" operator to determine if a subroutine
2032 exists.)
2033 If you pass arrays as part of LIST you may wish to pass the length
2034 of the array in front of each array.
2035 (See the section on subroutines later on.)
2036 The parentheses are required to avoid confusion with the \*(L"do EXPR\*(R"
2037 form.
2038 .Sp
2039 SUBROUTINE may also be a single scalar variable, in which case
2040 the name of the subroutine to execute is taken from the variable.
2041 .Sp
2042 As an alternate (and preferred) form,
2043 you may call a subroutine by prefixing the name with
2044 an ampersand: &foo(@args).
2045 If you aren't passing any arguments, you don't have to use parentheses.
2046 If you omit the parentheses, no @_ array is passed to the subroutine.
2047 The & form is also used to specify subroutines to the defined and undef
2048 operators:
2049 .nf
2050
2051         if (defined &$var) { &$var($parm); undef &$var; }
2052
2053 .fi
2054 .Ip "do EXPR" 8 3
2055 Uses the value of EXPR as a filename and executes the contents of the file
2056 as a
2057 .I perl
2058 script.
2059 Its primary use is to include subroutines from a
2060 .I perl
2061 subroutine library.
2062 .nf
2063
2064         do \'stat.pl\';
2065
2066 is just like
2067
2068         eval \`cat stat.pl\`;
2069
2070 .fi
2071 except that it's more efficient, more concise, keeps track of the current
2072 filename for error messages, and searches all the
2073 .B \-I
2074 libraries if the file
2075 isn't in the current directory (see also the @INC array in Predefined Names).
2076 It's the same, however, in that it does reparse the file every time you
2077 call it, so if you are going to use the file inside a loop you might prefer
2078 to use \-P and #include, at the expense of a little more startup time.
2079 (The main problem with #include is that cpp doesn't grok # comments\*(--a
2080 workaround is to use \*(L";#\*(R" for standalone comments.)
2081 Note that the following are NOT equivalent:
2082 .nf
2083
2084 .ne 2
2085         do $foo;        # eval a file
2086         do $foo();      # call a subroutine
2087
2088 .fi
2089 Note that inclusion of library routines is better done with
2090 the \*(L"require\*(R" operator.
2091 .Ip "dump LABEL" 8 6
2092 This causes an immediate core dump.
2093 Primarily this is so that you can use the undump program to turn your
2094 core dump into an executable binary after having initialized all your
2095 variables at the beginning of the program.
2096 When the new binary is executed it will begin by executing a "goto LABEL"
2097 (with all the restrictions that goto suffers).
2098 Think of it as a goto with an intervening core dump and reincarnation.
2099 If LABEL is omitted, restarts the program from the top.
2100 WARNING: any files opened at the time of the dump will NOT be open any more
2101 when the program is reincarnated, with possible resulting confusion on the part
2102 of perl.
2103 See also \-u.
2104 .Sp
2105 Example:
2106 .nf
2107
2108 .ne 16
2109         #!/usr/bin/perl
2110         require 'getopt.pl';
2111         require 'stat.pl';
2112         %days = (
2113             'Sun',1,
2114             'Mon',2,
2115             'Tue',3,
2116             'Wed',4,
2117             'Thu',5,
2118             'Fri',6,
2119             'Sat',7);
2120
2121         dump QUICKSTART if $ARGV[0] eq '-d';
2122
2123     QUICKSTART:
2124         do Getopt('f');
2125
2126 .fi
2127 .Ip "each(ASSOC_ARRAY)" 8 6
2128 .Ip "each ASSOC_ARRAY" 8
2129 Returns a 2 element array consisting of the key and value for the next
2130 value of an associative array, so that you can iterate over it.
2131 Entries are returned in an apparently random order.
2132 When the array is entirely read, a null array is returned (which when
2133 assigned produces a FALSE (0) value).
2134 The next call to each() after that will start iterating again.
2135 The iterator can be reset only by reading all the elements from the array.
2136 You must not modify the array while iterating over it.
2137 There is a single iterator for each associative array, shared by all
2138 each(), keys() and values() function calls in the program.
2139 The following prints out your environment like the printenv program, only
2140 in a different order:
2141 .nf
2142
2143 .ne 3
2144         while (($key,$value) = each %ENV) {
2145                 print "$key=$value\en";
2146         }
2147
2148 .fi
2149 See also keys() and values().
2150 .Ip "eof(FILEHANDLE)" 8 8
2151 .Ip "eof()" 8
2152 .Ip "eof" 8
2153 Returns 1 if the next read on FILEHANDLE will return end of file, or if
2154 FILEHANDLE is not open.
2155 FILEHANDLE may be an expression whose value gives the real filehandle name.
2156 (Note that this function actually reads a character and then ungetc's it,
2157 so it is not very useful in an interactive context.)
2158 An eof without an argument returns the eof status for the last file read.
2159 Empty parentheses () may be used to indicate the pseudo file formed of the
2160 files listed on the command line, i.e. eof() is reasonable to use inside
2161 a while (<>) loop to detect the end of only the last file.
2162 Use eof(ARGV) or eof without the parentheses to test EACH file in a while (<>) loop.
2163 Examples:
2164 .nf
2165
2166 .ne 7
2167         # insert dashes just before last line of last file
2168         while (<>) {
2169                 if (eof()) {
2170                         print "\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\en";
2171                 }
2172                 print;
2173         }
2174
2175 .ne 7
2176         # reset line numbering on each input file
2177         while (<>) {
2178                 print "$.\et$_";
2179                 if (eof) {      # Not eof().
2180                         close(ARGV);
2181                 }
2182         }
2183
2184 .fi
2185 .Ip "eval(EXPR)" 8 6
2186 .Ip "eval EXPR" 8 6
2187 .Ip "eval BLOCK" 8 6
2188 EXPR is parsed and executed as if it were a little
2189 .I perl
2190 program.
2191 It is executed in the context of the current
2192 .I perl
2193 program, so that
2194 any variable settings, subroutine or format definitions remain afterwards.
2195 The value returned is the value of the last expression evaluated, just
2196 as with subroutines.
2197 If there is a syntax error or runtime error, or a die statement is
2198 executed, an undefined value is returned by
2199 eval, and $@ is set to the error message.
2200 If there was no error, $@ is guaranteed to be a null string.
2201 If EXPR is omitted, evaluates $_.
2202 The final semicolon, if any, may be omitted from the expression.
2203 .Sp
2204 Note that, since eval traps otherwise-fatal errors, it is useful for
2205 determining whether a particular feature
2206 (such as dbmopen or symlink) is implemented.
2207 It is also Perl's exception trapping mechanism, where the die operator is
2208 used to raise exceptions.
2209 .Sp
2210 If the code to be executed doesn't vary, you may use
2211 the eval-BLOCK form to trap run-time errors without incurring
2212 the penalty of recompiling each time.
2213 The error, if any, is still returned in $@.
2214 Evaluating a single-quoted string (as EXPR) has the same effect, except that
2215 the eval-EXPR form reports syntax errors at run time via $@, whereas the
2216 eval-BLOCK form reports syntax errors at compile time.  The eval-EXPR form
2217 is optimized to eval-BLOCK the first time it succeeds.  (Since the replacement
2218 side of a substitution is considered a single-quoted string when you
2219 use the e modifier, the same optimization occurs there.)  Examples:
2220 .nf
2221
2222 .ne 11
2223         # make divide-by-zero non-fatal
2224         eval { $answer = $a / $b; }; warn $@ if $@;
2225
2226         # optimized to same thing after first use
2227         eval '$answer = $a / $b'; warn $@ if $@;
2228
2229         # a compile-time error
2230         eval { $answer = };
2231
2232         # a run-time error
2233         eval '$answer =';       # sets $@
2234
2235 .fi
2236 .Ip "exec(LIST)" 8 8
2237 .Ip "exec LIST" 8 6
2238 If there is more than one argument in LIST, or if LIST is an array with
2239 more than one value,
2240 calls execvp() with the arguments in LIST.
2241 If there is only one scalar argument, the argument is checked for shell metacharacters.
2242 If there are any, the entire argument is passed to \*(L"/bin/sh \-c\*(R" for parsing.
2243 If there are none, the argument is split into words and passed directly to
2244 execvp(), which is more efficient.
2245 Note: exec (and system) do not flush your output buffer, so you may need to
2246 set $| to avoid lost output.
2247 Examples:
2248 .nf
2249
2250         exec \'/bin/echo\', \'Your arguments are: \', @ARGV;
2251         exec "sort $outfile | uniq";
2252
2253 .fi
2254 .Sp
2255 If you don't really want to execute the first argument, but want to lie
2256 to the program you are executing about its own name, you can specify
2257 the program you actually want to run by assigning that to a variable and
2258 putting the name of the variable in front of the LIST without a comma.
2259 (This always forces interpretation of the LIST as a multi-valued list, even
2260 if there is only a single scalar in the list.)
2261 Example:
2262 .nf
2263
2264 .ne 2
2265         $shell = '/bin/csh';
2266         exec $shell '-sh';              # pretend it's a login shell
2267
2268 .fi
2269 .Ip "exit(EXPR)" 8 6
2270 .Ip "exit EXPR" 8
2271 Evaluates EXPR and exits immediately with that value.
2272 Example:
2273 .nf
2274
2275 .ne 2
2276         $ans = <STDIN>;
2277         exit 0 \|if \|$ans \|=~ \|/\|^[Xx]\|/\|;
2278
2279 .fi
2280 See also
2281 .IR die .
2282 If EXPR is omitted, exits with 0 status.
2283 .Ip "exp(EXPR)" 8 3
2284 .Ip "exp EXPR" 8
2285 Returns
2286 .I e
2287 to the power of EXPR.
2288 If EXPR is omitted, gives exp($_).
2289 .Ip "fcntl(FILEHANDLE,FUNCTION,SCALAR)" 8 4
2290 Implements the fcntl(2) function.
2291 You'll probably have to say
2292 .nf
2293
2294         require "fcntl.ph";     # probably /usr/local/lib/perl/fcntl.ph
2295
2296 .fi
2297 first to get the correct function definitions.
2298 If fcntl.ph doesn't exist or doesn't have the correct definitions
2299 you'll have to roll
2300 your own, based on your C header files such as <sys/fcntl.h>.
2301 (There is a perl script called h2ph that comes with the perl kit
2302 which may help you in this.)
2303 Argument processing and value return works just like ioctl below.
2304 Note that fcntl will produce a fatal error if used on a machine that doesn't implement
2305 fcntl(2).
2306 .Ip "fileno(FILEHANDLE)" 8 4
2307 .Ip "fileno FILEHANDLE" 8 4
2308 Returns the file descriptor for a filehandle.
2309 Useful for constructing bitmaps for select().
2310 If FILEHANDLE is an expression, the value is taken as the name of
2311 the filehandle.
2312 .Ip "flock(FILEHANDLE,OPERATION)" 8 4
2313 Calls flock(2) on FILEHANDLE.
2314 See manual page for flock(2) for definition of OPERATION.
2315 Returns true for success, false on failure.
2316 Will produce a fatal error if used on a machine that doesn't implement
2317 flock(2).
2318 Here's a mailbox appender for BSD systems.
2319 .nf
2320
2321 .ne 20
2322         $LOCK_SH = 1;
2323         $LOCK_EX = 2;
2324         $LOCK_NB = 4;
2325         $LOCK_UN = 8;
2326
2327         sub lock {
2328             flock(MBOX,$LOCK_EX);
2329             # and, in case someone appended
2330             # while we were waiting...
2331             seek(MBOX, 0, 2);
2332         }
2333
2334         sub unlock {
2335             flock(MBOX,$LOCK_UN);
2336         }
2337
2338         open(MBOX, ">>/usr/spool/mail/$ENV{'USER'}")
2339                 || die "Can't open mailbox: $!";
2340
2341         do lock();
2342         print MBOX $msg,"\en\en";
2343         do unlock();
2344
2345 .fi
2346 .Ip "fork" 8 4
2347 Does a fork() system call.
2348 Returns the child pid to the parent process and 0 to the child process,
2349 or undef if the fork is unsuccessful.
2350 Note: unflushed buffers remain unflushed in both processes, which means
2351 you may need to set $| to avoid duplicate output.
2352 .Ip "getc(FILEHANDLE)" 8 4
2353 .Ip "getc FILEHANDLE" 8
2354 .Ip "getc" 8
2355 Returns the next character from the input file attached to FILEHANDLE, or
2356 a null string at EOF.
2357 If FILEHANDLE is omitted, reads from STDIN.
2358 .Ip "getlogin" 8 3
2359 Returns the current login from /etc/utmp, if any.
2360 If null, use getpwuid.
2361
2362         $login = getlogin || (getpwuid($<))[0] || "Somebody";
2363
2364 .Ip "getpeername(SOCKET)" 8 3
2365 Returns the packed sockaddr address of other end of the SOCKET connection.
2366 .nf
2367
2368 .ne 4
2369         # An internet sockaddr
2370         $sockaddr = 'S n a4 x8';
2371         $hersockaddr = getpeername(S);
2372 .ie t \{\
2373         ($family, $port, $heraddr) = unpack($sockaddr,$hersockaddr);
2374 'br\}
2375 .el \{\
2376         ($family, $port, $heraddr) =
2377                         unpack($sockaddr,$hersockaddr);
2378 'br\}
2379
2380 .fi
2381 .Ip "getpgrp(PID)" 8 4
2382 .Ip "getpgrp PID" 8
2383 Returns the current process group for the specified PID, 0 for the current
2384 process.
2385 Will produce a fatal error if used on a machine that doesn't implement
2386 getpgrp(2).
2387 If EXPR is omitted, returns process group of current process.
2388 .Ip "getppid" 8 4
2389 Returns the process id of the parent process.
2390 .Ip "getpriority(WHICH,WHO)" 8 4
2391 Returns the current priority for a process, a process group, or a user.
2392 (See getpriority(2).)
2393 Will produce a fatal error if used on a machine that doesn't implement
2394 getpriority(2).
2395 .Ip "getpwnam(NAME)" 8
2396 .Ip "getgrnam(NAME)" 8
2397 .Ip "gethostbyname(NAME)" 8
2398 .Ip "getnetbyname(NAME)" 8
2399 .Ip "getprotobyname(NAME)" 8
2400 .Ip "getpwuid(UID)" 8
2401 .Ip "getgrgid(GID)" 8
2402 .Ip "getservbyname(NAME,PROTO)" 8
2403 .Ip "gethostbyaddr(ADDR,ADDRTYPE)" 8
2404 .Ip "getnetbyaddr(ADDR,ADDRTYPE)" 8
2405 .Ip "getprotobynumber(NUMBER)" 8
2406 .Ip "getservbyport(PORT,PROTO)" 8
2407 .Ip "getpwent" 8
2408 .Ip "getgrent" 8
2409 .Ip "gethostent" 8
2410 .Ip "getnetent" 8
2411 .Ip "getprotoent" 8
2412 .Ip "getservent" 8
2413 .Ip "setpwent" 8
2414 .Ip "setgrent" 8
2415 .Ip "sethostent(STAYOPEN)" 8
2416 .Ip "setnetent(STAYOPEN)" 8
2417 .Ip "setprotoent(STAYOPEN)" 8
2418 .Ip "setservent(STAYOPEN)" 8
2419 .Ip "endpwent" 8
2420 .Ip "endgrent" 8
2421 .Ip "endhostent" 8
2422 .Ip "endnetent" 8
2423 .Ip "endprotoent" 8
2424 .Ip "endservent" 8
2425 These routines perform the same functions as their counterparts in the
2426 system library.
2427 Within an array context,
2428 the return values from the various get routines are as follows:
2429 .nf
2430
2431         ($name,$passwd,$uid,$gid,
2432            $quota,$comment,$gcos,$dir,$shell) = getpw.\|.\|.
2433         ($name,$passwd,$gid,$members) = getgr.\|.\|.
2434         ($name,$aliases,$addrtype,$length,@addrs) = gethost.\|.\|.
2435         ($name,$aliases,$addrtype,$net) = getnet.\|.\|.
2436         ($name,$aliases,$proto) = getproto.\|.\|.
2437         ($name,$aliases,$port,$proto) = getserv.\|.\|.
2438
2439 .fi
2440 (If the entry doesn't exist you get a null list.)
2441 .Sp
2442 Within a scalar context, you get the name, unless the function was a
2443 lookup by name, in which case you get the other thing, whatever it is.
2444 (If the entry doesn't exist you get the undefined value.)
2445 For example:
2446 .nf
2447
2448         $uid = getpwnam
2449         $name = getpwuid
2450         $name = getpwent
2451         $gid = getgrnam
2452         $name = getgrgid
2453         $name = getgrent
2454         etc.
2455
2456 .fi
2457 The $members value returned by getgr.\|.\|. is a space separated list
2458 of the login names of the members of the group.
2459 .Sp
2460 For the gethost.\|.\|. functions, if the h_errno variable is supported in C,
2461 it will be returned to you via $? if the function call fails.
2462 The @addrs value returned by a successful call is a list of the
2463 raw addresses returned by the corresponding system library call.
2464 In the Internet domain, each address is four bytes long and you can unpack
2465 it by saying something like:
2466 .nf
2467
2468         ($a,$b,$c,$d) = unpack('C4',$addr[0]);
2469
2470 .fi
2471 .Ip "getsockname(SOCKET)" 8 3
2472 Returns the packed sockaddr address of this end of the SOCKET connection.
2473 .nf
2474
2475 .ne 4
2476         # An internet sockaddr
2477         $sockaddr = 'S n a4 x8';
2478         $mysockaddr = getsockname(S);
2479 .ie t \{\
2480         ($family, $port, $myaddr) = unpack($sockaddr,$mysockaddr);
2481 'br\}
2482 .el \{\
2483         ($family, $port, $myaddr) =
2484                         unpack($sockaddr,$mysockaddr);
2485 'br\}
2486
2487 .fi
2488 .Ip "getsockopt(SOCKET,LEVEL,OPTNAME)" 8 3
2489 Returns the socket option requested, or undefined if there is an error.
2490 .Ip "gmtime(EXPR)" 8 4
2491 .Ip "gmtime EXPR" 8
2492 Converts a time as returned by the time function to a 9-element array with
2493 the time analyzed for the Greenwich timezone.
2494 Typically used as follows:
2495 .nf
2496
2497 .ne 3
2498 .ie t \{\
2499     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = gmtime(time);
2500 'br\}
2501 .el \{\
2502     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
2503                                                 gmtime(time);
2504 'br\}
2505
2506 .fi
2507 All array elements are numeric, and come straight out of a struct tm.
2508 In particular this means that $mon has the range 0.\|.11 and $wday has the
2509 range 0.\|.6.
2510 If EXPR is omitted, does gmtime(time).
2511 .Ip "goto LABEL" 8 6
2512 Finds the statement labeled with LABEL and resumes execution there.
2513 Currently you may only go to statements in the main body of the program
2514 that are not nested inside a do {} construct.
2515 This statement is not implemented very efficiently, and is here only to make
2516 the
2517 .IR sed -to- perl
2518 translator easier.
2519 I may change its semantics at any time, consistent with support for translated
2520 .I sed
2521 scripts.
2522 Use it at your own risk.
2523 Better yet, don't use it at all.
2524 .Ip "grep(EXPR,LIST)" 8 4
2525 Evaluates EXPR for each element of LIST (locally setting $_ to each element)
2526 and returns the array value consisting of those elements for which the
2527 expression evaluated to true.
2528 In a scalar context, returns the number of times the expression was true.
2529 .nf
2530
2531         @foo = grep(!/^#/, @bar);    # weed out comments
2532
2533 .fi
2534 Note that, since $_ is a reference into the array value, it can be
2535 used to modify the elements of the array.
2536 While this is useful and supported, it can cause bizarre results if
2537 the LIST is not a named array.
2538 .Ip "hex(EXPR)" 8 4
2539 .Ip "hex EXPR" 8
2540 Returns the decimal value of EXPR interpreted as an hex string.
2541 (To interpret strings that might start with 0 or 0x see oct().)
2542 If EXPR is omitted, uses $_.
2543 .Ip "index(STR,SUBSTR,POSITION)" 8 4
2544 .Ip "index(STR,SUBSTR)" 8 4
2545 Returns the position of the first occurrence of SUBSTR in STR at or after
2546 POSITION.
2547 If POSITION is omitted, starts searching from the beginning of the string.
2548 The return value is based at 0, or whatever you've
2549 set the $[ variable to.
2550 If the substring is not found, returns one less than the base, ordinarily \-1.
2551 .Ip "int(EXPR)" 8 4
2552 .Ip "int EXPR" 8
2553 Returns the integer portion of EXPR.
2554 If EXPR is omitted, uses $_.
2555 .Ip "ioctl(FILEHANDLE,FUNCTION,SCALAR)" 8 4
2556 Implements the ioctl(2) function.
2557 You'll probably have to say
2558 .nf
2559
2560         require "ioctl.ph";     # probably /usr/local/lib/perl/ioctl.ph
2561
2562 .fi
2563 first to get the correct function definitions.
2564 If ioctl.ph doesn't exist or doesn't have the correct definitions
2565 you'll have to roll
2566 your own, based on your C header files such as <sys/ioctl.h>.
2567 (There is a perl script called h2ph that comes with the perl kit
2568 which may help you in this.)
2569 SCALAR will be read and/or written depending on the FUNCTION\*(--a pointer
2570 to the string value of SCALAR will be passed as the third argument of
2571 the actual ioctl call.
2572 (If SCALAR has no string value but does have a numeric value, that value
2573 will be passed rather than a pointer to the string value.
2574 To guarantee this to be true, add a 0 to the scalar before using it.)
2575 The pack() and unpack() functions are useful for manipulating the values
2576 of structures used by ioctl().
2577 The following example sets the erase character to DEL.
2578 .nf
2579
2580 .ne 9
2581         require 'ioctl.ph';
2582         $sgttyb_t = "ccccs";            # 4 chars and a short
2583         if (ioctl(STDIN,$TIOCGETP,$sgttyb)) {
2584                 @ary = unpack($sgttyb_t,$sgttyb);
2585                 $ary[2] = 127;
2586                 $sgttyb = pack($sgttyb_t,@ary);
2587                 ioctl(STDIN,$TIOCSETP,$sgttyb)
2588                         || die "Can't ioctl: $!";
2589         }
2590
2591 .fi
2592 The return value of ioctl (and fcntl) is as follows:
2593 .nf
2594
2595 .ne 4
2596         if OS returns:\h'|3i'perl returns:
2597           -1\h'|3i'  undefined value
2598           0\h'|3i'  string "0 but true"
2599           anything else\h'|3i'  that number
2600
2601 .fi
2602 Thus perl returns true on success and false on failure, yet you can still
2603 easily determine the actual value returned by the operating system:
2604 .nf
2605
2606         ($retval = ioctl(...)) || ($retval = -1);
2607         printf "System returned %d\en", $retval;
2608 .fi
2609 .Ip "join(EXPR,LIST)" 8 8
2610 .Ip "join(EXPR,ARRAY)" 8
2611 Joins the separate strings of LIST or ARRAY into a single string with fields
2612 separated by the value of EXPR, and returns the string.
2613 Example:
2614 .nf
2615
2616 .ie t \{\
2617     $_ = join(\|\':\', $login,$passwd,$uid,$gid,$gcos,$home,$shell);
2618 'br\}
2619 .el \{\
2620     $_ = join(\|\':\',
2621                 $login,$passwd,$uid,$gid,$gcos,$home,$shell);
2622 'br\}
2623
2624 .fi
2625 See
2626 .IR split .
2627 .Ip "keys(ASSOC_ARRAY)" 8 6
2628 .Ip "keys ASSOC_ARRAY" 8
2629 Returns a normal array consisting of all the keys of the named associative
2630 array.
2631 (In a scalar context, returns the number of keys.)
2632 The keys are returned in an apparently random order, but it is the same order
2633 as either the values() or each() function produces (given that the associative array
2634 has not been modified).
2635 Here is yet another way to print your environment:
2636 .nf
2637
2638 .ne 5
2639         @keys = keys %ENV;
2640         @values = values %ENV;
2641         while ($#keys >= 0) {
2642                 print pop(@keys), \'=\', pop(@values), "\en";
2643         }
2644
2645 or how about sorted by key:
2646
2647 .ne 3
2648         foreach $key (sort(keys %ENV)) {
2649                 print $key, \'=\', $ENV{$key}, "\en";
2650         }
2651
2652 .fi
2653 .Ip "kill(LIST)" 8 8
2654 .Ip "kill LIST" 8 2
2655 Sends a signal to a list of processes.
2656 The first element of the list must be the signal to send.
2657 Returns the number of processes successfully signaled.
2658 .nf
2659
2660         $cnt = kill 1, $child1, $child2;
2661         kill 9, @goners;
2662
2663 .fi
2664 If the signal is negative, kills process groups instead of processes.
2665 (On System V, a negative \fIprocess\fR number will also kill process groups,
2666 but that's not portable.)
2667 You may use a signal name in quotes.
2668 .Ip "last LABEL" 8 8
2669 .Ip "last" 8
2670 The
2671 .I last
2672 command is like the
2673 .I break
2674 statement in C (as used in loops); it immediately exits the loop in question.
2675 If the LABEL is omitted, the command refers to the innermost enclosing loop.
2676 The
2677 .I continue
2678 block, if any, is not executed:
2679 .nf
2680
2681 .ne 4
2682         line: while (<STDIN>) {
2683                 last line if /\|^$/;    # exit when done with header
2684                 .\|.\|.
2685         }
2686
2687 .fi
2688 .Ip "length(EXPR)" 8 4
2689 .Ip "length EXPR" 8
2690 Returns the length in characters of the value of EXPR.
2691 If EXPR is omitted, returns length of $_.
2692 .Ip "link(OLDFILE,NEWFILE)" 8 2
2693 Creates a new filename linked to the old filename.
2694 Returns 1 for success, 0 otherwise.
2695 .Ip "listen(SOCKET,QUEUESIZE)" 8 2
2696 Does the same thing that the listen system call does.
2697 Returns true if it succeeded, false otherwise.
2698 See example in section on Interprocess Communication.
2699 .Ip "local(LIST)" 8 4
2700 Declares the listed variables to be local to the enclosing block,
2701 subroutine, eval or \*(L"do\*(R".
2702 All the listed elements must be legal lvalues.
2703 This operator works by saving the current values of those variables in LIST
2704 on a hidden stack and restoring them upon exiting the block, subroutine or eval.
2705 This means that called subroutines can also reference the local variable,
2706 but not the global one.
2707 The LIST may be assigned to if desired, which allows you to initialize
2708 your local variables.
2709 (If no initializer is given for a particular variable, it is created with
2710 an undefined value.)
2711 Commonly this is used to name the parameters to a subroutine.
2712 Examples:
2713 .nf
2714
2715 .ne 13
2716         sub RANGEVAL {
2717                 local($min, $max, $thunk) = @_;
2718                 local($result) = \'\';
2719                 local($i);
2720
2721                 # Presumably $thunk makes reference to $i
2722
2723                 for ($i = $min; $i < $max; $i++) {
2724                         $result .= eval $thunk;
2725                 }
2726
2727                 $result;
2728         }
2729
2730 .ne 6
2731         if ($sw eq \'-v\') {
2732             # init local array with global array
2733             local(@ARGV) = @ARGV;
2734             unshift(@ARGV,\'echo\');
2735             system @ARGV;
2736         }
2737         # @ARGV restored
2738
2739 .ne 6
2740         # temporarily add to digits associative array
2741         if ($base12) {
2742                 # (NOTE: not claiming this is efficient!)
2743                 local(%digits) = (%digits,'t',10,'e',11);
2744                 do parse_num();
2745         }
2746
2747 .fi
2748 Note that local() is a run-time command, and so gets executed every time
2749 through a loop, using up more stack storage each time until it's all
2750 released at once when the loop is exited.
2751 .Ip "localtime(EXPR)" 8 4
2752 .Ip "localtime EXPR" 8
2753 Converts a time as returned by the time function to a 9-element array with
2754 the time analyzed for the local timezone.
2755 Typically used as follows:
2756 .nf
2757
2758 .ne 3
2759 .ie t \{\
2760     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
2761 'br\}
2762 .el \{\
2763     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
2764                                                 localtime(time);
2765 'br\}
2766
2767 .fi
2768 All array elements are numeric, and come straight out of a struct tm.
2769 In particular this means that $mon has the range 0.\|.11 and $wday has the
2770 range 0.\|.6.
2771 If EXPR is omitted, does localtime(time).
2772 .Ip "log(EXPR)" 8 4
2773 .Ip "log EXPR" 8
2774 Returns logarithm (base
2775 .IR e )
2776 of EXPR.
2777 If EXPR is omitted, returns log of $_.
2778 .Ip "lstat(FILEHANDLE)" 8 6
2779 .Ip "lstat FILEHANDLE" 8
2780 .Ip "lstat(EXPR)" 8
2781 .Ip "lstat SCALARVARIABLE" 8
2782 Does the same thing as the stat() function, but stats a symbolic link
2783 instead of the file the symbolic link points to.
2784 If symbolic links are unimplemented on your system, a normal stat is done.
2785 .Ip "m/PATTERN/gio" 8 4
2786 .Ip "/PATTERN/gio" 8
2787 Searches a string for a pattern match, and returns true (1) or false (\'\').
2788 If no string is specified via the =~ or !~ operator,
2789 the $_ string is searched.
2790 (The string specified with =~ need not be an lvalue\*(--it may be the result of an expression evaluation, but remember the =~ binds rather tightly.)
2791 See also the section on regular expressions.
2792 .Sp
2793 If / is the delimiter then the initial \*(L'm\*(R' is optional.
2794 With the \*(L'm\*(R' you can use any pair of non-alphanumeric characters
2795 as delimiters.
2796 This is particularly useful for matching Unix path names that contain \*(L'/\*(R'.
2797 If the final delimiter is followed by the optional letter \*(L'i\*(R', the matching is
2798 done in a case-insensitive manner.
2799 PATTERN may contain references to scalar variables, which will be interpolated
2800 (and the pattern recompiled) every time the pattern search is evaluated.
2801 (Note that $) and $| may not be interpolated because they look like end-of-string tests.)
2802 If you want such a pattern to be compiled only once, add an \*(L"o\*(R" after
2803 the trailing delimiter.
2804 This avoids expensive run-time recompilations, and
2805 is useful when the value you are interpolating won't change over the
2806 life of the script.
2807 If the PATTERN evaluates to a null string, the most recent successful
2808 regular expression is used instead.
2809 .Sp
2810 If used in a context that requires an array value, a pattern match returns an
2811 array consisting of the subexpressions matched by the parentheses in the
2812 pattern,
2813 i.e. ($1, $2, $3.\|.\|.).
2814 It does NOT actually set $1, $2, etc. in this case, nor does it set $+, $`, $&
2815 or $'.
2816 If the match fails, a null array is returned.
2817 If the match succeeds, but there were no parentheses, an array value of (1)
2818 is returned.
2819 .Sp
2820 Examples:
2821 .nf
2822
2823 .ne 4
2824     open(tty, \'/dev/tty\');
2825     <tty> \|=~ \|/\|^y\|/i \|&& \|do foo(\|);   # do foo if desired
2826
2827     if (/Version: \|*\|([0\-9.]*\|)\|/\|) { $version = $1; }
2828
2829     next if m#^/usr/spool/uucp#;
2830
2831 .ne 5
2832     # poor man's grep
2833     $arg = shift;
2834     while (<>) {
2835             print if /$arg/o;   # compile only once
2836     }
2837
2838     if (($F1, $F2, $Etc) = ($foo =~ /^(\eS+)\es+(\eS+)\es*(.*)/))
2839
2840 .fi
2841 This last example splits $foo into the first two words and the remainder
2842 of the line, and assigns those three fields to $F1, $F2 and $Etc.
2843 The conditional is true if any variables were assigned, i.e. if the pattern
2844 matched.
2845 .Sp
2846 The \*(L"g\*(R" modifier specifies global pattern matching\*(--that is,
2847 matching as many times as possible within the string.  How it behaves
2848 depends on the context.  In an array context, it returns a list of
2849 all the substrings matched by all the parentheses in the regular expression.
2850 If there are no parentheses, it returns a list of all the matched strings,
2851 as if there were parentheses around the whole pattern.  In a scalar context,
2852 it iterates through the string, returning TRUE each time it matches, and
2853 FALSE when it eventually runs out of matches.  (In other words, it remembers
2854 where it left off last time and restarts the search at that point.)  It
2855 presumes that you have not modified the string since the last match.
2856 Modifying the string between matches may result in undefined behavior.
2857 (You can actually get away with in-place modifications via substr()
2858 that do not change the length of the entire string.  In general, however,
2859 you should be using s///g for such modifications.)  Examples:
2860 .nf
2861
2862         # array context
2863         ($one,$five,$fifteen) = (\`uptime\` =~ /(\ed+\e.\ed+)/g);
2864
2865         # scalar context
2866         $/ = ""; $* = 1;
2867         while ($paragraph = <>) {
2868             while ($paragraph =~ /[a-z][\'")]*[.!?]+[\'")]*\es/g) {
2869                 $sentences++;
2870             }
2871         }
2872         print "$sentences\en";
2873
2874 .fi
2875 .Ip "mkdir(FILENAME,MODE)" 8 3
2876 Creates the directory specified by FILENAME, with permissions specified by
2877 MODE (as modified by umask).
2878 If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
2879 .Ip "msgctl(ID,CMD,ARG)" 8 4
2880 Calls the System V IPC function msgctl.  If CMD is &IPC_STAT, then ARG
2881 must be a variable which will hold the returned msqid_ds structure.
2882 Returns like ioctl: the undefined value for error, "0 but true" for
2883 zero, or the actual return value otherwise.
2884 .Ip "msgget(KEY,FLAGS)" 8 4
2885 Calls the System V IPC function msgget.  Returns the message queue id,
2886 or the undefined value if there is an error.
2887 .Ip "msgsnd(ID,MSG,FLAGS)" 8 4
2888 Calls the System V IPC function msgsnd to send the message MSG to the
2889 message queue ID.  MSG must begin with the long integer message type,
2890 which may be created with pack("L", $type).  Returns true if
2891 successful, or false if there is an error.
2892 .Ip "msgrcv(ID,VAR,SIZE,TYPE,FLAGS)" 8 4
2893 Calls the System V IPC function msgrcv to receive a message from
2894 message queue ID into variable VAR with a maximum message size of
2895 SIZE.  Note that if a message is received, the message type will be
2896 the first thing in VAR, and the maximum length of VAR is SIZE plus the
2897 size of the message type.  Returns true if successful, or false if
2898 there is an error.
2899 .Ip "next LABEL" 8 8
2900 .Ip "next" 8
2901 The
2902 .I next
2903 command is like the
2904 .I continue
2905 statement in C; it starts the next iteration of the loop:
2906 .nf
2907
2908 .ne 4
2909         line: while (<STDIN>) {
2910                 next line if /\|^#/;    # discard comments
2911                 .\|.\|.
2912         }
2913
2914 .fi
2915 Note that if there were a
2916 .I continue
2917 block on the above, it would get executed even on discarded lines.
2918 If the LABEL is omitted, the command refers to the innermost enclosing loop.
2919 .Ip "oct(EXPR)" 8 4
2920 .Ip "oct EXPR" 8
2921 Returns the decimal value of EXPR interpreted as an octal string.
2922 (If EXPR happens to start off with 0x, interprets it as a hex string instead.)
2923 The following will handle decimal, octal and hex in the standard notation:
2924 .nf
2925
2926         $val = oct($val) if $val =~ /^0/;
2927
2928 .fi
2929 If EXPR is omitted, uses $_.
2930 .Ip "open(FILEHANDLE,EXPR)" 8 8
2931 .Ip "open(FILEHANDLE)" 8
2932 .Ip "open FILEHANDLE" 8
2933 Opens the file whose filename is given by EXPR, and associates it with
2934 FILEHANDLE.
2935 If FILEHANDLE is an expression, its value is used as the name of the
2936 real filehandle wanted.
2937 If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE
2938 contains the filename.
2939 If the filename begins with \*(L"<\*(R" or nothing, the file is opened for
2940 input.
2941 If the filename begins with \*(L">\*(R", the file is opened for output.
2942 If the filename begins with \*(L">>\*(R", the file is opened for appending.
2943 (You can put a \'+\' in front of the \'>\' or \'<\' to indicate that you
2944 want both read and write access to the file.)
2945 If the filename begins with \*(L"|\*(R", the filename is interpreted
2946 as a command to which output is to be piped, and if the filename ends
2947 with a \*(L"|\*(R", the filename is interpreted as command which pipes
2948 input to us.
2949 (You may not have a command that pipes both in and out.)
2950 Opening \'\-\' opens
2951 .I STDIN
2952 and opening \'>\-\' opens
2953 .IR STDOUT .
2954 Open returns non-zero upon success, the undefined value otherwise.
2955 If the open involved a pipe, the return value happens to be the pid
2956 of the subprocess.
2957 Examples:
2958 .nf
2959
2960 .ne 3
2961         $article = 100;
2962         open article || die "Can't find article $article: $!\en";
2963         while (<article>) {\|.\|.\|.
2964
2965 .ie t \{\
2966         open(LOG, \'>>/usr/spool/news/twitlog\'\|);     # (log is reserved)
2967 'br\}
2968 .el \{\
2969         open(LOG, \'>>/usr/spool/news/twitlog\'\|);
2970                                         # (log is reserved)
2971 'br\}
2972
2973 .ie t \{\
2974         open(article, "caesar <$article |"\|);          # decrypt article
2975 'br\}
2976 .el \{\
2977         open(article, "caesar <$article |"\|);
2978                                         # decrypt article
2979 'br\}
2980
2981 .ie t \{\
2982         open(extract, "|sort >/tmp/Tmp$$"\|);           # $$ is our process#
2983 'br\}
2984 .el \{\
2985         open(extract, "|sort >/tmp/Tmp$$"\|);
2986                                         # $$ is our process#
2987 'br\}
2988
2989 .ne 7
2990         # process argument list of files along with any includes
2991
2992         foreach $file (@ARGV) {
2993                 do process($file, \'fh00\');    # no pun intended
2994         }
2995
2996         sub process {
2997                 local($filename, $input) = @_;
2998                 $input++;               # this is a string increment
2999                 unless (open($input, $filename)) {
3000                         print STDERR "Can't open $filename: $!\en";
3001                         return;
3002                 }
3003 .ie t \{\
3004                 while (<$input>) {              # note the use of indirection
3005 'br\}
3006 .el \{\
3007                 while (<$input>) {              # note use of indirection
3008 'br\}
3009                         if (/^#include "(.*)"/) {
3010                                 do process($1, $input);
3011                                 next;
3012                         }
3013                         .\|.\|.         # whatever
3014                 }
3015         }
3016
3017 .fi
3018 You may also, in the Bourne shell tradition, specify an EXPR beginning
3019 with \*(L">&\*(R", in which case the rest of the string
3020 is interpreted as the name of a filehandle
3021 (or file descriptor, if numeric) which is to be duped and opened.
3022 You may use & after >, >>, <, +>, +>> and +<.
3023 The mode you specify should match the mode of the original filehandle.
3024 Here is a script that saves, redirects, and restores
3025 .I STDOUT
3026 and
3027 .IR STDERR :
3028 .nf
3029
3030 .ne 21
3031         #!/usr/bin/perl
3032         open(SAVEOUT, ">&STDOUT");
3033         open(SAVEERR, ">&STDERR");
3034
3035         open(STDOUT, ">foo.out") || die "Can't redirect stdout";
3036         open(STDERR, ">&STDOUT") || die "Can't dup stdout";
3037
3038         select(STDERR); $| = 1;         # make unbuffered
3039         select(STDOUT); $| = 1;         # make unbuffered
3040
3041         print STDOUT "stdout 1\en";     # this works for
3042         print STDERR "stderr 1\en";     # subprocesses too
3043
3044         close(STDOUT);
3045         close(STDERR);
3046
3047         open(STDOUT, ">&SAVEOUT");
3048         open(STDERR, ">&SAVEERR");
3049
3050         print STDOUT "stdout 2\en";
3051         print STDERR "stderr 2\en";
3052
3053 .fi
3054 If you open a pipe on the command \*(L"\-\*(R", i.e. either \*(L"|\-\*(R" or \*(L"\-|\*(R",
3055 then there is an implicit fork done, and the return value of open
3056 is the pid of the child within the parent process, and 0 within the child
3057 process.
3058 (Use defined($pid) to determine if the open was successful.)
3059 The filehandle behaves normally for the parent, but i/o to that
3060 filehandle is piped from/to the
3061 .IR STDOUT / STDIN
3062 of the child process.
3063 In the child process the filehandle isn't opened\*(--i/o happens from/to
3064 the new
3065 .I STDOUT
3066 or
3067 .IR STDIN .
3068 Typically this is used like the normal piped open when you want to exercise
3069 more control over just how the pipe command gets executed, such as when
3070 you are running setuid, and don't want to have to scan shell commands
3071 for metacharacters.
3072 The following pairs are more or less equivalent:
3073 .nf
3074
3075 .ne 5
3076         open(FOO, "|tr \'[a\-z]\' \'[A\-Z]\'");
3077         open(FOO, "|\-") || exec \'tr\', \'[a\-z]\', \'[A\-Z]\';
3078
3079         open(FOO, "cat \-n '$file'|");
3080         open(FOO, "\-|") || exec \'cat\', \'\-n\', $file;
3081
3082 .fi
3083 Explicitly closing any piped filehandle causes the parent process to wait for the
3084 child to finish, and returns the status value in $?.
3085 Note: on any operation which may do a fork,
3086 unflushed buffers remain unflushed in both
3087 processes, which means you may need to set $| to
3088 avoid duplicate output.
3089 .Sp
3090 The filename that is passed to open will have leading and trailing
3091 whitespace deleted.
3092 In order to open a file with arbitrary weird characters in it, it's necessary
3093 to protect any leading and trailing whitespace thusly:
3094 .nf
3095
3096 .ne 2
3097         $file =~ s#^(\es)#./$1#;
3098         open(FOO, "< $file\e0");
3099
3100 .fi
3101 .Ip "opendir(DIRHANDLE,EXPR)" 8 3
3102 Opens a directory named EXPR for processing by readdir(), telldir(), seekdir(),
3103 rewinddir() and closedir().
3104 Returns true if successful.
3105 DIRHANDLEs have their own namespace separate from FILEHANDLEs.
3106 .Ip "ord(EXPR)" 8 4
3107 .Ip "ord EXPR" 8
3108 Returns the numeric ascii value of the first character of EXPR.
3109 If EXPR is omitted, uses $_.
3110 ''' Comments on f & d by gnb@melba.bby.oz.au    22/11/89
3111 .Ip "pack(TEMPLATE,LIST)" 8 4
3112 Takes an array or list of values and packs it into a binary structure,
3113 returning the string containing the structure.
3114 The TEMPLATE is a sequence of characters that give the order and type
3115 of values, as follows:
3116 .nf
3117
3118         A       An ascii string, will be space padded.
3119         a       An ascii string, will be null padded.
3120         c       A signed char value.
3121         C       An unsigned char value.
3122         s       A signed short value.
3123         S       An unsigned short value.
3124         i       A signed integer value.
3125         I       An unsigned integer value.
3126         l       A signed long value.
3127         L       An unsigned long value.
3128         n       A short in \*(L"network\*(R" order.
3129         N       A long in \*(L"network\*(R" order.
3130         f       A single-precision float in the native format.
3131         d       A double-precision float in the native format.
3132         p       A pointer to a string.
3133         v       A short in \*(L"VAX\*(R" (little-endian) order.
3134         V       A long in \*(L"VAX\*(R" (little-endian) order.
3135         x       A null byte.
3136         X       Back up a byte.
3137         @       Null fill to absolute position.
3138         u       A uuencoded string.
3139         b       A bit string (ascending bit order, like vec()).
3140         B       A bit string (descending bit order).
3141         h       A hex string (low nybble first).
3142         H       A hex string (high nybble first).
3143
3144 .fi
3145 Each letter may optionally be followed by a number which gives a repeat
3146 count.
3147 With all types except "a", "A", "b", "B", "h" and "H",
3148 the pack function will gobble up that many values
3149 from the LIST.
3150 A * for the repeat count means to use however many items are left.
3151 The "a" and "A" types gobble just one value, but pack it as a string of length
3152 count,
3153 padding with nulls or spaces as necessary.
3154 (When unpacking, "A" strips trailing spaces and nulls, but "a" does not.)
3155 Likewise, the "b" and "B" fields pack a string that many bits long.
3156 The "h" and "H" fields pack a string that many nybbles long.
3157 Real numbers (floats and doubles) are in the native machine format
3158 only; due to the multiplicity of floating formats around, and the lack
3159 of a standard \*(L"network\*(R" representation, no facility for
3160 interchange has been made.
3161 This means that packed floating point data
3162 written on one machine may not be readable on another - even if both
3163 use IEEE floating point arithmetic (as the endian-ness of the memory
3164 representation is not part of the IEEE spec).
3165 Note that perl uses
3166 doubles internally for all numeric calculation, and converting from
3167 double -> float -> double will lose precision (i.e. unpack("f",
3168 pack("f", $foo)) will not in general equal $foo).
3169 .br
3170 Examples:
3171 .nf
3172
3173         $foo = pack("cccc",65,66,67,68);
3174         # foo eq "ABCD"
3175         $foo = pack("c4",65,66,67,68);
3176         # same thing
3177
3178         $foo = pack("ccxxcc",65,66,67,68);
3179         # foo eq "AB\e0\e0CD"
3180
3181         $foo = pack("s2",1,2);
3182         # "\e1\e0\e2\e0" on little-endian
3183         # "\e0\e1\e0\e2" on big-endian
3184
3185         $foo = pack("a4","abcd","x","y","z");
3186         # "abcd"
3187
3188         $foo = pack("aaaa","abcd","x","y","z");
3189         # "axyz"
3190
3191         $foo = pack("a14","abcdefg");
3192         # "abcdefg\e0\e0\e0\e0\e0\e0\e0"
3193
3194         $foo = pack("i9pl", gmtime);
3195         # a real struct tm (on my system anyway)
3196
3197         sub bintodec {
3198             unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
3199         }
3200 .fi
3201 The same template may generally also be used in the unpack function.
3202 .Ip "pipe(READHANDLE,WRITEHANDLE)" 8 3
3203 Opens a pair of connected pipes like the corresponding system call.
3204 Note that if you set up a loop of piped processes, deadlock can occur
3205 unless you are very careful.
3206 In addition, note that perl's pipes use stdio buffering, so you may need
3207 to set $| to flush your WRITEHANDLE after each command, depending on
3208 the application.
3209 [Requires version 3.0 patchlevel 9.]
3210 .Ip "pop(ARRAY)" 8
3211 .Ip "pop ARRAY" 8 6
3212 Pops and returns the last value of the array, shortening the array by 1.
3213 Has the same effect as
3214 .nf
3215
3216         $tmp = $ARRAY[$#ARRAY\-\|\-];
3217
3218 .fi
3219 If there are no elements in the array, returns the undefined value.
3220 .Ip "print(FILEHANDLE LIST)" 8 10
3221 .Ip "print(LIST)" 8
3222 .Ip "print FILEHANDLE LIST" 8
3223 .Ip "print LIST" 8
3224 .Ip "print" 8
3225 Prints a string or a comma-separated list of strings.
3226 Returns non-zero if successful.
3227 FILEHANDLE may be a scalar variable name, in which case the variable contains
3228 the name of the filehandle, thus introducing one level of indirection.
3229 (NOTE: If FILEHANDLE is a variable and the next token is a term, it may be
3230 misinterpreted as an operator unless you interpose a + or put parens around
3231 the arguments.)
3232 If FILEHANDLE is omitted, prints by default to standard output (or to the
3233 last selected output channel\*(--see select()).
3234 If LIST is also omitted, prints $_ to
3235 .IR STDOUT .
3236 To set the default output channel to something other than
3237 .I STDOUT
3238 use the select operation.
3239 Note that, because print takes a LIST, anything in the LIST is evaluated
3240 in an array context, and any subroutine that you call will have one or more
3241 of its expressions evaluated in an array context.
3242 Also be careful not to follow the print keyword with a left parenthesis
3243 unless you want the corresponding right parenthesis to terminate the
3244 arguments to the print\*(--interpose a + or put parens around all the arguments.
3245 .Ip "printf(FILEHANDLE LIST)" 8 10
3246 .Ip "printf(LIST)" 8
3247 .Ip "printf FILEHANDLE LIST" 8
3248 .Ip "printf LIST" 8
3249 Equivalent to a \*(L"print FILEHANDLE sprintf(LIST)\*(R".
3250 .Ip "push(ARRAY,LIST)" 8 7
3251 Treats ARRAY (@ is optional) as a stack, and pushes the values of LIST
3252 onto the end of ARRAY.
3253 The length of ARRAY increases by the length of LIST.
3254 Has the same effect as
3255 .nf
3256
3257     for $value (LIST) {
3258             $ARRAY[++$#ARRAY] = $value;
3259     }
3260
3261 .fi
3262 but is more efficient.  Returns the new number of elements in the array.
3263 .Ip "q/STRING/" 8 5
3264 .Ip "qq/STRING/" 8
3265 .Ip "qx/STRING/" 8
3266 These are not really functions, but simply syntactic sugar to let you
3267 avoid putting too many backslashes into quoted strings.
3268 The q operator is a generalized single quote, and the qq operator a
3269 generalized double quote.
3270 The qx operator is a generalized backquote.
3271 Any non-alphanumeric delimiter can be used in place of /, including newline.
3272 If the delimiter is an opening bracket or parenthesis, the final delimiter
3273 will be the corresponding closing bracket or parenthesis.
3274 (Embedded occurrences of the closing bracket need to be backslashed as usual.)
3275 Examples:
3276 .nf
3277
3278 .ne 5
3279         $foo = q!I said, "You said, \'She said it.\'"!;
3280         $bar = q(\'This is it.\');
3281         $today = qx{ date };
3282         $_ .= qq
3283 *** The previous line contains the naughty word "$&".\en
3284                 if /(ibm|apple|awk)/;      # :-)
3285
3286 .fi
3287 .Ip "rand(EXPR)" 8 8
3288 .Ip "rand EXPR" 8
3289 .Ip "rand" 8
3290 Returns a random fractional number between 0 and the value of EXPR.
3291 (EXPR should be positive.)
3292 If EXPR is omitted, returns a value between 0 and 1.
3293 See also srand().
3294 .Ip "read(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
3295 .Ip "read(FILEHANDLE,SCALAR,LENGTH)" 8 5
3296 Attempts to read LENGTH bytes of data into variable SCALAR from the specified
3297 FILEHANDLE.
3298 Returns the number of bytes actually read, or undef if there was an error.
3299 SCALAR will be grown or shrunk to the length actually read.
3300 An OFFSET may be specified to place the read data at some other place
3301 than the beginning of the string.
3302 This call is actually implemented in terms of stdio's fread call.  To get
3303 a true read system call, see sysread.
3304 .Ip "readdir(DIRHANDLE)" 8 3
3305 .Ip "readdir DIRHANDLE" 8
3306 Returns the next directory entry for a directory opened by opendir().
3307 If used in an array context, returns all the rest of the entries in the
3308 directory.
3309 If there are no more entries, returns an undefined value in a scalar context
3310 or a null list in an array context.
3311 .Ip "readlink(EXPR)" 8 6
3312 .Ip "readlink EXPR" 8
3313 Returns the value of a symbolic link, if symbolic links are implemented.
3314 If not, gives a fatal error.
3315 If there is some system error, returns the undefined value and sets $! (errno).
3316 If EXPR is omitted, uses $_.
3317 .Ip "recv(SOCKET,SCALAR,LEN,FLAGS)" 8 4
3318 Receives a message on a socket.
3319 Attempts to receive LENGTH bytes of data into variable SCALAR from the specified
3320 SOCKET filehandle.
3321 Returns the address of the sender, or the undefined value if there's an error.
3322 SCALAR will be grown or shrunk to the length actually read.
3323 Takes the same flags as the system call of the same name.
3324 .Ip "redo LABEL" 8 8
3325 .Ip "redo" 8
3326 The
3327 .I redo
3328 command restarts the loop block without evaluating the conditional again.
3329 The
3330 .I continue
3331 block, if any, is not executed.
3332 If the LABEL is omitted, the command refers to the innermost enclosing loop.
3333 This command is normally used by programs that want to lie to themselves
3334 about what was just input:
3335 .nf
3336
3337 .ne 16
3338         # a simpleminded Pascal comment stripper
3339         # (warning: assumes no { or } in strings)
3340         line: while (<STDIN>) {
3341                 while (s|\|({.*}.*\|){.*}|$1 \||) {}
3342                 s|{.*}| \||;
3343                 if (s|{.*| \||) {
3344                         $front = $_;
3345                         while (<STDIN>) {
3346                                 if (\|/\|}/\|) {        # end of comment?
3347                                         s|^|$front{|;
3348                                         redo line;
3349                                 }
3350                         }
3351                 }
3352                 print;
3353         }
3354
3355 .fi
3356 .Ip "rename(OLDNAME,NEWNAME)" 8 2
3357 Changes the name of a file.
3358 Returns 1 for success, 0 otherwise.
3359 Will not work across filesystem boundaries.
3360 .Ip "require(EXPR)" 8 6
3361 .Ip "require EXPR" 8
3362 .Ip "require" 8
3363 Includes the library file specified by EXPR, or by $_ if EXPR is not supplied.
3364 Has semantics similar to the following subroutine:
3365 .nf
3366
3367         sub require {
3368             local($filename) = @_;
3369             return 1 if $INC{$filename};
3370             local($realfilename,$result);
3371             ITER: {
3372                 foreach $prefix (@INC) {
3373                     $realfilename = "$prefix/$filename";
3374                     if (-f $realfilename) {
3375                         $result = do $realfilename;
3376                         last ITER;
3377                     }
3378                 }
3379                 die "Can't find $filename in \e@INC";
3380             }
3381             die $@ if $@;
3382             die "$filename did not return true value" unless $result;
3383             $INC{$filename} = $realfilename;
3384             $result;
3385         }
3386
3387 .fi
3388 Note that the file will not be included twice under the same specified name.
3389 The file must return true as the last statement to indicate successful
3390 execution of any initialization code, so it's customary to end
3391 such a file with \*(L"1;\*(R" unless you're sure it'll return true otherwise.
3392 .Ip "reset(EXPR)" 8 6
3393 .Ip "reset EXPR" 8
3394 .Ip "reset" 8
3395 Generally used in a
3396 .I continue
3397 block at the end of a loop to clear variables and reset ?? searches
3398 so that they work again.
3399 The expression is interpreted as a list of single characters (hyphens allowed
3400 for ranges).
3401 All variables and arrays beginning with one of those letters are reset to
3402 their pristine state.
3403 If the expression is omitted, one-match searches (?pattern?) are reset to
3404 match again.
3405 Only resets variables or searches in the current package.
3406 Always returns 1.
3407 Examples:
3408 .nf
3409
3410 .ne 3
3411     reset \'X\';        \h'|2i'# reset all X variables
3412     reset \'a\-z\';\h'|2i'# reset lower case variables
3413     reset;      \h'|2i'# just reset ?? searches
3414
3415 .fi
3416 Note: resetting \*(L"A\-Z\*(R" is not recommended since you'll wipe out your ARGV and ENV
3417 arrays.
3418 .Sp
3419 The use of reset on dbm associative arrays does not change the dbm file.
3420 (It does, however, flush any entries cached by perl, which may be useful if
3421 you are sharing the dbm file.
3422 Then again, maybe not.)
3423 .Ip "return LIST" 8 3
3424 Returns from a subroutine with the value specified.
3425 (Note that a subroutine can automatically return
3426 the value of the last expression evaluated.
3427 That's the preferred method\*(--use of an explicit
3428 .I return
3429 is a bit slower.)
3430 .Ip "reverse(LIST)" 8 4
3431 .Ip "reverse LIST" 8
3432 In an array context, returns an array value consisting of the elements
3433 of LIST in the opposite order.
3434 In a scalar context, returns a string value consisting of the bytes of
3435 the first element of LIST in the opposite order.
3436 .Ip "rewinddir(DIRHANDLE)" 8 5
3437 .Ip "rewinddir DIRHANDLE" 8
3438 Sets the current position to the beginning of the directory for the readdir() routine on DIRHANDLE.
3439 .Ip "rindex(STR,SUBSTR,POSITION)" 8 6
3440 .Ip "rindex(STR,SUBSTR)" 8 4
3441 Works just like index except that it
3442 returns the position of the LAST occurrence of SUBSTR in STR.
3443 If POSITION is specified, returns the last occurrence at or before that
3444 position.
3445 .Ip "rmdir(FILENAME)" 8 4
3446 .Ip "rmdir FILENAME" 8
3447 Deletes the directory specified by FILENAME if it is empty.
3448 If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
3449 If FILENAME is omitted, uses $_.
3450 .Ip "s/PATTERN/REPLACEMENT/gieo" 8 3
3451 Searches a string for a pattern, and if found, replaces that pattern with the
3452 replacement text and returns the number of substitutions made.
3453 Otherwise it returns false (0).
3454 The \*(L"g\*(R" is optional, and if present, indicates that all occurrences
3455 of the pattern are to be replaced.
3456 The \*(L"i\*(R" is also optional, and if present, indicates that matching
3457 is to be done in a case-insensitive manner.
3458 The \*(L"e\*(R" is likewise optional, and if present, indicates that
3459 the replacement string is to be evaluated as an expression rather than just
3460 as a double-quoted string.
3461 Any non-alphanumeric delimiter may replace the slashes;
3462 if single quotes are used, no
3463 interpretation is done on the replacement string (the e modifier overrides
3464 this, however); if backquotes are used, the replacement string is a command
3465 to execute whose output will be used as the actual replacement text.
3466 If the PATTERN is delimited by bracketing quotes, the REPLACEMENT
3467 has its own pair of quotes, which may or may not be bracketing quotes, e.g.
3468 s(foo)(bar) or s<foo>/bar/.
3469 If no string is specified via the =~ or !~ operator,
3470 the $_ string is searched and modified.
3471 (The string specified with =~ must be a scalar variable, an array element,
3472 or an assignment to one of those, i.e. an lvalue.)
3473 If the pattern contains a $ that looks like a variable rather than an
3474 end-of-string test, the variable will be interpolated into the pattern at
3475 run-time.
3476 If you only want the pattern compiled once the first time the variable is
3477 interpolated, add an \*(L"o\*(R" at the end.
3478 If the PATTERN evaluates to a null string, the most recent successful
3479 regular expression is used instead.
3480 See also the section on regular expressions.
3481 Examples:
3482 .nf
3483
3484     s/\|\e\|bgreen\e\|b/mauve/g;                # don't change wintergreen
3485
3486     $path \|=~ \|s|\|/usr/bin|\|/usr/local/bin|;
3487
3488     s/Login: $foo/Login: $bar/; # run-time pattern
3489
3490     ($foo = $bar) =~ s/bar/foo/;
3491
3492     $_ = \'abc123xyz\';
3493     s/\ed+/$&*2/e;              # yields \*(L'abc246xyz\*(R'
3494     s/\ed+/sprintf("%5d",$&)/e; # yields \*(L'abc  246xyz\*(R'
3495     s/\ew/$& x 2/eg;            # yields \*(L'aabbcc  224466xxyyzz\*(R'
3496
3497     s/\|([^ \|]*\|) *\|([^ \|]*\|)\|/\|$2 $1/;  # reverse 1st two fields
3498
3499 .fi
3500 (Note the use of $ instead of \|\e\| in the last example.  See section
3501 on regular expressions.)
3502 .Ip "scalar(EXPR)" 8 3
3503 Forces EXPR to be interpreted in a scalar context and returns the value
3504 of EXPR.
3505 .Ip "seek(FILEHANDLE,POSITION,WHENCE)" 8 3
3506 Randomly positions the file pointer for FILEHANDLE, just like the fseek()
3507 call of stdio.
3508 FILEHANDLE may be an expression whose value gives the name of the filehandle.
3509 Returns 1 upon success, 0 otherwise.
3510 .Ip "seekdir(DIRHANDLE,POS)" 8 3
3511 Sets the current position for the readdir() routine on DIRHANDLE.
3512 POS must be a value returned by telldir().
3513 Has the same caveats about possible directory compaction as the corresponding
3514 system library routine.
3515 .Ip "select(FILEHANDLE)" 8 3
3516 .Ip "select" 8 3
3517 Returns the currently selected filehandle.
3518 Sets the current default filehandle for output, if FILEHANDLE is supplied.
3519 This has two effects: first, a
3520 .I write
3521 or a
3522 .I print
3523 without a filehandle will default to this FILEHANDLE.
3524 Second, references to variables related to output will refer to this output
3525 channel.
3526 For example, if you have to set the top of form format for more than
3527 one output channel, you might do the following:
3528 .nf
3529
3530 .ne 4
3531         select(REPORT1);
3532         $^ = \'report1_top\';
3533         select(REPORT2);
3534         $^ = \'report2_top\';
3535
3536 .fi
3537 FILEHANDLE may be an expression whose value gives the name of the actual filehandle.
3538 Thus:
3539 .nf
3540
3541         $oldfh = select(STDERR); $| = 1; select($oldfh);
3542
3543 .fi
3544 .Ip "select(RBITS,WBITS,EBITS,TIMEOUT)" 8 3
3545 This calls the select system call with the bitmasks specified, which can
3546 be constructed using fileno() and vec(), along these lines:
3547 .nf
3548
3549         $rin = $win = $ein = '';
3550         vec($rin,fileno(STDIN),1) = 1;
3551         vec($win,fileno(STDOUT),1) = 1;
3552         $ein = $rin | $win;
3553
3554 .fi
3555 If you want to select on many filehandles you might wish to write a subroutine:
3556 .nf
3557
3558         sub fhbits {
3559             local(@fhlist) = split(' ',$_[0]);
3560             local($bits);
3561             for (@fhlist) {
3562                 vec($bits,fileno($_),1) = 1;
3563             }
3564             $bits;
3565         }
3566         $rin = &fhbits('STDIN TTY SOCK');
3567
3568 .fi
3569 The usual idiom is:
3570 .nf
3571
3572         ($nfound,$timeleft) =
3573           select($rout=$rin, $wout=$win, $eout=$ein, $timeout);
3574
3575 or to block until something becomes ready:
3576
3577 .ie t \{\
3578         $nfound = select($rout=$rin, $wout=$win, $eout=$ein, undef);
3579 'br\}
3580 .el \{\
3581         $nfound = select($rout=$rin, $wout=$win,
3582                                 $eout=$ein, undef);
3583 'br\}
3584
3585 .fi
3586 Any of the bitmasks can also be undef.
3587 The timeout, if specified, is in seconds, which may be fractional.
3588 NOTE: not all implementations are capable of returning the $timeleft.
3589 If not, they always return $timeleft equal to the supplied $timeout.
3590 .Ip "semctl(ID,SEMNUM,CMD,ARG)" 8 4
3591 Calls the System V IPC function semctl.  If CMD is &IPC_STAT or
3592 &GETALL, then ARG must be a variable which will hold the returned
3593 semid_ds structure or semaphore value array.  Returns like ioctl: the
3594 undefined value for error, "0 but true" for zero, or the actual return
3595 value otherwise.
3596 .Ip "semget(KEY,NSEMS,FLAGS)" 8 4
3597 Calls the System V IPC function semget.  Returns the semaphore id, or
3598 the undefined value if there is an error.
3599 .Ip "semop(KEY,OPSTRING)" 8 4
3600 Calls the System V IPC function semop to perform semaphore operations
3601 such as signaling and waiting.  OPSTRING must be a packed array of
3602 semop structures.  Each semop structure can be generated with
3603 \&'pack("sss", $semnum, $semop, $semflag)'.  The number of semaphore
3604 operations is implied by the length of OPSTRING.  Returns true if
3605 successful, or false if there is an error.  As an example, the
3606 following code waits on semaphore $semnum of semaphore id $semid:
3607 .nf
3608
3609         $semop = pack("sss", $semnum, -1, 0);
3610         die "Semaphore trouble: $!\en" unless semop($semid, $semop);
3611
3612 .fi
3613 To signal the semaphore, replace "-1" with "1".
3614 .Ip "send(SOCKET,MSG,FLAGS,TO)" 8 4
3615 .Ip "send(SOCKET,MSG,FLAGS)" 8
3616 Sends a message on a socket.
3617 Takes the same flags as the system call of the same name.
3618 On unconnected sockets you must specify a destination to send TO.
3619 Returns the number of characters sent, or the undefined value if
3620 there is an error.
3621 .Ip "setpgrp(PID,PGRP)" 8 4
3622 Sets the current process group for the specified PID, 0 for the current
3623 process.
3624 Will produce a fatal error if used on a machine that doesn't implement
3625 setpgrp(2).
3626 .Ip "setpriority(WHICH,WHO,PRIORITY)" 8 4
3627 Sets the current priority for a process, a process group, or a user.
3628 (See setpriority(2).)
3629 Will produce a fatal error if used on a machine that doesn't implement
3630 setpriority(2).
3631 .Ip "setsockopt(SOCKET,LEVEL,OPTNAME,OPTVAL)" 8 3
3632 Sets the socket option requested.
3633 Returns undefined if there is an error.
3634 OPTVAL may be specified as undef if you don't want to pass an argument.
3635 .Ip "shift(ARRAY)" 8 6
3636 .Ip "shift ARRAY" 8
3637 .Ip "shift" 8
3638 Shifts the first value of the array off and returns it,
3639 shortening the array by 1 and moving everything down.
3640 If there are no elements in the array, returns the undefined value.
3641 If ARRAY is omitted, shifts the @ARGV array in the main program, and the @_
3642 array in subroutines.
3643 (This is determined lexically.)
3644 See also unshift(), push() and pop().
3645 Shift() and unshift() do the same thing to the left end of an array that push()
3646 and pop() do to the right end.
3647 .Ip "shmctl(ID,CMD,ARG)" 8 4
3648 Calls the System V IPC function shmctl.  If CMD is &IPC_STAT, then ARG
3649 must be a variable which will hold the returned shmid_ds structure.
3650 Returns like ioctl: the undefined value for error, "0 but true" for
3651 zero, or the actual return value otherwise.
3652 .Ip "shmget(KEY,SIZE,FLAGS)" 8 4
3653 Calls the System V IPC function shmget.  Returns the shared memory
3654 segment id, or the undefined value if there is an error.
3655 .Ip "shmread(ID,VAR,POS,SIZE)" 8 4
3656 .Ip "shmwrite(ID,STRING,POS,SIZE)" 8
3657 Reads or writes the System V shared memory segment ID starting at
3658 position POS for size SIZE by attaching to it, copying in/out, and
3659 detaching from it.  When reading, VAR must be a variable which
3660 will hold the data read.  When writing, if STRING is too long,
3661 only SIZE bytes are used; if STRING is too short, nulls are
3662 written to fill out SIZE bytes.  Return true if successful, or
3663 false if there is an error.
3664 .Ip "shutdown(SOCKET,HOW)" 8 3
3665 Shuts down a socket connection in the manner indicated by HOW, which has
3666 the same interpretation as in the system call of the same name.
3667 .Ip "sin(EXPR)" 8 4
3668 .Ip "sin EXPR" 8
3669 Returns the sine of EXPR (expressed in radians).
3670 If EXPR is omitted, returns sine of $_.
3671 .Ip "sleep(EXPR)" 8 6
3672 .Ip "sleep EXPR" 8
3673 .Ip "sleep" 8
3674 Causes the script to sleep for EXPR seconds, or forever if no EXPR.
3675 May be interrupted by sending the process a SIGALRM.
3676 Returns the number of seconds actually slept.
3677 You probably cannot mix alarm() and sleep() calls, since sleep() is
3678 often implemented using alarm().
3679 .Ip "socket(SOCKET,DOMAIN,TYPE,PROTOCOL)" 8 3
3680 Opens a socket of the specified kind and attaches it to filehandle SOCKET.
3681 DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
3682 of the same name.
3683 You may need to run h2ph on sys/socket.h to get the proper values handy
3684 in a perl library file.
3685 Return true if successful.
3686 See the example in the section on Interprocess Communication.
3687 .Ip "socketpair(SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL)" 8 3
3688 Creates an unnamed pair of sockets in the specified domain, of the specified
3689 type.
3690 DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
3691 of the same name.
3692 If unimplemented, yields a fatal error.
3693 Return true if successful.
3694 .Ip "sort(SUBROUTINE LIST)" 8 9
3695 .Ip "sort(LIST)" 8
3696 .Ip "sort SUBROUTINE LIST" 8
3697 .Ip "sort BLOCK LIST" 8
3698 .Ip "sort LIST" 8
3699 Sorts the LIST and returns the sorted array value.
3700 Nonexistent values of arrays are stripped out.
3701 If SUBROUTINE or BLOCK is omitted, sorts in standard string comparison order.
3702 If SUBROUTINE is specified, gives the name of a subroutine that returns
3703 an integer less than, equal to, or greater than 0,
3704 depending on how the elements of the array are to be ordered.
3705 (The <=> and cmp operators are extremely useful in such routines.)
3706 SUBROUTINE may be a scalar variable name, in which case the value provides
3707 the name of the subroutine to use.
3708 In place of a SUBROUTINE name, you can provide a BLOCK as an anonymous,
3709 in-line sort subroutine.
3710 .Sp
3711 In the interests of efficiency the normal calling code for subroutines
3712 is bypassed, with the following effects: the subroutine may not be a recursive
3713 subroutine, and the two elements to be compared are passed into the subroutine
3714 not via @_ but as $a and $b (see example below).
3715 They are passed by reference so don't modify $a and $b.
3716 .Sp
3717 Examples:
3718 .nf
3719
3720 .ne 2
3721         # sort lexically
3722         @articles = sort @files;
3723
3724 .ne 2
3725         # same thing, but with explicit sort routine
3726         @articles = sort {$a cmp $b} @files;
3727
3728 .ne 2
3729         # same thing in reversed order
3730         @articles = sort {$b cmp $a} @files;
3731
3732 .ne 2
3733         # sort numerically ascending
3734         @articles = sort {$a <=> $b} @files;
3735
3736 .ne 2
3737         # sort numerically descending
3738         @articles = sort {$b <=> $a} @files;
3739
3740 .ne 5
3741         # sort using explicit subroutine name
3742         sub byage {
3743             $age{$a} <=> $age{$b};      # presuming integers
3744         }
3745         @sortedclass = sort byage @class;
3746
3747 .ne 9
3748         sub reverse { $b cmp $a; }
3749         @harry = (\'dog\',\'cat\',\'x\',\'Cain\',\'Abel\');
3750         @george = (\'gone\',\'chased\',\'yz\',\'Punished\',\'Axed\');
3751         print sort @harry;
3752                 # prints AbelCaincatdogx
3753         print sort reverse @harry;
3754                 # prints xdogcatCainAbel
3755         print sort @george, \'to\', @harry;
3756                 # prints AbelAxedCainPunishedcatchaseddoggonetoxyz
3757
3758 .fi
3759 .Ip "splice(ARRAY,OFFSET,LENGTH,LIST)" 8 8
3760 .Ip "splice(ARRAY,OFFSET,LENGTH)" 8
3761 .Ip "splice(ARRAY,OFFSET)" 8
3762 Removes the elements designated by OFFSET and LENGTH from an array, and
3763 replaces them with the elements of LIST, if any.
3764 Returns the elements removed from the array.
3765 The array grows or shrinks as necessary.
3766 If LENGTH is omitted, removes everything from OFFSET onward.
3767 The following equivalencies hold (assuming $[ == 0):
3768 .nf
3769
3770         push(@a,$x,$y)\h'|3.5i'splice(@a,$#a+1,0,$x,$y)
3771         pop(@a)\h'|3.5i'splice(@a,-1)
3772         shift(@a)\h'|3.5i'splice(@a,0,1)
3773         unshift(@a,$x,$y)\h'|3.5i'splice(@a,0,0,$x,$y)
3774         $a[$x] = $y\h'|3.5i'splice(@a,$x,1,$y);
3775
3776 Example, assuming array lengths are passed before arrays:
3777
3778         sub aeq {       # compare two array values
3779                 local(@a) = splice(@_,0,shift);
3780                 local(@b) = splice(@_,0,shift);
3781                 return 0 unless @a == @b;       # same len?
3782                 while (@a) {
3783                     return 0 if pop(@a) ne pop(@b);
3784                 }
3785                 return 1;
3786         }
3787         if (&aeq($len,@foo[1..$len],0+@bar,@bar)) { ... }
3788
3789 .fi
3790 .Ip "split(/PATTERN/,EXPR,LIMIT)" 8 8
3791 .Ip "split(/PATTERN/,EXPR)" 8 8
3792 .Ip "split(/PATTERN/)" 8
3793 .Ip "split" 8
3794 Splits a string into an array of strings, and returns it.
3795 (If not in an array context, returns the number of fields found and splits
3796 into the @_ array.
3797 (In an array context, you can force the split into @_
3798 by using ?? as the pattern delimiters, but it still returns the array value.))
3799 If EXPR is omitted, splits the $_ string.
3800 If PATTERN is also omitted, splits on whitespace (/[\ \et\en]+/).
3801 Anything matching PATTERN is taken to be a delimiter separating the fields.
3802 (Note that the delimiter may be longer than one character.)
3803 If LIMIT is specified, splits into no more than that many fields (though it
3804 may split into fewer).
3805 If LIMIT is unspecified, trailing null fields are stripped (which
3806 potential users of pop() would do well to remember).
3807 A pattern matching the null string (not to be confused with a null pattern //,
3808 which is just one member of the set of patterns matching a null string)
3809 will split the value of EXPR into separate characters at each point it
3810 matches that way.
3811 For example:
3812 .nf
3813
3814         print join(\':\', split(/ */, \'hi there\'));
3815
3816 .fi
3817 produces the output \*(L'h:i:t:h:e:r:e\*(R'.
3818 .Sp
3819 The LIMIT parameter can be used to partially split a line
3820 .nf
3821
3822         ($login, $passwd, $remainder) = split(\|/\|:\|/\|, $_, 3);
3823
3824 .fi
3825 (When assigning to a list, if LIMIT is omitted, perl supplies a LIMIT one
3826 larger than the number of variables in the list, to avoid unnecessary work.
3827 For the list above LIMIT would have been 4 by default.
3828 In time critical applications it behooves you not to split into
3829 more fields than you really need.)
3830 .Sp
3831 If the PATTERN contains parentheses, additional array elements are created
3832 from each matching substring in the delimiter.
3833 .Sp
3834         split(/([,-])/,"1-10,20");
3835 .Sp
3836 produces the array value
3837 .Sp
3838         (1,'-',10,',',20)
3839 .Sp
3840 The pattern /PATTERN/ may be replaced with an expression to specify patterns
3841 that vary at runtime.
3842 (To do runtime compilation only once, use /$variable/o.)
3843 As a special case, specifying a space (\'\ \') will split on white space
3844 just as split with no arguments does, but leading white space does NOT
3845 produce a null first field.
3846 Thus, split(\'\ \') can be used to emulate
3847 .IR awk 's
3848 default behavior, whereas
3849 split(/\ /) will give you as many null initial fields as there are
3850 leading spaces.
3851 .Sp
3852 Example:
3853 .nf
3854
3855 .ne 5
3856         open(passwd, \'/etc/passwd\');
3857         while (<passwd>) {
3858 .ie t \{\
3859                 ($login, $passwd, $uid, $gid, $gcos, $home, $shell) = split(\|/\|:\|/\|);
3860 'br\}
3861 .el \{\
3862                 ($login, $passwd, $uid, $gid, $gcos, $home, $shell)
3863                         = split(\|/\|:\|/\|);
3864 'br\}
3865                 .\|.\|.
3866         }
3867
3868 .fi
3869 (Note that $shell above will still have a newline on it.  See chop().)
3870 See also
3871 .IR join .
3872 .Ip "sprintf(FORMAT,LIST)" 8 4
3873 Returns a string formatted by the usual printf conventions.
3874 The * character is not supported.
3875 .Ip "sqrt(EXPR)" 8 4
3876 .Ip "sqrt EXPR" 8
3877 Return the square root of EXPR.
3878 If EXPR is omitted, returns square root of $_.
3879 .Ip "srand(EXPR)" 8 4
3880 .Ip "srand EXPR" 8
3881 Sets the random number seed for the
3882 .I rand
3883 operator.
3884 If EXPR is omitted, does srand(time).
3885 .Ip "stat(FILEHANDLE)" 8 8
3886 .Ip "stat FILEHANDLE" 8
3887 .Ip "stat(EXPR)" 8
3888 .Ip "stat SCALARVARIABLE" 8
3889 Returns a 13-element array giving the statistics for a file, either the file
3890 opened via FILEHANDLE, or named by EXPR.
3891 Returns a null list if the stat fails.
3892 Typically used as follows:
3893 .nf
3894
3895 .ne 3
3896     ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
3897        $atime,$mtime,$ctime,$blksize,$blocks)
3898            = stat($filename);
3899
3900 .fi
3901 If stat is passed the special filehandle consisting of an underline,
3902 no stat is done, but the current contents of the stat structure from
3903 the last stat or filetest are returned.
3904 Example:
3905 .nf
3906
3907 .ne 3
3908         if (-x $file && (($d) = stat(_)) && $d < 0) {
3909                 print "$file is executable NFS file\en";
3910         }
3911
3912 .fi
3913 (This only works on machines for which the device number is negative under NFS.)
3914 .Ip "study(SCALAR)" 8 6
3915 .Ip "study SCALAR" 8
3916 .Ip "study"
3917 Takes extra time to study SCALAR ($_ if unspecified) in anticipation of
3918 doing many pattern matches on the string before it is next modified.
3919 This may or may not save time, depending on the nature and number of patterns
3920 you are searching on, and on the distribution of character frequencies in
3921 the string to be searched\*(--you probably want to compare runtimes with and
3922 without it to see which runs faster.
3923 Those loops which scan for many short constant strings (including the constant
3924 parts of more complex patterns) will benefit most.
3925 You may have only one study active at a time\*(--if you study a different
3926 scalar the first is \*(L"unstudied\*(R".
3927 (The way study works is this: a linked list of every character in the string
3928 to be searched is made, so we know, for example, where all the \*(L'k\*(R' characters
3929 are.
3930 From each search string, the rarest character is selected, based on some
3931 static frequency tables constructed from some C programs and English text.
3932 Only those places that contain this \*(L"rarest\*(R" character are examined.)
3933 .Sp
3934 For example, here is a loop which inserts index producing entries before any line
3935 containing a certain pattern:
3936 .nf
3937
3938 .ne 8
3939         while (<>) {
3940                 study;
3941                 print ".IX foo\en" if /\ebfoo\eb/;
3942                 print ".IX bar\en" if /\ebbar\eb/;
3943                 print ".IX blurfl\en" if /\ebblurfl\eb/;
3944                 .\|.\|.
3945                 print;
3946         }
3947
3948 .fi
3949 In searching for /\ebfoo\eb/, only those locations in $_ that contain \*(L'f\*(R'
3950 will be looked at, because \*(L'f\*(R' is rarer than \*(L'o\*(R'.
3951 In general, this is a big win except in pathological cases.
3952 The only question is whether it saves you more time than it took to build
3953 the linked list in the first place.
3954 .Sp
3955 Note that if you have to look for strings that you don't know till runtime,
3956 you can build an entire loop as a string and eval that to avoid recompiling
3957 all your patterns all the time.
3958 Together with undefining $/ to input entire files as one record, this can
3959 be very fast, often faster than specialized programs like fgrep.
3960 The following scans a list of files (@files)
3961 for a list of words (@words), and prints out the names of those files that
3962 contain a match:
3963 .nf
3964
3965 .ne 12
3966         $search = \'while (<>) { study;\';
3967         foreach $word (@words) {
3968             $search .= "++\e$seen{\e$ARGV} if /\e\eb$word\e\eb/;\en";
3969         }
3970         $search .= "}";
3971         @ARGV = @files;
3972         undef $/;
3973         eval $search;           # this screams
3974         $/ = "\en";             # put back to normal input delim
3975         foreach $file (sort keys(%seen)) {
3976             print $file, "\en";
3977         }
3978
3979 .fi
3980 .Ip "substr(EXPR,OFFSET,LEN)" 8 2
3981 .Ip "substr(EXPR,OFFSET)" 8 2
3982 Extracts a substring out of EXPR and returns it.
3983 First character is at offset 0, or whatever you've set $[ to.
3984 If OFFSET is negative, starts that far from the end of the string.
3985 If LEN is omitted, returns everything to the end of the string.
3986 You can use the substr() function as an lvalue, in which case EXPR must
3987 be an lvalue.
3988 If you assign something shorter than LEN, the string will shrink, and
3989 if you assign something longer than LEN, the string will grow to accommodate it.
3990 To keep the string the same length you may need to pad or chop your value using
3991 sprintf().
3992 .Ip "symlink(OLDFILE,NEWFILE)" 8 2
3993 Creates a new filename symbolically linked to the old filename.
3994 Returns 1 for success, 0 otherwise.
3995 On systems that don't support symbolic links, produces a fatal error at
3996 run time.
3997 To check for that, use eval:
3998 .nf
3999
4000         $symlink_exists = (eval \'symlink("","");\', $@ eq \'\');
4001
4002 .fi
4003 .Ip "syscall(LIST)" 8 6
4004 .Ip "syscall LIST" 8
4005 Calls the system call specified as the first element of the list, passing
4006 the remaining elements as arguments to the system call.
4007 If unimplemented, produces a fatal error.
4008 The arguments are interpreted as follows: if a given argument is numeric,
4009 the argument is passed as an int.
4010 If not, the pointer to the string value is passed.
4011 You are responsible to make sure a string is pre-extended long enough
4012 to receive any result that might be written into a string.
4013 If your integer arguments are not literals and have never been interpreted
4014 in a numeric context, you may need to add 0 to them to force them to look
4015 like numbers.
4016 .nf
4017
4018         require 'syscall.ph';           # may need to run h2ph
4019         syscall(&SYS_write, fileno(STDOUT), "hi there\en", 9);
4020
4021 .fi
4022 .Ip "sysread(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
4023 .Ip "sysread(FILEHANDLE,SCALAR,LENGTH)" 8 5
4024 Attempts to read LENGTH bytes of data into variable SCALAR from the specified
4025 FILEHANDLE, using the system call read(2).
4026 It bypasses stdio, so mixing this with other kinds of reads may cause
4027 confusion.
4028 Returns the number of bytes actually read, or undef if there was an error.
4029 SCALAR will be grown or shrunk to the length actually read.
4030 An OFFSET may be specified to place the read data at some other place
4031 than the beginning of the string.
4032 .Ip "system(LIST)" 8 6
4033 .Ip "system LIST" 8
4034 Does exactly the same thing as \*(L"exec LIST\*(R" except that a fork
4035 is done first, and the parent process waits for the child process to complete.
4036 Note that argument processing varies depending on the number of arguments.
4037 The return value is the exit status of the program as returned by the wait()
4038 call.
4039 To get the actual exit value divide by 256.
4040 See also
4041 .IR exec .
4042 .Ip "syswrite(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
4043 .Ip "syswrite(FILEHANDLE,SCALAR,LENGTH)" 8 5
4044 Attempts to write LENGTH bytes of data from variable SCALAR to the specified
4045 FILEHANDLE, using the system call write(2).
4046 It bypasses stdio, so mixing this with prints may cause
4047 confusion.
4048 Returns the number of bytes actually written, or undef if there was an error.
4049 An OFFSET may be specified to place the read data at some other place
4050 than the beginning of the string.
4051 .Ip "tell(FILEHANDLE)" 8 6
4052 .Ip "tell FILEHANDLE" 8 6
4053 .Ip "tell" 8
4054 Returns the current file position for FILEHANDLE.
4055 FILEHANDLE may be an expression whose value gives the name of the actual
4056 filehandle.
4057 If FILEHANDLE is omitted, assumes the file last read.
4058 .Ip "telldir(DIRHANDLE)" 8 5
4059 .Ip "telldir DIRHANDLE" 8
4060 Returns the current position of the readdir() routines on DIRHANDLE.
4061 Value may be given to seekdir() to access a particular location in
4062 a directory.
4063 Has the same caveats about possible directory compaction as the corresponding
4064 system library routine.
4065 .Ip "time" 8 4
4066 Returns the number of non-leap seconds since 00:00:00 UTC, January 1, 1970.
4067 Suitable for feeding to gmtime() and localtime().
4068 .Ip "times" 8 4
4069 Returns a four-element array giving the user and system times, in seconds, for this
4070 process and the children of this process.
4071 .Sp
4072     ($user,$system,$cuser,$csystem) = times;
4073 .Sp
4074 .Ip "tr/SEARCHLIST/REPLACEMENTLIST/cds" 8 5
4075 .Ip "y/SEARCHLIST/REPLACEMENTLIST/cds" 8
4076 Translates all occurrences of the characters found in the search list with
4077 the corresponding character in the replacement list.
4078 It returns the number of characters replaced or deleted.
4079 If no string is specified via the =~ or !~ operator,
4080 the $_ string is translated.
4081 (The string specified with =~ must be a scalar variable, an array element,
4082 or an assignment to one of those, i.e. an lvalue.)
4083 For
4084 .I sed
4085 devotees,
4086 .I y
4087 is provided as a synonym for
4088 .IR tr .
4089 If the SEARCHLIST is delimited by bracketing quotes, the REPLACEMENTLIST
4090 has its own pair of quotes, which may or may not be bracketing quotes, e.g.
4091 tr[A-Z][a-z] or tr(+-*/)/ABCD/.
4092 .Sp
4093 If the c modifier is specified, the SEARCHLIST character set is complemented.
4094 If the d modifier is specified, any characters specified by SEARCHLIST that
4095 are not found in REPLACEMENTLIST are deleted.
4096 (Note that this is slightly more flexible than the behavior of some
4097 .I tr
4098 programs, which delete anything they find in the SEARCHLIST, period.)
4099 If the s modifier is specified, sequences of characters that were translated
4100 to the same character are squashed down to 1 instance of the character.
4101 .Sp
4102 If the d modifier was used, the REPLACEMENTLIST is always interpreted exactly
4103 as specified.
4104 Otherwise, if the REPLACEMENTLIST is shorter than the SEARCHLIST,
4105 the final character is replicated till it is long enough.
4106 If the REPLACEMENTLIST is null, the SEARCHLIST is replicated.
4107 This latter is useful for counting characters in a class, or for squashing
4108 character sequences in a class.
4109 .Sp
4110 Examples:
4111 .nf
4112
4113     $ARGV[1] \|=~ \|y/A\-Z/a\-z/;       \h'|3i'# canonicalize to lower case
4114
4115     $cnt = tr/*/*/;             \h'|3i'# count the stars in $_
4116
4117     $cnt = tr/0\-9//;           \h'|3i'# count the digits in $_
4118
4119     tr/a\-zA\-Z//s;     \h'|3i'# bookkeeper \-> bokeper
4120
4121     ($HOST = $host) =~ tr/a\-z/A\-Z/;
4122
4123     y/a\-zA\-Z/ /cs;    \h'|3i'# change non-alphas to single space
4124
4125     tr/\e200\-\e377/\e0\-\e177/;\h'|3i'# delete 8th bit
4126
4127 .fi
4128 .Ip "truncate(FILEHANDLE,LENGTH)" 8 4
4129 .Ip "truncate(EXPR,LENGTH)" 8
4130 Truncates the file opened on FILEHANDLE, or named by EXPR, to the specified
4131 length.
4132 Produces a fatal error if truncate isn't implemented on your system.
4133 .Ip "umask(EXPR)" 8 4
4134 .Ip "umask EXPR" 8
4135 .Ip "umask" 8
4136 Sets the umask for the process and returns the old one.
4137 If EXPR is omitted, merely returns current umask.
4138 .Ip "undef(EXPR)" 8 6
4139 .Ip "undef EXPR" 8
4140 .Ip "undef" 8
4141 Undefines the value of EXPR, which must be an lvalue.
4142 Use only on a scalar value, an entire array, or a subroutine name (using &).
4143 (Undef will probably not do what you expect on most predefined variables or
4144 dbm array values.)
4145 Always returns the undefined value.
4146 You can omit the EXPR, in which case nothing is undefined, but you still
4147 get an undefined value that you could, for instance, return from a subroutine.
4148 Examples:
4149 .nf
4150
4151 .ne 6
4152         undef $foo;
4153         undef $bar{'blurfl'};
4154         undef @ary;
4155         undef %assoc;
4156         undef &mysub;
4157         return (wantarray ? () : undef) if $they_blew_it;
4158
4159 .fi
4160 .Ip "unlink(LIST)" 8 4
4161 .Ip "unlink LIST" 8
4162 Deletes a list of files.
4163 Returns the number of files successfully deleted.
4164 .nf
4165
4166 .ne 2
4167         $cnt = unlink \'a\', \'b\', \'c\';
4168         unlink @goners;
4169         unlink <*.bak>;
4170
4171 .fi
4172 Note: unlink will not delete directories unless you are superuser and the
4173 .B \-U
4174 flag is supplied to
4175 .IR perl .
4176 Even if these conditions are met, be warned that unlinking a directory
4177 can inflict damage on your filesystem.
4178 Use rmdir instead.
4179 .Ip "unpack(TEMPLATE,EXPR)" 8 4
4180 Unpack does the reverse of pack: it takes a string representing
4181 a structure and expands it out into an array value, returning the array
4182 value.
4183 (In a scalar context, it merely returns the first value produced.)
4184 The TEMPLATE has the same format as in the pack function.
4185 Here's a subroutine that does substring:
4186 .nf
4187
4188 .ne 4
4189         sub substr {
4190                 local($what,$where,$howmuch) = @_;
4191                 unpack("x$where a$howmuch", $what);
4192         }
4193
4194 .ne 3
4195 and then there's
4196
4197         sub ord { unpack("c",$_[0]); }
4198
4199 .fi
4200 In addition, you may prefix a field with a %<number> to indicate that
4201 you want a <number>-bit checksum of the items instead of the items themselves.
4202 Default is a 16-bit checksum.
4203 For example, the following computes the same number as the System V sum program:
4204 .nf
4205
4206 .ne 4
4207         while (<>) {
4208             $checksum += unpack("%16C*", $_);
4209         }
4210         $checksum %= 65536;
4211
4212 .fi
4213 The following efficiently counts the number of set bits in a bit vector:
4214 .nf
4215
4216         $setbits = unpack("%32b*", $selectmask);
4217
4218 .fi
4219 .Ip "unshift(ARRAY,LIST)" 8 4
4220 Does the opposite of a
4221 .IR shift .
4222 Or the opposite of a
4223 .IR push ,
4224 depending on how you look at it.
4225 Prepends list to the front of the array, and returns the new number of elements
4226 in the array.
4227 .nf
4228
4229         unshift(ARGV, \'\-e\') unless $ARGV[0] =~ /^\-/;
4230
4231 .fi
4232 Note the LIST is prepended whole, not one element at a time, so the prepended
4233 elements stay in the same order.  Use reverse to do the reverse.
4234 .Ip "utime(LIST)" 8 2
4235 .Ip "utime LIST" 8 2
4236 Changes the access and modification times on each file of a list of files.
4237 The first two elements of the list must be the NUMERICAL access and
4238 modification times, in that order.
4239 Returns the number of files successfully changed.
4240 The inode modification time of each file is set to the current time.
4241 Example of a \*(L"touch\*(R" command:
4242 .nf
4243
4244 .ne 3
4245         #!/usr/bin/perl
4246         $now = time;
4247         utime $now, $now, @ARGV;
4248
4249 .fi
4250 .Ip "values(ASSOC_ARRAY)" 8 6
4251 .Ip "values ASSOC_ARRAY" 8
4252 Returns a normal array consisting of all the values of the named associative
4253 array.
4254 (In a scalar context, returns the number of values.)
4255 The values are returned in an apparently random order, but it is the same order
4256 as either the keys() or each() function would produce on the same array.
4257 See also keys() and each().
4258 .Ip "vec(EXPR,OFFSET,BITS)" 8 2
4259 Treats a string as a vector of unsigned integers, and returns the value
4260 of the bitfield specified.
4261 May also be assigned to.
4262 BITS must be a power of two from 1 to 32.
4263 .Sp
4264 Vectors created with vec() can also be manipulated with the logical operators
4265 |, & and ^,
4266 which will assume a bit vector operation is desired when both operands are
4267 strings.
4268 This interpretation is not enabled unless there is at least one vec() in
4269 your program, to protect older programs.
4270 .Sp
4271 To transform a bit vector into a string or array of 0's and 1's, use these:
4272 .nf
4273
4274         $bits = unpack("b*", $vector);
4275         @bits = split(//, unpack("b*", $vector));
4276
4277 .fi
4278 If you know the exact length in bits, it can be used in place of the *.
4279 .Ip "wait" 8 6
4280 Waits for a child process to terminate and returns the pid of the deceased
4281 process, or -1 if there are no child processes.
4282 The status is returned in $?.
4283 .Ip "waitpid(PID,FLAGS)" 8 6
4284 Waits for a particular child process to terminate and returns the pid of the deceased
4285 process, or -1 if there is no such child process.
4286 The status is returned in $?.
4287 If you say
4288 .nf
4289
4290         require "sys/wait.h";
4291         .\|.\|.
4292         waitpid(-1,&WNOHANG);
4293
4294 .fi
4295 then you can do a non-blocking wait for any process.  Non-blocking wait
4296 is only available on machines supporting either the
4297 .I waitpid (2)
4298 or
4299 .I wait4 (2)
4300 system calls.
4301 However, waiting for a particular pid with FLAGS of 0 is implemented
4302 everywhere.  (Perl emulates the system call by remembering the status
4303 values of processes that have exited but have not been harvested by the
4304 Perl script yet.)
4305 .Ip "wantarray" 8 4
4306 Returns true if the context of the currently executing subroutine
4307 is looking for an array value.
4308 Returns false if the context is looking for a scalar.
4309 .nf
4310
4311         return wantarray ? () : undef;
4312
4313 .fi
4314 .Ip "warn(LIST)" 8 4
4315 .Ip "warn LIST" 8
4316 Produces a message on STDERR just like \*(L"die\*(R", but doesn't exit.
4317 .Ip "write(FILEHANDLE)" 8 6
4318 .Ip "write(EXPR)" 8
4319 .Ip "write" 8
4320 Writes a formatted record (possibly multi-line) to the specified file,
4321 using the format associated with that file.
4322 By default the format for a file is the one having the same name is the
4323 filehandle, but the format for the current output channel (see
4324 .IR select )
4325 may be set explicitly
4326 by assigning the name of the format to the $~ variable.
4327 .Sp
4328 Top of form processing is handled automatically:
4329 if there is insufficient room on the current page for the formatted
4330 record, the page is advanced by writing a form feed,
4331 a special top-of-page format is used
4332 to format the new page header, and then the record is written.
4333 By default the top-of-page format is the name of the filehandle with
4334 \*(L"_TOP\*(R" appended, but it may be dynamicallly set to the
4335 format of your choice by assigning the name to the $^ variable while
4336 the filehandle is selected.
4337 The number of lines remaining on the current page is in variable $-, which
4338 can be set to 0 to force a new page.
4339 .Sp
4340 If FILEHANDLE is unspecified, output goes to the current default output channel,
4341 which starts out as
4342 .I STDOUT
4343 but may be changed by the
4344 .I select
4345 operator.
4346 If the FILEHANDLE is an EXPR, then the expression is evaluated and the
4347 resulting string is used to look up the name of the FILEHANDLE at run time.
4348 For more on formats, see the section on formats later on.
4349 .Sp
4350 Note that write is NOT the opposite of read.
4351 .Sh "Precedence"
4352 .I Perl
4353 operators have the following associativity and precedence:
4354 .nf
4355
4356 nonassoc\h'|1i'print printf exec system sort reverse
4357 \h'1.5i'chmod chown kill unlink utime die return
4358 left\h'|1i',
4359 right\h'|1i'= += \-= *= etc.
4360 right\h'|1i'?:
4361 nonassoc\h'|1i'.\|.
4362 left\h'|1i'||
4363 left\h'|1i'&&
4364 left\h'|1i'| ^
4365 left\h'|1i'&
4366 nonassoc\h'|1i'== != <=> eq ne cmp
4367 nonassoc\h'|1i'< > <= >= lt gt le ge
4368 nonassoc\h'|1i'chdir exit eval reset sleep rand umask
4369 nonassoc\h'|1i'\-r \-w \-x etc.
4370 left\h'|1i'<< >>
4371 left\h'|1i'+ \- .
4372 left\h'|1i'* / % x
4373 left\h'|1i'=~ !~
4374 right\h'|1i'! ~ and unary minus
4375 right\h'|1i'**
4376 nonassoc\h'|1i'++ \-\|\-
4377 left\h'|1i'\*(L'(\*(R'
4378
4379 .fi
4380 As mentioned earlier, if any list operator (print, etc.) or
4381 any unary operator (chdir, etc.)
4382 is followed by a left parenthesis as the next token on the same line,
4383 the operator and arguments within parentheses are taken to
4384 be of highest precedence, just like a normal function call.
4385 Examples:
4386 .nf
4387
4388         chdir $foo || die;\h'|3i'# (chdir $foo) || die
4389         chdir($foo) || die;\h'|3i'# (chdir $foo) || die
4390         chdir ($foo) || die;\h'|3i'# (chdir $foo) || die
4391         chdir +($foo) || die;\h'|3i'# (chdir $foo) || die
4392
4393 but, because * is higher precedence than ||:
4394
4395         chdir $foo * 20;\h'|3i'# chdir ($foo * 20)
4396         chdir($foo) * 20;\h'|3i'# (chdir $foo) * 20
4397         chdir ($foo) * 20;\h'|3i'# (chdir $foo) * 20
4398         chdir +($foo) * 20;\h'|3i'# chdir ($foo * 20)
4399
4400         rand 10 * 20;\h'|3i'# rand (10 * 20)
4401         rand(10) * 20;\h'|3i'# (rand 10) * 20
4402         rand (10) * 20;\h'|3i'# (rand 10) * 20
4403         rand +(10) * 20;\h'|3i'# rand (10 * 20)
4404
4405 .fi
4406 In the absence of parentheses,
4407 the precedence of list operators such as print, sort or chmod is
4408 either very high or very low depending on whether you look at the left
4409 side of operator or the right side of it.
4410 For example, in
4411 .nf
4412
4413         @ary = (1, 3, sort 4, 2);
4414         print @ary;             # prints 1324
4415
4416 .fi
4417 the commas on the right of the sort are evaluated before the sort, but
4418 the commas on the left are evaluated after.
4419 In other words, list operators tend to gobble up all the arguments that
4420 follow them, and then act like a simple term with regard to the preceding
4421 expression.
4422 Note that you have to be careful with parens:
4423 .nf
4424
4425 .ne 3
4426         # These evaluate exit before doing the print:
4427         print($foo, exit);      # Obviously not what you want.
4428         print $foo, exit;       # Nor is this.
4429
4430 .ne 4
4431         # These do the print before evaluating exit:
4432         (print $foo), exit;     # This is what you want.
4433         print($foo), exit;      # Or this.
4434         print ($foo), exit;     # Or even this.
4435
4436 Also note that
4437
4438         print ($foo & 255) + 1, "\en";
4439
4440 .fi
4441 probably doesn't do what you expect at first glance.
4442 .Sh "Subroutines"
4443 A subroutine may be declared as follows:
4444 .nf
4445
4446     sub NAME BLOCK
4447
4448 .fi
4449 .PP
4450 Any arguments passed to the routine come in as array @_,
4451 that is ($_[0], $_[1], .\|.\|.).
4452 The array @_ is a local array, but its values are references to the
4453 actual scalar parameters.
4454 The return value of the subroutine is the value of the last expression
4455 evaluated, and can be either an array value or a scalar value.
4456 Alternately, a return statement may be used to specify the returned value and
4457 exit the subroutine.
4458 To create local variables see the
4459 .I local
4460 operator.
4461 .PP
4462 A subroutine is called using the
4463 .I do
4464 operator or the & operator.
4465 .nf
4466
4467 .ne 12
4468 Example:
4469
4470         sub MAX {
4471                 local($max) = pop(@_);
4472                 foreach $foo (@_) {
4473                         $max = $foo \|if \|$max < $foo;
4474                 }
4475                 $max;
4476         }
4477
4478         .\|.\|.
4479         $bestday = &MAX($mon,$tue,$wed,$thu,$fri);
4480
4481 .ne 21
4482 Example:
4483
4484         # get a line, combining continuation lines
4485         #  that start with whitespace
4486         sub get_line {
4487                 $thisline = $lookahead;
4488                 line: while ($lookahead = <STDIN>) {
4489                         if ($lookahead \|=~ \|/\|^[ \^\e\|t]\|/\|) {
4490                                 $thisline \|.= \|$lookahead;
4491                         }
4492                         else {
4493                                 last line;
4494                         }
4495                 }
4496                 $thisline;
4497         }
4498
4499         $lookahead = <STDIN>;   # get first line
4500         while ($_ = do get_line(\|)) {
4501                 .\|.\|.
4502         }
4503
4504 .fi
4505 .nf
4506 .ne 6
4507 Use array assignment to a local list to name your formal arguments:
4508
4509         sub maybeset {
4510                 local($key, $value) = @_;
4511                 $foo{$key} = $value unless $foo{$key};
4512         }
4513
4514 .fi
4515 This also has the effect of turning call-by-reference into call-by-value,
4516 since the assignment copies the values.
4517 .Sp
4518 Subroutines may be called recursively.
4519 If a subroutine is called using the & form, the argument list is optional.
4520 If omitted, no @_ array is set up for the subroutine; the @_ array at the
4521 time of the call is visible to subroutine instead.
4522 .nf
4523
4524         do foo(1,2,3);          # pass three arguments
4525         &foo(1,2,3);            # the same
4526
4527         do foo();               # pass a null list
4528         &foo();                 # the same
4529         &foo;                   # pass no arguments\*(--more efficient
4530
4531 .fi
4532 .Sh "Passing By Reference"
4533 Sometimes you don't want to pass the value of an array to a subroutine but
4534 rather the name of it, so that the subroutine can modify the global copy
4535 of it rather than working with a local copy.
4536 In perl you can refer to all the objects of a particular name by prefixing
4537 the name with a star: *foo.
4538 When evaluated, it produces a scalar value that represents all the objects
4539 of that name, including any filehandle, format or subroutine.
4540 When assigned to within a local() operation, it causes the name mentioned
4541 to refer to whatever * value was assigned to it.
4542 Example:
4543 .nf
4544
4545         sub doubleary {
4546             local(*someary) = @_;
4547             foreach $elem (@someary) {
4548                 $elem *= 2;
4549             }
4550         }
4551         do doubleary(*foo);
4552         do doubleary(*bar);
4553
4554 .fi
4555 Assignment to *name is currently recommended only inside a local().
4556 You can actually assign to *name anywhere, but the previous referent of
4557 *name may be stranded forever.
4558 This may or may not bother you.
4559 .Sp
4560 Note that scalars are already passed by reference, so you can modify scalar
4561 arguments without using this mechanism by referring explicitly to the $_[nnn]
4562 in question.
4563 You can modify all the elements of an array by passing all the elements
4564 as scalars, but you have to use the * mechanism to push, pop or change the
4565 size of an array.
4566 The * mechanism will probably be more efficient in any case.
4567 .Sp
4568 Since a *name value contains unprintable binary data, if it is used as
4569 an argument in a print, or as a %s argument in a printf or sprintf, it
4570 then has the value '*name', just so it prints out pretty.
4571 .Sp
4572 Even if you don't want to modify an array, this mechanism is useful for
4573 passing multiple arrays in a single LIST, since normally the LIST mechanism
4574 will merge all the array values so that you can't extract out the
4575 individual arrays.
4576 .Sh "Regular Expressions"
4577 The patterns used in pattern matching are regular expressions such as
4578 those supplied in the Version 8 regexp routines.
4579 (In fact, the routines are derived from Henry Spencer's freely redistributable
4580 reimplementation of the V8 routines.)
4581 In addition, \ew matches an alphanumeric character (including \*(L"_\*(R") and \eW a nonalphanumeric.
4582 Word boundaries may be matched by \eb, and non-boundaries by \eB.
4583 A whitespace character is matched by \es, non-whitespace by \eS.
4584 A numeric character is matched by \ed, non-numeric by \eD.
4585 You may use \ew, \es and \ed within character classes.
4586 Also, \en, \er, \ef, \et and \eNNN have their normal interpretations.
4587 Within character classes \eb represents backspace rather than a word boundary.
4588 Alternatives may be separated by |.
4589 The bracketing construct \|(\ .\|.\|.\ \|) may also be used, in which case \e<digit>
4590 matches the digit'th substring.
4591 (Outside of the pattern, always use $ instead of \e in front of the digit.
4592 The scope of $<digit> (and $\`, $& and $\')
4593 extends to the end of the enclosing BLOCK or eval string, or to
4594 the next pattern match with subexpressions.
4595 The \e<digit> notation sometimes works outside the current pattern, but should
4596 not be relied upon.)
4597 You may have as many parentheses as you wish.  If you have more than 9
4598 substrings, the variables $10, $11, ... refer to the corresponding
4599 substring.  Within the pattern, \e10, \e11,
4600 etc. refer back to substrings if there have been at least that many left parens
4601 before the backreference.  Otherwise (for backward compatibilty) \e10
4602 is the same as \e010, a backspace,
4603 and \e11 the same as \e011, a tab.
4604 And so on.
4605 (\e1 through \e9 are always backreferences.)
4606 .PP
4607 $+ returns whatever the last bracket match matched.
4608 $& returns the entire matched string.
4609 ($0 used to return the same thing, but not any more.)
4610 $\` returns everything before the matched string.
4611 $\' returns everything after the matched string.
4612 Examples:
4613 .nf
4614
4615         s/\|^\|([^ \|]*\|) \|*([^ \|]*\|)\|/\|$2 $1\|/; # swap first two words
4616
4617 .ne 5
4618         if (/\|Time: \|(.\|.\|):\|(.\|.\|):\|(.\|.\|)\|/\|) {
4619                 $hours = $1;
4620                 $minutes = $2;
4621                 $seconds = $3;
4622         }
4623
4624 .fi
4625 By default, the ^ character is only guaranteed to match at the beginning
4626 of the string,
4627 the $ character only at the end (or before the newline at the end)
4628 and
4629 .I perl
4630 does certain optimizations with the assumption that the string contains
4631 only one line.
4632 The behavior of ^ and $ on embedded newlines will be inconsistent.
4633 You may, however, wish to treat a string as a multi-line buffer, such that
4634 the ^ will match after any newline within the string, and $ will match
4635 before any newline.
4636 At the cost of a little more overhead, you can do this by setting the variable
4637 $* to 1.
4638 Setting it back to 0 makes
4639 .I perl
4640 revert to its old behavior.
4641 .PP
4642 To facilitate multi-line substitutions, the . character never matches a newline
4643 (even when $* is 0).
4644 In particular, the following leaves a newline on the $_ string:
4645 .nf
4646
4647         $_ = <STDIN>;
4648         s/.*(some_string).*/$1/;
4649
4650 If the newline is unwanted, try one of
4651
4652         s/.*(some_string).*\en/$1/;
4653         s/.*(some_string)[^\e000]*/$1/;
4654         s/.*(some_string)(.|\en)*/$1/;
4655         chop; s/.*(some_string).*/$1/;
4656         /(some_string)/ && ($_ = $1);
4657
4658 .fi
4659 Any item of a regular expression may be followed with digits in curly brackets
4660 of the form {n,m}, where n gives the minimum number of times to match the item
4661 and m gives the maximum.
4662 The form {n} is equivalent to {n,n} and matches exactly n times.
4663 The form {n,} matches n or more times.
4664 (If a curly bracket occurs in any other context, it is treated as a regular
4665 character.)
4666 The * modifier is equivalent to {0,}, the + modifier to {1,} and the ? modifier
4667 to {0,1}.
4668 There is no limit to the size of n or m, but large numbers will chew up
4669 more memory.
4670 .Sp
4671 You will note that all backslashed metacharacters in
4672 .I perl
4673 are alphanumeric,
4674 such as \eb, \ew, \en.
4675 Unlike some other regular expression languages, there are no backslashed
4676 symbols that aren't alphanumeric.
4677 So anything that looks like \e\e, \e(, \e), \e<, \e>, \e{, or \e} is always
4678 interpreted as a literal character, not a metacharacter.
4679 This makes it simple to quote a string that you want to use for a pattern
4680 but that you are afraid might contain metacharacters.
4681 Simply quote all the non-alphanumeric characters:
4682 .nf
4683
4684         $pattern =~ s/(\eW)/\e\e$1/g;
4685
4686 .fi
4687 .Sh "Formats"
4688 Output record formats for use with the
4689 .I write
4690 operator may declared as follows:
4691 .nf
4692
4693 .ne 3
4694     format NAME =
4695     FORMLIST
4696     .
4697
4698 .fi
4699 If name is omitted, format \*(L"STDOUT\*(R" is defined.
4700 FORMLIST consists of a sequence of lines, each of which may be of one of three
4701 types:
4702 .Ip 1. 4
4703 A comment.
4704 .Ip 2. 4
4705 A \*(L"picture\*(R" line giving the format for one output line.
4706 .Ip 3. 4
4707 An argument line supplying values to plug into a picture line.
4708 .PP
4709 Picture lines are printed exactly as they look, except for certain fields
4710 that substitute values into the line.
4711 Each picture field starts with either @ or ^.
4712 The @ field (not to be confused with the array marker @) is the normal
4713 case; ^ fields are used
4714 to do rudimentary multi-line text block filling.
4715 The length of the field is supplied by padding out the field
4716 with multiple <, >, or | characters to specify, respectively, left justification,
4717 right justification, or centering.
4718 As an alternate form of right justification,
4719 you may also use # characters (with an optional .) to specify a numeric field.
4720 (Use of ^ instead of @ causes the field to be blanked if undefined.)
4721 If any of the values supplied for these fields contains a newline, only
4722 the text up to the newline is printed.
4723 The special field @* can be used for printing multi-line values.
4724 It should appear by itself on a line.
4725 .PP
4726 The values are specified on the following line, in the same order as
4727 the picture fields.
4728 The values should be separated by commas.
4729 .PP
4730 Picture fields that begin with ^ rather than @ are treated specially.
4731 The value supplied must be a scalar variable name which contains a text
4732 string.
4733 .I Perl
4734 puts as much text as it can into the field, and then chops off the front
4735 of the string so that the next time the variable is referenced,
4736 more of the text can be printed.
4737 Normally you would use a sequence of fields in a vertical stack to print
4738 out a block of text.
4739 If you like, you can end the final field with .\|.\|., which will appear in the
4740 output if the text was too long to appear in its entirety.
4741 You can change which characters are legal to break on by changing the
4742 variable $: to a list of the desired characters.
4743 .PP
4744 Since use of ^ fields can produce variable length records if the text to be
4745 formatted is short, you can suppress blank lines by putting the tilde (~)
4746 character anywhere in the line.
4747 (Normally you should put it in the front if possible, for visibility.)
4748 The tilde will be translated to a space upon output.
4749 If you put a second tilde contiguous to the first, the line will be repeated
4750 until all the fields on the line are exhausted.
4751 (If you use a field of the @ variety, the expression you supply had better
4752 not give the same value every time forever!)
4753 .PP
4754 Examples:
4755 .nf
4756 .lg 0
4757 .cs R 25
4758 .ft C
4759
4760 .ne 10
4761 # a report on the /etc/passwd file
4762 format STDOUT_TOP =
4763 \&                        Passwd File
4764 Name                Login    Office   Uid   Gid Home
4765 ------------------------------------------------------------------
4766 \&.
4767 format STDOUT =
4768 @<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<<
4769 $name,              $login,  $office,$uid,$gid, $home
4770 \&.
4771
4772 .ne 29
4773 # a report from a bug report form
4774 format STDOUT_TOP =
4775 \&                        Bug Reports
4776 @<<<<<<<<<<<<<<<<<<<<<<<     @|||         @>>>>>>>>>>>>>>>>>>>>>>>
4777 $system,                      $%,         $date
4778 ------------------------------------------------------------------
4779 \&.
4780 format STDOUT =
4781 Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4782 \&         $subject
4783 Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4784 \&       $index,                       $description
4785 Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4786 \&          $priority,        $date,   $description
4787 From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4788 \&      $from,                         $description
4789 Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4790 \&             $programmer,            $description
4791 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4792 \&                                     $description
4793 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4794 \&                                     $description
4795 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4796 \&                                     $description
4797 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4798 \&                                     $description
4799 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<...
4800 \&                                     $description
4801 \&.
4802
4803 .ft R
4804 .cs R
4805 .lg
4806 .fi
4807 It is possible to intermix prints with writes on the same output channel,
4808 but you'll have to handle $\- (lines left on the page) yourself.
4809 .PP
4810 If you are printing lots of fields that are usually blank, you should consider
4811 using the reset operator between records.
4812 Not only is it more efficient, but it can prevent the bug of adding another
4813 field and forgetting to zero it.
4814 .Sh "Interprocess Communication"
4815 The IPC facilities of perl are built on the Berkeley socket mechanism.
4816 If you don't have sockets, you can ignore this section.
4817 The calls have the same names as the corresponding system calls,
4818 but the arguments tend to differ, for two reasons.
4819 First, perl file handles work differently than C file descriptors.
4820 Second, perl already knows the length of its strings, so you don't need
4821 to pass that information.
4822 Here is a sample client (untested):
4823 .nf
4824
4825         ($them,$port) = @ARGV;
4826         $port = 2345 unless $port;
4827         $them = 'localhost' unless $them;
4828
4829         $SIG{'INT'} = 'dokill';
4830         sub dokill { kill 9,$child if $child; }
4831
4832         require 'sys/socket.ph';
4833
4834         $sockaddr = 'S n a4 x8';
4835         chop($hostname = `hostname`);
4836
4837         ($name, $aliases, $proto) = getprotobyname('tcp');
4838         ($name, $aliases, $port) = getservbyname($port, 'tcp')
4839                 unless $port =~ /^\ed+$/;
4840 .ie t \{\
4841         ($name, $aliases, $type, $len, $thisaddr) = gethostbyname($hostname);
4842 'br\}
4843 .el \{\
4844         ($name, $aliases, $type, $len, $thisaddr) =
4845                                         gethostbyname($hostname);
4846 'br\}
4847         ($name, $aliases, $type, $len, $thataddr) = gethostbyname($them);
4848
4849         $this = pack($sockaddr, &AF_INET, 0, $thisaddr);
4850         $that = pack($sockaddr, &AF_INET, $port, $thataddr);
4851
4852         socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
4853         bind(S, $this) || die "bind: $!";
4854         connect(S, $that) || die "connect: $!";
4855
4856         select(S); $| = 1; select(stdout);
4857
4858         if ($child = fork) {
4859                 while (<>) {
4860                         print S;
4861                 }
4862                 sleep 3;
4863                 do dokill();
4864         }
4865         else {
4866                 while (<S>) {
4867                         print;
4868                 }
4869         }
4870
4871 .fi
4872 And here's a server:
4873 .nf
4874
4875         ($port) = @ARGV;
4876         $port = 2345 unless $port;
4877
4878         require 'sys/socket.ph';
4879
4880         $sockaddr = 'S n a4 x8';
4881
4882         ($name, $aliases, $proto) = getprotobyname('tcp');
4883         ($name, $aliases, $port) = getservbyname($port, 'tcp')
4884                 unless $port =~ /^\ed+$/;
4885
4886         $this = pack($sockaddr, &AF_INET, $port, "\e0\e0\e0\e0");
4887
4888         select(NS); $| = 1; select(stdout);
4889
4890         socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
4891         bind(S, $this) || die "bind: $!";
4892         listen(S, 5) || die "connect: $!";
4893
4894         select(S); $| = 1; select(stdout);
4895
4896         for (;;) {
4897                 print "Listening again\en";
4898                 ($addr = accept(NS,S)) || die $!;
4899                 print "accept ok\en";
4900
4901                 ($af,$port,$inetaddr) = unpack($sockaddr,$addr);
4902                 @inetaddr = unpack('C4',$inetaddr);
4903                 print "$af $port @inetaddr\en";
4904
4905                 while (<NS>) {
4906                         print;
4907                         print NS;
4908                 }
4909         }
4910
4911 .fi
4912 .Sh "Predefined Names"
4913 The following names have special meaning to
4914 .IR perl .
4915 I could have used alphabetic symbols for some of these, but I didn't want
4916 to take the chance that someone would say reset \*(L"a\-zA\-Z\*(R" and wipe them all
4917 out.
4918 You'll just have to suffer along with these silly symbols.
4919 Most of them have reasonable mnemonics, or analogues in one of the shells.
4920 .Ip $_ 8
4921 The default input and pattern-searching space.
4922 The following pairs are equivalent:
4923 .nf
4924
4925 .ne 2
4926         while (<>) {\|.\|.\|.   # only equivalent in while!
4927         while ($_ = <>) {\|.\|.\|.
4928
4929 .ne 2
4930         /\|^Subject:/
4931         $_ \|=~ \|/\|^Subject:/
4932
4933 .ne 2
4934         y/a\-z/A\-Z/
4935         $_ =~ y/a\-z/A\-Z/
4936
4937 .ne 2
4938         chop
4939         chop($_)
4940
4941 .fi
4942 (Mnemonic: underline is understood in certain operations.)
4943 .Ip $. 8
4944 The current input line number of the last filehandle that was read.
4945 Readonly.
4946 Remember that only an explicit close on the filehandle resets the line number.
4947 Since <> never does an explicit close, line numbers increase across ARGV files
4948 (but see examples under eof).
4949 (Mnemonic: many programs use . to mean the current line number.)
4950 .Ip $/ 8
4951 The input record separator, newline by default.
4952 Works like
4953 .IR awk 's
4954 RS variable, including treating blank lines as delimiters
4955 if set to the null string.
4956 You may set it to a multicharacter string to match a multi-character
4957 delimiter.
4958 Note that setting it to "\en\en" means something slightly different
4959 than setting it to "", if the file contains consecutive blank lines.
4960 Setting it to "" will treat two or more consecutive blank lines as a single
4961 blank line.
4962 Setting it to "\en\en" will blindly assume that the next input character
4963 belongs to the next paragraph, even if it's a newline.
4964 (Mnemonic: / is used to delimit line boundaries when quoting poetry.)
4965 .Ip $, 8
4966 The output field separator for the print operator.
4967 Ordinarily the print operator simply prints out the comma separated fields
4968 you specify.
4969 In order to get behavior more like
4970 .IR awk ,
4971 set this variable as you would set
4972 .IR awk 's
4973 OFS variable to specify what is printed between fields.
4974 (Mnemonic: what is printed when there is a , in your print statement.)
4975 .Ip $"" 8
4976 This is like $, except that it applies to array values interpolated into
4977 a double-quoted string (or similar interpreted string).
4978 Default is a space.
4979 (Mnemonic: obvious, I think.)
4980 .Ip $\e 8
4981 The output record separator for the print operator.
4982 Ordinarily the print operator simply prints out the comma separated fields
4983 you specify, with no trailing newline or record separator assumed.
4984 In order to get behavior more like
4985 .IR awk ,
4986 set this variable as you would set
4987 .IR awk 's
4988 ORS variable to specify what is printed at the end of the print.
4989 (Mnemonic: you set $\e instead of adding \en at the end of the print.
4990 Also, it's just like /, but it's what you get \*(L"back\*(R" from
4991 .IR perl .)
4992 .Ip $# 8
4993 The output format for printed numbers.
4994 This variable is a half-hearted attempt to emulate
4995 .IR awk 's
4996 OFMT variable.
4997 There are times, however, when
4998 .I awk
4999 and
5000 .I perl
5001 have differing notions of what
5002 is in fact numeric.
5003 Also, the initial value is %.20g rather than %.6g, so you need to set $#
5004 explicitly to get
5005 .IR awk 's
5006 value.
5007 (Mnemonic: # is the number sign.)
5008 .Ip $% 8
5009 The current page number of the currently selected output channel.
5010 (Mnemonic: % is page number in nroff.)
5011 .Ip $= 8
5012 The current page length (printable lines) of the currently selected output
5013 channel.
5014 Default is 60.
5015 (Mnemonic: = has horizontal lines.)
5016 .Ip $\- 8
5017 The number of lines left on the page of the currently selected output channel.
5018 (Mnemonic: lines_on_page \- lines_printed.)
5019 .Ip $~ 8
5020 The name of the current report format for the currently selected output
5021 channel.
5022 Default is name of the filehandle.
5023 (Mnemonic: brother to $^.)
5024 .Ip $^ 8
5025 The name of the current top-of-page format for the currently selected output
5026 channel.
5027 Default is name of the filehandle with \*(L"_TOP\*(R" appended.
5028 (Mnemonic: points to top of page.)
5029 .Ip $| 8
5030 If set to nonzero, forces a flush after every write or print on the currently
5031 selected output channel.
5032 Default is 0.
5033 Note that
5034 .I STDOUT
5035 will typically be line buffered if output is to the
5036 terminal and block buffered otherwise.
5037 Setting this variable is useful primarily when you are outputting to a pipe,
5038 such as when you are running a
5039 .I perl
5040 script under rsh and want to see the
5041 output as it's happening.
5042 (Mnemonic: when you want your pipes to be piping hot.)
5043 .Ip $$ 8
5044 The process number of the
5045 .I perl
5046 running this script.
5047 (Mnemonic: same as shells.)
5048 .Ip $? 8
5049 The status returned by the last pipe close, backtick (\`\`) command or
5050 .I system
5051 operator.
5052 Note that this is the status word returned by the wait() system
5053 call, so the exit value of the subprocess is actually ($? >> 8).
5054 $? & 255 gives which signal, if any, the process died from, and whether
5055 there was a core dump.
5056 (Mnemonic: similar to sh and ksh.)
5057 .Ip $& 8 4
5058 The string matched by the last successful pattern match
5059 (not counting any matches hidden
5060 within a BLOCK or eval enclosed by the current BLOCK).
5061 (Mnemonic: like & in some editors.)
5062 .Ip $\` 8 4
5063 The string preceding whatever was matched by the last successful pattern match
5064 (not counting any matches hidden within a BLOCK or eval enclosed by the current
5065 BLOCK).
5066 (Mnemonic: \` often precedes a quoted string.)
5067 .Ip $\' 8 4
5068 The string following whatever was matched by the last successful pattern match
5069 (not counting any matches hidden within a BLOCK or eval enclosed by the current
5070 BLOCK).
5071 (Mnemonic: \' often follows a quoted string.)
5072 Example:
5073 .nf
5074
5075 .ne 3
5076         $_ = \'abcdefghi\';
5077         /def/;
5078         print "$\`:$&:$\'\en";          # prints abc:def:ghi
5079
5080 .fi
5081 .Ip $+ 8 4
5082 The last bracket matched by the last search pattern.
5083 This is useful if you don't know which of a set of alternative patterns
5084 matched.
5085 For example:
5086 .nf
5087
5088     /Version: \|(.*\|)|Revision: \|(.*\|)\|/ \|&& \|($rev = $+);
5089
5090 .fi
5091 (Mnemonic: be positive and forward looking.)
5092 .Ip $* 8 2
5093 Set to 1 to do multiline matching within a string, 0 to tell
5094 .I perl
5095 that it can assume that strings contain a single line, for the purpose
5096 of optimizing pattern matches.
5097 Pattern matches on strings containing multiple newlines can produce confusing
5098 results when $* is 0.
5099 Default is 0.
5100 (Mnemonic: * matches multiple things.)
5101 Note that this variable only influences the interpretation of ^ and $.
5102 A literal newline can be searched for even when $* == 0.
5103 .Ip $0 8
5104 Contains the name of the file containing the
5105 .I perl
5106 script being executed.
5107 Assigning to $0 modifies the argument area that the ps(1) program sees.
5108 (Mnemonic: same as sh and ksh.)
5109 .Ip $<digit> 8
5110 Contains the subpattern from the corresponding set of parentheses in the last
5111 pattern matched, not counting patterns matched in nested blocks that have
5112 been exited already.
5113 (Mnemonic: like \edigit.)
5114 .Ip $[ 8 2
5115 The index of the first element in an array, and of the first character in
5116 a substring.
5117 Default is 0, but you could set it to 1 to make
5118 .I perl
5119 behave more like
5120 .I awk
5121 (or Fortran)
5122 when subscripting and when evaluating the index() and substr() functions.
5123 (Mnemonic: [ begins subscripts.)
5124 .Ip $] 8 2
5125 The string printed out when you say \*(L"perl -v\*(R".
5126 It can be used to determine at the beginning of a script whether the perl
5127 interpreter executing the script is in the right range of versions.
5128 If used in a numeric context, returns the version + patchlevel / 1000.
5129 Example:
5130 .nf
5131
5132 .ne 8
5133         # see if getc is available
5134         ($version,$patchlevel) =
5135                  $] =~ /(\ed+\e.\ed+).*\enPatch level: (\ed+)/;
5136         print STDERR "(No filename completion available.)\en"
5137                  if $version * 1000 + $patchlevel < 2016;
5138
5139 or, used numerically,
5140
5141         warn "No checksumming!\en" if $] < 3.019;
5142
5143 .fi
5144 (Mnemonic: Is this version of perl in the right bracket?)
5145 .Ip $; 8 2
5146 The subscript separator for multi-dimensional array emulation.
5147 If you refer to an associative array element as
5148 .nf
5149         $foo{$a,$b,$c}
5150
5151 it really means
5152
5153         $foo{join($;, $a, $b, $c)}
5154
5155 But don't put
5156
5157         @foo{$a,$b,$c}          # a slice\*(--note the @
5158
5159 which means
5160
5161         ($foo{$a},$foo{$b},$foo{$c})
5162
5163 .fi
5164 Default is "\e034", the same as SUBSEP in
5165 .IR awk .
5166 Note that if your keys contain binary data there might not be any safe
5167 value for $;.
5168 (Mnemonic: comma (the syntactic subscript separator) is a semi-semicolon.
5169 Yeah, I know, it's pretty lame, but $, is already taken for something more
5170 important.)
5171 .Ip $! 8 2
5172 If used in a numeric context, yields the current value of errno, with all the
5173 usual caveats.
5174 (This means that you shouldn't depend on the value of $! to be anything
5175 in particular unless you've gotten a specific error return indicating a
5176 system error.)
5177 If used in a string context, yields the corresponding system error string.
5178 You can assign to $! in order to set errno
5179 if, for instance, you want $! to return the string for error n, or you want
5180 to set the exit value for the die operator.
5181 (Mnemonic: What just went bang?)
5182 .Ip $@ 8 2
5183 The perl syntax error message from the last eval command.
5184 If null, the last eval parsed and executed correctly (although the operations
5185 you invoked may have failed in the normal fashion).
5186 (Mnemonic: Where was the syntax error \*(L"at\*(R"?)
5187 .Ip $< 8 2
5188 The real uid of this process.
5189 (Mnemonic: it's the uid you came FROM, if you're running setuid.)
5190 .Ip $> 8 2
5191 The effective uid of this process.
5192 Example:
5193 .nf
5194
5195 .ne 2
5196         $< = $>;        # set real uid to the effective uid
5197         ($<,$>) = ($>,$<);      # swap real and effective uid
5198
5199 .fi
5200 (Mnemonic: it's the uid you went TO, if you're running setuid.)
5201 Note: $< and $> can only be swapped on machines supporting setreuid().
5202 .Ip $( 8 2
5203 The real gid of this process.
5204 If you are on a machine that supports membership in multiple groups
5205 simultaneously, gives a space separated list of groups you are in.
5206 The first number is the one returned by getgid(), and the subsequent ones
5207 by getgroups(), one of which may be the same as the first number.
5208 (Mnemonic: parentheses are used to GROUP things.
5209 The real gid is the group you LEFT, if you're running setgid.)
5210 .Ip $) 8 2
5211 The effective gid of this process.
5212 If you are on a machine that supports membership in multiple groups
5213 simultaneously, gives a space separated list of groups you are in.
5214 The first number is the one returned by getegid(), and the subsequent ones
5215 by getgroups(), one of which may be the same as the first number.
5216 (Mnemonic: parentheses are used to GROUP things.
5217 The effective gid is the group that's RIGHT for you, if you're running setgid.)
5218 .Sp
5219 Note: $<, $>, $( and $) can only be set on machines that support the
5220 corresponding set[re][ug]id() routine.
5221 $( and $) can only be swapped on machines supporting setregid().
5222 .Ip $: 8 2
5223 The current set of characters after which a string may be broken to
5224 fill continuation fields (starting with ^) in a format.
5225 Default is "\ \en-", to break on whitespace or hyphens.
5226 (Mnemonic: a \*(L"colon\*(R" in poetry is a part of a line.)
5227 .Ip $^D 8 2
5228 The current value of the debugging flags.
5229 (Mnemonic: value of
5230 .B \-D
5231 switch.)
5232 .Ip $^F 8 2
5233 The maximum system file descriptor, ordinarily 2.  System file descriptors
5234 are passed to subprocesses, while higher file descriptors are not.
5235 During an open, system file descriptors are preserved even if the open
5236 fails.  Ordinary file descriptors are closed before the open is attempted.
5237 .Ip $^I 8 2
5238 The current value of the inplace-edit extension.
5239 Use undef to disable inplace editing.
5240 (Mnemonic: value of
5241 .B \-i
5242 switch.)
5243 .Ip $^L 8 2
5244 What formats output to perform a formfeed.  Default is \ef.
5245 .Ip $^P 8 2
5246 The internal flag that the debugger clears so that it doesn't
5247 debug itself.  You could conceivable disable debugging yourself
5248 by clearing it.
5249 .Ip $^T 8 2
5250 The time at which the script began running, in seconds since the epoch.
5251 The values returned by the
5252 .B \-M ,
5253 .B \-A
5254 and
5255 .B \-C
5256 filetests are based on this value.
5257 .Ip $^W 8 2
5258 The current value of the warning switch.
5259 (Mnemonic: related to the
5260 .B \-w
5261 switch.)
5262 .Ip $^X 8 2
5263 The name that Perl itself was executed as, from argv[0].
5264 .Ip $ARGV 8 3
5265 contains the name of the current file when reading from <>.
5266 .Ip @ARGV 8 3
5267 The array ARGV contains the command line arguments intended for the script.
5268 Note that $#ARGV is the generally number of arguments minus one, since
5269 $ARGV[0] is the first argument, NOT the command name.
5270 See $0 for the command name.
5271 .Ip @INC 8 3
5272 The array INC contains the list of places to look for
5273 .I perl
5274 scripts to be
5275 evaluated by the \*(L"do EXPR\*(R" command or the \*(L"require\*(R" command.
5276 It initially consists of the arguments to any
5277 .B \-I
5278 command line switches, followed
5279 by the default
5280 .I perl
5281 library, probably \*(L"/usr/local/lib/perl\*(R",
5282 followed by \*(L".\*(R", to represent the current directory.
5283 .Ip %INC 8 3
5284 The associative array INC contains entries for each filename that has
5285 been included via \*(L"do\*(R" or \*(L"require\*(R".
5286 The key is the filename you specified, and the value is the location of
5287 the file actually found.
5288 The \*(L"require\*(R" command uses this array to determine whether
5289 a given file has already been included.
5290 .Ip $ENV{expr} 8 2
5291 The associative array ENV contains your current environment.
5292 Setting a value in ENV changes the environment for child processes.
5293 .Ip $SIG{expr} 8 2
5294 The associative array SIG is used to set signal handlers for various signals.
5295 Example:
5296 .nf
5297
5298 .ne 12
5299         sub handler {   # 1st argument is signal name
5300                 local($sig) = @_;
5301                 print "Caught a SIG$sig\-\|\-shutting down\en";
5302                 close(LOG);
5303                 exit(0);
5304         }
5305
5306         $SIG{\'INT\'} = \'handler\';
5307         $SIG{\'QUIT\'} = \'handler\';
5308         .\|.\|.
5309         $SIG{\'INT\'} = \'DEFAULT\';    # restore default action
5310         $SIG{\'QUIT\'} = \'IGNORE\';    # ignore SIGQUIT
5311
5312 .fi
5313 The SIG array only contains values for the signals actually set within
5314 the perl script.
5315 .Sh "Packages"
5316 Perl provides a mechanism for alternate namespaces to protect packages from
5317 stomping on each others variables.
5318 By default, a perl script starts compiling into the package known as \*(L"main\*(R".
5319 By use of the
5320 .I package
5321 declaration, you can switch namespaces.
5322 The scope of the package declaration is from the declaration itself to the end
5323 of the enclosing block (the same scope as the local() operator).
5324 Typically it would be the first declaration in a file to be included by
5325 the \*(L"require\*(R" operator.
5326 You can switch into a package in more than one place; it merely influences
5327 which symbol table is used by the compiler for the rest of that block.
5328 You can refer to variables and filehandles in other packages by prefixing
5329 the identifier with the package name and a single quote.
5330 If the package name is null, the \*(L"main\*(R" package as assumed.
5331 .PP
5332 Only identifiers starting with letters are stored in the packages symbol
5333 table.
5334 All other symbols are kept in package \*(L"main\*(R".
5335 In addition, the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC
5336 and SIG are forced to be in package \*(L"main\*(R", even when used for
5337 other purposes than their built-in one.
5338 Note also that, if you have a package called \*(L"m\*(R", \*(L"s\*(R"
5339 or \*(L"y\*(R", the you can't use the qualified form of an identifier since it
5340 will be interpreted instead as a pattern match, a substitution
5341 or a translation.
5342 .PP
5343 Eval'ed strings are compiled in the package in which the eval was compiled
5344 in.
5345 (Assignments to $SIG{}, however, assume the signal handler specified is in the
5346 main package.
5347 Qualify the signal handler name if you wish to have a signal handler in
5348 a package.)
5349 For an example, examine perldb.pl in the perl library.
5350 It initially switches to the DB package so that the debugger doesn't interfere
5351 with variables in the script you are trying to debug.
5352 At various points, however, it temporarily switches back to the main package
5353 to evaluate various expressions in the context of the main package.
5354 .PP
5355 The symbol table for a package happens to be stored in the associative array
5356 of that name prepended with an underscore.
5357 The value in each entry of the associative array is
5358 what you are referring to when you use the *name notation.
5359 In fact, the following have the same effect (in package main, anyway),
5360 though the first is more
5361 efficient because it does the symbol table lookups at compile time:
5362 .nf
5363
5364 .ne 2
5365         local(*foo) = *bar;
5366         local($_main{'foo'}) = $_main{'bar'};
5367
5368 .fi
5369 You can use this to print out all the variables in a package, for instance.
5370 Here is dumpvar.pl from the perl library:
5371 .nf
5372 .ne 11
5373         package dumpvar;
5374
5375         sub main'dumpvar {
5376         \&    ($package) = @_;
5377         \&    local(*stab) = eval("*_$package");
5378         \&    while (($key,$val) = each(%stab)) {
5379         \&        {
5380         \&            local(*entry) = $val;
5381         \&            if (defined $entry) {
5382         \&                print "\e$$key = '$entry'\en";
5383         \&            }
5384 .ne 7
5385         \&            if (defined @entry) {
5386         \&                print "\e@$key = (\en";
5387         \&                foreach $num ($[ .. $#entry) {
5388         \&                    print "  $num\et'",$entry[$num],"'\en";
5389         \&                }
5390         \&                print ")\en";
5391         \&            }
5392 .ne 10
5393         \&            if ($key ne "_$package" && defined %entry) {
5394         \&                print "\e%$key = (\en";
5395         \&                foreach $key (sort keys(%entry)) {
5396         \&                    print "  $key\et'",$entry{$key},"'\en";
5397         \&                }
5398         \&                print ")\en";
5399         \&            }
5400         \&        }
5401         \&    }
5402         }
5403
5404 .fi
5405 Note that, even though the subroutine is compiled in package dumpvar, the
5406 name of the subroutine is qualified so that its name is inserted into package
5407 \*(L"main\*(R".
5408 .Sh "Style"
5409 Each programmer will, of course, have his or her own preferences in regards
5410 to formatting, but there are some general guidelines that will make your
5411 programs easier to read.
5412 .Ip 1. 4 4
5413 Just because you CAN do something a particular way doesn't mean that
5414 you SHOULD do it that way.
5415 .I Perl
5416 is designed to give you several ways to do anything, so consider picking
5417 the most readable one.
5418 For instance
5419
5420         open(FOO,$foo) || die "Can't open $foo: $!";
5421
5422 is better than
5423
5424         die "Can't open $foo: $!" unless open(FOO,$foo);
5425
5426 because the second way hides the main point of the statement in a
5427 modifier.
5428 On the other hand
5429
5430         print "Starting analysis\en" if $verbose;
5431
5432 is better than
5433
5434         $verbose && print "Starting analysis\en";
5435
5436 since the main point isn't whether the user typed -v or not.
5437 .Sp
5438 Similarly, just because an operator lets you assume default arguments
5439 doesn't mean that you have to make use of the defaults.
5440 The defaults are there for lazy systems programmers writing one-shot
5441 programs.
5442 If you want your program to be readable, consider supplying the argument.
5443 .Sp
5444 Along the same lines, just because you
5445 .I can
5446 omit parentheses in many places doesn't mean that you ought to:
5447 .nf
5448
5449         return print reverse sort num values array;
5450         return print(reverse(sort num (values(%array))));
5451
5452 .fi
5453 When in doubt, parenthesize.
5454 At the very least it will let some poor schmuck bounce on the % key in vi.
5455 .Sp
5456 Even if you aren't in doubt, consider the mental welfare of the person who
5457 has to maintain the code after you, and who will probably put parens in
5458 the wrong place.
5459 .Ip 2. 4 4
5460 Don't go through silly contortions to exit a loop at the top or the
5461 bottom, when
5462 .I perl
5463 provides the "last" operator so you can exit in the middle.
5464 Just outdent it a little to make it more visible:
5465 .nf
5466
5467 .ne 7
5468     line:
5469         for (;;) {
5470             statements;
5471         last line if $foo;
5472             next line if /^#/;
5473             statements;
5474         }
5475
5476 .fi
5477 .Ip 3. 4 4
5478 Don't be afraid to use loop labels\*(--they're there to enhance readability as
5479 well as to allow multi-level loop breaks.
5480 See last example.
5481 .Ip 4. 4 4
5482 For portability, when using features that may not be implemented on every
5483 machine, test the construct in an eval to see if it fails.
5484 If you know what version or patchlevel a particular feature was implemented,
5485 you can test $] to see if it will be there.
5486 .Ip 5. 4 4
5487 Choose mnemonic identifiers.
5488 .Ip 6. 4 4
5489 Be consistent.
5490 .Sh "Debugging"
5491 If you invoke
5492 .I perl
5493 with a
5494 .B \-d
5495 switch, your script will be run under a debugging monitor.
5496 It will halt before the first executable statement and ask you for a
5497 command, such as:
5498 .Ip "h" 12 4
5499 Prints out a help message.
5500 .Ip "T" 12 4
5501 Stack trace.
5502 .Ip "s" 12 4
5503 Single step.
5504 Executes until it reaches the beginning of another statement.
5505 .Ip "n" 12 4
5506 Next.
5507 Executes over subroutine calls, until it reaches the beginning of the
5508 next statement.
5509 .Ip "f" 12 4
5510 Finish.
5511 Executes statements until it has finished the current subroutine.
5512 .Ip "c" 12 4
5513 Continue.
5514 Executes until the next breakpoint is reached.
5515 .Ip "c line" 12 4
5516 Continue to the specified line.
5517 Inserts a one-time-only breakpoint at the specified line.
5518 .Ip "<CR>" 12 4
5519 Repeat last n or s.
5520 .Ip "l min+incr" 12 4
5521 List incr+1 lines starting at min.
5522 If min is omitted, starts where last listing left off.
5523 If incr is omitted, previous value of incr is used.
5524 .Ip "l min-max" 12 4
5525 List lines in the indicated range.
5526 .Ip "l line" 12 4
5527 List just the indicated line.
5528 .Ip "l" 12 4
5529 List next window.
5530 .Ip "-" 12 4
5531 List previous window.
5532 .Ip "w line" 12 4
5533 List window around line.
5534 .Ip "l subname" 12 4
5535 List subroutine.
5536 If it's a long subroutine it just lists the beginning.
5537 Use \*(L"l\*(R" to list more.
5538 .Ip "/pattern/" 12 4
5539 Regular expression search forward for pattern; the final / is optional.
5540 .Ip "?pattern?" 12 4
5541 Regular expression search backward for pattern; the final ? is optional.
5542 .Ip "L" 12 4
5543 List lines that have breakpoints or actions.
5544 .Ip "S" 12 4
5545 Lists the names of all subroutines.
5546 .Ip "t" 12 4
5547 Toggle trace mode on or off.
5548 .Ip "b line condition" 12 4
5549 Set a breakpoint.
5550 If line is omitted, sets a breakpoint on the
5551 line that is about to be executed.
5552 If a condition is specified, it is evaluated each time the statement is
5553 reached and a breakpoint is taken only if the condition is true.
5554 Breakpoints may only be set on lines that begin an executable statement.
5555 .Ip "b subname condition" 12 4
5556 Set breakpoint at first executable line of subroutine.
5557 .Ip "d line" 12 4
5558 Delete breakpoint.
5559 If line is omitted, deletes the breakpoint on the
5560 line that is about to be executed.
5561 .Ip "D" 12 4
5562 Delete all breakpoints.
5563 .Ip "a line command" 12 4
5564 Set an action for line.
5565 A multi-line command may be entered by backslashing the newlines.
5566 .Ip "A" 12 4
5567 Delete all line actions.
5568 .Ip "< command" 12 4
5569 Set an action to happen before every debugger prompt.
5570 A multi-line command may be entered by backslashing the newlines.
5571 .Ip "> command" 12 4
5572 Set an action to happen after the prompt when you've just given a command
5573 to return to executing the script.
5574 A multi-line command may be entered by backslashing the newlines.
5575 .Ip "V package" 12 4
5576 List all variables in package.
5577 Default is main package.
5578 .Ip "! number" 12 4
5579 Redo a debugging command.
5580 If number is omitted, redoes the previous command.
5581 .Ip "! -number" 12 4
5582 Redo the command that was that many commands ago.
5583 .Ip "H -number" 12 4
5584 Display last n commands.
5585 Only commands longer than one character are listed.
5586 If number is omitted, lists them all.
5587 .Ip "q or ^D" 12 4
5588 Quit.
5589 .Ip "command" 12 4
5590 Execute command as a perl statement.
5591 A missing semicolon will be supplied.
5592 .Ip "p expr" 12 4
5593 Same as \*(L"print DB'OUT expr\*(R".
5594 The DB'OUT filehandle is opened to /dev/tty, regardless of where STDOUT
5595 may be redirected to.
5596 .PP
5597 If you want to modify the debugger, copy perldb.pl from the perl library
5598 to your current directory and modify it as necessary.
5599 (You'll also have to put -I. on your command line.)
5600 You can do some customization by setting up a .perldb file which contains
5601 initialization code.
5602 For instance, you could make aliases like these:
5603 .nf
5604
5605     $DB'alias{'len'} = 's/^len(.*)/p length($1)/';
5606     $DB'alias{'stop'} = 's/^stop (at|in)/b/';
5607     $DB'alias{'.'} =
5608       's/^\e./p "\e$DB\e'sub(\e$DB\e'line):\et",\e$DB\e'line[\e$DB\e'line]/';
5609
5610 .fi
5611 .Sh "Setuid Scripts"
5612 .I Perl
5613 is designed to make it easy to write secure setuid and setgid scripts.
5614 Unlike shells, which are based on multiple substitution passes on each line
5615 of the script,
5616 .I perl
5617 uses a more conventional evaluation scheme with fewer hidden \*(L"gotchas\*(R".
5618 Additionally, since the language has more built-in functionality, it
5619 has to rely less upon external (and possibly untrustworthy) programs to
5620 accomplish its purposes.
5621 .PP
5622 In an unpatched 4.2 or 4.3bsd kernel, setuid scripts are intrinsically
5623 insecure, but this kernel feature can be disabled.
5624 If it is,
5625 .I perl
5626 can emulate the setuid and setgid mechanism when it notices the otherwise
5627 useless setuid/gid bits on perl scripts.
5628 If the kernel feature isn't disabled,
5629 .I perl
5630 will complain loudly that your setuid script is insecure.
5631 You'll need to either disable the kernel setuid script feature, or put
5632 a C wrapper around the script.
5633 .PP
5634 When perl is executing a setuid script, it takes special precautions to
5635 prevent you from falling into any obvious traps.
5636 (In some ways, a perl script is more secure than the corresponding
5637 C program.)
5638 Any command line argument, environment variable, or input is marked as
5639 \*(L"tainted\*(R", and may not be used, directly or indirectly, in any
5640 command that invokes a subshell, or in any command that modifies files,
5641 directories or processes.
5642 Any variable that is set within an expression that has previously referenced
5643 a tainted value also becomes tainted (even if it is logically impossible
5644 for the tainted value to influence the variable).
5645 For example:
5646 .nf
5647
5648 .ne 5
5649         $foo = shift;                   # $foo is tainted
5650         $bar = $foo,\'bar\';            # $bar is also tainted
5651         $xxx = <>;                      # Tainted
5652         $path = $ENV{\'PATH\'}; # Tainted, but see below
5653         $abc = \'abc\';                 # Not tainted
5654
5655 .ne 4
5656         system "echo $foo";             # Insecure
5657         system "/bin/echo", $foo;       # Secure (doesn't use sh)
5658         system "echo $bar";             # Insecure
5659         system "echo $abc";             # Insecure until PATH set
5660
5661 .ne 5
5662         $ENV{\'PATH\'} = \'/bin:/usr/bin\';
5663         $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
5664
5665         $path = $ENV{\'PATH\'}; # Not tainted
5666         system "echo $abc";             # Is secure now!
5667
5668 .ne 5
5669         open(FOO,"$foo");               # OK
5670         open(FOO,">$foo");              # Not OK
5671
5672         open(FOO,"echo $foo|"); # Not OK, but...
5673         open(FOO,"-|") || exec \'echo\', $foo;  # OK
5674
5675         $zzz = `echo $foo`;             # Insecure, zzz tainted
5676
5677         unlink $abc,$foo;               # Insecure
5678         umask $foo;                     # Insecure
5679
5680 .ne 3
5681         exec "echo $foo";               # Insecure
5682         exec "echo", $foo;              # Secure (doesn't use sh)
5683         exec "sh", \'-c\', $foo;        # Considered secure, alas
5684
5685 .fi
5686 The taintedness is associated with each scalar value, so some elements
5687 of an array can be tainted, and others not.
5688 .PP
5689 If you try to do something insecure, you will get a fatal error saying
5690 something like \*(L"Insecure dependency\*(R" or \*(L"Insecure PATH\*(R".
5691 Note that you can still write an insecure system call or exec,
5692 but only by explicitly doing something like the last example above.
5693 You can also bypass the tainting mechanism by referencing
5694 subpatterns\*(--\c
5695 .I perl
5696 presumes that if you reference a substring using $1, $2, etc, you knew
5697 what you were doing when you wrote the pattern:
5698 .nf
5699
5700         $ARGV[0] =~ /^\-P(\ew+)$/;
5701         $printer = $1;          # Not tainted
5702
5703 .fi
5704 This is fairly secure since \ew+ doesn't match shell metacharacters.
5705 Use of .+ would have been insecure, but
5706 .I perl
5707 doesn't check for that, so you must be careful with your patterns.
5708 This is the ONLY mechanism for untainting user supplied filenames if you
5709 want to do file operations on them (unless you make $> equal to $<).
5710 .PP
5711 It's also possible to get into trouble with other operations that don't care
5712 whether they use tainted values.
5713 Make judicious use of the file tests in dealing with any user-supplied
5714 filenames.
5715 When possible, do opens and such after setting $> = $<.
5716 .I Perl
5717 doesn't prevent you from opening tainted filenames for reading, so be
5718 careful what you print out.
5719 The tainting mechanism is intended to prevent stupid mistakes, not to remove
5720 the need for thought.
5721 .SH ENVIRONMENT
5722 .Ip HOME 12 4
5723 Used if chdir has no argument.
5724 .Ip LOGDIR 12 4
5725 Used if chdir has no argument and HOME is not set.
5726 .Ip PATH 12 4
5727 Used in executing subprocesses, and in finding the script if \-S
5728 is used.
5729 .Ip PERLLIB 12 4
5730 A colon-separated list of directories in which to look for Perl library
5731 files before looking in the standard library and the current directory.
5732 .Ip PERLDB 12 4
5733 The command used to get the debugger code.  If unset, uses
5734 .br
5735
5736         require 'perldb.pl'
5737
5738 .PP
5739 Apart from these,
5740 .I perl
5741 uses no other environment variables, except to make them available
5742 to the script being executed, and to child processes.
5743 However, scripts running setuid would do well to execute the following lines
5744 before doing anything else, just to keep people honest:
5745 .nf
5746
5747 .ne 3
5748     $ENV{\'PATH\'} = \'/bin:/usr/bin\';    # or whatever you need
5749     $ENV{\'SHELL\'} = \'/bin/sh\' if $ENV{\'SHELL\'} ne \'\';
5750     $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
5751
5752 .fi
5753 .SH AUTHOR
5754 Larry Wall <lwall@netlabs.com>
5755 .br
5756 MS-DOS port by Diomidis Spinellis <dds@cc.ic.ac.uk>
5757 .SH FILES
5758 /tmp/perl\-eXXXXXX      temporary file for
5759 .B \-e
5760 commands.
5761 .SH SEE ALSO
5762 a2p     awk to perl translator
5763 .br
5764 s2p     sed to perl translator
5765 .SH DIAGNOSTICS
5766 Compilation errors will tell you the line number of the error, with an
5767 indication of the next token or token type that was to be examined.
5768 (In the case of a script passed to
5769 .I perl
5770 via
5771 .B \-e
5772 switches, each
5773 .B \-e
5774 is counted as one line.)
5775 .PP
5776 Setuid scripts have additional constraints that can produce error messages
5777 such as \*(L"Insecure dependency\*(R".
5778 See the section on setuid scripts.
5779 .SH TRAPS
5780 Accustomed
5781 .IR awk
5782 users should take special note of the following:
5783 .Ip * 4 2
5784 Semicolons are required after all simple statements in
5785 .I perl
5786 (except at the end of a block).
5787 Newline is not a statement delimiter.
5788 .Ip * 4 2
5789 Curly brackets are required on ifs and whiles.
5790 .Ip * 4 2
5791 Variables begin with $ or @ in
5792 .IR perl .
5793 .Ip * 4 2
5794 Arrays index from 0 unless you set $[.
5795 Likewise string positions in substr() and index().
5796 .Ip * 4 2
5797 You have to decide whether your array has numeric or string indices.
5798 .Ip * 4 2
5799 Associative array values do not spring into existence upon mere reference.
5800 .Ip * 4 2
5801 You have to decide whether you want to use string or numeric comparisons.
5802 .Ip * 4 2
5803 Reading an input line does not split it for you.  You get to split it yourself
5804 to an array.
5805 And the
5806 .I split
5807 operator has different arguments.
5808 .Ip * 4 2
5809 The current input line is normally in $_, not $0.
5810 It generally does not have the newline stripped.
5811 ($0 is the name of the program executed.)
5812 .Ip * 4 2
5813 $<digit> does not refer to fields\*(--it refers to substrings matched by the last
5814 match pattern.
5815 .Ip * 4 2
5816 The
5817 .I print
5818 statement does not add field and record separators unless you set
5819 $, and $\e.
5820 .Ip * 4 2
5821 You must open your files before you print to them.
5822 .Ip * 4 2
5823 The range operator is \*(L".\|.\*(R", not comma.
5824 (The comma operator works as in C.)
5825 .Ip * 4 2
5826 The match operator is \*(L"=~\*(R", not \*(L"~\*(R".
5827 (\*(L"~\*(R" is the one's complement operator, as in C.)
5828 .Ip * 4 2
5829 The exponentiation operator is \*(L"**\*(R", not \*(L"^\*(R".
5830 (\*(L"^\*(R" is the XOR operator, as in C.)
5831 .Ip * 4 2
5832 The concatenation operator is \*(L".\*(R", not the null string.
5833 (Using the null string would render \*(L"/pat/ /pat/\*(R" unparsable,
5834 since the third slash would be interpreted as a division operator\*(--the
5835 tokener is in fact slightly context sensitive for operators like /, ?, and <.
5836 And in fact, . itself can be the beginning of a number.)
5837 .Ip * 4 2
5838 .IR Next ,
5839 .I exit
5840 and
5841 .I continue
5842 work differently.
5843 .Ip * 4 2
5844 The following variables work differently
5845 .nf
5846
5847           Awk   \h'|2.5i'Perl
5848           ARGC  \h'|2.5i'$#ARGV
5849           ARGV[0]       \h'|2.5i'$0
5850           FILENAME\h'|2.5i'$ARGV
5851           FNR   \h'|2.5i'$. \- something
5852           FS    \h'|2.5i'(whatever you like)
5853           NF    \h'|2.5i'$#Fld, or some such
5854           NR    \h'|2.5i'$.
5855           OFMT  \h'|2.5i'$#
5856           OFS   \h'|2.5i'$,
5857           ORS   \h'|2.5i'$\e
5858           RLENGTH       \h'|2.5i'length($&)
5859           RS    \h'|2.5i'$/
5860           RSTART        \h'|2.5i'length($\`)
5861           SUBSEP        \h'|2.5i'$;
5862
5863 .fi
5864 .Ip * 4 2
5865 When in doubt, run the
5866 .I awk
5867 construct through a2p and see what it gives you.
5868 .PP
5869 Cerebral C programmers should take note of the following:
5870 .Ip * 4 2
5871 Curly brackets are required on ifs and whiles.
5872 .Ip * 4 2
5873 You should use \*(L"elsif\*(R" rather than \*(L"else if\*(R"
5874 .Ip * 4 2
5875 .I Break
5876 and
5877 .I continue
5878 become
5879 .I last
5880 and
5881 .IR next ,
5882 respectively.
5883 .Ip * 4 2
5884 There's no switch statement.
5885 .Ip * 4 2
5886 Variables begin with $ or @ in
5887 .IR perl .
5888 .Ip * 4 2
5889 Printf does not implement *.
5890 .Ip * 4 2
5891 Comments begin with #, not /*.
5892 .Ip * 4 2
5893 You can't take the address of anything.
5894 .Ip * 4 2
5895 ARGV must be capitalized.
5896 .Ip * 4 2
5897 The \*(L"system\*(R" calls link, unlink, rename, etc. return nonzero for success, not 0.
5898 .Ip * 4 2
5899 Signal handlers deal with signal names, not numbers.
5900 .PP
5901 Seasoned
5902 .I sed
5903 programmers should take note of the following:
5904 .Ip * 4 2
5905 Backreferences in substitutions use $ rather than \e.
5906 .Ip * 4 2
5907 The pattern matching metacharacters (, ), and | do not have backslashes in front.
5908 .Ip * 4 2
5909 The range operator is .\|. rather than comma.
5910 .PP
5911 Sharp shell programmers should take note of the following:
5912 .Ip * 4 2
5913 The backtick operator does variable interpretation without regard to the
5914 presence of single quotes in the command.
5915 .Ip * 4 2
5916 The backtick operator does no translation of the return value, unlike csh.
5917 .Ip * 4 2
5918 Shells (especially csh) do several levels of substitution on each command line.
5919 .I Perl
5920 does substitution only in certain constructs such as double quotes,
5921 backticks, angle brackets and search patterns.
5922 .Ip * 4 2
5923 Shells interpret scripts a little bit at a time.
5924 .I Perl
5925 compiles the whole program before executing it.
5926 .Ip * 4 2
5927 The arguments are available via @ARGV, not $1, $2, etc.
5928 .Ip * 4 2
5929 The environment is not automatically made available as variables.
5930 .SH ERRATA\0AND\0ADDENDA
5931 The Perl book,
5932 .I Programming\0Perl ,
5933 has the following omissions and goofs.
5934 .PP
5935 On page 5, the examples which read
5936 .nf
5937
5938         eval "/usr/bin/perl
5939
5940 should read
5941
5942         eval "exec /usr/bin/perl
5943
5944 .fi
5945 .PP
5946 On page 195, the equivalent to the System V sum program only works for
5947 very small files.  To do larger files, use
5948 .nf
5949
5950         undef $/;
5951         $checksum = unpack("%32C*",<>) % 32767;
5952
5953 .fi
5954 .PP
5955 The descriptions of alarm and sleep refer to signal SIGALARM.  These
5956 should refer to SIGALRM.
5957 .PP
5958 The
5959 .B \-0
5960 switch to set the initial value of $/ was added to Perl after the book
5961 went to press.
5962 .PP
5963 The
5964 .B \-l
5965 switch now does automatic line ending processing.
5966 .PP
5967 The qx// construct is now a synonym for backticks.
5968 .PP
5969 $0 may now be assigned to set the argument displayed by
5970 .I ps (1).
5971 .PP
5972 The new @###.## format was omitted accidentally from the description
5973 on formats.
5974 .PP
5975 It wasn't known at press time that s///ee caused multiple evaluations of
5976 the replacement expression.  This is to be construed as a feature.
5977 .PP
5978 (LIST) x $count now does array replication.
5979 .PP
5980 There is now no limit on the number of parentheses in a regular expression.
5981 .PP
5982 In double-quote context, more escapes are supported: \ee, \ea, \ex1b, \ec[,
5983 \el, \eL, \eu, \eU, \eE.  The latter five control up/lower case translation.
5984 .PP
5985 The
5986 .B $/
5987 variable may now be set to a multi-character delimiter.
5988 .PP
5989 There is now a g modifier on ordinary pattern matching that causes it
5990 to iterate through a string finding multiple matches.
5991 .PP
5992 All of the $^X variables are new except for $^T.
5993 .PP
5994 The default top-of-form format for FILEHANDLE is now FILEHANDLE_TOP rather
5995 than top.
5996 .PP
5997 The eval {} and sort {} constructs were added in version 4.018.
5998 .PP
5999 The v and V (little-endian) template options for pack and unpack were
6000 added in 4.019.
6001 .SH BUGS
6002 .PP
6003 .I Perl
6004 is at the mercy of your machine's definitions of various operations
6005 such as type casting, atof() and sprintf().
6006 .PP
6007 If your stdio requires an seek or eof between reads and writes on a particular
6008 stream, so does
6009 .IR perl .
6010 (This doesn't apply to sysread() and syswrite().)
6011 .PP
6012 While none of the built-in data types have any arbitrary size limits (apart
6013 from memory size), there are still a few arbitrary limits:
6014 a given identifier may not be longer than 255 characters,
6015 and no component of your PATH may be longer than 255 if you use \-S.
6016 A regular expression may not compile to more than 32767 bytes internally.
6017 .PP
6018 .I Perl
6019 actually stands for Pathologically Eclectic Rubbish Lister, but don't tell
6020 anyone I said that.
6021 .rn }` ''