perl.man

   1 .rn '' }`
   2 ''' $RCSfile: perl.man,v $$Revision: 4.0.1.1 $$Date: 91/04/11 17:50:44 $
   3 '''
   4 ''' $Log:       perl.man,v $
   5 ''' Revision 4.0.1.1  91/04/11  17:50:44  lwall
   6 ''' patch1: fixed some typos
   7 '''
   8 ''' Revision 4.0  91/03/20  01:38:08  lwall
   9 ''' 4.0 baseline.
  10 '''
  11 '''
  12 .de Sh
  13 .br
  14 .ne 5
  15 .PP
  16 \fB\\$1\fR
  17 .PP
  18 ..
  19 .de Sp
  20 .if t .sp .5v
  21 .if n .sp
  22 ..
  23 .de Ip
  24 .br
  25 .ie \\n(.$>=3 .ne \\$3
  26 .el .ne 3
  27 .IP "\\$1" \\$2
  28 ..
  29 '''
  30 '''     Set up \*(-- to give an unbreakable dash;
  31 '''     string Tr holds user defined translation string.
  32 '''     Bell System Logo is used as a dummy character.
  33 '''
  34 .tr \(*W-|\(bv\*(Tr
  35 .ie n \{\
  36 .ds -- \(*W-
  37 .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
  38 .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
  39 .ds L" ""
  40 .ds R" ""
  41 .ds L' '
  42 .ds R' '
  43 'br\}
  44 .el\{\
  45 .ds -- \(em\|
  46 .tr \*(Tr
  47 .ds L" ``
  48 .ds R" ''
  49 .ds L' `
  50 .ds R' '
  51 'br\}
  52 .TH PERL 1 "\*(RP"
  53 .UC
  54 .SH NAME
  55 perl \- Practical Extraction and Report Language
  56 .SH SYNOPSIS
  57 .B perl
  58 [options] filename args
  59 .SH DESCRIPTION
  60 .I Perl
  61 is an interpreted language optimized for scanning arbitrary text files,
  62 extracting information from those text files, and printing reports based
  63 on that information.
  64 It's also a good language for many system management tasks.
  65 The language is intended to be practical (easy to use, efficient, complete)
  66 rather than beautiful (tiny, elegant, minimal).
  67 It combines (in the author's opinion, anyway) some of the best features of C,
  68 \fIsed\fR, \fIawk\fR, and \fIsh\fR,
  69 so people familiar with those languages should have little difficulty with it.
  70 (Language historians will also note some vestiges of \fIcsh\fR, Pascal, and
  71 even BASIC-PLUS.)
  72 Expression syntax corresponds quite closely to C expression syntax.
  73 Unlike most Unix utilities,
  74 .I perl
  75 does not arbitrarily limit the size of your data\*(--if you've got
  76 the memory,
  77 .I perl
  78 can slurp in your whole file as a single string.
  79 Recursion is of unlimited depth.
  80 And the hash tables used by associative arrays grow as necessary to prevent
  81 degraded performance.
  82 .I Perl
  83 uses sophisticated pattern matching techniques to scan large amounts of
  84 data very quickly.
  85 Although optimized for scanning text,
  86 .I perl
  87 can also deal with binary data, and can make dbm files look like associative
  88 arrays (where dbm is available).
  89 Setuid
  90 .I perl
  91 scripts are safer than C programs
  92 through a dataflow tracing mechanism which prevents many stupid security holes.
  93 If you have a problem that would ordinarily use \fIsed\fR
  94 or \fIawk\fR or \fIsh\fR, but it
  95 exceeds their capabilities or must run a little faster,
  96 and you don't want to write the silly thing in C, then
  97 .I perl
  98 may be for you.
  99 There are also translators to turn your
 100 .I sed
 101 and
 102 .I awk
 103 scripts into
 104 .I perl
 105 scripts.
 106 OK, enough hype.
 107 .PP
 108 Upon startup,
 109 .I perl
 110 looks for your script in one of the following places:
 111 .Ip 1. 4 2
 112 Specified line by line via
 113 .B \-e
 114 switches on the command line.
 115 .Ip 2. 4 2
 116 Contained in the file specified by the first filename on the command line.
 117 (Note that systems supporting the #! notation invoke interpreters this way.)
 118 .Ip 3. 4 2
 119 Passed in implicitly via standard input.
 120 This only works if there are no filename arguments\*(--to pass
 121 arguments to a
 122 .I stdin
 123 script you must explicitly specify a \- for the script name.
 124 .PP
 125 After locating your script,
 126 .I perl
 127 compiles it to an internal form.
 128 If the script is syntactically correct, it is executed.
 129 .Sh "Options"
 130 Note: on first reading this section may not make much sense to you.  It's here
 131 at the front for easy reference.
 132 .PP
 133 A single-character option may be combined with the following option, if any.
 134 This is particularly useful when invoking a script using the #! construct which
 135 only allows one argument.  Example:
 136 .nf
 137
 138 .ne 2
 139         #!/usr/bin/perl \-spi.bak       # same as \-s \-p \-i.bak
 140         .\|.\|.
 141
 142 .fi
 143 Options include:
 144 .TP 5
 145 .BI \-0 digits
 146 specifies the record separator ($/) as an octal number.
 147 If there are no digits, the null character is the separator.
 148 Other switches may precede or follow the digits.
 149 For example, if you have a version of
 150 .I find
 151 which can print filenames terminated by the null character, you can say this:
 152 .nf
 153
 154     find . \-name '*.bak' \-print0 | perl \-n0e unlink
 155
 156 .fi
 157 The special value 00 will cause Perl to slurp files in paragraph mode.
 158 The value 0777 will cause Perl to slurp files whole since there is no
 159 legal character with that value.
 160 .TP 5
 161 .B \-a
 162 turns on autosplit mode when used with a
 163 .B \-n
 164 or
 165 .BR \-p .
 166 An implicit split command to the @F array
 167 is done as the first thing inside the implicit while loop produced by
 168 the
 169 .B \-n
 170 or
 171 .BR \-p .
 172 .nf
 173
 174         perl \-ane \'print pop(@F), "\en";\'
 175
 176 is equivalent to
 177
 178         while (<>) {
 179                 @F = split(\' \');
 180                 print pop(@F), "\en";
 181         }
 182
 183 .fi
 184 .TP 5
 185 .B \-c
 186 causes
 187 .I perl
 188 to check the syntax of the script and then exit without executing it.
 189 .TP 5
 190 .BI \-d
 191 runs the script under the perl debugger.
 192 See the section on Debugging.
 193 .TP 5
 194 .BI \-D number
 195 sets debugging flags.
 196 To watch how it executes your script, use
 197 .BR \-D14 .
 198 (This only works if debugging is compiled into your
 199 .IR perl .)
 200 Another nice value is \-D1024, which lists your compiled syntax tree.
 201 And \-D512 displays compiled regular expressions.
 202 .TP 5
 203 .BI \-e " commandline"
 204 may be used to enter one line of script.
 205 Multiple
 206 .B \-e
 207 commands may be given to build up a multi-line script.
 208 If
 209 .B \-e
 210 is given,
 211 .I perl
 212 will not look for a script filename in the argument list.
 213 .TP 5
 214 .BI \-i extension
 215 specifies that files processed by the <> construct are to be edited
 216 in-place.
 217 It does this by renaming the input file, opening the output file by the
 218 same name, and selecting that output file as the default for print statements.
 219 The extension, if supplied, is added to the name of the
 220 old file to make a backup copy.
 221 If no extension is supplied, no backup is made.
 222 Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using
 223 the script:
 224 .nf
 225
 226 .ne 2
 227         #!/usr/bin/perl \-pi.bak
 228         s/foo/bar/;
 229
 230 which is equivalent to
 231
 232 .ne 14
 233         #!/usr/bin/perl
 234         while (<>) {
 235                 if ($ARGV ne $oldargv) {
 236                         rename($ARGV, $ARGV . \'.bak\');
 237                         open(ARGVOUT, ">$ARGV");
 238                         select(ARGVOUT);
 239                         $oldargv = $ARGV;
 240                 }
 241                 s/foo/bar/;
 242         }
 243         continue {
 244             print;      # this prints to original filename
 245         }
 246         select(STDOUT);
 247
 248 .fi
 249 except that the
 250 .B \-i
 251 form doesn't need to compare $ARGV to $oldargv to know when
 252 the filename has changed.
 253 It does, however, use ARGVOUT for the selected filehandle.
 254 Note that
 255 .I STDOUT
 256 is restored as the default output filehandle after the loop.
 257 .Sp
 258 You can use eof to locate the end of each input file, in case you want
 259 to append to each file, or reset line numbering (see example under eof).
 260 .TP 5
 261 .BI \-I directory
 262 may be used in conjunction with
 263 .B \-P
 264 to tell the C preprocessor where to look for include files.
 265 By default /usr/include and /usr/lib/perl are searched.
 266 .TP 5
 267 .BI \-l octnum
 268 enables automatic line-ending processing.  It has two effects:
 269 first, it automatically chops the line terminator when used with
 270 .B \-n
 271 or
 272 .B \-p ,
 273 and second, it assigns $\e to have the value of
 274 .I octnum
 275 so that any print statements will have that line terminator added back on.  If
 276 .I octnum
 277 is omitted, sets $\e to the current value of $/.
 278 For instance, to trim lines to 80 columns:
 279 .nf
 280
 281         perl -lpe \'substr($_, 80) = ""\'
 282
 283 .fi
 284 Note that the assignment $\e = $/ is done when the switch is processed,
 285 so the input record separator can be different than the output record
 286 separator if the
 287 .B \-l
 288 switch is followed by a
 289 .B \-0
 290 switch:
 291 .nf
 292
 293         gnufind / -print0 | perl -ln0e 'print "found $_" if -p'
 294
 295 .fi
 296 This sets $\e to newline and then sets $/ to the null character.
 297 .TP 5
 298 .B \-n
 299 causes
 300 .I perl
 301 to assume the following loop around your script, which makes it iterate
 302 over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR:
 303 .nf
 304
 305 .ne 3
 306         while (<>) {
 307                 .\|.\|.         # your script goes here
 308         }
 309
 310 .fi
 311 Note that the lines are not printed by default.
 312 See
 313 .B \-p
 314 to have lines printed.
 315 Here is an efficient way to delete all files older than a week:
 316 .nf
 317
 318         find . \-mtime +7 \-print | perl \-nle \'unlink;\'
 319
 320 .fi
 321 This is faster than using the \-exec switch of find because you don't have to
 322 start a process on every filename found.
 323 .TP 5
 324 .B \-p
 325 causes
 326 .I perl
 327 to assume the following loop around your script, which makes it iterate
 328 over filename arguments somewhat like \fIsed\fR:
 329 .nf
 330
 331 .ne 5
 332         while (<>) {
 333                 .\|.\|.         # your script goes here
 334         } continue {
 335                 print;
 336         }
 337
 338 .fi
 339 Note that the lines are printed automatically.
 340 To suppress printing use the
 341 .B \-n
 342 switch.
 343 A
 344 .B \-p
 345 overrides a
 346 .B \-n
 347 switch.
 348 .TP 5
 349 .B \-P
 350 causes your script to be run through the C preprocessor before
 351 compilation by
 352 .IR perl .
 353 (Since both comments and cpp directives begin with the # character,
 354 you should avoid starting comments with any words recognized
 355 by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".)
 356 .TP 5
 357 .B \-s
 358 enables some rudimentary switch parsing for switches on the command line
 359 after the script name but before any filename arguments (or before a \-\|\-).
 360 Any switch found there is removed from @ARGV and sets the corresponding variable in the
 361 .I perl
 362 script.
 363 The following script prints \*(L"true\*(R" if and only if the script is
 364 invoked with a \-xyz switch.
 365 .nf
 366
 367 .ne 2
 368         #!/usr/bin/perl \-s
 369         if ($xyz) { print "true\en"; }
 370
 371 .fi
 372 .TP 5
 373 .B \-S
 374 makes
 375 .I perl
 376 use the PATH environment variable to search for the script
 377 (unless the name of the script starts with a slash).
 378 Typically this is used to emulate #! startup on machines that don't
 379 support #!, in the following manner:
 380 .nf
 381
 382         #!/usr/bin/perl
 383         eval "exec /usr/bin/perl \-S $0 $*"
 384                 if $running_under_some_shell;
 385
 386 .fi
 387 The system ignores the first line and feeds the script to /bin/sh,
 388 which proceeds to try to execute the
 389 .I perl
 390 script as a shell script.
 391 The shell executes the second line as a normal shell command, and thus
 392 starts up the
 393 .I perl
 394 interpreter.
 395 On some systems $0 doesn't always contain the full pathname,
 396 so the
 397 .B \-S
 398 tells
 399 .I perl
 400 to search for the script if necessary.
 401 After
 402 .I perl
 403 locates the script, it parses the lines and ignores them because
 404 the variable $running_under_some_shell is never true.
 405 A better construct than $* would be ${1+"$@"}, which handles embedded spaces
 406 and such in the filenames, but doesn't work if the script is being interpreted
 407 by csh.
 408 In order to start up sh rather than csh, some systems may have to replace the
 409 #! line with a line containing just
 410 a colon, which will be politely ignored by perl.
 411 Other systems can't control that, and need a totally devious construct that
 412 will work under any of csh, sh or perl, such as the following:
 413 .nf
 414
 415 .ne 3
 416         eval '(exit $?0)' && eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
 417         & eval 'exec /usr/bin/perl -S $0 $argv:q'
 418                 if 0;
 419
 420 .fi
 421 .TP 5
 422 .B \-u
 423 causes
 424 .I perl
 425 to dump core after compiling your script.
 426 You can then take this core dump and turn it into an executable file
 427 by using the undump program (not supplied).
 428 This speeds startup at the expense of some disk space (which you can
 429 minimize by stripping the executable).
 430 (Still, a "hello world" executable comes out to about 200K on my machine.)
 431 If you are going to run your executable as a set-id program then you
 432 should probably compile it using taintperl rather than normal perl.
 433 If you want to execute a portion of your script before dumping, use the
 434 dump operator instead.
 435 Note: availability of undump is platform specific and may not be available
 436 for a specific port of perl.
 437 .TP 5
 438 .B \-U
 439 allows
 440 .I perl
 441 to do unsafe operations.
 442 Currently the only \*(L"unsafe\*(R" operation is the unlinking of directories while
 443 running as superuser.
 444 .TP 5
 445 .B \-v
 446 prints the version and patchlevel of your
 447 .I perl
 448 executable.
 449 .TP 5
 450 .B \-w
 451 prints warnings about identifiers that are mentioned only once, and scalar
 452 variables that are used before being set.
 453 Also warns about redefined subroutines, and references to undefined
 454 filehandles or filehandles opened readonly that you are attempting to
 455 write on.
 456 Also warns you if you use == on values that don't look like numbers, and if
 457 your subroutines recurse more than 100 deep.
 458 .TP 5
 459 .BI \-x directory
 460 tells
 461 .I perl
 462 that the script is embedded in a message.
 463 Leading garbage will be discarded until the first line that starts
 464 with #! and contains the string "perl".
 465 Any meaningful switches on that line will be applied (but only one
 466 group of switches, as with normal #! processing).
 467 If a directory name is specified, Perl will switch to that directory
 468 before running the script.
 469 The
 470 .B \-x
 471 switch only controls the the disposal of leading garbage.
 472 The script must be terminated with __END__ if there is trailing garbage
 473 to be ignored (the script can process any or all of the trailing garbage
 474 via the DATA filehandle if desired).
 475 .Sh "Data Types and Objects"
 476 .PP
 477 .I Perl
 478 has three data types: scalars, arrays of scalars, and
 479 associative arrays of scalars.
 480 Normal arrays are indexed by number, and associative arrays by string.
 481 .PP
 482 The interpretation of operations and values in perl sometimes
 483 depends on the requirements
 484 of the context around the operation or value.
 485 There are three major contexts: string, numeric and array.
 486 Certain operations return array values
 487 in contexts wanting an array, and scalar values otherwise.
 488 (If this is true of an operation it will be mentioned in the documentation
 489 for that operation.)
 490 Operations which return scalars don't care whether the context is looking
 491 for a string or a number, but
 492 scalar variables and values are interpreted as strings or numbers
 493 as appropriate to the context.
 494 A scalar is interpreted as TRUE in the boolean sense if it is not the null
 495 string or 0.
 496 Booleans returned by operators are 1 for true and 0 or \'\' (the null
 497 string) for false.
 498 .PP
 499 There are actually two varieties of null string: defined and undefined.
 500 Undefined null strings are returned when there is no real value for something,
 501 such as when there was an error, or at end of file, or when you refer
 502 to an uninitialized variable or element of an array.
 503 An undefined null string may become defined the first time you access it, but
 504 prior to that you can use the defined() operator to determine whether the
 505 value is defined or not.
 506 .PP
 507 References to scalar variables always begin with \*(L'$\*(R', even when referring
 508 to a scalar that is part of an array.
 509 Thus:
 510 .nf
 511
 512 .ne 3
 513     $days       \h'|2i'# a simple scalar variable
 514     $days[28]   \h'|2i'# 29th element of array @days
 515     $days{\'Feb\'}\h'|2i'# one value from an associative array
 516     $#days      \h'|2i'# last index of array @days
 517
 518 but entire arrays or array slices are denoted by \*(L'@\*(R':
 519
 520     @days       \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n])
 521     @days[3,4,5]\h'|2i'# same as @days[3.\|.5]
 522     @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'})
 523
 524 and entire associative arrays are denoted by \*(L'%\*(R':
 525
 526     %days       \h'|2i'# (key1, val1, key2, val2 .\|.\|.)
 527 .fi
 528 .PP
 529 Any of these eight constructs may serve as an lvalue,
 530 that is, may be assigned to.
 531 (It also turns out that an assignment is itself an lvalue in
 532 certain contexts\*(--see examples under s, tr and chop.)
 533 Assignment to a scalar evaluates the righthand side in a scalar context,
 534 while assignment to an array or array slice evaluates the righthand side
 535 in an array context.
 536 .PP
 537 You may find the length of array @days by evaluating
 538 \*(L"$#days\*(R", as in
 539 .IR csh .
 540 (Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.)
 541 Assigning to $#days changes the length of the array.
 542 Shortening an array by this method does not actually destroy any values.
 543 Lengthening an array that was previously shortened recovers the values that
 544 were in those elements.
 545 You can also gain some measure of efficiency by preextending an array that
 546 is going to get big.
 547 (You can also extend an array by assigning to an element that is off the
 548 end of the array.
 549 This differs from assigning to $#whatever in that intervening values
 550 are set to null rather than recovered.)
 551 You can truncate an array down to nothing by assigning the null list () to
 552 it.
 553 The following are exactly equivalent
 554 .nf
 555
 556         @whatever = ();
 557         $#whatever = $[ \- 1;
 558
 559 .fi
 560 .PP
 561 If you evaluate an array in a scalar context, it returns the length of
 562 the array.
 563 The following is always true:
 564 .nf
 565
 566         @whatever == $#whatever \- $[ + 1;
 567
 568 .fi
 569 .PP
 570 Multi-dimensional arrays are not directly supported, but see the discussion
 571 of the $; variable later for a means of emulating multiple subscripts with
 572 an associative array.
 573 You could also write a subroutine to turn multiple subscripts into a single
 574 subscript.
 575 .PP
 576 Every data type has its own namespace.
 577 You can, without fear of conflict, use the same name for a scalar variable,
 578 an array, an associative array, a filehandle, a subroutine name, and/or
 579 a label.
 580 Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R',
 581 or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved
 582 with respect to variable names.
 583 (They ARE reserved with respect to labels and filehandles, however, which
 584 don't have an initial special character.
 585 Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\').
 586 Using uppercase filehandles also improves readability and protects you
 587 from conflict with future reserved words.)
 588 Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all
 589 different names.
 590 Names which start with a letter may also contain digits and underscores.
 591 Names which do not start with a letter are limited to one character,
 592 e.g. \*(L"$%\*(R" or \*(L"$$\*(R".
 593 (Most of the one character names have a predefined significance to
 594 .IR perl .
 595 More later.)
 596 .PP
 597 Numeric literals are specified in any of the usual floating point or
 598 integer formats:
 599 .nf
 600
 601 .ne 5
 602     12345
 603     12345.67
 604     .23E-10
 605     0xffff      # hex
 606     0377        # octal
 607
 608 .fi
 609 String literals are delimited by either single or double quotes.
 610 They work much like shell quotes:
 611 double-quoted string literals are subject to backslash and variable
 612 substitution; single-quoted strings are not (except for \e\' and \e\e).
 613 The usual backslash rules apply for making characters such as newline, tab,
 614 etc., as well as some more exotic forms:
 615 .nf
 616
 617         \et             tab
 618         \en             newline
 619         \er             return
 620         \ef             form feed
 621         \eb             backspace
 622         \ea             alarm (bell)
 623         \ee             escape
 624         \e033           octal char
 625         \ex1b           hex char
 626         \ec[            control char
 627         \el             lowercase next char
 628         \eu             uppercase next char
 629         \eL             lowercase till \eE
 630         \eU             uppercase till \eE
 631         \eE             end case modification
 632
 633 .fi
 634 You can also embed newlines directly in your strings, i.e. they can end on
 635 a different line than they begin.
 636 This is nice, but if you forget your trailing quote, the error will not be
 637 reported until
 638 .I perl
 639 finds another line containing the quote character, which
 640 may be much further on in the script.
 641 Variable substitution inside strings is limited to scalar variables, normal
 642 array values, and array slices.
 643 (In other words, identifiers beginning with $ or @, followed by an optional
 644 bracketed expression as a subscript.)
 645 The following code segment prints out \*(L"The price is $100.\*(R"
 646 .nf
 647
 648 .ne 2
 649     $Price = \'$100\';\h'|3.5i'# not interpreted
 650     print "The price is $Price.\e\|n";\h'|3.5i'# interpreted
 651
 652 .fi
 653 Note that you can put curly brackets around the identifier to delimit it
 654 from following alphanumerics.
 655 Also note that a single quoted string must be separated from a preceding
 656 word by a space, since single quote is a valid character in an identifier
 657 (see Packages).
 658 .PP
 659 Two special literals are __LINE__ and __FILE__, which represent the current
 660 line number and filename at that point in your program.
 661 They may only be used as separate tokens; they will not be interpolated
 662 into strings.
 663 In addition, the token __END__ may be used to indicate the logical end of the
 664 script before the actual end of file.
 665 Any following text is ignored (but may be read via the DATA filehandle).
 666 The two control characters ^D and ^Z are synonyms for __END__.
 667 .PP
 668 A word that doesn't have any other interpretation in the grammar will be
 669 treated as if it had single quotes around it.
 670 For this purpose, a word consists only of alphanumeric characters and underline,
 671 and must start with an alphabetic character.
 672 As with filehandles and labels, a bare word that consists entirely of
 673 lowercase letters risks conflict with future reserved words, and if you
 674 use the
 675 .B \-w
 676 switch, Perl will warn you about any such words.
 677 .PP
 678 Array values are interpolated into double-quoted strings by joining all the
 679 elements of the array with the delimiter specified in the $" variable,
 680 space by default.
 681 (Since in versions of perl prior to 3.0 the @ character was not a metacharacter
 682 in double-quoted strings, the interpolation of @array, $array[EXPR],
 683 @array[LIST], $array{EXPR}, or @array{LIST} only happens if array is
 684 referenced elsewhere in the program or is predefined.)
 685 The following are equivalent:
 686 .nf
 687
 688 .ne 4
 689         $temp = join($",@ARGV);
 690         system "echo $temp";
 691
 692         system "echo @ARGV";
 693
 694 .fi
 695 Within search patterns (which also undergo double-quotish substitution)
 696 there is a bad ambiguity:  Is /$foo[bar]/ to be
 697 interpreted as /${foo}[bar]/ (where [bar] is a character class for the
 698 regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to
 699 array @foo)?
 700 If @foo doesn't otherwise exist, then it's obviously a character class.
 701 If @foo exists, perl takes a good guess about [bar], and is almost always right.
 702 If it does guess wrong, or if you're just plain paranoid,
 703 you can force the correct interpretation with curly brackets as above.
 704 .PP
 705 A line-oriented form of quoting is based on the shell here-is syntax.
 706 Following a << you specify a string to terminate the quoted material, and all lines
 707 following the current line down to the terminating string are the value
 708 of the item.
 709 The terminating string may be either an identifier (a word), or some
 710 quoted text.
 711 If quoted, the type of quotes you use determines the treatment of the text,
 712 just as in regular quoting.
 713 An unquoted identifier works like double quotes.
 714 There must be no space between the << and the identifier.
 715 (If you put a space it will be treated as a null identifier, which is
 716 valid, and matches the first blank line\*(--see Merry Christmas example below.)
 717 The terminating string must appear by itself (unquoted and with no surrounding
 718 whitespace) on the terminating line.
 719 .nf
 720
 721         print <<EOF;            # same as above
 722 The price is $Price.
 723 EOF
 724
 725         print <<"EOF";          # same as above
 726 The price is $Price.
 727 EOF
 728
 729         print << x 10;          # null identifier is delimiter
 730 Merry Christmas!
 731
 732         print <<`EOC`;          # execute commands
 733 echo hi there
 734 echo lo there
 735 EOC
 736
 737         print <<foo, <<bar;     # you can stack them
 738 I said foo.
 739 foo
 740 I said bar.
 741 bar
 742
 743 .fi
 744 Array literals are denoted by separating individual values by commas, and
 745 enclosing the list in parentheses:
 746 .nf
 747
 748         (LIST)
 749
 750 .fi
 751 In a context not requiring an array value, the value of the array literal
 752 is the value of the final element, as in the C comma operator.
 753 For example,
 754 .nf
 755
 756 .ne 4
 757     @foo = (\'cc\', \'\-E\', $bar);
 758
 759 assigns the entire array value to array foo, but
 760
 761     $foo = (\'cc\', \'\-E\', $bar);
 762
 763 .fi
 764 assigns the value of variable bar to variable foo.
 765 Note that the value of an actual array in a scalar context is the length
 766 of the array; the following assigns to $foo the value 3:
 767 .nf
 768
 769 .ne 2
 770     @foo = (\'cc\', \'\-E\', $bar);
 771     $foo = @foo;                # $foo gets 3
 772
 773 .fi
 774 You may have an optional comma before the closing parenthesis of an
 775 array literal, so that you can say:
 776 .nf
 777
 778     @foo = (
 779         1,
 780         2,
 781         3,
 782     );
 783
 784 .fi
 785 When a LIST is evaluated, each element of the list is evaluated in
 786 an array context, and the resulting array value is interpolated into LIST
 787 just as if each individual element were a member of LIST.  Thus arrays
 788 lose their identity in a LIST\*(--the list
 789
 790         (@foo,@bar,&SomeSub)
 791
 792 contains all the elements of @foo followed by all the elements of @bar,
 793 followed by all the elements returned by the subroutine named SomeSub.
 794 .PP
 795 A list value may also be subscripted like a normal array.
 796 Examples:
 797 .nf
 798
 799         $time = (stat($file))[8];       # stat returns array value
 800         $digit = ('a','b','c','d','e','f')[$digit-10];
 801         return (pop(@foo),pop(@foo))[0];
 802
 803 .fi
 804 .PP
 805 Array lists may be assigned to if and only if each element of the list
 806 is an lvalue:
 807 .nf
 808
 809     ($a, $b, $c) = (1, 2, 3);
 810
 811     ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00);
 812
 813 The final element may be an array or an associative array:
 814
 815     ($a, $b, @rest) = split;
 816     local($a, $b, %rest) = @_;
 817
 818 .fi
 819 You can actually put an array anywhere in the list, but the first array
 820 in the list will soak up all the values, and anything after it will get
 821 a null value.
 822 This may be useful in a local().
 823 .PP
 824 An associative array literal contains pairs of values to be interpreted
 825 as a key and a value:
 826 .nf
 827
 828 .ne 2
 829     # same as map assignment above
 830     %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
 831
 832 .fi
 833 Array assignment in a scalar context returns the number of elements
 834 produced by the expression on the right side of the assignment:
 835 .nf
 836
 837         $x = (($foo,$bar) = (3,2,1));   # set $x to 3, not 2
 838
 839 .fi
 840 .PP
 841 There are several other pseudo-literals that you should know about.
 842 If a string is enclosed by backticks (grave accents), it first undergoes
 843 variable substitution just like a double quoted string.
 844 It is then interpreted as a command, and the output of that command
 845 is the value of the pseudo-literal, like in a shell.
 846 In a scalar context, a single string consisting of all the output is
 847 returned.
 848 In an array context, an array of values is returned, one for each line
 849 of output.
 850 (You can set $/ to use a different line terminator.)
 851 The command is executed each time the pseudo-literal is evaluated.
 852 The status value of the command is returned in $? (see Predefined Names
 853 for the interpretation of $?).
 854 Unlike in \f2csh\f1, no translation is done on the return
 855 data\*(--newlines remain newlines.
 856 Unlike in any of the shells, single quotes do not hide variable names
 857 in the command from interpretation.
 858 To pass a $ through to the shell you need to hide it with a backslash.
 859 .PP
 860 Evaluating a filehandle in angle brackets yields the next line
 861 from that file (newline included, so it's never false until EOF, at
 862 which time an undefined value is returned).
 863 Ordinarily you must assign that value to a variable,
 864 but there is one situation where an automatic assignment happens.
 865 If (and only if) the input symbol is the only thing inside the conditional of a
 866 .I while
 867 loop, the value is
 868 automatically assigned to the variable \*(L"$_\*(R".
 869 (This may seem like an odd thing to you, but you'll use the construct
 870 in almost every
 871 .I perl
 872 script you write.)
 873 Anyway, the following lines are equivalent to each other:
 874 .nf
 875
 876 .ne 5
 877     while ($_ = <STDIN>) { print; }
 878     while (<STDIN>) { print; }
 879     for (\|;\|<STDIN>;\|) { print; }
 880     print while $_ = <STDIN>;
 881     print while <STDIN>;
 882
 883 .fi
 884 The filehandles
 885 .IR STDIN ,
 886 .I STDOUT
 887 and
 888 .I STDERR
 889 are predefined.
 890 (The filehandles
 891 .IR stdin ,
 892 .I stdout
 893 and
 894 .I stderr
 895 will also work except in packages, where they would be interpreted as
 896 local identifiers rather than global.)
 897 Additional filehandles may be created with the
 898 .I open
 899 function.
 900 .PP
 901 If a <FILEHANDLE> is used in a context that is looking for an array, an array
 902 consisting of all the input lines is returned, one line per array element.
 903 It's easy to make a LARGE data space this way, so use with care.
 904 .PP
 905 The null filehandle <> is special and can be used to emulate the behavior of
 906 \fIsed\fR and \fIawk\fR.
 907 Input from <> comes either from standard input, or from each file listed on
 908 the command line.
 909 Here's how it works: the first time <> is evaluated, the ARGV array is checked,
 910 and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard
 911 input.
 912 The ARGV array is then processed as a list of filenames.
 913 The loop
 914 .nf
 915
 916 .ne 3
 917         while (<>) {
 918                 .\|.\|.                 # code for each line
 919         }
 920
 921 .ne 10
 922 is equivalent to
 923
 924         unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[;
 925         while ($ARGV = shift) {
 926                 open(ARGV, $ARGV);
 927                 while (<ARGV>) {
 928                         .\|.\|.         # code for each line
 929                 }
 930         }
 931
 932 .fi
 933 except that it isn't as cumbersome to say.
 934 It really does shift array ARGV and put the current filename into
 935 variable ARGV.
 936 It also uses filehandle ARGV internally.
 937 You can modify @ARGV before the first <> as long as you leave the first
 938 filename at the beginning of the array.
 939 Line numbers ($.) continue as if the input was one big happy file.
 940 (But see example under eof for how to reset line numbers on each file.)
 941 .PP
 942 .ne 5
 943 If you want to set @ARGV to your own list of files, go right ahead.
 944 If you want to pass switches into your script, you can
 945 put a loop on the front like this:
 946 .nf
 947
 948 .ne 10
 949         while ($_ = $ARGV[0], /\|^\-/\|) {
 950                 shift;
 951             last if /\|^\-\|\-$\|/\|;
 952                 /\|^\-D\|(.*\|)/ \|&& \|($debug = $1);
 953                 /\|^\-v\|/ \|&& \|$verbose++;
 954                 .\|.\|.         # other switches
 955         }
 956         while (<>) {
 957                 .\|.\|.         # code for each line
 958         }
 959
 960 .fi
 961 The <> symbol will return FALSE only once.
 962 If you call it again after this it will assume you are processing another
 963 @ARGV list, and if you haven't set @ARGV, will input from
 964 .IR STDIN .
 965 .PP
 966 If the string inside the angle brackets is a reference to a scalar variable
 967 (e.g. <$foo>),
 968 then that variable contains the name of the filehandle to input from.
 969 .PP
 970 If the string inside angle brackets is not a filehandle, it is interpreted
 971 as a filename pattern to be globbed, and either an array of filenames or the
 972 next filename in the list is returned, depending on context.
 973 One level of $ interpretation is done first, but you can't say <$foo>
 974 because that's an indirect filehandle as explained in the previous
 975 paragraph.
 976 You could insert curly brackets to force interpretation as a
 977 filename glob: <${foo}>.
 978 Example:
 979 .nf
 980
 981 .ne 3
 982         while (<*.c>) {
 983                 chmod 0644, $_;
 984         }
 985
 986 is equivalent to
 987
 988 .ne 5
 989         open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|");
 990         while (<foo>) {
 991                 chop;
 992                 chmod 0644, $_;
 993         }
 994
 995 .fi
 996 In fact, it's currently implemented that way.
 997 (Which means it will not work on filenames with spaces in them unless
 998 you have /bin/csh on your machine.)
 999 Of course, the shortest way to do the above is:
1000 .nf
1001
1002         chmod 0644, <*.c>;
1003
1004 .fi
1005 .Sh "Syntax"
1006 .PP
1007 A
1008 .I perl
1009 script consists of a sequence of declarations and commands.
1010 The only things that need to be declared in
1011 .I perl
1012 are report formats and subroutines.
1013 See the sections below for more information on those declarations.
1014 All uninitialized user-created objects are assumed to
1015 start with a null or 0 value until they
1016 are defined by some explicit operation such as assignment.
1017 The sequence of commands is executed just once, unlike in
1018 .I sed
1019 and
1020 .I awk
1021 scripts, where the sequence of commands is executed for each input line.
1022 While this means that you must explicitly loop over the lines of your input file
1023 (or files), it also means you have much more control over which files and which
1024 lines you look at.
1025 (Actually, I'm lying\*(--it is possible to do an implicit loop with either the
1026 .B \-n
1027 or
1028 .B \-p
1029 switch.)
1030 .PP
1031 A declaration can be put anywhere a command can, but has no effect on the
1032 execution of the primary sequence of commands\*(--declarations all take effect
1033 at compile time.
1034 Typically all the declarations are put at the beginning or the end of the script.
1035 .PP
1036 .I Perl
1037 is, for the most part, a free-form language.
1038 (The only exception to this is format declarations, for fairly obvious reasons.)
1039 Comments are indicated by the # character, and extend to the end of the line.
1040 If you attempt to use /* */ C comments, it will be interpreted either as
1041 division or pattern matching, depending on the context.
1042 So don't do that.
1043 .Sh "Compound statements"
1044 In
1045 .IR perl ,
1046 a sequence of commands may be treated as one command by enclosing it
1047 in curly brackets.
1048 We will call this a BLOCK.
1049 .PP
1050 The following compound commands may be used to control flow:
1051 .nf
1052
1053 .ne 4
1054         if (EXPR) BLOCK
1055         if (EXPR) BLOCK else BLOCK
1056         if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK
1057         LABEL while (EXPR) BLOCK
1058         LABEL while (EXPR) BLOCK continue BLOCK
1059         LABEL for (EXPR; EXPR; EXPR) BLOCK
1060         LABEL foreach VAR (ARRAY) BLOCK
1061         LABEL BLOCK continue BLOCK
1062
1063 .fi
1064 Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not
1065 statements.
1066 This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed.
1067 If you want to write conditionals without curly brackets there are several
1068 other ways to do it.
1069 The following all do the same thing:
1070 .nf
1071
1072 .ne 5
1073         if (!open(foo)) { die "Can't open $foo: $!"; }
1074         die "Can't open $foo: $!" unless open(foo);
1075         open(foo) || die "Can't open $foo: $!"; # foo or bust!
1076         open(foo) ? \'hi mom\' : die "Can't open $foo: $!";
1077                                 # a bit exotic, that last one
1078
1079 .fi
1080 .PP
1081 The
1082 .I if
1083 statement is straightforward.
1084 Since BLOCKs are always bounded by curly brackets, there is never any
1085 ambiguity about which
1086 .I if
1087 an
1088 .I else
1089 goes with.
1090 If you use
1091 .I unless
1092 in place of
1093 .IR if ,
1094 the sense of the test is reversed.
1095 .PP
1096 The
1097 .I while
1098 statement executes the block as long as the expression is true
1099 (does not evaluate to the null string or 0).
1100 The LABEL is optional, and if present, consists of an identifier followed by
1101 a colon.
1102 The LABEL identifies the loop for the loop control statements
1103 .IR next ,
1104 .IR last ,
1105 and
1106 .I redo
1107 (see below).
1108 If there is a
1109 .I continue
1110 BLOCK, it is always executed just before
1111 the conditional is about to be evaluated again, similarly to the third part
1112 of a
1113 .I for
1114 loop in C.
1115 Thus it can be used to increment a loop variable, even when the loop has
1116 been continued via the
1117 .I next
1118 statement (similar to the C \*(L"continue\*(R" statement).
1119 .PP
1120 If the word
1121 .I while
1122 is replaced by the word
1123 .IR until ,
1124 the sense of the test is reversed, but the conditional is still tested before
1125 the first iteration.
1126 .PP
1127 In either the
1128 .I if
1129 or the
1130 .I while
1131 statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional
1132 is true if the value of the last command in that block is true.
1133 .PP
1134 The
1135 .I for
1136 loop works exactly like the corresponding
1137 .I while
1138 loop:
1139 .nf
1140
1141 .ne 12
1142         for ($i = 1; $i < 10; $i++) {
1143                 .\|.\|.
1144         }
1145
1146 is the same as
1147
1148         $i = 1;
1149         while ($i < 10) {
1150                 .\|.\|.
1151         } continue {
1152                 $i++;
1153         }
1154 .fi
1155 .PP
1156 The foreach loop iterates over a normal array value and sets the variable
1157 VAR to be each element of the array in turn.
1158 The variable is implicitly local to the loop, and regains its former value
1159 upon exiting the loop.
1160 The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword,
1161 so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity.
1162 If VAR is omitted, $_ is set to each value.
1163 If ARRAY is an actual array (as opposed to an expression returning an array
1164 value), you can modify each element of the array
1165 by modifying VAR inside the loop.
1166 Examples:
1167 .nf
1168
1169 .ne 5
1170         for (@ary) { s/foo/bar/; }
1171
1172         foreach $elem (@elements) {
1173                 $elem *= 2;
1174         }
1175
1176 .ne 3
1177         for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) {
1178                 print $_, "\en"; sleep(1);
1179         }
1180
1181         for (1..15) { print "Merry Christmas\en"; }
1182
1183 .ne 3
1184         foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'})) {
1185                 print "Item: $item\en";
1186         }
1187
1188 .fi
1189 .PP
1190 The BLOCK by itself (labeled or not) is equivalent to a loop that executes
1191 once.
1192 Thus you can use any of the loop control statements in it to leave or
1193 restart the block.
1194 The
1195 .I continue
1196 block is optional.
1197 This construct is particularly nice for doing case structures.
1198 .nf
1199
1200 .ne 6
1201         foo: {
1202                 if (/^abc/) { $abc = 1; last foo; }
1203                 if (/^def/) { $def = 1; last foo; }
1204                 if (/^xyz/) { $xyz = 1; last foo; }
1205                 $nothing = 1;
1206         }
1207
1208 .fi
1209 There is no official switch statement in perl, because there
1210 are already several ways to write the equivalent.
1211 In addition to the above, you could write
1212 .nf
1213
1214 .ne 6
1215         foo: {
1216                 $abc = 1, last foo  if /^abc/;
1217                 $def = 1, last foo  if /^def/;
1218                 $xyz = 1, last foo  if /^xyz/;
1219                 $nothing = 1;
1220         }
1221
1222 or
1223
1224 .ne 6
1225         foo: {
1226                 /^abc/ && do { $abc = 1; last foo; };
1227                 /^def/ && do { $def = 1; last foo; };
1228                 /^xyz/ && do { $xyz = 1; last foo; };
1229                 $nothing = 1;
1230         }
1231
1232 or
1233
1234 .ne 6
1235         foo: {
1236                 /^abc/ && ($abc = 1, last foo);
1237                 /^def/ && ($def = 1, last foo);
1238                 /^xyz/ && ($xyz = 1, last foo);
1239                 $nothing = 1;
1240         }
1241
1242 or even
1243
1244 .ne 8
1245         if (/^abc/)
1246                 { $abc = 1; }
1247         elsif (/^def/)
1248                 { $def = 1; }
1249         elsif (/^xyz/)
1250                 { $xyz = 1; }
1251         else
1252                 {$nothing = 1;}
1253
1254 .fi
1255 As it happens, these are all optimized internally to a switch structure,
1256 so perl jumps directly to the desired statement, and you needn't worry
1257 about perl executing a lot of unnecessary statements when you have a string
1258 of 50 elsifs, as long as you are testing the same simple scalar variable
1259 using ==, eq, or pattern matching as above.
1260 (If you're curious as to whether the optimizer has done this for a particular
1261 case statement, you can use the \-D1024 switch to list the syntax tree
1262 before execution.)
1263 .Sh "Simple statements"
1264 The only kind of simple statement is an expression evaluated for its side
1265 effects.
1266 Every expression (simple statement) must be terminated with a semicolon.
1267 Note that this is like C, but unlike Pascal (and
1268 .IR awk ).
1269 .PP
1270 Any simple statement may optionally be followed by a
1271 single modifier, just before the terminating semicolon.
1272 The possible modifiers are:
1273 .nf
1274
1275 .ne 4
1276         if EXPR
1277         unless EXPR
1278         while EXPR
1279         until EXPR
1280
1281 .fi
1282 The
1283 .I if
1284 and
1285 .I unless
1286 modifiers have the expected semantics.
1287 The
1288 .I while
1289 and
1290 .I until
1291 modifiers also have the expected semantics (conditional evaluated first),
1292 except when applied to a do-BLOCK or a do-SUBROUTINE command,
1293 in which case the block executes once before the conditional is evaluated.
1294 This is so that you can write loops like:
1295 .nf
1296
1297 .ne 4
1298         do {
1299                 $_ = <STDIN>;
1300                 .\|.\|.
1301         } until $_ \|eq \|".\|\e\|n";
1302
1303 .fi
1304 (See the
1305 .I do
1306 operator below.  Note also that the loop control commands described later will
1307 NOT work in this construct, since modifiers don't take loop labels.
1308 Sorry.)
1309 .Sh "Expressions"
1310 Since
1311 .I perl
1312 expressions work almost exactly like C expressions, only the differences
1313 will be mentioned here.
1314 .PP
1315 Here's what
1316 .I perl
1317 has that C doesn't:
1318 .Ip ** 8 2
1319 The exponentiation operator.
1320 .Ip **= 8
1321 The exponentiation assignment operator.
1322 .Ip (\|) 8 3
1323 The null list, used to initialize an array to null.
1324 .Ip . 8
1325 Concatenation of two strings.
1326 .Ip .= 8
1327 The concatenation assignment operator.
1328 .Ip eq 8
1329 String equality (== is numeric equality).
1330 For a mnemonic just think of \*(L"eq\*(R" as a string.
1331 (If you are used to the
1332 .I awk
1333 behavior of using == for either string or numeric equality
1334 based on the current form of the comparands, beware!
1335 You must be explicit here.)
1336 .Ip ne 8
1337 String inequality (!= is numeric inequality).
1338 .Ip lt 8
1339 String less than.
1340 .Ip gt 8
1341 String greater than.
1342 .Ip le 8
1343 String less than or equal.
1344 .Ip ge 8
1345 String greater than or equal.
1346 .Ip cmp 8
1347 String comparison, returning -1, 0, or 1.
1348 .Ip <=> 8
1349 Numeric comparison, returning -1, 0, or 1.
1350 .Ip =~ 8 2
1351 Certain operations search or modify the string \*(L"$_\*(R" by default.
1352 This operator makes that kind of operation work on some other string.
1353 The right argument is a search pattern, substitution, or translation.
1354 The left argument is what is supposed to be searched, substituted, or
1355 translated instead of the default \*(L"$_\*(R".
1356 The return value indicates the success of the operation.
1357 (If the right argument is an expression other than a search pattern,
1358 substitution, or translation, it is interpreted as a search pattern
1359 at run time.
1360 This is less efficient than an explicit search, since the pattern must
1361 be compiled every time the expression is evaluated.)
1362 The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else.
1363 .Ip !~ 8
1364 Just like =~ except the return value is negated.
1365 .Ip x 8
1366 The repetition operator.
1367 Returns a string consisting of the left operand repeated the
1368 number of times specified by the right operand.
1369 In an array context, if the left operand is a list in parens, it repeats
1370 the list.
1371 .nf
1372
1373         print \'\-\' x 80;              # print row of dashes
1374         print \'\-\' x80;               # illegal, x80 is identifier
1375
1376         print "\et" x ($tab/8), \' \' x ($tab%8);       # tab over
1377
1378         @ones = (1) x 80;               # an array of 80 1's
1379         @ones = (5) x @ones;            # set all elements to 5
1380
1381 .fi
1382 .Ip x= 8
1383 The repetition assignment operator.
1384 Only works on scalars.
1385 .Ip .\|. 8
1386 The range operator, which is really two different operators depending
1387 on the context.
1388 In an array context, returns an array of values counting (by ones)
1389 from the left value to the right value.
1390 This is useful for writing \*(L"for (1..10)\*(R" loops and for doing
1391 slice operations on arrays.
1392 .Sp
1393 In a scalar context, .\|. returns a boolean value.
1394 The operator is bistable, like a flip-flop..
1395 Each .\|. operator maintains its own boolean state.
1396 It is false as long as its left operand is false.
1397 Once the left operand is true, the range operator stays true
1398 until the right operand is true,
1399 AFTER which the range operator becomes false again.
1400 (It doesn't become false till the next time the range operator is evaluated.
1401 It can become false on the same evaluation it became true, but it still returns
1402 true once.)
1403 The right operand is not evaluated while the operator is in the \*(L"false\*(R" state,
1404 and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state.
1405 The scalar .\|. operator is primarily intended for doing line number ranges
1406 after
1407 the fashion of \fIsed\fR or \fIawk\fR.
1408 The precedence is a little lower than || and &&.
1409 The value returned is either the null string for false, or a sequence number
1410 (beginning with 1) for true.
1411 The sequence number is reset for each range encountered.
1412 The final sequence number in a range has the string \'E0\' appended to it, which
1413 doesn't affect its numeric value, but gives you something to search for if you
1414 want to exclude the endpoint.
1415 You can exclude the beginning point by waiting for the sequence number to be
1416 greater than 1.
1417 If either operand of scalar .\|. is static, that operand is implicitly compared
1418 to the $. variable, the current line number.
1419 Examples:
1420 .nf
1421
1422 .ne 6
1423 As a scalar operator:
1424     if (101 .\|. 200) { print; }        # print 2nd hundred lines
1425
1426     next line if (1 .\|. /^$/); # skip header lines
1427
1428     s/^/> / if (/^$/ .\|. eof());       # quote body
1429
1430 .ne 4
1431 As an array operator:
1432     for (101 .\|. 200) { print; }       # print $_ 100 times
1433
1434     @foo = @foo[$[ .\|. $#foo]; # an expensive no-op
1435     @foo = @foo[$#foo-4 .\|. $#foo];    # slice last 5 items
1436
1437 .fi
1438 .Ip \-x 8
1439 A file test.
1440 This unary operator takes one argument, either a filename or a filehandle,
1441 and tests the associated file to see if something is true about it.
1442 If the argument is omitted, tests $_, except for \-t, which tests
1443 .IR STDIN .
1444 It returns 1 for true and \'\' for false, or the undefined value if the
1445 file doesn't exist.
1446 Precedence is higher than logical and relational operators, but lower than
1447 arithmetic operators.
1448 The operator may be any of:
1449 .nf
1450         \-r     File is readable by effective uid.
1451         \-w     File is writable by effective uid.
1452         \-x     File is executable by effective uid.
1453         \-o     File is owned by effective uid.
1454         \-R     File is readable by real uid.
1455         \-W     File is writable by real uid.
1456         \-X     File is executable by real uid.
1457         \-O     File is owned by real uid.
1458         \-e     File exists.
1459         \-z     File has zero size.
1460         \-s     File has non-zero size (returns size).
1461         \-f     File is a plain file.
1462         \-d     File is a directory.
1463         \-l     File is a symbolic link.
1464         \-p     File is a named pipe (FIFO).
1465         \-S     File is a socket.
1466         \-b     File is a block special file.
1467         \-c     File is a character special file.
1468         \-u     File has setuid bit set.
1469         \-g     File has setgid bit set.
1470         \-k     File has sticky bit set.
1471         \-t     Filehandle is opened to a tty.
1472         \-T     File is a text file.
1473         \-B     File is a binary file (opposite of \-T).
1474         \-M     Age of file in days when script started.
1475         \-A     Same for access time.
1476         \-C     Same for inode change time.
1477
1478 .fi
1479 The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X
1480 is based solely on the mode of the file and the uids and gids of the user.
1481 There may be other reasons you can't actually read, write or execute the file.
1482 Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and
1483 \-x and \-X return 1 if any execute bit is set in the mode.
1484 Scripts run by the superuser may thus need to do a stat() in order to determine
1485 the actual mode of the file, or temporarily set the uid to something else.
1486 .Sp
1487 Example:
1488 .nf
1489 .ne 7
1490
1491         while (<>) {
1492                 chop;
1493                 next unless \-f $_;     # ignore specials
1494                 .\|.\|.
1495         }
1496
1497 .fi
1498 Note that \-s/a/b/ does not do a negated substitution.
1499 Saying \-exp($foo) still works as expected, however\*(--only single letters
1500 following a minus are interpreted as file tests.
1501 .Sp
1502 The \-T and \-B switches work as follows.
1503 The first block or so of the file is examined for odd characters such as
1504 strange control codes or metacharacters.
1505 If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file.
1506 Also, any file containing null in the first block is considered a binary file.
1507 If \-T or \-B is used on a filehandle, the current stdio buffer is examined
1508 rather than the first block.
1509 Both \-T and \-B return TRUE on a null file, or a file at EOF when testing
1510 a filehandle.
1511 .PP
1512 If any of the file tests (or either stat operator) are given the special
1513 filehandle consisting of a solitary underline, then the stat structure
1514 of the previous file test (or stat operator) is used, saving a system
1515 call.
1516 (This doesn't work with \-t, and you need to remember that lstat and -l
1517 will leave values in the stat structure for the symbolic link, not the
1518 real file.)
1519 Example:
1520 .nf
1521
1522         print "Can do.\en" if -r $a || -w _ || -x _;
1523
1524 .ne 9
1525         stat($filename);
1526         print "Readable\en" if -r _;
1527         print "Writable\en" if -w _;
1528         print "Executable\en" if -x _;
1529         print "Setuid\en" if -u _;
1530         print "Setgid\en" if -g _;
1531         print "Sticky\en" if -k _;
1532         print "Text\en" if -T _;
1533         print "Binary\en" if -B _;
1534
1535 .fi
1536 .PP
1537 Here is what C has that
1538 .I perl
1539 doesn't:
1540 .Ip "unary &" 12
1541 Address-of operator.
1542 .Ip "unary *" 12
1543 Dereference-address operator.
1544 .Ip "(TYPE)" 12
1545 Type casting operator.
1546 .PP
1547 Like C,
1548 .I perl
1549 does a certain amount of expression evaluation at compile time, whenever
1550 it determines that all of the arguments to an operator are static and have
1551 no side effects.
1552 In particular, string concatenation happens at compile time between literals that don't do variable substitution.
1553 Backslash interpretation also happens at compile time.
1554 You can say
1555 .nf
1556
1557 .ne 2
1558         \'Now is the time for all\' . "\|\e\|n" .
1559         \'good men to come to.\'
1560
1561 .fi
1562 and this all reduces to one string internally.
1563 .PP
1564 The autoincrement operator has a little extra built-in magic to it.
1565 If you increment a variable that is numeric, or that has ever been used in
1566 a numeric context, you get a normal increment.
1567 If, however, the variable has only been used in string contexts since it
1568 was set, and has a value that is not null and matches the
1569 pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done
1570 as a string, preserving each character within its range, with carry:
1571 .nf
1572
1573         print ++($foo = \'99\');        # prints \*(L'100\*(R'
1574         print ++($foo = \'a0\');        # prints \*(L'a1\*(R'
1575         print ++($foo = \'Az\');        # prints \*(L'Ba\*(R'
1576         print ++($foo = \'zz\');        # prints \*(L'aaa\*(R'
1577
1578 .fi
1579 The autodecrement is not magical.
1580 .PP
1581 The range operator (in an array context) makes use of the magical
1582 autoincrement algorithm if the minimum and maximum are strings.
1583 You can say
1584
1585         @alphabet = (\'A\' .. \'Z\');
1586
1587 to get all the letters of the alphabet, or
1588
1589         $hexdigit = (0 .. 9, \'a\' .. \'f\')[$num & 15];
1590
1591 to get a hexadecimal digit, or
1592
1593         @z2 = (\'01\' .. \'31\');  print @z2[$mday];
1594
1595 to get dates with leading zeros.
1596 (If the final value specified is not in the sequence that the magical increment
1597 would produce, the sequence goes until the next value would be longer than
1598 the final value specified.)
1599 .PP
1600 The || and && operators differ from C's in that, rather than returning 0 or 1,
1601 they return the last value evaluated.
1602 Thus, a portable way to find out the home directory might be:
1603 .nf
1604
1605         $home = $ENV{'HOME'} || $ENV{'LOGDIR'} ||
1606             (getpwuid($<))[7] || die "You're homeless!\en";
1607
1608 .fi
1609 ''' Beginning of part 2
1610 ''' $RCSfile: perl.man,v $$Revision: 4.0.1.1 $$Date: 91/04/11 17:50:44 $
1611 '''
1612 ''' $Log:       perl.man,v $
1613 ''' Revision 4.0.1.1  91/04/11  17:50:44  lwall
1614 ''' patch1: fixed some typos
1615 '''
1616 ''' Revision 4.0  91/03/20  01:38:08  lwall
1617 ''' 4.0 baseline.
1618 '''
1619 ''' Revision 3.0.1.11  91/01/11  18:17:08  lwall
1620 ''' patch42: fixed some man page entries
1621 '''
1622 ''' Revision 3.0.1.10  90/11/10  01:46:29  lwall
1623 ''' patch38: random cleanup
1624 ''' patch38: added alarm function
1625 '''
1626 ''' Revision 3.0.1.9  90/10/15  18:17:37  lwall
1627 ''' patch29: added caller
1628 ''' patch29: index and substr now have optional 3rd args
1629 ''' patch29: added SysV IPC
1630 '''
1631 ''' Revision 3.0.1.8  90/08/13  22:21:00  lwall
1632 ''' patch28: documented that you can't interpolate $) or $| in pattern
1633 '''
1634 ''' Revision 3.0.1.7  90/08/09  04:27:04  lwall
1635 ''' patch19: added require operator
1636 '''
1637 ''' Revision 3.0.1.6  90/08/03  11:15:29  lwall
1638 ''' patch19: Intermediate diffs for Randal
1639 '''
1640 ''' Revision 3.0.1.5  90/03/27  16:15:17  lwall
1641 ''' patch16: MSDOS support
1642 '''
1643 ''' Revision 3.0.1.4  90/03/12  16:46:02  lwall
1644 ''' patch13: documented behavior of @array = /noparens/
1645 '''
1646 ''' Revision 3.0.1.3  90/02/28  17:55:58  lwall
1647 ''' patch9: grep now returns number of items matched in scalar context
1648 ''' patch9: documented in-place modification capabilites of grep
1649 '''
1650 ''' Revision 3.0.1.2  89/11/17  15:30:16  lwall
1651 ''' patch5: fixed some manual typos and indent problems
1652 '''
1653 ''' Revision 3.0.1.1  89/11/11  04:43:10  lwall
1654 ''' patch2: made some line breaks depend on troff vs. nroff
1655 ''' patch2: example of unshift had args backwards
1656 '''
1657 ''' Revision 3.0  89/10/18  15:21:37  lwall
1658 ''' 3.0 baseline
1659 '''
1660 '''
1661 .PP
1662 Along with the literals and variables mentioned earlier,
1663 the operations in the following section can serve as terms in an expression.
1664 Some of these operations take a LIST as an argument.
1665 Such a list can consist of any combination of scalar arguments or array values;
1666 the array values will be included in the list as if each individual element were
1667 interpolated at that point in the list, forming a longer single-dimensional
1668 array value.
1669 Elements of the LIST should be separated by commas.
1670 If an operation is listed both with and without parentheses around its
1671 arguments, it means you can either use it as a unary operator or
1672 as a function call.
1673 To use it as a function call, the next token on the same line must
1674 be a left parenthesis.
1675 (There may be intervening white space.)
1676 Such a function then has highest precedence, as you would expect from
1677 a function.
1678 If any token other than a left parenthesis follows, then it is a
1679 unary operator, with a precedence depending only on whether it is a LIST
1680 operator or not.
1681 LIST operators have lowest precedence.
1682 All other unary operators have a precedence greater than relational operators
1683 but less than arithmetic operators.
1684 See the section on Precedence.
1685 .Ip "/PATTERN/" 8 4
1686 See m/PATTERN/.
1687 .Ip "?PATTERN?" 8 4
1688 This is just like the /pattern/ search, except that it matches only once between
1689 calls to the
1690 .I reset
1691 operator.
1692 This is a useful optimization when you only want to see the first occurrence of
1693 something in each file of a set of files, for instance.
1694 Only ?? patterns local to the current package are reset.
1695 .Ip "accept(NEWSOCKET,GENERICSOCKET)" 8 2
1696 Does the same thing that the accept system call does.
1697 Returns true if it succeeded, false otherwise.
1698 See example in section on Interprocess Communication.
1699 .Ip "alarm(SECONDS)" 8 4
1700 .Ip "alarm SECONDS" 8
1701 Arranges to have a SIGALRM delivered to this process after the specified number
1702 of seconds (minus 1, actually) have elapsed.  Thus, alarm(15) will cause
1703 a SIGALRM at some point more than 14 seconds in the future.
1704 Only one timer may be counting at once.  Each call disables the previous
1705 timer, and an argument of 0 may be supplied to cancel the previous timer
1706 without starting a new one.
1707 The returned value is the amount of time remaining on the previous timer.
1708 .Ip "atan2(Y,X)" 8 2
1709 Returns the arctangent of Y/X in the range
1710 .if t \-\(*p to \(*p.
1711 .if n \-PI to PI.
1712 .Ip "bind(SOCKET,NAME)" 8 2
1713 Does the same thing that the bind system call does.
1714 Returns true if it succeeded, false otherwise.
1715 NAME should be a packed address of the proper type for the socket.
1716 See example in section on Interprocess Communication.
1717 .Ip "binmode(FILEHANDLE)" 8 4
1718 .Ip "binmode FILEHANDLE" 8 4
1719 Arranges for the file to be read in \*(L"binary\*(R" mode in operating systems
1720 that distinguish between binary and text files.
1721 Files that are not read in binary mode have CR LF sequences translated
1722 to LF on input and LF translated to CR LF on output.
1723 Binmode has no effect under Unix.
1724 If FILEHANDLE is an expression, the value is taken as the name of
1725 the filehandle.
1726 .Ip "caller(EXPR)"
1727 .Ip "caller"
1728 Returns the context of the current subroutine call:
1729 .nf
1730
1731         ($package,$filename,$line) = caller;
1732
1733 .fi
1734 With EXPR, returns some extra information that the debugger uses to print
1735 a stack trace.  The value of EXPR indicates how many call frames to go
1736 back before the current one.
1737 .Ip "chdir(EXPR)" 8 2
1738 .Ip "chdir EXPR" 8 2
1739 Changes the working directory to EXPR, if possible.
1740 If EXPR is omitted, changes to home directory.
1741 Returns 1 upon success, 0 otherwise.
1742 See example under
1743 .IR die .
1744 .Ip "chmod(LIST)" 8 2
1745 .Ip "chmod LIST" 8 2
1746 Changes the permissions of a list of files.
1747 The first element of the list must be the numerical mode.
1748 Returns the number of files successfully changed.
1749 .nf
1750
1751 .ne 2
1752         $cnt = chmod 0755, \'foo\', \'bar\';
1753         chmod 0755, @executables;
1754
1755 .fi
1756 .Ip "chop(LIST)" 8 7
1757 .Ip "chop(VARIABLE)" 8
1758 .Ip "chop VARIABLE" 8
1759 .Ip "chop" 8
1760 Chops off the last character of a string and returns the character chopped.
1761 It's used primarily to remove the newline from the end of an input record,
1762 but is much more efficient than s/\en// because it neither scans nor copies
1763 the string.
1764 If VARIABLE is omitted, chops $_.
1765 Example:
1766 .nf
1767
1768 .ne 5
1769         while (<>) {
1770                 chop;   # avoid \en on last field
1771                 @array = split(/:/);
1772                 .\|.\|.
1773         }
1774
1775 .fi
1776 You can actually chop anything that's an lvalue, including an assignment:
1777 .nf
1778
1779         chop($cwd = \`pwd\`);
1780         chop($answer = <STDIN>);
1781
1782 .fi
1783 If you chop a list, each element is chopped.
1784 Only the value of the last chop is returned.
1785 .Ip "chown(LIST)" 8 2
1786 .Ip "chown LIST" 8 2
1787 Changes the owner (and group) of a list of files.
1788 The first two elements of the list must be the NUMERICAL uid and gid,
1789 in that order.
1790 Returns the number of files successfully changed.
1791 .nf
1792
1793 .ne 2
1794         $cnt = chown $uid, $gid, \'foo\', \'bar\';
1795         chown $uid, $gid, @filenames;
1796
1797 .fi
1798 .ne 23
1799 Here's an example of looking up non-numeric uids:
1800 .nf
1801
1802         print "User: ";
1803         $user = <STDIN>;
1804         chop($user);
1805         print "Files: "
1806         $pattern = <STDIN>;
1807         chop($pattern);
1808 .ie t \{\
1809         open(pass, \'/etc/passwd\') || die "Can't open passwd: $!\en";
1810 'br\}
1811 .el \{\
1812         open(pass, \'/etc/passwd\')
1813                 || die "Can't open passwd: $!\en";
1814 'br\}
1815         while (<pass>) {
1816                 ($login,$pass,$uid,$gid) = split(/:/);
1817                 $uid{$login} = $uid;
1818                 $gid{$login} = $gid;
1819         }
1820         @ary = <${pattern}>;    # get filenames
1821         if ($uid{$user} eq \'\') {
1822                 die "$user not in passwd file";
1823         }
1824         else {
1825                 chown $uid{$user}, $gid{$user}, @ary;
1826         }
1827
1828 .fi
1829 .Ip "chroot(FILENAME)" 8 5
1830 .Ip "chroot FILENAME" 8
1831 Does the same as the system call of that name.
1832 If you don't know what it does, don't worry about it.
1833 If FILENAME is omitted, does chroot to $_.
1834 .Ip "close(FILEHANDLE)" 8 5
1835 .Ip "close FILEHANDLE" 8
1836 Closes the file or pipe associated with the file handle.
1837 You don't have to close FILEHANDLE if you are immediately going to
1838 do another open on it, since open will close it for you.
1839 (See
1840 .IR open .)
1841 However, an explicit close on an input file resets the line counter ($.), while
1842 the implicit close done by
1843 .I open
1844 does not.
1845 Also, closing a pipe will wait for the process executing on the pipe to complete,
1846 in case you want to look at the output of the pipe afterwards.
1847 Closing a pipe explicitly also puts the status value of the command into $?.
1848 Example:
1849 .nf
1850
1851 .ne 4
1852         open(OUTPUT, \'|sort >foo\');   # pipe to sort
1853         .\|.\|. # print stuff to output
1854         close OUTPUT;           # wait for sort to finish
1855         open(INPUT, \'foo\');   # get sort's results
1856
1857 .fi
1858 FILEHANDLE may be an expression whose value gives the real filehandle name.
1859 .Ip "closedir(DIRHANDLE)" 8 5
1860 .Ip "closedir DIRHANDLE" 8
1861 Closes a directory opened by opendir().
1862 .Ip "connect(SOCKET,NAME)" 8 2
1863 Does the same thing that the connect system call does.
1864 Returns true if it succeeded, false otherwise.
1865 NAME should be a package address of the proper type for the socket.
1866 See example in section on Interprocess Communication.
1867 .Ip "cos(EXPR)" 8 6
1868 .Ip "cos EXPR" 8 6
1869 Returns the cosine of EXPR (expressed in radians).
1870 If EXPR is omitted takes cosine of $_.
1871 .Ip "crypt(PLAINTEXT,SALT)" 8 6
1872 Encrypts a string exactly like the crypt() function in the C library.
1873 Useful for checking the password file for lousy passwords.
1874 Only the guys wearing white hats should do this.
1875 .Ip "dbmclose(ASSOC_ARRAY)" 8 6
1876 .Ip "dbmclose ASSOC_ARRAY" 8
1877 Breaks the binding between a dbm file and an associative array.
1878 The values remaining in the associative array are meaningless unless
1879 you happen to want to know what was in the cache for the dbm file.
1880 This function is only useful if you have ndbm.
1881 .Ip "dbmopen(ASSOC,DBNAME,MODE)" 8 6
1882 This binds a dbm or ndbm file to an associative array.
1883 ASSOC is the name of the associative array.
1884 (Unlike normal open, the first argument is NOT a filehandle, even though
1885 it looks like one).
1886 DBNAME is the name of the database (without the .dir or .pag extension).
1887 If the database does not exist, it is created with protection specified
1888 by MODE (as modified by the umask).
1889 If your system only supports the older dbm functions, you may only have one
1890 dbmopen in your program.
1891 If your system has neither dbm nor ndbm, calling dbmopen produces a fatal
1892 error.
1893 .Sp
1894 Values assigned to the associative array prior to the dbmopen are lost.
1895 A certain number of values from the dbm file are cached in memory.
1896 By default this number is 64, but you can increase it by preallocating
1897 that number of garbage entries in the associative array before the dbmopen.
1898 You can flush the cache if necessary with the reset command.
1899 .Sp
1900 If you don't have write access to the dbm file, you can only read
1901 associative array variables, not set them.
1902 If you want to test whether you can write, either use file tests or
1903 try setting a dummy array entry inside an eval, which will trap the error.
1904 .Sp
1905 Note that functions such as keys() and values() may return huge array values
1906 when used on large dbm files.
1907 You may prefer to use the each() function to iterate over large dbm files.
1908 Example:
1909 .nf
1910
1911 .ne 6
1912         # print out history file offsets
1913         dbmopen(HIST,'/usr/lib/news/history',0666);
1914         while (($key,$val) = each %HIST) {
1915                 print $key, ' = ', unpack('L',$val), "\en";
1916         }
1917         dbmclose(HIST);
1918
1919 .fi
1920 .Ip "defined(EXPR)" 8 6
1921 .Ip "defined EXPR" 8
1922 Returns a boolean value saying whether the lvalue EXPR has a real value
1923 or not.
1924 Many operations return the undefined value under exceptional conditions,
1925 such as end of file, uninitialized variable, system error and such.
1926 This function allows you to distinguish between an undefined null string
1927 and a defined null string with operations that might return a real null
1928 string, in particular referencing elements of an array.
1929 You may also check to see if arrays or subroutines exist.
1930 Use on predefined variables is not guaranteed to produce intuitive results.
1931 Examples:
1932 .nf
1933
1934 .ne 7
1935         print if defined $switch{'D'};
1936         print "$val\en" while defined($val = pop(@ary));
1937         die "Can't readlink $sym: $!"
1938                 unless defined($value = readlink $sym);
1939         eval '@foo = ()' if defined(@foo);
1940         die "No XYZ package defined" unless defined %_XYZ;
1941         sub foo { defined &bar ? &bar(@_) : die "No bar"; }
1942
1943 .fi
1944 See also undef.
1945 .Ip "delete $ASSOC{KEY}" 8 6
1946 Deletes the specified value from the specified associative array.
1947 Returns the deleted value, or the undefined value if nothing was deleted.
1948 Deleting from $ENV{} modifies the environment.
1949 Deleting from an array bound to a dbm file deletes the entry from the dbm
1950 file.
1951 .Sp
1952 The following deletes all the values of an associative array:
1953 .nf
1954
1955 .ne 3
1956         foreach $key (keys %ARRAY) {
1957                 delete $ARRAY{$key};
1958         }
1959
1960 .fi
1961 (But it would be faster to use the
1962 .I reset
1963 command.
1964 Saying undef %ARRAY is faster yet.)
1965 .Ip "die(LIST)" 8
1966 .Ip "die LIST" 8
1967 Outside of an eval, prints the value of LIST to
1968 .I STDERR
1969 and exits with the current value of $!
1970 (errno).
1971 If $! is 0, exits with the value of ($? >> 8) (\`command\` status).
1972 If ($? >> 8) is 0, exits with 255.
1973 Inside an eval, the error message is stuffed into $@ and the eval is terminated
1974 with the undefined value.
1975 .Sp
1976 Equivalent examples:
1977 .nf
1978
1979 .ne 3
1980 .ie t \{\
1981         die "Can't cd to spool: $!\en" unless chdir \'/usr/spool/news\';
1982 'br\}
1983 .el \{\
1984         die "Can't cd to spool: $!\en"
1985                 unless chdir \'/usr/spool/news\';
1986 'br\}
1987
1988         chdir \'/usr/spool/news\' || die "Can't cd to spool: $!\en"
1989
1990 .fi
1991 .Sp
1992 If the value of EXPR does not end in a newline, the current script line
1993 number and input line number (if any) are also printed, and a newline is
1994 supplied.
1995 Hint: sometimes appending \*(L", stopped\*(R" to your message will cause it to make
1996 better sense when the string \*(L"at foo line 123\*(R" is appended.
1997 Suppose you are running script \*(L"canasta\*(R".
1998 .nf
1999
2000 .ne 7
2001         die "/etc/games is no good";
2002         die "/etc/games is no good, stopped";
2003
2004 produce, respectively
2005
2006         /etc/games is no good at canasta line 123.
2007         /etc/games is no good, stopped at canasta line 123.
2008
2009 .fi
2010 See also
2011 .IR exit .
2012 .Ip "do BLOCK" 8 4
2013 Returns the value of the last command in the sequence of commands indicated
2014 by BLOCK.
2015 When modified by a loop modifier, executes the BLOCK once before testing the
2016 loop condition.
2017 (On other statements the loop modifiers test the conditional first.)
2018 .Ip "do SUBROUTINE (LIST)" 8 3
2019 Executes a SUBROUTINE declared by a
2020 .I sub
2021 declaration, and returns the value
2022 of the last expression evaluated in SUBROUTINE.
2023 If there is no subroutine by that name, produces a fatal error.
2024 (You may use the \*(L"defined\*(R" operator to determine if a subroutine
2025 exists.)
2026 If you pass arrays as part of LIST you may wish to pass the length
2027 of the array in front of each array.
2028 (See the section on subroutines later on.)
2029 SUBROUTINE may be a scalar variable, in which case the variable contains
2030 the name of the subroutine to execute.
2031 The parentheses are required to avoid confusion with the \*(L"do EXPR\*(R"
2032 form.
2033 .Sp
2034 As an alternate form, you may call a subroutine by prefixing the name with
2035 an ampersand: &foo(@args).
2036 If you aren't passing any arguments, you don't have to use parentheses.
2037 If you omit the parentheses, no @_ array is passed to the subroutine.
2038 The & form is also used to specify subroutines to the defined and undef
2039 operators.
2040 .Ip "do EXPR" 8 3
2041 Uses the value of EXPR as a filename and executes the contents of the file
2042 as a
2043 .I perl
2044 script.
2045 Its primary use is to include subroutines from a
2046 .I perl
2047 subroutine library.
2048 .nf
2049
2050         do \'stat.pl\';
2051
2052 is just like
2053
2054         eval \`cat stat.pl\`;
2055
2056 .fi
2057 except that it's more efficient, more concise, keeps track of the current
2058 filename for error messages, and searches all the
2059 .B \-I
2060 libraries if the file
2061 isn't in the current directory (see also the @INC array in Predefined Names).
2062 It's the same, however, in that it does reparse the file every time you
2063 call it, so if you are going to use the file inside a loop you might prefer
2064 to use \-P and #include, at the expense of a little more startup time.
2065 (The main problem with #include is that cpp doesn't grok # comments\*(--a
2066 workaround is to use \*(L";#\*(R" for standalone comments.)
2067 Note that the following are NOT equivalent:
2068 .nf
2069
2070 .ne 2
2071         do $foo;        # eval a file
2072         do $foo();      # call a subroutine
2073
2074 .fi
2075 Note that inclusion of library routines is better done with
2076 the \*(L"require\*(R" operator.
2077 .Ip "dump LABEL" 8 6
2078 This causes an immediate core dump.
2079 Primarily this is so that you can use the undump program to turn your
2080 core dump into an executable binary after having initialized all your
2081 variables at the beginning of the program.
2082 When the new binary is executed it will begin by executing a "goto LABEL"
2083 (with all the restrictions that goto suffers).
2084 Think of it as a goto with an intervening core dump and reincarnation.
2085 If LABEL is omitted, restarts the program from the top.
2086 WARNING: any files opened at the time of the dump will NOT be open any more
2087 when the program is reincarnated, with possible resulting confusion on the part
2088 of perl.
2089 See also \-u.
2090 .Sp
2091 Example:
2092 .nf
2093
2094 .ne 16
2095         #!/usr/bin/perl
2096         require 'getopt.pl';
2097         require 'stat.pl';
2098         %days = (
2099             'Sun',1,
2100             'Mon',2,
2101             'Tue',3,
2102             'Wed',4,
2103             'Thu',5,
2104             'Fri',6,
2105             'Sat',7);
2106
2107         dump QUICKSTART if $ARGV[0] eq '-d';
2108
2109     QUICKSTART:
2110         do Getopt('f');
2111
2112 .fi
2113 .Ip "each(ASSOC_ARRAY)" 8 6
2114 .Ip "each ASSOC_ARRAY" 8
2115 Returns a 2 element array consisting of the key and value for the next
2116 value of an associative array, so that you can iterate over it.
2117 Entries are returned in an apparently random order.
2118 When the array is entirely read, a null array is returned (which when
2119 assigned produces a FALSE (0) value).
2120 The next call to each() after that will start iterating again.
2121 The iterator can be reset only by reading all the elements from the array.
2122 You must not modify the array while iterating over it.
2123 There is a single iterator for each associative array, shared by all
2124 each(), keys() and values() function calls in the program.
2125 The following prints out your environment like the printenv program, only
2126 in a different order:
2127 .nf
2128
2129 .ne 3
2130         while (($key,$value) = each %ENV) {
2131                 print "$key=$value\en";
2132         }
2133
2134 .fi
2135 See also keys() and values().
2136 .Ip "eof(FILEHANDLE)" 8 8
2137 .Ip "eof()" 8
2138 .Ip "eof" 8
2139 Returns 1 if the next read on FILEHANDLE will return end of file, or if
2140 FILEHANDLE is not open.
2141 FILEHANDLE may be an expression whose value gives the real filehandle name.
2142 (Note that this function actually reads a character and then ungetc's it,
2143 so it is not very useful in an interactive context.)
2144 An eof without an argument returns the eof status for the last file read.
2145 Empty parentheses () may be used to indicate the pseudo file formed of the
2146 files listed on the command line, i.e. eof() is reasonable to use inside
2147 a while (<>) loop to detect the end of only the last file.
2148 Use eof(ARGV) or eof without the parentheses to test EACH file in a while (<>) loop.
2149 Examples:
2150 .nf
2151
2152 .ne 7
2153         # insert dashes just before last line of last file
2154         while (<>) {
2155                 if (eof()) {
2156                         print "\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\en";
2157                 }
2158                 print;
2159         }
2160
2161 .ne 7
2162         # reset line numbering on each input file
2163         while (<>) {
2164                 print "$.\et$_";
2165                 if (eof) {      # Not eof().
2166                         close(ARGV);
2167                 }
2168         }
2169
2170 .fi
2171 .Ip "eval(EXPR)" 8 6
2172 .Ip "eval EXPR" 8 6
2173 EXPR is parsed and executed as if it were a little
2174 .I perl
2175 program.
2176 It is executed in the context of the current
2177 .I perl
2178 program, so that
2179 any variable settings, subroutine or format definitions remain afterwards.
2180 The value returned is the value of the last expression evaluated, just
2181 as with subroutines.
2182 If there is a syntax error or runtime error, or a die statement is
2183 executed, an undefined value is returned by
2184 eval, and $@ is set to the error message.
2185 If there was no error, $@ is guaranteed to be a null string.
2186 If EXPR is omitted, evaluates $_.
2187 The final semicolon, if any, may be omitted from the expression.
2188 .Sp
2189 Note that, since eval traps otherwise-fatal errors, it is useful for
2190 determining whether a particular feature
2191 (such as dbmopen or symlink) is implemented.
2192 It is also Perl's exception trapping mechanism, where the die operator is
2193 used to raise exceptions.
2194 .Ip "exec(LIST)" 8 8
2195 .Ip "exec LIST" 8 6
2196 If there is more than one argument in LIST, or if LIST is an array with
2197 more than one value,
2198 calls execvp() with the arguments in LIST.
2199 If there is only one scalar argument, the argument is checked for shell metacharacters.
2200 If there are any, the entire argument is passed to \*(L"/bin/sh \-c\*(R" for parsing.
2201 If there are none, the argument is split into words and passed directly to
2202 execvp(), which is more efficient.
2203 Note: exec (and system) do not flush your output buffer, so you may need to
2204 set $| to avoid lost output.
2205 Examples:
2206 .nf
2207
2208         exec \'/bin/echo\', \'Your arguments are: \', @ARGV;
2209         exec "sort $outfile | uniq";
2210
2211 .fi
2212 .Sp
2213 If you don't really want to execute the first argument, but want to lie
2214 to the program you are executing about its own name, you can specify
2215 the program you actually want to run by assigning that to a variable and
2216 putting the name of the variable in front of the LIST without a comma.
2217 (This always forces interpretation of the LIST as a multi-valued list, even
2218 if there is only a single scalar in the list.)
2219 Example:
2220 .nf
2221
2222 .ne 2
2223         $shell = '/bin/csh';
2224         exec $shell '-sh';              # pretend it's a login shell
2225
2226 .fi
2227 .Ip "exit(EXPR)" 8 6
2228 .Ip "exit EXPR" 8
2229 Evaluates EXPR and exits immediately with that value.
2230 Example:
2231 .nf
2232
2233 .ne 2
2234         $ans = <STDIN>;
2235         exit 0 \|if \|$ans \|=~ \|/\|^[Xx]\|/\|;
2236
2237 .fi
2238 See also
2239 .IR die .
2240 If EXPR is omitted, exits with 0 status.
2241 .Ip "exp(EXPR)" 8 3
2242 .Ip "exp EXPR" 8
2243 Returns
2244 .I e
2245 to the power of EXPR.
2246 If EXPR is omitted, gives exp($_).
2247 .Ip "fcntl(FILEHANDLE,FUNCTION,SCALAR)" 8 4
2248 Implements the fcntl(2) function.
2249 You'll probably have to say
2250 .nf
2251
2252         require "fcntl.ph";     # probably /usr/local/lib/perl/fcntl.ph
2253
2254 .fi
2255 first to get the correct function definitions.
2256 If fcntl.ph doesn't exist or doesn't have the correct definitions
2257 you'll have to roll
2258 your own, based on your C header files such as <sys/fcntl.h>.
2259 (There is a perl script called h2ph that comes with the perl kit
2260 which may help you in this.)
2261 Argument processing and value return works just like ioctl below.
2262 Note that fcntl will produce a fatal error if used on a machine that doesn't implement
2263 fcntl(2).
2264 .Ip "fileno(FILEHANDLE)" 8 4
2265 .Ip "fileno FILEHANDLE" 8 4
2266 Returns the file descriptor for a filehandle.
2267 Useful for constructing bitmaps for select().
2268 If FILEHANDLE is an expression, the value is taken as the name of
2269 the filehandle.
2270 .Ip "flock(FILEHANDLE,OPERATION)" 8 4
2271 Calls flock(2) on FILEHANDLE.
2272 See manual page for flock(2) for definition of OPERATION.
2273 Returns true for success, false on failure.
2274 Will produce a fatal error if used on a machine that doesn't implement
2275 flock(2).
2276 Here's a mailbox appender for BSD systems.
2277 .nf
2278
2279 .ne 20
2280         $LOCK_SH = 1;
2281         $LOCK_EX = 2;
2282         $LOCK_NB = 4;
2283         $LOCK_UN = 8;
2284
2285         sub lock {
2286             flock(MBOX,$LOCK_EX);
2287             # and, in case someone appended
2288             # while we were waiting...
2289             seek(MBOX, 0, 2);
2290         }
2291
2292         sub unlock {
2293             flock(MBOX,$LOCK_UN);
2294         }
2295
2296         open(MBOX, ">>/usr/spool/mail/$ENV{'USER'}")
2297                 || die "Can't open mailbox: $!";
2298
2299         do lock();
2300         print MBOX $msg,"\en\en";
2301         do unlock();
2302
2303 .fi
2304 .Ip "fork" 8 4
2305 Does a fork() call.
2306 Returns the child pid to the parent process and 0 to the child process.
2307 Note: unflushed buffers remain unflushed in both processes, which means
2308 you may need to set $| to avoid duplicate output.
2309 .Ip "getc(FILEHANDLE)" 8 4
2310 .Ip "getc FILEHANDLE" 8
2311 .Ip "getc" 8
2312 Returns the next character from the input file attached to FILEHANDLE, or
2313 a null string at EOF.
2314 If FILEHANDLE is omitted, reads from STDIN.
2315 .Ip "getlogin" 8 3
2316 Returns the current login from /etc/utmp, if any.
2317 If null, use getpwuid.
2318
2319         $login = getlogin || (getpwuid($<))[0] || "Somebody";
2320
2321 .Ip "getpeername(SOCKET)" 8 3
2322 Returns the packed sockaddr address of other end of the SOCKET connection.
2323 .nf
2324
2325 .ne 4
2326         # An internet sockaddr
2327         $sockaddr = 'S n a4 x8';
2328         $hersockaddr = getpeername(S);
2329 .ie t \{\
2330         ($family, $port, $heraddr) = unpack($sockaddr,$hersockaddr);
2331 'br\}
2332 .el \{\
2333         ($family, $port, $heraddr) =
2334                         unpack($sockaddr,$hersockaddr);
2335 'br\}
2336
2337 .fi
2338 .Ip "getpgrp(PID)" 8 4
2339 .Ip "getpgrp PID" 8
2340 Returns the current process group for the specified PID, 0 for the current
2341 process.
2342 Will produce a fatal error if used on a machine that doesn't implement
2343 getpgrp(2).
2344 If EXPR is omitted, returns process group of current process.
2345 .Ip "getppid" 8 4
2346 Returns the process id of the parent process.
2347 .Ip "getpriority(WHICH,WHO)" 8 4
2348 Returns the current priority for a process, a process group, or a user.
2349 (See getpriority(2).)
2350 Will produce a fatal error if used on a machine that doesn't implement
2351 getpriority(2).
2352 .Ip "getpwnam(NAME)" 8
2353 .Ip "getgrnam(NAME)" 8
2354 .Ip "gethostbyname(NAME)" 8
2355 .Ip "getnetbyname(NAME)" 8
2356 .Ip "getprotobyname(NAME)" 8
2357 .Ip "getpwuid(UID)" 8
2358 .Ip "getgrgid(GID)" 8
2359 .Ip "getservbyname(NAME,PROTO)" 8
2360 .Ip "gethostbyaddr(ADDR,ADDRTYPE)" 8
2361 .Ip "getnetbyaddr(ADDR,ADDRTYPE)" 8
2362 .Ip "getprotobynumber(NUMBER)" 8
2363 .Ip "getservbyport(PORT,PROTO)" 8
2364 .Ip "getpwent" 8
2365 .Ip "getgrent" 8
2366 .Ip "gethostent" 8
2367 .Ip "getnetent" 8
2368 .Ip "getprotoent" 8
2369 .Ip "getservent" 8
2370 .Ip "setpwent" 8
2371 .Ip "setgrent" 8
2372 .Ip "sethostent(STAYOPEN)" 8
2373 .Ip "setnetent(STAYOPEN)" 8
2374 .Ip "setprotoent(STAYOPEN)" 8
2375 .Ip "setservent(STAYOPEN)" 8
2376 .Ip "endpwent" 8
2377 .Ip "endgrent" 8
2378 .Ip "endhostent" 8
2379 .Ip "endnetent" 8
2380 .Ip "endprotoent" 8
2381 .Ip "endservent" 8
2382 These routines perform the same functions as their counterparts in the
2383 system library.
2384 The return values from the various get routines are as follows:
2385 .nf
2386
2387         ($name,$passwd,$uid,$gid,
2388            $quota,$comment,$gcos,$dir,$shell) = getpw.\|.\|.
2389         ($name,$passwd,$gid,$members) = getgr.\|.\|.
2390         ($name,$aliases,$addrtype,$length,@addrs) = gethost.\|.\|.
2391         ($name,$aliases,$addrtype,$net) = getnet.\|.\|.
2392         ($name,$aliases,$proto) = getproto.\|.\|.
2393         ($name,$aliases,$port,$proto) = getserv.\|.\|.
2394
2395 .fi
2396 The $members value returned by getgr.\|.\|. is a space separated list
2397 of the login names of the members of the group.
2398 .Sp
2399 The @addrs value returned by the gethost.\|.\|. functions is a list of the
2400 raw addresses returned by the corresponding system library call.
2401 In the Internet domain, each address is four bytes long and you can unpack
2402 it by saying something like:
2403 .nf
2404
2405         ($a,$b,$c,$d) = unpack('C4',$addr[0]);
2406
2407 .fi
2408 .Ip "getsockname(SOCKET)" 8 3
2409 Returns the packed sockaddr address of this end of the SOCKET connection.
2410 .nf
2411
2412 .ne 4
2413         # An internet sockaddr
2414         $sockaddr = 'S n a4 x8';
2415         $mysockaddr = getsockname(S);
2416 .ie t \{\
2417         ($family, $port, $myaddr) = unpack($sockaddr,$mysockaddr);
2418 'br\}
2419 .el \{\
2420         ($family, $port, $myaddr) =
2421                         unpack($sockaddr,$mysockaddr);
2422 'br\}
2423
2424 .fi
2425 .Ip "getsockopt(SOCKET,LEVEL,OPTNAME)" 8 3
2426 Returns the socket option requested, or undefined if there is an error.
2427 .Ip "gmtime(EXPR)" 8 4
2428 .Ip "gmtime EXPR" 8
2429 Converts a time as returned by the time function to a 9-element array with
2430 the time analyzed for the Greenwich timezone.
2431 Typically used as follows:
2432 .nf
2433
2434 .ne 3
2435 .ie t \{\
2436     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = gmtime(time);
2437 'br\}
2438 .el \{\
2439     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
2440                                                 gmtime(time);
2441 'br\}
2442
2443 .fi
2444 All array elements are numeric, and come straight out of a struct tm.
2445 In particular this means that $mon has the range 0.\|.11 and $wday has the
2446 range 0.\|.6.
2447 If EXPR is omitted, does gmtime(time).
2448 .Ip "goto LABEL" 8 6
2449 Finds the statement labeled with LABEL and resumes execution there.
2450 Currently you may only go to statements in the main body of the program
2451 that are not nested inside a do {} construct.
2452 This statement is not implemented very efficiently, and is here only to make
2453 the
2454 .IR sed -to- perl
2455 translator easier.
2456 I may change its semantics at any time, consistent with support for translated
2457 .I sed
2458 scripts.
2459 Use it at your own risk.
2460 Better yet, don't use it at all.
2461 .Ip "grep(EXPR,LIST)" 8 4
2462 Evaluates EXPR for each element of LIST (locally setting $_ to each element)
2463 and returns the array value consisting of those elements for which the
2464 expression evaluated to true.
2465 In a scalar context, returns the number of times the expression was true.
2466 .nf
2467
2468         @foo = grep(!/^#/, @bar);    # weed out comments
2469
2470 .fi
2471 Note that, since $_ is a reference into the array value, it can be
2472 used to modify the elements of the array.
2473 While this is useful and supported, it can cause bizarre results if
2474 the LIST is not a named array.
2475 .Ip "hex(EXPR)" 8 4
2476 .Ip "hex EXPR" 8
2477 Returns the decimal value of EXPR interpreted as an hex string.
2478 (To interpret strings that might start with 0 or 0x see oct().)
2479 If EXPR is omitted, uses $_.
2480 .Ip "index(STR,SUBSTR,POSITION)" 8 4
2481 .Ip "index(STR,SUBSTR)" 8 4
2482 Returns the position of the first occurrence of SUBSTR in STR at or after
2483 POSITION.
2484 If POSITION is omitted, starts searching from the beginning of the string.
2485 The return value is based at 0, or whatever you've
2486 set the $[ variable to.
2487 If the substring is not found, returns one less than the base, ordinarily \-1.
2488 .Ip "int(EXPR)" 8 4
2489 .Ip "int EXPR" 8
2490 Returns the integer portion of EXPR.
2491 If EXPR is omitted, uses $_.
2492 .Ip "ioctl(FILEHANDLE,FUNCTION,SCALAR)" 8 4
2493 Implements the ioctl(2) function.
2494 You'll probably have to say
2495 .nf
2496
2497         require "ioctl.ph";     # probably /usr/local/lib/perl/ioctl.ph
2498
2499 .fi
2500 first to get the correct function definitions.
2501 If ioctl.ph doesn't exist or doesn't have the correct definitions
2502 you'll have to roll
2503 your own, based on your C header files such as <sys/ioctl.h>.
2504 (There is a perl script called h2ph that comes with the perl kit
2505 which may help you in this.)
2506 SCALAR will be read and/or written depending on the FUNCTION\*(--a pointer
2507 to the string value of SCALAR will be passed as the third argument of
2508 the actual ioctl call.
2509 (If SCALAR has no string value but does have a numeric value, that value
2510 will be passed rather than a pointer to the string value.
2511 To guarantee this to be true, add a 0 to the scalar before using it.)
2512 The pack() and unpack() functions are useful for manipulating the values
2513 of structures used by ioctl().
2514 The following example sets the erase character to DEL.
2515 .nf
2516
2517 .ne 9
2518         require 'ioctl.ph';
2519         $sgttyb_t = "ccccs";            # 4 chars and a short
2520         if (ioctl(STDIN,$TIOCGETP,$sgttyb)) {
2521                 @ary = unpack($sgttyb_t,$sgttyb);
2522                 $ary[2] = 127;
2523                 $sgttyb = pack($sgttyb_t,@ary);
2524                 ioctl(STDIN,$TIOCSETP,$sgttyb)
2525                         || die "Can't ioctl: $!";
2526         }
2527
2528 .fi
2529 The return value of ioctl (and fcntl) is as follows:
2530 .nf
2531
2532 .ne 4
2533         if OS returns:\h'|3i'perl returns:
2534           -1\h'|3i'  undefined value
2535           0\h'|3i'  string "0 but true"
2536           anything else\h'|3i'  that number
2537
2538 .fi
2539 Thus perl returns true on success and false on failure, yet you can still
2540 easily determine the actual value returned by the operating system:
2541 .nf
2542
2543         ($retval = ioctl(...)) || ($retval = -1);
2544         printf "System returned %d\en", $retval;
2545 .fi
2546 .Ip "join(EXPR,LIST)" 8 8
2547 .Ip "join(EXPR,ARRAY)" 8
2548 Joins the separate strings of LIST or ARRAY into a single string with fields
2549 separated by the value of EXPR, and returns the string.
2550 Example:
2551 .nf
2552
2553 .ie t \{\
2554     $_ = join(\|\':\', $login,$passwd,$uid,$gid,$gcos,$home,$shell);
2555 'br\}
2556 .el \{\
2557     $_ = join(\|\':\',
2558                 $login,$passwd,$uid,$gid,$gcos,$home,$shell);
2559 'br\}
2560
2561 .fi
2562 See
2563 .IR split .
2564 .Ip "keys(ASSOC_ARRAY)" 8 6
2565 .Ip "keys ASSOC_ARRAY" 8
2566 Returns a normal array consisting of all the keys of the named associative
2567 array.
2568 The keys are returned in an apparently random order, but it is the same order
2569 as either the values() or each() function produces (given that the associative array
2570 has not been modified).
2571 Here is yet another way to print your environment:
2572 .nf
2573
2574 .ne 5
2575         @keys = keys %ENV;
2576         @values = values %ENV;
2577         while ($#keys >= 0) {
2578                 print pop(@keys), \'=\', pop(@values), "\en";
2579         }
2580
2581 or how about sorted by key:
2582
2583 .ne 3
2584         foreach $key (sort(keys %ENV)) {
2585                 print $key, \'=\', $ENV{$key}, "\en";
2586         }
2587
2588 .fi
2589 .Ip "kill(LIST)" 8 8
2590 .Ip "kill LIST" 8 2
2591 Sends a signal to a list of processes.
2592 The first element of the list must be the signal to send.
2593 Returns the number of processes successfully signaled.
2594 .nf
2595
2596         $cnt = kill 1, $child1, $child2;
2597         kill 9, @goners;
2598
2599 .fi
2600 If the signal is negative, kills process groups instead of processes.
2601 (On System V, a negative \fIprocess\fR number will also kill process groups,
2602 but that's not portable.)
2603 You may use a signal name in quotes.
2604 .Ip "last LABEL" 8 8
2605 .Ip "last" 8
2606 The
2607 .I last
2608 command is like the
2609 .I break
2610 statement in C (as used in loops); it immediately exits the loop in question.
2611 If the LABEL is omitted, the command refers to the innermost enclosing loop.
2612 The
2613 .I continue
2614 block, if any, is not executed:
2615 .nf
2616
2617 .ne 4
2618         line: while (<STDIN>) {
2619                 last line if /\|^$/;    # exit when done with header
2620                 .\|.\|.
2621         }
2622
2623 .fi
2624 .Ip "length(EXPR)" 8 4
2625 .Ip "length EXPR" 8
2626 Returns the length in characters of the value of EXPR.
2627 If EXPR is omitted, returns length of $_.
2628 .Ip "link(OLDFILE,NEWFILE)" 8 2
2629 Creates a new filename linked to the old filename.
2630 Returns 1 for success, 0 otherwise.
2631 .Ip "listen(SOCKET,QUEUESIZE)" 8 2
2632 Does the same thing that the listen system call does.
2633 Returns true if it succeeded, false otherwise.
2634 See example in section on Interprocess Communication.
2635 .Ip "local(LIST)" 8 4
2636 Declares the listed variables to be local to the enclosing block,
2637 subroutine, eval or \*(L"do\*(R".
2638 All the listed elements must be legal lvalues.
2639 This operator works by saving the current values of those variables in LIST
2640 on a hidden stack and restoring them upon exiting the block, subroutine or eval.
2641 This means that called subroutines can also reference the local variable,
2642 but not the global one.
2643 The LIST may be assigned to if desired, which allows you to initialize
2644 your local variables.
2645 (If no initializer is given for a particular variable, it is created with
2646 an undefined value.)
2647 Commonly this is used to name the parameters to a subroutine.
2648 Examples:
2649 .nf
2650
2651 .ne 13
2652         sub RANGEVAL {
2653                 local($min, $max, $thunk) = @_;
2654                 local($result) = \'\';
2655                 local($i);
2656
2657                 # Presumably $thunk makes reference to $i
2658
2659                 for ($i = $min; $i < $max; $i++) {
2660                         $result .= eval $thunk;
2661                 }
2662
2663                 $result;
2664         }
2665
2666 .ne 6
2667         if ($sw eq \'-v\') {
2668             # init local array with global array
2669             local(@ARGV) = @ARGV;
2670             unshift(@ARGV,\'echo\');
2671             system @ARGV;
2672         }
2673         # @ARGV restored
2674
2675 .ne 6
2676         # temporarily add to digits associative array
2677         if ($base12) {
2678                 # (NOTE: not claiming this is efficient!)
2679                 local(%digits) = (%digits,'t',10,'e',11);
2680                 do parse_num();
2681         }
2682
2683 .fi
2684 Note that local() is a run-time command, and so gets executed every time
2685 through a loop, using up more stack storage each time until it's all
2686 released at once when the loop is exited.
2687 .Ip "localtime(EXPR)" 8 4
2688 .Ip "localtime EXPR" 8
2689 Converts a time as returned by the time function to a 9-element array with
2690 the time analyzed for the local timezone.
2691 Typically used as follows:
2692 .nf
2693
2694 .ne 3
2695 .ie t \{\
2696     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
2697 'br\}
2698 .el \{\
2699     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
2700                                                 localtime(time);
2701 'br\}
2702
2703 .fi
2704 All array elements are numeric, and come straight out of a struct tm.
2705 In particular this means that $mon has the range 0.\|.11 and $wday has the
2706 range 0.\|.6.
2707 If EXPR is omitted, does localtime(time).
2708 .Ip "log(EXPR)" 8 4
2709 .Ip "log EXPR" 8
2710 Returns logarithm (base
2711 .IR e )
2712 of EXPR.
2713 If EXPR is omitted, returns log of $_.
2714 .Ip "lstat(FILEHANDLE)" 8 6
2715 .Ip "lstat FILEHANDLE" 8
2716 .Ip "lstat(EXPR)" 8
2717 .Ip "lstat SCALARVARIABLE" 8
2718 Does the same thing as the stat() function, but stats a symbolic link
2719 instead of the file the symbolic link points to.
2720 If symbolic links are unimplemented on your system, a normal stat is done.
2721 .Ip "m/PATTERN/io" 8 4
2722 .Ip "/PATTERN/io" 8
2723 Searches a string for a pattern match, and returns true (1) or false (\'\').
2724 If no string is specified via the =~ or !~ operator,
2725 the $_ string is searched.
2726 (The string specified with =~ need not be an lvalue\*(--it may be the result of an expression evaluation, but remember the =~ binds rather tightly.)
2727 See also the section on regular expressions.
2728 .Sp
2729 If / is the delimiter then the initial \*(L'm\*(R' is optional.
2730 With the \*(L'm\*(R' you can use any pair of non-alphanumeric characters
2731 as delimiters.
2732 This is particularly useful for matching Unix path names that contain \*(L'/\*(R'.
2733 If the final delimiter is followed by the optional letter \*(L'i\*(R', the matching is
2734 done in a case-insensitive manner.
2735 PATTERN may contain references to scalar variables, which will be interpolated
2736 (and the pattern recompiled) every time the pattern search is evaluated.
2737 (Note that $) and $| may not be interpolated because they look like end-of-string tests.)
2738 If you want such a pattern to be compiled only once, add an \*(L"o\*(R" after
2739 the trailing delimiter.
2740 This avoids expensive run-time recompilations, and
2741 is useful when the value you are interpolating won't change over the
2742 life of the script.
2743 If the PATTERN evaluates to a null string, the most recent successful
2744 regular expression is used instead.
2745 .Sp
2746 If used in a context that requires an array value, a pattern match returns an
2747 array consisting of the subexpressions matched by the parentheses in the
2748 pattern,
2749 i.e. ($1, $2, $3.\|.\|.).
2750 It does NOT actually set $1, $2, etc. in this case, nor does it set $+, $`, $&
2751 or $'.
2752 If the match fails, a null array is returned.
2753 If the match succeeds, but there were no parentheses, an array value of (1)
2754 is returned.
2755 .Sp
2756 Examples:
2757 .nf
2758
2759 .ne 4
2760     open(tty, \'/dev/tty\');
2761     <tty> \|=~ \|/\|^y\|/i \|&& \|do foo(\|);   # do foo if desired
2762
2763     if (/Version: \|*\|([0\-9.]*\|)\|/\|) { $version = $1; }
2764
2765     next if m#^/usr/spool/uucp#;
2766
2767 .ne 5
2768     # poor man's grep
2769     $arg = shift;
2770     while (<>) {
2771             print if /$arg/o;   # compile only once
2772     }
2773
2774     if (($F1, $F2, $Etc) = ($foo =~ /^(\eS+)\es+(\eS+)\es*(.*)/))
2775
2776 .fi
2777 This last example splits $foo into the first two words and the remainder
2778 of the line, and assigns those three fields to $F1, $F2 and $Etc.
2779 The conditional is true if any variables were assigned, i.e. if the pattern
2780 matched.
2781 .Ip "mkdir(FILENAME,MODE)" 8 3
2782 Creates the directory specified by FILENAME, with permissions specified by
2783 MODE (as modified by umask).
2784 If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
2785 .Ip "msgctl(ID,CMD,ARG)" 8 4
2786 Calls the System V IPC function msgctl.  If CMD is &IPC_STAT, then ARG
2787 must be a variable which will hold the returned msqid_ds structure.
2788 Returns like ioctl: the undefined value for error, "0 but true" for
2789 zero, or the actual return value otherwise.
2790 .Ip "msgget(KEY,FLAGS)" 8 4
2791 Calls the System V IPC function msgget.  Returns the message queue id,
2792 or the undefined value if there is an error.
2793 .Ip "msgsnd(ID,MSG,FLAGS)" 8 4
2794 Calls the System V IPC function msgsnd to send the message MSG to the
2795 message queue ID.  MSG must begin with the long integer message type,
2796 which may be created with pack("L", $type).  Returns true if
2797 successful, or false if there is an error.
2798 .Ip "msgrcv(ID,VAR,SIZE,TYPE,FLAGS)" 8 4
2799 Calls the System V IPC function msgrcv to receive a message from
2800 message queue ID into variable VAR with a maximum message size of
2801 SIZE.  Note that if a message is received, the message type will be
2802 the first thing in VAR, and the maximum length of VAR is SIZE plus the
2803 size of the message type.  Returns true if successful, or false if
2804 there is an error.
2805 ''' Beginning of part 3
2806 ''' $RCSfile: perl.man,v $$Revision: 4.0.1.1 $$Date: 91/04/11 17:50:44 $
2807 '''
2808 ''' $Log:       perl.man,v $
2809 ''' Revision 4.0.1.1  91/04/11  17:50:44  lwall
2810 ''' patch1: fixed some typos
2811 '''
2812 ''' Revision 4.0  91/03/20  01:38:08  lwall
2813 ''' 4.0 baseline.
2814 '''
2815 ''' Revision 3.0.1.12  91/01/11  18:18:15  lwall
2816 ''' patch42: added binary and hex pack/unpack options
2817 '''
2818 ''' Revision 3.0.1.11  90/11/10  01:48:21  lwall
2819 ''' patch38: random cleanup
2820 ''' patch38: documented tr///cds
2821 '''
2822 ''' Revision 3.0.1.10  90/10/20  02:15:17  lwall
2823 ''' patch37: patch37: fixed various typos in man page
2824 '''
2825 ''' Revision 3.0.1.9  90/10/16  10:02:43  lwall
2826 ''' patch29: you can now read into the middle string
2827 ''' patch29: index and substr now have optional 3rd args
2828 ''' patch29: added scalar reverse
2829 ''' patch29: added scalar
2830 ''' patch29: added SysV IPC
2831 ''' patch29: added waitpid
2832 ''' patch29: added sysread and syswrite
2833 '''
2834 ''' Revision 3.0.1.8  90/08/09  04:39:04  lwall
2835 ''' patch19: added require operator
2836 ''' patch19: added truncate operator
2837 ''' patch19: unpack can do checksumming
2838 '''
2839 ''' Revision 3.0.1.7  90/08/03  11:15:42  lwall
2840 ''' patch19: Intermediate diffs for Randal
2841 '''
2842 ''' Revision 3.0.1.6  90/03/27  16:17:56  lwall
2843 ''' patch16: MSDOS support
2844 '''
2845 ''' Revision 3.0.1.5  90/03/12  16:52:21  lwall
2846 ''' patch13: documented that print $filehandle &foo is ambiguous
2847 ''' patch13: added splice operator: @oldelems = splice(@array,$offset,$len,LIST)
2848 '''
2849 ''' Revision 3.0.1.4  90/02/28  18:00:09  lwall
2850 ''' patch9: added pipe function
2851 ''' patch9: documented how to handle arbitrary weird characters in filenames
2852 ''' patch9: documented the unflushed buffers problem on piped opens
2853 ''' patch9: documented how to force top of page
2854 '''
2855 ''' Revision 3.0.1.3  89/12/21  20:10:12  lwall
2856 ''' patch7: documented that s`pat`repl` does command substitution on replacement
2857 ''' patch7: documented that $timeleft from select() is likely not implemented
2858 '''
2859 ''' Revision 3.0.1.2  89/11/17  15:31:05  lwall
2860 ''' patch5: fixed some manual typos and indent problems
2861 ''' patch5: added warning about print making an array context
2862 '''
2863 ''' Revision 3.0.1.1  89/11/11  04:45:06  lwall
2864 ''' patch2: made some line breaks depend on troff vs. nroff
2865 '''
2866 ''' Revision 3.0  89/10/18  15:21:46  lwall
2867 ''' 3.0 baseline
2868 '''
2869 .Ip "next LABEL" 8 8
2870 .Ip "next" 8
2871 The
2872 .I next
2873 command is like the
2874 .I continue
2875 statement in C; it starts the next iteration of the loop:
2876 .nf
2877
2878 .ne 4
2879         line: while (<STDIN>) {
2880                 next line if /\|^#/;    # discard comments
2881                 .\|.\|.
2882         }
2883
2884 .fi
2885 Note that if there were a
2886 .I continue
2887 block on the above, it would get executed even on discarded lines.
2888 If the LABEL is omitted, the command refers to the innermost enclosing loop.
2889 .Ip "oct(EXPR)" 8 4
2890 .Ip "oct EXPR" 8
2891 Returns the decimal value of EXPR interpreted as an octal string.
2892 (If EXPR happens to start off with 0x, interprets it as a hex string instead.)
2893 The following will handle decimal, octal and hex in the standard notation:
2894 .nf
2895
2896         $val = oct($val) if $val =~ /^0/;
2897
2898 .fi
2899 If EXPR is omitted, uses $_.
2900 .Ip "open(FILEHANDLE,EXPR)" 8 8
2901 .Ip "open(FILEHANDLE)" 8
2902 .Ip "open FILEHANDLE" 8
2903 Opens the file whose filename is given by EXPR, and associates it with
2904 FILEHANDLE.
2905 If FILEHANDLE is an expression, its value is used as the name of the
2906 real filehandle wanted.
2907 If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE
2908 contains the filename.
2909 If the filename begins with \*(L"<\*(R" or nothing, the file is opened for
2910 input.
2911 If the filename begins with \*(L">\*(R", the file is opened for output.
2912 If the filename begins with \*(L">>\*(R", the file is opened for appending.
2913 (You can put a \'+\' in front of the \'>\' or \'<\' to indicate that you
2914 want both read and write access to the file.)
2915 If the filename begins with \*(L"|\*(R", the filename is interpreted
2916 as a command to which output is to be piped, and if the filename ends
2917 with a \*(L"|\*(R", the filename is interpreted as command which pipes
2918 input to us.
2919 (You may not have a command that pipes both in and out.)
2920 Opening \'\-\' opens
2921 .I STDIN
2922 and opening \'>\-\' opens
2923 .IR STDOUT .
2924 Open returns non-zero upon success, the undefined value otherwise.
2925 If the open involved a pipe, the return value happens to be the pid
2926 of the subprocess.
2927 Examples:
2928 .nf
2929
2930 .ne 3
2931         $article = 100;
2932         open article || die "Can't find article $article: $!\en";
2933         while (<article>) {\|.\|.\|.
2934
2935 .ie t \{\
2936         open(LOG, \'>>/usr/spool/news/twitlog\'\|);     # (log is reserved)
2937 'br\}
2938 .el \{\
2939         open(LOG, \'>>/usr/spool/news/twitlog\'\|);
2940                                         # (log is reserved)
2941 'br\}
2942
2943 .ie t \{\
2944         open(article, "caesar <$article |"\|);          # decrypt article
2945 'br\}
2946 .el \{\
2947         open(article, "caesar <$article |"\|);
2948                                         # decrypt article
2949 'br\}
2950
2951 .ie t \{\
2952         open(extract, "|sort >/tmp/Tmp$$"\|);           # $$ is our process#
2953 'br\}
2954 .el \{\
2955         open(extract, "|sort >/tmp/Tmp$$"\|);
2956                                         # $$ is our process#
2957 'br\}
2958
2959 .ne 7
2960         # process argument list of files along with any includes
2961
2962         foreach $file (@ARGV) {
2963                 do process($file, \'fh00\');    # no pun intended
2964         }
2965
2966         sub process {
2967                 local($filename, $input) = @_;
2968                 $input++;               # this is a string increment
2969                 unless (open($input, $filename)) {
2970                         print STDERR "Can't open $filename: $!\en";
2971                         return;
2972                 }
2973 .ie t \{\
2974                 while (<$input>) {              # note the use of indirection
2975 'br\}
2976 .el \{\
2977                 while (<$input>) {              # note use of indirection
2978 'br\}
2979                         if (/^#include "(.*)"/) {
2980                                 do process($1, $input);
2981                                 next;
2982                         }
2983                         .\|.\|.         # whatever
2984                 }
2985         }
2986
2987 .fi
2988 You may also, in the Bourne shell tradition, specify an EXPR beginning
2989 with \*(L">&\*(R", in which case the rest of the string
2990 is interpreted as the name of a filehandle
2991 (or file descriptor, if numeric) which is to be duped and opened.
2992 You may use & after >, >>, <, +>, +>> and +<.
2993 The mode you specify should match the mode of the original filehandle.
2994 Here is a script that saves, redirects, and restores
2995 .I STDOUT
2996 and
2997 .IR STDERR :
2998 .nf
2999
3000 .ne 21
3001         #!/usr/bin/perl
3002         open(SAVEOUT, ">&STDOUT");
3003         open(SAVEERR, ">&STDERR");
3004
3005         open(STDOUT, ">foo.out") || die "Can't redirect stdout";
3006         open(STDERR, ">&STDOUT") || die "Can't dup stdout";
3007
3008         select(STDERR); $| = 1;         # make unbuffered
3009         select(STDOUT); $| = 1;         # make unbuffered
3010
3011         print STDOUT "stdout 1\en";     # this works for
3012         print STDERR "stderr 1\en";     # subprocesses too
3013
3014         close(STDOUT);
3015         close(STDERR);
3016
3017         open(STDOUT, ">&SAVEOUT");
3018         open(STDERR, ">&SAVEERR");
3019
3020         print STDOUT "stdout 2\en";
3021         print STDERR "stderr 2\en";
3022
3023 .fi
3024 If you open a pipe on the command \*(L"\-\*(R", i.e. either \*(L"|\-\*(R" or \*(L"\-|\*(R",
3025 then there is an implicit fork done, and the return value of open
3026 is the pid of the child within the parent process, and 0 within the child
3027 process.
3028 (Use defined($pid) to determine if the open was successful.)
3029 The filehandle behaves normally for the parent, but i/o to that
3030 filehandle is piped from/to the
3031 .IR STDOUT / STDIN
3032 of the child process.
3033 In the child process the filehandle isn't opened\*(--i/o happens from/to
3034 the new
3035 .I STDOUT
3036 or
3037 .IR STDIN .
3038 Typically this is used like the normal piped open when you want to exercise
3039 more control over just how the pipe command gets executed, such as when
3040 you are running setuid, and don't want to have to scan shell commands
3041 for metacharacters.
3042 The following pairs are more or less equivalent:
3043 .nf
3044
3045 .ne 5
3046         open(FOO, "|tr \'[a\-z]\' \'[A\-Z]\'");
3047         open(FOO, "|\-") || exec \'tr\', \'[a\-z]\', \'[A\-Z]\';
3048
3049         open(FOO, "cat \-n '$file'|");
3050         open(FOO, "\-|") || exec \'cat\', \'\-n\', $file;
3051
3052 .fi
3053 Explicitly closing any piped filehandle causes the parent process to wait for the
3054 child to finish, and returns the status value in $?.
3055 Note: on any operation which may do a fork,
3056 unflushed buffers remain unflushed in both
3057 processes, which means you may need to set $| to
3058 avoid duplicate output.
3059 .Sp
3060 The filename that is passed to open will have leading and trailing
3061 whitespace deleted.
3062 In order to open a file with arbitrary weird characters in it, it's necessary
3063 to protect any leading and trailing whitespace thusly:
3064 .nf
3065
3066 .ne 2
3067         $file =~ s#^(\es)#./$1#;
3068         open(FOO, "< $file\e0");
3069
3070 .fi
3071 .Ip "opendir(DIRHANDLE,EXPR)" 8 3
3072 Opens a directory named EXPR for processing by readdir(), telldir(), seekdir(),
3073 rewinddir() and closedir().
3074 Returns true if successful.
3075 DIRHANDLEs have their own namespace separate from FILEHANDLEs.
3076 .Ip "ord(EXPR)" 8 4
3077 .Ip "ord EXPR" 8
3078 Returns the numeric ascii value of the first character of EXPR.
3079 If EXPR is omitted, uses $_.
3080 ''' Comments on f & d by gnb@melba.bby.oz.au    22/11/89
3081 .Ip "pack(TEMPLATE,LIST)" 8 4
3082 Takes an array or list of values and packs it into a binary structure,
3083 returning the string containing the structure.
3084 The TEMPLATE is a sequence of characters that give the order and type
3085 of values, as follows:
3086 .nf
3087
3088         A       An ascii string, will be space padded.
3089         a       An ascii string, will be null padded.
3090         c       A signed char value.
3091         C       An unsigned char value.
3092         s       A signed short value.
3093         S       An unsigned short value.
3094         i       A signed integer value.
3095         I       An unsigned integer value.
3096         l       A signed long value.
3097         L       An unsigned long value.
3098         n       A short in \*(L"network\*(R" order.
3099         N       A long in \*(L"network\*(R" order.
3100         f       A single-precision float in the native format.
3101         d       A double-precision float in the native format.
3102         p       A pointer to a string.
3103         x       A null byte.
3104         X       Back up a byte.
3105         @       Null fill to absolute position.
3106         u       A uuencoded string.
3107         b       A bit string (ascending bit order, like vec()).
3108         B       A bit string (descending bit order).
3109         h       A hex string (low nybble first).
3110         H       A hex string (high nybble first).
3111
3112 .fi
3113 Each letter may optionally be followed by a number which gives a repeat
3114 count.
3115 With all types except "a", "A", "b", "B", "h" and "H",
3116 the pack function will gobble up that many values
3117 from the LIST.
3118 A * for the repeat count means to use however many items are left.
3119 The "a" and "A" types gobble just one value, but pack it as a string of length
3120 count,
3121 padding with nulls or spaces as necessary.
3122 (When unpacking, "A" strips trailing spaces and nulls, but "a" does not.)
3123 Likewise, the "b" and "B" fields pack a string that many bits long.
3124 The "h" and "H" fields pack a string that many nybbles long.
3125 Real numbers (floats and doubles) are in the native machine format
3126 only; due to the multiplicity of floating formats around, and the lack
3127 of a standard \*(L"network\*(R" representation, no facility for
3128 interchange has been made.
3129 This means that packed floating point data
3130 written on one machine may not be readable on another - even if both
3131 use IEEE floating point arithmetic (as the endian-ness of the memory
3132 representation is not part of the IEEE spec).
3133 Note that perl uses
3134 doubles internally for all numeric calculation, and converting from
3135 double -> float -> double will lose precision (i.e. unpack("f",
3136 pack("f", $foo)) will not in general equal $foo).
3137 .br
3138 Examples:
3139 .nf
3140
3141         $foo = pack("cccc",65,66,67,68);
3142         # foo eq "ABCD"
3143         $foo = pack("c4",65,66,67,68);
3144         # same thing
3145
3146         $foo = pack("ccxxcc",65,66,67,68);
3147         # foo eq "AB\e0\e0CD"
3148
3149         $foo = pack("s2",1,2);
3150         # "\e1\e0\e2\e0" on little-endian
3151         # "\e0\e1\e0\e2" on big-endian
3152
3153         $foo = pack("a4","abcd","x","y","z");
3154         # "abcd"
3155
3156         $foo = pack("aaaa","abcd","x","y","z");
3157         # "axyz"
3158
3159         $foo = pack("a14","abcdefg");
3160         # "abcdefg\e0\e0\e0\e0\e0\e0\e0"
3161
3162         $foo = pack("i9pl", gmtime);
3163         # a real struct tm (on my system anyway)
3164
3165         sub bintodec {
3166             unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
3167         }
3168 .fi
3169 The same template may generally also be used in the unpack function.
3170 .Ip "pipe(READHANDLE,WRITEHANDLE)" 8 3
3171 Opens a pair of connected pipes like the corresponding system call.
3172 Note that if you set up a loop of piped processes, deadlock can occur
3173 unless you are very careful.
3174 In addition, note that perl's pipes use stdio buffering, so you may need
3175 to set $| to flush your WRITEHANDLE after each command, depending on
3176 the application.
3177 [Requires version 3.0 patchlevel 9.]
3178 .Ip "pop(ARRAY)" 8
3179 .Ip "pop ARRAY" 8 6
3180 Pops and returns the last value of the array, shortening the array by 1.
3181 Has the same effect as
3182 .nf
3183
3184         $tmp = $ARRAY[$#ARRAY\-\|\-];
3185
3186 .fi
3187 If there are no elements in the array, returns the undefined value.
3188 .Ip "print(FILEHANDLE LIST)" 8 10
3189 .Ip "print(LIST)" 8
3190 .Ip "print FILEHANDLE LIST" 8
3191 .Ip "print LIST" 8
3192 .Ip "print" 8
3193 Prints a string or a comma-separated list of strings.
3194 Returns non-zero if successful.
3195 FILEHANDLE may be a scalar variable name, in which case the variable contains
3196 the name of the filehandle, thus introducing one level of indirection.
3197 (NOTE: If FILEHANDLE is a variable and the next token is a term, it may be
3198 misinterpreted as an operator unless you interpose a + or put parens around
3199 the arguments.)
3200 If FILEHANDLE is omitted, prints by default to standard output (or to the
3201 last selected output channel\*(--see select()).
3202 If LIST is also omitted, prints $_ to
3203 .IR STDOUT .
3204 To set the default output channel to something other than
3205 .I STDOUT
3206 use the select operation.
3207 Note that, because print takes a LIST, anything in the LIST is evaluated
3208 in an array context, and any subroutine that you call will have one or more
3209 of its expressions evaluated in an array context.
3210 Also be careful not to follow the print keyword with a left parenthesis
3211 unless you want the corresponding right parenthesis to terminate the
3212 arguments to the print\*(--interpose a + or put parens around all the arguments.
3213 .Ip "printf(FILEHANDLE LIST)" 8 10
3214 .Ip "printf(LIST)" 8
3215 .Ip "printf FILEHANDLE LIST" 8
3216 .Ip "printf LIST" 8
3217 Equivalent to a \*(L"print FILEHANDLE sprintf(LIST)\*(R".
3218 .Ip "push(ARRAY,LIST)" 8 7
3219 Treats ARRAY (@ is optional) as a stack, and pushes the values of LIST
3220 onto the end of ARRAY.
3221 The length of ARRAY increases by the length of LIST.
3222 Has the same effect as
3223 .nf
3224
3225     for $value (LIST) {
3226             $ARRAY[++$#ARRAY] = $value;
3227     }
3228
3229 .fi
3230 but is more efficient.
3231 .Ip "q/STRING/" 8 5
3232 .Ip "qq/STRING/" 8
3233 .Ip "qx/STRING/" 8
3234 These are not really functions, but simply syntactic sugar to let you
3235 avoid putting too many backslashes into quoted strings.
3236 The q operator is a generalized single quote, and the qq operator a
3237 generalized double quote.
3238 The qx operator is a generalized backquote.
3239 Any non-alphanumeric delimiter can be used in place of /, including newline.
3240 If the delimiter is an opening bracket or parenthesis, the final delimiter
3241 will be the corresponding closing bracket or parenthesis.
3242 (Embedded occurrences of the closing bracket need to be backslashed as usual.)
3243 Examples:
3244 .nf
3245
3246 .ne 5
3247         $foo = q!I said, "You said, \'She said it.\'"!;
3248         $bar = q(\'This is it.\');
3249         $today = qx{ date };
3250         $_ .= qq
3251 *** The previous line contains the naughty word "$&".\en
3252                 if /(ibm|apple|awk)/;      # :-)
3253
3254 .fi
3255 .Ip "rand(EXPR)" 8 8
3256 .Ip "rand EXPR" 8
3257 .Ip "rand" 8
3258 Returns a random fractional number between 0 and the value of EXPR.
3259 (EXPR should be positive.)
3260 If EXPR is omitted, returns a value between 0 and 1.
3261 See also srand().
3262 .Ip "read(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
3263 .Ip "read(FILEHANDLE,SCALAR,LENGTH)" 8 5
3264 Attempts to read LENGTH bytes of data into variable SCALAR from the specified
3265 FILEHANDLE.
3266 Returns the number of bytes actually read, or undef if there was an error.
3267 SCALAR will be grown or shrunk to the length actually read.
3268 An OFFSET may be specified to place the read data at some other place
3269 than the beginning of the string.
3270 This call is actually implemented in terms of stdio's fread call.  To get
3271 a true read system call, see sysread.
3272 .Ip "readdir(DIRHANDLE)" 8 3
3273 .Ip "readdir DIRHANDLE" 8
3274 Returns the next directory entry for a directory opened by opendir().
3275 If used in an array context, returns all the rest of the entries in the
3276 directory.
3277 If there are no more entries, returns an undefined value in a scalar context
3278 or a null list in an array context.
3279 .Ip "readlink(EXPR)" 8 6
3280 .Ip "readlink EXPR" 8
3281 Returns the value of a symbolic link, if symbolic links are implemented.
3282 If not, gives a fatal error.
3283 If there is some system error, returns the undefined value and sets $! (errno).
3284 If EXPR is omitted, uses $_.
3285 .Ip "recv(SOCKET,SCALAR,LEN,FLAGS)" 8 4
3286 Receives a message on a socket.
3287 Attempts to receive LENGTH bytes of data into variable SCALAR from the specified
3288 SOCKET filehandle.
3289 Returns the address of the sender, or the undefined value if there's an error.
3290 SCALAR will be grown or shrunk to the length actually read.
3291 Takes the same flags as the system call of the same name.
3292 .Ip "redo LABEL" 8 8
3293 .Ip "redo" 8
3294 The
3295 .I redo
3296 command restarts the loop block without evaluating the conditional again.
3297 The
3298 .I continue
3299 block, if any, is not executed.
3300 If the LABEL is omitted, the command refers to the innermost enclosing loop.
3301 This command is normally used by programs that want to lie to themselves
3302 about what was just input:
3303 .nf
3304
3305 .ne 16
3306         # a simpleminded Pascal comment stripper
3307         # (warning: assumes no { or } in strings)
3308         line: while (<STDIN>) {
3309                 while (s|\|({.*}.*\|){.*}|$1 \||) {}
3310                 s|{.*}| \||;
3311                 if (s|{.*| \||) {
3312                         $front = $_;
3313                         while (<STDIN>) {
3314                                 if (\|/\|}/\|) {        # end of comment?
3315                                         s|^|$front{|;
3316                                         redo line;
3317                                 }
3318                         }
3319                 }
3320                 print;
3321         }
3322
3323 .fi
3324 .Ip "rename(OLDNAME,NEWNAME)" 8 2
3325 Changes the name of a file.
3326 Returns 1 for success, 0 otherwise.
3327 Will not work across filesystem boundaries.
3328 .Ip "require(EXPR)" 8 6
3329 .Ip "require EXPR" 8
3330 .Ip "require" 8
3331 Includes the library file specified by EXPR, or by $_ if EXPR is not supplied.
3332 Has semantics similar to the following subroutine:
3333 .nf
3334
3335         sub require {
3336             local($filename) = @_;
3337             return 1 if $INC{$filename};
3338             local($realfilename,$result);
3339             ITER: {
3340                 foreach $prefix (@INC) {
3341                     $realfilename = "$prefix/$filename";
3342                     if (-f $realfilename) {
3343                         $result = do $realfilename;
3344                         last ITER;
3345                     }
3346                 }
3347                 die "Can't find $filename in \e@INC";
3348             }
3349             die $@ if $@;
3350             die "$filename did not return true value" unless $result;
3351             $INC{$filename} = $realfilename;
3352             $result;
3353         }
3354
3355 .fi
3356 Note that the file will not be included twice under the same specified name.
3357 .Ip "reset(EXPR)" 8 6
3358 .Ip "reset EXPR" 8
3359 .Ip "reset" 8
3360 Generally used in a
3361 .I continue
3362 block at the end of a loop to clear variables and reset ?? searches
3363 so that they work again.
3364 The expression is interpreted as a list of single characters (hyphens allowed
3365 for ranges).
3366 All variables and arrays beginning with one of those letters are reset to
3367 their pristine state.
3368 If the expression is omitted, one-match searches (?pattern?) are reset to
3369 match again.
3370 Only resets variables or searches in the current package.
3371 Always returns 1.
3372 Examples:
3373 .nf
3374
3375 .ne 3
3376     reset \'X\';        \h'|2i'# reset all X variables
3377     reset \'a\-z\';\h'|2i'# reset lower case variables
3378     reset;      \h'|2i'# just reset ?? searches
3379
3380 .fi
3381 Note: resetting \*(L"A\-Z\*(R" is not recommended since you'll wipe out your ARGV and ENV
3382 arrays.
3383 .Sp
3384 The use of reset on dbm associative arrays does not change the dbm file.
3385 (It does, however, flush any entries cached by perl, which may be useful if
3386 you are sharing the dbm file.
3387 Then again, maybe not.)
3388 .Ip "return LIST" 8 3
3389 Returns from a subroutine with the value specified.
3390 (Note that a subroutine can automatically return
3391 the value of the last expression evaluated.
3392 That's the preferred method\*(--use of an explicit
3393 .I return
3394 is a bit slower.)
3395 .Ip "reverse(LIST)" 8 4
3396 .Ip "reverse LIST" 8
3397 In an array context, returns an array value consisting of the elements
3398 of LIST in the opposite order.
3399 In a scalar context, returns a string value consisting of the bytes of
3400 the first element of LIST in the opposite order.
3401 .Ip "rewinddir(DIRHANDLE)" 8 5
3402 .Ip "rewinddir DIRHANDLE" 8
3403 Sets the current position to the beginning of the directory for the readdir() routine on DIRHANDLE.
3404 .Ip "rindex(STR,SUBSTR,POSITION)" 8 6
3405 .Ip "rindex(STR,SUBSTR)" 8 4
3406 Works just like index except that it
3407 returns the position of the LAST occurrence of SUBSTR in STR.
3408 If POSITION is specified, returns the last occurrence at or before that
3409 position.
3410 .Ip "rmdir(FILENAME)" 8 4
3411 .Ip "rmdir FILENAME" 8
3412 Deletes the directory specified by FILENAME if it is empty.
3413 If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
3414 If FILENAME is omitted, uses $_.
3415 .Ip "s/PATTERN/REPLACEMENT/gieo" 8 3
3416 Searches a string for a pattern, and if found, replaces that pattern with the
3417 replacement text and returns the number of substitutions made.
3418 Otherwise it returns false (0).
3419 The \*(L"g\*(R" is optional, and if present, indicates that all occurrences
3420 of the pattern are to be replaced.
3421 The \*(L"i\*(R" is also optional, and if present, indicates that matching
3422 is to be done in a case-insensitive manner.
3423 The \*(L"e\*(R" is likewise optional, and if present, indicates that
3424 the replacement string is to be evaluated as an expression rather than just
3425 as a double-quoted string.
3426 Any non-alphanumeric delimiter may replace the slashes;
3427 if single quotes are used, no
3428 interpretation is done on the replacement string (the e modifier overrides
3429 this, however); if backquotes are used, the replacement string is a command
3430 to execute whose output will be used as the actual replacement text.
3431 If no string is specified via the =~ or !~ operator,
3432 the $_ string is searched and modified.
3433 (The string specified with =~ must be a scalar variable, an array element,
3434 or an assignment to one of those, i.e. an lvalue.)
3435 If the pattern contains a $ that looks like a variable rather than an
3436 end-of-string test, the variable will be interpolated into the pattern at
3437 run-time.
3438 If you only want the pattern compiled once the first time the variable is
3439 interpolated, add an \*(L"o\*(R" at the end.
3440 If the PATTERN evaluates to a null string, the most recent successful
3441 regular expression is used instead.
3442 See also the section on regular expressions.
3443 Examples:
3444 .nf
3445
3446     s/\|\e\|bgreen\e\|b/mauve/g;                # don't change wintergreen
3447
3448     $path \|=~ \|s|\|/usr/bin|\|/usr/local/bin|;
3449
3450     s/Login: $foo/Login: $bar/; # run-time pattern
3451
3452     ($foo = $bar) =~ s/bar/foo/;
3453
3454     $_ = \'abc123xyz\';
3455     s/\ed+/$&*2/e;              # yields \*(L'abc246xyz\*(R'
3456     s/\ed+/sprintf("%5d",$&)/e; # yields \*(L'abc  246xyz\*(R'
3457     s/\ew/$& x 2/eg;            # yields \*(L'aabbcc  224466xxyyzz\*(R'
3458
3459     s/\|([^ \|]*\|) *\|([^ \|]*\|)\|/\|$2 $1/;  # reverse 1st two fields
3460
3461 .fi
3462 (Note the use of $ instead of \|\e\| in the last example.  See section
3463 on regular expressions.)
3464 .Ip "scalar(EXPR)" 8 3
3465 Forces EXPR to be interpreted in a scalar context and returns the value
3466 of EXPR.
3467 .Ip "seek(FILEHANDLE,POSITION,WHENCE)" 8 3
3468 Randomly positions the file pointer for FILEHANDLE, just like the fseek()
3469 call of stdio.
3470 FILEHANDLE may be an expression whose value gives the name of the filehandle.
3471 Returns 1 upon success, 0 otherwise.
3472 .Ip "seekdir(DIRHANDLE,POS)" 8 3
3473 Sets the current position for the readdir() routine on DIRHANDLE.
3474 POS must be a value returned by telldir().
3475 Has the same caveats about possible directory compaction as the corresponding
3476 system library routine.
3477 .Ip "select(FILEHANDLE)" 8 3
3478 .Ip "select" 8 3
3479 Returns the currently selected filehandle.
3480 Sets the current default filehandle for output, if FILEHANDLE is supplied.
3481 This has two effects: first, a
3482 .I write
3483 or a
3484 .I print
3485 without a filehandle will default to this FILEHANDLE.
3486 Second, references to variables related to output will refer to this output
3487 channel.
3488 For example, if you have to set the top of form format for more than
3489 one output channel, you might do the following:
3490 .nf
3491
3492 .ne 4
3493         select(REPORT1);
3494         $^ = \'report1_top\';
3495         select(REPORT2);
3496         $^ = \'report2_top\';
3497
3498 .fi
3499 FILEHANDLE may be an expression whose value gives the name of the actual filehandle.
3500 Thus:
3501 .nf
3502
3503         $oldfh = select(STDERR); $| = 1; select($oldfh);
3504
3505 .fi
3506 .Ip "select(RBITS,WBITS,EBITS,TIMEOUT)" 8 3
3507 This calls the select system call with the bitmasks specified, which can
3508 be constructed using fileno() and vec(), along these lines:
3509 .nf
3510
3511         $rin = $win = $ein = '';
3512         vec($rin,fileno(STDIN),1) = 1;
3513         vec($win,fileno(STDOUT),1) = 1;
3514         $ein = $rin | $win;
3515
3516 .fi
3517 If you want to select on many filehandles you might wish to write a subroutine:
3518 .nf
3519
3520         sub fhbits {
3521             local(@fhlist) = split(' ',$_[0]);
3522             local($bits);
3523             for (@fhlist) {
3524                 vec($bits,fileno($_),1) = 1;
3525             }
3526             $bits;
3527         }
3528         $rin = &fhbits('STDIN TTY SOCK');
3529
3530 .fi
3531 The usual idiom is:
3532 .nf
3533
3534         ($nfound,$timeleft) =
3535           select($rout=$rin, $wout=$win, $eout=$ein, $timeout);
3536
3537 or to block until something becomes ready:
3538
3539 .ie t \{\
3540         $nfound = select($rout=$rin, $wout=$win, $eout=$ein, undef);
3541 'br\}
3542 .el \{\
3543         $nfound = select($rout=$rin, $wout=$win,
3544                                 $eout=$ein, undef);
3545 'br\}
3546
3547 .fi
3548 Any of the bitmasks can also be undef.
3549 The timeout, if specified, is in seconds, which may be fractional.
3550 NOTE: not all implementations are capable of returning the $timeleft.
3551 If not, they always return $timeleft equal to the supplied $timeout.
3552 .Ip "semctl(ID,SEMNUM,CMD,ARG)" 8 4
3553 Calls the System V IPC function semctl.  If CMD is &IPC_STAT or
3554 &GETALL, then ARG must be a variable which will hold the returned
3555 semid_ds structure or semaphore value array.  Returns like ioctl: the
3556 undefined value for error, "0 but true" for zero, or the actual return
3557 value otherwise.
3558 .Ip "semget(KEY,NSEMS,SIZE,FLAGS)" 8 4
3559 Calls the System V IPC function semget.  Returns the semaphore id, or
3560 the undefined value if there is an error.
3561 .Ip "semop(KEY,OPSTRING)" 8 4
3562 Calls the System V IPC function semop to perform semaphore operations
3563 such as signaling and waiting.  OPSTRING must be a packed array of
3564 semop structures.  Each semop structure can be generated with
3565 \&'pack("sss", $semnum, $semop, $semflag)'.  The number of semaphore
3566 operations is implied by the length of OPSTRING.  Returns true if
3567 successful, or false if there is an error.  As an example, the
3568 following code waits on semaphore $semnum of semaphore id $semid:
3569 .nf
3570
3571         $semop = pack("sss", $semnum, -1, 0);
3572         die "Semaphore trouble: $!\en" unless semop($semid, $semop);
3573
3574 .fi
3575 To signal the semaphore, replace "-1" with "1".
3576 .Ip "send(SOCKET,MSG,FLAGS,TO)" 8 4
3577 .Ip "send(SOCKET,MSG,FLAGS)" 8
3578 Sends a message on a socket.
3579 Takes the same flags as the system call of the same name.
3580 On unconnected sockets you must specify a destination to send TO.
3581 Returns the number of characters sent, or the undefined value if
3582 there is an error.
3583 .Ip "setpgrp(PID,PGRP)" 8 4
3584 Sets the current process group for the specified PID, 0 for the current
3585 process.
3586 Will produce a fatal error if used on a machine that doesn't implement
3587 setpgrp(2).
3588 .Ip "setpriority(WHICH,WHO,PRIORITY)" 8 4
3589 Sets the current priority for a process, a process group, or a user.
3590 (See setpriority(2).)
3591 Will produce a fatal error if used on a machine that doesn't implement
3592 setpriority(2).
3593 .Ip "setsockopt(SOCKET,LEVEL,OPTNAME,OPTVAL)" 8 3
3594 Sets the socket option requested.
3595 Returns undefined if there is an error.
3596 OPTVAL may be specified as undef if you don't want to pass an argument.
3597 .Ip "shift(ARRAY)" 8 6
3598 .Ip "shift ARRAY" 8
3599 .Ip "shift" 8
3600 Shifts the first value of the array off and returns it,
3601 shortening the array by 1 and moving everything down.
3602 If there are no elements in the array, returns the undefined value.
3603 If ARRAY is omitted, shifts the @ARGV array in the main program, and the @_
3604 array in subroutines.
3605 (This is determined lexically.)
3606 See also unshift(), push() and pop().
3607 Shift() and unshift() do the same thing to the left end of an array that push()
3608 and pop() do to the right end.
3609 .Ip "shmctl(ID,CMD,ARG)" 8 4
3610 Calls the System V IPC function shmctl.  If CMD is &IPC_STAT, then ARG
3611 must be a variable which will hold the returned shmid_ds structure.
3612 Returns like ioctl: the undefined value for error, "0 but true" for
3613 zero, or the actual return value otherwise.
3614 .Ip "shmget(KEY,SIZE,FLAGS)" 8 4
3615 Calls the System V IPC function shmget.  Returns the shared memory
3616 segment id, or the undefined value if there is an error.
3617 .Ip "shmread(ID,VAR,POS,SIZE)" 8 4
3618 .Ip "shmwrite(ID,STRING,POS,SIZE)" 8
3619 Reads or writes the System V shared memory segment ID starting at
3620 position POS for size SIZE by attaching to it, copying in/out, and
3621 detaching from it.  When reading, VAR must be a variable which
3622 will hold the data read.  When writing, if STRING is too long,
3623 only SIZE bytes are used; if STRING is too short, nulls are
3624 written to fill out SIZE bytes.  Return true if successful, or
3625 false if there is an error.
3626 .Ip "shutdown(SOCKET,HOW)" 8 3
3627 Shuts down a socket connection in the manner indicated by HOW, which has
3628 the same interpretation as in the system call of the same name.
3629 .Ip "sin(EXPR)" 8 4
3630 .Ip "sin EXPR" 8
3631 Returns the sine of EXPR (expressed in radians).
3632 If EXPR is omitted, returns sine of $_.
3633 .Ip "sleep(EXPR)" 8 6
3634 .Ip "sleep EXPR" 8
3635 .Ip "sleep" 8
3636 Causes the script to sleep for EXPR seconds, or forever if no EXPR.
3637 May be interrupted by sending the process a SIGALARM.
3638 Returns the number of seconds actually slept.
3639 .Ip "socket(SOCKET,DOMAIN,TYPE,PROTOCOL)" 8 3
3640 Opens a socket of the specified kind and attaches it to filehandle SOCKET.
3641 DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
3642 of the same name.
3643 You may need to run h2ph on sys/socket.h to get the proper values handy
3644 in a perl library file.
3645 Return true if successful.
3646 See the example in the section on Interprocess Communication.
3647 .Ip "socketpair(SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL)" 8 3
3648 Creates an unnamed pair of sockets in the specified domain, of the specified
3649 type.
3650 DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
3651 of the same name.
3652 If unimplemented, yields a fatal error.
3653 Return true if successful.
3654 .Ip "sort(SUBROUTINE LIST)" 8 9
3655 .Ip "sort(LIST)" 8
3656 .Ip "sort SUBROUTINE LIST" 8
3657 .Ip "sort LIST" 8
3658 Sorts the LIST and returns the sorted array value.
3659 Nonexistent values of arrays are stripped out.
3660 If SUBROUTINE is omitted, sorts in standard string comparison order.
3661 If SUBROUTINE is specified, gives the name of a subroutine that returns
3662 an integer less than, equal to, or greater than 0,
3663 depending on how the elements of the array are to be ordered.
3664 In the interests of efficiency the normal calling code for subroutines
3665 is bypassed, with the following effects: the subroutine may not be a recursive
3666 subroutine, and the two elements to be compared are passed into the subroutine
3667 not via @_ but as $a and $b (see example below).
3668 They are passed by reference so don't modify $a and $b.
3669 SUBROUTINE may be a scalar variable name, in which case the value provides
3670 the name of the subroutine to use.
3671 Examples:
3672 .nf
3673
3674 .ne 4
3675         sub byage {
3676             $age{$a} - $age{$b};        # presuming integers
3677         }
3678         @sortedclass = sort byage @class;
3679
3680 .ne 9
3681         sub reverse { $a lt $b ? 1 : $a gt $b ? \-1 : 0; }
3682         @harry = (\'dog\',\'cat\',\'x\',\'Cain\',\'Abel\');
3683         @george = (\'gone\',\'chased\',\'yz\',\'Punished\',\'Axed\');
3684         print sort @harry;
3685                 # prints AbelCaincatdogx
3686         print sort reverse @harry;
3687                 # prints xdogcatCainAbel
3688         print sort @george, \'to\', @harry;
3689                 # prints AbelAxedCainPunishedcatchaseddoggonetoxyz
3690
3691 .fi
3692 .Ip "splice(ARRAY,OFFSET,LENGTH,LIST)" 8 8
3693 .Ip "splice(ARRAY,OFFSET,LENGTH)" 8
3694 .Ip "splice(ARRAY,OFFSET)" 8
3695 Removes the elements designated by OFFSET and LENGTH from an array, and
3696 replaces them with the elements of LIST, if any.
3697 Returns the elements removed from the array.
3698 The array grows or shrinks as necessary.
3699 If LENGTH is omitted, removes everything from OFFSET onward.
3700 The following equivalencies hold (assuming $[ == 0):
3701 .nf
3702
3703         push(@a,$x,$y)\h'|3.5i'splice(@a,$#a+1,0,$x,$y)
3704         pop(@a)\h'|3.5i'splice(@a,-1)
3705         shift(@a)\h'|3.5i'splice(@a,0,1)
3706         unshift(@a,$x,$y)\h'|3.5i'splice(@a,0,0,$x,$y)
3707         $a[$x] = $y\h'|3.5i'splice(@a,$x,1,$y);
3708
3709 Example, assuming array lengths are passed before arrays:
3710
3711         sub aeq {       # compare two array values
3712                 local(@a) = splice(@_,0,shift);
3713                 local(@b) = splice(@_,0,shift);
3714                 return 0 unless @a == @b;       # same len?
3715                 while (@a) {
3716                     return 0 if pop(@a) ne pop(@b);
3717                 }
3718                 return 1;
3719         }
3720         if (&aeq($len,@foo[1..$len],0+@bar,@bar)) { ... }
3721
3722 .fi
3723 .Ip "split(/PATTERN/,EXPR,LIMIT)" 8 8
3724 .Ip "split(/PATTERN/,EXPR)" 8 8
3725 .Ip "split(/PATTERN/)" 8
3726 .Ip "split" 8
3727 Splits a string into an array of strings, and returns it.
3728 (If not in an array context, returns the number of fields found and splits
3729 into the @_ array.
3730 (In an array context, you can force the split into @_
3731 by using ?? as the pattern delimiters, but it still returns the array value.))
3732 If EXPR is omitted, splits the $_ string.
3733 If PATTERN is also omitted, splits on whitespace (/[\ \et\en]+/).
3734 Anything matching PATTERN is taken to be a delimiter separating the fields.
3735 (Note that the delimiter may be longer than one character.)
3736 If LIMIT is specified, splits into no more than that many fields (though it
3737 may split into fewer).
3738 If LIMIT is unspecified, trailing null fields are stripped (which
3739 potential users of pop() would do well to remember).
3740 A pattern matching the null string (not to be confused with a null pattern //,
3741 which is just one member of the set of patterns matching a null string)
3742 will split the value of EXPR into separate characters at each point it
3743 matches that way.
3744 For example:
3745 .nf
3746
3747         print join(\':\', split(/ */, \'hi there\'));
3748
3749 .fi
3750 produces the output \*(L'h:i:t:h:e:r:e\*(R'.
3751 .Sp
3752 The LIMIT parameter can be used to partially split a line
3753 .nf
3754
3755         ($login, $passwd, $remainder) = split(\|/\|:\|/\|, $_, 3);
3756
3757 .fi
3758 (When assigning to a list, if LIMIT is omitted, perl supplies a LIMIT one
3759 larger than the number of variables in the list, to avoid unnecessary work.
3760 For the list above LIMIT would have been 4 by default.
3761 In time critical applications it behooves you not to split into
3762 more fields than you really need.)
3763 .Sp
3764 If the PATTERN contains parentheses, additional array elements are created
3765 from each matching substring in the delimiter.
3766 .Sp
3767         split(/([,-])/,"1-10,20");
3768 .Sp
3769 produces the array value
3770 .Sp
3771         (1,'-',10,',',20)
3772 .Sp
3773 The pattern /PATTERN/ may be replaced with an expression to specify patterns
3774 that vary at runtime.
3775 (To do runtime compilation only once, use /$variable/o.)
3776 As a special case, specifying a space (\'\ \') will split on white space
3777 just as split with no arguments does, but leading white space does NOT
3778 produce a null first field.
3779 Thus, split(\'\ \') can be used to emulate
3780 .IR awk 's
3781 default behavior, whereas
3782 split(/\ /) will give you as many null initial fields as there are
3783 leading spaces.
3784 .Sp
3785 Example:
3786 .nf
3787
3788 .ne 5
3789         open(passwd, \'/etc/passwd\');
3790         while (<passwd>) {
3791 .ie t \{\
3792                 ($login, $passwd, $uid, $gid, $gcos, $home, $shell) = split(\|/\|:\|/\|);
3793 'br\}
3794 .el \{\
3795                 ($login, $passwd, $uid, $gid, $gcos, $home, $shell)
3796                         = split(\|/\|:\|/\|);
3797 'br\}
3798                 .\|.\|.
3799         }
3800
3801 .fi
3802 (Note that $shell above will still have a newline on it.  See chop().)
3803 See also
3804 .IR join .
3805 .Ip "sprintf(FORMAT,LIST)" 8 4
3806 Returns a string formatted by the usual printf conventions.
3807 The * character is not supported.
3808 .Ip "sqrt(EXPR)" 8 4
3809 .Ip "sqrt EXPR" 8
3810 Return the square root of EXPR.
3811 If EXPR is omitted, returns square root of $_.
3812 .Ip "srand(EXPR)" 8 4
3813 .Ip "srand EXPR" 8
3814 Sets the random number seed for the
3815 .I rand
3816 operator.
3817 If EXPR is omitted, does srand(time).
3818 .Ip "stat(FILEHANDLE)" 8 8
3819 .Ip "stat FILEHANDLE" 8
3820 .Ip "stat(EXPR)" 8
3821 .Ip "stat SCALARVARIABLE" 8
3822 Returns a 13-element array giving the statistics for a file, either the file
3823 opened via FILEHANDLE, or named by EXPR.
3824 Typically used as follows:
3825 .nf
3826
3827 .ne 3
3828     ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
3829        $atime,$mtime,$ctime,$blksize,$blocks)
3830            = stat($filename);
3831
3832 .fi
3833 If stat is passed the special filehandle consisting of an underline,
3834 no stat is done, but the current contents of the stat structure from
3835 the last stat or filetest are returned.
3836 Example:
3837 .nf
3838
3839 .ne 3
3840         if (-x $file && (($d) = stat(_)) && $d < 0) {
3841                 print "$file is executable NFS file\en";
3842         }
3843
3844 .fi
3845 .Ip "study(SCALAR)" 8 6
3846 .Ip "study SCALAR" 8
3847 .Ip "study"
3848 Takes extra time to study SCALAR ($_ if unspecified) in anticipation of
3849 doing many pattern matches on the string before it is next modified.
3850 This may or may not save time, depending on the nature and number of patterns
3851 you are searching on, and on the distribution of character frequencies in
3852 the string to be searched\*(--you probably want to compare runtimes with and
3853 without it to see which runs faster.
3854 Those loops which scan for many short constant strings (including the constant
3855 parts of more complex patterns) will benefit most.
3856 You may have only one study active at a time\*(--if you study a different
3857 scalar the first is \*(L"unstudied\*(R".
3858 (The way study works is this: a linked list of every character in the string
3859 to be searched is made, so we know, for example, where all the \*(L'k\*(R' characters
3860 are.
3861 From each search string, the rarest character is selected, based on some
3862 static frequency tables constructed from some C programs and English text.
3863 Only those places that contain this \*(L"rarest\*(R" character are examined.)
3864 .Sp
3865 For example, here is a loop which inserts index producing entries before any line
3866 containing a certain pattern:
3867 .nf
3868
3869 .ne 8
3870         while (<>) {
3871                 study;
3872                 print ".IX foo\en" if /\ebfoo\eb/;
3873                 print ".IX bar\en" if /\ebbar\eb/;
3874                 print ".IX blurfl\en" if /\ebblurfl\eb/;
3875                 .\|.\|.
3876                 print;
3877         }
3878
3879 .fi
3880 In searching for /\ebfoo\eb/, only those locations in $_ that contain \*(L'f\*(R'
3881 will be looked at, because \*(L'f\*(R' is rarer than \*(L'o\*(R'.
3882 In general, this is a big win except in pathological cases.
3883 The only question is whether it saves you more time than it took to build
3884 the linked list in the first place.
3885 .Sp
3886 Note that if you have to look for strings that you don't know till runtime,
3887 you can build an entire loop as a string and eval that to avoid recompiling
3888 all your patterns all the time.
3889 Together with undefining $/ to input entire files as one record, this can
3890 be very fast, often faster than specialized programs like fgrep.
3891 The following scans a list of files (@files)
3892 for a list of words (@words), and prints out the names of those files that
3893 contain a match:
3894 .nf
3895
3896 .ne 12
3897         $search = \'while (<>) { study;\';
3898         foreach $word (@words) {
3899             $search .= "++\e$seen{\e$ARGV} if /\eb$word\eb/;\en";
3900         }
3901         $search .= "}";
3902         @ARGV = @files;
3903         undef $/;
3904         eval $search;           # this screams
3905         $/ = "\en";             # put back to normal input delim
3906         foreach $file (sort keys(%seen)) {
3907             print $file, "\en";
3908         }
3909
3910 .fi
3911 .Ip "substr(EXPR,OFFSET,LEN)" 8 2
3912 .Ip "substr(EXPR,OFFSET)" 8 2
3913 Extracts a substring out of EXPR and returns it.
3914 First character is at offset 0, or whatever you've set $[ to.
3915 If OFFSET is negative, starts that far from the end of the string.
3916 If LEN is omitted, returns everything to the end of the string.
3917 You can use the substr() function as an lvalue, in which case EXPR must
3918 be an lvalue.
3919 If you assign something shorter than LEN, the string will shrink, and
3920 if you assign something longer than LEN, the string will grow to accommodate it.
3921 To keep the string the same length you may need to pad or chop your value using
3922 sprintf().
3923 .Ip "symlink(OLDFILE,NEWFILE)" 8 2
3924 Creates a new filename symbolically linked to the old filename.
3925 Returns 1 for success, 0 otherwise.
3926 On systems that don't support symbolic links, produces a fatal error at
3927 run time.
3928 To check for that, use eval:
3929 .nf
3930
3931         $symlink_exists = (eval \'symlink("","");\', $@ eq \'\');
3932
3933 .fi
3934 .Ip "syscall(LIST)" 8 6
3935 .Ip "syscall LIST" 8
3936 Calls the system call specified as the first element of the list, passing
3937 the remaining elements as arguments to the system call.
3938 If unimplemented, produces a fatal error.
3939 The arguments are interpreted as follows: if a given argument is numeric,
3940 the argument is passed as an int.
3941 If not, the pointer to the string value is passed.
3942 You are responsible to make sure a string is pre-extended long enough
3943 to receive any result that might be written into a string.
3944 If your integer arguments are not literals and have never been interpreted
3945 in a numeric context, you may need to add 0 to them to force them to look
3946 like numbers.
3947 .nf
3948
3949         require 'syscall.ph';           # may need to run h2ph
3950         syscall(&SYS_write, fileno(STDOUT), "hi there\en", 9);
3951
3952 .fi
3953 .Ip "sysread(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
3954 .Ip "sysread(FILEHANDLE,SCALAR,LENGTH)" 8 5
3955 Attempts to read LENGTH bytes of data into variable SCALAR from the specified
3956 FILEHANDLE, using the system call read(2).
3957 It bypasses stdio, so mixing this with other kinds of reads may cause
3958 confusion.
3959 Returns the number of bytes actually read, or undef if there was an error.
3960 SCALAR will be grown or shrunk to the length actually read.
3961 An OFFSET may be specified to place the read data at some other place
3962 than the beginning of the string.
3963 .Ip "system(LIST)" 8 6
3964 .Ip "system LIST" 8
3965 Does exactly the same thing as \*(L"exec LIST\*(R" except that a fork
3966 is done first, and the parent process waits for the child process to complete.
3967 Note that argument processing varies depending on the number of arguments.
3968 The return value is the exit status of the program as returned by the wait()
3969 call.
3970 To get the actual exit value divide by 256.
3971 See also
3972 .IR exec .
3973 .Ip "syswrite(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
3974 .Ip "syswrite(FILEHANDLE,SCALAR,LENGTH)" 8 5
3975 Attempts to write LENGTH bytes of data from variable SCALAR to the specified
3976 FILEHANDLE, using the system call write(2).
3977 It bypasses stdio, so mixing this with prints may cause
3978 confusion.
3979 Returns the number of bytes actually written, or undef if there was an error.
3980 An OFFSET may be specified to place the read data at some other place
3981 than the beginning of the string.
3982 .Ip "tell(FILEHANDLE)" 8 6
3983 .Ip "tell FILEHANDLE" 8 6
3984 .Ip "tell" 8
3985 Returns the current file position for FILEHANDLE.
3986 FILEHANDLE may be an expression whose value gives the name of the actual
3987 filehandle.
3988 If FILEHANDLE is omitted, assumes the file last read.
3989 .Ip "telldir(DIRHANDLE)" 8 5
3990 .Ip "telldir DIRHANDLE" 8
3991 Returns the current position of the readdir() routines on DIRHANDLE.
3992 Value may be given to seekdir() to access a particular location in
3993 a directory.
3994 Has the same caveats about possible directory compaction as the corresponding
3995 system library routine.
3996 .Ip "time" 8 4
3997 Returns the number of non-leap seconds since 00:00:00 UTC, January 1, 1970.
3998 Suitable for feeding to gmtime() and localtime().
3999 .Ip "times" 8 4
4000 Returns a four-element array giving the user and system times, in seconds, for this
4001 process and the children of this process.
4002 .Sp
4003     ($user,$system,$cuser,$csystem) = times;
4004 .Sp
4005 .Ip "tr/SEARCHLIST/REPLACEMENTLIST/cds" 8 5
4006 .Ip "y/SEARCHLIST/REPLACEMENTLIST/cds" 8
4007 Translates all occurrences of the characters found in the search list with
4008 the corresponding character in the replacement list.
4009 It returns the number of characters replaced or deleted.
4010 If no string is specified via the =~ or !~ operator,
4011 the $_ string is translated.
4012 (The string specified with =~ must be a scalar variable, an array element,
4013 or an assignment to one of those, i.e. an lvalue.)
4014 For
4015 .I sed
4016 devotees,
4017 .I y
4018 is provided as a synonym for
4019 .IR tr .
4020 .Sp
4021 If the c modifier is specified, the SEARCHLIST character set is complemented.
4022 If the d modifier is specified, any characters specified by SEARCHLIST that
4023 are not found in REPLACEMENTLIST are deleted.
4024 (Note that this is slightly more flexible than the behavior of some
4025 .I tr
4026 programs, which delete anything they find in the SEARCHLIST, period.)
4027 If the s modifier is specified, sequences of characters that were translated
4028 to the same character are squashed down to 1 instance of the character.
4029 .Sp
4030 If the d modifier was used, the REPLACEMENTLIST is always interpreted exactly
4031 as specified.
4032 Otherwise, if the REPLACEMENTLIST is shorter than the SEARCHLIST,
4033 the final character is replicated till it is long enough.
4034 If the REPLACEMENTLIST is null, the SEARCHLIST is replicated.
4035 This latter is useful for counting characters in a class, or for squashing
4036 character sequences in a class.
4037 .Sp
4038 Examples:
4039 .nf
4040
4041     $ARGV[1] \|=~ \|y/A\-Z/a\-z/;       \h'|3i'# canonicalize to lower case
4042
4043     $cnt = tr/*/*/;             \h'|3i'# count the stars in $_
4044
4045     $cnt = tr/0\-9//;           \h'|3i'# count the digits in $_
4046
4047     tr/a\-zA\-Z//s;     \h'|3i'# bookkeeper \-> bokeper
4048
4049     ($HOST = $host) =~ tr/a\-z/A\-Z/;
4050
4051     y/a\-zA\-Z/ /cs;    \h'|3i'# change non-alphas to single space
4052
4053     tr/\e200\-\e377/\e0\-\e177/;\h'|3i'# delete 8th bit
4054
4055 .fi
4056 .Ip "truncate(FILEHANDLE,LENGTH)" 8 4
4057 .Ip "truncate(EXPR,LENGTH)" 8
4058 Truncates the file opened on FILEHANDLE, or named by EXPR, to the specified
4059 length.
4060 Produces a fatal error if truncate isn't implemented on your system.
4061 .Ip "umask(EXPR)" 8 4
4062 .Ip "umask EXPR" 8
4063 .Ip "umask" 8
4064 Sets the umask for the process and returns the old one.
4065 If EXPR is omitted, merely returns current umask.
4066 .Ip "undef(EXPR)" 8 6
4067 .Ip "undef EXPR" 8
4068 .Ip "undef" 8
4069 Undefines the value of EXPR, which must be an lvalue.
4070 Use only on a scalar value, an entire array, or a subroutine name (using &).
4071 (Undef will probably not do what you expect on most predefined variables or
4072 dbm array values.)
4073 Always returns the undefined value.
4074 You can omit the EXPR, in which case nothing is undefined, but you still
4075 get an undefined value that you could, for instance, return from a subroutine.
4076 Examples:
4077 .nf
4078
4079 .ne 6
4080         undef $foo;
4081         undef $bar{'blurfl'};
4082         undef @ary;
4083         undef %assoc;
4084         undef &mysub;
4085         return (wantarray ? () : undef) if $they_blew_it;
4086
4087 .fi
4088 .Ip "unlink(LIST)" 8 4
4089 .Ip "unlink LIST" 8
4090 Deletes a list of files.
4091 Returns the number of files successfully deleted.
4092 .nf
4093
4094 .ne 2
4095         $cnt = unlink \'a\', \'b\', \'c\';
4096         unlink @goners;
4097         unlink <*.bak>;
4098
4099 .fi
4100 Note: unlink will not delete directories unless you are superuser and the
4101 .B \-U
4102 flag is supplied to
4103 .IR perl .
4104 Even if these conditions are met, be warned that unlinking a directory
4105 can inflict damage on your filesystem.
4106 Use rmdir instead.
4107 .Ip "unpack(TEMPLATE,EXPR)" 8 4
4108 Unpack does the reverse of pack: it takes a string representing
4109 a structure and expands it out into an array value, returning the array
4110 value.
4111 (In a scalar context, it merely returns the first value produced.)
4112 The TEMPLATE has the same format as in the pack function.
4113 Here's a subroutine that does substring:
4114 .nf
4115
4116 .ne 4
4117         sub substr {
4118                 local($what,$where,$howmuch) = @_;
4119                 unpack("x$where a$howmuch", $what);
4120         }
4121
4122 .ne 3
4123 and then there's
4124
4125         sub ord { unpack("c",$_[0]); }
4126
4127 .fi
4128 In addition, you may prefix a field with a %<number> to indicate that
4129 you want a <number>-bit checksum of the items instead of the items themselves.
4130 Default is a 16-bit checksum.
4131 For example, the following computes the same number as the System V sum program:
4132 .nf
4133
4134 .ne 4
4135         while (<>) {
4136             $checksum += unpack("%16C*", $_);
4137         }
4138         $checksum %= 65536;
4139
4140 .fi
4141 .Ip "unshift(ARRAY,LIST)" 8 4
4142 Does the opposite of a
4143 .IR shift .
4144 Or the opposite of a
4145 .IR push ,
4146 depending on how you look at it.
4147 Prepends list to the front of the array, and returns the number of elements
4148 in the new array.
4149 .nf
4150
4151         unshift(ARGV, \'\-e\') unless $ARGV[0] =~ /^\-/;
4152
4153 .fi
4154 .Ip "utime(LIST)" 8 2
4155 .Ip "utime LIST" 8 2
4156 Changes the access and modification times on each file of a list of files.
4157 The first two elements of the list must be the NUMERICAL access and
4158 modification times, in that order.
4159 Returns the number of files successfully changed.
4160 The inode modification time of each file is set to the current time.
4161 Example of a \*(L"touch\*(R" command:
4162 .nf
4163
4164 .ne 3
4165         #!/usr/bin/perl
4166         $now = time;
4167         utime $now, $now, @ARGV;
4168
4169 .fi
4170 .Ip "values(ASSOC_ARRAY)" 8 6
4171 .Ip "values ASSOC_ARRAY" 8
4172 Returns a normal array consisting of all the values of the named associative
4173 array.
4174 The values are returned in an apparently random order, but it is the same order
4175 as either the keys() or each() function would produce on the same array.
4176 See also keys() and each().
4177 .Ip "vec(EXPR,OFFSET,BITS)" 8 2
4178 Treats a string as a vector of unsigned integers, and returns the value
4179 of the bitfield specified.
4180 May also be assigned to.
4181 BITS must be a power of two from 1 to 32.
4182 .Sp
4183 Vectors created with vec() can also be manipulated with the logical operators
4184 |, & and ^,
4185 which will assume a bit vector operation is desired when both operands are
4186 strings.
4187 This interpretation is not enabled unless there is at least one vec() in
4188 your program, to protect older programs.
4189 .Sp
4190 To transform a bit vector into a string or array of 0's and 1's, use these:
4191 .nf
4192
4193         $bits = unpack("b*", $vector);
4194         @bits = split(//, unpack("b*", $vector));
4195
4196 .fi
4197 If you know the exact length in bits, it can be used in place of the *.
4198 .Ip "wait" 8 6
4199 Waits for a child process to terminate and returns the pid of the deceased
4200 process, or -1 if there are no child processes.
4201 The status is returned in $?.
4202 .Ip "waitpid(PID,FLAGS)" 8 6
4203 Waits for a particular child process to terminate and returns the pid of the deceased
4204 process, or -1 if there is no such child process.
4205 The status is returned in $?.
4206 If you say
4207 .nf
4208
4209         require "sys/wait.h";
4210         .\|.\|.
4211         waitpid(-1,&WNOHANG);
4212
4213 .fi
4214 then you can do a non-blocking wait for any process.  Non-blocking wait
4215 is only available on machines supporting either the
4216 .I waitpid (2)
4217 or
4218 .I wait4 (2)
4219 system calls.
4220 However, waiting for a particular pid with FLAGS of 0 is implemented
4221 everywhere.  (Perl emulates the system call by remembering the status
4222 values of processes that have exited but have not been harvested by the
4223 Perl script yet.)
4224 .Ip "wantarray" 8 4
4225 Returns true if the context of the currently executing subroutine
4226 is looking for an array value.
4227 Returns false if the context is looking for a scalar.
4228 .nf
4229
4230         return wantarray ? () : undef;
4231
4232 .fi
4233 .Ip "warn(LIST)" 8 4
4234 .Ip "warn LIST" 8
4235 Produces a message on STDERR just like \*(L"die\*(R", but doesn't exit.
4236 .Ip "write(FILEHANDLE)" 8 6
4237 .Ip "write(EXPR)" 8
4238 .Ip "write" 8
4239 Writes a formatted record (possibly multi-line) to the specified file,
4240 using the format associated with that file.
4241 By default the format for a file is the one having the same name is the
4242 filehandle, but the format for the current output channel (see
4243 .IR select )
4244 may be set explicitly
4245 by assigning the name of the format to the $~ variable.
4246 .Sp
4247 Top of form processing is handled automatically:
4248 if there is insufficient room on the current page for the formatted
4249 record, the page is advanced by writing a form feed,
4250 a special top-of-page format is used
4251 to format the new page header, and then the record is written.
4252 By default the top-of-page format is \*(L"top\*(R", but it
4253 may be set to the
4254 format of your choice by assigning the name to the $^ variable.
4255 The number of lines remaining on the current page is in variable $-, which
4256 can be set to 0 to force a new page.
4257 .Sp
4258 If FILEHANDLE is unspecified, output goes to the current default output channel,
4259 which starts out as
4260 .I STDOUT
4261 but may be changed by the
4262 .I select
4263 operator.
4264 If the FILEHANDLE is an EXPR, then the expression is evaluated and the
4265 resulting string is used to look up the name of the FILEHANDLE at run time.
4266 For more on formats, see the section on formats later on.
4267 .Sp
4268 Note that write is NOT the opposite of read.
4269 ''' Beginning of part 4
4270 ''' $RCSfile: perl.man,v $$Revision: 4.0.1.1 $$Date: 91/04/11 17:50:44 $
4271 '''
4272 ''' $Log:       perl.man,v $
4273 ''' Revision 4.0.1.1  91/04/11  17:50:44  lwall
4274 ''' patch1: fixed some typos
4275 '''
4276 ''' Revision 4.0  91/03/20  01:38:08  lwall
4277 ''' 4.0 baseline.
4278 '''
4279 ''' Revision 3.0.1.14  91/01/11  18:18:53  lwall
4280 ''' patch42: started an addendum and errata section in the man page
4281 '''
4282 ''' Revision 3.0.1.13  90/11/10  01:51:00  lwall
4283 ''' patch38: random cleanup
4284 '''
4285 ''' Revision 3.0.1.12  90/10/20  02:15:43  lwall
4286 ''' patch37: patch37: fixed various typos in man page
4287 '''
4288 ''' Revision 3.0.1.11  90/10/16  10:04:28  lwall
4289 ''' patch29: added @###.## fields to format
4290 '''
4291 ''' Revision 3.0.1.10  90/08/09  04:47:35  lwall
4292 ''' patch19: added require operator
4293 ''' patch19: added numeric interpretation of $]
4294 '''
4295 ''' Revision 3.0.1.9  90/08/03  11:15:58  lwall
4296 ''' patch19: Intermediate diffs for Randal
4297 '''
4298 ''' Revision 3.0.1.8  90/03/27  16:19:31  lwall
4299 ''' patch16: MSDOS support
4300 '''
4301 ''' Revision 3.0.1.7  90/03/14  12:29:50  lwall
4302 ''' patch15: man page falsely states that you can't subscript array values
4303 '''
4304 ''' Revision 3.0.1.6  90/03/12  16:54:04  lwall
4305 ''' patch13: improved documentation of *name
4306 '''
4307 ''' Revision 3.0.1.5  90/02/28  18:01:52  lwall
4308 ''' patch9: $0 is now always the command name
4309 '''
4310 ''' Revision 3.0.1.4  89/12/21  20:12:39  lwall
4311 ''' patch7: documented that package'filehandle works as well as $package'variable
4312 ''' patch7: documented which identifiers are always in package main
4313 '''
4314 ''' Revision 3.0.1.3  89/11/17  15:32:25  lwall
4315 ''' patch5: fixed some manual typos and indent problems
4316 ''' patch5: clarified difference between $! and $@
4317 '''
4318 ''' Revision 3.0.1.2  89/11/11  04:46:40  lwall
4319 ''' patch2: made some line breaks depend on troff vs. nroff
4320 ''' patch2: clarified operation of ^ and $ when $* is false
4321 '''
4322 ''' Revision 3.0.1.1  89/10/26  23:18:43  lwall
4323 ''' patch1: documented the desirability of unnecessary parentheses
4324 '''
4325 ''' Revision 3.0  89/10/18  15:21:55  lwall
4326 ''' 3.0 baseline
4327 '''
4328 .Sh "Precedence"
4329 .I Perl
4330 operators have the following associativity and precedence:
4331 .nf
4332
4333 nonassoc\h'|1i'print printf exec system sort reverse
4334 \h'1.5i'chmod chown kill unlink utime die return
4335 left\h'|1i',
4336 right\h'|1i'= += \-= *= etc.
4337 right\h'|1i'?:
4338 nonassoc\h'|1i'.\|.
4339 left\h'|1i'||
4340 left\h'|1i'&&
4341 left\h'|1i'| ^
4342 left\h'|1i'&
4343 nonassoc\h'|1i'== != <=> eq ne cmp
4344 nonassoc\h'|1i'< > <= >= lt gt le ge
4345 nonassoc\h'|1i'chdir exit eval reset sleep rand umask
4346 nonassoc\h'|1i'\-r \-w \-x etc.
4347 left\h'|1i'<< >>
4348 left\h'|1i'+ \- .
4349 left\h'|1i'* / % x
4350 left\h'|1i'=~ !~
4351 right\h'|1i'! ~ and unary minus
4352 right\h'|1i'**
4353 nonassoc\h'|1i'++ \-\|\-
4354 left\h'|1i'\*(L'(\*(R'
4355
4356 .fi
4357 As mentioned earlier, if any list operator (print, etc.) or
4358 any unary operator (chdir, etc.)
4359 is followed by a left parenthesis as the next token on the same line,
4360 the operator and arguments within parentheses are taken to
4361 be of highest precedence, just like a normal function call.
4362 Examples:
4363 .nf
4364
4365         chdir $foo || die;\h'|3i'# (chdir $foo) || die
4366         chdir($foo) || die;\h'|3i'# (chdir $foo) || die
4367         chdir ($foo) || die;\h'|3i'# (chdir $foo) || die
4368         chdir +($foo) || die;\h'|3i'# (chdir $foo) || die
4369
4370 but, because * is higher precedence than ||:
4371
4372         chdir $foo * 20;\h'|3i'# chdir ($foo * 20)
4373         chdir($foo) * 20;\h'|3i'# (chdir $foo) * 20
4374         chdir ($foo) * 20;\h'|3i'# (chdir $foo) * 20
4375         chdir +($foo) * 20;\h'|3i'# chdir ($foo * 20)
4376
4377         rand 10 * 20;\h'|3i'# rand (10 * 20)
4378         rand(10) * 20;\h'|3i'# (rand 10) * 20
4379         rand (10) * 20;\h'|3i'# (rand 10) * 20
4380         rand +(10) * 20;\h'|3i'# rand (10 * 20)
4381
4382 .fi
4383 In the absence of parentheses,
4384 the precedence of list operators such as print, sort or chmod is
4385 either very high or very low depending on whether you look at the left
4386 side of operator or the right side of it.
4387 For example, in
4388 .nf
4389
4390         @ary = (1, 3, sort 4, 2);
4391         print @ary;             # prints 1324
4392
4393 .fi
4394 the commas on the right of the sort are evaluated before the sort, but
4395 the commas on the left are evaluated after.
4396 In other words, list operators tend to gobble up all the arguments that
4397 follow them, and then act like a simple term with regard to the preceding
4398 expression.
4399 Note that you have to be careful with parens:
4400 .nf
4401
4402 .ne 3
4403         # These evaluate exit before doing the print:
4404         print($foo, exit);      # Obviously not what you want.
4405         print $foo, exit;       # Nor is this.
4406
4407 .ne 4
4408         # These do the print before evaluating exit:
4409         (print $foo), exit;     # This is what you want.
4410         print($foo), exit;      # Or this.
4411         print ($foo), exit;     # Or even this.
4412
4413 Also note that
4414
4415         print ($foo & 255) + 1, "\en";
4416
4417 .fi
4418 probably doesn't do what you expect at first glance.
4419 .Sh "Subroutines"
4420 A subroutine may be declared as follows:
4421 .nf
4422
4423     sub NAME BLOCK
4424
4425 .fi
4426 .PP
4427 Any arguments passed to the routine come in as array @_,
4428 that is ($_[0], $_[1], .\|.\|.).
4429 The array @_ is a local array, but its values are references to the
4430 actual scalar parameters.
4431 The return value of the subroutine is the value of the last expression
4432 evaluated, and can be either an array value or a scalar value.
4433 Alternately, a return statement may be used to specify the returned value and
4434 exit the subroutine.
4435 To create local variables see the
4436 .I local
4437 operator.
4438 .PP
4439 A subroutine is called using the
4440 .I do
4441 operator or the & operator.
4442 .nf
4443
4444 .ne 12
4445 Example:
4446
4447         sub MAX {
4448                 local($max) = pop(@_);
4449                 foreach $foo (@_) {
4450                         $max = $foo \|if \|$max < $foo;
4451                 }
4452                 $max;
4453         }
4454
4455         .\|.\|.
4456         $bestday = &MAX($mon,$tue,$wed,$thu,$fri);
4457
4458 .ne 21
4459 Example:
4460
4461         # get a line, combining continuation lines
4462         #  that start with whitespace
4463         sub get_line {
4464                 $thisline = $lookahead;
4465                 line: while ($lookahead = <STDIN>) {
4466                         if ($lookahead \|=~ \|/\|^[ \^\e\|t]\|/\|) {
4467                                 $thisline \|.= \|$lookahead;
4468                         }
4469                         else {
4470                                 last line;
4471                         }
4472                 }
4473                 $thisline;
4474         }
4475
4476         $lookahead = <STDIN>;   # get first line
4477         while ($_ = do get_line(\|)) {
4478                 .\|.\|.
4479         }
4480
4481 .fi
4482 .nf
4483 .ne 6
4484 Use array assignment to a local list to name your formal arguments:
4485
4486         sub maybeset {
4487                 local($key, $value) = @_;
4488                 $foo{$key} = $value unless $foo{$key};
4489         }
4490
4491 .fi
4492 This also has the effect of turning call-by-reference into call-by-value,
4493 since the assignment copies the values.
4494 .Sp
4495 Subroutines may be called recursively.
4496 If a subroutine is called using the & form, the argument list is optional.
4497 If omitted, no @_ array is set up for the subroutine; the @_ array at the
4498 time of the call is visible to subroutine instead.
4499 .nf
4500
4501         do foo(1,2,3);          # pass three arguments
4502         &foo(1,2,3);            # the same
4503
4504         do foo();               # pass a null list
4505         &foo();                 # the same
4506         &foo;                   # pass no arguments\*(--more efficient
4507
4508 .fi
4509 .Sh "Passing By Reference"
4510 Sometimes you don't want to pass the value of an array to a subroutine but
4511 rather the name of it, so that the subroutine can modify the global copy
4512 of it rather than working with a local copy.
4513 In perl you can refer to all the objects of a particular name by prefixing
4514 the name with a star: *foo.
4515 When evaluated, it produces a scalar value that represents all the objects
4516 of that name, including any filehandle, format or subroutine.
4517 When assigned to within a local() operation, it causes the name mentioned
4518 to refer to whatever * value was assigned to it.
4519 Example:
4520 .nf
4521
4522         sub doubleary {
4523             local(*someary) = @_;
4524             foreach $elem (@someary) {
4525                 $elem *= 2;
4526             }
4527         }
4528         do doubleary(*foo);
4529         do doubleary(*bar);
4530
4531 .fi
4532 Assignment to *name is currently recommended only inside a local().
4533 You can actually assign to *name anywhere, but the previous referent of
4534 *name may be stranded forever.
4535 This may or may not bother you.
4536 .Sp
4537 Note that scalars are already passed by reference, so you can modify scalar
4538 arguments without using this mechanism by referring explicitly to the $_[nnn]
4539 in question.
4540 You can modify all the elements of an array by passing all the elements
4541 as scalars, but you have to use the * mechanism to push, pop or change the
4542 size of an array.
4543 The * mechanism will probably be more efficient in any case.
4544 .Sp
4545 Since a *name value contains unprintable binary data, if it is used as
4546 an argument in a print, or as a %s argument in a printf or sprintf, it
4547 then has the value '*name', just so it prints out pretty.
4548 .Sp
4549 Even if you don't want to modify an array, this mechanism is useful for
4550 passing multiple arrays in a single LIST, since normally the LIST mechanism
4551 will merge all the array values so that you can't extract out the
4552 individual arrays.
4553 .Sh "Regular Expressions"
4554 The patterns used in pattern matching are regular expressions such as
4555 those supplied in the Version 8 regexp routines.
4556 (In fact, the routines are derived from Henry Spencer's freely redistributable
4557 reimplementation of the V8 routines.)
4558 In addition, \ew matches an alphanumeric character (including \*(L"_\*(R") and \eW a nonalphanumeric.
4559 Word boundaries may be matched by \eb, and non-boundaries by \eB.
4560 A whitespace character is matched by \es, non-whitespace by \eS.
4561 A numeric character is matched by \ed, non-numeric by \eD.
4562 You may use \ew, \es and \ed within character classes.
4563 Also, \en, \er, \ef, \et and \eNNN have their normal interpretations.
4564 Within character classes \eb represents backspace rather than a word boundary.
4565 Alternatives may be separated by |.
4566 The bracketing construct \|(\ .\|.\|.\ \|) may also be used, in which case \e<digit>
4567 matches the digit'th substring.
4568 (Outside of the pattern, always use $ instead of \e in front of the digit.
4569 The scope of $<digit> (and $\`, $& and $\')
4570 extends to the end of the enclosing BLOCK or eval string, or to
4571 the next pattern match with subexpressions.
4572 The \e<digit> notation sometimes works outside the current pattern, but should
4573 not be relied upon.)
4574 You may have as many parentheses as you wish.  If you have more than 9
4575 substrings, the variables $10, $11, ... refer to the corresponding
4576 substring.  Within the pattern, \e10, \e11,
4577 etc. refer back to substrings if there have been at least that many left parens
4578 before the backreference.  Otherwise (for backward compatibilty) \e10
4579 is the same as \e010, a backspace,
4580 and \e11 the same as \e011, a tab.
4581 And so on.
4582 (\e1 through \e9 are always backreferences.)
4583 .PP
4584 $+ returns whatever the last bracket match matched.
4585 $& returns the entire matched string.
4586 ($0 used to return the same thing, but not any more.)
4587 $\` returns everything before the matched string.
4588 $\' returns everything after the matched string.
4589 Examples:
4590 .nf
4591
4592         s/\|^\|([^ \|]*\|) \|*([^ \|]*\|)\|/\|$2 $1\|/; # swap first two words
4593
4594 .ne 5
4595         if (/\|Time: \|(.\|.\|):\|(.\|.\|):\|(.\|.\|)\|/\|) {
4596                 $hours = $1;
4597                 $minutes = $2;
4598                 $seconds = $3;
4599         }
4600
4601 .fi
4602 By default, the ^ character is only guaranteed to match at the beginning
4603 of the string,
4604 the $ character only at the end (or before the newline at the end)
4605 and
4606 .I perl
4607 does certain optimizations with the assumption that the string contains
4608 only one line.
4609 The behavior of ^ and $ on embedded newlines will be inconsistent.
4610 You may, however, wish to treat a string as a multi-line buffer, such that
4611 the ^ will match after any newline within the string, and $ will match
4612 before any newline.
4613 At the cost of a little more overhead, you can do this by setting the variable
4614 $* to 1.
4615 Setting it back to 0 makes
4616 .I perl
4617 revert to its old behavior.
4618 .PP
4619 To facilitate multi-line substitutions, the . character never matches a newline
4620 (even when $* is 0).
4621 In particular, the following leaves a newline on the $_ string:
4622 .nf
4623
4624         $_ = <STDIN>;
4625         s/.*(some_string).*/$1/;
4626
4627 If the newline is unwanted, try one of
4628
4629         s/.*(some_string).*\en/$1/;
4630         s/.*(some_string)[^\e000]*/$1/;
4631         s/.*(some_string)(.|\en)*/$1/;
4632         chop; s/.*(some_string).*/$1/;
4633         /(some_string)/ && ($_ = $1);
4634
4635 .fi
4636 Any item of a regular expression may be followed with digits in curly brackets
4637 of the form {n,m}, where n gives the minimum number of times to match the item
4638 and m gives the maximum.
4639 The form {n} is equivalent to {n,n} and matches exactly n times.
4640 The form {n,} matches n or more times.
4641 (If a curly bracket occurs in any other context, it is treated as a regular
4642 character.)
4643 The * modifier is equivalent to {0,}, the + modifier to {1,} and the ? modifier
4644 to {0,1}.
4645 There is no limit to the size of n or m, but large numbers will chew up
4646 more memory.
4647 .Sp
4648 You will note that all backslashed metacharacters in
4649 .I perl
4650 are alphanumeric,
4651 such as \eb, \ew, \en.
4652 Unlike some other regular expression languages, there are no backslashed
4653 symbols that aren't alphanumeric.
4654 So anything that looks like \e\e, \e(, \e), \e<, \e>, \e{, or \e} is always
4655 interpreted as a literal character, not a metacharacter.
4656 This makes it simple to quote a string that you want to use for a pattern
4657 but that you are afraid might contain metacharacters.
4658 Simply quote all the non-alphanumeric characters:
4659 .nf
4660
4661         $pattern =~ s/(\eW)/\e\e$1/g;
4662
4663 .fi
4664 .Sh "Formats"
4665 Output record formats for use with the
4666 .I write
4667 operator may declared as follows:
4668 .nf
4669
4670 .ne 3
4671     format NAME =
4672     FORMLIST
4673     .
4674
4675 .fi
4676 If name is omitted, format \*(L"STDOUT\*(R" is defined.
4677 FORMLIST consists of a sequence of lines, each of which may be of one of three
4678 types:
4679 .Ip 1. 4
4680 A comment.
4681 .Ip 2. 4
4682 A \*(L"picture\*(R" line giving the format for one output line.
4683 .Ip 3. 4
4684 An argument line supplying values to plug into a picture line.
4685 .PP
4686 Picture lines are printed exactly as they look, except for certain fields
4687 that substitute values into the line.
4688 Each picture field starts with either @ or ^.
4689 The @ field (not to be confused with the array marker @) is the normal
4690 case; ^ fields are used
4691 to do rudimentary multi-line text block filling.
4692 The length of the field is supplied by padding out the field
4693 with multiple <, >, or | characters to specify, respectively, left justification,
4694 right justification, or centering.
4695 As an alternate form of right justification,
4696 you may also use # characters (with an optional .) to specify a numeric field.
4697 (Use of ^ instead of @ causes the field to be blanked if undefined.)
4698 If any of the values supplied for these fields contains a newline, only
4699 the text up to the newline is printed.
4700 The special field @* can be used for printing multi-line values.
4701 It should appear by itself on a line.
4702 .PP
4703 The values are specified on the following line, in the same order as
4704 the picture fields.
4705 The values should be separated by commas.
4706 .PP
4707 Picture fields that begin with ^ rather than @ are treated specially.
4708 The value supplied must be a scalar variable name which contains a text
4709 string.
4710 .I Perl
4711 puts as much text as it can into the field, and then chops off the front
4712 of the string so that the next time the variable is referenced,
4713 more of the text can be printed.
4714 Normally you would use a sequence of fields in a vertical stack to print
4715 out a block of text.
4716 If you like, you can end the final field with .\|.\|., which will appear in the
4717 output if the text was too long to appear in its entirety.
4718 You can change which characters are legal to break on by changing the
4719 variable $: to a list of the desired characters.
4720 .PP
4721 Since use of ^ fields can produce variable length records if the text to be
4722 formatted is short, you can suppress blank lines by putting the tilde (~)
4723 character anywhere in the line.
4724 (Normally you should put it in the front if possible, for visibility.)
4725 The tilde will be translated to a space upon output.
4726 If you put a second tilde contiguous to the first, the line will be repeated
4727 until all the fields on the line are exhausted.
4728 (If you use a field of the @ variety, the expression you supply had better
4729 not give the same value every time forever!)
4730 .PP
4731 Examples:
4732 .nf
4733 .lg 0
4734 .cs R 25
4735 .ft C
4736
4737 .ne 10
4738 # a report on the /etc/passwd file
4739 format top =
4740 \&                        Passwd File
4741 Name                Login    Office   Uid   Gid Home
4742 ------------------------------------------------------------------
4743 \&.
4744 format STDOUT =
4745 @<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<<
4746 $name,              $login,  $office,$uid,$gid, $home
4747 \&.
4748
4749 .ne 29
4750 # a report from a bug report form
4751 format top =
4752 \&                        Bug Reports
4753 @<<<<<<<<<<<<<<<<<<<<<<<     @|||         @>>>>>>>>>>>>>>>>>>>>>>>
4754 $system,                      $%,         $date
4755 ------------------------------------------------------------------
4756 \&.
4757 format STDOUT =
4758 Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4759 \&         $subject
4760 Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4761 \&       $index,                       $description
4762 Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4763 \&          $priority,        $date,   $description
4764 From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4765 \&      $from,                         $description
4766 Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4767 \&             $programmer,            $description
4768 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4769 \&                                     $description
4770 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4771 \&                                     $description
4772 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4773 \&                                     $description
4774 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4775 \&                                     $description
4776 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<...
4777 \&                                     $description
4778 \&.
4779
4780 .ft R
4781 .cs R
4782 .lg
4783 .fi
4784 It is possible to intermix prints with writes on the same output channel,
4785 but you'll have to handle $\- (lines left on the page) yourself.
4786 .PP
4787 If you are printing lots of fields that are usually blank, you should consider
4788 using the reset operator between records.
4789 Not only is it more efficient, but it can prevent the bug of adding another
4790 field and forgetting to zero it.
4791 .Sh "Interprocess Communication"
4792 The IPC facilities of perl are built on the Berkeley socket mechanism.
4793 If you don't have sockets, you can ignore this section.
4794 The calls have the same names as the corresponding system calls,
4795 but the arguments tend to differ, for two reasons.
4796 First, perl file handles work differently than C file descriptors.
4797 Second, perl already knows the length of its strings, so you don't need
4798 to pass that information.
4799 Here is a sample client (untested):
4800 .nf
4801
4802         ($them,$port) = @ARGV;
4803         $port = 2345 unless $port;
4804         $them = 'localhost' unless $them;
4805
4806         $SIG{'INT'} = 'dokill';
4807         sub dokill { kill 9,$child if $child; }
4808
4809         require 'sys/socket.ph';
4810
4811         $sockaddr = 'S n a4 x8';
4812         chop($hostname = `hostname`);
4813
4814         ($name, $aliases, $proto) = getprotobyname('tcp');
4815         ($name, $aliases, $port) = getservbyname($port, 'tcp')
4816                 unless $port =~ /^\ed+$/;
4817 .ie t \{\
4818         ($name, $aliases, $type, $len, $thisaddr) = gethostbyname($hostname);
4819 'br\}
4820 .el \{\
4821         ($name, $aliases, $type, $len, $thisaddr) =
4822                                         gethostbyname($hostname);
4823 'br\}
4824         ($name, $aliases, $type, $len, $thataddr) = gethostbyname($them);
4825
4826         $this = pack($sockaddr, &AF_INET, 0, $thisaddr);
4827         $that = pack($sockaddr, &AF_INET, $port, $thataddr);
4828
4829         socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
4830         bind(S, $this) || die "bind: $!";
4831         connect(S, $that) || die "connect: $!";
4832
4833         select(S); $| = 1; select(stdout);
4834
4835         if ($child = fork) {
4836                 while (<>) {
4837                         print S;
4838                 }
4839                 sleep 3;
4840                 do dokill();
4841         }
4842         else {
4843                 while (<S>) {
4844                         print;
4845                 }
4846         }
4847
4848 .fi
4849 And here's a server:
4850 .nf
4851
4852         ($port) = @ARGV;
4853         $port = 2345 unless $port;
4854
4855         require 'sys/socket.ph';
4856
4857         $sockaddr = 'S n a4 x8';
4858
4859         ($name, $aliases, $proto) = getprotobyname('tcp');
4860         ($name, $aliases, $port) = getservbyname($port, 'tcp')
4861                 unless $port =~ /^\ed+$/;
4862
4863         $this = pack($sockaddr, &AF_INET, $port, "\e0\e0\e0\e0");
4864
4865         select(NS); $| = 1; select(stdout);
4866
4867         socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
4868         bind(S, $this) || die "bind: $!";
4869         listen(S, 5) || die "connect: $!";
4870
4871         select(S); $| = 1; select(stdout);
4872
4873         for (;;) {
4874                 print "Listening again\en";
4875                 ($addr = accept(NS,S)) || die $!;
4876                 print "accept ok\en";
4877
4878                 ($af,$port,$inetaddr) = unpack($sockaddr,$addr);
4879                 @inetaddr = unpack('C4',$inetaddr);
4880                 print "$af $port @inetaddr\en";
4881
4882                 while (<NS>) {
4883                         print;
4884                         print NS;
4885                 }
4886         }
4887
4888 .fi
4889 .Sh "Predefined Names"
4890 The following names have special meaning to
4891 .IR perl .
4892 I could have used alphabetic symbols for some of these, but I didn't want
4893 to take the chance that someone would say reset \*(L"a\-zA\-Z\*(R" and wipe them all
4894 out.
4895 You'll just have to suffer along with these silly symbols.
4896 Most of them have reasonable mnemonics, or analogues in one of the shells.
4897 .Ip $_ 8
4898 The default input and pattern-searching space.
4899 The following pairs are equivalent:
4900 .nf
4901
4902 .ne 2
4903         while (<>) {\|.\|.\|.   # only equivalent in while!
4904         while ($_ = <>) {\|.\|.\|.
4905
4906 .ne 2
4907         /\|^Subject:/
4908         $_ \|=~ \|/\|^Subject:/
4909
4910 .ne 2
4911         y/a\-z/A\-Z/
4912         $_ =~ y/a\-z/A\-Z/
4913
4914 .ne 2
4915         chop
4916         chop($_)
4917
4918 .fi
4919 (Mnemonic: underline is understood in certain operations.)
4920 .Ip $. 8
4921 The current input line number of the last filehandle that was read.
4922 Readonly.
4923 Remember that only an explicit close on the filehandle resets the line number.
4924 Since <> never does an explicit close, line numbers increase across ARGV files
4925 (but see examples under eof).
4926 (Mnemonic: many programs use . to mean the current line number.)
4927 .Ip $/ 8
4928 The input record separator, newline by default.
4929 Works like
4930 .IR awk 's
4931 RS variable, including treating blank lines as delimiters
4932 if set to the null string.
4933 You may set it to a multicharacter string to match a multi-character
4934 delimiter.
4935 (Mnemonic: / is used to delimit line boundaries when quoting poetry.)
4936 .Ip $, 8
4937 The output field separator for the print operator.
4938 Ordinarily the print operator simply prints out the comma separated fields
4939 you specify.
4940 In order to get behavior more like
4941 .IR awk ,
4942 set this variable as you would set
4943 .IR awk 's
4944 OFS variable to specify what is printed between fields.
4945 (Mnemonic: what is printed when there is a , in your print statement.)
4946 .Ip $"" 8
4947 This is like $, except that it applies to array values interpolated into
4948 a double-quoted string (or similar interpreted string).
4949 Default is a space.
4950 (Mnemonic: obvious, I think.)
4951 .Ip $\e 8
4952 The output record separator for the print operator.
4953 Ordinarily the print operator simply prints out the comma separated fields
4954 you specify, with no trailing newline or record separator assumed.
4955 In order to get behavior more like
4956 .IR awk ,
4957 set this variable as you would set
4958 .IR awk 's
4959 ORS variable to specify what is printed at the end of the print.
4960 (Mnemonic: you set $\e instead of adding \en at the end of the print.
4961 Also, it's just like /, but it's what you get \*(L"back\*(R" from
4962 .IR perl .)
4963 .Ip $# 8
4964 The output format for printed numbers.
4965 This variable is a half-hearted attempt to emulate
4966 .IR awk 's
4967 OFMT variable.
4968 There are times, however, when
4969 .I awk
4970 and
4971 .I perl
4972 have differing notions of what
4973 is in fact numeric.
4974 Also, the initial value is %.20g rather than %.6g, so you need to set $#
4975 explicitly to get
4976 .IR awk 's
4977 value.
4978 (Mnemonic: # is the number sign.)
4979 .Ip $% 8
4980 The current page number of the currently selected output channel.
4981 (Mnemonic: % is page number in nroff.)
4982 .Ip $= 8
4983 The current page length (printable lines) of the currently selected output
4984 channel.
4985 Default is 60.
4986 (Mnemonic: = has horizontal lines.)
4987 .Ip $\- 8
4988 The number of lines left on the page of the currently selected output channel.
4989 (Mnemonic: lines_on_page \- lines_printed.)
4990 .Ip $~ 8
4991 The name of the current report format for the currently selected output
4992 channel.
4993 (Mnemonic: brother to $^.)
4994 .Ip $^ 8
4995 The name of the current top-of-page format for the currently selected output
4996 channel.
4997 (Mnemonic: points to top of page.)
4998 .Ip $| 8
4999 If set to nonzero, forces a flush after every write or print on the currently
5000 selected output channel.
5001 Default is 0.
5002 Note that
5003 .I STDOUT
5004 will typically be line buffered if output is to the
5005 terminal and block buffered otherwise.
5006 Setting this variable is useful primarily when you are outputting to a pipe,
5007 such as when you are running a
5008 .I perl
5009 script under rsh and want to see the
5010 output as it's happening.
5011 (Mnemonic: when you want your pipes to be piping hot.)
5012 .Ip $$ 8
5013 The process number of the
5014 .I perl
5015 running this script.
5016 (Mnemonic: same as shells.)
5017 .Ip $? 8
5018 The status returned by the last pipe close, backtick (\`\`) command or
5019 .I system
5020 operator.
5021 Note that this is the status word returned by the wait() system
5022 call, so the exit value of the subprocess is actually ($? >> 8).
5023 $? & 255 gives which signal, if any, the process died from, and whether
5024 there was a core dump.
5025 (Mnemonic: similar to sh and ksh.)
5026 .Ip $& 8 4
5027 The string matched by the last pattern match (not counting any matches hidden
5028 within a BLOCK or eval enclosed by the current BLOCK).
5029 (Mnemonic: like & in some editors.)
5030 .Ip $\` 8 4
5031 The string preceding whatever was matched by the last pattern match
5032 (not counting any matches hidden within a BLOCK or eval enclosed by the current
5033 BLOCK).
5034 (Mnemonic: \` often precedes a quoted string.)
5035 .Ip $\' 8 4
5036 The string following whatever was matched by the last pattern match
5037 (not counting any matches hidden within a BLOCK or eval enclosed by the current
5038 BLOCK).
5039 (Mnemonic: \' often follows a quoted string.)
5040 Example:
5041 .nf
5042
5043 .ne 3
5044         $_ = \'abcdefghi\';
5045         /def/;
5046         print "$\`:$&:$\'\en";          # prints abc:def:ghi
5047
5048 .fi
5049 .Ip $+ 8 4
5050 The last bracket matched by the last search pattern.
5051 This is useful if you don't know which of a set of alternative patterns
5052 matched.
5053 For example:
5054 .nf
5055
5056     /Version: \|(.*\|)|Revision: \|(.*\|)\|/ \|&& \|($rev = $+);
5057
5058 .fi
5059 (Mnemonic: be positive and forward looking.)
5060 .Ip $* 8 2
5061 Set to 1 to do multiline matching within a string, 0 to tell
5062 .I perl
5063 that it can assume that strings contain a single line, for the purpose
5064 of optimizing pattern matches.
5065 Pattern matches on strings containing multiple newlines can produce confusing
5066 results when $* is 0.
5067 Default is 0.
5068 (Mnemonic: * matches multiple things.)
5069 Note that this variable only influences the interpretation of ^ and $.
5070 A literal newline can be searched for even when $* == 0.
5071 .Ip $0 8
5072 Contains the name of the file containing the
5073 .I perl
5074 script being executed.
5075 Assigning to $0 modifies the argument area that the ps(1) program sees.
5076 (Mnemonic: same as sh and ksh.)
5077 .Ip $<digit> 8
5078 Contains the subpattern from the corresponding set of parentheses in the last
5079 pattern matched, not counting patterns matched in nested blocks that have
5080 been exited already.
5081 (Mnemonic: like \edigit.)
5082 .Ip $[ 8 2
5083 The index of the first element in an array, and of the first character in
5084 a substring.
5085 Default is 0, but you could set it to 1 to make
5086 .I perl
5087 behave more like
5088 .I awk
5089 (or Fortran)
5090 when subscripting and when evaluating the index() and substr() functions.
5091 (Mnemonic: [ begins subscripts.)
5092 .Ip $] 8 2
5093 The string printed out when you say \*(L"perl -v\*(R".
5094 It can be used to determine at the beginning of a script whether the perl
5095 interpreter executing the script is in the right range of versions.
5096 If used in a numeric context, returns the version + patchlevel / 1000.
5097 Example:
5098 .nf
5099
5100 .ne 8
5101         # see if getc is available
5102         ($version,$patchlevel) =
5103                  $] =~ /(\ed+\e.\ed+).*\enPatch level: (\ed+)/;
5104         print STDERR "(No filename completion available.)\en"
5105                  if $version * 1000 + $patchlevel < 2016;
5106
5107 or, used numerically,
5108
5109         warn "No checksumming!\en" if $] < 3.019;
5110
5111 .fi
5112 (Mnemonic: Is this version of perl in the right bracket?)
5113 .Ip $; 8 2
5114 The subscript separator for multi-dimensional array emulation.
5115 If you refer to an associative array element as
5116 .nf
5117         $foo{$a,$b,$c}
5118
5119 it really means
5120
5121         $foo{join($;, $a, $b, $c)}
5122
5123 But don't put
5124
5125         @foo{$a,$b,$c}          # a slice\*(--note the @
5126
5127 which means
5128
5129         ($foo{$a},$foo{$b},$foo{$c})
5130
5131 .fi
5132 Default is "\e034", the same as SUBSEP in
5133 .IR awk .
5134 Note that if your keys contain binary data there might not be any safe
5135 value for $;.
5136 (Mnemonic: comma (the syntactic subscript separator) is a semi-semicolon.
5137 Yeah, I know, it's pretty lame, but $, is already taken for something more
5138 important.)
5139 .Ip $! 8 2
5140 If used in a numeric context, yields the current value of errno, with all the
5141 usual caveats.
5142 (This means that you shouldn't depend on the value of $! to be anything
5143 in particular unless you've gotten a specific error return indicating a
5144 system error.)
5145 If used in a string context, yields the corresponding system error string.
5146 You can assign to $! in order to set errno
5147 if, for instance, you want $! to return the string for error n, or you want
5148 to set the exit value for the die operator.
5149 (Mnemonic: What just went bang?)
5150 .Ip $@ 8 2
5151 The perl syntax error message from the last eval command.
5152 If null, the last eval parsed and executed correctly (although the operations
5153 you invoked may have failed in the normal fashion).
5154 (Mnemonic: Where was the syntax error \*(L"at\*(R"?)
5155 .Ip $< 8 2
5156 The real uid of this process.
5157 (Mnemonic: it's the uid you came FROM, if you're running setuid.)
5158 .Ip $> 8 2
5159 The effective uid of this process.
5160 Example:
5161 .nf
5162
5163 .ne 2
5164         $< = $>;        # set real uid to the effective uid
5165         ($<,$>) = ($>,$<);      # swap real and effective uid
5166
5167 .fi
5168 (Mnemonic: it's the uid you went TO, if you're running setuid.)
5169 Note: $< and $> can only be swapped on machines supporting setreuid().
5170 .Ip $( 8 2
5171 The real gid of this process.
5172 If you are on a machine that supports membership in multiple groups
5173 simultaneously, gives a space separated list of groups you are in.
5174 The first number is the one returned by getgid(), and the subsequent ones
5175 by getgroups(), one of which may be the same as the first number.
5176 (Mnemonic: parentheses are used to GROUP things.
5177 The real gid is the group you LEFT, if you're running setgid.)
5178 .Ip $) 8 2
5179 The effective gid of this process.
5180 If you are on a machine that supports membership in multiple groups
5181 simultaneously, gives a space separated list of groups you are in.
5182 The first number is the one returned by getegid(), and the subsequent ones
5183 by getgroups(), one of which may be the same as the first number.
5184 (Mnemonic: parentheses are used to GROUP things.
5185 The effective gid is the group that's RIGHT for you, if you're running setgid.)
5186 .Sp
5187 Note: $<, $>, $( and $) can only be set on machines that support the
5188 corresponding set[re][ug]id() routine.
5189 $( and $) can only be swapped on machines supporting setregid().
5190 .Ip $: 8 2
5191 The current set of characters after which a string may be broken to
5192 fill continuation fields (starting with ^) in a format.
5193 Default is "\ \en-", to break on whitespace or hyphens.
5194 (Mnemonic: a \*(L"colon\*(R" in poetry is a part of a line.)
5195 .Ip $^D 8 2
5196 The current value of the debugging flags.
5197 (Mnemonic: value of
5198 .B \-D
5199 switch.)
5200 .Ip $^I 8 2
5201 The current value of the inplace-edit extension.
5202 Use undef to disable inplace editing.
5203 (Mnemonic: value of
5204 .B \-i
5205 switch.)
5206 .Ip $^P 8 2
5207 The name that Perl itself was invoked as, from argv[0].
5208 .Ip $^T 8 2
5209 The time at which the script began running, in seconds since the epoch.
5210 The values returned by the
5211 .B \-M ,
5212 .B \-A
5213 and
5214 .B \-C
5215 filetests are based on this value.
5216 .Ip $^W 8 2
5217 The current value of the warning switch.
5218 (Mnemonic: related to the
5219 .B \-w
5220 switch.)
5221 .Ip $ARGV 8 3
5222 contains the name of the current file when reading from <>.
5223 .Ip @ARGV 8 3
5224 The array ARGV contains the command line arguments intended for the script.
5225 Note that $#ARGV is the generally number of arguments minus one, since
5226 $ARGV[0] is the first argument, NOT the command name.
5227 See $0 for the command name.
5228 .Ip @INC 8 3
5229 The array INC contains the list of places to look for
5230 .I perl
5231 scripts to be
5232 evaluated by the \*(L"do EXPR\*(R" command or the \*(L"require\*(R" command.
5233 It initially consists of the arguments to any
5234 .B \-I
5235 command line switches, followed
5236 by the default
5237 .I perl
5238 library, probably \*(L"/usr/local/lib/perl\*(R",
5239 followed by \*(L".\*(R", to represent the current directory.
5240 .Ip %INC 8 3
5241 The associative array INC contains entries for each filename that has
5242 been included via \*(L"do\*(R" or \*(L"require\*(R".
5243 The key is the filename you specified, and the value is the location of
5244 the file actually found.
5245 The \*(L"require\*(R" command uses this array to determine whether
5246 a given file has already been included.
5247 .Ip $ENV{expr} 8 2
5248 The associative array ENV contains your current environment.
5249 Setting a value in ENV changes the environment for child processes.
5250 .Ip $SIG{expr} 8 2
5251 The associative array SIG is used to set signal handlers for various signals.
5252 Example:
5253 .nf
5254
5255 .ne 12
5256         sub handler {   # 1st argument is signal name
5257                 local($sig) = @_;
5258                 print "Caught a SIG$sig\-\|\-shutting down\en";
5259                 close(LOG);
5260                 exit(0);
5261         }
5262
5263         $SIG{\'INT\'} = \'handler\';
5264         $SIG{\'QUIT\'} = \'handler\';
5265         .\|.\|.
5266         $SIG{\'INT\'} = \'DEFAULT\';    # restore default action
5267         $SIG{\'QUIT\'} = \'IGNORE\';    # ignore SIGQUIT
5268
5269 .fi
5270 The SIG array only contains values for the signals actually set within
5271 the perl script.
5272 .Sh "Packages"
5273 Perl provides a mechanism for alternate namespaces to protect packages from
5274 stomping on each others variables.
5275 By default, a perl script starts compiling into the package known as \*(L"main\*(R".
5276 By use of the
5277 .I package
5278 declaration, you can switch namespaces.
5279 The scope of the package declaration is from the declaration itself to the end
5280 of the enclosing block (the same scope as the local() operator).
5281 Typically it would be the first declaration in a file to be included by
5282 the \*(L"require\*(R" operator.
5283 You can switch into a package in more than one place; it merely influences
5284 which symbol table is used by the compiler for the rest of that block.
5285 You can refer to variables and filehandles in other packages by prefixing
5286 the identifier with the package name and a single quote.
5287 If the package name is null, the \*(L"main\*(R" package as assumed.
5288 .PP
5289 Only identifiers starting with letters are stored in the packages symbol
5290 table.
5291 All other symbols are kept in package \*(L"main\*(R".
5292 In addition, the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC
5293 and SIG are forced to be in package \*(L"main\*(R", even when used for
5294 other purposes than their built-in one.
5295 Note also that, if you have a package called \*(L"m\*(R", \*(L"s\*(R"
5296 or \*(L"y\*(R", the you can't use the qualified form of an identifier since it
5297 will be interpreted instead as a pattern match, a substitution
5298 or a translation.
5299 .PP
5300 Eval'ed strings are compiled in the package in which the eval was compiled
5301 in.
5302 (Assignments to $SIG{}, however, assume the signal handler specified is in the
5303 main package.
5304 Qualify the signal handler name if you wish to have a signal handler in
5305 a package.)
5306 For an example, examine perldb.pl in the perl library.
5307 It initially switches to the DB package so that the debugger doesn't interfere
5308 with variables in the script you are trying to debug.
5309 At various points, however, it temporarily switches back to the main package
5310 to evaluate various expressions in the context of the main package.
5311 .PP
5312 The symbol table for a package happens to be stored in the associative array
5313 of that name prepended with an underscore.
5314 The value in each entry of the associative array is
5315 what you are referring to when you use the *name notation.
5316 In fact, the following have the same effect (in package main, anyway),
5317 though the first is more
5318 efficient because it does the symbol table lookups at compile time:
5319 .nf
5320
5321 .ne 2
5322         local(*foo) = *bar;
5323         local($_main{'foo'}) = $_main{'bar'};
5324
5325 .fi
5326 You can use this to print out all the variables in a package, for instance.
5327 Here is dumpvar.pl from the perl library:
5328 .nf
5329 .ne 11
5330         package dumpvar;
5331
5332         sub main'dumpvar {
5333         \&    ($package) = @_;
5334         \&    local(*stab) = eval("*_$package");
5335         \&    while (($key,$val) = each(%stab)) {
5336         \&        {
5337         \&            local(*entry) = $val;
5338         \&            if (defined $entry) {
5339         \&                print "\e$$key = '$entry'\en";
5340         \&            }
5341 .ne 7
5342         \&            if (defined @entry) {
5343         \&                print "\e@$key = (\en";
5344         \&                foreach $num ($[ .. $#entry) {
5345         \&                    print "  $num\et'",$entry[$num],"'\en";
5346         \&                }
5347         \&                print ")\en";
5348         \&            }
5349 .ne 10
5350         \&            if ($key ne "_$package" && defined %entry) {
5351         \&                print "\e%$key = (\en";
5352         \&                foreach $key (sort keys(%entry)) {
5353         \&                    print "  $key\et'",$entry{$key},"'\en";
5354         \&                }
5355         \&                print ")\en";
5356         \&            }
5357         \&        }
5358         \&    }
5359         }
5360
5361 .fi
5362 Note that, even though the subroutine is compiled in package dumpvar, the
5363 name of the subroutine is qualified so that its name is inserted into package
5364 \*(L"main\*(R".
5365 .Sh "Style"
5366 Each programmer will, of course, have his or her own preferences in regards
5367 to formatting, but there are some general guidelines that will make your
5368 programs easier to read.
5369 .Ip 1. 4 4
5370 Just because you CAN do something a particular way doesn't mean that
5371 you SHOULD do it that way.
5372 .I Perl
5373 is designed to give you several ways to do anything, so consider picking
5374 the most readable one.
5375 For instance
5376
5377         open(FOO,$foo) || die "Can't open $foo: $!";
5378
5379 is better than
5380
5381         die "Can't open $foo: $!" unless open(FOO,$foo);
5382
5383 because the second way hides the main point of the statement in a
5384 modifier.
5385 On the other hand
5386
5387         print "Starting analysis\en" if $verbose;
5388
5389 is better than
5390
5391         $verbose && print "Starting analysis\en";
5392
5393 since the main point isn't whether the user typed -v or not.
5394 .Sp
5395 Similarly, just because an operator lets you assume default arguments
5396 doesn't mean that you have to make use of the defaults.
5397 The defaults are there for lazy systems programmers writing one-shot
5398 programs.
5399 If you want your program to be readable, consider supplying the argument.
5400 .Sp
5401 Along the same lines, just because you
5402 .I can
5403 omit parentheses in many places doesn't mean that you ought to:
5404 .nf
5405
5406         return print reverse sort num values array;
5407         return print(reverse(sort num (values(%array))));
5408
5409 .fi
5410 When in doubt, parenthesize.
5411 At the very least it will let some poor schmuck bounce on the % key in vi.
5412 .Sp
5413 Even if you aren't in doubt, consider the mental welfare of the person who
5414 has to maintain the code after you, and who will probably put parens in
5415 the wrong place.
5416 .Ip 2. 4 4
5417 Don't go through silly contortions to exit a loop at the top or the
5418 bottom, when
5419 .I perl
5420 provides the "last" operator so you can exit in the middle.
5421 Just outdent it a little to make it more visible:
5422 .nf
5423
5424 .ne 7
5425     line:
5426         for (;;) {
5427             statements;
5428         last line if $foo;
5429             next line if /^#/;
5430             statements;
5431         }
5432
5433 .fi
5434 .Ip 3. 4 4
5435 Don't be afraid to use loop labels\*(--they're there to enhance readability as
5436 well as to allow multi-level loop breaks.
5437 See last example.
5438 .Ip 4. 4 4
5439 For portability, when using features that may not be implemented on every
5440 machine, test the construct in an eval to see if it fails.
5441 If you know what version or patchlevel a particular feature was implemented,
5442 you can test $] to see if it will be there.
5443 .Ip 5. 4 4
5444 Choose mnemonic identifiers.
5445 .Ip 6. 4 4
5446 Be consistent.
5447 .Sh "Debugging"
5448 If you invoke
5449 .I perl
5450 with a
5451 .B \-d
5452 switch, your script will be run under a debugging monitor.
5453 It will halt before the first executable statement and ask you for a
5454 command, such as:
5455 .Ip "h" 12 4
5456 Prints out a help message.
5457 .Ip "T" 12 4
5458 Stack trace.
5459 .Ip "s" 12 4
5460 Single step.
5461 Executes until it reaches the beginning of another statement.
5462 .Ip "n" 12 4
5463 Next.
5464 Executes over subroutine calls, until it reaches the beginning of the
5465 next statement.
5466 .Ip "f" 12 4
5467 Finish.
5468 Executes statements until it has finished the current subroutine.
5469 .Ip "c" 12 4
5470 Continue.
5471 Executes until the next breakpoint is reached.
5472 .Ip "c line" 12 4
5473 Continue to the specified line.
5474 Inserts a one-time-only breakpoint at the specified line.
5475 .Ip "<CR>" 12 4
5476 Repeat last n or s.
5477 .Ip "l min+incr" 12 4
5478 List incr+1 lines starting at min.
5479 If min is omitted, starts where last listing left off.
5480 If incr is omitted, previous value of incr is used.
5481 .Ip "l min-max" 12 4
5482 List lines in the indicated range.
5483 .Ip "l line" 12 4
5484 List just the indicated line.
5485 .Ip "l" 12 4
5486 List next window.
5487 .Ip "-" 12 4
5488 List previous window.
5489 .Ip "w line" 12 4
5490 List window around line.
5491 .Ip "l subname" 12 4
5492 List subroutine.
5493 If it's a long subroutine it just lists the beginning.
5494 Use \*(L"l\*(R" to list more.
5495 .Ip "/pattern/" 12 4
5496 Regular expression search forward for pattern; the final / is optional.
5497 .Ip "?pattern?" 12 4
5498 Regular expression search backward for pattern; the final ? is optional.
5499 .Ip "L" 12 4
5500 List lines that have breakpoints or actions.
5501 .Ip "S" 12 4
5502 Lists the names of all subroutines.
5503 .Ip "t" 12 4
5504 Toggle trace mode on or off.
5505 .Ip "b line condition" 12 4
5506 Set a breakpoint.
5507 If line is omitted, sets a breakpoint on the
5508 line that is about to be executed.
5509 If a condition is specified, it is evaluated each time the statement is
5510 reached and a breakpoint is taken only if the condition is true.
5511 Breakpoints may only be set on lines that begin an executable statement.
5512 .Ip "b subname condition" 12 4
5513 Set breakpoint at first executable line of subroutine.
5514 .Ip "d line" 12 4
5515 Delete breakpoint.
5516 If line is omitted, deletes the breakpoint on the
5517 line that is about to be executed.
5518 .Ip "D" 12 4
5519 Delete all breakpoints.
5520 .Ip "a line command" 12 4
5521 Set an action for line.
5522 A multi-line command may be entered by backslashing the newlines.
5523 .Ip "A" 12 4
5524 Delete all line actions.
5525 .Ip "< command" 12 4
5526 Set an action to happen before every debugger prompt.
5527 A multi-line command may be entered by backslashing the newlines.
5528 .Ip "> command" 12 4
5529 Set an action to happen after the prompt when you've just given a command
5530 to return to executing the script.
5531 A multi-line command may be entered by backslashing the newlines.
5532 .Ip "V package" 12 4
5533 List all variables in package.
5534 Default is main package.
5535 .Ip "! number" 12 4
5536 Redo a debugging command.
5537 If number is omitted, redoes the previous command.
5538 .Ip "! -number" 12 4
5539 Redo the command that was that many commands ago.
5540 .Ip "H -number" 12 4
5541 Display last n commands.
5542 Only commands longer than one character are listed.
5543 If number is omitted, lists them all.
5544 .Ip "q or ^D" 12 4
5545 Quit.
5546 .Ip "command" 12 4
5547 Execute command as a perl statement.
5548 A missing semicolon will be supplied.
5549 .Ip "p expr" 12 4
5550 Same as \*(L"print DB'OUT expr\*(R".
5551 The DB'OUT filehandle is opened to /dev/tty, regardless of where STDOUT
5552 may be redirected to.
5553 .PP
5554 If you want to modify the debugger, copy perldb.pl from the perl library
5555 to your current directory and modify it as necessary.
5556 (You'll also have to put -I. on your command line.)
5557 You can do some customization by setting up a .perldb file which contains
5558 initialization code.
5559 For instance, you could make aliases like these:
5560 .nf
5561
5562     $DB'alias{'len'} = 's/^len(.*)/p length($1)/';
5563     $DB'alias{'stop'} = 's/^stop (at|in)/b/';
5564     $DB'alias{'.'} =
5565       's/^\e./p "\e$DB\e'sub(\e$DB\e'line):\et",\e$DB\e'line[\e$DB\e'line]/';
5566
5567 .fi
5568 .Sh "Setuid Scripts"
5569 .I Perl
5570 is designed to make it easy to write secure setuid and setgid scripts.
5571 Unlike shells, which are based on multiple substitution passes on each line
5572 of the script,
5573 .I perl
5574 uses a more conventional evaluation scheme with fewer hidden \*(L"gotchas\*(R".
5575 Additionally, since the language has more built-in functionality, it
5576 has to rely less upon external (and possibly untrustworthy) programs to
5577 accomplish its purposes.
5578 .PP
5579 In an unpatched 4.2 or 4.3bsd kernel, setuid scripts are intrinsically
5580 insecure, but this kernel feature can be disabled.
5581 If it is,
5582 .I perl
5583 can emulate the setuid and setgid mechanism when it notices the otherwise
5584 useless setuid/gid bits on perl scripts.
5585 If the kernel feature isn't disabled,
5586 .I perl
5587 will complain loudly that your setuid script is insecure.
5588 You'll need to either disable the kernel setuid script feature, or put
5589 a C wrapper around the script.
5590 .PP
5591 When perl is executing a setuid script, it takes special precautions to
5592 prevent you from falling into any obvious traps.
5593 (In some ways, a perl script is more secure than the corresponding
5594 C program.)
5595 Any command line argument, environment variable, or input is marked as
5596 \*(L"tainted\*(R", and may not be used, directly or indirectly, in any
5597 command that invokes a subshell, or in any command that modifies files,
5598 directories or processes.
5599 Any variable that is set within an expression that has previously referenced
5600 a tainted value also becomes tainted (even if it is logically impossible
5601 for the tainted value to influence the variable).
5602 For example:
5603 .nf
5604
5605 .ne 5
5606         $foo = shift;                   # $foo is tainted
5607         $bar = $foo,\'bar\';            # $bar is also tainted
5608         $xxx = <>;                      # Tainted
5609         $path = $ENV{\'PATH\'}; # Tainted, but see below
5610         $abc = \'abc\';                 # Not tainted
5611
5612 .ne 4
5613         system "echo $foo";             # Insecure
5614         system "/bin/echo", $foo;       # Secure (doesn't use sh)
5615         system "echo $bar";             # Insecure
5616         system "echo $abc";             # Insecure until PATH set
5617
5618 .ne 5
5619         $ENV{\'PATH\'} = \'/bin:/usr/bin\';
5620         $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
5621
5622         $path = $ENV{\'PATH\'}; # Not tainted
5623         system "echo $abc";             # Is secure now!
5624
5625 .ne 5
5626         open(FOO,"$foo");               # OK
5627         open(FOO,">$foo");              # Not OK
5628
5629         open(FOO,"echo $foo|"); # Not OK, but...
5630         open(FOO,"-|") || exec \'echo\', $foo;  # OK
5631
5632         $zzz = `echo $foo`;             # Insecure, zzz tainted
5633
5634         unlink $abc,$foo;               # Insecure
5635         umask $foo;                     # Insecure
5636
5637 .ne 3
5638         exec "echo $foo";               # Insecure
5639         exec "echo", $foo;              # Secure (doesn't use sh)
5640         exec "sh", \'-c\', $foo;        # Considered secure, alas
5641
5642 .fi
5643 The taintedness is associated with each scalar value, so some elements
5644 of an array can be tainted, and others not.
5645 .PP
5646 If you try to do something insecure, you will get a fatal error saying
5647 something like \*(L"Insecure dependency\*(R" or \*(L"Insecure PATH\*(R".
5648 Note that you can still write an insecure system call or exec,
5649 but only by explicitly doing something like the last example above.
5650 You can also bypass the tainting mechanism by referencing
5651 subpatterns\*(--\c
5652 .I perl
5653 presumes that if you reference a substring using $1, $2, etc, you knew
5654 what you were doing when you wrote the pattern:
5655 .nf
5656
5657         $ARGV[0] =~ /^\-P(\ew+)$/;
5658         $printer = $1;          # Not tainted
5659
5660 .fi
5661 This is fairly secure since \ew+ doesn't match shell metacharacters.
5662 Use of .+ would have been insecure, but
5663 .I perl
5664 doesn't check for that, so you must be careful with your patterns.
5665 This is the ONLY mechanism for untainting user supplied filenames if you
5666 want to do file operations on them (unless you make $> equal to $<).
5667 .PP
5668 It's also possible to get into trouble with other operations that don't care
5669 whether they use tainted values.
5670 Make judicious use of the file tests in dealing with any user-supplied
5671 filenames.
5672 When possible, do opens and such after setting $> = $<.
5673 .I Perl
5674 doesn't prevent you from opening tainted filenames for reading, so be
5675 careful what you print out.
5676 The tainting mechanism is intended to prevent stupid mistakes, not to remove
5677 the need for thought.
5678 .SH ENVIRONMENT
5679 .I Perl
5680 uses PATH in executing subprocesses, and in finding the script if \-S
5681 is used.
5682 HOME or LOGDIR are used if chdir has no argument.
5683 .PP
5684 Apart from these,
5685 .I perl
5686 uses no environment variables, except to make them available
5687 to the script being executed, and to child processes.
5688 However, scripts running setuid would do well to execute the following lines
5689 before doing anything else, just to keep people honest:
5690 .nf
5691
5692 .ne 3
5693     $ENV{\'PATH\'} = \'/bin:/usr/bin\';    # or whatever you need
5694     $ENV{\'SHELL\'} = \'/bin/sh\' if $ENV{\'SHELL\'} ne \'\';
5695     $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
5696
5697 .fi
5698 .SH AUTHOR
5699 Larry Wall <lwall@jpl-devvax.Jpl.Nasa.Gov>
5700 .br
5701 MS-DOS port by Diomidis Spinellis <dds@cc.ic.ac.uk>
5702 .SH FILES
5703 /tmp/perl\-eXXXXXX      temporary file for
5704 .B \-e
5705 commands.
5706 .SH SEE ALSO
5707 a2p     awk to perl translator
5708 .br
5709 s2p     sed to perl translator
5710 .SH DIAGNOSTICS
5711 Compilation errors will tell you the line number of the error, with an
5712 indication of the next token or token type that was to be examined.
5713 (In the case of a script passed to
5714 .I perl
5715 via
5716 .B \-e
5717 switches, each
5718 .B \-e
5719 is counted as one line.)
5720 .PP
5721 Setuid scripts have additional constraints that can produce error messages
5722 such as \*(L"Insecure dependency\*(R".
5723 See the section on setuid scripts.
5724 .SH TRAPS
5725 Accustomed
5726 .IR awk
5727 users should take special note of the following:
5728 .Ip * 4 2
5729 Semicolons are required after all simple statements in
5730 .IR perl .
5731 Newline
5732 is not a statement delimiter.
5733 .Ip * 4 2
5734 Curly brackets are required on ifs and whiles.
5735 .Ip * 4 2
5736 Variables begin with $ or @ in
5737 .IR perl .
5738 .Ip * 4 2
5739 Arrays index from 0 unless you set $[.
5740 Likewise string positions in substr() and index().
5741 .Ip * 4 2
5742 You have to decide whether your array has numeric or string indices.
5743 .Ip * 4 2
5744 Associative array values do not spring into existence upon mere reference.
5745 .Ip * 4 2
5746 You have to decide whether you want to use string or numeric comparisons.
5747 .Ip * 4 2
5748 Reading an input line does not split it for you.  You get to split it yourself
5749 to an array.
5750 And the
5751 .I split
5752 operator has different arguments.
5753 .Ip * 4 2
5754 The current input line is normally in $_, not $0.
5755 It generally does not have the newline stripped.
5756 ($0 is the name of the program executed.)
5757 .Ip * 4 2
5758 $<digit> does not refer to fields\*(--it refers to substrings matched by the last
5759 match pattern.
5760 .Ip * 4 2
5761 The
5762 .I print
5763 statement does not add field and record separators unless you set
5764 $, and $\e.
5765 .Ip * 4 2
5766 You must open your files before you print to them.
5767 .Ip * 4 2
5768 The range operator is \*(L".\|.\*(R", not comma.
5769 (The comma operator works as in C.)
5770 .Ip * 4 2
5771 The match operator is \*(L"=~\*(R", not \*(L"~\*(R".
5772 (\*(L"~\*(R" is the one's complement operator, as in C.)
5773 .Ip * 4 2
5774 The exponentiation operator is \*(L"**\*(R", not \*(L"^\*(R".
5775 (\*(L"^\*(R" is the XOR operator, as in C.)
5776 .Ip * 4 2
5777 The concatenation operator is \*(L".\*(R", not the null string.
5778 (Using the null string would render \*(L"/pat/ /pat/\*(R" unparsable,
5779 since the third slash would be interpreted as a division operator\*(--the
5780 tokener is in fact slightly context sensitive for operators like /, ?, and <.
5781 And in fact, . itself can be the beginning of a number.)
5782 .Ip * 4 2
5783 .IR Next ,
5784 .I exit
5785 and
5786 .I continue
5787 work differently.
5788 .Ip * 4 2
5789 The following variables work differently
5790 .nf
5791
5792           Awk   \h'|2.5i'Perl
5793           ARGC  \h'|2.5i'$#ARGV
5794           ARGV[0]       \h'|2.5i'$0
5795           FILENAME\h'|2.5i'$ARGV
5796           FNR   \h'|2.5i'$. \- something
5797           FS    \h'|2.5i'(whatever you like)
5798           NF    \h'|2.5i'$#Fld, or some such
5799           NR    \h'|2.5i'$.
5800           OFMT  \h'|2.5i'$#
5801           OFS   \h'|2.5i'$,
5802           ORS   \h'|2.5i'$\e
5803           RLENGTH       \h'|2.5i'length($&)
5804           RS    \h'|2.5i'$/
5805           RSTART        \h'|2.5i'length($\`)
5806           SUBSEP        \h'|2.5i'$;
5807
5808 .fi
5809 .Ip * 4 2
5810 When in doubt, run the
5811 .I awk
5812 construct through a2p and see what it gives you.
5813 .PP
5814 Cerebral C programmers should take note of the following:
5815 .Ip * 4 2
5816 Curly brackets are required on ifs and whiles.
5817 .Ip * 4 2
5818 You should use \*(L"elsif\*(R" rather than \*(L"else if\*(R"
5819 .Ip * 4 2
5820 .I Break
5821 and
5822 .I continue
5823 become
5824 .I last
5825 and
5826 .IR next ,
5827 respectively.
5828 .Ip * 4 2
5829 There's no switch statement.
5830 .Ip * 4 2
5831 Variables begin with $ or @ in
5832 .IR perl .
5833 .Ip * 4 2
5834 Printf does not implement *.
5835 .Ip * 4 2
5836 Comments begin with #, not /*.
5837 .Ip * 4 2
5838 You can't take the address of anything.
5839 .Ip * 4 2
5840 ARGV must be capitalized.
5841 .Ip * 4 2
5842 The \*(L"system\*(R" calls link, unlink, rename, etc. return nonzero for success, not 0.
5843 .Ip * 4 2
5844 Signal handlers deal with signal names, not numbers.
5845 .PP
5846 Seasoned
5847 .I sed
5848 programmers should take note of the following:
5849 .Ip * 4 2
5850 Backreferences in substitutions use $ rather than \e.
5851 .Ip * 4 2
5852 The pattern matching metacharacters (, ), and | do not have backslashes in front.
5853 .Ip * 4 2
5854 The range operator is .\|. rather than comma.
5855 .PP
5856 Sharp shell programmers should take note of the following:
5857 .Ip * 4 2
5858 The backtick operator does variable interpretation without regard to the
5859 presence of single quotes in the command.
5860 .Ip * 4 2
5861 The backtick operator does no translation of the return value, unlike csh.
5862 .Ip * 4 2
5863 Shells (especially csh) do several levels of substitution on each command line.
5864 .I Perl
5865 does substitution only in certain constructs such as double quotes,
5866 backticks, angle brackets and search patterns.
5867 .Ip * 4 2
5868 Shells interpret scripts a little bit at a time.
5869 .I Perl
5870 compiles the whole program before executing it.
5871 .Ip * 4 2
5872 The arguments are available via @ARGV, not $1, $2, etc.
5873 .Ip * 4 2
5874 The environment is not automatically made available as variables.
5875 .SH ERRATA\0AND\0ADDENDA
5876 The Perl book,
5877 .I Programming\0Perl ,
5878 has the following omissions and goofs.
5879 .PP
5880 On page 5, the examples which read
5881 .nf
5882
5883         eval "/usr/bin/perl
5884
5885 should read
5886
5887         eval "exec /usr/bin/perl
5888
5889 .fi
5890 .PP
5891 On page 195, the equivalent to the System V sum program only works for
5892 very small files.  To do larger files, use
5893 .nf
5894
5895         undef $/;
5896         $checksum = unpack("%32C*",<>) % 32767;
5897
5898 .fi
5899 .PP
5900 The
5901 .B \-0
5902 switch to set the initial value of $/ was added to Perl after the book
5903 went to press.
5904 .PP
5905 The
5906 .B \-l
5907 switch now does automatic line ending processing.
5908 .PP
5909 The qx// construct is now a synonym for backticks.
5910 .PP
5911 $0 may now be assigned to set the argument displayed by
5912 .I ps (1).
5913 .PP
5914 The new @###.## format was omitted accidentally from the description
5915 on formats.
5916 .PP
5917 It wasn't known at press time that s///ee caused multiple evaluations of
5918 the replacement expression.  This is to be construed as a feature.
5919 .PP
5920 (LIST) x $count now does array replication.
5921 .PP
5922 There is now no limit on the number of parentheses in a regular expression.
5923 .PP
5924 In double-quote context, more escapes are supported: \ee, \ea, \ex1b, \ec[,
5925 \el, \eL, \eu, \eU, \eE.  The latter five control up/lower case translation.
5926 .PP
5927 The
5928 .B $/
5929 variable may now be set to a multi-character delimiter.
5930 .SH BUGS
5931 .PP
5932 .I Perl
5933 is at the mercy of your machine's definitions of various operations
5934 such as type casting, atof() and sprintf().
5935 .PP
5936 If your stdio requires an seek or eof between reads and writes on a particular
5937 stream, so does
5938 .IR perl .
5939 (This doesn't apply to sysread() and syswrite().)
5940 .PP
5941 While none of the built-in data types have any arbitrary size limits (apart
5942 from memory size), there are still a few arbitrary limits:
5943 a given identifier may not be longer than 255 characters;
5944 sprintf is limited on many machines to 128 characters per field (unless the format
5945 specifier is exactly %s);
5946 and no component of your PATH may be longer than 255 if you use \-S.
5947 .PP
5948 .I Perl
5949 actually stands for Pathologically Eclectic Rubbish Lister, but don't tell
5950 anyone I said that.
5951 .rn }` ''