perl.man

   1 .rn '' }`
   2 ''' $RCSfile: perl.man,v $$Revision: 4.0.1.2 $$Date: 91/06/07 11:41:23 $
   3 '''
   4 ''' $Log:       perl.man,v $
   5 ''' Revision 4.0.1.2  91/06/07  11:41:23  lwall
   6 ''' patch4: added global modifier for pattern matches
   7 ''' patch4: default top-of-form format is now FILEHANDLE_TOP
   8 ''' patch4: added $^P variable to control calling of perldb routines
   9 ''' patch4: added $^F variable to specify maximum system fd, default 2
  10 ''' patch4: changed old $^P to $^X
  11 '''
  12 ''' Revision 4.0.1.1  91/04/11  17:50:44  lwall
  13 ''' patch1: fixed some typos
  14 '''
  15 ''' Revision 4.0  91/03/20  01:38:08  lwall
  16 ''' 4.0 baseline.
  17 '''
  18 '''
  19 .de Sh
  20 .br
  21 .ne 5
  22 .PP
  23 \fB\\$1\fR
  24 .PP
  25 ..
  26 .de Sp
  27 .if t .sp .5v
  28 .if n .sp
  29 ..
  30 .de Ip
  31 .br
  32 .ie \\n(.$>=3 .ne \\$3
  33 .el .ne 3
  34 .IP "\\$1" \\$2
  35 ..
  36 '''
  37 '''     Set up \*(-- to give an unbreakable dash;
  38 '''     string Tr holds user defined translation string.
  39 '''     Bell System Logo is used as a dummy character.
  40 '''
  41 .tr \(*W-|\(bv\*(Tr
  42 .ie n \{\
  43 .ds -- \(*W-
  44 .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
  45 .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
  46 .ds L" ""
  47 .ds R" ""
  48 .ds L' '
  49 .ds R' '
  50 'br\}
  51 .el\{\
  52 .ds -- \(em\|
  53 .tr \*(Tr
  54 .ds L" ``
  55 .ds R" ''
  56 .ds L' `
  57 .ds R' '
  58 'br\}
  59 .TH PERL 1 "\*(RP"
  60 .UC
  61 .SH NAME
  62 perl \- Practical Extraction and Report Language
  63 .SH SYNOPSIS
  64 .B perl
  65 [options] filename args
  66 .SH DESCRIPTION
  67 .I Perl
  68 is an interpreted language optimized for scanning arbitrary text files,
  69 extracting information from those text files, and printing reports based
  70 on that information.
  71 It's also a good language for many system management tasks.
  72 The language is intended to be practical (easy to use, efficient, complete)
  73 rather than beautiful (tiny, elegant, minimal).
  74 It combines (in the author's opinion, anyway) some of the best features of C,
  75 \fIsed\fR, \fIawk\fR, and \fIsh\fR,
  76 so people familiar with those languages should have little difficulty with it.
  77 (Language historians will also note some vestiges of \fIcsh\fR, Pascal, and
  78 even BASIC-PLUS.)
  79 Expression syntax corresponds quite closely to C expression syntax.
  80 Unlike most Unix utilities,
  81 .I perl
  82 does not arbitrarily limit the size of your data\*(--if you've got
  83 the memory,
  84 .I perl
  85 can slurp in your whole file as a single string.
  86 Recursion is of unlimited depth.
  87 And the hash tables used by associative arrays grow as necessary to prevent
  88 degraded performance.
  89 .I Perl
  90 uses sophisticated pattern matching techniques to scan large amounts of
  91 data very quickly.
  92 Although optimized for scanning text,
  93 .I perl
  94 can also deal with binary data, and can make dbm files look like associative
  95 arrays (where dbm is available).
  96 Setuid
  97 .I perl
  98 scripts are safer than C programs
  99 through a dataflow tracing mechanism which prevents many stupid security holes.
 100 If you have a problem that would ordinarily use \fIsed\fR
 101 or \fIawk\fR or \fIsh\fR, but it
 102 exceeds their capabilities or must run a little faster,
 103 and you don't want to write the silly thing in C, then
 104 .I perl
 105 may be for you.
 106 There are also translators to turn your
 107 .I sed
 108 and
 109 .I awk
 110 scripts into
 111 .I perl
 112 scripts.
 113 OK, enough hype.
 114 .PP
 115 Upon startup,
 116 .I perl
 117 looks for your script in one of the following places:
 118 .Ip 1. 4 2
 119 Specified line by line via
 120 .B \-e
 121 switches on the command line.
 122 .Ip 2. 4 2
 123 Contained in the file specified by the first filename on the command line.
 124 (Note that systems supporting the #! notation invoke interpreters this way.)
 125 .Ip 3. 4 2
 126 Passed in implicitly via standard input.
 127 This only works if there are no filename arguments\*(--to pass
 128 arguments to a
 129 .I stdin
 130 script you must explicitly specify a \- for the script name.
 131 .PP
 132 After locating your script,
 133 .I perl
 134 compiles it to an internal form.
 135 If the script is syntactically correct, it is executed.
 136 .Sh "Options"
 137 Note: on first reading this section may not make much sense to you.  It's here
 138 at the front for easy reference.
 139 .PP
 140 A single-character option may be combined with the following option, if any.
 141 This is particularly useful when invoking a script using the #! construct which
 142 only allows one argument.  Example:
 143 .nf
 144
 145 .ne 2
 146         #!/usr/bin/perl \-spi.bak       # same as \-s \-p \-i.bak
 147         .\|.\|.
 148
 149 .fi
 150 Options include:
 151 .TP 5
 152 .BI \-0 digits
 153 specifies the record separator ($/) as an octal number.
 154 If there are no digits, the null character is the separator.
 155 Other switches may precede or follow the digits.
 156 For example, if you have a version of
 157 .I find
 158 which can print filenames terminated by the null character, you can say this:
 159 .nf
 160
 161     find . \-name '*.bak' \-print0 | perl \-n0e unlink
 162
 163 .fi
 164 The special value 00 will cause Perl to slurp files in paragraph mode.
 165 The value 0777 will cause Perl to slurp files whole since there is no
 166 legal character with that value.
 167 .TP 5
 168 .B \-a
 169 turns on autosplit mode when used with a
 170 .B \-n
 171 or
 172 .BR \-p .
 173 An implicit split command to the @F array
 174 is done as the first thing inside the implicit while loop produced by
 175 the
 176 .B \-n
 177 or
 178 .BR \-p .
 179 .nf
 180
 181         perl \-ane \'print pop(@F), "\en";\'
 182
 183 is equivalent to
 184
 185         while (<>) {
 186                 @F = split(\' \');
 187                 print pop(@F), "\en";
 188         }
 189
 190 .fi
 191 .TP 5
 192 .B \-c
 193 causes
 194 .I perl
 195 to check the syntax of the script and then exit without executing it.
 196 .TP 5
 197 .BI \-d
 198 runs the script under the perl debugger.
 199 See the section on Debugging.
 200 .TP 5
 201 .BI \-D number
 202 sets debugging flags.
 203 To watch how it executes your script, use
 204 .BR \-D14 .
 205 (This only works if debugging is compiled into your
 206 .IR perl .)
 207 Another nice value is \-D1024, which lists your compiled syntax tree.
 208 And \-D512 displays compiled regular expressions.
 209 .TP 5
 210 .BI \-e " commandline"
 211 may be used to enter one line of script.
 212 Multiple
 213 .B \-e
 214 commands may be given to build up a multi-line script.
 215 If
 216 .B \-e
 217 is given,
 218 .I perl
 219 will not look for a script filename in the argument list.
 220 .TP 5
 221 .BI \-i extension
 222 specifies that files processed by the <> construct are to be edited
 223 in-place.
 224 It does this by renaming the input file, opening the output file by the
 225 same name, and selecting that output file as the default for print statements.
 226 The extension, if supplied, is added to the name of the
 227 old file to make a backup copy.
 228 If no extension is supplied, no backup is made.
 229 Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using
 230 the script:
 231 .nf
 232
 233 .ne 2
 234         #!/usr/bin/perl \-pi.bak
 235         s/foo/bar/;
 236
 237 which is equivalent to
 238
 239 .ne 14
 240         #!/usr/bin/perl
 241         while (<>) {
 242                 if ($ARGV ne $oldargv) {
 243                         rename($ARGV, $ARGV . \'.bak\');
 244                         open(ARGVOUT, ">$ARGV");
 245                         select(ARGVOUT);
 246                         $oldargv = $ARGV;
 247                 }
 248                 s/foo/bar/;
 249         }
 250         continue {
 251             print;      # this prints to original filename
 252         }
 253         select(STDOUT);
 254
 255 .fi
 256 except that the
 257 .B \-i
 258 form doesn't need to compare $ARGV to $oldargv to know when
 259 the filename has changed.
 260 It does, however, use ARGVOUT for the selected filehandle.
 261 Note that
 262 .I STDOUT
 263 is restored as the default output filehandle after the loop.
 264 .Sp
 265 You can use eof to locate the end of each input file, in case you want
 266 to append to each file, or reset line numbering (see example under eof).
 267 .TP 5
 268 .BI \-I directory
 269 may be used in conjunction with
 270 .B \-P
 271 to tell the C preprocessor where to look for include files.
 272 By default /usr/include and /usr/lib/perl are searched.
 273 .TP 5
 274 .BI \-l octnum
 275 enables automatic line-ending processing.  It has two effects:
 276 first, it automatically chops the line terminator when used with
 277 .B \-n
 278 or
 279 .B \-p ,
 280 and second, it assigns $\e to have the value of
 281 .I octnum
 282 so that any print statements will have that line terminator added back on.  If
 283 .I octnum
 284 is omitted, sets $\e to the current value of $/.
 285 For instance, to trim lines to 80 columns:
 286 .nf
 287
 288         perl -lpe \'substr($_, 80) = ""\'
 289
 290 .fi
 291 Note that the assignment $\e = $/ is done when the switch is processed,
 292 so the input record separator can be different than the output record
 293 separator if the
 294 .B \-l
 295 switch is followed by a
 296 .B \-0
 297 switch:
 298 .nf
 299
 300         gnufind / -print0 | perl -ln0e 'print "found $_" if -p'
 301
 302 .fi
 303 This sets $\e to newline and then sets $/ to the null character.
 304 .TP 5
 305 .B \-n
 306 causes
 307 .I perl
 308 to assume the following loop around your script, which makes it iterate
 309 over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR:
 310 .nf
 311
 312 .ne 3
 313         while (<>) {
 314                 .\|.\|.         # your script goes here
 315         }
 316
 317 .fi
 318 Note that the lines are not printed by default.
 319 See
 320 .B \-p
 321 to have lines printed.
 322 Here is an efficient way to delete all files older than a week:
 323 .nf
 324
 325         find . \-mtime +7 \-print | perl \-nle \'unlink;\'
 326
 327 .fi
 328 This is faster than using the \-exec switch of find because you don't have to
 329 start a process on every filename found.
 330 .TP 5
 331 .B \-p
 332 causes
 333 .I perl
 334 to assume the following loop around your script, which makes it iterate
 335 over filename arguments somewhat like \fIsed\fR:
 336 .nf
 337
 338 .ne 5
 339         while (<>) {
 340                 .\|.\|.         # your script goes here
 341         } continue {
 342                 print;
 343         }
 344
 345 .fi
 346 Note that the lines are printed automatically.
 347 To suppress printing use the
 348 .B \-n
 349 switch.
 350 A
 351 .B \-p
 352 overrides a
 353 .B \-n
 354 switch.
 355 .TP 5
 356 .B \-P
 357 causes your script to be run through the C preprocessor before
 358 compilation by
 359 .IR perl .
 360 (Since both comments and cpp directives begin with the # character,
 361 you should avoid starting comments with any words recognized
 362 by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".)
 363 .TP 5
 364 .B \-s
 365 enables some rudimentary switch parsing for switches on the command line
 366 after the script name but before any filename arguments (or before a \-\|\-).
 367 Any switch found there is removed from @ARGV and sets the corresponding variable in the
 368 .I perl
 369 script.
 370 The following script prints \*(L"true\*(R" if and only if the script is
 371 invoked with a \-xyz switch.
 372 .nf
 373
 374 .ne 2
 375         #!/usr/bin/perl \-s
 376         if ($xyz) { print "true\en"; }
 377
 378 .fi
 379 .TP 5
 380 .B \-S
 381 makes
 382 .I perl
 383 use the PATH environment variable to search for the script
 384 (unless the name of the script starts with a slash).
 385 Typically this is used to emulate #! startup on machines that don't
 386 support #!, in the following manner:
 387 .nf
 388
 389         #!/usr/bin/perl
 390         eval "exec /usr/bin/perl \-S $0 $*"
 391                 if $running_under_some_shell;
 392
 393 .fi
 394 The system ignores the first line and feeds the script to /bin/sh,
 395 which proceeds to try to execute the
 396 .I perl
 397 script as a shell script.
 398 The shell executes the second line as a normal shell command, and thus
 399 starts up the
 400 .I perl
 401 interpreter.
 402 On some systems $0 doesn't always contain the full pathname,
 403 so the
 404 .B \-S
 405 tells
 406 .I perl
 407 to search for the script if necessary.
 408 After
 409 .I perl
 410 locates the script, it parses the lines and ignores them because
 411 the variable $running_under_some_shell is never true.
 412 A better construct than $* would be ${1+"$@"}, which handles embedded spaces
 413 and such in the filenames, but doesn't work if the script is being interpreted
 414 by csh.
 415 In order to start up sh rather than csh, some systems may have to replace the
 416 #! line with a line containing just
 417 a colon, which will be politely ignored by perl.
 418 Other systems can't control that, and need a totally devious construct that
 419 will work under any of csh, sh or perl, such as the following:
 420 .nf
 421
 422 .ne 3
 423         eval '(exit $?0)' && eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
 424         & eval 'exec /usr/bin/perl -S $0 $argv:q'
 425                 if 0;
 426
 427 .fi
 428 .TP 5
 429 .B \-u
 430 causes
 431 .I perl
 432 to dump core after compiling your script.
 433 You can then take this core dump and turn it into an executable file
 434 by using the undump program (not supplied).
 435 This speeds startup at the expense of some disk space (which you can
 436 minimize by stripping the executable).
 437 (Still, a "hello world" executable comes out to about 200K on my machine.)
 438 If you are going to run your executable as a set-id program then you
 439 should probably compile it using taintperl rather than normal perl.
 440 If you want to execute a portion of your script before dumping, use the
 441 dump operator instead.
 442 Note: availability of undump is platform specific and may not be available
 443 for a specific port of perl.
 444 .TP 5
 445 .B \-U
 446 allows
 447 .I perl
 448 to do unsafe operations.
 449 Currently the only \*(L"unsafe\*(R" operation is the unlinking of directories while
 450 running as superuser.
 451 .TP 5
 452 .B \-v
 453 prints the version and patchlevel of your
 454 .I perl
 455 executable.
 456 .TP 5
 457 .B \-w
 458 prints warnings about identifiers that are mentioned only once, and scalar
 459 variables that are used before being set.
 460 Also warns about redefined subroutines, and references to undefined
 461 filehandles or filehandles opened readonly that you are attempting to
 462 write on.
 463 Also warns you if you use == on values that don't look like numbers, and if
 464 your subroutines recurse more than 100 deep.
 465 .TP 5
 466 .BI \-x directory
 467 tells
 468 .I perl
 469 that the script is embedded in a message.
 470 Leading garbage will be discarded until the first line that starts
 471 with #! and contains the string "perl".
 472 Any meaningful switches on that line will be applied (but only one
 473 group of switches, as with normal #! processing).
 474 If a directory name is specified, Perl will switch to that directory
 475 before running the script.
 476 The
 477 .B \-x
 478 switch only controls the the disposal of leading garbage.
 479 The script must be terminated with __END__ if there is trailing garbage
 480 to be ignored (the script can process any or all of the trailing garbage
 481 via the DATA filehandle if desired).
 482 .Sh "Data Types and Objects"
 483 .PP
 484 .I Perl
 485 has three data types: scalars, arrays of scalars, and
 486 associative arrays of scalars.
 487 Normal arrays are indexed by number, and associative arrays by string.
 488 .PP
 489 The interpretation of operations and values in perl sometimes
 490 depends on the requirements
 491 of the context around the operation or value.
 492 There are three major contexts: string, numeric and array.
 493 Certain operations return array values
 494 in contexts wanting an array, and scalar values otherwise.
 495 (If this is true of an operation it will be mentioned in the documentation
 496 for that operation.)
 497 Operations which return scalars don't care whether the context is looking
 498 for a string or a number, but
 499 scalar variables and values are interpreted as strings or numbers
 500 as appropriate to the context.
 501 A scalar is interpreted as TRUE in the boolean sense if it is not the null
 502 string or 0.
 503 Booleans returned by operators are 1 for true and 0 or \'\' (the null
 504 string) for false.
 505 .PP
 506 There are actually two varieties of null string: defined and undefined.
 507 Undefined null strings are returned when there is no real value for something,
 508 such as when there was an error, or at end of file, or when you refer
 509 to an uninitialized variable or element of an array.
 510 An undefined null string may become defined the first time you access it, but
 511 prior to that you can use the defined() operator to determine whether the
 512 value is defined or not.
 513 .PP
 514 References to scalar variables always begin with \*(L'$\*(R', even when referring
 515 to a scalar that is part of an array.
 516 Thus:
 517 .nf
 518
 519 .ne 3
 520     $days       \h'|2i'# a simple scalar variable
 521     $days[28]   \h'|2i'# 29th element of array @days
 522     $days{\'Feb\'}\h'|2i'# one value from an associative array
 523     $#days      \h'|2i'# last index of array @days
 524
 525 but entire arrays or array slices are denoted by \*(L'@\*(R':
 526
 527     @days       \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n])
 528     @days[3,4,5]\h'|2i'# same as @days[3.\|.5]
 529     @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'})
 530
 531 and entire associative arrays are denoted by \*(L'%\*(R':
 532
 533     %days       \h'|2i'# (key1, val1, key2, val2 .\|.\|.)
 534 .fi
 535 .PP
 536 Any of these eight constructs may serve as an lvalue,
 537 that is, may be assigned to.
 538 (It also turns out that an assignment is itself an lvalue in
 539 certain contexts\*(--see examples under s, tr and chop.)
 540 Assignment to a scalar evaluates the righthand side in a scalar context,
 541 while assignment to an array or array slice evaluates the righthand side
 542 in an array context.
 543 .PP
 544 You may find the length of array @days by evaluating
 545 \*(L"$#days\*(R", as in
 546 .IR csh .
 547 (Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.)
 548 Assigning to $#days changes the length of the array.
 549 Shortening an array by this method does not actually destroy any values.
 550 Lengthening an array that was previously shortened recovers the values that
 551 were in those elements.
 552 You can also gain some measure of efficiency by preextending an array that
 553 is going to get big.
 554 (You can also extend an array by assigning to an element that is off the
 555 end of the array.
 556 This differs from assigning to $#whatever in that intervening values
 557 are set to null rather than recovered.)
 558 You can truncate an array down to nothing by assigning the null list () to
 559 it.
 560 The following are exactly equivalent
 561 .nf
 562
 563         @whatever = ();
 564         $#whatever = $[ \- 1;
 565
 566 .fi
 567 .PP
 568 If you evaluate an array in a scalar context, it returns the length of
 569 the array.
 570 The following is always true:
 571 .nf
 572
 573         @whatever == $#whatever \- $[ + 1;
 574
 575 .fi
 576 .PP
 577 Multi-dimensional arrays are not directly supported, but see the discussion
 578 of the $; variable later for a means of emulating multiple subscripts with
 579 an associative array.
 580 You could also write a subroutine to turn multiple subscripts into a single
 581 subscript.
 582 .PP
 583 Every data type has its own namespace.
 584 You can, without fear of conflict, use the same name for a scalar variable,
 585 an array, an associative array, a filehandle, a subroutine name, and/or
 586 a label.
 587 Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R',
 588 or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved
 589 with respect to variable names.
 590 (They ARE reserved with respect to labels and filehandles, however, which
 591 don't have an initial special character.
 592 Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\').
 593 Using uppercase filehandles also improves readability and protects you
 594 from conflict with future reserved words.)
 595 Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all
 596 different names.
 597 Names which start with a letter may also contain digits and underscores.
 598 Names which do not start with a letter are limited to one character,
 599 e.g. \*(L"$%\*(R" or \*(L"$$\*(R".
 600 (Most of the one character names have a predefined significance to
 601 .IR perl .
 602 More later.)
 603 .PP
 604 Numeric literals are specified in any of the usual floating point or
 605 integer formats:
 606 .nf
 607
 608 .ne 5
 609     12345
 610     12345.67
 611     .23E-10
 612     0xffff      # hex
 613     0377        # octal
 614
 615 .fi
 616 String literals are delimited by either single or double quotes.
 617 They work much like shell quotes:
 618 double-quoted string literals are subject to backslash and variable
 619 substitution; single-quoted strings are not (except for \e\' and \e\e).
 620 The usual backslash rules apply for making characters such as newline, tab,
 621 etc., as well as some more exotic forms:
 622 .nf
 623
 624         \et             tab
 625         \en             newline
 626         \er             return
 627         \ef             form feed
 628         \eb             backspace
 629         \ea             alarm (bell)
 630         \ee             escape
 631         \e033           octal char
 632         \ex1b           hex char
 633         \ec[            control char
 634         \el             lowercase next char
 635         \eu             uppercase next char
 636         \eL             lowercase till \eE
 637         \eU             uppercase till \eE
 638         \eE             end case modification
 639
 640 .fi
 641 You can also embed newlines directly in your strings, i.e. they can end on
 642 a different line than they begin.
 643 This is nice, but if you forget your trailing quote, the error will not be
 644 reported until
 645 .I perl
 646 finds another line containing the quote character, which
 647 may be much further on in the script.
 648 Variable substitution inside strings is limited to scalar variables, normal
 649 array values, and array slices.
 650 (In other words, identifiers beginning with $ or @, followed by an optional
 651 bracketed expression as a subscript.)
 652 The following code segment prints out \*(L"The price is $100.\*(R"
 653 .nf
 654
 655 .ne 2
 656     $Price = \'$100\';\h'|3.5i'# not interpreted
 657     print "The price is $Price.\e\|n";\h'|3.5i'# interpreted
 658
 659 .fi
 660 Note that you can put curly brackets around the identifier to delimit it
 661 from following alphanumerics.
 662 Also note that a single quoted string must be separated from a preceding
 663 word by a space, since single quote is a valid character in an identifier
 664 (see Packages).
 665 .PP
 666 Two special literals are __LINE__ and __FILE__, which represent the current
 667 line number and filename at that point in your program.
 668 They may only be used as separate tokens; they will not be interpolated
 669 into strings.
 670 In addition, the token __END__ may be used to indicate the logical end of the
 671 script before the actual end of file.
 672 Any following text is ignored (but may be read via the DATA filehandle).
 673 The two control characters ^D and ^Z are synonyms for __END__.
 674 .PP
 675 A word that doesn't have any other interpretation in the grammar will be
 676 treated as if it had single quotes around it.
 677 For this purpose, a word consists only of alphanumeric characters and underline,
 678 and must start with an alphabetic character.
 679 As with filehandles and labels, a bare word that consists entirely of
 680 lowercase letters risks conflict with future reserved words, and if you
 681 use the
 682 .B \-w
 683 switch, Perl will warn you about any such words.
 684 .PP
 685 Array values are interpolated into double-quoted strings by joining all the
 686 elements of the array with the delimiter specified in the $" variable,
 687 space by default.
 688 (Since in versions of perl prior to 3.0 the @ character was not a metacharacter
 689 in double-quoted strings, the interpolation of @array, $array[EXPR],
 690 @array[LIST], $array{EXPR}, or @array{LIST} only happens if array is
 691 referenced elsewhere in the program or is predefined.)
 692 The following are equivalent:
 693 .nf
 694
 695 .ne 4
 696         $temp = join($",@ARGV);
 697         system "echo $temp";
 698
 699         system "echo @ARGV";
 700
 701 .fi
 702 Within search patterns (which also undergo double-quotish substitution)
 703 there is a bad ambiguity:  Is /$foo[bar]/ to be
 704 interpreted as /${foo}[bar]/ (where [bar] is a character class for the
 705 regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to
 706 array @foo)?
 707 If @foo doesn't otherwise exist, then it's obviously a character class.
 708 If @foo exists, perl takes a good guess about [bar], and is almost always right.
 709 If it does guess wrong, or if you're just plain paranoid,
 710 you can force the correct interpretation with curly brackets as above.
 711 .PP
 712 A line-oriented form of quoting is based on the shell here-is syntax.
 713 Following a << you specify a string to terminate the quoted material, and all lines
 714 following the current line down to the terminating string are the value
 715 of the item.
 716 The terminating string may be either an identifier (a word), or some
 717 quoted text.
 718 If quoted, the type of quotes you use determines the treatment of the text,
 719 just as in regular quoting.
 720 An unquoted identifier works like double quotes.
 721 There must be no space between the << and the identifier.
 722 (If you put a space it will be treated as a null identifier, which is
 723 valid, and matches the first blank line\*(--see Merry Christmas example below.)
 724 The terminating string must appear by itself (unquoted and with no surrounding
 725 whitespace) on the terminating line.
 726 .nf
 727
 728         print <<EOF;            # same as above
 729 The price is $Price.
 730 EOF
 731
 732         print <<"EOF";          # same as above
 733 The price is $Price.
 734 EOF
 735
 736         print << x 10;          # null identifier is delimiter
 737 Merry Christmas!
 738
 739         print <<`EOC`;          # execute commands
 740 echo hi there
 741 echo lo there
 742 EOC
 743
 744         print <<foo, <<bar;     # you can stack them
 745 I said foo.
 746 foo
 747 I said bar.
 748 bar
 749
 750 .fi
 751 Array literals are denoted by separating individual values by commas, and
 752 enclosing the list in parentheses:
 753 .nf
 754
 755         (LIST)
 756
 757 .fi
 758 In a context not requiring an array value, the value of the array literal
 759 is the value of the final element, as in the C comma operator.
 760 For example,
 761 .nf
 762
 763 .ne 4
 764     @foo = (\'cc\', \'\-E\', $bar);
 765
 766 assigns the entire array value to array foo, but
 767
 768     $foo = (\'cc\', \'\-E\', $bar);
 769
 770 .fi
 771 assigns the value of variable bar to variable foo.
 772 Note that the value of an actual array in a scalar context is the length
 773 of the array; the following assigns to $foo the value 3:
 774 .nf
 775
 776 .ne 2
 777     @foo = (\'cc\', \'\-E\', $bar);
 778     $foo = @foo;                # $foo gets 3
 779
 780 .fi
 781 You may have an optional comma before the closing parenthesis of an
 782 array literal, so that you can say:
 783 .nf
 784
 785     @foo = (
 786         1,
 787         2,
 788         3,
 789     );
 790
 791 .fi
 792 When a LIST is evaluated, each element of the list is evaluated in
 793 an array context, and the resulting array value is interpolated into LIST
 794 just as if each individual element were a member of LIST.  Thus arrays
 795 lose their identity in a LIST\*(--the list
 796
 797         (@foo,@bar,&SomeSub)
 798
 799 contains all the elements of @foo followed by all the elements of @bar,
 800 followed by all the elements returned by the subroutine named SomeSub.
 801 .PP
 802 A list value may also be subscripted like a normal array.
 803 Examples:
 804 .nf
 805
 806         $time = (stat($file))[8];       # stat returns array value
 807         $digit = ('a','b','c','d','e','f')[$digit-10];
 808         return (pop(@foo),pop(@foo))[0];
 809
 810 .fi
 811 .PP
 812 Array lists may be assigned to if and only if each element of the list
 813 is an lvalue:
 814 .nf
 815
 816     ($a, $b, $c) = (1, 2, 3);
 817
 818     ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00);
 819
 820 The final element may be an array or an associative array:
 821
 822     ($a, $b, @rest) = split;
 823     local($a, $b, %rest) = @_;
 824
 825 .fi
 826 You can actually put an array anywhere in the list, but the first array
 827 in the list will soak up all the values, and anything after it will get
 828 a null value.
 829 This may be useful in a local().
 830 .PP
 831 An associative array literal contains pairs of values to be interpreted
 832 as a key and a value:
 833 .nf
 834
 835 .ne 2
 836     # same as map assignment above
 837     %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
 838
 839 .fi
 840 Array assignment in a scalar context returns the number of elements
 841 produced by the expression on the right side of the assignment:
 842 .nf
 843
 844         $x = (($foo,$bar) = (3,2,1));   # set $x to 3, not 2
 845
 846 .fi
 847 .PP
 848 There are several other pseudo-literals that you should know about.
 849 If a string is enclosed by backticks (grave accents), it first undergoes
 850 variable substitution just like a double quoted string.
 851 It is then interpreted as a command, and the output of that command
 852 is the value of the pseudo-literal, like in a shell.
 853 In a scalar context, a single string consisting of all the output is
 854 returned.
 855 In an array context, an array of values is returned, one for each line
 856 of output.
 857 (You can set $/ to use a different line terminator.)
 858 The command is executed each time the pseudo-literal is evaluated.
 859 The status value of the command is returned in $? (see Predefined Names
 860 for the interpretation of $?).
 861 Unlike in \f2csh\f1, no translation is done on the return
 862 data\*(--newlines remain newlines.
 863 Unlike in any of the shells, single quotes do not hide variable names
 864 in the command from interpretation.
 865 To pass a $ through to the shell you need to hide it with a backslash.
 866 .PP
 867 Evaluating a filehandle in angle brackets yields the next line
 868 from that file (newline included, so it's never false until EOF, at
 869 which time an undefined value is returned).
 870 Ordinarily you must assign that value to a variable,
 871 but there is one situation where an automatic assignment happens.
 872 If (and only if) the input symbol is the only thing inside the conditional of a
 873 .I while
 874 loop, the value is
 875 automatically assigned to the variable \*(L"$_\*(R".
 876 (This may seem like an odd thing to you, but you'll use the construct
 877 in almost every
 878 .I perl
 879 script you write.)
 880 Anyway, the following lines are equivalent to each other:
 881 .nf
 882
 883 .ne 5
 884     while ($_ = <STDIN>) { print; }
 885     while (<STDIN>) { print; }
 886     for (\|;\|<STDIN>;\|) { print; }
 887     print while $_ = <STDIN>;
 888     print while <STDIN>;
 889
 890 .fi
 891 The filehandles
 892 .IR STDIN ,
 893 .I STDOUT
 894 and
 895 .I STDERR
 896 are predefined.
 897 (The filehandles
 898 .IR stdin ,
 899 .I stdout
 900 and
 901 .I stderr
 902 will also work except in packages, where they would be interpreted as
 903 local identifiers rather than global.)
 904 Additional filehandles may be created with the
 905 .I open
 906 function.
 907 .PP
 908 If a <FILEHANDLE> is used in a context that is looking for an array, an array
 909 consisting of all the input lines is returned, one line per array element.
 910 It's easy to make a LARGE data space this way, so use with care.
 911 .PP
 912 The null filehandle <> is special and can be used to emulate the behavior of
 913 \fIsed\fR and \fIawk\fR.
 914 Input from <> comes either from standard input, or from each file listed on
 915 the command line.
 916 Here's how it works: the first time <> is evaluated, the ARGV array is checked,
 917 and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard
 918 input.
 919 The ARGV array is then processed as a list of filenames.
 920 The loop
 921 .nf
 922
 923 .ne 3
 924         while (<>) {
 925                 .\|.\|.                 # code for each line
 926         }
 927
 928 .ne 10
 929 is equivalent to
 930
 931         unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[;
 932         while ($ARGV = shift) {
 933                 open(ARGV, $ARGV);
 934                 while (<ARGV>) {
 935                         .\|.\|.         # code for each line
 936                 }
 937         }
 938
 939 .fi
 940 except that it isn't as cumbersome to say.
 941 It really does shift array ARGV and put the current filename into
 942 variable ARGV.
 943 It also uses filehandle ARGV internally.
 944 You can modify @ARGV before the first <> as long as you leave the first
 945 filename at the beginning of the array.
 946 Line numbers ($.) continue as if the input was one big happy file.
 947 (But see example under eof for how to reset line numbers on each file.)
 948 .PP
 949 .ne 5
 950 If you want to set @ARGV to your own list of files, go right ahead.
 951 If you want to pass switches into your script, you can
 952 put a loop on the front like this:
 953 .nf
 954
 955 .ne 10
 956         while ($_ = $ARGV[0], /\|^\-/\|) {
 957                 shift;
 958             last if /\|^\-\|\-$\|/\|;
 959                 /\|^\-D\|(.*\|)/ \|&& \|($debug = $1);
 960                 /\|^\-v\|/ \|&& \|$verbose++;
 961                 .\|.\|.         # other switches
 962         }
 963         while (<>) {
 964                 .\|.\|.         # code for each line
 965         }
 966
 967 .fi
 968 The <> symbol will return FALSE only once.
 969 If you call it again after this it will assume you are processing another
 970 @ARGV list, and if you haven't set @ARGV, will input from
 971 .IR STDIN .
 972 .PP
 973 If the string inside the angle brackets is a reference to a scalar variable
 974 (e.g. <$foo>),
 975 then that variable contains the name of the filehandle to input from.
 976 .PP
 977 If the string inside angle brackets is not a filehandle, it is interpreted
 978 as a filename pattern to be globbed, and either an array of filenames or the
 979 next filename in the list is returned, depending on context.
 980 One level of $ interpretation is done first, but you can't say <$foo>
 981 because that's an indirect filehandle as explained in the previous
 982 paragraph.
 983 You could insert curly brackets to force interpretation as a
 984 filename glob: <${foo}>.
 985 Example:
 986 .nf
 987
 988 .ne 3
 989         while (<*.c>) {
 990                 chmod 0644, $_;
 991         }
 992
 993 is equivalent to
 994
 995 .ne 5
 996         open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|");
 997         while (<foo>) {
 998                 chop;
 999                 chmod 0644, $_;
1000         }
1001
1002 .fi
1003 In fact, it's currently implemented that way.
1004 (Which means it will not work on filenames with spaces in them unless
1005 you have /bin/csh on your machine.)
1006 Of course, the shortest way to do the above is:
1007 .nf
1008
1009         chmod 0644, <*.c>;
1010
1011 .fi
1012 .Sh "Syntax"
1013 .PP
1014 A
1015 .I perl
1016 script consists of a sequence of declarations and commands.
1017 The only things that need to be declared in
1018 .I perl
1019 are report formats and subroutines.
1020 See the sections below for more information on those declarations.
1021 All uninitialized user-created objects are assumed to
1022 start with a null or 0 value until they
1023 are defined by some explicit operation such as assignment.
1024 The sequence of commands is executed just once, unlike in
1025 .I sed
1026 and
1027 .I awk
1028 scripts, where the sequence of commands is executed for each input line.
1029 While this means that you must explicitly loop over the lines of your input file
1030 (or files), it also means you have much more control over which files and which
1031 lines you look at.
1032 (Actually, I'm lying\*(--it is possible to do an implicit loop with either the
1033 .B \-n
1034 or
1035 .B \-p
1036 switch.)
1037 .PP
1038 A declaration can be put anywhere a command can, but has no effect on the
1039 execution of the primary sequence of commands\*(--declarations all take effect
1040 at compile time.
1041 Typically all the declarations are put at the beginning or the end of the script.
1042 .PP
1043 .I Perl
1044 is, for the most part, a free-form language.
1045 (The only exception to this is format declarations, for fairly obvious reasons.)
1046 Comments are indicated by the # character, and extend to the end of the line.
1047 If you attempt to use /* */ C comments, it will be interpreted either as
1048 division or pattern matching, depending on the context.
1049 So don't do that.
1050 .Sh "Compound statements"
1051 In
1052 .IR perl ,
1053 a sequence of commands may be treated as one command by enclosing it
1054 in curly brackets.
1055 We will call this a BLOCK.
1056 .PP
1057 The following compound commands may be used to control flow:
1058 .nf
1059
1060 .ne 4
1061         if (EXPR) BLOCK
1062         if (EXPR) BLOCK else BLOCK
1063         if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK
1064         LABEL while (EXPR) BLOCK
1065         LABEL while (EXPR) BLOCK continue BLOCK
1066         LABEL for (EXPR; EXPR; EXPR) BLOCK
1067         LABEL foreach VAR (ARRAY) BLOCK
1068         LABEL BLOCK continue BLOCK
1069
1070 .fi
1071 Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not
1072 statements.
1073 This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed.
1074 If you want to write conditionals without curly brackets there are several
1075 other ways to do it.
1076 The following all do the same thing:
1077 .nf
1078
1079 .ne 5
1080         if (!open(foo)) { die "Can't open $foo: $!"; }
1081         die "Can't open $foo: $!" unless open(foo);
1082         open(foo) || die "Can't open $foo: $!"; # foo or bust!
1083         open(foo) ? \'hi mom\' : die "Can't open $foo: $!";
1084                                 # a bit exotic, that last one
1085
1086 .fi
1087 .PP
1088 The
1089 .I if
1090 statement is straightforward.
1091 Since BLOCKs are always bounded by curly brackets, there is never any
1092 ambiguity about which
1093 .I if
1094 an
1095 .I else
1096 goes with.
1097 If you use
1098 .I unless
1099 in place of
1100 .IR if ,
1101 the sense of the test is reversed.
1102 .PP
1103 The
1104 .I while
1105 statement executes the block as long as the expression is true
1106 (does not evaluate to the null string or 0).
1107 The LABEL is optional, and if present, consists of an identifier followed by
1108 a colon.
1109 The LABEL identifies the loop for the loop control statements
1110 .IR next ,
1111 .IR last ,
1112 and
1113 .I redo
1114 (see below).
1115 If there is a
1116 .I continue
1117 BLOCK, it is always executed just before
1118 the conditional is about to be evaluated again, similarly to the third part
1119 of a
1120 .I for
1121 loop in C.
1122 Thus it can be used to increment a loop variable, even when the loop has
1123 been continued via the
1124 .I next
1125 statement (similar to the C \*(L"continue\*(R" statement).
1126 .PP
1127 If the word
1128 .I while
1129 is replaced by the word
1130 .IR until ,
1131 the sense of the test is reversed, but the conditional is still tested before
1132 the first iteration.
1133 .PP
1134 In either the
1135 .I if
1136 or the
1137 .I while
1138 statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional
1139 is true if the value of the last command in that block is true.
1140 .PP
1141 The
1142 .I for
1143 loop works exactly like the corresponding
1144 .I while
1145 loop:
1146 .nf
1147
1148 .ne 12
1149         for ($i = 1; $i < 10; $i++) {
1150                 .\|.\|.
1151         }
1152
1153 is the same as
1154
1155         $i = 1;
1156         while ($i < 10) {
1157                 .\|.\|.
1158         } continue {
1159                 $i++;
1160         }
1161 .fi
1162 .PP
1163 The foreach loop iterates over a normal array value and sets the variable
1164 VAR to be each element of the array in turn.
1165 The variable is implicitly local to the loop, and regains its former value
1166 upon exiting the loop.
1167 The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword,
1168 so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity.
1169 If VAR is omitted, $_ is set to each value.
1170 If ARRAY is an actual array (as opposed to an expression returning an array
1171 value), you can modify each element of the array
1172 by modifying VAR inside the loop.
1173 Examples:
1174 .nf
1175
1176 .ne 5
1177         for (@ary) { s/foo/bar/; }
1178
1179         foreach $elem (@elements) {
1180                 $elem *= 2;
1181         }
1182
1183 .ne 3
1184         for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) {
1185                 print $_, "\en"; sleep(1);
1186         }
1187
1188         for (1..15) { print "Merry Christmas\en"; }
1189
1190 .ne 3
1191         foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'})) {
1192                 print "Item: $item\en";
1193         }
1194
1195 .fi
1196 .PP
1197 The BLOCK by itself (labeled or not) is equivalent to a loop that executes
1198 once.
1199 Thus you can use any of the loop control statements in it to leave or
1200 restart the block.
1201 The
1202 .I continue
1203 block is optional.
1204 This construct is particularly nice for doing case structures.
1205 .nf
1206
1207 .ne 6
1208         foo: {
1209                 if (/^abc/) { $abc = 1; last foo; }
1210                 if (/^def/) { $def = 1; last foo; }
1211                 if (/^xyz/) { $xyz = 1; last foo; }
1212                 $nothing = 1;
1213         }
1214
1215 .fi
1216 There is no official switch statement in perl, because there
1217 are already several ways to write the equivalent.
1218 In addition to the above, you could write
1219 .nf
1220
1221 .ne 6
1222         foo: {
1223                 $abc = 1, last foo  if /^abc/;
1224                 $def = 1, last foo  if /^def/;
1225                 $xyz = 1, last foo  if /^xyz/;
1226                 $nothing = 1;
1227         }
1228
1229 or
1230
1231 .ne 6
1232         foo: {
1233                 /^abc/ && do { $abc = 1; last foo; };
1234                 /^def/ && do { $def = 1; last foo; };
1235                 /^xyz/ && do { $xyz = 1; last foo; };
1236                 $nothing = 1;
1237         }
1238
1239 or
1240
1241 .ne 6
1242         foo: {
1243                 /^abc/ && ($abc = 1, last foo);
1244                 /^def/ && ($def = 1, last foo);
1245                 /^xyz/ && ($xyz = 1, last foo);
1246                 $nothing = 1;
1247         }
1248
1249 or even
1250
1251 .ne 8
1252         if (/^abc/)
1253                 { $abc = 1; }
1254         elsif (/^def/)
1255                 { $def = 1; }
1256         elsif (/^xyz/)
1257                 { $xyz = 1; }
1258         else
1259                 {$nothing = 1;}
1260
1261 .fi
1262 As it happens, these are all optimized internally to a switch structure,
1263 so perl jumps directly to the desired statement, and you needn't worry
1264 about perl executing a lot of unnecessary statements when you have a string
1265 of 50 elsifs, as long as you are testing the same simple scalar variable
1266 using ==, eq, or pattern matching as above.
1267 (If you're curious as to whether the optimizer has done this for a particular
1268 case statement, you can use the \-D1024 switch to list the syntax tree
1269 before execution.)
1270 .Sh "Simple statements"
1271 The only kind of simple statement is an expression evaluated for its side
1272 effects.
1273 Every expression (simple statement) must be terminated with a semicolon.
1274 Note that this is like C, but unlike Pascal (and
1275 .IR awk ).
1276 .PP
1277 Any simple statement may optionally be followed by a
1278 single modifier, just before the terminating semicolon.
1279 The possible modifiers are:
1280 .nf
1281
1282 .ne 4
1283         if EXPR
1284         unless EXPR
1285         while EXPR
1286         until EXPR
1287
1288 .fi
1289 The
1290 .I if
1291 and
1292 .I unless
1293 modifiers have the expected semantics.
1294 The
1295 .I while
1296 and
1297 .I until
1298 modifiers also have the expected semantics (conditional evaluated first),
1299 except when applied to a do-BLOCK or a do-SUBROUTINE command,
1300 in which case the block executes once before the conditional is evaluated.
1301 This is so that you can write loops like:
1302 .nf
1303
1304 .ne 4
1305         do {
1306                 $_ = <STDIN>;
1307                 .\|.\|.
1308         } until $_ \|eq \|".\|\e\|n";
1309
1310 .fi
1311 (See the
1312 .I do
1313 operator below.  Note also that the loop control commands described later will
1314 NOT work in this construct, since modifiers don't take loop labels.
1315 Sorry.)
1316 .Sh "Expressions"
1317 Since
1318 .I perl
1319 expressions work almost exactly like C expressions, only the differences
1320 will be mentioned here.
1321 .PP
1322 Here's what
1323 .I perl
1324 has that C doesn't:
1325 .Ip ** 8 2
1326 The exponentiation operator.
1327 .Ip **= 8
1328 The exponentiation assignment operator.
1329 .Ip (\|) 8 3
1330 The null list, used to initialize an array to null.
1331 .Ip . 8
1332 Concatenation of two strings.
1333 .Ip .= 8
1334 The concatenation assignment operator.
1335 .Ip eq 8
1336 String equality (== is numeric equality).
1337 For a mnemonic just think of \*(L"eq\*(R" as a string.
1338 (If you are used to the
1339 .I awk
1340 behavior of using == for either string or numeric equality
1341 based on the current form of the comparands, beware!
1342 You must be explicit here.)
1343 .Ip ne 8
1344 String inequality (!= is numeric inequality).
1345 .Ip lt 8
1346 String less than.
1347 .Ip gt 8
1348 String greater than.
1349 .Ip le 8
1350 String less than or equal.
1351 .Ip ge 8
1352 String greater than or equal.
1353 .Ip cmp 8
1354 String comparison, returning -1, 0, or 1.
1355 .Ip <=> 8
1356 Numeric comparison, returning -1, 0, or 1.
1357 .Ip =~ 8 2
1358 Certain operations search or modify the string \*(L"$_\*(R" by default.
1359 This operator makes that kind of operation work on some other string.
1360 The right argument is a search pattern, substitution, or translation.
1361 The left argument is what is supposed to be searched, substituted, or
1362 translated instead of the default \*(L"$_\*(R".
1363 The return value indicates the success of the operation.
1364 (If the right argument is an expression other than a search pattern,
1365 substitution, or translation, it is interpreted as a search pattern
1366 at run time.
1367 This is less efficient than an explicit search, since the pattern must
1368 be compiled every time the expression is evaluated.)
1369 The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else.
1370 .Ip !~ 8
1371 Just like =~ except the return value is negated.
1372 .Ip x 8
1373 The repetition operator.
1374 Returns a string consisting of the left operand repeated the
1375 number of times specified by the right operand.
1376 In an array context, if the left operand is a list in parens, it repeats
1377 the list.
1378 .nf
1379
1380         print \'\-\' x 80;              # print row of dashes
1381         print \'\-\' x80;               # illegal, x80 is identifier
1382
1383         print "\et" x ($tab/8), \' \' x ($tab%8);       # tab over
1384
1385         @ones = (1) x 80;               # an array of 80 1's
1386         @ones = (5) x @ones;            # set all elements to 5
1387
1388 .fi
1389 .Ip x= 8
1390 The repetition assignment operator.
1391 Only works on scalars.
1392 .Ip .\|. 8
1393 The range operator, which is really two different operators depending
1394 on the context.
1395 In an array context, returns an array of values counting (by ones)
1396 from the left value to the right value.
1397 This is useful for writing \*(L"for (1..10)\*(R" loops and for doing
1398 slice operations on arrays.
1399 .Sp
1400 In a scalar context, .\|. returns a boolean value.
1401 The operator is bistable, like a flip-flop..
1402 Each .\|. operator maintains its own boolean state.
1403 It is false as long as its left operand is false.
1404 Once the left operand is true, the range operator stays true
1405 until the right operand is true,
1406 AFTER which the range operator becomes false again.
1407 (It doesn't become false till the next time the range operator is evaluated.
1408 It can become false on the same evaluation it became true, but it still returns
1409 true once.)
1410 The right operand is not evaluated while the operator is in the \*(L"false\*(R" state,
1411 and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state.
1412 The scalar .\|. operator is primarily intended for doing line number ranges
1413 after
1414 the fashion of \fIsed\fR or \fIawk\fR.
1415 The precedence is a little lower than || and &&.
1416 The value returned is either the null string for false, or a sequence number
1417 (beginning with 1) for true.
1418 The sequence number is reset for each range encountered.
1419 The final sequence number in a range has the string \'E0\' appended to it, which
1420 doesn't affect its numeric value, but gives you something to search for if you
1421 want to exclude the endpoint.
1422 You can exclude the beginning point by waiting for the sequence number to be
1423 greater than 1.
1424 If either operand of scalar .\|. is static, that operand is implicitly compared
1425 to the $. variable, the current line number.
1426 Examples:
1427 .nf
1428
1429 .ne 6
1430 As a scalar operator:
1431     if (101 .\|. 200) { print; }        # print 2nd hundred lines
1432
1433     next line if (1 .\|. /^$/); # skip header lines
1434
1435     s/^/> / if (/^$/ .\|. eof());       # quote body
1436
1437 .ne 4
1438 As an array operator:
1439     for (101 .\|. 200) { print; }       # print $_ 100 times
1440
1441     @foo = @foo[$[ .\|. $#foo]; # an expensive no-op
1442     @foo = @foo[$#foo-4 .\|. $#foo];    # slice last 5 items
1443
1444 .fi
1445 .Ip \-x 8
1446 A file test.
1447 This unary operator takes one argument, either a filename or a filehandle,
1448 and tests the associated file to see if something is true about it.
1449 If the argument is omitted, tests $_, except for \-t, which tests
1450 .IR STDIN .
1451 It returns 1 for true and \'\' for false, or the undefined value if the
1452 file doesn't exist.
1453 Precedence is higher than logical and relational operators, but lower than
1454 arithmetic operators.
1455 The operator may be any of:
1456 .nf
1457         \-r     File is readable by effective uid.
1458         \-w     File is writable by effective uid.
1459         \-x     File is executable by effective uid.
1460         \-o     File is owned by effective uid.
1461         \-R     File is readable by real uid.
1462         \-W     File is writable by real uid.
1463         \-X     File is executable by real uid.
1464         \-O     File is owned by real uid.
1465         \-e     File exists.
1466         \-z     File has zero size.
1467         \-s     File has non-zero size (returns size).
1468         \-f     File is a plain file.
1469         \-d     File is a directory.
1470         \-l     File is a symbolic link.
1471         \-p     File is a named pipe (FIFO).
1472         \-S     File is a socket.
1473         \-b     File is a block special file.
1474         \-c     File is a character special file.
1475         \-u     File has setuid bit set.
1476         \-g     File has setgid bit set.
1477         \-k     File has sticky bit set.
1478         \-t     Filehandle is opened to a tty.
1479         \-T     File is a text file.
1480         \-B     File is a binary file (opposite of \-T).
1481         \-M     Age of file in days when script started.
1482         \-A     Same for access time.
1483         \-C     Same for inode change time.
1484
1485 .fi
1486 The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X
1487 is based solely on the mode of the file and the uids and gids of the user.
1488 There may be other reasons you can't actually read, write or execute the file.
1489 Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and
1490 \-x and \-X return 1 if any execute bit is set in the mode.
1491 Scripts run by the superuser may thus need to do a stat() in order to determine
1492 the actual mode of the file, or temporarily set the uid to something else.
1493 .Sp
1494 Example:
1495 .nf
1496 .ne 7
1497
1498         while (<>) {
1499                 chop;
1500                 next unless \-f $_;     # ignore specials
1501                 .\|.\|.
1502         }
1503
1504 .fi
1505 Note that \-s/a/b/ does not do a negated substitution.
1506 Saying \-exp($foo) still works as expected, however\*(--only single letters
1507 following a minus are interpreted as file tests.
1508 .Sp
1509 The \-T and \-B switches work as follows.
1510 The first block or so of the file is examined for odd characters such as
1511 strange control codes or metacharacters.
1512 If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file.
1513 Also, any file containing null in the first block is considered a binary file.
1514 If \-T or \-B is used on a filehandle, the current stdio buffer is examined
1515 rather than the first block.
1516 Both \-T and \-B return TRUE on a null file, or a file at EOF when testing
1517 a filehandle.
1518 .PP
1519 If any of the file tests (or either stat operator) are given the special
1520 filehandle consisting of a solitary underline, then the stat structure
1521 of the previous file test (or stat operator) is used, saving a system
1522 call.
1523 (This doesn't work with \-t, and you need to remember that lstat and -l
1524 will leave values in the stat structure for the symbolic link, not the
1525 real file.)
1526 Example:
1527 .nf
1528
1529         print "Can do.\en" if -r $a || -w _ || -x _;
1530
1531 .ne 9
1532         stat($filename);
1533         print "Readable\en" if -r _;
1534         print "Writable\en" if -w _;
1535         print "Executable\en" if -x _;
1536         print "Setuid\en" if -u _;
1537         print "Setgid\en" if -g _;
1538         print "Sticky\en" if -k _;
1539         print "Text\en" if -T _;
1540         print "Binary\en" if -B _;
1541
1542 .fi
1543 .PP
1544 Here is what C has that
1545 .I perl
1546 doesn't:
1547 .Ip "unary &" 12
1548 Address-of operator.
1549 .Ip "unary *" 12
1550 Dereference-address operator.
1551 .Ip "(TYPE)" 12
1552 Type casting operator.
1553 .PP
1554 Like C,
1555 .I perl
1556 does a certain amount of expression evaluation at compile time, whenever
1557 it determines that all of the arguments to an operator are static and have
1558 no side effects.
1559 In particular, string concatenation happens at compile time between literals that don't do variable substitution.
1560 Backslash interpretation also happens at compile time.
1561 You can say
1562 .nf
1563
1564 .ne 2
1565         \'Now is the time for all\' . "\|\e\|n" .
1566         \'good men to come to.\'
1567
1568 .fi
1569 and this all reduces to one string internally.
1570 .PP
1571 The autoincrement operator has a little extra built-in magic to it.
1572 If you increment a variable that is numeric, or that has ever been used in
1573 a numeric context, you get a normal increment.
1574 If, however, the variable has only been used in string contexts since it
1575 was set, and has a value that is not null and matches the
1576 pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done
1577 as a string, preserving each character within its range, with carry:
1578 .nf
1579
1580         print ++($foo = \'99\');        # prints \*(L'100\*(R'
1581         print ++($foo = \'a0\');        # prints \*(L'a1\*(R'
1582         print ++($foo = \'Az\');        # prints \*(L'Ba\*(R'
1583         print ++($foo = \'zz\');        # prints \*(L'aaa\*(R'
1584
1585 .fi
1586 The autodecrement is not magical.
1587 .PP
1588 The range operator (in an array context) makes use of the magical
1589 autoincrement algorithm if the minimum and maximum are strings.
1590 You can say
1591
1592         @alphabet = (\'A\' .. \'Z\');
1593
1594 to get all the letters of the alphabet, or
1595
1596         $hexdigit = (0 .. 9, \'a\' .. \'f\')[$num & 15];
1597
1598 to get a hexadecimal digit, or
1599
1600         @z2 = (\'01\' .. \'31\');  print @z2[$mday];
1601
1602 to get dates with leading zeros.
1603 (If the final value specified is not in the sequence that the magical increment
1604 would produce, the sequence goes until the next value would be longer than
1605 the final value specified.)
1606 .PP
1607 The || and && operators differ from C's in that, rather than returning 0 or 1,
1608 they return the last value evaluated.
1609 Thus, a portable way to find out the home directory might be:
1610 .nf
1611
1612         $home = $ENV{'HOME'} || $ENV{'LOGDIR'} ||
1613             (getpwuid($<))[7] || die "You're homeless!\en";
1614
1615 .fi
1616 .PP
1617 Along with the literals and variables mentioned earlier,
1618 the operations in the following section can serve as terms in an expression.
1619 Some of these operations take a LIST as an argument.
1620 Such a list can consist of any combination of scalar arguments or array values;
1621 the array values will be included in the list as if each individual element were
1622 interpolated at that point in the list, forming a longer single-dimensional
1623 array value.
1624 Elements of the LIST should be separated by commas.
1625 If an operation is listed both with and without parentheses around its
1626 arguments, it means you can either use it as a unary operator or
1627 as a function call.
1628 To use it as a function call, the next token on the same line must
1629 be a left parenthesis.
1630 (There may be intervening white space.)
1631 Such a function then has highest precedence, as you would expect from
1632 a function.
1633 If any token other than a left parenthesis follows, then it is a
1634 unary operator, with a precedence depending only on whether it is a LIST
1635 operator or not.
1636 LIST operators have lowest precedence.
1637 All other unary operators have a precedence greater than relational operators
1638 but less than arithmetic operators.
1639 See the section on Precedence.
1640 .Ip "/PATTERN/" 8 4
1641 See m/PATTERN/.
1642 .Ip "?PATTERN?" 8 4
1643 This is just like the /pattern/ search, except that it matches only once between
1644 calls to the
1645 .I reset
1646 operator.
1647 This is a useful optimization when you only want to see the first occurrence of
1648 something in each file of a set of files, for instance.
1649 Only ?? patterns local to the current package are reset.
1650 .Ip "accept(NEWSOCKET,GENERICSOCKET)" 8 2
1651 Does the same thing that the accept system call does.
1652 Returns true if it succeeded, false otherwise.
1653 See example in section on Interprocess Communication.
1654 .Ip "alarm(SECONDS)" 8 4
1655 .Ip "alarm SECONDS" 8
1656 Arranges to have a SIGALRM delivered to this process after the specified number
1657 of seconds (minus 1, actually) have elapsed.  Thus, alarm(15) will cause
1658 a SIGALRM at some point more than 14 seconds in the future.
1659 Only one timer may be counting at once.  Each call disables the previous
1660 timer, and an argument of 0 may be supplied to cancel the previous timer
1661 without starting a new one.
1662 The returned value is the amount of time remaining on the previous timer.
1663 .Ip "atan2(Y,X)" 8 2
1664 Returns the arctangent of Y/X in the range
1665 .if t \-\(*p to \(*p.
1666 .if n \-PI to PI.
1667 .Ip "bind(SOCKET,NAME)" 8 2
1668 Does the same thing that the bind system call does.
1669 Returns true if it succeeded, false otherwise.
1670 NAME should be a packed address of the proper type for the socket.
1671 See example in section on Interprocess Communication.
1672 .Ip "binmode(FILEHANDLE)" 8 4
1673 .Ip "binmode FILEHANDLE" 8 4
1674 Arranges for the file to be read in \*(L"binary\*(R" mode in operating systems
1675 that distinguish between binary and text files.
1676 Files that are not read in binary mode have CR LF sequences translated
1677 to LF on input and LF translated to CR LF on output.
1678 Binmode has no effect under Unix.
1679 If FILEHANDLE is an expression, the value is taken as the name of
1680 the filehandle.
1681 .Ip "caller(EXPR)"
1682 .Ip "caller"
1683 Returns the context of the current subroutine call:
1684 .nf
1685
1686         ($package,$filename,$line) = caller;
1687
1688 .fi
1689 With EXPR, returns some extra information that the debugger uses to print
1690 a stack trace.  The value of EXPR indicates how many call frames to go
1691 back before the current one.
1692 .Ip "chdir(EXPR)" 8 2
1693 .Ip "chdir EXPR" 8 2
1694 Changes the working directory to EXPR, if possible.
1695 If EXPR is omitted, changes to home directory.
1696 Returns 1 upon success, 0 otherwise.
1697 See example under
1698 .IR die .
1699 .Ip "chmod(LIST)" 8 2
1700 .Ip "chmod LIST" 8 2
1701 Changes the permissions of a list of files.
1702 The first element of the list must be the numerical mode.
1703 Returns the number of files successfully changed.
1704 .nf
1705
1706 .ne 2
1707         $cnt = chmod 0755, \'foo\', \'bar\';
1708         chmod 0755, @executables;
1709
1710 .fi
1711 .Ip "chop(LIST)" 8 7
1712 .Ip "chop(VARIABLE)" 8
1713 .Ip "chop VARIABLE" 8
1714 .Ip "chop" 8
1715 Chops off the last character of a string and returns the character chopped.
1716 It's used primarily to remove the newline from the end of an input record,
1717 but is much more efficient than s/\en// because it neither scans nor copies
1718 the string.
1719 If VARIABLE is omitted, chops $_.
1720 Example:
1721 .nf
1722
1723 .ne 5
1724         while (<>) {
1725                 chop;   # avoid \en on last field
1726                 @array = split(/:/);
1727                 .\|.\|.
1728         }
1729
1730 .fi
1731 You can actually chop anything that's an lvalue, including an assignment:
1732 .nf
1733
1734         chop($cwd = \`pwd\`);
1735         chop($answer = <STDIN>);
1736
1737 .fi
1738 If you chop a list, each element is chopped.
1739 Only the value of the last chop is returned.
1740 .Ip "chown(LIST)" 8 2
1741 .Ip "chown LIST" 8 2
1742 Changes the owner (and group) of a list of files.
1743 The first two elements of the list must be the NUMERICAL uid and gid,
1744 in that order.
1745 Returns the number of files successfully changed.
1746 .nf
1747
1748 .ne 2
1749         $cnt = chown $uid, $gid, \'foo\', \'bar\';
1750         chown $uid, $gid, @filenames;
1751
1752 .fi
1753 .ne 23
1754 Here's an example that looks up non-numeric uids in the passwd file:
1755 .nf
1756
1757         print "User: ";
1758         $user = <STDIN>;
1759         chop($user);
1760         print "Files: "
1761         $pattern = <STDIN>;
1762         chop($pattern);
1763 .ie t \{\
1764         open(pass, \'/etc/passwd\') || die "Can't open passwd: $!\en";
1765 'br\}
1766 .el \{\
1767         open(pass, \'/etc/passwd\')
1768                 || die "Can't open passwd: $!\en";
1769 'br\}
1770         while (<pass>) {
1771                 ($login,$pass,$uid,$gid) = split(/:/);
1772                 $uid{$login} = $uid;
1773                 $gid{$login} = $gid;
1774         }
1775         @ary = <${pattern}>;    # get filenames
1776         if ($uid{$user} eq \'\') {
1777                 die "$user not in passwd file";
1778         }
1779         else {
1780                 chown $uid{$user}, $gid{$user}, @ary;
1781         }
1782
1783 .fi
1784 .Ip "chroot(FILENAME)" 8 5
1785 .Ip "chroot FILENAME" 8
1786 Does the same as the system call of that name.
1787 If you don't know what it does, don't worry about it.
1788 If FILENAME is omitted, does chroot to $_.
1789 .Ip "close(FILEHANDLE)" 8 5
1790 .Ip "close FILEHANDLE" 8
1791 Closes the file or pipe associated with the file handle.
1792 You don't have to close FILEHANDLE if you are immediately going to
1793 do another open on it, since open will close it for you.
1794 (See
1795 .IR open .)
1796 However, an explicit close on an input file resets the line counter ($.), while
1797 the implicit close done by
1798 .I open
1799 does not.
1800 Also, closing a pipe will wait for the process executing on the pipe to complete,
1801 in case you want to look at the output of the pipe afterwards.
1802 Closing a pipe explicitly also puts the status value of the command into $?.
1803 Example:
1804 .nf
1805
1806 .ne 4
1807         open(OUTPUT, \'|sort >foo\');   # pipe to sort
1808         .\|.\|. # print stuff to output
1809         close OUTPUT;           # wait for sort to finish
1810         open(INPUT, \'foo\');   # get sort's results
1811
1812 .fi
1813 FILEHANDLE may be an expression whose value gives the real filehandle name.
1814 .Ip "closedir(DIRHANDLE)" 8 5
1815 .Ip "closedir DIRHANDLE" 8
1816 Closes a directory opened by opendir().
1817 .Ip "connect(SOCKET,NAME)" 8 2
1818 Does the same thing that the connect system call does.
1819 Returns true if it succeeded, false otherwise.
1820 NAME should be a package address of the proper type for the socket.
1821 See example in section on Interprocess Communication.
1822 .Ip "cos(EXPR)" 8 6
1823 .Ip "cos EXPR" 8 6
1824 Returns the cosine of EXPR (expressed in radians).
1825 If EXPR is omitted takes cosine of $_.
1826 .Ip "crypt(PLAINTEXT,SALT)" 8 6
1827 Encrypts a string exactly like the crypt() function in the C library.
1828 Useful for checking the password file for lousy passwords.
1829 Only the guys wearing white hats should do this.
1830 .Ip "dbmclose(ASSOC_ARRAY)" 8 6
1831 .Ip "dbmclose ASSOC_ARRAY" 8
1832 Breaks the binding between a dbm file and an associative array.
1833 The values remaining in the associative array are meaningless unless
1834 you happen to want to know what was in the cache for the dbm file.
1835 This function is only useful if you have ndbm.
1836 .Ip "dbmopen(ASSOC,DBNAME,MODE)" 8 6
1837 This binds a dbm or ndbm file to an associative array.
1838 ASSOC is the name of the associative array.
1839 (Unlike normal open, the first argument is NOT a filehandle, even though
1840 it looks like one).
1841 DBNAME is the name of the database (without the .dir or .pag extension).
1842 If the database does not exist, it is created with protection specified
1843 by MODE (as modified by the umask).
1844 If your system only supports the older dbm functions, you may only have one
1845 dbmopen in your program.
1846 If your system has neither dbm nor ndbm, calling dbmopen produces a fatal
1847 error.
1848 .Sp
1849 Values assigned to the associative array prior to the dbmopen are lost.
1850 A certain number of values from the dbm file are cached in memory.
1851 By default this number is 64, but you can increase it by preallocating
1852 that number of garbage entries in the associative array before the dbmopen.
1853 You can flush the cache if necessary with the reset command.
1854 .Sp
1855 If you don't have write access to the dbm file, you can only read
1856 associative array variables, not set them.
1857 If you want to test whether you can write, either use file tests or
1858 try setting a dummy array entry inside an eval, which will trap the error.
1859 .Sp
1860 Note that functions such as keys() and values() may return huge array values
1861 when used on large dbm files.
1862 You may prefer to use the each() function to iterate over large dbm files.
1863 Example:
1864 .nf
1865
1866 .ne 6
1867         # print out history file offsets
1868         dbmopen(HIST,'/usr/lib/news/history',0666);
1869         while (($key,$val) = each %HIST) {
1870                 print $key, ' = ', unpack('L',$val), "\en";
1871         }
1872         dbmclose(HIST);
1873
1874 .fi
1875 .Ip "defined(EXPR)" 8 6
1876 .Ip "defined EXPR" 8
1877 Returns a boolean value saying whether the lvalue EXPR has a real value
1878 or not.
1879 Many operations return the undefined value under exceptional conditions,
1880 such as end of file, uninitialized variable, system error and such.
1881 This function allows you to distinguish between an undefined null string
1882 and a defined null string with operations that might return a real null
1883 string, in particular referencing elements of an array.
1884 You may also check to see if arrays or subroutines exist.
1885 Use on predefined variables is not guaranteed to produce intuitive results.
1886 Examples:
1887 .nf
1888
1889 .ne 7
1890         print if defined $switch{'D'};
1891         print "$val\en" while defined($val = pop(@ary));
1892         die "Can't readlink $sym: $!"
1893                 unless defined($value = readlink $sym);
1894         eval '@foo = ()' if defined(@foo);
1895         die "No XYZ package defined" unless defined %_XYZ;
1896         sub foo { defined &bar ? &bar(@_) : die "No bar"; }
1897
1898 .fi
1899 See also undef.
1900 .Ip "delete $ASSOC{KEY}" 8 6
1901 Deletes the specified value from the specified associative array.
1902 Returns the deleted value, or the undefined value if nothing was deleted.
1903 Deleting from $ENV{} modifies the environment.
1904 Deleting from an array bound to a dbm file deletes the entry from the dbm
1905 file.
1906 .Sp
1907 The following deletes all the values of an associative array:
1908 .nf
1909
1910 .ne 3
1911         foreach $key (keys %ARRAY) {
1912                 delete $ARRAY{$key};
1913         }
1914
1915 .fi
1916 (But it would be faster to use the
1917 .I reset
1918 command.
1919 Saying undef %ARRAY is faster yet.)
1920 .Ip "die(LIST)" 8
1921 .Ip "die LIST" 8
1922 Outside of an eval, prints the value of LIST to
1923 .I STDERR
1924 and exits with the current value of $!
1925 (errno).
1926 If $! is 0, exits with the value of ($? >> 8) (\`command\` status).
1927 If ($? >> 8) is 0, exits with 255.
1928 Inside an eval, the error message is stuffed into $@ and the eval is terminated
1929 with the undefined value.
1930 .Sp
1931 Equivalent examples:
1932 .nf
1933
1934 .ne 3
1935 .ie t \{\
1936         die "Can't cd to spool: $!\en" unless chdir \'/usr/spool/news\';
1937 'br\}
1938 .el \{\
1939         die "Can't cd to spool: $!\en"
1940                 unless chdir \'/usr/spool/news\';
1941 'br\}
1942
1943         chdir \'/usr/spool/news\' || die "Can't cd to spool: $!\en"
1944
1945 .fi
1946 .Sp
1947 If the value of EXPR does not end in a newline, the current script line
1948 number and input line number (if any) are also printed, and a newline is
1949 supplied.
1950 Hint: sometimes appending \*(L", stopped\*(R" to your message will cause it to make
1951 better sense when the string \*(L"at foo line 123\*(R" is appended.
1952 Suppose you are running script \*(L"canasta\*(R".
1953 .nf
1954
1955 .ne 7
1956         die "/etc/games is no good";
1957         die "/etc/games is no good, stopped";
1958
1959 produce, respectively
1960
1961         /etc/games is no good at canasta line 123.
1962         /etc/games is no good, stopped at canasta line 123.
1963
1964 .fi
1965 See also
1966 .IR exit .
1967 .Ip "do BLOCK" 8 4
1968 Returns the value of the last command in the sequence of commands indicated
1969 by BLOCK.
1970 When modified by a loop modifier, executes the BLOCK once before testing the
1971 loop condition.
1972 (On other statements the loop modifiers test the conditional first.)
1973 .Ip "do SUBROUTINE (LIST)" 8 3
1974 Executes a SUBROUTINE declared by a
1975 .I sub
1976 declaration, and returns the value
1977 of the last expression evaluated in SUBROUTINE.
1978 If there is no subroutine by that name, produces a fatal error.
1979 (You may use the \*(L"defined\*(R" operator to determine if a subroutine
1980 exists.)
1981 If you pass arrays as part of LIST you may wish to pass the length
1982 of the array in front of each array.
1983 (See the section on subroutines later on.)
1984 SUBROUTINE may be a scalar variable, in which case the variable contains
1985 the name of the subroutine to execute.
1986 The parentheses are required to avoid confusion with the \*(L"do EXPR\*(R"
1987 form.
1988 .Sp
1989 As an alternate form, you may call a subroutine by prefixing the name with
1990 an ampersand: &foo(@args).
1991 If you aren't passing any arguments, you don't have to use parentheses.
1992 If you omit the parentheses, no @_ array is passed to the subroutine.
1993 The & form is also used to specify subroutines to the defined and undef
1994 operators.
1995 .Ip "do EXPR" 8 3
1996 Uses the value of EXPR as a filename and executes the contents of the file
1997 as a
1998 .I perl
1999 script.
2000 Its primary use is to include subroutines from a
2001 .I perl
2002 subroutine library.
2003 .nf
2004
2005         do \'stat.pl\';
2006
2007 is just like
2008
2009         eval \`cat stat.pl\`;
2010
2011 .fi
2012 except that it's more efficient, more concise, keeps track of the current
2013 filename for error messages, and searches all the
2014 .B \-I
2015 libraries if the file
2016 isn't in the current directory (see also the @INC array in Predefined Names).
2017 It's the same, however, in that it does reparse the file every time you
2018 call it, so if you are going to use the file inside a loop you might prefer
2019 to use \-P and #include, at the expense of a little more startup time.
2020 (The main problem with #include is that cpp doesn't grok # comments\*(--a
2021 workaround is to use \*(L";#\*(R" for standalone comments.)
2022 Note that the following are NOT equivalent:
2023 .nf
2024
2025 .ne 2
2026         do $foo;        # eval a file
2027         do $foo();      # call a subroutine
2028
2029 .fi
2030 Note that inclusion of library routines is better done with
2031 the \*(L"require\*(R" operator.
2032 .Ip "dump LABEL" 8 6
2033 This causes an immediate core dump.
2034 Primarily this is so that you can use the undump program to turn your
2035 core dump into an executable binary after having initialized all your
2036 variables at the beginning of the program.
2037 When the new binary is executed it will begin by executing a "goto LABEL"
2038 (with all the restrictions that goto suffers).
2039 Think of it as a goto with an intervening core dump and reincarnation.
2040 If LABEL is omitted, restarts the program from the top.
2041 WARNING: any files opened at the time of the dump will NOT be open any more
2042 when the program is reincarnated, with possible resulting confusion on the part
2043 of perl.
2044 See also \-u.
2045 .Sp
2046 Example:
2047 .nf
2048
2049 .ne 16
2050         #!/usr/bin/perl
2051         require 'getopt.pl';
2052         require 'stat.pl';
2053         %days = (
2054             'Sun',1,
2055             'Mon',2,
2056             'Tue',3,
2057             'Wed',4,
2058             'Thu',5,
2059             'Fri',6,
2060             'Sat',7);
2061
2062         dump QUICKSTART if $ARGV[0] eq '-d';
2063
2064     QUICKSTART:
2065         do Getopt('f');
2066
2067 .fi
2068 .Ip "each(ASSOC_ARRAY)" 8 6
2069 .Ip "each ASSOC_ARRAY" 8
2070 Returns a 2 element array consisting of the key and value for the next
2071 value of an associative array, so that you can iterate over it.
2072 Entries are returned in an apparently random order.
2073 When the array is entirely read, a null array is returned (which when
2074 assigned produces a FALSE (0) value).
2075 The next call to each() after that will start iterating again.
2076 The iterator can be reset only by reading all the elements from the array.
2077 You must not modify the array while iterating over it.
2078 There is a single iterator for each associative array, shared by all
2079 each(), keys() and values() function calls in the program.
2080 The following prints out your environment like the printenv program, only
2081 in a different order:
2082 .nf
2083
2084 .ne 3
2085         while (($key,$value) = each %ENV) {
2086                 print "$key=$value\en";
2087         }
2088
2089 .fi
2090 See also keys() and values().
2091 .Ip "eof(FILEHANDLE)" 8 8
2092 .Ip "eof()" 8
2093 .Ip "eof" 8
2094 Returns 1 if the next read on FILEHANDLE will return end of file, or if
2095 FILEHANDLE is not open.
2096 FILEHANDLE may be an expression whose value gives the real filehandle name.
2097 (Note that this function actually reads a character and then ungetc's it,
2098 so it is not very useful in an interactive context.)
2099 An eof without an argument returns the eof status for the last file read.
2100 Empty parentheses () may be used to indicate the pseudo file formed of the
2101 files listed on the command line, i.e. eof() is reasonable to use inside
2102 a while (<>) loop to detect the end of only the last file.
2103 Use eof(ARGV) or eof without the parentheses to test EACH file in a while (<>) loop.
2104 Examples:
2105 .nf
2106
2107 .ne 7
2108         # insert dashes just before last line of last file
2109         while (<>) {
2110                 if (eof()) {
2111                         print "\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\en";
2112                 }
2113                 print;
2114         }
2115
2116 .ne 7
2117         # reset line numbering on each input file
2118         while (<>) {
2119                 print "$.\et$_";
2120                 if (eof) {      # Not eof().
2121                         close(ARGV);
2122                 }
2123         }
2124
2125 .fi
2126 .Ip "eval(EXPR)" 8 6
2127 .Ip "eval EXPR" 8 6
2128 EXPR is parsed and executed as if it were a little
2129 .I perl
2130 program.
2131 It is executed in the context of the current
2132 .I perl
2133 program, so that
2134 any variable settings, subroutine or format definitions remain afterwards.
2135 The value returned is the value of the last expression evaluated, just
2136 as with subroutines.
2137 If there is a syntax error or runtime error, or a die statement is
2138 executed, an undefined value is returned by
2139 eval, and $@ is set to the error message.
2140 If there was no error, $@ is guaranteed to be a null string.
2141 If EXPR is omitted, evaluates $_.
2142 The final semicolon, if any, may be omitted from the expression.
2143 .Sp
2144 Note that, since eval traps otherwise-fatal errors, it is useful for
2145 determining whether a particular feature
2146 (such as dbmopen or symlink) is implemented.
2147 It is also Perl's exception trapping mechanism, where the die operator is
2148 used to raise exceptions.
2149 .Ip "exec(LIST)" 8 8
2150 .Ip "exec LIST" 8 6
2151 If there is more than one argument in LIST, or if LIST is an array with
2152 more than one value,
2153 calls execvp() with the arguments in LIST.
2154 If there is only one scalar argument, the argument is checked for shell metacharacters.
2155 If there are any, the entire argument is passed to \*(L"/bin/sh \-c\*(R" for parsing.
2156 If there are none, the argument is split into words and passed directly to
2157 execvp(), which is more efficient.
2158 Note: exec (and system) do not flush your output buffer, so you may need to
2159 set $| to avoid lost output.
2160 Examples:
2161 .nf
2162
2163         exec \'/bin/echo\', \'Your arguments are: \', @ARGV;
2164         exec "sort $outfile | uniq";
2165
2166 .fi
2167 .Sp
2168 If you don't really want to execute the first argument, but want to lie
2169 to the program you are executing about its own name, you can specify
2170 the program you actually want to run by assigning that to a variable and
2171 putting the name of the variable in front of the LIST without a comma.
2172 (This always forces interpretation of the LIST as a multi-valued list, even
2173 if there is only a single scalar in the list.)
2174 Example:
2175 .nf
2176
2177 .ne 2
2178         $shell = '/bin/csh';
2179         exec $shell '-sh';              # pretend it's a login shell
2180
2181 .fi
2182 .Ip "exit(EXPR)" 8 6
2183 .Ip "exit EXPR" 8
2184 Evaluates EXPR and exits immediately with that value.
2185 Example:
2186 .nf
2187
2188 .ne 2
2189         $ans = <STDIN>;
2190         exit 0 \|if \|$ans \|=~ \|/\|^[Xx]\|/\|;
2191
2192 .fi
2193 See also
2194 .IR die .
2195 If EXPR is omitted, exits with 0 status.
2196 .Ip "exp(EXPR)" 8 3
2197 .Ip "exp EXPR" 8
2198 Returns
2199 .I e
2200 to the power of EXPR.
2201 If EXPR is omitted, gives exp($_).
2202 .Ip "fcntl(FILEHANDLE,FUNCTION,SCALAR)" 8 4
2203 Implements the fcntl(2) function.
2204 You'll probably have to say
2205 .nf
2206
2207         require "fcntl.ph";     # probably /usr/local/lib/perl/fcntl.ph
2208
2209 .fi
2210 first to get the correct function definitions.
2211 If fcntl.ph doesn't exist or doesn't have the correct definitions
2212 you'll have to roll
2213 your own, based on your C header files such as <sys/fcntl.h>.
2214 (There is a perl script called h2ph that comes with the perl kit
2215 which may help you in this.)
2216 Argument processing and value return works just like ioctl below.
2217 Note that fcntl will produce a fatal error if used on a machine that doesn't implement
2218 fcntl(2).
2219 .Ip "fileno(FILEHANDLE)" 8 4
2220 .Ip "fileno FILEHANDLE" 8 4
2221 Returns the file descriptor for a filehandle.
2222 Useful for constructing bitmaps for select().
2223 If FILEHANDLE is an expression, the value is taken as the name of
2224 the filehandle.
2225 .Ip "flock(FILEHANDLE,OPERATION)" 8 4
2226 Calls flock(2) on FILEHANDLE.
2227 See manual page for flock(2) for definition of OPERATION.
2228 Returns true for success, false on failure.
2229 Will produce a fatal error if used on a machine that doesn't implement
2230 flock(2).
2231 Here's a mailbox appender for BSD systems.
2232 .nf
2233
2234 .ne 20
2235         $LOCK_SH = 1;
2236         $LOCK_EX = 2;
2237         $LOCK_NB = 4;
2238         $LOCK_UN = 8;
2239
2240         sub lock {
2241             flock(MBOX,$LOCK_EX);
2242             # and, in case someone appended
2243             # while we were waiting...
2244             seek(MBOX, 0, 2);
2245         }
2246
2247         sub unlock {
2248             flock(MBOX,$LOCK_UN);
2249         }
2250
2251         open(MBOX, ">>/usr/spool/mail/$ENV{'USER'}")
2252                 || die "Can't open mailbox: $!";
2253
2254         do lock();
2255         print MBOX $msg,"\en\en";
2256         do unlock();
2257
2258 .fi
2259 .Ip "fork" 8 4
2260 Does a fork() call.
2261 Returns the child pid to the parent process and 0 to the child process.
2262 Note: unflushed buffers remain unflushed in both processes, which means
2263 you may need to set $| to avoid duplicate output.
2264 .Ip "getc(FILEHANDLE)" 8 4
2265 .Ip "getc FILEHANDLE" 8
2266 .Ip "getc" 8
2267 Returns the next character from the input file attached to FILEHANDLE, or
2268 a null string at EOF.
2269 If FILEHANDLE is omitted, reads from STDIN.
2270 .Ip "getlogin" 8 3
2271 Returns the current login from /etc/utmp, if any.
2272 If null, use getpwuid.
2273
2274         $login = getlogin || (getpwuid($<))[0] || "Somebody";
2275
2276 .Ip "getpeername(SOCKET)" 8 3
2277 Returns the packed sockaddr address of other end of the SOCKET connection.
2278 .nf
2279
2280 .ne 4
2281         # An internet sockaddr
2282         $sockaddr = 'S n a4 x8';
2283         $hersockaddr = getpeername(S);
2284 .ie t \{\
2285         ($family, $port, $heraddr) = unpack($sockaddr,$hersockaddr);
2286 'br\}
2287 .el \{\
2288         ($family, $port, $heraddr) =
2289                         unpack($sockaddr,$hersockaddr);
2290 'br\}
2291
2292 .fi
2293 .Ip "getpgrp(PID)" 8 4
2294 .Ip "getpgrp PID" 8
2295 Returns the current process group for the specified PID, 0 for the current
2296 process.
2297 Will produce a fatal error if used on a machine that doesn't implement
2298 getpgrp(2).
2299 If EXPR is omitted, returns process group of current process.
2300 .Ip "getppid" 8 4
2301 Returns the process id of the parent process.
2302 .Ip "getpriority(WHICH,WHO)" 8 4
2303 Returns the current priority for a process, a process group, or a user.
2304 (See getpriority(2).)
2305 Will produce a fatal error if used on a machine that doesn't implement
2306 getpriority(2).
2307 .Ip "getpwnam(NAME)" 8
2308 .Ip "getgrnam(NAME)" 8
2309 .Ip "gethostbyname(NAME)" 8
2310 .Ip "getnetbyname(NAME)" 8
2311 .Ip "getprotobyname(NAME)" 8
2312 .Ip "getpwuid(UID)" 8
2313 .Ip "getgrgid(GID)" 8
2314 .Ip "getservbyname(NAME,PROTO)" 8
2315 .Ip "gethostbyaddr(ADDR,ADDRTYPE)" 8
2316 .Ip "getnetbyaddr(ADDR,ADDRTYPE)" 8
2317 .Ip "getprotobynumber(NUMBER)" 8
2318 .Ip "getservbyport(PORT,PROTO)" 8
2319 .Ip "getpwent" 8
2320 .Ip "getgrent" 8
2321 .Ip "gethostent" 8
2322 .Ip "getnetent" 8
2323 .Ip "getprotoent" 8
2324 .Ip "getservent" 8
2325 .Ip "setpwent" 8
2326 .Ip "setgrent" 8
2327 .Ip "sethostent(STAYOPEN)" 8
2328 .Ip "setnetent(STAYOPEN)" 8
2329 .Ip "setprotoent(STAYOPEN)" 8
2330 .Ip "setservent(STAYOPEN)" 8
2331 .Ip "endpwent" 8
2332 .Ip "endgrent" 8
2333 .Ip "endhostent" 8
2334 .Ip "endnetent" 8
2335 .Ip "endprotoent" 8
2336 .Ip "endservent" 8
2337 These routines perform the same functions as their counterparts in the
2338 system library.
2339 The return values from the various get routines are as follows:
2340 .nf
2341
2342         ($name,$passwd,$uid,$gid,
2343            $quota,$comment,$gcos,$dir,$shell) = getpw.\|.\|.
2344         ($name,$passwd,$gid,$members) = getgr.\|.\|.
2345         ($name,$aliases,$addrtype,$length,@addrs) = gethost.\|.\|.
2346         ($name,$aliases,$addrtype,$net) = getnet.\|.\|.
2347         ($name,$aliases,$proto) = getproto.\|.\|.
2348         ($name,$aliases,$port,$proto) = getserv.\|.\|.
2349
2350 .fi
2351 The $members value returned by getgr.\|.\|. is a space separated list
2352 of the login names of the members of the group.
2353 .Sp
2354 The @addrs value returned by the gethost.\|.\|. functions is a list of the
2355 raw addresses returned by the corresponding system library call.
2356 In the Internet domain, each address is four bytes long and you can unpack
2357 it by saying something like:
2358 .nf
2359
2360         ($a,$b,$c,$d) = unpack('C4',$addr[0]);
2361
2362 .fi
2363 .Ip "getsockname(SOCKET)" 8 3
2364 Returns the packed sockaddr address of this end of the SOCKET connection.
2365 .nf
2366
2367 .ne 4
2368         # An internet sockaddr
2369         $sockaddr = 'S n a4 x8';
2370         $mysockaddr = getsockname(S);
2371 .ie t \{\
2372         ($family, $port, $myaddr) = unpack($sockaddr,$mysockaddr);
2373 'br\}
2374 .el \{\
2375         ($family, $port, $myaddr) =
2376                         unpack($sockaddr,$mysockaddr);
2377 'br\}
2378
2379 .fi
2380 .Ip "getsockopt(SOCKET,LEVEL,OPTNAME)" 8 3
2381 Returns the socket option requested, or undefined if there is an error.
2382 .Ip "gmtime(EXPR)" 8 4
2383 .Ip "gmtime EXPR" 8
2384 Converts a time as returned by the time function to a 9-element array with
2385 the time analyzed for the Greenwich timezone.
2386 Typically used as follows:
2387 .nf
2388
2389 .ne 3
2390 .ie t \{\
2391     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = gmtime(time);
2392 'br\}
2393 .el \{\
2394     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
2395                                                 gmtime(time);
2396 'br\}
2397
2398 .fi
2399 All array elements are numeric, and come straight out of a struct tm.
2400 In particular this means that $mon has the range 0.\|.11 and $wday has the
2401 range 0.\|.6.
2402 If EXPR is omitted, does gmtime(time).
2403 .Ip "goto LABEL" 8 6
2404 Finds the statement labeled with LABEL and resumes execution there.
2405 Currently you may only go to statements in the main body of the program
2406 that are not nested inside a do {} construct.
2407 This statement is not implemented very efficiently, and is here only to make
2408 the
2409 .IR sed -to- perl
2410 translator easier.
2411 I may change its semantics at any time, consistent with support for translated
2412 .I sed
2413 scripts.
2414 Use it at your own risk.
2415 Better yet, don't use it at all.
2416 .Ip "grep(EXPR,LIST)" 8 4
2417 Evaluates EXPR for each element of LIST (locally setting $_ to each element)
2418 and returns the array value consisting of those elements for which the
2419 expression evaluated to true.
2420 In a scalar context, returns the number of times the expression was true.
2421 .nf
2422
2423         @foo = grep(!/^#/, @bar);    # weed out comments
2424
2425 .fi
2426 Note that, since $_ is a reference into the array value, it can be
2427 used to modify the elements of the array.
2428 While this is useful and supported, it can cause bizarre results if
2429 the LIST is not a named array.
2430 .Ip "hex(EXPR)" 8 4
2431 .Ip "hex EXPR" 8
2432 Returns the decimal value of EXPR interpreted as an hex string.
2433 (To interpret strings that might start with 0 or 0x see oct().)
2434 If EXPR is omitted, uses $_.
2435 .Ip "index(STR,SUBSTR,POSITION)" 8 4
2436 .Ip "index(STR,SUBSTR)" 8 4
2437 Returns the position of the first occurrence of SUBSTR in STR at or after
2438 POSITION.
2439 If POSITION is omitted, starts searching from the beginning of the string.
2440 The return value is based at 0, or whatever you've
2441 set the $[ variable to.
2442 If the substring is not found, returns one less than the base, ordinarily \-1.
2443 .Ip "int(EXPR)" 8 4
2444 .Ip "int EXPR" 8
2445 Returns the integer portion of EXPR.
2446 If EXPR is omitted, uses $_.
2447 .Ip "ioctl(FILEHANDLE,FUNCTION,SCALAR)" 8 4
2448 Implements the ioctl(2) function.
2449 You'll probably have to say
2450 .nf
2451
2452         require "ioctl.ph";     # probably /usr/local/lib/perl/ioctl.ph
2453
2454 .fi
2455 first to get the correct function definitions.
2456 If ioctl.ph doesn't exist or doesn't have the correct definitions
2457 you'll have to roll
2458 your own, based on your C header files such as <sys/ioctl.h>.
2459 (There is a perl script called h2ph that comes with the perl kit
2460 which may help you in this.)
2461 SCALAR will be read and/or written depending on the FUNCTION\*(--a pointer
2462 to the string value of SCALAR will be passed as the third argument of
2463 the actual ioctl call.
2464 (If SCALAR has no string value but does have a numeric value, that value
2465 will be passed rather than a pointer to the string value.
2466 To guarantee this to be true, add a 0 to the scalar before using it.)
2467 The pack() and unpack() functions are useful for manipulating the values
2468 of structures used by ioctl().
2469 The following example sets the erase character to DEL.
2470 .nf
2471
2472 .ne 9
2473         require 'ioctl.ph';
2474         $sgttyb_t = "ccccs";            # 4 chars and a short
2475         if (ioctl(STDIN,$TIOCGETP,$sgttyb)) {
2476                 @ary = unpack($sgttyb_t,$sgttyb);
2477                 $ary[2] = 127;
2478                 $sgttyb = pack($sgttyb_t,@ary);
2479                 ioctl(STDIN,$TIOCSETP,$sgttyb)
2480                         || die "Can't ioctl: $!";
2481         }
2482
2483 .fi
2484 The return value of ioctl (and fcntl) is as follows:
2485 .nf
2486
2487 .ne 4
2488         if OS returns:\h'|3i'perl returns:
2489           -1\h'|3i'  undefined value
2490           0\h'|3i'  string "0 but true"
2491           anything else\h'|3i'  that number
2492
2493 .fi
2494 Thus perl returns true on success and false on failure, yet you can still
2495 easily determine the actual value returned by the operating system:
2496 .nf
2497
2498         ($retval = ioctl(...)) || ($retval = -1);
2499         printf "System returned %d\en", $retval;
2500 .fi
2501 .Ip "join(EXPR,LIST)" 8 8
2502 .Ip "join(EXPR,ARRAY)" 8
2503 Joins the separate strings of LIST or ARRAY into a single string with fields
2504 separated by the value of EXPR, and returns the string.
2505 Example:
2506 .nf
2507
2508 .ie t \{\
2509     $_ = join(\|\':\', $login,$passwd,$uid,$gid,$gcos,$home,$shell);
2510 'br\}
2511 .el \{\
2512     $_ = join(\|\':\',
2513                 $login,$passwd,$uid,$gid,$gcos,$home,$shell);
2514 'br\}
2515
2516 .fi
2517 See
2518 .IR split .
2519 .Ip "keys(ASSOC_ARRAY)" 8 6
2520 .Ip "keys ASSOC_ARRAY" 8
2521 Returns a normal array consisting of all the keys of the named associative
2522 array.
2523 The keys are returned in an apparently random order, but it is the same order
2524 as either the values() or each() function produces (given that the associative array
2525 has not been modified).
2526 Here is yet another way to print your environment:
2527 .nf
2528
2529 .ne 5
2530         @keys = keys %ENV;
2531         @values = values %ENV;
2532         while ($#keys >= 0) {
2533                 print pop(@keys), \'=\', pop(@values), "\en";
2534         }
2535
2536 or how about sorted by key:
2537
2538 .ne 3
2539         foreach $key (sort(keys %ENV)) {
2540                 print $key, \'=\', $ENV{$key}, "\en";
2541         }
2542
2543 .fi
2544 .Ip "kill(LIST)" 8 8
2545 .Ip "kill LIST" 8 2
2546 Sends a signal to a list of processes.
2547 The first element of the list must be the signal to send.
2548 Returns the number of processes successfully signaled.
2549 .nf
2550
2551         $cnt = kill 1, $child1, $child2;
2552         kill 9, @goners;
2553
2554 .fi
2555 If the signal is negative, kills process groups instead of processes.
2556 (On System V, a negative \fIprocess\fR number will also kill process groups,
2557 but that's not portable.)
2558 You may use a signal name in quotes.
2559 .Ip "last LABEL" 8 8
2560 .Ip "last" 8
2561 The
2562 .I last
2563 command is like the
2564 .I break
2565 statement in C (as used in loops); it immediately exits the loop in question.
2566 If the LABEL is omitted, the command refers to the innermost enclosing loop.
2567 The
2568 .I continue
2569 block, if any, is not executed:
2570 .nf
2571
2572 .ne 4
2573         line: while (<STDIN>) {
2574                 last line if /\|^$/;    # exit when done with header
2575                 .\|.\|.
2576         }
2577
2578 .fi
2579 .Ip "length(EXPR)" 8 4
2580 .Ip "length EXPR" 8
2581 Returns the length in characters of the value of EXPR.
2582 If EXPR is omitted, returns length of $_.
2583 .Ip "link(OLDFILE,NEWFILE)" 8 2
2584 Creates a new filename linked to the old filename.
2585 Returns 1 for success, 0 otherwise.
2586 .Ip "listen(SOCKET,QUEUESIZE)" 8 2
2587 Does the same thing that the listen system call does.
2588 Returns true if it succeeded, false otherwise.
2589 See example in section on Interprocess Communication.
2590 .Ip "local(LIST)" 8 4
2591 Declares the listed variables to be local to the enclosing block,
2592 subroutine, eval or \*(L"do\*(R".
2593 All the listed elements must be legal lvalues.
2594 This operator works by saving the current values of those variables in LIST
2595 on a hidden stack and restoring them upon exiting the block, subroutine or eval.
2596 This means that called subroutines can also reference the local variable,
2597 but not the global one.
2598 The LIST may be assigned to if desired, which allows you to initialize
2599 your local variables.
2600 (If no initializer is given for a particular variable, it is created with
2601 an undefined value.)
2602 Commonly this is used to name the parameters to a subroutine.
2603 Examples:
2604 .nf
2605
2606 .ne 13
2607         sub RANGEVAL {
2608                 local($min, $max, $thunk) = @_;
2609                 local($result) = \'\';
2610                 local($i);
2611
2612                 # Presumably $thunk makes reference to $i
2613
2614                 for ($i = $min; $i < $max; $i++) {
2615                         $result .= eval $thunk;
2616                 }
2617
2618                 $result;
2619         }
2620
2621 .ne 6
2622         if ($sw eq \'-v\') {
2623             # init local array with global array
2624             local(@ARGV) = @ARGV;
2625             unshift(@ARGV,\'echo\');
2626             system @ARGV;
2627         }
2628         # @ARGV restored
2629
2630 .ne 6
2631         # temporarily add to digits associative array
2632         if ($base12) {
2633                 # (NOTE: not claiming this is efficient!)
2634                 local(%digits) = (%digits,'t',10,'e',11);
2635                 do parse_num();
2636         }
2637
2638 .fi
2639 Note that local() is a run-time command, and so gets executed every time
2640 through a loop, using up more stack storage each time until it's all
2641 released at once when the loop is exited.
2642 .Ip "localtime(EXPR)" 8 4
2643 .Ip "localtime EXPR" 8
2644 Converts a time as returned by the time function to a 9-element array with
2645 the time analyzed for the local timezone.
2646 Typically used as follows:
2647 .nf
2648
2649 .ne 3
2650 .ie t \{\
2651     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
2652 'br\}
2653 .el \{\
2654     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
2655                                                 localtime(time);
2656 'br\}
2657
2658 .fi
2659 All array elements are numeric, and come straight out of a struct tm.
2660 In particular this means that $mon has the range 0.\|.11 and $wday has the
2661 range 0.\|.6.
2662 If EXPR is omitted, does localtime(time).
2663 .Ip "log(EXPR)" 8 4
2664 .Ip "log EXPR" 8
2665 Returns logarithm (base
2666 .IR e )
2667 of EXPR.
2668 If EXPR is omitted, returns log of $_.
2669 .Ip "lstat(FILEHANDLE)" 8 6
2670 .Ip "lstat FILEHANDLE" 8
2671 .Ip "lstat(EXPR)" 8
2672 .Ip "lstat SCALARVARIABLE" 8
2673 Does the same thing as the stat() function, but stats a symbolic link
2674 instead of the file the symbolic link points to.
2675 If symbolic links are unimplemented on your system, a normal stat is done.
2676 .Ip "m/PATTERN/gio" 8 4
2677 .Ip "/PATTERN/gio" 8
2678 Searches a string for a pattern match, and returns true (1) or false (\'\').
2679 If no string is specified via the =~ or !~ operator,
2680 the $_ string is searched.
2681 (The string specified with =~ need not be an lvalue\*(--it may be the result of an expression evaluation, but remember the =~ binds rather tightly.)
2682 See also the section on regular expressions.
2683 .Sp
2684 If / is the delimiter then the initial \*(L'm\*(R' is optional.
2685 With the \*(L'm\*(R' you can use any pair of non-alphanumeric characters
2686 as delimiters.
2687 This is particularly useful for matching Unix path names that contain \*(L'/\*(R'.
2688 If the final delimiter is followed by the optional letter \*(L'i\*(R', the matching is
2689 done in a case-insensitive manner.
2690 PATTERN may contain references to scalar variables, which will be interpolated
2691 (and the pattern recompiled) every time the pattern search is evaluated.
2692 (Note that $) and $| may not be interpolated because they look like end-of-string tests.)
2693 If you want such a pattern to be compiled only once, add an \*(L"o\*(R" after
2694 the trailing delimiter.
2695 This avoids expensive run-time recompilations, and
2696 is useful when the value you are interpolating won't change over the
2697 life of the script.
2698 If the PATTERN evaluates to a null string, the most recent successful
2699 regular expression is used instead.
2700 .Sp
2701 If used in a context that requires an array value, a pattern match returns an
2702 array consisting of the subexpressions matched by the parentheses in the
2703 pattern,
2704 i.e. ($1, $2, $3.\|.\|.).
2705 It does NOT actually set $1, $2, etc. in this case, nor does it set $+, $`, $&
2706 or $'.
2707 If the match fails, a null array is returned.
2708 If the match succeeds, but there were no parentheses, an array value of (1)
2709 is returned.
2710 .Sp
2711 Examples:
2712 .nf
2713
2714 .ne 4
2715     open(tty, \'/dev/tty\');
2716     <tty> \|=~ \|/\|^y\|/i \|&& \|do foo(\|);   # do foo if desired
2717
2718     if (/Version: \|*\|([0\-9.]*\|)\|/\|) { $version = $1; }
2719
2720     next if m#^/usr/spool/uucp#;
2721
2722 .ne 5
2723     # poor man's grep
2724     $arg = shift;
2725     while (<>) {
2726             print if /$arg/o;   # compile only once
2727     }
2728
2729     if (($F1, $F2, $Etc) = ($foo =~ /^(\eS+)\es+(\eS+)\es*(.*)/))
2730
2731 .fi
2732 This last example splits $foo into the first two words and the remainder
2733 of the line, and assigns those three fields to $F1, $F2 and $Etc.
2734 The conditional is true if any variables were assigned, i.e. if the pattern
2735 matched.
2736 .Sp
2737 The \*(L"g\*(R" modifier specifies global pattern matching\*(--that is,
2738 matching as many times as possible within the string.  How it behaves
2739 depends on the context.  In an array context, it returns a list of
2740 all the substrings matched by all the parentheses in the regular expression.
2741 If there are no parentheses, it returns a list of all the matched strings,
2742 as if there were parentheses around the whole pattern.  In a scalar context,
2743 it iterates through the string, returning TRUE each time it matches, and
2744 FALSE when it eventually runs out of matches.  (In other words, it remembers
2745 where it left off last time and restarts the search at that point.)  It
2746 presumes that you have not modified the string since the last match.
2747 Modifying the string between matches may result in undefined behavior.
2748 (You can actually get away with in-place modifications via substr()
2749 that do not change the length of the entire string.  In general, however,
2750 you should be using s///g for such modifications.)  Examples:
2751 .nf
2752
2753         # array context
2754         ($one,$five,$fifteen) = (\`uptime\` =~ /(\ed+\e.\ed+)/g);
2755
2756         # scalar context
2757         $/ = 1; $* = 1;
2758         while ($paragraph = <>) {
2759             while ($paragraph =~ /[a-z][\'")]*[.!?]+[\'")]*\es/g) {
2760                 $sentences++;
2761             }
2762         }
2763         print "$sentences\en";
2764
2765 .fi
2766 .Ip "mkdir(FILENAME,MODE)" 8 3
2767 Creates the directory specified by FILENAME, with permissions specified by
2768 MODE (as modified by umask).
2769 If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
2770 .Ip "msgctl(ID,CMD,ARG)" 8 4
2771 Calls the System V IPC function msgctl.  If CMD is &IPC_STAT, then ARG
2772 must be a variable which will hold the returned msqid_ds structure.
2773 Returns like ioctl: the undefined value for error, "0 but true" for
2774 zero, or the actual return value otherwise.
2775 .Ip "msgget(KEY,FLAGS)" 8 4
2776 Calls the System V IPC function msgget.  Returns the message queue id,
2777 or the undefined value if there is an error.
2778 .Ip "msgsnd(ID,MSG,FLAGS)" 8 4
2779 Calls the System V IPC function msgsnd to send the message MSG to the
2780 message queue ID.  MSG must begin with the long integer message type,
2781 which may be created with pack("L", $type).  Returns true if
2782 successful, or false if there is an error.
2783 .Ip "msgrcv(ID,VAR,SIZE,TYPE,FLAGS)" 8 4
2784 Calls the System V IPC function msgrcv to receive a message from
2785 message queue ID into variable VAR with a maximum message size of
2786 SIZE.  Note that if a message is received, the message type will be
2787 the first thing in VAR, and the maximum length of VAR is SIZE plus the
2788 size of the message type.  Returns true if successful, or false if
2789 there is an error.
2790 .Ip "next LABEL" 8 8
2791 .Ip "next" 8
2792 The
2793 .I next
2794 command is like the
2795 .I continue
2796 statement in C; it starts the next iteration of the loop:
2797 .nf
2798
2799 .ne 4
2800         line: while (<STDIN>) {
2801                 next line if /\|^#/;    # discard comments
2802                 .\|.\|.
2803         }
2804
2805 .fi
2806 Note that if there were a
2807 .I continue
2808 block on the above, it would get executed even on discarded lines.
2809 If the LABEL is omitted, the command refers to the innermost enclosing loop.
2810 .Ip "oct(EXPR)" 8 4
2811 .Ip "oct EXPR" 8
2812 Returns the decimal value of EXPR interpreted as an octal string.
2813 (If EXPR happens to start off with 0x, interprets it as a hex string instead.)
2814 The following will handle decimal, octal and hex in the standard notation:
2815 .nf
2816
2817         $val = oct($val) if $val =~ /^0/;
2818
2819 .fi
2820 If EXPR is omitted, uses $_.
2821 .Ip "open(FILEHANDLE,EXPR)" 8 8
2822 .Ip "open(FILEHANDLE)" 8
2823 .Ip "open FILEHANDLE" 8
2824 Opens the file whose filename is given by EXPR, and associates it with
2825 FILEHANDLE.
2826 If FILEHANDLE is an expression, its value is used as the name of the
2827 real filehandle wanted.
2828 If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE
2829 contains the filename.
2830 If the filename begins with \*(L"<\*(R" or nothing, the file is opened for
2831 input.
2832 If the filename begins with \*(L">\*(R", the file is opened for output.
2833 If the filename begins with \*(L">>\*(R", the file is opened for appending.
2834 (You can put a \'+\' in front of the \'>\' or \'<\' to indicate that you
2835 want both read and write access to the file.)
2836 If the filename begins with \*(L"|\*(R", the filename is interpreted
2837 as a command to which output is to be piped, and if the filename ends
2838 with a \*(L"|\*(R", the filename is interpreted as command which pipes
2839 input to us.
2840 (You may not have a command that pipes both in and out.)
2841 Opening \'\-\' opens
2842 .I STDIN
2843 and opening \'>\-\' opens
2844 .IR STDOUT .
2845 Open returns non-zero upon success, the undefined value otherwise.
2846 If the open involved a pipe, the return value happens to be the pid
2847 of the subprocess.
2848 Examples:
2849 .nf
2850
2851 .ne 3
2852         $article = 100;
2853         open article || die "Can't find article $article: $!\en";
2854         while (<article>) {\|.\|.\|.
2855
2856 .ie t \{\
2857         open(LOG, \'>>/usr/spool/news/twitlog\'\|);     # (log is reserved)
2858 'br\}
2859 .el \{\
2860         open(LOG, \'>>/usr/spool/news/twitlog\'\|);
2861                                         # (log is reserved)
2862 'br\}
2863
2864 .ie t \{\
2865         open(article, "caesar <$article |"\|);          # decrypt article
2866 'br\}
2867 .el \{\
2868         open(article, "caesar <$article |"\|);
2869                                         # decrypt article
2870 'br\}
2871
2872 .ie t \{\
2873         open(extract, "|sort >/tmp/Tmp$$"\|);           # $$ is our process#
2874 'br\}
2875 .el \{\
2876         open(extract, "|sort >/tmp/Tmp$$"\|);
2877                                         # $$ is our process#
2878 'br\}
2879
2880 .ne 7
2881         # process argument list of files along with any includes
2882
2883         foreach $file (@ARGV) {
2884                 do process($file, \'fh00\');    # no pun intended
2885         }
2886
2887         sub process {
2888                 local($filename, $input) = @_;
2889                 $input++;               # this is a string increment
2890                 unless (open($input, $filename)) {
2891                         print STDERR "Can't open $filename: $!\en";
2892                         return;
2893                 }
2894 .ie t \{\
2895                 while (<$input>) {              # note the use of indirection
2896 'br\}
2897 .el \{\
2898                 while (<$input>) {              # note use of indirection
2899 'br\}
2900                         if (/^#include "(.*)"/) {
2901                                 do process($1, $input);
2902                                 next;
2903                         }
2904                         .\|.\|.         # whatever
2905                 }
2906         }
2907
2908 .fi
2909 You may also, in the Bourne shell tradition, specify an EXPR beginning
2910 with \*(L">&\*(R", in which case the rest of the string
2911 is interpreted as the name of a filehandle
2912 (or file descriptor, if numeric) which is to be duped and opened.
2913 You may use & after >, >>, <, +>, +>> and +<.
2914 The mode you specify should match the mode of the original filehandle.
2915 Here is a script that saves, redirects, and restores
2916 .I STDOUT
2917 and
2918 .IR STDERR :
2919 .nf
2920
2921 .ne 21
2922         #!/usr/bin/perl
2923         open(SAVEOUT, ">&STDOUT");
2924         open(SAVEERR, ">&STDERR");
2925
2926         open(STDOUT, ">foo.out") || die "Can't redirect stdout";
2927         open(STDERR, ">&STDOUT") || die "Can't dup stdout";
2928
2929         select(STDERR); $| = 1;         # make unbuffered
2930         select(STDOUT); $| = 1;         # make unbuffered
2931
2932         print STDOUT "stdout 1\en";     # this works for
2933         print STDERR "stderr 1\en";     # subprocesses too
2934
2935         close(STDOUT);
2936         close(STDERR);
2937
2938         open(STDOUT, ">&SAVEOUT");
2939         open(STDERR, ">&SAVEERR");
2940
2941         print STDOUT "stdout 2\en";
2942         print STDERR "stderr 2\en";
2943
2944 .fi
2945 If you open a pipe on the command \*(L"\-\*(R", i.e. either \*(L"|\-\*(R" or \*(L"\-|\*(R",
2946 then there is an implicit fork done, and the return value of open
2947 is the pid of the child within the parent process, and 0 within the child
2948 process.
2949 (Use defined($pid) to determine if the open was successful.)
2950 The filehandle behaves normally for the parent, but i/o to that
2951 filehandle is piped from/to the
2952 .IR STDOUT / STDIN
2953 of the child process.
2954 In the child process the filehandle isn't opened\*(--i/o happens from/to
2955 the new
2956 .I STDOUT
2957 or
2958 .IR STDIN .
2959 Typically this is used like the normal piped open when you want to exercise
2960 more control over just how the pipe command gets executed, such as when
2961 you are running setuid, and don't want to have to scan shell commands
2962 for metacharacters.
2963 The following pairs are more or less equivalent:
2964 .nf
2965
2966 .ne 5
2967         open(FOO, "|tr \'[a\-z]\' \'[A\-Z]\'");
2968         open(FOO, "|\-") || exec \'tr\', \'[a\-z]\', \'[A\-Z]\';
2969
2970         open(FOO, "cat \-n '$file'|");
2971         open(FOO, "\-|") || exec \'cat\', \'\-n\', $file;
2972
2973 .fi
2974 Explicitly closing any piped filehandle causes the parent process to wait for the
2975 child to finish, and returns the status value in $?.
2976 Note: on any operation which may do a fork,
2977 unflushed buffers remain unflushed in both
2978 processes, which means you may need to set $| to
2979 avoid duplicate output.
2980 .Sp
2981 The filename that is passed to open will have leading and trailing
2982 whitespace deleted.
2983 In order to open a file with arbitrary weird characters in it, it's necessary
2984 to protect any leading and trailing whitespace thusly:
2985 .nf
2986
2987 .ne 2
2988         $file =~ s#^(\es)#./$1#;
2989         open(FOO, "< $file\e0");
2990
2991 .fi
2992 .Ip "opendir(DIRHANDLE,EXPR)" 8 3
2993 Opens a directory named EXPR for processing by readdir(), telldir(), seekdir(),
2994 rewinddir() and closedir().
2995 Returns true if successful.
2996 DIRHANDLEs have their own namespace separate from FILEHANDLEs.
2997 .Ip "ord(EXPR)" 8 4
2998 .Ip "ord EXPR" 8
2999 Returns the numeric ascii value of the first character of EXPR.
3000 If EXPR is omitted, uses $_.
3001 ''' Comments on f & d by gnb@melba.bby.oz.au    22/11/89
3002 .Ip "pack(TEMPLATE,LIST)" 8 4
3003 Takes an array or list of values and packs it into a binary structure,
3004 returning the string containing the structure.
3005 The TEMPLATE is a sequence of characters that give the order and type
3006 of values, as follows:
3007 .nf
3008
3009         A       An ascii string, will be space padded.
3010         a       An ascii string, will be null padded.
3011         c       A signed char value.
3012         C       An unsigned char value.
3013         s       A signed short value.
3014         S       An unsigned short value.
3015         i       A signed integer value.
3016         I       An unsigned integer value.
3017         l       A signed long value.
3018         L       An unsigned long value.
3019         n       A short in \*(L"network\*(R" order.
3020         N       A long in \*(L"network\*(R" order.
3021         f       A single-precision float in the native format.
3022         d       A double-precision float in the native format.
3023         p       A pointer to a string.
3024         x       A null byte.
3025         X       Back up a byte.
3026         @       Null fill to absolute position.
3027         u       A uuencoded string.
3028         b       A bit string (ascending bit order, like vec()).
3029         B       A bit string (descending bit order).
3030         h       A hex string (low nybble first).
3031         H       A hex string (high nybble first).
3032
3033 .fi
3034 Each letter may optionally be followed by a number which gives a repeat
3035 count.
3036 With all types except "a", "A", "b", "B", "h" and "H",
3037 the pack function will gobble up that many values
3038 from the LIST.
3039 A * for the repeat count means to use however many items are left.
3040 The "a" and "A" types gobble just one value, but pack it as a string of length
3041 count,
3042 padding with nulls or spaces as necessary.
3043 (When unpacking, "A" strips trailing spaces and nulls, but "a" does not.)
3044 Likewise, the "b" and "B" fields pack a string that many bits long.
3045 The "h" and "H" fields pack a string that many nybbles long.
3046 Real numbers (floats and doubles) are in the native machine format
3047 only; due to the multiplicity of floating formats around, and the lack
3048 of a standard \*(L"network\*(R" representation, no facility for
3049 interchange has been made.
3050 This means that packed floating point data
3051 written on one machine may not be readable on another - even if both
3052 use IEEE floating point arithmetic (as the endian-ness of the memory
3053 representation is not part of the IEEE spec).
3054 Note that perl uses
3055 doubles internally for all numeric calculation, and converting from
3056 double -> float -> double will lose precision (i.e. unpack("f",
3057 pack("f", $foo)) will not in general equal $foo).
3058 .br
3059 Examples:
3060 .nf
3061
3062         $foo = pack("cccc",65,66,67,68);
3063         # foo eq "ABCD"
3064         $foo = pack("c4",65,66,67,68);
3065         # same thing
3066
3067         $foo = pack("ccxxcc",65,66,67,68);
3068         # foo eq "AB\e0\e0CD"
3069
3070         $foo = pack("s2",1,2);
3071         # "\e1\e0\e2\e0" on little-endian
3072         # "\e0\e1\e0\e2" on big-endian
3073
3074         $foo = pack("a4","abcd","x","y","z");
3075         # "abcd"
3076
3077         $foo = pack("aaaa","abcd","x","y","z");
3078         # "axyz"
3079
3080         $foo = pack("a14","abcdefg");
3081         # "abcdefg\e0\e0\e0\e0\e0\e0\e0"
3082
3083         $foo = pack("i9pl", gmtime);
3084         # a real struct tm (on my system anyway)
3085
3086         sub bintodec {
3087             unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
3088         }
3089 .fi
3090 The same template may generally also be used in the unpack function.
3091 .Ip "pipe(READHANDLE,WRITEHANDLE)" 8 3
3092 Opens a pair of connected pipes like the corresponding system call.
3093 Note that if you set up a loop of piped processes, deadlock can occur
3094 unless you are very careful.
3095 In addition, note that perl's pipes use stdio buffering, so you may need
3096 to set $| to flush your WRITEHANDLE after each command, depending on
3097 the application.
3098 [Requires version 3.0 patchlevel 9.]
3099 .Ip "pop(ARRAY)" 8
3100 .Ip "pop ARRAY" 8 6
3101 Pops and returns the last value of the array, shortening the array by 1.
3102 Has the same effect as
3103 .nf
3104
3105         $tmp = $ARRAY[$#ARRAY\-\|\-];
3106
3107 .fi
3108 If there are no elements in the array, returns the undefined value.
3109 .Ip "print(FILEHANDLE LIST)" 8 10
3110 .Ip "print(LIST)" 8
3111 .Ip "print FILEHANDLE LIST" 8
3112 .Ip "print LIST" 8
3113 .Ip "print" 8
3114 Prints a string or a comma-separated list of strings.
3115 Returns non-zero if successful.
3116 FILEHANDLE may be a scalar variable name, in which case the variable contains
3117 the name of the filehandle, thus introducing one level of indirection.
3118 (NOTE: If FILEHANDLE is a variable and the next token is a term, it may be
3119 misinterpreted as an operator unless you interpose a + or put parens around
3120 the arguments.)
3121 If FILEHANDLE is omitted, prints by default to standard output (or to the
3122 last selected output channel\*(--see select()).
3123 If LIST is also omitted, prints $_ to
3124 .IR STDOUT .
3125 To set the default output channel to something other than
3126 .I STDOUT
3127 use the select operation.
3128 Note that, because print takes a LIST, anything in the LIST is evaluated
3129 in an array context, and any subroutine that you call will have one or more
3130 of its expressions evaluated in an array context.
3131 Also be careful not to follow the print keyword with a left parenthesis
3132 unless you want the corresponding right parenthesis to terminate the
3133 arguments to the print\*(--interpose a + or put parens around all the arguments.
3134 .Ip "printf(FILEHANDLE LIST)" 8 10
3135 .Ip "printf(LIST)" 8
3136 .Ip "printf FILEHANDLE LIST" 8
3137 .Ip "printf LIST" 8
3138 Equivalent to a \*(L"print FILEHANDLE sprintf(LIST)\*(R".
3139 .Ip "push(ARRAY,LIST)" 8 7
3140 Treats ARRAY (@ is optional) as a stack, and pushes the values of LIST
3141 onto the end of ARRAY.
3142 The length of ARRAY increases by the length of LIST.
3143 Has the same effect as
3144 .nf
3145
3146     for $value (LIST) {
3147             $ARRAY[++$#ARRAY] = $value;
3148     }
3149
3150 .fi
3151 but is more efficient.
3152 .Ip "q/STRING/" 8 5
3153 .Ip "qq/STRING/" 8
3154 .Ip "qx/STRING/" 8
3155 These are not really functions, but simply syntactic sugar to let you
3156 avoid putting too many backslashes into quoted strings.
3157 The q operator is a generalized single quote, and the qq operator a
3158 generalized double quote.
3159 The qx operator is a generalized backquote.
3160 Any non-alphanumeric delimiter can be used in place of /, including newline.
3161 If the delimiter is an opening bracket or parenthesis, the final delimiter
3162 will be the corresponding closing bracket or parenthesis.
3163 (Embedded occurrences of the closing bracket need to be backslashed as usual.)
3164 Examples:
3165 .nf
3166
3167 .ne 5
3168         $foo = q!I said, "You said, \'She said it.\'"!;
3169         $bar = q(\'This is it.\');
3170         $today = qx{ date };
3171         $_ .= qq
3172 *** The previous line contains the naughty word "$&".\en
3173                 if /(ibm|apple|awk)/;      # :-)
3174
3175 .fi
3176 .Ip "rand(EXPR)" 8 8
3177 .Ip "rand EXPR" 8
3178 .Ip "rand" 8
3179 Returns a random fractional number between 0 and the value of EXPR.
3180 (EXPR should be positive.)
3181 If EXPR is omitted, returns a value between 0 and 1.
3182 See also srand().
3183 .Ip "read(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
3184 .Ip "read(FILEHANDLE,SCALAR,LENGTH)" 8 5
3185 Attempts to read LENGTH bytes of data into variable SCALAR from the specified
3186 FILEHANDLE.
3187 Returns the number of bytes actually read, or undef if there was an error.
3188 SCALAR will be grown or shrunk to the length actually read.
3189 An OFFSET may be specified to place the read data at some other place
3190 than the beginning of the string.
3191 This call is actually implemented in terms of stdio's fread call.  To get
3192 a true read system call, see sysread.
3193 .Ip "readdir(DIRHANDLE)" 8 3
3194 .Ip "readdir DIRHANDLE" 8
3195 Returns the next directory entry for a directory opened by opendir().
3196 If used in an array context, returns all the rest of the entries in the
3197 directory.
3198 If there are no more entries, returns an undefined value in a scalar context
3199 or a null list in an array context.
3200 .Ip "readlink(EXPR)" 8 6
3201 .Ip "readlink EXPR" 8
3202 Returns the value of a symbolic link, if symbolic links are implemented.
3203 If not, gives a fatal error.
3204 If there is some system error, returns the undefined value and sets $! (errno).
3205 If EXPR is omitted, uses $_.
3206 .Ip "recv(SOCKET,SCALAR,LEN,FLAGS)" 8 4
3207 Receives a message on a socket.
3208 Attempts to receive LENGTH bytes of data into variable SCALAR from the specified
3209 SOCKET filehandle.
3210 Returns the address of the sender, or the undefined value if there's an error.
3211 SCALAR will be grown or shrunk to the length actually read.
3212 Takes the same flags as the system call of the same name.
3213 .Ip "redo LABEL" 8 8
3214 .Ip "redo" 8
3215 The
3216 .I redo
3217 command restarts the loop block without evaluating the conditional again.
3218 The
3219 .I continue
3220 block, if any, is not executed.
3221 If the LABEL is omitted, the command refers to the innermost enclosing loop.
3222 This command is normally used by programs that want to lie to themselves
3223 about what was just input:
3224 .nf
3225
3226 .ne 16
3227         # a simpleminded Pascal comment stripper
3228         # (warning: assumes no { or } in strings)
3229         line: while (<STDIN>) {
3230                 while (s|\|({.*}.*\|){.*}|$1 \||) {}
3231                 s|{.*}| \||;
3232                 if (s|{.*| \||) {
3233                         $front = $_;
3234                         while (<STDIN>) {
3235                                 if (\|/\|}/\|) {        # end of comment?
3236                                         s|^|$front{|;
3237                                         redo line;
3238                                 }
3239                         }
3240                 }
3241                 print;
3242         }
3243
3244 .fi
3245 .Ip "rename(OLDNAME,NEWNAME)" 8 2
3246 Changes the name of a file.
3247 Returns 1 for success, 0 otherwise.
3248 Will not work across filesystem boundaries.
3249 .Ip "require(EXPR)" 8 6
3250 .Ip "require EXPR" 8
3251 .Ip "require" 8
3252 Includes the library file specified by EXPR, or by $_ if EXPR is not supplied.
3253 Has semantics similar to the following subroutine:
3254 .nf
3255
3256         sub require {
3257             local($filename) = @_;
3258             return 1 if $INC{$filename};
3259             local($realfilename,$result);
3260             ITER: {
3261                 foreach $prefix (@INC) {
3262                     $realfilename = "$prefix/$filename";
3263                     if (-f $realfilename) {
3264                         $result = do $realfilename;
3265                         last ITER;
3266                     }
3267                 }
3268                 die "Can't find $filename in \e@INC";
3269             }
3270             die $@ if $@;
3271             die "$filename did not return true value" unless $result;
3272             $INC{$filename} = $realfilename;
3273             $result;
3274         }
3275
3276 .fi
3277 Note that the file will not be included twice under the same specified name.
3278 .Ip "reset(EXPR)" 8 6
3279 .Ip "reset EXPR" 8
3280 .Ip "reset" 8
3281 Generally used in a
3282 .I continue
3283 block at the end of a loop to clear variables and reset ?? searches
3284 so that they work again.
3285 The expression is interpreted as a list of single characters (hyphens allowed
3286 for ranges).
3287 All variables and arrays beginning with one of those letters are reset to
3288 their pristine state.
3289 If the expression is omitted, one-match searches (?pattern?) are reset to
3290 match again.
3291 Only resets variables or searches in the current package.
3292 Always returns 1.
3293 Examples:
3294 .nf
3295
3296 .ne 3
3297     reset \'X\';        \h'|2i'# reset all X variables
3298     reset \'a\-z\';\h'|2i'# reset lower case variables
3299     reset;      \h'|2i'# just reset ?? searches
3300
3301 .fi
3302 Note: resetting \*(L"A\-Z\*(R" is not recommended since you'll wipe out your ARGV and ENV
3303 arrays.
3304 .Sp
3305 The use of reset on dbm associative arrays does not change the dbm file.
3306 (It does, however, flush any entries cached by perl, which may be useful if
3307 you are sharing the dbm file.
3308 Then again, maybe not.)
3309 .Ip "return LIST" 8 3
3310 Returns from a subroutine with the value specified.
3311 (Note that a subroutine can automatically return
3312 the value of the last expression evaluated.
3313 That's the preferred method\*(--use of an explicit
3314 .I return
3315 is a bit slower.)
3316 .Ip "reverse(LIST)" 8 4
3317 .Ip "reverse LIST" 8
3318 In an array context, returns an array value consisting of the elements
3319 of LIST in the opposite order.
3320 In a scalar context, returns a string value consisting of the bytes of
3321 the first element of LIST in the opposite order.
3322 .Ip "rewinddir(DIRHANDLE)" 8 5
3323 .Ip "rewinddir DIRHANDLE" 8
3324 Sets the current position to the beginning of the directory for the readdir() routine on DIRHANDLE.
3325 .Ip "rindex(STR,SUBSTR,POSITION)" 8 6
3326 .Ip "rindex(STR,SUBSTR)" 8 4
3327 Works just like index except that it
3328 returns the position of the LAST occurrence of SUBSTR in STR.
3329 If POSITION is specified, returns the last occurrence at or before that
3330 position.
3331 .Ip "rmdir(FILENAME)" 8 4
3332 .Ip "rmdir FILENAME" 8
3333 Deletes the directory specified by FILENAME if it is empty.
3334 If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
3335 If FILENAME is omitted, uses $_.
3336 .Ip "s/PATTERN/REPLACEMENT/gieo" 8 3
3337 Searches a string for a pattern, and if found, replaces that pattern with the
3338 replacement text and returns the number of substitutions made.
3339 Otherwise it returns false (0).
3340 The \*(L"g\*(R" is optional, and if present, indicates that all occurrences
3341 of the pattern are to be replaced.
3342 The \*(L"i\*(R" is also optional, and if present, indicates that matching
3343 is to be done in a case-insensitive manner.
3344 The \*(L"e\*(R" is likewise optional, and if present, indicates that
3345 the replacement string is to be evaluated as an expression rather than just
3346 as a double-quoted string.
3347 Any non-alphanumeric delimiter may replace the slashes;
3348 if single quotes are used, no
3349 interpretation is done on the replacement string (the e modifier overrides
3350 this, however); if backquotes are used, the replacement string is a command
3351 to execute whose output will be used as the actual replacement text.
3352 If no string is specified via the =~ or !~ operator,
3353 the $_ string is searched and modified.
3354 (The string specified with =~ must be a scalar variable, an array element,
3355 or an assignment to one of those, i.e. an lvalue.)
3356 If the pattern contains a $ that looks like a variable rather than an
3357 end-of-string test, the variable will be interpolated into the pattern at
3358 run-time.
3359 If you only want the pattern compiled once the first time the variable is
3360 interpolated, add an \*(L"o\*(R" at the end.
3361 If the PATTERN evaluates to a null string, the most recent successful
3362 regular expression is used instead.
3363 See also the section on regular expressions.
3364 Examples:
3365 .nf
3366
3367     s/\|\e\|bgreen\e\|b/mauve/g;                # don't change wintergreen
3368
3369     $path \|=~ \|s|\|/usr/bin|\|/usr/local/bin|;
3370
3371     s/Login: $foo/Login: $bar/; # run-time pattern
3372
3373     ($foo = $bar) =~ s/bar/foo/;
3374
3375     $_ = \'abc123xyz\';
3376     s/\ed+/$&*2/e;              # yields \*(L'abc246xyz\*(R'
3377     s/\ed+/sprintf("%5d",$&)/e; # yields \*(L'abc  246xyz\*(R'
3378     s/\ew/$& x 2/eg;            # yields \*(L'aabbcc  224466xxyyzz\*(R'
3379
3380     s/\|([^ \|]*\|) *\|([^ \|]*\|)\|/\|$2 $1/;  # reverse 1st two fields
3381
3382 .fi
3383 (Note the use of $ instead of \|\e\| in the last example.  See section
3384 on regular expressions.)
3385 .Ip "scalar(EXPR)" 8 3
3386 Forces EXPR to be interpreted in a scalar context and returns the value
3387 of EXPR.
3388 .Ip "seek(FILEHANDLE,POSITION,WHENCE)" 8 3
3389 Randomly positions the file pointer for FILEHANDLE, just like the fseek()
3390 call of stdio.
3391 FILEHANDLE may be an expression whose value gives the name of the filehandle.
3392 Returns 1 upon success, 0 otherwise.
3393 .Ip "seekdir(DIRHANDLE,POS)" 8 3
3394 Sets the current position for the readdir() routine on DIRHANDLE.
3395 POS must be a value returned by telldir().
3396 Has the same caveats about possible directory compaction as the corresponding
3397 system library routine.
3398 .Ip "select(FILEHANDLE)" 8 3
3399 .Ip "select" 8 3
3400 Returns the currently selected filehandle.
3401 Sets the current default filehandle for output, if FILEHANDLE is supplied.
3402 This has two effects: first, a
3403 .I write
3404 or a
3405 .I print
3406 without a filehandle will default to this FILEHANDLE.
3407 Second, references to variables related to output will refer to this output
3408 channel.
3409 For example, if you have to set the top of form format for more than
3410 one output channel, you might do the following:
3411 .nf
3412
3413 .ne 4
3414         select(REPORT1);
3415         $^ = \'report1_top\';
3416         select(REPORT2);
3417         $^ = \'report2_top\';
3418
3419 .fi
3420 FILEHANDLE may be an expression whose value gives the name of the actual filehandle.
3421 Thus:
3422 .nf
3423
3424         $oldfh = select(STDERR); $| = 1; select($oldfh);
3425
3426 .fi
3427 .Ip "select(RBITS,WBITS,EBITS,TIMEOUT)" 8 3
3428 This calls the select system call with the bitmasks specified, which can
3429 be constructed using fileno() and vec(), along these lines:
3430 .nf
3431
3432         $rin = $win = $ein = '';
3433         vec($rin,fileno(STDIN),1) = 1;
3434         vec($win,fileno(STDOUT),1) = 1;
3435         $ein = $rin | $win;
3436
3437 .fi
3438 If you want to select on many filehandles you might wish to write a subroutine:
3439 .nf
3440
3441         sub fhbits {
3442             local(@fhlist) = split(' ',$_[0]);
3443             local($bits);
3444             for (@fhlist) {
3445                 vec($bits,fileno($_),1) = 1;
3446             }
3447             $bits;
3448         }
3449         $rin = &fhbits('STDIN TTY SOCK');
3450
3451 .fi
3452 The usual idiom is:
3453 .nf
3454
3455         ($nfound,$timeleft) =
3456           select($rout=$rin, $wout=$win, $eout=$ein, $timeout);
3457
3458 or to block until something becomes ready:
3459
3460 .ie t \{\
3461         $nfound = select($rout=$rin, $wout=$win, $eout=$ein, undef);
3462 'br\}
3463 .el \{\
3464         $nfound = select($rout=$rin, $wout=$win,
3465                                 $eout=$ein, undef);
3466 'br\}
3467
3468 .fi
3469 Any of the bitmasks can also be undef.
3470 The timeout, if specified, is in seconds, which may be fractional.
3471 NOTE: not all implementations are capable of returning the $timeleft.
3472 If not, they always return $timeleft equal to the supplied $timeout.
3473 .Ip "semctl(ID,SEMNUM,CMD,ARG)" 8 4
3474 Calls the System V IPC function semctl.  If CMD is &IPC_STAT or
3475 &GETALL, then ARG must be a variable which will hold the returned
3476 semid_ds structure or semaphore value array.  Returns like ioctl: the
3477 undefined value for error, "0 but true" for zero, or the actual return
3478 value otherwise.
3479 .Ip "semget(KEY,NSEMS,SIZE,FLAGS)" 8 4
3480 Calls the System V IPC function semget.  Returns the semaphore id, or
3481 the undefined value if there is an error.
3482 .Ip "semop(KEY,OPSTRING)" 8 4
3483 Calls the System V IPC function semop to perform semaphore operations
3484 such as signaling and waiting.  OPSTRING must be a packed array of
3485 semop structures.  Each semop structure can be generated with
3486 \&'pack("sss", $semnum, $semop, $semflag)'.  The number of semaphore
3487 operations is implied by the length of OPSTRING.  Returns true if
3488 successful, or false if there is an error.  As an example, the
3489 following code waits on semaphore $semnum of semaphore id $semid:
3490 .nf
3491
3492         $semop = pack("sss", $semnum, -1, 0);
3493         die "Semaphore trouble: $!\en" unless semop($semid, $semop);
3494
3495 .fi
3496 To signal the semaphore, replace "-1" with "1".
3497 .Ip "send(SOCKET,MSG,FLAGS,TO)" 8 4
3498 .Ip "send(SOCKET,MSG,FLAGS)" 8
3499 Sends a message on a socket.
3500 Takes the same flags as the system call of the same name.
3501 On unconnected sockets you must specify a destination to send TO.
3502 Returns the number of characters sent, or the undefined value if
3503 there is an error.
3504 .Ip "setpgrp(PID,PGRP)" 8 4
3505 Sets the current process group for the specified PID, 0 for the current
3506 process.
3507 Will produce a fatal error if used on a machine that doesn't implement
3508 setpgrp(2).
3509 .Ip "setpriority(WHICH,WHO,PRIORITY)" 8 4
3510 Sets the current priority for a process, a process group, or a user.
3511 (See setpriority(2).)
3512 Will produce a fatal error if used on a machine that doesn't implement
3513 setpriority(2).
3514 .Ip "setsockopt(SOCKET,LEVEL,OPTNAME,OPTVAL)" 8 3
3515 Sets the socket option requested.
3516 Returns undefined if there is an error.
3517 OPTVAL may be specified as undef if you don't want to pass an argument.
3518 .Ip "shift(ARRAY)" 8 6
3519 .Ip "shift ARRAY" 8
3520 .Ip "shift" 8
3521 Shifts the first value of the array off and returns it,
3522 shortening the array by 1 and moving everything down.
3523 If there are no elements in the array, returns the undefined value.
3524 If ARRAY is omitted, shifts the @ARGV array in the main program, and the @_
3525 array in subroutines.
3526 (This is determined lexically.)
3527 See also unshift(), push() and pop().
3528 Shift() and unshift() do the same thing to the left end of an array that push()
3529 and pop() do to the right end.
3530 .Ip "shmctl(ID,CMD,ARG)" 8 4
3531 Calls the System V IPC function shmctl.  If CMD is &IPC_STAT, then ARG
3532 must be a variable which will hold the returned shmid_ds structure.
3533 Returns like ioctl: the undefined value for error, "0 but true" for
3534 zero, or the actual return value otherwise.
3535 .Ip "shmget(KEY,SIZE,FLAGS)" 8 4
3536 Calls the System V IPC function shmget.  Returns the shared memory
3537 segment id, or the undefined value if there is an error.
3538 .Ip "shmread(ID,VAR,POS,SIZE)" 8 4
3539 .Ip "shmwrite(ID,STRING,POS,SIZE)" 8
3540 Reads or writes the System V shared memory segment ID starting at
3541 position POS for size SIZE by attaching to it, copying in/out, and
3542 detaching from it.  When reading, VAR must be a variable which
3543 will hold the data read.  When writing, if STRING is too long,
3544 only SIZE bytes are used; if STRING is too short, nulls are
3545 written to fill out SIZE bytes.  Return true if successful, or
3546 false if there is an error.
3547 .Ip "shutdown(SOCKET,HOW)" 8 3
3548 Shuts down a socket connection in the manner indicated by HOW, which has
3549 the same interpretation as in the system call of the same name.
3550 .Ip "sin(EXPR)" 8 4
3551 .Ip "sin EXPR" 8
3552 Returns the sine of EXPR (expressed in radians).
3553 If EXPR is omitted, returns sine of $_.
3554 .Ip "sleep(EXPR)" 8 6
3555 .Ip "sleep EXPR" 8
3556 .Ip "sleep" 8
3557 Causes the script to sleep for EXPR seconds, or forever if no EXPR.
3558 May be interrupted by sending the process a SIGALARM.
3559 Returns the number of seconds actually slept.
3560 .Ip "socket(SOCKET,DOMAIN,TYPE,PROTOCOL)" 8 3
3561 Opens a socket of the specified kind and attaches it to filehandle SOCKET.
3562 DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
3563 of the same name.
3564 You may need to run h2ph on sys/socket.h to get the proper values handy
3565 in a perl library file.
3566 Return true if successful.
3567 See the example in the section on Interprocess Communication.
3568 .Ip "socketpair(SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL)" 8 3
3569 Creates an unnamed pair of sockets in the specified domain, of the specified
3570 type.
3571 DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
3572 of the same name.
3573 If unimplemented, yields a fatal error.
3574 Return true if successful.
3575 .Ip "sort(SUBROUTINE LIST)" 8 9
3576 .Ip "sort(LIST)" 8
3577 .Ip "sort SUBROUTINE LIST" 8
3578 .Ip "sort LIST" 8
3579 Sorts the LIST and returns the sorted array value.
3580 Nonexistent values of arrays are stripped out.
3581 If SUBROUTINE is omitted, sorts in standard string comparison order.
3582 If SUBROUTINE is specified, gives the name of a subroutine that returns
3583 an integer less than, equal to, or greater than 0,
3584 depending on how the elements of the array are to be ordered.
3585 (The <=> and cmp operators are extremely useful in such routines.)
3586 In the interests of efficiency the normal calling code for subroutines
3587 is bypassed, with the following effects: the subroutine may not be a recursive
3588 subroutine, and the two elements to be compared are passed into the subroutine
3589 not via @_ but as $a and $b (see example below).
3590 They are passed by reference so don't modify $a and $b.
3591 SUBROUTINE may be a scalar variable name, in which case the value provides
3592 the name of the subroutine to use.
3593 Examples:
3594 .nf
3595
3596 .ne 4
3597         sub byage {
3598             $age{$a} <=> $age{$b};      # presuming integers
3599         }
3600         @sortedclass = sort byage @class;
3601
3602 .ne 9
3603         sub reverse { $b cmp $a; }
3604         @harry = (\'dog\',\'cat\',\'x\',\'Cain\',\'Abel\');
3605         @george = (\'gone\',\'chased\',\'yz\',\'Punished\',\'Axed\');
3606         print sort @harry;
3607                 # prints AbelCaincatdogx
3608         print sort reverse @harry;
3609                 # prints xdogcatCainAbel
3610         print sort @george, \'to\', @harry;
3611                 # prints AbelAxedCainPunishedcatchaseddoggonetoxyz
3612
3613 .fi
3614 .Ip "splice(ARRAY,OFFSET,LENGTH,LIST)" 8 8
3615 .Ip "splice(ARRAY,OFFSET,LENGTH)" 8
3616 .Ip "splice(ARRAY,OFFSET)" 8
3617 Removes the elements designated by OFFSET and LENGTH from an array, and
3618 replaces them with the elements of LIST, if any.
3619 Returns the elements removed from the array.
3620 The array grows or shrinks as necessary.
3621 If LENGTH is omitted, removes everything from OFFSET onward.
3622 The following equivalencies hold (assuming $[ == 0):
3623 .nf
3624
3625         push(@a,$x,$y)\h'|3.5i'splice(@a,$#a+1,0,$x,$y)
3626         pop(@a)\h'|3.5i'splice(@a,-1)
3627         shift(@a)\h'|3.5i'splice(@a,0,1)
3628         unshift(@a,$x,$y)\h'|3.5i'splice(@a,0,0,$x,$y)
3629         $a[$x] = $y\h'|3.5i'splice(@a,$x,1,$y);
3630
3631 Example, assuming array lengths are passed before arrays:
3632
3633         sub aeq {       # compare two array values
3634                 local(@a) = splice(@_,0,shift);
3635                 local(@b) = splice(@_,0,shift);
3636                 return 0 unless @a == @b;       # same len?
3637                 while (@a) {
3638                     return 0 if pop(@a) ne pop(@b);
3639                 }
3640                 return 1;
3641         }
3642         if (&aeq($len,@foo[1..$len],0+@bar,@bar)) { ... }
3643
3644 .fi
3645 .Ip "split(/PATTERN/,EXPR,LIMIT)" 8 8
3646 .Ip "split(/PATTERN/,EXPR)" 8 8
3647 .Ip "split(/PATTERN/)" 8
3648 .Ip "split" 8
3649 Splits a string into an array of strings, and returns it.
3650 (If not in an array context, returns the number of fields found and splits
3651 into the @_ array.
3652 (In an array context, you can force the split into @_
3653 by using ?? as the pattern delimiters, but it still returns the array value.))
3654 If EXPR is omitted, splits the $_ string.
3655 If PATTERN is also omitted, splits on whitespace (/[\ \et\en]+/).
3656 Anything matching PATTERN is taken to be a delimiter separating the fields.
3657 (Note that the delimiter may be longer than one character.)
3658 If LIMIT is specified, splits into no more than that many fields (though it
3659 may split into fewer).
3660 If LIMIT is unspecified, trailing null fields are stripped (which
3661 potential users of pop() would do well to remember).
3662 A pattern matching the null string (not to be confused with a null pattern //,
3663 which is just one member of the set of patterns matching a null string)
3664 will split the value of EXPR into separate characters at each point it
3665 matches that way.
3666 For example:
3667 .nf
3668
3669         print join(\':\', split(/ */, \'hi there\'));
3670
3671 .fi
3672 produces the output \*(L'h:i:t:h:e:r:e\*(R'.
3673 .Sp
3674 The LIMIT parameter can be used to partially split a line
3675 .nf
3676
3677         ($login, $passwd, $remainder) = split(\|/\|:\|/\|, $_, 3);
3678
3679 .fi
3680 (When assigning to a list, if LIMIT is omitted, perl supplies a LIMIT one
3681 larger than the number of variables in the list, to avoid unnecessary work.
3682 For the list above LIMIT would have been 4 by default.
3683 In time critical applications it behooves you not to split into
3684 more fields than you really need.)
3685 .Sp
3686 If the PATTERN contains parentheses, additional array elements are created
3687 from each matching substring in the delimiter.
3688 .Sp
3689         split(/([,-])/,"1-10,20");
3690 .Sp
3691 produces the array value
3692 .Sp
3693         (1,'-',10,',',20)
3694 .Sp
3695 The pattern /PATTERN/ may be replaced with an expression to specify patterns
3696 that vary at runtime.
3697 (To do runtime compilation only once, use /$variable/o.)
3698 As a special case, specifying a space (\'\ \') will split on white space
3699 just as split with no arguments does, but leading white space does NOT
3700 produce a null first field.
3701 Thus, split(\'\ \') can be used to emulate
3702 .IR awk 's
3703 default behavior, whereas
3704 split(/\ /) will give you as many null initial fields as there are
3705 leading spaces.
3706 .Sp
3707 Example:
3708 .nf
3709
3710 .ne 5
3711         open(passwd, \'/etc/passwd\');
3712         while (<passwd>) {
3713 .ie t \{\
3714                 ($login, $passwd, $uid, $gid, $gcos, $home, $shell) = split(\|/\|:\|/\|);
3715 'br\}
3716 .el \{\
3717                 ($login, $passwd, $uid, $gid, $gcos, $home, $shell)
3718                         = split(\|/\|:\|/\|);
3719 'br\}
3720                 .\|.\|.
3721         }
3722
3723 .fi
3724 (Note that $shell above will still have a newline on it.  See chop().)
3725 See also
3726 .IR join .
3727 .Ip "sprintf(FORMAT,LIST)" 8 4
3728 Returns a string formatted by the usual printf conventions.
3729 The * character is not supported.
3730 .Ip "sqrt(EXPR)" 8 4
3731 .Ip "sqrt EXPR" 8
3732 Return the square root of EXPR.
3733 If EXPR is omitted, returns square root of $_.
3734 .Ip "srand(EXPR)" 8 4
3735 .Ip "srand EXPR" 8
3736 Sets the random number seed for the
3737 .I rand
3738 operator.
3739 If EXPR is omitted, does srand(time).
3740 .Ip "stat(FILEHANDLE)" 8 8
3741 .Ip "stat FILEHANDLE" 8
3742 .Ip "stat(EXPR)" 8
3743 .Ip "stat SCALARVARIABLE" 8
3744 Returns a 13-element array giving the statistics for a file, either the file
3745 opened via FILEHANDLE, or named by EXPR.
3746 Typically used as follows:
3747 .nf
3748
3749 .ne 3
3750     ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
3751        $atime,$mtime,$ctime,$blksize,$blocks)
3752            = stat($filename);
3753
3754 .fi
3755 If stat is passed the special filehandle consisting of an underline,
3756 no stat is done, but the current contents of the stat structure from
3757 the last stat or filetest are returned.
3758 Example:
3759 .nf
3760
3761 .ne 3
3762         if (-x $file && (($d) = stat(_)) && $d < 0) {
3763                 print "$file is executable NFS file\en";
3764         }
3765
3766 .fi
3767 (This only works on machines for which the device number is negative under NFS.)
3768 .Ip "study(SCALAR)" 8 6
3769 .Ip "study SCALAR" 8
3770 .Ip "study"
3771 Takes extra time to study SCALAR ($_ if unspecified) in anticipation of
3772 doing many pattern matches on the string before it is next modified.
3773 This may or may not save time, depending on the nature and number of patterns
3774 you are searching on, and on the distribution of character frequencies in
3775 the string to be searched\*(--you probably want to compare runtimes with and
3776 without it to see which runs faster.
3777 Those loops which scan for many short constant strings (including the constant
3778 parts of more complex patterns) will benefit most.
3779 You may have only one study active at a time\*(--if you study a different
3780 scalar the first is \*(L"unstudied\*(R".
3781 (The way study works is this: a linked list of every character in the string
3782 to be searched is made, so we know, for example, where all the \*(L'k\*(R' characters
3783 are.
3784 From each search string, the rarest character is selected, based on some
3785 static frequency tables constructed from some C programs and English text.
3786 Only those places that contain this \*(L"rarest\*(R" character are examined.)
3787 .Sp
3788 For example, here is a loop which inserts index producing entries before any line
3789 containing a certain pattern:
3790 .nf
3791
3792 .ne 8
3793         while (<>) {
3794                 study;
3795                 print ".IX foo\en" if /\ebfoo\eb/;
3796                 print ".IX bar\en" if /\ebbar\eb/;
3797                 print ".IX blurfl\en" if /\ebblurfl\eb/;
3798                 .\|.\|.
3799                 print;
3800         }
3801
3802 .fi
3803 In searching for /\ebfoo\eb/, only those locations in $_ that contain \*(L'f\*(R'
3804 will be looked at, because \*(L'f\*(R' is rarer than \*(L'o\*(R'.
3805 In general, this is a big win except in pathological cases.
3806 The only question is whether it saves you more time than it took to build
3807 the linked list in the first place.
3808 .Sp
3809 Note that if you have to look for strings that you don't know till runtime,
3810 you can build an entire loop as a string and eval that to avoid recompiling
3811 all your patterns all the time.
3812 Together with undefining $/ to input entire files as one record, this can
3813 be very fast, often faster than specialized programs like fgrep.
3814 The following scans a list of files (@files)
3815 for a list of words (@words), and prints out the names of those files that
3816 contain a match:
3817 .nf
3818
3819 .ne 12
3820         $search = \'while (<>) { study;\';
3821         foreach $word (@words) {
3822             $search .= "++\e$seen{\e$ARGV} if /\eb$word\eb/;\en";
3823         }
3824         $search .= "}";
3825         @ARGV = @files;
3826         undef $/;
3827         eval $search;           # this screams
3828         $/ = "\en";             # put back to normal input delim
3829         foreach $file (sort keys(%seen)) {
3830             print $file, "\en";
3831         }
3832
3833 .fi
3834 .Ip "substr(EXPR,OFFSET,LEN)" 8 2
3835 .Ip "substr(EXPR,OFFSET)" 8 2
3836 Extracts a substring out of EXPR and returns it.
3837 First character is at offset 0, or whatever you've set $[ to.
3838 If OFFSET is negative, starts that far from the end of the string.
3839 If LEN is omitted, returns everything to the end of the string.
3840 You can use the substr() function as an lvalue, in which case EXPR must
3841 be an lvalue.
3842 If you assign something shorter than LEN, the string will shrink, and
3843 if you assign something longer than LEN, the string will grow to accommodate it.
3844 To keep the string the same length you may need to pad or chop your value using
3845 sprintf().
3846 .Ip "symlink(OLDFILE,NEWFILE)" 8 2
3847 Creates a new filename symbolically linked to the old filename.
3848 Returns 1 for success, 0 otherwise.
3849 On systems that don't support symbolic links, produces a fatal error at
3850 run time.
3851 To check for that, use eval:
3852 .nf
3853
3854         $symlink_exists = (eval \'symlink("","");\', $@ eq \'\');
3855
3856 .fi
3857 .Ip "syscall(LIST)" 8 6
3858 .Ip "syscall LIST" 8
3859 Calls the system call specified as the first element of the list, passing
3860 the remaining elements as arguments to the system call.
3861 If unimplemented, produces a fatal error.
3862 The arguments are interpreted as follows: if a given argument is numeric,
3863 the argument is passed as an int.
3864 If not, the pointer to the string value is passed.
3865 You are responsible to make sure a string is pre-extended long enough
3866 to receive any result that might be written into a string.
3867 If your integer arguments are not literals and have never been interpreted
3868 in a numeric context, you may need to add 0 to them to force them to look
3869 like numbers.
3870 .nf
3871
3872         require 'syscall.ph';           # may need to run h2ph
3873         syscall(&SYS_write, fileno(STDOUT), "hi there\en", 9);
3874
3875 .fi
3876 .Ip "sysread(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
3877 .Ip "sysread(FILEHANDLE,SCALAR,LENGTH)" 8 5
3878 Attempts to read LENGTH bytes of data into variable SCALAR from the specified
3879 FILEHANDLE, using the system call read(2).
3880 It bypasses stdio, so mixing this with other kinds of reads may cause
3881 confusion.
3882 Returns the number of bytes actually read, or undef if there was an error.
3883 SCALAR will be grown or shrunk to the length actually read.
3884 An OFFSET may be specified to place the read data at some other place
3885 than the beginning of the string.
3886 .Ip "system(LIST)" 8 6
3887 .Ip "system LIST" 8
3888 Does exactly the same thing as \*(L"exec LIST\*(R" except that a fork
3889 is done first, and the parent process waits for the child process to complete.
3890 Note that argument processing varies depending on the number of arguments.
3891 The return value is the exit status of the program as returned by the wait()
3892 call.
3893 To get the actual exit value divide by 256.
3894 See also
3895 .IR exec .
3896 .Ip "syswrite(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
3897 .Ip "syswrite(FILEHANDLE,SCALAR,LENGTH)" 8 5
3898 Attempts to write LENGTH bytes of data from variable SCALAR to the specified
3899 FILEHANDLE, using the system call write(2).
3900 It bypasses stdio, so mixing this with prints may cause
3901 confusion.
3902 Returns the number of bytes actually written, or undef if there was an error.
3903 An OFFSET may be specified to place the read data at some other place
3904 than the beginning of the string.
3905 .Ip "tell(FILEHANDLE)" 8 6
3906 .Ip "tell FILEHANDLE" 8 6
3907 .Ip "tell" 8
3908 Returns the current file position for FILEHANDLE.
3909 FILEHANDLE may be an expression whose value gives the name of the actual
3910 filehandle.
3911 If FILEHANDLE is omitted, assumes the file last read.
3912 .Ip "telldir(DIRHANDLE)" 8 5
3913 .Ip "telldir DIRHANDLE" 8
3914 Returns the current position of the readdir() routines on DIRHANDLE.
3915 Value may be given to seekdir() to access a particular location in
3916 a directory.
3917 Has the same caveats about possible directory compaction as the corresponding
3918 system library routine.
3919 .Ip "time" 8 4
3920 Returns the number of non-leap seconds since 00:00:00 UTC, January 1, 1970.
3921 Suitable for feeding to gmtime() and localtime().
3922 .Ip "times" 8 4
3923 Returns a four-element array giving the user and system times, in seconds, for this
3924 process and the children of this process.
3925 .Sp
3926     ($user,$system,$cuser,$csystem) = times;
3927 .Sp
3928 .Ip "tr/SEARCHLIST/REPLACEMENTLIST/cds" 8 5
3929 .Ip "y/SEARCHLIST/REPLACEMENTLIST/cds" 8
3930 Translates all occurrences of the characters found in the search list with
3931 the corresponding character in the replacement list.
3932 It returns the number of characters replaced or deleted.
3933 If no string is specified via the =~ or !~ operator,
3934 the $_ string is translated.
3935 (The string specified with =~ must be a scalar variable, an array element,
3936 or an assignment to one of those, i.e. an lvalue.)
3937 For
3938 .I sed
3939 devotees,
3940 .I y
3941 is provided as a synonym for
3942 .IR tr .
3943 .Sp
3944 If the c modifier is specified, the SEARCHLIST character set is complemented.
3945 If the d modifier is specified, any characters specified by SEARCHLIST that
3946 are not found in REPLACEMENTLIST are deleted.
3947 (Note that this is slightly more flexible than the behavior of some
3948 .I tr
3949 programs, which delete anything they find in the SEARCHLIST, period.)
3950 If the s modifier is specified, sequences of characters that were translated
3951 to the same character are squashed down to 1 instance of the character.
3952 .Sp
3953 If the d modifier was used, the REPLACEMENTLIST is always interpreted exactly
3954 as specified.
3955 Otherwise, if the REPLACEMENTLIST is shorter than the SEARCHLIST,
3956 the final character is replicated till it is long enough.
3957 If the REPLACEMENTLIST is null, the SEARCHLIST is replicated.
3958 This latter is useful for counting characters in a class, or for squashing
3959 character sequences in a class.
3960 .Sp
3961 Examples:
3962 .nf
3963
3964     $ARGV[1] \|=~ \|y/A\-Z/a\-z/;       \h'|3i'# canonicalize to lower case
3965
3966     $cnt = tr/*/*/;             \h'|3i'# count the stars in $_
3967
3968     $cnt = tr/0\-9//;           \h'|3i'# count the digits in $_
3969
3970     tr/a\-zA\-Z//s;     \h'|3i'# bookkeeper \-> bokeper
3971
3972     ($HOST = $host) =~ tr/a\-z/A\-Z/;
3973
3974     y/a\-zA\-Z/ /cs;    \h'|3i'# change non-alphas to single space
3975
3976     tr/\e200\-\e377/\e0\-\e177/;\h'|3i'# delete 8th bit
3977
3978 .fi
3979 .Ip "truncate(FILEHANDLE,LENGTH)" 8 4
3980 .Ip "truncate(EXPR,LENGTH)" 8
3981 Truncates the file opened on FILEHANDLE, or named by EXPR, to the specified
3982 length.
3983 Produces a fatal error if truncate isn't implemented on your system.
3984 .Ip "umask(EXPR)" 8 4
3985 .Ip "umask EXPR" 8
3986 .Ip "umask" 8
3987 Sets the umask for the process and returns the old one.
3988 If EXPR is omitted, merely returns current umask.
3989 .Ip "undef(EXPR)" 8 6
3990 .Ip "undef EXPR" 8
3991 .Ip "undef" 8
3992 Undefines the value of EXPR, which must be an lvalue.
3993 Use only on a scalar value, an entire array, or a subroutine name (using &).
3994 (Undef will probably not do what you expect on most predefined variables or
3995 dbm array values.)
3996 Always returns the undefined value.
3997 You can omit the EXPR, in which case nothing is undefined, but you still
3998 get an undefined value that you could, for instance, return from a subroutine.
3999 Examples:
4000 .nf
4001
4002 .ne 6
4003         undef $foo;
4004         undef $bar{'blurfl'};
4005         undef @ary;
4006         undef %assoc;
4007         undef &mysub;
4008         return (wantarray ? () : undef) if $they_blew_it;
4009
4010 .fi
4011 .Ip "unlink(LIST)" 8 4
4012 .Ip "unlink LIST" 8
4013 Deletes a list of files.
4014 Returns the number of files successfully deleted.
4015 .nf
4016
4017 .ne 2
4018         $cnt = unlink \'a\', \'b\', \'c\';
4019         unlink @goners;
4020         unlink <*.bak>;
4021
4022 .fi
4023 Note: unlink will not delete directories unless you are superuser and the
4024 .B \-U
4025 flag is supplied to
4026 .IR perl .
4027 Even if these conditions are met, be warned that unlinking a directory
4028 can inflict damage on your filesystem.
4029 Use rmdir instead.
4030 .Ip "unpack(TEMPLATE,EXPR)" 8 4
4031 Unpack does the reverse of pack: it takes a string representing
4032 a structure and expands it out into an array value, returning the array
4033 value.
4034 (In a scalar context, it merely returns the first value produced.)
4035 The TEMPLATE has the same format as in the pack function.
4036 Here's a subroutine that does substring:
4037 .nf
4038
4039 .ne 4
4040         sub substr {
4041                 local($what,$where,$howmuch) = @_;
4042                 unpack("x$where a$howmuch", $what);
4043         }
4044
4045 .ne 3
4046 and then there's
4047
4048         sub ord { unpack("c",$_[0]); }
4049
4050 .fi
4051 In addition, you may prefix a field with a %<number> to indicate that
4052 you want a <number>-bit checksum of the items instead of the items themselves.
4053 Default is a 16-bit checksum.
4054 For example, the following computes the same number as the System V sum program:
4055 .nf
4056
4057 .ne 4
4058         while (<>) {
4059             $checksum += unpack("%16C*", $_);
4060         }
4061         $checksum %= 65536;
4062
4063 .fi
4064 .Ip "unshift(ARRAY,LIST)" 8 4
4065 Does the opposite of a
4066 .IR shift .
4067 Or the opposite of a
4068 .IR push ,
4069 depending on how you look at it.
4070 Prepends list to the front of the array, and returns the number of elements
4071 in the new array.
4072 .nf
4073
4074         unshift(ARGV, \'\-e\') unless $ARGV[0] =~ /^\-/;
4075
4076 .fi
4077 .Ip "utime(LIST)" 8 2
4078 .Ip "utime LIST" 8 2
4079 Changes the access and modification times on each file of a list of files.
4080 The first two elements of the list must be the NUMERICAL access and
4081 modification times, in that order.
4082 Returns the number of files successfully changed.
4083 The inode modification time of each file is set to the current time.
4084 Example of a \*(L"touch\*(R" command:
4085 .nf
4086
4087 .ne 3
4088         #!/usr/bin/perl
4089         $now = time;
4090         utime $now, $now, @ARGV;
4091
4092 .fi
4093 .Ip "values(ASSOC_ARRAY)" 8 6
4094 .Ip "values ASSOC_ARRAY" 8
4095 Returns a normal array consisting of all the values of the named associative
4096 array.
4097 The values are returned in an apparently random order, but it is the same order
4098 as either the keys() or each() function would produce on the same array.
4099 See also keys() and each().
4100 .Ip "vec(EXPR,OFFSET,BITS)" 8 2
4101 Treats a string as a vector of unsigned integers, and returns the value
4102 of the bitfield specified.
4103 May also be assigned to.
4104 BITS must be a power of two from 1 to 32.
4105 .Sp
4106 Vectors created with vec() can also be manipulated with the logical operators
4107 |, & and ^,
4108 which will assume a bit vector operation is desired when both operands are
4109 strings.
4110 This interpretation is not enabled unless there is at least one vec() in
4111 your program, to protect older programs.
4112 .Sp
4113 To transform a bit vector into a string or array of 0's and 1's, use these:
4114 .nf
4115
4116         $bits = unpack("b*", $vector);
4117         @bits = split(//, unpack("b*", $vector));
4118
4119 .fi
4120 If you know the exact length in bits, it can be used in place of the *.
4121 .Ip "wait" 8 6
4122 Waits for a child process to terminate and returns the pid of the deceased
4123 process, or -1 if there are no child processes.
4124 The status is returned in $?.
4125 .Ip "waitpid(PID,FLAGS)" 8 6
4126 Waits for a particular child process to terminate and returns the pid of the deceased
4127 process, or -1 if there is no such child process.
4128 The status is returned in $?.
4129 If you say
4130 .nf
4131
4132         require "sys/wait.h";
4133         .\|.\|.
4134         waitpid(-1,&WNOHANG);
4135
4136 .fi
4137 then you can do a non-blocking wait for any process.  Non-blocking wait
4138 is only available on machines supporting either the
4139 .I waitpid (2)
4140 or
4141 .I wait4 (2)
4142 system calls.
4143 However, waiting for a particular pid with FLAGS of 0 is implemented
4144 everywhere.  (Perl emulates the system call by remembering the status
4145 values of processes that have exited but have not been harvested by the
4146 Perl script yet.)
4147 .Ip "wantarray" 8 4
4148 Returns true if the context of the currently executing subroutine
4149 is looking for an array value.
4150 Returns false if the context is looking for a scalar.
4151 .nf
4152
4153         return wantarray ? () : undef;
4154
4155 .fi
4156 .Ip "warn(LIST)" 8 4
4157 .Ip "warn LIST" 8
4158 Produces a message on STDERR just like \*(L"die\*(R", but doesn't exit.
4159 .Ip "write(FILEHANDLE)" 8 6
4160 .Ip "write(EXPR)" 8
4161 .Ip "write" 8
4162 Writes a formatted record (possibly multi-line) to the specified file,
4163 using the format associated with that file.
4164 By default the format for a file is the one having the same name is the
4165 filehandle, but the format for the current output channel (see
4166 .IR select )
4167 may be set explicitly
4168 by assigning the name of the format to the $~ variable.
4169 .Sp
4170 Top of form processing is handled automatically:
4171 if there is insufficient room on the current page for the formatted
4172 record, the page is advanced by writing a form feed,
4173 a special top-of-page format is used
4174 to format the new page header, and then the record is written.
4175 By default the top-of-page format is \*(L"top\*(R", but it
4176 may be set to the
4177 format of your choice by assigning the name to the $^ variable.
4178 The number of lines remaining on the current page is in variable $-, which
4179 can be set to 0 to force a new page.
4180 .Sp
4181 If FILEHANDLE is unspecified, output goes to the current default output channel,
4182 which starts out as
4183 .I STDOUT
4184 but may be changed by the
4185 .I select
4186 operator.
4187 If the FILEHANDLE is an EXPR, then the expression is evaluated and the
4188 resulting string is used to look up the name of the FILEHANDLE at run time.
4189 For more on formats, see the section on formats later on.
4190 .Sp
4191 Note that write is NOT the opposite of read.
4192 .Sh "Precedence"
4193 .I Perl
4194 operators have the following associativity and precedence:
4195 .nf
4196
4197 nonassoc\h'|1i'print printf exec system sort reverse
4198 \h'1.5i'chmod chown kill unlink utime die return
4199 left\h'|1i',
4200 right\h'|1i'= += \-= *= etc.
4201 right\h'|1i'?:
4202 nonassoc\h'|1i'.\|.
4203 left\h'|1i'||
4204 left\h'|1i'&&
4205 left\h'|1i'| ^
4206 left\h'|1i'&
4207 nonassoc\h'|1i'== != <=> eq ne cmp
4208 nonassoc\h'|1i'< > <= >= lt gt le ge
4209 nonassoc\h'|1i'chdir exit eval reset sleep rand umask
4210 nonassoc\h'|1i'\-r \-w \-x etc.
4211 left\h'|1i'<< >>
4212 left\h'|1i'+ \- .
4213 left\h'|1i'* / % x
4214 left\h'|1i'=~ !~
4215 right\h'|1i'! ~ and unary minus
4216 right\h'|1i'**
4217 nonassoc\h'|1i'++ \-\|\-
4218 left\h'|1i'\*(L'(\*(R'
4219
4220 .fi
4221 As mentioned earlier, if any list operator (print, etc.) or
4222 any unary operator (chdir, etc.)
4223 is followed by a left parenthesis as the next token on the same line,
4224 the operator and arguments within parentheses are taken to
4225 be of highest precedence, just like a normal function call.
4226 Examples:
4227 .nf
4228
4229         chdir $foo || die;\h'|3i'# (chdir $foo) || die
4230         chdir($foo) || die;\h'|3i'# (chdir $foo) || die
4231         chdir ($foo) || die;\h'|3i'# (chdir $foo) || die
4232         chdir +($foo) || die;\h'|3i'# (chdir $foo) || die
4233
4234 but, because * is higher precedence than ||:
4235
4236         chdir $foo * 20;\h'|3i'# chdir ($foo * 20)
4237         chdir($foo) * 20;\h'|3i'# (chdir $foo) * 20
4238         chdir ($foo) * 20;\h'|3i'# (chdir $foo) * 20
4239         chdir +($foo) * 20;\h'|3i'# chdir ($foo * 20)
4240
4241         rand 10 * 20;\h'|3i'# rand (10 * 20)
4242         rand(10) * 20;\h'|3i'# (rand 10) * 20
4243         rand (10) * 20;\h'|3i'# (rand 10) * 20
4244         rand +(10) * 20;\h'|3i'# rand (10 * 20)
4245
4246 .fi
4247 In the absence of parentheses,
4248 the precedence of list operators such as print, sort or chmod is
4249 either very high or very low depending on whether you look at the left
4250 side of operator or the right side of it.
4251 For example, in
4252 .nf
4253
4254         @ary = (1, 3, sort 4, 2);
4255         print @ary;             # prints 1324
4256
4257 .fi
4258 the commas on the right of the sort are evaluated before the sort, but
4259 the commas on the left are evaluated after.
4260 In other words, list operators tend to gobble up all the arguments that
4261 follow them, and then act like a simple term with regard to the preceding
4262 expression.
4263 Note that you have to be careful with parens:
4264 .nf
4265
4266 .ne 3
4267         # These evaluate exit before doing the print:
4268         print($foo, exit);      # Obviously not what you want.
4269         print $foo, exit;       # Nor is this.
4270
4271 .ne 4
4272         # These do the print before evaluating exit:
4273         (print $foo), exit;     # This is what you want.
4274         print($foo), exit;      # Or this.
4275         print ($foo), exit;     # Or even this.
4276
4277 Also note that
4278
4279         print ($foo & 255) + 1, "\en";
4280
4281 .fi
4282 probably doesn't do what you expect at first glance.
4283 .Sh "Subroutines"
4284 A subroutine may be declared as follows:
4285 .nf
4286
4287     sub NAME BLOCK
4288
4289 .fi
4290 .PP
4291 Any arguments passed to the routine come in as array @_,
4292 that is ($_[0], $_[1], .\|.\|.).
4293 The array @_ is a local array, but its values are references to the
4294 actual scalar parameters.
4295 The return value of the subroutine is the value of the last expression
4296 evaluated, and can be either an array value or a scalar value.
4297 Alternately, a return statement may be used to specify the returned value and
4298 exit the subroutine.
4299 To create local variables see the
4300 .I local
4301 operator.
4302 .PP
4303 A subroutine is called using the
4304 .I do
4305 operator or the & operator.
4306 .nf
4307
4308 .ne 12
4309 Example:
4310
4311         sub MAX {
4312                 local($max) = pop(@_);
4313                 foreach $foo (@_) {
4314                         $max = $foo \|if \|$max < $foo;
4315                 }
4316                 $max;
4317         }
4318
4319         .\|.\|.
4320         $bestday = &MAX($mon,$tue,$wed,$thu,$fri);
4321
4322 .ne 21
4323 Example:
4324
4325         # get a line, combining continuation lines
4326         #  that start with whitespace
4327         sub get_line {
4328                 $thisline = $lookahead;
4329                 line: while ($lookahead = <STDIN>) {
4330                         if ($lookahead \|=~ \|/\|^[ \^\e\|t]\|/\|) {
4331                                 $thisline \|.= \|$lookahead;
4332                         }
4333                         else {
4334                                 last line;
4335                         }
4336                 }
4337                 $thisline;
4338         }
4339
4340         $lookahead = <STDIN>;   # get first line
4341         while ($_ = do get_line(\|)) {
4342                 .\|.\|.
4343         }
4344
4345 .fi
4346 .nf
4347 .ne 6
4348 Use array assignment to a local list to name your formal arguments:
4349
4350         sub maybeset {
4351                 local($key, $value) = @_;
4352                 $foo{$key} = $value unless $foo{$key};
4353         }
4354
4355 .fi
4356 This also has the effect of turning call-by-reference into call-by-value,
4357 since the assignment copies the values.
4358 .Sp
4359 Subroutines may be called recursively.
4360 If a subroutine is called using the & form, the argument list is optional.
4361 If omitted, no @_ array is set up for the subroutine; the @_ array at the
4362 time of the call is visible to subroutine instead.
4363 .nf
4364
4365         do foo(1,2,3);          # pass three arguments
4366         &foo(1,2,3);            # the same
4367
4368         do foo();               # pass a null list
4369         &foo();                 # the same
4370         &foo;                   # pass no arguments\*(--more efficient
4371
4372 .fi
4373 .Sh "Passing By Reference"
4374 Sometimes you don't want to pass the value of an array to a subroutine but
4375 rather the name of it, so that the subroutine can modify the global copy
4376 of it rather than working with a local copy.
4377 In perl you can refer to all the objects of a particular name by prefixing
4378 the name with a star: *foo.
4379 When evaluated, it produces a scalar value that represents all the objects
4380 of that name, including any filehandle, format or subroutine.
4381 When assigned to within a local() operation, it causes the name mentioned
4382 to refer to whatever * value was assigned to it.
4383 Example:
4384 .nf
4385
4386         sub doubleary {
4387             local(*someary) = @_;
4388             foreach $elem (@someary) {
4389                 $elem *= 2;
4390             }
4391         }
4392         do doubleary(*foo);
4393         do doubleary(*bar);
4394
4395 .fi
4396 Assignment to *name is currently recommended only inside a local().
4397 You can actually assign to *name anywhere, but the previous referent of
4398 *name may be stranded forever.
4399 This may or may not bother you.
4400 .Sp
4401 Note that scalars are already passed by reference, so you can modify scalar
4402 arguments without using this mechanism by referring explicitly to the $_[nnn]
4403 in question.
4404 You can modify all the elements of an array by passing all the elements
4405 as scalars, but you have to use the * mechanism to push, pop or change the
4406 size of an array.
4407 The * mechanism will probably be more efficient in any case.
4408 .Sp
4409 Since a *name value contains unprintable binary data, if it is used as
4410 an argument in a print, or as a %s argument in a printf or sprintf, it
4411 then has the value '*name', just so it prints out pretty.
4412 .Sp
4413 Even if you don't want to modify an array, this mechanism is useful for
4414 passing multiple arrays in a single LIST, since normally the LIST mechanism
4415 will merge all the array values so that you can't extract out the
4416 individual arrays.
4417 .Sh "Regular Expressions"
4418 The patterns used in pattern matching are regular expressions such as
4419 those supplied in the Version 8 regexp routines.
4420 (In fact, the routines are derived from Henry Spencer's freely redistributable
4421 reimplementation of the V8 routines.)
4422 In addition, \ew matches an alphanumeric character (including \*(L"_\*(R") and \eW a nonalphanumeric.
4423 Word boundaries may be matched by \eb, and non-boundaries by \eB.
4424 A whitespace character is matched by \es, non-whitespace by \eS.
4425 A numeric character is matched by \ed, non-numeric by \eD.
4426 You may use \ew, \es and \ed within character classes.
4427 Also, \en, \er, \ef, \et and \eNNN have their normal interpretations.
4428 Within character classes \eb represents backspace rather than a word boundary.
4429 Alternatives may be separated by |.
4430 The bracketing construct \|(\ .\|.\|.\ \|) may also be used, in which case \e<digit>
4431 matches the digit'th substring.
4432 (Outside of the pattern, always use $ instead of \e in front of the digit.
4433 The scope of $<digit> (and $\`, $& and $\')
4434 extends to the end of the enclosing BLOCK or eval string, or to
4435 the next pattern match with subexpressions.
4436 The \e<digit> notation sometimes works outside the current pattern, but should
4437 not be relied upon.)
4438 You may have as many parentheses as you wish.  If you have more than 9
4439 substrings, the variables $10, $11, ... refer to the corresponding
4440 substring.  Within the pattern, \e10, \e11,
4441 etc. refer back to substrings if there have been at least that many left parens
4442 before the backreference.  Otherwise (for backward compatibilty) \e10
4443 is the same as \e010, a backspace,
4444 and \e11 the same as \e011, a tab.
4445 And so on.
4446 (\e1 through \e9 are always backreferences.)
4447 .PP
4448 $+ returns whatever the last bracket match matched.
4449 $& returns the entire matched string.
4450 ($0 used to return the same thing, but not any more.)
4451 $\` returns everything before the matched string.
4452 $\' returns everything after the matched string.
4453 Examples:
4454 .nf
4455
4456         s/\|^\|([^ \|]*\|) \|*([^ \|]*\|)\|/\|$2 $1\|/; # swap first two words
4457
4458 .ne 5
4459         if (/\|Time: \|(.\|.\|):\|(.\|.\|):\|(.\|.\|)\|/\|) {
4460                 $hours = $1;
4461                 $minutes = $2;
4462                 $seconds = $3;
4463         }
4464
4465 .fi
4466 By default, the ^ character is only guaranteed to match at the beginning
4467 of the string,
4468 the $ character only at the end (or before the newline at the end)
4469 and
4470 .I perl
4471 does certain optimizations with the assumption that the string contains
4472 only one line.
4473 The behavior of ^ and $ on embedded newlines will be inconsistent.
4474 You may, however, wish to treat a string as a multi-line buffer, such that
4475 the ^ will match after any newline within the string, and $ will match
4476 before any newline.
4477 At the cost of a little more overhead, you can do this by setting the variable
4478 $* to 1.
4479 Setting it back to 0 makes
4480 .I perl
4481 revert to its old behavior.
4482 .PP
4483 To facilitate multi-line substitutions, the . character never matches a newline
4484 (even when $* is 0).
4485 In particular, the following leaves a newline on the $_ string:
4486 .nf
4487
4488         $_ = <STDIN>;
4489         s/.*(some_string).*/$1/;
4490
4491 If the newline is unwanted, try one of
4492
4493         s/.*(some_string).*\en/$1/;
4494         s/.*(some_string)[^\e000]*/$1/;
4495         s/.*(some_string)(.|\en)*/$1/;
4496         chop; s/.*(some_string).*/$1/;
4497         /(some_string)/ && ($_ = $1);
4498
4499 .fi
4500 Any item of a regular expression may be followed with digits in curly brackets
4501 of the form {n,m}, where n gives the minimum number of times to match the item
4502 and m gives the maximum.
4503 The form {n} is equivalent to {n,n} and matches exactly n times.
4504 The form {n,} matches n or more times.
4505 (If a curly bracket occurs in any other context, it is treated as a regular
4506 character.)
4507 The * modifier is equivalent to {0,}, the + modifier to {1,} and the ? modifier
4508 to {0,1}.
4509 There is no limit to the size of n or m, but large numbers will chew up
4510 more memory.
4511 .Sp
4512 You will note that all backslashed metacharacters in
4513 .I perl
4514 are alphanumeric,
4515 such as \eb, \ew, \en.
4516 Unlike some other regular expression languages, there are no backslashed
4517 symbols that aren't alphanumeric.
4518 So anything that looks like \e\e, \e(, \e), \e<, \e>, \e{, or \e} is always
4519 interpreted as a literal character, not a metacharacter.
4520 This makes it simple to quote a string that you want to use for a pattern
4521 but that you are afraid might contain metacharacters.
4522 Simply quote all the non-alphanumeric characters:
4523 .nf
4524
4525         $pattern =~ s/(\eW)/\e\e$1/g;
4526
4527 .fi
4528 .Sh "Formats"
4529 Output record formats for use with the
4530 .I write
4531 operator may declared as follows:
4532 .nf
4533
4534 .ne 3
4535     format NAME =
4536     FORMLIST
4537     .
4538
4539 .fi
4540 If name is omitted, format \*(L"STDOUT\*(R" is defined.
4541 FORMLIST consists of a sequence of lines, each of which may be of one of three
4542 types:
4543 .Ip 1. 4
4544 A comment.
4545 .Ip 2. 4
4546 A \*(L"picture\*(R" line giving the format for one output line.
4547 .Ip 3. 4
4548 An argument line supplying values to plug into a picture line.
4549 .PP
4550 Picture lines are printed exactly as they look, except for certain fields
4551 that substitute values into the line.
4552 Each picture field starts with either @ or ^.
4553 The @ field (not to be confused with the array marker @) is the normal
4554 case; ^ fields are used
4555 to do rudimentary multi-line text block filling.
4556 The length of the field is supplied by padding out the field
4557 with multiple <, >, or | characters to specify, respectively, left justification,
4558 right justification, or centering.
4559 As an alternate form of right justification,
4560 you may also use # characters (with an optional .) to specify a numeric field.
4561 (Use of ^ instead of @ causes the field to be blanked if undefined.)
4562 If any of the values supplied for these fields contains a newline, only
4563 the text up to the newline is printed.
4564 The special field @* can be used for printing multi-line values.
4565 It should appear by itself on a line.
4566 .PP
4567 The values are specified on the following line, in the same order as
4568 the picture fields.
4569 The values should be separated by commas.
4570 .PP
4571 Picture fields that begin with ^ rather than @ are treated specially.
4572 The value supplied must be a scalar variable name which contains a text
4573 string.
4574 .I Perl
4575 puts as much text as it can into the field, and then chops off the front
4576 of the string so that the next time the variable is referenced,
4577 more of the text can be printed.
4578 Normally you would use a sequence of fields in a vertical stack to print
4579 out a block of text.
4580 If you like, you can end the final field with .\|.\|., which will appear in the
4581 output if the text was too long to appear in its entirety.
4582 You can change which characters are legal to break on by changing the
4583 variable $: to a list of the desired characters.
4584 .PP
4585 Since use of ^ fields can produce variable length records if the text to be
4586 formatted is short, you can suppress blank lines by putting the tilde (~)
4587 character anywhere in the line.
4588 (Normally you should put it in the front if possible, for visibility.)
4589 The tilde will be translated to a space upon output.
4590 If you put a second tilde contiguous to the first, the line will be repeated
4591 until all the fields on the line are exhausted.
4592 (If you use a field of the @ variety, the expression you supply had better
4593 not give the same value every time forever!)
4594 .PP
4595 Examples:
4596 .nf
4597 .lg 0
4598 .cs R 25
4599 .ft C
4600
4601 .ne 10
4602 # a report on the /etc/passwd file
4603 format STDOUT_TOP =
4604 \&                        Passwd File
4605 Name                Login    Office   Uid   Gid Home
4606 ------------------------------------------------------------------
4607 \&.
4608 format STDOUT =
4609 @<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<<
4610 $name,              $login,  $office,$uid,$gid, $home
4611 \&.
4612
4613 .ne 29
4614 # a report from a bug report form
4615 format STDOUT_TOP =
4616 \&                        Bug Reports
4617 @<<<<<<<<<<<<<<<<<<<<<<<     @|||         @>>>>>>>>>>>>>>>>>>>>>>>
4618 $system,                      $%,         $date
4619 ------------------------------------------------------------------
4620 \&.
4621 format STDOUT =
4622 Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4623 \&         $subject
4624 Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4625 \&       $index,                       $description
4626 Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4627 \&          $priority,        $date,   $description
4628 From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4629 \&      $from,                         $description
4630 Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4631 \&             $programmer,            $description
4632 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4633 \&                                     $description
4634 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4635 \&                                     $description
4636 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4637 \&                                     $description
4638 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4639 \&                                     $description
4640 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<...
4641 \&                                     $description
4642 \&.
4643
4644 .ft R
4645 .cs R
4646 .lg
4647 .fi
4648 It is possible to intermix prints with writes on the same output channel,
4649 but you'll have to handle $\- (lines left on the page) yourself.
4650 .PP
4651 If you are printing lots of fields that are usually blank, you should consider
4652 using the reset operator between records.
4653 Not only is it more efficient, but it can prevent the bug of adding another
4654 field and forgetting to zero it.
4655 .Sh "Interprocess Communication"
4656 The IPC facilities of perl are built on the Berkeley socket mechanism.
4657 If you don't have sockets, you can ignore this section.
4658 The calls have the same names as the corresponding system calls,
4659 but the arguments tend to differ, for two reasons.
4660 First, perl file handles work differently than C file descriptors.
4661 Second, perl already knows the length of its strings, so you don't need
4662 to pass that information.
4663 Here is a sample client (untested):
4664 .nf
4665
4666         ($them,$port) = @ARGV;
4667         $port = 2345 unless $port;
4668         $them = 'localhost' unless $them;
4669
4670         $SIG{'INT'} = 'dokill';
4671         sub dokill { kill 9,$child if $child; }
4672
4673         require 'sys/socket.ph';
4674
4675         $sockaddr = 'S n a4 x8';
4676         chop($hostname = `hostname`);
4677
4678         ($name, $aliases, $proto) = getprotobyname('tcp');
4679         ($name, $aliases, $port) = getservbyname($port, 'tcp')
4680                 unless $port =~ /^\ed+$/;
4681 .ie t \{\
4682         ($name, $aliases, $type, $len, $thisaddr) = gethostbyname($hostname);
4683 'br\}
4684 .el \{\
4685         ($name, $aliases, $type, $len, $thisaddr) =
4686                                         gethostbyname($hostname);
4687 'br\}
4688         ($name, $aliases, $type, $len, $thataddr) = gethostbyname($them);
4689
4690         $this = pack($sockaddr, &AF_INET, 0, $thisaddr);
4691         $that = pack($sockaddr, &AF_INET, $port, $thataddr);
4692
4693         socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
4694         bind(S, $this) || die "bind: $!";
4695         connect(S, $that) || die "connect: $!";
4696
4697         select(S); $| = 1; select(stdout);
4698
4699         if ($child = fork) {
4700                 while (<>) {
4701                         print S;
4702                 }
4703                 sleep 3;
4704                 do dokill();
4705         }
4706         else {
4707                 while (<S>) {
4708                         print;
4709                 }
4710         }
4711
4712 .fi
4713 And here's a server:
4714 .nf
4715
4716         ($port) = @ARGV;
4717         $port = 2345 unless $port;
4718
4719         require 'sys/socket.ph';
4720
4721         $sockaddr = 'S n a4 x8';
4722
4723         ($name, $aliases, $proto) = getprotobyname('tcp');
4724         ($name, $aliases, $port) = getservbyname($port, 'tcp')
4725                 unless $port =~ /^\ed+$/;
4726
4727         $this = pack($sockaddr, &AF_INET, $port, "\e0\e0\e0\e0");
4728
4729         select(NS); $| = 1; select(stdout);
4730
4731         socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
4732         bind(S, $this) || die "bind: $!";
4733         listen(S, 5) || die "connect: $!";
4734
4735         select(S); $| = 1; select(stdout);
4736
4737         for (;;) {
4738                 print "Listening again\en";
4739                 ($addr = accept(NS,S)) || die $!;
4740                 print "accept ok\en";
4741
4742                 ($af,$port,$inetaddr) = unpack($sockaddr,$addr);
4743                 @inetaddr = unpack('C4',$inetaddr);
4744                 print "$af $port @inetaddr\en";
4745
4746                 while (<NS>) {
4747                         print;
4748                         print NS;
4749                 }
4750         }
4751
4752 .fi
4753 .Sh "Predefined Names"
4754 The following names have special meaning to
4755 .IR perl .
4756 I could have used alphabetic symbols for some of these, but I didn't want
4757 to take the chance that someone would say reset \*(L"a\-zA\-Z\*(R" and wipe them all
4758 out.
4759 You'll just have to suffer along with these silly symbols.
4760 Most of them have reasonable mnemonics, or analogues in one of the shells.
4761 .Ip $_ 8
4762 The default input and pattern-searching space.
4763 The following pairs are equivalent:
4764 .nf
4765
4766 .ne 2
4767         while (<>) {\|.\|.\|.   # only equivalent in while!
4768         while ($_ = <>) {\|.\|.\|.
4769
4770 .ne 2
4771         /\|^Subject:/
4772         $_ \|=~ \|/\|^Subject:/
4773
4774 .ne 2
4775         y/a\-z/A\-Z/
4776         $_ =~ y/a\-z/A\-Z/
4777
4778 .ne 2
4779         chop
4780         chop($_)
4781
4782 .fi
4783 (Mnemonic: underline is understood in certain operations.)
4784 .Ip $. 8
4785 The current input line number of the last filehandle that was read.
4786 Readonly.
4787 Remember that only an explicit close on the filehandle resets the line number.
4788 Since <> never does an explicit close, line numbers increase across ARGV files
4789 (but see examples under eof).
4790 (Mnemonic: many programs use . to mean the current line number.)
4791 .Ip $/ 8
4792 The input record separator, newline by default.
4793 Works like
4794 .IR awk 's
4795 RS variable, including treating blank lines as delimiters
4796 if set to the null string.
4797 You may set it to a multicharacter string to match a multi-character
4798 delimiter.
4799 (Mnemonic: / is used to delimit line boundaries when quoting poetry.)
4800 .Ip $, 8
4801 The output field separator for the print operator.
4802 Ordinarily the print operator simply prints out the comma separated fields
4803 you specify.
4804 In order to get behavior more like
4805 .IR awk ,
4806 set this variable as you would set
4807 .IR awk 's
4808 OFS variable to specify what is printed between fields.
4809 (Mnemonic: what is printed when there is a , in your print statement.)
4810 .Ip $"" 8
4811 This is like $, except that it applies to array values interpolated into
4812 a double-quoted string (or similar interpreted string).
4813 Default is a space.
4814 (Mnemonic: obvious, I think.)
4815 .Ip $\e 8
4816 The output record separator for the print operator.
4817 Ordinarily the print operator simply prints out the comma separated fields
4818 you specify, with no trailing newline or record separator assumed.
4819 In order to get behavior more like
4820 .IR awk ,
4821 set this variable as you would set
4822 .IR awk 's
4823 ORS variable to specify what is printed at the end of the print.
4824 (Mnemonic: you set $\e instead of adding \en at the end of the print.
4825 Also, it's just like /, but it's what you get \*(L"back\*(R" from
4826 .IR perl .)
4827 .Ip $# 8
4828 The output format for printed numbers.
4829 This variable is a half-hearted attempt to emulate
4830 .IR awk 's
4831 OFMT variable.
4832 There are times, however, when
4833 .I awk
4834 and
4835 .I perl
4836 have differing notions of what
4837 is in fact numeric.
4838 Also, the initial value is %.20g rather than %.6g, so you need to set $#
4839 explicitly to get
4840 .IR awk 's
4841 value.
4842 (Mnemonic: # is the number sign.)
4843 .Ip $% 8
4844 The current page number of the currently selected output channel.
4845 (Mnemonic: % is page number in nroff.)
4846 .Ip $= 8
4847 The current page length (printable lines) of the currently selected output
4848 channel.
4849 Default is 60.
4850 (Mnemonic: = has horizontal lines.)
4851 .Ip $\- 8
4852 The number of lines left on the page of the currently selected output channel.
4853 (Mnemonic: lines_on_page \- lines_printed.)
4854 .Ip $~ 8
4855 The name of the current report format for the currently selected output
4856 channel.
4857 Default is name of the filehandle.
4858 (Mnemonic: brother to $^.)
4859 .Ip $^ 8
4860 The name of the current top-of-page format for the currently selected output
4861 channel.
4862 Default is name of the filehandle with \*(L"_TOP\*(R" appended.
4863 (Mnemonic: points to top of page.)
4864 .Ip $| 8
4865 If set to nonzero, forces a flush after every write or print on the currently
4866 selected output channel.
4867 Default is 0.
4868 Note that
4869 .I STDOUT
4870 will typically be line buffered if output is to the
4871 terminal and block buffered otherwise.
4872 Setting this variable is useful primarily when you are outputting to a pipe,
4873 such as when you are running a
4874 .I perl
4875 script under rsh and want to see the
4876 output as it's happening.
4877 (Mnemonic: when you want your pipes to be piping hot.)
4878 .Ip $$ 8
4879 The process number of the
4880 .I perl
4881 running this script.
4882 (Mnemonic: same as shells.)
4883 .Ip $? 8
4884 The status returned by the last pipe close, backtick (\`\`) command or
4885 .I system
4886 operator.
4887 Note that this is the status word returned by the wait() system
4888 call, so the exit value of the subprocess is actually ($? >> 8).
4889 $? & 255 gives which signal, if any, the process died from, and whether
4890 there was a core dump.
4891 (Mnemonic: similar to sh and ksh.)
4892 .Ip $& 8 4
4893 The string matched by the last pattern match (not counting any matches hidden
4894 within a BLOCK or eval enclosed by the current BLOCK).
4895 (Mnemonic: like & in some editors.)
4896 .Ip $\` 8 4
4897 The string preceding whatever was matched by the last pattern match
4898 (not counting any matches hidden within a BLOCK or eval enclosed by the current
4899 BLOCK).
4900 (Mnemonic: \` often precedes a quoted string.)
4901 .Ip $\' 8 4
4902 The string following whatever was matched by the last pattern match
4903 (not counting any matches hidden within a BLOCK or eval enclosed by the current
4904 BLOCK).
4905 (Mnemonic: \' often follows a quoted string.)
4906 Example:
4907 .nf
4908
4909 .ne 3
4910         $_ = \'abcdefghi\';
4911         /def/;
4912         print "$\`:$&:$\'\en";          # prints abc:def:ghi
4913
4914 .fi
4915 .Ip $+ 8 4
4916 The last bracket matched by the last search pattern.
4917 This is useful if you don't know which of a set of alternative patterns
4918 matched.
4919 For example:
4920 .nf
4921
4922     /Version: \|(.*\|)|Revision: \|(.*\|)\|/ \|&& \|($rev = $+);
4923
4924 .fi
4925 (Mnemonic: be positive and forward looking.)
4926 .Ip $* 8 2
4927 Set to 1 to do multiline matching within a string, 0 to tell
4928 .I perl
4929 that it can assume that strings contain a single line, for the purpose
4930 of optimizing pattern matches.
4931 Pattern matches on strings containing multiple newlines can produce confusing
4932 results when $* is 0.
4933 Default is 0.
4934 (Mnemonic: * matches multiple things.)
4935 Note that this variable only influences the interpretation of ^ and $.
4936 A literal newline can be searched for even when $* == 0.
4937 .Ip $0 8
4938 Contains the name of the file containing the
4939 .I perl
4940 script being executed.
4941 Assigning to $0 modifies the argument area that the ps(1) program sees.
4942 (Mnemonic: same as sh and ksh.)
4943 .Ip $<digit> 8
4944 Contains the subpattern from the corresponding set of parentheses in the last
4945 pattern matched, not counting patterns matched in nested blocks that have
4946 been exited already.
4947 (Mnemonic: like \edigit.)
4948 .Ip $[ 8 2
4949 The index of the first element in an array, and of the first character in
4950 a substring.
4951 Default is 0, but you could set it to 1 to make
4952 .I perl
4953 behave more like
4954 .I awk
4955 (or Fortran)
4956 when subscripting and when evaluating the index() and substr() functions.
4957 (Mnemonic: [ begins subscripts.)
4958 .Ip $] 8 2
4959 The string printed out when you say \*(L"perl -v\*(R".
4960 It can be used to determine at the beginning of a script whether the perl
4961 interpreter executing the script is in the right range of versions.
4962 If used in a numeric context, returns the version + patchlevel / 1000.
4963 Example:
4964 .nf
4965
4966 .ne 8
4967         # see if getc is available
4968         ($version,$patchlevel) =
4969                  $] =~ /(\ed+\e.\ed+).*\enPatch level: (\ed+)/;
4970         print STDERR "(No filename completion available.)\en"
4971                  if $version * 1000 + $patchlevel < 2016;
4972
4973 or, used numerically,
4974
4975         warn "No checksumming!\en" if $] < 3.019;
4976
4977 .fi
4978 (Mnemonic: Is this version of perl in the right bracket?)
4979 .Ip $; 8 2
4980 The subscript separator for multi-dimensional array emulation.
4981 If you refer to an associative array element as
4982 .nf
4983         $foo{$a,$b,$c}
4984
4985 it really means
4986
4987         $foo{join($;, $a, $b, $c)}
4988
4989 But don't put
4990
4991         @foo{$a,$b,$c}          # a slice\*(--note the @
4992
4993 which means
4994
4995         ($foo{$a},$foo{$b},$foo{$c})
4996
4997 .fi
4998 Default is "\e034", the same as SUBSEP in
4999 .IR awk .
5000 Note that if your keys contain binary data there might not be any safe
5001 value for $;.
5002 (Mnemonic: comma (the syntactic subscript separator) is a semi-semicolon.
5003 Yeah, I know, it's pretty lame, but $, is already taken for something more
5004 important.)
5005 .Ip $! 8 2
5006 If used in a numeric context, yields the current value of errno, with all the
5007 usual caveats.
5008 (This means that you shouldn't depend on the value of $! to be anything
5009 in particular unless you've gotten a specific error return indicating a
5010 system error.)
5011 If used in a string context, yields the corresponding system error string.
5012 You can assign to $! in order to set errno
5013 if, for instance, you want $! to return the string for error n, or you want
5014 to set the exit value for the die operator.
5015 (Mnemonic: What just went bang?)
5016 .Ip $@ 8 2
5017 The perl syntax error message from the last eval command.
5018 If null, the last eval parsed and executed correctly (although the operations
5019 you invoked may have failed in the normal fashion).
5020 (Mnemonic: Where was the syntax error \*(L"at\*(R"?)
5021 .Ip $< 8 2
5022 The real uid of this process.
5023 (Mnemonic: it's the uid you came FROM, if you're running setuid.)
5024 .Ip $> 8 2
5025 The effective uid of this process.
5026 Example:
5027 .nf
5028
5029 .ne 2
5030         $< = $>;        # set real uid to the effective uid
5031         ($<,$>) = ($>,$<);      # swap real and effective uid
5032
5033 .fi
5034 (Mnemonic: it's the uid you went TO, if you're running setuid.)
5035 Note: $< and $> can only be swapped on machines supporting setreuid().
5036 .Ip $( 8 2
5037 The real gid of this process.
5038 If you are on a machine that supports membership in multiple groups
5039 simultaneously, gives a space separated list of groups you are in.
5040 The first number is the one returned by getgid(), and the subsequent ones
5041 by getgroups(), one of which may be the same as the first number.
5042 (Mnemonic: parentheses are used to GROUP things.
5043 The real gid is the group you LEFT, if you're running setgid.)
5044 .Ip $) 8 2
5045 The effective gid of this process.
5046 If you are on a machine that supports membership in multiple groups
5047 simultaneously, gives a space separated list of groups you are in.
5048 The first number is the one returned by getegid(), and the subsequent ones
5049 by getgroups(), one of which may be the same as the first number.
5050 (Mnemonic: parentheses are used to GROUP things.
5051 The effective gid is the group that's RIGHT for you, if you're running setgid.)
5052 .Sp
5053 Note: $<, $>, $( and $) can only be set on machines that support the
5054 corresponding set[re][ug]id() routine.
5055 $( and $) can only be swapped on machines supporting setregid().
5056 .Ip $: 8 2
5057 The current set of characters after which a string may be broken to
5058 fill continuation fields (starting with ^) in a format.
5059 Default is "\ \en-", to break on whitespace or hyphens.
5060 (Mnemonic: a \*(L"colon\*(R" in poetry is a part of a line.)
5061 .Ip $^D 8 2
5062 The current value of the debugging flags.
5063 (Mnemonic: value of
5064 .B \-D
5065 switch.)
5066 .Ip $^F 8 2
5067 The maximum system file descriptor, ordinarily 2.  System file descriptors
5068 are passed to subprocesses, while higher file descriptors are not.
5069 During an open, system file descriptors are preserved even if the open
5070 fails.  Ordinary file descriptors are closed before the open is attempted.
5071 .Ip $^I 8 2
5072 The current value of the inplace-edit extension.
5073 Use undef to disable inplace editing.
5074 (Mnemonic: value of
5075 .B \-i
5076 switch.)
5077 .Ip $^P 8 2
5078 The internal flag that the debugger clears so that it doesn't
5079 debug itself.  You could conceivable disable debugging yourself
5080 by clearing it.
5081 .Ip $^T 8 2
5082 The time at which the script began running, in seconds since the epoch.
5083 The values returned by the
5084 .B \-M ,
5085 .B \-A
5086 and
5087 .B \-C
5088 filetests are based on this value.
5089 .Ip $^W 8 2
5090 The current value of the warning switch.
5091 (Mnemonic: related to the
5092 .B \-w
5093 switch.)
5094 .Ip $^X 8 2
5095 The name that Perl itself was executed as, from argv[0].
5096 .Ip $ARGV 8 3
5097 contains the name of the current file when reading from <>.
5098 .Ip @ARGV 8 3
5099 The array ARGV contains the command line arguments intended for the script.
5100 Note that $#ARGV is the generally number of arguments minus one, since
5101 $ARGV[0] is the first argument, NOT the command name.
5102 See $0 for the command name.
5103 .Ip @INC 8 3
5104 The array INC contains the list of places to look for
5105 .I perl
5106 scripts to be
5107 evaluated by the \*(L"do EXPR\*(R" command or the \*(L"require\*(R" command.
5108 It initially consists of the arguments to any
5109 .B \-I
5110 command line switches, followed
5111 by the default
5112 .I perl
5113 library, probably \*(L"/usr/local/lib/perl\*(R",
5114 followed by \*(L".\*(R", to represent the current directory.
5115 .Ip %INC 8 3
5116 The associative array INC contains entries for each filename that has
5117 been included via \*(L"do\*(R" or \*(L"require\*(R".
5118 The key is the filename you specified, and the value is the location of
5119 the file actually found.
5120 The \*(L"require\*(R" command uses this array to determine whether
5121 a given file has already been included.
5122 .Ip $ENV{expr} 8 2
5123 The associative array ENV contains your current environment.
5124 Setting a value in ENV changes the environment for child processes.
5125 .Ip $SIG{expr} 8 2
5126 The associative array SIG is used to set signal handlers for various signals.
5127 Example:
5128 .nf
5129
5130 .ne 12
5131         sub handler {   # 1st argument is signal name
5132                 local($sig) = @_;
5133                 print "Caught a SIG$sig\-\|\-shutting down\en";
5134                 close(LOG);
5135                 exit(0);
5136         }
5137
5138         $SIG{\'INT\'} = \'handler\';
5139         $SIG{\'QUIT\'} = \'handler\';
5140         .\|.\|.
5141         $SIG{\'INT\'} = \'DEFAULT\';    # restore default action
5142         $SIG{\'QUIT\'} = \'IGNORE\';    # ignore SIGQUIT
5143
5144 .fi
5145 The SIG array only contains values for the signals actually set within
5146 the perl script.
5147 .Sh "Packages"
5148 Perl provides a mechanism for alternate namespaces to protect packages from
5149 stomping on each others variables.
5150 By default, a perl script starts compiling into the package known as \*(L"main\*(R".
5151 By use of the
5152 .I package
5153 declaration, you can switch namespaces.
5154 The scope of the package declaration is from the declaration itself to the end
5155 of the enclosing block (the same scope as the local() operator).
5156 Typically it would be the first declaration in a file to be included by
5157 the \*(L"require\*(R" operator.
5158 You can switch into a package in more than one place; it merely influences
5159 which symbol table is used by the compiler for the rest of that block.
5160 You can refer to variables and filehandles in other packages by prefixing
5161 the identifier with the package name and a single quote.
5162 If the package name is null, the \*(L"main\*(R" package as assumed.
5163 .PP
5164 Only identifiers starting with letters are stored in the packages symbol
5165 table.
5166 All other symbols are kept in package \*(L"main\*(R".
5167 In addition, the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC
5168 and SIG are forced to be in package \*(L"main\*(R", even when used for
5169 other purposes than their built-in one.
5170 Note also that, if you have a package called \*(L"m\*(R", \*(L"s\*(R"
5171 or \*(L"y\*(R", the you can't use the qualified form of an identifier since it
5172 will be interpreted instead as a pattern match, a substitution
5173 or a translation.
5174 .PP
5175 Eval'ed strings are compiled in the package in which the eval was compiled
5176 in.
5177 (Assignments to $SIG{}, however, assume the signal handler specified is in the
5178 main package.
5179 Qualify the signal handler name if you wish to have a signal handler in
5180 a package.)
5181 For an example, examine perldb.pl in the perl library.
5182 It initially switches to the DB package so that the debugger doesn't interfere
5183 with variables in the script you are trying to debug.
5184 At various points, however, it temporarily switches back to the main package
5185 to evaluate various expressions in the context of the main package.
5186 .PP
5187 The symbol table for a package happens to be stored in the associative array
5188 of that name prepended with an underscore.
5189 The value in each entry of the associative array is
5190 what you are referring to when you use the *name notation.
5191 In fact, the following have the same effect (in package main, anyway),
5192 though the first is more
5193 efficient because it does the symbol table lookups at compile time:
5194 .nf
5195
5196 .ne 2
5197         local(*foo) = *bar;
5198         local($_main{'foo'}) = $_main{'bar'};
5199
5200 .fi
5201 You can use this to print out all the variables in a package, for instance.
5202 Here is dumpvar.pl from the perl library:
5203 .nf
5204 .ne 11
5205         package dumpvar;
5206
5207         sub main'dumpvar {
5208         \&    ($package) = @_;
5209         \&    local(*stab) = eval("*_$package");
5210         \&    while (($key,$val) = each(%stab)) {
5211         \&        {
5212         \&            local(*entry) = $val;
5213         \&            if (defined $entry) {
5214         \&                print "\e$$key = '$entry'\en";
5215         \&            }
5216 .ne 7
5217         \&            if (defined @entry) {
5218         \&                print "\e@$key = (\en";
5219         \&                foreach $num ($[ .. $#entry) {
5220         \&                    print "  $num\et'",$entry[$num],"'\en";
5221         \&                }
5222         \&                print ")\en";
5223         \&            }
5224 .ne 10
5225         \&            if ($key ne "_$package" && defined %entry) {
5226         \&                print "\e%$key = (\en";
5227         \&                foreach $key (sort keys(%entry)) {
5228         \&                    print "  $key\et'",$entry{$key},"'\en";
5229         \&                }
5230         \&                print ")\en";
5231         \&            }
5232         \&        }
5233         \&    }
5234         }
5235
5236 .fi
5237 Note that, even though the subroutine is compiled in package dumpvar, the
5238 name of the subroutine is qualified so that its name is inserted into package
5239 \*(L"main\*(R".
5240 .Sh "Style"
5241 Each programmer will, of course, have his or her own preferences in regards
5242 to formatting, but there are some general guidelines that will make your
5243 programs easier to read.
5244 .Ip 1. 4 4
5245 Just because you CAN do something a particular way doesn't mean that
5246 you SHOULD do it that way.
5247 .I Perl
5248 is designed to give you several ways to do anything, so consider picking
5249 the most readable one.
5250 For instance
5251
5252         open(FOO,$foo) || die "Can't open $foo: $!";
5253
5254 is better than
5255
5256         die "Can't open $foo: $!" unless open(FOO,$foo);
5257
5258 because the second way hides the main point of the statement in a
5259 modifier.
5260 On the other hand
5261
5262         print "Starting analysis\en" if $verbose;
5263
5264 is better than
5265
5266         $verbose && print "Starting analysis\en";
5267
5268 since the main point isn't whether the user typed -v or not.
5269 .Sp
5270 Similarly, just because an operator lets you assume default arguments
5271 doesn't mean that you have to make use of the defaults.
5272 The defaults are there for lazy systems programmers writing one-shot
5273 programs.
5274 If you want your program to be readable, consider supplying the argument.
5275 .Sp
5276 Along the same lines, just because you
5277 .I can
5278 omit parentheses in many places doesn't mean that you ought to:
5279 .nf
5280
5281         return print reverse sort num values array;
5282         return print(reverse(sort num (values(%array))));
5283
5284 .fi
5285 When in doubt, parenthesize.
5286 At the very least it will let some poor schmuck bounce on the % key in vi.
5287 .Sp
5288 Even if you aren't in doubt, consider the mental welfare of the person who
5289 has to maintain the code after you, and who will probably put parens in
5290 the wrong place.
5291 .Ip 2. 4 4
5292 Don't go through silly contortions to exit a loop at the top or the
5293 bottom, when
5294 .I perl
5295 provides the "last" operator so you can exit in the middle.
5296 Just outdent it a little to make it more visible:
5297 .nf
5298
5299 .ne 7
5300     line:
5301         for (;;) {
5302             statements;
5303         last line if $foo;
5304             next line if /^#/;
5305             statements;
5306         }
5307
5308 .fi
5309 .Ip 3. 4 4
5310 Don't be afraid to use loop labels\*(--they're there to enhance readability as
5311 well as to allow multi-level loop breaks.
5312 See last example.
5313 .Ip 4. 4 4
5314 For portability, when using features that may not be implemented on every
5315 machine, test the construct in an eval to see if it fails.
5316 If you know what version or patchlevel a particular feature was implemented,
5317 you can test $] to see if it will be there.
5318 .Ip 5. 4 4
5319 Choose mnemonic identifiers.
5320 .Ip 6. 4 4
5321 Be consistent.
5322 .Sh "Debugging"
5323 If you invoke
5324 .I perl
5325 with a
5326 .B \-d
5327 switch, your script will be run under a debugging monitor.
5328 It will halt before the first executable statement and ask you for a
5329 command, such as:
5330 .Ip "h" 12 4
5331 Prints out a help message.
5332 .Ip "T" 12 4
5333 Stack trace.
5334 .Ip "s" 12 4
5335 Single step.
5336 Executes until it reaches the beginning of another statement.
5337 .Ip "n" 12 4
5338 Next.
5339 Executes over subroutine calls, until it reaches the beginning of the
5340 next statement.
5341 .Ip "f" 12 4
5342 Finish.
5343 Executes statements until it has finished the current subroutine.
5344 .Ip "c" 12 4
5345 Continue.
5346 Executes until the next breakpoint is reached.
5347 .Ip "c line" 12 4
5348 Continue to the specified line.
5349 Inserts a one-time-only breakpoint at the specified line.
5350 .Ip "<CR>" 12 4
5351 Repeat last n or s.
5352 .Ip "l min+incr" 12 4
5353 List incr+1 lines starting at min.
5354 If min is omitted, starts where last listing left off.
5355 If incr is omitted, previous value of incr is used.
5356 .Ip "l min-max" 12 4
5357 List lines in the indicated range.
5358 .Ip "l line" 12 4
5359 List just the indicated line.
5360 .Ip "l" 12 4
5361 List next window.
5362 .Ip "-" 12 4
5363 List previous window.
5364 .Ip "w line" 12 4
5365 List window around line.
5366 .Ip "l subname" 12 4
5367 List subroutine.
5368 If it's a long subroutine it just lists the beginning.
5369 Use \*(L"l\*(R" to list more.
5370 .Ip "/pattern/" 12 4
5371 Regular expression search forward for pattern; the final / is optional.
5372 .Ip "?pattern?" 12 4
5373 Regular expression search backward for pattern; the final ? is optional.
5374 .Ip "L" 12 4
5375 List lines that have breakpoints or actions.
5376 .Ip "S" 12 4
5377 Lists the names of all subroutines.
5378 .Ip "t" 12 4
5379 Toggle trace mode on or off.
5380 .Ip "b line condition" 12 4
5381 Set a breakpoint.
5382 If line is omitted, sets a breakpoint on the
5383 line that is about to be executed.
5384 If a condition is specified, it is evaluated each time the statement is
5385 reached and a breakpoint is taken only if the condition is true.
5386 Breakpoints may only be set on lines that begin an executable statement.
5387 .Ip "b subname condition" 12 4
5388 Set breakpoint at first executable line of subroutine.
5389 .Ip "d line" 12 4
5390 Delete breakpoint.
5391 If line is omitted, deletes the breakpoint on the
5392 line that is about to be executed.
5393 .Ip "D" 12 4
5394 Delete all breakpoints.
5395 .Ip "a line command" 12 4
5396 Set an action for line.
5397 A multi-line command may be entered by backslashing the newlines.
5398 .Ip "A" 12 4
5399 Delete all line actions.
5400 .Ip "< command" 12 4
5401 Set an action to happen before every debugger prompt.
5402 A multi-line command may be entered by backslashing the newlines.
5403 .Ip "> command" 12 4
5404 Set an action to happen after the prompt when you've just given a command
5405 to return to executing the script.
5406 A multi-line command may be entered by backslashing the newlines.
5407 .Ip "V package" 12 4
5408 List all variables in package.
5409 Default is main package.
5410 .Ip "! number" 12 4
5411 Redo a debugging command.
5412 If number is omitted, redoes the previous command.
5413 .Ip "! -number" 12 4
5414 Redo the command that was that many commands ago.
5415 .Ip "H -number" 12 4
5416 Display last n commands.
5417 Only commands longer than one character are listed.
5418 If number is omitted, lists them all.
5419 .Ip "q or ^D" 12 4
5420 Quit.
5421 .Ip "command" 12 4
5422 Execute command as a perl statement.
5423 A missing semicolon will be supplied.
5424 .Ip "p expr" 12 4
5425 Same as \*(L"print DB'OUT expr\*(R".
5426 The DB'OUT filehandle is opened to /dev/tty, regardless of where STDOUT
5427 may be redirected to.
5428 .PP
5429 If you want to modify the debugger, copy perldb.pl from the perl library
5430 to your current directory and modify it as necessary.
5431 (You'll also have to put -I. on your command line.)
5432 You can do some customization by setting up a .perldb file which contains
5433 initialization code.
5434 For instance, you could make aliases like these:
5435 .nf
5436
5437     $DB'alias{'len'} = 's/^len(.*)/p length($1)/';
5438     $DB'alias{'stop'} = 's/^stop (at|in)/b/';
5439     $DB'alias{'.'} =
5440       's/^\e./p "\e$DB\e'sub(\e$DB\e'line):\et",\e$DB\e'line[\e$DB\e'line]/';
5441
5442 .fi
5443 .Sh "Setuid Scripts"
5444 .I Perl
5445 is designed to make it easy to write secure setuid and setgid scripts.
5446 Unlike shells, which are based on multiple substitution passes on each line
5447 of the script,
5448 .I perl
5449 uses a more conventional evaluation scheme with fewer hidden \*(L"gotchas\*(R".
5450 Additionally, since the language has more built-in functionality, it
5451 has to rely less upon external (and possibly untrustworthy) programs to
5452 accomplish its purposes.
5453 .PP
5454 In an unpatched 4.2 or 4.3bsd kernel, setuid scripts are intrinsically
5455 insecure, but this kernel feature can be disabled.
5456 If it is,
5457 .I perl
5458 can emulate the setuid and setgid mechanism when it notices the otherwise
5459 useless setuid/gid bits on perl scripts.
5460 If the kernel feature isn't disabled,
5461 .I perl
5462 will complain loudly that your setuid script is insecure.
5463 You'll need to either disable the kernel setuid script feature, or put
5464 a C wrapper around the script.
5465 .PP
5466 When perl is executing a setuid script, it takes special precautions to
5467 prevent you from falling into any obvious traps.
5468 (In some ways, a perl script is more secure than the corresponding
5469 C program.)
5470 Any command line argument, environment variable, or input is marked as
5471 \*(L"tainted\*(R", and may not be used, directly or indirectly, in any
5472 command that invokes a subshell, or in any command that modifies files,
5473 directories or processes.
5474 Any variable that is set within an expression that has previously referenced
5475 a tainted value also becomes tainted (even if it is logically impossible
5476 for the tainted value to influence the variable).
5477 For example:
5478 .nf
5479
5480 .ne 5
5481         $foo = shift;                   # $foo is tainted
5482         $bar = $foo,\'bar\';            # $bar is also tainted
5483         $xxx = <>;                      # Tainted
5484         $path = $ENV{\'PATH\'}; # Tainted, but see below
5485         $abc = \'abc\';                 # Not tainted
5486
5487 .ne 4
5488         system "echo $foo";             # Insecure
5489         system "/bin/echo", $foo;       # Secure (doesn't use sh)
5490         system "echo $bar";             # Insecure
5491         system "echo $abc";             # Insecure until PATH set
5492
5493 .ne 5
5494         $ENV{\'PATH\'} = \'/bin:/usr/bin\';
5495         $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
5496
5497         $path = $ENV{\'PATH\'}; # Not tainted
5498         system "echo $abc";             # Is secure now!
5499
5500 .ne 5
5501         open(FOO,"$foo");               # OK
5502         open(FOO,">$foo");              # Not OK
5503
5504         open(FOO,"echo $foo|"); # Not OK, but...
5505         open(FOO,"-|") || exec \'echo\', $foo;  # OK
5506
5507         $zzz = `echo $foo`;             # Insecure, zzz tainted
5508
5509         unlink $abc,$foo;               # Insecure
5510         umask $foo;                     # Insecure
5511
5512 .ne 3
5513         exec "echo $foo";               # Insecure
5514         exec "echo", $foo;              # Secure (doesn't use sh)
5515         exec "sh", \'-c\', $foo;        # Considered secure, alas
5516
5517 .fi
5518 The taintedness is associated with each scalar value, so some elements
5519 of an array can be tainted, and others not.
5520 .PP
5521 If you try to do something insecure, you will get a fatal error saying
5522 something like \*(L"Insecure dependency\*(R" or \*(L"Insecure PATH\*(R".
5523 Note that you can still write an insecure system call or exec,
5524 but only by explicitly doing something like the last example above.
5525 You can also bypass the tainting mechanism by referencing
5526 subpatterns\*(--\c
5527 .I perl
5528 presumes that if you reference a substring using $1, $2, etc, you knew
5529 what you were doing when you wrote the pattern:
5530 .nf
5531
5532         $ARGV[0] =~ /^\-P(\ew+)$/;
5533         $printer = $1;          # Not tainted
5534
5535 .fi
5536 This is fairly secure since \ew+ doesn't match shell metacharacters.
5537 Use of .+ would have been insecure, but
5538 .I perl
5539 doesn't check for that, so you must be careful with your patterns.
5540 This is the ONLY mechanism for untainting user supplied filenames if you
5541 want to do file operations on them (unless you make $> equal to $<).
5542 .PP
5543 It's also possible to get into trouble with other operations that don't care
5544 whether they use tainted values.
5545 Make judicious use of the file tests in dealing with any user-supplied
5546 filenames.
5547 When possible, do opens and such after setting $> = $<.
5548 .I Perl
5549 doesn't prevent you from opening tainted filenames for reading, so be
5550 careful what you print out.
5551 The tainting mechanism is intended to prevent stupid mistakes, not to remove
5552 the need for thought.
5553 .SH ENVIRONMENT
5554 .I Perl
5555 uses PATH in executing subprocesses, and in finding the script if \-S
5556 is used.
5557 HOME or LOGDIR are used if chdir has no argument.
5558 .PP
5559 Apart from these,
5560 .I perl
5561 uses no environment variables, except to make them available
5562 to the script being executed, and to child processes.
5563 However, scripts running setuid would do well to execute the following lines
5564 before doing anything else, just to keep people honest:
5565 .nf
5566
5567 .ne 3
5568     $ENV{\'PATH\'} = \'/bin:/usr/bin\';    # or whatever you need
5569     $ENV{\'SHELL\'} = \'/bin/sh\' if $ENV{\'SHELL\'} ne \'\';
5570     $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
5571
5572 .fi
5573 .SH AUTHOR
5574 Larry Wall <lwall@jpl-devvax.Jpl.Nasa.Gov>
5575 .br
5576 MS-DOS port by Diomidis Spinellis <dds@cc.ic.ac.uk>
5577 .SH FILES
5578 /tmp/perl\-eXXXXXX      temporary file for
5579 .B \-e
5580 commands.
5581 .SH SEE ALSO
5582 a2p     awk to perl translator
5583 .br
5584 s2p     sed to perl translator
5585 .SH DIAGNOSTICS
5586 Compilation errors will tell you the line number of the error, with an
5587 indication of the next token or token type that was to be examined.
5588 (In the case of a script passed to
5589 .I perl
5590 via
5591 .B \-e
5592 switches, each
5593 .B \-e
5594 is counted as one line.)
5595 .PP
5596 Setuid scripts have additional constraints that can produce error messages
5597 such as \*(L"Insecure dependency\*(R".
5598 See the section on setuid scripts.
5599 .SH TRAPS
5600 Accustomed
5601 .IR awk
5602 users should take special note of the following:
5603 .Ip * 4 2
5604 Semicolons are required after all simple statements in
5605 .IR perl .
5606 Newline
5607 is not a statement delimiter.
5608 .Ip * 4 2
5609 Curly brackets are required on ifs and whiles.
5610 .Ip * 4 2
5611 Variables begin with $ or @ in
5612 .IR perl .
5613 .Ip * 4 2
5614 Arrays index from 0 unless you set $[.
5615 Likewise string positions in substr() and index().
5616 .Ip * 4 2
5617 You have to decide whether your array has numeric or string indices.
5618 .Ip * 4 2
5619 Associative array values do not spring into existence upon mere reference.
5620 .Ip * 4 2
5621 You have to decide whether you want to use string or numeric comparisons.
5622 .Ip * 4 2
5623 Reading an input line does not split it for you.  You get to split it yourself
5624 to an array.
5625 And the
5626 .I split
5627 operator has different arguments.
5628 .Ip * 4 2
5629 The current input line is normally in $_, not $0.
5630 It generally does not have the newline stripped.
5631 ($0 is the name of the program executed.)
5632 .Ip * 4 2
5633 $<digit> does not refer to fields\*(--it refers to substrings matched by the last
5634 match pattern.
5635 .Ip * 4 2
5636 The
5637 .I print
5638 statement does not add field and record separators unless you set
5639 $, and $\e.
5640 .Ip * 4 2
5641 You must open your files before you print to them.
5642 .Ip * 4 2
5643 The range operator is \*(L".\|.\*(R", not comma.
5644 (The comma operator works as in C.)
5645 .Ip * 4 2
5646 The match operator is \*(L"=~\*(R", not \*(L"~\*(R".
5647 (\*(L"~\*(R" is the one's complement operator, as in C.)
5648 .Ip * 4 2
5649 The exponentiation operator is \*(L"**\*(R", not \*(L"^\*(R".
5650 (\*(L"^\*(R" is the XOR operator, as in C.)
5651 .Ip * 4 2
5652 The concatenation operator is \*(L".\*(R", not the null string.
5653 (Using the null string would render \*(L"/pat/ /pat/\*(R" unparsable,
5654 since the third slash would be interpreted as a division operator\*(--the
5655 tokener is in fact slightly context sensitive for operators like /, ?, and <.
5656 And in fact, . itself can be the beginning of a number.)
5657 .Ip * 4 2
5658 .IR Next ,
5659 .I exit
5660 and
5661 .I continue
5662 work differently.
5663 .Ip * 4 2
5664 The following variables work differently
5665 .nf
5666
5667           Awk   \h'|2.5i'Perl
5668           ARGC  \h'|2.5i'$#ARGV
5669           ARGV[0]       \h'|2.5i'$0
5670           FILENAME\h'|2.5i'$ARGV
5671           FNR   \h'|2.5i'$. \- something
5672           FS    \h'|2.5i'(whatever you like)
5673           NF    \h'|2.5i'$#Fld, or some such
5674           NR    \h'|2.5i'$.
5675           OFMT  \h'|2.5i'$#
5676           OFS   \h'|2.5i'$,
5677           ORS   \h'|2.5i'$\e
5678           RLENGTH       \h'|2.5i'length($&)
5679           RS    \h'|2.5i'$/
5680           RSTART        \h'|2.5i'length($\`)
5681           SUBSEP        \h'|2.5i'$;
5682
5683 .fi
5684 .Ip * 4 2
5685 When in doubt, run the
5686 .I awk
5687 construct through a2p and see what it gives you.
5688 .PP
5689 Cerebral C programmers should take note of the following:
5690 .Ip * 4 2
5691 Curly brackets are required on ifs and whiles.
5692 .Ip * 4 2
5693 You should use \*(L"elsif\*(R" rather than \*(L"else if\*(R"
5694 .Ip * 4 2
5695 .I Break
5696 and
5697 .I continue
5698 become
5699 .I last
5700 and
5701 .IR next ,
5702 respectively.
5703 .Ip * 4 2
5704 There's no switch statement.
5705 .Ip * 4 2
5706 Variables begin with $ or @ in
5707 .IR perl .
5708 .Ip * 4 2
5709 Printf does not implement *.
5710 .Ip * 4 2
5711 Comments begin with #, not /*.
5712 .Ip * 4 2
5713 You can't take the address of anything.
5714 .Ip * 4 2
5715 ARGV must be capitalized.
5716 .Ip * 4 2
5717 The \*(L"system\*(R" calls link, unlink, rename, etc. return nonzero for success, not 0.
5718 .Ip * 4 2
5719 Signal handlers deal with signal names, not numbers.
5720 .PP
5721 Seasoned
5722 .I sed
5723 programmers should take note of the following:
5724 .Ip * 4 2
5725 Backreferences in substitutions use $ rather than \e.
5726 .Ip * 4 2
5727 The pattern matching metacharacters (, ), and | do not have backslashes in front.
5728 .Ip * 4 2
5729 The range operator is .\|. rather than comma.
5730 .PP
5731 Sharp shell programmers should take note of the following:
5732 .Ip * 4 2
5733 The backtick operator does variable interpretation without regard to the
5734 presence of single quotes in the command.
5735 .Ip * 4 2
5736 The backtick operator does no translation of the return value, unlike csh.
5737 .Ip * 4 2
5738 Shells (especially csh) do several levels of substitution on each command line.
5739 .I Perl
5740 does substitution only in certain constructs such as double quotes,
5741 backticks, angle brackets and search patterns.
5742 .Ip * 4 2
5743 Shells interpret scripts a little bit at a time.
5744 .I Perl
5745 compiles the whole program before executing it.
5746 .Ip * 4 2
5747 The arguments are available via @ARGV, not $1, $2, etc.
5748 .Ip * 4 2
5749 The environment is not automatically made available as variables.
5750 .SH ERRATA\0AND\0ADDENDA
5751 The Perl book,
5752 .I Programming\0Perl ,
5753 has the following omissions and goofs.
5754 .PP
5755 On page 5, the examples which read
5756 .nf
5757
5758         eval "/usr/bin/perl
5759
5760 should read
5761
5762         eval "exec /usr/bin/perl
5763
5764 .fi
5765 .PP
5766 On page 195, the equivalent to the System V sum program only works for
5767 very small files.  To do larger files, use
5768 .nf
5769
5770         undef $/;
5771         $checksum = unpack("%32C*",<>) % 32767;
5772
5773 .fi
5774 .PP
5775 The
5776 .B \-0
5777 switch to set the initial value of $/ was added to Perl after the book
5778 went to press.
5779 .PP
5780 The
5781 .B \-l
5782 switch now does automatic line ending processing.
5783 .PP
5784 The qx// construct is now a synonym for backticks.
5785 .PP
5786 $0 may now be assigned to set the argument displayed by
5787 .I ps (1).
5788 .PP
5789 The new @###.## format was omitted accidentally from the description
5790 on formats.
5791 .PP
5792 It wasn't known at press time that s///ee caused multiple evaluations of
5793 the replacement expression.  This is to be construed as a feature.
5794 .PP
5795 (LIST) x $count now does array replication.
5796 .PP
5797 There is now no limit on the number of parentheses in a regular expression.
5798 .PP
5799 In double-quote context, more escapes are supported: \ee, \ea, \ex1b, \ec[,
5800 \el, \eL, \eu, \eU, \eE.  The latter five control up/lower case translation.
5801 .PP
5802 The
5803 .B $/
5804 variable may now be set to a multi-character delimiter.
5805 .SH BUGS
5806 .PP
5807 .I Perl
5808 is at the mercy of your machine's definitions of various operations
5809 such as type casting, atof() and sprintf().
5810 .PP
5811 If your stdio requires an seek or eof between reads and writes on a particular
5812 stream, so does
5813 .IR perl .
5814 (This doesn't apply to sysread() and syswrite().)
5815 .PP
5816 While none of the built-in data types have any arbitrary size limits (apart
5817 from memory size), there are still a few arbitrary limits:
5818 a given identifier may not be longer than 255 characters;
5819 sprintf is limited on many machines to 128 characters per field (unless the format
5820 specifier is exactly %s);
5821 and no component of your PATH may be longer than 255 if you use \-S.
5822 .PP
5823 .I Perl
5824 actually stands for Pathologically Eclectic Rubbish Lister, but don't tell
5825 anyone I said that.
5826 .rn }` ''