perl.man

   1 .rn '' }`
   2 ''' $RCSfile: perl.man,v $$Revision: 4.0.1.3 $$Date: 91/06/10 01:26:02 $
   3 '''
   4 ''' $Log:       perl.man,v $
   5 ''' Revision 4.0.1.3  91/06/10  01:26:02  lwall
   6 ''' patch10: documented some newer features in addenda
   7 '''
   8 ''' Revision 4.0.1.2  91/06/07  11:41:23  lwall
   9 ''' patch4: added global modifier for pattern matches
  10 ''' patch4: default top-of-form format is now FILEHANDLE_TOP
  11 ''' patch4: added $^P variable to control calling of perldb routines
  12 ''' patch4: added $^F variable to specify maximum system fd, default 2
  13 ''' patch4: changed old $^P to $^X
  14 '''
  15 ''' Revision 4.0.1.1  91/04/11  17:50:44  lwall
  16 ''' patch1: fixed some typos
  17 '''
  18 ''' Revision 4.0  91/03/20  01:38:08  lwall
  19 ''' 4.0 baseline.
  20 '''
  21 '''
  22 .de Sh
  23 .br
  24 .ne 5
  25 .PP
  26 \fB\\$1\fR
  27 .PP
  28 ..
  29 .de Sp
  30 .if t .sp .5v
  31 .if n .sp
  32 ..
  33 .de Ip
  34 .br
  35 .ie \\n(.$>=3 .ne \\$3
  36 .el .ne 3
  37 .IP "\\$1" \\$2
  38 ..
  39 '''
  40 '''     Set up \*(-- to give an unbreakable dash;
  41 '''     string Tr holds user defined translation string.
  42 '''     Bell System Logo is used as a dummy character.
  43 '''
  44 .tr \(*W-|\(bv\*(Tr
  45 .ie n \{\
  46 .ds -- \(*W-
  47 .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
  48 .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
  49 .ds L" ""
  50 .ds R" ""
  51 .ds L' '
  52 .ds R' '
  53 'br\}
  54 .el\{\
  55 .ds -- \(em\|
  56 .tr \*(Tr
  57 .ds L" ``
  58 .ds R" ''
  59 .ds L' `
  60 .ds R' '
  61 'br\}
  62 .TH PERL 1 "\*(RP"
  63 .UC
  64 .SH NAME
  65 perl \- Practical Extraction and Report Language
  66 .SH SYNOPSIS
  67 .B perl
  68 [options] filename args
  69 .SH DESCRIPTION
  70 .I Perl
  71 is an interpreted language optimized for scanning arbitrary text files,
  72 extracting information from those text files, and printing reports based
  73 on that information.
  74 It's also a good language for many system management tasks.
  75 The language is intended to be practical (easy to use, efficient, complete)
  76 rather than beautiful (tiny, elegant, minimal).
  77 It combines (in the author's opinion, anyway) some of the best features of C,
  78 \fIsed\fR, \fIawk\fR, and \fIsh\fR,
  79 so people familiar with those languages should have little difficulty with it.
  80 (Language historians will also note some vestiges of \fIcsh\fR, Pascal, and
  81 even BASIC-PLUS.)
  82 Expression syntax corresponds quite closely to C expression syntax.
  83 Unlike most Unix utilities,
  84 .I perl
  85 does not arbitrarily limit the size of your data\*(--if you've got
  86 the memory,
  87 .I perl
  88 can slurp in your whole file as a single string.
  89 Recursion is of unlimited depth.
  90 And the hash tables used by associative arrays grow as necessary to prevent
  91 degraded performance.
  92 .I Perl
  93 uses sophisticated pattern matching techniques to scan large amounts of
  94 data very quickly.
  95 Although optimized for scanning text,
  96 .I perl
  97 can also deal with binary data, and can make dbm files look like associative
  98 arrays (where dbm is available).
  99 Setuid
 100 .I perl
 101 scripts are safer than C programs
 102 through a dataflow tracing mechanism which prevents many stupid security holes.
 103 If you have a problem that would ordinarily use \fIsed\fR
 104 or \fIawk\fR or \fIsh\fR, but it
 105 exceeds their capabilities or must run a little faster,
 106 and you don't want to write the silly thing in C, then
 107 .I perl
 108 may be for you.
 109 There are also translators to turn your
 110 .I sed
 111 and
 112 .I awk
 113 scripts into
 114 .I perl
 115 scripts.
 116 OK, enough hype.
 117 .PP
 118 Upon startup,
 119 .I perl
 120 looks for your script in one of the following places:
 121 .Ip 1. 4 2
 122 Specified line by line via
 123 .B \-e
 124 switches on the command line.
 125 .Ip 2. 4 2
 126 Contained in the file specified by the first filename on the command line.
 127 (Note that systems supporting the #! notation invoke interpreters this way.)
 128 .Ip 3. 4 2
 129 Passed in implicitly via standard input.
 130 This only works if there are no filename arguments\*(--to pass
 131 arguments to a
 132 .I stdin
 133 script you must explicitly specify a \- for the script name.
 134 .PP
 135 After locating your script,
 136 .I perl
 137 compiles it to an internal form.
 138 If the script is syntactically correct, it is executed.
 139 .Sh "Options"
 140 Note: on first reading this section may not make much sense to you.  It's here
 141 at the front for easy reference.
 142 .PP
 143 A single-character option may be combined with the following option, if any.
 144 This is particularly useful when invoking a script using the #! construct which
 145 only allows one argument.  Example:
 146 .nf
 147
 148 .ne 2
 149         #!/usr/bin/perl \-spi.bak       # same as \-s \-p \-i.bak
 150         .\|.\|.
 151
 152 .fi
 153 Options include:
 154 .TP 5
 155 .BI \-0 digits
 156 specifies the record separator ($/) as an octal number.
 157 If there are no digits, the null character is the separator.
 158 Other switches may precede or follow the digits.
 159 For example, if you have a version of
 160 .I find
 161 which can print filenames terminated by the null character, you can say this:
 162 .nf
 163
 164     find . \-name '*.bak' \-print0 | perl \-n0e unlink
 165
 166 .fi
 167 The special value 00 will cause Perl to slurp files in paragraph mode.
 168 The value 0777 will cause Perl to slurp files whole since there is no
 169 legal character with that value.
 170 .TP 5
 171 .B \-a
 172 turns on autosplit mode when used with a
 173 .B \-n
 174 or
 175 .BR \-p .
 176 An implicit split command to the @F array
 177 is done as the first thing inside the implicit while loop produced by
 178 the
 179 .B \-n
 180 or
 181 .BR \-p .
 182 .nf
 183
 184         perl \-ane \'print pop(@F), "\en";\'
 185
 186 is equivalent to
 187
 188         while (<>) {
 189                 @F = split(\' \');
 190                 print pop(@F), "\en";
 191         }
 192
 193 .fi
 194 .TP 5
 195 .B \-c
 196 causes
 197 .I perl
 198 to check the syntax of the script and then exit without executing it.
 199 .TP 5
 200 .BI \-d
 201 runs the script under the perl debugger.
 202 See the section on Debugging.
 203 .TP 5
 204 .BI \-D number
 205 sets debugging flags.
 206 To watch how it executes your script, use
 207 .BR \-D14 .
 208 (This only works if debugging is compiled into your
 209 .IR perl .)
 210 Another nice value is \-D1024, which lists your compiled syntax tree.
 211 And \-D512 displays compiled regular expressions.
 212 .TP 5
 213 .BI \-e " commandline"
 214 may be used to enter one line of script.
 215 Multiple
 216 .B \-e
 217 commands may be given to build up a multi-line script.
 218 If
 219 .B \-e
 220 is given,
 221 .I perl
 222 will not look for a script filename in the argument list.
 223 .TP 5
 224 .BI \-i extension
 225 specifies that files processed by the <> construct are to be edited
 226 in-place.
 227 It does this by renaming the input file, opening the output file by the
 228 same name, and selecting that output file as the default for print statements.
 229 The extension, if supplied, is added to the name of the
 230 old file to make a backup copy.
 231 If no extension is supplied, no backup is made.
 232 Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using
 233 the script:
 234 .nf
 235
 236 .ne 2
 237         #!/usr/bin/perl \-pi.bak
 238         s/foo/bar/;
 239
 240 which is equivalent to
 241
 242 .ne 14
 243         #!/usr/bin/perl
 244         while (<>) {
 245                 if ($ARGV ne $oldargv) {
 246                         rename($ARGV, $ARGV . \'.bak\');
 247                         open(ARGVOUT, ">$ARGV");
 248                         select(ARGVOUT);
 249                         $oldargv = $ARGV;
 250                 }
 251                 s/foo/bar/;
 252         }
 253         continue {
 254             print;      # this prints to original filename
 255         }
 256         select(STDOUT);
 257
 258 .fi
 259 except that the
 260 .B \-i
 261 form doesn't need to compare $ARGV to $oldargv to know when
 262 the filename has changed.
 263 It does, however, use ARGVOUT for the selected filehandle.
 264 Note that
 265 .I STDOUT
 266 is restored as the default output filehandle after the loop.
 267 .Sp
 268 You can use eof to locate the end of each input file, in case you want
 269 to append to each file, or reset line numbering (see example under eof).
 270 .TP 5
 271 .BI \-I directory
 272 may be used in conjunction with
 273 .B \-P
 274 to tell the C preprocessor where to look for include files.
 275 By default /usr/include and /usr/lib/perl are searched.
 276 .TP 5
 277 .BI \-l octnum
 278 enables automatic line-ending processing.  It has two effects:
 279 first, it automatically chops the line terminator when used with
 280 .B \-n
 281 or
 282 .B \-p ,
 283 and second, it assigns $\e to have the value of
 284 .I octnum
 285 so that any print statements will have that line terminator added back on.  If
 286 .I octnum
 287 is omitted, sets $\e to the current value of $/.
 288 For instance, to trim lines to 80 columns:
 289 .nf
 290
 291         perl -lpe \'substr($_, 80) = ""\'
 292
 293 .fi
 294 Note that the assignment $\e = $/ is done when the switch is processed,
 295 so the input record separator can be different than the output record
 296 separator if the
 297 .B \-l
 298 switch is followed by a
 299 .B \-0
 300 switch:
 301 .nf
 302
 303         gnufind / -print0 | perl -ln0e 'print "found $_" if -p'
 304
 305 .fi
 306 This sets $\e to newline and then sets $/ to the null character.
 307 .TP 5
 308 .B \-n
 309 causes
 310 .I perl
 311 to assume the following loop around your script, which makes it iterate
 312 over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR:
 313 .nf
 314
 315 .ne 3
 316         while (<>) {
 317                 .\|.\|.         # your script goes here
 318         }
 319
 320 .fi
 321 Note that the lines are not printed by default.
 322 See
 323 .B \-p
 324 to have lines printed.
 325 Here is an efficient way to delete all files older than a week:
 326 .nf
 327
 328         find . \-mtime +7 \-print | perl \-nle \'unlink;\'
 329
 330 .fi
 331 This is faster than using the \-exec switch of find because you don't have to
 332 start a process on every filename found.
 333 .TP 5
 334 .B \-p
 335 causes
 336 .I perl
 337 to assume the following loop around your script, which makes it iterate
 338 over filename arguments somewhat like \fIsed\fR:
 339 .nf
 340
 341 .ne 5
 342         while (<>) {
 343                 .\|.\|.         # your script goes here
 344         } continue {
 345                 print;
 346         }
 347
 348 .fi
 349 Note that the lines are printed automatically.
 350 To suppress printing use the
 351 .B \-n
 352 switch.
 353 A
 354 .B \-p
 355 overrides a
 356 .B \-n
 357 switch.
 358 .TP 5
 359 .B \-P
 360 causes your script to be run through the C preprocessor before
 361 compilation by
 362 .IR perl .
 363 (Since both comments and cpp directives begin with the # character,
 364 you should avoid starting comments with any words recognized
 365 by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".)
 366 .TP 5
 367 .B \-s
 368 enables some rudimentary switch parsing for switches on the command line
 369 after the script name but before any filename arguments (or before a \-\|\-).
 370 Any switch found there is removed from @ARGV and sets the corresponding variable in the
 371 .I perl
 372 script.
 373 The following script prints \*(L"true\*(R" if and only if the script is
 374 invoked with a \-xyz switch.
 375 .nf
 376
 377 .ne 2
 378         #!/usr/bin/perl \-s
 379         if ($xyz) { print "true\en"; }
 380
 381 .fi
 382 .TP 5
 383 .B \-S
 384 makes
 385 .I perl
 386 use the PATH environment variable to search for the script
 387 (unless the name of the script starts with a slash).
 388 Typically this is used to emulate #! startup on machines that don't
 389 support #!, in the following manner:
 390 .nf
 391
 392         #!/usr/bin/perl
 393         eval "exec /usr/bin/perl \-S $0 $*"
 394                 if $running_under_some_shell;
 395
 396 .fi
 397 The system ignores the first line and feeds the script to /bin/sh,
 398 which proceeds to try to execute the
 399 .I perl
 400 script as a shell script.
 401 The shell executes the second line as a normal shell command, and thus
 402 starts up the
 403 .I perl
 404 interpreter.
 405 On some systems $0 doesn't always contain the full pathname,
 406 so the
 407 .B \-S
 408 tells
 409 .I perl
 410 to search for the script if necessary.
 411 After
 412 .I perl
 413 locates the script, it parses the lines and ignores them because
 414 the variable $running_under_some_shell is never true.
 415 A better construct than $* would be ${1+"$@"}, which handles embedded spaces
 416 and such in the filenames, but doesn't work if the script is being interpreted
 417 by csh.
 418 In order to start up sh rather than csh, some systems may have to replace the
 419 #! line with a line containing just
 420 a colon, which will be politely ignored by perl.
 421 Other systems can't control that, and need a totally devious construct that
 422 will work under any of csh, sh or perl, such as the following:
 423 .nf
 424
 425 .ne 3
 426         eval '(exit $?0)' && eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
 427         & eval 'exec /usr/bin/perl -S $0 $argv:q'
 428                 if 0;
 429
 430 .fi
 431 .TP 5
 432 .B \-u
 433 causes
 434 .I perl
 435 to dump core after compiling your script.
 436 You can then take this core dump and turn it into an executable file
 437 by using the undump program (not supplied).
 438 This speeds startup at the expense of some disk space (which you can
 439 minimize by stripping the executable).
 440 (Still, a "hello world" executable comes out to about 200K on my machine.)
 441 If you are going to run your executable as a set-id program then you
 442 should probably compile it using taintperl rather than normal perl.
 443 If you want to execute a portion of your script before dumping, use the
 444 dump operator instead.
 445 Note: availability of undump is platform specific and may not be available
 446 for a specific port of perl.
 447 .TP 5
 448 .B \-U
 449 allows
 450 .I perl
 451 to do unsafe operations.
 452 Currently the only \*(L"unsafe\*(R" operation is the unlinking of directories while
 453 running as superuser.
 454 .TP 5
 455 .B \-v
 456 prints the version and patchlevel of your
 457 .I perl
 458 executable.
 459 .TP 5
 460 .B \-w
 461 prints warnings about identifiers that are mentioned only once, and scalar
 462 variables that are used before being set.
 463 Also warns about redefined subroutines, and references to undefined
 464 filehandles or filehandles opened readonly that you are attempting to
 465 write on.
 466 Also warns you if you use == on values that don't look like numbers, and if
 467 your subroutines recurse more than 100 deep.
 468 .TP 5
 469 .BI \-x directory
 470 tells
 471 .I perl
 472 that the script is embedded in a message.
 473 Leading garbage will be discarded until the first line that starts
 474 with #! and contains the string "perl".
 475 Any meaningful switches on that line will be applied (but only one
 476 group of switches, as with normal #! processing).
 477 If a directory name is specified, Perl will switch to that directory
 478 before running the script.
 479 The
 480 .B \-x
 481 switch only controls the the disposal of leading garbage.
 482 The script must be terminated with __END__ if there is trailing garbage
 483 to be ignored (the script can process any or all of the trailing garbage
 484 via the DATA filehandle if desired).
 485 .Sh "Data Types and Objects"
 486 .PP
 487 .I Perl
 488 has three data types: scalars, arrays of scalars, and
 489 associative arrays of scalars.
 490 Normal arrays are indexed by number, and associative arrays by string.
 491 .PP
 492 The interpretation of operations and values in perl sometimes
 493 depends on the requirements
 494 of the context around the operation or value.
 495 There are three major contexts: string, numeric and array.
 496 Certain operations return array values
 497 in contexts wanting an array, and scalar values otherwise.
 498 (If this is true of an operation it will be mentioned in the documentation
 499 for that operation.)
 500 Operations which return scalars don't care whether the context is looking
 501 for a string or a number, but
 502 scalar variables and values are interpreted as strings or numbers
 503 as appropriate to the context.
 504 A scalar is interpreted as TRUE in the boolean sense if it is not the null
 505 string or 0.
 506 Booleans returned by operators are 1 for true and 0 or \'\' (the null
 507 string) for false.
 508 .PP
 509 There are actually two varieties of null string: defined and undefined.
 510 Undefined null strings are returned when there is no real value for something,
 511 such as when there was an error, or at end of file, or when you refer
 512 to an uninitialized variable or element of an array.
 513 An undefined null string may become defined the first time you access it, but
 514 prior to that you can use the defined() operator to determine whether the
 515 value is defined or not.
 516 .PP
 517 References to scalar variables always begin with \*(L'$\*(R', even when referring
 518 to a scalar that is part of an array.
 519 Thus:
 520 .nf
 521
 522 .ne 3
 523     $days       \h'|2i'# a simple scalar variable
 524     $days[28]   \h'|2i'# 29th element of array @days
 525     $days{\'Feb\'}\h'|2i'# one value from an associative array
 526     $#days      \h'|2i'# last index of array @days
 527
 528 but entire arrays or array slices are denoted by \*(L'@\*(R':
 529
 530     @days       \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n])
 531     @days[3,4,5]\h'|2i'# same as @days[3.\|.5]
 532     @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'})
 533
 534 and entire associative arrays are denoted by \*(L'%\*(R':
 535
 536     %days       \h'|2i'# (key1, val1, key2, val2 .\|.\|.)
 537 .fi
 538 .PP
 539 Any of these eight constructs may serve as an lvalue,
 540 that is, may be assigned to.
 541 (It also turns out that an assignment is itself an lvalue in
 542 certain contexts\*(--see examples under s, tr and chop.)
 543 Assignment to a scalar evaluates the righthand side in a scalar context,
 544 while assignment to an array or array slice evaluates the righthand side
 545 in an array context.
 546 .PP
 547 You may find the length of array @days by evaluating
 548 \*(L"$#days\*(R", as in
 549 .IR csh .
 550 (Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.)
 551 Assigning to $#days changes the length of the array.
 552 Shortening an array by this method does not actually destroy any values.
 553 Lengthening an array that was previously shortened recovers the values that
 554 were in those elements.
 555 You can also gain some measure of efficiency by preextending an array that
 556 is going to get big.
 557 (You can also extend an array by assigning to an element that is off the
 558 end of the array.
 559 This differs from assigning to $#whatever in that intervening values
 560 are set to null rather than recovered.)
 561 You can truncate an array down to nothing by assigning the null list () to
 562 it.
 563 The following are exactly equivalent
 564 .nf
 565
 566         @whatever = ();
 567         $#whatever = $[ \- 1;
 568
 569 .fi
 570 .PP
 571 If you evaluate an array in a scalar context, it returns the length of
 572 the array.
 573 The following is always true:
 574 .nf
 575
 576         @whatever == $#whatever \- $[ + 1;
 577
 578 .fi
 579 .PP
 580 Multi-dimensional arrays are not directly supported, but see the discussion
 581 of the $; variable later for a means of emulating multiple subscripts with
 582 an associative array.
 583 You could also write a subroutine to turn multiple subscripts into a single
 584 subscript.
 585 .PP
 586 Every data type has its own namespace.
 587 You can, without fear of conflict, use the same name for a scalar variable,
 588 an array, an associative array, a filehandle, a subroutine name, and/or
 589 a label.
 590 Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R',
 591 or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved
 592 with respect to variable names.
 593 (They ARE reserved with respect to labels and filehandles, however, which
 594 don't have an initial special character.
 595 Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\').
 596 Using uppercase filehandles also improves readability and protects you
 597 from conflict with future reserved words.)
 598 Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all
 599 different names.
 600 Names which start with a letter may also contain digits and underscores.
 601 Names which do not start with a letter are limited to one character,
 602 e.g. \*(L"$%\*(R" or \*(L"$$\*(R".
 603 (Most of the one character names have a predefined significance to
 604 .IR perl .
 605 More later.)
 606 .PP
 607 Numeric literals are specified in any of the usual floating point or
 608 integer formats:
 609 .nf
 610
 611 .ne 5
 612     12345
 613     12345.67
 614     .23E-10
 615     0xffff      # hex
 616     0377        # octal
 617
 618 .fi
 619 String literals are delimited by either single or double quotes.
 620 They work much like shell quotes:
 621 double-quoted string literals are subject to backslash and variable
 622 substitution; single-quoted strings are not (except for \e\' and \e\e).
 623 The usual backslash rules apply for making characters such as newline, tab,
 624 etc., as well as some more exotic forms:
 625 .nf
 626
 627         \et             tab
 628         \en             newline
 629         \er             return
 630         \ef             form feed
 631         \eb             backspace
 632         \ea             alarm (bell)
 633         \ee             escape
 634         \e033           octal char
 635         \ex1b           hex char
 636         \ec[            control char
 637         \el             lowercase next char
 638         \eu             uppercase next char
 639         \eL             lowercase till \eE
 640         \eU             uppercase till \eE
 641         \eE             end case modification
 642
 643 .fi
 644 You can also embed newlines directly in your strings, i.e. they can end on
 645 a different line than they begin.
 646 This is nice, but if you forget your trailing quote, the error will not be
 647 reported until
 648 .I perl
 649 finds another line containing the quote character, which
 650 may be much further on in the script.
 651 Variable substitution inside strings is limited to scalar variables, normal
 652 array values, and array slices.
 653 (In other words, identifiers beginning with $ or @, followed by an optional
 654 bracketed expression as a subscript.)
 655 The following code segment prints out \*(L"The price is $100.\*(R"
 656 .nf
 657
 658 .ne 2
 659     $Price = \'$100\';\h'|3.5i'# not interpreted
 660     print "The price is $Price.\e\|n";\h'|3.5i'# interpreted
 661
 662 .fi
 663 Note that you can put curly brackets around the identifier to delimit it
 664 from following alphanumerics.
 665 Also note that a single quoted string must be separated from a preceding
 666 word by a space, since single quote is a valid character in an identifier
 667 (see Packages).
 668 .PP
 669 Two special literals are __LINE__ and __FILE__, which represent the current
 670 line number and filename at that point in your program.
 671 They may only be used as separate tokens; they will not be interpolated
 672 into strings.
 673 In addition, the token __END__ may be used to indicate the logical end of the
 674 script before the actual end of file.
 675 Any following text is ignored (but may be read via the DATA filehandle).
 676 The two control characters ^D and ^Z are synonyms for __END__.
 677 .PP
 678 A word that doesn't have any other interpretation in the grammar will be
 679 treated as if it had single quotes around it.
 680 For this purpose, a word consists only of alphanumeric characters and underline,
 681 and must start with an alphabetic character.
 682 As with filehandles and labels, a bare word that consists entirely of
 683 lowercase letters risks conflict with future reserved words, and if you
 684 use the
 685 .B \-w
 686 switch, Perl will warn you about any such words.
 687 .PP
 688 Array values are interpolated into double-quoted strings by joining all the
 689 elements of the array with the delimiter specified in the $" variable,
 690 space by default.
 691 (Since in versions of perl prior to 3.0 the @ character was not a metacharacter
 692 in double-quoted strings, the interpolation of @array, $array[EXPR],
 693 @array[LIST], $array{EXPR}, or @array{LIST} only happens if array is
 694 referenced elsewhere in the program or is predefined.)
 695 The following are equivalent:
 696 .nf
 697
 698 .ne 4
 699         $temp = join($",@ARGV);
 700         system "echo $temp";
 701
 702         system "echo @ARGV";
 703
 704 .fi
 705 Within search patterns (which also undergo double-quotish substitution)
 706 there is a bad ambiguity:  Is /$foo[bar]/ to be
 707 interpreted as /${foo}[bar]/ (where [bar] is a character class for the
 708 regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to
 709 array @foo)?
 710 If @foo doesn't otherwise exist, then it's obviously a character class.
 711 If @foo exists, perl takes a good guess about [bar], and is almost always right.
 712 If it does guess wrong, or if you're just plain paranoid,
 713 you can force the correct interpretation with curly brackets as above.
 714 .PP
 715 A line-oriented form of quoting is based on the shell here-is syntax.
 716 Following a << you specify a string to terminate the quoted material, and all lines
 717 following the current line down to the terminating string are the value
 718 of the item.
 719 The terminating string may be either an identifier (a word), or some
 720 quoted text.
 721 If quoted, the type of quotes you use determines the treatment of the text,
 722 just as in regular quoting.
 723 An unquoted identifier works like double quotes.
 724 There must be no space between the << and the identifier.
 725 (If you put a space it will be treated as a null identifier, which is
 726 valid, and matches the first blank line\*(--see Merry Christmas example below.)
 727 The terminating string must appear by itself (unquoted and with no surrounding
 728 whitespace) on the terminating line.
 729 .nf
 730
 731         print <<EOF;            # same as above
 732 The price is $Price.
 733 EOF
 734
 735         print <<"EOF";          # same as above
 736 The price is $Price.
 737 EOF
 738
 739         print << x 10;          # null identifier is delimiter
 740 Merry Christmas!
 741
 742         print <<`EOC`;          # execute commands
 743 echo hi there
 744 echo lo there
 745 EOC
 746
 747         print <<foo, <<bar;     # you can stack them
 748 I said foo.
 749 foo
 750 I said bar.
 751 bar
 752
 753 .fi
 754 Array literals are denoted by separating individual values by commas, and
 755 enclosing the list in parentheses:
 756 .nf
 757
 758         (LIST)
 759
 760 .fi
 761 In a context not requiring an array value, the value of the array literal
 762 is the value of the final element, as in the C comma operator.
 763 For example,
 764 .nf
 765
 766 .ne 4
 767     @foo = (\'cc\', \'\-E\', $bar);
 768
 769 assigns the entire array value to array foo, but
 770
 771     $foo = (\'cc\', \'\-E\', $bar);
 772
 773 .fi
 774 assigns the value of variable bar to variable foo.
 775 Note that the value of an actual array in a scalar context is the length
 776 of the array; the following assigns to $foo the value 3:
 777 .nf
 778
 779 .ne 2
 780     @foo = (\'cc\', \'\-E\', $bar);
 781     $foo = @foo;                # $foo gets 3
 782
 783 .fi
 784 You may have an optional comma before the closing parenthesis of an
 785 array literal, so that you can say:
 786 .nf
 787
 788     @foo = (
 789         1,
 790         2,
 791         3,
 792     );
 793
 794 .fi
 795 When a LIST is evaluated, each element of the list is evaluated in
 796 an array context, and the resulting array value is interpolated into LIST
 797 just as if each individual element were a member of LIST.  Thus arrays
 798 lose their identity in a LIST\*(--the list
 799
 800         (@foo,@bar,&SomeSub)
 801
 802 contains all the elements of @foo followed by all the elements of @bar,
 803 followed by all the elements returned by the subroutine named SomeSub.
 804 .PP
 805 A list value may also be subscripted like a normal array.
 806 Examples:
 807 .nf
 808
 809         $time = (stat($file))[8];       # stat returns array value
 810         $digit = ('a','b','c','d','e','f')[$digit-10];
 811         return (pop(@foo),pop(@foo))[0];
 812
 813 .fi
 814 .PP
 815 Array lists may be assigned to if and only if each element of the list
 816 is an lvalue:
 817 .nf
 818
 819     ($a, $b, $c) = (1, 2, 3);
 820
 821     ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00);
 822
 823 The final element may be an array or an associative array:
 824
 825     ($a, $b, @rest) = split;
 826     local($a, $b, %rest) = @_;
 827
 828 .fi
 829 You can actually put an array anywhere in the list, but the first array
 830 in the list will soak up all the values, and anything after it will get
 831 a null value.
 832 This may be useful in a local().
 833 .PP
 834 An associative array literal contains pairs of values to be interpreted
 835 as a key and a value:
 836 .nf
 837
 838 .ne 2
 839     # same as map assignment above
 840     %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
 841
 842 .fi
 843 Array assignment in a scalar context returns the number of elements
 844 produced by the expression on the right side of the assignment:
 845 .nf
 846
 847         $x = (($foo,$bar) = (3,2,1));   # set $x to 3, not 2
 848
 849 .fi
 850 .PP
 851 There are several other pseudo-literals that you should know about.
 852 If a string is enclosed by backticks (grave accents), it first undergoes
 853 variable substitution just like a double quoted string.
 854 It is then interpreted as a command, and the output of that command
 855 is the value of the pseudo-literal, like in a shell.
 856 In a scalar context, a single string consisting of all the output is
 857 returned.
 858 In an array context, an array of values is returned, one for each line
 859 of output.
 860 (You can set $/ to use a different line terminator.)
 861 The command is executed each time the pseudo-literal is evaluated.
 862 The status value of the command is returned in $? (see Predefined Names
 863 for the interpretation of $?).
 864 Unlike in \f2csh\f1, no translation is done on the return
 865 data\*(--newlines remain newlines.
 866 Unlike in any of the shells, single quotes do not hide variable names
 867 in the command from interpretation.
 868 To pass a $ through to the shell you need to hide it with a backslash.
 869 .PP
 870 Evaluating a filehandle in angle brackets yields the next line
 871 from that file (newline included, so it's never false until EOF, at
 872 which time an undefined value is returned).
 873 Ordinarily you must assign that value to a variable,
 874 but there is one situation where an automatic assignment happens.
 875 If (and only if) the input symbol is the only thing inside the conditional of a
 876 .I while
 877 loop, the value is
 878 automatically assigned to the variable \*(L"$_\*(R".
 879 (This may seem like an odd thing to you, but you'll use the construct
 880 in almost every
 881 .I perl
 882 script you write.)
 883 Anyway, the following lines are equivalent to each other:
 884 .nf
 885
 886 .ne 5
 887     while ($_ = <STDIN>) { print; }
 888     while (<STDIN>) { print; }
 889     for (\|;\|<STDIN>;\|) { print; }
 890     print while $_ = <STDIN>;
 891     print while <STDIN>;
 892
 893 .fi
 894 The filehandles
 895 .IR STDIN ,
 896 .I STDOUT
 897 and
 898 .I STDERR
 899 are predefined.
 900 (The filehandles
 901 .IR stdin ,
 902 .I stdout
 903 and
 904 .I stderr
 905 will also work except in packages, where they would be interpreted as
 906 local identifiers rather than global.)
 907 Additional filehandles may be created with the
 908 .I open
 909 function.
 910 .PP
 911 If a <FILEHANDLE> is used in a context that is looking for an array, an array
 912 consisting of all the input lines is returned, one line per array element.
 913 It's easy to make a LARGE data space this way, so use with care.
 914 .PP
 915 The null filehandle <> is special and can be used to emulate the behavior of
 916 \fIsed\fR and \fIawk\fR.
 917 Input from <> comes either from standard input, or from each file listed on
 918 the command line.
 919 Here's how it works: the first time <> is evaluated, the ARGV array is checked,
 920 and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard
 921 input.
 922 The ARGV array is then processed as a list of filenames.
 923 The loop
 924 .nf
 925
 926 .ne 3
 927         while (<>) {
 928                 .\|.\|.                 # code for each line
 929         }
 930
 931 .ne 10
 932 is equivalent to
 933
 934         unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[;
 935         while ($ARGV = shift) {
 936                 open(ARGV, $ARGV);
 937                 while (<ARGV>) {
 938                         .\|.\|.         # code for each line
 939                 }
 940         }
 941
 942 .fi
 943 except that it isn't as cumbersome to say.
 944 It really does shift array ARGV and put the current filename into
 945 variable ARGV.
 946 It also uses filehandle ARGV internally.
 947 You can modify @ARGV before the first <> as long as you leave the first
 948 filename at the beginning of the array.
 949 Line numbers ($.) continue as if the input was one big happy file.
 950 (But see example under eof for how to reset line numbers on each file.)
 951 .PP
 952 .ne 5
 953 If you want to set @ARGV to your own list of files, go right ahead.
 954 If you want to pass switches into your script, you can
 955 put a loop on the front like this:
 956 .nf
 957
 958 .ne 10
 959         while ($_ = $ARGV[0], /\|^\-/\|) {
 960                 shift;
 961             last if /\|^\-\|\-$\|/\|;
 962                 /\|^\-D\|(.*\|)/ \|&& \|($debug = $1);
 963                 /\|^\-v\|/ \|&& \|$verbose++;
 964                 .\|.\|.         # other switches
 965         }
 966         while (<>) {
 967                 .\|.\|.         # code for each line
 968         }
 969
 970 .fi
 971 The <> symbol will return FALSE only once.
 972 If you call it again after this it will assume you are processing another
 973 @ARGV list, and if you haven't set @ARGV, will input from
 974 .IR STDIN .
 975 .PP
 976 If the string inside the angle brackets is a reference to a scalar variable
 977 (e.g. <$foo>),
 978 then that variable contains the name of the filehandle to input from.
 979 .PP
 980 If the string inside angle brackets is not a filehandle, it is interpreted
 981 as a filename pattern to be globbed, and either an array of filenames or the
 982 next filename in the list is returned, depending on context.
 983 One level of $ interpretation is done first, but you can't say <$foo>
 984 because that's an indirect filehandle as explained in the previous
 985 paragraph.
 986 You could insert curly brackets to force interpretation as a
 987 filename glob: <${foo}>.
 988 Example:
 989 .nf
 990
 991 .ne 3
 992         while (<*.c>) {
 993                 chmod 0644, $_;
 994         }
 995
 996 is equivalent to
 997
 998 .ne 5
 999         open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|");
1000         while (<foo>) {
1001                 chop;
1002                 chmod 0644, $_;
1003         }
1004
1005 .fi
1006 In fact, it's currently implemented that way.
1007 (Which means it will not work on filenames with spaces in them unless
1008 you have /bin/csh on your machine.)
1009 Of course, the shortest way to do the above is:
1010 .nf
1011
1012         chmod 0644, <*.c>;
1013
1014 .fi
1015 .Sh "Syntax"
1016 .PP
1017 A
1018 .I perl
1019 script consists of a sequence of declarations and commands.
1020 The only things that need to be declared in
1021 .I perl
1022 are report formats and subroutines.
1023 See the sections below for more information on those declarations.
1024 All uninitialized user-created objects are assumed to
1025 start with a null or 0 value until they
1026 are defined by some explicit operation such as assignment.
1027 The sequence of commands is executed just once, unlike in
1028 .I sed
1029 and
1030 .I awk
1031 scripts, where the sequence of commands is executed for each input line.
1032 While this means that you must explicitly loop over the lines of your input file
1033 (or files), it also means you have much more control over which files and which
1034 lines you look at.
1035 (Actually, I'm lying\*(--it is possible to do an implicit loop with either the
1036 .B \-n
1037 or
1038 .B \-p
1039 switch.)
1040 .PP
1041 A declaration can be put anywhere a command can, but has no effect on the
1042 execution of the primary sequence of commands\*(--declarations all take effect
1043 at compile time.
1044 Typically all the declarations are put at the beginning or the end of the script.
1045 .PP
1046 .I Perl
1047 is, for the most part, a free-form language.
1048 (The only exception to this is format declarations, for fairly obvious reasons.)
1049 Comments are indicated by the # character, and extend to the end of the line.
1050 If you attempt to use /* */ C comments, it will be interpreted either as
1051 division or pattern matching, depending on the context.
1052 So don't do that.
1053 .Sh "Compound statements"
1054 In
1055 .IR perl ,
1056 a sequence of commands may be treated as one command by enclosing it
1057 in curly brackets.
1058 We will call this a BLOCK.
1059 .PP
1060 The following compound commands may be used to control flow:
1061 .nf
1062
1063 .ne 4
1064         if (EXPR) BLOCK
1065         if (EXPR) BLOCK else BLOCK
1066         if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK
1067         LABEL while (EXPR) BLOCK
1068         LABEL while (EXPR) BLOCK continue BLOCK
1069         LABEL for (EXPR; EXPR; EXPR) BLOCK
1070         LABEL foreach VAR (ARRAY) BLOCK
1071         LABEL BLOCK continue BLOCK
1072
1073 .fi
1074 Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not
1075 statements.
1076 This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed.
1077 If you want to write conditionals without curly brackets there are several
1078 other ways to do it.
1079 The following all do the same thing:
1080 .nf
1081
1082 .ne 5
1083         if (!open(foo)) { die "Can't open $foo: $!"; }
1084         die "Can't open $foo: $!" unless open(foo);
1085         open(foo) || die "Can't open $foo: $!"; # foo or bust!
1086         open(foo) ? \'hi mom\' : die "Can't open $foo: $!";
1087                                 # a bit exotic, that last one
1088
1089 .fi
1090 .PP
1091 The
1092 .I if
1093 statement is straightforward.
1094 Since BLOCKs are always bounded by curly brackets, there is never any
1095 ambiguity about which
1096 .I if
1097 an
1098 .I else
1099 goes with.
1100 If you use
1101 .I unless
1102 in place of
1103 .IR if ,
1104 the sense of the test is reversed.
1105 .PP
1106 The
1107 .I while
1108 statement executes the block as long as the expression is true
1109 (does not evaluate to the null string or 0).
1110 The LABEL is optional, and if present, consists of an identifier followed by
1111 a colon.
1112 The LABEL identifies the loop for the loop control statements
1113 .IR next ,
1114 .IR last ,
1115 and
1116 .I redo
1117 (see below).
1118 If there is a
1119 .I continue
1120 BLOCK, it is always executed just before
1121 the conditional is about to be evaluated again, similarly to the third part
1122 of a
1123 .I for
1124 loop in C.
1125 Thus it can be used to increment a loop variable, even when the loop has
1126 been continued via the
1127 .I next
1128 statement (similar to the C \*(L"continue\*(R" statement).
1129 .PP
1130 If the word
1131 .I while
1132 is replaced by the word
1133 .IR until ,
1134 the sense of the test is reversed, but the conditional is still tested before
1135 the first iteration.
1136 .PP
1137 In either the
1138 .I if
1139 or the
1140 .I while
1141 statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional
1142 is true if the value of the last command in that block is true.
1143 .PP
1144 The
1145 .I for
1146 loop works exactly like the corresponding
1147 .I while
1148 loop:
1149 .nf
1150
1151 .ne 12
1152         for ($i = 1; $i < 10; $i++) {
1153                 .\|.\|.
1154         }
1155
1156 is the same as
1157
1158         $i = 1;
1159         while ($i < 10) {
1160                 .\|.\|.
1161         } continue {
1162                 $i++;
1163         }
1164 .fi
1165 .PP
1166 The foreach loop iterates over a normal array value and sets the variable
1167 VAR to be each element of the array in turn.
1168 The variable is implicitly local to the loop, and regains its former value
1169 upon exiting the loop.
1170 The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword,
1171 so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity.
1172 If VAR is omitted, $_ is set to each value.
1173 If ARRAY is an actual array (as opposed to an expression returning an array
1174 value), you can modify each element of the array
1175 by modifying VAR inside the loop.
1176 Examples:
1177 .nf
1178
1179 .ne 5
1180         for (@ary) { s/foo/bar/; }
1181
1182         foreach $elem (@elements) {
1183                 $elem *= 2;
1184         }
1185
1186 .ne 3
1187         for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) {
1188                 print $_, "\en"; sleep(1);
1189         }
1190
1191         for (1..15) { print "Merry Christmas\en"; }
1192
1193 .ne 3
1194         foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'})) {
1195                 print "Item: $item\en";
1196         }
1197
1198 .fi
1199 .PP
1200 The BLOCK by itself (labeled or not) is equivalent to a loop that executes
1201 once.
1202 Thus you can use any of the loop control statements in it to leave or
1203 restart the block.
1204 The
1205 .I continue
1206 block is optional.
1207 This construct is particularly nice for doing case structures.
1208 .nf
1209
1210 .ne 6
1211         foo: {
1212                 if (/^abc/) { $abc = 1; last foo; }
1213                 if (/^def/) { $def = 1; last foo; }
1214                 if (/^xyz/) { $xyz = 1; last foo; }
1215                 $nothing = 1;
1216         }
1217
1218 .fi
1219 There is no official switch statement in perl, because there
1220 are already several ways to write the equivalent.
1221 In addition to the above, you could write
1222 .nf
1223
1224 .ne 6
1225         foo: {
1226                 $abc = 1, last foo  if /^abc/;
1227                 $def = 1, last foo  if /^def/;
1228                 $xyz = 1, last foo  if /^xyz/;
1229                 $nothing = 1;
1230         }
1231
1232 or
1233
1234 .ne 6
1235         foo: {
1236                 /^abc/ && do { $abc = 1; last foo; };
1237                 /^def/ && do { $def = 1; last foo; };
1238                 /^xyz/ && do { $xyz = 1; last foo; };
1239                 $nothing = 1;
1240         }
1241
1242 or
1243
1244 .ne 6
1245         foo: {
1246                 /^abc/ && ($abc = 1, last foo);
1247                 /^def/ && ($def = 1, last foo);
1248                 /^xyz/ && ($xyz = 1, last foo);
1249                 $nothing = 1;
1250         }
1251
1252 or even
1253
1254 .ne 8
1255         if (/^abc/)
1256                 { $abc = 1; }
1257         elsif (/^def/)
1258                 { $def = 1; }
1259         elsif (/^xyz/)
1260                 { $xyz = 1; }
1261         else
1262                 {$nothing = 1;}
1263
1264 .fi
1265 As it happens, these are all optimized internally to a switch structure,
1266 so perl jumps directly to the desired statement, and you needn't worry
1267 about perl executing a lot of unnecessary statements when you have a string
1268 of 50 elsifs, as long as you are testing the same simple scalar variable
1269 using ==, eq, or pattern matching as above.
1270 (If you're curious as to whether the optimizer has done this for a particular
1271 case statement, you can use the \-D1024 switch to list the syntax tree
1272 before execution.)
1273 .Sh "Simple statements"
1274 The only kind of simple statement is an expression evaluated for its side
1275 effects.
1276 Every expression (simple statement) must be terminated with a semicolon.
1277 Note that this is like C, but unlike Pascal (and
1278 .IR awk ).
1279 .PP
1280 Any simple statement may optionally be followed by a
1281 single modifier, just before the terminating semicolon.
1282 The possible modifiers are:
1283 .nf
1284
1285 .ne 4
1286         if EXPR
1287         unless EXPR
1288         while EXPR
1289         until EXPR
1290
1291 .fi
1292 The
1293 .I if
1294 and
1295 .I unless
1296 modifiers have the expected semantics.
1297 The
1298 .I while
1299 and
1300 .I until
1301 modifiers also have the expected semantics (conditional evaluated first),
1302 except when applied to a do-BLOCK or a do-SUBROUTINE command,
1303 in which case the block executes once before the conditional is evaluated.
1304 This is so that you can write loops like:
1305 .nf
1306
1307 .ne 4
1308         do {
1309                 $_ = <STDIN>;
1310                 .\|.\|.
1311         } until $_ \|eq \|".\|\e\|n";
1312
1313 .fi
1314 (See the
1315 .I do
1316 operator below.  Note also that the loop control commands described later will
1317 NOT work in this construct, since modifiers don't take loop labels.
1318 Sorry.)
1319 .Sh "Expressions"
1320 Since
1321 .I perl
1322 expressions work almost exactly like C expressions, only the differences
1323 will be mentioned here.
1324 .PP
1325 Here's what
1326 .I perl
1327 has that C doesn't:
1328 .Ip ** 8 2
1329 The exponentiation operator.
1330 .Ip **= 8
1331 The exponentiation assignment operator.
1332 .Ip (\|) 8 3
1333 The null list, used to initialize an array to null.
1334 .Ip . 8
1335 Concatenation of two strings.
1336 .Ip .= 8
1337 The concatenation assignment operator.
1338 .Ip eq 8
1339 String equality (== is numeric equality).
1340 For a mnemonic just think of \*(L"eq\*(R" as a string.
1341 (If you are used to the
1342 .I awk
1343 behavior of using == for either string or numeric equality
1344 based on the current form of the comparands, beware!
1345 You must be explicit here.)
1346 .Ip ne 8
1347 String inequality (!= is numeric inequality).
1348 .Ip lt 8
1349 String less than.
1350 .Ip gt 8
1351 String greater than.
1352 .Ip le 8
1353 String less than or equal.
1354 .Ip ge 8
1355 String greater than or equal.
1356 .Ip cmp 8
1357 String comparison, returning -1, 0, or 1.
1358 .Ip <=> 8
1359 Numeric comparison, returning -1, 0, or 1.
1360 .Ip =~ 8 2
1361 Certain operations search or modify the string \*(L"$_\*(R" by default.
1362 This operator makes that kind of operation work on some other string.
1363 The right argument is a search pattern, substitution, or translation.
1364 The left argument is what is supposed to be searched, substituted, or
1365 translated instead of the default \*(L"$_\*(R".
1366 The return value indicates the success of the operation.
1367 (If the right argument is an expression other than a search pattern,
1368 substitution, or translation, it is interpreted as a search pattern
1369 at run time.
1370 This is less efficient than an explicit search, since the pattern must
1371 be compiled every time the expression is evaluated.)
1372 The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else.
1373 .Ip !~ 8
1374 Just like =~ except the return value is negated.
1375 .Ip x 8
1376 The repetition operator.
1377 Returns a string consisting of the left operand repeated the
1378 number of times specified by the right operand.
1379 In an array context, if the left operand is a list in parens, it repeats
1380 the list.
1381 .nf
1382
1383         print \'\-\' x 80;              # print row of dashes
1384         print \'\-\' x80;               # illegal, x80 is identifier
1385
1386         print "\et" x ($tab/8), \' \' x ($tab%8);       # tab over
1387
1388         @ones = (1) x 80;               # an array of 80 1's
1389         @ones = (5) x @ones;            # set all elements to 5
1390
1391 .fi
1392 .Ip x= 8
1393 The repetition assignment operator.
1394 Only works on scalars.
1395 .Ip .\|. 8
1396 The range operator, which is really two different operators depending
1397 on the context.
1398 In an array context, returns an array of values counting (by ones)
1399 from the left value to the right value.
1400 This is useful for writing \*(L"for (1..10)\*(R" loops and for doing
1401 slice operations on arrays.
1402 .Sp
1403 In a scalar context, .\|. returns a boolean value.
1404 The operator is bistable, like a flip-flop..
1405 Each .\|. operator maintains its own boolean state.
1406 It is false as long as its left operand is false.
1407 Once the left operand is true, the range operator stays true
1408 until the right operand is true,
1409 AFTER which the range operator becomes false again.
1410 (It doesn't become false till the next time the range operator is evaluated.
1411 It can become false on the same evaluation it became true, but it still returns
1412 true once.)
1413 The right operand is not evaluated while the operator is in the \*(L"false\*(R" state,
1414 and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state.
1415 The scalar .\|. operator is primarily intended for doing line number ranges
1416 after
1417 the fashion of \fIsed\fR or \fIawk\fR.
1418 The precedence is a little lower than || and &&.
1419 The value returned is either the null string for false, or a sequence number
1420 (beginning with 1) for true.
1421 The sequence number is reset for each range encountered.
1422 The final sequence number in a range has the string \'E0\' appended to it, which
1423 doesn't affect its numeric value, but gives you something to search for if you
1424 want to exclude the endpoint.
1425 You can exclude the beginning point by waiting for the sequence number to be
1426 greater than 1.
1427 If either operand of scalar .\|. is static, that operand is implicitly compared
1428 to the $. variable, the current line number.
1429 Examples:
1430 .nf
1431
1432 .ne 6
1433 As a scalar operator:
1434     if (101 .\|. 200) { print; }        # print 2nd hundred lines
1435
1436     next line if (1 .\|. /^$/); # skip header lines
1437
1438     s/^/> / if (/^$/ .\|. eof());       # quote body
1439
1440 .ne 4
1441 As an array operator:
1442     for (101 .\|. 200) { print; }       # print $_ 100 times
1443
1444     @foo = @foo[$[ .\|. $#foo]; # an expensive no-op
1445     @foo = @foo[$#foo-4 .\|. $#foo];    # slice last 5 items
1446
1447 .fi
1448 .Ip \-x 8
1449 A file test.
1450 This unary operator takes one argument, either a filename or a filehandle,
1451 and tests the associated file to see if something is true about it.
1452 If the argument is omitted, tests $_, except for \-t, which tests
1453 .IR STDIN .
1454 It returns 1 for true and \'\' for false, or the undefined value if the
1455 file doesn't exist.
1456 Precedence is higher than logical and relational operators, but lower than
1457 arithmetic operators.
1458 The operator may be any of:
1459 .nf
1460         \-r     File is readable by effective uid.
1461         \-w     File is writable by effective uid.
1462         \-x     File is executable by effective uid.
1463         \-o     File is owned by effective uid.
1464         \-R     File is readable by real uid.
1465         \-W     File is writable by real uid.
1466         \-X     File is executable by real uid.
1467         \-O     File is owned by real uid.
1468         \-e     File exists.
1469         \-z     File has zero size.
1470         \-s     File has non-zero size (returns size).
1471         \-f     File is a plain file.
1472         \-d     File is a directory.
1473         \-l     File is a symbolic link.
1474         \-p     File is a named pipe (FIFO).
1475         \-S     File is a socket.
1476         \-b     File is a block special file.
1477         \-c     File is a character special file.
1478         \-u     File has setuid bit set.
1479         \-g     File has setgid bit set.
1480         \-k     File has sticky bit set.
1481         \-t     Filehandle is opened to a tty.
1482         \-T     File is a text file.
1483         \-B     File is a binary file (opposite of \-T).
1484         \-M     Age of file in days when script started.
1485         \-A     Same for access time.
1486         \-C     Same for inode change time.
1487
1488 .fi
1489 The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X
1490 is based solely on the mode of the file and the uids and gids of the user.
1491 There may be other reasons you can't actually read, write or execute the file.
1492 Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and
1493 \-x and \-X return 1 if any execute bit is set in the mode.
1494 Scripts run by the superuser may thus need to do a stat() in order to determine
1495 the actual mode of the file, or temporarily set the uid to something else.
1496 .Sp
1497 Example:
1498 .nf
1499 .ne 7
1500
1501         while (<>) {
1502                 chop;
1503                 next unless \-f $_;     # ignore specials
1504                 .\|.\|.
1505         }
1506
1507 .fi
1508 Note that \-s/a/b/ does not do a negated substitution.
1509 Saying \-exp($foo) still works as expected, however\*(--only single letters
1510 following a minus are interpreted as file tests.
1511 .Sp
1512 The \-T and \-B switches work as follows.
1513 The first block or so of the file is examined for odd characters such as
1514 strange control codes or metacharacters.
1515 If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file.
1516 Also, any file containing null in the first block is considered a binary file.
1517 If \-T or \-B is used on a filehandle, the current stdio buffer is examined
1518 rather than the first block.
1519 Both \-T and \-B return TRUE on a null file, or a file at EOF when testing
1520 a filehandle.
1521 .PP
1522 If any of the file tests (or either stat operator) are given the special
1523 filehandle consisting of a solitary underline, then the stat structure
1524 of the previous file test (or stat operator) is used, saving a system
1525 call.
1526 (This doesn't work with \-t, and you need to remember that lstat and -l
1527 will leave values in the stat structure for the symbolic link, not the
1528 real file.)
1529 Example:
1530 .nf
1531
1532         print "Can do.\en" if -r $a || -w _ || -x _;
1533
1534 .ne 9
1535         stat($filename);
1536         print "Readable\en" if -r _;
1537         print "Writable\en" if -w _;
1538         print "Executable\en" if -x _;
1539         print "Setuid\en" if -u _;
1540         print "Setgid\en" if -g _;
1541         print "Sticky\en" if -k _;
1542         print "Text\en" if -T _;
1543         print "Binary\en" if -B _;
1544
1545 .fi
1546 .PP
1547 Here is what C has that
1548 .I perl
1549 doesn't:
1550 .Ip "unary &" 12
1551 Address-of operator.
1552 .Ip "unary *" 12
1553 Dereference-address operator.
1554 .Ip "(TYPE)" 12
1555 Type casting operator.
1556 .PP
1557 Like C,
1558 .I perl
1559 does a certain amount of expression evaluation at compile time, whenever
1560 it determines that all of the arguments to an operator are static and have
1561 no side effects.
1562 In particular, string concatenation happens at compile time between literals that don't do variable substitution.
1563 Backslash interpretation also happens at compile time.
1564 You can say
1565 .nf
1566
1567 .ne 2
1568         \'Now is the time for all\' . "\|\e\|n" .
1569         \'good men to come to.\'
1570
1571 .fi
1572 and this all reduces to one string internally.
1573 .PP
1574 The autoincrement operator has a little extra built-in magic to it.
1575 If you increment a variable that is numeric, or that has ever been used in
1576 a numeric context, you get a normal increment.
1577 If, however, the variable has only been used in string contexts since it
1578 was set, and has a value that is not null and matches the
1579 pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done
1580 as a string, preserving each character within its range, with carry:
1581 .nf
1582
1583         print ++($foo = \'99\');        # prints \*(L'100\*(R'
1584         print ++($foo = \'a0\');        # prints \*(L'a1\*(R'
1585         print ++($foo = \'Az\');        # prints \*(L'Ba\*(R'
1586         print ++($foo = \'zz\');        # prints \*(L'aaa\*(R'
1587
1588 .fi
1589 The autodecrement is not magical.
1590 .PP
1591 The range operator (in an array context) makes use of the magical
1592 autoincrement algorithm if the minimum and maximum are strings.
1593 You can say
1594
1595         @alphabet = (\'A\' .. \'Z\');
1596
1597 to get all the letters of the alphabet, or
1598
1599         $hexdigit = (0 .. 9, \'a\' .. \'f\')[$num & 15];
1600
1601 to get a hexadecimal digit, or
1602
1603         @z2 = (\'01\' .. \'31\');  print @z2[$mday];
1604
1605 to get dates with leading zeros.
1606 (If the final value specified is not in the sequence that the magical increment
1607 would produce, the sequence goes until the next value would be longer than
1608 the final value specified.)
1609 .PP
1610 The || and && operators differ from C's in that, rather than returning 0 or 1,
1611 they return the last value evaluated.
1612 Thus, a portable way to find out the home directory might be:
1613 .nf
1614
1615         $home = $ENV{'HOME'} || $ENV{'LOGDIR'} ||
1616             (getpwuid($<))[7] || die "You're homeless!\en";
1617
1618 .fi
1619 .PP
1620 Along with the literals and variables mentioned earlier,
1621 the operations in the following section can serve as terms in an expression.
1622 Some of these operations take a LIST as an argument.
1623 Such a list can consist of any combination of scalar arguments or array values;
1624 the array values will be included in the list as if each individual element were
1625 interpolated at that point in the list, forming a longer single-dimensional
1626 array value.
1627 Elements of the LIST should be separated by commas.
1628 If an operation is listed both with and without parentheses around its
1629 arguments, it means you can either use it as a unary operator or
1630 as a function call.
1631 To use it as a function call, the next token on the same line must
1632 be a left parenthesis.
1633 (There may be intervening white space.)
1634 Such a function then has highest precedence, as you would expect from
1635 a function.
1636 If any token other than a left parenthesis follows, then it is a
1637 unary operator, with a precedence depending only on whether it is a LIST
1638 operator or not.
1639 LIST operators have lowest precedence.
1640 All other unary operators have a precedence greater than relational operators
1641 but less than arithmetic operators.
1642 See the section on Precedence.
1643 .Ip "/PATTERN/" 8 4
1644 See m/PATTERN/.
1645 .Ip "?PATTERN?" 8 4
1646 This is just like the /pattern/ search, except that it matches only once between
1647 calls to the
1648 .I reset
1649 operator.
1650 This is a useful optimization when you only want to see the first occurrence of
1651 something in each file of a set of files, for instance.
1652 Only ?? patterns local to the current package are reset.
1653 .Ip "accept(NEWSOCKET,GENERICSOCKET)" 8 2
1654 Does the same thing that the accept system call does.
1655 Returns true if it succeeded, false otherwise.
1656 See example in section on Interprocess Communication.
1657 .Ip "alarm(SECONDS)" 8 4
1658 .Ip "alarm SECONDS" 8
1659 Arranges to have a SIGALRM delivered to this process after the specified number
1660 of seconds (minus 1, actually) have elapsed.  Thus, alarm(15) will cause
1661 a SIGALRM at some point more than 14 seconds in the future.
1662 Only one timer may be counting at once.  Each call disables the previous
1663 timer, and an argument of 0 may be supplied to cancel the previous timer
1664 without starting a new one.
1665 The returned value is the amount of time remaining on the previous timer.
1666 .Ip "atan2(Y,X)" 8 2
1667 Returns the arctangent of Y/X in the range
1668 .if t \-\(*p to \(*p.
1669 .if n \-PI to PI.
1670 .Ip "bind(SOCKET,NAME)" 8 2
1671 Does the same thing that the bind system call does.
1672 Returns true if it succeeded, false otherwise.
1673 NAME should be a packed address of the proper type for the socket.
1674 See example in section on Interprocess Communication.
1675 .Ip "binmode(FILEHANDLE)" 8 4
1676 .Ip "binmode FILEHANDLE" 8 4
1677 Arranges for the file to be read in \*(L"binary\*(R" mode in operating systems
1678 that distinguish between binary and text files.
1679 Files that are not read in binary mode have CR LF sequences translated
1680 to LF on input and LF translated to CR LF on output.
1681 Binmode has no effect under Unix.
1682 If FILEHANDLE is an expression, the value is taken as the name of
1683 the filehandle.
1684 .Ip "caller(EXPR)"
1685 .Ip "caller"
1686 Returns the context of the current subroutine call:
1687 .nf
1688
1689         ($package,$filename,$line) = caller;
1690
1691 .fi
1692 With EXPR, returns some extra information that the debugger uses to print
1693 a stack trace.  The value of EXPR indicates how many call frames to go
1694 back before the current one.
1695 .Ip "chdir(EXPR)" 8 2
1696 .Ip "chdir EXPR" 8 2
1697 Changes the working directory to EXPR, if possible.
1698 If EXPR is omitted, changes to home directory.
1699 Returns 1 upon success, 0 otherwise.
1700 See example under
1701 .IR die .
1702 .Ip "chmod(LIST)" 8 2
1703 .Ip "chmod LIST" 8 2
1704 Changes the permissions of a list of files.
1705 The first element of the list must be the numerical mode.
1706 Returns the number of files successfully changed.
1707 .nf
1708
1709 .ne 2
1710         $cnt = chmod 0755, \'foo\', \'bar\';
1711         chmod 0755, @executables;
1712
1713 .fi
1714 .Ip "chop(LIST)" 8 7
1715 .Ip "chop(VARIABLE)" 8
1716 .Ip "chop VARIABLE" 8
1717 .Ip "chop" 8
1718 Chops off the last character of a string and returns the character chopped.
1719 It's used primarily to remove the newline from the end of an input record,
1720 but is much more efficient than s/\en// because it neither scans nor copies
1721 the string.
1722 If VARIABLE is omitted, chops $_.
1723 Example:
1724 .nf
1725
1726 .ne 5
1727         while (<>) {
1728                 chop;   # avoid \en on last field
1729                 @array = split(/:/);
1730                 .\|.\|.
1731         }
1732
1733 .fi
1734 You can actually chop anything that's an lvalue, including an assignment:
1735 .nf
1736
1737         chop($cwd = \`pwd\`);
1738         chop($answer = <STDIN>);
1739
1740 .fi
1741 If you chop a list, each element is chopped.
1742 Only the value of the last chop is returned.
1743 .Ip "chown(LIST)" 8 2
1744 .Ip "chown LIST" 8 2
1745 Changes the owner (and group) of a list of files.
1746 The first two elements of the list must be the NUMERICAL uid and gid,
1747 in that order.
1748 Returns the number of files successfully changed.
1749 .nf
1750
1751 .ne 2
1752         $cnt = chown $uid, $gid, \'foo\', \'bar\';
1753         chown $uid, $gid, @filenames;
1754
1755 .fi
1756 .ne 23
1757 Here's an example that looks up non-numeric uids in the passwd file:
1758 .nf
1759
1760         print "User: ";
1761         $user = <STDIN>;
1762         chop($user);
1763         print "Files: "
1764         $pattern = <STDIN>;
1765         chop($pattern);
1766 .ie t \{\
1767         open(pass, \'/etc/passwd\') || die "Can't open passwd: $!\en";
1768 'br\}
1769 .el \{\
1770         open(pass, \'/etc/passwd\')
1771                 || die "Can't open passwd: $!\en";
1772 'br\}
1773         while (<pass>) {
1774                 ($login,$pass,$uid,$gid) = split(/:/);
1775                 $uid{$login} = $uid;
1776                 $gid{$login} = $gid;
1777         }
1778         @ary = <${pattern}>;    # get filenames
1779         if ($uid{$user} eq \'\') {
1780                 die "$user not in passwd file";
1781         }
1782         else {
1783                 chown $uid{$user}, $gid{$user}, @ary;
1784         }
1785
1786 .fi
1787 .Ip "chroot(FILENAME)" 8 5
1788 .Ip "chroot FILENAME" 8
1789 Does the same as the system call of that name.
1790 If you don't know what it does, don't worry about it.
1791 If FILENAME is omitted, does chroot to $_.
1792 .Ip "close(FILEHANDLE)" 8 5
1793 .Ip "close FILEHANDLE" 8
1794 Closes the file or pipe associated with the file handle.
1795 You don't have to close FILEHANDLE if you are immediately going to
1796 do another open on it, since open will close it for you.
1797 (See
1798 .IR open .)
1799 However, an explicit close on an input file resets the line counter ($.), while
1800 the implicit close done by
1801 .I open
1802 does not.
1803 Also, closing a pipe will wait for the process executing on the pipe to complete,
1804 in case you want to look at the output of the pipe afterwards.
1805 Closing a pipe explicitly also puts the status value of the command into $?.
1806 Example:
1807 .nf
1808
1809 .ne 4
1810         open(OUTPUT, \'|sort >foo\');   # pipe to sort
1811         .\|.\|. # print stuff to output
1812         close OUTPUT;           # wait for sort to finish
1813         open(INPUT, \'foo\');   # get sort's results
1814
1815 .fi
1816 FILEHANDLE may be an expression whose value gives the real filehandle name.
1817 .Ip "closedir(DIRHANDLE)" 8 5
1818 .Ip "closedir DIRHANDLE" 8
1819 Closes a directory opened by opendir().
1820 .Ip "connect(SOCKET,NAME)" 8 2
1821 Does the same thing that the connect system call does.
1822 Returns true if it succeeded, false otherwise.
1823 NAME should be a package address of the proper type for the socket.
1824 See example in section on Interprocess Communication.
1825 .Ip "cos(EXPR)" 8 6
1826 .Ip "cos EXPR" 8 6
1827 Returns the cosine of EXPR (expressed in radians).
1828 If EXPR is omitted takes cosine of $_.
1829 .Ip "crypt(PLAINTEXT,SALT)" 8 6
1830 Encrypts a string exactly like the crypt() function in the C library.
1831 Useful for checking the password file for lousy passwords.
1832 Only the guys wearing white hats should do this.
1833 .Ip "dbmclose(ASSOC_ARRAY)" 8 6
1834 .Ip "dbmclose ASSOC_ARRAY" 8
1835 Breaks the binding between a dbm file and an associative array.
1836 The values remaining in the associative array are meaningless unless
1837 you happen to want to know what was in the cache for the dbm file.
1838 This function is only useful if you have ndbm.
1839 .Ip "dbmopen(ASSOC,DBNAME,MODE)" 8 6
1840 This binds a dbm or ndbm file to an associative array.
1841 ASSOC is the name of the associative array.
1842 (Unlike normal open, the first argument is NOT a filehandle, even though
1843 it looks like one).
1844 DBNAME is the name of the database (without the .dir or .pag extension).
1845 If the database does not exist, it is created with protection specified
1846 by MODE (as modified by the umask).
1847 If your system only supports the older dbm functions, you may only have one
1848 dbmopen in your program.
1849 If your system has neither dbm nor ndbm, calling dbmopen produces a fatal
1850 error.
1851 .Sp
1852 Values assigned to the associative array prior to the dbmopen are lost.
1853 A certain number of values from the dbm file are cached in memory.
1854 By default this number is 64, but you can increase it by preallocating
1855 that number of garbage entries in the associative array before the dbmopen.
1856 You can flush the cache if necessary with the reset command.
1857 .Sp
1858 If you don't have write access to the dbm file, you can only read
1859 associative array variables, not set them.
1860 If you want to test whether you can write, either use file tests or
1861 try setting a dummy array entry inside an eval, which will trap the error.
1862 .Sp
1863 Note that functions such as keys() and values() may return huge array values
1864 when used on large dbm files.
1865 You may prefer to use the each() function to iterate over large dbm files.
1866 Example:
1867 .nf
1868
1869 .ne 6
1870         # print out history file offsets
1871         dbmopen(HIST,'/usr/lib/news/history',0666);
1872         while (($key,$val) = each %HIST) {
1873                 print $key, ' = ', unpack('L',$val), "\en";
1874         }
1875         dbmclose(HIST);
1876
1877 .fi
1878 .Ip "defined(EXPR)" 8 6
1879 .Ip "defined EXPR" 8
1880 Returns a boolean value saying whether the lvalue EXPR has a real value
1881 or not.
1882 Many operations return the undefined value under exceptional conditions,
1883 such as end of file, uninitialized variable, system error and such.
1884 This function allows you to distinguish between an undefined null string
1885 and a defined null string with operations that might return a real null
1886 string, in particular referencing elements of an array.
1887 You may also check to see if arrays or subroutines exist.
1888 Use on predefined variables is not guaranteed to produce intuitive results.
1889 Examples:
1890 .nf
1891
1892 .ne 7
1893         print if defined $switch{'D'};
1894         print "$val\en" while defined($val = pop(@ary));
1895         die "Can't readlink $sym: $!"
1896                 unless defined($value = readlink $sym);
1897         eval '@foo = ()' if defined(@foo);
1898         die "No XYZ package defined" unless defined %_XYZ;
1899         sub foo { defined &bar ? &bar(@_) : die "No bar"; }
1900
1901 .fi
1902 See also undef.
1903 .Ip "delete $ASSOC{KEY}" 8 6
1904 Deletes the specified value from the specified associative array.
1905 Returns the deleted value, or the undefined value if nothing was deleted.
1906 Deleting from $ENV{} modifies the environment.
1907 Deleting from an array bound to a dbm file deletes the entry from the dbm
1908 file.
1909 .Sp
1910 The following deletes all the values of an associative array:
1911 .nf
1912
1913 .ne 3
1914         foreach $key (keys %ARRAY) {
1915                 delete $ARRAY{$key};
1916         }
1917
1918 .fi
1919 (But it would be faster to use the
1920 .I reset
1921 command.
1922 Saying undef %ARRAY is faster yet.)
1923 .Ip "die(LIST)" 8
1924 .Ip "die LIST" 8
1925 Outside of an eval, prints the value of LIST to
1926 .I STDERR
1927 and exits with the current value of $!
1928 (errno).
1929 If $! is 0, exits with the value of ($? >> 8) (\`command\` status).
1930 If ($? >> 8) is 0, exits with 255.
1931 Inside an eval, the error message is stuffed into $@ and the eval is terminated
1932 with the undefined value.
1933 .Sp
1934 Equivalent examples:
1935 .nf
1936
1937 .ne 3
1938 .ie t \{\
1939         die "Can't cd to spool: $!\en" unless chdir \'/usr/spool/news\';
1940 'br\}
1941 .el \{\
1942         die "Can't cd to spool: $!\en"
1943                 unless chdir \'/usr/spool/news\';
1944 'br\}
1945
1946         chdir \'/usr/spool/news\' || die "Can't cd to spool: $!\en"
1947
1948 .fi
1949 .Sp
1950 If the value of EXPR does not end in a newline, the current script line
1951 number and input line number (if any) are also printed, and a newline is
1952 supplied.
1953 Hint: sometimes appending \*(L", stopped\*(R" to your message will cause it to make
1954 better sense when the string \*(L"at foo line 123\*(R" is appended.
1955 Suppose you are running script \*(L"canasta\*(R".
1956 .nf
1957
1958 .ne 7
1959         die "/etc/games is no good";
1960         die "/etc/games is no good, stopped";
1961
1962 produce, respectively
1963
1964         /etc/games is no good at canasta line 123.
1965         /etc/games is no good, stopped at canasta line 123.
1966
1967 .fi
1968 See also
1969 .IR exit .
1970 .Ip "do BLOCK" 8 4
1971 Returns the value of the last command in the sequence of commands indicated
1972 by BLOCK.
1973 When modified by a loop modifier, executes the BLOCK once before testing the
1974 loop condition.
1975 (On other statements the loop modifiers test the conditional first.)
1976 .Ip "do SUBROUTINE (LIST)" 8 3
1977 Executes a SUBROUTINE declared by a
1978 .I sub
1979 declaration, and returns the value
1980 of the last expression evaluated in SUBROUTINE.
1981 If there is no subroutine by that name, produces a fatal error.
1982 (You may use the \*(L"defined\*(R" operator to determine if a subroutine
1983 exists.)
1984 If you pass arrays as part of LIST you may wish to pass the length
1985 of the array in front of each array.
1986 (See the section on subroutines later on.)
1987 SUBROUTINE may be a scalar variable, in which case the variable contains
1988 the name of the subroutine to execute.
1989 The parentheses are required to avoid confusion with the \*(L"do EXPR\*(R"
1990 form.
1991 .Sp
1992 As an alternate form, you may call a subroutine by prefixing the name with
1993 an ampersand: &foo(@args).
1994 If you aren't passing any arguments, you don't have to use parentheses.
1995 If you omit the parentheses, no @_ array is passed to the subroutine.
1996 The & form is also used to specify subroutines to the defined and undef
1997 operators.
1998 .Ip "do EXPR" 8 3
1999 Uses the value of EXPR as a filename and executes the contents of the file
2000 as a
2001 .I perl
2002 script.
2003 Its primary use is to include subroutines from a
2004 .I perl
2005 subroutine library.
2006 .nf
2007
2008         do \'stat.pl\';
2009
2010 is just like
2011
2012         eval \`cat stat.pl\`;
2013
2014 .fi
2015 except that it's more efficient, more concise, keeps track of the current
2016 filename for error messages, and searches all the
2017 .B \-I
2018 libraries if the file
2019 isn't in the current directory (see also the @INC array in Predefined Names).
2020 It's the same, however, in that it does reparse the file every time you
2021 call it, so if you are going to use the file inside a loop you might prefer
2022 to use \-P and #include, at the expense of a little more startup time.
2023 (The main problem with #include is that cpp doesn't grok # comments\*(--a
2024 workaround is to use \*(L";#\*(R" for standalone comments.)
2025 Note that the following are NOT equivalent:
2026 .nf
2027
2028 .ne 2
2029         do $foo;        # eval a file
2030         do $foo();      # call a subroutine
2031
2032 .fi
2033 Note that inclusion of library routines is better done with
2034 the \*(L"require\*(R" operator.
2035 .Ip "dump LABEL" 8 6
2036 This causes an immediate core dump.
2037 Primarily this is so that you can use the undump program to turn your
2038 core dump into an executable binary after having initialized all your
2039 variables at the beginning of the program.
2040 When the new binary is executed it will begin by executing a "goto LABEL"
2041 (with all the restrictions that goto suffers).
2042 Think of it as a goto with an intervening core dump and reincarnation.
2043 If LABEL is omitted, restarts the program from the top.
2044 WARNING: any files opened at the time of the dump will NOT be open any more
2045 when the program is reincarnated, with possible resulting confusion on the part
2046 of perl.
2047 See also \-u.
2048 .Sp
2049 Example:
2050 .nf
2051
2052 .ne 16
2053         #!/usr/bin/perl
2054         require 'getopt.pl';
2055         require 'stat.pl';
2056         %days = (
2057             'Sun',1,
2058             'Mon',2,
2059             'Tue',3,
2060             'Wed',4,
2061             'Thu',5,
2062             'Fri',6,
2063             'Sat',7);
2064
2065         dump QUICKSTART if $ARGV[0] eq '-d';
2066
2067     QUICKSTART:
2068         do Getopt('f');
2069
2070 .fi
2071 .Ip "each(ASSOC_ARRAY)" 8 6
2072 .Ip "each ASSOC_ARRAY" 8
2073 Returns a 2 element array consisting of the key and value for the next
2074 value of an associative array, so that you can iterate over it.
2075 Entries are returned in an apparently random order.
2076 When the array is entirely read, a null array is returned (which when
2077 assigned produces a FALSE (0) value).
2078 The next call to each() after that will start iterating again.
2079 The iterator can be reset only by reading all the elements from the array.
2080 You must not modify the array while iterating over it.
2081 There is a single iterator for each associative array, shared by all
2082 each(), keys() and values() function calls in the program.
2083 The following prints out your environment like the printenv program, only
2084 in a different order:
2085 .nf
2086
2087 .ne 3
2088         while (($key,$value) = each %ENV) {
2089                 print "$key=$value\en";
2090         }
2091
2092 .fi
2093 See also keys() and values().
2094 .Ip "eof(FILEHANDLE)" 8 8
2095 .Ip "eof()" 8
2096 .Ip "eof" 8
2097 Returns 1 if the next read on FILEHANDLE will return end of file, or if
2098 FILEHANDLE is not open.
2099 FILEHANDLE may be an expression whose value gives the real filehandle name.
2100 (Note that this function actually reads a character and then ungetc's it,
2101 so it is not very useful in an interactive context.)
2102 An eof without an argument returns the eof status for the last file read.
2103 Empty parentheses () may be used to indicate the pseudo file formed of the
2104 files listed on the command line, i.e. eof() is reasonable to use inside
2105 a while (<>) loop to detect the end of only the last file.
2106 Use eof(ARGV) or eof without the parentheses to test EACH file in a while (<>) loop.
2107 Examples:
2108 .nf
2109
2110 .ne 7
2111         # insert dashes just before last line of last file
2112         while (<>) {
2113                 if (eof()) {
2114                         print "\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\en";
2115                 }
2116                 print;
2117         }
2118
2119 .ne 7
2120         # reset line numbering on each input file
2121         while (<>) {
2122                 print "$.\et$_";
2123                 if (eof) {      # Not eof().
2124                         close(ARGV);
2125                 }
2126         }
2127
2128 .fi
2129 .Ip "eval(EXPR)" 8 6
2130 .Ip "eval EXPR" 8 6
2131 EXPR is parsed and executed as if it were a little
2132 .I perl
2133 program.
2134 It is executed in the context of the current
2135 .I perl
2136 program, so that
2137 any variable settings, subroutine or format definitions remain afterwards.
2138 The value returned is the value of the last expression evaluated, just
2139 as with subroutines.
2140 If there is a syntax error or runtime error, or a die statement is
2141 executed, an undefined value is returned by
2142 eval, and $@ is set to the error message.
2143 If there was no error, $@ is guaranteed to be a null string.
2144 If EXPR is omitted, evaluates $_.
2145 The final semicolon, if any, may be omitted from the expression.
2146 .Sp
2147 Note that, since eval traps otherwise-fatal errors, it is useful for
2148 determining whether a particular feature
2149 (such as dbmopen or symlink) is implemented.
2150 It is also Perl's exception trapping mechanism, where the die operator is
2151 used to raise exceptions.
2152 .Ip "exec(LIST)" 8 8
2153 .Ip "exec LIST" 8 6
2154 If there is more than one argument in LIST, or if LIST is an array with
2155 more than one value,
2156 calls execvp() with the arguments in LIST.
2157 If there is only one scalar argument, the argument is checked for shell metacharacters.
2158 If there are any, the entire argument is passed to \*(L"/bin/sh \-c\*(R" for parsing.
2159 If there are none, the argument is split into words and passed directly to
2160 execvp(), which is more efficient.
2161 Note: exec (and system) do not flush your output buffer, so you may need to
2162 set $| to avoid lost output.
2163 Examples:
2164 .nf
2165
2166         exec \'/bin/echo\', \'Your arguments are: \', @ARGV;
2167         exec "sort $outfile | uniq";
2168
2169 .fi
2170 .Sp
2171 If you don't really want to execute the first argument, but want to lie
2172 to the program you are executing about its own name, you can specify
2173 the program you actually want to run by assigning that to a variable and
2174 putting the name of the variable in front of the LIST without a comma.
2175 (This always forces interpretation of the LIST as a multi-valued list, even
2176 if there is only a single scalar in the list.)
2177 Example:
2178 .nf
2179
2180 .ne 2
2181         $shell = '/bin/csh';
2182         exec $shell '-sh';              # pretend it's a login shell
2183
2184 .fi
2185 .Ip "exit(EXPR)" 8 6
2186 .Ip "exit EXPR" 8
2187 Evaluates EXPR and exits immediately with that value.
2188 Example:
2189 .nf
2190
2191 .ne 2
2192         $ans = <STDIN>;
2193         exit 0 \|if \|$ans \|=~ \|/\|^[Xx]\|/\|;
2194
2195 .fi
2196 See also
2197 .IR die .
2198 If EXPR is omitted, exits with 0 status.
2199 .Ip "exp(EXPR)" 8 3
2200 .Ip "exp EXPR" 8
2201 Returns
2202 .I e
2203 to the power of EXPR.
2204 If EXPR is omitted, gives exp($_).
2205 .Ip "fcntl(FILEHANDLE,FUNCTION,SCALAR)" 8 4
2206 Implements the fcntl(2) function.
2207 You'll probably have to say
2208 .nf
2209
2210         require "fcntl.ph";     # probably /usr/local/lib/perl/fcntl.ph
2211
2212 .fi
2213 first to get the correct function definitions.
2214 If fcntl.ph doesn't exist or doesn't have the correct definitions
2215 you'll have to roll
2216 your own, based on your C header files such as <sys/fcntl.h>.
2217 (There is a perl script called h2ph that comes with the perl kit
2218 which may help you in this.)
2219 Argument processing and value return works just like ioctl below.
2220 Note that fcntl will produce a fatal error if used on a machine that doesn't implement
2221 fcntl(2).
2222 .Ip "fileno(FILEHANDLE)" 8 4
2223 .Ip "fileno FILEHANDLE" 8 4
2224 Returns the file descriptor for a filehandle.
2225 Useful for constructing bitmaps for select().
2226 If FILEHANDLE is an expression, the value is taken as the name of
2227 the filehandle.
2228 .Ip "flock(FILEHANDLE,OPERATION)" 8 4
2229 Calls flock(2) on FILEHANDLE.
2230 See manual page for flock(2) for definition of OPERATION.
2231 Returns true for success, false on failure.
2232 Will produce a fatal error if used on a machine that doesn't implement
2233 flock(2).
2234 Here's a mailbox appender for BSD systems.
2235 .nf
2236
2237 .ne 20
2238         $LOCK_SH = 1;
2239         $LOCK_EX = 2;
2240         $LOCK_NB = 4;
2241         $LOCK_UN = 8;
2242
2243         sub lock {
2244             flock(MBOX,$LOCK_EX);
2245             # and, in case someone appended
2246             # while we were waiting...
2247             seek(MBOX, 0, 2);
2248         }
2249
2250         sub unlock {
2251             flock(MBOX,$LOCK_UN);
2252         }
2253
2254         open(MBOX, ">>/usr/spool/mail/$ENV{'USER'}")
2255                 || die "Can't open mailbox: $!";
2256
2257         do lock();
2258         print MBOX $msg,"\en\en";
2259         do unlock();
2260
2261 .fi
2262 .Ip "fork" 8 4
2263 Does a fork() call.
2264 Returns the child pid to the parent process and 0 to the child process.
2265 Note: unflushed buffers remain unflushed in both processes, which means
2266 you may need to set $| to avoid duplicate output.
2267 .Ip "getc(FILEHANDLE)" 8 4
2268 .Ip "getc FILEHANDLE" 8
2269 .Ip "getc" 8
2270 Returns the next character from the input file attached to FILEHANDLE, or
2271 a null string at EOF.
2272 If FILEHANDLE is omitted, reads from STDIN.
2273 .Ip "getlogin" 8 3
2274 Returns the current login from /etc/utmp, if any.
2275 If null, use getpwuid.
2276
2277         $login = getlogin || (getpwuid($<))[0] || "Somebody";
2278
2279 .Ip "getpeername(SOCKET)" 8 3
2280 Returns the packed sockaddr address of other end of the SOCKET connection.
2281 .nf
2282
2283 .ne 4
2284         # An internet sockaddr
2285         $sockaddr = 'S n a4 x8';
2286         $hersockaddr = getpeername(S);
2287 .ie t \{\
2288         ($family, $port, $heraddr) = unpack($sockaddr,$hersockaddr);
2289 'br\}
2290 .el \{\
2291         ($family, $port, $heraddr) =
2292                         unpack($sockaddr,$hersockaddr);
2293 'br\}
2294
2295 .fi
2296 .Ip "getpgrp(PID)" 8 4
2297 .Ip "getpgrp PID" 8
2298 Returns the current process group for the specified PID, 0 for the current
2299 process.
2300 Will produce a fatal error if used on a machine that doesn't implement
2301 getpgrp(2).
2302 If EXPR is omitted, returns process group of current process.
2303 .Ip "getppid" 8 4
2304 Returns the process id of the parent process.
2305 .Ip "getpriority(WHICH,WHO)" 8 4
2306 Returns the current priority for a process, a process group, or a user.
2307 (See getpriority(2).)
2308 Will produce a fatal error if used on a machine that doesn't implement
2309 getpriority(2).
2310 .Ip "getpwnam(NAME)" 8
2311 .Ip "getgrnam(NAME)" 8
2312 .Ip "gethostbyname(NAME)" 8
2313 .Ip "getnetbyname(NAME)" 8
2314 .Ip "getprotobyname(NAME)" 8
2315 .Ip "getpwuid(UID)" 8
2316 .Ip "getgrgid(GID)" 8
2317 .Ip "getservbyname(NAME,PROTO)" 8
2318 .Ip "gethostbyaddr(ADDR,ADDRTYPE)" 8
2319 .Ip "getnetbyaddr(ADDR,ADDRTYPE)" 8
2320 .Ip "getprotobynumber(NUMBER)" 8
2321 .Ip "getservbyport(PORT,PROTO)" 8
2322 .Ip "getpwent" 8
2323 .Ip "getgrent" 8
2324 .Ip "gethostent" 8
2325 .Ip "getnetent" 8
2326 .Ip "getprotoent" 8
2327 .Ip "getservent" 8
2328 .Ip "setpwent" 8
2329 .Ip "setgrent" 8
2330 .Ip "sethostent(STAYOPEN)" 8
2331 .Ip "setnetent(STAYOPEN)" 8
2332 .Ip "setprotoent(STAYOPEN)" 8
2333 .Ip "setservent(STAYOPEN)" 8
2334 .Ip "endpwent" 8
2335 .Ip "endgrent" 8
2336 .Ip "endhostent" 8
2337 .Ip "endnetent" 8
2338 .Ip "endprotoent" 8
2339 .Ip "endservent" 8
2340 These routines perform the same functions as their counterparts in the
2341 system library.
2342 The return values from the various get routines are as follows:
2343 .nf
2344
2345         ($name,$passwd,$uid,$gid,
2346            $quota,$comment,$gcos,$dir,$shell) = getpw.\|.\|.
2347         ($name,$passwd,$gid,$members) = getgr.\|.\|.
2348         ($name,$aliases,$addrtype,$length,@addrs) = gethost.\|.\|.
2349         ($name,$aliases,$addrtype,$net) = getnet.\|.\|.
2350         ($name,$aliases,$proto) = getproto.\|.\|.
2351         ($name,$aliases,$port,$proto) = getserv.\|.\|.
2352
2353 .fi
2354 The $members value returned by getgr.\|.\|. is a space separated list
2355 of the login names of the members of the group.
2356 .Sp
2357 The @addrs value returned by the gethost.\|.\|. functions is a list of the
2358 raw addresses returned by the corresponding system library call.
2359 In the Internet domain, each address is four bytes long and you can unpack
2360 it by saying something like:
2361 .nf
2362
2363         ($a,$b,$c,$d) = unpack('C4',$addr[0]);
2364
2365 .fi
2366 .Ip "getsockname(SOCKET)" 8 3
2367 Returns the packed sockaddr address of this end of the SOCKET connection.
2368 .nf
2369
2370 .ne 4
2371         # An internet sockaddr
2372         $sockaddr = 'S n a4 x8';
2373         $mysockaddr = getsockname(S);
2374 .ie t \{\
2375         ($family, $port, $myaddr) = unpack($sockaddr,$mysockaddr);
2376 'br\}
2377 .el \{\
2378         ($family, $port, $myaddr) =
2379                         unpack($sockaddr,$mysockaddr);
2380 'br\}
2381
2382 .fi
2383 .Ip "getsockopt(SOCKET,LEVEL,OPTNAME)" 8 3
2384 Returns the socket option requested, or undefined if there is an error.
2385 .Ip "gmtime(EXPR)" 8 4
2386 .Ip "gmtime EXPR" 8
2387 Converts a time as returned by the time function to a 9-element array with
2388 the time analyzed for the Greenwich timezone.
2389 Typically used as follows:
2390 .nf
2391
2392 .ne 3
2393 .ie t \{\
2394     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = gmtime(time);
2395 'br\}
2396 .el \{\
2397     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
2398                                                 gmtime(time);
2399 'br\}
2400
2401 .fi
2402 All array elements are numeric, and come straight out of a struct tm.
2403 In particular this means that $mon has the range 0.\|.11 and $wday has the
2404 range 0.\|.6.
2405 If EXPR is omitted, does gmtime(time).
2406 .Ip "goto LABEL" 8 6
2407 Finds the statement labeled with LABEL and resumes execution there.
2408 Currently you may only go to statements in the main body of the program
2409 that are not nested inside a do {} construct.
2410 This statement is not implemented very efficiently, and is here only to make
2411 the
2412 .IR sed -to- perl
2413 translator easier.
2414 I may change its semantics at any time, consistent with support for translated
2415 .I sed
2416 scripts.
2417 Use it at your own risk.
2418 Better yet, don't use it at all.
2419 .Ip "grep(EXPR,LIST)" 8 4
2420 Evaluates EXPR for each element of LIST (locally setting $_ to each element)
2421 and returns the array value consisting of those elements for which the
2422 expression evaluated to true.
2423 In a scalar context, returns the number of times the expression was true.
2424 .nf
2425
2426         @foo = grep(!/^#/, @bar);    # weed out comments
2427
2428 .fi
2429 Note that, since $_ is a reference into the array value, it can be
2430 used to modify the elements of the array.
2431 While this is useful and supported, it can cause bizarre results if
2432 the LIST is not a named array.
2433 .Ip "hex(EXPR)" 8 4
2434 .Ip "hex EXPR" 8
2435 Returns the decimal value of EXPR interpreted as an hex string.
2436 (To interpret strings that might start with 0 or 0x see oct().)
2437 If EXPR is omitted, uses $_.
2438 .Ip "index(STR,SUBSTR,POSITION)" 8 4
2439 .Ip "index(STR,SUBSTR)" 8 4
2440 Returns the position of the first occurrence of SUBSTR in STR at or after
2441 POSITION.
2442 If POSITION is omitted, starts searching from the beginning of the string.
2443 The return value is based at 0, or whatever you've
2444 set the $[ variable to.
2445 If the substring is not found, returns one less than the base, ordinarily \-1.
2446 .Ip "int(EXPR)" 8 4
2447 .Ip "int EXPR" 8
2448 Returns the integer portion of EXPR.
2449 If EXPR is omitted, uses $_.
2450 .Ip "ioctl(FILEHANDLE,FUNCTION,SCALAR)" 8 4
2451 Implements the ioctl(2) function.
2452 You'll probably have to say
2453 .nf
2454
2455         require "ioctl.ph";     # probably /usr/local/lib/perl/ioctl.ph
2456
2457 .fi
2458 first to get the correct function definitions.
2459 If ioctl.ph doesn't exist or doesn't have the correct definitions
2460 you'll have to roll
2461 your own, based on your C header files such as <sys/ioctl.h>.
2462 (There is a perl script called h2ph that comes with the perl kit
2463 which may help you in this.)
2464 SCALAR will be read and/or written depending on the FUNCTION\*(--a pointer
2465 to the string value of SCALAR will be passed as the third argument of
2466 the actual ioctl call.
2467 (If SCALAR has no string value but does have a numeric value, that value
2468 will be passed rather than a pointer to the string value.
2469 To guarantee this to be true, add a 0 to the scalar before using it.)
2470 The pack() and unpack() functions are useful for manipulating the values
2471 of structures used by ioctl().
2472 The following example sets the erase character to DEL.
2473 .nf
2474
2475 .ne 9
2476         require 'ioctl.ph';
2477         $sgttyb_t = "ccccs";            # 4 chars and a short
2478         if (ioctl(STDIN,$TIOCGETP,$sgttyb)) {
2479                 @ary = unpack($sgttyb_t,$sgttyb);
2480                 $ary[2] = 127;
2481                 $sgttyb = pack($sgttyb_t,@ary);
2482                 ioctl(STDIN,$TIOCSETP,$sgttyb)
2483                         || die "Can't ioctl: $!";
2484         }
2485
2486 .fi
2487 The return value of ioctl (and fcntl) is as follows:
2488 .nf
2489
2490 .ne 4
2491         if OS returns:\h'|3i'perl returns:
2492           -1\h'|3i'  undefined value
2493           0\h'|3i'  string "0 but true"
2494           anything else\h'|3i'  that number
2495
2496 .fi
2497 Thus perl returns true on success and false on failure, yet you can still
2498 easily determine the actual value returned by the operating system:
2499 .nf
2500
2501         ($retval = ioctl(...)) || ($retval = -1);
2502         printf "System returned %d\en", $retval;
2503 .fi
2504 .Ip "join(EXPR,LIST)" 8 8
2505 .Ip "join(EXPR,ARRAY)" 8
2506 Joins the separate strings of LIST or ARRAY into a single string with fields
2507 separated by the value of EXPR, and returns the string.
2508 Example:
2509 .nf
2510
2511 .ie t \{\
2512     $_ = join(\|\':\', $login,$passwd,$uid,$gid,$gcos,$home,$shell);
2513 'br\}
2514 .el \{\
2515     $_ = join(\|\':\',
2516                 $login,$passwd,$uid,$gid,$gcos,$home,$shell);
2517 'br\}
2518
2519 .fi
2520 See
2521 .IR split .
2522 .Ip "keys(ASSOC_ARRAY)" 8 6
2523 .Ip "keys ASSOC_ARRAY" 8
2524 Returns a normal array consisting of all the keys of the named associative
2525 array.
2526 The keys are returned in an apparently random order, but it is the same order
2527 as either the values() or each() function produces (given that the associative array
2528 has not been modified).
2529 Here is yet another way to print your environment:
2530 .nf
2531
2532 .ne 5
2533         @keys = keys %ENV;
2534         @values = values %ENV;
2535         while ($#keys >= 0) {
2536                 print pop(@keys), \'=\', pop(@values), "\en";
2537         }
2538
2539 or how about sorted by key:
2540
2541 .ne 3
2542         foreach $key (sort(keys %ENV)) {
2543                 print $key, \'=\', $ENV{$key}, "\en";
2544         }
2545
2546 .fi
2547 .Ip "kill(LIST)" 8 8
2548 .Ip "kill LIST" 8 2
2549 Sends a signal to a list of processes.
2550 The first element of the list must be the signal to send.
2551 Returns the number of processes successfully signaled.
2552 .nf
2553
2554         $cnt = kill 1, $child1, $child2;
2555         kill 9, @goners;
2556
2557 .fi
2558 If the signal is negative, kills process groups instead of processes.
2559 (On System V, a negative \fIprocess\fR number will also kill process groups,
2560 but that's not portable.)
2561 You may use a signal name in quotes.
2562 .Ip "last LABEL" 8 8
2563 .Ip "last" 8
2564 The
2565 .I last
2566 command is like the
2567 .I break
2568 statement in C (as used in loops); it immediately exits the loop in question.
2569 If the LABEL is omitted, the command refers to the innermost enclosing loop.
2570 The
2571 .I continue
2572 block, if any, is not executed:
2573 .nf
2574
2575 .ne 4
2576         line: while (<STDIN>) {
2577                 last line if /\|^$/;    # exit when done with header
2578                 .\|.\|.
2579         }
2580
2581 .fi
2582 .Ip "length(EXPR)" 8 4
2583 .Ip "length EXPR" 8
2584 Returns the length in characters of the value of EXPR.
2585 If EXPR is omitted, returns length of $_.
2586 .Ip "link(OLDFILE,NEWFILE)" 8 2
2587 Creates a new filename linked to the old filename.
2588 Returns 1 for success, 0 otherwise.
2589 .Ip "listen(SOCKET,QUEUESIZE)" 8 2
2590 Does the same thing that the listen system call does.
2591 Returns true if it succeeded, false otherwise.
2592 See example in section on Interprocess Communication.
2593 .Ip "local(LIST)" 8 4
2594 Declares the listed variables to be local to the enclosing block,
2595 subroutine, eval or \*(L"do\*(R".
2596 All the listed elements must be legal lvalues.
2597 This operator works by saving the current values of those variables in LIST
2598 on a hidden stack and restoring them upon exiting the block, subroutine or eval.
2599 This means that called subroutines can also reference the local variable,
2600 but not the global one.
2601 The LIST may be assigned to if desired, which allows you to initialize
2602 your local variables.
2603 (If no initializer is given for a particular variable, it is created with
2604 an undefined value.)
2605 Commonly this is used to name the parameters to a subroutine.
2606 Examples:
2607 .nf
2608
2609 .ne 13
2610         sub RANGEVAL {
2611                 local($min, $max, $thunk) = @_;
2612                 local($result) = \'\';
2613                 local($i);
2614
2615                 # Presumably $thunk makes reference to $i
2616
2617                 for ($i = $min; $i < $max; $i++) {
2618                         $result .= eval $thunk;
2619                 }
2620
2621                 $result;
2622         }
2623
2624 .ne 6
2625         if ($sw eq \'-v\') {
2626             # init local array with global array
2627             local(@ARGV) = @ARGV;
2628             unshift(@ARGV,\'echo\');
2629             system @ARGV;
2630         }
2631         # @ARGV restored
2632
2633 .ne 6
2634         # temporarily add to digits associative array
2635         if ($base12) {
2636                 # (NOTE: not claiming this is efficient!)
2637                 local(%digits) = (%digits,'t',10,'e',11);
2638                 do parse_num();
2639         }
2640
2641 .fi
2642 Note that local() is a run-time command, and so gets executed every time
2643 through a loop, using up more stack storage each time until it's all
2644 released at once when the loop is exited.
2645 .Ip "localtime(EXPR)" 8 4
2646 .Ip "localtime EXPR" 8
2647 Converts a time as returned by the time function to a 9-element array with
2648 the time analyzed for the local timezone.
2649 Typically used as follows:
2650 .nf
2651
2652 .ne 3
2653 .ie t \{\
2654     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
2655 'br\}
2656 .el \{\
2657     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
2658                                                 localtime(time);
2659 'br\}
2660
2661 .fi
2662 All array elements are numeric, and come straight out of a struct tm.
2663 In particular this means that $mon has the range 0.\|.11 and $wday has the
2664 range 0.\|.6.
2665 If EXPR is omitted, does localtime(time).
2666 .Ip "log(EXPR)" 8 4
2667 .Ip "log EXPR" 8
2668 Returns logarithm (base
2669 .IR e )
2670 of EXPR.
2671 If EXPR is omitted, returns log of $_.
2672 .Ip "lstat(FILEHANDLE)" 8 6
2673 .Ip "lstat FILEHANDLE" 8
2674 .Ip "lstat(EXPR)" 8
2675 .Ip "lstat SCALARVARIABLE" 8
2676 Does the same thing as the stat() function, but stats a symbolic link
2677 instead of the file the symbolic link points to.
2678 If symbolic links are unimplemented on your system, a normal stat is done.
2679 .Ip "m/PATTERN/gio" 8 4
2680 .Ip "/PATTERN/gio" 8
2681 Searches a string for a pattern match, and returns true (1) or false (\'\').
2682 If no string is specified via the =~ or !~ operator,
2683 the $_ string is searched.
2684 (The string specified with =~ need not be an lvalue\*(--it may be the result of an expression evaluation, but remember the =~ binds rather tightly.)
2685 See also the section on regular expressions.
2686 .Sp
2687 If / is the delimiter then the initial \*(L'm\*(R' is optional.
2688 With the \*(L'm\*(R' you can use any pair of non-alphanumeric characters
2689 as delimiters.
2690 This is particularly useful for matching Unix path names that contain \*(L'/\*(R'.
2691 If the final delimiter is followed by the optional letter \*(L'i\*(R', the matching is
2692 done in a case-insensitive manner.
2693 PATTERN may contain references to scalar variables, which will be interpolated
2694 (and the pattern recompiled) every time the pattern search is evaluated.
2695 (Note that $) and $| may not be interpolated because they look like end-of-string tests.)
2696 If you want such a pattern to be compiled only once, add an \*(L"o\*(R" after
2697 the trailing delimiter.
2698 This avoids expensive run-time recompilations, and
2699 is useful when the value you are interpolating won't change over the
2700 life of the script.
2701 If the PATTERN evaluates to a null string, the most recent successful
2702 regular expression is used instead.
2703 .Sp
2704 If used in a context that requires an array value, a pattern match returns an
2705 array consisting of the subexpressions matched by the parentheses in the
2706 pattern,
2707 i.e. ($1, $2, $3.\|.\|.).
2708 It does NOT actually set $1, $2, etc. in this case, nor does it set $+, $`, $&
2709 or $'.
2710 If the match fails, a null array is returned.
2711 If the match succeeds, but there were no parentheses, an array value of (1)
2712 is returned.
2713 .Sp
2714 Examples:
2715 .nf
2716
2717 .ne 4
2718     open(tty, \'/dev/tty\');
2719     <tty> \|=~ \|/\|^y\|/i \|&& \|do foo(\|);   # do foo if desired
2720
2721     if (/Version: \|*\|([0\-9.]*\|)\|/\|) { $version = $1; }
2722
2723     next if m#^/usr/spool/uucp#;
2724
2725 .ne 5
2726     # poor man's grep
2727     $arg = shift;
2728     while (<>) {
2729             print if /$arg/o;   # compile only once
2730     }
2731
2732     if (($F1, $F2, $Etc) = ($foo =~ /^(\eS+)\es+(\eS+)\es*(.*)/))
2733
2734 .fi
2735 This last example splits $foo into the first two words and the remainder
2736 of the line, and assigns those three fields to $F1, $F2 and $Etc.
2737 The conditional is true if any variables were assigned, i.e. if the pattern
2738 matched.
2739 .Sp
2740 The \*(L"g\*(R" modifier specifies global pattern matching\*(--that is,
2741 matching as many times as possible within the string.  How it behaves
2742 depends on the context.  In an array context, it returns a list of
2743 all the substrings matched by all the parentheses in the regular expression.
2744 If there are no parentheses, it returns a list of all the matched strings,
2745 as if there were parentheses around the whole pattern.  In a scalar context,
2746 it iterates through the string, returning TRUE each time it matches, and
2747 FALSE when it eventually runs out of matches.  (In other words, it remembers
2748 where it left off last time and restarts the search at that point.)  It
2749 presumes that you have not modified the string since the last match.
2750 Modifying the string between matches may result in undefined behavior.
2751 (You can actually get away with in-place modifications via substr()
2752 that do not change the length of the entire string.  In general, however,
2753 you should be using s///g for such modifications.)  Examples:
2754 .nf
2755
2756         # array context
2757         ($one,$five,$fifteen) = (\`uptime\` =~ /(\ed+\e.\ed+)/g);
2758
2759         # scalar context
2760         $/ = 1; $* = 1;
2761         while ($paragraph = <>) {
2762             while ($paragraph =~ /[a-z][\'")]*[.!?]+[\'")]*\es/g) {
2763                 $sentences++;
2764             }
2765         }
2766         print "$sentences\en";
2767
2768 .fi
2769 .Ip "mkdir(FILENAME,MODE)" 8 3
2770 Creates the directory specified by FILENAME, with permissions specified by
2771 MODE (as modified by umask).
2772 If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
2773 .Ip "msgctl(ID,CMD,ARG)" 8 4
2774 Calls the System V IPC function msgctl.  If CMD is &IPC_STAT, then ARG
2775 must be a variable which will hold the returned msqid_ds structure.
2776 Returns like ioctl: the undefined value for error, "0 but true" for
2777 zero, or the actual return value otherwise.
2778 .Ip "msgget(KEY,FLAGS)" 8 4
2779 Calls the System V IPC function msgget.  Returns the message queue id,
2780 or the undefined value if there is an error.
2781 .Ip "msgsnd(ID,MSG,FLAGS)" 8 4
2782 Calls the System V IPC function msgsnd to send the message MSG to the
2783 message queue ID.  MSG must begin with the long integer message type,
2784 which may be created with pack("L", $type).  Returns true if
2785 successful, or false if there is an error.
2786 .Ip "msgrcv(ID,VAR,SIZE,TYPE,FLAGS)" 8 4
2787 Calls the System V IPC function msgrcv to receive a message from
2788 message queue ID into variable VAR with a maximum message size of
2789 SIZE.  Note that if a message is received, the message type will be
2790 the first thing in VAR, and the maximum length of VAR is SIZE plus the
2791 size of the message type.  Returns true if successful, or false if
2792 there is an error.
2793 .Ip "next LABEL" 8 8
2794 .Ip "next" 8
2795 The
2796 .I next
2797 command is like the
2798 .I continue
2799 statement in C; it starts the next iteration of the loop:
2800 .nf
2801
2802 .ne 4
2803         line: while (<STDIN>) {
2804                 next line if /\|^#/;    # discard comments
2805                 .\|.\|.
2806         }
2807
2808 .fi
2809 Note that if there were a
2810 .I continue
2811 block on the above, it would get executed even on discarded lines.
2812 If the LABEL is omitted, the command refers to the innermost enclosing loop.
2813 .Ip "oct(EXPR)" 8 4
2814 .Ip "oct EXPR" 8
2815 Returns the decimal value of EXPR interpreted as an octal string.
2816 (If EXPR happens to start off with 0x, interprets it as a hex string instead.)
2817 The following will handle decimal, octal and hex in the standard notation:
2818 .nf
2819
2820         $val = oct($val) if $val =~ /^0/;
2821
2822 .fi
2823 If EXPR is omitted, uses $_.
2824 .Ip "open(FILEHANDLE,EXPR)" 8 8
2825 .Ip "open(FILEHANDLE)" 8
2826 .Ip "open FILEHANDLE" 8
2827 Opens the file whose filename is given by EXPR, and associates it with
2828 FILEHANDLE.
2829 If FILEHANDLE is an expression, its value is used as the name of the
2830 real filehandle wanted.
2831 If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE
2832 contains the filename.
2833 If the filename begins with \*(L"<\*(R" or nothing, the file is opened for
2834 input.
2835 If the filename begins with \*(L">\*(R", the file is opened for output.
2836 If the filename begins with \*(L">>\*(R", the file is opened for appending.
2837 (You can put a \'+\' in front of the \'>\' or \'<\' to indicate that you
2838 want both read and write access to the file.)
2839 If the filename begins with \*(L"|\*(R", the filename is interpreted
2840 as a command to which output is to be piped, and if the filename ends
2841 with a \*(L"|\*(R", the filename is interpreted as command which pipes
2842 input to us.
2843 (You may not have a command that pipes both in and out.)
2844 Opening \'\-\' opens
2845 .I STDIN
2846 and opening \'>\-\' opens
2847 .IR STDOUT .
2848 Open returns non-zero upon success, the undefined value otherwise.
2849 If the open involved a pipe, the return value happens to be the pid
2850 of the subprocess.
2851 Examples:
2852 .nf
2853
2854 .ne 3
2855         $article = 100;
2856         open article || die "Can't find article $article: $!\en";
2857         while (<article>) {\|.\|.\|.
2858
2859 .ie t \{\
2860         open(LOG, \'>>/usr/spool/news/twitlog\'\|);     # (log is reserved)
2861 'br\}
2862 .el \{\
2863         open(LOG, \'>>/usr/spool/news/twitlog\'\|);
2864                                         # (log is reserved)
2865 'br\}
2866
2867 .ie t \{\
2868         open(article, "caesar <$article |"\|);          # decrypt article
2869 'br\}
2870 .el \{\
2871         open(article, "caesar <$article |"\|);
2872                                         # decrypt article
2873 'br\}
2874
2875 .ie t \{\
2876         open(extract, "|sort >/tmp/Tmp$$"\|);           # $$ is our process#
2877 'br\}
2878 .el \{\
2879         open(extract, "|sort >/tmp/Tmp$$"\|);
2880                                         # $$ is our process#
2881 'br\}
2882
2883 .ne 7
2884         # process argument list of files along with any includes
2885
2886         foreach $file (@ARGV) {
2887                 do process($file, \'fh00\');    # no pun intended
2888         }
2889
2890         sub process {
2891                 local($filename, $input) = @_;
2892                 $input++;               # this is a string increment
2893                 unless (open($input, $filename)) {
2894                         print STDERR "Can't open $filename: $!\en";
2895                         return;
2896                 }
2897 .ie t \{\
2898                 while (<$input>) {              # note the use of indirection
2899 'br\}
2900 .el \{\
2901                 while (<$input>) {              # note use of indirection
2902 'br\}
2903                         if (/^#include "(.*)"/) {
2904                                 do process($1, $input);
2905                                 next;
2906                         }
2907                         .\|.\|.         # whatever
2908                 }
2909         }
2910
2911 .fi
2912 You may also, in the Bourne shell tradition, specify an EXPR beginning
2913 with \*(L">&\*(R", in which case the rest of the string
2914 is interpreted as the name of a filehandle
2915 (or file descriptor, if numeric) which is to be duped and opened.
2916 You may use & after >, >>, <, +>, +>> and +<.
2917 The mode you specify should match the mode of the original filehandle.
2918 Here is a script that saves, redirects, and restores
2919 .I STDOUT
2920 and
2921 .IR STDERR :
2922 .nf
2923
2924 .ne 21
2925         #!/usr/bin/perl
2926         open(SAVEOUT, ">&STDOUT");
2927         open(SAVEERR, ">&STDERR");
2928
2929         open(STDOUT, ">foo.out") || die "Can't redirect stdout";
2930         open(STDERR, ">&STDOUT") || die "Can't dup stdout";
2931
2932         select(STDERR); $| = 1;         # make unbuffered
2933         select(STDOUT); $| = 1;         # make unbuffered
2934
2935         print STDOUT "stdout 1\en";     # this works for
2936         print STDERR "stderr 1\en";     # subprocesses too
2937
2938         close(STDOUT);
2939         close(STDERR);
2940
2941         open(STDOUT, ">&SAVEOUT");
2942         open(STDERR, ">&SAVEERR");
2943
2944         print STDOUT "stdout 2\en";
2945         print STDERR "stderr 2\en";
2946
2947 .fi
2948 If you open a pipe on the command \*(L"\-\*(R", i.e. either \*(L"|\-\*(R" or \*(L"\-|\*(R",
2949 then there is an implicit fork done, and the return value of open
2950 is the pid of the child within the parent process, and 0 within the child
2951 process.
2952 (Use defined($pid) to determine if the open was successful.)
2953 The filehandle behaves normally for the parent, but i/o to that
2954 filehandle is piped from/to the
2955 .IR STDOUT / STDIN
2956 of the child process.
2957 In the child process the filehandle isn't opened\*(--i/o happens from/to
2958 the new
2959 .I STDOUT
2960 or
2961 .IR STDIN .
2962 Typically this is used like the normal piped open when you want to exercise
2963 more control over just how the pipe command gets executed, such as when
2964 you are running setuid, and don't want to have to scan shell commands
2965 for metacharacters.
2966 The following pairs are more or less equivalent:
2967 .nf
2968
2969 .ne 5
2970         open(FOO, "|tr \'[a\-z]\' \'[A\-Z]\'");
2971         open(FOO, "|\-") || exec \'tr\', \'[a\-z]\', \'[A\-Z]\';
2972
2973         open(FOO, "cat \-n '$file'|");
2974         open(FOO, "\-|") || exec \'cat\', \'\-n\', $file;
2975
2976 .fi
2977 Explicitly closing any piped filehandle causes the parent process to wait for the
2978 child to finish, and returns the status value in $?.
2979 Note: on any operation which may do a fork,
2980 unflushed buffers remain unflushed in both
2981 processes, which means you may need to set $| to
2982 avoid duplicate output.
2983 .Sp
2984 The filename that is passed to open will have leading and trailing
2985 whitespace deleted.
2986 In order to open a file with arbitrary weird characters in it, it's necessary
2987 to protect any leading and trailing whitespace thusly:
2988 .nf
2989
2990 .ne 2
2991         $file =~ s#^(\es)#./$1#;
2992         open(FOO, "< $file\e0");
2993
2994 .fi
2995 .Ip "opendir(DIRHANDLE,EXPR)" 8 3
2996 Opens a directory named EXPR for processing by readdir(), telldir(), seekdir(),
2997 rewinddir() and closedir().
2998 Returns true if successful.
2999 DIRHANDLEs have their own namespace separate from FILEHANDLEs.
3000 .Ip "ord(EXPR)" 8 4
3001 .Ip "ord EXPR" 8
3002 Returns the numeric ascii value of the first character of EXPR.
3003 If EXPR is omitted, uses $_.
3004 ''' Comments on f & d by gnb@melba.bby.oz.au    22/11/89
3005 .Ip "pack(TEMPLATE,LIST)" 8 4
3006 Takes an array or list of values and packs it into a binary structure,
3007 returning the string containing the structure.
3008 The TEMPLATE is a sequence of characters that give the order and type
3009 of values, as follows:
3010 .nf
3011
3012         A       An ascii string, will be space padded.
3013         a       An ascii string, will be null padded.
3014         c       A signed char value.
3015         C       An unsigned char value.
3016         s       A signed short value.
3017         S       An unsigned short value.
3018         i       A signed integer value.
3019         I       An unsigned integer value.
3020         l       A signed long value.
3021         L       An unsigned long value.
3022         n       A short in \*(L"network\*(R" order.
3023         N       A long in \*(L"network\*(R" order.
3024         f       A single-precision float in the native format.
3025         d       A double-precision float in the native format.
3026         p       A pointer to a string.
3027         x       A null byte.
3028         X       Back up a byte.
3029         @       Null fill to absolute position.
3030         u       A uuencoded string.
3031         b       A bit string (ascending bit order, like vec()).
3032         B       A bit string (descending bit order).
3033         h       A hex string (low nybble first).
3034         H       A hex string (high nybble first).
3035
3036 .fi
3037 Each letter may optionally be followed by a number which gives a repeat
3038 count.
3039 With all types except "a", "A", "b", "B", "h" and "H",
3040 the pack function will gobble up that many values
3041 from the LIST.
3042 A * for the repeat count means to use however many items are left.
3043 The "a" and "A" types gobble just one value, but pack it as a string of length
3044 count,
3045 padding with nulls or spaces as necessary.
3046 (When unpacking, "A" strips trailing spaces and nulls, but "a" does not.)
3047 Likewise, the "b" and "B" fields pack a string that many bits long.
3048 The "h" and "H" fields pack a string that many nybbles long.
3049 Real numbers (floats and doubles) are in the native machine format
3050 only; due to the multiplicity of floating formats around, and the lack
3051 of a standard \*(L"network\*(R" representation, no facility for
3052 interchange has been made.
3053 This means that packed floating point data
3054 written on one machine may not be readable on another - even if both
3055 use IEEE floating point arithmetic (as the endian-ness of the memory
3056 representation is not part of the IEEE spec).
3057 Note that perl uses
3058 doubles internally for all numeric calculation, and converting from
3059 double -> float -> double will lose precision (i.e. unpack("f",
3060 pack("f", $foo)) will not in general equal $foo).
3061 .br
3062 Examples:
3063 .nf
3064
3065         $foo = pack("cccc",65,66,67,68);
3066         # foo eq "ABCD"
3067         $foo = pack("c4",65,66,67,68);
3068         # same thing
3069
3070         $foo = pack("ccxxcc",65,66,67,68);
3071         # foo eq "AB\e0\e0CD"
3072
3073         $foo = pack("s2",1,2);
3074         # "\e1\e0\e2\e0" on little-endian
3075         # "\e0\e1\e0\e2" on big-endian
3076
3077         $foo = pack("a4","abcd","x","y","z");
3078         # "abcd"
3079
3080         $foo = pack("aaaa","abcd","x","y","z");
3081         # "axyz"
3082
3083         $foo = pack("a14","abcdefg");
3084         # "abcdefg\e0\e0\e0\e0\e0\e0\e0"
3085
3086         $foo = pack("i9pl", gmtime);
3087         # a real struct tm (on my system anyway)
3088
3089         sub bintodec {
3090             unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
3091         }
3092 .fi
3093 The same template may generally also be used in the unpack function.
3094 .Ip "pipe(READHANDLE,WRITEHANDLE)" 8 3
3095 Opens a pair of connected pipes like the corresponding system call.
3096 Note that if you set up a loop of piped processes, deadlock can occur
3097 unless you are very careful.
3098 In addition, note that perl's pipes use stdio buffering, so you may need
3099 to set $| to flush your WRITEHANDLE after each command, depending on
3100 the application.
3101 [Requires version 3.0 patchlevel 9.]
3102 .Ip "pop(ARRAY)" 8
3103 .Ip "pop ARRAY" 8 6
3104 Pops and returns the last value of the array, shortening the array by 1.
3105 Has the same effect as
3106 .nf
3107
3108         $tmp = $ARRAY[$#ARRAY\-\|\-];
3109
3110 .fi
3111 If there are no elements in the array, returns the undefined value.
3112 .Ip "print(FILEHANDLE LIST)" 8 10
3113 .Ip "print(LIST)" 8
3114 .Ip "print FILEHANDLE LIST" 8
3115 .Ip "print LIST" 8
3116 .Ip "print" 8
3117 Prints a string or a comma-separated list of strings.
3118 Returns non-zero if successful.
3119 FILEHANDLE may be a scalar variable name, in which case the variable contains
3120 the name of the filehandle, thus introducing one level of indirection.
3121 (NOTE: If FILEHANDLE is a variable and the next token is a term, it may be
3122 misinterpreted as an operator unless you interpose a + or put parens around
3123 the arguments.)
3124 If FILEHANDLE is omitted, prints by default to standard output (or to the
3125 last selected output channel\*(--see select()).
3126 If LIST is also omitted, prints $_ to
3127 .IR STDOUT .
3128 To set the default output channel to something other than
3129 .I STDOUT
3130 use the select operation.
3131 Note that, because print takes a LIST, anything in the LIST is evaluated
3132 in an array context, and any subroutine that you call will have one or more
3133 of its expressions evaluated in an array context.
3134 Also be careful not to follow the print keyword with a left parenthesis
3135 unless you want the corresponding right parenthesis to terminate the
3136 arguments to the print\*(--interpose a + or put parens around all the arguments.
3137 .Ip "printf(FILEHANDLE LIST)" 8 10
3138 .Ip "printf(LIST)" 8
3139 .Ip "printf FILEHANDLE LIST" 8
3140 .Ip "printf LIST" 8
3141 Equivalent to a \*(L"print FILEHANDLE sprintf(LIST)\*(R".
3142 .Ip "push(ARRAY,LIST)" 8 7
3143 Treats ARRAY (@ is optional) as a stack, and pushes the values of LIST
3144 onto the end of ARRAY.
3145 The length of ARRAY increases by the length of LIST.
3146 Has the same effect as
3147 .nf
3148
3149     for $value (LIST) {
3150             $ARRAY[++$#ARRAY] = $value;
3151     }
3152
3153 .fi
3154 but is more efficient.
3155 .Ip "q/STRING/" 8 5
3156 .Ip "qq/STRING/" 8
3157 .Ip "qx/STRING/" 8
3158 These are not really functions, but simply syntactic sugar to let you
3159 avoid putting too many backslashes into quoted strings.
3160 The q operator is a generalized single quote, and the qq operator a
3161 generalized double quote.
3162 The qx operator is a generalized backquote.
3163 Any non-alphanumeric delimiter can be used in place of /, including newline.
3164 If the delimiter is an opening bracket or parenthesis, the final delimiter
3165 will be the corresponding closing bracket or parenthesis.
3166 (Embedded occurrences of the closing bracket need to be backslashed as usual.)
3167 Examples:
3168 .nf
3169
3170 .ne 5
3171         $foo = q!I said, "You said, \'She said it.\'"!;
3172         $bar = q(\'This is it.\');
3173         $today = qx{ date };
3174         $_ .= qq
3175 *** The previous line contains the naughty word "$&".\en
3176                 if /(ibm|apple|awk)/;      # :-)
3177
3178 .fi
3179 .Ip "rand(EXPR)" 8 8
3180 .Ip "rand EXPR" 8
3181 .Ip "rand" 8
3182 Returns a random fractional number between 0 and the value of EXPR.
3183 (EXPR should be positive.)
3184 If EXPR is omitted, returns a value between 0 and 1.
3185 See also srand().
3186 .Ip "read(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
3187 .Ip "read(FILEHANDLE,SCALAR,LENGTH)" 8 5
3188 Attempts to read LENGTH bytes of data into variable SCALAR from the specified
3189 FILEHANDLE.
3190 Returns the number of bytes actually read, or undef if there was an error.
3191 SCALAR will be grown or shrunk to the length actually read.
3192 An OFFSET may be specified to place the read data at some other place
3193 than the beginning of the string.
3194 This call is actually implemented in terms of stdio's fread call.  To get
3195 a true read system call, see sysread.
3196 .Ip "readdir(DIRHANDLE)" 8 3
3197 .Ip "readdir DIRHANDLE" 8
3198 Returns the next directory entry for a directory opened by opendir().
3199 If used in an array context, returns all the rest of the entries in the
3200 directory.
3201 If there are no more entries, returns an undefined value in a scalar context
3202 or a null list in an array context.
3203 .Ip "readlink(EXPR)" 8 6
3204 .Ip "readlink EXPR" 8
3205 Returns the value of a symbolic link, if symbolic links are implemented.
3206 If not, gives a fatal error.
3207 If there is some system error, returns the undefined value and sets $! (errno).
3208 If EXPR is omitted, uses $_.
3209 .Ip "recv(SOCKET,SCALAR,LEN,FLAGS)" 8 4
3210 Receives a message on a socket.
3211 Attempts to receive LENGTH bytes of data into variable SCALAR from the specified
3212 SOCKET filehandle.
3213 Returns the address of the sender, or the undefined value if there's an error.
3214 SCALAR will be grown or shrunk to the length actually read.
3215 Takes the same flags as the system call of the same name.
3216 .Ip "redo LABEL" 8 8
3217 .Ip "redo" 8
3218 The
3219 .I redo
3220 command restarts the loop block without evaluating the conditional again.
3221 The
3222 .I continue
3223 block, if any, is not executed.
3224 If the LABEL is omitted, the command refers to the innermost enclosing loop.
3225 This command is normally used by programs that want to lie to themselves
3226 about what was just input:
3227 .nf
3228
3229 .ne 16
3230         # a simpleminded Pascal comment stripper
3231         # (warning: assumes no { or } in strings)
3232         line: while (<STDIN>) {
3233                 while (s|\|({.*}.*\|){.*}|$1 \||) {}
3234                 s|{.*}| \||;
3235                 if (s|{.*| \||) {
3236                         $front = $_;
3237                         while (<STDIN>) {
3238                                 if (\|/\|}/\|) {        # end of comment?
3239                                         s|^|$front{|;
3240                                         redo line;
3241                                 }
3242                         }
3243                 }
3244                 print;
3245         }
3246
3247 .fi
3248 .Ip "rename(OLDNAME,NEWNAME)" 8 2
3249 Changes the name of a file.
3250 Returns 1 for success, 0 otherwise.
3251 Will not work across filesystem boundaries.
3252 .Ip "require(EXPR)" 8 6
3253 .Ip "require EXPR" 8
3254 .Ip "require" 8
3255 Includes the library file specified by EXPR, or by $_ if EXPR is not supplied.
3256 Has semantics similar to the following subroutine:
3257 .nf
3258
3259         sub require {
3260             local($filename) = @_;
3261             return 1 if $INC{$filename};
3262             local($realfilename,$result);
3263             ITER: {
3264                 foreach $prefix (@INC) {
3265                     $realfilename = "$prefix/$filename";
3266                     if (-f $realfilename) {
3267                         $result = do $realfilename;
3268                         last ITER;
3269                     }
3270                 }
3271                 die "Can't find $filename in \e@INC";
3272             }
3273             die $@ if $@;
3274             die "$filename did not return true value" unless $result;
3275             $INC{$filename} = $realfilename;
3276             $result;
3277         }
3278
3279 .fi
3280 Note that the file will not be included twice under the same specified name.
3281 .Ip "reset(EXPR)" 8 6
3282 .Ip "reset EXPR" 8
3283 .Ip "reset" 8
3284 Generally used in a
3285 .I continue
3286 block at the end of a loop to clear variables and reset ?? searches
3287 so that they work again.
3288 The expression is interpreted as a list of single characters (hyphens allowed
3289 for ranges).
3290 All variables and arrays beginning with one of those letters are reset to
3291 their pristine state.
3292 If the expression is omitted, one-match searches (?pattern?) are reset to
3293 match again.
3294 Only resets variables or searches in the current package.
3295 Always returns 1.
3296 Examples:
3297 .nf
3298
3299 .ne 3
3300     reset \'X\';        \h'|2i'# reset all X variables
3301     reset \'a\-z\';\h'|2i'# reset lower case variables
3302     reset;      \h'|2i'# just reset ?? searches
3303
3304 .fi
3305 Note: resetting \*(L"A\-Z\*(R" is not recommended since you'll wipe out your ARGV and ENV
3306 arrays.
3307 .Sp
3308 The use of reset on dbm associative arrays does not change the dbm file.
3309 (It does, however, flush any entries cached by perl, which may be useful if
3310 you are sharing the dbm file.
3311 Then again, maybe not.)
3312 .Ip "return LIST" 8 3
3313 Returns from a subroutine with the value specified.
3314 (Note that a subroutine can automatically return
3315 the value of the last expression evaluated.
3316 That's the preferred method\*(--use of an explicit
3317 .I return
3318 is a bit slower.)
3319 .Ip "reverse(LIST)" 8 4
3320 .Ip "reverse LIST" 8
3321 In an array context, returns an array value consisting of the elements
3322 of LIST in the opposite order.
3323 In a scalar context, returns a string value consisting of the bytes of
3324 the first element of LIST in the opposite order.
3325 .Ip "rewinddir(DIRHANDLE)" 8 5
3326 .Ip "rewinddir DIRHANDLE" 8
3327 Sets the current position to the beginning of the directory for the readdir() routine on DIRHANDLE.
3328 .Ip "rindex(STR,SUBSTR,POSITION)" 8 6
3329 .Ip "rindex(STR,SUBSTR)" 8 4
3330 Works just like index except that it
3331 returns the position of the LAST occurrence of SUBSTR in STR.
3332 If POSITION is specified, returns the last occurrence at or before that
3333 position.
3334 .Ip "rmdir(FILENAME)" 8 4
3335 .Ip "rmdir FILENAME" 8
3336 Deletes the directory specified by FILENAME if it is empty.
3337 If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
3338 If FILENAME is omitted, uses $_.
3339 .Ip "s/PATTERN/REPLACEMENT/gieo" 8 3
3340 Searches a string for a pattern, and if found, replaces that pattern with the
3341 replacement text and returns the number of substitutions made.
3342 Otherwise it returns false (0).
3343 The \*(L"g\*(R" is optional, and if present, indicates that all occurrences
3344 of the pattern are to be replaced.
3345 The \*(L"i\*(R" is also optional, and if present, indicates that matching
3346 is to be done in a case-insensitive manner.
3347 The \*(L"e\*(R" is likewise optional, and if present, indicates that
3348 the replacement string is to be evaluated as an expression rather than just
3349 as a double-quoted string.
3350 Any non-alphanumeric delimiter may replace the slashes;
3351 if single quotes are used, no
3352 interpretation is done on the replacement string (the e modifier overrides
3353 this, however); if backquotes are used, the replacement string is a command
3354 to execute whose output will be used as the actual replacement text.
3355 If no string is specified via the =~ or !~ operator,
3356 the $_ string is searched and modified.
3357 (The string specified with =~ must be a scalar variable, an array element,
3358 or an assignment to one of those, i.e. an lvalue.)
3359 If the pattern contains a $ that looks like a variable rather than an
3360 end-of-string test, the variable will be interpolated into the pattern at
3361 run-time.
3362 If you only want the pattern compiled once the first time the variable is
3363 interpolated, add an \*(L"o\*(R" at the end.
3364 If the PATTERN evaluates to a null string, the most recent successful
3365 regular expression is used instead.
3366 See also the section on regular expressions.
3367 Examples:
3368 .nf
3369
3370     s/\|\e\|bgreen\e\|b/mauve/g;                # don't change wintergreen
3371
3372     $path \|=~ \|s|\|/usr/bin|\|/usr/local/bin|;
3373
3374     s/Login: $foo/Login: $bar/; # run-time pattern
3375
3376     ($foo = $bar) =~ s/bar/foo/;
3377
3378     $_ = \'abc123xyz\';
3379     s/\ed+/$&*2/e;              # yields \*(L'abc246xyz\*(R'
3380     s/\ed+/sprintf("%5d",$&)/e; # yields \*(L'abc  246xyz\*(R'
3381     s/\ew/$& x 2/eg;            # yields \*(L'aabbcc  224466xxyyzz\*(R'
3382
3383     s/\|([^ \|]*\|) *\|([^ \|]*\|)\|/\|$2 $1/;  # reverse 1st two fields
3384
3385 .fi
3386 (Note the use of $ instead of \|\e\| in the last example.  See section
3387 on regular expressions.)
3388 .Ip "scalar(EXPR)" 8 3
3389 Forces EXPR to be interpreted in a scalar context and returns the value
3390 of EXPR.
3391 .Ip "seek(FILEHANDLE,POSITION,WHENCE)" 8 3
3392 Randomly positions the file pointer for FILEHANDLE, just like the fseek()
3393 call of stdio.
3394 FILEHANDLE may be an expression whose value gives the name of the filehandle.
3395 Returns 1 upon success, 0 otherwise.
3396 .Ip "seekdir(DIRHANDLE,POS)" 8 3
3397 Sets the current position for the readdir() routine on DIRHANDLE.
3398 POS must be a value returned by telldir().
3399 Has the same caveats about possible directory compaction as the corresponding
3400 system library routine.
3401 .Ip "select(FILEHANDLE)" 8 3
3402 .Ip "select" 8 3
3403 Returns the currently selected filehandle.
3404 Sets the current default filehandle for output, if FILEHANDLE is supplied.
3405 This has two effects: first, a
3406 .I write
3407 or a
3408 .I print
3409 without a filehandle will default to this FILEHANDLE.
3410 Second, references to variables related to output will refer to this output
3411 channel.
3412 For example, if you have to set the top of form format for more than
3413 one output channel, you might do the following:
3414 .nf
3415
3416 .ne 4
3417         select(REPORT1);
3418         $^ = \'report1_top\';
3419         select(REPORT2);
3420         $^ = \'report2_top\';
3421
3422 .fi
3423 FILEHANDLE may be an expression whose value gives the name of the actual filehandle.
3424 Thus:
3425 .nf
3426
3427         $oldfh = select(STDERR); $| = 1; select($oldfh);
3428
3429 .fi
3430 .Ip "select(RBITS,WBITS,EBITS,TIMEOUT)" 8 3
3431 This calls the select system call with the bitmasks specified, which can
3432 be constructed using fileno() and vec(), along these lines:
3433 .nf
3434
3435         $rin = $win = $ein = '';
3436         vec($rin,fileno(STDIN),1) = 1;
3437         vec($win,fileno(STDOUT),1) = 1;
3438         $ein = $rin | $win;
3439
3440 .fi
3441 If you want to select on many filehandles you might wish to write a subroutine:
3442 .nf
3443
3444         sub fhbits {
3445             local(@fhlist) = split(' ',$_[0]);
3446             local($bits);
3447             for (@fhlist) {
3448                 vec($bits,fileno($_),1) = 1;
3449             }
3450             $bits;
3451         }
3452         $rin = &fhbits('STDIN TTY SOCK');
3453
3454 .fi
3455 The usual idiom is:
3456 .nf
3457
3458         ($nfound,$timeleft) =
3459           select($rout=$rin, $wout=$win, $eout=$ein, $timeout);
3460
3461 or to block until something becomes ready:
3462
3463 .ie t \{\
3464         $nfound = select($rout=$rin, $wout=$win, $eout=$ein, undef);
3465 'br\}
3466 .el \{\
3467         $nfound = select($rout=$rin, $wout=$win,
3468                                 $eout=$ein, undef);
3469 'br\}
3470
3471 .fi
3472 Any of the bitmasks can also be undef.
3473 The timeout, if specified, is in seconds, which may be fractional.
3474 NOTE: not all implementations are capable of returning the $timeleft.
3475 If not, they always return $timeleft equal to the supplied $timeout.
3476 .Ip "semctl(ID,SEMNUM,CMD,ARG)" 8 4
3477 Calls the System V IPC function semctl.  If CMD is &IPC_STAT or
3478 &GETALL, then ARG must be a variable which will hold the returned
3479 semid_ds structure or semaphore value array.  Returns like ioctl: the
3480 undefined value for error, "0 but true" for zero, or the actual return
3481 value otherwise.
3482 .Ip "semget(KEY,NSEMS,SIZE,FLAGS)" 8 4
3483 Calls the System V IPC function semget.  Returns the semaphore id, or
3484 the undefined value if there is an error.
3485 .Ip "semop(KEY,OPSTRING)" 8 4
3486 Calls the System V IPC function semop to perform semaphore operations
3487 such as signaling and waiting.  OPSTRING must be a packed array of
3488 semop structures.  Each semop structure can be generated with
3489 \&'pack("sss", $semnum, $semop, $semflag)'.  The number of semaphore
3490 operations is implied by the length of OPSTRING.  Returns true if
3491 successful, or false if there is an error.  As an example, the
3492 following code waits on semaphore $semnum of semaphore id $semid:
3493 .nf
3494
3495         $semop = pack("sss", $semnum, -1, 0);
3496         die "Semaphore trouble: $!\en" unless semop($semid, $semop);
3497
3498 .fi
3499 To signal the semaphore, replace "-1" with "1".
3500 .Ip "send(SOCKET,MSG,FLAGS,TO)" 8 4
3501 .Ip "send(SOCKET,MSG,FLAGS)" 8
3502 Sends a message on a socket.
3503 Takes the same flags as the system call of the same name.
3504 On unconnected sockets you must specify a destination to send TO.
3505 Returns the number of characters sent, or the undefined value if
3506 there is an error.
3507 .Ip "setpgrp(PID,PGRP)" 8 4
3508 Sets the current process group for the specified PID, 0 for the current
3509 process.
3510 Will produce a fatal error if used on a machine that doesn't implement
3511 setpgrp(2).
3512 .Ip "setpriority(WHICH,WHO,PRIORITY)" 8 4
3513 Sets the current priority for a process, a process group, or a user.
3514 (See setpriority(2).)
3515 Will produce a fatal error if used on a machine that doesn't implement
3516 setpriority(2).
3517 .Ip "setsockopt(SOCKET,LEVEL,OPTNAME,OPTVAL)" 8 3
3518 Sets the socket option requested.
3519 Returns undefined if there is an error.
3520 OPTVAL may be specified as undef if you don't want to pass an argument.
3521 .Ip "shift(ARRAY)" 8 6
3522 .Ip "shift ARRAY" 8
3523 .Ip "shift" 8
3524 Shifts the first value of the array off and returns it,
3525 shortening the array by 1 and moving everything down.
3526 If there are no elements in the array, returns the undefined value.
3527 If ARRAY is omitted, shifts the @ARGV array in the main program, and the @_
3528 array in subroutines.
3529 (This is determined lexically.)
3530 See also unshift(), push() and pop().
3531 Shift() and unshift() do the same thing to the left end of an array that push()
3532 and pop() do to the right end.
3533 .Ip "shmctl(ID,CMD,ARG)" 8 4
3534 Calls the System V IPC function shmctl.  If CMD is &IPC_STAT, then ARG
3535 must be a variable which will hold the returned shmid_ds structure.
3536 Returns like ioctl: the undefined value for error, "0 but true" for
3537 zero, or the actual return value otherwise.
3538 .Ip "shmget(KEY,SIZE,FLAGS)" 8 4
3539 Calls the System V IPC function shmget.  Returns the shared memory
3540 segment id, or the undefined value if there is an error.
3541 .Ip "shmread(ID,VAR,POS,SIZE)" 8 4
3542 .Ip "shmwrite(ID,STRING,POS,SIZE)" 8
3543 Reads or writes the System V shared memory segment ID starting at
3544 position POS for size SIZE by attaching to it, copying in/out, and
3545 detaching from it.  When reading, VAR must be a variable which
3546 will hold the data read.  When writing, if STRING is too long,
3547 only SIZE bytes are used; if STRING is too short, nulls are
3548 written to fill out SIZE bytes.  Return true if successful, or
3549 false if there is an error.
3550 .Ip "shutdown(SOCKET,HOW)" 8 3
3551 Shuts down a socket connection in the manner indicated by HOW, which has
3552 the same interpretation as in the system call of the same name.
3553 .Ip "sin(EXPR)" 8 4
3554 .Ip "sin EXPR" 8
3555 Returns the sine of EXPR (expressed in radians).
3556 If EXPR is omitted, returns sine of $_.
3557 .Ip "sleep(EXPR)" 8 6
3558 .Ip "sleep EXPR" 8
3559 .Ip "sleep" 8
3560 Causes the script to sleep for EXPR seconds, or forever if no EXPR.
3561 May be interrupted by sending the process a SIGALARM.
3562 Returns the number of seconds actually slept.
3563 .Ip "socket(SOCKET,DOMAIN,TYPE,PROTOCOL)" 8 3
3564 Opens a socket of the specified kind and attaches it to filehandle SOCKET.
3565 DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
3566 of the same name.
3567 You may need to run h2ph on sys/socket.h to get the proper values handy
3568 in a perl library file.
3569 Return true if successful.
3570 See the example in the section on Interprocess Communication.
3571 .Ip "socketpair(SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL)" 8 3
3572 Creates an unnamed pair of sockets in the specified domain, of the specified
3573 type.
3574 DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
3575 of the same name.
3576 If unimplemented, yields a fatal error.
3577 Return true if successful.
3578 .Ip "sort(SUBROUTINE LIST)" 8 9
3579 .Ip "sort(LIST)" 8
3580 .Ip "sort SUBROUTINE LIST" 8
3581 .Ip "sort LIST" 8
3582 Sorts the LIST and returns the sorted array value.
3583 Nonexistent values of arrays are stripped out.
3584 If SUBROUTINE is omitted, sorts in standard string comparison order.
3585 If SUBROUTINE is specified, gives the name of a subroutine that returns
3586 an integer less than, equal to, or greater than 0,
3587 depending on how the elements of the array are to be ordered.
3588 (The <=> and cmp operators are extremely useful in such routines.)
3589 In the interests of efficiency the normal calling code for subroutines
3590 is bypassed, with the following effects: the subroutine may not be a recursive
3591 subroutine, and the two elements to be compared are passed into the subroutine
3592 not via @_ but as $a and $b (see example below).
3593 They are passed by reference so don't modify $a and $b.
3594 SUBROUTINE may be a scalar variable name, in which case the value provides
3595 the name of the subroutine to use.
3596 Examples:
3597 .nf
3598
3599 .ne 4
3600         sub byage {
3601             $age{$a} <=> $age{$b};      # presuming integers
3602         }
3603         @sortedclass = sort byage @class;
3604
3605 .ne 9
3606         sub reverse { $b cmp $a; }
3607         @harry = (\'dog\',\'cat\',\'x\',\'Cain\',\'Abel\');
3608         @george = (\'gone\',\'chased\',\'yz\',\'Punished\',\'Axed\');
3609         print sort @harry;
3610                 # prints AbelCaincatdogx
3611         print sort reverse @harry;
3612                 # prints xdogcatCainAbel
3613         print sort @george, \'to\', @harry;
3614                 # prints AbelAxedCainPunishedcatchaseddoggonetoxyz
3615
3616 .fi
3617 .Ip "splice(ARRAY,OFFSET,LENGTH,LIST)" 8 8
3618 .Ip "splice(ARRAY,OFFSET,LENGTH)" 8
3619 .Ip "splice(ARRAY,OFFSET)" 8
3620 Removes the elements designated by OFFSET and LENGTH from an array, and
3621 replaces them with the elements of LIST, if any.
3622 Returns the elements removed from the array.
3623 The array grows or shrinks as necessary.
3624 If LENGTH is omitted, removes everything from OFFSET onward.
3625 The following equivalencies hold (assuming $[ == 0):
3626 .nf
3627
3628         push(@a,$x,$y)\h'|3.5i'splice(@a,$#a+1,0,$x,$y)
3629         pop(@a)\h'|3.5i'splice(@a,-1)
3630         shift(@a)\h'|3.5i'splice(@a,0,1)
3631         unshift(@a,$x,$y)\h'|3.5i'splice(@a,0,0,$x,$y)
3632         $a[$x] = $y\h'|3.5i'splice(@a,$x,1,$y);
3633
3634 Example, assuming array lengths are passed before arrays:
3635
3636         sub aeq {       # compare two array values
3637                 local(@a) = splice(@_,0,shift);
3638                 local(@b) = splice(@_,0,shift);
3639                 return 0 unless @a == @b;       # same len?
3640                 while (@a) {
3641                     return 0 if pop(@a) ne pop(@b);
3642                 }
3643                 return 1;
3644         }
3645         if (&aeq($len,@foo[1..$len],0+@bar,@bar)) { ... }
3646
3647 .fi
3648 .Ip "split(/PATTERN/,EXPR,LIMIT)" 8 8
3649 .Ip "split(/PATTERN/,EXPR)" 8 8
3650 .Ip "split(/PATTERN/)" 8
3651 .Ip "split" 8
3652 Splits a string into an array of strings, and returns it.
3653 (If not in an array context, returns the number of fields found and splits
3654 into the @_ array.
3655 (In an array context, you can force the split into @_
3656 by using ?? as the pattern delimiters, but it still returns the array value.))
3657 If EXPR is omitted, splits the $_ string.
3658 If PATTERN is also omitted, splits on whitespace (/[\ \et\en]+/).
3659 Anything matching PATTERN is taken to be a delimiter separating the fields.
3660 (Note that the delimiter may be longer than one character.)
3661 If LIMIT is specified, splits into no more than that many fields (though it
3662 may split into fewer).
3663 If LIMIT is unspecified, trailing null fields are stripped (which
3664 potential users of pop() would do well to remember).
3665 A pattern matching the null string (not to be confused with a null pattern //,
3666 which is just one member of the set of patterns matching a null string)
3667 will split the value of EXPR into separate characters at each point it
3668 matches that way.
3669 For example:
3670 .nf
3671
3672         print join(\':\', split(/ */, \'hi there\'));
3673
3674 .fi
3675 produces the output \*(L'h:i:t:h:e:r:e\*(R'.
3676 .Sp
3677 The LIMIT parameter can be used to partially split a line
3678 .nf
3679
3680         ($login, $passwd, $remainder) = split(\|/\|:\|/\|, $_, 3);
3681
3682 .fi
3683 (When assigning to a list, if LIMIT is omitted, perl supplies a LIMIT one
3684 larger than the number of variables in the list, to avoid unnecessary work.
3685 For the list above LIMIT would have been 4 by default.
3686 In time critical applications it behooves you not to split into
3687 more fields than you really need.)
3688 .Sp
3689 If the PATTERN contains parentheses, additional array elements are created
3690 from each matching substring in the delimiter.
3691 .Sp
3692         split(/([,-])/,"1-10,20");
3693 .Sp
3694 produces the array value
3695 .Sp
3696         (1,'-',10,',',20)
3697 .Sp
3698 The pattern /PATTERN/ may be replaced with an expression to specify patterns
3699 that vary at runtime.
3700 (To do runtime compilation only once, use /$variable/o.)
3701 As a special case, specifying a space (\'\ \') will split on white space
3702 just as split with no arguments does, but leading white space does NOT
3703 produce a null first field.
3704 Thus, split(\'\ \') can be used to emulate
3705 .IR awk 's
3706 default behavior, whereas
3707 split(/\ /) will give you as many null initial fields as there are
3708 leading spaces.
3709 .Sp
3710 Example:
3711 .nf
3712
3713 .ne 5
3714         open(passwd, \'/etc/passwd\');
3715         while (<passwd>) {
3716 .ie t \{\
3717                 ($login, $passwd, $uid, $gid, $gcos, $home, $shell) = split(\|/\|:\|/\|);
3718 'br\}
3719 .el \{\
3720                 ($login, $passwd, $uid, $gid, $gcos, $home, $shell)
3721                         = split(\|/\|:\|/\|);
3722 'br\}
3723                 .\|.\|.
3724         }
3725
3726 .fi
3727 (Note that $shell above will still have a newline on it.  See chop().)
3728 See also
3729 .IR join .
3730 .Ip "sprintf(FORMAT,LIST)" 8 4
3731 Returns a string formatted by the usual printf conventions.
3732 The * character is not supported.
3733 .Ip "sqrt(EXPR)" 8 4
3734 .Ip "sqrt EXPR" 8
3735 Return the square root of EXPR.
3736 If EXPR is omitted, returns square root of $_.
3737 .Ip "srand(EXPR)" 8 4
3738 .Ip "srand EXPR" 8
3739 Sets the random number seed for the
3740 .I rand
3741 operator.
3742 If EXPR is omitted, does srand(time).
3743 .Ip "stat(FILEHANDLE)" 8 8
3744 .Ip "stat FILEHANDLE" 8
3745 .Ip "stat(EXPR)" 8
3746 .Ip "stat SCALARVARIABLE" 8
3747 Returns a 13-element array giving the statistics for a file, either the file
3748 opened via FILEHANDLE, or named by EXPR.
3749 Typically used as follows:
3750 .nf
3751
3752 .ne 3
3753     ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
3754        $atime,$mtime,$ctime,$blksize,$blocks)
3755            = stat($filename);
3756
3757 .fi
3758 If stat is passed the special filehandle consisting of an underline,
3759 no stat is done, but the current contents of the stat structure from
3760 the last stat or filetest are returned.
3761 Example:
3762 .nf
3763
3764 .ne 3
3765         if (-x $file && (($d) = stat(_)) && $d < 0) {
3766                 print "$file is executable NFS file\en";
3767         }
3768
3769 .fi
3770 (This only works on machines for which the device number is negative under NFS.)
3771 .Ip "study(SCALAR)" 8 6
3772 .Ip "study SCALAR" 8
3773 .Ip "study"
3774 Takes extra time to study SCALAR ($_ if unspecified) in anticipation of
3775 doing many pattern matches on the string before it is next modified.
3776 This may or may not save time, depending on the nature and number of patterns
3777 you are searching on, and on the distribution of character frequencies in
3778 the string to be searched\*(--you probably want to compare runtimes with and
3779 without it to see which runs faster.
3780 Those loops which scan for many short constant strings (including the constant
3781 parts of more complex patterns) will benefit most.
3782 You may have only one study active at a time\*(--if you study a different
3783 scalar the first is \*(L"unstudied\*(R".
3784 (The way study works is this: a linked list of every character in the string
3785 to be searched is made, so we know, for example, where all the \*(L'k\*(R' characters
3786 are.
3787 From each search string, the rarest character is selected, based on some
3788 static frequency tables constructed from some C programs and English text.
3789 Only those places that contain this \*(L"rarest\*(R" character are examined.)
3790 .Sp
3791 For example, here is a loop which inserts index producing entries before any line
3792 containing a certain pattern:
3793 .nf
3794
3795 .ne 8
3796         while (<>) {
3797                 study;
3798                 print ".IX foo\en" if /\ebfoo\eb/;
3799                 print ".IX bar\en" if /\ebbar\eb/;
3800                 print ".IX blurfl\en" if /\ebblurfl\eb/;
3801                 .\|.\|.
3802                 print;
3803         }
3804
3805 .fi
3806 In searching for /\ebfoo\eb/, only those locations in $_ that contain \*(L'f\*(R'
3807 will be looked at, because \*(L'f\*(R' is rarer than \*(L'o\*(R'.
3808 In general, this is a big win except in pathological cases.
3809 The only question is whether it saves you more time than it took to build
3810 the linked list in the first place.
3811 .Sp
3812 Note that if you have to look for strings that you don't know till runtime,
3813 you can build an entire loop as a string and eval that to avoid recompiling
3814 all your patterns all the time.
3815 Together with undefining $/ to input entire files as one record, this can
3816 be very fast, often faster than specialized programs like fgrep.
3817 The following scans a list of files (@files)
3818 for a list of words (@words), and prints out the names of those files that
3819 contain a match:
3820 .nf
3821
3822 .ne 12
3823         $search = \'while (<>) { study;\';
3824         foreach $word (@words) {
3825             $search .= "++\e$seen{\e$ARGV} if /\eb$word\eb/;\en";
3826         }
3827         $search .= "}";
3828         @ARGV = @files;
3829         undef $/;
3830         eval $search;           # this screams
3831         $/ = "\en";             # put back to normal input delim
3832         foreach $file (sort keys(%seen)) {
3833             print $file, "\en";
3834         }
3835
3836 .fi
3837 .Ip "substr(EXPR,OFFSET,LEN)" 8 2
3838 .Ip "substr(EXPR,OFFSET)" 8 2
3839 Extracts a substring out of EXPR and returns it.
3840 First character is at offset 0, or whatever you've set $[ to.
3841 If OFFSET is negative, starts that far from the end of the string.
3842 If LEN is omitted, returns everything to the end of the string.
3843 You can use the substr() function as an lvalue, in which case EXPR must
3844 be an lvalue.
3845 If you assign something shorter than LEN, the string will shrink, and
3846 if you assign something longer than LEN, the string will grow to accommodate it.
3847 To keep the string the same length you may need to pad or chop your value using
3848 sprintf().
3849 .Ip "symlink(OLDFILE,NEWFILE)" 8 2
3850 Creates a new filename symbolically linked to the old filename.
3851 Returns 1 for success, 0 otherwise.
3852 On systems that don't support symbolic links, produces a fatal error at
3853 run time.
3854 To check for that, use eval:
3855 .nf
3856
3857         $symlink_exists = (eval \'symlink("","");\', $@ eq \'\');
3858
3859 .fi
3860 .Ip "syscall(LIST)" 8 6
3861 .Ip "syscall LIST" 8
3862 Calls the system call specified as the first element of the list, passing
3863 the remaining elements as arguments to the system call.
3864 If unimplemented, produces a fatal error.
3865 The arguments are interpreted as follows: if a given argument is numeric,
3866 the argument is passed as an int.
3867 If not, the pointer to the string value is passed.
3868 You are responsible to make sure a string is pre-extended long enough
3869 to receive any result that might be written into a string.
3870 If your integer arguments are not literals and have never been interpreted
3871 in a numeric context, you may need to add 0 to them to force them to look
3872 like numbers.
3873 .nf
3874
3875         require 'syscall.ph';           # may need to run h2ph
3876         syscall(&SYS_write, fileno(STDOUT), "hi there\en", 9);
3877
3878 .fi
3879 .Ip "sysread(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
3880 .Ip "sysread(FILEHANDLE,SCALAR,LENGTH)" 8 5
3881 Attempts to read LENGTH bytes of data into variable SCALAR from the specified
3882 FILEHANDLE, using the system call read(2).
3883 It bypasses stdio, so mixing this with other kinds of reads may cause
3884 confusion.
3885 Returns the number of bytes actually read, or undef if there was an error.
3886 SCALAR will be grown or shrunk to the length actually read.
3887 An OFFSET may be specified to place the read data at some other place
3888 than the beginning of the string.
3889 .Ip "system(LIST)" 8 6
3890 .Ip "system LIST" 8
3891 Does exactly the same thing as \*(L"exec LIST\*(R" except that a fork
3892 is done first, and the parent process waits for the child process to complete.
3893 Note that argument processing varies depending on the number of arguments.
3894 The return value is the exit status of the program as returned by the wait()
3895 call.
3896 To get the actual exit value divide by 256.
3897 See also
3898 .IR exec .
3899 .Ip "syswrite(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5
3900 .Ip "syswrite(FILEHANDLE,SCALAR,LENGTH)" 8 5
3901 Attempts to write LENGTH bytes of data from variable SCALAR to the specified
3902 FILEHANDLE, using the system call write(2).
3903 It bypasses stdio, so mixing this with prints may cause
3904 confusion.
3905 Returns the number of bytes actually written, or undef if there was an error.
3906 An OFFSET may be specified to place the read data at some other place
3907 than the beginning of the string.
3908 .Ip "tell(FILEHANDLE)" 8 6
3909 .Ip "tell FILEHANDLE" 8 6
3910 .Ip "tell" 8
3911 Returns the current file position for FILEHANDLE.
3912 FILEHANDLE may be an expression whose value gives the name of the actual
3913 filehandle.
3914 If FILEHANDLE is omitted, assumes the file last read.
3915 .Ip "telldir(DIRHANDLE)" 8 5
3916 .Ip "telldir DIRHANDLE" 8
3917 Returns the current position of the readdir() routines on DIRHANDLE.
3918 Value may be given to seekdir() to access a particular location in
3919 a directory.
3920 Has the same caveats about possible directory compaction as the corresponding
3921 system library routine.
3922 .Ip "time" 8 4
3923 Returns the number of non-leap seconds since 00:00:00 UTC, January 1, 1970.
3924 Suitable for feeding to gmtime() and localtime().
3925 .Ip "times" 8 4
3926 Returns a four-element array giving the user and system times, in seconds, for this
3927 process and the children of this process.
3928 .Sp
3929     ($user,$system,$cuser,$csystem) = times;
3930 .Sp
3931 .Ip "tr/SEARCHLIST/REPLACEMENTLIST/cds" 8 5
3932 .Ip "y/SEARCHLIST/REPLACEMENTLIST/cds" 8
3933 Translates all occurrences of the characters found in the search list with
3934 the corresponding character in the replacement list.
3935 It returns the number of characters replaced or deleted.
3936 If no string is specified via the =~ or !~ operator,
3937 the $_ string is translated.
3938 (The string specified with =~ must be a scalar variable, an array element,
3939 or an assignment to one of those, i.e. an lvalue.)
3940 For
3941 .I sed
3942 devotees,
3943 .I y
3944 is provided as a synonym for
3945 .IR tr .
3946 .Sp
3947 If the c modifier is specified, the SEARCHLIST character set is complemented.
3948 If the d modifier is specified, any characters specified by SEARCHLIST that
3949 are not found in REPLACEMENTLIST are deleted.
3950 (Note that this is slightly more flexible than the behavior of some
3951 .I tr
3952 programs, which delete anything they find in the SEARCHLIST, period.)
3953 If the s modifier is specified, sequences of characters that were translated
3954 to the same character are squashed down to 1 instance of the character.
3955 .Sp
3956 If the d modifier was used, the REPLACEMENTLIST is always interpreted exactly
3957 as specified.
3958 Otherwise, if the REPLACEMENTLIST is shorter than the SEARCHLIST,
3959 the final character is replicated till it is long enough.
3960 If the REPLACEMENTLIST is null, the SEARCHLIST is replicated.
3961 This latter is useful for counting characters in a class, or for squashing
3962 character sequences in a class.
3963 .Sp
3964 Examples:
3965 .nf
3966
3967     $ARGV[1] \|=~ \|y/A\-Z/a\-z/;       \h'|3i'# canonicalize to lower case
3968
3969     $cnt = tr/*/*/;             \h'|3i'# count the stars in $_
3970
3971     $cnt = tr/0\-9//;           \h'|3i'# count the digits in $_
3972
3973     tr/a\-zA\-Z//s;     \h'|3i'# bookkeeper \-> bokeper
3974
3975     ($HOST = $host) =~ tr/a\-z/A\-Z/;
3976
3977     y/a\-zA\-Z/ /cs;    \h'|3i'# change non-alphas to single space
3978
3979     tr/\e200\-\e377/\e0\-\e177/;\h'|3i'# delete 8th bit
3980
3981 .fi
3982 .Ip "truncate(FILEHANDLE,LENGTH)" 8 4
3983 .Ip "truncate(EXPR,LENGTH)" 8
3984 Truncates the file opened on FILEHANDLE, or named by EXPR, to the specified
3985 length.
3986 Produces a fatal error if truncate isn't implemented on your system.
3987 .Ip "umask(EXPR)" 8 4
3988 .Ip "umask EXPR" 8
3989 .Ip "umask" 8
3990 Sets the umask for the process and returns the old one.
3991 If EXPR is omitted, merely returns current umask.
3992 .Ip "undef(EXPR)" 8 6
3993 .Ip "undef EXPR" 8
3994 .Ip "undef" 8
3995 Undefines the value of EXPR, which must be an lvalue.
3996 Use only on a scalar value, an entire array, or a subroutine name (using &).
3997 (Undef will probably not do what you expect on most predefined variables or
3998 dbm array values.)
3999 Always returns the undefined value.
4000 You can omit the EXPR, in which case nothing is undefined, but you still
4001 get an undefined value that you could, for instance, return from a subroutine.
4002 Examples:
4003 .nf
4004
4005 .ne 6
4006         undef $foo;
4007         undef $bar{'blurfl'};
4008         undef @ary;
4009         undef %assoc;
4010         undef &mysub;
4011         return (wantarray ? () : undef) if $they_blew_it;
4012
4013 .fi
4014 .Ip "unlink(LIST)" 8 4
4015 .Ip "unlink LIST" 8
4016 Deletes a list of files.
4017 Returns the number of files successfully deleted.
4018 .nf
4019
4020 .ne 2
4021         $cnt = unlink \'a\', \'b\', \'c\';
4022         unlink @goners;
4023         unlink <*.bak>;
4024
4025 .fi
4026 Note: unlink will not delete directories unless you are superuser and the
4027 .B \-U
4028 flag is supplied to
4029 .IR perl .
4030 Even if these conditions are met, be warned that unlinking a directory
4031 can inflict damage on your filesystem.
4032 Use rmdir instead.
4033 .Ip "unpack(TEMPLATE,EXPR)" 8 4
4034 Unpack does the reverse of pack: it takes a string representing
4035 a structure and expands it out into an array value, returning the array
4036 value.
4037 (In a scalar context, it merely returns the first value produced.)
4038 The TEMPLATE has the same format as in the pack function.
4039 Here's a subroutine that does substring:
4040 .nf
4041
4042 .ne 4
4043         sub substr {
4044                 local($what,$where,$howmuch) = @_;
4045                 unpack("x$where a$howmuch", $what);
4046         }
4047
4048 .ne 3
4049 and then there's
4050
4051         sub ord { unpack("c",$_[0]); }
4052
4053 .fi
4054 In addition, you may prefix a field with a %<number> to indicate that
4055 you want a <number>-bit checksum of the items instead of the items themselves.
4056 Default is a 16-bit checksum.
4057 For example, the following computes the same number as the System V sum program:
4058 .nf
4059
4060 .ne 4
4061         while (<>) {
4062             $checksum += unpack("%16C*", $_);
4063         }
4064         $checksum %= 65536;
4065
4066 .fi
4067 .Ip "unshift(ARRAY,LIST)" 8 4
4068 Does the opposite of a
4069 .IR shift .
4070 Or the opposite of a
4071 .IR push ,
4072 depending on how you look at it.
4073 Prepends list to the front of the array, and returns the number of elements
4074 in the new array.
4075 .nf
4076
4077         unshift(ARGV, \'\-e\') unless $ARGV[0] =~ /^\-/;
4078
4079 .fi
4080 .Ip "utime(LIST)" 8 2
4081 .Ip "utime LIST" 8 2
4082 Changes the access and modification times on each file of a list of files.
4083 The first two elements of the list must be the NUMERICAL access and
4084 modification times, in that order.
4085 Returns the number of files successfully changed.
4086 The inode modification time of each file is set to the current time.
4087 Example of a \*(L"touch\*(R" command:
4088 .nf
4089
4090 .ne 3
4091         #!/usr/bin/perl
4092         $now = time;
4093         utime $now, $now, @ARGV;
4094
4095 .fi
4096 .Ip "values(ASSOC_ARRAY)" 8 6
4097 .Ip "values ASSOC_ARRAY" 8
4098 Returns a normal array consisting of all the values of the named associative
4099 array.
4100 The values are returned in an apparently random order, but it is the same order
4101 as either the keys() or each() function would produce on the same array.
4102 See also keys() and each().
4103 .Ip "vec(EXPR,OFFSET,BITS)" 8 2
4104 Treats a string as a vector of unsigned integers, and returns the value
4105 of the bitfield specified.
4106 May also be assigned to.
4107 BITS must be a power of two from 1 to 32.
4108 .Sp
4109 Vectors created with vec() can also be manipulated with the logical operators
4110 |, & and ^,
4111 which will assume a bit vector operation is desired when both operands are
4112 strings.
4113 This interpretation is not enabled unless there is at least one vec() in
4114 your program, to protect older programs.
4115 .Sp
4116 To transform a bit vector into a string or array of 0's and 1's, use these:
4117 .nf
4118
4119         $bits = unpack("b*", $vector);
4120         @bits = split(//, unpack("b*", $vector));
4121
4122 .fi
4123 If you know the exact length in bits, it can be used in place of the *.
4124 .Ip "wait" 8 6
4125 Waits for a child process to terminate and returns the pid of the deceased
4126 process, or -1 if there are no child processes.
4127 The status is returned in $?.
4128 .Ip "waitpid(PID,FLAGS)" 8 6
4129 Waits for a particular child process to terminate and returns the pid of the deceased
4130 process, or -1 if there is no such child process.
4131 The status is returned in $?.
4132 If you say
4133 .nf
4134
4135         require "sys/wait.h";
4136         .\|.\|.
4137         waitpid(-1,&WNOHANG);
4138
4139 .fi
4140 then you can do a non-blocking wait for any process.  Non-blocking wait
4141 is only available on machines supporting either the
4142 .I waitpid (2)
4143 or
4144 .I wait4 (2)
4145 system calls.
4146 However, waiting for a particular pid with FLAGS of 0 is implemented
4147 everywhere.  (Perl emulates the system call by remembering the status
4148 values of processes that have exited but have not been harvested by the
4149 Perl script yet.)
4150 .Ip "wantarray" 8 4
4151 Returns true if the context of the currently executing subroutine
4152 is looking for an array value.
4153 Returns false if the context is looking for a scalar.
4154 .nf
4155
4156         return wantarray ? () : undef;
4157
4158 .fi
4159 .Ip "warn(LIST)" 8 4
4160 .Ip "warn LIST" 8
4161 Produces a message on STDERR just like \*(L"die\*(R", but doesn't exit.
4162 .Ip "write(FILEHANDLE)" 8 6
4163 .Ip "write(EXPR)" 8
4164 .Ip "write" 8
4165 Writes a formatted record (possibly multi-line) to the specified file,
4166 using the format associated with that file.
4167 By default the format for a file is the one having the same name is the
4168 filehandle, but the format for the current output channel (see
4169 .IR select )
4170 may be set explicitly
4171 by assigning the name of the format to the $~ variable.
4172 .Sp
4173 Top of form processing is handled automatically:
4174 if there is insufficient room on the current page for the formatted
4175 record, the page is advanced by writing a form feed,
4176 a special top-of-page format is used
4177 to format the new page header, and then the record is written.
4178 By default the top-of-page format is \*(L"top\*(R", but it
4179 may be set to the
4180 format of your choice by assigning the name to the $^ variable.
4181 The number of lines remaining on the current page is in variable $-, which
4182 can be set to 0 to force a new page.
4183 .Sp
4184 If FILEHANDLE is unspecified, output goes to the current default output channel,
4185 which starts out as
4186 .I STDOUT
4187 but may be changed by the
4188 .I select
4189 operator.
4190 If the FILEHANDLE is an EXPR, then the expression is evaluated and the
4191 resulting string is used to look up the name of the FILEHANDLE at run time.
4192 For more on formats, see the section on formats later on.
4193 .Sp
4194 Note that write is NOT the opposite of read.
4195 .Sh "Precedence"
4196 .I Perl
4197 operators have the following associativity and precedence:
4198 .nf
4199
4200 nonassoc\h'|1i'print printf exec system sort reverse
4201 \h'1.5i'chmod chown kill unlink utime die return
4202 left\h'|1i',
4203 right\h'|1i'= += \-= *= etc.
4204 right\h'|1i'?:
4205 nonassoc\h'|1i'.\|.
4206 left\h'|1i'||
4207 left\h'|1i'&&
4208 left\h'|1i'| ^
4209 left\h'|1i'&
4210 nonassoc\h'|1i'== != <=> eq ne cmp
4211 nonassoc\h'|1i'< > <= >= lt gt le ge
4212 nonassoc\h'|1i'chdir exit eval reset sleep rand umask
4213 nonassoc\h'|1i'\-r \-w \-x etc.
4214 left\h'|1i'<< >>
4215 left\h'|1i'+ \- .
4216 left\h'|1i'* / % x
4217 left\h'|1i'=~ !~
4218 right\h'|1i'! ~ and unary minus
4219 right\h'|1i'**
4220 nonassoc\h'|1i'++ \-\|\-
4221 left\h'|1i'\*(L'(\*(R'
4222
4223 .fi
4224 As mentioned earlier, if any list operator (print, etc.) or
4225 any unary operator (chdir, etc.)
4226 is followed by a left parenthesis as the next token on the same line,
4227 the operator and arguments within parentheses are taken to
4228 be of highest precedence, just like a normal function call.
4229 Examples:
4230 .nf
4231
4232         chdir $foo || die;\h'|3i'# (chdir $foo) || die
4233         chdir($foo) || die;\h'|3i'# (chdir $foo) || die
4234         chdir ($foo) || die;\h'|3i'# (chdir $foo) || die
4235         chdir +($foo) || die;\h'|3i'# (chdir $foo) || die
4236
4237 but, because * is higher precedence than ||:
4238
4239         chdir $foo * 20;\h'|3i'# chdir ($foo * 20)
4240         chdir($foo) * 20;\h'|3i'# (chdir $foo) * 20
4241         chdir ($foo) * 20;\h'|3i'# (chdir $foo) * 20
4242         chdir +($foo) * 20;\h'|3i'# chdir ($foo * 20)
4243
4244         rand 10 * 20;\h'|3i'# rand (10 * 20)
4245         rand(10) * 20;\h'|3i'# (rand 10) * 20
4246         rand (10) * 20;\h'|3i'# (rand 10) * 20
4247         rand +(10) * 20;\h'|3i'# rand (10 * 20)
4248
4249 .fi
4250 In the absence of parentheses,
4251 the precedence of list operators such as print, sort or chmod is
4252 either very high or very low depending on whether you look at the left
4253 side of operator or the right side of it.
4254 For example, in
4255 .nf
4256
4257         @ary = (1, 3, sort 4, 2);
4258         print @ary;             # prints 1324
4259
4260 .fi
4261 the commas on the right of the sort are evaluated before the sort, but
4262 the commas on the left are evaluated after.
4263 In other words, list operators tend to gobble up all the arguments that
4264 follow them, and then act like a simple term with regard to the preceding
4265 expression.
4266 Note that you have to be careful with parens:
4267 .nf
4268
4269 .ne 3
4270         # These evaluate exit before doing the print:
4271         print($foo, exit);      # Obviously not what you want.
4272         print $foo, exit;       # Nor is this.
4273
4274 .ne 4
4275         # These do the print before evaluating exit:
4276         (print $foo), exit;     # This is what you want.
4277         print($foo), exit;      # Or this.
4278         print ($foo), exit;     # Or even this.
4279
4280 Also note that
4281
4282         print ($foo & 255) + 1, "\en";
4283
4284 .fi
4285 probably doesn't do what you expect at first glance.
4286 .Sh "Subroutines"
4287 A subroutine may be declared as follows:
4288 .nf
4289
4290     sub NAME BLOCK
4291
4292 .fi
4293 .PP
4294 Any arguments passed to the routine come in as array @_,
4295 that is ($_[0], $_[1], .\|.\|.).
4296 The array @_ is a local array, but its values are references to the
4297 actual scalar parameters.
4298 The return value of the subroutine is the value of the last expression
4299 evaluated, and can be either an array value or a scalar value.
4300 Alternately, a return statement may be used to specify the returned value and
4301 exit the subroutine.
4302 To create local variables see the
4303 .I local
4304 operator.
4305 .PP
4306 A subroutine is called using the
4307 .I do
4308 operator or the & operator.
4309 .nf
4310
4311 .ne 12
4312 Example:
4313
4314         sub MAX {
4315                 local($max) = pop(@_);
4316                 foreach $foo (@_) {
4317                         $max = $foo \|if \|$max < $foo;
4318                 }
4319                 $max;
4320         }
4321
4322         .\|.\|.
4323         $bestday = &MAX($mon,$tue,$wed,$thu,$fri);
4324
4325 .ne 21
4326 Example:
4327
4328         # get a line, combining continuation lines
4329         #  that start with whitespace
4330         sub get_line {
4331                 $thisline = $lookahead;
4332                 line: while ($lookahead = <STDIN>) {
4333                         if ($lookahead \|=~ \|/\|^[ \^\e\|t]\|/\|) {
4334                                 $thisline \|.= \|$lookahead;
4335                         }
4336                         else {
4337                                 last line;
4338                         }
4339                 }
4340                 $thisline;
4341         }
4342
4343         $lookahead = <STDIN>;   # get first line
4344         while ($_ = do get_line(\|)) {
4345                 .\|.\|.
4346         }
4347
4348 .fi
4349 .nf
4350 .ne 6
4351 Use array assignment to a local list to name your formal arguments:
4352
4353         sub maybeset {
4354                 local($key, $value) = @_;
4355                 $foo{$key} = $value unless $foo{$key};
4356         }
4357
4358 .fi
4359 This also has the effect of turning call-by-reference into call-by-value,
4360 since the assignment copies the values.
4361 .Sp
4362 Subroutines may be called recursively.
4363 If a subroutine is called using the & form, the argument list is optional.
4364 If omitted, no @_ array is set up for the subroutine; the @_ array at the
4365 time of the call is visible to subroutine instead.
4366 .nf
4367
4368         do foo(1,2,3);          # pass three arguments
4369         &foo(1,2,3);            # the same
4370
4371         do foo();               # pass a null list
4372         &foo();                 # the same
4373         &foo;                   # pass no arguments\*(--more efficient
4374
4375 .fi
4376 .Sh "Passing By Reference"
4377 Sometimes you don't want to pass the value of an array to a subroutine but
4378 rather the name of it, so that the subroutine can modify the global copy
4379 of it rather than working with a local copy.
4380 In perl you can refer to all the objects of a particular name by prefixing
4381 the name with a star: *foo.
4382 When evaluated, it produces a scalar value that represents all the objects
4383 of that name, including any filehandle, format or subroutine.
4384 When assigned to within a local() operation, it causes the name mentioned
4385 to refer to whatever * value was assigned to it.
4386 Example:
4387 .nf
4388
4389         sub doubleary {
4390             local(*someary) = @_;
4391             foreach $elem (@someary) {
4392                 $elem *= 2;
4393             }
4394         }
4395         do doubleary(*foo);
4396         do doubleary(*bar);
4397
4398 .fi
4399 Assignment to *name is currently recommended only inside a local().
4400 You can actually assign to *name anywhere, but the previous referent of
4401 *name may be stranded forever.
4402 This may or may not bother you.
4403 .Sp
4404 Note that scalars are already passed by reference, so you can modify scalar
4405 arguments without using this mechanism by referring explicitly to the $_[nnn]
4406 in question.
4407 You can modify all the elements of an array by passing all the elements
4408 as scalars, but you have to use the * mechanism to push, pop or change the
4409 size of an array.
4410 The * mechanism will probably be more efficient in any case.
4411 .Sp
4412 Since a *name value contains unprintable binary data, if it is used as
4413 an argument in a print, or as a %s argument in a printf or sprintf, it
4414 then has the value '*name', just so it prints out pretty.
4415 .Sp
4416 Even if you don't want to modify an array, this mechanism is useful for
4417 passing multiple arrays in a single LIST, since normally the LIST mechanism
4418 will merge all the array values so that you can't extract out the
4419 individual arrays.
4420 .Sh "Regular Expressions"
4421 The patterns used in pattern matching are regular expressions such as
4422 those supplied in the Version 8 regexp routines.
4423 (In fact, the routines are derived from Henry Spencer's freely redistributable
4424 reimplementation of the V8 routines.)
4425 In addition, \ew matches an alphanumeric character (including \*(L"_\*(R") and \eW a nonalphanumeric.
4426 Word boundaries may be matched by \eb, and non-boundaries by \eB.
4427 A whitespace character is matched by \es, non-whitespace by \eS.
4428 A numeric character is matched by \ed, non-numeric by \eD.
4429 You may use \ew, \es and \ed within character classes.
4430 Also, \en, \er, \ef, \et and \eNNN have their normal interpretations.
4431 Within character classes \eb represents backspace rather than a word boundary.
4432 Alternatives may be separated by |.
4433 The bracketing construct \|(\ .\|.\|.\ \|) may also be used, in which case \e<digit>
4434 matches the digit'th substring.
4435 (Outside of the pattern, always use $ instead of \e in front of the digit.
4436 The scope of $<digit> (and $\`, $& and $\')
4437 extends to the end of the enclosing BLOCK or eval string, or to
4438 the next pattern match with subexpressions.
4439 The \e<digit> notation sometimes works outside the current pattern, but should
4440 not be relied upon.)
4441 You may have as many parentheses as you wish.  If you have more than 9
4442 substrings, the variables $10, $11, ... refer to the corresponding
4443 substring.  Within the pattern, \e10, \e11,
4444 etc. refer back to substrings if there have been at least that many left parens
4445 before the backreference.  Otherwise (for backward compatibilty) \e10
4446 is the same as \e010, a backspace,
4447 and \e11 the same as \e011, a tab.
4448 And so on.
4449 (\e1 through \e9 are always backreferences.)
4450 .PP
4451 $+ returns whatever the last bracket match matched.
4452 $& returns the entire matched string.
4453 ($0 used to return the same thing, but not any more.)
4454 $\` returns everything before the matched string.
4455 $\' returns everything after the matched string.
4456 Examples:
4457 .nf
4458
4459         s/\|^\|([^ \|]*\|) \|*([^ \|]*\|)\|/\|$2 $1\|/; # swap first two words
4460
4461 .ne 5
4462         if (/\|Time: \|(.\|.\|):\|(.\|.\|):\|(.\|.\|)\|/\|) {
4463                 $hours = $1;
4464                 $minutes = $2;
4465                 $seconds = $3;
4466         }
4467
4468 .fi
4469 By default, the ^ character is only guaranteed to match at the beginning
4470 of the string,
4471 the $ character only at the end (or before the newline at the end)
4472 and
4473 .I perl
4474 does certain optimizations with the assumption that the string contains
4475 only one line.
4476 The behavior of ^ and $ on embedded newlines will be inconsistent.
4477 You may, however, wish to treat a string as a multi-line buffer, such that
4478 the ^ will match after any newline within the string, and $ will match
4479 before any newline.
4480 At the cost of a little more overhead, you can do this by setting the variable
4481 $* to 1.
4482 Setting it back to 0 makes
4483 .I perl
4484 revert to its old behavior.
4485 .PP
4486 To facilitate multi-line substitutions, the . character never matches a newline
4487 (even when $* is 0).
4488 In particular, the following leaves a newline on the $_ string:
4489 .nf
4490
4491         $_ = <STDIN>;
4492         s/.*(some_string).*/$1/;
4493
4494 If the newline is unwanted, try one of
4495
4496         s/.*(some_string).*\en/$1/;
4497         s/.*(some_string)[^\e000]*/$1/;
4498         s/.*(some_string)(.|\en)*/$1/;
4499         chop; s/.*(some_string).*/$1/;
4500         /(some_string)/ && ($_ = $1);
4501
4502 .fi
4503 Any item of a regular expression may be followed with digits in curly brackets
4504 of the form {n,m}, where n gives the minimum number of times to match the item
4505 and m gives the maximum.
4506 The form {n} is equivalent to {n,n} and matches exactly n times.
4507 The form {n,} matches n or more times.
4508 (If a curly bracket occurs in any other context, it is treated as a regular
4509 character.)
4510 The * modifier is equivalent to {0,}, the + modifier to {1,} and the ? modifier
4511 to {0,1}.
4512 There is no limit to the size of n or m, but large numbers will chew up
4513 more memory.
4514 .Sp
4515 You will note that all backslashed metacharacters in
4516 .I perl
4517 are alphanumeric,
4518 such as \eb, \ew, \en.
4519 Unlike some other regular expression languages, there are no backslashed
4520 symbols that aren't alphanumeric.
4521 So anything that looks like \e\e, \e(, \e), \e<, \e>, \e{, or \e} is always
4522 interpreted as a literal character, not a metacharacter.
4523 This makes it simple to quote a string that you want to use for a pattern
4524 but that you are afraid might contain metacharacters.
4525 Simply quote all the non-alphanumeric characters:
4526 .nf
4527
4528         $pattern =~ s/(\eW)/\e\e$1/g;
4529
4530 .fi
4531 .Sh "Formats"
4532 Output record formats for use with the
4533 .I write
4534 operator may declared as follows:
4535 .nf
4536
4537 .ne 3
4538     format NAME =
4539     FORMLIST
4540     .
4541
4542 .fi
4543 If name is omitted, format \*(L"STDOUT\*(R" is defined.
4544 FORMLIST consists of a sequence of lines, each of which may be of one of three
4545 types:
4546 .Ip 1. 4
4547 A comment.
4548 .Ip 2. 4
4549 A \*(L"picture\*(R" line giving the format for one output line.
4550 .Ip 3. 4
4551 An argument line supplying values to plug into a picture line.
4552 .PP
4553 Picture lines are printed exactly as they look, except for certain fields
4554 that substitute values into the line.
4555 Each picture field starts with either @ or ^.
4556 The @ field (not to be confused with the array marker @) is the normal
4557 case; ^ fields are used
4558 to do rudimentary multi-line text block filling.
4559 The length of the field is supplied by padding out the field
4560 with multiple <, >, or | characters to specify, respectively, left justification,
4561 right justification, or centering.
4562 As an alternate form of right justification,
4563 you may also use # characters (with an optional .) to specify a numeric field.
4564 (Use of ^ instead of @ causes the field to be blanked if undefined.)
4565 If any of the values supplied for these fields contains a newline, only
4566 the text up to the newline is printed.
4567 The special field @* can be used for printing multi-line values.
4568 It should appear by itself on a line.
4569 .PP
4570 The values are specified on the following line, in the same order as
4571 the picture fields.
4572 The values should be separated by commas.
4573 .PP
4574 Picture fields that begin with ^ rather than @ are treated specially.
4575 The value supplied must be a scalar variable name which contains a text
4576 string.
4577 .I Perl
4578 puts as much text as it can into the field, and then chops off the front
4579 of the string so that the next time the variable is referenced,
4580 more of the text can be printed.
4581 Normally you would use a sequence of fields in a vertical stack to print
4582 out a block of text.
4583 If you like, you can end the final field with .\|.\|., which will appear in the
4584 output if the text was too long to appear in its entirety.
4585 You can change which characters are legal to break on by changing the
4586 variable $: to a list of the desired characters.
4587 .PP
4588 Since use of ^ fields can produce variable length records if the text to be
4589 formatted is short, you can suppress blank lines by putting the tilde (~)
4590 character anywhere in the line.
4591 (Normally you should put it in the front if possible, for visibility.)
4592 The tilde will be translated to a space upon output.
4593 If you put a second tilde contiguous to the first, the line will be repeated
4594 until all the fields on the line are exhausted.
4595 (If you use a field of the @ variety, the expression you supply had better
4596 not give the same value every time forever!)
4597 .PP
4598 Examples:
4599 .nf
4600 .lg 0
4601 .cs R 25
4602 .ft C
4603
4604 .ne 10
4605 # a report on the /etc/passwd file
4606 format STDOUT_TOP =
4607 \&                        Passwd File
4608 Name                Login    Office   Uid   Gid Home
4609 ------------------------------------------------------------------
4610 \&.
4611 format STDOUT =
4612 @<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<<
4613 $name,              $login,  $office,$uid,$gid, $home
4614 \&.
4615
4616 .ne 29
4617 # a report from a bug report form
4618 format STDOUT_TOP =
4619 \&                        Bug Reports
4620 @<<<<<<<<<<<<<<<<<<<<<<<     @|||         @>>>>>>>>>>>>>>>>>>>>>>>
4621 $system,                      $%,         $date
4622 ------------------------------------------------------------------
4623 \&.
4624 format STDOUT =
4625 Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4626 \&         $subject
4627 Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4628 \&       $index,                       $description
4629 Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4630 \&          $priority,        $date,   $description
4631 From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4632 \&      $from,                         $description
4633 Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4634 \&             $programmer,            $description
4635 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4636 \&                                     $description
4637 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4638 \&                                     $description
4639 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4640 \&                                     $description
4641 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
4642 \&                                     $description
4643 \&~                                    ^<<<<<<<<<<<<<<<<<<<<<<<...
4644 \&                                     $description
4645 \&.
4646
4647 .ft R
4648 .cs R
4649 .lg
4650 .fi
4651 It is possible to intermix prints with writes on the same output channel,
4652 but you'll have to handle $\- (lines left on the page) yourself.
4653 .PP
4654 If you are printing lots of fields that are usually blank, you should consider
4655 using the reset operator between records.
4656 Not only is it more efficient, but it can prevent the bug of adding another
4657 field and forgetting to zero it.
4658 .Sh "Interprocess Communication"
4659 The IPC facilities of perl are built on the Berkeley socket mechanism.
4660 If you don't have sockets, you can ignore this section.
4661 The calls have the same names as the corresponding system calls,
4662 but the arguments tend to differ, for two reasons.
4663 First, perl file handles work differently than C file descriptors.
4664 Second, perl already knows the length of its strings, so you don't need
4665 to pass that information.
4666 Here is a sample client (untested):
4667 .nf
4668
4669         ($them,$port) = @ARGV;
4670         $port = 2345 unless $port;
4671         $them = 'localhost' unless $them;
4672
4673         $SIG{'INT'} = 'dokill';
4674         sub dokill { kill 9,$child if $child; }
4675
4676         require 'sys/socket.ph';
4677
4678         $sockaddr = 'S n a4 x8';
4679         chop($hostname = `hostname`);
4680
4681         ($name, $aliases, $proto) = getprotobyname('tcp');
4682         ($name, $aliases, $port) = getservbyname($port, 'tcp')
4683                 unless $port =~ /^\ed+$/;
4684 .ie t \{\
4685         ($name, $aliases, $type, $len, $thisaddr) = gethostbyname($hostname);
4686 'br\}
4687 .el \{\
4688         ($name, $aliases, $type, $len, $thisaddr) =
4689                                         gethostbyname($hostname);
4690 'br\}
4691         ($name, $aliases, $type, $len, $thataddr) = gethostbyname($them);
4692
4693         $this = pack($sockaddr, &AF_INET, 0, $thisaddr);
4694         $that = pack($sockaddr, &AF_INET, $port, $thataddr);
4695
4696         socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
4697         bind(S, $this) || die "bind: $!";
4698         connect(S, $that) || die "connect: $!";
4699
4700         select(S); $| = 1; select(stdout);
4701
4702         if ($child = fork) {
4703                 while (<>) {
4704                         print S;
4705                 }
4706                 sleep 3;
4707                 do dokill();
4708         }
4709         else {
4710                 while (<S>) {
4711                         print;
4712                 }
4713         }
4714
4715 .fi
4716 And here's a server:
4717 .nf
4718
4719         ($port) = @ARGV;
4720         $port = 2345 unless $port;
4721
4722         require 'sys/socket.ph';
4723
4724         $sockaddr = 'S n a4 x8';
4725
4726         ($name, $aliases, $proto) = getprotobyname('tcp');
4727         ($name, $aliases, $port) = getservbyname($port, 'tcp')
4728                 unless $port =~ /^\ed+$/;
4729
4730         $this = pack($sockaddr, &AF_INET, $port, "\e0\e0\e0\e0");
4731
4732         select(NS); $| = 1; select(stdout);
4733
4734         socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
4735         bind(S, $this) || die "bind: $!";
4736         listen(S, 5) || die "connect: $!";
4737
4738         select(S); $| = 1; select(stdout);
4739
4740         for (;;) {
4741                 print "Listening again\en";
4742                 ($addr = accept(NS,S)) || die $!;
4743                 print "accept ok\en";
4744
4745                 ($af,$port,$inetaddr) = unpack($sockaddr,$addr);
4746                 @inetaddr = unpack('C4',$inetaddr);
4747                 print "$af $port @inetaddr\en";
4748
4749                 while (<NS>) {
4750                         print;
4751                         print NS;
4752                 }
4753         }
4754
4755 .fi
4756 .Sh "Predefined Names"
4757 The following names have special meaning to
4758 .IR perl .
4759 I could have used alphabetic symbols for some of these, but I didn't want
4760 to take the chance that someone would say reset \*(L"a\-zA\-Z\*(R" and wipe them all
4761 out.
4762 You'll just have to suffer along with these silly symbols.
4763 Most of them have reasonable mnemonics, or analogues in one of the shells.
4764 .Ip $_ 8
4765 The default input and pattern-searching space.
4766 The following pairs are equivalent:
4767 .nf
4768
4769 .ne 2
4770         while (<>) {\|.\|.\|.   # only equivalent in while!
4771         while ($_ = <>) {\|.\|.\|.
4772
4773 .ne 2
4774         /\|^Subject:/
4775         $_ \|=~ \|/\|^Subject:/
4776
4777 .ne 2
4778         y/a\-z/A\-Z/
4779         $_ =~ y/a\-z/A\-Z/
4780
4781 .ne 2
4782         chop
4783         chop($_)
4784
4785 .fi
4786 (Mnemonic: underline is understood in certain operations.)
4787 .Ip $. 8
4788 The current input line number of the last filehandle that was read.
4789 Readonly.
4790 Remember that only an explicit close on the filehandle resets the line number.
4791 Since <> never does an explicit close, line numbers increase across ARGV files
4792 (but see examples under eof).
4793 (Mnemonic: many programs use . to mean the current line number.)
4794 .Ip $/ 8
4795 The input record separator, newline by default.
4796 Works like
4797 .IR awk 's
4798 RS variable, including treating blank lines as delimiters
4799 if set to the null string.
4800 You may set it to a multicharacter string to match a multi-character
4801 delimiter.
4802 (Mnemonic: / is used to delimit line boundaries when quoting poetry.)
4803 .Ip $, 8
4804 The output field separator for the print operator.
4805 Ordinarily the print operator simply prints out the comma separated fields
4806 you specify.
4807 In order to get behavior more like
4808 .IR awk ,
4809 set this variable as you would set
4810 .IR awk 's
4811 OFS variable to specify what is printed between fields.
4812 (Mnemonic: what is printed when there is a , in your print statement.)
4813 .Ip $"" 8
4814 This is like $, except that it applies to array values interpolated into
4815 a double-quoted string (or similar interpreted string).
4816 Default is a space.
4817 (Mnemonic: obvious, I think.)
4818 .Ip $\e 8
4819 The output record separator for the print operator.
4820 Ordinarily the print operator simply prints out the comma separated fields
4821 you specify, with no trailing newline or record separator assumed.
4822 In order to get behavior more like
4823 .IR awk ,
4824 set this variable as you would set
4825 .IR awk 's
4826 ORS variable to specify what is printed at the end of the print.
4827 (Mnemonic: you set $\e instead of adding \en at the end of the print.
4828 Also, it's just like /, but it's what you get \*(L"back\*(R" from
4829 .IR perl .)
4830 .Ip $# 8
4831 The output format for printed numbers.
4832 This variable is a half-hearted attempt to emulate
4833 .IR awk 's
4834 OFMT variable.
4835 There are times, however, when
4836 .I awk
4837 and
4838 .I perl
4839 have differing notions of what
4840 is in fact numeric.
4841 Also, the initial value is %.20g rather than %.6g, so you need to set $#
4842 explicitly to get
4843 .IR awk 's
4844 value.
4845 (Mnemonic: # is the number sign.)
4846 .Ip $% 8
4847 The current page number of the currently selected output channel.
4848 (Mnemonic: % is page number in nroff.)
4849 .Ip $= 8
4850 The current page length (printable lines) of the currently selected output
4851 channel.
4852 Default is 60.
4853 (Mnemonic: = has horizontal lines.)
4854 .Ip $\- 8
4855 The number of lines left on the page of the currently selected output channel.
4856 (Mnemonic: lines_on_page \- lines_printed.)
4857 .Ip $~ 8
4858 The name of the current report format for the currently selected output
4859 channel.
4860 Default is name of the filehandle.
4861 (Mnemonic: brother to $^.)
4862 .Ip $^ 8
4863 The name of the current top-of-page format for the currently selected output
4864 channel.
4865 Default is name of the filehandle with \*(L"_TOP\*(R" appended.
4866 (Mnemonic: points to top of page.)
4867 .Ip $| 8
4868 If set to nonzero, forces a flush after every write or print on the currently
4869 selected output channel.
4870 Default is 0.
4871 Note that
4872 .I STDOUT
4873 will typically be line buffered if output is to the
4874 terminal and block buffered otherwise.
4875 Setting this variable is useful primarily when you are outputting to a pipe,
4876 such as when you are running a
4877 .I perl
4878 script under rsh and want to see the
4879 output as it's happening.
4880 (Mnemonic: when you want your pipes to be piping hot.)
4881 .Ip $$ 8
4882 The process number of the
4883 .I perl
4884 running this script.
4885 (Mnemonic: same as shells.)
4886 .Ip $? 8
4887 The status returned by the last pipe close, backtick (\`\`) command or
4888 .I system
4889 operator.
4890 Note that this is the status word returned by the wait() system
4891 call, so the exit value of the subprocess is actually ($? >> 8).
4892 $? & 255 gives which signal, if any, the process died from, and whether
4893 there was a core dump.
4894 (Mnemonic: similar to sh and ksh.)
4895 .Ip $& 8 4
4896 The string matched by the last pattern match (not counting any matches hidden
4897 within a BLOCK or eval enclosed by the current BLOCK).
4898 (Mnemonic: like & in some editors.)
4899 .Ip $\` 8 4
4900 The string preceding whatever was matched by the last pattern match
4901 (not counting any matches hidden within a BLOCK or eval enclosed by the current
4902 BLOCK).
4903 (Mnemonic: \` often precedes a quoted string.)
4904 .Ip $\' 8 4
4905 The string following whatever was matched by the last pattern match
4906 (not counting any matches hidden within a BLOCK or eval enclosed by the current
4907 BLOCK).
4908 (Mnemonic: \' often follows a quoted string.)
4909 Example:
4910 .nf
4911
4912 .ne 3
4913         $_ = \'abcdefghi\';
4914         /def/;
4915         print "$\`:$&:$\'\en";          # prints abc:def:ghi
4916
4917 .fi
4918 .Ip $+ 8 4
4919 The last bracket matched by the last search pattern.
4920 This is useful if you don't know which of a set of alternative patterns
4921 matched.
4922 For example:
4923 .nf
4924
4925     /Version: \|(.*\|)|Revision: \|(.*\|)\|/ \|&& \|($rev = $+);
4926
4927 .fi
4928 (Mnemonic: be positive and forward looking.)
4929 .Ip $* 8 2
4930 Set to 1 to do multiline matching within a string, 0 to tell
4931 .I perl
4932 that it can assume that strings contain a single line, for the purpose
4933 of optimizing pattern matches.
4934 Pattern matches on strings containing multiple newlines can produce confusing
4935 results when $* is 0.
4936 Default is 0.
4937 (Mnemonic: * matches multiple things.)
4938 Note that this variable only influences the interpretation of ^ and $.
4939 A literal newline can be searched for even when $* == 0.
4940 .Ip $0 8
4941 Contains the name of the file containing the
4942 .I perl
4943 script being executed.
4944 Assigning to $0 modifies the argument area that the ps(1) program sees.
4945 (Mnemonic: same as sh and ksh.)
4946 .Ip $<digit> 8
4947 Contains the subpattern from the corresponding set of parentheses in the last
4948 pattern matched, not counting patterns matched in nested blocks that have
4949 been exited already.
4950 (Mnemonic: like \edigit.)
4951 .Ip $[ 8 2
4952 The index of the first element in an array, and of the first character in
4953 a substring.
4954 Default is 0, but you could set it to 1 to make
4955 .I perl
4956 behave more like
4957 .I awk
4958 (or Fortran)
4959 when subscripting and when evaluating the index() and substr() functions.
4960 (Mnemonic: [ begins subscripts.)
4961 .Ip $] 8 2
4962 The string printed out when you say \*(L"perl -v\*(R".
4963 It can be used to determine at the beginning of a script whether the perl
4964 interpreter executing the script is in the right range of versions.
4965 If used in a numeric context, returns the version + patchlevel / 1000.
4966 Example:
4967 .nf
4968
4969 .ne 8
4970         # see if getc is available
4971         ($version,$patchlevel) =
4972                  $] =~ /(\ed+\e.\ed+).*\enPatch level: (\ed+)/;
4973         print STDERR "(No filename completion available.)\en"
4974                  if $version * 1000 + $patchlevel < 2016;
4975
4976 or, used numerically,
4977
4978         warn "No checksumming!\en" if $] < 3.019;
4979
4980 .fi
4981 (Mnemonic: Is this version of perl in the right bracket?)
4982 .Ip $; 8 2
4983 The subscript separator for multi-dimensional array emulation.
4984 If you refer to an associative array element as
4985 .nf
4986         $foo{$a,$b,$c}
4987
4988 it really means
4989
4990         $foo{join($;, $a, $b, $c)}
4991
4992 But don't put
4993
4994         @foo{$a,$b,$c}          # a slice\*(--note the @
4995
4996 which means
4997
4998         ($foo{$a},$foo{$b},$foo{$c})
4999
5000 .fi
5001 Default is "\e034", the same as SUBSEP in
5002 .IR awk .
5003 Note that if your keys contain binary data there might not be any safe
5004 value for $;.
5005 (Mnemonic: comma (the syntactic subscript separator) is a semi-semicolon.
5006 Yeah, I know, it's pretty lame, but $, is already taken for something more
5007 important.)
5008 .Ip $! 8 2
5009 If used in a numeric context, yields the current value of errno, with all the
5010 usual caveats.
5011 (This means that you shouldn't depend on the value of $! to be anything
5012 in particular unless you've gotten a specific error return indicating a
5013 system error.)
5014 If used in a string context, yields the corresponding system error string.
5015 You can assign to $! in order to set errno
5016 if, for instance, you want $! to return the string for error n, or you want
5017 to set the exit value for the die operator.
5018 (Mnemonic: What just went bang?)
5019 .Ip $@ 8 2
5020 The perl syntax error message from the last eval command.
5021 If null, the last eval parsed and executed correctly (although the operations
5022 you invoked may have failed in the normal fashion).
5023 (Mnemonic: Where was the syntax error \*(L"at\*(R"?)
5024 .Ip $< 8 2
5025 The real uid of this process.
5026 (Mnemonic: it's the uid you came FROM, if you're running setuid.)
5027 .Ip $> 8 2
5028 The effective uid of this process.
5029 Example:
5030 .nf
5031
5032 .ne 2
5033         $< = $>;        # set real uid to the effective uid
5034         ($<,$>) = ($>,$<);      # swap real and effective uid
5035
5036 .fi
5037 (Mnemonic: it's the uid you went TO, if you're running setuid.)
5038 Note: $< and $> can only be swapped on machines supporting setreuid().
5039 .Ip $( 8 2
5040 The real gid of this process.
5041 If you are on a machine that supports membership in multiple groups
5042 simultaneously, gives a space separated list of groups you are in.
5043 The first number is the one returned by getgid(), and the subsequent ones
5044 by getgroups(), one of which may be the same as the first number.
5045 (Mnemonic: parentheses are used to GROUP things.
5046 The real gid is the group you LEFT, if you're running setgid.)
5047 .Ip $) 8 2
5048 The effective gid of this process.
5049 If you are on a machine that supports membership in multiple groups
5050 simultaneously, gives a space separated list of groups you are in.
5051 The first number is the one returned by getegid(), and the subsequent ones
5052 by getgroups(), one of which may be the same as the first number.
5053 (Mnemonic: parentheses are used to GROUP things.
5054 The effective gid is the group that's RIGHT for you, if you're running setgid.)
5055 .Sp
5056 Note: $<, $>, $( and $) can only be set on machines that support the
5057 corresponding set[re][ug]id() routine.
5058 $( and $) can only be swapped on machines supporting setregid().
5059 .Ip $: 8 2
5060 The current set of characters after which a string may be broken to
5061 fill continuation fields (starting with ^) in a format.
5062 Default is "\ \en-", to break on whitespace or hyphens.
5063 (Mnemonic: a \*(L"colon\*(R" in poetry is a part of a line.)
5064 .Ip $^D 8 2
5065 The current value of the debugging flags.
5066 (Mnemonic: value of
5067 .B \-D
5068 switch.)
5069 .Ip $^F 8 2
5070 The maximum system file descriptor, ordinarily 2.  System file descriptors
5071 are passed to subprocesses, while higher file descriptors are not.
5072 During an open, system file descriptors are preserved even if the open
5073 fails.  Ordinary file descriptors are closed before the open is attempted.
5074 .Ip $^I 8 2
5075 The current value of the inplace-edit extension.
5076 Use undef to disable inplace editing.
5077 (Mnemonic: value of
5078 .B \-i
5079 switch.)
5080 .Ip $^P 8 2
5081 The internal flag that the debugger clears so that it doesn't
5082 debug itself.  You could conceivable disable debugging yourself
5083 by clearing it.
5084 .Ip $^T 8 2
5085 The time at which the script began running, in seconds since the epoch.
5086 The values returned by the
5087 .B \-M ,
5088 .B \-A
5089 and
5090 .B \-C
5091 filetests are based on this value.
5092 .Ip $^W 8 2
5093 The current value of the warning switch.
5094 (Mnemonic: related to the
5095 .B \-w
5096 switch.)
5097 .Ip $^X 8 2
5098 The name that Perl itself was executed as, from argv[0].
5099 .Ip $ARGV 8 3
5100 contains the name of the current file when reading from <>.
5101 .Ip @ARGV 8 3
5102 The array ARGV contains the command line arguments intended for the script.
5103 Note that $#ARGV is the generally number of arguments minus one, since
5104 $ARGV[0] is the first argument, NOT the command name.
5105 See $0 for the command name.
5106 .Ip @INC 8 3
5107 The array INC contains the list of places to look for
5108 .I perl
5109 scripts to be
5110 evaluated by the \*(L"do EXPR\*(R" command or the \*(L"require\*(R" command.
5111 It initially consists of the arguments to any
5112 .B \-I
5113 command line switches, followed
5114 by the default
5115 .I perl
5116 library, probably \*(L"/usr/local/lib/perl\*(R",
5117 followed by \*(L".\*(R", to represent the current directory.
5118 .Ip %INC 8 3
5119 The associative array INC contains entries for each filename that has
5120 been included via \*(L"do\*(R" or \*(L"require\*(R".
5121 The key is the filename you specified, and the value is the location of
5122 the file actually found.
5123 The \*(L"require\*(R" command uses this array to determine whether
5124 a given file has already been included.
5125 .Ip $ENV{expr} 8 2
5126 The associative array ENV contains your current environment.
5127 Setting a value in ENV changes the environment for child processes.
5128 .Ip $SIG{expr} 8 2
5129 The associative array SIG is used to set signal handlers for various signals.
5130 Example:
5131 .nf
5132
5133 .ne 12
5134         sub handler {   # 1st argument is signal name
5135                 local($sig) = @_;
5136                 print "Caught a SIG$sig\-\|\-shutting down\en";
5137                 close(LOG);
5138                 exit(0);
5139         }
5140
5141         $SIG{\'INT\'} = \'handler\';
5142         $SIG{\'QUIT\'} = \'handler\';
5143         .\|.\|.
5144         $SIG{\'INT\'} = \'DEFAULT\';    # restore default action
5145         $SIG{\'QUIT\'} = \'IGNORE\';    # ignore SIGQUIT
5146
5147 .fi
5148 The SIG array only contains values for the signals actually set within
5149 the perl script.
5150 .Sh "Packages"
5151 Perl provides a mechanism for alternate namespaces to protect packages from
5152 stomping on each others variables.
5153 By default, a perl script starts compiling into the package known as \*(L"main\*(R".
5154 By use of the
5155 .I package
5156 declaration, you can switch namespaces.
5157 The scope of the package declaration is from the declaration itself to the end
5158 of the enclosing block (the same scope as the local() operator).
5159 Typically it would be the first declaration in a file to be included by
5160 the \*(L"require\*(R" operator.
5161 You can switch into a package in more than one place; it merely influences
5162 which symbol table is used by the compiler for the rest of that block.
5163 You can refer to variables and filehandles in other packages by prefixing
5164 the identifier with the package name and a single quote.
5165 If the package name is null, the \*(L"main\*(R" package as assumed.
5166 .PP
5167 Only identifiers starting with letters are stored in the packages symbol
5168 table.
5169 All other symbols are kept in package \*(L"main\*(R".
5170 In addition, the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC
5171 and SIG are forced to be in package \*(L"main\*(R", even when used for
5172 other purposes than their built-in one.
5173 Note also that, if you have a package called \*(L"m\*(R", \*(L"s\*(R"
5174 or \*(L"y\*(R", the you can't use the qualified form of an identifier since it
5175 will be interpreted instead as a pattern match, a substitution
5176 or a translation.
5177 .PP
5178 Eval'ed strings are compiled in the package in which the eval was compiled
5179 in.
5180 (Assignments to $SIG{}, however, assume the signal handler specified is in the
5181 main package.
5182 Qualify the signal handler name if you wish to have a signal handler in
5183 a package.)
5184 For an example, examine perldb.pl in the perl library.
5185 It initially switches to the DB package so that the debugger doesn't interfere
5186 with variables in the script you are trying to debug.
5187 At various points, however, it temporarily switches back to the main package
5188 to evaluate various expressions in the context of the main package.
5189 .PP
5190 The symbol table for a package happens to be stored in the associative array
5191 of that name prepended with an underscore.
5192 The value in each entry of the associative array is
5193 what you are referring to when you use the *name notation.
5194 In fact, the following have the same effect (in package main, anyway),
5195 though the first is more
5196 efficient because it does the symbol table lookups at compile time:
5197 .nf
5198
5199 .ne 2
5200         local(*foo) = *bar;
5201         local($_main{'foo'}) = $_main{'bar'};
5202
5203 .fi
5204 You can use this to print out all the variables in a package, for instance.
5205 Here is dumpvar.pl from the perl library:
5206 .nf
5207 .ne 11
5208         package dumpvar;
5209
5210         sub main'dumpvar {
5211         \&    ($package) = @_;
5212         \&    local(*stab) = eval("*_$package");
5213         \&    while (($key,$val) = each(%stab)) {
5214         \&        {
5215         \&            local(*entry) = $val;
5216         \&            if (defined $entry) {
5217         \&                print "\e$$key = '$entry'\en";
5218         \&            }
5219 .ne 7
5220         \&            if (defined @entry) {
5221         \&                print "\e@$key = (\en";
5222         \&                foreach $num ($[ .. $#entry) {
5223         \&                    print "  $num\et'",$entry[$num],"'\en";
5224         \&                }
5225         \&                print ")\en";
5226         \&            }
5227 .ne 10
5228         \&            if ($key ne "_$package" && defined %entry) {
5229         \&                print "\e%$key = (\en";
5230         \&                foreach $key (sort keys(%entry)) {
5231         \&                    print "  $key\et'",$entry{$key},"'\en";
5232         \&                }
5233         \&                print ")\en";
5234         \&            }
5235         \&        }
5236         \&    }
5237         }
5238
5239 .fi
5240 Note that, even though the subroutine is compiled in package dumpvar, the
5241 name of the subroutine is qualified so that its name is inserted into package
5242 \*(L"main\*(R".
5243 .Sh "Style"
5244 Each programmer will, of course, have his or her own preferences in regards
5245 to formatting, but there are some general guidelines that will make your
5246 programs easier to read.
5247 .Ip 1. 4 4
5248 Just because you CAN do something a particular way doesn't mean that
5249 you SHOULD do it that way.
5250 .I Perl
5251 is designed to give you several ways to do anything, so consider picking
5252 the most readable one.
5253 For instance
5254
5255         open(FOO,$foo) || die "Can't open $foo: $!";
5256
5257 is better than
5258
5259         die "Can't open $foo: $!" unless open(FOO,$foo);
5260
5261 because the second way hides the main point of the statement in a
5262 modifier.
5263 On the other hand
5264
5265         print "Starting analysis\en" if $verbose;
5266
5267 is better than
5268
5269         $verbose && print "Starting analysis\en";
5270
5271 since the main point isn't whether the user typed -v or not.
5272 .Sp
5273 Similarly, just because an operator lets you assume default arguments
5274 doesn't mean that you have to make use of the defaults.
5275 The defaults are there for lazy systems programmers writing one-shot
5276 programs.
5277 If you want your program to be readable, consider supplying the argument.
5278 .Sp
5279 Along the same lines, just because you
5280 .I can
5281 omit parentheses in many places doesn't mean that you ought to:
5282 .nf
5283
5284         return print reverse sort num values array;
5285         return print(reverse(sort num (values(%array))));
5286
5287 .fi
5288 When in doubt, parenthesize.
5289 At the very least it will let some poor schmuck bounce on the % key in vi.
5290 .Sp
5291 Even if you aren't in doubt, consider the mental welfare of the person who
5292 has to maintain the code after you, and who will probably put parens in
5293 the wrong place.
5294 .Ip 2. 4 4
5295 Don't go through silly contortions to exit a loop at the top or the
5296 bottom, when
5297 .I perl
5298 provides the "last" operator so you can exit in the middle.
5299 Just outdent it a little to make it more visible:
5300 .nf
5301
5302 .ne 7
5303     line:
5304         for (;;) {
5305             statements;
5306         last line if $foo;
5307             next line if /^#/;
5308             statements;
5309         }
5310
5311 .fi
5312 .Ip 3. 4 4
5313 Don't be afraid to use loop labels\*(--they're there to enhance readability as
5314 well as to allow multi-level loop breaks.
5315 See last example.
5316 .Ip 4. 4 4
5317 For portability, when using features that may not be implemented on every
5318 machine, test the construct in an eval to see if it fails.
5319 If you know what version or patchlevel a particular feature was implemented,
5320 you can test $] to see if it will be there.
5321 .Ip 5. 4 4
5322 Choose mnemonic identifiers.
5323 .Ip 6. 4 4
5324 Be consistent.
5325 .Sh "Debugging"
5326 If you invoke
5327 .I perl
5328 with a
5329 .B \-d
5330 switch, your script will be run under a debugging monitor.
5331 It will halt before the first executable statement and ask you for a
5332 command, such as:
5333 .Ip "h" 12 4
5334 Prints out a help message.
5335 .Ip "T" 12 4
5336 Stack trace.
5337 .Ip "s" 12 4
5338 Single step.
5339 Executes until it reaches the beginning of another statement.
5340 .Ip "n" 12 4
5341 Next.
5342 Executes over subroutine calls, until it reaches the beginning of the
5343 next statement.
5344 .Ip "f" 12 4
5345 Finish.
5346 Executes statements until it has finished the current subroutine.
5347 .Ip "c" 12 4
5348 Continue.
5349 Executes until the next breakpoint is reached.
5350 .Ip "c line" 12 4
5351 Continue to the specified line.
5352 Inserts a one-time-only breakpoint at the specified line.
5353 .Ip "<CR>" 12 4
5354 Repeat last n or s.
5355 .Ip "l min+incr" 12 4
5356 List incr+1 lines starting at min.
5357 If min is omitted, starts where last listing left off.
5358 If incr is omitted, previous value of incr is used.
5359 .Ip "l min-max" 12 4
5360 List lines in the indicated range.
5361 .Ip "l line" 12 4
5362 List just the indicated line.
5363 .Ip "l" 12 4
5364 List next window.
5365 .Ip "-" 12 4
5366 List previous window.
5367 .Ip "w line" 12 4
5368 List window around line.
5369 .Ip "l subname" 12 4
5370 List subroutine.
5371 If it's a long subroutine it just lists the beginning.
5372 Use \*(L"l\*(R" to list more.
5373 .Ip "/pattern/" 12 4
5374 Regular expression search forward for pattern; the final / is optional.
5375 .Ip "?pattern?" 12 4
5376 Regular expression search backward for pattern; the final ? is optional.
5377 .Ip "L" 12 4
5378 List lines that have breakpoints or actions.
5379 .Ip "S" 12 4
5380 Lists the names of all subroutines.
5381 .Ip "t" 12 4
5382 Toggle trace mode on or off.
5383 .Ip "b line condition" 12 4
5384 Set a breakpoint.
5385 If line is omitted, sets a breakpoint on the
5386 line that is about to be executed.
5387 If a condition is specified, it is evaluated each time the statement is
5388 reached and a breakpoint is taken only if the condition is true.
5389 Breakpoints may only be set on lines that begin an executable statement.
5390 .Ip "b subname condition" 12 4
5391 Set breakpoint at first executable line of subroutine.
5392 .Ip "d line" 12 4
5393 Delete breakpoint.
5394 If line is omitted, deletes the breakpoint on the
5395 line that is about to be executed.
5396 .Ip "D" 12 4
5397 Delete all breakpoints.
5398 .Ip "a line command" 12 4
5399 Set an action for line.
5400 A multi-line command may be entered by backslashing the newlines.
5401 .Ip "A" 12 4
5402 Delete all line actions.
5403 .Ip "< command" 12 4
5404 Set an action to happen before every debugger prompt.
5405 A multi-line command may be entered by backslashing the newlines.
5406 .Ip "> command" 12 4
5407 Set an action to happen after the prompt when you've just given a command
5408 to return to executing the script.
5409 A multi-line command may be entered by backslashing the newlines.
5410 .Ip "V package" 12 4
5411 List all variables in package.
5412 Default is main package.
5413 .Ip "! number" 12 4
5414 Redo a debugging command.
5415 If number is omitted, redoes the previous command.
5416 .Ip "! -number" 12 4
5417 Redo the command that was that many commands ago.
5418 .Ip "H -number" 12 4
5419 Display last n commands.
5420 Only commands longer than one character are listed.
5421 If number is omitted, lists them all.
5422 .Ip "q or ^D" 12 4
5423 Quit.
5424 .Ip "command" 12 4
5425 Execute command as a perl statement.
5426 A missing semicolon will be supplied.
5427 .Ip "p expr" 12 4
5428 Same as \*(L"print DB'OUT expr\*(R".
5429 The DB'OUT filehandle is opened to /dev/tty, regardless of where STDOUT
5430 may be redirected to.
5431 .PP
5432 If you want to modify the debugger, copy perldb.pl from the perl library
5433 to your current directory and modify it as necessary.
5434 (You'll also have to put -I. on your command line.)
5435 You can do some customization by setting up a .perldb file which contains
5436 initialization code.
5437 For instance, you could make aliases like these:
5438 .nf
5439
5440     $DB'alias{'len'} = 's/^len(.*)/p length($1)/';
5441     $DB'alias{'stop'} = 's/^stop (at|in)/b/';
5442     $DB'alias{'.'} =
5443       's/^\e./p "\e$DB\e'sub(\e$DB\e'line):\et",\e$DB\e'line[\e$DB\e'line]/';
5444
5445 .fi
5446 .Sh "Setuid Scripts"
5447 .I Perl
5448 is designed to make it easy to write secure setuid and setgid scripts.
5449 Unlike shells, which are based on multiple substitution passes on each line
5450 of the script,
5451 .I perl
5452 uses a more conventional evaluation scheme with fewer hidden \*(L"gotchas\*(R".
5453 Additionally, since the language has more built-in functionality, it
5454 has to rely less upon external (and possibly untrustworthy) programs to
5455 accomplish its purposes.
5456 .PP
5457 In an unpatched 4.2 or 4.3bsd kernel, setuid scripts are intrinsically
5458 insecure, but this kernel feature can be disabled.
5459 If it is,
5460 .I perl
5461 can emulate the setuid and setgid mechanism when it notices the otherwise
5462 useless setuid/gid bits on perl scripts.
5463 If the kernel feature isn't disabled,
5464 .I perl
5465 will complain loudly that your setuid script is insecure.
5466 You'll need to either disable the kernel setuid script feature, or put
5467 a C wrapper around the script.
5468 .PP
5469 When perl is executing a setuid script, it takes special precautions to
5470 prevent you from falling into any obvious traps.
5471 (In some ways, a perl script is more secure than the corresponding
5472 C program.)
5473 Any command line argument, environment variable, or input is marked as
5474 \*(L"tainted\*(R", and may not be used, directly or indirectly, in any
5475 command that invokes a subshell, or in any command that modifies files,
5476 directories or processes.
5477 Any variable that is set within an expression that has previously referenced
5478 a tainted value also becomes tainted (even if it is logically impossible
5479 for the tainted value to influence the variable).
5480 For example:
5481 .nf
5482
5483 .ne 5
5484         $foo = shift;                   # $foo is tainted
5485         $bar = $foo,\'bar\';            # $bar is also tainted
5486         $xxx = <>;                      # Tainted
5487         $path = $ENV{\'PATH\'}; # Tainted, but see below
5488         $abc = \'abc\';                 # Not tainted
5489
5490 .ne 4
5491         system "echo $foo";             # Insecure
5492         system "/bin/echo", $foo;       # Secure (doesn't use sh)
5493         system "echo $bar";             # Insecure
5494         system "echo $abc";             # Insecure until PATH set
5495
5496 .ne 5
5497         $ENV{\'PATH\'} = \'/bin:/usr/bin\';
5498         $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
5499
5500         $path = $ENV{\'PATH\'}; # Not tainted
5501         system "echo $abc";             # Is secure now!
5502
5503 .ne 5
5504         open(FOO,"$foo");               # OK
5505         open(FOO,">$foo");              # Not OK
5506
5507         open(FOO,"echo $foo|"); # Not OK, but...
5508         open(FOO,"-|") || exec \'echo\', $foo;  # OK
5509
5510         $zzz = `echo $foo`;             # Insecure, zzz tainted
5511
5512         unlink $abc,$foo;               # Insecure
5513         umask $foo;                     # Insecure
5514
5515 .ne 3
5516         exec "echo $foo";               # Insecure
5517         exec "echo", $foo;              # Secure (doesn't use sh)
5518         exec "sh", \'-c\', $foo;        # Considered secure, alas
5519
5520 .fi
5521 The taintedness is associated with each scalar value, so some elements
5522 of an array can be tainted, and others not.
5523 .PP
5524 If you try to do something insecure, you will get a fatal error saying
5525 something like \*(L"Insecure dependency\*(R" or \*(L"Insecure PATH\*(R".
5526 Note that you can still write an insecure system call or exec,
5527 but only by explicitly doing something like the last example above.
5528 You can also bypass the tainting mechanism by referencing
5529 subpatterns\*(--\c
5530 .I perl
5531 presumes that if you reference a substring using $1, $2, etc, you knew
5532 what you were doing when you wrote the pattern:
5533 .nf
5534
5535         $ARGV[0] =~ /^\-P(\ew+)$/;
5536         $printer = $1;          # Not tainted
5537
5538 .fi
5539 This is fairly secure since \ew+ doesn't match shell metacharacters.
5540 Use of .+ would have been insecure, but
5541 .I perl
5542 doesn't check for that, so you must be careful with your patterns.
5543 This is the ONLY mechanism for untainting user supplied filenames if you
5544 want to do file operations on them (unless you make $> equal to $<).
5545 .PP
5546 It's also possible to get into trouble with other operations that don't care
5547 whether they use tainted values.
5548 Make judicious use of the file tests in dealing with any user-supplied
5549 filenames.
5550 When possible, do opens and such after setting $> = $<.
5551 .I Perl
5552 doesn't prevent you from opening tainted filenames for reading, so be
5553 careful what you print out.
5554 The tainting mechanism is intended to prevent stupid mistakes, not to remove
5555 the need for thought.
5556 .SH ENVIRONMENT
5557 .I Perl
5558 uses PATH in executing subprocesses, and in finding the script if \-S
5559 is used.
5560 HOME or LOGDIR are used if chdir has no argument.
5561 .PP
5562 Apart from these,
5563 .I perl
5564 uses no environment variables, except to make them available
5565 to the script being executed, and to child processes.
5566 However, scripts running setuid would do well to execute the following lines
5567 before doing anything else, just to keep people honest:
5568 .nf
5569
5570 .ne 3
5571     $ENV{\'PATH\'} = \'/bin:/usr/bin\';    # or whatever you need
5572     $ENV{\'SHELL\'} = \'/bin/sh\' if $ENV{\'SHELL\'} ne \'\';
5573     $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
5574
5575 .fi
5576 .SH AUTHOR
5577 Larry Wall <lwall@jpl-devvax.Jpl.Nasa.Gov>
5578 .br
5579 MS-DOS port by Diomidis Spinellis <dds@cc.ic.ac.uk>
5580 .SH FILES
5581 /tmp/perl\-eXXXXXX      temporary file for
5582 .B \-e
5583 commands.
5584 .SH SEE ALSO
5585 a2p     awk to perl translator
5586 .br
5587 s2p     sed to perl translator
5588 .SH DIAGNOSTICS
5589 Compilation errors will tell you the line number of the error, with an
5590 indication of the next token or token type that was to be examined.
5591 (In the case of a script passed to
5592 .I perl
5593 via
5594 .B \-e
5595 switches, each
5596 .B \-e
5597 is counted as one line.)
5598 .PP
5599 Setuid scripts have additional constraints that can produce error messages
5600 such as \*(L"Insecure dependency\*(R".
5601 See the section on setuid scripts.
5602 .SH TRAPS
5603 Accustomed
5604 .IR awk
5605 users should take special note of the following:
5606 .Ip * 4 2
5607 Semicolons are required after all simple statements in
5608 .IR perl .
5609 Newline
5610 is not a statement delimiter.
5611 .Ip * 4 2
5612 Curly brackets are required on ifs and whiles.
5613 .Ip * 4 2
5614 Variables begin with $ or @ in
5615 .IR perl .
5616 .Ip * 4 2
5617 Arrays index from 0 unless you set $[.
5618 Likewise string positions in substr() and index().
5619 .Ip * 4 2
5620 You have to decide whether your array has numeric or string indices.
5621 .Ip * 4 2
5622 Associative array values do not spring into existence upon mere reference.
5623 .Ip * 4 2
5624 You have to decide whether you want to use string or numeric comparisons.
5625 .Ip * 4 2
5626 Reading an input line does not split it for you.  You get to split it yourself
5627 to an array.
5628 And the
5629 .I split
5630 operator has different arguments.
5631 .Ip * 4 2
5632 The current input line is normally in $_, not $0.
5633 It generally does not have the newline stripped.
5634 ($0 is the name of the program executed.)
5635 .Ip * 4 2
5636 $<digit> does not refer to fields\*(--it refers to substrings matched by the last
5637 match pattern.
5638 .Ip * 4 2
5639 The
5640 .I print
5641 statement does not add field and record separators unless you set
5642 $, and $\e.
5643 .Ip * 4 2
5644 You must open your files before you print to them.
5645 .Ip * 4 2
5646 The range operator is \*(L".\|.\*(R", not comma.
5647 (The comma operator works as in C.)
5648 .Ip * 4 2
5649 The match operator is \*(L"=~\*(R", not \*(L"~\*(R".
5650 (\*(L"~\*(R" is the one's complement operator, as in C.)
5651 .Ip * 4 2
5652 The exponentiation operator is \*(L"**\*(R", not \*(L"^\*(R".
5653 (\*(L"^\*(R" is the XOR operator, as in C.)
5654 .Ip * 4 2
5655 The concatenation operator is \*(L".\*(R", not the null string.
5656 (Using the null string would render \*(L"/pat/ /pat/\*(R" unparsable,
5657 since the third slash would be interpreted as a division operator\*(--the
5658 tokener is in fact slightly context sensitive for operators like /, ?, and <.
5659 And in fact, . itself can be the beginning of a number.)
5660 .Ip * 4 2
5661 .IR Next ,
5662 .I exit
5663 and
5664 .I continue
5665 work differently.
5666 .Ip * 4 2
5667 The following variables work differently
5668 .nf
5669
5670           Awk   \h'|2.5i'Perl
5671           ARGC  \h'|2.5i'$#ARGV
5672           ARGV[0]       \h'|2.5i'$0
5673           FILENAME\h'|2.5i'$ARGV
5674           FNR   \h'|2.5i'$. \- something
5675           FS    \h'|2.5i'(whatever you like)
5676           NF    \h'|2.5i'$#Fld, or some such
5677           NR    \h'|2.5i'$.
5678           OFMT  \h'|2.5i'$#
5679           OFS   \h'|2.5i'$,
5680           ORS   \h'|2.5i'$\e
5681           RLENGTH       \h'|2.5i'length($&)
5682           RS    \h'|2.5i'$/
5683           RSTART        \h'|2.5i'length($\`)
5684           SUBSEP        \h'|2.5i'$;
5685
5686 .fi
5687 .Ip * 4 2
5688 When in doubt, run the
5689 .I awk
5690 construct through a2p and see what it gives you.
5691 .PP
5692 Cerebral C programmers should take note of the following:
5693 .Ip * 4 2
5694 Curly brackets are required on ifs and whiles.
5695 .Ip * 4 2
5696 You should use \*(L"elsif\*(R" rather than \*(L"else if\*(R"
5697 .Ip * 4 2
5698 .I Break
5699 and
5700 .I continue
5701 become
5702 .I last
5703 and
5704 .IR next ,
5705 respectively.
5706 .Ip * 4 2
5707 There's no switch statement.
5708 .Ip * 4 2
5709 Variables begin with $ or @ in
5710 .IR perl .
5711 .Ip * 4 2
5712 Printf does not implement *.
5713 .Ip * 4 2
5714 Comments begin with #, not /*.
5715 .Ip * 4 2
5716 You can't take the address of anything.
5717 .Ip * 4 2
5718 ARGV must be capitalized.
5719 .Ip * 4 2
5720 The \*(L"system\*(R" calls link, unlink, rename, etc. return nonzero for success, not 0.
5721 .Ip * 4 2
5722 Signal handlers deal with signal names, not numbers.
5723 .PP
5724 Seasoned
5725 .I sed
5726 programmers should take note of the following:
5727 .Ip * 4 2
5728 Backreferences in substitutions use $ rather than \e.
5729 .Ip * 4 2
5730 The pattern matching metacharacters (, ), and | do not have backslashes in front.
5731 .Ip * 4 2
5732 The range operator is .\|. rather than comma.
5733 .PP
5734 Sharp shell programmers should take note of the following:
5735 .Ip * 4 2
5736 The backtick operator does variable interpretation without regard to the
5737 presence of single quotes in the command.
5738 .Ip * 4 2
5739 The backtick operator does no translation of the return value, unlike csh.
5740 .Ip * 4 2
5741 Shells (especially csh) do several levels of substitution on each command line.
5742 .I Perl
5743 does substitution only in certain constructs such as double quotes,
5744 backticks, angle brackets and search patterns.
5745 .Ip * 4 2
5746 Shells interpret scripts a little bit at a time.
5747 .I Perl
5748 compiles the whole program before executing it.
5749 .Ip * 4 2
5750 The arguments are available via @ARGV, not $1, $2, etc.
5751 .Ip * 4 2
5752 The environment is not automatically made available as variables.
5753 .SH ERRATA\0AND\0ADDENDA
5754 The Perl book,
5755 .I Programming\0Perl ,
5756 has the following omissions and goofs.
5757 .PP
5758 On page 5, the examples which read
5759 .nf
5760
5761         eval "/usr/bin/perl
5762
5763 should read
5764
5765         eval "exec /usr/bin/perl
5766
5767 .fi
5768 .PP
5769 On page 195, the equivalent to the System V sum program only works for
5770 very small files.  To do larger files, use
5771 .nf
5772
5773         undef $/;
5774         $checksum = unpack("%32C*",<>) % 32767;
5775
5776 .fi
5777 .PP
5778 The
5779 .B \-0
5780 switch to set the initial value of $/ was added to Perl after the book
5781 went to press.
5782 .PP
5783 The
5784 .B \-l
5785 switch now does automatic line ending processing.
5786 .PP
5787 The qx// construct is now a synonym for backticks.
5788 .PP
5789 $0 may now be assigned to set the argument displayed by
5790 .I ps (1).
5791 .PP
5792 The new @###.## format was omitted accidentally from the description
5793 on formats.
5794 .PP
5795 It wasn't known at press time that s///ee caused multiple evaluations of
5796 the replacement expression.  This is to be construed as a feature.
5797 .PP
5798 (LIST) x $count now does array replication.
5799 .PP
5800 There is now no limit on the number of parentheses in a regular expression.
5801 .PP
5802 In double-quote context, more escapes are supported: \ee, \ea, \ex1b, \ec[,
5803 \el, \eL, \eu, \eU, \eE.  The latter five control up/lower case translation.
5804 .PP
5805 The
5806 .B $/
5807 variable may now be set to a multi-character delimiter.
5808 .PP
5809 There is now a g modifier on ordinary pattern matching that causes it
5810 to iterate through a string finding multiple matches.
5811 .PP
5812 All of the $^X variables are new except for $^T.
5813 .SH BUGS
5814 .PP
5815 .I Perl
5816 is at the mercy of your machine's definitions of various operations
5817 such as type casting, atof() and sprintf().
5818 .PP
5819 If your stdio requires an seek or eof between reads and writes on a particular
5820 stream, so does
5821 .IR perl .
5822 (This doesn't apply to sysread() and syswrite().)
5823 .PP
5824 While none of the built-in data types have any arbitrary size limits (apart
5825 from memory size), there are still a few arbitrary limits:
5826 a given identifier may not be longer than 255 characters;
5827 sprintf is limited on many machines to 128 characters per field (unless the format
5828 specifier is exactly %s);
5829 and no component of your PATH may be longer than 255 if you use \-S.
5830 .PP
5831 .I Perl
5832 actually stands for Pathologically Eclectic Rubbish Lister, but don't tell
5833 anyone I said that.
5834 .rn }` ''