1 ''' Beginning of part 3
2 ''' $Header: perl.man.3,v 3.0 89/10/18 15:21:46 lwall Locked $
4 ''' $Log: perl.man.3,v $
5 ''' Revision 3.0 89/10/18 15:21:46 lwall
14 statement in C; it starts the next iteration of the loop:
18 line: while (<STDIN>) {
19 next line if /\|^#/; # discard comments
24 Note that if there were a
26 block on the above, it would get executed even on discarded lines.
27 If the LABEL is omitted, the command refers to the innermost enclosing loop.
30 Returns the decimal value of EXPR interpreted as an octal string.
31 (If EXPR happens to start off with 0x, interprets it as a hex string instead.)
32 The following will handle decimal, octal and hex in the standard notation:
35 $val = oct($val) if $val =~ /^0/;
38 If EXPR is omitted, uses $_.
39 .Ip "open(FILEHANDLE,EXPR)" 8 8
40 .Ip "open(FILEHANDLE)" 8
41 .Ip "open FILEHANDLE" 8
42 Opens the file whose filename is given by EXPR, and associates it with
44 If FILEHANDLE is an expression, its value is used as the name of the
45 real filehandle wanted.
46 If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE
47 contains the filename.
48 If the filename begins with \*(L"<\*(R" or nothing, the file is opened for
50 If the filename begins with \*(L">\*(R", the file is opened for output.
51 If the filename begins with \*(L">>\*(R", the file is opened for appending.
52 (You can put a \'+\' in front of the \'>\' or \'<\' to indicate that you
53 want both read and write access to the file.)
54 If the filename begins with \*(L"|\*(R", the filename is interpreted
55 as a command to which output is to be piped, and if the filename ends
56 with a \*(L"|\*(R", the filename is interpreted as command which pipes
58 (You may not have a command that pipes both in and out.)
61 and opening \'>\-\' opens
63 Open returns non-zero upon success, the undefined value otherwise.
64 If the open involved a pipe, the return value happens to be the pid
71 open article || die "Can't find article $article: $!\en";
72 while (<article>) {\|.\|.\|.
74 open(LOG, \'>>/usr/spool/news/twitlog\'\|); # (log is reserved)
76 open(article, "caesar <$article |"\|); # decrypt article
78 open(extract, "|sort >/tmp/Tmp$$"\|); # $$ is our process#
81 # process argument list of files along with any includes
83 foreach $file (@ARGV) {
84 do process($file, \'fh00\'); # no pun intended
88 local($filename, $input) = @_;
89 $input++; # this is a string increment
90 unless (open($input, $filename)) {
91 print STDERR "Can't open $filename: $!\en";
94 while (<$input>) { # note the use of indirection
95 if (/^#include "(.*)"/) {
96 do process($1, $input);
104 You may also, in the Bourne shell tradition, specify an EXPR beginning
105 with \*(L">&\*(R", in which case the rest of the string
106 is interpreted as the name of a filehandle
107 (or file descriptor, if numeric) which is to be duped and opened.
108 Here is a script that saves, redirects, and restores
116 open(SAVEOUT, ">&STDOUT");
117 open(SAVEERR, ">&STDERR");
119 open(STDOUT, ">foo.out") || die "Can't redirect stdout";
120 open(STDERR, ">&STDOUT") || die "Can't dup stdout";
122 select(STDERR); $| = 1; # make unbuffered
123 select(STDOUT); $| = 1; # make unbuffered
125 print STDOUT "stdout 1\en"; # this works for
126 print STDERR "stderr 1\en"; # subprocesses too
131 open(STDOUT, ">&SAVEOUT");
132 open(STDERR, ">&SAVEERR");
134 print STDOUT "stdout 2\en";
135 print STDERR "stderr 2\en";
138 If you open a pipe on the command \*(L"\-\*(R", i.e. either \*(L"|\-\*(R" or \*(L"\-|\*(R",
139 then there is an implicit fork done, and the return value of open
140 is the pid of the child within the parent process, and 0 within the child
142 (Use defined($pid) to determine if the open was successful.)
143 The filehandle behaves normally for the parent, but i/o to that
144 filehandle is piped from/to the
146 of the child process.
147 In the child process the filehandle isn't opened\*(--i/o happens from/to
152 Typically this is used like the normal piped open when you want to exercise
153 more control over just how the pipe command gets executed, such as when
154 you are running setuid, and don't want to have to scan shell commands
156 The following pairs are equivalent:
160 open(FOO, "|tr \'[a\-z]\' \'[A\-Z]\'");
161 open(FOO, "|\-") || exec \'tr\', \'[a\-z]\', \'[A\-Z]\';
163 open(FOO, "cat \-n $file|");
164 open(FOO, "\-|") || exec \'cat\', \'\-n\', $file;
167 Explicitly closing any piped filehandle causes the parent process to wait for the
168 child to finish, and returns the status value in $?.
169 .Ip "opendir(DIRHANDLE,EXPR)" 8 3
170 Opens a directory named EXPR for processing by readdir(), telldir(), seekdir(),
171 rewinddir() and closedir().
172 Returns true if successful.
173 DIRHANDLEs have their own namespace separate from FILEHANDLEs.
176 Returns the ascii value of the first character of EXPR.
177 If EXPR is omitted, uses $_.
178 .Ip "pack(TEMPLATE,LIST)" 8 4
179 Takes an array or list of values and packs it into a binary structure,
180 returning the string containing the structure.
181 The TEMPLATE is a sequence of characters that give the order and type
182 of values, as follows:
185 A An ascii string, will be space padded.
186 a An ascii string, will be null padded.
187 c A native char value.
188 C An unsigned char value.
189 s A signed short value.
190 S An unsigned short value.
191 i A signed integer value.
192 I An unsigned integer value.
193 l A signed long value.
194 L An unsigned long value.
195 n A short in \*(L"network\*(R" order.
196 N A long in \*(L"network\*(R" order.
197 p A pointer to a string.
201 Each letter may optionally be followed by a number which gives a repeat
203 With all types except "a" and "A" the pack function will gobble up that many values
205 The "a" and "A" types gobble just one value, but pack it as a string that long,
206 padding with nulls or spaces as necessary.
207 (When unpacking, "A" strips trailing spaces and nulls, but "a" does not.)
211 $foo = pack("cccc",65,66,67,68);
213 $foo = pack("c4",65,66,67,68);
216 $foo = pack("ccxxcc",65,66,67,68);
217 # foo eq "AB\e0\e0CD"
219 $foo = pack("s2",1,2);
220 # "\e1\e0\e2\e0" on little-endian
221 # "\e0\e1\e0\e2" on big-endian
223 $foo = pack("a4","abcd","x","y","z");
226 $foo = pack("aaaa","abcd","x","y","z");
229 $foo = pack("a14","abcdefg");
230 # "abcdefg\e0\e0\e0\e0\e0\e0\e0"
232 $foo = pack("i9pl", gmtime());
233 # a real struct tm (on my system anyway)
236 The same template may generally also be used in the unpack function.
239 Pops and returns the last value of the array, shortening the array by 1.
240 Has the same effect as
243 $tmp = $ARRAY[$#ARRAY\-\|\-];
246 If there are no elements in the array, returns the undefined value.
247 .Ip "print(FILEHANDLE LIST)" 8 10
249 .Ip "print FILEHANDLE LIST" 8
252 Prints a string or a comma-separated list of strings.
253 Returns non-zero if successful.
254 FILEHANDLE may be a scalar variable name, in which case the variable contains
255 the name of the filehandle, thus introducing one level of indirection.
256 If FILEHANDLE is omitted, prints by default to standard output (or to the
257 last selected output channel\*(--see select()).
258 If LIST is also omitted, prints $_ to
260 To set the default output channel to something other than
262 use the select operation.
263 .Ip "printf(FILEHANDLE LIST)" 8 10
265 .Ip "printf FILEHANDLE LIST" 8
267 Equivalent to a \*(L"print FILEHANDLE sprintf(LIST)\*(R".
268 .Ip "push(ARRAY,LIST)" 8 7
269 Treats ARRAY (@ is optional) as a stack, and pushes the values of LIST
270 onto the end of ARRAY.
271 The length of ARRAY increases by the length of LIST.
272 Has the same effect as
276 $ARRAY[++$#ARRAY] = $value;
280 but is more efficient.
283 These are not really functions, but simply syntactic sugar to let you
284 avoid putting too many backslashes into quoted strings.
285 The q operator is a generalized single quote, and the qq operator a
286 generalized double quote.
287 Any delimiter can be used in place of /, including newline.
288 If the delimiter is an opening bracket or parenthesis, the final delimiter
289 will be the corresponding closing bracket or parenthesis.
290 (Embedded occurrences of the closing bracket need to be backslashed as usual.)
295 $foo = q!I said, "You said, \'She said it.\'"!;
296 $bar = q(\'This is it.\');
298 *** The previous line contains the naughty word "$&".\en
299 if /(ibm|apple|awk)/; # :-)
305 Returns a random fractional number between 0 and the value of EXPR.
306 (EXPR should be positive.)
307 If EXPR is omitted, returns a value between 0 and 1.
309 .Ip "read(FILEHANDLE,SCALAR,LENGTH)" 8 5
310 Attempts to read LENGTH bytes of data into variable SCALAR from the specified
312 Returns the number of bytes actually read.
313 SCALAR will be grown or shrunk to the length actually read.
314 .Ip "readdir(DIRHANDLE)" 8 3
315 Returns the next directory entry for a directory opened by opendir().
316 If used in an array context, returns all the rest of the entries in the
318 If there are no more entries, returns an undefined value in a scalar context
319 or a null list in an array context.
320 .Ip "readlink(EXPR)" 8 6
321 .Ip "readlink EXPR" 8
322 Returns the value of a symbolic link, if symbolic links are implemented.
323 If not, gives a fatal error.
324 If there is some system error, returns the undefined value and sets $! (errno).
325 If EXPR is omitted, uses $_.
326 .Ip "recv(SOCKET,SCALAR,LEN,FLAGS)" 8 4
327 Receives a message on a socket.
328 Attempts to receive LENGTH bytes of data into variable SCALAR from the specified
330 Returns the address of the sender, or the undefined value if there's an error.
331 SCALAR will be grown or shrunk to the length actually read.
332 Takes the same flags as the system call of the same name.
337 command restarts the loop block without evaluating the conditional again.
340 block, if any, is not executed.
341 If the LABEL is omitted, the command refers to the innermost enclosing loop.
342 This command is normally used by programs that want to lie to themselves
343 about what was just input:
347 # a simpleminded Pascal comment stripper
348 # (warning: assumes no { or } in strings)
349 line: while (<STDIN>) {
350 while (s|\|({.*}.*\|){.*}|$1 \||) {}
355 if (\|/\|}/\|) { # end of comment?
365 .Ip "rename(OLDNAME,NEWNAME)" 8 2
366 Changes the name of a file.
367 Returns 1 for success, 0 otherwise.
368 Will not work across filesystem boundaries.
369 .Ip "reset(EXPR)" 8 6
374 block at the end of a loop to clear variables and reset ?? searches
375 so that they work again.
376 The expression is interpreted as a list of single characters (hyphens allowed
378 All variables and arrays beginning with one of those letters are reset to
379 their pristine state.
380 If the expression is omitted, one-match searches (?pattern?) are reset to
382 Only resets variables or searches in the current package.
388 reset \'X\'; \h'|2i'# reset all X variables
389 reset \'a\-z\';\h'|2i'# reset lower case variables
390 reset; \h'|2i'# just reset ?? searches
393 Note: resetting \*(L"A\-Z\*(R" is not recommended since you'll wipe out your ARGV and ENV
396 The use of reset on dbm associative arrays does not change the dbm file.
397 (It does, however, flush any entries cached by perl, which may be useful if
398 you are sharing the dbm file.
399 Then again, maybe not.)
400 .Ip "return LIST" 8 3
401 Returns from a subroutine with the value specified.
402 (Note that a subroutine can automatically return
403 the value of the last expression evaluated.
404 That's the preferred method\*(--use of an explicit
407 .Ip "reverse(LIST)" 8 4
409 Returns an array value consisting of the elements of LIST in the opposite order.
410 .Ip "rewinddir(DIRHANDLE)" 8 5
411 .Ip "rewinddir DIRHANDLE" 8
412 Sets the current position to the beginning of the directory for the readdir() routine on DIRHANDLE.
413 .Ip "rindex(STR,SUBSTR)" 8 4
414 Works just like index except that it
415 returns the position of the LAST occurrence of SUBSTR in STR.
416 .Ip "rmdir(FILENAME)" 8 4
417 .Ip "rmdir FILENAME" 8
418 Deletes the directory specified by FILENAME if it is empty.
419 If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
420 If FILENAME is omitted, uses $_.
421 .Ip "s/PATTERN/REPLACEMENT/gieo" 8 3
422 Searches a string for a pattern, and if found, replaces that pattern with the
423 replacement text and returns the number of substitutions made.
424 Otherwise it returns false (0).
425 The \*(L"g\*(R" is optional, and if present, indicates that all occurrences
426 of the pattern are to be replaced.
427 The \*(L"i\*(R" is also optional, and if present, indicates that matching
428 is to be done in a case-insensitive manner.
429 The \*(L"e\*(R" is likewise optional, and if present, indicates that
430 the replacement string is to be evaluated as an expression rather than just
431 as a double-quoted string.
432 Any delimiter may replace the slashes; if single quotes are used, no
433 interpretation is done on the replacement string (the e modifier overrides
435 If no string is specified via the =~ or !~ operator,
436 the $_ string is searched and modified.
437 (The string specified with =~ must be a scalar variable, an array element,
438 or an assignment to one of those, i.e. an lvalue.)
439 If the pattern contains a $ that looks like a variable rather than an
440 end-of-string test, the variable will be interpolated into the pattern at
442 If you only want the pattern compiled once the first time the variable is
443 interpolated, add an \*(L"o\*(R" at the end.
444 See also the section on regular expressions.
448 s/\|\e\|bgreen\e\|b/mauve/g; # don't change wintergreen
450 $path \|=~ \|s|\|/usr/bin|\|/usr/local/bin|;
452 s/Login: $foo/Login: $bar/; # run-time pattern
454 ($foo = $bar) =~ s/bar/foo/;
457 s/\ed+/$&*2/e; # yields \*(L'abc246xyz\*(R'
458 s/\ed+/sprintf("%5d",$&)/e; # yields \*(L'abc 246xyz\*(R'
459 s/\ew/$& x 2/eg; # yields \*(L'aabbcc 224466xxyyzz\*(R'
461 s/\|([^ \|]*\|) *\|([^ \|]*\|)\|/\|$2 $1/; # reverse 1st two fields
464 (Note the use of $ instead of \|\e\| in the last example. See section
465 on regular expressions.)
466 .Ip "seek(FILEHANDLE,POSITION,WHENCE)" 8 3
467 Randomly positions the file pointer for FILEHANDLE, just like the fseek()
469 FILEHANDLE may be an expression whose value gives the name of the filehandle.
470 Returns 1 upon success, 0 otherwise.
471 .Ip "seekdir(DIRHANDLE,POS)" 8 3
472 Sets the current position for the readdir() routine on DIRHANDLE.
473 POS must be a value returned by seekdir().
474 Has the same caveats about possible directory compaction as the corresponding
475 system library routine.
476 .Ip "select(FILEHANDLE)" 8 3
478 Returns the currently selected filehandle.
479 Sets the current default filehandle for output, if FILEHANDLE is supplied.
480 This has two effects: first, a
484 without a filehandle will default to this FILEHANDLE.
485 Second, references to variables related to output will refer to this output
487 For example, if you have to set the top of form format for more than
488 one output channel, you might do the following:
493 $^ = \'report1_top\';
495 $^ = \'report2_top\';
498 FILEHANDLE may be an expression whose value gives the name of the actual filehandle.
502 $oldfh = select(STDERR); $| = 1; select($oldfh);
505 .Ip "select(RBITS,WBITS,EBITS,TIMEOUT)" 8 3
506 This calls the select system call with the bitmasks specified, which can
507 be constructed using fileno() and vec(), along these lines:
510 $rin = $win = $ein = '';
511 vec($rin,fileno(STDIN),1) = 1;
512 vec($win,fileno(STDOUT),1) = 1;
516 If you want to select on many filehandles you might wish to write a subroutine:
520 local(@fhlist) = split(' ',$_[0]);
523 vec($bits,fileno($_),1) = 1;
527 $rin = &fhbits('STDIN TTY SOCK');
533 ($nfound,$timeleft) =
534 select($rout=$rin, $wout=$win, $eout=$ein, $timeout);
536 or to block until something becomes ready:
538 $nfound = select($rout=$rin, $wout=$win, $eout=$ein, undef);
541 Any of the bitmasks can also be undef.
542 The timeout, if specified, is in seconds, which may be fractional.
543 .Ip "setpgrp(PID,PGRP)" 8 4
544 Sets the current process group for the specified PID, 0 for the current
546 Will produce a fatal error if used on a machine that doesn't implement
548 .Ip "send(SOCKET,MSG,FLAGS,TO)" 8 4
549 .Ip "send(SOCKET,MSG,FLAGS)" 8
550 Sends a message on a socket.
551 Takes the same flags as the system call of the same name.
552 On unconnected sockets you must specify a destination to send TO.
553 Returns the number of characters sent, or the undefined value if
555 .Ip "setpriority(WHICH,WHO,PRIORITY)" 8 4
556 Sets the current priority for a process, a process group, or a user.
557 (See setpriority(2).)
558 Will produce a fatal error if used on a machine that doesn't implement
560 .Ip "setsockopt(SOCKET,LEVEL,OPTNAME,OPTVAL)" 8 3
561 Sets the socket option requested.
562 Returns undefined if there is an error.
563 OPTVAL may be specified as undef if you don't want to pass an argument.
564 .Ip "shift(ARRAY)" 8 6
567 Shifts the first value of the array off and returns it,
568 shortening the array by 1 and moving everything down.
569 If there are no elements in the array, returns the undefined value.
570 If ARRAY is omitted, shifts the @ARGV array in the main program, and the @_
571 array in subroutines.
572 See also unshift(), push() and pop().
573 Shift() and unshift() do the same thing to the left end of an array that push()
574 and pop() do to the right end.
575 .Ip "shutdown(SOCKET,HOW)" 8 3
576 Shuts down a socket connection in the manner indicated by HOW, which has
577 the same interpretation as in the system call of the same name.
580 Returns the sine of EXPR (expressed in radians).
581 If EXPR is omitted, returns sine of $_.
582 .Ip "sleep(EXPR)" 8 6
585 Causes the script to sleep for EXPR seconds, or forever if no EXPR.
586 May be interrupted by sending the process a SIGALARM.
587 Returns the number of seconds actually slept.
588 .Ip "socket(SOCKET,DOMAIN,TYPE,PROTOCOL)" 8 3
589 Opens a socket of the specified kind and attaches it to filehandle SOCKET.
590 DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
592 You may need to run makelib on sys/socket.h to get the proper values handy
593 in a perl library file.
594 Return true if successful.
595 See the example in the section on Interprocess Communication.
596 .Ip "socketpair(SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL)" 8 3
597 Creates an unnamed pair of sockets in the specified domain, of the specified
599 DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
601 If unimplemented, yields a fatal error.
602 Return true if successful.
603 .Ip "sort(SUBROUTINE LIST)" 8 9
605 .Ip "sort SUBROUTINE LIST" 8
607 Sorts the LIST and returns the sorted array value.
608 Nonexistent values of arrays are stripped out.
609 If SUBROUTINE is omitted, sorts in standard string comparison order.
610 If SUBROUTINE is specified, gives the name of a subroutine that returns
611 an integer less than, equal to, or greater than 0,
612 depending on how the elements of the array are to be ordered.
613 In the interests of efficiency the normal calling code for subroutines
614 is bypassed, with the following effects: the subroutine may not be a recursive
615 subroutine, and the two elements to be compared are passed into the subroutine
616 not via @_ but as $a and $b (see example below).
617 They are passed by reference so don't modify $a and $b.
618 SUBROUTINE may be a scalar variable name, in which case the value provides
619 the name of the subroutine to use.
625 $age{$a} - $age{$b}; # presuming integers
627 @sortedclass = sort byage @class;
630 sub reverse { $a lt $b ? 1 : $a gt $b ? \-1 : 0; }
631 @harry = (\'dog\',\'cat\',\'x\',\'Cain\',\'Abel\');
632 @george = (\'gone\',\'chased\',\'yz\',\'Punished\',\'Axed\');
634 # prints AbelCaincatdogx
635 print sort reverse @harry;
636 # prints xdogcatCainAbel
637 print sort @george, \'to\', @harry;
638 # prints AbelAxedCainPunishedcatchaseddoggonetoxyz
641 .Ip "split(/PATTERN/,EXPR,LIMIT)" 8 8
642 .Ip "split(/PATTERN/,EXPR)" 8 8
643 .Ip "split(/PATTERN/)" 8
645 Splits a string into an array of strings, and returns it.
646 (If not in an array context, returns the number of fields found and splits
648 If EXPR is omitted, splits the $_ string.
649 If PATTERN is also omitted, splits on whitespace (/[\ \et\en]+/).
650 Anything matching PATTERN is taken to be a delimiter separating the fields.
651 (Note that the delimiter may be longer than one character.)
652 If LIMIT is specified, splits into no more than that many fields (though it
653 may split into fewer).
654 If LIMIT is unspecified, trailing null fields are stripped (which
655 potential users of pop() would do well to remember).
656 A pattern matching the null string (not to be confused with a null pattern,
657 which is one member of the set of patterns matching a null string)
658 will split the value of EXPR into separate characters at each point it
663 print join(\':\', split(/ */, \'hi there\'));
666 produces the output \*(L'h:i:t:h:e:r:e\*(R'.
668 The NUM parameter can be used to partially split a line
671 ($login, $passwd, $remainder) = split(\|/\|:\|/\|, $_, 3);
674 (When assigning to a list, if NUM is omitted, perl supplies a NUM one
675 larger than the number of variables in the list, to avoid unnecessary work.
676 For the list above NUM would have been 4 by default.
677 In time critical applications it behooves you not to split into
678 more fields than you really need.)
680 If the PATTERN contains parentheses, additional array elements are created
681 from each matching substring in the delimiter.
683 split(/([,-])/,"1-10,20");
685 produces the array value
689 The pattern /PATTERN/ may be replaced with an expression to specify patterns
690 that vary at runtime.
691 (To do runtime compilation only once, use /$variable/o.)
692 As a special case, specifying a space (\'\ \') will split on white space
693 just as split with no arguments does, but leading white space does NOT
694 produce a null first field.
695 Thus, split(\'\ \') can be used to emulate
697 default behavior, whereas
698 split(/\ /) will give you as many null initial fields as there are
705 open(passwd, \'/etc/passwd\');
708 ($login, $passwd, $uid, $gid, $gcos, $home, $shell) = split(\|/\|:\|/\|);
711 ($login, $passwd, $uid, $gid, $gcos, $home, $shell)
712 = split(\|/\|:\|/\|);
718 (Note that $shell above will still have a newline on it. See chop().)
721 .Ip "sprintf(FORMAT,LIST)" 8 4
722 Returns a string formatted by the usual printf conventions.
723 The * character is not supported.
726 Return the square root of EXPR.
727 If EXPR is omitted, returns square root of $_.
728 .Ip "srand(EXPR)" 8 4
730 Sets the random number seed for the
733 If EXPR is omitted, does srand(time).
734 .Ip "stat(FILEHANDLE)" 8 6
735 .Ip "stat FILEHANDLE" 8
737 Returns a 13-element array giving the statistics for a file, either the file
738 opened via FILEHANDLE, or named by EXPR.
739 Typically used as follows:
743 ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
744 $atime,$mtime,$ctime,$blksize,$blocks)
748 If stat is passed the special filehandle consisting of an underline,
749 no stat is done, but the current contents of the stat structure from
750 the last stat or filetest are returned.
755 if (-x $file && (($d) = stat(_)) && $d < 0) {
756 print "$file is executable NFS file\en";
760 .Ip "study(SCALAR)" 8 6
763 Takes extra time to study SCALAR ($_ if unspecified) in anticipation of
764 doing many pattern matches on the string before it is next modified.
765 This may or may not save time, depending on the nature and number of patterns
766 you are searching on, and on the distribution of character frequencies in
767 the string to be searched\*(--you probably want to compare runtimes with and
768 without it to see which runs faster.
769 Those loops which scan for many short constant strings (including the constant
770 parts of more complex patterns) will benefit most.
771 You may have only one study active at a time\*(--if you study a different
772 scalar the first is \*(L"unstudied\*(R".
773 (The way study works is this: a linked list of every character in the string
774 to be searched is made, so we know, for example, where all the \*(L'k\*(R' characters
776 From each search string, the rarest character is selected, based on some
777 static frequency tables constructed from some C programs and English text.
778 Only those places that contain this \*(L"rarest\*(R" character are examined.)
780 For example, here is a loop which inserts index producing entries before any line
781 containing a certain pattern:
787 print ".IX foo\en" if /\ebfoo\eb/;
788 print ".IX bar\en" if /\ebbar\eb/;
789 print ".IX blurfl\en" if /\ebblurfl\eb/;
795 In searching for /\ebfoo\eb/, only those locations in $_ that contain \*(L'f\*(R'
796 will be looked at, because \*(L'f\*(R' is rarer than \*(L'o\*(R'.
797 In general, this is a big win except in pathological cases.
798 The only question is whether it saves you more time than it took to build
799 the linked list in the first place.
801 Note that if you have to look for strings that you don't know till runtime,
802 you can build an entire loop as a string and eval that to avoid recompiling
803 all your patterns all the time.
804 Together with setting $/ to input entire files as one record, this can
805 be very fast, often faster than specialized programs like fgrep.
806 The following scans a list of files (@files)
807 for a list of words (@words), and prints out the names of those files that
812 $search = \'while (<>) { study;\';
813 foreach $word (@words) {
814 $search .= "++\e$seen{\e$ARGV} if /\eb$word\eb/;\en";
818 $/ = "\e177"; # something that doesn't occur
819 eval $search; # this screams
820 $/ = "\en"; # put back to normal input delim
821 foreach $file (sort keys(%seen)) {
826 .Ip "substr(EXPR,OFFSET,LEN)" 8 2
827 Extracts a substring out of EXPR and returns it.
828 First character is at offset 0, or whatever you've set $[ to.
829 If OFFSET is negative, starts that far from the end of the string.
830 You can use the substr() function as an lvalue, in which case EXPR must
832 If you assign something shorter than LEN, the string will shrink, and
833 if you assign something longer than LEN, the string will grow to accomodate it.
834 To keep the string the same length you may need to pad or chop your value using
836 .Ip "syscall(LIST)" 8 6
838 Calls the system call specified as the first element of the list, passing
839 the remaining elements as arguments to the system call.
840 If unimplemented, produces a fatal error.
841 The arguments are interpreted as follows: if a given argument is numeric,
842 the argument is passed as an int.
843 If not, the pointer to the string value is passed.
844 You are responsible to make sure a string is pre-extended long enough
845 to receive any result that might be written into a string.
846 If your integer arguments are not literals and have never been interpreted
847 in a numeric context, you may need to add 0 to them to force them to look
851 do 'syscall.h'; # may need to run makelib
852 syscall(&SYS_write, fileno(STDOUT), "hi there\en", 9);
855 .Ip "system(LIST)" 8 6
857 Does exactly the same thing as \*(L"exec LIST\*(R" except that a fork
858 is done first, and the parent process waits for the child process to complete.
859 Note that argument processing varies depending on the number of arguments.
860 The return value is the exit status of the program as returned by the wait()
862 To get the actual exit value divide by 256.
865 .Ip "symlink(OLDFILE,NEWFILE)" 8 2
866 Creates a new filename symbolically linked to the old filename.
867 Returns 1 for success, 0 otherwise.
868 On systems that don't support symbolic links, produces a fatal error at
870 To check for that, use eval:
873 $symlink_exists = (eval \'symlink("","");\', $@ eq \'\');
876 .Ip "tell(FILEHANDLE)" 8 6
877 .Ip "tell FILEHANDLE" 8 6
879 Returns the current file position for FILEHANDLE.
880 FILEHANDLE may be an expression whose value gives the name of the actual
882 If FILEHANDLE is omitted, assumes the file last read.
883 .Ip "telldir(DIRHANDLE)" 8 5
884 .Ip "telldir DIRHANDLE" 8
885 Returns the current position of the readdir() routines on DIRHANDLE.
886 Value may be given to seekdir() to access a particular location in
888 Has the same caveats about possible directory compaction as the corresponding
889 system library routine.
891 Returns the number of non-leap seconds since January 1, 1970, UTC.
892 Suitable for feeding to gmtime() and localtime().
894 Returns a four-element array giving the user and system times, in seconds, for this
895 process and the children of this process.
897 ($user,$system,$cuser,$csystem) = times;
899 .Ip "tr/SEARCHLIST/REPLACEMENTLIST/" 8 5
900 .Ip "y/SEARCHLIST/REPLACEMENTLIST/" 8
901 Translates all occurrences of the characters found in the search list with
902 the corresponding character in the replacement list.
903 It returns the number of characters replaced.
904 If no string is specified via the =~ or !~ operator,
905 the $_ string is translated.
906 (The string specified with =~ must be a scalar variable, an array element,
907 or an assignment to one of those, i.e. an lvalue.)
912 is provided as a synonym for
917 $ARGV[1] \|=~ \|y/A\-Z/a\-z/; \h'|3i'# canonicalize to lower case
919 $cnt = tr/*/*/; \h'|3i'# count the stars in $_
921 ($HOST = $host) =~ tr/a\-z/A\-Z/;
923 y/\e001\-@[\-_{\-\e177/ /; \h'|3i'# change non-alphas to space
926 .Ip "umask(EXPR)" 8 4
928 Sets the umask for the process and returns the old one.
929 If EXPR is omitted, merely returns current umask.
930 .Ip "undef(EXPR)" 8 6
933 Undefines the value of EXPR, which must be an lvalue.
934 Use only on a scalar value, an entire array, or a subroutine name (using &).
935 (Undef will probably not do what you expect on most predefined variables or
937 Always returns the undefined value.
938 You can omit the EXPR, in which case nothing is undefined, but you still
939 get an undefined value that you could, for instance, return from a subroutine.
945 undef $bar{'blurfl'};
949 return (wantarray ? () : undef) if $they_blew_it;
952 .Ip "unlink(LIST)" 8 4
954 Deletes a list of files.
955 Returns the number of files successfully deleted.
959 $cnt = unlink \'a\', \'b\', \'c\';
964 Note: unlink will not delete directories unless you are superuser and the
968 Even if these conditions are met, be warned that unlinking a directory
969 can inflict damage on your filesystem.
971 .Ip "unpack(TEMPLATE,EXPR)" 8 4
972 Unpack does the reverse of pack: it takes a string representing
973 a structure and expands it out into an array value, returning the array
975 The TEMPLATE has the same format as in the pack function.
976 Here's a subroutine that does substring:
981 local($what,$where,$howmuch) = @_;
982 unpack("x$where a$howmuch", $what);
988 sub ord { unpack("c",$_[0]); }
991 .Ip "unshift(ARRAY,LIST)" 8 4
992 Does the opposite of a
996 depending on how you look at it.
997 Prepends list to the front of the array, and returns the number of elements
1001 unshift(ARGV, \'\-e\') unless $ARGV[0] =~ /^\-/;
1004 .Ip "utime(LIST)" 8 2
1005 .Ip "utime LIST" 8 2
1006 Changes the access and modification times on each file of a list of files.
1007 The first two elements of the list must be the NUMERICAL access and
1008 modification times, in that order.
1009 Returns the number of files successfully changed.
1010 The inode modification time of each file is set to the current time.
1011 Example of a \*(L"touch\*(R" command:
1017 utime $now, $now, @ARGV;
1020 .Ip "values(ASSOC_ARRAY)" 8 6
1021 .Ip "values ASSOC_ARRAY" 8
1022 Returns a normal array consisting of all the values of the named associative
1024 The values are returned in an apparently random order, but it is the same order
1025 as either the keys() or each() function would produce on the same array.
1026 See also keys() and each().
1027 .Ip "vec(EXPR,OFFSET,BITS)" 8 2
1028 Treats a string as a vector of unsigned integers, and returns the value
1029 of the bitfield specified.
1030 May also be assigned to.
1031 BITS must be a power of two from 1 to 32.
1033 Vectors created with vec() can also be manipulated with the logical operators
1035 which will assume a bit vector operation is desired when both operands are
1037 This interpretation is not enabled unless there is at least one vec() in
1038 your program, to protect older programs.
1040 Waits for a child process to terminate and returns the pid of the deceased
1042 The status is returned in $?.
1044 Returns true if the context of the currently executing subroutine
1045 is looking for an array value.
1046 Returns false if the context is looking for a scalar.
1049 return wantarray ? () : undef;
1052 .Ip "warn(LIST)" 8 4
1054 Produces a message on STDERR just like \*(L"die\*(R", but doesn't exit.
1055 .Ip "write(FILEHANDLE)" 8 6
1058 Writes a formatted record (possibly multi-line) to the specified file,
1059 using the format associated with that file.
1060 By default the format for a file is the one having the same name is the
1061 filehandle, but the format for the current output channel (see
1063 may be set explicitly
1064 by assigning the name of the format to the $~ variable.
1066 Top of form processing is handled automatically:
1067 if there is insufficient room on the current page for the formatted
1068 record, the page is advanced, a special top-of-page format is used
1069 to format the new page header, and then the record is written.
1070 By default the top-of-page format is \*(L"top\*(R", but it
1072 format of your choice by assigning the name to the $^ variable.
1074 If FILEHANDLE is unspecified, output goes to the current default output channel,
1077 but may be changed by the
1080 If the FILEHANDLE is an EXPR, then the expression is evaluated and the
1081 resulting string is used to look up the name of the FILEHANDLE at run time.
1082 For more on formats, see the section on formats later on.
1084 Note that write is NOT the opposite of read.