1 ''' Beginning of part 2
2 ''' $Header: perl.man.2,v 2.0 88/06/05 00:09:30 root Exp $
4 ''' $Log: perl.man.2,v $
5 ''' Revision 2.0 88/06/05 00:09:30 root
6 ''' Baseline version 2.0.
10 Finds the statement labeled with LABEL and resumes execution there.
11 Currently you may only go to statements in the main body of the program
12 that are not nested inside a do {} construct.
13 This statement is not implemented very efficiently, and is here only to make
14 the sed-to-perl translator easier.
17 Returns the decimal value of EXPR interpreted as an hex string.
18 (To interpret strings that might start with 0 or 0x see oct().)
19 .Ip "index(STR,SUBSTR)" 8 4
20 Returns the position of SUBSTR in STR, based at 0, or whatever you've
21 set the $[ variable to.
22 If the substring is not found, returns one less than the base, ordinarily -1.
24 Returns the integer portion of EXPR.
25 .Ip "join(EXPR,LIST)" 8 8
26 .Ip "join(EXPR,ARRAY)" 8
27 Joins the separate strings of LIST or ARRAY into a single string with fields
28 separated by the value of EXPR, and returns the string.
32 $_ = join(\|':', $login,$passwd,$uid,$gid,$gcos,$home,$shell);
37 .Ip "keys(ASSOC_ARRAY)" 8 6
38 Returns a normal array consisting of all the keys of the named associative
40 The keys are returned in an apparently random order, but it is the same order
41 as either the values() or each() function produces (given that the associative array
42 has not been modified).
43 Here is yet another way to print your environment:
48 @values = values(ENV);
50 print pop(keys),'=',pop(values),"\en";
53 or how about sorted by key:
56 foreach $key (sort keys(ENV)) {
57 print $key,'=',$ENV{$key},"\en";
62 Sends a signal to a list of processes.
63 The first element of the list must be the (numerical) signal to send.
64 Returns the number of processes successfully signaled.
67 $cnt = kill 1,$child1,$child2;
71 If the signal is negative, kills process groups instead of processes.
72 (On System V, a negative \fIprocess\fR number will also kill process groups,
73 but that's not portable.)
80 statement in C (as used in loops); it immediately exits the loop in question.
81 If the LABEL is omitted, the command refers to the innermost enclosing loop.
84 block, if any, is not executed:
88 line: while (<stdin>) {
89 last line if /\|^$/; # exit when done with header
94 .Ip "length(EXPR)" 8 2
95 Returns the length in characters of the value of EXPR.
96 .Ip "link(OLDFILE,NEWFILE)" 8 2
97 Creates a new filename linked to the old filename.
98 Returns 1 for success, 0 otherwise.
100 Declares the listed (scalar) variables to be local to the enclosing block,
102 (The "do 'filename';" operator also counts as an eval.)
103 This operator works by saving the current values of those variables in LIST
104 on a hidden stack and restoring them upon exiting the block, subroutine or eval.
105 The LIST may be assigned to if desired, which allows you to initialize
106 your local variables.
107 Commonly this is used to name the parameters to a subroutine.
113 local($min, $max, $thunk) = @_;
117 # Presumably $thunk makes reference to $i
119 for ($i = $min; $i < $max; $i++) {
120 $result .= eval $thunk;
127 .Ip "localtime(EXPR)" 8 4
128 Converts a time as returned by the time function to a 9-element array with
129 the time analyzed for the local timezone.
130 Typically used as follows:
134 ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst)
138 All array elements are numeric, and come straight out of a struct tm.
139 In particular this means that $mon has the range 0..11 and $wday has the
142 Returns logarithm (base e) of EXPR.
149 statement in C; it starts the next iteration of the loop:
153 line: while (<stdin>) {
154 next line if /\|^#/; # discard comments
159 Note that if there were a
161 block on the above, it would get executed even on discarded lines.
162 If the LABEL is omitted, the command refers to the innermost enclosing loop.
164 Returns the decimal value of EXPR interpreted as an octal string.
165 (If EXPR happens to start off with 0x, interprets it as a hex string instead.)
166 The following will handle decimal, octal and hex in the standard notation:
169 $val = oct($val) if $val =~ /^0/;
172 .Ip "open(FILEHANDLE,EXPR)" 8 8
173 .Ip "open(FILEHANDLE)" 8
174 .Ip "open FILEHANDLE" 8
175 Opens the file whose filename is given by EXPR, and associates it with
177 If FILEHANDLE is an expression, its value is used as the name of the
178 real filehandle wanted.
179 If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE
180 contains the filename.
181 If the filename begins with \*(L">\*(R", the file is opened for output.
182 If the filename begins with \*(L">>\*(R", the file is opened for appending.
183 If the filename begins with \*(L"|\*(R", the filename is interpreted
184 as a command to which output is to be piped, and if the filename ends
185 with a \*(L"|\*(R", the filename is interpreted as command which pipes
187 (You may not have a command that pipes both in and out.)
188 Opening '\-' opens stdin and opening '>\-' opens stdout.
189 Open returns 1 upon success, '' otherwise.
195 open article || die "Can't find article $article";
196 while (<article>) {\|.\|.\|.
198 open(LOG, '>>/usr/spool/news/twitlog'\|); # (log is reserved)
200 open(article, "caeser <$article |"\|); # decrypt article
202 open(extract, "|sort >/tmp/Tmp$$"\|); # $$ is our process#
205 # process argument list of files along with any includes
207 foreach $file (@ARGV) {
208 do process($file,'fh00'); # no pun intended
212 local($filename,$input) = @_;
213 $input++; # this is a string increment
214 unless (open($input,$filename)) {
215 print stderr "Can't open $filename\en";
216 last; # note block inside sub
218 while (<$input>) { # note the use of indirection
219 if (/^#include "(.*)"/) {
220 do process($1,$input);
228 You may also, in the Bourne shell tradition, specify an EXPR beginning
229 with ">&", in which case the rest of the string
230 is interpreted as the name of a filehandle
231 (or file descriptor, if numeric) which is to be duped and opened.
232 Here is a script that saves, redirects, and restores stdout and stdin:
237 open(saveout,">&stdout");
238 open(saveerr,">&stderr");
240 open(stdout,">foo.out") || die "Can't redirect stdout";
241 open(stderr,">&stdout") || die "Can't dup stdout";
243 select(stderr); $| = 1; # make unbuffered
244 select(stdout); $| = 1; # make unbuffered
246 print stdout "stdout 1\en"; # this works for
247 print stderr "stderr 1\en"; # subprocesses too
252 open(stdout,">&saveout");
253 open(stderr,">&saveerr");
255 print stdout "stdout 2\en";
256 print stderr "stderr 2\en";
259 If you open a pipe on the command "-", i.e. either "|-" or "-|",
260 then there is an implicit fork done, and the return value of open
261 is the pid of the child within the parent process, and 0 within the child
263 The filehandle behaves normally for the parent, but i/o to that
264 filehandle is piped from/to the stdout/stdin of the child process.
265 In the child process the filehandle isn't opened--i/o happens from/to
266 the new stdout or stdin.
267 Typically this is used like the normal piped open when you want to exercise
268 more control over just how the pipe command gets executed, such as when
269 you are running setuid, and don't want to have to scan shell commands
271 The following pairs are equivalent:
275 open(FOO,"|tr '[a-z]' '[A-Z]'");
276 open(FOO,"|-") || exec 'tr', '[a-z]', '[A-Z]';
278 open(FOO,"cat -n $file|");
279 open(FOO,"-|") || exec 'cat', '-n', $file;
282 Explicitly closing the filehandle causes the parent process to wait for the
283 child to finish, and returns the status value in $?.
285 Returns the ascii value of the first character of EXPR.
288 Pops and returns the last value of the array, shortening the array by 1.
289 Has the same effect as
292 $tmp = $ARRAY[$#ARRAY]; $#ARRAY--;
295 .Ip "print FILEHANDLE LIST" 8 9
298 Prints a string or a comma-separated list of strings.
299 FILEHANDLE may be a scalar variable name, in which case the variable contains
300 the name of the filehandle, thus introducing one level of indirection.
301 If FILEHANDLE is omitted, prints by default to standard output (or to the
302 last selected output channel\*(--see select()).
303 If LIST is also omitted, prints $_ to stdout.
304 To set the default output channel to something other than stdout use the select operation.
305 .Ip "printf FILEHANDLE LIST" 8 9
307 Equivalent to a "print FILEHANDLE sprintf(LIST)".
308 .Ip "push(ARRAY,LIST)" 8 7
309 Treats ARRAY (@ is optional) as a stack, and pushes the values of LIST
310 onto the end of ARRAY.
311 The length of ARRAY increases by the length of LIST.
312 Has the same effect as
316 $ARRAY[$#ARRAY+1] = $value;
320 but is more efficient.
325 command restarts the loop block without evaluating the conditional again.
328 block, if any, is not executed.
329 If the LABEL is omitted, the command refers to the innermost enclosing loop.
330 This command is normally used by programs that want to lie to themselves
331 about what was just input:
335 # a simpleminded Pascal comment stripper
336 # (warning: assumes no { or } in strings)
337 line: while (<stdin>) {
338 while (s|\|({.*}.*\|){.*}|$1 \||) {}
343 if (\|/\|}/\|) { # end of comment?
353 .Ip "rename(OLDNAME,NEWNAME)" 8 2
354 Changes the name of a file.
355 Returns 1 for success, 0 otherwise.
359 block at the end of a loop to clear variables and reset ?? searches
360 so that they work again.
361 The expression is interpreted as a list of single characters (hyphens allowed
363 All variables and arrays beginning with one of those letters are reset to
364 their pristine state.
365 If the expression is omitted, one-match searches (?pattern?) are reset to
372 reset 'X'; \h'|2i'# reset all X variables
373 reset 'a-z';\h'|2i'# reset lower case variables
374 reset; \h'|2i'# just reset ?? searches
377 Note: resetting "A-Z" is not recommended since you'll wipe out your ARGV and ENV
379 .Ip "s/PATTERN/REPLACEMENT/gi" 8 3
380 Searches a string for a pattern, and if found, replaces that pattern with the
381 replacement text and returns the number of substitutions made.
382 Otherwise it returns false (0).
383 The \*(L"g\*(R" is optional, and if present, indicates that all occurences
384 of the pattern are to be replaced.
385 The \*(L"i\*(R" is also optional, and if present, indicates that matching
386 is to be done in a case-insensitive manner.
387 Any delimiter may replace the slashes; if single quotes are used, no
388 interpretation is done on the replacement string.
389 If no string is specified via the =~ or !~ operator,
390 the $_ string is searched and modified.
391 (The string specified with =~ must be a scalar variable, an array element,
392 or an assignment to one of those, i.e. an lvalue.)
393 If the pattern contains a $ that looks like a variable rather than an
394 end-of-string test, the variable will be interpolated into the pattern at
396 See also the section on regular expressions.
400 s/\|\e\|bgreen\e\|b/mauve/g; # don't change wintergreen
402 $path \|=~ \|s|\|/usr/bin|\|/usr/local/bin|;
404 s/Login: $foo/Login: $bar/; # run-time pattern
406 s/\|([^ \|]*\|) *\|([^ \|]*\|)\|/\|$2 $1/; # reverse 1st two fields
408 ($foo = $bar) =~ s/bar/foo/;
411 (Note the use of $ instead of \|\e\| in the last example. See section
412 on regular expressions.)
413 .Ip "seek(FILEHANDLE,POSITION,WHENCE)" 8 3
414 Randomly positions the file pointer for FILEHANDLE, just like the fseek()
416 FILEHANDLE may be an expression whose value gives the name of the filehandle.
417 Returns 1 upon success, 0 otherwise.
418 .Ip "select(FILEHANDLE)" 8 3
419 Sets the current default filehandle for output.
420 This has two effects: first, a
424 without a filehandle will default to this FILEHANDLE.
425 Second, references to variables related to output will refer to this output
427 For example, if you have to set the top of form format for more than
428 one output channel, you might do the following:
438 Select happens to return TRUE if the file is currently open and FALSE otherwise,
439 but this has no effect on its operation.
440 FILEHANDLE may be an expression whose value gives the name of the actual filehandle.
441 .Ip "shift(ARRAY)" 8 6
444 Shifts the first value of the array off and returns it,
445 shortening the array by 1 and moving everything down.
446 If ARRAY is omitted, shifts the ARGV array.
447 See also unshift(), push() and pop().
448 Shift() and unshift() do the same thing to the left end of an array that push()
449 and pop() do to the right end.
452 Causes the script to sleep for EXPR seconds, or forever if no EXPR.
453 May be interrupted by sending the process a SIGALARM.
454 Returns the number of seconds actually slept.
455 .Ip "sort SUBROUTINE LIST" 8 7
457 Sorts the LIST and returns the sorted array value.
458 Nonexistent values of arrays are stripped out.
459 If SUBROUTINE is omitted, sorts in standard string comparison order.
460 If SUBROUTINE is specified, gives the name of a subroutine that returns
461 a -1, 0, or 1, depending on how the elements of the array are to be ordered.
462 In the interests of efficiency the normal calling code for subroutines
463 is bypassed, with the following effects: the subroutine may not be a recursive
464 subroutine, and the two elements to be compared are passed into the subroutine
465 not via @_ but as $a and $b (see example below).
466 SUBROUTINE may be a scalar variable name, in which case the value provides
467 the name of the subroutine to use.
473 $age{$a} < $age{$b} ? -1 : $age{$a} > $age{$b} ? 1 : 0;
475 @sortedclass = sort byage @class;
478 sub reverse { $a lt $b ? 1 : $a gt $b ? -1 : 0; }
479 @harry = ('dog','cat','x','Cain','Abel');
480 @george = ('gone','chased','yz','Punished','Axed');
482 # prints AbelCaincatdogx
483 print sort reverse @harry;
484 # prints xdogcatCainAbel
485 print sort @george,'to',@harry;
486 # prints AbelAxedCainPunishedcatchaseddoggonetoxyz
489 .Ip "split(/PATTERN/,EXPR)" 8 8
490 .Ip "split(/PATTERN/)" 8
492 Splits a string into an array of strings, and returns it.
493 If EXPR is omitted, splits the $_ string.
494 If PATTERN is also omitted, splits on whitespace (/[\ \et\en]+/).
495 Anything matching PATTERN is taken to be a delimiter separating the fields.
496 (Note that the delimiter may be longer than one character.)
497 Trailing null fields are stripped, which potential users of pop() would
499 A pattern matching the null string (not to be confused with a null pattern)
500 will split the value of EXPR into separate characters at each point it
505 print join(':',split(/ */,'hi there'));
508 produces the output 'h:i:t:h:e:r:e'.
510 The pattern /PATTERN/ may be replaced with an expression to specify patterns
511 that vary at runtime.
512 As a special case, specifying a space ('\ ') will split on white space
513 just as split with no arguments does, but leading white space does NOT
514 produce a null first field.
515 Thus, split('\ ') can be used to emulate awk's default behavior, whereas
516 split(/\ /) will give you as many null initial fields as there are
523 open(passwd, '/etc/passwd');
526 ($login, $passwd, $uid, $gid, $gcos, $home, $shell) = split(\|/\|:\|/\|);
529 ($login, $passwd, $uid, $gid, $gcos, $home, $shell)
530 = split(\|/\|:\|/\|);
536 (Note that $shell above will still have a newline on it. See chop().)
539 .Ip "sprintf(FORMAT,LIST)" 8 4
540 Returns a string formatted by the usual printf conventions.
541 The * character is not supported.
543 Return the square root of EXPR.
544 .Ip "stat(FILEHANDLE)" 8 6
546 Returns a 13-element array giving the statistics for a file, either the file
547 opened via FILEHANDLE, or named by EXPR.
548 Typically used as follows:
552 ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
553 $atime,$mtime,$ctime,$blksize,$blocks)
557 .Ip "study(SCALAR)" 8 6
559 Takes extra time to study SCALAR ($_ if unspecified) in anticipation of
560 doing many pattern matches on the string before it is next modified.
561 This may or may not save time, depending on the nature and number of patterns
562 you are searching on\*(--you probably want to compare runtimes with and
563 without it to see which runs faster.
564 Those loops which scan for many short constant strings (including the constant
565 parts of more complex patterns) will benefit most.
566 For example, a loop which inserts index producing entries before an line
567 containing a certain pattern:
573 print ".IX foo\en" if /\ebfoo\eb/;
574 print ".IX bar\en" if /\ebbar\eb/;
575 print ".IX blurfl\en" if /\ebblurfl\eb/;
581 .Ip "substr(EXPR,OFFSET,LEN)" 8 2
582 Extracts a substring out of EXPR and returns it.
583 First character is at offset 0, or whatever you've set $[ to.
584 .Ip "system LIST" 8 6
585 Does exactly the same thing as \*(L"exec LIST\*(R" except that a fork
586 is done first, and the parent process waits for the child process to complete.
587 Note that argument processing varies depending on the number of arguments.
588 The return value is the exit status of the program as returned by the wait()
590 To get the actual exit value divide by 256.
592 .Ip "symlink(OLDFILE,NEWFILE)" 8 2
593 Creates a new filename symbolically linked to the old filename.
594 Returns 1 for success, 0 otherwise.
595 On systems that don't support symbolic links, produces a fatal error at
597 To check for that, use eval:
600 $symlink_exists = (eval 'symlink("","");', $@ eq '');
603 .Ip "tell(FILEHANDLE)" 8 6
605 Returns the current file position for FILEHANDLE.
606 FILEHANDLE may be an expression whose value gives the name of the actual
608 If FILEHANDLE is omitted, assumes the file last read.
610 Returns the number of seconds since January 1, 1970.
611 Suitable for feeding to gmtime() and localtime().
613 Returns a four-element array giving the user and system times, in seconds, for this
614 process and the children of this process.
616 ($user,$system,$cuser,$csystem) = times;
618 .Ip "tr/SEARCHLIST/REPLACEMENTLIST/" 8 5
619 .Ip "y/SEARCHLIST/REPLACEMENTLIST/" 8
620 Translates all occurences of the characters found in the search list with
621 the corresponding character in the replacement list.
622 It returns the number of characters replaced.
623 If no string is specified via the =~ or !~ operator,
624 the $_ string is translated.
625 (The string specified with =~ must be a scalar variable, an array element,
626 or an assignment to one of those, i.e. an lvalue.)
631 is provided as a synonym for
636 $ARGV[1] \|=~ \|y/A-Z/a-z/; \h'|3i'# canonicalize to lower case
638 $cnt = tr/*/*/; \h'|3i'# count the stars in $_
640 ($HOST = $host) =~ tr/a-z/A-Z/;
643 .Ip "umask(EXPR)" 8 3
644 Sets the umask for the process and returns the old one.
645 .Ip "unlink LIST" 8 2
646 Deletes a list of files.
647 Returns the number of files successfully deleted.
651 $cnt = unlink 'a','b','c';
655 Note: unlink will not delete directories unless you are superuser and the \-U
656 flag is supplied to perl.
658 .Ip "unshift(ARRAY,LIST)" 8 4
659 Does the opposite of a shift.
660 Or the opposite of a push, depending on how you look at it.
661 Prepends list to the front of the array, and returns the number of elements
665 unshift(ARGV,'-e') unless $ARGV[0] =~ /^-/;
669 Changes the access and modification times on each file of a list of files.
670 The first two elements of the list must be the NUMERICAL access and
671 modification times, in that order.
672 Returns the number of files successfully changed.
673 The inode modification time of each file is set to the current time.
674 Example of a "touch" command:
680 utime $now,$now,@ARGV;
683 .Ip "values(ASSOC_ARRAY)" 8 6
684 Returns a normal array consisting of all the values of the named associative
686 The values are returned in an apparently random order, but it is the same order
687 as either the keys() or each() function produces (given that the associative array
688 has not been modified).
689 See also keys() and each().
691 Waits for a child process to terminate and returns the pid of the deceased
693 The status is returned in $?.
694 .Ip "write(FILEHANDLE)" 8 6
697 Writes a formatted record (possibly multi-line) to the specified file,
698 using the format associated with that file.
699 By default the format for a file is the one having the same name is the
700 filehandle, but the format for the current output channel (see
702 may be set explicitly
703 by assigning the name of the format to the $~ variable.
705 Top of form processing is handled automatically:
706 if there is insufficient room on the current page for the formatted
707 record, the page is advanced, a special top-of-page format is used
708 to format the new page header, and then the record is written.
709 By default the top-of-page format is \*(L"top\*(R", but it
711 format of your choice by assigning the name to the $^ variable.
713 If FILEHANDLE is unspecified, output goes to the current default output channel,
714 which starts out as stdout but may be changed by the
717 If the FILEHANDLE is an EXPR, then the expression is evaluated and the
718 resulting string is used to look up the name of the FILEHANDLE at run time.
719 For more on formats, see the section on formats later on.
721 Perl operators have the following associativity and precedence:
724 nonassoc\h'|1i'print printf exec system sort
725 \h'1.5i'chmod chown kill unlink utime
734 nonassoc\h'|1i'== != eq ne
735 nonassoc\h'|1i'< > <= >= lt gt le ge
736 nonassoc\h'|1i'chdir die exit eval reset sleep
737 nonassoc\h'|1i'-r -w -x etc.
742 right\h'|1i'! ~ and unary minus
747 Actually, the precedence of list operators such as print, sort or chmod is
748 either very high or very low depending on whether you look at the left
749 side of operator or the right side of it.
752 @ary = (1, 3, sort 4, 2);
753 print @ary; # prints 1324
755 the commas on the right of the sort are evaluated before the sort, but
756 the commas on the left are evaluated after.
757 In other words, list operators tend to gobble up all the arguments that
758 follow them, and then act like a simple term with regard to the preceding
761 A subroutine may be declared as follows:
768 Any arguments passed to the routine come in as array @_,
769 that is ($_[0], $_[1], .\|.\|.).
770 The return value of the subroutine is the value of the last expression
772 To create local variables see the "local" operator.
774 A subroutine is called using the
783 local($max) = pop(@_);
785 $max = $foo \|if \|$max < $foo;
791 $bestday = do MAX($mon,$tue,$wed,$thu,$fri);
796 # get a line, combining continuation lines
797 # that start with whitespace
799 $thisline = $lookahead;
800 line: while ($lookahead = <stdin>) {
801 if ($lookahead \|=~ \|/\|^[ \^\e\|t]\|/\|) {
802 $thisline \|.= \|$lookahead;
811 $lookahead = <stdin>; # get first line
812 while ($_ = get_line(\|)) {
819 Use array assignment to local list to name your formal arguments:
822 local($key,$value) = @_;
823 $foo{$key} = $value unless $foo{$key};
827 Subroutines may be called recursively.
828 .Sh "Regular Expressions"
829 The patterns used in pattern matching are regular expressions such as
830 those supplied in the Version 8 regexp routines.
831 (In fact, the routines are derived from Henry Spencer's freely redistributable
832 reimplementation of the V8 routines.)
833 In addition, \ew matches an alphanumeric character (including "_") and \eW a nonalphanumeric.
834 Word boundaries may be matched by \eb, and non-boundaries by \eB.
835 A whitespace character is matched by \es, non-whitespace by \eS.
836 A numeric character is matched by \ed, non-numeric by \eD.
837 You may use \ew, \es and \ed within character classes.
838 Also, \en, \er, \ef, \et and \eNNN have their normal interpretations.
839 Within character classes \eb represents backspace rather than a word boundary.
840 The bracketing construct \|(\ .\|.\|.\ \|) may also be used, in which case \e<digit>
841 matches the digit'th substring, where digit can range from 1 to 9.
842 (Outside of patterns, use $ instead of \e in front of the digit.
843 The scope of $<digit> extends to the end of the enclosing BLOCK, or to
844 the next pattern match with subexpressions.)
845 $+ returns whatever the last bracket match matched.
846 $& returns the entire matched string.
847 ($0 normally returns the same thing, but don't depend on it.)
848 Alternatives may be separated by |.
852 s/\|^\|([^ \|]*\|) \|*([^ \|]*\|)\|/\|$2 $1\|/; # swap first two words
855 if (/\|Time: \|(.\|.\|):\|(.\|.\|):\|(.\|.\|)\|/\|) {
862 By default, the ^ character matches only the beginning of the string, and
864 does certain optimizations with the assumption that the string contains
866 You may, however, wish to treat a string as a multi-line buffer, such that
867 the ^ will match after any newline within the string.
868 At the cost of a little more overhead, you can do this by setting the variable
870 Setting it back to 0 makes
872 revert to its old behavior.
874 To facilitate multi-line substitutions, the . character never matches a newline.
875 In particular, the following leaves a newline on the $_ string:
879 s/.*(some_string).*/$1/;
881 If the newline is unwanted, try one of
883 s/.*(some_string).*\en/$1/;
884 s/.*(some_string)[^\000]*/$1/;
885 s/.*(some_string)(.|\en)*/$1/;
886 chop; s/.*(some_string).*/$1/;
887 /(some_string)/ && ($_ = $1);
891 Output record formats for use with the
893 operator may declared as follows:
902 If name is omitted, format \*(L"stdout\*(R" is defined.
903 FORMLIST consists of a sequence of lines, each of which may be of one of three
908 A \*(L"picture\*(R" line giving the format for one output line.
910 An argument line supplying values to plug into a picture line.
912 Picture lines are printed exactly as they look, except for certain fields
913 that substitute values into the line.
914 Each picture field starts with either @ or ^.
915 The @ field (not to be confused with the array marker @) is the normal
916 case; ^ fields are used
917 to do rudimentary multi-line text block filling.
918 The length of the field is supplied by padding out the field
919 with multiple <, >, or | characters to specify, respectively, left justfication,
920 right justification, or centering.
921 If any of the values supplied for these fields contains a newline, only
922 the text up to the newline is printed.
923 The special field @* can be used for printing multi-line values.
924 It should appear by itself on a line.
926 The values are specified on the following line, in the same order as
928 They must currently be either scalar variable names or literals (or
930 Currently you can separate values with spaces, but commas may be placed
931 between values to prepare for possible future versions in which full expressions
932 are allowed as values.
934 Picture fields that begin with ^ rather than @ are treated specially.
935 The value supplied must be a scalar variable name which contains a text
938 puts as much text as it can into the field, and then chops off the front
939 of the string so that the next time the variable is referenced,
940 more of the text can be printed.
941 Normally you would use a sequence of fields in a vertical stack to print
943 If you like, you can end the final field with .\|.\|., which will appear in the
944 output if the text was too long to appear in its entirety.
946 Since use of ^ fields can produce variable length records if the text to be
947 formatted is short, you can suppress blank lines by putting the tilde (~)
948 character anywhere in the line.
949 (Normally you should put it in the front if possible.)
950 The tilde will be translated to a space upon output.
958 # a report on the /etc/passwd file
961 Name Login Office Uid Gid Home
962 ------------------------------------------------------------------
965 @<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<<
966 $name $login $office $uid $gid $home
970 # a report from a bug report form
973 @<<<<<<<<<<<<<<<<<<<<<<< @||| @>>>>>>>>>>>>>>>>>>>>>>>
975 ------------------------------------------------------------------
978 Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
980 Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
981 \& $index $description
982 Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
983 \& $priority $date $description
984 From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
985 \& $from $description
986 Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
987 \& $programmer $description
988 \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
990 \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
992 \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
994 \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
996 \&~ ^<<<<<<<<<<<<<<<<<<<<<<<...
1002 It is possible to intermix prints with writes on the same output channel,
1003 but you'll have to handle $\- (lines left on the page) yourself.
1006 If you are printing lots of fields that are usually blank, you should consider
1007 using the reset operator between records.
1008 Not only is it more efficient, but it can prevent the bug of adding another
1009 field and forgetting to zero it.
1010 .Sh "Predefined Names"
1011 The following names have special meaning to
1013 I could have used alphabetic symbols for some of these, but I didn't want
1014 to take the chance that someone would say reset "a-zA-Z" and wipe them all
1016 You'll just have to suffer along with these silly symbols.
1017 Most of them have reasonable mnemonics, or analogues in one of the shells.
1019 The default input and pattern-searching space.
1020 The following pairs are equivalent:
1024 while (<>) {\|.\|.\|. # only equivalent in while!
1025 while ($_ = <>) {\|.\|.\|.
1029 $_ \|=~ \|/\|^Subject:/
1040 (Mnemonic: underline is understood in certain operations.)
1042 The current input line number of the last filehandle that was read.
1044 Remember that only an explicit close on the filehandle resets the line number.
1045 Since <> never does an explicit close, line numbers increase across ARGV files
1046 (but see examples under eof).
1047 (Mnemonic: many programs use . to mean the current line number.)
1049 The input record separator, newline by default.
1050 Works like awk's RS variable, including treating blank lines as delimiters
1051 if set to the null string.
1052 If set to a value longer than one character, only the first character is used.
1053 (Mnemonic: / is used to delimit line boundaries when quoting poetry.)
1055 The output field separator for the print operator.
1056 Ordinarily the print operator simply prints out the comma separated fields
1058 In order to get behavior more like awk, set this variable as you would set
1059 awk's OFS variable to specify what is printed between fields.
1060 (Mnemonic: what is printed when there is a , in your print statement.)
1062 The output record separator for the print operator.
1063 Ordinarily the print operator simply prints out the comma separated fields
1064 you specify, with no trailing newline or record separator assumed.
1065 In order to get behavior more like awk, set this variable as you would set
1066 awk's ORS variable to specify what is printed at the end of the print.
1067 (Mnemonic: you set $\e instead of adding \en at the end of the print.
1068 Also, it's just like /, but it's what you get \*(L"back\*(R" from perl.)
1070 The output format for printed numbers.
1071 This variable is a half-hearted attempt to emulate awk's OFMT variable.
1072 There are times, however, when awk and perl have differing notions of what
1074 Also, the initial value is %.20g rather than %.6g, so you need to set $#
1075 explicitly to get awk's value.
1076 (Mnemonic: # is the number sign.)
1078 The current page number of the currently selected output channel.
1079 (Mnemonic: % is page number in nroff.)
1081 The current page length (printable lines) of the currently selected output
1084 (Mnemonic: = has horizontal lines.)
1086 The number of lines left on the page of the currently selected output channel.
1087 (Mnemonic: lines_on_page - lines_printed.)
1089 The name of the current report format for the currently selected output
1091 (Mnemonic: brother to $^.)
1093 The name of the current top-of-page format for the currently selected output
1095 (Mnemonic: points to top of page.)
1097 If set to nonzero, forces a flush after every write or print on the currently
1098 selected output channel.
1100 Note that stdout will typically be line buffered if output is to the
1101 terminal and block buffered otherwise.
1102 Setting this variable is useful primarily when you are outputting to a pipe,
1103 such as when you are running a perl script under rsh and want to see the
1104 output as it's happening.
1105 (Mnemonic: when you want your pipes to be piping hot.)
1107 The process number of the
1109 running this script.
1110 (Mnemonic: same as shells.)
1112 The status returned by the last backtick (``) command or system operator.
1113 Note that this is the status word returned by the wait() system
1114 call, so the exit value of the subprocess is actually ($? >> 8).
1115 $? & 255 gives which signal, if any, the process died from, and whether
1116 there was a core dump.
1117 (Mnemonic: similar to sh and ksh.)
1119 The string matched by the last pattern match.
1120 (Mnemonic: like & in some editors.)
1122 The last bracket matched by the last search pattern.
1123 This is useful if you don't know which of a set of alternative patterns
1128 /Version: \|(.*\|)|Revision: \|(.*\|)\|/ \|&& \|($rev = $+);
1131 (Mnemonic: be positive and forward looking.)
1133 Set to 1 to do multiline matching within a string, 0 to assume strings contain
1136 (Mnemonic: * matches multiple things.)
1138 Contains the name of the file containing the
1140 script being executed.
1141 The value should be copied elsewhere before any pattern matching happens, which
1143 (Mnemonic: same as sh and ksh.)
1145 Contains the subpattern from the corresponding set of parentheses in the last
1146 pattern matched, not counting patterns matched in nested blocks that have
1147 been exited already.
1148 (Mnemonic: like \edigit.)
1150 The index of the first element in an array, and of the first character in
1152 Default is 0, but you could set it to 1 to make
1157 when subscripting and when evaluating the index() and substr() functions.
1158 (Mnemonic: [ begins subscripts.)
1160 If used in a numeric context, yields the current value of errno, with all the
1162 If used in a string context, yields the corresponding system error string.
1163 You can assign to $! in order to set errno
1164 if, for instance, you want $! to return the string for error n, or you want
1165 to set the exit value for the die operator.
1166 (Mnemonic: What just went bang?)
1168 The error message from the last eval command.
1169 If null, the last eval parsed and executed correctly.
1170 (Mnemonic: Where was the syntax error "at"?)
1172 The real uid of this process.
1173 (Mnemonic: it's the uid you came FROM, if you're running setuid.)
1175 The effective uid of this process.
1179 $< = $>; # set real uid to the effective uid
1182 (Mnemonic: it's the uid you went TO, if you're running setuid.)
1184 The real gid of this process.
1185 If you are on a machine that supports membership in multiple groups
1186 simultaneously, gives a space separated list of groups you are in.
1187 The first number is the one returned by getgid(), and the subsequent ones
1188 by getgroups(), one of which may be the same as the first number.
1189 (Mnemonic: parens are used to GROUP things.
1190 The real gid is the group you LEFT, if you're running setgid.)
1192 The effective gid of this process.
1193 If you are on a machine that supports membership in multiple groups
1194 simultaneously, gives a space separated list of groups you are in.
1195 The first number is the one returned by getegid(), and the subsequent ones
1196 by getgroups(), one of which may be the same as the first number.
1197 (Mnemonic: parens are used to GROUP things.
1198 The effective gid is the group that's RIGHT for you, if you're running setgid.)
1200 Note: $<, $>, $( and $) can only be set on machines that support the
1201 corresponding set[re][ug]id() routine.
1203 The array ARGV contains the command line arguments intended for the script.
1204 Note that $#ARGV is the generally number of arguments minus one, since
1205 $ARGV[0] is the first argument, NOT the command name.
1206 See $0 for the command name.
1208 The array INC contains the list of places to look for perl scripts to be
1209 evaluated by the "do EXPR" command.
1210 It initially consists of the arguments to any -I command line switches, followed
1211 by the default perl library, probably "/usr/local/lib/perl".
1213 The associative array ENV contains your current environment.
1214 Setting a value in ENV changes the environment for child processes.
1216 The associative array SIG is used to set signal handlers for various signals.
1221 sub handler { # 1st argument is signal name
1223 print "Caught a SIG$sig--shutting down\en";
1228 $SIG{'INT'} = 'handler';
1229 $SIG{'QUIT'} = 'handler';
1231 $SIG{'INT'} = 'DEFAULT'; # restore default action
1232 $SIG{'QUIT'} = 'IGNORE'; # ignore SIGQUIT
1237 currently uses no environment variables, except to make them available
1238 to the script being executed, and to child processes.
1239 However, scripts running setuid would do well to execute the following lines
1240 before doing anything else, just to keep people honest:
1244 $ENV{'PATH'} = '/bin:/usr/bin'; # or whatever you need
1245 $ENV{'SHELL'} = '/bin/sh' if $ENV{'SHELL'};
1246 $ENV{'IFS'} = '' if $ENV{'IFS'};
1250 Larry Wall <lwall@jpl-devvax.Jpl.Nasa.Gov>
1252 /tmp/perl\-eXXXXXX temporary file for
1256 a2p awk to perl translator
1258 s2p sed to perl translator
1260 perldb interactive perl debugger
1262 Compilation errors will tell you the line number of the error, with an
1263 indication of the next token or token type that was to be examined.
1264 (In the case of a script passed to
1270 is counted as one line.)
1272 Accustomed awk users should take special note of the following:
1274 Semicolons are required after all simple statements in perl. Newline
1275 is not a statement delimiter.
1277 Curly brackets are required on ifs and whiles.
1279 Variables begin with $ or @ in perl.
1281 Arrays index from 0 unless you set $[.
1282 Likewise string positions in substr() and index().
1284 You have to decide whether your array has numeric or string indices.
1286 Associative array values do not spring into existence upon mere reference.
1288 You have to decide whether you want to use string or numeric comparisons.
1290 Reading an input line does not split it for you. You get to split it yourself
1292 And split has different arguments.
1294 The current input line is normally in $_, not $0.
1295 It generally does not have the newline stripped.
1296 ($0 is initially the name of the program executed, then the last matched
1299 The current filename is $ARGV, not $FILENAME.
1300 NR, RS, ORS, OFS, and OFMT have equivalents with other symbols.
1301 FS doesn't have an equivalent, since you have to be explicit about
1304 $<digit> does not refer to fields--it refers to substrings matched by the last
1307 The print statement does not add field and record separators unless you set
1310 You must open your files before you print to them.
1312 The range operator is \*(L"..\*(R", not comma.
1313 (The comma operator works as in C.)
1315 The match operator is \*(L"=~\*(R", not \*(L"~\*(R".
1316 (\*(L"~\*(R" is the one's complement operator.)
1318 The concatenation operator is \*(L".\*(R", not the null string.
1319 (Using the null string would render \*(L"/pat/ /pat/\*(R" unparseable,
1320 since the third slash would be interpreted as a division operator\*(--the
1321 tokener is in fact slightly context sensitive for operators like /, ?, and <.
1322 And in fact, . itself can be the beginning of a number.)
1324 Next, exit, and continue work differently.
1326 When in doubt, run the awk construct through a2p and see what it gives you.
1328 Cerebral C programmers should take note of the following:
1330 Curly brackets are required on ifs and whiles.
1332 You should use \*(L"elsif\*(R" rather than \*(L"else if\*(R"
1334 Break and continue become last and next, respectively.
1336 There's no switch statement.
1338 Variables begin with $ or @ in perl.
1340 Printf does not implement *.
1342 Comments begin with #, not /*.
1344 You can't take the address of anything.
1346 ARGV must be capitalized.
1348 The \*(L"system\*(R" calls link, unlink, rename, etc. return nonzero for success, not 0.
1350 Signal handlers deal with signal names, not numbers.
1352 Seasoned sed programmers should take note of the following:
1354 Backreferences in substitutions use $ rather than \e.
1356 The pattern matching metacharacters (, ), and | do not have backslashes in front.
1358 The range operator is .. rather than comma.
1360 Sharp shell programmers should take note of the following:
1362 The backtick operator does variable interpretation without regard to the
1363 presence of single quotes in the command.
1365 The backtick operator does no translation of the return value, unlike csh.
1367 Shells (especially csh) do several levels of substitution on each command line.
1368 Perl does substitution only in certain constructs such as double quotes,
1369 backticks, angle brackets and search patterns.
1371 Shells interpret scripts a little bit at a time.
1372 Perl compiles the whole program before executing it.
1374 The arguments are available via @ARGV, not $1, $2, etc.
1376 The environment is not automatically made available as variables.
1379 You can't currently dereference arrays or array elements inside a
1380 double-quoted string.
1381 You must assign them to a scalar and interpolate that.
1383 Associative arrays really ought to be first class objects.
1385 Perl is at the mercy of the C compiler's definitions of various operations
1386 such as % and atof().
1387 In particular, don't trust % on negative numbers.
1390 actually stands for Pathologically Eclectic Rubbish Lister, but don't tell