Commit | Line | Data |
8d063cd8 |
1 | ''' Beginning of part 2 |
13281fa4 |
2 | ''' $Header: perl.man.2,v 2.0.1.1 88/06/28 16:31:49 root Exp $ |
8d063cd8 |
3 | ''' |
4 | ''' $Log: perl.man.2,v $ |
13281fa4 |
5 | ''' Revision 2.0.1.1 88/06/28 16:31:49 root |
6 | ''' patch1: fixed some quotes |
7 | ''' patch1: clarified semantics of study |
8 | ''' patch1: added example of y with short second string |
9 | ''' patch1: added example of unlink with <*> |
10 | ''' |
378cc40b |
11 | ''' Revision 2.0 88/06/05 00:09:30 root |
12 | ''' Baseline version 2.0. |
8d063cd8 |
13 | ''' |
14 | ''' |
15 | .Ip "goto LABEL" 8 6 |
16 | Finds the statement labeled with LABEL and resumes execution there. |
17 | Currently you may only go to statements in the main body of the program |
18 | that are not nested inside a do {} construct. |
19 | This statement is not implemented very efficiently, and is here only to make |
20 | the sed-to-perl translator easier. |
21 | Use at your own risk. |
22 | .Ip "hex(EXPR)" 8 2 |
23 | Returns the decimal value of EXPR interpreted as an hex string. |
24 | (To interpret strings that might start with 0 or 0x see oct().) |
25 | .Ip "index(STR,SUBSTR)" 8 4 |
26 | Returns the position of SUBSTR in STR, based at 0, or whatever you've |
27 | set the $[ variable to. |
28 | If the substring is not found, returns one less than the base, ordinarily -1. |
29 | .Ip "int(EXPR)" 8 3 |
30 | Returns the integer portion of EXPR. |
31 | .Ip "join(EXPR,LIST)" 8 8 |
32 | .Ip "join(EXPR,ARRAY)" 8 |
33 | Joins the separate strings of LIST or ARRAY into a single string with fields |
34 | separated by the value of EXPR, and returns the string. |
35 | Example: |
36 | .nf |
37 | |
38 | $_ = join(\|':', $login,$passwd,$uid,$gid,$gcos,$home,$shell); |
39 | |
40 | .fi |
41 | See |
42 | .IR split . |
43 | .Ip "keys(ASSOC_ARRAY)" 8 6 |
44 | Returns a normal array consisting of all the keys of the named associative |
45 | array. |
46 | The keys are returned in an apparently random order, but it is the same order |
47 | as either the values() or each() function produces (given that the associative array |
48 | has not been modified). |
49 | Here is yet another way to print your environment: |
50 | .nf |
51 | |
52 | .ne 5 |
53 | @keys = keys(ENV); |
54 | @values = values(ENV); |
55 | while ($#keys >= 0) { |
378cc40b |
56 | print pop(keys),'=',pop(values),"\en"; |
57 | } |
58 | |
59 | or how about sorted by key: |
60 | |
61 | .ne 3 |
62 | foreach $key (sort keys(ENV)) { |
63 | print $key,'=',$ENV{$key},"\en"; |
8d063cd8 |
64 | } |
65 | |
66 | .fi |
67 | .Ip "kill LIST" 8 2 |
68 | Sends a signal to a list of processes. |
69 | The first element of the list must be the (numerical) signal to send. |
8d063cd8 |
70 | Returns the number of processes successfully signaled. |
8d063cd8 |
71 | .nf |
72 | |
378cc40b |
73 | $cnt = kill 1,$child1,$child2; |
74 | kill 9,@goners; |
8d063cd8 |
75 | |
76 | .fi |
378cc40b |
77 | If the signal is negative, kills process groups instead of processes. |
78 | (On System V, a negative \fIprocess\fR number will also kill process groups, |
79 | but that's not portable.) |
8d063cd8 |
80 | .Ip "last LABEL" 8 8 |
81 | .Ip "last" 8 |
82 | The |
83 | .I last |
84 | command is like the |
85 | .I break |
86 | statement in C (as used in loops); it immediately exits the loop in question. |
87 | If the LABEL is omitted, the command refers to the innermost enclosing loop. |
88 | The |
89 | .I continue |
90 | block, if any, is not executed: |
91 | .nf |
92 | |
93 | .ne 4 |
94 | line: while (<stdin>) { |
95 | last line if /\|^$/; # exit when done with header |
96 | .\|.\|. |
97 | } |
98 | |
99 | .fi |
378cc40b |
100 | .Ip "length(EXPR)" 8 2 |
101 | Returns the length in characters of the value of EXPR. |
102 | .Ip "link(OLDFILE,NEWFILE)" 8 2 |
103 | Creates a new filename linked to the old filename. |
104 | Returns 1 for success, 0 otherwise. |
105 | .Ip "local(LIST)" 8 4 |
106 | Declares the listed (scalar) variables to be local to the enclosing block, |
107 | subroutine or eval. |
13281fa4 |
108 | (The \*(L"do 'filename';\*(R" operator also counts as an eval.) |
378cc40b |
109 | This operator works by saving the current values of those variables in LIST |
110 | on a hidden stack and restoring them upon exiting the block, subroutine or eval. |
111 | The LIST may be assigned to if desired, which allows you to initialize |
112 | your local variables. |
113 | Commonly this is used to name the parameters to a subroutine. |
114 | Examples: |
115 | .nf |
116 | |
117 | .ne 13 |
118 | sub RANGEVAL { |
119 | local($min, $max, $thunk) = @_; |
120 | local($result) = ''; |
121 | local($i); |
122 | |
123 | # Presumably $thunk makes reference to $i |
124 | |
125 | for ($i = $min; $i < $max; $i++) { |
126 | $result .= eval $thunk; |
127 | } |
128 | |
129 | $result; |
130 | } |
131 | |
132 | .fi |
8d063cd8 |
133 | .Ip "localtime(EXPR)" 8 4 |
134 | Converts a time as returned by the time function to a 9-element array with |
135 | the time analyzed for the local timezone. |
136 | Typically used as follows: |
137 | .nf |
138 | |
139 | .ne 3 |
140 | ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) |
141 | = localtime(time); |
142 | |
143 | .fi |
378cc40b |
144 | All array elements are numeric, and come straight out of a struct tm. |
145 | In particular this means that $mon has the range 0..11 and $wday has the |
146 | range 0..6. |
8d063cd8 |
147 | .Ip "log(EXPR)" 8 3 |
148 | Returns logarithm (base e) of EXPR. |
149 | .Ip "next LABEL" 8 8 |
150 | .Ip "next" 8 |
151 | The |
152 | .I next |
153 | command is like the |
154 | .I continue |
155 | statement in C; it starts the next iteration of the loop: |
156 | .nf |
157 | |
158 | .ne 4 |
159 | line: while (<stdin>) { |
160 | next line if /\|^#/; # discard comments |
161 | .\|.\|. |
162 | } |
163 | |
164 | .fi |
165 | Note that if there were a |
166 | .I continue |
167 | block on the above, it would get executed even on discarded lines. |
168 | If the LABEL is omitted, the command refers to the innermost enclosing loop. |
8d063cd8 |
169 | .Ip "oct(EXPR)" 8 2 |
170 | Returns the decimal value of EXPR interpreted as an octal string. |
171 | (If EXPR happens to start off with 0x, interprets it as a hex string instead.) |
172 | The following will handle decimal, octal and hex in the standard notation: |
173 | .nf |
174 | |
175 | $val = oct($val) if $val =~ /^0/; |
176 | |
177 | .fi |
178 | .Ip "open(FILEHANDLE,EXPR)" 8 8 |
179 | .Ip "open(FILEHANDLE)" 8 |
180 | .Ip "open FILEHANDLE" 8 |
181 | Opens the file whose filename is given by EXPR, and associates it with |
182 | FILEHANDLE. |
378cc40b |
183 | If FILEHANDLE is an expression, its value is used as the name of the |
184 | real filehandle wanted. |
185 | If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE |
8d063cd8 |
186 | contains the filename. |
187 | If the filename begins with \*(L">\*(R", the file is opened for output. |
188 | If the filename begins with \*(L">>\*(R", the file is opened for appending. |
189 | If the filename begins with \*(L"|\*(R", the filename is interpreted |
190 | as a command to which output is to be piped, and if the filename ends |
191 | with a \*(L"|\*(R", the filename is interpreted as command which pipes |
192 | input to us. |
193 | (You may not have a command that pipes both in and out.) |
83b4785a |
194 | Opening '\-' opens stdin and opening '>\-' opens stdout. |
8d063cd8 |
195 | Open returns 1 upon success, '' otherwise. |
196 | Examples: |
197 | .nf |
198 | |
199 | .ne 3 |
378cc40b |
200 | $article = 100; |
201 | open article || die "Can't find article $article"; |
202 | while (<article>) {\|.\|.\|. |
203 | |
204 | open(LOG, '>>/usr/spool/news/twitlog'\|); # (log is reserved) |
8d063cd8 |
205 | |
378cc40b |
206 | open(article, "caeser <$article |"\|); # decrypt article |
8d063cd8 |
207 | |
378cc40b |
208 | open(extract, "|sort >/tmp/Tmp$$"\|); # $$ is our process# |
8d063cd8 |
209 | |
378cc40b |
210 | .ne 7 |
211 | # process argument list of files along with any includes |
212 | |
213 | foreach $file (@ARGV) { |
214 | do process($file,'fh00'); # no pun intended |
215 | } |
216 | |
217 | sub process {{ |
218 | local($filename,$input) = @_; |
219 | $input++; # this is a string increment |
220 | unless (open($input,$filename)) { |
221 | print stderr "Can't open $filename\en"; |
222 | last; # note block inside sub |
223 | } |
224 | while (<$input>) { # note the use of indirection |
225 | if (/^#include "(.*)"/) { |
226 | do process($1,$input); |
227 | next; |
228 | } |
229 | .\|.\|. # whatever |
230 | } |
231 | }} |
232 | |
233 | .fi |
234 | You may also, in the Bourne shell tradition, specify an EXPR beginning |
13281fa4 |
235 | with \*(L">&\*(R", in which case the rest of the string |
378cc40b |
236 | is interpreted as the name of a filehandle |
237 | (or file descriptor, if numeric) which is to be duped and opened. |
238 | Here is a script that saves, redirects, and restores stdout and stdin: |
239 | .nf |
240 | |
241 | .ne 21 |
242 | #!/usr/bin/perl |
243 | open(saveout,">&stdout"); |
244 | open(saveerr,">&stderr"); |
245 | |
246 | open(stdout,">foo.out") || die "Can't redirect stdout"; |
247 | open(stderr,">&stdout") || die "Can't dup stdout"; |
248 | |
249 | select(stderr); $| = 1; # make unbuffered |
250 | select(stdout); $| = 1; # make unbuffered |
251 | |
252 | print stdout "stdout 1\en"; # this works for |
253 | print stderr "stderr 1\en"; # subprocesses too |
254 | |
255 | close(stdout); |
256 | close(stderr); |
257 | |
258 | open(stdout,">&saveout"); |
259 | open(stderr,">&saveerr"); |
260 | |
261 | print stdout "stdout 2\en"; |
262 | print stderr "stderr 2\en"; |
8d063cd8 |
263 | |
264 | .fi |
13281fa4 |
265 | If you open a pipe on the command \*(L"-\*(R", i.e. either \*(L"|-\*(R" or \*(L"-|\*(R", |
378cc40b |
266 | then there is an implicit fork done, and the return value of open |
267 | is the pid of the child within the parent process, and 0 within the child |
268 | process. |
269 | The filehandle behaves normally for the parent, but i/o to that |
270 | filehandle is piped from/to the stdout/stdin of the child process. |
271 | In the child process the filehandle isn't opened--i/o happens from/to |
272 | the new stdout or stdin. |
273 | Typically this is used like the normal piped open when you want to exercise |
274 | more control over just how the pipe command gets executed, such as when |
275 | you are running setuid, and don't want to have to scan shell commands |
276 | for metacharacters. |
277 | The following pairs are equivalent: |
278 | .nf |
279 | |
280 | .ne 5 |
281 | open(FOO,"|tr '[a-z]' '[A-Z]'"); |
282 | open(FOO,"|-") || exec 'tr', '[a-z]', '[A-Z]'; |
283 | |
284 | open(FOO,"cat -n $file|"); |
285 | open(FOO,"-|") || exec 'cat', '-n', $file; |
286 | |
287 | .fi |
288 | Explicitly closing the filehandle causes the parent process to wait for the |
289 | child to finish, and returns the status value in $?. |
8d063cd8 |
290 | .Ip "ord(EXPR)" 8 3 |
291 | Returns the ascii value of the first character of EXPR. |
292 | .Ip "pop ARRAY" 8 6 |
293 | .Ip "pop(ARRAY)" 8 |
294 | Pops and returns the last value of the array, shortening the array by 1. |
378cc40b |
295 | Has the same effect as |
296 | .nf |
297 | |
298 | $tmp = $ARRAY[$#ARRAY]; $#ARRAY--; |
299 | |
300 | .fi |
8d063cd8 |
301 | .Ip "print FILEHANDLE LIST" 8 9 |
302 | .Ip "print LIST" 8 |
303 | .Ip "print" 8 |
378cc40b |
304 | Prints a string or a comma-separated list of strings. |
305 | FILEHANDLE may be a scalar variable name, in which case the variable contains |
306 | the name of the filehandle, thus introducing one level of indirection. |
8d063cd8 |
307 | If FILEHANDLE is omitted, prints by default to standard output (or to the |
308 | last selected output channel\*(--see select()). |
309 | If LIST is also omitted, prints $_ to stdout. |
8d063cd8 |
310 | To set the default output channel to something other than stdout use the select operation. |
311 | .Ip "printf FILEHANDLE LIST" 8 9 |
312 | .Ip "printf LIST" 8 |
13281fa4 |
313 | Equivalent to a \*(L"print FILEHANDLE sprintf(LIST)\*(R". |
378cc40b |
314 | .Ip "push(ARRAY,LIST)" 8 7 |
315 | Treats ARRAY (@ is optional) as a stack, and pushes the values of LIST |
8d063cd8 |
316 | onto the end of ARRAY. |
378cc40b |
317 | The length of ARRAY increases by the length of LIST. |
8d063cd8 |
318 | Has the same effect as |
319 | .nf |
320 | |
378cc40b |
321 | for $value (LIST) { |
322 | $ARRAY[$#ARRAY+1] = $value; |
323 | } |
8d063cd8 |
324 | |
325 | .fi |
326 | but is more efficient. |
327 | .Ip "redo LABEL" 8 8 |
328 | .Ip "redo" 8 |
329 | The |
330 | .I redo |
331 | command restarts the loop block without evaluating the conditional again. |
332 | The |
333 | .I continue |
334 | block, if any, is not executed. |
335 | If the LABEL is omitted, the command refers to the innermost enclosing loop. |
336 | This command is normally used by programs that want to lie to themselves |
337 | about what was just input: |
338 | .nf |
339 | |
340 | .ne 16 |
341 | # a simpleminded Pascal comment stripper |
342 | # (warning: assumes no { or } in strings) |
343 | line: while (<stdin>) { |
344 | while (s|\|({.*}.*\|){.*}|$1 \||) {} |
345 | s|{.*}| \||; |
346 | if (s|{.*| \||) { |
347 | $front = $_; |
348 | while (<stdin>) { |
349 | if (\|/\|}/\|) { # end of comment? |
350 | s|^|$front{|; |
351 | redo line; |
352 | } |
353 | } |
354 | } |
355 | print; |
356 | } |
357 | |
358 | .fi |
359 | .Ip "rename(OLDNAME,NEWNAME)" 8 2 |
360 | Changes the name of a file. |
361 | Returns 1 for success, 0 otherwise. |
362 | .Ip "reset EXPR" 8 3 |
363 | Generally used in a |
364 | .I continue |
365 | block at the end of a loop to clear variables and reset ?? searches |
366 | so that they work again. |
367 | The expression is interpreted as a list of single characters (hyphens allowed |
368 | for ranges). |
378cc40b |
369 | All variables and arrays beginning with one of those letters are reset to |
370 | their pristine state. |
8d063cd8 |
371 | If the expression is omitted, one-match searches (?pattern?) are reset to |
372 | match again. |
373 | Always returns 1. |
374 | Examples: |
375 | .nf |
376 | |
377 | .ne 3 |
378 | reset 'X'; \h'|2i'# reset all X variables |
379 | reset 'a-z';\h'|2i'# reset lower case variables |
380 | reset; \h'|2i'# just reset ?? searches |
381 | |
382 | .fi |
378cc40b |
383 | Note: resetting "A-Z" is not recommended since you'll wipe out your ARGV and ENV |
384 | arrays. |
385 | .Ip "s/PATTERN/REPLACEMENT/gi" 8 3 |
8d063cd8 |
386 | Searches a string for a pattern, and if found, replaces that pattern with the |
387 | replacement text and returns the number of substitutions made. |
388 | Otherwise it returns false (0). |
389 | The \*(L"g\*(R" is optional, and if present, indicates that all occurences |
390 | of the pattern are to be replaced. |
378cc40b |
391 | The \*(L"i\*(R" is also optional, and if present, indicates that matching |
392 | is to be done in a case-insensitive manner. |
8d063cd8 |
393 | Any delimiter may replace the slashes; if single quotes are used, no |
394 | interpretation is done on the replacement string. |
395 | If no string is specified via the =~ or !~ operator, |
396 | the $_ string is searched and modified. |
378cc40b |
397 | (The string specified with =~ must be a scalar variable, an array element, |
398 | or an assignment to one of those, i.e. an lvalue.) |
8d063cd8 |
399 | If the pattern contains a $ that looks like a variable rather than an |
400 | end-of-string test, the variable will be interpolated into the pattern at |
401 | run-time. |
402 | See also the section on regular expressions. |
403 | Examples: |
404 | .nf |
405 | |
406 | s/\|\e\|bgreen\e\|b/mauve/g; # don't change wintergreen |
407 | |
408 | $path \|=~ \|s|\|/usr/bin|\|/usr/local/bin|; |
409 | |
410 | s/Login: $foo/Login: $bar/; # run-time pattern |
411 | |
412 | s/\|([^ \|]*\|) *\|([^ \|]*\|)\|/\|$2 $1/; # reverse 1st two fields |
413 | |
378cc40b |
414 | ($foo = $bar) =~ s/bar/foo/; |
415 | |
8d063cd8 |
416 | .fi |
417 | (Note the use of $ instead of \|\e\| in the last example. See section |
418 | on regular expressions.) |
419 | .Ip "seek(FILEHANDLE,POSITION,WHENCE)" 8 3 |
420 | Randomly positions the file pointer for FILEHANDLE, just like the fseek() |
421 | call of stdio. |
378cc40b |
422 | FILEHANDLE may be an expression whose value gives the name of the filehandle. |
8d063cd8 |
423 | Returns 1 upon success, 0 otherwise. |
424 | .Ip "select(FILEHANDLE)" 8 3 |
425 | Sets the current default filehandle for output. |
426 | This has two effects: first, a |
427 | .I write |
428 | or a |
429 | .I print |
430 | without a filehandle will default to this FILEHANDLE. |
431 | Second, references to variables related to output will refer to this output |
432 | channel. |
433 | For example, if you have to set the top of form format for more than |
434 | one output channel, you might do the following: |
435 | .nf |
436 | |
437 | .ne 4 |
438 | select(report1); |
439 | $^ = 'report1_top'; |
440 | select(report2); |
441 | $^ = 'report2_top'; |
442 | |
443 | .fi |
444 | Select happens to return TRUE if the file is currently open and FALSE otherwise, |
445 | but this has no effect on its operation. |
378cc40b |
446 | FILEHANDLE may be an expression whose value gives the name of the actual filehandle. |
8d063cd8 |
447 | .Ip "shift(ARRAY)" 8 6 |
448 | .Ip "shift ARRAY" 8 |
449 | .Ip "shift" 8 |
378cc40b |
450 | Shifts the first value of the array off and returns it, |
451 | shortening the array by 1 and moving everything down. |
8d063cd8 |
452 | If ARRAY is omitted, shifts the ARGV array. |
83b4785a |
453 | See also unshift(), push() and pop(). |
454 | Shift() and unshift() do the same thing to the left end of an array that push() |
455 | and pop() do to the right end. |
8d063cd8 |
456 | .Ip "sleep EXPR" 8 6 |
457 | .Ip "sleep" 8 |
458 | Causes the script to sleep for EXPR seconds, or forever if no EXPR. |
459 | May be interrupted by sending the process a SIGALARM. |
460 | Returns the number of seconds actually slept. |
378cc40b |
461 | .Ip "sort SUBROUTINE LIST" 8 7 |
462 | .Ip "sort LIST" 8 |
463 | Sorts the LIST and returns the sorted array value. |
464 | Nonexistent values of arrays are stripped out. |
465 | If SUBROUTINE is omitted, sorts in standard string comparison order. |
466 | If SUBROUTINE is specified, gives the name of a subroutine that returns |
467 | a -1, 0, or 1, depending on how the elements of the array are to be ordered. |
468 | In the interests of efficiency the normal calling code for subroutines |
469 | is bypassed, with the following effects: the subroutine may not be a recursive |
470 | subroutine, and the two elements to be compared are passed into the subroutine |
471 | not via @_ but as $a and $b (see example below). |
472 | SUBROUTINE may be a scalar variable name, in which case the value provides |
473 | the name of the subroutine to use. |
474 | Examples: |
475 | .nf |
476 | |
477 | .ne 4 |
478 | sub byage { |
479 | $age{$a} < $age{$b} ? -1 : $age{$a} > $age{$b} ? 1 : 0; |
480 | } |
481 | @sortedclass = sort byage @class; |
482 | |
483 | .ne 9 |
484 | sub reverse { $a lt $b ? 1 : $a gt $b ? -1 : 0; } |
485 | @harry = ('dog','cat','x','Cain','Abel'); |
486 | @george = ('gone','chased','yz','Punished','Axed'); |
487 | print sort @harry; |
488 | # prints AbelCaincatdogx |
489 | print sort reverse @harry; |
490 | # prints xdogcatCainAbel |
491 | print sort @george,'to',@harry; |
492 | # prints AbelAxedCainPunishedcatchaseddoggonetoxyz |
493 | |
494 | .fi |
8d063cd8 |
495 | .Ip "split(/PATTERN/,EXPR)" 8 8 |
496 | .Ip "split(/PATTERN/)" 8 |
497 | .Ip "split" 8 |
498 | Splits a string into an array of strings, and returns it. |
499 | If EXPR is omitted, splits the $_ string. |
500 | If PATTERN is also omitted, splits on whitespace (/[\ \et\en]+/). |
501 | Anything matching PATTERN is taken to be a delimiter separating the fields. |
502 | (Note that the delimiter may be longer than one character.) |
503 | Trailing null fields are stripped, which potential users of pop() would |
504 | do well to remember. |
9bb9d9f7 |
505 | A pattern matching the null string (not to be confused with a null pattern) |
506 | will split the value of EXPR into separate characters at each point it |
507 | matches that way. |
508 | For example: |
509 | .nf |
510 | |
511 | print join(':',split(/ */,'hi there')); |
512 | |
513 | .fi |
514 | produces the output 'h:i:t:h:e:r:e'. |
515 | |
516 | The pattern /PATTERN/ may be replaced with an expression to specify patterns |
517 | that vary at runtime. |
518 | As a special case, specifying a space ('\ ') will split on white space |
519 | just as split with no arguments does, but leading white space does NOT |
520 | produce a null first field. |
521 | Thus, split('\ ') can be used to emulate awk's default behavior, whereas |
522 | split(/\ /) will give you as many null initial fields as there are |
523 | leading spaces. |
8d063cd8 |
524 | .sp |
525 | Example: |
526 | .nf |
527 | |
528 | .ne 5 |
529 | open(passwd, '/etc/passwd'); |
530 | while (<passwd>) { |
531 | .ie t \{\ |
532 | ($login, $passwd, $uid, $gid, $gcos, $home, $shell) = split(\|/\|:\|/\|); |
533 | 'br\} |
534 | .el \{\ |
535 | ($login, $passwd, $uid, $gid, $gcos, $home, $shell) |
536 | = split(\|/\|:\|/\|); |
537 | 'br\} |
538 | .\|.\|. |
539 | } |
540 | |
541 | .fi |
542 | (Note that $shell above will still have a newline on it. See chop().) |
543 | See also |
544 | .IR join . |
545 | .Ip "sprintf(FORMAT,LIST)" 8 4 |
546 | Returns a string formatted by the usual printf conventions. |
547 | The * character is not supported. |
548 | .Ip "sqrt(EXPR)" 8 3 |
549 | Return the square root of EXPR. |
550 | .Ip "stat(FILEHANDLE)" 8 6 |
551 | .Ip "stat(EXPR)" 8 |
552 | Returns a 13-element array giving the statistics for a file, either the file |
553 | opened via FILEHANDLE, or named by EXPR. |
554 | Typically used as follows: |
555 | .nf |
556 | |
557 | .ne 3 |
558 | ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size, |
559 | $atime,$mtime,$ctime,$blksize,$blocks) |
560 | = stat($filename); |
561 | |
562 | .fi |
378cc40b |
563 | .Ip "study(SCALAR)" 8 6 |
564 | .Ip "study" |
565 | Takes extra time to study SCALAR ($_ if unspecified) in anticipation of |
566 | doing many pattern matches on the string before it is next modified. |
567 | This may or may not save time, depending on the nature and number of patterns |
13281fa4 |
568 | you are searching on, and on the distribution of character frequencies in |
569 | the string to be searched\*(--you probably want to compare runtimes with and |
378cc40b |
570 | without it to see which runs faster. |
571 | Those loops which scan for many short constant strings (including the constant |
572 | parts of more complex patterns) will benefit most. |
13281fa4 |
573 | (The way study works is this: a linked list of every character in the string |
574 | to be searched is made, so we know, for example, where all the `k' characters |
575 | are. |
576 | From each search string, the rarest character is selected, based on some |
577 | static frequency tables constructed from some C programs and English text. |
578 | Only those places that contain this \*(L"rarest\*(R" character are examined.) |
579 | .Sp |
580 | For example, here is a loop which inserts index producing entries before an line |
378cc40b |
581 | containing a certain pattern: |
582 | .nf |
583 | |
584 | .ne 8 |
585 | while (<>) { |
586 | study; |
587 | print ".IX foo\en" if /\ebfoo\eb/; |
588 | print ".IX bar\en" if /\ebbar\eb/; |
589 | print ".IX blurfl\en" if /\ebblurfl\eb/; |
590 | .\|.\|. |
591 | print; |
592 | } |
593 | |
594 | .fi |
13281fa4 |
595 | In searching for /\ebfoo\eb/, only those locations in $_ that contain `f' |
596 | will be looked at, because `f' is rarer than `o'. |
597 | In general, this is a big win except in pathological cases. |
598 | The only question is whether it saves you more time than it took to build |
599 | the linked list in the first place. |
600 | .Sp |
601 | Note that if you have to look for strings that you don't know till runtime, |
602 | you can build an entire loop as a string and eval that to avoid recompiling |
603 | all your patterns all the time. |
604 | Together with setting $/ to input entire files as one record, this can |
605 | be very fast, often faster than specialized programs like fgrep. |
606 | The following scans a list of files (@files) |
607 | for a list of words (@words), and prints out the names of those files that |
608 | contain a match: |
609 | .nf |
610 | |
611 | .ne 12 |
612 | $search = 'while (<>) { study;'; |
613 | foreach $word (@words) { |
614 | $search .= "\e++$seen{\e$ARGV} if /\eb$word\eb/;\en"; |
615 | } |
616 | $search .= "}"; |
617 | @ARGV = @files; |
618 | $/ = "\e177"; # something that doesn't occur |
619 | eval $search; # this screams |
620 | $/ = "\en"; # put back to normal input delim |
621 | foreach $file (sort keys(seen)) { |
622 | print $file,"\en"; |
623 | } |
624 | |
625 | .fi |
8d063cd8 |
626 | .Ip "substr(EXPR,OFFSET,LEN)" 8 2 |
627 | Extracts a substring out of EXPR and returns it. |
628 | First character is at offset 0, or whatever you've set $[ to. |
629 | .Ip "system LIST" 8 6 |
630 | Does exactly the same thing as \*(L"exec LIST\*(R" except that a fork |
631 | is done first, and the parent process waits for the child process to complete. |
632 | Note that argument processing varies depending on the number of arguments. |
378cc40b |
633 | The return value is the exit status of the program as returned by the wait() |
634 | call. |
635 | To get the actual exit value divide by 256. |
636 | See also exec. |
637 | .Ip "symlink(OLDFILE,NEWFILE)" 8 2 |
638 | Creates a new filename symbolically linked to the old filename. |
639 | Returns 1 for success, 0 otherwise. |
640 | On systems that don't support symbolic links, produces a fatal error at |
641 | run time. |
642 | To check for that, use eval: |
643 | .nf |
644 | |
645 | $symlink_exists = (eval 'symlink("","");', $@ eq ''); |
646 | |
647 | .fi |
8d063cd8 |
648 | .Ip "tell(FILEHANDLE)" 8 6 |
649 | .Ip "tell" 8 |
650 | Returns the current file position for FILEHANDLE. |
378cc40b |
651 | FILEHANDLE may be an expression whose value gives the name of the actual |
652 | filehandle. |
8d063cd8 |
653 | If FILEHANDLE is omitted, assumes the file last read. |
654 | .Ip "time" 8 4 |
655 | Returns the number of seconds since January 1, 1970. |
656 | Suitable for feeding to gmtime() and localtime(). |
657 | .Ip "times" 8 4 |
658 | Returns a four-element array giving the user and system times, in seconds, for this |
659 | process and the children of this process. |
660 | .sp |
661 | ($user,$system,$cuser,$csystem) = times; |
662 | .sp |
663 | .Ip "tr/SEARCHLIST/REPLACEMENTLIST/" 8 5 |
664 | .Ip "y/SEARCHLIST/REPLACEMENTLIST/" 8 |
665 | Translates all occurences of the characters found in the search list with |
666 | the corresponding character in the replacement list. |
667 | It returns the number of characters replaced. |
668 | If no string is specified via the =~ or !~ operator, |
669 | the $_ string is translated. |
378cc40b |
670 | (The string specified with =~ must be a scalar variable, an array element, |
671 | or an assignment to one of those, i.e. an lvalue.) |
8d063cd8 |
672 | For |
673 | .I sed |
674 | devotees, |
675 | .I y |
676 | is provided as a synonym for |
677 | .IR tr . |
678 | Examples: |
679 | .nf |
680 | |
681 | $ARGV[1] \|=~ \|y/A-Z/a-z/; \h'|3i'# canonicalize to lower case |
682 | |
683 | $cnt = tr/*/*/; \h'|3i'# count the stars in $_ |
684 | |
378cc40b |
685 | ($HOST = $host) =~ tr/a-z/A-Z/; |
686 | |
13281fa4 |
687 | y/\e001-@[-_{-\e177/ /; \h'|3i'# change non-alphas to space |
688 | |
8d063cd8 |
689 | .fi |
690 | .Ip "umask(EXPR)" 8 3 |
691 | Sets the umask for the process and returns the old one. |
692 | .Ip "unlink LIST" 8 2 |
693 | Deletes a list of files. |
8d063cd8 |
694 | Returns the number of files successfully deleted. |
8d063cd8 |
695 | .nf |
696 | |
378cc40b |
697 | .ne 2 |
698 | $cnt = unlink 'a','b','c'; |
699 | unlink @goners; |
13281fa4 |
700 | unlink <*.bak>; |
8d063cd8 |
701 | |
702 | .fi |
378cc40b |
703 | Note: unlink will not delete directories unless you are superuser and the \-U |
704 | flag is supplied to perl. |
83b4785a |
705 | .ne 7 |
8d063cd8 |
706 | .Ip "unshift(ARRAY,LIST)" 8 4 |
707 | Does the opposite of a shift. |
378cc40b |
708 | Or the opposite of a push, depending on how you look at it. |
8d063cd8 |
709 | Prepends list to the front of the array, and returns the number of elements |
710 | in the new array. |
711 | .nf |
712 | |
713 | unshift(ARGV,'-e') unless $ARGV[0] =~ /^-/; |
714 | |
715 | .fi |
378cc40b |
716 | .Ip "utime LIST" 8 2 |
717 | Changes the access and modification times on each file of a list of files. |
718 | The first two elements of the list must be the NUMERICAL access and |
719 | modification times, in that order. |
720 | Returns the number of files successfully changed. |
721 | The inode modification time of each file is set to the current time. |
13281fa4 |
722 | Example of a \*(L"touch\*(R" command: |
378cc40b |
723 | .nf |
724 | |
725 | .ne 3 |
726 | #!/usr/bin/perl |
727 | $now = time; |
728 | utime $now,$now,@ARGV; |
729 | |
730 | .fi |
8d063cd8 |
731 | .Ip "values(ASSOC_ARRAY)" 8 6 |
732 | Returns a normal array consisting of all the values of the named associative |
733 | array. |
734 | The values are returned in an apparently random order, but it is the same order |
735 | as either the keys() or each() function produces (given that the associative array |
736 | has not been modified). |
737 | See also keys() and each(). |
378cc40b |
738 | .Ip "wait" 8 6 |
739 | Waits for a child process to terminate and returns the pid of the deceased |
740 | process. |
741 | The status is returned in $?. |
8d063cd8 |
742 | .Ip "write(FILEHANDLE)" 8 6 |
743 | .Ip "write(EXPR)" 8 |
744 | .Ip "write(\|)" 8 |
745 | Writes a formatted record (possibly multi-line) to the specified file, |
746 | using the format associated with that file. |
747 | By default the format for a file is the one having the same name is the |
748 | filehandle, but the format for the current output channel (see |
749 | .IR select ) |
750 | may be set explicitly |
751 | by assigning the name of the format to the $~ variable. |
752 | .sp |
753 | Top of form processing is handled automatically: |
754 | if there is insufficient room on the current page for the formatted |
755 | record, the page is advanced, a special top-of-page format is used |
756 | to format the new page header, and then the record is written. |
757 | By default the top-of-page format is \*(L"top\*(R", but it |
758 | may be set to the |
759 | format of your choice by assigning the name to the $^ variable. |
760 | .sp |
761 | If FILEHANDLE is unspecified, output goes to the current default output channel, |
762 | which starts out as stdout but may be changed by the |
763 | .I select |
764 | operator. |
765 | If the FILEHANDLE is an EXPR, then the expression is evaluated and the |
766 | resulting string is used to look up the name of the FILEHANDLE at run time. |
767 | For more on formats, see the section on formats later on. |
378cc40b |
768 | .Sh "Precedence" |
769 | Perl operators have the following associativity and precedence: |
770 | .nf |
771 | |
772 | nonassoc\h'|1i'print printf exec system sort |
773 | \h'1.5i'chmod chown kill unlink utime |
774 | left\h'|1i', |
775 | right\h'|1i'= |
776 | right\h'|1i'?: |
777 | nonassoc\h'|1i'.. |
778 | left\h'|1i'|| |
779 | left\h'|1i'&& |
780 | left\h'|1i'| ^ |
781 | left\h'|1i'& |
782 | nonassoc\h'|1i'== != eq ne |
783 | nonassoc\h'|1i'< > <= >= lt gt le ge |
784 | nonassoc\h'|1i'chdir die exit eval reset sleep |
785 | nonassoc\h'|1i'-r -w -x etc. |
786 | left\h'|1i'<< >> |
787 | left\h'|1i'+ - . |
788 | left\h'|1i'* / % x |
789 | left\h'|1i'=~ !~ |
790 | right\h'|1i'! ~ and unary minus |
791 | nonassoc\h'|1i'++ -- |
792 | left\h'|1i''(' |
793 | |
794 | .fi |
795 | Actually, the precedence of list operators such as print, sort or chmod is |
796 | either very high or very low depending on whether you look at the left |
797 | side of operator or the right side of it. |
798 | For example, in |
799 | |
800 | @ary = (1, 3, sort 4, 2); |
801 | print @ary; # prints 1324 |
802 | |
803 | the commas on the right of the sort are evaluated before the sort, but |
804 | the commas on the left are evaluated after. |
805 | In other words, list operators tend to gobble up all the arguments that |
806 | follow them, and then act like a simple term with regard to the preceding |
807 | expression. |
8d063cd8 |
808 | .Sh "Subroutines" |
809 | A subroutine may be declared as follows: |
810 | .nf |
811 | |
812 | sub NAME BLOCK |
813 | |
814 | .fi |
815 | .PP |
816 | Any arguments passed to the routine come in as array @_, |
817 | that is ($_[0], $_[1], .\|.\|.). |
818 | The return value of the subroutine is the value of the last expression |
819 | evaluated. |
13281fa4 |
820 | To create local variables see the \*(L"local\*(R" operator. |
8d063cd8 |
821 | .PP |
822 | A subroutine is called using the |
823 | .I do |
824 | operator. |
8d063cd8 |
825 | .nf |
826 | |
827 | .ne 12 |
828 | Example: |
829 | |
830 | sub MAX { |
378cc40b |
831 | local($max) = pop(@_); |
832 | foreach $foo (@_) { |
8d063cd8 |
833 | $max = $foo \|if \|$max < $foo; |
834 | } |
835 | $max; |
836 | } |
837 | |
838 | .\|.\|. |
839 | $bestday = do MAX($mon,$tue,$wed,$thu,$fri); |
840 | |
841 | .ne 21 |
842 | Example: |
843 | |
844 | # get a line, combining continuation lines |
845 | # that start with whitespace |
846 | sub get_line { |
847 | $thisline = $lookahead; |
848 | line: while ($lookahead = <stdin>) { |
849 | if ($lookahead \|=~ \|/\|^[ \^\e\|t]\|/\|) { |
850 | $thisline \|.= \|$lookahead; |
851 | } |
852 | else { |
853 | last line; |
854 | } |
855 | } |
856 | $thisline; |
857 | } |
858 | |
859 | $lookahead = <stdin>; # get first line |
860 | while ($_ = get_line(\|)) { |
861 | .\|.\|. |
862 | } |
863 | |
864 | .fi |
865 | .nf |
866 | .ne 6 |
378cc40b |
867 | Use array assignment to local list to name your formal arguments: |
8d063cd8 |
868 | |
869 | sub maybeset { |
378cc40b |
870 | local($key,$value) = @_; |
8d063cd8 |
871 | $foo{$key} = $value unless $foo{$key}; |
872 | } |
873 | |
874 | .fi |
378cc40b |
875 | Subroutines may be called recursively. |
8d063cd8 |
876 | .Sh "Regular Expressions" |
877 | The patterns used in pattern matching are regular expressions such as |
378cc40b |
878 | those supplied in the Version 8 regexp routines. |
879 | (In fact, the routines are derived from Henry Spencer's freely redistributable |
880 | reimplementation of the V8 routines.) |
13281fa4 |
881 | In addition, \ew matches an alphanumeric character (including \*(L"_\*(R") and \eW a nonalphanumeric. |
8d063cd8 |
882 | Word boundaries may be matched by \eb, and non-boundaries by \eB. |
378cc40b |
883 | A whitespace character is matched by \es, non-whitespace by \eS. |
884 | A numeric character is matched by \ed, non-numeric by \eD. |
885 | You may use \ew, \es and \ed within character classes. |
886 | Also, \en, \er, \ef, \et and \eNNN have their normal interpretations. |
887 | Within character classes \eb represents backspace rather than a word boundary. |
888 | The bracketing construct \|(\ .\|.\|.\ \|) may also be used, in which case \e<digit> |
8d063cd8 |
889 | matches the digit'th substring, where digit can range from 1 to 9. |
378cc40b |
890 | (Outside of patterns, use $ instead of \e in front of the digit. |
891 | The scope of $<digit> extends to the end of the enclosing BLOCK, or to |
892 | the next pattern match with subexpressions.) |
8d063cd8 |
893 | $+ returns whatever the last bracket match matched. |
894 | $& returns the entire matched string. |
378cc40b |
895 | ($0 normally returns the same thing, but don't depend on it.) |
896 | Alternatives may be separated by |. |
8d063cd8 |
897 | Examples: |
898 | .nf |
899 | |
900 | s/\|^\|([^ \|]*\|) \|*([^ \|]*\|)\|/\|$2 $1\|/; # swap first two words |
901 | |
902 | .ne 5 |
903 | if (/\|Time: \|(.\|.\|):\|(.\|.\|):\|(.\|.\|)\|/\|) { |
904 | $hours = $1; |
905 | $minutes = $2; |
906 | $seconds = $3; |
907 | } |
908 | |
909 | .fi |
910 | By default, the ^ character matches only the beginning of the string, and |
911 | .I perl |
912 | does certain optimizations with the assumption that the string contains |
913 | only one line. |
914 | You may, however, wish to treat a string as a multi-line buffer, such that |
915 | the ^ will match after any newline within the string. |
916 | At the cost of a little more overhead, you can do this by setting the variable |
917 | $* to 1. |
918 | Setting it back to 0 makes |
919 | .I perl |
920 | revert to its old behavior. |
378cc40b |
921 | .PP |
922 | To facilitate multi-line substitutions, the . character never matches a newline. |
923 | In particular, the following leaves a newline on the $_ string: |
924 | .nf |
925 | |
926 | $_ = <stdin>; |
927 | s/.*(some_string).*/$1/; |
928 | |
929 | If the newline is unwanted, try one of |
930 | |
931 | s/.*(some_string).*\en/$1/; |
932 | s/.*(some_string)[^\000]*/$1/; |
933 | s/.*(some_string)(.|\en)*/$1/; |
934 | chop; s/.*(some_string).*/$1/; |
935 | /(some_string)/ && ($_ = $1); |
936 | |
937 | .fi |
8d063cd8 |
938 | .Sh "Formats" |
939 | Output record formats for use with the |
940 | .I write |
941 | operator may declared as follows: |
942 | .nf |
943 | |
944 | .ne 3 |
945 | format NAME = |
946 | FORMLIST |
947 | . |
948 | |
949 | .fi |
950 | If name is omitted, format \*(L"stdout\*(R" is defined. |
951 | FORMLIST consists of a sequence of lines, each of which may be of one of three |
952 | types: |
953 | .Ip 1. 4 |
954 | A comment. |
955 | .Ip 2. 4 |
956 | A \*(L"picture\*(R" line giving the format for one output line. |
957 | .Ip 3. 4 |
958 | An argument line supplying values to plug into a picture line. |
959 | .PP |
960 | Picture lines are printed exactly as they look, except for certain fields |
961 | that substitute values into the line. |
962 | Each picture field starts with either @ or ^. |
963 | The @ field (not to be confused with the array marker @) is the normal |
964 | case; ^ fields are used |
965 | to do rudimentary multi-line text block filling. |
966 | The length of the field is supplied by padding out the field |
967 | with multiple <, >, or | characters to specify, respectively, left justfication, |
968 | right justification, or centering. |
969 | If any of the values supplied for these fields contains a newline, only |
970 | the text up to the newline is printed. |
971 | The special field @* can be used for printing multi-line values. |
972 | It should appear by itself on a line. |
973 | .PP |
974 | The values are specified on the following line, in the same order as |
975 | the picture fields. |
378cc40b |
976 | They must currently be either scalar variable names or literals (or |
8d063cd8 |
977 | pseudo-literals). |
978 | Currently you can separate values with spaces, but commas may be placed |
979 | between values to prepare for possible future versions in which full expressions |
980 | are allowed as values. |
981 | .PP |
982 | Picture fields that begin with ^ rather than @ are treated specially. |
378cc40b |
983 | The value supplied must be a scalar variable name which contains a text |
8d063cd8 |
984 | string. |
985 | .I Perl |
986 | puts as much text as it can into the field, and then chops off the front |
378cc40b |
987 | of the string so that the next time the variable is referenced, |
8d063cd8 |
988 | more of the text can be printed. |
989 | Normally you would use a sequence of fields in a vertical stack to print |
990 | out a block of text. |
991 | If you like, you can end the final field with .\|.\|., which will appear in the |
992 | output if the text was too long to appear in its entirety. |
993 | .PP |
994 | Since use of ^ fields can produce variable length records if the text to be |
995 | formatted is short, you can suppress blank lines by putting the tilde (~) |
996 | character anywhere in the line. |
997 | (Normally you should put it in the front if possible.) |
998 | The tilde will be translated to a space upon output. |
999 | .PP |
1000 | Examples: |
1001 | .nf |
1002 | .lg 0 |
1003 | .cs R 25 |
1004 | |
1005 | .ne 10 |
1006 | # a report on the /etc/passwd file |
1007 | format top = |
1008 | \& Passwd File |
1009 | Name Login Office Uid Gid Home |
1010 | ------------------------------------------------------------------ |
1011 | \&. |
1012 | format stdout = |
1013 | @<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<< |
1014 | $name $login $office $uid $gid $home |
1015 | \&. |
1016 | |
1017 | .ne 29 |
1018 | # a report from a bug report form |
1019 | format top = |
1020 | \& Bug Reports |
1021 | @<<<<<<<<<<<<<<<<<<<<<<< @||| @>>>>>>>>>>>>>>>>>>>>>>> |
1022 | $system; $%; $date |
1023 | ------------------------------------------------------------------ |
1024 | \&. |
1025 | format stdout = |
1026 | Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
1027 | \& $subject |
1028 | Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
1029 | \& $index $description |
1030 | Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
1031 | \& $priority $date $description |
1032 | From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
1033 | \& $from $description |
1034 | Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
1035 | \& $programmer $description |
1036 | \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
1037 | \& $description |
1038 | \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
1039 | \& $description |
1040 | \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
1041 | \& $description |
1042 | \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
1043 | \& $description |
1044 | \&~ ^<<<<<<<<<<<<<<<<<<<<<<<... |
1045 | \& $description |
1046 | \&. |
1047 | |
1048 | .cs R |
1049 | .lg |
1050 | It is possible to intermix prints with writes on the same output channel, |
1051 | but you'll have to handle $\- (lines left on the page) yourself. |
1052 | .fi |
1053 | .PP |
1054 | If you are printing lots of fields that are usually blank, you should consider |
1055 | using the reset operator between records. |
1056 | Not only is it more efficient, but it can prevent the bug of adding another |
1057 | field and forgetting to zero it. |
1058 | .Sh "Predefined Names" |
1059 | The following names have special meaning to |
1060 | .IR perl . |
1061 | I could have used alphabetic symbols for some of these, but I didn't want |
13281fa4 |
1062 | to take the chance that someone would say reset \*(L"a-zA-Z\*(R" and wipe them all |
8d063cd8 |
1063 | out. |
1064 | You'll just have to suffer along with these silly symbols. |
1065 | Most of them have reasonable mnemonics, or analogues in one of the shells. |
1066 | .Ip $_ 8 |
1067 | The default input and pattern-searching space. |
1068 | The following pairs are equivalent: |
1069 | .nf |
1070 | |
1071 | .ne 2 |
1072 | while (<>) {\|.\|.\|. # only equivalent in while! |
1073 | while ($_ = <>) {\|.\|.\|. |
1074 | |
1075 | .ne 2 |
1076 | /\|^Subject:/ |
1077 | $_ \|=~ \|/\|^Subject:/ |
1078 | |
1079 | .ne 2 |
1080 | y/a-z/A-Z/ |
1081 | $_ =~ y/a-z/A-Z/ |
1082 | |
1083 | .ne 2 |
1084 | chop |
1085 | chop($_) |
1086 | |
1087 | .fi |
1088 | (Mnemonic: underline is understood in certain operations.) |
1089 | .Ip $. 8 |
378cc40b |
1090 | The current input line number of the last filehandle that was read. |
8d063cd8 |
1091 | Readonly. |
378cc40b |
1092 | Remember that only an explicit close on the filehandle resets the line number. |
1093 | Since <> never does an explicit close, line numbers increase across ARGV files |
1094 | (but see examples under eof). |
8d063cd8 |
1095 | (Mnemonic: many programs use . to mean the current line number.) |
1096 | .Ip $/ 8 |
1097 | The input record separator, newline by default. |
1098 | Works like awk's RS variable, including treating blank lines as delimiters |
1099 | if set to the null string. |
1100 | If set to a value longer than one character, only the first character is used. |
1101 | (Mnemonic: / is used to delimit line boundaries when quoting poetry.) |
1102 | .Ip $, 8 |
1103 | The output field separator for the print operator. |
1104 | Ordinarily the print operator simply prints out the comma separated fields |
1105 | you specify. |
1106 | In order to get behavior more like awk, set this variable as you would set |
1107 | awk's OFS variable to specify what is printed between fields. |
1108 | (Mnemonic: what is printed when there is a , in your print statement.) |
1109 | .Ip $\e 8 |
1110 | The output record separator for the print operator. |
1111 | Ordinarily the print operator simply prints out the comma separated fields |
1112 | you specify, with no trailing newline or record separator assumed. |
1113 | In order to get behavior more like awk, set this variable as you would set |
1114 | awk's ORS variable to specify what is printed at the end of the print. |
1115 | (Mnemonic: you set $\e instead of adding \en at the end of the print. |
1116 | Also, it's just like /, but it's what you get \*(L"back\*(R" from perl.) |
1117 | .Ip $# 8 |
1118 | The output format for printed numbers. |
1119 | This variable is a half-hearted attempt to emulate awk's OFMT variable. |
1120 | There are times, however, when awk and perl have differing notions of what |
1121 | is in fact numeric. |
1122 | Also, the initial value is %.20g rather than %.6g, so you need to set $# |
1123 | explicitly to get awk's value. |
1124 | (Mnemonic: # is the number sign.) |
1125 | .Ip $% 8 |
1126 | The current page number of the currently selected output channel. |
1127 | (Mnemonic: % is page number in nroff.) |
1128 | .Ip $= 8 |
1129 | The current page length (printable lines) of the currently selected output |
1130 | channel. |
1131 | Default is 60. |
1132 | (Mnemonic: = has horizontal lines.) |
1133 | .Ip $\- 8 |
1134 | The number of lines left on the page of the currently selected output channel. |
1135 | (Mnemonic: lines_on_page - lines_printed.) |
1136 | .Ip $~ 8 |
1137 | The name of the current report format for the currently selected output |
1138 | channel. |
1139 | (Mnemonic: brother to $^.) |
1140 | .Ip $^ 8 |
1141 | The name of the current top-of-page format for the currently selected output |
1142 | channel. |
1143 | (Mnemonic: points to top of page.) |
1144 | .Ip $| 8 |
1145 | If set to nonzero, forces a flush after every write or print on the currently |
1146 | selected output channel. |
1147 | Default is 0. |
1148 | Note that stdout will typically be line buffered if output is to the |
1149 | terminal and block buffered otherwise. |
1150 | Setting this variable is useful primarily when you are outputting to a pipe, |
1151 | such as when you are running a perl script under rsh and want to see the |
1152 | output as it's happening. |
1153 | (Mnemonic: when you want your pipes to be piping hot.) |
1154 | .Ip $$ 8 |
1155 | The process number of the |
1156 | .I perl |
1157 | running this script. |
1158 | (Mnemonic: same as shells.) |
1159 | .Ip $? 8 |
378cc40b |
1160 | The status returned by the last backtick (``) command or system operator. |
1161 | Note that this is the status word returned by the wait() system |
1162 | call, so the exit value of the subprocess is actually ($? >> 8). |
1163 | $? & 255 gives which signal, if any, the process died from, and whether |
1164 | there was a core dump. |
1165 | (Mnemonic: similar to sh and ksh.) |
1166 | .Ip $& 8 4 |
1167 | The string matched by the last pattern match. |
1168 | (Mnemonic: like & in some editors.) |
8d063cd8 |
1169 | .Ip $+ 8 4 |
1170 | The last bracket matched by the last search pattern. |
1171 | This is useful if you don't know which of a set of alternative patterns |
1172 | matched. |
1173 | For example: |
1174 | .nf |
1175 | |
1176 | /Version: \|(.*\|)|Revision: \|(.*\|)\|/ \|&& \|($rev = $+); |
1177 | |
1178 | .fi |
1179 | (Mnemonic: be positive and forward looking.) |
1180 | .Ip $* 8 2 |
1181 | Set to 1 to do multiline matching within a string, 0 to assume strings contain |
1182 | a single line. |
1183 | Default is 0. |
1184 | (Mnemonic: * matches multiple things.) |
1185 | .Ip $0 8 |
1186 | Contains the name of the file containing the |
1187 | .I perl |
1188 | script being executed. |
1189 | The value should be copied elsewhere before any pattern matching happens, which |
1190 | clobbers $0. |
1191 | (Mnemonic: same as sh and ksh.) |
83b4785a |
1192 | .Ip $<digit> 8 |
1193 | Contains the subpattern from the corresponding set of parentheses in the last |
1194 | pattern matched, not counting patterns matched in nested blocks that have |
1195 | been exited already. |
1196 | (Mnemonic: like \edigit.) |
8d063cd8 |
1197 | .Ip $[ 8 2 |
1198 | The index of the first element in an array, and of the first character in |
1199 | a substring. |
1200 | Default is 0, but you could set it to 1 to make |
1201 | .I perl |
1202 | behave more like |
1203 | .I awk |
1204 | (or Fortran) |
1205 | when subscripting and when evaluating the index() and substr() functions. |
1206 | (Mnemonic: [ begins subscripts.) |
1207 | .Ip $! 8 2 |
378cc40b |
1208 | If used in a numeric context, yields the current value of errno, with all the |
1209 | usual caveats. |
1210 | If used in a string context, yields the corresponding system error string. |
1211 | You can assign to $! in order to set errno |
1212 | if, for instance, you want $! to return the string for error n, or you want |
1213 | to set the exit value for the die operator. |
8d063cd8 |
1214 | (Mnemonic: What just went bang?) |
83b4785a |
1215 | .Ip $@ 8 2 |
1216 | The error message from the last eval command. |
1217 | If null, the last eval parsed and executed correctly. |
13281fa4 |
1218 | (Mnemonic: Where was the syntax error \*(L"at\*(R"?) |
378cc40b |
1219 | .Ip $< 8 2 |
1220 | The real uid of this process. |
1221 | (Mnemonic: it's the uid you came FROM, if you're running setuid.) |
1222 | .Ip $> 8 2 |
1223 | The effective uid of this process. |
1224 | Example: |
1225 | .nf |
1226 | |
1227 | $< = $>; # set real uid to the effective uid |
1228 | |
1229 | .fi |
1230 | (Mnemonic: it's the uid you went TO, if you're running setuid.) |
1231 | .Ip $( 8 2 |
1232 | The real gid of this process. |
1233 | If you are on a machine that supports membership in multiple groups |
1234 | simultaneously, gives a space separated list of groups you are in. |
1235 | The first number is the one returned by getgid(), and the subsequent ones |
1236 | by getgroups(), one of which may be the same as the first number. |
1237 | (Mnemonic: parens are used to GROUP things. |
1238 | The real gid is the group you LEFT, if you're running setgid.) |
1239 | .Ip $) 8 2 |
1240 | The effective gid of this process. |
1241 | If you are on a machine that supports membership in multiple groups |
1242 | simultaneously, gives a space separated list of groups you are in. |
1243 | The first number is the one returned by getegid(), and the subsequent ones |
1244 | by getgroups(), one of which may be the same as the first number. |
1245 | (Mnemonic: parens are used to GROUP things. |
1246 | The effective gid is the group that's RIGHT for you, if you're running setgid.) |
1247 | .Sp |
1248 | Note: $<, $>, $( and $) can only be set on machines that support the |
1249 | corresponding set[re][ug]id() routine. |
8d063cd8 |
1250 | .Ip @ARGV 8 3 |
1251 | The array ARGV contains the command line arguments intended for the script. |
1252 | Note that $#ARGV is the generally number of arguments minus one, since |
1253 | $ARGV[0] is the first argument, NOT the command name. |
1254 | See $0 for the command name. |
378cc40b |
1255 | .Ip @INC 8 3 |
1256 | The array INC contains the list of places to look for perl scripts to be |
13281fa4 |
1257 | evaluated by the \*(L"do EXPR\*(R" command. |
378cc40b |
1258 | It initially consists of the arguments to any -I command line switches, followed |
13281fa4 |
1259 | by the default perl library, probably \*(L"/usr/local/lib/perl\*(R". |
8d063cd8 |
1260 | .Ip $ENV{expr} 8 2 |
1261 | The associative array ENV contains your current environment. |
1262 | Setting a value in ENV changes the environment for child processes. |
1263 | .Ip $SIG{expr} 8 2 |
1264 | The associative array SIG is used to set signal handlers for various signals. |
1265 | Example: |
1266 | .nf |
1267 | |
1268 | .ne 12 |
1269 | sub handler { # 1st argument is signal name |
378cc40b |
1270 | local($sig) = @_; |
1271 | print "Caught a SIG$sig--shutting down\en"; |
1272 | close(LOG); |
8d063cd8 |
1273 | exit(0); |
1274 | } |
1275 | |
1276 | $SIG{'INT'} = 'handler'; |
1277 | $SIG{'QUIT'} = 'handler'; |
378cc40b |
1278 | .\|.\|. |
8d063cd8 |
1279 | $SIG{'INT'} = 'DEFAULT'; # restore default action |
1280 | $SIG{'QUIT'} = 'IGNORE'; # ignore SIGQUIT |
1281 | |
1282 | .fi |
1283 | .SH ENVIRONMENT |
1284 | .I Perl |
1285 | currently uses no environment variables, except to make them available |
1286 | to the script being executed, and to child processes. |
1287 | However, scripts running setuid would do well to execute the following lines |
1288 | before doing anything else, just to keep people honest: |
1289 | .nf |
1290 | |
1291 | .ne 3 |
1292 | $ENV{'PATH'} = '/bin:/usr/bin'; # or whatever you need |
1293 | $ENV{'SHELL'} = '/bin/sh' if $ENV{'SHELL'}; |
1294 | $ENV{'IFS'} = '' if $ENV{'IFS'}; |
1295 | |
1296 | .fi |
1297 | .SH AUTHOR |
1298 | Larry Wall <lwall@jpl-devvax.Jpl.Nasa.Gov> |
1299 | .SH FILES |
1300 | /tmp/perl\-eXXXXXX temporary file for |
1301 | .B \-e |
1302 | commands. |
1303 | .SH SEE ALSO |
1304 | a2p awk to perl translator |
1305 | .br |
1306 | s2p sed to perl translator |
83b4785a |
1307 | .br |
1308 | perldb interactive perl debugger |
8d063cd8 |
1309 | .SH DIAGNOSTICS |
1310 | Compilation errors will tell you the line number of the error, with an |
1311 | indication of the next token or token type that was to be examined. |
1312 | (In the case of a script passed to |
1313 | .I perl |
1314 | via |
1315 | .B \-e |
1316 | switches, each |
1317 | .B \-e |
1318 | is counted as one line.) |
1319 | .SH TRAPS |
1320 | Accustomed awk users should take special note of the following: |
1321 | .Ip * 4 2 |
1322 | Semicolons are required after all simple statements in perl. Newline |
1323 | is not a statement delimiter. |
1324 | .Ip * 4 2 |
1325 | Curly brackets are required on ifs and whiles. |
1326 | .Ip * 4 2 |
1327 | Variables begin with $ or @ in perl. |
1328 | .Ip * 4 2 |
1329 | Arrays index from 0 unless you set $[. |
1330 | Likewise string positions in substr() and index(). |
1331 | .Ip * 4 2 |
1332 | You have to decide whether your array has numeric or string indices. |
1333 | .Ip * 4 2 |
378cc40b |
1334 | Associative array values do not spring into existence upon mere reference. |
1335 | .Ip * 4 2 |
8d063cd8 |
1336 | You have to decide whether you want to use string or numeric comparisons. |
1337 | .Ip * 4 2 |
1338 | Reading an input line does not split it for you. You get to split it yourself |
1339 | to an array. |
1340 | And split has different arguments. |
1341 | .Ip * 4 2 |
1342 | The current input line is normally in $_, not $0. |
1343 | It generally does not have the newline stripped. |
1344 | ($0 is initially the name of the program executed, then the last matched |
1345 | string.) |
1346 | .Ip * 4 2 |
1347 | The current filename is $ARGV, not $FILENAME. |
1348 | NR, RS, ORS, OFS, and OFMT have equivalents with other symbols. |
1349 | FS doesn't have an equivalent, since you have to be explicit about |
1350 | split statements. |
1351 | .Ip * 4 2 |
1352 | $<digit> does not refer to fields--it refers to substrings matched by the last |
1353 | match pattern. |
1354 | .Ip * 4 2 |
1355 | The print statement does not add field and record separators unless you set |
1356 | $, and $\e. |
1357 | .Ip * 4 2 |
1358 | You must open your files before you print to them. |
1359 | .Ip * 4 2 |
1360 | The range operator is \*(L"..\*(R", not comma. |
1361 | (The comma operator works as in C.) |
1362 | .Ip * 4 2 |
1363 | The match operator is \*(L"=~\*(R", not \*(L"~\*(R". |
1364 | (\*(L"~\*(R" is the one's complement operator.) |
1365 | .Ip * 4 2 |
1366 | The concatenation operator is \*(L".\*(R", not the null string. |
1367 | (Using the null string would render \*(L"/pat/ /pat/\*(R" unparseable, |
1368 | since the third slash would be interpreted as a division operator\*(--the |
1369 | tokener is in fact slightly context sensitive for operators like /, ?, and <. |
1370 | And in fact, . itself can be the beginning of a number.) |
1371 | .Ip * 4 2 |
8d063cd8 |
1372 | Next, exit, and continue work differently. |
1373 | .Ip * 4 2 |
1374 | When in doubt, run the awk construct through a2p and see what it gives you. |
1375 | .PP |
1376 | Cerebral C programmers should take note of the following: |
1377 | .Ip * 4 2 |
1378 | Curly brackets are required on ifs and whiles. |
1379 | .Ip * 4 2 |
1380 | You should use \*(L"elsif\*(R" rather than \*(L"else if\*(R" |
1381 | .Ip * 4 2 |
1382 | Break and continue become last and next, respectively. |
1383 | .Ip * 4 2 |
1384 | There's no switch statement. |
1385 | .Ip * 4 2 |
1386 | Variables begin with $ or @ in perl. |
1387 | .Ip * 4 2 |
1388 | Printf does not implement *. |
1389 | .Ip * 4 2 |
1390 | Comments begin with #, not /*. |
1391 | .Ip * 4 2 |
1392 | You can't take the address of anything. |
1393 | .Ip * 4 2 |
8d063cd8 |
1394 | ARGV must be capitalized. |
1395 | .Ip * 4 2 |
378cc40b |
1396 | The \*(L"system\*(R" calls link, unlink, rename, etc. return nonzero for success, not 0. |
8d063cd8 |
1397 | .Ip * 4 2 |
1398 | Signal handlers deal with signal names, not numbers. |
1399 | .PP |
1400 | Seasoned sed programmers should take note of the following: |
1401 | .Ip * 4 2 |
1402 | Backreferences in substitutions use $ rather than \e. |
1403 | .Ip * 4 2 |
1404 | The pattern matching metacharacters (, ), and | do not have backslashes in front. |
378cc40b |
1405 | .Ip * 4 2 |
1406 | The range operator is .. rather than comma. |
1407 | .PP |
1408 | Sharp shell programmers should take note of the following: |
1409 | .Ip * 4 2 |
1410 | The backtick operator does variable interpretation without regard to the |
1411 | presence of single quotes in the command. |
1412 | .Ip * 4 2 |
1413 | The backtick operator does no translation of the return value, unlike csh. |
1414 | .Ip * 4 2 |
1415 | Shells (especially csh) do several levels of substitution on each command line. |
1416 | Perl does substitution only in certain constructs such as double quotes, |
1417 | backticks, angle brackets and search patterns. |
1418 | .Ip * 4 2 |
1419 | Shells interpret scripts a little bit at a time. |
1420 | Perl compiles the whole program before executing it. |
1421 | .Ip * 4 2 |
1422 | The arguments are available via @ARGV, not $1, $2, etc. |
1423 | .Ip * 4 2 |
1424 | The environment is not automatically made available as variables. |
8d063cd8 |
1425 | .SH BUGS |
1426 | .PP |
378cc40b |
1427 | You can't currently dereference arrays or array elements inside a |
1428 | double-quoted string. |
1429 | You must assign them to a scalar and interpolate that. |
8d063cd8 |
1430 | .PP |
1431 | Associative arrays really ought to be first class objects. |
1432 | .PP |
378cc40b |
1433 | Perl is at the mercy of the C compiler's definitions of various operations |
1434 | such as % and atof(). |
1435 | In particular, don't trust % on negative numbers. |
8d063cd8 |
1436 | .PP |
1437 | .I Perl |
1438 | actually stands for Pathologically Eclectic Rubbish Lister, but don't tell |
1439 | anyone I said that. |
1440 | .rn }` '' |