Commit | Line | Data |
8d063cd8 |
1 | .rn '' }` |
83b4785a |
2 | ''' $Header: perl.man.1,v 1.0.1.2 88/01/30 17:04:07 root Exp $ |
8d063cd8 |
3 | ''' |
4 | ''' $Log: perl.man.1,v $ |
83b4785a |
5 | ''' Revision 1.0.1.2 88/01/30 17:04:07 root |
6 | ''' patch 11: random cleanup |
7 | ''' |
8 | ''' Revision 1.0.1.1 88/01/28 10:24:44 root |
9 | ''' patch8: added eval operator. |
10 | ''' |
8d063cd8 |
11 | ''' Revision 1.0 87/12/18 16:18:16 root |
12 | ''' Initial revision |
13 | ''' |
14 | ''' |
15 | .de Sh |
16 | .br |
17 | .ne 5 |
18 | .PP |
19 | \fB\\$1\fR |
20 | .PP |
21 | .. |
22 | .de Sp |
23 | .if t .sp .5v |
24 | .if n .sp |
25 | .. |
26 | .de Ip |
27 | .br |
28 | .ie \\n.$>=3 .ne \\$3 |
29 | .el .ne 3 |
30 | .IP "\\$1" \\$2 |
31 | .. |
32 | ''' |
33 | ''' Set up \*(-- to give an unbreakable dash; |
34 | ''' string Tr holds user defined translation string. |
35 | ''' Bell System Logo is used as a dummy character. |
36 | ''' |
37 | .tr \(bs-|\(bv\*(Tr |
38 | .ie n \{\ |
39 | .ds -- \(bs- |
40 | .if (\n(.H=4u)&(1m=24u) .ds -- \(bs\h'-12u'\(bs\h'-12u'-\" diablo 10 pitch |
41 | .if (\n(.H=4u)&(1m=20u) .ds -- \(bs\h'-12u'\(bs\h'-8u'-\" diablo 12 pitch |
42 | .ds L" "" |
43 | .ds R" "" |
44 | .ds L' ' |
45 | .ds R' ' |
46 | 'br\} |
47 | .el\{\ |
48 | .ds -- \(em\| |
49 | .tr \*(Tr |
50 | .ds L" `` |
51 | .ds R" '' |
52 | .ds L' ` |
53 | .ds R' ' |
54 | 'br\} |
55 | .TH PERL 1 LOCAL |
56 | .SH NAME |
57 | perl - Practical Extraction and Report Language |
58 | .SH SYNOPSIS |
59 | .B perl [options] filename args |
60 | .SH DESCRIPTION |
61 | .I Perl |
62 | is a interpreted language optimized for scanning arbitrary text files, |
63 | extracting information from those text files, and printing reports based |
64 | on that information. |
65 | It's also a good language for many system management tasks. |
66 | The language is intended to be practical (easy to use, efficient, complete) |
67 | rather than beautiful (tiny, elegant, minimal). |
68 | It combines (in the author's opinion, anyway) some of the best features of C, |
69 | \fIsed\fR, \fIawk\fR, and \fIsh\fR, |
70 | so people familiar with those languages should have little difficulty with it. |
71 | (Language historians will also note some vestiges of \fIcsh\fR, Pascal, and |
72 | even BASIC-PLUS.) |
73 | Expression syntax corresponds quite closely to C expression syntax. |
74 | If you have a problem that would ordinarily use \fIsed\fR |
75 | or \fIawk\fR or \fIsh\fR, but it |
76 | exceeds their capabilities or must run a little faster, |
77 | and you don't want to write the silly thing in C, then |
78 | .I perl |
79 | may be for you. |
80 | There are also translators to turn your sed and awk scripts into perl scripts. |
81 | OK, enough hype. |
82 | .PP |
83 | Upon startup, |
84 | .I perl |
85 | looks for your script in one of the following places: |
86 | .Ip 1. 4 2 |
87 | Specified line by line via |
88 | .B \-e |
89 | switches on the command line. |
90 | .Ip 2. 4 2 |
91 | Contained in the file specified by the first filename on the command line. |
92 | (Note that systems supporting the #! notation invoke interpreters this way.) |
93 | .Ip 3. 4 2 |
94 | Passed in via standard input. |
95 | .PP |
96 | After locating your script, |
97 | .I perl |
98 | compiles it to an internal form. |
99 | If the script is syntactically correct, it is executed. |
100 | .Sh "Options" |
83b4785a |
101 | Note: on first reading this section may not make much sense to you. It's here |
8d063cd8 |
102 | at the front for easy reference. |
103 | .PP |
104 | A single-character option may be combined with the following option, if any. |
105 | This is particularly useful when invoking a script using the #! construct which |
106 | only allows one argument. Example: |
107 | .nf |
108 | |
109 | .ne 2 |
110 | #!/bin/perl -spi.bak # same as -s -p -i.bak |
111 | .\|.\|. |
112 | |
113 | .fi |
114 | Options include: |
115 | .TP 5 |
116 | .B \-D<number> |
117 | sets debugging flags. |
118 | To watch how it executes your script, use |
119 | .B \-D14. |
120 | (This only works if debugging is compiled into your |
121 | .IR perl .) |
122 | .TP 5 |
123 | .B \-e commandline |
124 | may be used to enter one line of script. |
125 | Multiple |
126 | .B \-e |
127 | commands may be given to build up a multi-line script. |
128 | If |
129 | .B \-e |
130 | is given, |
131 | .I perl |
132 | will not look for a script filename in the argument list. |
133 | .TP 5 |
134 | .B \-i<extension> |
135 | specifies that files processed by the <> construct are to be edited |
136 | in-place. |
137 | It does this by renaming the input file, opening the output file by the |
138 | same name, and selecting that output file as the default for print statements. |
139 | The extension, if supplied, is added to the name of the |
140 | old file to make a backup copy. |
141 | If no extension is supplied, no backup is made. |
142 | Saying \*(L"perl -p -i.bak -e "s/foo/bar/;" ... \*(R" is the same as using |
143 | the script: |
144 | .nf |
145 | |
146 | .ne 2 |
147 | #!/bin/perl -pi.bak |
148 | s/foo/bar/; |
149 | |
150 | which is equivalent to |
151 | |
152 | .ne 14 |
153 | #!/bin/perl |
154 | while (<>) { |
155 | if ($ARGV ne $oldargv) { |
156 | rename($ARGV,$ARGV . '.bak'); |
157 | open(ARGVOUT,">$ARGV"); |
158 | select(ARGVOUT); |
159 | $oldargv = $ARGV; |
160 | } |
161 | s/foo/bar/; |
162 | } |
163 | continue { |
164 | print; # this prints to original filename |
165 | } |
166 | select(stdout); |
167 | |
168 | .fi |
169 | except that the \-i form doesn't need to compare $ARGV to $oldargv to know when |
170 | the filename has changed. |
171 | It does, however, use ARGVOUT for the selected filehandle. |
172 | Note that stdout is restored as the default output filehandle after the loop. |
173 | .TP 5 |
174 | .B \-I<directory> |
175 | may be used in conjunction with |
176 | .B \-P |
177 | to tell the C preprocessor where to look for include files. |
178 | By default /usr/include and /usr/lib/perl are searched. |
179 | .TP 5 |
180 | .B \-n |
181 | causes |
182 | .I perl |
183 | to assume the following loop around your script, which makes it iterate |
184 | over filename arguments somewhat like \*(L"sed -n\*(R" or \fIawk\fR: |
185 | .nf |
186 | |
187 | .ne 3 |
188 | while (<>) { |
189 | ... # your script goes here |
190 | } |
191 | |
192 | .fi |
193 | Note that the lines are not printed by default. |
194 | See |
195 | .B \-p |
196 | to have lines printed. |
197 | .TP 5 |
198 | .B \-p |
199 | causes |
200 | .I perl |
201 | to assume the following loop around your script, which makes it iterate |
202 | over filename arguments somewhat like \fIsed\fR: |
203 | .nf |
204 | |
205 | .ne 5 |
206 | while (<>) { |
207 | ... # your script goes here |
208 | } continue { |
209 | print; |
210 | } |
211 | |
212 | .fi |
213 | Note that the lines are printed automatically. |
214 | To suppress printing use the |
215 | .B \-n |
216 | switch. |
83b4785a |
217 | A |
218 | .B \-p |
219 | overrides a |
220 | .B \-n |
221 | switch. |
8d063cd8 |
222 | .TP 5 |
223 | .B \-P |
224 | causes your script to be run through the C preprocessor before |
225 | compilation by |
226 | .I perl. |
227 | (Since both comments and cpp directives begin with the # character, |
228 | you should avoid starting comments with any words recognized |
229 | by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".) |
230 | .TP 5 |
231 | .B \-s |
232 | enables some rudimentary switch parsing for switches on the command line |
83b4785a |
233 | after the script name but before any filename arguments (or before a --). |
234 | Any switch found there is removed from @ARGV and sets the corresponding variable in the |
8d063cd8 |
235 | .I perl |
236 | script. |
237 | The following script prints \*(L"true\*(R" if and only if the script is |
83b4785a |
238 | invoked with a -xyz switch. |
8d063cd8 |
239 | .nf |
240 | |
241 | .ne 2 |
242 | #!/bin/perl -s |
83b4785a |
243 | if ($xyz) { print "true\en"; } |
8d063cd8 |
244 | |
245 | .fi |
246 | .Sh "Data Types and Objects" |
247 | .PP |
248 | Perl has about two and a half data types: strings, arrays of strings, and |
249 | associative arrays. |
250 | Strings and arrays of strings are first class objects, for the most part, |
251 | in the sense that they can be used as a whole as values in an expression. |
252 | Associative arrays can only be accessed on an association by association basis; |
253 | they don't have a value as a whole (at least not yet). |
254 | .PP |
255 | Strings are interpreted numerically as appropriate. |
256 | A string is interpreted as TRUE in the boolean sense if it is not the null |
257 | string or 0. |
258 | Booleans returned by operators are 1 for true and '0' or '' (the null |
259 | string) for false. |
260 | .PP |
261 | References to string variables always begin with \*(L'$\*(R', even when referring |
262 | to a string that is part of an array. |
263 | Thus: |
264 | .nf |
265 | |
266 | .ne 3 |
267 | $days \h'|2i'# a simple string variable |
268 | $days[28] \h'|2i'# 29th element of array @days |
269 | $days{'Feb'}\h'|2i'# one value from an associative array |
270 | |
271 | but entire arrays are denoted by \*(L'@\*(R': |
272 | |
273 | @days \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n]) |
274 | |
275 | .fi |
276 | .PP |
277 | Any of these four constructs may be assigned to (in compiler lingo, may serve |
278 | as an lvalue). |
279 | (Additionally, you may find the length of array @days by evaluating |
280 | \*(L"$#days\*(R", as in |
281 | .IR csh . |
282 | [Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.]) |
283 | .PP |
284 | Every data type has its own namespace. |
285 | You can, without fear of conflict, use the same name for a string variable, |
286 | an array, an associative array, a filehandle, a subroutine name, and/or |
287 | a label. |
288 | Since variable and array references always start with \*(L'$\*(R' |
289 | or \*(L'@\*(R', the \*(L"reserved\*(R" words aren't in fact reserved |
290 | with respect to variable names. |
291 | (They ARE reserved with respect to labels and filehandles, however, which |
292 | don't have an initial special character.) |
293 | Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all |
294 | different names. |
295 | Names which start with a letter may also contain digits and underscores. |
296 | Names which do not start with a letter are limited to one character, |
297 | e.g. \*(L"$%\*(R" or \*(L"$$\*(R". |
298 | (Many one character names have a predefined significance to |
299 | .I perl. |
300 | More later.) |
301 | .PP |
302 | String literals are delimited by either single or double quotes. |
303 | They work much like shell quotes: |
304 | double-quoted string literals are subject to backslash and variable |
305 | substitution; single-quoted strings are not. |
306 | The usual backslash rules apply for making characters such as newline, tab, etc. |
307 | You can also embed newlines directly in your strings, i.e. they can end on |
308 | a different line than they begin. |
309 | This is nice, but if you forget your trailing quote, the error will not be |
310 | reported until perl finds another line containing the quote character, which |
311 | may be much further on in the script. |
312 | Variable substitution inside strings is limited (currently) to simple string variables. |
313 | The following code segment prints out \*(L"The price is $100.\*(R" |
314 | .nf |
315 | |
316 | .ne 2 |
317 | $Price = '$100';\h'|3.5i'# not interpreted |
318 | print "The price is $Price.\e\|n";\h'|3.5i'# interpreted |
319 | |
320 | .fi |
83b4785a |
321 | Note that you can put curly brackets around the identifier to delimit it |
322 | from following alphanumerics. |
8d063cd8 |
323 | .PP |
324 | Array literals are denoted by separating individual values by commas, and |
325 | enclosing the list in parentheses. |
326 | In a context not requiring an array value, the value of the array literal |
327 | is the value of the final element, as in the C comma operator. |
328 | For example, |
329 | .nf |
330 | |
83b4785a |
331 | .ne 4 |
8d063cd8 |
332 | @foo = ('cc', '\-E', $bar); |
333 | |
334 | assigns the entire array value to array foo, but |
335 | |
336 | $foo = ('cc', '\-E', $bar); |
337 | |
338 | .fi |
339 | assigns the value of variable bar to variable foo. |
340 | Array lists may be assigned to if and only if each element of the list |
341 | is an lvalue: |
342 | .nf |
343 | |
344 | ($a, $b, $c) = (1, 2, 3); |
345 | |
346 | ($map{'red'}, $map{'blue'}, $map{'green'}) = (0x00f, 0x0f0, 0xf00); |
347 | |
348 | .fi |
349 | .PP |
350 | Numeric literals are specified in any of the usual floating point or |
351 | integer formats. |
352 | .PP |
353 | There are several other pseudo-literals that you should know about. |
354 | If a string is enclosed by backticks (grave accents), it is interpreted as |
355 | a command, and the output of that command is the value of the pseudo-literal, |
356 | just like in any of the standard shells. |
357 | The command is executed each time the pseudo-literal is evaluated. |
358 | Unlike in \f2csh\f1, no interpretation is done on the |
359 | data\*(--newlines remain newlines. |
83b4785a |
360 | The status value of the command is returned in $?. |
8d063cd8 |
361 | .PP |
362 | Evaluating a filehandle in angle brackets yields the next line |
363 | from that file (newline included, so it's never false until EOF). |
364 | Ordinarily you must assign that value to a variable, |
365 | but there is one situation where in which an automatic assignment happens. |
366 | If (and only if) the input symbol is the only thing inside the conditional of a |
367 | .I while |
368 | loop, the value is |
369 | automatically assigned to the variable \*(L"$_\*(R". |
370 | (This may seem like an odd thing to you, but you'll use the construct |
371 | in almost every |
372 | .I perl |
373 | script you write.) |
374 | Anyway, the following lines are equivalent to each other: |
375 | .nf |
376 | |
377 | .ne 3 |
378 | while ($_ = <stdin>) { |
379 | while (<stdin>) { |
380 | for (\|;\|<stdin>;\|) { |
381 | |
382 | .fi |
383 | The filehandles |
384 | .IR stdin , |
385 | .I stdout |
386 | and |
387 | .I stderr |
388 | are predefined. |
389 | Additional filehandles may be created with the |
390 | .I open |
391 | function. |
392 | .PP |
393 | The null filehandle <> is special and can be used to emulate the behavior of |
394 | \fIsed\fR and \fIawk\fR. |
395 | Input from <> comes either from standard input, or from each file listed on |
396 | the command line. |
397 | Here's how it works: the first time <> is evaluated, the ARGV array is checked, |
398 | and if it is null, $ARGV[0] is set to '-', which when opened gives you standard |
399 | input. |
400 | The ARGV array is then processed as a list of filenames. |
401 | The loop |
402 | .nf |
403 | |
404 | .ne 3 |
405 | while (<>) { |
406 | .\|.\|. # code for each line |
407 | } |
408 | |
409 | .ne 10 |
410 | is equivalent to |
411 | |
412 | unshift(@ARGV, '\-') \|if \|$#ARGV < $[; |
413 | while ($ARGV = shift) { |
414 | open(ARGV, $ARGV); |
415 | while (<ARGV>) { |
416 | .\|.\|. # code for each line |
417 | } |
418 | } |
419 | |
420 | .fi |
421 | except that it isn't as cumbersome to say. |
422 | It really does shift array ARGV and put the current filename into |
423 | variable ARGV. |
424 | It also uses filehandle ARGV internally. |
425 | You can modify @ARGV before the first <> as long as you leave the first |
426 | filename at the beginning of the array. |
83b4785a |
427 | Line numbers ($.) continue as if the input was one big happy file. |
8d063cd8 |
428 | .PP |
83b4785a |
429 | .ne 5 |
8d063cd8 |
430 | If you want to set @ARGV to you own list of files, go right ahead. |
431 | If you want to pass switches into your script, you can |
432 | put a loop on the front like this: |
433 | .nf |
434 | |
435 | .ne 10 |
436 | while ($_ = $ARGV[0], /\|^\-/\|) { |
437 | shift; |
438 | last if /\|^\-\|\-$\|/\|; |
439 | /\|^\-D\|(.*\|)/ \|&& \|($debug = $1); |
440 | /\|^\-v\|/ \|&& \|$verbose++; |
441 | .\|.\|. # other switches |
442 | } |
443 | while (<>) { |
444 | .\|.\|. # code for each line |
445 | } |
446 | |
447 | .fi |
448 | The <> symbol will return FALSE only once. |
449 | If you call it again after this it will assume you are processing another |
450 | @ARGV list, and if you haven't set @ARGV, will input from stdin. |
451 | .Sh "Syntax" |
452 | .PP |
453 | A |
454 | .I perl |
455 | script consists of a sequence of declarations and commands. |
456 | The only things that need to be declared in |
457 | .I perl |
458 | are report formats and subroutines. |
459 | See the sections below for more information on those declarations. |
460 | All objects are assumed to start with a null or 0 value. |
461 | The sequence of commands is executed just once, unlike in |
462 | .I sed |
463 | and |
464 | .I awk |
465 | scripts, where the sequence of commands is executed for each input line. |
466 | While this means that you must explicitly loop over the lines of your input file |
467 | (or files), it also means you have much more control over which files and which |
468 | lines you look at. |
469 | (Actually, I'm lying\*(--it is possible to do an implicit loop with either the |
470 | .B \-n |
471 | or |
472 | .B \-p |
473 | switch.) |
474 | .PP |
475 | A declaration can be put anywhere a command can, but has no effect on the |
476 | execution of the primary sequence of commands. |
477 | Typically all the declarations are put at the beginning or the end of the script. |
478 | .PP |
479 | .I Perl |
480 | is, for the most part, a free-form language. |
481 | (The only exception to this is format declarations, for fairly obvious reasons.) |
482 | Comments are indicated by the # character, and extend to the end of the line. |
483 | If you attempt to use /* */ C comments, it will be interpreted either as |
484 | division or pattern matching, depending on the context. |
485 | So don't do that. |
486 | .Sh "Compound statements" |
487 | In |
488 | .IR perl , |
489 | a sequence of commands may be treated as one command by enclosing it |
490 | in curly brackets. |
491 | We will call this a BLOCK. |
492 | .PP |
493 | The following compound commands may be used to control flow: |
494 | .nf |
495 | |
496 | .ne 4 |
497 | if (EXPR) BLOCK |
498 | if (EXPR) BLOCK else BLOCK |
499 | if (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK |
500 | LABEL while (EXPR) BLOCK |
501 | LABEL while (EXPR) BLOCK continue BLOCK |
502 | LABEL for (EXPR; EXPR; EXPR) BLOCK |
503 | LABEL BLOCK continue BLOCK |
504 | |
505 | .fi |
83b4785a |
506 | Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not |
8d063cd8 |
507 | statements. |
508 | This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed. |
509 | If you want to write conditionals without curly brackets there are several |
510 | other ways to do it. |
511 | The following all do the same thing: |
512 | .nf |
513 | |
514 | .ne 5 |
515 | if (!open(foo)) { die "Can't open $foo"; } |
516 | die "Can't open $foo" unless open(foo); |
517 | open(foo) || die "Can't open $foo"; # foo or bust! |
518 | open(foo) ? die "Can't open $foo" : 'hi mom'; |
83b4785a |
519 | # a bit exotic, that last one |
8d063cd8 |
520 | |
521 | .fi |
8d063cd8 |
522 | .PP |
523 | The |
524 | .I if |
525 | statement is straightforward. |
526 | Since BLOCKs are always bounded by curly brackets, there is never any |
527 | ambiguity about which |
528 | .I if |
529 | an |
530 | .I else |
531 | goes with. |
532 | If you use |
533 | .I unless |
534 | in place of |
535 | .IR if , |
536 | the sense of the test is reversed. |
537 | .PP |
538 | The |
539 | .I while |
540 | statement executes the block as long as the expression is true |
541 | (does not evaluate to the null string or 0). |
542 | The LABEL is optional, and if present, consists of an identifier followed by |
543 | a colon. |
544 | The LABEL identifies the loop for the loop control statements |
545 | .IR next , |
546 | .I last |
547 | and |
548 | .I redo |
549 | (see below). |
550 | If there is a |
551 | .I continue |
552 | BLOCK, it is always executed just before |
553 | the conditional is about to be evaluated again, similarly to the third part |
554 | of a |
555 | .I for |
556 | loop in C. |
557 | Thus it can be used to increment a loop variable, even when the loop has |
558 | been continued via the |
559 | .I next |
560 | statement (similar to the C \*(L"continue\*(R" statement). |
561 | .PP |
562 | If the word |
563 | .I while |
564 | is replaced by the word |
565 | .IR until , |
566 | the sense of the test is reversed, but the conditional is still tested before |
567 | the first iteration. |
568 | .PP |
569 | In either the |
570 | .I if |
571 | or the |
572 | .I while |
573 | statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional |
574 | is true if the value of the last command in that block is true. |
575 | .PP |
576 | The |
577 | .I for |
578 | loop works exactly like the corresponding |
579 | .I while |
580 | loop: |
581 | .nf |
582 | |
583 | .ne 12 |
584 | for ($i = 1; $i < 10; $i++) { |
585 | .\|.\|. |
586 | } |
587 | |
588 | is the same as |
589 | |
590 | $i = 1; |
591 | while ($i < 10) { |
592 | .\|.\|. |
593 | } continue { |
594 | $i++; |
595 | } |
596 | .fi |
597 | .PP |
598 | The BLOCK by itself (labeled or not) is equivalent to a loop that executes |
599 | once. |
600 | Thus you can use any of the loop control statements in it to leave or |
601 | restart the block. |
602 | The |
603 | .I continue |
604 | block is optional. |
605 | This construct is particularly nice for doing case structures. |
606 | .nf |
607 | |
608 | .ne 6 |
609 | foo: { |
610 | if (/abc/) { $abc = 1; last foo; } |
611 | if (/def/) { $def = 1; last foo; } |
612 | if (/xyz/) { $xyz = 1; last foo; } |
613 | $nothing = 1; |
614 | } |
615 | |
616 | .fi |
617 | .Sh "Simple statements" |
618 | The only kind of simple statement is an expression evaluated for its side |
619 | effects. |
620 | Every expression (simple statement) must be terminated with a semicolon. |
621 | Note that this is like C, but unlike Pascal (and |
622 | .IR awk ). |
623 | .PP |
624 | Any simple statement may optionally be followed by a |
625 | single modifier, just before the terminating semicolon. |
626 | The possible modifiers are: |
627 | .nf |
628 | |
629 | .ne 4 |
630 | if EXPR |
631 | unless EXPR |
632 | while EXPR |
633 | until EXPR |
634 | |
635 | .fi |
636 | The |
637 | .I if |
638 | and |
639 | .I unless |
640 | modifiers have the expected semantics. |
641 | The |
642 | .I while |
643 | and |
644 | .I unless |
645 | modifiers also have the expected semantics (conditional evaluated first), |
646 | except when applied to a do-BLOCK command, |
647 | in which case the block executes once before the conditional is evaluated. |
648 | This is so that you can write loops like: |
649 | .nf |
650 | |
651 | .ne 4 |
652 | do { |
653 | $_ = <stdin>; |
654 | .\|.\|. |
655 | } until $_ \|eq \|".\|\e\|n"; |
656 | |
657 | .fi |
658 | (See the |
659 | .I do |
660 | operator below. Note also that the loop control commands described later will |
83b4785a |
661 | NOT work in this construct, since modifiers don't take loop labels. |
8d063cd8 |
662 | Sorry.) |
663 | .Sh "Expressions" |
664 | Since |
665 | .I perl |
666 | expressions work almost exactly like C expressions, only the differences |
667 | will be mentioned here. |
668 | .PP |
669 | Here's what |
670 | .I perl |
671 | has that C doesn't: |
672 | .Ip (\|) 8 3 |
673 | The null list, used to initialize an array to null. |
674 | .Ip . 8 |
675 | Concatenation of two strings. |
676 | .Ip .= 8 |
677 | The corresponding assignment operator. |
678 | .Ip eq 8 |
679 | String equality (== is numeric equality). |
680 | For a mnemonic just think of \*(L"eq\*(R" as a string. |
681 | (If you are used to the |
682 | .I awk |
683 | behavior of using == for either string or numeric equality |
684 | based on the current form of the comparands, beware! |
685 | You must be explicit here.) |
686 | .Ip ne 8 |
687 | String inequality (!= is numeric inequality). |
688 | .Ip lt 8 |
689 | String less than. |
690 | .Ip gt 8 |
691 | String greater than. |
692 | .Ip le 8 |
693 | String less than or equal. |
694 | .Ip ge 8 |
695 | String greater than or equal. |
696 | .Ip =~ 8 2 |
697 | Certain operations search or modify the string \*(L"$_\*(R" by default. |
698 | This operator makes that kind of operation work on some other string. |
699 | The right argument is a search pattern, substitution, or translation. |
700 | The left argument is what is supposed to be searched, substituted, or |
701 | translated instead of the default \*(L"$_\*(R". |
702 | The return value indicates the success of the operation. |
703 | (If the right argument is an expression other than a search pattern, |
704 | substitution, or translation, it is interpreted as a search pattern |
705 | at run time. |
706 | This is less efficient than an explicit search, since the pattern must |
707 | be compiled every time the expression is evaluated.) |
708 | The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else. |
709 | .Ip !~ 8 |
710 | Just like =~ except the return value is negated. |
711 | .Ip x 8 |
712 | The repetition operator. |
713 | Returns a string consisting of the left operand repeated the |
714 | number of times specified by the right operand. |
715 | .nf |
716 | |
717 | print '-' x 80; # print row of dashes |
718 | print '-' x80; # illegal, x80 is identifier |
719 | |
720 | print "\et" x ($tab/8), ' ' x ($tab%8); # tab over |
721 | |
722 | .fi |
723 | .Ip x= 8 |
724 | The corresponding assignment operator. |
725 | .Ip .. 8 |
726 | The range operator, which is bistable. |
727 | It is false as long as its left argument is false. |
728 | Once the left argument is true, it stays true until the right argument is true, |
729 | AFTER which it becomes false again. |
730 | (It doesn't become false till the next time it's evaluated. |
731 | It can become false on the same evaluation it became true, but it still returns |
732 | true once.) |
733 | The .. operator is primarily intended for doing line number ranges after |
734 | the fashion of \fIsed\fR or \fIawk\fR. |
735 | The precedence is a little lower than || and &&. |
736 | The value returned is either the null string for false, or a sequence number |
737 | (beginning with 1) for true. |
738 | The sequence number is reset for each range encountered. |
739 | The final sequence number in a range has the string 'E0' appended to it, which |
740 | doesn't affect its numeric value, but gives you something to search for if you |
741 | want to exclude the endpoint. |
742 | You can exclude the beginning point by waiting for the sequence number to be |
743 | greater than 1. |
744 | If either argument to .. is static, that argument is implicitly compared to |
745 | the $. variable, the current line number. |
746 | Examples: |
747 | .nf |
748 | |
749 | .ne 5 |
750 | if (101 .. 200) { print; } # print 2nd hundred lines |
751 | |
752 | next line if (1 .. /^$/); # skip header lines |
753 | |
754 | s/^/> / if (/^$/ .. eof()); # quote body |
755 | |
756 | .fi |
757 | .PP |
758 | Here is what C has that |
759 | .I perl |
760 | doesn't: |
761 | .Ip "unary &" 12 |
762 | Address-of operator. |
763 | .Ip "unary *" 12 |
764 | Dereference-address operator. |
765 | .PP |
766 | Like C, |
767 | .I perl |
768 | does a certain amount of expression evaluation at compile time, whenever |
769 | it determines that all of the arguments to an operator are static and have |
770 | no side effects. |
771 | In particular, string concatenation happens at compile time between literals that don't do variable substitution. |
772 | Backslash interpretation also happens at compile time. |
773 | You can say |
774 | .nf |
775 | |
776 | .ne 2 |
777 | 'Now is the time for all' . "\|\e\|n" . |
778 | 'good men to come to.' |
779 | |
780 | .fi |
781 | and this all reduces to one string internally. |
782 | .PP |
783 | Along with the literals and variables mentioned earlier, |
784 | the following operations can serve as terms in an expression: |
785 | .Ip "/PATTERN/" 8 4 |
786 | Searches a string for a pattern, and returns true (1) or false (''). |
787 | If no string is specified via the =~ or !~ operator, |
788 | the $_ string is searched. |
789 | (The string specified with =~ need not be an lvalue\*(--it may be the result of an expression evaluation, but remember the =~ binds rather tightly.) |
790 | See also the section on regular expressions. |
791 | .Sp |
792 | If you prepend an `m' you can use any pair of characters as delimiters. |
793 | This is particularly useful for matching Unix path names that contain `/'. |
794 | .Sp |
795 | Examples: |
796 | .nf |
797 | |
798 | .ne 4 |
799 | open(tty, '/dev/tty'); |
800 | <tty> \|=~ \|/\|^[Yy]\|/ \|&& \|do foo(\|); # do foo if desired |
801 | |
802 | if (/Version: \|*\|([0-9.]*\|)\|/\|) { $version = $1; } |
803 | |
804 | next if m#^/usr/spool/uucp#; |
805 | |
806 | .fi |
807 | .Ip "?PATTERN?" 8 4 |
808 | This is just like the /pattern/ search, except that it matches only once between |
809 | calls to the |
810 | .I reset |
811 | operator. |
812 | This is a useful optimization when you only want to see the first occurence of |
813 | something in each of a set of files, for instance. |
814 | .Ip "chdir EXPR" 8 2 |
815 | Changes the working director to EXPR, if possible. |
816 | Returns 1 upon success, 0 otherwise. |
817 | See example under die(). |
818 | .Ip "chmod LIST" 8 2 |
819 | Changes the permissions of a list of files. |
820 | The first element of the list must be the numerical mode. |
821 | LIST may be an array, in which case you may wish to use the unshift() |
822 | command to put the mode on the front of the array. |
823 | Returns the number of files successfully changed. |
824 | Note: in order to use the value you must put the whole thing in parentheses. |
825 | .nf |
826 | |
827 | $cnt = (chmod 0755,'foo','bar'); |
828 | |
829 | .fi |
830 | .Ip "chop(VARIABLE)" 8 5 |
831 | .Ip "chop" 8 |
832 | Chops off the last character of a string and returns it. |
833 | It's used primarily to remove the newline from the end of an input record, |
834 | but is much more efficient than s/\en// because it neither scans nor copies |
835 | the string. |
836 | If VARIABLE is omitted, chops $_. |
837 | Example: |
838 | .nf |
839 | |
840 | .ne 5 |
841 | while (<>) { |
842 | chop; # avoid \en on last field |
843 | @array = split(/:/); |
844 | .\|.\|. |
845 | } |
846 | |
847 | .fi |
848 | .Ip "chown LIST" 8 2 |
849 | Changes the owner (and group) of a list of files. |
850 | LIST may be an array. |
851 | The first two elements of the list must be the NUMERICAL uid and gid, in that order. |
852 | Returns the number of files successfully changed. |
853 | Note: in order to use the value you must put the whole thing in parentheses. |
854 | .nf |
855 | |
856 | $cnt = (chown $uid,$gid,'foo'); |
857 | |
858 | .fi |
83b4785a |
859 | .ne 18 |
8d063cd8 |
860 | Here's an example of looking up non-numeric uids: |
861 | .nf |
862 | |
8d063cd8 |
863 | print "User: "; |
864 | $user = <stdin>; |
865 | open(pass,'/etc/passwd') || die "Can't open passwd"; |
866 | while (<pass>) { |
867 | ($login,$pass,$uid,$gid) = split(/:/); |
868 | $uid{$login} = $uid; |
869 | $gid{$login} = $gid; |
870 | } |
871 | @ary = ('foo','bar','bie','doll'); |
872 | if ($uid{$user} eq '') { |
873 | die "$user not in passwd file"; |
874 | } |
875 | else { |
876 | unshift(@ary,$uid{$user},$gid{$user}); |
877 | chown @ary; |
878 | } |
879 | |
880 | .fi |
881 | .Ip "close(FILEHANDLE)" 8 5 |
882 | .Ip "close FILEHANDLE" 8 |
883 | Closes the file or pipe associated with the file handle. |
884 | You don't have to close FILEHANDLE if you are immediately going to |
885 | do another open on it, since open will close it for you. |
886 | (See |
887 | .IR open .) |
888 | However, an explicit close on an input file resets the line counter ($.), while |
889 | the implicit close done by |
890 | .I open |
891 | does not. |
892 | Also, closing a pipe will wait for the process executing on the pipe to complete, |
893 | in case you want to look at the output of the pipe afterwards. |
894 | Example: |
895 | .nf |
896 | |
897 | .ne 4 |
898 | open(output,'|sort >foo'); # pipe to sort |
899 | ... # print stuff to output |
900 | close(output); # wait for sort to finish |
901 | open(input,'foo'); # get sort's results |
902 | |
903 | .fi |
904 | .Ip "crypt(PLAINTEXT,SALT)" 8 6 |
905 | Encrypts a string exactly like the crypt() function in the C library. |
906 | Useful for checking the password file for lousy passwords. |
907 | Only the guys wearing white hats should do this. |
908 | .Ip "die EXPR" 8 6 |
909 | Prints the value of EXPR to stderr and exits with a non-zero status. |
910 | Equivalent examples: |
911 | .nf |
912 | |
913 | .ne 3 |
914 | die "Can't cd to spool." unless chdir '/usr/spool/news'; |
915 | |
916 | (chdir '/usr/spool/news') || die "Can't cd to spool." |
917 | |
918 | .fi |
919 | Note that the parens are necessary above due to precedence. |
920 | See also |
921 | .IR exit . |
922 | .Ip "do BLOCK" 8 4 |
923 | Returns the value of the last command in the sequence of commands indicated |
924 | by BLOCK. |
925 | When modified by a loop modifier, executes the BLOCK once before testing the |
926 | loop condition. |
927 | (On other statements the loop modifiers test the conditional first.) |
928 | .Ip "do SUBROUTINE (LIST)" 8 3 |
929 | Executes a SUBROUTINE declared by a |
930 | .I sub |
931 | declaration, and returns the value |
932 | of the last expression evaluated in SUBROUTINE. |
933 | (See the section on subroutines later on.) |
934 | .Ip "each(ASSOC_ARRAY)" 8 6 |
935 | Returns a 2 element array consisting of the key and value for the next |
936 | value of an associative array, so that you can iterate over it. |
937 | Entries are returned in an apparently random order. |
938 | When the array is entirely read, a null array is returned (which when |
939 | assigned produces a FALSE (0) value). |
940 | The next call to each() after that will start iterating again. |
941 | The iterator can be reset only by reading all the elements from the array. |
83b4785a |
942 | You should not modify the array while iterating over it. |
8d063cd8 |
943 | The following prints out your environment like the printenv program, only |
944 | in a different order: |
945 | .nf |
946 | |
947 | .ne 3 |
948 | while (($key,$value) = each(ENV)) { |
949 | print "$key=$value\en"; |
950 | } |
951 | |
952 | .fi |
953 | See also keys() and values(). |
954 | .Ip "eof(FILEHANDLE)" 8 8 |
955 | .Ip "eof" 8 |
956 | Returns 1 if the next read on FILEHANDLE will return end of file, or if |
957 | FILEHANDLE is not open. |
958 | If (FILEHANDLE) is omitted, the eof status is returned for the last file read. |
959 | The null filehandle may be used to indicate the pseudo file formed of the |
960 | files listed on the command line, i.e. eof() is reasonable to use inside |
961 | a while (<>) loop. |
962 | Example: |
963 | .nf |
964 | |
965 | .ne 7 |
966 | # insert dashes just before last line |
967 | while (<>) { |
968 | if (eof()) { |
969 | print "--------------\en"; |
970 | } |
971 | print; |
972 | } |
973 | |
974 | .fi |
83b4785a |
975 | .Ip "eval EXPR" 8 6 |
976 | EXPR is parsed and executed as if it were a little perl program. |
977 | It is executed in the context of the current perl program, so that |
978 | any variable settings, subroutine or format definitions remain afterwards. |
979 | The value returned is the value of the last expression evaluated, just |
980 | as with subroutines. |
981 | If there is a syntax error or runtime error, a null string is returned by |
982 | eval, and $@ is set to the error message. |
983 | If there was no error, $@ is null. |
8d063cd8 |
984 | .Ip "exec LIST" 8 6 |
985 | If there is more than one argument in LIST, |
986 | calls execvp() with the arguments in LIST. |
987 | If there is only one argument, the argument is checked for shell metacharacters. |
988 | If there are any, the entire argument is passed to /bin/sh -c for parsing. |
989 | If there are none, the argument is split into words and passed directly to |
990 | execvp(), which is more efficient. |
991 | Note: exec (and system) do not flush your output buffer, so you may need to |
992 | set $| to avoid lost output. |
993 | .Ip "exit EXPR" 8 6 |
994 | Evaluates EXPR and exits immediately with that value. |
995 | Example: |
996 | .nf |
997 | |
998 | .ne 2 |
999 | $ans = <stdin>; |
1000 | exit 0 \|if \|$ans \|=~ \|/\|^[Xx]\|/\|; |
1001 | |
1002 | .fi |
1003 | See also |
1004 | .IR die . |
1005 | .Ip "exp(EXPR)" 8 3 |
1006 | Returns e to the power of EXPR. |
1007 | .Ip "fork" 8 4 |
1008 | Does a fork() call. |
1009 | Returns the child pid to the parent process and 0 to the child process. |
1010 | Note: unflushed buffers remain unflushed in both processes, which means |
1011 | you may need to set $| to avoid duplicate output. |
1012 | .Ip "gmtime(EXPR)" 8 4 |
1013 | Converts a time as returned by the time function to a 9-element array with |
1014 | the time analyzed for the Greenwich timezone. |
1015 | Typically used as follows: |
1016 | .nf |
1017 | |
1018 | .ne 3 |
1019 | ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) |
1020 | = gmtime(time); |
1021 | |
1022 | .fi |
1023 | All array elements are numeric. |
1024 | ''' End of part 1 |