Commit | Line | Data |
8d063cd8 |
1 | .rn '' }` |
a687059c |
2 | ''' $Header: perl.man.1,v 3.0 89/10/18 15:21:29 lwall Locked $ |
8d063cd8 |
3 | ''' |
4 | ''' $Log: perl.man.1,v $ |
a687059c |
5 | ''' Revision 3.0 89/10/18 15:21:29 lwall |
6 | ''' 3.0 baseline |
8d063cd8 |
7 | ''' |
8 | ''' |
9 | .de Sh |
10 | .br |
11 | .ne 5 |
12 | .PP |
13 | \fB\\$1\fR |
14 | .PP |
15 | .. |
16 | .de Sp |
17 | .if t .sp .5v |
18 | .if n .sp |
19 | .. |
20 | .de Ip |
21 | .br |
22 | .ie \\n.$>=3 .ne \\$3 |
23 | .el .ne 3 |
24 | .IP "\\$1" \\$2 |
25 | .. |
26 | ''' |
27 | ''' Set up \*(-- to give an unbreakable dash; |
28 | ''' string Tr holds user defined translation string. |
29 | ''' Bell System Logo is used as a dummy character. |
30 | ''' |
378cc40b |
31 | .tr \(*W-|\(bv\*(Tr |
8d063cd8 |
32 | .ie n \{\ |
378cc40b |
33 | .ds -- \(*W- |
34 | .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch |
35 | .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch |
8d063cd8 |
36 | .ds L" "" |
37 | .ds R" "" |
38 | .ds L' ' |
39 | .ds R' ' |
40 | 'br\} |
41 | .el\{\ |
42 | .ds -- \(em\| |
43 | .tr \*(Tr |
44 | .ds L" `` |
45 | .ds R" '' |
46 | .ds L' ` |
47 | .ds R' ' |
48 | 'br\} |
a687059c |
49 | .TH PERL 1 "\*(RP" |
50 | .UC |
8d063cd8 |
51 | .SH NAME |
a687059c |
52 | perl \- Practical Extraction and Report Language |
8d063cd8 |
53 | .SH SYNOPSIS |
a687059c |
54 | .B perl |
55 | [options] filename args |
8d063cd8 |
56 | .SH DESCRIPTION |
57 | .I Perl |
a687059c |
58 | is an interpreted language optimized for scanning arbitrary text files, |
8d063cd8 |
59 | extracting information from those text files, and printing reports based |
60 | on that information. |
61 | It's also a good language for many system management tasks. |
62 | The language is intended to be practical (easy to use, efficient, complete) |
63 | rather than beautiful (tiny, elegant, minimal). |
64 | It combines (in the author's opinion, anyway) some of the best features of C, |
65 | \fIsed\fR, \fIawk\fR, and \fIsh\fR, |
66 | so people familiar with those languages should have little difficulty with it. |
67 | (Language historians will also note some vestiges of \fIcsh\fR, Pascal, and |
68 | even BASIC-PLUS.) |
69 | Expression syntax corresponds quite closely to C expression syntax. |
a687059c |
70 | Unlike most Unix utilities, |
71 | .I perl |
72 | does not arbitrarily limit the size of your data\*(--if you've got |
73 | the memory, |
74 | .I perl |
75 | can slurp in your whole file as a single string. |
76 | Recursion is of unlimited depth. |
77 | And the hash tables used by associative arrays grow as necessary to prevent |
78 | degraded performance. |
79 | .I Perl |
80 | uses sophisticated pattern matching techniques to scan large amounts of |
81 | data very quickly. |
82 | Although optimized for scanning text, |
83 | .I perl |
84 | can also deal with binary data, and can make dbm files look like associative |
85 | arrays (where dbm is available). |
86 | Setuid |
87 | .I perl |
88 | scripts are safer than C programs |
89 | through a dataflow tracing mechanism which prevents many stupid security holes. |
8d063cd8 |
90 | If you have a problem that would ordinarily use \fIsed\fR |
91 | or \fIawk\fR or \fIsh\fR, but it |
92 | exceeds their capabilities or must run a little faster, |
93 | and you don't want to write the silly thing in C, then |
94 | .I perl |
95 | may be for you. |
a687059c |
96 | There are also translators to turn your |
97 | .I sed |
98 | and |
99 | .I awk |
100 | scripts into |
101 | .I perl |
102 | scripts. |
8d063cd8 |
103 | OK, enough hype. |
104 | .PP |
105 | Upon startup, |
106 | .I perl |
107 | looks for your script in one of the following places: |
108 | .Ip 1. 4 2 |
109 | Specified line by line via |
110 | .B \-e |
111 | switches on the command line. |
112 | .Ip 2. 4 2 |
113 | Contained in the file specified by the first filename on the command line. |
114 | (Note that systems supporting the #! notation invoke interpreters this way.) |
115 | .Ip 3. 4 2 |
a687059c |
116 | Passed in implicitly via standard input. |
378cc40b |
117 | This only works if there are no filename arguments\*(--to pass |
a687059c |
118 | arguments to a |
119 | .I stdin |
120 | script you must explicitly specify a \- for the script name. |
8d063cd8 |
121 | .PP |
122 | After locating your script, |
123 | .I perl |
124 | compiles it to an internal form. |
125 | If the script is syntactically correct, it is executed. |
126 | .Sh "Options" |
83b4785a |
127 | Note: on first reading this section may not make much sense to you. It's here |
8d063cd8 |
128 | at the front for easy reference. |
129 | .PP |
130 | A single-character option may be combined with the following option, if any. |
131 | This is particularly useful when invoking a script using the #! construct which |
132 | only allows one argument. Example: |
133 | .nf |
134 | |
135 | .ne 2 |
a687059c |
136 | #!/usr/bin/perl \-spi.bak # same as \-s \-p \-i.bak |
8d063cd8 |
137 | .\|.\|. |
138 | |
139 | .fi |
140 | Options include: |
141 | .TP 5 |
378cc40b |
142 | .B \-a |
a687059c |
143 | turns on autosplit mode when used with a |
144 | .B \-n |
145 | or |
146 | .BR \-p . |
378cc40b |
147 | An implicit split command to the @F array |
148 | is done as the first thing inside the implicit while loop produced by |
a687059c |
149 | the |
150 | .B \-n |
151 | or |
152 | .BR \-p . |
378cc40b |
153 | .nf |
154 | |
a687059c |
155 | perl \-ane \'print pop(@F), "\en";\' |
378cc40b |
156 | |
157 | is equivalent to |
158 | |
159 | while (<>) { |
a687059c |
160 | @F = split(\' \'); |
161 | print pop(@F), "\en"; |
378cc40b |
162 | } |
163 | |
164 | .fi |
165 | .TP 5 |
a687059c |
166 | .BI \-d |
167 | runs the script under the perl debugger. |
168 | See the section on Debugging. |
169 | .TP 5 |
170 | .BI \-D number |
8d063cd8 |
171 | sets debugging flags. |
172 | To watch how it executes your script, use |
a687059c |
173 | .BR \-D14 . |
8d063cd8 |
174 | (This only works if debugging is compiled into your |
175 | .IR perl .) |
a687059c |
176 | Another nice value is \-D1024, which lists your compiled syntax tree. |
177 | And \-D512 displays compiled regular expressions. |
8d063cd8 |
178 | .TP 5 |
a687059c |
179 | .BI \-e " commandline" |
8d063cd8 |
180 | may be used to enter one line of script. |
181 | Multiple |
182 | .B \-e |
183 | commands may be given to build up a multi-line script. |
184 | If |
185 | .B \-e |
186 | is given, |
187 | .I perl |
188 | will not look for a script filename in the argument list. |
189 | .TP 5 |
a687059c |
190 | .BI \-i extension |
8d063cd8 |
191 | specifies that files processed by the <> construct are to be edited |
192 | in-place. |
193 | It does this by renaming the input file, opening the output file by the |
194 | same name, and selecting that output file as the default for print statements. |
195 | The extension, if supplied, is added to the name of the |
196 | old file to make a backup copy. |
197 | If no extension is supplied, no backup is made. |
a687059c |
198 | Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using |
8d063cd8 |
199 | the script: |
200 | .nf |
201 | |
202 | .ne 2 |
a687059c |
203 | #!/usr/bin/perl \-pi.bak |
8d063cd8 |
204 | s/foo/bar/; |
205 | |
206 | which is equivalent to |
207 | |
208 | .ne 14 |
378cc40b |
209 | #!/usr/bin/perl |
8d063cd8 |
210 | while (<>) { |
211 | if ($ARGV ne $oldargv) { |
a687059c |
212 | rename($ARGV, $ARGV . \'.bak\'); |
213 | open(ARGVOUT, ">$ARGV"); |
8d063cd8 |
214 | select(ARGVOUT); |
215 | $oldargv = $ARGV; |
216 | } |
217 | s/foo/bar/; |
218 | } |
219 | continue { |
220 | print; # this prints to original filename |
221 | } |
a687059c |
222 | select(STDOUT); |
8d063cd8 |
223 | |
224 | .fi |
a687059c |
225 | except that the |
226 | .B \-i |
227 | form doesn't need to compare $ARGV to $oldargv to know when |
8d063cd8 |
228 | the filename has changed. |
229 | It does, however, use ARGVOUT for the selected filehandle. |
a687059c |
230 | Note that |
231 | .I STDOUT |
232 | is restored as the default output filehandle after the loop. |
378cc40b |
233 | .Sp |
234 | You can use eof to locate the end of each input file, in case you want |
235 | to append to each file, or reset line numbering (see example under eof). |
8d063cd8 |
236 | .TP 5 |
a687059c |
237 | .BI \-I directory |
8d063cd8 |
238 | may be used in conjunction with |
239 | .B \-P |
240 | to tell the C preprocessor where to look for include files. |
241 | By default /usr/include and /usr/lib/perl are searched. |
242 | .TP 5 |
243 | .B \-n |
244 | causes |
245 | .I perl |
246 | to assume the following loop around your script, which makes it iterate |
a687059c |
247 | over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR: |
8d063cd8 |
248 | .nf |
249 | |
250 | .ne 3 |
251 | while (<>) { |
378cc40b |
252 | .\|.\|. # your script goes here |
8d063cd8 |
253 | } |
254 | |
255 | .fi |
256 | Note that the lines are not printed by default. |
257 | See |
258 | .B \-p |
259 | to have lines printed. |
378cc40b |
260 | Here is an efficient way to delete all files older than a week: |
261 | .nf |
262 | |
a687059c |
263 | find . \-mtime +7 \-print | perl \-ne \'chop;unlink;\' |
378cc40b |
264 | |
265 | .fi |
a687059c |
266 | This is faster than using the \-exec switch of find because you don't have to |
378cc40b |
267 | start a process on every filename found. |
8d063cd8 |
268 | .TP 5 |
269 | .B \-p |
270 | causes |
271 | .I perl |
272 | to assume the following loop around your script, which makes it iterate |
273 | over filename arguments somewhat like \fIsed\fR: |
274 | .nf |
275 | |
276 | .ne 5 |
277 | while (<>) { |
378cc40b |
278 | .\|.\|. # your script goes here |
8d063cd8 |
279 | } continue { |
280 | print; |
281 | } |
282 | |
283 | .fi |
284 | Note that the lines are printed automatically. |
285 | To suppress printing use the |
286 | .B \-n |
287 | switch. |
83b4785a |
288 | A |
289 | .B \-p |
290 | overrides a |
291 | .B \-n |
292 | switch. |
8d063cd8 |
293 | .TP 5 |
294 | .B \-P |
295 | causes your script to be run through the C preprocessor before |
296 | compilation by |
a687059c |
297 | .IR perl . |
8d063cd8 |
298 | (Since both comments and cpp directives begin with the # character, |
299 | you should avoid starting comments with any words recognized |
300 | by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".) |
301 | .TP 5 |
302 | .B \-s |
303 | enables some rudimentary switch parsing for switches on the command line |
a687059c |
304 | after the script name but before any filename arguments (or before a \-\|\-). |
83b4785a |
305 | Any switch found there is removed from @ARGV and sets the corresponding variable in the |
8d063cd8 |
306 | .I perl |
307 | script. |
308 | The following script prints \*(L"true\*(R" if and only if the script is |
a687059c |
309 | invoked with a \-xyz switch. |
8d063cd8 |
310 | .nf |
311 | |
312 | .ne 2 |
a687059c |
313 | #!/usr/bin/perl \-s |
83b4785a |
314 | if ($xyz) { print "true\en"; } |
8d063cd8 |
315 | |
316 | .fi |
378cc40b |
317 | .TP 5 |
318 | .B \-S |
a687059c |
319 | makes |
320 | .I perl |
321 | use the PATH environment variable to search for the script |
378cc40b |
322 | (unless the name of the script starts with a slash). |
323 | Typically this is used to emulate #! startup on machines that don't |
324 | support #!, in the following manner: |
325 | .nf |
326 | |
327 | #!/usr/bin/perl |
a687059c |
328 | eval "exec /usr/bin/perl \-S $0 $*" |
378cc40b |
329 | if $running_under_some_shell; |
330 | |
331 | .fi |
332 | The system ignores the first line and feeds the script to /bin/sh, |
a687059c |
333 | which proceeds to try to execute the |
334 | .I perl |
335 | script as a shell script. |
378cc40b |
336 | The shell executes the second line as a normal shell command, and thus |
a687059c |
337 | starts up the |
338 | .I perl |
339 | interpreter. |
378cc40b |
340 | On some systems $0 doesn't always contain the full pathname, |
a687059c |
341 | so the |
342 | .B \-S |
343 | tells |
344 | .I perl |
345 | to search for the script if necessary. |
346 | After |
347 | .I perl |
348 | locates the script, it parses the lines and ignores them because |
378cc40b |
349 | the variable $running_under_some_shell is never true. |
350 | .TP 5 |
a687059c |
351 | .B \-u |
352 | causes |
353 | .I perl |
354 | to dump core after compiling your script. |
355 | You can then take this core dump and turn it into an executable file |
356 | by using the undump program (not supplied). |
357 | This speeds startup at the expense of some disk space (which you can |
358 | minimize by stripping the executable). |
359 | (Still, a "hello world" executable comes out to about 200K on my machine.) |
360 | If you are going to run your executable as a set-id program then you |
361 | should probably compile it using taintperl rather than normal perl. |
362 | If you want to execute a portion of your script before dumping, use the |
363 | dump operator instead. |
364 | .TP 5 |
378cc40b |
365 | .B \-U |
a687059c |
366 | allows |
367 | .I perl |
368 | to do unsafe operations. |
13281fa4 |
369 | Currently the only \*(L"unsafe\*(R" operation is the unlinking of directories while |
378cc40b |
370 | running as superuser. |
371 | .TP 5 |
372 | .B \-v |
a687059c |
373 | prints the version and patchlevel of your |
374 | .I perl |
375 | executable. |
378cc40b |
376 | .TP 5 |
377 | .B \-w |
378 | prints warnings about identifiers that are mentioned only once, and scalar |
379 | variables that are used before being set. |
380 | Also warns about redefined subroutines, and references to undefined |
a687059c |
381 | filehandles or filehandles opened readonly that you are attempting to |
382 | write on. |
383 | Also warns you if you use == on values that don't look like numbers, and if |
384 | your subroutines recurse more than 100 deep. |
8d063cd8 |
385 | .Sh "Data Types and Objects" |
386 | .PP |
a687059c |
387 | .I Perl |
388 | has three data types: scalars, arrays of scalars, and |
389 | associative arrays of scalars. |
390 | Normal arrays are indexed by number, and associative arrays by string. |
8d063cd8 |
391 | .PP |
a687059c |
392 | The interpretation of operations and values in perl sometimes |
393 | depends on the requirements |
394 | of the context around the operation or value. |
395 | There are three major contexts: string, numeric and array. |
396 | Certain operations return array values |
397 | in contexts wanting an array, and scalar values otherwise. |
398 | (If this is true of an operation it will be mentioned in the documentation |
399 | for that operation.) |
400 | Operations which return scalars don't care whether the context is looking |
401 | for a string or a number, but |
402 | scalar variables and values are interpreted as strings or numbers |
403 | as appropriate to the context. |
378cc40b |
404 | A scalar is interpreted as TRUE in the boolean sense if it is not the null |
8d063cd8 |
405 | string or 0. |
a687059c |
406 | Booleans returned by operators are 1 for true and \'0\' or \'\' (the null |
8d063cd8 |
407 | string) for false. |
408 | .PP |
a687059c |
409 | There are actually two varieties of null string: defined and undefined. |
410 | Undefined null strings are returned when there is no real value for something, |
411 | such as when there was an error, or at end of file, or when you refer |
412 | to an uninitialized variable or element of an array. |
413 | An undefined null string may become defined the first time you access it, but |
414 | prior to that you can use the defined() operator to determine whether the |
415 | value is defined or not. |
416 | .PP |
378cc40b |
417 | References to scalar variables always begin with \*(L'$\*(R', even when referring |
418 | to a scalar that is part of an array. |
8d063cd8 |
419 | Thus: |
420 | .nf |
421 | |
422 | .ne 3 |
378cc40b |
423 | $days \h'|2i'# a simple scalar variable |
8d063cd8 |
424 | $days[28] \h'|2i'# 29th element of array @days |
a687059c |
425 | $days{\'Feb\'}\h'|2i'# one value from an associative array |
378cc40b |
426 | $#days \h'|2i'# last index of array @days |
8d063cd8 |
427 | |
a687059c |
428 | but entire arrays or array slices are denoted by \*(L'@\*(R': |
8d063cd8 |
429 | |
430 | @days \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n]) |
a687059c |
431 | @days[3,4,5]\h'|2i'# same as @days[3.\|.5] |
432 | @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'}) |
433 | |
434 | and entire associative arrays are denoted by \*(L'%\*(R': |
8d063cd8 |
435 | |
a687059c |
436 | %days \h'|2i'# (key1, val1, key2, val2 .\|.\|.) |
8d063cd8 |
437 | .fi |
438 | .PP |
a687059c |
439 | Any of these eight constructs may serve as an lvalue, |
378cc40b |
440 | that is, may be assigned to. |
a687059c |
441 | (It also turns out that an assignment is itself an lvalue in |
442 | certain contexts\*(--see examples under s, tr and chop.) |
443 | Assignment to a scalar evaluates the righthand side in a scalar context, |
444 | while assignment to an array or array slice evaluates the righthand side |
445 | in an array context. |
446 | .PP |
378cc40b |
447 | You may find the length of array @days by evaluating |
8d063cd8 |
448 | \*(L"$#days\*(R", as in |
449 | .IR csh . |
378cc40b |
450 | (Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.) |
451 | Assigning to $#days changes the length of the array. |
452 | Shortening an array by this method does not actually destroy any values. |
453 | Lengthening an array that was previously shortened recovers the values that |
454 | were in those elements. |
455 | You can also gain some measure of efficiency by preextending an array that |
456 | is going to get big. |
457 | (You can also extend an array by assigning to an element that is off the |
458 | end of the array. |
459 | This differs from assigning to $#whatever in that intervening values |
460 | are set to null rather than recovered.) |
461 | You can truncate an array down to nothing by assigning the null list () to |
462 | it. |
463 | The following are exactly equivalent |
464 | .nf |
465 | |
466 | @whatever = (); |
467 | $#whatever = $[ \- 1; |
468 | |
469 | .fi |
8d063cd8 |
470 | .PP |
a687059c |
471 | Multi-dimensional arrays are not directly supported, but see the discussion |
472 | of the $; variable later for a means of emulating multiple subscripts with |
473 | an associative array. |
474 | .PP |
8d063cd8 |
475 | Every data type has its own namespace. |
378cc40b |
476 | You can, without fear of conflict, use the same name for a scalar variable, |
8d063cd8 |
477 | an array, an associative array, a filehandle, a subroutine name, and/or |
478 | a label. |
a687059c |
479 | Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R', |
480 | or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved |
8d063cd8 |
481 | with respect to variable names. |
482 | (They ARE reserved with respect to labels and filehandles, however, which |
378cc40b |
483 | don't have an initial special character. |
a687059c |
484 | Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\'). |
485 | Using uppercase filehandles also improves readability and protects you |
486 | from conflict with future reserved words.) |
8d063cd8 |
487 | Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all |
488 | different names. |
489 | Names which start with a letter may also contain digits and underscores. |
490 | Names which do not start with a letter are limited to one character, |
491 | e.g. \*(L"$%\*(R" or \*(L"$$\*(R". |
a687059c |
492 | (Most of the one character names have a predefined significance to |
493 | .IR perl . |
8d063cd8 |
494 | More later.) |
495 | .PP |
a687059c |
496 | Numeric literals are specified in any of the usual floating point or |
497 | integer formats: |
498 | .nf |
499 | |
500 | .ne 5 |
501 | 12345 |
502 | 12345.67 |
503 | .23E-10 |
504 | 0xffff # hex |
505 | 0377 # octal |
506 | |
507 | .fi |
8d063cd8 |
508 | String literals are delimited by either single or double quotes. |
509 | They work much like shell quotes: |
510 | double-quoted string literals are subject to backslash and variable |
a687059c |
511 | substitution; single-quoted strings are not (except for \e\' and \e\e). |
8d063cd8 |
512 | The usual backslash rules apply for making characters such as newline, tab, etc. |
513 | You can also embed newlines directly in your strings, i.e. they can end on |
514 | a different line than they begin. |
515 | This is nice, but if you forget your trailing quote, the error will not be |
a687059c |
516 | reported until |
517 | .I perl |
518 | finds another line containing the quote character, which |
8d063cd8 |
519 | may be much further on in the script. |
a687059c |
520 | Variable substitution inside strings is limited to scalar variables, normal |
521 | array values, and array slices. |
522 | (In other words, identifiers beginning with $ or @, followed by an optional |
523 | bracketed expression as a subscript.) |
8d063cd8 |
524 | The following code segment prints out \*(L"The price is $100.\*(R" |
525 | .nf |
526 | |
527 | .ne 2 |
a687059c |
528 | $Price = \'$100\';\h'|3.5i'# not interpreted |
8d063cd8 |
529 | print "The price is $Price.\e\|n";\h'|3.5i'# interpreted |
530 | |
531 | .fi |
83b4785a |
532 | Note that you can put curly brackets around the identifier to delimit it |
533 | from following alphanumerics. |
8d063cd8 |
534 | .PP |
a687059c |
535 | Array values are interpolated into double-quoted strings by joining all the |
536 | elements of the array with the delimiter specified in the $" variable, |
537 | space by default. |
538 | (Since in versions of perl prior to 3.0 the @ character was not a metacharacter |
539 | in double-quoted strings, the interpolation of @array, $array[EXPR], |
540 | @array[LIST], $array{EXPR}, or @array{LIST} only happens if array is |
541 | referenced elsewhere in the program or is predefined.) |
542 | The following are equivalent: |
543 | .nf |
544 | |
545 | .ne 4 |
546 | $temp = join($",@ARGV); |
547 | system "echo $temp"; |
548 | |
549 | system "echo @ARGV"; |
550 | |
551 | .fi |
552 | Within search patterns (which also undergo double-quoteish substitution) |
553 | there is a bad ambiguity: Is /$foo[bar]/ to be |
554 | interpreted as /${foo}[bar]/ (where [bar] is a character class for the |
555 | regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to |
556 | array @foo)? |
557 | If @foo doesn't otherwise exist, then it's obviously a character class. |
558 | If @foo exists, perl takes a good guess about [bar], and is almost always right. |
559 | If it does guess wrong, or if you're just plain paranoid, |
560 | you can force the correct interpretation with curly brackets as above. |
561 | .PP |
562 | A line-oriented form of quoting is based on the shell here-is syntax. |
563 | Following a << you specify a string to terminate the quoted material, and all lines |
564 | following the current line down to the terminating string are the value |
565 | of the item. |
566 | The terminating string may be either an identifier (a word), or some |
567 | quoted text. |
568 | If quoted, the type of quotes you use determines the treatment of the text, |
569 | just as in regular quoting. |
570 | An unquoted identifier works like double quotes. |
571 | There must be no space between the << and the identifier. |
572 | (If you put a space it will be treated as a null identifier, which is |
573 | valid, and matches the first blank line\*(--see Merry Christmas example below.) |
574 | The terminating string must appear by itself (unquoted and with no surrounding |
575 | whitespace) on the terminating line. |
576 | .nf |
577 | |
578 | print <<EOF; # same as above |
579 | The price is $Price. |
580 | EOF |
581 | |
582 | print <<"EOF"; # same as above |
583 | The price is $Price. |
584 | EOF |
585 | |
586 | print << x 10; # null identifier is delimiter |
587 | Merry Christmas! |
588 | |
589 | print <<`EOC`; # execute commands |
590 | echo hi there |
591 | echo lo there |
592 | EOC |
593 | |
594 | print <<foo, <<bar; # you can stack them |
595 | I said foo. |
596 | foo |
597 | I said bar. |
598 | bar |
599 | |
600 | .fi |
8d063cd8 |
601 | Array literals are denoted by separating individual values by commas, and |
602 | enclosing the list in parentheses. |
603 | In a context not requiring an array value, the value of the array literal |
604 | is the value of the final element, as in the C comma operator. |
605 | For example, |
606 | .nf |
607 | |
83b4785a |
608 | .ne 4 |
a687059c |
609 | @foo = (\'cc\', \'\-E\', $bar); |
8d063cd8 |
610 | |
611 | assigns the entire array value to array foo, but |
612 | |
a687059c |
613 | $foo = (\'cc\', \'\-E\', $bar); |
8d063cd8 |
614 | |
615 | .fi |
616 | assigns the value of variable bar to variable foo. |
617 | Array lists may be assigned to if and only if each element of the list |
618 | is an lvalue: |
619 | .nf |
620 | |
621 | ($a, $b, $c) = (1, 2, 3); |
622 | |
a687059c |
623 | ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00); |
624 | |
625 | The final element may be an array or an associative array: |
626 | |
627 | ($a, $b, @rest) = split; |
628 | local($a, $b, %rest) = @_; |
8d063cd8 |
629 | |
630 | .fi |
a687059c |
631 | You can actually put an array anywhere in the list, but the first array |
632 | in the list will soak up all the values, and anything after it will get |
633 | a null value. |
634 | This may be useful in a local(). |
8d063cd8 |
635 | .PP |
a687059c |
636 | An associative array literal contains pairs of values to be interpreted |
637 | as a key and a value: |
638 | .nf |
639 | |
640 | .ne 2 |
641 | # same as map assignment above |
642 | %map = ('red',0x00f,'blue',0x0f0,'green',0xf00); |
643 | |
644 | .fi |
645 | Array assignment in a scalar context returns the number of elements |
646 | produced by the expression on the right side of the assignment: |
647 | .nf |
648 | |
649 | $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2 |
650 | |
651 | .fi |
8d063cd8 |
652 | .PP |
653 | There are several other pseudo-literals that you should know about. |
378cc40b |
654 | If a string is enclosed by backticks (grave accents), it first undergoes |
655 | variable substitution just like a double quoted string. |
656 | It is then interpreted as a command, and the output of that command |
657 | is the value of the pseudo-literal, like in a shell. |
8d063cd8 |
658 | The command is executed each time the pseudo-literal is evaluated. |
378cc40b |
659 | The status value of the command is returned in $? (see Predefined Names |
660 | for the interpretation of $?). |
661 | Unlike in \f2csh\f1, no translation is done on the return |
8d063cd8 |
662 | data\*(--newlines remain newlines. |
378cc40b |
663 | Unlike in any of the shells, single quotes do not hide variable names |
664 | in the command from interpretation. |
665 | To pass a $ through to the shell you need to hide it with a backslash. |
8d063cd8 |
666 | .PP |
667 | Evaluating a filehandle in angle brackets yields the next line |
a687059c |
668 | from that file (newline included, so it's never false until EOF, at |
669 | which time an undefined value is returned). |
8d063cd8 |
670 | Ordinarily you must assign that value to a variable, |
671 | but there is one situation where in which an automatic assignment happens. |
672 | If (and only if) the input symbol is the only thing inside the conditional of a |
673 | .I while |
674 | loop, the value is |
675 | automatically assigned to the variable \*(L"$_\*(R". |
676 | (This may seem like an odd thing to you, but you'll use the construct |
677 | in almost every |
678 | .I perl |
679 | script you write.) |
680 | Anyway, the following lines are equivalent to each other: |
681 | .nf |
682 | |
a687059c |
683 | .ne 5 |
684 | while ($_ = <STDIN>) { print; } |
685 | while (<STDIN>) { print; } |
686 | for (\|;\|<STDIN>;\|) { print; } |
687 | print while $_ = <STDIN>; |
688 | print while <STDIN>; |
8d063cd8 |
689 | |
690 | .fi |
691 | The filehandles |
a687059c |
692 | .IR STDIN , |
693 | .I STDOUT |
694 | and |
695 | .I STDERR |
696 | are predefined. |
697 | (The filehandles |
8d063cd8 |
698 | .IR stdin , |
699 | .I stdout |
700 | and |
701 | .I stderr |
a687059c |
702 | will also work except in packages, where they would be interpreted as |
703 | local identifiers rather than global.) |
8d063cd8 |
704 | Additional filehandles may be created with the |
705 | .I open |
706 | function. |
707 | .PP |
378cc40b |
708 | If a <FILEHANDLE> is used in a context that is looking for an array, an array |
709 | consisting of all the input lines is returned, one line per array element. |
710 | It's easy to make a LARGE data space this way, so use with care. |
711 | .PP |
8d063cd8 |
712 | The null filehandle <> is special and can be used to emulate the behavior of |
713 | \fIsed\fR and \fIawk\fR. |
714 | Input from <> comes either from standard input, or from each file listed on |
715 | the command line. |
716 | Here's how it works: the first time <> is evaluated, the ARGV array is checked, |
a687059c |
717 | and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard |
8d063cd8 |
718 | input. |
719 | The ARGV array is then processed as a list of filenames. |
720 | The loop |
721 | .nf |
722 | |
723 | .ne 3 |
724 | while (<>) { |
725 | .\|.\|. # code for each line |
726 | } |
727 | |
728 | .ne 10 |
729 | is equivalent to |
730 | |
a687059c |
731 | unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[; |
8d063cd8 |
732 | while ($ARGV = shift) { |
733 | open(ARGV, $ARGV); |
734 | while (<ARGV>) { |
735 | .\|.\|. # code for each line |
736 | } |
737 | } |
738 | |
739 | .fi |
740 | except that it isn't as cumbersome to say. |
741 | It really does shift array ARGV and put the current filename into |
742 | variable ARGV. |
743 | It also uses filehandle ARGV internally. |
744 | You can modify @ARGV before the first <> as long as you leave the first |
745 | filename at the beginning of the array. |
83b4785a |
746 | Line numbers ($.) continue as if the input was one big happy file. |
378cc40b |
747 | (But see example under eof for how to reset line numbers on each file.) |
8d063cd8 |
748 | .PP |
83b4785a |
749 | .ne 5 |
378cc40b |
750 | If you want to set @ARGV to your own list of files, go right ahead. |
8d063cd8 |
751 | If you want to pass switches into your script, you can |
752 | put a loop on the front like this: |
753 | .nf |
754 | |
755 | .ne 10 |
756 | while ($_ = $ARGV[0], /\|^\-/\|) { |
757 | shift; |
758 | last if /\|^\-\|\-$\|/\|; |
759 | /\|^\-D\|(.*\|)/ \|&& \|($debug = $1); |
760 | /\|^\-v\|/ \|&& \|$verbose++; |
761 | .\|.\|. # other switches |
762 | } |
763 | while (<>) { |
764 | .\|.\|. # code for each line |
765 | } |
766 | |
767 | .fi |
768 | The <> symbol will return FALSE only once. |
769 | If you call it again after this it will assume you are processing another |
a687059c |
770 | @ARGV list, and if you haven't set @ARGV, will input from |
771 | .IR STDIN . |
378cc40b |
772 | .PP |
773 | If the string inside the angle brackets is a reference to a scalar variable |
774 | (e.g. <$foo>), |
775 | then that variable contains the name of the filehandle to input from. |
776 | .PP |
777 | If the string inside angle brackets is not a filehandle, it is interpreted |
778 | as a filename pattern to be globbed, and either an array of filenames or the |
779 | next filename in the list is returned, depending on context. |
780 | One level of $ interpretation is done first, but you can't say <$foo> |
781 | because that's an indirect filehandle as explained in the previous |
782 | paragraph. |
783 | You could insert curly brackets to force interpretation as a |
784 | filename glob: <${foo}>. |
785 | Example: |
786 | .nf |
787 | |
788 | .ne 3 |
789 | while (<*.c>) { |
a687059c |
790 | chmod 0644, $_; |
378cc40b |
791 | } |
792 | |
793 | is equivalent to |
794 | |
795 | .ne 5 |
a687059c |
796 | open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|"); |
378cc40b |
797 | while (<foo>) { |
798 | chop; |
a687059c |
799 | chmod 0644, $_; |
378cc40b |
800 | } |
801 | |
802 | .fi |
803 | In fact, it's currently implemented that way. |
a687059c |
804 | (Which means it will not work on filenames with spaces in them unless |
805 | you have /bin/csh on your machine.) |
378cc40b |
806 | Of course, the shortest way to do the above is: |
807 | .nf |
808 | |
a687059c |
809 | chmod 0644, <*.c>; |
378cc40b |
810 | |
811 | .fi |
8d063cd8 |
812 | .Sh "Syntax" |
813 | .PP |
814 | A |
815 | .I perl |
816 | script consists of a sequence of declarations and commands. |
817 | The only things that need to be declared in |
818 | .I perl |
819 | are report formats and subroutines. |
820 | See the sections below for more information on those declarations. |
a687059c |
821 | All uninitialized objects user-created objects are assumed to |
822 | start with a null or 0 value until they |
823 | are defined by some explicit operation such as assignment. |
8d063cd8 |
824 | The sequence of commands is executed just once, unlike in |
825 | .I sed |
826 | and |
827 | .I awk |
828 | scripts, where the sequence of commands is executed for each input line. |
829 | While this means that you must explicitly loop over the lines of your input file |
830 | (or files), it also means you have much more control over which files and which |
831 | lines you look at. |
832 | (Actually, I'm lying\*(--it is possible to do an implicit loop with either the |
833 | .B \-n |
834 | or |
835 | .B \-p |
836 | switch.) |
837 | .PP |
838 | A declaration can be put anywhere a command can, but has no effect on the |
a687059c |
839 | execution of the primary sequence of commands--declarations all take effect |
840 | at compile time. |
8d063cd8 |
841 | Typically all the declarations are put at the beginning or the end of the script. |
842 | .PP |
843 | .I Perl |
844 | is, for the most part, a free-form language. |
845 | (The only exception to this is format declarations, for fairly obvious reasons.) |
846 | Comments are indicated by the # character, and extend to the end of the line. |
847 | If you attempt to use /* */ C comments, it will be interpreted either as |
848 | division or pattern matching, depending on the context. |
849 | So don't do that. |
850 | .Sh "Compound statements" |
851 | In |
852 | .IR perl , |
853 | a sequence of commands may be treated as one command by enclosing it |
854 | in curly brackets. |
855 | We will call this a BLOCK. |
856 | .PP |
857 | The following compound commands may be used to control flow: |
858 | .nf |
859 | |
860 | .ne 4 |
861 | if (EXPR) BLOCK |
862 | if (EXPR) BLOCK else BLOCK |
378cc40b |
863 | if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK |
8d063cd8 |
864 | LABEL while (EXPR) BLOCK |
865 | LABEL while (EXPR) BLOCK continue BLOCK |
866 | LABEL for (EXPR; EXPR; EXPR) BLOCK |
378cc40b |
867 | LABEL foreach VAR (ARRAY) BLOCK |
8d063cd8 |
868 | LABEL BLOCK continue BLOCK |
869 | |
870 | .fi |
83b4785a |
871 | Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not |
8d063cd8 |
872 | statements. |
873 | This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed. |
874 | If you want to write conditionals without curly brackets there are several |
875 | other ways to do it. |
876 | The following all do the same thing: |
877 | .nf |
878 | |
879 | .ne 5 |
a687059c |
880 | if (!open(foo)) { die "Can't open $foo: $!"; } |
881 | die "Can't open $foo: $!" unless open(foo); |
882 | open(foo) || die "Can't open $foo: $!"; # foo or bust! |
883 | open(foo) ? die "Can't open $foo: $!" : \'hi mom\'; |
884 | # a bit exotic, that last one |
8d063cd8 |
885 | |
886 | .fi |
8d063cd8 |
887 | .PP |
888 | The |
889 | .I if |
890 | statement is straightforward. |
891 | Since BLOCKs are always bounded by curly brackets, there is never any |
892 | ambiguity about which |
893 | .I if |
894 | an |
895 | .I else |
896 | goes with. |
897 | If you use |
898 | .I unless |
899 | in place of |
900 | .IR if , |
901 | the sense of the test is reversed. |
902 | .PP |
903 | The |
904 | .I while |
905 | statement executes the block as long as the expression is true |
906 | (does not evaluate to the null string or 0). |
907 | The LABEL is optional, and if present, consists of an identifier followed by |
908 | a colon. |
909 | The LABEL identifies the loop for the loop control statements |
910 | .IR next , |
a687059c |
911 | .IR last , |
8d063cd8 |
912 | and |
913 | .I redo |
914 | (see below). |
915 | If there is a |
916 | .I continue |
917 | BLOCK, it is always executed just before |
918 | the conditional is about to be evaluated again, similarly to the third part |
919 | of a |
920 | .I for |
921 | loop in C. |
922 | Thus it can be used to increment a loop variable, even when the loop has |
923 | been continued via the |
924 | .I next |
925 | statement (similar to the C \*(L"continue\*(R" statement). |
926 | .PP |
927 | If the word |
928 | .I while |
929 | is replaced by the word |
930 | .IR until , |
931 | the sense of the test is reversed, but the conditional is still tested before |
932 | the first iteration. |
933 | .PP |
934 | In either the |
935 | .I if |
936 | or the |
937 | .I while |
938 | statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional |
939 | is true if the value of the last command in that block is true. |
940 | .PP |
941 | The |
942 | .I for |
943 | loop works exactly like the corresponding |
944 | .I while |
945 | loop: |
946 | .nf |
947 | |
948 | .ne 12 |
949 | for ($i = 1; $i < 10; $i++) { |
950 | .\|.\|. |
951 | } |
952 | |
953 | is the same as |
954 | |
955 | $i = 1; |
956 | while ($i < 10) { |
957 | .\|.\|. |
958 | } continue { |
959 | $i++; |
960 | } |
961 | .fi |
962 | .PP |
378cc40b |
963 | The foreach loop iterates over a normal array value and sets the variable |
964 | VAR to be each element of the array in turn. |
13281fa4 |
965 | The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword, |
966 | so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity. |
378cc40b |
967 | If VAR is omitted, $_ is set to each value. |
968 | If ARRAY is an actual array (as opposed to an expression returning an array |
969 | value), you can modify each element of the array |
970 | by modifying VAR inside the loop. |
971 | Examples: |
972 | .nf |
973 | |
974 | .ne 5 |
975 | for (@ary) { s/foo/bar/; } |
976 | |
977 | foreach $elem (@elements) { |
978 | $elem *= 2; |
979 | } |
980 | |
a687059c |
981 | .ne 3 |
982 | for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) { |
983 | print $_, "\en"; sleep(1); |
378cc40b |
984 | } |
985 | |
a687059c |
986 | for (1..15) { print "Merry Christmas\en"; } |
987 | |
378cc40b |
988 | .ne 3 |
a687059c |
989 | foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'}) { |
378cc40b |
990 | print "Item: $item\en"; |
991 | } |
a687059c |
992 | |
378cc40b |
993 | .fi |
994 | .PP |
8d063cd8 |
995 | The BLOCK by itself (labeled or not) is equivalent to a loop that executes |
996 | once. |
997 | Thus you can use any of the loop control statements in it to leave or |
998 | restart the block. |
999 | The |
1000 | .I continue |
1001 | block is optional. |
1002 | This construct is particularly nice for doing case structures. |
1003 | .nf |
1004 | |
1005 | .ne 6 |
1006 | foo: { |
a687059c |
1007 | if (/^abc/) { $abc = 1; last foo; } |
1008 | if (/^def/) { $def = 1; last foo; } |
1009 | if (/^xyz/) { $xyz = 1; last foo; } |
8d063cd8 |
1010 | $nothing = 1; |
1011 | } |
1012 | |
1013 | .fi |
a687059c |
1014 | There is no official switch statement in perl, because there |
1015 | are already several ways to write the equivalent. |
1016 | In addition to the above, you could write |
378cc40b |
1017 | .nf |
1018 | |
a687059c |
1019 | .ne 6 |
1020 | foo: { |
1021 | $abc = 1, last foo if /^abc/; |
1022 | $def = 1, last foo if /^def/; |
1023 | $xyz = 1, last foo if /^xyz/; |
1024 | $nothing = 1; |
1025 | } |
1026 | |
1027 | or |
1028 | |
1029 | .ne 6 |
1030 | foo: { |
1031 | /^abc/ && do { $abc = 1; last foo; } |
1032 | /^def/ && do { $def = 1; last foo; } |
1033 | /^xyz/ && do { $xyz = 1; last foo; } |
1034 | $nothing = 1; |
1035 | } |
1036 | |
1037 | or |
1038 | |
1039 | .ne 6 |
1040 | foo: { |
1041 | /^abc/ && ($abc = 1, last foo); |
1042 | /^def/ && ($def = 1, last foo); |
1043 | /^xyz/ && ($xyz = 1, last foo); |
1044 | $nothing = 1; |
1045 | } |
1046 | |
1047 | or even |
1048 | |
378cc40b |
1049 | .ne 8 |
a687059c |
1050 | if (/^abc/) |
1051 | { $abc = 1; last foo; } |
1052 | elsif (/^def/) |
1053 | { $def = 1; last foo; } |
1054 | elsif (/^xyz/) |
1055 | { $xyz = 1; last foo; } |
1056 | else |
1057 | {$nothing = 1;} |
378cc40b |
1058 | |
1059 | .fi |
a687059c |
1060 | As it happens, these are all optimized internally to a switch structure, |
1061 | so perl jumps directly to the desired statement, and you needn't worry |
1062 | about perl executing a lot of unnecessary statements when you have a string |
1063 | of 50 elsifs, as long as you are testing the same simple scalar variable |
1064 | using ==, eq, or pattern matching as above. |
1065 | (If you're curious as to whether the optimizer has done this for a particular |
1066 | case statement, you can use the \-D1024 switch to list the syntax tree |
1067 | before execution.) |
8d063cd8 |
1068 | .Sh "Simple statements" |
1069 | The only kind of simple statement is an expression evaluated for its side |
1070 | effects. |
1071 | Every expression (simple statement) must be terminated with a semicolon. |
1072 | Note that this is like C, but unlike Pascal (and |
1073 | .IR awk ). |
1074 | .PP |
1075 | Any simple statement may optionally be followed by a |
1076 | single modifier, just before the terminating semicolon. |
1077 | The possible modifiers are: |
1078 | .nf |
1079 | |
1080 | .ne 4 |
1081 | if EXPR |
1082 | unless EXPR |
1083 | while EXPR |
1084 | until EXPR |
1085 | |
1086 | .fi |
1087 | The |
1088 | .I if |
1089 | and |
1090 | .I unless |
1091 | modifiers have the expected semantics. |
1092 | The |
1093 | .I while |
1094 | and |
378cc40b |
1095 | .I until |
8d063cd8 |
1096 | modifiers also have the expected semantics (conditional evaluated first), |
1097 | except when applied to a do-BLOCK command, |
1098 | in which case the block executes once before the conditional is evaluated. |
1099 | This is so that you can write loops like: |
1100 | .nf |
1101 | |
1102 | .ne 4 |
1103 | do { |
a687059c |
1104 | $_ = <STDIN>; |
8d063cd8 |
1105 | .\|.\|. |
1106 | } until $_ \|eq \|".\|\e\|n"; |
1107 | |
1108 | .fi |
1109 | (See the |
1110 | .I do |
1111 | operator below. Note also that the loop control commands described later will |
83b4785a |
1112 | NOT work in this construct, since modifiers don't take loop labels. |
8d063cd8 |
1113 | Sorry.) |
1114 | .Sh "Expressions" |
1115 | Since |
1116 | .I perl |
1117 | expressions work almost exactly like C expressions, only the differences |
1118 | will be mentioned here. |
1119 | .PP |
1120 | Here's what |
1121 | .I perl |
1122 | has that C doesn't: |
a687059c |
1123 | .Ip ** 8 2 |
1124 | The exponentiation operator. |
1125 | .Ip **= 8 |
1126 | The exponentiation assignment operator. |
8d063cd8 |
1127 | .Ip (\|) 8 3 |
1128 | The null list, used to initialize an array to null. |
1129 | .Ip . 8 |
1130 | Concatenation of two strings. |
1131 | .Ip .= 8 |
a687059c |
1132 | The concatenation assignment operator. |
8d063cd8 |
1133 | .Ip eq 8 |
1134 | String equality (== is numeric equality). |
1135 | For a mnemonic just think of \*(L"eq\*(R" as a string. |
1136 | (If you are used to the |
1137 | .I awk |
1138 | behavior of using == for either string or numeric equality |
1139 | based on the current form of the comparands, beware! |
1140 | You must be explicit here.) |
1141 | .Ip ne 8 |
1142 | String inequality (!= is numeric inequality). |
1143 | .Ip lt 8 |
1144 | String less than. |
1145 | .Ip gt 8 |
1146 | String greater than. |
1147 | .Ip le 8 |
1148 | String less than or equal. |
1149 | .Ip ge 8 |
1150 | String greater than or equal. |
1151 | .Ip =~ 8 2 |
1152 | Certain operations search or modify the string \*(L"$_\*(R" by default. |
1153 | This operator makes that kind of operation work on some other string. |
1154 | The right argument is a search pattern, substitution, or translation. |
1155 | The left argument is what is supposed to be searched, substituted, or |
1156 | translated instead of the default \*(L"$_\*(R". |
1157 | The return value indicates the success of the operation. |
1158 | (If the right argument is an expression other than a search pattern, |
1159 | substitution, or translation, it is interpreted as a search pattern |
1160 | at run time. |
1161 | This is less efficient than an explicit search, since the pattern must |
1162 | be compiled every time the expression is evaluated.) |
1163 | The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else. |
1164 | .Ip !~ 8 |
1165 | Just like =~ except the return value is negated. |
1166 | .Ip x 8 |
1167 | The repetition operator. |
1168 | Returns a string consisting of the left operand repeated the |
1169 | number of times specified by the right operand. |
1170 | .nf |
1171 | |
a687059c |
1172 | print \'\-\' x 80; # print row of dashes |
1173 | print \'\-\' x80; # illegal, x80 is identifier |
8d063cd8 |
1174 | |
a687059c |
1175 | print "\et" x ($tab/8), \' \' x ($tab%8); # tab over |
8d063cd8 |
1176 | |
1177 | .fi |
1178 | .Ip x= 8 |
a687059c |
1179 | The repetition assignment operator. |
1180 | .Ip .\|. 8 |
1181 | The range operator, which is really two different operators depending |
1182 | on the context. |
1183 | In an array context, returns an array of values counting (by ones) |
1184 | from the left value to the right value. |
1185 | This is useful for writing \*(L"for (1..10)\*(R" loops and for doing |
1186 | slice operations on arrays. |
1187 | .Sp |
1188 | In a scalar context, .\|. returns a boolean value. |
1189 | The operator is bistable, like a flip-flop.. |
1190 | Each .\|. operator maintains its own boolean state. |
378cc40b |
1191 | It is false as long as its left operand is false. |
1192 | Once the left operand is true, the range operator stays true |
1193 | until the right operand is true, |
1194 | AFTER which the range operator becomes false again. |
a687059c |
1195 | (It doesn't become false till the next time the range operator is evaluated. |
8d063cd8 |
1196 | It can become false on the same evaluation it became true, but it still returns |
1197 | true once.) |
13281fa4 |
1198 | The right operand is not evaluated while the operator is in the \*(L"false\*(R" state, |
1199 | and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state. |
a687059c |
1200 | The scalar .\|. operator is primarily intended for doing line number ranges |
1201 | after |
8d063cd8 |
1202 | the fashion of \fIsed\fR or \fIawk\fR. |
1203 | The precedence is a little lower than || and &&. |
1204 | The value returned is either the null string for false, or a sequence number |
1205 | (beginning with 1) for true. |
1206 | The sequence number is reset for each range encountered. |
a687059c |
1207 | The final sequence number in a range has the string \'E0\' appended to it, which |
8d063cd8 |
1208 | doesn't affect its numeric value, but gives you something to search for if you |
1209 | want to exclude the endpoint. |
1210 | You can exclude the beginning point by waiting for the sequence number to be |
1211 | greater than 1. |
a687059c |
1212 | If either operand of scalar .\|. is static, that operand is implicitly compared |
1213 | to the $. variable, the current line number. |
8d063cd8 |
1214 | Examples: |
1215 | .nf |
1216 | |
a687059c |
1217 | .ne 6 |
1218 | As a scalar operator: |
1219 | if (101 .\|. 200) { print; } # print 2nd hundred lines |
8d063cd8 |
1220 | |
a687059c |
1221 | next line if (1 .\|. /^$/); # skip header lines |
8d063cd8 |
1222 | |
a687059c |
1223 | s/^/> / if (/^$/ .\|. eof()); # quote body |
1224 | |
1225 | .ne 4 |
1226 | As an array operator: |
1227 | for (101 .\|. 200) { print; } # print $_ 100 times |
1228 | |
1229 | @foo = @foo[$[ .\|. $#foo]; # an expensive no-op |
1230 | @foo = @foo[$#foo-4 .\|. $#foo]; # slice last 5 items |
8d063cd8 |
1231 | |
1232 | .fi |
378cc40b |
1233 | .Ip \-x 8 |
1234 | A file test. |
1235 | This unary operator takes one argument, either a filename or a filehandle, |
1236 | and tests the associated file to see if something is true about it. |
a687059c |
1237 | If the argument is omitted, tests $_, except for \-t, which tests |
1238 | .IR STDIN . |
1239 | It returns 1 for true and \'\' for false, or the undefined value if the |
1240 | file doesn't exist. |
378cc40b |
1241 | Precedence is higher than logical and relational operators, but lower than |
1242 | arithmetic operators. |
1243 | The operator may be any of: |
1244 | .nf |
1245 | \-r File is readable by effective uid. |
a687059c |
1246 | \-w File is writable by effective uid. |
378cc40b |
1247 | \-x File is executable by effective uid. |
1248 | \-o File is owned by effective uid. |
1249 | \-R File is readable by real uid. |
a687059c |
1250 | \-W File is writable by real uid. |
378cc40b |
1251 | \-X File is executable by real uid. |
1252 | \-O File is owned by real uid. |
1253 | \-e File exists. |
1254 | \-z File has zero size. |
1255 | \-s File has non-zero size. |
1256 | \-f File is a plain file. |
1257 | \-d File is a directory. |
1258 | \-l File is a symbolic link. |
1259 | \-p File is a named pipe (FIFO). |
1260 | \-S File is a socket. |
1261 | \-b File is a block special file. |
1262 | \-c File is a character special file. |
1263 | \-u File has setuid bit set. |
1264 | \-g File has setgid bit set. |
1265 | \-k File has sticky bit set. |
1266 | \-t Filehandle is opened to a tty. |
1267 | \-T File is a text file. |
1268 | \-B File is a binary file (opposite of \-T). |
1269 | |
1270 | .fi |
1271 | The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X |
1272 | is based solely on the mode of the file and the uids and gids of the user. |
1273 | There may be other reasons you can't actually read, write or execute the file. |
1274 | Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and |
1275 | \-x and \-X return 1 if any execute bit is set in the mode. |
1276 | Scripts run by the superuser may thus need to do a stat() in order to determine |
1277 | the actual mode of the file, or temporarily set the uid to something else. |
1278 | .Sp |
1279 | Example: |
1280 | .nf |
1281 | .ne 7 |
1282 | |
1283 | while (<>) { |
1284 | chop; |
1285 | next unless \-f $_; # ignore specials |
1286 | .\|.\|. |
1287 | } |
1288 | |
1289 | .fi |
a687059c |
1290 | Note that \-s/a/b/ does not do a negated substitution. |
1291 | Saying \-exp($foo) still works as expected, however\*(--only single letters |
378cc40b |
1292 | following a minus are interpreted as file tests. |
1293 | .Sp |
1294 | The \-T and \-B switches work as follows. |
1295 | The first block or so of the file is examined for odd characters such as |
1296 | strange control codes or metacharacters. |
1297 | If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file. |
1298 | Also, any file containing null in the first block is considered a binary file. |
1299 | If \-T or \-B is used on a filehandle, the current stdio buffer is examined |
1300 | rather than the first block. |
378cc40b |
1301 | Both \-T and \-B return TRUE on a null file, or a file at EOF when testing |
1302 | a filehandle. |
8d063cd8 |
1303 | .PP |
a687059c |
1304 | If any of the file tests (or either stat operator) are given the special |
1305 | filehandle consisting of a solitary underline, then the stat structure |
1306 | of the previous file test (or stat operator) is used, saving a system |
1307 | call. |
1308 | (This doesn't work with \-t, and you need to remember that lstat and -l |
1309 | will leave values in the stat structure for the symbolic link, not the |
1310 | real file.) |
1311 | Example: |
1312 | .nf |
1313 | |
1314 | print "Can do.\en" if -r $a || -w _ || -x _; |
1315 | |
1316 | .ne 9 |
1317 | stat($filename); |
1318 | print "Readable\en" if -r _; |
1319 | print "Writable\en" if -w _; |
1320 | print "Executable\en" if -x _; |
1321 | print "Setuid\en" if -u _; |
1322 | print "Setgid\en" if -g _; |
1323 | print "Sticky\en" if -k _; |
1324 | print "Text\en" if -T _; |
1325 | print "Binary\en" if -B _; |
1326 | |
1327 | .fi |
1328 | .PP |
8d063cd8 |
1329 | Here is what C has that |
1330 | .I perl |
1331 | doesn't: |
1332 | .Ip "unary &" 12 |
1333 | Address-of operator. |
1334 | .Ip "unary *" 12 |
1335 | Dereference-address operator. |
378cc40b |
1336 | .Ip "(TYPE)" 12 |
1337 | Type casting operator. |
8d063cd8 |
1338 | .PP |
1339 | Like C, |
1340 | .I perl |
1341 | does a certain amount of expression evaluation at compile time, whenever |
1342 | it determines that all of the arguments to an operator are static and have |
1343 | no side effects. |
1344 | In particular, string concatenation happens at compile time between literals that don't do variable substitution. |
1345 | Backslash interpretation also happens at compile time. |
1346 | You can say |
1347 | .nf |
1348 | |
1349 | .ne 2 |
a687059c |
1350 | \'Now is the time for all\' . "\|\e\|n" . |
1351 | \'good men to come to.\' |
8d063cd8 |
1352 | |
1353 | .fi |
1354 | and this all reduces to one string internally. |
1355 | .PP |
378cc40b |
1356 | The autoincrement operator has a little extra built-in magic to it. |
1357 | If you increment a variable that is numeric, or that has ever been used in |
1358 | a numeric context, you get a normal increment. |
1359 | If, however, the variable has only been used in string contexts since it |
1360 | was set, and has a value that is not null and matches the |
a687059c |
1361 | pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done |
378cc40b |
1362 | as a string, preserving each character within its range, with carry: |
1363 | .nf |
1364 | |
a687059c |
1365 | print ++($foo = \'99\'); # prints \*(L'100\*(R' |
1366 | print ++($foo = \'a0\'); # prints \*(L'a1\*(R' |
1367 | print ++($foo = \'Az\'); # prints \*(L'Ba\*(R' |
1368 | print ++($foo = \'zz\'); # prints \*(L'aaa\*(R' |
378cc40b |
1369 | |
1370 | .fi |
1371 | The autodecrement is not magical. |