Commit | Line | Data |
8d063cd8 |
1 | .rn '' }` |
ae986130 |
2 | ''' $Header: perl.man.1,v 3.0.1.1 89/11/11 04:41:22 lwall Locked $ |
8d063cd8 |
3 | ''' |
4 | ''' $Log: perl.man.1,v $ |
ae986130 |
5 | ''' Revision 3.0.1.1 89/11/11 04:41:22 lwall |
6 | ''' patch2: explained about sh and ${1+"$@"} |
7 | ''' patch2: documented that space must separate word and '' string |
8 | ''' |
a687059c |
9 | ''' Revision 3.0 89/10/18 15:21:29 lwall |
10 | ''' 3.0 baseline |
8d063cd8 |
11 | ''' |
12 | ''' |
13 | .de Sh |
14 | .br |
15 | .ne 5 |
16 | .PP |
17 | \fB\\$1\fR |
18 | .PP |
19 | .. |
20 | .de Sp |
21 | .if t .sp .5v |
22 | .if n .sp |
23 | .. |
24 | .de Ip |
25 | .br |
26 | .ie \\n.$>=3 .ne \\$3 |
27 | .el .ne 3 |
28 | .IP "\\$1" \\$2 |
29 | .. |
30 | ''' |
31 | ''' Set up \*(-- to give an unbreakable dash; |
32 | ''' string Tr holds user defined translation string. |
33 | ''' Bell System Logo is used as a dummy character. |
34 | ''' |
378cc40b |
35 | .tr \(*W-|\(bv\*(Tr |
8d063cd8 |
36 | .ie n \{\ |
378cc40b |
37 | .ds -- \(*W- |
38 | .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch |
39 | .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch |
8d063cd8 |
40 | .ds L" "" |
41 | .ds R" "" |
42 | .ds L' ' |
43 | .ds R' ' |
44 | 'br\} |
45 | .el\{\ |
46 | .ds -- \(em\| |
47 | .tr \*(Tr |
48 | .ds L" `` |
49 | .ds R" '' |
50 | .ds L' ` |
51 | .ds R' ' |
52 | 'br\} |
a687059c |
53 | .TH PERL 1 "\*(RP" |
54 | .UC |
8d063cd8 |
55 | .SH NAME |
a687059c |
56 | perl \- Practical Extraction and Report Language |
8d063cd8 |
57 | .SH SYNOPSIS |
a687059c |
58 | .B perl |
59 | [options] filename args |
8d063cd8 |
60 | .SH DESCRIPTION |
61 | .I Perl |
a687059c |
62 | is an interpreted language optimized for scanning arbitrary text files, |
8d063cd8 |
63 | extracting information from those text files, and printing reports based |
64 | on that information. |
65 | It's also a good language for many system management tasks. |
66 | The language is intended to be practical (easy to use, efficient, complete) |
67 | rather than beautiful (tiny, elegant, minimal). |
68 | It combines (in the author's opinion, anyway) some of the best features of C, |
69 | \fIsed\fR, \fIawk\fR, and \fIsh\fR, |
70 | so people familiar with those languages should have little difficulty with it. |
71 | (Language historians will also note some vestiges of \fIcsh\fR, Pascal, and |
72 | even BASIC-PLUS.) |
73 | Expression syntax corresponds quite closely to C expression syntax. |
a687059c |
74 | Unlike most Unix utilities, |
75 | .I perl |
76 | does not arbitrarily limit the size of your data\*(--if you've got |
77 | the memory, |
78 | .I perl |
79 | can slurp in your whole file as a single string. |
80 | Recursion is of unlimited depth. |
81 | And the hash tables used by associative arrays grow as necessary to prevent |
82 | degraded performance. |
83 | .I Perl |
84 | uses sophisticated pattern matching techniques to scan large amounts of |
85 | data very quickly. |
86 | Although optimized for scanning text, |
87 | .I perl |
88 | can also deal with binary data, and can make dbm files look like associative |
89 | arrays (where dbm is available). |
90 | Setuid |
91 | .I perl |
92 | scripts are safer than C programs |
93 | through a dataflow tracing mechanism which prevents many stupid security holes. |
8d063cd8 |
94 | If you have a problem that would ordinarily use \fIsed\fR |
95 | or \fIawk\fR or \fIsh\fR, but it |
96 | exceeds their capabilities or must run a little faster, |
97 | and you don't want to write the silly thing in C, then |
98 | .I perl |
99 | may be for you. |
a687059c |
100 | There are also translators to turn your |
101 | .I sed |
102 | and |
103 | .I awk |
104 | scripts into |
105 | .I perl |
106 | scripts. |
8d063cd8 |
107 | OK, enough hype. |
108 | .PP |
109 | Upon startup, |
110 | .I perl |
111 | looks for your script in one of the following places: |
112 | .Ip 1. 4 2 |
113 | Specified line by line via |
114 | .B \-e |
115 | switches on the command line. |
116 | .Ip 2. 4 2 |
117 | Contained in the file specified by the first filename on the command line. |
118 | (Note that systems supporting the #! notation invoke interpreters this way.) |
119 | .Ip 3. 4 2 |
a687059c |
120 | Passed in implicitly via standard input. |
378cc40b |
121 | This only works if there are no filename arguments\*(--to pass |
a687059c |
122 | arguments to a |
123 | .I stdin |
124 | script you must explicitly specify a \- for the script name. |
8d063cd8 |
125 | .PP |
126 | After locating your script, |
127 | .I perl |
128 | compiles it to an internal form. |
129 | If the script is syntactically correct, it is executed. |
130 | .Sh "Options" |
83b4785a |
131 | Note: on first reading this section may not make much sense to you. It's here |
8d063cd8 |
132 | at the front for easy reference. |
133 | .PP |
134 | A single-character option may be combined with the following option, if any. |
135 | This is particularly useful when invoking a script using the #! construct which |
136 | only allows one argument. Example: |
137 | .nf |
138 | |
139 | .ne 2 |
a687059c |
140 | #!/usr/bin/perl \-spi.bak # same as \-s \-p \-i.bak |
8d063cd8 |
141 | .\|.\|. |
142 | |
143 | .fi |
144 | Options include: |
145 | .TP 5 |
378cc40b |
146 | .B \-a |
a687059c |
147 | turns on autosplit mode when used with a |
148 | .B \-n |
149 | or |
150 | .BR \-p . |
378cc40b |
151 | An implicit split command to the @F array |
152 | is done as the first thing inside the implicit while loop produced by |
a687059c |
153 | the |
154 | .B \-n |
155 | or |
156 | .BR \-p . |
378cc40b |
157 | .nf |
158 | |
a687059c |
159 | perl \-ane \'print pop(@F), "\en";\' |
378cc40b |
160 | |
161 | is equivalent to |
162 | |
163 | while (<>) { |
a687059c |
164 | @F = split(\' \'); |
165 | print pop(@F), "\en"; |
378cc40b |
166 | } |
167 | |
168 | .fi |
169 | .TP 5 |
a687059c |
170 | .BI \-d |
171 | runs the script under the perl debugger. |
172 | See the section on Debugging. |
173 | .TP 5 |
174 | .BI \-D number |
8d063cd8 |
175 | sets debugging flags. |
176 | To watch how it executes your script, use |
a687059c |
177 | .BR \-D14 . |
8d063cd8 |
178 | (This only works if debugging is compiled into your |
179 | .IR perl .) |
a687059c |
180 | Another nice value is \-D1024, which lists your compiled syntax tree. |
181 | And \-D512 displays compiled regular expressions. |
8d063cd8 |
182 | .TP 5 |
a687059c |
183 | .BI \-e " commandline" |
8d063cd8 |
184 | may be used to enter one line of script. |
185 | Multiple |
186 | .B \-e |
187 | commands may be given to build up a multi-line script. |
188 | If |
189 | .B \-e |
190 | is given, |
191 | .I perl |
192 | will not look for a script filename in the argument list. |
193 | .TP 5 |
a687059c |
194 | .BI \-i extension |
8d063cd8 |
195 | specifies that files processed by the <> construct are to be edited |
196 | in-place. |
197 | It does this by renaming the input file, opening the output file by the |
198 | same name, and selecting that output file as the default for print statements. |
199 | The extension, if supplied, is added to the name of the |
200 | old file to make a backup copy. |
201 | If no extension is supplied, no backup is made. |
a687059c |
202 | Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using |
8d063cd8 |
203 | the script: |
204 | .nf |
205 | |
206 | .ne 2 |
a687059c |
207 | #!/usr/bin/perl \-pi.bak |
8d063cd8 |
208 | s/foo/bar/; |
209 | |
210 | which is equivalent to |
211 | |
212 | .ne 14 |
378cc40b |
213 | #!/usr/bin/perl |
8d063cd8 |
214 | while (<>) { |
215 | if ($ARGV ne $oldargv) { |
a687059c |
216 | rename($ARGV, $ARGV . \'.bak\'); |
217 | open(ARGVOUT, ">$ARGV"); |
8d063cd8 |
218 | select(ARGVOUT); |
219 | $oldargv = $ARGV; |
220 | } |
221 | s/foo/bar/; |
222 | } |
223 | continue { |
224 | print; # this prints to original filename |
225 | } |
a687059c |
226 | select(STDOUT); |
8d063cd8 |
227 | |
228 | .fi |
a687059c |
229 | except that the |
230 | .B \-i |
231 | form doesn't need to compare $ARGV to $oldargv to know when |
8d063cd8 |
232 | the filename has changed. |
233 | It does, however, use ARGVOUT for the selected filehandle. |
a687059c |
234 | Note that |
235 | .I STDOUT |
236 | is restored as the default output filehandle after the loop. |
378cc40b |
237 | .Sp |
238 | You can use eof to locate the end of each input file, in case you want |
239 | to append to each file, or reset line numbering (see example under eof). |
8d063cd8 |
240 | .TP 5 |
a687059c |
241 | .BI \-I directory |
8d063cd8 |
242 | may be used in conjunction with |
243 | .B \-P |
244 | to tell the C preprocessor where to look for include files. |
245 | By default /usr/include and /usr/lib/perl are searched. |
246 | .TP 5 |
247 | .B \-n |
248 | causes |
249 | .I perl |
250 | to assume the following loop around your script, which makes it iterate |
a687059c |
251 | over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR: |
8d063cd8 |
252 | .nf |
253 | |
254 | .ne 3 |
255 | while (<>) { |
378cc40b |
256 | .\|.\|. # your script goes here |
8d063cd8 |
257 | } |
258 | |
259 | .fi |
260 | Note that the lines are not printed by default. |
261 | See |
262 | .B \-p |
263 | to have lines printed. |
378cc40b |
264 | Here is an efficient way to delete all files older than a week: |
265 | .nf |
266 | |
a687059c |
267 | find . \-mtime +7 \-print | perl \-ne \'chop;unlink;\' |
378cc40b |
268 | |
269 | .fi |
a687059c |
270 | This is faster than using the \-exec switch of find because you don't have to |
378cc40b |
271 | start a process on every filename found. |
8d063cd8 |
272 | .TP 5 |
273 | .B \-p |
274 | causes |
275 | .I perl |
276 | to assume the following loop around your script, which makes it iterate |
277 | over filename arguments somewhat like \fIsed\fR: |
278 | .nf |
279 | |
280 | .ne 5 |
281 | while (<>) { |
378cc40b |
282 | .\|.\|. # your script goes here |
8d063cd8 |
283 | } continue { |
284 | print; |
285 | } |
286 | |
287 | .fi |
288 | Note that the lines are printed automatically. |
289 | To suppress printing use the |
290 | .B \-n |
291 | switch. |
83b4785a |
292 | A |
293 | .B \-p |
294 | overrides a |
295 | .B \-n |
296 | switch. |
8d063cd8 |
297 | .TP 5 |
298 | .B \-P |
299 | causes your script to be run through the C preprocessor before |
300 | compilation by |
a687059c |
301 | .IR perl . |
8d063cd8 |
302 | (Since both comments and cpp directives begin with the # character, |
303 | you should avoid starting comments with any words recognized |
304 | by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".) |
305 | .TP 5 |
306 | .B \-s |
307 | enables some rudimentary switch parsing for switches on the command line |
a687059c |
308 | after the script name but before any filename arguments (or before a \-\|\-). |
83b4785a |
309 | Any switch found there is removed from @ARGV and sets the corresponding variable in the |
8d063cd8 |
310 | .I perl |
311 | script. |
312 | The following script prints \*(L"true\*(R" if and only if the script is |
a687059c |
313 | invoked with a \-xyz switch. |
8d063cd8 |
314 | .nf |
315 | |
316 | .ne 2 |
a687059c |
317 | #!/usr/bin/perl \-s |
83b4785a |
318 | if ($xyz) { print "true\en"; } |
8d063cd8 |
319 | |
320 | .fi |
378cc40b |
321 | .TP 5 |
322 | .B \-S |
a687059c |
323 | makes |
324 | .I perl |
325 | use the PATH environment variable to search for the script |
378cc40b |
326 | (unless the name of the script starts with a slash). |
327 | Typically this is used to emulate #! startup on machines that don't |
328 | support #!, in the following manner: |
329 | .nf |
330 | |
331 | #!/usr/bin/perl |
a687059c |
332 | eval "exec /usr/bin/perl \-S $0 $*" |
378cc40b |
333 | if $running_under_some_shell; |
334 | |
335 | .fi |
336 | The system ignores the first line and feeds the script to /bin/sh, |
a687059c |
337 | which proceeds to try to execute the |
338 | .I perl |
339 | script as a shell script. |
378cc40b |
340 | The shell executes the second line as a normal shell command, and thus |
a687059c |
341 | starts up the |
342 | .I perl |
343 | interpreter. |
378cc40b |
344 | On some systems $0 doesn't always contain the full pathname, |
a687059c |
345 | so the |
346 | .B \-S |
347 | tells |
348 | .I perl |
349 | to search for the script if necessary. |
350 | After |
351 | .I perl |
352 | locates the script, it parses the lines and ignores them because |
378cc40b |
353 | the variable $running_under_some_shell is never true. |
ae986130 |
354 | A better construct than $* would be ${1+"$@"}, which handles embedded spaces |
355 | and such in the filenames, but doesn't work if the script is being interpreted |
356 | by csh. |
357 | In order to start up sh rather than csh, some systems may have to replace the |
358 | #! line with a line containing just |
359 | a colon, which will be politely ignored by perl. |
378cc40b |
360 | .TP 5 |
a687059c |
361 | .B \-u |
362 | causes |
363 | .I perl |
364 | to dump core after compiling your script. |
365 | You can then take this core dump and turn it into an executable file |
366 | by using the undump program (not supplied). |
367 | This speeds startup at the expense of some disk space (which you can |
368 | minimize by stripping the executable). |
369 | (Still, a "hello world" executable comes out to about 200K on my machine.) |
370 | If you are going to run your executable as a set-id program then you |
371 | should probably compile it using taintperl rather than normal perl. |
372 | If you want to execute a portion of your script before dumping, use the |
373 | dump operator instead. |
374 | .TP 5 |
378cc40b |
375 | .B \-U |
a687059c |
376 | allows |
377 | .I perl |
378 | to do unsafe operations. |
13281fa4 |
379 | Currently the only \*(L"unsafe\*(R" operation is the unlinking of directories while |
378cc40b |
380 | running as superuser. |
381 | .TP 5 |
382 | .B \-v |
a687059c |
383 | prints the version and patchlevel of your |
384 | .I perl |
385 | executable. |
378cc40b |
386 | .TP 5 |
387 | .B \-w |
388 | prints warnings about identifiers that are mentioned only once, and scalar |
389 | variables that are used before being set. |
390 | Also warns about redefined subroutines, and references to undefined |
a687059c |
391 | filehandles or filehandles opened readonly that you are attempting to |
392 | write on. |
393 | Also warns you if you use == on values that don't look like numbers, and if |
394 | your subroutines recurse more than 100 deep. |
8d063cd8 |
395 | .Sh "Data Types and Objects" |
396 | .PP |
a687059c |
397 | .I Perl |
398 | has three data types: scalars, arrays of scalars, and |
399 | associative arrays of scalars. |
400 | Normal arrays are indexed by number, and associative arrays by string. |
8d063cd8 |
401 | .PP |
a687059c |
402 | The interpretation of operations and values in perl sometimes |
403 | depends on the requirements |
404 | of the context around the operation or value. |
405 | There are three major contexts: string, numeric and array. |
406 | Certain operations return array values |
407 | in contexts wanting an array, and scalar values otherwise. |
408 | (If this is true of an operation it will be mentioned in the documentation |
409 | for that operation.) |
410 | Operations which return scalars don't care whether the context is looking |
411 | for a string or a number, but |
412 | scalar variables and values are interpreted as strings or numbers |
413 | as appropriate to the context. |
378cc40b |
414 | A scalar is interpreted as TRUE in the boolean sense if it is not the null |
8d063cd8 |
415 | string or 0. |
a687059c |
416 | Booleans returned by operators are 1 for true and \'0\' or \'\' (the null |
8d063cd8 |
417 | string) for false. |
418 | .PP |
a687059c |
419 | There are actually two varieties of null string: defined and undefined. |
420 | Undefined null strings are returned when there is no real value for something, |
421 | such as when there was an error, or at end of file, or when you refer |
422 | to an uninitialized variable or element of an array. |
423 | An undefined null string may become defined the first time you access it, but |
424 | prior to that you can use the defined() operator to determine whether the |
425 | value is defined or not. |
426 | .PP |
378cc40b |
427 | References to scalar variables always begin with \*(L'$\*(R', even when referring |
428 | to a scalar that is part of an array. |
8d063cd8 |
429 | Thus: |
430 | .nf |
431 | |
432 | .ne 3 |
378cc40b |
433 | $days \h'|2i'# a simple scalar variable |
8d063cd8 |
434 | $days[28] \h'|2i'# 29th element of array @days |
a687059c |
435 | $days{\'Feb\'}\h'|2i'# one value from an associative array |
378cc40b |
436 | $#days \h'|2i'# last index of array @days |
8d063cd8 |
437 | |
a687059c |
438 | but entire arrays or array slices are denoted by \*(L'@\*(R': |
8d063cd8 |
439 | |
440 | @days \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n]) |
a687059c |
441 | @days[3,4,5]\h'|2i'# same as @days[3.\|.5] |
442 | @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'}) |
443 | |
444 | and entire associative arrays are denoted by \*(L'%\*(R': |
8d063cd8 |
445 | |
a687059c |
446 | %days \h'|2i'# (key1, val1, key2, val2 .\|.\|.) |
8d063cd8 |
447 | .fi |
448 | .PP |
a687059c |
449 | Any of these eight constructs may serve as an lvalue, |
378cc40b |
450 | that is, may be assigned to. |
a687059c |
451 | (It also turns out that an assignment is itself an lvalue in |
452 | certain contexts\*(--see examples under s, tr and chop.) |
453 | Assignment to a scalar evaluates the righthand side in a scalar context, |
454 | while assignment to an array or array slice evaluates the righthand side |
455 | in an array context. |
456 | .PP |
378cc40b |
457 | You may find the length of array @days by evaluating |
8d063cd8 |
458 | \*(L"$#days\*(R", as in |
459 | .IR csh . |
378cc40b |
460 | (Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.) |
461 | Assigning to $#days changes the length of the array. |
462 | Shortening an array by this method does not actually destroy any values. |
463 | Lengthening an array that was previously shortened recovers the values that |
464 | were in those elements. |
465 | You can also gain some measure of efficiency by preextending an array that |
466 | is going to get big. |
467 | (You can also extend an array by assigning to an element that is off the |
468 | end of the array. |
469 | This differs from assigning to $#whatever in that intervening values |
470 | are set to null rather than recovered.) |
471 | You can truncate an array down to nothing by assigning the null list () to |
472 | it. |
473 | The following are exactly equivalent |
474 | .nf |
475 | |
476 | @whatever = (); |
477 | $#whatever = $[ \- 1; |
478 | |
479 | .fi |
8d063cd8 |
480 | .PP |
a687059c |
481 | Multi-dimensional arrays are not directly supported, but see the discussion |
482 | of the $; variable later for a means of emulating multiple subscripts with |
483 | an associative array. |
484 | .PP |
8d063cd8 |
485 | Every data type has its own namespace. |
378cc40b |
486 | You can, without fear of conflict, use the same name for a scalar variable, |
8d063cd8 |
487 | an array, an associative array, a filehandle, a subroutine name, and/or |
488 | a label. |
a687059c |
489 | Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R', |
490 | or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved |
8d063cd8 |
491 | with respect to variable names. |
492 | (They ARE reserved with respect to labels and filehandles, however, which |
378cc40b |
493 | don't have an initial special character. |
a687059c |
494 | Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\'). |
495 | Using uppercase filehandles also improves readability and protects you |
496 | from conflict with future reserved words.) |
8d063cd8 |
497 | Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all |
498 | different names. |
499 | Names which start with a letter may also contain digits and underscores. |
500 | Names which do not start with a letter are limited to one character, |
501 | e.g. \*(L"$%\*(R" or \*(L"$$\*(R". |
a687059c |
502 | (Most of the one character names have a predefined significance to |
503 | .IR perl . |
8d063cd8 |
504 | More later.) |
505 | .PP |
a687059c |
506 | Numeric literals are specified in any of the usual floating point or |
507 | integer formats: |
508 | .nf |
509 | |
510 | .ne 5 |
511 | 12345 |
512 | 12345.67 |
513 | .23E-10 |
514 | 0xffff # hex |
515 | 0377 # octal |
516 | |
517 | .fi |
8d063cd8 |
518 | String literals are delimited by either single or double quotes. |
519 | They work much like shell quotes: |
520 | double-quoted string literals are subject to backslash and variable |
a687059c |
521 | substitution; single-quoted strings are not (except for \e\' and \e\e). |
8d063cd8 |
522 | The usual backslash rules apply for making characters such as newline, tab, etc. |
523 | You can also embed newlines directly in your strings, i.e. they can end on |
524 | a different line than they begin. |
525 | This is nice, but if you forget your trailing quote, the error will not be |
a687059c |
526 | reported until |
527 | .I perl |
528 | finds another line containing the quote character, which |
8d063cd8 |
529 | may be much further on in the script. |
a687059c |
530 | Variable substitution inside strings is limited to scalar variables, normal |
531 | array values, and array slices. |
532 | (In other words, identifiers beginning with $ or @, followed by an optional |
533 | bracketed expression as a subscript.) |
8d063cd8 |
534 | The following code segment prints out \*(L"The price is $100.\*(R" |
535 | .nf |
536 | |
537 | .ne 2 |
a687059c |
538 | $Price = \'$100\';\h'|3.5i'# not interpreted |
8d063cd8 |
539 | print "The price is $Price.\e\|n";\h'|3.5i'# interpreted |
540 | |
541 | .fi |
83b4785a |
542 | Note that you can put curly brackets around the identifier to delimit it |
543 | from following alphanumerics. |
ae986130 |
544 | Also note that a single quoted string must be separated from a preceding |
545 | word by a space, since single quote is a valid character in an identifier |
546 | (see Packages). |
8d063cd8 |
547 | .PP |
a687059c |
548 | Array values are interpolated into double-quoted strings by joining all the |
549 | elements of the array with the delimiter specified in the $" variable, |
550 | space by default. |
551 | (Since in versions of perl prior to 3.0 the @ character was not a metacharacter |
552 | in double-quoted strings, the interpolation of @array, $array[EXPR], |
553 | @array[LIST], $array{EXPR}, or @array{LIST} only happens if array is |
554 | referenced elsewhere in the program or is predefined.) |
555 | The following are equivalent: |
556 | .nf |
557 | |
558 | .ne 4 |
559 | $temp = join($",@ARGV); |
560 | system "echo $temp"; |
561 | |
562 | system "echo @ARGV"; |
563 | |
564 | .fi |
ae986130 |
565 | Within search patterns (which also undergo double-quotish substitution) |
a687059c |
566 | there is a bad ambiguity: Is /$foo[bar]/ to be |
567 | interpreted as /${foo}[bar]/ (where [bar] is a character class for the |
568 | regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to |
569 | array @foo)? |
570 | If @foo doesn't otherwise exist, then it's obviously a character class. |
571 | If @foo exists, perl takes a good guess about [bar], and is almost always right. |
572 | If it does guess wrong, or if you're just plain paranoid, |
573 | you can force the correct interpretation with curly brackets as above. |
574 | .PP |
575 | A line-oriented form of quoting is based on the shell here-is syntax. |
576 | Following a << you specify a string to terminate the quoted material, and all lines |
577 | following the current line down to the terminating string are the value |
578 | of the item. |
579 | The terminating string may be either an identifier (a word), or some |
580 | quoted text. |
581 | If quoted, the type of quotes you use determines the treatment of the text, |
582 | just as in regular quoting. |
583 | An unquoted identifier works like double quotes. |
584 | There must be no space between the << and the identifier. |
585 | (If you put a space it will be treated as a null identifier, which is |
586 | valid, and matches the first blank line\*(--see Merry Christmas example below.) |
587 | The terminating string must appear by itself (unquoted and with no surrounding |
588 | whitespace) on the terminating line. |
589 | .nf |
590 | |
591 | print <<EOF; # same as above |
592 | The price is $Price. |
593 | EOF |
594 | |
595 | print <<"EOF"; # same as above |
596 | The price is $Price. |
597 | EOF |
598 | |
599 | print << x 10; # null identifier is delimiter |
600 | Merry Christmas! |
601 | |
602 | print <<`EOC`; # execute commands |
603 | echo hi there |
604 | echo lo there |
605 | EOC |
606 | |
607 | print <<foo, <<bar; # you can stack them |
608 | I said foo. |
609 | foo |
610 | I said bar. |
611 | bar |
612 | |
613 | .fi |
8d063cd8 |
614 | Array literals are denoted by separating individual values by commas, and |
615 | enclosing the list in parentheses. |
616 | In a context not requiring an array value, the value of the array literal |
617 | is the value of the final element, as in the C comma operator. |
618 | For example, |
619 | .nf |
620 | |
83b4785a |
621 | .ne 4 |
a687059c |
622 | @foo = (\'cc\', \'\-E\', $bar); |
8d063cd8 |
623 | |
624 | assigns the entire array value to array foo, but |
625 | |
a687059c |
626 | $foo = (\'cc\', \'\-E\', $bar); |
8d063cd8 |
627 | |
628 | .fi |
629 | assigns the value of variable bar to variable foo. |
630 | Array lists may be assigned to if and only if each element of the list |
631 | is an lvalue: |
632 | .nf |
633 | |
634 | ($a, $b, $c) = (1, 2, 3); |
635 | |
a687059c |
636 | ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00); |
637 | |
638 | The final element may be an array or an associative array: |
639 | |
640 | ($a, $b, @rest) = split; |
641 | local($a, $b, %rest) = @_; |
8d063cd8 |
642 | |
643 | .fi |
a687059c |
644 | You can actually put an array anywhere in the list, but the first array |
645 | in the list will soak up all the values, and anything after it will get |
646 | a null value. |
647 | This may be useful in a local(). |
8d063cd8 |
648 | .PP |
a687059c |
649 | An associative array literal contains pairs of values to be interpreted |
650 | as a key and a value: |
651 | .nf |
652 | |
653 | .ne 2 |
654 | # same as map assignment above |
655 | %map = ('red',0x00f,'blue',0x0f0,'green',0xf00); |
656 | |
657 | .fi |
658 | Array assignment in a scalar context returns the number of elements |
659 | produced by the expression on the right side of the assignment: |
660 | .nf |
661 | |
662 | $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2 |
663 | |
664 | .fi |
8d063cd8 |
665 | .PP |
666 | There are several other pseudo-literals that you should know about. |
378cc40b |
667 | If a string is enclosed by backticks (grave accents), it first undergoes |
668 | variable substitution just like a double quoted string. |
669 | It is then interpreted as a command, and the output of that command |
670 | is the value of the pseudo-literal, like in a shell. |
8d063cd8 |
671 | The command is executed each time the pseudo-literal is evaluated. |
378cc40b |
672 | The status value of the command is returned in $? (see Predefined Names |
673 | for the interpretation of $?). |
674 | Unlike in \f2csh\f1, no translation is done on the return |
8d063cd8 |
675 | data\*(--newlines remain newlines. |
378cc40b |
676 | Unlike in any of the shells, single quotes do not hide variable names |
677 | in the command from interpretation. |
678 | To pass a $ through to the shell you need to hide it with a backslash. |
8d063cd8 |
679 | .PP |
680 | Evaluating a filehandle in angle brackets yields the next line |
a687059c |
681 | from that file (newline included, so it's never false until EOF, at |
682 | which time an undefined value is returned). |
8d063cd8 |
683 | Ordinarily you must assign that value to a variable, |
684 | but there is one situation where in which an automatic assignment happens. |
685 | If (and only if) the input symbol is the only thing inside the conditional of a |
686 | .I while |
687 | loop, the value is |
688 | automatically assigned to the variable \*(L"$_\*(R". |
689 | (This may seem like an odd thing to you, but you'll use the construct |
690 | in almost every |
691 | .I perl |
692 | script you write.) |
693 | Anyway, the following lines are equivalent to each other: |
694 | .nf |
695 | |
a687059c |
696 | .ne 5 |
697 | while ($_ = <STDIN>) { print; } |
698 | while (<STDIN>) { print; } |
699 | for (\|;\|<STDIN>;\|) { print; } |
700 | print while $_ = <STDIN>; |
701 | print while <STDIN>; |
8d063cd8 |
702 | |
703 | .fi |
704 | The filehandles |
a687059c |
705 | .IR STDIN , |
706 | .I STDOUT |
707 | and |
708 | .I STDERR |
709 | are predefined. |
710 | (The filehandles |
8d063cd8 |
711 | .IR stdin , |
712 | .I stdout |
713 | and |
714 | .I stderr |
a687059c |
715 | will also work except in packages, where they would be interpreted as |
716 | local identifiers rather than global.) |
8d063cd8 |
717 | Additional filehandles may be created with the |
718 | .I open |
719 | function. |
720 | .PP |
378cc40b |
721 | If a <FILEHANDLE> is used in a context that is looking for an array, an array |
722 | consisting of all the input lines is returned, one line per array element. |
723 | It's easy to make a LARGE data space this way, so use with care. |
724 | .PP |
8d063cd8 |
725 | The null filehandle <> is special and can be used to emulate the behavior of |
726 | \fIsed\fR and \fIawk\fR. |
727 | Input from <> comes either from standard input, or from each file listed on |
728 | the command line. |
729 | Here's how it works: the first time <> is evaluated, the ARGV array is checked, |
a687059c |
730 | and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard |
8d063cd8 |
731 | input. |
732 | The ARGV array is then processed as a list of filenames. |
733 | The loop |
734 | .nf |
735 | |
736 | .ne 3 |
737 | while (<>) { |
738 | .\|.\|. # code for each line |
739 | } |
740 | |
741 | .ne 10 |
742 | is equivalent to |
743 | |
a687059c |
744 | unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[; |
8d063cd8 |
745 | while ($ARGV = shift) { |
746 | open(ARGV, $ARGV); |
747 | while (<ARGV>) { |
748 | .\|.\|. # code for each line |
749 | } |
750 | } |
751 | |
752 | .fi |
753 | except that it isn't as cumbersome to say. |
754 | It really does shift array ARGV and put the current filename into |
755 | variable ARGV. |
756 | It also uses filehandle ARGV internally. |
757 | You can modify @ARGV before the first <> as long as you leave the first |
758 | filename at the beginning of the array. |
83b4785a |
759 | Line numbers ($.) continue as if the input was one big happy file. |
378cc40b |
760 | (But see example under eof for how to reset line numbers on each file.) |
8d063cd8 |
761 | .PP |
83b4785a |
762 | .ne 5 |
378cc40b |
763 | If you want to set @ARGV to your own list of files, go right ahead. |
8d063cd8 |
764 | If you want to pass switches into your script, you can |
765 | put a loop on the front like this: |
766 | .nf |
767 | |
768 | .ne 10 |
769 | while ($_ = $ARGV[0], /\|^\-/\|) { |
770 | shift; |
771 | last if /\|^\-\|\-$\|/\|; |
772 | /\|^\-D\|(.*\|)/ \|&& \|($debug = $1); |
773 | /\|^\-v\|/ \|&& \|$verbose++; |
774 | .\|.\|. # other switches |
775 | } |
776 | while (<>) { |
777 | .\|.\|. # code for each line |
778 | } |
779 | |
780 | .fi |
781 | The <> symbol will return FALSE only once. |
782 | If you call it again after this it will assume you are processing another |
a687059c |
783 | @ARGV list, and if you haven't set @ARGV, will input from |
784 | .IR STDIN . |
378cc40b |
785 | .PP |
786 | If the string inside the angle brackets is a reference to a scalar variable |
787 | (e.g. <$foo>), |
788 | then that variable contains the name of the filehandle to input from. |
789 | .PP |
790 | If the string inside angle brackets is not a filehandle, it is interpreted |
791 | as a filename pattern to be globbed, and either an array of filenames or the |
792 | next filename in the list is returned, depending on context. |
793 | One level of $ interpretation is done first, but you can't say <$foo> |
794 | because that's an indirect filehandle as explained in the previous |
795 | paragraph. |
796 | You could insert curly brackets to force interpretation as a |
797 | filename glob: <${foo}>. |
798 | Example: |
799 | .nf |
800 | |
801 | .ne 3 |
802 | while (<*.c>) { |
a687059c |
803 | chmod 0644, $_; |
378cc40b |
804 | } |
805 | |
806 | is equivalent to |
807 | |
808 | .ne 5 |
a687059c |
809 | open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|"); |
378cc40b |
810 | while (<foo>) { |
811 | chop; |
a687059c |
812 | chmod 0644, $_; |
378cc40b |
813 | } |
814 | |
815 | .fi |
816 | In fact, it's currently implemented that way. |
a687059c |
817 | (Which means it will not work on filenames with spaces in them unless |
818 | you have /bin/csh on your machine.) |
378cc40b |
819 | Of course, the shortest way to do the above is: |
820 | .nf |
821 | |
a687059c |
822 | chmod 0644, <*.c>; |
378cc40b |
823 | |
824 | .fi |
8d063cd8 |
825 | .Sh "Syntax" |
826 | .PP |
827 | A |
828 | .I perl |
829 | script consists of a sequence of declarations and commands. |
830 | The only things that need to be declared in |
831 | .I perl |
832 | are report formats and subroutines. |
833 | See the sections below for more information on those declarations. |
a687059c |
834 | All uninitialized objects user-created objects are assumed to |
835 | start with a null or 0 value until they |
836 | are defined by some explicit operation such as assignment. |
8d063cd8 |
837 | The sequence of commands is executed just once, unlike in |
838 | .I sed |
839 | and |
840 | .I awk |
841 | scripts, where the sequence of commands is executed for each input line. |
842 | While this means that you must explicitly loop over the lines of your input file |
843 | (or files), it also means you have much more control over which files and which |
844 | lines you look at. |
845 | (Actually, I'm lying\*(--it is possible to do an implicit loop with either the |
846 | .B \-n |
847 | or |
848 | .B \-p |
849 | switch.) |
850 | .PP |
851 | A declaration can be put anywhere a command can, but has no effect on the |
a687059c |
852 | execution of the primary sequence of commands--declarations all take effect |
853 | at compile time. |
8d063cd8 |
854 | Typically all the declarations are put at the beginning or the end of the script. |
855 | .PP |
856 | .I Perl |
857 | is, for the most part, a free-form language. |
858 | (The only exception to this is format declarations, for fairly obvious reasons.) |
859 | Comments are indicated by the # character, and extend to the end of the line. |
860 | If you attempt to use /* */ C comments, it will be interpreted either as |
861 | division or pattern matching, depending on the context. |
862 | So don't do that. |
863 | .Sh "Compound statements" |
864 | In |
865 | .IR perl , |
866 | a sequence of commands may be treated as one command by enclosing it |
867 | in curly brackets. |
868 | We will call this a BLOCK. |
869 | .PP |
870 | The following compound commands may be used to control flow: |
871 | .nf |
872 | |
873 | .ne 4 |
874 | if (EXPR) BLOCK |
875 | if (EXPR) BLOCK else BLOCK |
378cc40b |
876 | if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK |
8d063cd8 |
877 | LABEL while (EXPR) BLOCK |
878 | LABEL while (EXPR) BLOCK continue BLOCK |
879 | LABEL for (EXPR; EXPR; EXPR) BLOCK |
378cc40b |
880 | LABEL foreach VAR (ARRAY) BLOCK |
8d063cd8 |
881 | LABEL BLOCK continue BLOCK |
882 | |
883 | .fi |
83b4785a |
884 | Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not |
8d063cd8 |
885 | statements. |
886 | This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed. |
887 | If you want to write conditionals without curly brackets there are several |
888 | other ways to do it. |
889 | The following all do the same thing: |
890 | .nf |
891 | |
892 | .ne 5 |
a687059c |
893 | if (!open(foo)) { die "Can't open $foo: $!"; } |
894 | die "Can't open $foo: $!" unless open(foo); |
895 | open(foo) || die "Can't open $foo: $!"; # foo or bust! |
896 | open(foo) ? die "Can't open $foo: $!" : \'hi mom\'; |
897 | # a bit exotic, that last one |
8d063cd8 |
898 | |
899 | .fi |
8d063cd8 |
900 | .PP |
901 | The |
902 | .I if |
903 | statement is straightforward. |
904 | Since BLOCKs are always bounded by curly brackets, there is never any |
905 | ambiguity about which |
906 | .I if |
907 | an |
908 | .I else |
909 | goes with. |
910 | If you use |
911 | .I unless |
912 | in place of |
913 | .IR if , |
914 | the sense of the test is reversed. |
915 | .PP |
916 | The |
917 | .I while |
918 | statement executes the block as long as the expression is true |
919 | (does not evaluate to the null string or 0). |
920 | The LABEL is optional, and if present, consists of an identifier followed by |
921 | a colon. |
922 | The LABEL identifies the loop for the loop control statements |
923 | .IR next , |
a687059c |
924 | .IR last , |
8d063cd8 |
925 | and |
926 | .I redo |
927 | (see below). |
928 | If there is a |
929 | .I continue |
930 | BLOCK, it is always executed just before |
931 | the conditional is about to be evaluated again, similarly to the third part |
932 | of a |
933 | .I for |
934 | loop in C. |
935 | Thus it can be used to increment a loop variable, even when the loop has |
936 | been continued via the |
937 | .I next |
938 | statement (similar to the C \*(L"continue\*(R" statement). |
939 | .PP |
940 | If the word |
941 | .I while |
942 | is replaced by the word |
943 | .IR until , |
944 | the sense of the test is reversed, but the conditional is still tested before |
945 | the first iteration. |
946 | .PP |
947 | In either the |
948 | .I if |
949 | or the |
950 | .I while |
951 | statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional |
952 | is true if the value of the last command in that block is true. |
953 | .PP |
954 | The |
955 | .I for |
956 | loop works exactly like the corresponding |
957 | .I while |
958 | loop: |
959 | .nf |
960 | |
961 | .ne 12 |
962 | for ($i = 1; $i < 10; $i++) { |
963 | .\|.\|. |
964 | } |
965 | |
966 | is the same as |
967 | |
968 | $i = 1; |
969 | while ($i < 10) { |
970 | .\|.\|. |
971 | } continue { |
972 | $i++; |
973 | } |
974 | .fi |
975 | .PP |
378cc40b |
976 | The foreach loop iterates over a normal array value and sets the variable |
977 | VAR to be each element of the array in turn. |
13281fa4 |
978 | The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword, |
979 | so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity. |
378cc40b |
980 | If VAR is omitted, $_ is set to each value. |
981 | If ARRAY is an actual array (as opposed to an expression returning an array |
982 | value), you can modify each element of the array |
983 | by modifying VAR inside the loop. |
984 | Examples: |
985 | .nf |
986 | |
987 | .ne 5 |
988 | for (@ary) { s/foo/bar/; } |
989 | |
990 | foreach $elem (@elements) { |
991 | $elem *= 2; |
992 | } |
993 | |
a687059c |
994 | .ne 3 |
995 | for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) { |
996 | print $_, "\en"; sleep(1); |
378cc40b |
997 | } |
998 | |
a687059c |
999 | for (1..15) { print "Merry Christmas\en"; } |
1000 | |
378cc40b |
1001 | .ne 3 |
a687059c |
1002 | foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'}) { |
378cc40b |
1003 | print "Item: $item\en"; |
1004 | } |
a687059c |
1005 | |
378cc40b |
1006 | .fi |
1007 | .PP |
8d063cd8 |
1008 | The BLOCK by itself (labeled or not) is equivalent to a loop that executes |
1009 | once. |
1010 | Thus you can use any of the loop control statements in it to leave or |
1011 | restart the block. |
1012 | The |
1013 | .I continue |
1014 | block is optional. |
1015 | This construct is particularly nice for doing case structures. |
1016 | .nf |
1017 | |
1018 | .ne 6 |
1019 | foo: { |
a687059c |
1020 | if (/^abc/) { $abc = 1; last foo; } |
1021 | if (/^def/) { $def = 1; last foo; } |
1022 | if (/^xyz/) { $xyz = 1; last foo; } |
8d063cd8 |
1023 | $nothing = 1; |
1024 | } |
1025 | |
1026 | .fi |
a687059c |
1027 | There is no official switch statement in perl, because there |
1028 | are already several ways to write the equivalent. |
1029 | In addition to the above, you could write |
378cc40b |
1030 | .nf |
1031 | |
a687059c |
1032 | .ne 6 |
1033 | foo: { |
1034 | $abc = 1, last foo if /^abc/; |
1035 | $def = 1, last foo if /^def/; |
1036 | $xyz = 1, last foo if /^xyz/; |
1037 | $nothing = 1; |
1038 | } |
1039 | |
1040 | or |
1041 | |
1042 | .ne 6 |
1043 | foo: { |
1044 | /^abc/ && do { $abc = 1; last foo; } |
1045 | /^def/ && do { $def = 1; last foo; } |
1046 | /^xyz/ && do { $xyz = 1; last foo; } |
1047 | $nothing = 1; |
1048 | } |
1049 | |
1050 | or |
1051 | |
1052 | .ne 6 |
1053 | foo: { |
1054 | /^abc/ && ($abc = 1, last foo); |
1055 | /^def/ && ($def = 1, last foo); |
1056 | /^xyz/ && ($xyz = 1, last foo); |
1057 | $nothing = 1; |
1058 | } |
1059 | |
1060 | or even |
1061 | |
378cc40b |
1062 | .ne 8 |
a687059c |
1063 | if (/^abc/) |
1064 | { $abc = 1; last foo; } |
1065 | elsif (/^def/) |
1066 | { $def = 1; last foo; } |
1067 | elsif (/^xyz/) |
1068 | { $xyz = 1; last foo; } |
1069 | else |
1070 | {$nothing = 1;} |
378cc40b |
1071 | |
1072 | .fi |
a687059c |
1073 | As it happens, these are all optimized internally to a switch structure, |
1074 | so perl jumps directly to the desired statement, and you needn't worry |
1075 | about perl executing a lot of unnecessary statements when you have a string |
1076 | of 50 elsifs, as long as you are testing the same simple scalar variable |
1077 | using ==, eq, or pattern matching as above. |
1078 | (If you're curious as to whether the optimizer has done this for a particular |
1079 | case statement, you can use the \-D1024 switch to list the syntax tree |
1080 | before execution.) |
8d063cd8 |
1081 | .Sh "Simple statements" |
1082 | The only kind of simple statement is an expression evaluated for its side |
1083 | effects. |
1084 | Every expression (simple statement) must be terminated with a semicolon. |
1085 | Note that this is like C, but unlike Pascal (and |
1086 | .IR awk ). |
1087 | .PP |
1088 | Any simple statement may optionally be followed by a |
1089 | single modifier, just before the terminating semicolon. |
1090 | The possible modifiers are: |
1091 | .nf |
1092 | |
1093 | .ne 4 |
1094 | if EXPR |
1095 | unless EXPR |
1096 | while EXPR |
1097 | until EXPR |
1098 | |
1099 | .fi |
1100 | The |
1101 | .I if |
1102 | and |
1103 | .I unless |
1104 | modifiers have the expected semantics. |
1105 | The |
1106 | .I while |
1107 | and |
378cc40b |
1108 | .I until |
8d063cd8 |
1109 | modifiers also have the expected semantics (conditional evaluated first), |
1110 | except when applied to a do-BLOCK command, |
1111 | in which case the block executes once before the conditional is evaluated. |
1112 | This is so that you can write loops like: |
1113 | .nf |
1114 | |
1115 | .ne 4 |
1116 | do { |
a687059c |
1117 | $_ = <STDIN>; |
8d063cd8 |
1118 | .\|.\|. |
1119 | } until $_ \|eq \|".\|\e\|n"; |
1120 | |
1121 | .fi |
1122 | (See the |
1123 | .I do |
1124 | operator below. Note also that the loop control commands described later will |
83b4785a |
1125 | NOT work in this construct, since modifiers don't take loop labels. |
8d063cd8 |
1126 | Sorry.) |
1127 | .Sh "Expressions" |
1128 | Since |
1129 | .I perl |
1130 | expressions work almost exactly like C expressions, only the differences |
1131 | will be mentioned here. |
1132 | .PP |
1133 | Here's what |
1134 | .I perl |
1135 | has that C doesn't: |
a687059c |
1136 | .Ip ** 8 2 |
1137 | The exponentiation operator. |
1138 | .Ip **= 8 |
1139 | The exponentiation assignment operator. |
8d063cd8 |
1140 | .Ip (\|) 8 3 |
1141 | The null list, used to initialize an array to null. |
1142 | .Ip . 8 |
1143 | Concatenation of two strings. |
1144 | .Ip .= 8 |
a687059c |
1145 | The concatenation assignment operator. |
8d063cd8 |
1146 | .Ip eq 8 |
1147 | String equality (== is numeric equality). |
1148 | For a mnemonic just think of \*(L"eq\*(R" as a string. |
1149 | (If you are used to the |
1150 | .I awk |
1151 | behavior of using == for either string or numeric equality |
1152 | based on the current form of the comparands, beware! |
1153 | You must be explicit here.) |
1154 | .Ip ne 8 |
1155 | String inequality (!= is numeric inequality). |
1156 | .Ip lt 8 |
1157 | String less than. |
1158 | .Ip gt 8 |
1159 | String greater than. |
1160 | .Ip le 8 |
1161 | String less than or equal. |
1162 | .Ip ge 8 |
1163 | String greater than or equal. |
1164 | .Ip =~ 8 2 |
1165 | Certain operations search or modify the string \*(L"$_\*(R" by default. |
1166 | This operator makes that kind of operation work on some other string. |
1167 | The right argument is a search pattern, substitution, or translation. |
1168 | The left argument is what is supposed to be searched, substituted, or |
1169 | translated instead of the default \*(L"$_\*(R". |
1170 | The return value indicates the success of the operation. |
1171 | (If the right argument is an expression other than a search pattern, |
1172 | substitution, or translation, it is interpreted as a search pattern |
1173 | at run time. |
1174 | This is less efficient than an explicit search, since the pattern must |
1175 | be compiled every time the expression is evaluated.) |
1176 | The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else. |
1177 | .Ip !~ 8 |
1178 | Just like =~ except the return value is negated. |
1179 | .Ip x 8 |
1180 | The repetition operator. |
1181 | Returns a string consisting of the left operand repeated the |
1182 | number of times specified by the right operand. |
1183 | .nf |
1184 | |
a687059c |
1185 | print \'\-\' x 80; # print row of dashes |
1186 | print \'\-\' x80; # illegal, x80 is identifier |
8d063cd8 |
1187 | |
a687059c |
1188 | print "\et" x ($tab/8), \' \' x ($tab%8); # tab over |
8d063cd8 |
1189 | |
1190 | .fi |
1191 | .Ip x= 8 |
a687059c |
1192 | The repetition assignment operator. |
1193 | .Ip .\|. 8 |
1194 | The range operator, which is really two different operators depending |
1195 | on the context. |
1196 | In an array context, returns an array of values counting (by ones) |
1197 | from the left value to the right value. |
1198 | This is useful for writing \*(L"for (1..10)\*(R" loops and for doing |
1199 | slice operations on arrays. |
1200 | .Sp |
1201 | In a scalar context, .\|. returns a boolean value. |
1202 | The operator is bistable, like a flip-flop.. |
1203 | Each .\|. operator maintains its own boolean state. |
378cc40b |
1204 | It is false as long as its left operand is false. |
1205 | Once the left operand is true, the range operator stays true |
1206 | until the right operand is true, |
1207 | AFTER which the range operator becomes false again. |
a687059c |
1208 | (It doesn't become false till the next time the range operator is evaluated. |
8d063cd8 |
1209 | It can become false on the same evaluation it became true, but it still returns |
1210 | true once.) |
13281fa4 |
1211 | The right operand is not evaluated while the operator is in the \*(L"false\*(R" state, |
1212 | and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state. |
a687059c |
1213 | The scalar .\|. operator is primarily intended for doing line number ranges |
1214 | after |
8d063cd8 |
1215 | the fashion of \fIsed\fR or \fIawk\fR. |
1216 | The precedence is a little lower than || and &&. |
1217 | The value returned is either the null string for false, or a sequence number |
1218 | (beginning with 1) for true. |
1219 | The sequence number is reset for each range encountered. |
a687059c |
1220 | The final sequence number in a range has the string \'E0\' appended to it, which |
8d063cd8 |
1221 | doesn't affect its numeric value, but gives you something to search for if you |
1222 | want to exclude the endpoint. |
1223 | You can exclude the beginning point by waiting for the sequence number to be |
1224 | greater than 1. |
a687059c |
1225 | If either operand of scalar .\|. is static, that operand is implicitly compared |
1226 | to the $. variable, the current line number. |
8d063cd8 |
1227 | Examples: |
1228 | .nf |
1229 | |
a687059c |
1230 | .ne 6 |
1231 | As a scalar operator: |
1232 | if (101 .\|. 200) { print; } # print 2nd hundred lines |
8d063cd8 |
1233 | |
a687059c |
1234 | next line if (1 .\|. /^$/); # skip header lines |
8d063cd8 |
1235 | |
a687059c |
1236 | s/^/> / if (/^$/ .\|. eof()); # quote body |
1237 | |
1238 | .ne 4 |
1239 | As an array operator: |
1240 | for (101 .\|. 200) { print; } # print $_ 100 times |
1241 | |
1242 | @foo = @foo[$[ .\|. $#foo]; # an expensive no-op |
1243 | @foo = @foo[$#foo-4 .\|. $#foo]; # slice last 5 items |
8d063cd8 |
1244 | |
1245 | .fi |
378cc40b |
1246 | .Ip \-x 8 |
1247 | A file test. |
1248 | This unary operator takes one argument, either a filename or a filehandle, |
1249 | and tests the associated file to see if something is true about it. |
a687059c |
1250 | If the argument is omitted, tests $_, except for \-t, which tests |
1251 | .IR STDIN . |
1252 | It returns 1 for true and \'\' for false, or the undefined value if the |
1253 | file doesn't exist. |
378cc40b |
1254 | Precedence is higher than logical and relational operators, but lower than |
1255 | arithmetic operators. |
1256 | The operator may be any of: |
1257 | .nf |
1258 | \-r File is readable by effective uid. |
a687059c |
1259 | \-w File is writable by effective uid. |
378cc40b |
1260 | \-x File is executable by effective uid. |
1261 | \-o File is owned by effective uid. |
1262 | \-R File is readable by real uid. |
a687059c |
1263 | \-W File is writable by real uid. |
378cc40b |
1264 | \-X File is executable by real uid. |
1265 | \-O File is owned by real uid. |
1266 | \-e File exists. |
1267 | \-z File has zero size. |
1268 | \-s File has non-zero size. |
1269 | \-f File is a plain file. |
1270 | \-d File is a directory. |
1271 | \-l File is a symbolic link. |
1272 | \-p File is a named pipe (FIFO). |
1273 | \-S File is a socket. |
1274 | \-b File is a block special file. |
1275 | \-c File is a character special file. |
1276 | \-u File has setuid bit set. |
1277 | \-g File has setgid bit set. |
1278 | \-k File has sticky bit set. |
1279 | \-t Filehandle is opened to a tty. |
1280 | \-T File is a text file. |
1281 | \-B File is a binary file (opposite of \-T). |
1282 | |
1283 | .fi |
1284 | The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X |
1285 | is based solely on the mode of the file and the uids and gids of the user. |
1286 | There may be other reasons you can't actually read, write or execute the file. |
1287 | Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and |
1288 | \-x and \-X return 1 if any execute bit is set in the mode. |
1289 | Scripts run by the superuser may thus need to do a stat() in order to determine |
1290 | the actual mode of the file, or temporarily set the uid to something else. |
1291 | .Sp |
1292 | Example: |
1293 | .nf |
1294 | .ne 7 |
1295 | |
1296 | while (<>) { |
1297 | chop; |
1298 | next unless \-f $_; # ignore specials |
1299 | .\|.\|. |
1300 | } |
1301 | |
1302 | .fi |
a687059c |
1303 | Note that \-s/a/b/ does not do a negated substitution. |
1304 | Saying \-exp($foo) still works as expected, however\*(--only single letters |
378cc40b |
1305 | following a minus are interpreted as file tests. |
1306 | .Sp |
1307 | The \-T and \-B switches work as follows. |
1308 | The first block or so of the file is examined for odd characters such as |
1309 | strange control codes or metacharacters. |
1310 | If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file. |
1311 | Also, any file containing null in the first block is considered a binary file. |
1312 | If \-T or \-B is used on a filehandle, the current stdio buffer is examined |
1313 | rather than the first block. |
378cc40b |
1314 | Both \-T and \-B return TRUE on a null file, or a file at EOF when testing |
1315 | a filehandle. |
8d063cd8 |
1316 | .PP |
a687059c |
1317 | If any of the file tests (or either stat operator) are given the special |
1318 | filehandle consisting of a solitary underline, then the stat structure |
1319 | of the previous file test (or stat operator) is used, saving a system |
1320 | call. |
1321 | (This doesn't work with \-t, and you need to remember that lstat and -l |
1322 | will leave values in the stat structure for the symbolic link, not the |
1323 | real file.) |
1324 | Example: |
1325 | .nf |
1326 | |
1327 | print "Can do.\en" if -r $a || -w _ || -x _; |
1328 | |
1329 | .ne 9 |
1330 | stat($filename); |
1331 | print "Readable\en" if -r _; |
1332 | print "Writable\en" if -w _; |
1333 | print "Executable\en" if -x _; |
1334 | print "Setuid\en" if -u _; |
1335 | print "Setgid\en" if -g _; |
1336 | print "Sticky\en" if -k _; |
1337 | print "Text\en" if -T _; |
1338 | print "Binary\en" if -B _; |
1339 | |
1340 | .fi |
1341 | .PP |
8d063cd8 |
1342 | Here is what C has that |
1343 | .I perl |
1344 | doesn't: |
1345 | .Ip "unary &" 12 |
1346 | Address-of operator. |
1347 | .Ip "unary *" 12 |
1348 | Dereference-address operator. |
378cc40b |
1349 | .Ip "(TYPE)" 12 |
1350 | Type casting operator. |
8d063cd8 |
1351 | .PP |
1352 | Like C, |
1353 | .I perl |
1354 | does a certain amount of expression evaluation at compile time, whenever |
1355 | it determines that all of the arguments to an operator are static and have |
1356 | no side effects. |
1357 | In particular, string concatenation happens at compile time between literals that don't do variable substitution. |
1358 | Backslash interpretation also happens at compile time. |
1359 | You can say |
1360 | .nf |
1361 | |
1362 | .ne 2 |
a687059c |
1363 | \'Now is the time for all\' . "\|\e\|n" . |
1364 | \'good men to come to.\' |
8d063cd8 |
1365 | |
1366 | .fi |
1367 | and this all reduces to one string internally. |
1368 | .PP |
378cc40b |
1369 | The autoincrement operator has a little extra built-in magic to it. |
1370 | If you increment a variable that is numeric, or that has ever been used in |
1371 | a numeric context, you get a normal increment. |
1372 | If, however, the variable has only been used in string contexts since it |
1373 | was set, and has a value that is not null and matches the |
a687059c |
1374 | pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done |
378cc40b |
1375 | as a string, preserving each character within its range, with carry: |
1376 | .nf |
1377 | |
a687059c |
1378 | print ++($foo = \'99\'); # prints \*(L'100\*(R' |
1379 | print ++($foo = \'a0\'); # prints \*(L'a1\*(R' |
1380 | print ++($foo = \'Az\'); # prints \*(L'Ba\*(R' |
1381 | print ++($foo = \'zz\'); # prints \*(L'aaa\*(R' |
378cc40b |
1382 | |
1383 | .fi |
1384 | The autodecrement is not magical. |