Commit | Line | Data |
8d063cd8 |
1 | .rn '' }` |
ac58e20f |
2 | ''' $Header: perl.man.1,v 3.0.1.3 90/02/28 17:54:32 lwall Locked $ |
8d063cd8 |
3 | ''' |
4 | ''' $Log: perl.man.1,v $ |
ac58e20f |
5 | ''' Revision 3.0.1.3 90/02/28 17:54:32 lwall |
6 | ''' patch9: @array in scalar context now returns length of array |
7 | ''' patch9: in manual, example of open and ?: was backwards |
8 | ''' |
ffed7fef |
9 | ''' Revision 3.0.1.2 89/11/17 15:30:03 lwall |
10 | ''' patch5: fixed some manual typos and indent problems |
11 | ''' |
ae986130 |
12 | ''' Revision 3.0.1.1 89/11/11 04:41:22 lwall |
13 | ''' patch2: explained about sh and ${1+"$@"} |
14 | ''' patch2: documented that space must separate word and '' string |
15 | ''' |
a687059c |
16 | ''' Revision 3.0 89/10/18 15:21:29 lwall |
17 | ''' 3.0 baseline |
8d063cd8 |
18 | ''' |
19 | ''' |
20 | .de Sh |
21 | .br |
22 | .ne 5 |
23 | .PP |
24 | \fB\\$1\fR |
25 | .PP |
26 | .. |
27 | .de Sp |
28 | .if t .sp .5v |
29 | .if n .sp |
30 | .. |
31 | .de Ip |
32 | .br |
33 | .ie \\n.$>=3 .ne \\$3 |
34 | .el .ne 3 |
35 | .IP "\\$1" \\$2 |
36 | .. |
37 | ''' |
38 | ''' Set up \*(-- to give an unbreakable dash; |
39 | ''' string Tr holds user defined translation string. |
40 | ''' Bell System Logo is used as a dummy character. |
41 | ''' |
378cc40b |
42 | .tr \(*W-|\(bv\*(Tr |
8d063cd8 |
43 | .ie n \{\ |
378cc40b |
44 | .ds -- \(*W- |
45 | .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch |
46 | .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch |
8d063cd8 |
47 | .ds L" "" |
48 | .ds R" "" |
49 | .ds L' ' |
50 | .ds R' ' |
51 | 'br\} |
52 | .el\{\ |
53 | .ds -- \(em\| |
54 | .tr \*(Tr |
55 | .ds L" `` |
56 | .ds R" '' |
57 | .ds L' ` |
58 | .ds R' ' |
59 | 'br\} |
a687059c |
60 | .TH PERL 1 "\*(RP" |
61 | .UC |
8d063cd8 |
62 | .SH NAME |
a687059c |
63 | perl \- Practical Extraction and Report Language |
8d063cd8 |
64 | .SH SYNOPSIS |
a687059c |
65 | .B perl |
66 | [options] filename args |
8d063cd8 |
67 | .SH DESCRIPTION |
68 | .I Perl |
a687059c |
69 | is an interpreted language optimized for scanning arbitrary text files, |
8d063cd8 |
70 | extracting information from those text files, and printing reports based |
71 | on that information. |
72 | It's also a good language for many system management tasks. |
73 | The language is intended to be practical (easy to use, efficient, complete) |
74 | rather than beautiful (tiny, elegant, minimal). |
75 | It combines (in the author's opinion, anyway) some of the best features of C, |
76 | \fIsed\fR, \fIawk\fR, and \fIsh\fR, |
77 | so people familiar with those languages should have little difficulty with it. |
78 | (Language historians will also note some vestiges of \fIcsh\fR, Pascal, and |
79 | even BASIC-PLUS.) |
80 | Expression syntax corresponds quite closely to C expression syntax. |
a687059c |
81 | Unlike most Unix utilities, |
82 | .I perl |
83 | does not arbitrarily limit the size of your data\*(--if you've got |
84 | the memory, |
85 | .I perl |
86 | can slurp in your whole file as a single string. |
87 | Recursion is of unlimited depth. |
88 | And the hash tables used by associative arrays grow as necessary to prevent |
89 | degraded performance. |
90 | .I Perl |
91 | uses sophisticated pattern matching techniques to scan large amounts of |
92 | data very quickly. |
93 | Although optimized for scanning text, |
94 | .I perl |
95 | can also deal with binary data, and can make dbm files look like associative |
96 | arrays (where dbm is available). |
97 | Setuid |
98 | .I perl |
99 | scripts are safer than C programs |
100 | through a dataflow tracing mechanism which prevents many stupid security holes. |
8d063cd8 |
101 | If you have a problem that would ordinarily use \fIsed\fR |
102 | or \fIawk\fR or \fIsh\fR, but it |
103 | exceeds their capabilities or must run a little faster, |
104 | and you don't want to write the silly thing in C, then |
105 | .I perl |
106 | may be for you. |
a687059c |
107 | There are also translators to turn your |
108 | .I sed |
109 | and |
110 | .I awk |
111 | scripts into |
112 | .I perl |
113 | scripts. |
8d063cd8 |
114 | OK, enough hype. |
115 | .PP |
116 | Upon startup, |
117 | .I perl |
118 | looks for your script in one of the following places: |
119 | .Ip 1. 4 2 |
120 | Specified line by line via |
121 | .B \-e |
122 | switches on the command line. |
123 | .Ip 2. 4 2 |
124 | Contained in the file specified by the first filename on the command line. |
125 | (Note that systems supporting the #! notation invoke interpreters this way.) |
126 | .Ip 3. 4 2 |
a687059c |
127 | Passed in implicitly via standard input. |
378cc40b |
128 | This only works if there are no filename arguments\*(--to pass |
a687059c |
129 | arguments to a |
130 | .I stdin |
131 | script you must explicitly specify a \- for the script name. |
8d063cd8 |
132 | .PP |
133 | After locating your script, |
134 | .I perl |
135 | compiles it to an internal form. |
136 | If the script is syntactically correct, it is executed. |
137 | .Sh "Options" |
83b4785a |
138 | Note: on first reading this section may not make much sense to you. It's here |
8d063cd8 |
139 | at the front for easy reference. |
140 | .PP |
141 | A single-character option may be combined with the following option, if any. |
142 | This is particularly useful when invoking a script using the #! construct which |
143 | only allows one argument. Example: |
144 | .nf |
145 | |
146 | .ne 2 |
a687059c |
147 | #!/usr/bin/perl \-spi.bak # same as \-s \-p \-i.bak |
8d063cd8 |
148 | .\|.\|. |
149 | |
150 | .fi |
151 | Options include: |
152 | .TP 5 |
378cc40b |
153 | .B \-a |
a687059c |
154 | turns on autosplit mode when used with a |
155 | .B \-n |
156 | or |
157 | .BR \-p . |
378cc40b |
158 | An implicit split command to the @F array |
159 | is done as the first thing inside the implicit while loop produced by |
a687059c |
160 | the |
161 | .B \-n |
162 | or |
163 | .BR \-p . |
378cc40b |
164 | .nf |
165 | |
a687059c |
166 | perl \-ane \'print pop(@F), "\en";\' |
378cc40b |
167 | |
168 | is equivalent to |
169 | |
170 | while (<>) { |
a687059c |
171 | @F = split(\' \'); |
172 | print pop(@F), "\en"; |
378cc40b |
173 | } |
174 | |
175 | .fi |
176 | .TP 5 |
a687059c |
177 | .BI \-d |
178 | runs the script under the perl debugger. |
179 | See the section on Debugging. |
180 | .TP 5 |
181 | .BI \-D number |
8d063cd8 |
182 | sets debugging flags. |
183 | To watch how it executes your script, use |
a687059c |
184 | .BR \-D14 . |
8d063cd8 |
185 | (This only works if debugging is compiled into your |
186 | .IR perl .) |
a687059c |
187 | Another nice value is \-D1024, which lists your compiled syntax tree. |
188 | And \-D512 displays compiled regular expressions. |
8d063cd8 |
189 | .TP 5 |
a687059c |
190 | .BI \-e " commandline" |
8d063cd8 |
191 | may be used to enter one line of script. |
192 | Multiple |
193 | .B \-e |
194 | commands may be given to build up a multi-line script. |
195 | If |
196 | .B \-e |
197 | is given, |
198 | .I perl |
199 | will not look for a script filename in the argument list. |
200 | .TP 5 |
a687059c |
201 | .BI \-i extension |
8d063cd8 |
202 | specifies that files processed by the <> construct are to be edited |
203 | in-place. |
204 | It does this by renaming the input file, opening the output file by the |
205 | same name, and selecting that output file as the default for print statements. |
206 | The extension, if supplied, is added to the name of the |
207 | old file to make a backup copy. |
208 | If no extension is supplied, no backup is made. |
a687059c |
209 | Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using |
8d063cd8 |
210 | the script: |
211 | .nf |
212 | |
213 | .ne 2 |
a687059c |
214 | #!/usr/bin/perl \-pi.bak |
8d063cd8 |
215 | s/foo/bar/; |
216 | |
217 | which is equivalent to |
218 | |
219 | .ne 14 |
378cc40b |
220 | #!/usr/bin/perl |
8d063cd8 |
221 | while (<>) { |
222 | if ($ARGV ne $oldargv) { |
a687059c |
223 | rename($ARGV, $ARGV . \'.bak\'); |
224 | open(ARGVOUT, ">$ARGV"); |
8d063cd8 |
225 | select(ARGVOUT); |
226 | $oldargv = $ARGV; |
227 | } |
228 | s/foo/bar/; |
229 | } |
230 | continue { |
231 | print; # this prints to original filename |
232 | } |
a687059c |
233 | select(STDOUT); |
8d063cd8 |
234 | |
235 | .fi |
a687059c |
236 | except that the |
237 | .B \-i |
238 | form doesn't need to compare $ARGV to $oldargv to know when |
8d063cd8 |
239 | the filename has changed. |
240 | It does, however, use ARGVOUT for the selected filehandle. |
a687059c |
241 | Note that |
242 | .I STDOUT |
243 | is restored as the default output filehandle after the loop. |
378cc40b |
244 | .Sp |
245 | You can use eof to locate the end of each input file, in case you want |
246 | to append to each file, or reset line numbering (see example under eof). |
8d063cd8 |
247 | .TP 5 |
a687059c |
248 | .BI \-I directory |
8d063cd8 |
249 | may be used in conjunction with |
250 | .B \-P |
251 | to tell the C preprocessor where to look for include files. |
252 | By default /usr/include and /usr/lib/perl are searched. |
253 | .TP 5 |
254 | .B \-n |
255 | causes |
256 | .I perl |
257 | to assume the following loop around your script, which makes it iterate |
a687059c |
258 | over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR: |
8d063cd8 |
259 | .nf |
260 | |
261 | .ne 3 |
262 | while (<>) { |
378cc40b |
263 | .\|.\|. # your script goes here |
8d063cd8 |
264 | } |
265 | |
266 | .fi |
267 | Note that the lines are not printed by default. |
268 | See |
269 | .B \-p |
270 | to have lines printed. |
378cc40b |
271 | Here is an efficient way to delete all files older than a week: |
272 | .nf |
273 | |
a687059c |
274 | find . \-mtime +7 \-print | perl \-ne \'chop;unlink;\' |
378cc40b |
275 | |
276 | .fi |
a687059c |
277 | This is faster than using the \-exec switch of find because you don't have to |
378cc40b |
278 | start a process on every filename found. |
8d063cd8 |
279 | .TP 5 |
280 | .B \-p |
281 | causes |
282 | .I perl |
283 | to assume the following loop around your script, which makes it iterate |
284 | over filename arguments somewhat like \fIsed\fR: |
285 | .nf |
286 | |
287 | .ne 5 |
288 | while (<>) { |
378cc40b |
289 | .\|.\|. # your script goes here |
8d063cd8 |
290 | } continue { |
291 | print; |
292 | } |
293 | |
294 | .fi |
295 | Note that the lines are printed automatically. |
296 | To suppress printing use the |
297 | .B \-n |
298 | switch. |
83b4785a |
299 | A |
300 | .B \-p |
301 | overrides a |
302 | .B \-n |
303 | switch. |
8d063cd8 |
304 | .TP 5 |
305 | .B \-P |
306 | causes your script to be run through the C preprocessor before |
307 | compilation by |
a687059c |
308 | .IR perl . |
8d063cd8 |
309 | (Since both comments and cpp directives begin with the # character, |
310 | you should avoid starting comments with any words recognized |
311 | by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".) |
312 | .TP 5 |
313 | .B \-s |
314 | enables some rudimentary switch parsing for switches on the command line |
a687059c |
315 | after the script name but before any filename arguments (or before a \-\|\-). |
83b4785a |
316 | Any switch found there is removed from @ARGV and sets the corresponding variable in the |
8d063cd8 |
317 | .I perl |
318 | script. |
319 | The following script prints \*(L"true\*(R" if and only if the script is |
a687059c |
320 | invoked with a \-xyz switch. |
8d063cd8 |
321 | .nf |
322 | |
323 | .ne 2 |
a687059c |
324 | #!/usr/bin/perl \-s |
83b4785a |
325 | if ($xyz) { print "true\en"; } |
8d063cd8 |
326 | |
327 | .fi |
378cc40b |
328 | .TP 5 |
329 | .B \-S |
a687059c |
330 | makes |
331 | .I perl |
332 | use the PATH environment variable to search for the script |
378cc40b |
333 | (unless the name of the script starts with a slash). |
334 | Typically this is used to emulate #! startup on machines that don't |
335 | support #!, in the following manner: |
336 | .nf |
337 | |
338 | #!/usr/bin/perl |
a687059c |
339 | eval "exec /usr/bin/perl \-S $0 $*" |
378cc40b |
340 | if $running_under_some_shell; |
341 | |
342 | .fi |
343 | The system ignores the first line and feeds the script to /bin/sh, |
a687059c |
344 | which proceeds to try to execute the |
345 | .I perl |
346 | script as a shell script. |
378cc40b |
347 | The shell executes the second line as a normal shell command, and thus |
a687059c |
348 | starts up the |
349 | .I perl |
350 | interpreter. |
378cc40b |
351 | On some systems $0 doesn't always contain the full pathname, |
a687059c |
352 | so the |
353 | .B \-S |
354 | tells |
355 | .I perl |
356 | to search for the script if necessary. |
357 | After |
358 | .I perl |
359 | locates the script, it parses the lines and ignores them because |
378cc40b |
360 | the variable $running_under_some_shell is never true. |
ae986130 |
361 | A better construct than $* would be ${1+"$@"}, which handles embedded spaces |
362 | and such in the filenames, but doesn't work if the script is being interpreted |
363 | by csh. |
364 | In order to start up sh rather than csh, some systems may have to replace the |
365 | #! line with a line containing just |
366 | a colon, which will be politely ignored by perl. |
378cc40b |
367 | .TP 5 |
a687059c |
368 | .B \-u |
369 | causes |
370 | .I perl |
371 | to dump core after compiling your script. |
372 | You can then take this core dump and turn it into an executable file |
373 | by using the undump program (not supplied). |
374 | This speeds startup at the expense of some disk space (which you can |
375 | minimize by stripping the executable). |
376 | (Still, a "hello world" executable comes out to about 200K on my machine.) |
377 | If you are going to run your executable as a set-id program then you |
378 | should probably compile it using taintperl rather than normal perl. |
379 | If you want to execute a portion of your script before dumping, use the |
380 | dump operator instead. |
381 | .TP 5 |
378cc40b |
382 | .B \-U |
a687059c |
383 | allows |
384 | .I perl |
385 | to do unsafe operations. |
13281fa4 |
386 | Currently the only \*(L"unsafe\*(R" operation is the unlinking of directories while |
378cc40b |
387 | running as superuser. |
388 | .TP 5 |
389 | .B \-v |
a687059c |
390 | prints the version and patchlevel of your |
391 | .I perl |
392 | executable. |
378cc40b |
393 | .TP 5 |
394 | .B \-w |
395 | prints warnings about identifiers that are mentioned only once, and scalar |
396 | variables that are used before being set. |
397 | Also warns about redefined subroutines, and references to undefined |
a687059c |
398 | filehandles or filehandles opened readonly that you are attempting to |
399 | write on. |
400 | Also warns you if you use == on values that don't look like numbers, and if |
401 | your subroutines recurse more than 100 deep. |
8d063cd8 |
402 | .Sh "Data Types and Objects" |
403 | .PP |
a687059c |
404 | .I Perl |
405 | has three data types: scalars, arrays of scalars, and |
406 | associative arrays of scalars. |
407 | Normal arrays are indexed by number, and associative arrays by string. |
8d063cd8 |
408 | .PP |
a687059c |
409 | The interpretation of operations and values in perl sometimes |
410 | depends on the requirements |
411 | of the context around the operation or value. |
412 | There are three major contexts: string, numeric and array. |
413 | Certain operations return array values |
414 | in contexts wanting an array, and scalar values otherwise. |
415 | (If this is true of an operation it will be mentioned in the documentation |
416 | for that operation.) |
417 | Operations which return scalars don't care whether the context is looking |
418 | for a string or a number, but |
419 | scalar variables and values are interpreted as strings or numbers |
420 | as appropriate to the context. |
378cc40b |
421 | A scalar is interpreted as TRUE in the boolean sense if it is not the null |
8d063cd8 |
422 | string or 0. |
ffed7fef |
423 | Booleans returned by operators are 1 for true and 0 or \'\' (the null |
8d063cd8 |
424 | string) for false. |
425 | .PP |
a687059c |
426 | There are actually two varieties of null string: defined and undefined. |
427 | Undefined null strings are returned when there is no real value for something, |
428 | such as when there was an error, or at end of file, or when you refer |
429 | to an uninitialized variable or element of an array. |
430 | An undefined null string may become defined the first time you access it, but |
431 | prior to that you can use the defined() operator to determine whether the |
432 | value is defined or not. |
433 | .PP |
378cc40b |
434 | References to scalar variables always begin with \*(L'$\*(R', even when referring |
435 | to a scalar that is part of an array. |
8d063cd8 |
436 | Thus: |
437 | .nf |
438 | |
439 | .ne 3 |
378cc40b |
440 | $days \h'|2i'# a simple scalar variable |
8d063cd8 |
441 | $days[28] \h'|2i'# 29th element of array @days |
a687059c |
442 | $days{\'Feb\'}\h'|2i'# one value from an associative array |
378cc40b |
443 | $#days \h'|2i'# last index of array @days |
8d063cd8 |
444 | |
a687059c |
445 | but entire arrays or array slices are denoted by \*(L'@\*(R': |
8d063cd8 |
446 | |
447 | @days \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n]) |
a687059c |
448 | @days[3,4,5]\h'|2i'# same as @days[3.\|.5] |
449 | @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'}) |
450 | |
451 | and entire associative arrays are denoted by \*(L'%\*(R': |
8d063cd8 |
452 | |
a687059c |
453 | %days \h'|2i'# (key1, val1, key2, val2 .\|.\|.) |
8d063cd8 |
454 | .fi |
455 | .PP |
a687059c |
456 | Any of these eight constructs may serve as an lvalue, |
378cc40b |
457 | that is, may be assigned to. |
a687059c |
458 | (It also turns out that an assignment is itself an lvalue in |
459 | certain contexts\*(--see examples under s, tr and chop.) |
460 | Assignment to a scalar evaluates the righthand side in a scalar context, |
461 | while assignment to an array or array slice evaluates the righthand side |
462 | in an array context. |
463 | .PP |
378cc40b |
464 | You may find the length of array @days by evaluating |
8d063cd8 |
465 | \*(L"$#days\*(R", as in |
466 | .IR csh . |
378cc40b |
467 | (Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.) |
468 | Assigning to $#days changes the length of the array. |
469 | Shortening an array by this method does not actually destroy any values. |
470 | Lengthening an array that was previously shortened recovers the values that |
471 | were in those elements. |
472 | You can also gain some measure of efficiency by preextending an array that |
473 | is going to get big. |
474 | (You can also extend an array by assigning to an element that is off the |
475 | end of the array. |
476 | This differs from assigning to $#whatever in that intervening values |
477 | are set to null rather than recovered.) |
478 | You can truncate an array down to nothing by assigning the null list () to |
479 | it. |
480 | The following are exactly equivalent |
481 | .nf |
482 | |
483 | @whatever = (); |
484 | $#whatever = $[ \- 1; |
485 | |
486 | .fi |
8d063cd8 |
487 | .PP |
ac58e20f |
488 | If you evaluate an array in a scalar context, it returns the length of |
489 | the array. |
490 | The following is always true: |
491 | .nf |
492 | |
493 | @whatever == $#whatever \- $[ + 1; |
494 | |
495 | .fi |
496 | .PP |
a687059c |
497 | Multi-dimensional arrays are not directly supported, but see the discussion |
498 | of the $; variable later for a means of emulating multiple subscripts with |
499 | an associative array. |
ac58e20f |
500 | You could also write a subroutine to turn multiple subscripts into a single |
501 | subscript. |
a687059c |
502 | .PP |
8d063cd8 |
503 | Every data type has its own namespace. |
378cc40b |
504 | You can, without fear of conflict, use the same name for a scalar variable, |
8d063cd8 |
505 | an array, an associative array, a filehandle, a subroutine name, and/or |
506 | a label. |
a687059c |
507 | Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R', |
508 | or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved |
8d063cd8 |
509 | with respect to variable names. |
510 | (They ARE reserved with respect to labels and filehandles, however, which |
378cc40b |
511 | don't have an initial special character. |
a687059c |
512 | Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\'). |
513 | Using uppercase filehandles also improves readability and protects you |
514 | from conflict with future reserved words.) |
8d063cd8 |
515 | Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all |
516 | different names. |
517 | Names which start with a letter may also contain digits and underscores. |
518 | Names which do not start with a letter are limited to one character, |
519 | e.g. \*(L"$%\*(R" or \*(L"$$\*(R". |
a687059c |
520 | (Most of the one character names have a predefined significance to |
521 | .IR perl . |
8d063cd8 |
522 | More later.) |
523 | .PP |
a687059c |
524 | Numeric literals are specified in any of the usual floating point or |
525 | integer formats: |
526 | .nf |
527 | |
528 | .ne 5 |
529 | 12345 |
530 | 12345.67 |
531 | .23E-10 |
532 | 0xffff # hex |
533 | 0377 # octal |
534 | |
535 | .fi |
8d063cd8 |
536 | String literals are delimited by either single or double quotes. |
537 | They work much like shell quotes: |
538 | double-quoted string literals are subject to backslash and variable |
a687059c |
539 | substitution; single-quoted strings are not (except for \e\' and \e\e). |
8d063cd8 |
540 | The usual backslash rules apply for making characters such as newline, tab, etc. |
541 | You can also embed newlines directly in your strings, i.e. they can end on |
542 | a different line than they begin. |
543 | This is nice, but if you forget your trailing quote, the error will not be |
a687059c |
544 | reported until |
545 | .I perl |
546 | finds another line containing the quote character, which |
8d063cd8 |
547 | may be much further on in the script. |
a687059c |
548 | Variable substitution inside strings is limited to scalar variables, normal |
549 | array values, and array slices. |
550 | (In other words, identifiers beginning with $ or @, followed by an optional |
551 | bracketed expression as a subscript.) |
8d063cd8 |
552 | The following code segment prints out \*(L"The price is $100.\*(R" |
553 | .nf |
554 | |
555 | .ne 2 |
a687059c |
556 | $Price = \'$100\';\h'|3.5i'# not interpreted |
8d063cd8 |
557 | print "The price is $Price.\e\|n";\h'|3.5i'# interpreted |
558 | |
559 | .fi |
83b4785a |
560 | Note that you can put curly brackets around the identifier to delimit it |
561 | from following alphanumerics. |
ae986130 |
562 | Also note that a single quoted string must be separated from a preceding |
563 | word by a space, since single quote is a valid character in an identifier |
564 | (see Packages). |
8d063cd8 |
565 | .PP |
a687059c |
566 | Array values are interpolated into double-quoted strings by joining all the |
567 | elements of the array with the delimiter specified in the $" variable, |
568 | space by default. |
569 | (Since in versions of perl prior to 3.0 the @ character was not a metacharacter |
570 | in double-quoted strings, the interpolation of @array, $array[EXPR], |
571 | @array[LIST], $array{EXPR}, or @array{LIST} only happens if array is |
572 | referenced elsewhere in the program or is predefined.) |
573 | The following are equivalent: |
574 | .nf |
575 | |
576 | .ne 4 |
577 | $temp = join($",@ARGV); |
578 | system "echo $temp"; |
579 | |
580 | system "echo @ARGV"; |
581 | |
582 | .fi |
ae986130 |
583 | Within search patterns (which also undergo double-quotish substitution) |
a687059c |
584 | there is a bad ambiguity: Is /$foo[bar]/ to be |
585 | interpreted as /${foo}[bar]/ (where [bar] is a character class for the |
586 | regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to |
587 | array @foo)? |
588 | If @foo doesn't otherwise exist, then it's obviously a character class. |
589 | If @foo exists, perl takes a good guess about [bar], and is almost always right. |
590 | If it does guess wrong, or if you're just plain paranoid, |
591 | you can force the correct interpretation with curly brackets as above. |
592 | .PP |
593 | A line-oriented form of quoting is based on the shell here-is syntax. |
594 | Following a << you specify a string to terminate the quoted material, and all lines |
595 | following the current line down to the terminating string are the value |
596 | of the item. |
597 | The terminating string may be either an identifier (a word), or some |
598 | quoted text. |
599 | If quoted, the type of quotes you use determines the treatment of the text, |
600 | just as in regular quoting. |
601 | An unquoted identifier works like double quotes. |
602 | There must be no space between the << and the identifier. |
603 | (If you put a space it will be treated as a null identifier, which is |
604 | valid, and matches the first blank line\*(--see Merry Christmas example below.) |
605 | The terminating string must appear by itself (unquoted and with no surrounding |
606 | whitespace) on the terminating line. |
607 | .nf |
608 | |
609 | print <<EOF; # same as above |
610 | The price is $Price. |
611 | EOF |
612 | |
613 | print <<"EOF"; # same as above |
614 | The price is $Price. |
615 | EOF |
616 | |
617 | print << x 10; # null identifier is delimiter |
618 | Merry Christmas! |
619 | |
620 | print <<`EOC`; # execute commands |
621 | echo hi there |
622 | echo lo there |
623 | EOC |
624 | |
625 | print <<foo, <<bar; # you can stack them |
626 | I said foo. |
627 | foo |
628 | I said bar. |
629 | bar |
630 | |
631 | .fi |
8d063cd8 |
632 | Array literals are denoted by separating individual values by commas, and |
633 | enclosing the list in parentheses. |
634 | In a context not requiring an array value, the value of the array literal |
635 | is the value of the final element, as in the C comma operator. |
636 | For example, |
637 | .nf |
638 | |
83b4785a |
639 | .ne 4 |
a687059c |
640 | @foo = (\'cc\', \'\-E\', $bar); |
8d063cd8 |
641 | |
642 | assigns the entire array value to array foo, but |
643 | |
a687059c |
644 | $foo = (\'cc\', \'\-E\', $bar); |
8d063cd8 |
645 | |
646 | .fi |
647 | assigns the value of variable bar to variable foo. |
648 | Array lists may be assigned to if and only if each element of the list |
649 | is an lvalue: |
650 | .nf |
651 | |
652 | ($a, $b, $c) = (1, 2, 3); |
653 | |
a687059c |
654 | ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00); |
655 | |
656 | The final element may be an array or an associative array: |
657 | |
658 | ($a, $b, @rest) = split; |
659 | local($a, $b, %rest) = @_; |
8d063cd8 |
660 | |
661 | .fi |
a687059c |
662 | You can actually put an array anywhere in the list, but the first array |
663 | in the list will soak up all the values, and anything after it will get |
664 | a null value. |
665 | This may be useful in a local(). |
8d063cd8 |
666 | .PP |
a687059c |
667 | An associative array literal contains pairs of values to be interpreted |
668 | as a key and a value: |
669 | .nf |
670 | |
671 | .ne 2 |
672 | # same as map assignment above |
673 | %map = ('red',0x00f,'blue',0x0f0,'green',0xf00); |
674 | |
675 | .fi |
676 | Array assignment in a scalar context returns the number of elements |
677 | produced by the expression on the right side of the assignment: |
678 | .nf |
679 | |
680 | $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2 |
681 | |
682 | .fi |
8d063cd8 |
683 | .PP |
684 | There are several other pseudo-literals that you should know about. |
378cc40b |
685 | If a string is enclosed by backticks (grave accents), it first undergoes |
686 | variable substitution just like a double quoted string. |
687 | It is then interpreted as a command, and the output of that command |
688 | is the value of the pseudo-literal, like in a shell. |
8d063cd8 |
689 | The command is executed each time the pseudo-literal is evaluated. |
378cc40b |
690 | The status value of the command is returned in $? (see Predefined Names |
691 | for the interpretation of $?). |
692 | Unlike in \f2csh\f1, no translation is done on the return |
8d063cd8 |
693 | data\*(--newlines remain newlines. |
378cc40b |
694 | Unlike in any of the shells, single quotes do not hide variable names |
695 | in the command from interpretation. |
696 | To pass a $ through to the shell you need to hide it with a backslash. |
8d063cd8 |
697 | .PP |
698 | Evaluating a filehandle in angle brackets yields the next line |
a687059c |
699 | from that file (newline included, so it's never false until EOF, at |
700 | which time an undefined value is returned). |
8d063cd8 |
701 | Ordinarily you must assign that value to a variable, |
ac58e20f |
702 | but there is one situation where an automatic assignment happens. |
8d063cd8 |
703 | If (and only if) the input symbol is the only thing inside the conditional of a |
704 | .I while |
705 | loop, the value is |
706 | automatically assigned to the variable \*(L"$_\*(R". |
707 | (This may seem like an odd thing to you, but you'll use the construct |
708 | in almost every |
709 | .I perl |
710 | script you write.) |
711 | Anyway, the following lines are equivalent to each other: |
712 | .nf |
713 | |
a687059c |
714 | .ne 5 |
715 | while ($_ = <STDIN>) { print; } |
716 | while (<STDIN>) { print; } |
717 | for (\|;\|<STDIN>;\|) { print; } |
718 | print while $_ = <STDIN>; |
719 | print while <STDIN>; |
8d063cd8 |
720 | |
721 | .fi |
722 | The filehandles |
a687059c |
723 | .IR STDIN , |
724 | .I STDOUT |
725 | and |
726 | .I STDERR |
727 | are predefined. |
728 | (The filehandles |
8d063cd8 |
729 | .IR stdin , |
730 | .I stdout |
731 | and |
732 | .I stderr |
a687059c |
733 | will also work except in packages, where they would be interpreted as |
734 | local identifiers rather than global.) |
8d063cd8 |
735 | Additional filehandles may be created with the |
736 | .I open |
737 | function. |
738 | .PP |
378cc40b |
739 | If a <FILEHANDLE> is used in a context that is looking for an array, an array |
740 | consisting of all the input lines is returned, one line per array element. |
741 | It's easy to make a LARGE data space this way, so use with care. |
742 | .PP |
8d063cd8 |
743 | The null filehandle <> is special and can be used to emulate the behavior of |
744 | \fIsed\fR and \fIawk\fR. |
745 | Input from <> comes either from standard input, or from each file listed on |
746 | the command line. |
747 | Here's how it works: the first time <> is evaluated, the ARGV array is checked, |
a687059c |
748 | and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard |
8d063cd8 |
749 | input. |
750 | The ARGV array is then processed as a list of filenames. |
751 | The loop |
752 | .nf |
753 | |
754 | .ne 3 |
755 | while (<>) { |
756 | .\|.\|. # code for each line |
757 | } |
758 | |
759 | .ne 10 |
760 | is equivalent to |
761 | |
a687059c |
762 | unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[; |
8d063cd8 |
763 | while ($ARGV = shift) { |
764 | open(ARGV, $ARGV); |
765 | while (<ARGV>) { |
766 | .\|.\|. # code for each line |
767 | } |
768 | } |
769 | |
770 | .fi |
771 | except that it isn't as cumbersome to say. |
772 | It really does shift array ARGV and put the current filename into |
773 | variable ARGV. |
774 | It also uses filehandle ARGV internally. |
775 | You can modify @ARGV before the first <> as long as you leave the first |
776 | filename at the beginning of the array. |
83b4785a |
777 | Line numbers ($.) continue as if the input was one big happy file. |
378cc40b |
778 | (But see example under eof for how to reset line numbers on each file.) |
8d063cd8 |
779 | .PP |
83b4785a |
780 | .ne 5 |
378cc40b |
781 | If you want to set @ARGV to your own list of files, go right ahead. |
8d063cd8 |
782 | If you want to pass switches into your script, you can |
783 | put a loop on the front like this: |
784 | .nf |
785 | |
786 | .ne 10 |
787 | while ($_ = $ARGV[0], /\|^\-/\|) { |
788 | shift; |
789 | last if /\|^\-\|\-$\|/\|; |
790 | /\|^\-D\|(.*\|)/ \|&& \|($debug = $1); |
791 | /\|^\-v\|/ \|&& \|$verbose++; |
792 | .\|.\|. # other switches |
793 | } |
794 | while (<>) { |
795 | .\|.\|. # code for each line |
796 | } |
797 | |
798 | .fi |
799 | The <> symbol will return FALSE only once. |
800 | If you call it again after this it will assume you are processing another |
a687059c |
801 | @ARGV list, and if you haven't set @ARGV, will input from |
802 | .IR STDIN . |
378cc40b |
803 | .PP |
804 | If the string inside the angle brackets is a reference to a scalar variable |
805 | (e.g. <$foo>), |
806 | then that variable contains the name of the filehandle to input from. |
807 | .PP |
808 | If the string inside angle brackets is not a filehandle, it is interpreted |
809 | as a filename pattern to be globbed, and either an array of filenames or the |
810 | next filename in the list is returned, depending on context. |
811 | One level of $ interpretation is done first, but you can't say <$foo> |
812 | because that's an indirect filehandle as explained in the previous |
813 | paragraph. |
814 | You could insert curly brackets to force interpretation as a |
815 | filename glob: <${foo}>. |
816 | Example: |
817 | .nf |
818 | |
819 | .ne 3 |
820 | while (<*.c>) { |
a687059c |
821 | chmod 0644, $_; |
378cc40b |
822 | } |
823 | |
824 | is equivalent to |
825 | |
826 | .ne 5 |
a687059c |
827 | open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|"); |
378cc40b |
828 | while (<foo>) { |
829 | chop; |
a687059c |
830 | chmod 0644, $_; |
378cc40b |
831 | } |
832 | |
833 | .fi |
834 | In fact, it's currently implemented that way. |
a687059c |
835 | (Which means it will not work on filenames with spaces in them unless |
836 | you have /bin/csh on your machine.) |
378cc40b |
837 | Of course, the shortest way to do the above is: |
838 | .nf |
839 | |
a687059c |
840 | chmod 0644, <*.c>; |
378cc40b |
841 | |
842 | .fi |
8d063cd8 |
843 | .Sh "Syntax" |
844 | .PP |
845 | A |
846 | .I perl |
847 | script consists of a sequence of declarations and commands. |
848 | The only things that need to be declared in |
849 | .I perl |
850 | are report formats and subroutines. |
851 | See the sections below for more information on those declarations. |
ffed7fef |
852 | All uninitialized user-created objects are assumed to |
a687059c |
853 | start with a null or 0 value until they |
854 | are defined by some explicit operation such as assignment. |
8d063cd8 |
855 | The sequence of commands is executed just once, unlike in |
856 | .I sed |
857 | and |
858 | .I awk |
859 | scripts, where the sequence of commands is executed for each input line. |
860 | While this means that you must explicitly loop over the lines of your input file |
861 | (or files), it also means you have much more control over which files and which |
862 | lines you look at. |
863 | (Actually, I'm lying\*(--it is possible to do an implicit loop with either the |
864 | .B \-n |
865 | or |
866 | .B \-p |
867 | switch.) |
868 | .PP |
869 | A declaration can be put anywhere a command can, but has no effect on the |
a687059c |
870 | execution of the primary sequence of commands--declarations all take effect |
871 | at compile time. |
8d063cd8 |
872 | Typically all the declarations are put at the beginning or the end of the script. |
873 | .PP |
874 | .I Perl |
875 | is, for the most part, a free-form language. |
876 | (The only exception to this is format declarations, for fairly obvious reasons.) |
877 | Comments are indicated by the # character, and extend to the end of the line. |
878 | If you attempt to use /* */ C comments, it will be interpreted either as |
879 | division or pattern matching, depending on the context. |
880 | So don't do that. |
881 | .Sh "Compound statements" |
882 | In |
883 | .IR perl , |
884 | a sequence of commands may be treated as one command by enclosing it |
885 | in curly brackets. |
886 | We will call this a BLOCK. |
887 | .PP |
888 | The following compound commands may be used to control flow: |
889 | .nf |
890 | |
891 | .ne 4 |
892 | if (EXPR) BLOCK |
893 | if (EXPR) BLOCK else BLOCK |
378cc40b |
894 | if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK |
8d063cd8 |
895 | LABEL while (EXPR) BLOCK |
896 | LABEL while (EXPR) BLOCK continue BLOCK |
897 | LABEL for (EXPR; EXPR; EXPR) BLOCK |
378cc40b |
898 | LABEL foreach VAR (ARRAY) BLOCK |
8d063cd8 |
899 | LABEL BLOCK continue BLOCK |
900 | |
901 | .fi |
83b4785a |
902 | Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not |
8d063cd8 |
903 | statements. |
904 | This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed. |
905 | If you want to write conditionals without curly brackets there are several |
906 | other ways to do it. |
907 | The following all do the same thing: |
908 | .nf |
909 | |
910 | .ne 5 |
a687059c |
911 | if (!open(foo)) { die "Can't open $foo: $!"; } |
912 | die "Can't open $foo: $!" unless open(foo); |
913 | open(foo) || die "Can't open $foo: $!"; # foo or bust! |
ac58e20f |
914 | open(foo) ? \'hi mom\' : die "Can't open $foo: $!"; |
a687059c |
915 | # a bit exotic, that last one |
8d063cd8 |
916 | |
917 | .fi |
8d063cd8 |
918 | .PP |
919 | The |
920 | .I if |
921 | statement is straightforward. |
922 | Since BLOCKs are always bounded by curly brackets, there is never any |
923 | ambiguity about which |
924 | .I if |
925 | an |
926 | .I else |
927 | goes with. |
928 | If you use |
929 | .I unless |
930 | in place of |
931 | .IR if , |
932 | the sense of the test is reversed. |
933 | .PP |
934 | The |
935 | .I while |
936 | statement executes the block as long as the expression is true |
937 | (does not evaluate to the null string or 0). |
938 | The LABEL is optional, and if present, consists of an identifier followed by |
939 | a colon. |
940 | The LABEL identifies the loop for the loop control statements |
941 | .IR next , |
a687059c |
942 | .IR last , |
8d063cd8 |
943 | and |
944 | .I redo |
945 | (see below). |
946 | If there is a |
947 | .I continue |
948 | BLOCK, it is always executed just before |
949 | the conditional is about to be evaluated again, similarly to the third part |
950 | of a |
951 | .I for |
952 | loop in C. |
953 | Thus it can be used to increment a loop variable, even when the loop has |
954 | been continued via the |
955 | .I next |
956 | statement (similar to the C \*(L"continue\*(R" statement). |
957 | .PP |
958 | If the word |
959 | .I while |
960 | is replaced by the word |
961 | .IR until , |
962 | the sense of the test is reversed, but the conditional is still tested before |
963 | the first iteration. |
964 | .PP |
965 | In either the |
966 | .I if |
967 | or the |
968 | .I while |
969 | statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional |
970 | is true if the value of the last command in that block is true. |
971 | .PP |
972 | The |
973 | .I for |
974 | loop works exactly like the corresponding |
975 | .I while |
976 | loop: |
977 | .nf |
978 | |
979 | .ne 12 |
980 | for ($i = 1; $i < 10; $i++) { |
981 | .\|.\|. |
982 | } |
983 | |
984 | is the same as |
985 | |
986 | $i = 1; |
987 | while ($i < 10) { |
988 | .\|.\|. |
989 | } continue { |
990 | $i++; |
991 | } |
992 | .fi |
993 | .PP |
378cc40b |
994 | The foreach loop iterates over a normal array value and sets the variable |
995 | VAR to be each element of the array in turn. |
13281fa4 |
996 | The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword, |
997 | so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity. |
378cc40b |
998 | If VAR is omitted, $_ is set to each value. |
999 | If ARRAY is an actual array (as opposed to an expression returning an array |
1000 | value), you can modify each element of the array |
1001 | by modifying VAR inside the loop. |
1002 | Examples: |
1003 | .nf |
1004 | |
1005 | .ne 5 |
1006 | for (@ary) { s/foo/bar/; } |
1007 | |
1008 | foreach $elem (@elements) { |
1009 | $elem *= 2; |
1010 | } |
1011 | |
a687059c |
1012 | .ne 3 |
1013 | for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) { |
1014 | print $_, "\en"; sleep(1); |
378cc40b |
1015 | } |
1016 | |
a687059c |
1017 | for (1..15) { print "Merry Christmas\en"; } |
1018 | |
378cc40b |
1019 | .ne 3 |
a687059c |
1020 | foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'}) { |
378cc40b |
1021 | print "Item: $item\en"; |
1022 | } |
a687059c |
1023 | |
378cc40b |
1024 | .fi |
1025 | .PP |
8d063cd8 |
1026 | The BLOCK by itself (labeled or not) is equivalent to a loop that executes |
1027 | once. |
1028 | Thus you can use any of the loop control statements in it to leave or |
1029 | restart the block. |
1030 | The |
1031 | .I continue |
1032 | block is optional. |
1033 | This construct is particularly nice for doing case structures. |
1034 | .nf |
1035 | |
1036 | .ne 6 |
1037 | foo: { |
a687059c |
1038 | if (/^abc/) { $abc = 1; last foo; } |
1039 | if (/^def/) { $def = 1; last foo; } |
1040 | if (/^xyz/) { $xyz = 1; last foo; } |
8d063cd8 |
1041 | $nothing = 1; |
1042 | } |
1043 | |
1044 | .fi |
a687059c |
1045 | There is no official switch statement in perl, because there |
1046 | are already several ways to write the equivalent. |
1047 | In addition to the above, you could write |
378cc40b |
1048 | .nf |
1049 | |
a687059c |
1050 | .ne 6 |
1051 | foo: { |
ffed7fef |
1052 | $abc = 1, last foo if /^abc/; |
1053 | $def = 1, last foo if /^def/; |
1054 | $xyz = 1, last foo if /^xyz/; |
a687059c |
1055 | $nothing = 1; |
1056 | } |
1057 | |
1058 | or |
1059 | |
1060 | .ne 6 |
1061 | foo: { |
1062 | /^abc/ && do { $abc = 1; last foo; } |
1063 | /^def/ && do { $def = 1; last foo; } |
1064 | /^xyz/ && do { $xyz = 1; last foo; } |
1065 | $nothing = 1; |
1066 | } |
1067 | |
1068 | or |
1069 | |
1070 | .ne 6 |
1071 | foo: { |
1072 | /^abc/ && ($abc = 1, last foo); |
1073 | /^def/ && ($def = 1, last foo); |
1074 | /^xyz/ && ($xyz = 1, last foo); |
1075 | $nothing = 1; |
1076 | } |
1077 | |
1078 | or even |
1079 | |
378cc40b |
1080 | .ne 8 |
a687059c |
1081 | if (/^abc/) |
1082 | { $abc = 1; last foo; } |
1083 | elsif (/^def/) |
1084 | { $def = 1; last foo; } |
1085 | elsif (/^xyz/) |
1086 | { $xyz = 1; last foo; } |
1087 | else |
1088 | {$nothing = 1;} |
378cc40b |
1089 | |
1090 | .fi |
a687059c |
1091 | As it happens, these are all optimized internally to a switch structure, |
1092 | so perl jumps directly to the desired statement, and you needn't worry |
1093 | about perl executing a lot of unnecessary statements when you have a string |
1094 | of 50 elsifs, as long as you are testing the same simple scalar variable |
1095 | using ==, eq, or pattern matching as above. |
1096 | (If you're curious as to whether the optimizer has done this for a particular |
1097 | case statement, you can use the \-D1024 switch to list the syntax tree |
1098 | before execution.) |
8d063cd8 |
1099 | .Sh "Simple statements" |
1100 | The only kind of simple statement is an expression evaluated for its side |
1101 | effects. |
1102 | Every expression (simple statement) must be terminated with a semicolon. |
1103 | Note that this is like C, but unlike Pascal (and |
1104 | .IR awk ). |
1105 | .PP |
1106 | Any simple statement may optionally be followed by a |
1107 | single modifier, just before the terminating semicolon. |
1108 | The possible modifiers are: |
1109 | .nf |
1110 | |
1111 | .ne 4 |
1112 | if EXPR |
1113 | unless EXPR |
1114 | while EXPR |
1115 | until EXPR |
1116 | |
1117 | .fi |
1118 | The |
1119 | .I if |
1120 | and |
1121 | .I unless |
1122 | modifiers have the expected semantics. |
1123 | The |
1124 | .I while |
1125 | and |
378cc40b |
1126 | .I until |
8d063cd8 |
1127 | modifiers also have the expected semantics (conditional evaluated first), |
1128 | except when applied to a do-BLOCK command, |
1129 | in which case the block executes once before the conditional is evaluated. |
1130 | This is so that you can write loops like: |
1131 | .nf |
1132 | |
1133 | .ne 4 |
1134 | do { |
a687059c |
1135 | $_ = <STDIN>; |
8d063cd8 |
1136 | .\|.\|. |
1137 | } until $_ \|eq \|".\|\e\|n"; |
1138 | |
1139 | .fi |
1140 | (See the |
1141 | .I do |
1142 | operator below. Note also that the loop control commands described later will |
83b4785a |
1143 | NOT work in this construct, since modifiers don't take loop labels. |
8d063cd8 |
1144 | Sorry.) |
1145 | .Sh "Expressions" |
1146 | Since |
1147 | .I perl |
1148 | expressions work almost exactly like C expressions, only the differences |
1149 | will be mentioned here. |
1150 | .PP |
1151 | Here's what |
1152 | .I perl |
1153 | has that C doesn't: |
a687059c |
1154 | .Ip ** 8 2 |
1155 | The exponentiation operator. |
1156 | .Ip **= 8 |
1157 | The exponentiation assignment operator. |
8d063cd8 |
1158 | .Ip (\|) 8 3 |
1159 | The null list, used to initialize an array to null. |
1160 | .Ip . 8 |
1161 | Concatenation of two strings. |
1162 | .Ip .= 8 |
a687059c |
1163 | The concatenation assignment operator. |
8d063cd8 |
1164 | .Ip eq 8 |
1165 | String equality (== is numeric equality). |
1166 | For a mnemonic just think of \*(L"eq\*(R" as a string. |
1167 | (If you are used to the |
1168 | .I awk |
1169 | behavior of using == for either string or numeric equality |
1170 | based on the current form of the comparands, beware! |
1171 | You must be explicit here.) |
1172 | .Ip ne 8 |
1173 | String inequality (!= is numeric inequality). |
1174 | .Ip lt 8 |
1175 | String less than. |
1176 | .Ip gt 8 |
1177 | String greater than. |
1178 | .Ip le 8 |
1179 | String less than or equal. |
1180 | .Ip ge 8 |
1181 | String greater than or equal. |
1182 | .Ip =~ 8 2 |
1183 | Certain operations search or modify the string \*(L"$_\*(R" by default. |
1184 | This operator makes that kind of operation work on some other string. |
1185 | The right argument is a search pattern, substitution, or translation. |
1186 | The left argument is what is supposed to be searched, substituted, or |
1187 | translated instead of the default \*(L"$_\*(R". |
1188 | The return value indicates the success of the operation. |
1189 | (If the right argument is an expression other than a search pattern, |
1190 | substitution, or translation, it is interpreted as a search pattern |
1191 | at run time. |
1192 | This is less efficient than an explicit search, since the pattern must |
1193 | be compiled every time the expression is evaluated.) |
1194 | The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else. |
1195 | .Ip !~ 8 |
1196 | Just like =~ except the return value is negated. |
1197 | .Ip x 8 |
1198 | The repetition operator. |
1199 | Returns a string consisting of the left operand repeated the |
1200 | number of times specified by the right operand. |
1201 | .nf |
1202 | |
a687059c |
1203 | print \'\-\' x 80; # print row of dashes |
1204 | print \'\-\' x80; # illegal, x80 is identifier |
8d063cd8 |
1205 | |
a687059c |
1206 | print "\et" x ($tab/8), \' \' x ($tab%8); # tab over |
8d063cd8 |
1207 | |
1208 | .fi |
1209 | .Ip x= 8 |
a687059c |
1210 | The repetition assignment operator. |
1211 | .Ip .\|. 8 |
1212 | The range operator, which is really two different operators depending |
1213 | on the context. |
1214 | In an array context, returns an array of values counting (by ones) |
1215 | from the left value to the right value. |
1216 | This is useful for writing \*(L"for (1..10)\*(R" loops and for doing |
1217 | slice operations on arrays. |
1218 | .Sp |
1219 | In a scalar context, .\|. returns a boolean value. |
1220 | The operator is bistable, like a flip-flop.. |
1221 | Each .\|. operator maintains its own boolean state. |
378cc40b |
1222 | It is false as long as its left operand is false. |
1223 | Once the left operand is true, the range operator stays true |
1224 | until the right operand is true, |
1225 | AFTER which the range operator becomes false again. |
a687059c |
1226 | (It doesn't become false till the next time the range operator is evaluated. |
8d063cd8 |
1227 | It can become false on the same evaluation it became true, but it still returns |
1228 | true once.) |
13281fa4 |
1229 | The right operand is not evaluated while the operator is in the \*(L"false\*(R" state, |
1230 | and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state. |
a687059c |
1231 | The scalar .\|. operator is primarily intended for doing line number ranges |
1232 | after |
8d063cd8 |
1233 | the fashion of \fIsed\fR or \fIawk\fR. |
1234 | The precedence is a little lower than || and &&. |
1235 | The value returned is either the null string for false, or a sequence number |
1236 | (beginning with 1) for true. |
1237 | The sequence number is reset for each range encountered. |
a687059c |
1238 | The final sequence number in a range has the string \'E0\' appended to it, which |
8d063cd8 |
1239 | doesn't affect its numeric value, but gives you something to search for if you |
1240 | want to exclude the endpoint. |
1241 | You can exclude the beginning point by waiting for the sequence number to be |
1242 | greater than 1. |
a687059c |
1243 | If either operand of scalar .\|. is static, that operand is implicitly compared |
1244 | to the $. variable, the current line number. |
8d063cd8 |
1245 | Examples: |
1246 | .nf |
1247 | |
a687059c |
1248 | .ne 6 |
1249 | As a scalar operator: |
1250 | if (101 .\|. 200) { print; } # print 2nd hundred lines |
8d063cd8 |
1251 | |
a687059c |
1252 | next line if (1 .\|. /^$/); # skip header lines |
8d063cd8 |
1253 | |
a687059c |
1254 | s/^/> / if (/^$/ .\|. eof()); # quote body |
1255 | |
1256 | .ne 4 |
1257 | As an array operator: |
1258 | for (101 .\|. 200) { print; } # print $_ 100 times |
1259 | |
1260 | @foo = @foo[$[ .\|. $#foo]; # an expensive no-op |
1261 | @foo = @foo[$#foo-4 .\|. $#foo]; # slice last 5 items |
8d063cd8 |
1262 | |
1263 | .fi |
378cc40b |
1264 | .Ip \-x 8 |
1265 | A file test. |
1266 | This unary operator takes one argument, either a filename or a filehandle, |
1267 | and tests the associated file to see if something is true about it. |
a687059c |
1268 | If the argument is omitted, tests $_, except for \-t, which tests |
1269 | .IR STDIN . |
1270 | It returns 1 for true and \'\' for false, or the undefined value if the |
1271 | file doesn't exist. |
378cc40b |
1272 | Precedence is higher than logical and relational operators, but lower than |
1273 | arithmetic operators. |
1274 | The operator may be any of: |
1275 | .nf |
1276 | \-r File is readable by effective uid. |
a687059c |
1277 | \-w File is writable by effective uid. |
378cc40b |
1278 | \-x File is executable by effective uid. |
1279 | \-o File is owned by effective uid. |
1280 | \-R File is readable by real uid. |
a687059c |
1281 | \-W File is writable by real uid. |
378cc40b |
1282 | \-X File is executable by real uid. |
1283 | \-O File is owned by real uid. |
1284 | \-e File exists. |
1285 | \-z File has zero size. |
1286 | \-s File has non-zero size. |
1287 | \-f File is a plain file. |
1288 | \-d File is a directory. |
1289 | \-l File is a symbolic link. |
1290 | \-p File is a named pipe (FIFO). |
1291 | \-S File is a socket. |
1292 | \-b File is a block special file. |
1293 | \-c File is a character special file. |
1294 | \-u File has setuid bit set. |
1295 | \-g File has setgid bit set. |
1296 | \-k File has sticky bit set. |
1297 | \-t Filehandle is opened to a tty. |
1298 | \-T File is a text file. |
1299 | \-B File is a binary file (opposite of \-T). |
1300 | |
1301 | .fi |
1302 | The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X |
1303 | is based solely on the mode of the file and the uids and gids of the user. |
1304 | There may be other reasons you can't actually read, write or execute the file. |
1305 | Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and |
1306 | \-x and \-X return 1 if any execute bit is set in the mode. |
1307 | Scripts run by the superuser may thus need to do a stat() in order to determine |
1308 | the actual mode of the file, or temporarily set the uid to something else. |
1309 | .Sp |
1310 | Example: |
1311 | .nf |
1312 | .ne 7 |
1313 | |
1314 | while (<>) { |
1315 | chop; |
1316 | next unless \-f $_; # ignore specials |
1317 | .\|.\|. |
1318 | } |
1319 | |
1320 | .fi |
a687059c |
1321 | Note that \-s/a/b/ does not do a negated substitution. |
1322 | Saying \-exp($foo) still works as expected, however\*(--only single letters |
378cc40b |
1323 | following a minus are interpreted as file tests. |
1324 | .Sp |
1325 | The \-T and \-B switches work as follows. |
1326 | The first block or so of the file is examined for odd characters such as |
1327 | strange control codes or metacharacters. |
1328 | If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file. |
1329 | Also, any file containing null in the first block is considered a binary file. |
1330 | If \-T or \-B is used on a filehandle, the current stdio buffer is examined |
1331 | rather than the first block. |
378cc40b |
1332 | Both \-T and \-B return TRUE on a null file, or a file at EOF when testing |
1333 | a filehandle. |
8d063cd8 |
1334 | .PP |
a687059c |
1335 | If any of the file tests (or either stat operator) are given the special |
1336 | filehandle consisting of a solitary underline, then the stat structure |
1337 | of the previous file test (or stat operator) is used, saving a system |
1338 | call. |
1339 | (This doesn't work with \-t, and you need to remember that lstat and -l |
1340 | will leave values in the stat structure for the symbolic link, not the |
1341 | real file.) |
1342 | Example: |
1343 | .nf |
1344 | |
1345 | print "Can do.\en" if -r $a || -w _ || -x _; |
1346 | |
1347 | .ne 9 |
1348 | stat($filename); |
1349 | print "Readable\en" if -r _; |
1350 | print "Writable\en" if -w _; |
1351 | print "Executable\en" if -x _; |
1352 | print "Setuid\en" if -u _; |
1353 | print "Setgid\en" if -g _; |
1354 | print "Sticky\en" if -k _; |
1355 | print "Text\en" if -T _; |
1356 | print "Binary\en" if -B _; |
1357 | |
1358 | .fi |
1359 | .PP |
8d063cd8 |
1360 | Here is what C has that |
1361 | .I perl |
1362 | doesn't: |
1363 | .Ip "unary &" 12 |
1364 | Address-of operator. |
1365 | .Ip "unary *" 12 |
1366 | Dereference-address operator. |
378cc40b |
1367 | .Ip "(TYPE)" 12 |
1368 | Type casting operator. |
8d063cd8 |
1369 | .PP |
1370 | Like C, |
1371 | .I perl |
1372 | does a certain amount of expression evaluation at compile time, whenever |
1373 | it determines that all of the arguments to an operator are static and have |
1374 | no side effects. |
1375 | In particular, string concatenation happens at compile time between literals that don't do variable substitution. |
1376 | Backslash interpretation also happens at compile time. |
1377 | You can say |
1378 | .nf |
1379 | |
1380 | .ne 2 |
a687059c |
1381 | \'Now is the time for all\' . "\|\e\|n" . |
1382 | \'good men to come to.\' |
8d063cd8 |
1383 | |
1384 | .fi |
1385 | and this all reduces to one string internally. |
1386 | .PP |
378cc40b |
1387 | The autoincrement operator has a little extra built-in magic to it. |
1388 | If you increment a variable that is numeric, or that has ever been used in |
1389 | a numeric context, you get a normal increment. |
1390 | If, however, the variable has only been used in string contexts since it |
1391 | was set, and has a value that is not null and matches the |
a687059c |
1392 | pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done |
378cc40b |
1393 | as a string, preserving each character within its range, with carry: |
1394 | .nf |
1395 | |
a687059c |
1396 | print ++($foo = \'99\'); # prints \*(L'100\*(R' |
1397 | print ++($foo = \'a0\'); # prints \*(L'a1\*(R' |
1398 | print ++($foo = \'Az\'); # prints \*(L'Ba\*(R' |
1399 | print ++($foo = \'zz\'); # prints \*(L'aaa\*(R' |
378cc40b |
1400 | |
1401 | .fi |
1402 | The autodecrement is not magical. |