Commit | Line | Data |
a687059c |
1 | ''' Beginning of part 4 |
e5d73d77 |
2 | ''' $Header: perl_man.4,v 3.0.1.12 90/10/20 02:15:43 lwall Locked $ |
a687059c |
3 | ''' |
4 | ''' $Log: perl.man.4,v $ |
e5d73d77 |
5 | ''' Revision 3.0.1.12 90/10/20 02:15:43 lwall |
6 | ''' patch37: patch37: fixed various typos in man page |
7 | ''' |
76854fea |
8 | ''' Revision 3.0.1.11 90/10/16 10:04:28 lwall |
9 | ''' patch29: added @###.## fields to format |
10 | ''' |
33b78306 |
11 | ''' Revision 3.0.1.10 90/08/09 04:47:35 lwall |
12 | ''' patch19: added require operator |
13 | ''' patch19: added numeric interpretation of $] |
14 | ''' |
15 | ''' Revision 3.0.1.9 90/08/03 11:15:58 lwall |
16 | ''' patch19: Intermediate diffs for Randal |
17 | ''' |
0f85fab0 |
18 | ''' Revision 3.0.1.8 90/03/27 16:19:31 lwall |
19 | ''' patch16: MSDOS support |
20 | ''' |
63f2c1e1 |
21 | ''' Revision 3.0.1.7 90/03/14 12:29:50 lwall |
22 | ''' patch15: man page falsely states that you can't subscript array values |
23 | ''' |
79a0689e |
24 | ''' Revision 3.0.1.6 90/03/12 16:54:04 lwall |
25 | ''' patch13: improved documentation of *name |
26 | ''' |
ac58e20f |
27 | ''' Revision 3.0.1.5 90/02/28 18:01:52 lwall |
28 | ''' patch9: $0 is now always the command name |
29 | ''' |
663a0e37 |
30 | ''' Revision 3.0.1.4 89/12/21 20:12:39 lwall |
31 | ''' patch7: documented that package'filehandle works as well as $package'variable |
32 | ''' patch7: documented which identifiers are always in package main |
33 | ''' |
ffed7fef |
34 | ''' Revision 3.0.1.3 89/11/17 15:32:25 lwall |
35 | ''' patch5: fixed some manual typos and indent problems |
36 | ''' patch5: clarified difference between $! and $@ |
37 | ''' |
ae986130 |
38 | ''' Revision 3.0.1.2 89/11/11 04:46:40 lwall |
39 | ''' patch2: made some line breaks depend on troff vs. nroff |
40 | ''' patch2: clarified operation of ^ and $ when $* is false |
41 | ''' |
03a14243 |
42 | ''' Revision 3.0.1.1 89/10/26 23:18:43 lwall |
43 | ''' patch1: documented the desirability of unnecessary parentheses |
44 | ''' |
a687059c |
45 | ''' Revision 3.0 89/10/18 15:21:55 lwall |
46 | ''' 3.0 baseline |
47 | ''' |
48 | .Sh "Precedence" |
49 | .I Perl |
50 | operators have the following associativity and precedence: |
51 | .nf |
52 | |
53 | nonassoc\h'|1i'print printf exec system sort reverse |
54 | \h'1.5i'chmod chown kill unlink utime die return |
55 | left\h'|1i', |
56 | right\h'|1i'= += \-= *= etc. |
57 | right\h'|1i'?: |
58 | nonassoc\h'|1i'.\|. |
59 | left\h'|1i'|| |
60 | left\h'|1i'&& |
61 | left\h'|1i'| ^ |
62 | left\h'|1i'& |
63 | nonassoc\h'|1i'== != eq ne |
64 | nonassoc\h'|1i'< > <= >= lt gt le ge |
65 | nonassoc\h'|1i'chdir exit eval reset sleep rand umask |
66 | nonassoc\h'|1i'\-r \-w \-x etc. |
67 | left\h'|1i'<< >> |
68 | left\h'|1i'+ \- . |
69 | left\h'|1i'* / % x |
70 | left\h'|1i'=~ !~ |
71 | right\h'|1i'! ~ and unary minus |
72 | right\h'|1i'** |
73 | nonassoc\h'|1i'++ \-\|\- |
74 | left\h'|1i'\*(L'(\*(R' |
75 | |
76 | .fi |
77 | As mentioned earlier, if any list operator (print, etc.) or |
78 | any unary operator (chdir, etc.) |
79 | is followed by a left parenthesis as the next token on the same line, |
80 | the operator and arguments within parentheses are taken to |
81 | be of highest precedence, just like a normal function call. |
82 | Examples: |
83 | .nf |
84 | |
ffed7fef |
85 | chdir $foo || die;\h'|3i'# (chdir $foo) || die |
86 | chdir($foo) || die;\h'|3i'# (chdir $foo) || die |
87 | chdir ($foo) || die;\h'|3i'# (chdir $foo) || die |
88 | chdir +($foo) || die;\h'|3i'# (chdir $foo) || die |
a687059c |
89 | |
90 | but, because * is higher precedence than ||: |
91 | |
ffed7fef |
92 | chdir $foo * 20;\h'|3i'# chdir ($foo * 20) |
93 | chdir($foo) * 20;\h'|3i'# (chdir $foo) * 20 |
94 | chdir ($foo) * 20;\h'|3i'# (chdir $foo) * 20 |
95 | chdir +($foo) * 20;\h'|3i'# chdir ($foo * 20) |
a687059c |
96 | |
ffed7fef |
97 | rand 10 * 20;\h'|3i'# rand (10 * 20) |
98 | rand(10) * 20;\h'|3i'# (rand 10) * 20 |
99 | rand (10) * 20;\h'|3i'# (rand 10) * 20 |
100 | rand +(10) * 20;\h'|3i'# rand (10 * 20) |
a687059c |
101 | |
102 | .fi |
103 | In the absence of parentheses, |
104 | the precedence of list operators such as print, sort or chmod is |
105 | either very high or very low depending on whether you look at the left |
106 | side of operator or the right side of it. |
107 | For example, in |
108 | .nf |
109 | |
110 | @ary = (1, 3, sort 4, 2); |
111 | print @ary; # prints 1324 |
112 | |
113 | .fi |
114 | the commas on the right of the sort are evaluated before the sort, but |
115 | the commas on the left are evaluated after. |
116 | In other words, list operators tend to gobble up all the arguments that |
117 | follow them, and then act like a simple term with regard to the preceding |
118 | expression. |
119 | Note that you have to be careful with parens: |
120 | .nf |
121 | |
122 | .ne 3 |
123 | # These evaluate exit before doing the print: |
124 | print($foo, exit); # Obviously not what you want. |
125 | print $foo, exit; # Nor is this. |
126 | |
127 | .ne 4 |
128 | # These do the print before evaluating exit: |
129 | (print $foo), exit; # This is what you want. |
130 | print($foo), exit; # Or this. |
131 | print ($foo), exit; # Or even this. |
132 | |
133 | Also note that |
134 | |
135 | print ($foo & 255) + 1, "\en"; |
136 | |
137 | .fi |
138 | probably doesn't do what you expect at first glance. |
139 | .Sh "Subroutines" |
140 | A subroutine may be declared as follows: |
141 | .nf |
142 | |
143 | sub NAME BLOCK |
144 | |
145 | .fi |
146 | .PP |
147 | Any arguments passed to the routine come in as array @_, |
148 | that is ($_[0], $_[1], .\|.\|.). |
149 | The array @_ is a local array, but its values are references to the |
150 | actual scalar parameters. |
151 | The return value of the subroutine is the value of the last expression |
152 | evaluated, and can be either an array value or a scalar value. |
153 | Alternately, a return statement may be used to specify the returned value and |
154 | exit the subroutine. |
155 | To create local variables see the |
156 | .I local |
157 | operator. |
158 | .PP |
159 | A subroutine is called using the |
160 | .I do |
161 | operator or the & operator. |
162 | .nf |
163 | |
164 | .ne 12 |
165 | Example: |
166 | |
167 | sub MAX { |
168 | local($max) = pop(@_); |
169 | foreach $foo (@_) { |
170 | $max = $foo \|if \|$max < $foo; |
171 | } |
172 | $max; |
173 | } |
174 | |
175 | .\|.\|. |
176 | $bestday = &MAX($mon,$tue,$wed,$thu,$fri); |
177 | |
178 | .ne 21 |
179 | Example: |
180 | |
181 | # get a line, combining continuation lines |
182 | # that start with whitespace |
183 | sub get_line { |
184 | $thisline = $lookahead; |
185 | line: while ($lookahead = <STDIN>) { |
186 | if ($lookahead \|=~ \|/\|^[ \^\e\|t]\|/\|) { |
187 | $thisline \|.= \|$lookahead; |
188 | } |
189 | else { |
190 | last line; |
191 | } |
192 | } |
193 | $thisline; |
194 | } |
195 | |
196 | $lookahead = <STDIN>; # get first line |
197 | while ($_ = do get_line(\|)) { |
198 | .\|.\|. |
199 | } |
200 | |
201 | .fi |
202 | .nf |
203 | .ne 6 |
204 | Use array assignment to a local list to name your formal arguments: |
205 | |
206 | sub maybeset { |
207 | local($key, $value) = @_; |
208 | $foo{$key} = $value unless $foo{$key}; |
209 | } |
210 | |
211 | .fi |
212 | This also has the effect of turning call-by-reference into call-by-value, |
213 | since the assignment copies the values. |
214 | .Sp |
215 | Subroutines may be called recursively. |
216 | If a subroutine is called using the & form, the argument list is optional. |
217 | If omitted, no @_ array is set up for the subroutine; the @_ array at the |
218 | time of the call is visible to subroutine instead. |
219 | .nf |
220 | |
221 | do foo(1,2,3); # pass three arguments |
222 | &foo(1,2,3); # the same |
223 | |
224 | do foo(); # pass a null list |
225 | &foo(); # the same |
226 | &foo; # pass no arguments--more efficient |
227 | |
228 | .fi |
229 | .Sh "Passing By Reference" |
230 | Sometimes you don't want to pass the value of an array to a subroutine but |
231 | rather the name of it, so that the subroutine can modify the global copy |
232 | of it rather than working with a local copy. |
233 | In perl you can refer to all the objects of a particular name by prefixing |
234 | the name with a star: *foo. |
235 | When evaluated, it produces a scalar value that represents all the objects |
79a0689e |
236 | of that name, including any filehandle, format or subroutine. |
a687059c |
237 | When assigned to within a local() operation, it causes the name mentioned |
238 | to refer to whatever * value was assigned to it. |
239 | Example: |
240 | .nf |
241 | |
242 | sub doubleary { |
243 | local(*someary) = @_; |
244 | foreach $elem (@someary) { |
245 | $elem *= 2; |
246 | } |
247 | } |
248 | do doubleary(*foo); |
249 | do doubleary(*bar); |
250 | |
251 | .fi |
252 | Assignment to *name is currently recommended only inside a local(). |
253 | You can actually assign to *name anywhere, but the previous referent of |
254 | *name may be stranded forever. |
255 | This may or may not bother you. |
256 | .Sp |
257 | Note that scalars are already passed by reference, so you can modify scalar |
ae986130 |
258 | arguments without using this mechanism by referring explicitly to the $_[nnn] |
a687059c |
259 | in question. |
260 | You can modify all the elements of an array by passing all the elements |
261 | as scalars, but you have to use the * mechanism to push, pop or change the |
262 | size of an array. |
263 | The * mechanism will probably be more efficient in any case. |
264 | .Sp |
265 | Since a *name value contains unprintable binary data, if it is used as |
266 | an argument in a print, or as a %s argument in a printf or sprintf, it |
267 | then has the value '*name', just so it prints out pretty. |
79a0689e |
268 | .Sp |
269 | Even if you don't want to modify an array, this mechanism is useful for |
270 | passing multiple arrays in a single LIST, since normally the LIST mechanism |
271 | will merge all the array values so that you can't extract out the |
272 | individual arrays. |
a687059c |
273 | .Sh "Regular Expressions" |
274 | The patterns used in pattern matching are regular expressions such as |
275 | those supplied in the Version 8 regexp routines. |
276 | (In fact, the routines are derived from Henry Spencer's freely redistributable |
277 | reimplementation of the V8 routines.) |
278 | In addition, \ew matches an alphanumeric character (including \*(L"_\*(R") and \eW a nonalphanumeric. |
279 | Word boundaries may be matched by \eb, and non-boundaries by \eB. |
280 | A whitespace character is matched by \es, non-whitespace by \eS. |
281 | A numeric character is matched by \ed, non-numeric by \eD. |
282 | You may use \ew, \es and \ed within character classes. |
283 | Also, \en, \er, \ef, \et and \eNNN have their normal interpretations. |
284 | Within character classes \eb represents backspace rather than a word boundary. |
285 | Alternatives may be separated by |. |
286 | The bracketing construct \|(\ .\|.\|.\ \|) may also be used, in which case \e<digit> |
287 | matches the digit'th substring, where digit can range from 1 to 9. |
288 | (Outside of the pattern, always use $ instead of \e in front of the digit. |
289 | The scope of $<digit> (and $\`, $& and $\') |
290 | extends to the end of the enclosing BLOCK or eval string, or to |
291 | the next pattern match with subexpressions. |
292 | The \e<digit> notation sometimes works outside the current pattern, but should |
293 | not be relied upon.) |
294 | $+ returns whatever the last bracket match matched. |
295 | $& returns the entire matched string. |
ac58e20f |
296 | ($0 used to return the same thing, but not any more.) |
a687059c |
297 | $\` returns everything before the matched string. |
298 | $\' returns everything after the matched string. |
299 | Examples: |
300 | .nf |
301 | |
302 | s/\|^\|([^ \|]*\|) \|*([^ \|]*\|)\|/\|$2 $1\|/; # swap first two words |
303 | |
304 | .ne 5 |
305 | if (/\|Time: \|(.\|.\|):\|(.\|.\|):\|(.\|.\|)\|/\|) { |
306 | $hours = $1; |
307 | $minutes = $2; |
308 | $seconds = $3; |
309 | } |
310 | |
311 | .fi |
ae986130 |
312 | By default, the ^ character is only guaranteed to match at the beginning |
313 | of the string, |
314 | the $ character only at the end (or before the newline at the end) |
a687059c |
315 | and |
316 | .I perl |
317 | does certain optimizations with the assumption that the string contains |
318 | only one line. |
ae986130 |
319 | The behavior of ^ and $ on embedded newlines will be inconsistent. |
a687059c |
320 | You may, however, wish to treat a string as a multi-line buffer, such that |
321 | the ^ will match after any newline within the string, and $ will match |
322 | before any newline. |
323 | At the cost of a little more overhead, you can do this by setting the variable |
324 | $* to 1. |
325 | Setting it back to 0 makes |
326 | .I perl |
327 | revert to its old behavior. |
328 | .PP |
329 | To facilitate multi-line substitutions, the . character never matches a newline |
330 | (even when $* is 0). |
331 | In particular, the following leaves a newline on the $_ string: |
332 | .nf |
333 | |
334 | $_ = <STDIN>; |
335 | s/.*(some_string).*/$1/; |
336 | |
337 | If the newline is unwanted, try one of |
338 | |
339 | s/.*(some_string).*\en/$1/; |
340 | s/.*(some_string)[^\e000]*/$1/; |
341 | s/.*(some_string)(.|\en)*/$1/; |
342 | chop; s/.*(some_string).*/$1/; |
343 | /(some_string)/ && ($_ = $1); |
344 | |
345 | .fi |
346 | Any item of a regular expression may be followed with digits in curly brackets |
347 | of the form {n,m}, where n gives the minimum number of times to match the item |
348 | and m gives the maximum. |
349 | The form {n} is equivalent to {n,n} and matches exactly n times. |
350 | The form {n,} matches n or more times. |
351 | (If a curly bracket occurs in any other context, it is treated as a regular |
352 | character.) |
353 | The * modifier is equivalent to {0,}, the + modifier to {1,} and the ? modifier |
354 | to {0,1}. |
355 | There is no limit to the size of n or m, but large numbers will chew up |
356 | more memory. |
357 | .Sp |
358 | You will note that all backslashed metacharacters in |
359 | .I perl |
360 | are alphanumeric, |
361 | such as \eb, \ew, \en. |
362 | Unlike some other regular expression languages, there are no backslashed |
363 | symbols that aren't alphanumeric. |
364 | So anything that looks like \e\e, \e(, \e), \e<, \e>, \e{, or \e} is always |
365 | interpreted as a literal character, not a metacharacter. |
366 | This makes it simple to quote a string that you want to use for a pattern |
367 | but that you are afraid might contain metacharacters. |
368 | Simply quote all the non-alphanumeric characters: |
369 | .nf |
370 | |
371 | $pattern =~ s/(\eW)/\e\e$1/g; |
372 | |
373 | .fi |
374 | .Sh "Formats" |
375 | Output record formats for use with the |
376 | .I write |
377 | operator may declared as follows: |
378 | .nf |
379 | |
380 | .ne 3 |
381 | format NAME = |
382 | FORMLIST |
383 | . |
384 | |
385 | .fi |
386 | If name is omitted, format \*(L"STDOUT\*(R" is defined. |
387 | FORMLIST consists of a sequence of lines, each of which may be of one of three |
388 | types: |
389 | .Ip 1. 4 |
390 | A comment. |
391 | .Ip 2. 4 |
392 | A \*(L"picture\*(R" line giving the format for one output line. |
393 | .Ip 3. 4 |
394 | An argument line supplying values to plug into a picture line. |
395 | .PP |
396 | Picture lines are printed exactly as they look, except for certain fields |
397 | that substitute values into the line. |
398 | Each picture field starts with either @ or ^. |
399 | The @ field (not to be confused with the array marker @) is the normal |
400 | case; ^ fields are used |
401 | to do rudimentary multi-line text block filling. |
402 | The length of the field is supplied by padding out the field |
403 | with multiple <, >, or | characters to specify, respectively, left justification, |
404 | right justification, or centering. |
76854fea |
405 | As an alternate form of right justification, |
406 | you may also use # characters (with an optional .) to specify a numeric field. |
a687059c |
407 | If any of the values supplied for these fields contains a newline, only |
408 | the text up to the newline is printed. |
409 | The special field @* can be used for printing multi-line values. |
410 | It should appear by itself on a line. |
411 | .PP |
412 | The values are specified on the following line, in the same order as |
413 | the picture fields. |
414 | The values should be separated by commas. |
415 | .PP |
416 | Picture fields that begin with ^ rather than @ are treated specially. |
417 | The value supplied must be a scalar variable name which contains a text |
418 | string. |
419 | .I Perl |
420 | puts as much text as it can into the field, and then chops off the front |
421 | of the string so that the next time the variable is referenced, |
422 | more of the text can be printed. |
423 | Normally you would use a sequence of fields in a vertical stack to print |
424 | out a block of text. |
425 | If you like, you can end the final field with .\|.\|., which will appear in the |
426 | output if the text was too long to appear in its entirety. |
427 | You can change which characters are legal to break on by changing the |
428 | variable $: to a list of the desired characters. |
429 | .PP |
430 | Since use of ^ fields can produce variable length records if the text to be |
431 | formatted is short, you can suppress blank lines by putting the tilde (~) |
432 | character anywhere in the line. |
433 | (Normally you should put it in the front if possible, for visibility.) |
434 | The tilde will be translated to a space upon output. |
435 | If you put a second tilde contiguous to the first, the line will be repeated |
436 | until all the fields on the line are exhausted. |
437 | (If you use a field of the @ variety, the expression you supply had better |
438 | not give the same value every time forever!) |
439 | .PP |
440 | Examples: |
441 | .nf |
442 | .lg 0 |
443 | .cs R 25 |
444 | .ft C |
445 | |
446 | .ne 10 |
447 | # a report on the /etc/passwd file |
448 | format top = |
449 | \& Passwd File |
450 | Name Login Office Uid Gid Home |
451 | ------------------------------------------------------------------ |
452 | \&. |
453 | format STDOUT = |
454 | @<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<< |
455 | $name, $login, $office,$uid,$gid, $home |
456 | \&. |
457 | |
458 | .ne 29 |
459 | # a report from a bug report form |
460 | format top = |
461 | \& Bug Reports |
462 | @<<<<<<<<<<<<<<<<<<<<<<< @||| @>>>>>>>>>>>>>>>>>>>>>>> |
463 | $system, $%, $date |
464 | ------------------------------------------------------------------ |
465 | \&. |
466 | format STDOUT = |
467 | Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
468 | \& $subject |
469 | Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
470 | \& $index, $description |
471 | Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
472 | \& $priority, $date, $description |
473 | From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
474 | \& $from, $description |
475 | Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
476 | \& $programmer, $description |
477 | \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
478 | \& $description |
479 | \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
480 | \& $description |
481 | \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
482 | \& $description |
483 | \&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
484 | \& $description |
485 | \&~ ^<<<<<<<<<<<<<<<<<<<<<<<... |
486 | \& $description |
487 | \&. |
488 | |
489 | .ft R |
490 | .cs R |
491 | .lg |
492 | .fi |
493 | It is possible to intermix prints with writes on the same output channel, |
494 | but you'll have to handle $\- (lines left on the page) yourself. |
495 | .PP |
496 | If you are printing lots of fields that are usually blank, you should consider |
497 | using the reset operator between records. |
498 | Not only is it more efficient, but it can prevent the bug of adding another |
499 | field and forgetting to zero it. |
500 | .Sh "Interprocess Communication" |
501 | The IPC facilities of perl are built on the Berkeley socket mechanism. |
502 | If you don't have sockets, you can ignore this section. |
503 | The calls have the same names as the corresponding system calls, |
504 | but the arguments tend to differ, for two reasons. |
505 | First, perl file handles work differently than C file descriptors. |
506 | Second, perl already knows the length of its strings, so you don't need |
507 | to pass that information. |
508 | Here is a sample client (untested): |
509 | .nf |
510 | |
511 | ($them,$port) = @ARGV; |
512 | $port = 2345 unless $port; |
513 | $them = 'localhost' unless $them; |
514 | |
515 | $SIG{'INT'} = 'dokill'; |
516 | sub dokill { kill 9,$child if $child; } |
517 | |
33b78306 |
518 | require 'sys/socket.ph'; |
a687059c |
519 | |
520 | $sockaddr = 'S n a4 x8'; |
521 | chop($hostname = `hostname`); |
522 | |
523 | ($name, $aliases, $proto) = getprotobyname('tcp'); |
524 | ($name, $aliases, $port) = getservbyname($port, 'tcp') |
0f85fab0 |
525 | unless $port =~ /^\ed+$/; |
ae986130 |
526 | .ie t \{\ |
a687059c |
527 | ($name, $aliases, $type, $len, $thisaddr) = gethostbyname($hostname); |
ae986130 |
528 | 'br\} |
529 | .el \{\ |
530 | ($name, $aliases, $type, $len, $thisaddr) = |
531 | gethostbyname($hostname); |
532 | 'br\} |
a687059c |
533 | ($name, $aliases, $type, $len, $thataddr) = gethostbyname($them); |
534 | |
535 | $this = pack($sockaddr, &AF_INET, 0, $thisaddr); |
536 | $that = pack($sockaddr, &AF_INET, $port, $thataddr); |
537 | |
538 | socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!"; |
539 | bind(S, $this) || die "bind: $!"; |
540 | connect(S, $that) || die "connect: $!"; |
541 | |
542 | select(S); $| = 1; select(stdout); |
543 | |
544 | if ($child = fork) { |
545 | while (<>) { |
546 | print S; |
547 | } |
548 | sleep 3; |
549 | do dokill(); |
550 | } |
551 | else { |
552 | while (<S>) { |
553 | print; |
554 | } |
555 | } |
556 | |
557 | .fi |
558 | And here's a server: |
559 | .nf |
560 | |
561 | ($port) = @ARGV; |
562 | $port = 2345 unless $port; |
563 | |
33b78306 |
564 | require 'sys/socket.ph'; |
a687059c |
565 | |
566 | $sockaddr = 'S n a4 x8'; |
567 | |
568 | ($name, $aliases, $proto) = getprotobyname('tcp'); |
569 | ($name, $aliases, $port) = getservbyname($port, 'tcp') |
0f85fab0 |
570 | unless $port =~ /^\ed+$/; |
a687059c |
571 | |
572 | $this = pack($sockaddr, &AF_INET, $port, "\e0\e0\e0\e0"); |
573 | |
574 | select(NS); $| = 1; select(stdout); |
575 | |
576 | socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!"; |
577 | bind(S, $this) || die "bind: $!"; |
578 | listen(S, 5) || die "connect: $!"; |
579 | |
580 | select(S); $| = 1; select(stdout); |
581 | |
582 | for (;;) { |
583 | print "Listening again\en"; |
584 | ($addr = accept(NS,S)) || die $!; |
585 | print "accept ok\en"; |
586 | |
ae986130 |
587 | ($af,$port,$inetaddr) = unpack($sockaddr,$addr); |
a687059c |
588 | @inetaddr = unpack('C4',$inetaddr); |
589 | print "$af $port @inetaddr\en"; |
590 | |
591 | while (<NS>) { |
592 | print; |
593 | print NS; |
594 | } |
595 | } |
596 | |
597 | .fi |
598 | .Sh "Predefined Names" |
599 | The following names have special meaning to |
600 | .IR perl . |
601 | I could have used alphabetic symbols for some of these, but I didn't want |
602 | to take the chance that someone would say reset \*(L"a\-zA\-Z\*(R" and wipe them all |
603 | out. |
604 | You'll just have to suffer along with these silly symbols. |
605 | Most of them have reasonable mnemonics, or analogues in one of the shells. |
606 | .Ip $_ 8 |
607 | The default input and pattern-searching space. |
608 | The following pairs are equivalent: |
609 | .nf |
610 | |
611 | .ne 2 |
612 | while (<>) {\|.\|.\|. # only equivalent in while! |
613 | while ($_ = <>) {\|.\|.\|. |
614 | |
615 | .ne 2 |
616 | /\|^Subject:/ |
617 | $_ \|=~ \|/\|^Subject:/ |
618 | |
619 | .ne 2 |
620 | y/a\-z/A\-Z/ |
621 | $_ =~ y/a\-z/A\-Z/ |
622 | |
623 | .ne 2 |
624 | chop |
625 | chop($_) |
626 | |
627 | .fi |
628 | (Mnemonic: underline is understood in certain operations.) |
629 | .Ip $. 8 |
630 | The current input line number of the last filehandle that was read. |
631 | Readonly. |
632 | Remember that only an explicit close on the filehandle resets the line number. |
633 | Since <> never does an explicit close, line numbers increase across ARGV files |
634 | (but see examples under eof). |
635 | (Mnemonic: many programs use . to mean the current line number.) |
636 | .Ip $/ 8 |
637 | The input record separator, newline by default. |
638 | Works like |
639 | .IR awk 's |
640 | RS variable, including treating blank lines as delimiters |
641 | if set to the null string. |
642 | If set to a value longer than one character, only the first character is used. |
643 | (Mnemonic: / is used to delimit line boundaries when quoting poetry.) |
644 | .Ip $, 8 |
645 | The output field separator for the print operator. |
646 | Ordinarily the print operator simply prints out the comma separated fields |
647 | you specify. |
648 | In order to get behavior more like |
649 | .IR awk , |
650 | set this variable as you would set |
651 | .IR awk 's |
652 | OFS variable to specify what is printed between fields. |
653 | (Mnemonic: what is printed when there is a , in your print statement.) |
654 | .Ip $"" 8 |
655 | This is like $, except that it applies to array values interpolated into |
656 | a double-quoted string (or similar interpreted string). |
657 | Default is a space. |
658 | (Mnemonic: obvious, I think.) |
659 | .Ip $\e 8 |
660 | The output record separator for the print operator. |
661 | Ordinarily the print operator simply prints out the comma separated fields |
662 | you specify, with no trailing newline or record separator assumed. |
663 | In order to get behavior more like |
664 | .IR awk , |
665 | set this variable as you would set |
666 | .IR awk 's |
667 | ORS variable to specify what is printed at the end of the print. |
668 | (Mnemonic: you set $\e instead of adding \en at the end of the print. |
669 | Also, it's just like /, but it's what you get \*(L"back\*(R" from |
670 | .IR perl .) |
671 | .Ip $# 8 |
672 | The output format for printed numbers. |
673 | This variable is a half-hearted attempt to emulate |
674 | .IR awk 's |
675 | OFMT variable. |
676 | There are times, however, when |
677 | .I awk |
678 | and |
679 | .I perl |
680 | have differing notions of what |
681 | is in fact numeric. |
682 | Also, the initial value is %.20g rather than %.6g, so you need to set $# |
683 | explicitly to get |
684 | .IR awk 's |
685 | value. |
686 | (Mnemonic: # is the number sign.) |
687 | .Ip $% 8 |
688 | The current page number of the currently selected output channel. |
689 | (Mnemonic: % is page number in nroff.) |
690 | .Ip $= 8 |
691 | The current page length (printable lines) of the currently selected output |
692 | channel. |
693 | Default is 60. |
694 | (Mnemonic: = has horizontal lines.) |
695 | .Ip $\- 8 |
696 | The number of lines left on the page of the currently selected output channel. |
697 | (Mnemonic: lines_on_page \- lines_printed.) |
698 | .Ip $~ 8 |
699 | The name of the current report format for the currently selected output |
700 | channel. |
701 | (Mnemonic: brother to $^.) |
702 | .Ip $^ 8 |
703 | The name of the current top-of-page format for the currently selected output |
704 | channel. |
705 | (Mnemonic: points to top of page.) |
706 | .Ip $| 8 |
707 | If set to nonzero, forces a flush after every write or print on the currently |
708 | selected output channel. |
709 | Default is 0. |
710 | Note that |
711 | .I STDOUT |
712 | will typically be line buffered if output is to the |
713 | terminal and block buffered otherwise. |
714 | Setting this variable is useful primarily when you are outputting to a pipe, |
715 | such as when you are running a |
716 | .I perl |
717 | script under rsh and want to see the |
718 | output as it's happening. |
719 | (Mnemonic: when you want your pipes to be piping hot.) |
720 | .Ip $$ 8 |
721 | The process number of the |
722 | .I perl |
723 | running this script. |
724 | (Mnemonic: same as shells.) |
725 | .Ip $? 8 |
726 | The status returned by the last pipe close, backtick (\`\`) command or |
727 | .I system |
728 | operator. |
729 | Note that this is the status word returned by the wait() system |
730 | call, so the exit value of the subprocess is actually ($? >> 8). |
731 | $? & 255 gives which signal, if any, the process died from, and whether |
732 | there was a core dump. |
733 | (Mnemonic: similar to sh and ksh.) |
734 | .Ip $& 8 4 |
735 | The string matched by the last pattern match (not counting any matches hidden |
736 | within a BLOCK or eval enclosed by the current BLOCK). |
737 | (Mnemonic: like & in some editors.) |
738 | .Ip $\` 8 4 |
739 | The string preceding whatever was matched by the last pattern match |
740 | (not counting any matches hidden within a BLOCK or eval enclosed by the current |
741 | BLOCK). |
742 | (Mnemonic: \` often precedes a quoted string.) |
743 | .Ip $\' 8 4 |
744 | The string following whatever was matched by the last pattern match |
745 | (not counting any matches hidden within a BLOCK or eval enclosed by the current |
746 | BLOCK). |
747 | (Mnemonic: \' often follows a quoted string.) |
748 | Example: |
749 | .nf |
750 | |
751 | .ne 3 |
752 | $_ = \'abcdefghi\'; |
753 | /def/; |
754 | print "$\`:$&:$\'\en"; # prints abc:def:ghi |
755 | |
756 | .fi |
757 | .Ip $+ 8 4 |
758 | The last bracket matched by the last search pattern. |
759 | This is useful if you don't know which of a set of alternative patterns |
760 | matched. |
761 | For example: |
762 | .nf |
763 | |
764 | /Version: \|(.*\|)|Revision: \|(.*\|)\|/ \|&& \|($rev = $+); |
765 | |
766 | .fi |
767 | (Mnemonic: be positive and forward looking.) |
768 | .Ip $* 8 2 |
769 | Set to 1 to do multiline matching within a string, 0 to tell |
770 | .I perl |
771 | that it can assume that strings contain a single line, for the purpose |
772 | of optimizing pattern matches. |
773 | Pattern matches on strings containing multiple newlines can produce confusing |
774 | results when $* is 0. |
775 | Default is 0. |
776 | (Mnemonic: * matches multiple things.) |
777 | .Ip $0 8 |
778 | Contains the name of the file containing the |
779 | .I perl |
780 | script being executed. |
a687059c |
781 | (Mnemonic: same as sh and ksh.) |
782 | .Ip $<digit> 8 |
783 | Contains the subpattern from the corresponding set of parentheses in the last |
784 | pattern matched, not counting patterns matched in nested blocks that have |
785 | been exited already. |
786 | (Mnemonic: like \edigit.) |
787 | .Ip $[ 8 2 |
788 | The index of the first element in an array, and of the first character in |
789 | a substring. |
790 | Default is 0, but you could set it to 1 to make |
791 | .I perl |
792 | behave more like |
793 | .I awk |
794 | (or Fortran) |
795 | when subscripting and when evaluating the index() and substr() functions. |
796 | (Mnemonic: [ begins subscripts.) |
797 | .Ip $] 8 2 |
798 | The string printed out when you say \*(L"perl -v\*(R". |
799 | It can be used to determine at the beginning of a script whether the perl |
800 | interpreter executing the script is in the right range of versions. |
33b78306 |
801 | If used in a numeric context, returns the version + patchlevel / 1000. |
a687059c |
802 | Example: |
803 | .nf |
804 | |
33b78306 |
805 | .ne 8 |
a687059c |
806 | # see if getc is available |
807 | ($version,$patchlevel) = |
808 | $] =~ /(\ed+\e.\ed+).*\enPatch level: (\ed+)/; |
809 | print STDERR "(No filename completion available.)\en" |
810 | if $version * 1000 + $patchlevel < 2016; |
811 | |
33b78306 |
812 | or, used numerically, |
813 | |
e5d73d77 |
814 | warn "No checksumming!\en" if $] < 3.019; |
33b78306 |
815 | |
a687059c |
816 | .fi |
817 | (Mnemonic: Is this version of perl in the right bracket?) |
818 | .Ip $; 8 2 |
819 | The subscript separator for multi-dimensional array emulation. |
820 | If you refer to an associative array element as |
821 | .nf |
822 | $foo{$a,$b,$c} |
823 | |
824 | it really means |
825 | |
826 | $foo{join($;, $a, $b, $c)} |
827 | |
828 | But don't put |
829 | |
830 | @foo{$a,$b,$c} # a slice--note the @ |
831 | |
832 | which means |
833 | |
834 | ($foo{$a},$foo{$b},$foo{$c}) |
835 | |
836 | .fi |
837 | Default is "\e034", the same as SUBSEP in |
838 | .IR awk . |
839 | Note that if your keys contain binary data there might not be any safe |
840 | value for $;. |
841 | (Mnemonic: comma (the syntactic subscript separator) is a semi-semicolon. |
842 | Yeah, I know, it's pretty lame, but $, is already taken for something more |
843 | important.) |
844 | .Ip $! 8 2 |
845 | If used in a numeric context, yields the current value of errno, with all the |
846 | usual caveats. |
ffed7fef |
847 | (This means that you shouldn't depend on the value of $! to be anything |
848 | in particular unless you've gotten a specific error return indicating a |
849 | system error.) |
a687059c |
850 | If used in a string context, yields the corresponding system error string. |
851 | You can assign to $! in order to set errno |
852 | if, for instance, you want $! to return the string for error n, or you want |
853 | to set the exit value for the die operator. |
854 | (Mnemonic: What just went bang?) |
855 | .Ip $@ 8 2 |
ffed7fef |
856 | The perl syntax error message from the last eval command. |
857 | If null, the last eval parsed and executed correctly (although the operations |
858 | you invoked may have failed in the normal fashion). |
a687059c |
859 | (Mnemonic: Where was the syntax error \*(L"at\*(R"?) |
860 | .Ip $< 8 2 |
861 | The real uid of this process. |
862 | (Mnemonic: it's the uid you came FROM, if you're running setuid.) |
863 | .Ip $> 8 2 |
864 | The effective uid of this process. |
865 | Example: |
866 | .nf |
867 | |
868 | .ne 2 |
869 | $< = $>; # set real uid to the effective uid |
870 | ($<,$>) = ($>,$<); # swap real and effective uid |
871 | |
872 | .fi |
873 | (Mnemonic: it's the uid you went TO, if you're running setuid.) |
874 | Note: $< and $> can only be swapped on machines supporting setreuid(). |
875 | .Ip $( 8 2 |
876 | The real gid of this process. |
877 | If you are on a machine that supports membership in multiple groups |
878 | simultaneously, gives a space separated list of groups you are in. |
879 | The first number is the one returned by getgid(), and the subsequent ones |
880 | by getgroups(), one of which may be the same as the first number. |
881 | (Mnemonic: parentheses are used to GROUP things. |
882 | The real gid is the group you LEFT, if you're running setgid.) |
883 | .Ip $) 8 2 |
884 | The effective gid of this process. |
885 | If you are on a machine that supports membership in multiple groups |
886 | simultaneously, gives a space separated list of groups you are in. |
887 | The first number is the one returned by getegid(), and the subsequent ones |
888 | by getgroups(), one of which may be the same as the first number. |
889 | (Mnemonic: parentheses are used to GROUP things. |
890 | The effective gid is the group that's RIGHT for you, if you're running setgid.) |
891 | .Sp |
892 | Note: $<, $>, $( and $) can only be set on machines that support the |
893 | corresponding set[re][ug]id() routine. |
894 | $( and $) can only be swapped on machines supporting setregid(). |
895 | .Ip $: 8 2 |
896 | The current set of characters after which a string may be broken to |
897 | fill continuation fields (starting with ^) in a format. |
898 | Default is "\ \en-", to break on whitespace or hyphens. |
899 | (Mnemonic: a \*(L"colon\*(R" in poetry is a part of a line.) |
33b78306 |
900 | .Ip $ARGV 8 3 |
901 | contains the name of the current file when reading from <>. |
a687059c |
902 | .Ip @ARGV 8 3 |
903 | The array ARGV contains the command line arguments intended for the script. |
904 | Note that $#ARGV is the generally number of arguments minus one, since |
905 | $ARGV[0] is the first argument, NOT the command name. |
906 | See $0 for the command name. |
907 | .Ip @INC 8 3 |
908 | The array INC contains the list of places to look for |
909 | .I perl |
910 | scripts to be |
e5d73d77 |
911 | evaluated by the \*(L"do EXPR\*(R" command or the \*(L"require\*(R" command. |
a687059c |
912 | It initially consists of the arguments to any |
913 | .B \-I |
914 | command line switches, followed |
915 | by the default |
916 | .I perl |
33b78306 |
917 | library, probably \*(L"/usr/local/lib/perl\*(R", |
918 | followed by \*(L".\*(R", to represent the current directory. |
919 | .Ip %INC 8 3 |
920 | The associative array INC contains entries for each filename that has |
921 | been included via \*(L"do\*(R" or \*(L"require\*(R". |
922 | The key is the filename you specified, and the value is the location of |
923 | the file actually found. |
924 | The \*(L"require\*(R" command uses this array to determine whether |
925 | a given file has already been included. |
a687059c |
926 | .Ip $ENV{expr} 8 2 |
927 | The associative array ENV contains your current environment. |
928 | Setting a value in ENV changes the environment for child processes. |
929 | .Ip $SIG{expr} 8 2 |
930 | The associative array SIG is used to set signal handlers for various signals. |
931 | Example: |
932 | .nf |
933 | |
934 | .ne 12 |
935 | sub handler { # 1st argument is signal name |
936 | local($sig) = @_; |
937 | print "Caught a SIG$sig\-\|\-shutting down\en"; |
938 | close(LOG); |
939 | exit(0); |
940 | } |
941 | |
942 | $SIG{\'INT\'} = \'handler\'; |
943 | $SIG{\'QUIT\'} = \'handler\'; |
944 | .\|.\|. |
945 | $SIG{\'INT\'} = \'DEFAULT\'; # restore default action |
946 | $SIG{\'QUIT\'} = \'IGNORE\'; # ignore SIGQUIT |
947 | |
948 | .fi |
949 | The SIG array only contains values for the signals actually set within |
950 | the perl script. |
951 | .Sh "Packages" |
952 | Perl provides a mechanism for alternate namespaces to protect packages from |
953 | stomping on each others variables. |
954 | By default, a perl script starts compiling into the package known as \*(L"main\*(R". |
955 | By use of the |
956 | .I package |
957 | declaration, you can switch namespaces. |
958 | The scope of the package declaration is from the declaration itself to the end |
959 | of the enclosing block (the same scope as the local() operator). |
960 | Typically it would be the first declaration in a file to be included by |
33b78306 |
961 | the \*(L"require\*(R" operator. |
a687059c |
962 | You can switch into a package in more than one place; it merely influences |
963 | which symbol table is used by the compiler for the rest of that block. |
663a0e37 |
964 | You can refer to variables and filehandles in other packages by prefixing |
965 | the identifier with the package name and a single quote. |
a687059c |
966 | If the package name is null, the \*(L"main\*(R" package as assumed. |
663a0e37 |
967 | .PP |
968 | Only identifiers starting with letters are stored in the packages symbol |
969 | table. |
970 | All other symbols are kept in package \*(L"main\*(R". |
971 | In addition, the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC |
972 | and SIG are forced to be in package \*(L"main\*(R", even when used for |
973 | other purposes than their built-in one. |
974 | Note also that, if you have a package called \*(L"m\*(R", \*(L"s\*(R" |
975 | or \*(L"y\*(R", the you can't use the qualified form of an identifier since it |
976 | will be interpreted instead as a pattern match, a substitution |
977 | or a translation. |
978 | .PP |
a687059c |
979 | Eval'ed strings are compiled in the package in which the eval was compiled |
980 | in. |
981 | (Assignments to $SIG{}, however, assume the signal handler specified is in the |
982 | main package. |
983 | Qualify the signal handler name if you wish to have a signal handler in |
984 | a package.) |
985 | For an example, examine perldb.pl in the perl library. |
986 | It initially switches to the DB package so that the debugger doesn't interfere |
987 | with variables in the script you are trying to debug. |
988 | At various points, however, it temporarily switches back to the main package |
989 | to evaluate various expressions in the context of the main package. |
990 | .PP |
991 | The symbol table for a package happens to be stored in the associative array |
992 | of that name prepended with an underscore. |
993 | The value in each entry of the associative array is |
994 | what you are referring to when you use the *name notation. |
995 | In fact, the following have the same effect (in package main, anyway), |
996 | though the first is more |
997 | efficient because it does the symbol table lookups at compile time: |
998 | .nf |
999 | |
1000 | .ne 2 |
1001 | local(*foo) = *bar; |
1002 | local($_main{'foo'}) = $_main{'bar'}; |
1003 | |
1004 | .fi |
1005 | You can use this to print out all the variables in a package, for instance. |
1006 | Here is dumpvar.pl from the perl library: |
1007 | .nf |
1008 | .ne 11 |
1009 | package dumpvar; |
1010 | |
1011 | sub main'dumpvar { |
1012 | \& ($package) = @_; |
1013 | \& local(*stab) = eval("*_$package"); |
1014 | \& while (($key,$val) = each(%stab)) { |
1015 | \& { |
1016 | \& local(*entry) = $val; |
1017 | \& if (defined $entry) { |
1018 | \& print "\e$$key = '$entry'\en"; |
1019 | \& } |
1020 | .ne 7 |
1021 | \& if (defined @entry) { |
1022 | \& print "\e@$key = (\en"; |
1023 | \& foreach $num ($[ .. $#entry) { |
1024 | \& print " $num\et'",$entry[$num],"'\en"; |
1025 | \& } |
1026 | \& print ")\en"; |
1027 | \& } |
1028 | .ne 10 |
1029 | \& if ($key ne "_$package" && defined %entry) { |
1030 | \& print "\e%$key = (\en"; |
1031 | \& foreach $key (sort keys(%entry)) { |
1032 | \& print " $key\et'",$entry{$key},"'\en"; |
1033 | \& } |
1034 | \& print ")\en"; |
1035 | \& } |
1036 | \& } |
1037 | \& } |
1038 | } |
1039 | |
1040 | .fi |
1041 | Note that, even though the subroutine is compiled in package dumpvar, the |
663a0e37 |
1042 | name of the subroutine is qualified so that its name is inserted into package |
a687059c |
1043 | \*(L"main\*(R". |
1044 | .Sh "Style" |
1045 | Each programmer will, of course, have his or her own preferences in regards |
1046 | to formatting, but there are some general guidelines that will make your |
1047 | programs easier to read. |
1048 | .Ip 1. 4 4 |
1049 | Just because you CAN do something a particular way doesn't mean that |
1050 | you SHOULD do it that way. |
1051 | .I Perl |
1052 | is designed to give you several ways to do anything, so consider picking |
1053 | the most readable one. |
1054 | For instance |
1055 | |
1056 | open(FOO,$foo) || die "Can't open $foo: $!"; |
1057 | |
1058 | is better than |
1059 | |
1060 | die "Can't open $foo: $!" unless open(FOO,$foo); |
1061 | |
1062 | because the second way hides the main point of the statement in a |
1063 | modifier. |
1064 | On the other hand |
1065 | |
1066 | print "Starting analysis\en" if $verbose; |
1067 | |
1068 | is better than |
1069 | |
1070 | $verbose && print "Starting analysis\en"; |
1071 | |
1072 | since the main point isn't whether the user typed -v or not. |
1073 | .Sp |
1074 | Similarly, just because an operator lets you assume default arguments |
1075 | doesn't mean that you have to make use of the defaults. |
1076 | The defaults are there for lazy systems programmers writing one-shot |
1077 | programs. |
1078 | If you want your program to be readable, consider supplying the argument. |
03a14243 |
1079 | .Sp |
1080 | Along the same lines, just because you |
1081 | .I can |
1082 | omit parentheses in many places doesn't mean that you ought to: |
1083 | .nf |
1084 | |
1085 | return print reverse sort num values array; |
1086 | return print(reverse(sort num (values(%array)))); |
1087 | |
1088 | .fi |
1089 | When in doubt, parenthesize. |
1090 | At the very least it will let some poor schmuck bounce on the % key in vi. |
a687059c |
1091 | .Ip 2. 4 4 |
1092 | Don't go through silly contortions to exit a loop at the top or the |
1093 | bottom, when |
1094 | .I perl |
1095 | provides the "last" operator so you can exit in the middle. |
1096 | Just outdent it a little to make it more visible: |
1097 | .nf |
1098 | |
1099 | .ne 7 |
1100 | line: |
1101 | for (;;) { |
1102 | statements; |
1103 | last line if $foo; |
1104 | next line if /^#/; |
1105 | statements; |
1106 | } |
1107 | |
1108 | .fi |
1109 | .Ip 3. 4 4 |
1110 | Don't be afraid to use loop labels\*(--they're there to enhance readability as |
1111 | well as to allow multi-level loop breaks. |
1112 | See last example. |
ffed7fef |
1113 | .Ip 4. 4 4 |
a687059c |
1114 | For portability, when using features that may not be implemented on every |
1115 | machine, test the construct in an eval to see if it fails. |
03a14243 |
1116 | If you know what version or patchlevel a particular feature was implemented, |
1117 | you can test $] to see if it will be there. |
a687059c |
1118 | .Ip 5. 4 4 |
ffed7fef |
1119 | Choose mnemonic identifiers. |
1120 | .Ip 6. 4 4 |
a687059c |
1121 | Be consistent. |
1122 | .Sh "Debugging" |
1123 | If you invoke |
1124 | .I perl |
1125 | with a |
1126 | .B \-d |
1127 | switch, your script will be run under a debugging monitor. |
1128 | It will halt before the first executable statement and ask you for a |
1129 | command, such as: |
1130 | .Ip "h" 12 4 |
1131 | Prints out a help message. |
33b78306 |
1132 | .Ip "T" 12 4 |
1133 | Stack trace. |
a687059c |
1134 | .Ip "s" 12 4 |
1135 | Single step. |
1136 | Executes until it reaches the beginning of another statement. |
33b78306 |
1137 | .Ip "n" 12 4 |
1138 | Next. |
1139 | Executes over subroutine calls, until it reaches the beginning of the |
1140 | next statement. |
1141 | .Ip "f" 12 4 |
1142 | Finish. |
1143 | Executes statements until it has finished the current subroutine. |
a687059c |
1144 | .Ip "c" 12 4 |
1145 | Continue. |
1146 | Executes until the next breakpoint is reached. |
33b78306 |
1147 | .Ip "c line" 12 4 |
1148 | Continue to the specified line. |
1149 | Inserts a one-time-only breakpoint at the specified line. |
a687059c |
1150 | .Ip "<CR>" 12 4 |
33b78306 |
1151 | Repeat last n or s. |
a687059c |
1152 | .Ip "l min+incr" 12 4 |
1153 | List incr+1 lines starting at min. |
1154 | If min is omitted, starts where last listing left off. |
1155 | If incr is omitted, previous value of incr is used. |
1156 | .Ip "l min-max" 12 4 |
1157 | List lines in the indicated range. |
1158 | .Ip "l line" 12 4 |
1159 | List just the indicated line. |
1160 | .Ip "l" 12 4 |
33b78306 |
1161 | List next window. |
1162 | .Ip "-" 12 4 |
1163 | List previous window. |
1164 | .Ip "w line" 12 4 |
1165 | List window around line. |
a687059c |
1166 | .Ip "l subname" 12 4 |
1167 | List subroutine. |
1168 | If it's a long subroutine it just lists the beginning. |
1169 | Use \*(L"l\*(R" to list more. |
33b78306 |
1170 | .Ip "/pattern/" 12 4 |
1171 | Regular expression search forward for pattern; the final / is optional. |
1172 | .Ip "?pattern?" 12 4 |
1173 | Regular expression search backward for pattern; the final ? is optional. |
a687059c |
1174 | .Ip "L" 12 4 |
1175 | List lines that have breakpoints or actions. |
33b78306 |
1176 | .Ip "S" 12 4 |
1177 | Lists the names of all subroutines. |
a687059c |
1178 | .Ip "t" 12 4 |
1179 | Toggle trace mode on or off. |
33b78306 |
1180 | .Ip "b line condition" 12 4 |
a687059c |
1181 | Set a breakpoint. |
33b78306 |
1182 | If line is omitted, sets a breakpoint on the |
a687059c |
1183 | line that is about to be executed. |
33b78306 |
1184 | If a condition is specified, it is evaluated each time the statement is |
1185 | reached and a breakpoint is taken only if the condition is true. |
a687059c |
1186 | Breakpoints may only be set on lines that begin an executable statement. |
33b78306 |
1187 | .Ip "b subname condition" 12 4 |
a687059c |
1188 | Set breakpoint at first executable line of subroutine. |
a687059c |
1189 | .Ip "d line" 12 4 |
1190 | Delete breakpoint. |
33b78306 |
1191 | If line is omitted, deletes the breakpoint on the |
a687059c |
1192 | line that is about to be executed. |
1193 | .Ip "D" 12 4 |
1194 | Delete all breakpoints. |
a687059c |
1195 | .Ip "a line command" 12 4 |
1196 | Set an action for line. |
1197 | A multi-line command may be entered by backslashing the newlines. |
33b78306 |
1198 | .Ip "A" 12 4 |
1199 | Delete all line actions. |
a687059c |
1200 | .Ip "< command" 12 4 |
1201 | Set an action to happen before every debugger prompt. |
1202 | A multi-line command may be entered by backslashing the newlines. |
1203 | .Ip "> command" 12 4 |
1204 | Set an action to happen after the prompt when you've just given a command |
1205 | to return to executing the script. |
1206 | A multi-line command may be entered by backslashing the newlines. |
33b78306 |
1207 | .Ip "V package" 12 4 |
1208 | List all variables in package. |
1209 | Default is main package. |
a687059c |
1210 | .Ip "! number" 12 4 |
1211 | Redo a debugging command. |
1212 | If number is omitted, redoes the previous command. |
1213 | .Ip "! -number" 12 4 |
1214 | Redo the command that was that many commands ago. |
1215 | .Ip "H -number" 12 4 |
1216 | Display last n commands. |
1217 | Only commands longer than one character are listed. |
1218 | If number is omitted, lists them all. |
1219 | .Ip "q or ^D" 12 4 |
1220 | Quit. |
1221 | .Ip "command" 12 4 |
1222 | Execute command as a perl statement. |
1223 | A missing semicolon will be supplied. |
1224 | .Ip "p expr" 12 4 |
1225 | Same as \*(L"print DB'OUT expr\*(R". |
1226 | The DB'OUT filehandle is opened to /dev/tty, regardless of where STDOUT |
1227 | may be redirected to. |
1228 | .PP |
1229 | If you want to modify the debugger, copy perldb.pl from the perl library |
1230 | to your current directory and modify it as necessary. |
76854fea |
1231 | (You'll also have to put -I. on your command line.) |
a687059c |
1232 | You can do some customization by setting up a .perldb file which contains |
1233 | initialization code. |
1234 | For instance, you could make aliases like these: |
1235 | .nf |
1236 | |
ac58e20f |
1237 | $DB'alias{'len'} = 's/^len(.*)/p length($1)/'; |
1238 | $DB'alias{'stop'} = 's/^stop (at|in)/b/'; |
1239 | $DB'alias{'.'} = |
1240 | 's/^\e./p "\e$DB\e'sub(\e$DB\e'line):\et",\e$DB\e'line[\e$DB\e'line]/'; |
a687059c |
1241 | |
1242 | .fi |
1243 | .Sh "Setuid Scripts" |
1244 | .I Perl |
1245 | is designed to make it easy to write secure setuid and setgid scripts. |
1246 | Unlike shells, which are based on multiple substitution passes on each line |
1247 | of the script, |
1248 | .I perl |
1249 | uses a more conventional evaluation scheme with fewer hidden \*(L"gotchas\*(R". |
1250 | Additionally, since the language has more built-in functionality, it |
1251 | has to rely less upon external (and possibly untrustworthy) programs to |
1252 | accomplish its purposes. |
1253 | .PP |
1254 | In an unpatched 4.2 or 4.3bsd kernel, setuid scripts are intrinsically |
1255 | insecure, but this kernel feature can be disabled. |
1256 | If it is, |
1257 | .I perl |
1258 | can emulate the setuid and setgid mechanism when it notices the otherwise |
1259 | useless setuid/gid bits on perl scripts. |
1260 | If the kernel feature isn't disabled, |
1261 | .I perl |
1262 | will complain loudly that your setuid script is insecure. |
1263 | You'll need to either disable the kernel setuid script feature, or put |
1264 | a C wrapper around the script. |
1265 | .PP |
1266 | When perl is executing a setuid script, it takes special precautions to |
1267 | prevent you from falling into any obvious traps. |
1268 | (In some ways, a perl script is more secure than the corresponding |
1269 | C program.) |
1270 | Any command line argument, environment variable, or input is marked as |
1271 | \*(L"tainted\*(R", and may not be used, directly or indirectly, in any |
1272 | command that invokes a subshell, or in any command that modifies files, |
1273 | directories or processes. |
1274 | Any variable that is set within an expression that has previously referenced |
1275 | a tainted value also becomes tainted (even if it is logically impossible |
1276 | for the tainted value to influence the variable). |
1277 | For example: |
1278 | .nf |
1279 | |
1280 | .ne 5 |
1281 | $foo = shift; # $foo is tainted |
1282 | $bar = $foo,\'bar\'; # $bar is also tainted |
1283 | $xxx = <>; # Tainted |
1284 | $path = $ENV{\'PATH\'}; # Tainted, but see below |
1285 | $abc = \'abc\'; # Not tainted |
1286 | |
1287 | .ne 4 |
1288 | system "echo $foo"; # Insecure |
79a0689e |
1289 | system "/bin/echo", $foo; # Secure (doesn't use sh) |
a687059c |
1290 | system "echo $bar"; # Insecure |
1291 | system "echo $abc"; # Insecure until PATH set |
1292 | |
1293 | .ne 5 |
1294 | $ENV{\'PATH\'} = \'/bin:/usr/bin\'; |
1295 | $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\'; |
1296 | |
1297 | $path = $ENV{\'PATH\'}; # Not tainted |
1298 | system "echo $abc"; # Is secure now! |
1299 | |
1300 | .ne 5 |
1301 | open(FOO,"$foo"); # OK |
1302 | open(FOO,">$foo"); # Not OK |
1303 | |
1304 | open(FOO,"echo $foo|"); # Not OK, but... |
1305 | open(FOO,"-|") || exec \'echo\', $foo; # OK |
1306 | |
1307 | $zzz = `echo $foo`; # Insecure, zzz tainted |
1308 | |
1309 | unlink $abc,$foo; # Insecure |
1310 | umask $foo; # Insecure |
1311 | |
1312 | .ne 3 |
1313 | exec "echo $foo"; # Insecure |
1314 | exec "echo", $foo; # Secure (doesn't use sh) |
1315 | exec "sh", \'-c\', $foo; # Considered secure, alas |
1316 | |
1317 | .fi |
1318 | The taintedness is associated with each scalar value, so some elements |
1319 | of an array can be tainted, and others not. |
1320 | .PP |
1321 | If you try to do something insecure, you will get a fatal error saying |
1322 | something like \*(L"Insecure dependency\*(R" or \*(L"Insecure PATH\*(R". |
1323 | Note that you can still write an insecure system call or exec, |
ae986130 |
1324 | but only by explicitly doing something like the last example above. |
a687059c |
1325 | You can also bypass the tainting mechanism by referencing |
1326 | subpatterns\*(--\c |
1327 | .I perl |
1328 | presumes that if you reference a substring using $1, $2, etc, you knew |
1329 | what you were doing when you wrote the pattern: |
1330 | .nf |
1331 | |
1332 | $ARGV[0] =~ /^\-P(\ew+)$/; |
1333 | $printer = $1; # Not tainted |
1334 | |
1335 | .fi |
1336 | This is fairly secure since \ew+ doesn't match shell metacharacters. |
1337 | Use of .+ would have been insecure, but |
1338 | .I perl |
1339 | doesn't check for that, so you must be careful with your patterns. |
1340 | This is the ONLY mechanism for untainting user supplied filenames if you |
1341 | want to do file operations on them (unless you make $> equal to $<). |
1342 | .PP |
1343 | It's also possible to get into trouble with other operations that don't care |
1344 | whether they use tainted values. |
1345 | Make judicious use of the file tests in dealing with any user-supplied |
1346 | filenames. |
1347 | When possible, do opens and such after setting $> = $<. |
1348 | .I Perl |
1349 | doesn't prevent you from opening tainted filenames for reading, so be |
1350 | careful what you print out. |
1351 | The tainting mechanism is intended to prevent stupid mistakes, not to remove |
1352 | the need for thought. |
1353 | .SH ENVIRONMENT |
1354 | .I Perl |
1355 | uses PATH in executing subprocesses, and in finding the script if \-S |
1356 | is used. |
1357 | HOME or LOGDIR are used if chdir has no argument. |
1358 | .PP |
1359 | Apart from these, |
1360 | .I perl |
1361 | uses no environment variables, except to make them available |
1362 | to the script being executed, and to child processes. |
1363 | However, scripts running setuid would do well to execute the following lines |
1364 | before doing anything else, just to keep people honest: |
1365 | .nf |
1366 | |
1367 | .ne 3 |
1368 | $ENV{\'PATH\'} = \'/bin:/usr/bin\'; # or whatever you need |
1369 | $ENV{\'SHELL\'} = \'/bin/sh\' if $ENV{\'SHELL\'} ne \'\'; |
1370 | $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\'; |
1371 | |
1372 | .fi |
1373 | .SH AUTHOR |
1374 | Larry Wall <lwall@jpl-devvax.Jpl.Nasa.Gov> |
0f85fab0 |
1375 | .br |
1376 | MS-DOS port by Diomidis Spinellis <dds@cc.ic.ac.uk> |
a687059c |
1377 | .SH FILES |
1378 | /tmp/perl\-eXXXXXX temporary file for |
1379 | .B \-e |
1380 | commands. |
1381 | .SH SEE ALSO |
1382 | a2p awk to perl translator |
1383 | .br |
1384 | s2p sed to perl translator |
1385 | .SH DIAGNOSTICS |
1386 | Compilation errors will tell you the line number of the error, with an |
1387 | indication of the next token or token type that was to be examined. |
1388 | (In the case of a script passed to |
1389 | .I perl |
1390 | via |
1391 | .B \-e |
1392 | switches, each |
1393 | .B \-e |
1394 | is counted as one line.) |
1395 | .PP |
1396 | Setuid scripts have additional constraints that can produce error messages |
1397 | such as \*(L"Insecure dependency\*(R". |
1398 | See the section on setuid scripts. |
1399 | .SH TRAPS |
1400 | Accustomed |
1401 | .IR awk |
1402 | users should take special note of the following: |
1403 | .Ip * 4 2 |
1404 | Semicolons are required after all simple statements in |
1405 | .IR perl . |
1406 | Newline |
1407 | is not a statement delimiter. |
1408 | .Ip * 4 2 |
1409 | Curly brackets are required on ifs and whiles. |
1410 | .Ip * 4 2 |
1411 | Variables begin with $ or @ in |
1412 | .IR perl . |
1413 | .Ip * 4 2 |
1414 | Arrays index from 0 unless you set $[. |
1415 | Likewise string positions in substr() and index(). |
1416 | .Ip * 4 2 |
1417 | You have to decide whether your array has numeric or string indices. |
1418 | .Ip * 4 2 |
1419 | Associative array values do not spring into existence upon mere reference. |
1420 | .Ip * 4 2 |
1421 | You have to decide whether you want to use string or numeric comparisons. |
1422 | .Ip * 4 2 |
1423 | Reading an input line does not split it for you. You get to split it yourself |
1424 | to an array. |
1425 | And the |
1426 | .I split |
1427 | operator has different arguments. |
1428 | .Ip * 4 2 |
1429 | The current input line is normally in $_, not $0. |
1430 | It generally does not have the newline stripped. |
ac58e20f |
1431 | ($0 is the name of the program executed.) |
a687059c |
1432 | .Ip * 4 2 |
1433 | $<digit> does not refer to fields\*(--it refers to substrings matched by the last |
1434 | match pattern. |
1435 | .Ip * 4 2 |
1436 | The |
1437 | .I print |
1438 | statement does not add field and record separators unless you set |
1439 | $, and $\e. |
1440 | .Ip * 4 2 |
1441 | You must open your files before you print to them. |
1442 | .Ip * 4 2 |
1443 | The range operator is \*(L".\|.\*(R", not comma. |
1444 | (The comma operator works as in C.) |
1445 | .Ip * 4 2 |
1446 | The match operator is \*(L"=~\*(R", not \*(L"~\*(R". |
1447 | (\*(L"~\*(R" is the one's complement operator, as in C.) |
1448 | .Ip * 4 2 |
1449 | The exponentiation operator is \*(L"**\*(R", not \*(L"^\*(R". |
1450 | (\*(L"^\*(R" is the XOR operator, as in C.) |
1451 | .Ip * 4 2 |
1452 | The concatenation operator is \*(L".\*(R", not the null string. |
1453 | (Using the null string would render \*(L"/pat/ /pat/\*(R" unparsable, |
1454 | since the third slash would be interpreted as a division operator\*(--the |
1455 | tokener is in fact slightly context sensitive for operators like /, ?, and <. |
1456 | And in fact, . itself can be the beginning of a number.) |
1457 | .Ip * 4 2 |
1458 | .IR Next , |
1459 | .I exit |
1460 | and |
1461 | .I continue |
1462 | work differently. |
1463 | .Ip * 4 2 |
1464 | The following variables work differently |
1465 | .nf |
1466 | |
1467 | Awk \h'|2.5i'Perl |
1468 | ARGC \h'|2.5i'$#ARGV |
1469 | ARGV[0] \h'|2.5i'$0 |
1470 | FILENAME\h'|2.5i'$ARGV |
1471 | FNR \h'|2.5i'$. \- something |
1472 | FS \h'|2.5i'(whatever you like) |
1473 | NF \h'|2.5i'$#Fld, or some such |
1474 | NR \h'|2.5i'$. |
1475 | OFMT \h'|2.5i'$# |
1476 | OFS \h'|2.5i'$, |
1477 | ORS \h'|2.5i'$\e |
1478 | RLENGTH \h'|2.5i'length($&) |
ac58e20f |
1479 | RS \h'|2.5i'$/ |
a687059c |
1480 | RSTART \h'|2.5i'length($\`) |
1481 | SUBSEP \h'|2.5i'$; |
1482 | |
1483 | .fi |
1484 | .Ip * 4 2 |
1485 | When in doubt, run the |
1486 | .I awk |
1487 | construct through a2p and see what it gives you. |
1488 | .PP |
1489 | Cerebral C programmers should take note of the following: |
1490 | .Ip * 4 2 |
1491 | Curly brackets are required on ifs and whiles. |
1492 | .Ip * 4 2 |
1493 | You should use \*(L"elsif\*(R" rather than \*(L"else if\*(R" |
1494 | .Ip * 4 2 |
1495 | .I Break |
1496 | and |
1497 | .I continue |
1498 | become |
1499 | .I last |
1500 | and |
1501 | .IR next , |
1502 | respectively. |
1503 | .Ip * 4 2 |
1504 | There's no switch statement. |
1505 | .Ip * 4 2 |
1506 | Variables begin with $ or @ in |
1507 | .IR perl . |
1508 | .Ip * 4 2 |
1509 | Printf does not implement *. |
1510 | .Ip * 4 2 |
1511 | Comments begin with #, not /*. |
1512 | .Ip * 4 2 |
1513 | You can't take the address of anything. |
1514 | .Ip * 4 2 |
1515 | ARGV must be capitalized. |
1516 | .Ip * 4 2 |
1517 | The \*(L"system\*(R" calls link, unlink, rename, etc. return nonzero for success, not 0. |
1518 | .Ip * 4 2 |
1519 | Signal handlers deal with signal names, not numbers. |
a687059c |
1520 | .PP |
1521 | Seasoned |
1522 | .I sed |
1523 | programmers should take note of the following: |
1524 | .Ip * 4 2 |
1525 | Backreferences in substitutions use $ rather than \e. |
1526 | .Ip * 4 2 |
1527 | The pattern matching metacharacters (, ), and | do not have backslashes in front. |
1528 | .Ip * 4 2 |
1529 | The range operator is .\|. rather than comma. |
1530 | .PP |
1531 | Sharp shell programmers should take note of the following: |
1532 | .Ip * 4 2 |
1533 | The backtick operator does variable interpretation without regard to the |
1534 | presence of single quotes in the command. |
1535 | .Ip * 4 2 |
1536 | The backtick operator does no translation of the return value, unlike csh. |
1537 | .Ip * 4 2 |
1538 | Shells (especially csh) do several levels of substitution on each command line. |
1539 | .I Perl |
1540 | does substitution only in certain constructs such as double quotes, |
1541 | backticks, angle brackets and search patterns. |
1542 | .Ip * 4 2 |
1543 | Shells interpret scripts a little bit at a time. |
1544 | .I Perl |
1545 | compiles the whole program before executing it. |
1546 | .Ip * 4 2 |
1547 | The arguments are available via @ARGV, not $1, $2, etc. |
1548 | .Ip * 4 2 |
1549 | The environment is not automatically made available as variables. |
1550 | .SH BUGS |
1551 | .PP |
1552 | .I Perl |
1553 | is at the mercy of your machine's definitions of various operations |
1554 | such as type casting, atof() and sprintf(). |
1555 | .PP |
1556 | If your stdio requires an seek or eof between reads and writes on a particular |
1557 | stream, so does |
1558 | .IR perl . |
1559 | .PP |
1560 | While none of the built-in data types have any arbitrary size limits (apart |
1561 | from memory size), there are still a few arbitrary limits: |
1562 | a given identifier may not be longer than 255 characters; |
1563 | sprintf is limited on many machines to 128 characters per field (unless the format |
1564 | specifier is exactly %s); |
1565 | and no component of your PATH may be longer than 255 if you use \-S. |
1566 | .PP |
1567 | .I Perl |
1568 | actually stands for Pathologically Eclectic Rubbish Lister, but don't tell |
1569 | anyone I said that. |
1570 | .rn }` '' |