perl 3.0 patch #26 patch #19, continued
[p5sagit/p5-mst-13.2.git] / perl.man.4
CommitLineData
a687059c 1''' Beginning of part 4
33b78306 2''' $Header: perl_man.4,v 3.0.1.10 90/08/09 04:47:35 lwall Locked $
a687059c 3'''
4''' $Log: perl.man.4,v $
33b78306 5''' Revision 3.0.1.10 90/08/09 04:47:35 lwall
6''' patch19: added require operator
7''' patch19: added numeric interpretation of $]
8'''
9''' Revision 3.0.1.9 90/08/03 11:15:58 lwall
10''' patch19: Intermediate diffs for Randal
11'''
0f85fab0 12''' Revision 3.0.1.8 90/03/27 16:19:31 lwall
13''' patch16: MSDOS support
14'''
63f2c1e1 15''' Revision 3.0.1.7 90/03/14 12:29:50 lwall
16''' patch15: man page falsely states that you can't subscript array values
17'''
79a0689e 18''' Revision 3.0.1.6 90/03/12 16:54:04 lwall
19''' patch13: improved documentation of *name
20'''
ac58e20f 21''' Revision 3.0.1.5 90/02/28 18:01:52 lwall
22''' patch9: $0 is now always the command name
23'''
663a0e37 24''' Revision 3.0.1.4 89/12/21 20:12:39 lwall
25''' patch7: documented that package'filehandle works as well as $package'variable
26''' patch7: documented which identifiers are always in package main
27'''
ffed7fef 28''' Revision 3.0.1.3 89/11/17 15:32:25 lwall
29''' patch5: fixed some manual typos and indent problems
30''' patch5: clarified difference between $! and $@
31'''
ae986130 32''' Revision 3.0.1.2 89/11/11 04:46:40 lwall
33''' patch2: made some line breaks depend on troff vs. nroff
34''' patch2: clarified operation of ^ and $ when $* is false
35'''
03a14243 36''' Revision 3.0.1.1 89/10/26 23:18:43 lwall
37''' patch1: documented the desirability of unnecessary parentheses
38'''
a687059c 39''' Revision 3.0 89/10/18 15:21:55 lwall
40''' 3.0 baseline
41'''
42.Sh "Precedence"
43.I Perl
44operators have the following associativity and precedence:
45.nf
46
47nonassoc\h'|1i'print printf exec system sort reverse
48\h'1.5i'chmod chown kill unlink utime die return
49left\h'|1i',
50right\h'|1i'= += \-= *= etc.
51right\h'|1i'?:
52nonassoc\h'|1i'.\|.
53left\h'|1i'||
54left\h'|1i'&&
55left\h'|1i'| ^
56left\h'|1i'&
57nonassoc\h'|1i'== != eq ne
58nonassoc\h'|1i'< > <= >= lt gt le ge
59nonassoc\h'|1i'chdir exit eval reset sleep rand umask
60nonassoc\h'|1i'\-r \-w \-x etc.
61left\h'|1i'<< >>
62left\h'|1i'+ \- .
63left\h'|1i'* / % x
64left\h'|1i'=~ !~
65right\h'|1i'! ~ and unary minus
66right\h'|1i'**
67nonassoc\h'|1i'++ \-\|\-
68left\h'|1i'\*(L'(\*(R'
69
70.fi
71As mentioned earlier, if any list operator (print, etc.) or
72any unary operator (chdir, etc.)
73is followed by a left parenthesis as the next token on the same line,
74the operator and arguments within parentheses are taken to
75be of highest precedence, just like a normal function call.
76Examples:
77.nf
78
ffed7fef 79 chdir $foo || die;\h'|3i'# (chdir $foo) || die
80 chdir($foo) || die;\h'|3i'# (chdir $foo) || die
81 chdir ($foo) || die;\h'|3i'# (chdir $foo) || die
82 chdir +($foo) || die;\h'|3i'# (chdir $foo) || die
a687059c 83
84but, because * is higher precedence than ||:
85
ffed7fef 86 chdir $foo * 20;\h'|3i'# chdir ($foo * 20)
87 chdir($foo) * 20;\h'|3i'# (chdir $foo) * 20
88 chdir ($foo) * 20;\h'|3i'# (chdir $foo) * 20
89 chdir +($foo) * 20;\h'|3i'# chdir ($foo * 20)
a687059c 90
ffed7fef 91 rand 10 * 20;\h'|3i'# rand (10 * 20)
92 rand(10) * 20;\h'|3i'# (rand 10) * 20
93 rand (10) * 20;\h'|3i'# (rand 10) * 20
94 rand +(10) * 20;\h'|3i'# rand (10 * 20)
a687059c 95
96.fi
97In the absence of parentheses,
98the precedence of list operators such as print, sort or chmod is
99either very high or very low depending on whether you look at the left
100side of operator or the right side of it.
101For example, in
102.nf
103
104 @ary = (1, 3, sort 4, 2);
105 print @ary; # prints 1324
106
107.fi
108the commas on the right of the sort are evaluated before the sort, but
109the commas on the left are evaluated after.
110In other words, list operators tend to gobble up all the arguments that
111follow them, and then act like a simple term with regard to the preceding
112expression.
113Note that you have to be careful with parens:
114.nf
115
116.ne 3
117 # These evaluate exit before doing the print:
118 print($foo, exit); # Obviously not what you want.
119 print $foo, exit; # Nor is this.
120
121.ne 4
122 # These do the print before evaluating exit:
123 (print $foo), exit; # This is what you want.
124 print($foo), exit; # Or this.
125 print ($foo), exit; # Or even this.
126
127Also note that
128
129 print ($foo & 255) + 1, "\en";
130
131.fi
132probably doesn't do what you expect at first glance.
133.Sh "Subroutines"
134A subroutine may be declared as follows:
135.nf
136
137 sub NAME BLOCK
138
139.fi
140.PP
141Any arguments passed to the routine come in as array @_,
142that is ($_[0], $_[1], .\|.\|.).
143The array @_ is a local array, but its values are references to the
144actual scalar parameters.
145The return value of the subroutine is the value of the last expression
146evaluated, and can be either an array value or a scalar value.
147Alternately, a return statement may be used to specify the returned value and
148exit the subroutine.
149To create local variables see the
150.I local
151operator.
152.PP
153A subroutine is called using the
154.I do
155operator or the & operator.
156.nf
157
158.ne 12
159Example:
160
161 sub MAX {
162 local($max) = pop(@_);
163 foreach $foo (@_) {
164 $max = $foo \|if \|$max < $foo;
165 }
166 $max;
167 }
168
169 .\|.\|.
170 $bestday = &MAX($mon,$tue,$wed,$thu,$fri);
171
172.ne 21
173Example:
174
175 # get a line, combining continuation lines
176 # that start with whitespace
177 sub get_line {
178 $thisline = $lookahead;
179 line: while ($lookahead = <STDIN>) {
180 if ($lookahead \|=~ \|/\|^[ \^\e\|t]\|/\|) {
181 $thisline \|.= \|$lookahead;
182 }
183 else {
184 last line;
185 }
186 }
187 $thisline;
188 }
189
190 $lookahead = <STDIN>; # get first line
191 while ($_ = do get_line(\|)) {
192 .\|.\|.
193 }
194
195.fi
196.nf
197.ne 6
198Use array assignment to a local list to name your formal arguments:
199
200 sub maybeset {
201 local($key, $value) = @_;
202 $foo{$key} = $value unless $foo{$key};
203 }
204
205.fi
206This also has the effect of turning call-by-reference into call-by-value,
207since the assignment copies the values.
208.Sp
209Subroutines may be called recursively.
210If a subroutine is called using the & form, the argument list is optional.
211If omitted, no @_ array is set up for the subroutine; the @_ array at the
212time of the call is visible to subroutine instead.
213.nf
214
215 do foo(1,2,3); # pass three arguments
216 &foo(1,2,3); # the same
217
218 do foo(); # pass a null list
219 &foo(); # the same
220 &foo; # pass no arguments--more efficient
221
222.fi
223.Sh "Passing By Reference"
224Sometimes you don't want to pass the value of an array to a subroutine but
225rather the name of it, so that the subroutine can modify the global copy
226of it rather than working with a local copy.
227In perl you can refer to all the objects of a particular name by prefixing
228the name with a star: *foo.
229When evaluated, it produces a scalar value that represents all the objects
79a0689e 230of that name, including any filehandle, format or subroutine.
a687059c 231When assigned to within a local() operation, it causes the name mentioned
232to refer to whatever * value was assigned to it.
233Example:
234.nf
235
236 sub doubleary {
237 local(*someary) = @_;
238 foreach $elem (@someary) {
239 $elem *= 2;
240 }
241 }
242 do doubleary(*foo);
243 do doubleary(*bar);
244
245.fi
246Assignment to *name is currently recommended only inside a local().
247You can actually assign to *name anywhere, but the previous referent of
248*name may be stranded forever.
249This may or may not bother you.
250.Sp
251Note that scalars are already passed by reference, so you can modify scalar
ae986130 252arguments without using this mechanism by referring explicitly to the $_[nnn]
a687059c 253in question.
254You can modify all the elements of an array by passing all the elements
255as scalars, but you have to use the * mechanism to push, pop or change the
256size of an array.
257The * mechanism will probably be more efficient in any case.
258.Sp
259Since a *name value contains unprintable binary data, if it is used as
260an argument in a print, or as a %s argument in a printf or sprintf, it
261then has the value '*name', just so it prints out pretty.
79a0689e 262.Sp
263Even if you don't want to modify an array, this mechanism is useful for
264passing multiple arrays in a single LIST, since normally the LIST mechanism
265will merge all the array values so that you can't extract out the
266individual arrays.
a687059c 267.Sh "Regular Expressions"
268The patterns used in pattern matching are regular expressions such as
269those supplied in the Version 8 regexp routines.
270(In fact, the routines are derived from Henry Spencer's freely redistributable
271reimplementation of the V8 routines.)
272In addition, \ew matches an alphanumeric character (including \*(L"_\*(R") and \eW a nonalphanumeric.
273Word boundaries may be matched by \eb, and non-boundaries by \eB.
274A whitespace character is matched by \es, non-whitespace by \eS.
275A numeric character is matched by \ed, non-numeric by \eD.
276You may use \ew, \es and \ed within character classes.
277Also, \en, \er, \ef, \et and \eNNN have their normal interpretations.
278Within character classes \eb represents backspace rather than a word boundary.
279Alternatives may be separated by |.
280The bracketing construct \|(\ .\|.\|.\ \|) may also be used, in which case \e<digit>
281matches the digit'th substring, where digit can range from 1 to 9.
282(Outside of the pattern, always use $ instead of \e in front of the digit.
283The scope of $<digit> (and $\`, $& and $\')
284extends to the end of the enclosing BLOCK or eval string, or to
285the next pattern match with subexpressions.
286The \e<digit> notation sometimes works outside the current pattern, but should
287not be relied upon.)
288$+ returns whatever the last bracket match matched.
289$& returns the entire matched string.
ac58e20f 290($0 used to return the same thing, but not any more.)
a687059c 291$\` returns everything before the matched string.
292$\' returns everything after the matched string.
293Examples:
294.nf
295
296 s/\|^\|([^ \|]*\|) \|*([^ \|]*\|)\|/\|$2 $1\|/; # swap first two words
297
298.ne 5
299 if (/\|Time: \|(.\|.\|):\|(.\|.\|):\|(.\|.\|)\|/\|) {
300 $hours = $1;
301 $minutes = $2;
302 $seconds = $3;
303 }
304
305.fi
ae986130 306By default, the ^ character is only guaranteed to match at the beginning
307of the string,
308the $ character only at the end (or before the newline at the end)
a687059c 309and
310.I perl
311does certain optimizations with the assumption that the string contains
312only one line.
ae986130 313The behavior of ^ and $ on embedded newlines will be inconsistent.
a687059c 314You may, however, wish to treat a string as a multi-line buffer, such that
315the ^ will match after any newline within the string, and $ will match
316before any newline.
317At the cost of a little more overhead, you can do this by setting the variable
318$* to 1.
319Setting it back to 0 makes
320.I perl
321revert to its old behavior.
322.PP
323To facilitate multi-line substitutions, the . character never matches a newline
324(even when $* is 0).
325In particular, the following leaves a newline on the $_ string:
326.nf
327
328 $_ = <STDIN>;
329 s/.*(some_string).*/$1/;
330
331If the newline is unwanted, try one of
332
333 s/.*(some_string).*\en/$1/;
334 s/.*(some_string)[^\e000]*/$1/;
335 s/.*(some_string)(.|\en)*/$1/;
336 chop; s/.*(some_string).*/$1/;
337 /(some_string)/ && ($_ = $1);
338
339.fi
340Any item of a regular expression may be followed with digits in curly brackets
341of the form {n,m}, where n gives the minimum number of times to match the item
342and m gives the maximum.
343The form {n} is equivalent to {n,n} and matches exactly n times.
344The form {n,} matches n or more times.
345(If a curly bracket occurs in any other context, it is treated as a regular
346character.)
347The * modifier is equivalent to {0,}, the + modifier to {1,} and the ? modifier
348to {0,1}.
349There is no limit to the size of n or m, but large numbers will chew up
350more memory.
351.Sp
352You will note that all backslashed metacharacters in
353.I perl
354are alphanumeric,
355such as \eb, \ew, \en.
356Unlike some other regular expression languages, there are no backslashed
357symbols that aren't alphanumeric.
358So anything that looks like \e\e, \e(, \e), \e<, \e>, \e{, or \e} is always
359interpreted as a literal character, not a metacharacter.
360This makes it simple to quote a string that you want to use for a pattern
361but that you are afraid might contain metacharacters.
362Simply quote all the non-alphanumeric characters:
363.nf
364
365 $pattern =~ s/(\eW)/\e\e$1/g;
366
367.fi
368.Sh "Formats"
369Output record formats for use with the
370.I write
371operator may declared as follows:
372.nf
373
374.ne 3
375 format NAME =
376 FORMLIST
377 .
378
379.fi
380If name is omitted, format \*(L"STDOUT\*(R" is defined.
381FORMLIST consists of a sequence of lines, each of which may be of one of three
382types:
383.Ip 1. 4
384A comment.
385.Ip 2. 4
386A \*(L"picture\*(R" line giving the format for one output line.
387.Ip 3. 4
388An argument line supplying values to plug into a picture line.
389.PP
390Picture lines are printed exactly as they look, except for certain fields
391that substitute values into the line.
392Each picture field starts with either @ or ^.
393The @ field (not to be confused with the array marker @) is the normal
394case; ^ fields are used
395to do rudimentary multi-line text block filling.
396The length of the field is supplied by padding out the field
397with multiple <, >, or | characters to specify, respectively, left justification,
398right justification, or centering.
399If any of the values supplied for these fields contains a newline, only
400the text up to the newline is printed.
401The special field @* can be used for printing multi-line values.
402It should appear by itself on a line.
403.PP
404The values are specified on the following line, in the same order as
405the picture fields.
406The values should be separated by commas.
407.PP
408Picture fields that begin with ^ rather than @ are treated specially.
409The value supplied must be a scalar variable name which contains a text
410string.
411.I Perl
412puts as much text as it can into the field, and then chops off the front
413of the string so that the next time the variable is referenced,
414more of the text can be printed.
415Normally you would use a sequence of fields in a vertical stack to print
416out a block of text.
417If you like, you can end the final field with .\|.\|., which will appear in the
418output if the text was too long to appear in its entirety.
419You can change which characters are legal to break on by changing the
420variable $: to a list of the desired characters.
421.PP
422Since use of ^ fields can produce variable length records if the text to be
423formatted is short, you can suppress blank lines by putting the tilde (~)
424character anywhere in the line.
425(Normally you should put it in the front if possible, for visibility.)
426The tilde will be translated to a space upon output.
427If you put a second tilde contiguous to the first, the line will be repeated
428until all the fields on the line are exhausted.
429(If you use a field of the @ variety, the expression you supply had better
430not give the same value every time forever!)
431.PP
432Examples:
433.nf
434.lg 0
435.cs R 25
436.ft C
437
438.ne 10
439# a report on the /etc/passwd file
440format top =
441\& Passwd File
442Name Login Office Uid Gid Home
443------------------------------------------------------------------
444\&.
445format STDOUT =
446@<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<<
447$name, $login, $office,$uid,$gid, $home
448\&.
449
450.ne 29
451# a report from a bug report form
452format top =
453\& Bug Reports
454@<<<<<<<<<<<<<<<<<<<<<<< @||| @>>>>>>>>>>>>>>>>>>>>>>>
455$system, $%, $date
456------------------------------------------------------------------
457\&.
458format STDOUT =
459Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
460\& $subject
461Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
462\& $index, $description
463Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
464\& $priority, $date, $description
465From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
466\& $from, $description
467Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
468\& $programmer, $description
469\&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
470\& $description
471\&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
472\& $description
473\&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
474\& $description
475\&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
476\& $description
477\&~ ^<<<<<<<<<<<<<<<<<<<<<<<...
478\& $description
479\&.
480
481.ft R
482.cs R
483.lg
484.fi
485It is possible to intermix prints with writes on the same output channel,
486but you'll have to handle $\- (lines left on the page) yourself.
487.PP
488If you are printing lots of fields that are usually blank, you should consider
489using the reset operator between records.
490Not only is it more efficient, but it can prevent the bug of adding another
491field and forgetting to zero it.
492.Sh "Interprocess Communication"
493The IPC facilities of perl are built on the Berkeley socket mechanism.
494If you don't have sockets, you can ignore this section.
495The calls have the same names as the corresponding system calls,
496but the arguments tend to differ, for two reasons.
497First, perl file handles work differently than C file descriptors.
498Second, perl already knows the length of its strings, so you don't need
499to pass that information.
500Here is a sample client (untested):
501.nf
502
503 ($them,$port) = @ARGV;
504 $port = 2345 unless $port;
505 $them = 'localhost' unless $them;
506
507 $SIG{'INT'} = 'dokill';
508 sub dokill { kill 9,$child if $child; }
509
33b78306 510 require 'sys/socket.ph';
a687059c 511
512 $sockaddr = 'S n a4 x8';
513 chop($hostname = `hostname`);
514
515 ($name, $aliases, $proto) = getprotobyname('tcp');
516 ($name, $aliases, $port) = getservbyname($port, 'tcp')
0f85fab0 517 unless $port =~ /^\ed+$/;
ae986130 518.ie t \{\
a687059c 519 ($name, $aliases, $type, $len, $thisaddr) = gethostbyname($hostname);
ae986130 520'br\}
521.el \{\
522 ($name, $aliases, $type, $len, $thisaddr) =
523 gethostbyname($hostname);
524'br\}
a687059c 525 ($name, $aliases, $type, $len, $thataddr) = gethostbyname($them);
526
527 $this = pack($sockaddr, &AF_INET, 0, $thisaddr);
528 $that = pack($sockaddr, &AF_INET, $port, $thataddr);
529
530 socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
531 bind(S, $this) || die "bind: $!";
532 connect(S, $that) || die "connect: $!";
533
534 select(S); $| = 1; select(stdout);
535
536 if ($child = fork) {
537 while (<>) {
538 print S;
539 }
540 sleep 3;
541 do dokill();
542 }
543 else {
544 while (<S>) {
545 print;
546 }
547 }
548
549.fi
550And here's a server:
551.nf
552
553 ($port) = @ARGV;
554 $port = 2345 unless $port;
555
33b78306 556 require 'sys/socket.ph';
a687059c 557
558 $sockaddr = 'S n a4 x8';
559
560 ($name, $aliases, $proto) = getprotobyname('tcp');
561 ($name, $aliases, $port) = getservbyname($port, 'tcp')
0f85fab0 562 unless $port =~ /^\ed+$/;
a687059c 563
564 $this = pack($sockaddr, &AF_INET, $port, "\e0\e0\e0\e0");
565
566 select(NS); $| = 1; select(stdout);
567
568 socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!";
569 bind(S, $this) || die "bind: $!";
570 listen(S, 5) || die "connect: $!";
571
572 select(S); $| = 1; select(stdout);
573
574 for (;;) {
575 print "Listening again\en";
576 ($addr = accept(NS,S)) || die $!;
577 print "accept ok\en";
578
ae986130 579 ($af,$port,$inetaddr) = unpack($sockaddr,$addr);
a687059c 580 @inetaddr = unpack('C4',$inetaddr);
581 print "$af $port @inetaddr\en";
582
583 while (<NS>) {
584 print;
585 print NS;
586 }
587 }
588
589.fi
590.Sh "Predefined Names"
591The following names have special meaning to
592.IR perl .
593I could have used alphabetic symbols for some of these, but I didn't want
594to take the chance that someone would say reset \*(L"a\-zA\-Z\*(R" and wipe them all
595out.
596You'll just have to suffer along with these silly symbols.
597Most of them have reasonable mnemonics, or analogues in one of the shells.
598.Ip $_ 8
599The default input and pattern-searching space.
600The following pairs are equivalent:
601.nf
602
603.ne 2
604 while (<>) {\|.\|.\|. # only equivalent in while!
605 while ($_ = <>) {\|.\|.\|.
606
607.ne 2
608 /\|^Subject:/
609 $_ \|=~ \|/\|^Subject:/
610
611.ne 2
612 y/a\-z/A\-Z/
613 $_ =~ y/a\-z/A\-Z/
614
615.ne 2
616 chop
617 chop($_)
618
619.fi
620(Mnemonic: underline is understood in certain operations.)
621.Ip $. 8
622The current input line number of the last filehandle that was read.
623Readonly.
624Remember that only an explicit close on the filehandle resets the line number.
625Since <> never does an explicit close, line numbers increase across ARGV files
626(but see examples under eof).
627(Mnemonic: many programs use . to mean the current line number.)
628.Ip $/ 8
629The input record separator, newline by default.
630Works like
631.IR awk 's
632RS variable, including treating blank lines as delimiters
633if set to the null string.
634If set to a value longer than one character, only the first character is used.
635(Mnemonic: / is used to delimit line boundaries when quoting poetry.)
636.Ip $, 8
637The output field separator for the print operator.
638Ordinarily the print operator simply prints out the comma separated fields
639you specify.
640In order to get behavior more like
641.IR awk ,
642set this variable as you would set
643.IR awk 's
644OFS variable to specify what is printed between fields.
645(Mnemonic: what is printed when there is a , in your print statement.)
646.Ip $"" 8
647This is like $, except that it applies to array values interpolated into
648a double-quoted string (or similar interpreted string).
649Default is a space.
650(Mnemonic: obvious, I think.)
651.Ip $\e 8
652The output record separator for the print operator.
653Ordinarily the print operator simply prints out the comma separated fields
654you specify, with no trailing newline or record separator assumed.
655In order to get behavior more like
656.IR awk ,
657set this variable as you would set
658.IR awk 's
659ORS variable to specify what is printed at the end of the print.
660(Mnemonic: you set $\e instead of adding \en at the end of the print.
661Also, it's just like /, but it's what you get \*(L"back\*(R" from
662.IR perl .)
663.Ip $# 8
664The output format for printed numbers.
665This variable is a half-hearted attempt to emulate
666.IR awk 's
667OFMT variable.
668There are times, however, when
669.I awk
670and
671.I perl
672have differing notions of what
673is in fact numeric.
674Also, the initial value is %.20g rather than %.6g, so you need to set $#
675explicitly to get
676.IR awk 's
677value.
678(Mnemonic: # is the number sign.)
679.Ip $% 8
680The current page number of the currently selected output channel.
681(Mnemonic: % is page number in nroff.)
682.Ip $= 8
683The current page length (printable lines) of the currently selected output
684channel.
685Default is 60.
686(Mnemonic: = has horizontal lines.)
687.Ip $\- 8
688The number of lines left on the page of the currently selected output channel.
689(Mnemonic: lines_on_page \- lines_printed.)
690.Ip $~ 8
691The name of the current report format for the currently selected output
692channel.
693(Mnemonic: brother to $^.)
694.Ip $^ 8
695The name of the current top-of-page format for the currently selected output
696channel.
697(Mnemonic: points to top of page.)
698.Ip $| 8
699If set to nonzero, forces a flush after every write or print on the currently
700selected output channel.
701Default is 0.
702Note that
703.I STDOUT
704will typically be line buffered if output is to the
705terminal and block buffered otherwise.
706Setting this variable is useful primarily when you are outputting to a pipe,
707such as when you are running a
708.I perl
709script under rsh and want to see the
710output as it's happening.
711(Mnemonic: when you want your pipes to be piping hot.)
712.Ip $$ 8
713The process number of the
714.I perl
715running this script.
716(Mnemonic: same as shells.)
717.Ip $? 8
718The status returned by the last pipe close, backtick (\`\`) command or
719.I system
720operator.
721Note that this is the status word returned by the wait() system
722call, so the exit value of the subprocess is actually ($? >> 8).
723$? & 255 gives which signal, if any, the process died from, and whether
724there was a core dump.
725(Mnemonic: similar to sh and ksh.)
726.Ip $& 8 4
727The string matched by the last pattern match (not counting any matches hidden
728within a BLOCK or eval enclosed by the current BLOCK).
729(Mnemonic: like & in some editors.)
730.Ip $\` 8 4
731The string preceding whatever was matched by the last pattern match
732(not counting any matches hidden within a BLOCK or eval enclosed by the current
733BLOCK).
734(Mnemonic: \` often precedes a quoted string.)
735.Ip $\' 8 4
736The string following whatever was matched by the last pattern match
737(not counting any matches hidden within a BLOCK or eval enclosed by the current
738BLOCK).
739(Mnemonic: \' often follows a quoted string.)
740Example:
741.nf
742
743.ne 3
744 $_ = \'abcdefghi\';
745 /def/;
746 print "$\`:$&:$\'\en"; # prints abc:def:ghi
747
748.fi
749.Ip $+ 8 4
750The last bracket matched by the last search pattern.
751This is useful if you don't know which of a set of alternative patterns
752matched.
753For example:
754.nf
755
756 /Version: \|(.*\|)|Revision: \|(.*\|)\|/ \|&& \|($rev = $+);
757
758.fi
759(Mnemonic: be positive and forward looking.)
760.Ip $* 8 2
761Set to 1 to do multiline matching within a string, 0 to tell
762.I perl
763that it can assume that strings contain a single line, for the purpose
764of optimizing pattern matches.
765Pattern matches on strings containing multiple newlines can produce confusing
766results when $* is 0.
767Default is 0.
768(Mnemonic: * matches multiple things.)
769.Ip $0 8
770Contains the name of the file containing the
771.I perl
772script being executed.
a687059c 773(Mnemonic: same as sh and ksh.)
774.Ip $<digit> 8
775Contains the subpattern from the corresponding set of parentheses in the last
776pattern matched, not counting patterns matched in nested blocks that have
777been exited already.
778(Mnemonic: like \edigit.)
779.Ip $[ 8 2
780The index of the first element in an array, and of the first character in
781a substring.
782Default is 0, but you could set it to 1 to make
783.I perl
784behave more like
785.I awk
786(or Fortran)
787when subscripting and when evaluating the index() and substr() functions.
788(Mnemonic: [ begins subscripts.)
789.Ip $] 8 2
790The string printed out when you say \*(L"perl -v\*(R".
791It can be used to determine at the beginning of a script whether the perl
792interpreter executing the script is in the right range of versions.
33b78306 793If used in a numeric context, returns the version + patchlevel / 1000.
a687059c 794Example:
795.nf
796
33b78306 797.ne 8
a687059c 798 # see if getc is available
799 ($version,$patchlevel) =
800 $] =~ /(\ed+\e.\ed+).*\enPatch level: (\ed+)/;
801 print STDERR "(No filename completion available.)\en"
802 if $version * 1000 + $patchlevel < 2016;
803
33b78306 804or, used numerically,
805
806 warn "No checksumming!\n" if $] < 3.019;
807
a687059c 808.fi
809(Mnemonic: Is this version of perl in the right bracket?)
810.Ip $; 8 2
811The subscript separator for multi-dimensional array emulation.
812If you refer to an associative array element as
813.nf
814 $foo{$a,$b,$c}
815
816it really means
817
818 $foo{join($;, $a, $b, $c)}
819
820But don't put
821
822 @foo{$a,$b,$c} # a slice--note the @
823
824which means
825
826 ($foo{$a},$foo{$b},$foo{$c})
827
828.fi
829Default is "\e034", the same as SUBSEP in
830.IR awk .
831Note that if your keys contain binary data there might not be any safe
832value for $;.
833(Mnemonic: comma (the syntactic subscript separator) is a semi-semicolon.
834Yeah, I know, it's pretty lame, but $, is already taken for something more
835important.)
836.Ip $! 8 2
837If used in a numeric context, yields the current value of errno, with all the
838usual caveats.
ffed7fef 839(This means that you shouldn't depend on the value of $! to be anything
840in particular unless you've gotten a specific error return indicating a
841system error.)
a687059c 842If used in a string context, yields the corresponding system error string.
843You can assign to $! in order to set errno
844if, for instance, you want $! to return the string for error n, or you want
845to set the exit value for the die operator.
846(Mnemonic: What just went bang?)
847.Ip $@ 8 2
ffed7fef 848The perl syntax error message from the last eval command.
849If null, the last eval parsed and executed correctly (although the operations
850you invoked may have failed in the normal fashion).
a687059c 851(Mnemonic: Where was the syntax error \*(L"at\*(R"?)
852.Ip $< 8 2
853The real uid of this process.
854(Mnemonic: it's the uid you came FROM, if you're running setuid.)
855.Ip $> 8 2
856The effective uid of this process.
857Example:
858.nf
859
860.ne 2
861 $< = $>; # set real uid to the effective uid
862 ($<,$>) = ($>,$<); # swap real and effective uid
863
864.fi
865(Mnemonic: it's the uid you went TO, if you're running setuid.)
866Note: $< and $> can only be swapped on machines supporting setreuid().
867.Ip $( 8 2
868The real gid of this process.
869If you are on a machine that supports membership in multiple groups
870simultaneously, gives a space separated list of groups you are in.
871The first number is the one returned by getgid(), and the subsequent ones
872by getgroups(), one of which may be the same as the first number.
873(Mnemonic: parentheses are used to GROUP things.
874The real gid is the group you LEFT, if you're running setgid.)
875.Ip $) 8 2
876The effective gid of this process.
877If you are on a machine that supports membership in multiple groups
878simultaneously, gives a space separated list of groups you are in.
879The first number is the one returned by getegid(), and the subsequent ones
880by getgroups(), one of which may be the same as the first number.
881(Mnemonic: parentheses are used to GROUP things.
882The effective gid is the group that's RIGHT for you, if you're running setgid.)
883.Sp
884Note: $<, $>, $( and $) can only be set on machines that support the
885corresponding set[re][ug]id() routine.
886$( and $) can only be swapped on machines supporting setregid().
887.Ip $: 8 2
888The current set of characters after which a string may be broken to
889fill continuation fields (starting with ^) in a format.
890Default is "\ \en-", to break on whitespace or hyphens.
891(Mnemonic: a \*(L"colon\*(R" in poetry is a part of a line.)
33b78306 892.Ip $ARGV 8 3
893contains the name of the current file when reading from <>.
a687059c 894.Ip @ARGV 8 3
895The array ARGV contains the command line arguments intended for the script.
896Note that $#ARGV is the generally number of arguments minus one, since
897$ARGV[0] is the first argument, NOT the command name.
898See $0 for the command name.
899.Ip @INC 8 3
900The array INC contains the list of places to look for
901.I perl
902scripts to be
33b78306 903evaluated by the \*(L"do EXPR\*(R" command or the \*(L"require\*(r" command.
a687059c 904It initially consists of the arguments to any
905.B \-I
906command line switches, followed
907by the default
908.I perl
33b78306 909library, probably \*(L"/usr/local/lib/perl\*(R",
910followed by \*(L".\*(R", to represent the current directory.
911.Ip %INC 8 3
912The associative array INC contains entries for each filename that has
913been included via \*(L"do\*(R" or \*(L"require\*(R".
914The key is the filename you specified, and the value is the location of
915the file actually found.
916The \*(L"require\*(R" command uses this array to determine whether
917a given file has already been included.
a687059c 918.Ip $ENV{expr} 8 2
919The associative array ENV contains your current environment.
920Setting a value in ENV changes the environment for child processes.
921.Ip $SIG{expr} 8 2
922The associative array SIG is used to set signal handlers for various signals.
923Example:
924.nf
925
926.ne 12
927 sub handler { # 1st argument is signal name
928 local($sig) = @_;
929 print "Caught a SIG$sig\-\|\-shutting down\en";
930 close(LOG);
931 exit(0);
932 }
933
934 $SIG{\'INT\'} = \'handler\';
935 $SIG{\'QUIT\'} = \'handler\';
936 .\|.\|.
937 $SIG{\'INT\'} = \'DEFAULT\'; # restore default action
938 $SIG{\'QUIT\'} = \'IGNORE\'; # ignore SIGQUIT
939
940.fi
941The SIG array only contains values for the signals actually set within
942the perl script.
943.Sh "Packages"
944Perl provides a mechanism for alternate namespaces to protect packages from
945stomping on each others variables.
946By default, a perl script starts compiling into the package known as \*(L"main\*(R".
947By use of the
948.I package
949declaration, you can switch namespaces.
950The scope of the package declaration is from the declaration itself to the end
951of the enclosing block (the same scope as the local() operator).
952Typically it would be the first declaration in a file to be included by
33b78306 953the \*(L"require\*(R" operator.
a687059c 954You can switch into a package in more than one place; it merely influences
955which symbol table is used by the compiler for the rest of that block.
663a0e37 956You can refer to variables and filehandles in other packages by prefixing
957the identifier with the package name and a single quote.
a687059c 958If the package name is null, the \*(L"main\*(R" package as assumed.
663a0e37 959.PP
960Only identifiers starting with letters are stored in the packages symbol
961table.
962All other symbols are kept in package \*(L"main\*(R".
963In addition, the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC
964and SIG are forced to be in package \*(L"main\*(R", even when used for
965other purposes than their built-in one.
966Note also that, if you have a package called \*(L"m\*(R", \*(L"s\*(R"
967or \*(L"y\*(R", the you can't use the qualified form of an identifier since it
968will be interpreted instead as a pattern match, a substitution
969or a translation.
970.PP
a687059c 971Eval'ed strings are compiled in the package in which the eval was compiled
972in.
973(Assignments to $SIG{}, however, assume the signal handler specified is in the
974main package.
975Qualify the signal handler name if you wish to have a signal handler in
976a package.)
977For an example, examine perldb.pl in the perl library.
978It initially switches to the DB package so that the debugger doesn't interfere
979with variables in the script you are trying to debug.
980At various points, however, it temporarily switches back to the main package
981to evaluate various expressions in the context of the main package.
982.PP
983The symbol table for a package happens to be stored in the associative array
984of that name prepended with an underscore.
985The value in each entry of the associative array is
986what you are referring to when you use the *name notation.
987In fact, the following have the same effect (in package main, anyway),
988though the first is more
989efficient because it does the symbol table lookups at compile time:
990.nf
991
992.ne 2
993 local(*foo) = *bar;
994 local($_main{'foo'}) = $_main{'bar'};
995
996.fi
997You can use this to print out all the variables in a package, for instance.
998Here is dumpvar.pl from the perl library:
999.nf
1000.ne 11
1001 package dumpvar;
1002
1003 sub main'dumpvar {
1004 \& ($package) = @_;
1005 \& local(*stab) = eval("*_$package");
1006 \& while (($key,$val) = each(%stab)) {
1007 \& {
1008 \& local(*entry) = $val;
1009 \& if (defined $entry) {
1010 \& print "\e$$key = '$entry'\en";
1011 \& }
1012.ne 7
1013 \& if (defined @entry) {
1014 \& print "\e@$key = (\en";
1015 \& foreach $num ($[ .. $#entry) {
1016 \& print " $num\et'",$entry[$num],"'\en";
1017 \& }
1018 \& print ")\en";
1019 \& }
1020.ne 10
1021 \& if ($key ne "_$package" && defined %entry) {
1022 \& print "\e%$key = (\en";
1023 \& foreach $key (sort keys(%entry)) {
1024 \& print " $key\et'",$entry{$key},"'\en";
1025 \& }
1026 \& print ")\en";
1027 \& }
1028 \& }
1029 \& }
1030 }
1031
1032.fi
1033Note that, even though the subroutine is compiled in package dumpvar, the
663a0e37 1034name of the subroutine is qualified so that its name is inserted into package
a687059c 1035\*(L"main\*(R".
1036.Sh "Style"
1037Each programmer will, of course, have his or her own preferences in regards
1038to formatting, but there are some general guidelines that will make your
1039programs easier to read.
1040.Ip 1. 4 4
1041Just because you CAN do something a particular way doesn't mean that
1042you SHOULD do it that way.
1043.I Perl
1044is designed to give you several ways to do anything, so consider picking
1045the most readable one.
1046For instance
1047
1048 open(FOO,$foo) || die "Can't open $foo: $!";
1049
1050is better than
1051
1052 die "Can't open $foo: $!" unless open(FOO,$foo);
1053
1054because the second way hides the main point of the statement in a
1055modifier.
1056On the other hand
1057
1058 print "Starting analysis\en" if $verbose;
1059
1060is better than
1061
1062 $verbose && print "Starting analysis\en";
1063
1064since the main point isn't whether the user typed -v or not.
1065.Sp
1066Similarly, just because an operator lets you assume default arguments
1067doesn't mean that you have to make use of the defaults.
1068The defaults are there for lazy systems programmers writing one-shot
1069programs.
1070If you want your program to be readable, consider supplying the argument.
03a14243 1071.Sp
1072Along the same lines, just because you
1073.I can
1074omit parentheses in many places doesn't mean that you ought to:
1075.nf
1076
1077 return print reverse sort num values array;
1078 return print(reverse(sort num (values(%array))));
1079
1080.fi
1081When in doubt, parenthesize.
1082At the very least it will let some poor schmuck bounce on the % key in vi.
a687059c 1083.Ip 2. 4 4
1084Don't go through silly contortions to exit a loop at the top or the
1085bottom, when
1086.I perl
1087provides the "last" operator so you can exit in the middle.
1088Just outdent it a little to make it more visible:
1089.nf
1090
1091.ne 7
1092 line:
1093 for (;;) {
1094 statements;
1095 last line if $foo;
1096 next line if /^#/;
1097 statements;
1098 }
1099
1100.fi
1101.Ip 3. 4 4
1102Don't be afraid to use loop labels\*(--they're there to enhance readability as
1103well as to allow multi-level loop breaks.
1104See last example.
ffed7fef 1105.Ip 4. 4 4
a687059c 1106For portability, when using features that may not be implemented on every
1107machine, test the construct in an eval to see if it fails.
03a14243 1108If you know what version or patchlevel a particular feature was implemented,
1109you can test $] to see if it will be there.
a687059c 1110.Ip 5. 4 4
ffed7fef 1111Choose mnemonic identifiers.
1112.Ip 6. 4 4
a687059c 1113Be consistent.
1114.Sh "Debugging"
1115If you invoke
1116.I perl
1117with a
1118.B \-d
1119switch, your script will be run under a debugging monitor.
1120It will halt before the first executable statement and ask you for a
1121command, such as:
1122.Ip "h" 12 4
1123Prints out a help message.
33b78306 1124.Ip "T" 12 4
1125Stack trace.
a687059c 1126.Ip "s" 12 4
1127Single step.
1128Executes until it reaches the beginning of another statement.
33b78306 1129.Ip "n" 12 4
1130Next.
1131Executes over subroutine calls, until it reaches the beginning of the
1132next statement.
1133.Ip "f" 12 4
1134Finish.
1135Executes statements until it has finished the current subroutine.
a687059c 1136.Ip "c" 12 4
1137Continue.
1138Executes until the next breakpoint is reached.
33b78306 1139.Ip "c line" 12 4
1140Continue to the specified line.
1141Inserts a one-time-only breakpoint at the specified line.
a687059c 1142.Ip "<CR>" 12 4
33b78306 1143Repeat last n or s.
a687059c 1144.Ip "l min+incr" 12 4
1145List incr+1 lines starting at min.
1146If min is omitted, starts where last listing left off.
1147If incr is omitted, previous value of incr is used.
1148.Ip "l min-max" 12 4
1149List lines in the indicated range.
1150.Ip "l line" 12 4
1151List just the indicated line.
1152.Ip "l" 12 4
33b78306 1153List next window.
1154.Ip "-" 12 4
1155List previous window.
1156.Ip "w line" 12 4
1157List window around line.
a687059c 1158.Ip "l subname" 12 4
1159List subroutine.
1160If it's a long subroutine it just lists the beginning.
1161Use \*(L"l\*(R" to list more.
33b78306 1162.Ip "/pattern/" 12 4
1163Regular expression search forward for pattern; the final / is optional.
1164.Ip "?pattern?" 12 4
1165Regular expression search backward for pattern; the final ? is optional.
a687059c 1166.Ip "L" 12 4
1167List lines that have breakpoints or actions.
33b78306 1168.Ip "S" 12 4
1169Lists the names of all subroutines.
a687059c 1170.Ip "t" 12 4
1171Toggle trace mode on or off.
33b78306 1172.Ip "b line condition" 12 4
a687059c 1173Set a breakpoint.
33b78306 1174If line is omitted, sets a breakpoint on the
a687059c 1175line that is about to be executed.
33b78306 1176If a condition is specified, it is evaluated each time the statement is
1177reached and a breakpoint is taken only if the condition is true.
a687059c 1178Breakpoints may only be set on lines that begin an executable statement.
33b78306 1179.Ip "b subname condition" 12 4
a687059c 1180Set breakpoint at first executable line of subroutine.
a687059c 1181.Ip "d line" 12 4
1182Delete breakpoint.
33b78306 1183If line is omitted, deletes the breakpoint on the
a687059c 1184line that is about to be executed.
1185.Ip "D" 12 4
1186Delete all breakpoints.
a687059c 1187.Ip "a line command" 12 4
1188Set an action for line.
1189A multi-line command may be entered by backslashing the newlines.
33b78306 1190.Ip "A" 12 4
1191Delete all line actions.
a687059c 1192.Ip "< command" 12 4
1193Set an action to happen before every debugger prompt.
1194A multi-line command may be entered by backslashing the newlines.
1195.Ip "> command" 12 4
1196Set an action to happen after the prompt when you've just given a command
1197to return to executing the script.
1198A multi-line command may be entered by backslashing the newlines.
33b78306 1199.Ip "V package" 12 4
1200List all variables in package.
1201Default is main package.
a687059c 1202.Ip "! number" 12 4
1203Redo a debugging command.
1204If number is omitted, redoes the previous command.
1205.Ip "! -number" 12 4
1206Redo the command that was that many commands ago.
1207.Ip "H -number" 12 4
1208Display last n commands.
1209Only commands longer than one character are listed.
1210If number is omitted, lists them all.
1211.Ip "q or ^D" 12 4
1212Quit.
1213.Ip "command" 12 4
1214Execute command as a perl statement.
1215A missing semicolon will be supplied.
1216.Ip "p expr" 12 4
1217Same as \*(L"print DB'OUT expr\*(R".
1218The DB'OUT filehandle is opened to /dev/tty, regardless of where STDOUT
1219may be redirected to.
1220.PP
1221If you want to modify the debugger, copy perldb.pl from the perl library
1222to your current directory and modify it as necessary.
1223You can do some customization by setting up a .perldb file which contains
1224initialization code.
1225For instance, you could make aliases like these:
1226.nf
1227
ac58e20f 1228 $DB'alias{'len'} = 's/^len(.*)/p length($1)/';
1229 $DB'alias{'stop'} = 's/^stop (at|in)/b/';
1230 $DB'alias{'.'} =
1231 's/^\e./p "\e$DB\e'sub(\e$DB\e'line):\et",\e$DB\e'line[\e$DB\e'line]/';
a687059c 1232
1233.fi
1234.Sh "Setuid Scripts"
1235.I Perl
1236is designed to make it easy to write secure setuid and setgid scripts.
1237Unlike shells, which are based on multiple substitution passes on each line
1238of the script,
1239.I perl
1240uses a more conventional evaluation scheme with fewer hidden \*(L"gotchas\*(R".
1241Additionally, since the language has more built-in functionality, it
1242has to rely less upon external (and possibly untrustworthy) programs to
1243accomplish its purposes.
1244.PP
1245In an unpatched 4.2 or 4.3bsd kernel, setuid scripts are intrinsically
1246insecure, but this kernel feature can be disabled.
1247If it is,
1248.I perl
1249can emulate the setuid and setgid mechanism when it notices the otherwise
1250useless setuid/gid bits on perl scripts.
1251If the kernel feature isn't disabled,
1252.I perl
1253will complain loudly that your setuid script is insecure.
1254You'll need to either disable the kernel setuid script feature, or put
1255a C wrapper around the script.
1256.PP
1257When perl is executing a setuid script, it takes special precautions to
1258prevent you from falling into any obvious traps.
1259(In some ways, a perl script is more secure than the corresponding
1260C program.)
1261Any command line argument, environment variable, or input is marked as
1262\*(L"tainted\*(R", and may not be used, directly or indirectly, in any
1263command that invokes a subshell, or in any command that modifies files,
1264directories or processes.
1265Any variable that is set within an expression that has previously referenced
1266a tainted value also becomes tainted (even if it is logically impossible
1267for the tainted value to influence the variable).
1268For example:
1269.nf
1270
1271.ne 5
1272 $foo = shift; # $foo is tainted
1273 $bar = $foo,\'bar\'; # $bar is also tainted
1274 $xxx = <>; # Tainted
1275 $path = $ENV{\'PATH\'}; # Tainted, but see below
1276 $abc = \'abc\'; # Not tainted
1277
1278.ne 4
1279 system "echo $foo"; # Insecure
79a0689e 1280 system "/bin/echo", $foo; # Secure (doesn't use sh)
a687059c 1281 system "echo $bar"; # Insecure
1282 system "echo $abc"; # Insecure until PATH set
1283
1284.ne 5
1285 $ENV{\'PATH\'} = \'/bin:/usr/bin\';
1286 $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
1287
1288 $path = $ENV{\'PATH\'}; # Not tainted
1289 system "echo $abc"; # Is secure now!
1290
1291.ne 5
1292 open(FOO,"$foo"); # OK
1293 open(FOO,">$foo"); # Not OK
1294
1295 open(FOO,"echo $foo|"); # Not OK, but...
1296 open(FOO,"-|") || exec \'echo\', $foo; # OK
1297
1298 $zzz = `echo $foo`; # Insecure, zzz tainted
1299
1300 unlink $abc,$foo; # Insecure
1301 umask $foo; # Insecure
1302
1303.ne 3
1304 exec "echo $foo"; # Insecure
1305 exec "echo", $foo; # Secure (doesn't use sh)
1306 exec "sh", \'-c\', $foo; # Considered secure, alas
1307
1308.fi
1309The taintedness is associated with each scalar value, so some elements
1310of an array can be tainted, and others not.
1311.PP
1312If you try to do something insecure, you will get a fatal error saying
1313something like \*(L"Insecure dependency\*(R" or \*(L"Insecure PATH\*(R".
1314Note that you can still write an insecure system call or exec,
ae986130 1315but only by explicitly doing something like the last example above.
a687059c 1316You can also bypass the tainting mechanism by referencing
1317subpatterns\*(--\c
1318.I perl
1319presumes that if you reference a substring using $1, $2, etc, you knew
1320what you were doing when you wrote the pattern:
1321.nf
1322
1323 $ARGV[0] =~ /^\-P(\ew+)$/;
1324 $printer = $1; # Not tainted
1325
1326.fi
1327This is fairly secure since \ew+ doesn't match shell metacharacters.
1328Use of .+ would have been insecure, but
1329.I perl
1330doesn't check for that, so you must be careful with your patterns.
1331This is the ONLY mechanism for untainting user supplied filenames if you
1332want to do file operations on them (unless you make $> equal to $<).
1333.PP
1334It's also possible to get into trouble with other operations that don't care
1335whether they use tainted values.
1336Make judicious use of the file tests in dealing with any user-supplied
1337filenames.
1338When possible, do opens and such after setting $> = $<.
1339.I Perl
1340doesn't prevent you from opening tainted filenames for reading, so be
1341careful what you print out.
1342The tainting mechanism is intended to prevent stupid mistakes, not to remove
1343the need for thought.
1344.SH ENVIRONMENT
1345.I Perl
1346uses PATH in executing subprocesses, and in finding the script if \-S
1347is used.
1348HOME or LOGDIR are used if chdir has no argument.
1349.PP
1350Apart from these,
1351.I perl
1352uses no environment variables, except to make them available
1353to the script being executed, and to child processes.
1354However, scripts running setuid would do well to execute the following lines
1355before doing anything else, just to keep people honest:
1356.nf
1357
1358.ne 3
1359 $ENV{\'PATH\'} = \'/bin:/usr/bin\'; # or whatever you need
1360 $ENV{\'SHELL\'} = \'/bin/sh\' if $ENV{\'SHELL\'} ne \'\';
1361 $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\';
1362
1363.fi
1364.SH AUTHOR
1365Larry Wall <lwall@jpl-devvax.Jpl.Nasa.Gov>
0f85fab0 1366.br
1367MS-DOS port by Diomidis Spinellis <dds@cc.ic.ac.uk>
a687059c 1368.SH FILES
1369/tmp/perl\-eXXXXXX temporary file for
1370.B \-e
1371commands.
1372.SH SEE ALSO
1373a2p awk to perl translator
1374.br
1375s2p sed to perl translator
1376.SH DIAGNOSTICS
1377Compilation errors will tell you the line number of the error, with an
1378indication of the next token or token type that was to be examined.
1379(In the case of a script passed to
1380.I perl
1381via
1382.B \-e
1383switches, each
1384.B \-e
1385is counted as one line.)
1386.PP
1387Setuid scripts have additional constraints that can produce error messages
1388such as \*(L"Insecure dependency\*(R".
1389See the section on setuid scripts.
1390.SH TRAPS
1391Accustomed
1392.IR awk
1393users should take special note of the following:
1394.Ip * 4 2
1395Semicolons are required after all simple statements in
1396.IR perl .
1397Newline
1398is not a statement delimiter.
1399.Ip * 4 2
1400Curly brackets are required on ifs and whiles.
1401.Ip * 4 2
1402Variables begin with $ or @ in
1403.IR perl .
1404.Ip * 4 2
1405Arrays index from 0 unless you set $[.
1406Likewise string positions in substr() and index().
1407.Ip * 4 2
1408You have to decide whether your array has numeric or string indices.
1409.Ip * 4 2
1410Associative array values do not spring into existence upon mere reference.
1411.Ip * 4 2
1412You have to decide whether you want to use string or numeric comparisons.
1413.Ip * 4 2
1414Reading an input line does not split it for you. You get to split it yourself
1415to an array.
1416And the
1417.I split
1418operator has different arguments.
1419.Ip * 4 2
1420The current input line is normally in $_, not $0.
1421It generally does not have the newline stripped.
ac58e20f 1422($0 is the name of the program executed.)
a687059c 1423.Ip * 4 2
1424$<digit> does not refer to fields\*(--it refers to substrings matched by the last
1425match pattern.
1426.Ip * 4 2
1427The
1428.I print
1429statement does not add field and record separators unless you set
1430$, and $\e.
1431.Ip * 4 2
1432You must open your files before you print to them.
1433.Ip * 4 2
1434The range operator is \*(L".\|.\*(R", not comma.
1435(The comma operator works as in C.)
1436.Ip * 4 2
1437The match operator is \*(L"=~\*(R", not \*(L"~\*(R".
1438(\*(L"~\*(R" is the one's complement operator, as in C.)
1439.Ip * 4 2
1440The exponentiation operator is \*(L"**\*(R", not \*(L"^\*(R".
1441(\*(L"^\*(R" is the XOR operator, as in C.)
1442.Ip * 4 2
1443The concatenation operator is \*(L".\*(R", not the null string.
1444(Using the null string would render \*(L"/pat/ /pat/\*(R" unparsable,
1445since the third slash would be interpreted as a division operator\*(--the
1446tokener is in fact slightly context sensitive for operators like /, ?, and <.
1447And in fact, . itself can be the beginning of a number.)
1448.Ip * 4 2
1449.IR Next ,
1450.I exit
1451and
1452.I continue
1453work differently.
1454.Ip * 4 2
1455The following variables work differently
1456.nf
1457
1458 Awk \h'|2.5i'Perl
1459 ARGC \h'|2.5i'$#ARGV
1460 ARGV[0] \h'|2.5i'$0
1461 FILENAME\h'|2.5i'$ARGV
1462 FNR \h'|2.5i'$. \- something
1463 FS \h'|2.5i'(whatever you like)
1464 NF \h'|2.5i'$#Fld, or some such
1465 NR \h'|2.5i'$.
1466 OFMT \h'|2.5i'$#
1467 OFS \h'|2.5i'$,
1468 ORS \h'|2.5i'$\e
1469 RLENGTH \h'|2.5i'length($&)
ac58e20f 1470 RS \h'|2.5i'$/
a687059c 1471 RSTART \h'|2.5i'length($\`)
1472 SUBSEP \h'|2.5i'$;
1473
1474.fi
1475.Ip * 4 2
1476When in doubt, run the
1477.I awk
1478construct through a2p and see what it gives you.
1479.PP
1480Cerebral C programmers should take note of the following:
1481.Ip * 4 2
1482Curly brackets are required on ifs and whiles.
1483.Ip * 4 2
1484You should use \*(L"elsif\*(R" rather than \*(L"else if\*(R"
1485.Ip * 4 2
1486.I Break
1487and
1488.I continue
1489become
1490.I last
1491and
1492.IR next ,
1493respectively.
1494.Ip * 4 2
1495There's no switch statement.
1496.Ip * 4 2
1497Variables begin with $ or @ in
1498.IR perl .
1499.Ip * 4 2
1500Printf does not implement *.
1501.Ip * 4 2
1502Comments begin with #, not /*.
1503.Ip * 4 2
1504You can't take the address of anything.
1505.Ip * 4 2
1506ARGV must be capitalized.
1507.Ip * 4 2
1508The \*(L"system\*(R" calls link, unlink, rename, etc. return nonzero for success, not 0.
1509.Ip * 4 2
1510Signal handlers deal with signal names, not numbers.
a687059c 1511.PP
1512Seasoned
1513.I sed
1514programmers should take note of the following:
1515.Ip * 4 2
1516Backreferences in substitutions use $ rather than \e.
1517.Ip * 4 2
1518The pattern matching metacharacters (, ), and | do not have backslashes in front.
1519.Ip * 4 2
1520The range operator is .\|. rather than comma.
1521.PP
1522Sharp shell programmers should take note of the following:
1523.Ip * 4 2
1524The backtick operator does variable interpretation without regard to the
1525presence of single quotes in the command.
1526.Ip * 4 2
1527The backtick operator does no translation of the return value, unlike csh.
1528.Ip * 4 2
1529Shells (especially csh) do several levels of substitution on each command line.
1530.I Perl
1531does substitution only in certain constructs such as double quotes,
1532backticks, angle brackets and search patterns.
1533.Ip * 4 2
1534Shells interpret scripts a little bit at a time.
1535.I Perl
1536compiles the whole program before executing it.
1537.Ip * 4 2
1538The arguments are available via @ARGV, not $1, $2, etc.
1539.Ip * 4 2
1540The environment is not automatically made available as variables.
1541.SH BUGS
1542.PP
1543.I Perl
1544is at the mercy of your machine's definitions of various operations
1545such as type casting, atof() and sprintf().
1546.PP
1547If your stdio requires an seek or eof between reads and writes on a particular
1548stream, so does
1549.IR perl .
1550.PP
1551While none of the built-in data types have any arbitrary size limits (apart
1552from memory size), there are still a few arbitrary limits:
1553a given identifier may not be longer than 255 characters;
1554sprintf is limited on many machines to 128 characters per field (unless the format
1555specifier is exactly %s);
1556and no component of your PATH may be longer than 255 if you use \-S.
1557.PP
1558.I Perl
1559actually stands for Pathologically Eclectic Rubbish Lister, but don't tell
1560anyone I said that.
1561.rn }` ''