perl 5.003_01: pod/perllol.pod
[p5sagit/p5-mst-13.2.git] / pod / perlsub.pod
CommitLineData
a0d0e21e 1=head1 NAME
2
3perlsub - Perl subroutines
4
5=head1 SYNOPSIS
6
7To declare subroutines:
8
cb1a09d0 9 sub NAME; # A "forward" declaration.
10 sub NAME(PROTO); # ditto, but with prototypes
11
12 sub NAME BLOCK # A declaration and a definition.
13 sub NAME(PROTO) BLOCK # ditto, but with prototypes
a0d0e21e 14
748a9306 15To define an anonymous subroutine at runtime:
16
17 $subref = sub BLOCK;
18
a0d0e21e 19To import subroutines:
20
21 use PACKAGE qw(NAME1 NAME2 NAME3);
22
23To call subroutines:
24
a0d0e21e 25 NAME(LIST); # & is optional with parens.
26 NAME LIST; # Parens optional if predeclared/imported.
cb1a09d0 27 &NAME; # Passes current @_ to subroutine.
a0d0e21e 28
29=head1 DESCRIPTION
30
cb1a09d0 31Like many languages, Perl provides for user-defined subroutines. These
32may be located anywhere in the main program, loaded in from other files
33via the C<do>, C<require>, or C<use> keywords, or even generated on the
34fly using C<eval> or anonymous subroutines (closures). You can even call
c07a80fd 35a function indirectly using a variable containing its name or a CODE reference
36to it, as in C<$var = \&function>.
cb1a09d0 37
38The Perl model for function call and return values is simple: all
39functions are passed as parameters one single flat list of scalars, and
40all functions likewise return to their caller one single flat list of
41scalars. Any arrays or hashes in these call and return lists will
42collapse, losing their identities--but you may always use
43pass-by-reference instead to avoid this. Both call and return lists may
44contain as many or as few scalar elements as you'd like. (Often a
45function without an explicit return statement is called a subroutine, but
46there's really no difference from the language's perspective.)
47
48Any arguments passed to the routine come in as the array @_. Thus if you
49called a function with two arguments, those would be stored in C<$_[0]>
50and C<$_[1]>. The array @_ is a local array, but its values are implicit
51references (predating L<perlref>) to the actual scalar parameters. The
52return value of the subroutine is the value of the last expression
53evaluated. Alternatively, a return statement may be used to specify the
54returned value and exit the subroutine. If you return one or more arrays
55and/or hashes, these will be flattened together into one large
56indistinguishable list.
57
58Perl does not have named formal parameters, but in practice all you do is
59assign to a my() list of these. Any variables you use in the function
60that aren't declared private are global variables. For the gory details
61on creating private variables, see the sections below on L<"Private
62Variables via my()"> and L</"Temporary Values via local()">. To create
63protected environments for a set of functions in a separate package (and
64probably a separate file), see L<perlmod/"Packages">.
a0d0e21e 65
66Example:
67
cb1a09d0 68 sub max {
69 my $max = shift(@_);
a0d0e21e 70 foreach $foo (@_) {
71 $max = $foo if $max < $foo;
72 }
cb1a09d0 73 return $max;
a0d0e21e 74 }
cb1a09d0 75 $bestday = max($mon,$tue,$wed,$thu,$fri);
a0d0e21e 76
77Example:
78
79 # get a line, combining continuation lines
80 # that start with whitespace
81
82 sub get_line {
cb1a09d0 83 $thisline = $lookahead; # GLOBAL VARIABLES!!
a0d0e21e 84 LINE: while ($lookahead = <STDIN>) {
85 if ($lookahead =~ /^[ \t]/) {
86 $thisline .= $lookahead;
87 }
88 else {
89 last LINE;
90 }
91 }
92 $thisline;
93 }
94
95 $lookahead = <STDIN>; # get first line
96 while ($_ = get_line()) {
97 ...
98 }
99
100Use array assignment to a local list to name your formal arguments:
101
102 sub maybeset {
103 my($key, $value) = @_;
cb1a09d0 104 $Foo{$key} = $value unless $Foo{$key};
a0d0e21e 105 }
106
cb1a09d0 107This also has the effect of turning call-by-reference into call-by-value,
108since the assignment copies the values. Otherwise a function is free to
109do in-place modifications of @_ and change its callers values.
110
111 upcase_in($v1, $v2); # this changes $v1 and $v2
112 sub upcase_in {
113 for (@_) { tr/a-z/A-Z/ }
114 }
115
116You aren't allowed to modify constants in this way, of course. If an
117argument were actually literal and you tried to change it, you'd take a
118(presumably fatal) exception. For example, this won't work:
119
120 upcase_in("frederick");
121
122It would be much safer if the upcase_in() function
123were written to return a copy of its parameters instead
124of changing them in place:
125
126 ($v3, $v4) = upcase($v1, $v2); # this doesn't
127 sub upcase {
128 my @parms = @_;
129 for (@parms) { tr/a-z/A-Z/ }
c07a80fd 130 # wantarray checks if we were called in list context
131 return wantarray ? @parms : $parms[0];
cb1a09d0 132 }
133
134Notice how this (unprototyped) function doesn't care whether it was passed
135real scalars or arrays. Perl will see everything as one big long flat @_
136parameter list. This is one of the ways where Perl's simple
137argument-passing style shines. The upcase() function would work perfectly
138well without changing the upcase() definition even if we fed it things
139like this:
140
141 @newlist = upcase(@list1, @list2);
142 @newlist = upcase( split /:/, $var );
143
144Do not, however, be tempted to do this:
145
146 (@a, @b) = upcase(@list1, @list2);
147
148Because like its flat incoming parameter list, the return list is also
149flat. So all you have managed to do here is stored everything in @a and
150made @b an empty list. See L</"Pass by Reference"> for alternatives.
151
152A subroutine may be called using the "&" prefix. The "&" is optional in
153Perl 5, and so are the parens if the subroutine has been predeclared.
154(Note, however, that the "&" is I<NOT> optional when you're just naming
155the subroutine, such as when it's used as an argument to defined() or
156undef(). Nor is it optional when you want to do an indirect subroutine
157call with a subroutine name or reference using the C<&$subref()> or
158C<&{$subref}()> constructs. See L<perlref> for more on that.)
a0d0e21e 159
160Subroutines may be called recursively. If a subroutine is called using
cb1a09d0 161the "&" form, the argument list is optional, and if omitted, no @_ array is
162set up for the subroutine: the @_ array at the time of the call is
163visible to subroutine instead. This is an efficiency mechanism that
164new users may wish to avoid.
a0d0e21e 165
166 &foo(1,2,3); # pass three arguments
167 foo(1,2,3); # the same
168
169 foo(); # pass a null list
170 &foo(); # the same
a0d0e21e 171
cb1a09d0 172 &foo; # foo() get current args, like foo(@_) !!
173 foo; # like foo() IFF sub foo pre-declared, else "foo"
174
c07a80fd 175Not only does the "&" form make the argument list optional, but it also
176disables any prototype checking on the arguments you do provide. This
177is partly for historical reasons, and partly for having a convenient way
178to cheat if you know what you're doing. See the section on Prototypes below.
179
cb1a09d0 180=head2 Private Variables via my()
181
182Synopsis:
183
184 my $foo; # declare $foo lexically local
185 my (@wid, %get); # declare list of variables local
186 my $foo = "flurp"; # declare $foo lexical, and init it
187 my @oof = @bar; # declare @oof lexical, and init it
188
189A "my" declares the listed variables to be confined (lexically) to the
190enclosing block, subroutine, C<eval>, or C<do/require/use>'d file. If
191more than one value is listed, the list must be placed in parens. All
192listed elements must be legal lvalues. Only alphanumeric identifiers may
193be lexically scoped--magical builtins like $/ must currently be localized with
194"local" instead.
195
196Unlike dynamic variables created by the "local" statement, lexical
197variables declared with "my" are totally hidden from the outside world,
198including any called subroutines (even if it's the same subroutine called
199from itself or elsewhere--every call gets its own copy).
200
201(An eval(), however, can see the lexical variables of the scope it is
202being evaluated in so long as the names aren't hidden by declarations within
203the eval() itself. See L<perlref>.)
204
205The parameter list to my() may be assigned to if desired, which allows you
206to initialize your variables. (If no initializer is given for a
207particular variable, it is created with the undefined value.) Commonly
208this is used to name the parameters to a subroutine. Examples:
209
210 $arg = "fred"; # "global" variable
211 $n = cube_root(27);
212 print "$arg thinks the root is $n\n";
213 fred thinks the root is 3
214
215 sub cube_root {
216 my $arg = shift; # name doesn't matter
217 $arg **= 1/3;
218 return $arg;
219 }
220
221The "my" is simply a modifier on something you might assign to. So when
222you do assign to the variables in its argument list, the "my" doesn't
223change whether those variables is viewed as a scalar or an array. So
224
225 my ($foo) = <STDIN>;
226 my @FOO = <STDIN>;
227
228both supply a list context to the righthand side, while
229
230 my $foo = <STDIN>;
231
232supplies a scalar context. But the following only declares one variable:
748a9306 233
cb1a09d0 234 my $foo, $bar = 1;
748a9306 235
cb1a09d0 236That has the same effect as
748a9306 237
cb1a09d0 238 my $foo;
239 $bar = 1;
a0d0e21e 240
cb1a09d0 241The declared variable is not introduced (is not visible) until after
242the current statement. Thus,
243
244 my $x = $x;
245
246can be used to initialize the new $x with the value of the old $x, and
247the expression
248
249 my $x = 123 and $x == 123
250
251is false unless the old $x happened to have the value 123.
252
253Some users may wish to encourage the use of lexically scoped variables.
254As an aid to catching implicit references to package variables,
255if you say
256
257 use strict 'vars';
258
259then any variable reference from there to the end of the enclosing
260block must either refer to a lexical variable, or must be fully
261qualified with the package name. A compilation error results
262otherwise. An inner block may countermand this with S<"no strict 'vars'">.
263
264A my() has both a compile-time and a run-time effect. At compile time,
265the compiler takes notice of it; the principle usefulness of this is to
266quiet C<use strict 'vars'>. The actual initialization doesn't happen
267until run time, so gets executed every time through a loop.
268
269Variables declared with "my" are not part of any package and are therefore
270never fully qualified with the package name. In particular, you're not
271allowed to try to make a package variable (or other global) lexical:
272
273 my $pack::var; # ERROR! Illegal syntax
274 my $_; # also illegal (currently)
275
276In fact, a dynamic variable (also known as package or global variables)
277are still accessible using the fully qualified :: notation even while a
278lexical of the same name is also visible:
279
280 package main;
281 local $x = 10;
282 my $x = 20;
283 print "$x and $::x\n";
284
285That will print out 20 and 10.
286
287You may declare "my" variables at the outer most scope of a file to
288totally hide any such identifiers from the outside world. This is similar
289to a C's static variables at the file level. To do this with a subroutine
290requires the use of a closure (anonymous function). If a block (such as
291an eval(), function, or C<package>) wants to create a private subroutine
292that cannot be called from outside that block, it can declare a lexical
293variable containing an anonymous sub reference:
294
295 my $secret_version = '1.001-beta';
296 my $secret_sub = sub { print $secret_version };
297 &$secret_sub();
298
299As long as the reference is never returned by any function within the
300module, no outside module can see the subroutine, since its name is not in
301any package's symbol table. Remember that it's not I<REALLY> called
302$some_pack::secret_version or anything; it's just $secret_version,
303unqualified and unqualifiable.
304
305This does not work with object methods, however; all object methods have
306to be in the symbol table of some package to be found.
307
308Just because the lexical variable is lexically (also called statically)
309scoped doesn't mean that within a function it works like a C static. It
310normally works more like a C auto. But here's a mechanism for giving a
311function private variables with both lexical scoping and a static
312lifetime. If you do want to create something like C's static variables,
313just enclose the whole function in an extra block, and put the
314static variable outside the function but in the block.
315
316 {
317 my $secret_val = 0;
318 sub gimme_another {
319 return ++$secret_val;
320 }
321 }
322 # $secret_val now becomes unreachable by the outside
323 # world, but retains its value between calls to gimme_another
324
325If this function is being sourced in from a separate file
326via C<require> or C<use>, then this is probably just fine. If it's
327all in the main program, you'll need to arrange for the my()
328to be executed early, either by putting the whole block above
329your pain program, or more likely, merely placing a BEGIN
330sub around it to make sure it gets executed before your program
331starts to run:
332
333 sub BEGIN {
334 my $secret_val = 0;
335 sub gimme_another {
336 return ++$secret_val;
337 }
338 }
339
340See L<perlrun> about the BEGIN function.
341
342=head2 Temporary Values via local()
343
344B<NOTE>: In general, you should be using "my" instead of "local", because
345it's faster and safer. Execeptions to this include the global punctuation
346variables, filehandles and formats, and direct manipulation of the Perl
347symbol table itself. Format variables often use "local" though, as do
348other variables whose current value must be visible to called
349subroutines.
350
351Synopsis:
352
353 local $foo; # declare $foo dynamically local
354 local (@wid, %get); # declare list of variables local
355 local $foo = "flurp"; # declare $foo dynamic, and init it
356 local @oof = @bar; # declare @oof dynamic, and init it
357
358 local *FH; # localize $FH, @FH, %FH, &FH ...
359 local *merlyn = *randal; # now $merlyn is really $randal, plus
360 # @merlyn is really @randal, etc
361 local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal
362 local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc
363
364A local() modifies its listed variables to be local to the enclosing
365block, (or subroutine, C<eval{}> or C<do>) and I<the any called from
366within that block>. A local() just gives temporary values to global
367(meaning package) variables. This is known as dynamic scoping. Lexical
368scoping is done with "my", which works more like C's auto declarations.
369
370If more than one variable is given to local(), they must be placed in
371parens. All listed elements must be legal lvalues. This operator works
372by saving the current values of those variables in its argument list on a
373hidden stack and restoring them upon exiting the block, subroutine or
374eval. This means that called subroutines can also reference the local
375variable, but not the global one. The argument list may be assigned to if
376desired, which allows you to initialize your local variables. (If no
377initializer is given for a particular variable, it is created with an
378undefined value.) Commonly this is used to name the parameters to a
379subroutine. Examples:
380
381 for $i ( 0 .. 9 ) {
382 $digits{$i} = $i;
383 }
384 # assume this function uses global %digits hash
385 parse_num();
386
387 # now temporarily add to %digits hash
388 if ($base12) {
389 # (NOTE: not claiming this is efficient!)
390 local %digits = (%digits, 't' => 10, 'e' => 11);
391 parse_num(); # parse_num gets this new %digits!
392 }
393 # old %digits restored here
394
395Because local() is a run-time command, and so gets executed every time
396through a loop. In releases of Perl previous to 5.0, this used more stack
397storage each time until the loop was exited. Perl now reclaims the space
398each time through, but it's still more efficient to declare your variables
399outside the loop.
400
401A local is simply a modifier on an lvalue expression. When you assign to
402a localized variable, the local doesn't change whether its list is viewed
403as a scalar or an array. So
404
405 local($foo) = <STDIN>;
406 local @FOO = <STDIN>;
407
408both supply a list context to the righthand side, while
409
410 local $foo = <STDIN>;
411
412supplies a scalar context.
413
414=head2 Passing Symbol Table Entries (typeglobs)
415
416[Note: The mechanism described in this section was originally the only
417way to simulate pass-by-reference in older versions of Perl. While it
418still works fine in modern versions, the new reference mechanism is
419generally easier to work with. See below.]
a0d0e21e 420
421Sometimes you don't want to pass the value of an array to a subroutine
422but rather the name of it, so that the subroutine can modify the global
423copy of it rather than working with a local copy. In perl you can
cb1a09d0 424refer to all objects of a particular name by prefixing the name
a0d0e21e 425with a star: C<*foo>. This is often known as a "type glob", since the
426star on the front can be thought of as a wildcard match for all the
427funny prefix characters on variables and subroutines and such.
428
429When evaluated, the type glob produces a scalar value that represents
430all the objects of that name, including any filehandle, format or
431subroutine. When assigned to, it causes the name mentioned to refer to
432whatever "*" value was assigned to it. Example:
433
434 sub doubleary {
435 local(*someary) = @_;
436 foreach $elem (@someary) {
437 $elem *= 2;
438 }
439 }
440 doubleary(*foo);
441 doubleary(*bar);
442
443Note that scalars are already passed by reference, so you can modify
444scalar arguments without using this mechanism by referring explicitly
445to $_[0] etc. You can modify all the elements of an array by passing
446all the elements as scalars, but you have to use the * mechanism (or
447the equivalent reference mechanism) to push, pop or change the size of
448an array. It will certainly be faster to pass the typeglob (or reference).
449
450Even if you don't want to modify an array, this mechanism is useful for
451passing multiple arrays in a single LIST, since normally the LIST
452mechanism will merge all the array values so that you can't extract out
cb1a09d0 453the individual arrays. For more on typeglobs, see L<perldata/"Typeglobs">.
454
455=head2 Pass by Reference
456
457If you want to pass more than one array or hash into a function--or
458return them from it--and have them maintain their integrity,
459then you're going to have to use an explicit pass-by-reference.
c07a80fd 460Before you do that, you need to understand references as detailed in L<perlref>.
461This section may not make much sense to you otherwise.
cb1a09d0 462
463Here are a few simple examples. First, let's pass in several
464arrays to a function and have it pop all of then, return a new
465list of all their former last elements:
466
467 @tailings = popmany ( \@a, \@b, \@c, \@d );
468
469 sub popmany {
470 my $aref;
471 my @retlist = ();
472 foreach $aref ( @_ ) {
473 push @retlist, pop @$aref;
474 }
475 return @retlist;
476 }
477
478Here's how you might write a function that returns a
479list of keys occurring in all the hashes passed to it:
480
481 @common = inter( \%foo, \%bar, \%joe );
482 sub inter {
483 my ($k, $href, %seen); # locals
484 foreach $href (@_) {
485 while ( $k = each %$href ) {
486 $seen{$k}++;
487 }
488 }
489 return grep { $seen{$_} == @_ } keys %seen;
490 }
491
492So far, we're just using the normal list return mechanism.
493What happens if you want to pass or return a hash? Well,
494if you're only using one of them, or you don't mind them
495concatenating, then the normal calling convention is ok, although
496a little expensive.
497
498Where people get into trouble is here:
499
500 (@a, @b) = func(@c, @d);
501or
502 (%a, %b) = func(%c, %d);
503
504That syntax simply won't work. It just sets @a or %a and clears the @b or
505%b. Plus the function didn't get passed into two separate arrays or
506hashes: it got one long list in @_, as always.
507
508If you can arrange for everyone to deal with this through references, it's
509cleaner code, although not so nice to look at. Here's a function that
510takes two array references as arguments, returning the two array elements
511in order of how many elements they have in them:
512
513 ($aref, $bref) = func(\@c, \@d);
514 print "@$aref has more than @$bref\n";
515 sub func {
516 my ($cref, $dref) = @_;
517 if (@$cref > @$dref) {
518 return ($cref, $dref);
519 } else {
c07a80fd 520 return ($dref, $cref);
cb1a09d0 521 }
522 }
523
524It turns out that you can actually do this also:
525
526 (*a, *b) = func(\@c, \@d);
527 print "@a has more than @b\n";
528 sub func {
529 local (*c, *d) = @_;
530 if (@c > @d) {
531 return (\@c, \@d);
532 } else {
533 return (\@d, \@c);
534 }
535 }
536
537Here we're using the typeglobs to do symbol table aliasing. It's
538a tad subtle, though, and also won't work if you're using my()
539variables, since only globals (well, and local()s) are in the symbol table.
540
541If you're passing around filehandles, you could usually just use the bare
542typeglob, like *STDOUT, but typeglobs references would be better because
543they'll still work properly under C<use strict 'refs'>. For example:
544
545 splutter(\*STDOUT);
546 sub splutter {
547 my $fh = shift;
548 print $fh "her um well a hmmm\n";
549 }
550
551 $rec = get_rec(\*STDIN);
552 sub get_rec {
553 my $fh = shift;
554 return scalar <$fh>;
555 }
556
557If you're planning on generating new filehandles, you could do this:
558
559 sub openit {
560 my $name = shift;
561 local *FH;
562 return open (FH, $path) ? \*FH : undef;
563 }
564
565Although that will actually produce a small memory leak. See the bottom
566of L<perlfunc/open()> for a somewhat cleaner way using the FileHandle
567functions supplied with the POSIX package.
568
569=head2 Prototypes
570
571As of the 5.002 release of perl, if you declare
572
573 sub mypush (\@@)
574
c07a80fd 575then mypush() takes arguments exactly like push() does. The declaration
576of the function to be called must be visible at compile time. The prototype
577only affects the interpretation of new-style calls to the function, where
578new-style is defined as not using the C<&> character. In other words,
579if you call it like a builtin function, then it behaves like a builtin
580function. If you call it like an old-fashioned subroutine, then it
581behaves like an old-fashioned subroutine. It naturally falls out from
582this rule that prototypes have no influence on subroutine references
583like C<\&foo> or on indirect subroutine calls like C<&{$subref}>.
584
585Method calls are not influenced by prototypes either, because the
586function to be called is indeterminate at compile time, since it depends
587on inheritance.
cb1a09d0 588
c07a80fd 589Since the intent is primarily to let you define subroutines that work
590like builtin commands, here are the prototypes for some other functions
591that parse almost exactly like the corresponding builtins.
cb1a09d0 592
593 Declared as Called as
594
595 sub mylink ($$) mylink $old, $new
596 sub myvec ($$$) myvec $var, $offset, 1
597 sub myindex ($$;$) myindex &getstring, "substr"
598 sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off
599 sub myreverse (@) myreverse $a,$b,$c
600 sub myjoin ($@) myjoin ":",$a,$b,$c
601 sub mypop (\@) mypop @array
602 sub mysplice (\@$$@) mysplice @array,@array,0,@pushme
603 sub mykeys (\%) mykeys %{$hashref}
604 sub myopen (*;$) myopen HANDLE, $name
605 sub mypipe (**) mypipe READHANDLE, WRITEHANDLE
606 sub mygrep (&@) mygrep { /foo/ } $a,$b,$c
607 sub myrand ($) myrand 42
608 sub mytime () mytime
609
c07a80fd 610Any backslashed prototype character represents an actual argument
6e47f808 611that absolutely must start with that character. The value passed
612to the subroutine (as part of C<@_>) will be a reference to the
613actual argument given in the subroutine call, obtained by applying
614C<\> to that argument.
c07a80fd 615
616Unbackslashed prototype characters have special meanings. Any
617unbackslashed @ or % eats all the rest of the arguments, and forces
618list context. An argument represented by $ forces scalar context. An
619& requires an anonymous subroutine, which, if passed as the first
620argument, does not require the "sub" keyword or a subsequent comma. A
621* does whatever it has to do to turn the argument into a reference to a
622symbol table entry.
623
624A semicolon separates mandatory arguments from optional arguments.
625(It is redundant before @ or %.)
cb1a09d0 626
c07a80fd 627Note how the last three examples above are treated specially by the parser.
cb1a09d0 628mygrep() is parsed as a true list operator, myrand() is parsed as a
629true unary operator with unary precedence the same as rand(), and
630mytime() is truly argumentless, just like time(). That is, if you
631say
632
633 mytime +2;
634
635you'll get mytime() + 2, not mytime(2), which is how it would be parsed
636without the prototype.
637
638The interesting thing about & is that you can generate new syntax with it:
639
640 sub try (&$) {
641 my($try,$catch) = @_;
642 eval { &$try };
643 if ($@) {
644 local $_ = $@;
645 &$catch;
646 }
647 }
648 sub catch (&) { @_ }
649
650 try {
651 die "phooey";
652 } catch {
653 /phooey/ and print "unphooey\n";
654 };
655
656That prints "unphooey". (Yes, there are still unresolved
657issues having to do with the visibility of @_. I'm ignoring that
658question for the moment. (But note that if we make @_ lexically
659scoped, those anonymous subroutines can act like closures... (Gee,
660is this sounding a little Lispish? (Nevermind.))))
661
662And here's a reimplementation of grep:
663
664 sub mygrep (&@) {
665 my $code = shift;
666 my @result;
667 foreach $_ (@_) {
6e47f808 668 push(@result, $_) if &$code;
cb1a09d0 669 }
670 @result;
671 }
a0d0e21e 672
cb1a09d0 673Some folks would prefer full alphanumeric prototypes. Alphanumerics have
674been intentionally left out of prototypes for the express purpose of
675someday in the future adding named, formal parameters. The current
676mechanism's main goal is to let module writers provide better diagnostics
677for module users. Larry feels the notation quite understandable to Perl
678programmers, and that it will not intrude greatly upon the meat of the
679module, nor make it harder to read. The line noise is visually
680encapsulated into a small pill that's easy to swallow.
681
682It's probably best to prototype new functions, not retrofit prototyping
683into older ones. That's because you must be especially careful about
684silent impositions of differing list versus scalar contexts. For example,
685if you decide that a function should take just one parameter, like this:
686
687 sub func ($) {
688 my $n = shift;
689 print "you gave me $n\n";
690 }
691
692and someone has been calling it with an array or expression
693returning a list:
694
695 func(@foo);
696 func( split /:/ );
697
698Then you've just supplied an automatic scalar() in front of their
699argument, which can be more than a bit surprising. The old @foo
700which used to hold one thing doesn't get passed in. Instead,
701the func() now gets passed in 1, that is, the number of elments
702in @foo. And the split() gets called in a scalar context and
703starts scribbling on your @_ parameter list.
704
705This is all very powerful, of course, and should only be used in moderation
706to make the world a better place.
707
708=head2 Overriding Builtin Functions
a0d0e21e 709
710Many builtin functions may be overridden, though this should only be
711tried occasionally and for good reason. Typically this might be
712done by a package attempting to emulate missing builtin functionality
713on a non-Unix system.
714
715Overriding may only be done by importing the name from a
716module--ordinary predeclaration isn't good enough. However, the
717C<subs> pragma (compiler directive) lets you, in effect, predeclare subs
718via the import syntax, and these names may then override the builtin ones:
719
720 use subs 'chdir', 'chroot', 'chmod', 'chown';
721 chdir $somewhere;
722 sub chdir { ... }
723
724Library modules should not in general export builtin names like "open"
725or "chdir" as part of their default @EXPORT list, since these may
726sneak into someone else's namespace and change the semantics unexpectedly.
727Instead, if the module adds the name to the @EXPORT_OK list, then it's
728possible for a user to import the name explicitly, but not implicitly.
729That is, they could say
730
731 use Module 'open';
732
733and it would import the open override, but if they said
734
735 use Module;
736
737they would get the default imports without the overrides.
738
739=head2 Autoloading
740
741If you call a subroutine that is undefined, you would ordinarily get an
742immediate fatal error complaining that the subroutine doesn't exist.
743(Likewise for subroutines being used as methods, when the method
744doesn't exist in any of the base classes of the class package.) If,
745however, there is an C<AUTOLOAD> subroutine defined in the package or
746packages that were searched for the original subroutine, then that
747C<AUTOLOAD> subroutine is called with the arguments that would have been
748passed to the original subroutine. The fully qualified name of the
749original subroutine magically appears in the $AUTOLOAD variable in the
750same package as the C<AUTOLOAD> routine. The name is not passed as an
751ordinary argument because, er, well, just because, that's why...
752
753Most C<AUTOLOAD> routines will load in a definition for the subroutine in
754question using eval, and then execute that subroutine using a special
755form of "goto" that erases the stack frame of the C<AUTOLOAD> routine
756without a trace. (See the standard C<AutoLoader> module, for example.)
757But an C<AUTOLOAD> routine can also just emulate the routine and never
cb1a09d0 758define it. For example, let's pretend that a function that wasn't defined
759should just call system() with those arguments. All you'd do is this:
760
761 sub AUTOLOAD {
762 my $program = $AUTOLOAD;
763 $program =~ s/.*:://;
764 system($program, @_);
765 }
766 date();
767 who('am', i');
768 ls('-l');
769
770In fact, if you preclare the functions you want to call that way, you don't
771even need the parentheses:
772
773 use subs qw(date who ls);
774 date;
775 who "am", "i";
776 ls -l;
777
778A more complete example of this is the standard Shell module, which
a0d0e21e 779can treat undefined subroutine calls as calls to Unix programs.
780
cb1a09d0 781Mechanisms are available for modules writers to help split the modules
782up into autoloadable files. See the standard AutoLoader module described
783in L<Autoloader>, the standard SelfLoader modules in L<SelfLoader>, and
784the document on adding C functions to perl code in L<perlxs>.
785
786=head1 SEE ALSO
a0d0e21e 787
cb1a09d0 788See L<perlref> for more on references. See L<perlxs> if you'd
789like to learn about calling C subroutines from perl. See
790L<perlmod> to learn about bundling up your functions in
791separate files.