[dummy merge]
[p5sagit/p5-mst-13.2.git] / pod / perlsub.pod
CommitLineData
a0d0e21e 1=head1 NAME
2
3perlsub - Perl subroutines
4
5=head1 SYNOPSIS
6
7To declare subroutines:
8
cb1a09d0 9 sub NAME; # A "forward" declaration.
10 sub NAME(PROTO); # ditto, but with prototypes
11
12 sub NAME BLOCK # A declaration and a definition.
13 sub NAME(PROTO) BLOCK # ditto, but with prototypes
a0d0e21e 14
748a9306 15To define an anonymous subroutine at runtime:
16
17 $subref = sub BLOCK;
18
a0d0e21e 19To import subroutines:
20
21 use PACKAGE qw(NAME1 NAME2 NAME3);
22
23To call subroutines:
24
5f05dabc 25 NAME(LIST); # & is optional with parentheses.
26 NAME LIST; # Parentheses optional if pre-declared/imported.
cb1a09d0 27 &NAME; # Passes current @_ to subroutine.
a0d0e21e 28
29=head1 DESCRIPTION
30
cb1a09d0 31Like many languages, Perl provides for user-defined subroutines. These
32may be located anywhere in the main program, loaded in from other files
33via the C<do>, C<require>, or C<use> keywords, or even generated on the
34fly using C<eval> or anonymous subroutines (closures). You can even call
c07a80fd 35a function indirectly using a variable containing its name or a CODE reference
36to it, as in C<$var = \&function>.
cb1a09d0 37
38The Perl model for function call and return values is simple: all
39functions are passed as parameters one single flat list of scalars, and
40all functions likewise return to their caller one single flat list of
41scalars. Any arrays or hashes in these call and return lists will
42collapse, losing their identities--but you may always use
43pass-by-reference instead to avoid this. Both call and return lists may
44contain as many or as few scalar elements as you'd like. (Often a
45function without an explicit return statement is called a subroutine, but
46there's really no difference from the language's perspective.)
47
48Any arguments passed to the routine come in as the array @_. Thus if you
49called a function with two arguments, those would be stored in C<$_[0]>
3fe9a6f1 50and C<$_[1]>. The array @_ is a local array, but its elements are
51aliases for the actual scalar parameters. In particular, if an element
52C<$_[0]> is updated, the corresponding argument is updated (or an error
53occurs if it is not updatable). If an argument is an array or hash
54element which did not exist when the function was called, that element is
55created only when (and if) it is modified or if a reference to it is
56taken. (Some earlier versions of Perl created the element whether or not
57it was assigned to.) Note that assigning to the whole array @_ removes
58the aliasing, and does not update any arguments.
59
60The return value of the subroutine is the value of the last expression
cb1a09d0 61evaluated. Alternatively, a return statement may be used to specify the
62returned value and exit the subroutine. If you return one or more arrays
63and/or hashes, these will be flattened together into one large
64indistinguishable list.
65
66Perl does not have named formal parameters, but in practice all you do is
67assign to a my() list of these. Any variables you use in the function
68that aren't declared private are global variables. For the gory details
1fef88e7 69on creating private variables, see
6d28dffb 70L<"Private Variables via my()"> and L<"Temporary Values via local()">.
71To create protected environments for a set of functions in a separate
72package (and probably a separate file), see L<perlmod/"Packages">.
a0d0e21e 73
74Example:
75
cb1a09d0 76 sub max {
77 my $max = shift(@_);
a0d0e21e 78 foreach $foo (@_) {
79 $max = $foo if $max < $foo;
80 }
cb1a09d0 81 return $max;
a0d0e21e 82 }
cb1a09d0 83 $bestday = max($mon,$tue,$wed,$thu,$fri);
a0d0e21e 84
85Example:
86
87 # get a line, combining continuation lines
88 # that start with whitespace
89
90 sub get_line {
cb1a09d0 91 $thisline = $lookahead; # GLOBAL VARIABLES!!
a0d0e21e 92 LINE: while ($lookahead = <STDIN>) {
93 if ($lookahead =~ /^[ \t]/) {
94 $thisline .= $lookahead;
95 }
96 else {
97 last LINE;
98 }
99 }
100 $thisline;
101 }
102
103 $lookahead = <STDIN>; # get first line
104 while ($_ = get_line()) {
105 ...
106 }
107
108Use array assignment to a local list to name your formal arguments:
109
110 sub maybeset {
111 my($key, $value) = @_;
cb1a09d0 112 $Foo{$key} = $value unless $Foo{$key};
a0d0e21e 113 }
114
cb1a09d0 115This also has the effect of turning call-by-reference into call-by-value,
5f05dabc 116because the assignment copies the values. Otherwise a function is free to
1fef88e7 117do in-place modifications of @_ and change its caller's values.
cb1a09d0 118
119 upcase_in($v1, $v2); # this changes $v1 and $v2
120 sub upcase_in {
121 for (@_) { tr/a-z/A-Z/ }
122 }
123
124You aren't allowed to modify constants in this way, of course. If an
125argument were actually literal and you tried to change it, you'd take a
126(presumably fatal) exception. For example, this won't work:
127
128 upcase_in("frederick");
129
130It would be much safer if the upcase_in() function
131were written to return a copy of its parameters instead
132of changing them in place:
133
134 ($v3, $v4) = upcase($v1, $v2); # this doesn't
135 sub upcase {
136 my @parms = @_;
137 for (@parms) { tr/a-z/A-Z/ }
c07a80fd 138 # wantarray checks if we were called in list context
139 return wantarray ? @parms : $parms[0];
cb1a09d0 140 }
141
142Notice how this (unprototyped) function doesn't care whether it was passed
143real scalars or arrays. Perl will see everything as one big long flat @_
144parameter list. This is one of the ways where Perl's simple
145argument-passing style shines. The upcase() function would work perfectly
146well without changing the upcase() definition even if we fed it things
147like this:
148
149 @newlist = upcase(@list1, @list2);
150 @newlist = upcase( split /:/, $var );
151
152Do not, however, be tempted to do this:
153
154 (@a, @b) = upcase(@list1, @list2);
155
156Because like its flat incoming parameter list, the return list is also
157flat. So all you have managed to do here is stored everything in @a and
158made @b an empty list. See L</"Pass by Reference"> for alternatives.
159
5f05dabc 160A subroutine may be called using the "&" prefix. The "&" is optional
161in modern Perls, and so are the parentheses if the subroutine has been
162pre-declared. (Note, however, that the "&" is I<NOT> optional when
163you're just naming the subroutine, such as when it's used as an
164argument to defined() or undef(). Nor is it optional when you want to
165do an indirect subroutine call with a subroutine name or reference
166using the C<&$subref()> or C<&{$subref}()> constructs. See L<perlref>
167for more on that.)
a0d0e21e 168
169Subroutines may be called recursively. If a subroutine is called using
cb1a09d0 170the "&" form, the argument list is optional, and if omitted, no @_ array is
171set up for the subroutine: the @_ array at the time of the call is
172visible to subroutine instead. This is an efficiency mechanism that
173new users may wish to avoid.
a0d0e21e 174
175 &foo(1,2,3); # pass three arguments
176 foo(1,2,3); # the same
177
178 foo(); # pass a null list
179 &foo(); # the same
a0d0e21e 180
cb1a09d0 181 &foo; # foo() get current args, like foo(@_) !!
182 foo; # like foo() IFF sub foo pre-declared, else "foo"
183
c07a80fd 184Not only does the "&" form make the argument list optional, but it also
185disables any prototype checking on the arguments you do provide. This
186is partly for historical reasons, and partly for having a convenient way
187to cheat if you know what you're doing. See the section on Prototypes below.
188
cb1a09d0 189=head2 Private Variables via my()
190
191Synopsis:
192
193 my $foo; # declare $foo lexically local
194 my (@wid, %get); # declare list of variables local
195 my $foo = "flurp"; # declare $foo lexical, and init it
196 my @oof = @bar; # declare @oof lexical, and init it
197
198A "my" declares the listed variables to be confined (lexically) to the
55497cff 199enclosing block, conditional (C<if/unless/elsif/else>), loop
200(C<for/foreach/while/until/continue>), subroutine, C<eval>, or
201C<do/require/use>'d file. If more than one value is listed, the list
5f05dabc 202must be placed in parentheses. All listed elements must be legal lvalues.
55497cff 203Only alphanumeric identifiers may be lexically scoped--magical
204builtins like $/ must currently be localized with "local" instead.
cb1a09d0 205
206Unlike dynamic variables created by the "local" statement, lexical
207variables declared with "my" are totally hidden from the outside world,
208including any called subroutines (even if it's the same subroutine called
209from itself or elsewhere--every call gets its own copy).
210
211(An eval(), however, can see the lexical variables of the scope it is
212being evaluated in so long as the names aren't hidden by declarations within
213the eval() itself. See L<perlref>.)
214
215The parameter list to my() may be assigned to if desired, which allows you
216to initialize your variables. (If no initializer is given for a
217particular variable, it is created with the undefined value.) Commonly
218this is used to name the parameters to a subroutine. Examples:
219
220 $arg = "fred"; # "global" variable
221 $n = cube_root(27);
222 print "$arg thinks the root is $n\n";
223 fred thinks the root is 3
224
225 sub cube_root {
226 my $arg = shift; # name doesn't matter
227 $arg **= 1/3;
228 return $arg;
229 }
230
231The "my" is simply a modifier on something you might assign to. So when
232you do assign to the variables in its argument list, the "my" doesn't
233change whether those variables is viewed as a scalar or an array. So
234
235 my ($foo) = <STDIN>;
236 my @FOO = <STDIN>;
237
5f05dabc 238both supply a list context to the right-hand side, while
cb1a09d0 239
240 my $foo = <STDIN>;
241
5f05dabc 242supplies a scalar context. But the following declares only one variable:
748a9306 243
cb1a09d0 244 my $foo, $bar = 1;
748a9306 245
cb1a09d0 246That has the same effect as
748a9306 247
cb1a09d0 248 my $foo;
249 $bar = 1;
a0d0e21e 250
cb1a09d0 251The declared variable is not introduced (is not visible) until after
252the current statement. Thus,
253
254 my $x = $x;
255
256can be used to initialize the new $x with the value of the old $x, and
257the expression
258
259 my $x = 123 and $x == 123
260
261is false unless the old $x happened to have the value 123.
262
55497cff 263Lexical scopes of control structures are not bounded precisely by the
264braces that delimit their controlled blocks; control expressions are
265part of the scope, too. Thus in the loop
266
267 while (my $line = <>) {
268 $line = lc $line;
269 } continue {
270 print $line;
271 }
272
273the scope of $line extends from its declaration throughout the rest of
274the loop construct (including the C<continue> clause), but not beyond
275it. Similarly, in the conditional
276
277 if ((my $answer = <STDIN>) =~ /^yes$/i) {
278 user_agrees();
279 } elsif ($answer =~ /^no$/i) {
280 user_disagrees();
281 } else {
282 chomp $answer;
283 die "'$answer' is neither 'yes' nor 'no'";
284 }
285
286the scope of $answer extends from its declaration throughout the rest
287of the conditional (including C<elsif> and C<else> clauses, if any),
288but not beyond it.
289
290(None of the foregoing applies to C<if/unless> or C<while/until>
291modifiers appended to simple statements. Such modifiers are not
292control structures and have no effect on scoping.)
293
5f05dabc 294The C<foreach> loop defaults to scoping its index variable dynamically
55497cff 295(in the manner of C<local>; see below). However, if the index
296variable is prefixed with the keyword "my", then it is lexically
297scoped instead. Thus in the loop
298
299 for my $i (1, 2, 3) {
300 some_function();
301 }
302
303the scope of $i extends to the end of the loop, but not beyond it, and
304so the value of $i is unavailable in some_function().
305
cb1a09d0 306Some users may wish to encourage the use of lexically scoped variables.
307As an aid to catching implicit references to package variables,
308if you say
309
310 use strict 'vars';
311
312then any variable reference from there to the end of the enclosing
313block must either refer to a lexical variable, or must be fully
314qualified with the package name. A compilation error results
315otherwise. An inner block may countermand this with S<"no strict 'vars'">.
316
317A my() has both a compile-time and a run-time effect. At compile time,
318the compiler takes notice of it; the principle usefulness of this is to
319quiet C<use strict 'vars'>. The actual initialization doesn't happen
320until run time, so gets executed every time through a loop.
321
322Variables declared with "my" are not part of any package and are therefore
323never fully qualified with the package name. In particular, you're not
324allowed to try to make a package variable (or other global) lexical:
325
326 my $pack::var; # ERROR! Illegal syntax
327 my $_; # also illegal (currently)
328
329In fact, a dynamic variable (also known as package or global variables)
330are still accessible using the fully qualified :: notation even while a
331lexical of the same name is also visible:
332
333 package main;
334 local $x = 10;
335 my $x = 20;
336 print "$x and $::x\n";
337
338That will print out 20 and 10.
339
5f05dabc 340You may declare "my" variables at the outermost scope of a file to
341hide any such identifiers totally from the outside world. This is similar
6d28dffb 342to C's static variables at the file level. To do this with a subroutine
cb1a09d0 343requires the use of a closure (anonymous function). If a block (such as
344an eval(), function, or C<package>) wants to create a private subroutine
345that cannot be called from outside that block, it can declare a lexical
346variable containing an anonymous sub reference:
347
348 my $secret_version = '1.001-beta';
349 my $secret_sub = sub { print $secret_version };
350 &$secret_sub();
351
352As long as the reference is never returned by any function within the
5f05dabc 353module, no outside module can see the subroutine, because its name is not in
cb1a09d0 354any package's symbol table. Remember that it's not I<REALLY> called
355$some_pack::secret_version or anything; it's just $secret_version,
356unqualified and unqualifiable.
357
358This does not work with object methods, however; all object methods have
359to be in the symbol table of some package to be found.
360
361Just because the lexical variable is lexically (also called statically)
362scoped doesn't mean that within a function it works like a C static. It
363normally works more like a C auto. But here's a mechanism for giving a
364function private variables with both lexical scoping and a static
365lifetime. If you do want to create something like C's static variables,
366just enclose the whole function in an extra block, and put the
367static variable outside the function but in the block.
368
369 {
370 my $secret_val = 0;
371 sub gimme_another {
372 return ++$secret_val;
373 }
374 }
375 # $secret_val now becomes unreachable by the outside
376 # world, but retains its value between calls to gimme_another
377
378If this function is being sourced in from a separate file
379via C<require> or C<use>, then this is probably just fine. If it's
380all in the main program, you'll need to arrange for the my()
381to be executed early, either by putting the whole block above
5f05dabc 382your pain program, or more likely, placing merely a BEGIN
cb1a09d0 383sub around it to make sure it gets executed before your program
384starts to run:
385
386 sub BEGIN {
387 my $secret_val = 0;
388 sub gimme_another {
389 return ++$secret_val;
390 }
391 }
392
393See L<perlrun> about the BEGIN function.
394
395=head2 Temporary Values via local()
396
397B<NOTE>: In general, you should be using "my" instead of "local", because
6d28dffb 398it's faster and safer. Exceptions to this include the global punctuation
cb1a09d0 399variables, filehandles and formats, and direct manipulation of the Perl
400symbol table itself. Format variables often use "local" though, as do
401other variables whose current value must be visible to called
402subroutines.
403
404Synopsis:
405
406 local $foo; # declare $foo dynamically local
407 local (@wid, %get); # declare list of variables local
408 local $foo = "flurp"; # declare $foo dynamic, and init it
409 local @oof = @bar; # declare @oof dynamic, and init it
410
411 local *FH; # localize $FH, @FH, %FH, &FH ...
412 local *merlyn = *randal; # now $merlyn is really $randal, plus
413 # @merlyn is really @randal, etc
414 local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal
415 local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc
416
417A local() modifies its listed variables to be local to the enclosing
5f05dabc 418block, (or subroutine, C<eval{}>, or C<do>) and I<any called from
cb1a09d0 419within that block>. A local() just gives temporary values to global
420(meaning package) variables. This is known as dynamic scoping. Lexical
421scoping is done with "my", which works more like C's auto declarations.
422
423If more than one variable is given to local(), they must be placed in
5f05dabc 424parentheses. All listed elements must be legal lvalues. This operator works
cb1a09d0 425by saving the current values of those variables in its argument list on a
5f05dabc 426hidden stack and restoring them upon exiting the block, subroutine, or
cb1a09d0 427eval. This means that called subroutines can also reference the local
428variable, but not the global one. The argument list may be assigned to if
429desired, which allows you to initialize your local variables. (If no
430initializer is given for a particular variable, it is created with an
431undefined value.) Commonly this is used to name the parameters to a
432subroutine. Examples:
433
434 for $i ( 0 .. 9 ) {
435 $digits{$i} = $i;
436 }
437 # assume this function uses global %digits hash
438 parse_num();
439
440 # now temporarily add to %digits hash
441 if ($base12) {
442 # (NOTE: not claiming this is efficient!)
443 local %digits = (%digits, 't' => 10, 'e' => 11);
444 parse_num(); # parse_num gets this new %digits!
445 }
446 # old %digits restored here
447
1fef88e7 448Because local() is a run-time command, it gets executed every time
cb1a09d0 449through a loop. In releases of Perl previous to 5.0, this used more stack
450storage each time until the loop was exited. Perl now reclaims the space
451each time through, but it's still more efficient to declare your variables
452outside the loop.
453
454A local is simply a modifier on an lvalue expression. When you assign to
455a localized variable, the local doesn't change whether its list is viewed
456as a scalar or an array. So
457
458 local($foo) = <STDIN>;
459 local @FOO = <STDIN>;
460
5f05dabc 461both supply a list context to the right-hand side, while
cb1a09d0 462
463 local $foo = <STDIN>;
464
465supplies a scalar context.
466
467=head2 Passing Symbol Table Entries (typeglobs)
468
469[Note: The mechanism described in this section was originally the only
470way to simulate pass-by-reference in older versions of Perl. While it
471still works fine in modern versions, the new reference mechanism is
472generally easier to work with. See below.]
a0d0e21e 473
474Sometimes you don't want to pass the value of an array to a subroutine
475but rather the name of it, so that the subroutine can modify the global
476copy of it rather than working with a local copy. In perl you can
cb1a09d0 477refer to all objects of a particular name by prefixing the name
5f05dabc 478with a star: C<*foo>. This is often known as a "typeglob", because the
a0d0e21e 479star on the front can be thought of as a wildcard match for all the
480funny prefix characters on variables and subroutines and such.
481
55497cff 482When evaluated, the typeglob produces a scalar value that represents
5f05dabc 483all the objects of that name, including any filehandle, format, or
a0d0e21e 484subroutine. When assigned to, it causes the name mentioned to refer to
485whatever "*" value was assigned to it. Example:
486
487 sub doubleary {
488 local(*someary) = @_;
489 foreach $elem (@someary) {
490 $elem *= 2;
491 }
492 }
493 doubleary(*foo);
494 doubleary(*bar);
495
496Note that scalars are already passed by reference, so you can modify
497scalar arguments without using this mechanism by referring explicitly
1fef88e7 498to C<$_[0]> etc. You can modify all the elements of an array by passing
a0d0e21e 499all the elements as scalars, but you have to use the * mechanism (or
5f05dabc 500the equivalent reference mechanism) to push, pop, or change the size of
a0d0e21e 501an array. It will certainly be faster to pass the typeglob (or reference).
502
503Even if you don't want to modify an array, this mechanism is useful for
5f05dabc 504passing multiple arrays in a single LIST, because normally the LIST
a0d0e21e 505mechanism will merge all the array values so that you can't extract out
55497cff 506the individual arrays. For more on typeglobs, see
2ae324a7 507L<perldata/"Typeglobs and Filehandles">.
cb1a09d0 508
509=head2 Pass by Reference
510
55497cff 511If you want to pass more than one array or hash into a function--or
512return them from it--and have them maintain their integrity, then
513you're going to have to use an explicit pass-by-reference. Before you
514do that, you need to understand references as detailed in L<perlref>.
c07a80fd 515This section may not make much sense to you otherwise.
cb1a09d0 516
517Here are a few simple examples. First, let's pass in several
518arrays to a function and have it pop all of then, return a new
519list of all their former last elements:
520
521 @tailings = popmany ( \@a, \@b, \@c, \@d );
522
523 sub popmany {
524 my $aref;
525 my @retlist = ();
526 foreach $aref ( @_ ) {
527 push @retlist, pop @$aref;
528 }
529 return @retlist;
530 }
531
532Here's how you might write a function that returns a
533list of keys occurring in all the hashes passed to it:
534
535 @common = inter( \%foo, \%bar, \%joe );
536 sub inter {
537 my ($k, $href, %seen); # locals
538 foreach $href (@_) {
539 while ( $k = each %$href ) {
540 $seen{$k}++;
541 }
542 }
543 return grep { $seen{$_} == @_ } keys %seen;
544 }
545
5f05dabc 546So far, we're using just the normal list return mechanism.
cb1a09d0 547What happens if you want to pass or return a hash? Well,
5f05dabc 548if you're using only one of them, or you don't mind them
cb1a09d0 549concatenating, then the normal calling convention is ok, although
550a little expensive.
551
552Where people get into trouble is here:
553
554 (@a, @b) = func(@c, @d);
555or
556 (%a, %b) = func(%c, %d);
557
5f05dabc 558That syntax simply won't work. It sets just @a or %a and clears the @b or
cb1a09d0 559%b. Plus the function didn't get passed into two separate arrays or
560hashes: it got one long list in @_, as always.
561
562If you can arrange for everyone to deal with this through references, it's
563cleaner code, although not so nice to look at. Here's a function that
564takes two array references as arguments, returning the two array elements
565in order of how many elements they have in them:
566
567 ($aref, $bref) = func(\@c, \@d);
568 print "@$aref has more than @$bref\n";
569 sub func {
570 my ($cref, $dref) = @_;
571 if (@$cref > @$dref) {
572 return ($cref, $dref);
573 } else {
c07a80fd 574 return ($dref, $cref);
cb1a09d0 575 }
576 }
577
578It turns out that you can actually do this also:
579
580 (*a, *b) = func(\@c, \@d);
581 print "@a has more than @b\n";
582 sub func {
583 local (*c, *d) = @_;
584 if (@c > @d) {
585 return (\@c, \@d);
586 } else {
587 return (\@d, \@c);
588 }
589 }
590
591Here we're using the typeglobs to do symbol table aliasing. It's
592a tad subtle, though, and also won't work if you're using my()
5f05dabc 593variables, because only globals (well, and local()s) are in the symbol table.
594
595If you're passing around filehandles, you could usually just use the bare
596typeglob, like *STDOUT, but typeglobs references would be better because
597they'll still work properly under C<use strict 'refs'>. For example:
598
599 splutter(\*STDOUT);
600 sub splutter {
601 my $fh = shift;
602 print $fh "her um well a hmmm\n";
603 }
604
605 $rec = get_rec(\*STDIN);
606 sub get_rec {
607 my $fh = shift;
608 return scalar <$fh>;
609 }
610
611Another way to do this is using *HANDLE{IO}, see L<perlref> for usage
612and caveats.
613
614If you're planning on generating new filehandles, you could do this:
615
616 sub openit {
617 my $name = shift;
618 local *FH;
e05a3a1e 619 return open (FH, $path) ? *FH : undef;
5f05dabc 620 }
621
622Although that will actually produce a small memory leak. See the bottom
623of L<perlfunc/open()> for a somewhat cleaner way using the IO::Handle
624package.
cb1a09d0 625
cb1a09d0 626=head2 Prototypes
627
628As of the 5.002 release of perl, if you declare
629
630 sub mypush (\@@)
631
c07a80fd 632then mypush() takes arguments exactly like push() does. The declaration
633of the function to be called must be visible at compile time. The prototype
5f05dabc 634affects only the interpretation of new-style calls to the function, where
c07a80fd 635new-style is defined as not using the C<&> character. In other words,
636if you call it like a builtin function, then it behaves like a builtin
637function. If you call it like an old-fashioned subroutine, then it
638behaves like an old-fashioned subroutine. It naturally falls out from
639this rule that prototypes have no influence on subroutine references
640like C<\&foo> or on indirect subroutine calls like C<&{$subref}>.
641
642Method calls are not influenced by prototypes either, because the
5f05dabc 643function to be called is indeterminate at compile time, because it depends
c07a80fd 644on inheritance.
cb1a09d0 645
5f05dabc 646Because the intent is primarily to let you define subroutines that work
c07a80fd 647like builtin commands, here are the prototypes for some other functions
648that parse almost exactly like the corresponding builtins.
cb1a09d0 649
650 Declared as Called as
651
652 sub mylink ($$) mylink $old, $new
653 sub myvec ($$$) myvec $var, $offset, 1
654 sub myindex ($$;$) myindex &getstring, "substr"
655 sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off
656 sub myreverse (@) myreverse $a,$b,$c
657 sub myjoin ($@) myjoin ":",$a,$b,$c
658 sub mypop (\@) mypop @array
659 sub mysplice (\@$$@) mysplice @array,@array,0,@pushme
660 sub mykeys (\%) mykeys %{$hashref}
661 sub myopen (*;$) myopen HANDLE, $name
662 sub mypipe (**) mypipe READHANDLE, WRITEHANDLE
663 sub mygrep (&@) mygrep { /foo/ } $a,$b,$c
664 sub myrand ($) myrand 42
665 sub mytime () mytime
666
c07a80fd 667Any backslashed prototype character represents an actual argument
6e47f808 668that absolutely must start with that character. The value passed
669to the subroutine (as part of C<@_>) will be a reference to the
670actual argument given in the subroutine call, obtained by applying
671C<\> to that argument.
c07a80fd 672
673Unbackslashed prototype characters have special meanings. Any
674unbackslashed @ or % eats all the rest of the arguments, and forces
675list context. An argument represented by $ forces scalar context. An
676& requires an anonymous subroutine, which, if passed as the first
677argument, does not require the "sub" keyword or a subsequent comma. A
678* does whatever it has to do to turn the argument into a reference to a
679symbol table entry.
680
681A semicolon separates mandatory arguments from optional arguments.
682(It is redundant before @ or %.)
cb1a09d0 683
c07a80fd 684Note how the last three examples above are treated specially by the parser.
cb1a09d0 685mygrep() is parsed as a true list operator, myrand() is parsed as a
686true unary operator with unary precedence the same as rand(), and
5f05dabc 687mytime() is truly without arguments, just like time(). That is, if you
cb1a09d0 688say
689
690 mytime +2;
691
692you'll get mytime() + 2, not mytime(2), which is how it would be parsed
693without the prototype.
694
695The interesting thing about & is that you can generate new syntax with it:
696
6d28dffb 697 sub try (&@) {
cb1a09d0 698 my($try,$catch) = @_;
699 eval { &$try };
700 if ($@) {
701 local $_ = $@;
702 &$catch;
703 }
704 }
55497cff 705 sub catch (&) { $_[0] }
cb1a09d0 706
707 try {
708 die "phooey";
709 } catch {
710 /phooey/ and print "unphooey\n";
711 };
712
713That prints "unphooey". (Yes, there are still unresolved
714issues having to do with the visibility of @_. I'm ignoring that
715question for the moment. (But note that if we make @_ lexically
716scoped, those anonymous subroutines can act like closures... (Gee,
5f05dabc 717is this sounding a little Lispish? (Never mind.))))
cb1a09d0 718
719And here's a reimplementation of grep:
720
721 sub mygrep (&@) {
722 my $code = shift;
723 my @result;
724 foreach $_ (@_) {
6e47f808 725 push(@result, $_) if &$code;
cb1a09d0 726 }
727 @result;
728 }
a0d0e21e 729
cb1a09d0 730Some folks would prefer full alphanumeric prototypes. Alphanumerics have
731been intentionally left out of prototypes for the express purpose of
732someday in the future adding named, formal parameters. The current
733mechanism's main goal is to let module writers provide better diagnostics
734for module users. Larry feels the notation quite understandable to Perl
735programmers, and that it will not intrude greatly upon the meat of the
736module, nor make it harder to read. The line noise is visually
737encapsulated into a small pill that's easy to swallow.
738
739It's probably best to prototype new functions, not retrofit prototyping
740into older ones. That's because you must be especially careful about
741silent impositions of differing list versus scalar contexts. For example,
742if you decide that a function should take just one parameter, like this:
743
744 sub func ($) {
745 my $n = shift;
746 print "you gave me $n\n";
747 }
748
749and someone has been calling it with an array or expression
750returning a list:
751
752 func(@foo);
753 func( split /:/ );
754
755Then you've just supplied an automatic scalar() in front of their
756argument, which can be more than a bit surprising. The old @foo
757which used to hold one thing doesn't get passed in. Instead,
5f05dabc 758the func() now gets passed in 1, that is, the number of elements
cb1a09d0 759in @foo. And the split() gets called in a scalar context and
760starts scribbling on your @_ parameter list.
761
5f05dabc 762This is all very powerful, of course, and should be used only in moderation
cb1a09d0 763to make the world a better place.
44a8e56a 764
765=head2 Constant Functions
766
767Functions with a prototype of C<()> are potential candidates for
768inlining. If the result after optimization and constant folding is a
769constant then it will be used in place of new-style calls to the
770function. Old-style calls (that is, calls made using C<&>) are not
771affected.
772
773All of the following functions would be inlined.
774
699e6cd4 775 sub pi () { 3.14159 } # Not exact, but close.
776 sub PI () { 4 * atan2 1, 1 } # As good as it gets,
777 # and it's inlined, too!
44a8e56a 778 sub ST_DEV () { 0 }
779 sub ST_INO () { 1 }
780
781 sub FLAG_FOO () { 1 << 8 }
782 sub FLAG_BAR () { 1 << 9 }
783 sub FLAG_MASK () { FLAG_FOO | FLAG_BAR }
784
785 sub OPT_BAZ () { 1 }
786 sub BAZ_VAL () {
787 if (OPT_BAZ) {
788 return 23;
789 }
790 else {
791 return 42;
792 }
793 }
cb1a09d0 794
4cee8e80 795If you redefine a subroutine which was eligible for inlining you'll get
796a mandatory warning. (You can use this warning to tell whether or not a
797particular subroutine is considered constant.) The warning is
798considered severe enough not to be optional because previously compiled
799invocations of the function will still be using the old value of the
800function. If you need to be able to redefine the subroutine you need to
801ensure that it isn't inlined, either by dropping the C<()> prototype
802(which changes the calling semantics, so beware) or by thwarting the
803inlining mechanism in some other way, such as
804
805 my $dummy;
806 sub not_inlined () {
807 $dummy || 23
808 }
809
cb1a09d0 810=head2 Overriding Builtin Functions
a0d0e21e 811
5f05dabc 812Many builtin functions may be overridden, though this should be tried
813only occasionally and for good reason. Typically this might be
a0d0e21e 814done by a package attempting to emulate missing builtin functionality
815on a non-Unix system.
816
5f05dabc 817Overriding may be done only by importing the name from a
a0d0e21e 818module--ordinary predeclaration isn't good enough. However, the
5f05dabc 819C<subs> pragma (compiler directive) lets you, in effect, pre-declare subs
a0d0e21e 820via the import syntax, and these names may then override the builtin ones:
821
822 use subs 'chdir', 'chroot', 'chmod', 'chown';
823 chdir $somewhere;
824 sub chdir { ... }
825
826Library modules should not in general export builtin names like "open"
5f05dabc 827or "chdir" as part of their default @EXPORT list, because these may
a0d0e21e 828sneak into someone else's namespace and change the semantics unexpectedly.
829Instead, if the module adds the name to the @EXPORT_OK list, then it's
830possible for a user to import the name explicitly, but not implicitly.
831That is, they could say
832
833 use Module 'open';
834
835and it would import the open override, but if they said
836
837 use Module;
838
839they would get the default imports without the overrides.
840
841=head2 Autoloading
842
843If you call a subroutine that is undefined, you would ordinarily get an
844immediate fatal error complaining that the subroutine doesn't exist.
845(Likewise for subroutines being used as methods, when the method
846doesn't exist in any of the base classes of the class package.) If,
847however, there is an C<AUTOLOAD> subroutine defined in the package or
848packages that were searched for the original subroutine, then that
849C<AUTOLOAD> subroutine is called with the arguments that would have been
850passed to the original subroutine. The fully qualified name of the
851original subroutine magically appears in the $AUTOLOAD variable in the
852same package as the C<AUTOLOAD> routine. The name is not passed as an
853ordinary argument because, er, well, just because, that's why...
854
855Most C<AUTOLOAD> routines will load in a definition for the subroutine in
856question using eval, and then execute that subroutine using a special
857form of "goto" that erases the stack frame of the C<AUTOLOAD> routine
858without a trace. (See the standard C<AutoLoader> module, for example.)
859But an C<AUTOLOAD> routine can also just emulate the routine and never
cb1a09d0 860define it. For example, let's pretend that a function that wasn't defined
861should just call system() with those arguments. All you'd do is this:
862
863 sub AUTOLOAD {
864 my $program = $AUTOLOAD;
865 $program =~ s/.*:://;
866 system($program, @_);
867 }
868 date();
6d28dffb 869 who('am', 'i');
cb1a09d0 870 ls('-l');
871
5f05dabc 872In fact, if you pre-declare the functions you want to call that way, you don't
cb1a09d0 873even need the parentheses:
874
875 use subs qw(date who ls);
876 date;
877 who "am", "i";
878 ls -l;
879
880A more complete example of this is the standard Shell module, which
a0d0e21e 881can treat undefined subroutine calls as calls to Unix programs.
882
cb1a09d0 883Mechanisms are available for modules writers to help split the modules
6d28dffb 884up into autoloadable files. See the standard AutoLoader module
885described in L<AutoLoader> and in L<AutoSplit>, the standard
886SelfLoader modules in L<SelfLoader>, and the document on adding C
887functions to perl code in L<perlxs>.
cb1a09d0 888
889=head1 SEE ALSO
a0d0e21e 890
cb1a09d0 891See L<perlref> for more on references. See L<perlxs> if you'd
892like to learn about calling C subroutines from perl. See
893L<perlmod> to learn about bundling up your functions in
894separate files.