3 perlsub - Perl subroutines
7 To declare subroutines:
9 sub NAME; # A "forward" declaration.
10 sub NAME(PROTO); # ditto, but with prototypes
12 sub NAME BLOCK # A declaration and a definition.
13 sub NAME(PROTO) BLOCK # ditto, but with prototypes
15 To define an anonymous subroutine at runtime:
17 $subref = sub BLOCK; # no proto
18 $subref = sub (PROTO) BLOCK; # with proto
20 To import subroutines:
22 use MODULE qw(NAME1 NAME2 NAME3);
26 NAME(LIST); # & is optional with parentheses.
27 NAME LIST; # Parentheses optional if predeclared/imported.
28 &NAME(LIST); # Circumvent prototypes.
29 &NAME; # Makes current @_ visible to called subroutine.
33 Like many languages, Perl provides for user-defined subroutines.
34 These may be located anywhere in the main program, loaded in from
35 other files via the C<do>, C<require>, or C<use> keywords, or
36 generated on the fly using C<eval> or anonymous subroutines (closures).
37 You can even call a function indirectly using a variable containing
38 its name or a CODE reference.
40 The Perl model for function call and return values is simple: all
41 functions are passed as parameters one single flat list of scalars, and
42 all functions likewise return to their caller one single flat list of
43 scalars. Any arrays or hashes in these call and return lists will
44 collapse, losing their identities--but you may always use
45 pass-by-reference instead to avoid this. Both call and return lists may
46 contain as many or as few scalar elements as you'd like. (Often a
47 function without an explicit return statement is called a subroutine, but
48 there's really no difference from Perl's perspective.)
50 Any arguments passed in show up in the array C<@_>. Therefore, if
51 you called a function with two arguments, those would be stored in
52 C<$_[0]> and C<$_[1]>. The array C<@_> is a local array, but its
53 elements are aliases for the actual scalar parameters. In particular,
54 if an element C<$_[0]> is updated, the corresponding argument is
55 updated (or an error occurs if it is not updatable). If an argument
56 is an array or hash element which did not exist when the function
57 was called, that element is created only when (and if) it is modified
58 or a reference to it is taken. (Some earlier versions of Perl
59 created the element whether or not the element was assigned to.)
60 Assigning to the whole array C<@_> removes that aliasing, and does
61 not update any arguments.
63 The return value of a subroutine is the value of the last expression
64 evaluated. More explicitly, a C<return> statement may be used to exit the
65 subroutine, optionally specifying the returned value, which will be
66 evaluated in the appropriate context (list, scalar, or void) depending
67 on the context of the subroutine call. If you specify no return value,
68 the subroutine returns an empty list in list context, the undefined
69 value in scalar context, or nothing in void context. If you return
70 one or more aggregates (arrays and hashes), these will be flattened
71 together into one large indistinguishable list.
73 Perl does not have named formal parameters. In practice all you
74 do is assign to a C<my()> list of these. Variables that aren't
75 declared to be private are global variables. For gory details
76 on creating private variables, see L<"Private Variables via my()">
77 and L<"Temporary Values via local()">. To create protected
78 environments for a set of functions in a separate package (and
79 probably a separate file), see L<perlmod/"Packages">.
86 $max = $foo if $max < $foo;
90 $bestday = max($mon,$tue,$wed,$thu,$fri);
94 # get a line, combining continuation lines
95 # that start with whitespace
98 $thisline = $lookahead; # global variables!
99 LINE: while (defined($lookahead = <STDIN>)) {
100 if ($lookahead =~ /^[ \t]/) {
101 $thisline .= $lookahead;
110 $lookahead = <STDIN>; # get first line
111 while (defined($line = get_line())) {
115 Asisng to a list of private variables to name your arguments:
118 my($key, $value) = @_;
119 $Foo{$key} = $value unless $Foo{$key};
122 Because the assignment copies the values, this also has the effect
123 of turning call-by-reference into call-by-value. Otherwise a
124 function is free to do in-place modifications of C<@_> and change
127 upcase_in($v1, $v2); # this changes $v1 and $v2
129 for (@_) { tr/a-z/A-Z/ }
132 You aren't allowed to modify constants in this way, of course. If an
133 argument were actually literal and you tried to change it, you'd take a
134 (presumably fatal) exception. For example, this won't work:
136 upcase_in("frederick");
138 It would be much safer if the C<upcase_in()> function
139 were written to return a copy of its parameters instead
140 of changing them in place:
142 ($v3, $v4) = upcase($v1, $v2); # this doesn't change $v1 and $v2
144 return unless defined wantarray; # void context, do nothing
146 for (@parms) { tr/a-z/A-Z/ }
147 return wantarray ? @parms : $parms[0];
150 Notice how this (unprototyped) function doesn't care whether it was
151 passed real scalars or arrays. Perl sees all arugments as one big,
152 long, flat parameter list in C<@_>. This is one area where
153 Perl's simple argument-passing style shines. The C<upcase()>
154 function would work perfectly well without changing the C<upcase()>
155 definition even if we fed it things like this:
157 @newlist = upcase(@list1, @list2);
158 @newlist = upcase( split /:/, $var );
160 Do not, however, be tempted to do this:
162 (@a, @b) = upcase(@list1, @list2);
164 Like the flattened incoming parameter list, the return list is also
165 flattened on return. So all you have managed to do here is stored
166 everything in C<@a> and made C<@b> an empty list. See L<Pass by
167 Reference> for alternatives.
169 A subroutine may be called using an explicit C<&> prefix. The
170 C<&> is optional in modern Perl, as are parentheses if the
171 subroutine has been predeclared. The C<&> is I<not> optional
172 when just naming the subroutine, such as when it's used as
173 an argument to defined() or undef(). Nor is it optional when you
174 want to do an indirect subroutine call with a subroutine name or
175 reference using the C<&$subref()> or C<&{$subref}()> constructs,
176 although the C<$subref-E<gt>()> notation solves that problem.
177 See L<perlref> for more about all that.
179 Subroutines may be called recursively. If a subroutine is called
180 using the C<&> form, the argument list is optional, and if omitted,
181 no C<@_> array is set up for the subroutine: the C<@_> array at the
182 time of the call is visible to subroutine instead. This is an
183 efficiency mechanism that new users may wish to avoid.
185 &foo(1,2,3); # pass three arguments
186 foo(1,2,3); # the same
188 foo(); # pass a null list
191 &foo; # foo() get current args, like foo(@_) !!
192 foo; # like foo() IFF sub foo predeclared, else "foo"
194 Not only does the C<&> form make the argument list optional, it also
195 disables any prototype checking on arguments you do provide. This
196 is partly for historical reasons, and partly for having a convenient way
197 to cheat if you know what you're doing. See L<Prototypes> below.
199 Function whose names are in all upper case are reserved to the Perl
200 core, as are modules whose names are in all lower case. A
201 function in all capitals is a loosely-held convention meaning it
202 will be called indirectly by the run-time system itself, usually
203 due to a triggered event. Functions that do special, pre-defined
204 things include C<BEGIN>, C<END>, C<AUTOLOAD>, and C<DESTROY>--plus
205 all functions mentioned in L<perltie>. The 5.005 release adds
206 C<INIT> to this list.
208 =head2 Private Variables via my()
212 my $foo; # declare $foo lexically local
213 my (@wid, %get); # declare list of variables local
214 my $foo = "flurp"; # declare $foo lexical, and init it
215 my @oof = @bar; # declare @oof lexical, and init it
217 The C<my> operator declares the listed variables to be lexically
218 confined to the enclosing block, conditional (C<if/unless/elsif/else>),
219 loop (C<for/foreach/while/until/continue>), subroutine, C<eval>,
220 or C<do/require/use>'d file. If more than one value is listed, the
221 list must be placed in parentheses. All listed elements must be
222 legal lvalues. Only alphanumeric identifiers may be lexically
223 scoped--magical built-in like C<$/> must currently be C<local>ize
224 with C<local> instead.
226 Unlike dynamic variables created by the C<local> operator, lexical
227 variables declared with C<my> are totally hidden from the outside
228 world, including any called subroutines. This is true if it's the
229 same subroutine called from itself or elsewhere--every call gets
232 This doesn't mean that a C<my> variable declared in a statically
233 enclosing lexical scope would be invisible. Only dynamic scopes
234 are cut off. For example, the C<bumpx()> function below has access
235 to the lexical $x variable because both the C<my> and the C<sub>
236 occurred at the same scope, presumably file scope.
241 An C<eval()>, however, can see lexical variables of the scope it is
242 being evaluated in, so long as the names aren't hidden by declarations within
243 the C<eval()> itself. See L<perlref>.
245 The parameter list to my() may be assigned to if desired, which allows you
246 to initialize your variables. (If no initializer is given for a
247 particular variable, it is created with the undefined value.) Commonly
248 this is used to name input parameters to a subroutine. Examples:
250 $arg = "fred"; # "global" variable
252 print "$arg thinks the root is $n\n";
253 fred thinks the root is 3
256 my $arg = shift; # name doesn't matter
261 The C<my> is simply a modifier on something you might assign to. So when
262 you do assign to variables in its argument list, C<my> doesn't
263 change whether those variables are viewed as a scalar or an array. So
265 my ($foo) = <STDIN>; # WRONG?
268 both supply a list context to the right-hand side, while
272 supplies a scalar context. But the following declares only one variable:
274 my $foo, $bar = 1; # WRONG
276 That has the same effect as
281 The declared variable is not introduced (is not visible) until after
282 the current statement. Thus,
286 can be used to initialize a new $x with the value of the old $x, and
289 my $x = 123 and $x == 123
291 is false unless the old $x happened to have the value C<123>.
293 Lexical scopes of control structures are not bounded precisely by the
294 braces that delimit their controlled blocks; control expressions are
295 part of that scope, too. Thus in the loop
297 while (my $line = <>) {
303 the scope of $line extends from its declaration throughout the rest of
304 the loop construct (including the C<continue> clause), but not beyond
305 it. Similarly, in the conditional
307 if ((my $answer = <STDIN>) =~ /^yes$/i) {
309 } elsif ($answer =~ /^no$/i) {
313 die "'$answer' is neither 'yes' nor 'no'";
316 the scope of $answer extends from its declaration through the rest
317 of that conditional, including any C<elsif> and C<else> clauses,
320 None of the foregoing text applies to C<if/unless> or C<while/until>
321 modifiers appended to simple statements. Such modifiers are not
322 control structures and have no effect on scoping.
324 The C<foreach> loop defaults to scoping its index variable dynamically
325 in the manner of C<local>. However, if the index variable is
326 prefixed with the keyword C<my>, or if there is already a lexical
327 by that name in scope, then a new lexical is created instead. Thus
330 for my $i (1, 2, 3) {
334 the scope of $i extends to the end of the loop, but not beyond it,
335 rendering the value of $i inaccessible within C<some_function()>.
337 Some users may wish to encourage the use of lexically scoped variables.
338 As an aid to catching implicit uses to package variables,
339 which are always global, if you say
343 then any variable mentioned from there to the end of the enclosing
344 block must either refer to a lexical variable, be predeclared via
345 C<use vars>, or else must be fully qualified with the package name.
346 A compilation error results otherwise. An inner block may countermand
347 this with C<no strict 'vars'>.
349 A C<my> has both a compile-time and a run-time effect. At compile
350 time, the compiler takes notice of it. The principle usefulness
351 of this is to quiet C<use strict 'vars'>, but it is also essential
352 for generation of closures as detailed in L<perlref>. Actual
353 initialization is delayed until run time, though, so it gets executed
354 at the appropriate time, such as each time through a loop, for
357 Variables declared with C<my> are not part of any package and are therefore
358 never fully qualified with the package name. In particular, you're not
359 allowed to try to make a package variable (or other global) lexical:
361 my $pack::var; # ERROR! Illegal syntax
362 my $_; # also illegal (currently)
364 In fact, a dynamic variable (also known as package or global variables)
365 are still accessible using the fully qualified C<::> notation even while a
366 lexical of the same name is also visible:
371 print "$x and $::x\n";
373 That will print out C<20> and C<10>.
375 You may declare C<my> variables at the outermost scope of a file
376 to hide any such identifiers from the world outside that file. This
377 is similar in spirit to C's static variables when they are used at
378 the file level. To do this with a subroutine requires the use of
379 a closure (an anonymous function that accesses enclosing lexicals).
380 If you want to create a private subroutine that cannot be called
381 from outside that block, it can declare a lexical variable containing
382 an anonymous sub reference:
384 my $secret_version = '1.001-beta';
385 my $secret_sub = sub { print $secret_version };
388 As long as the reference is never returned by any function within the
389 module, no outside module can see the subroutine, because its name is not in
390 any package's symbol table. Remember that it's not I<REALLY> called
391 C<$some_pack::secret_version> or anything; it's just $secret_version,
392 unqualified and unqualifiable.
394 This does not work with object methods, however; all object methods
395 have to be in the symbol table of some package to be found. See
396 L<perlref/"Function Templates"> for something of a work-around to
399 =head2 Persistent Private Variables
401 Just because a lexical variable is lexically (also called statically)
402 scoped to its enclosing block, C<eval>, or C<do> FILE, this doesn't mean that
403 within a function it works like a C static. It normally works more
404 like a C auto, but with implicit garbage collection.
406 Unlike local variables in C or C++, Perl's lexical variables don't
407 necessarily get recycled just because their scope has exited.
408 If something more permanent is still aware of the lexical, it will
409 stick around. So long as something else references a lexical, that
410 lexical won't be freed--which is as it should be. You wouldn't want
411 memory being free until you were done using it, or kept around once you
412 were done. Automatic garbage collection takes care of this for you.
414 This means that you can pass back or save away references to lexical
415 variables, whereas to return a pointer to a C auto is a grave error.
416 It also gives us a way to simulate C's function statics. Here's a
417 mechanism for giving a function private variables with both lexical
418 scoping and a static lifetime. If you do want to create something like
419 C's static variables, just enclose the whole function in an extra block,
420 and put the static variable outside the function but in the block.
425 return ++$secret_val;
428 # $secret_val now becomes unreachable by the outside
429 # world, but retains its value between calls to gimme_another
431 If this function is being sourced in from a separate file
432 via C<require> or C<use>, then this is probably just fine. If it's
433 all in the main program, you'll need to arrange for the C<my>
434 to be executed early, either by putting the whole block above
435 your main program, or more likely, placing merely a C<BEGIN>
436 sub around it to make sure it gets executed before your program
442 return ++$secret_val;
446 See L<perlmod/"Package Constructors and Destructors"> about the
447 special triggered functions, C<BEGIN> and C<INIT>.
449 If declared at the outermost scope (the file scope), then lexicals
450 work somewhat like C's file statics. They are available to all
451 functions in that same file declared below them, but are inaccessible
452 from outside that file. This strategy is sometimes used in modules
453 to create private variables that the whole module can see.
455 =head2 Temporary Values via local()
457 B<WARNING>: In general, you should be using C<my> instead of C<local>, because
458 it's faster and safer. Exceptions to this include the global punctuation
459 variables, filehandles and formats, and direct manipulation of the Perl
460 symbol table itself. Format variables often use C<local> though, as do
461 other variables whose current value must be visible to called
466 local $foo; # declare $foo dynamically local
467 local (@wid, %get); # declare list of variables local
468 local $foo = "flurp"; # declare $foo dynamic, and init it
469 local @oof = @bar; # declare @oof dynamic, and init it
471 local *FH; # localize $FH, @FH, %FH, &FH ...
472 local *merlyn = *randal; # now $merlyn is really $randal, plus
473 # @merlyn is really @randal, etc
474 local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal
475 local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc
477 A C<local> modifies its listed variables to be "local" to the
478 enclosing block, C<eval>, or C<do FILE>--and to I<any subroutine
479 called from within that block>. A C<local> just gives temporary
480 values to global (meaning package) variables. It does I<not> create
481 a local variable. This is known as dynamic scoping. Lexical scoping
482 is done with C<my>, which works more like C's auto declarations.
484 If more than one variable is given to C<local>, they must be placed in
485 parentheses. All listed elements must be legal lvalues. This operator works
486 by saving the current values of those variables in its argument list on a
487 hidden stack and restoring them upon exiting the block, subroutine, or
488 eval. This means that called subroutines can also reference the local
489 variable, but not the global one. The argument list may be assigned to if
490 desired, which allows you to initialize your local variables. (If no
491 initializer is given for a particular variable, it is created with an
492 undefined value.) Commonly this is used to name the parameters to a
493 subroutine. Examples:
498 # assume this function uses global %digits hash
501 # now temporarily add to %digits hash
503 # (NOTE: not claiming this is efficient!)
504 local %digits = (%digits, 't' => 10, 'e' => 11);
505 parse_num(); # parse_num gets this new %digits!
507 # old %digits restored here
509 Because C<local> is a run-time operator, it gets executed each time
510 through a loop. In releases of Perl previous to 5.0, this used more stack
511 storage each time until the loop was exited. Perl now reclaims the space
512 each time through, but it's still more efficient to declare your variables
515 A C<local> is simply a modifier on an lvalue expression. When you assign to
516 a C<local>ized variable, the C<local> doesn't change whether its list is viewed
517 as a scalar or an array. So
519 local($foo) = <STDIN>;
520 local @FOO = <STDIN>;
522 both supply a list context to the right-hand side, while
524 local $foo = <STDIN>;
526 supplies a scalar context.
528 A note about C<local()> and composite types is in order. Something
529 like C<local(%foo)> works by temporarily placing a brand new hash in
530 the symbol table. The old hash is left alone, but is hidden "behind"
533 This means the old variable is completely invisible via the symbol
534 table (i.e. the hash entry in the C<*foo> typeglob) for the duration
535 of the dynamic scope within which the C<local()> was seen. This
536 has the effect of allowing one to temporarily occlude any magic on
537 composite types. For instance, this will briefly alter a tied
538 hash to some other implementation:
540 tie %ahash, 'APackage';
544 tie %ahash, 'BPackage';
545 [..called code will see %ahash tied to 'BPackage'..]
548 [..%ahash is a normal (untied) hash here..]
551 [..%ahash back to its initial tied self again..]
553 As another example, a custom implementation of C<%ENV> might look
558 tie %ENV, 'MyOwnEnv';
559 [..do your own fancy %ENV manipulation here..]
561 [..normal %ENV behavior here..]
563 It's also worth taking a moment to explain what happens when you
564 C<local>ize a member of a composite type (i.e. an array or hash element).
565 In this case, the element is C<local>ized I<by name>. This means that
566 when the scope of the C<local()> ends, the saved value will be
567 restored to the hash element whose key was named in the C<local()>, or
568 the array element whose index was named in the C<local()>. If that
569 element was deleted while the C<local()> was in effect (e.g. by a
570 C<delete()> from a hash or a C<shift()> of an array), it will spring
571 back into existence, possibly extending an array and filling in the
572 skipped elements with C<undef>. For instance, if you say
574 %hash = ( 'This' => 'is', 'a' => 'test' );
578 local($hash{'a'}) = 'drill';
579 while (my $e = pop(@ary)) {
584 $hash{'only a'} = 'test';
588 print join(' ', map { "$_ $hash{$_}" } sort keys %hash),".\n";
589 print "The array has ",scalar(@ary)," elements: ",
590 join(', ', map { defined $_ ? $_ : 'undef' } @ary),"\n";
597 This is a test only a test.
598 The array has 6 elements: 0, 1, 2, undef, undef, 5
600 The behavior of local() on non-existent members of composite
601 types is subject to change in future.
603 =head2 Passing Symbol Table Entries (typeglobs)
605 B<WARNING>: The mechanism described in this section was originally
606 the only way to simulate pass-by-reference in older versions of
607 Perl. While it still works fine in modern versions, the new reference
608 mechanism is generally easier to work with. See below.
610 Sometimes you don't want to pass the value of an array to a subroutine
611 but rather the name of it, so that the subroutine can modify the global
612 copy of it rather than working with a local copy. In perl you can
613 refer to all objects of a particular name by prefixing the name
614 with a star: C<*foo>. This is often known as a "typeglob", because the
615 star on the front can be thought of as a wildcard match for all the
616 funny prefix characters on variables and subroutines and such.
618 When evaluated, the typeglob produces a scalar value that represents
619 all the objects of that name, including any filehandle, format, or
620 subroutine. When assigned to, it causes the name mentioned to refer to
621 whatever C<*> value was assigned to it. Example:
624 local(*someary) = @_;
625 foreach $elem (@someary) {
632 Scalars are already passed by reference, so you can modify
633 scalar arguments without using this mechanism by referring explicitly
634 to C<$_[0]> etc. You can modify all the elements of an array by passing
635 all the elements as scalars, but you have to use the C<*> mechanism (or
636 the equivalent reference mechanism) to C<push>, C<pop>, or change the size of
637 an array. It will certainly be faster to pass the typeglob (or reference).
639 Even if you don't want to modify an array, this mechanism is useful for
640 passing multiple arrays in a single LIST, because normally the LIST
641 mechanism will merge all the array values so that you can't extract out
642 the individual arrays. For more on typeglobs, see
643 L<perldata/"Typeglobs and Filehandles">.
645 =head2 When to Still Use local()
647 Despite the existence of C<my>, there are still three places where the
648 C<local> operator still shines. In fact, in these three places, you
649 I<must> use C<local> instead of C<my>.
653 =item 1. You need to give a global variable a temporary value, especially $_.
655 The global variables, like C<@ARGV> or the punctuation variables, must be
656 C<local>ized with C<local()>. This block reads in F</etc/motd>, and splits
657 it up into chunks separated by lines of equal signs, which are placed
661 local @ARGV = ("/etc/motd");
664 @Fields = split /^\s*=+\s*$/;
667 It particular, it's important to C<local>ize $_ in any routine that assigns
668 to it. Look out for implicit assignments in C<while> conditionals.
670 =item 2. You need to create a local file or directory handle or a local function.
672 A function that needs a filehandle of its own must use C<local()> uses
673 C<local()> on complete typeglob. This can be used to create new symbol
677 local (*READER, *WRITER); # not my!
678 pipe (READER, WRITER); or die "pipe: $!";
679 return (*READER, *WRITER);
681 ($head, $tail) = ioqueue();
683 See the Symbol module for a way to create anonymous symbol table
686 Because assignment of a reference to a typeglob creates an alias, this
687 can be used to create what is effectively a local function, or at least,
691 local *grow = \&shrink; # only until this block exists
692 grow(); # really calls shrink()
693 move(); # if move() grow()s, it shrink()s too
695 grow(); # get the real grow() again
697 See L<perlref/"Function Templates"> for more about manipulating
698 functions by name in this way.
700 =item 3. You want to temporarily change just one element of an array or hash.
702 You can C<local>ize just one element of an aggregate. Usually this
706 local $SIG{INT} = 'IGNORE';
707 funct(); # uninterruptible
709 # interruptibility automatically restored here
711 But it also works on lexically declared aggregates. Prior to 5.005,
712 this operation could on occasion misbehave.
716 =head2 Pass by Reference
718 If you want to pass more than one array or hash into a function--or
719 return them from it--and have them maintain their integrity, then
720 you're going to have to use an explicit pass-by-reference. Before you
721 do that, you need to understand references as detailed in L<perlref>.
722 This section may not make much sense to you otherwise.
724 Here are a few simple examples. First, let's pass in several arrays
725 to a function and have it C<pop> all of then, returning a new list
726 of all their former last elements:
728 @tailings = popmany ( \@a, \@b, \@c, \@d );
733 foreach $aref ( @_ ) {
734 push @retlist, pop @$aref;
739 Here's how you might write a function that returns a
740 list of keys occurring in all the hashes passed to it:
742 @common = inter( \%foo, \%bar, \%joe );
744 my ($k, $href, %seen); # locals
746 while ( $k = each %$href ) {
750 return grep { $seen{$_} == @_ } keys %seen;
753 So far, we're using just the normal list return mechanism.
754 What happens if you want to pass or return a hash? Well,
755 if you're using only one of them, or you don't mind them
756 concatenating, then the normal calling convention is ok, although
759 Where people get into trouble is here:
761 (@a, @b) = func(@c, @d);
763 (%a, %b) = func(%c, %d);
765 That syntax simply won't work. It sets just C<@a> or C<%a> and
766 clears the C<@b> or C<%b>. Plus the function didn't get passed
767 into two separate arrays or hashes: it got one long list in C<@_>,
770 If you can arrange for everyone to deal with this through references, it's
771 cleaner code, although not so nice to look at. Here's a function that
772 takes two array references as arguments, returning the two array elements
773 in order of how many elements they have in them:
775 ($aref, $bref) = func(\@c, \@d);
776 print "@$aref has more than @$bref\n";
778 my ($cref, $dref) = @_;
779 if (@$cref > @$dref) {
780 return ($cref, $dref);
782 return ($dref, $cref);
786 It turns out that you can actually do this also:
788 (*a, *b) = func(\@c, \@d);
789 print "@a has more than @b\n";
799 Here we're using the typeglobs to do symbol table aliasing. It's
800 a tad subtle, though, and also won't work if you're using C<my>
801 variables, because only globals (even in disguised as C<local>s)
802 are in the symbol table.
804 If you're passing around filehandles, you could usually just use the bare
805 typeglob, like C<*STDOUT>, but typeglobs references work, too.
811 print $fh "her um well a hmmm\n";
814 $rec = get_rec(\*STDIN);
820 If you're planning on generating new filehandles, you could do this.
821 Notice to pass back just the bare *FH, not its reference.
826 return open (FH, $path) ? *FH : undef;
831 Perl supports a very limited kind of compile-time argument checking
832 using function prototyping. If you declare
836 then C<mypush()> takes arguments exactly like C<push()> does. The
837 function declaration must be visible at compile time. The prototype
838 affects only interpretation of new-style calls to the function,
839 where new-style is defined as not using the C<&> character. In
840 other words, if you call it like a built-in function, then it behaves
841 like a built-in function. If you call it like an old-fashioned
842 subroutine, then it behaves like an old-fashioned subroutine. It
843 naturally falls out from this rule that prototypes have no influence
844 on subroutine references like C<\&foo> or on indirect subroutine
845 calls like C<&{$subref}> or C<$subref-E<gt>()>.
847 Method calls are not influenced by prototypes either, because the
848 function to be called is indeterminate at compile time, since
849 the exact code called depends on inheritance.
851 Because the intent of this feature is primarily to let you define
852 subroutines that work like built-in functions, here are prototypes
853 for some other functions that parse almost exactly like the
854 corresponding built-in.
856 Declared as Called as
858 sub mylink ($$) mylink $old, $new
859 sub myvec ($$$) myvec $var, $offset, 1
860 sub myindex ($$;$) myindex &getstring, "substr"
861 sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off
862 sub myreverse (@) myreverse $a, $b, $c
863 sub myjoin ($@) myjoin ":", $a, $b, $c
864 sub mypop (\@) mypop @array
865 sub mysplice (\@$$@) mysplice @array, @array, 0, @pushme
866 sub mykeys (\%) mykeys %{$hashref}
867 sub myopen (*;$) myopen HANDLE, $name
868 sub mypipe (**) mypipe READHANDLE, WRITEHANDLE
869 sub mygrep (&@) mygrep { /foo/ } $a, $b, $c
870 sub myrand ($) myrand 42
873 Any backslashed prototype character represents an actual argument
874 that absolutely must start with that character. The value passed
875 as part of C<@_> will be a reference to the actual argument given
876 in the subroutine call, obtained by applying C<\> to that argument.
878 Unbackslashed prototype characters have special meanings. Any
879 unbackslashed C<@> or C<%> eats all remaining arguments, and forces
880 list context. An argument represented by C<$> forces scalar context. An
881 C<&> requires an anonymous subroutine, which, if passed as the first
882 argument, does not require the C<sub> keyword or a subsequent comma. A
883 C<*> allows the subroutine to accept a bareword, constant, scalar expression,
884 typeglob, or a reference to a typeglob in that slot. The value will be
885 available to the subroutine either as a simple scalar, or (in the latter
886 two cases) as a reference to the typeglob.
888 A semicolon separates mandatory arguments from optional arguments.
889 It is redundant before C<@> or C<%>, which gobble up everything else.
891 Note how the last three examples in the table above are treated
892 specially by the parser. C<mygrep()> is parsed as a true list
893 operator, C<myrand()> is parsed as a true unary operator with unary
894 precedence the same as C<rand()>, and C<mytime()> is truly without
895 arguments, just like C<time()>. That is, if you say
899 you'll get C<mytime() + 2>, not C<mytime(2)>, which is how it would be parsed
902 The interesting thing about C<&> is that you can generate new syntax with it,
903 provided it's in the initial position:
906 my($try,$catch) = @_;
913 sub catch (&) { $_[0] }
918 /phooey/ and print "unphooey\n";
921 That prints C<"unphooey">. (Yes, there are still unresolved
922 issues having to do with visibility of C<@_>. I'm ignoring that
923 question for the moment. (But note that if we make C<@_> lexically
924 scoped, those anonymous subroutines can act like closures... (Gee,
925 is this sounding a little Lispish? (Never mind.))))
927 And here's a reimplementation of the Perl C<grep> operator:
933 push(@result, $_) if &$code;
938 Some folks would prefer full alphanumeric prototypes. Alphanumerics have
939 been intentionally left out of prototypes for the express purpose of
940 someday in the future adding named, formal parameters. The current
941 mechanism's main goal is to let module writers provide better diagnostics
942 for module users. Larry feels the notation quite understandable to Perl
943 programmers, and that it will not intrude greatly upon the meat of the
944 module, nor make it harder to read. The line noise is visually
945 encapsulated into a small pill that's easy to swallow.
947 It's probably best to prototype new functions, not retrofit prototyping
948 into older ones. That's because you must be especially careful about
949 silent impositions of differing list versus scalar contexts. For example,
950 if you decide that a function should take just one parameter, like this:
954 print "you gave me $n\n";
957 and someone has been calling it with an array or expression
963 Then you've just supplied an automatic C<scalar> in front of their
964 argument, which can be more than a bit surprising. The old C<@foo>
965 which used to hold one thing doesn't get passed in. Instead,
966 C<func()> now gets passed in a C<1>; that is, the number of elements
967 in C<@foo>. And the C<split> gets called in scalar context so it
968 starts scribbling on your C<@_> parameter list. Ouch!
970 This is all very powerful, of course, and should be used only in moderation
971 to make the world a better place.
973 =head2 Constant Functions
975 Functions with a prototype of C<()> are potential candidates for
976 inlining. If the result after optimization and constant folding
977 is either a constant or a lexically-scoped scalar which has no other
978 references, then it will be used in place of function calls made
979 without C<&>. Calls made using C<&> are never inlined. (See
980 F<constant.pm> for an easy way to declare most constants.)
982 The following functions would all be inlined:
984 sub pi () { 3.14159 } # Not exact, but close.
985 sub PI () { 4 * atan2 1, 1 } # As good as it gets,
986 # and it's inlined, too!
990 sub FLAG_FOO () { 1 << 8 }
991 sub FLAG_BAR () { 1 << 9 }
992 sub FLAG_MASK () { FLAG_FOO | FLAG_BAR }
994 sub OPT_BAZ () { not (0x1B58 & FLAG_MASK) }
1004 sub N () { int(BAZ_VAL) / 3 }
1007 for (1..N) { $prod *= $_ }
1008 sub N_FACTORIAL () { $prod }
1011 If you redefine a subroutine that was eligible for inlining, you'll get
1012 a mandatory warning. (You can use this warning to tell whether or not a
1013 particular subroutine is considered constant.) The warning is
1014 considered severe enough not to be optional because previously compiled
1015 invocations of the function will still be using the old value of the
1016 function. If you need to be able to redefine the subroutine, you need to
1017 ensure that it isn't inlined, either by dropping the C<()> prototype
1018 (which changes calling semantics, so beware) or by thwarting the
1019 inlining mechanism in some other way, such as
1021 sub not_inlined () {
1025 =head2 Overriding Built-in Functions
1027 Many built-in functions may be overridden, though this should be tried
1028 only occasionally and for good reason. Typically this might be
1029 done by a package attempting to emulate missing built-in functionality
1030 on a non-Unix system.
1032 Overriding may be done only by importing the name from a
1033 module--ordinary predeclaration isn't good enough. However, the
1034 C<use subs> pragma lets you, in effect, predeclare subs
1035 via the import syntax, and these names may then override built-in ones:
1037 use subs 'chdir', 'chroot', 'chmod', 'chown';
1041 To unambiguously refer to the built-in form, precede the
1042 built-in name with the special package qualifier C<CORE::>. For example,
1043 saying C<CORE::open()> always refers to the built-in C<open()>, even
1044 if the current package has imported some other subroutine called
1045 C<&open()> from elsewhere. Even though it looks like a regular
1046 function calls, it isn't: you can't take a reference to it, such as
1047 the incorrect C<\&CORE::open> might appear to produce.
1049 Library modules should not in general export built-in names like C<open>
1050 or C<chdir> as part of their default C<@EXPORT> list, because these may
1051 sneak into someone else's namespace and change the semantics unexpectedly.
1052 Instead, if the module adds that name to C<@EXPORT_OK>, then it's
1053 possible for a user to import the name explicitly, but not implicitly.
1054 That is, they could say
1058 and it would import the C<open> override. But if they said
1062 they would get the default imports without overrides.
1064 The foregoing mechanism for overriding built-in is restricted, quite
1065 deliberately, to the package that requests the import. There is a second
1066 method that is sometimes applicable when you wish to override a built-in
1067 everywhere, without regard to namespace boundaries. This is achieved by
1068 importing a sub into the special namespace C<CORE::GLOBAL::>. Here is an
1069 example that quite brazenly replaces the C<glob> operator with something
1070 that understands regular expressions.
1075 @EXPORT_OK = 'glob';
1081 my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0));
1082 $pkg->export($where, $sym, @_);
1089 if (opendir D, '.') {
1090 @got = grep /$pat/, readdir D;
1097 And here's how it could be (ab)used:
1099 #use REGlob 'GLOBAL_glob'; # override glob() in ALL namespaces
1101 use REGlob 'glob'; # override glob() in Foo:: only
1102 print for <^[a-z_]+\.pm\$>; # show all pragmatic modules
1104 The initial comment shows a contrived, even dangerous example.
1105 By overriding C<glob> globally, you would be forcing the new (and
1106 subversive) behavior for the C<glob> operator for I<every> namespace,
1107 without the complete cognizance or cooperation of the modules that own
1108 those namespaces. Naturally, this should be done with extreme caution--if
1109 it must be done at all.
1111 The C<REGlob> example above does not implement all the support needed to
1112 cleanly override perl's C<glob> operator. The built-in C<glob> has
1113 different behaviors depending on whether it appears in a scalar or list
1114 context, but our C<REGlob> doesn't. Indeed, many perl built-in have such
1115 context sensitive behaviors, and these must be adequately supported by
1116 a properly written override. For a fully functional example of overriding
1117 C<glob>, study the implementation of C<File::DosGlob> in the standard
1122 If you call a subroutine that is undefined, you would ordinarily
1123 get an immediate, fatal error complaining that the subroutine doesn't
1124 exist. (Likewise for subroutines being used as methods, when the
1125 method doesn't exist in any base class of the class's package.)
1126 However, if an C<AUTOLOAD> subroutine is defined in the package or
1127 packages used to locate the original subroutine, then that
1128 C<AUTOLOAD> subroutine is called with the arguments that would have
1129 been passed to the original subroutine. The fully qualified name
1130 of the original subroutine magically appears in the global $AUTOLOAD
1131 variable of the same package as the C<AUTOLOAD> routine. The name
1132 is not passed as an ordinary argument because, er, well, just
1133 because, that's why...
1135 Many C<AUTOLOAD> routines load in a definition for the requested
1136 subroutine using eval(), then execute that subroutine using a special
1137 form of goto() that erases the stack frame of the C<AUTOLOAD> routine
1138 without a trace. (See the source to the standard module documented
1139 in L<AutoLoader>, for example.) But an C<AUTOLOAD> routine can
1140 also just emulate the routine and never define it. For example,
1141 let's pretend that a function that wasn't defined should just invoke
1142 C<system> with those arguments. All you'd do is:
1145 my $program = $AUTOLOAD;
1146 $program =~ s/.*:://;
1147 system($program, @_);
1153 In fact, if you predeclare functions you want to call that way, you don't
1154 even need parentheses:
1156 use subs qw(date who ls);
1161 A more complete example of this is the standard Shell module, which
1162 can treat undefined subroutine calls as calls to external programs.
1164 Mechanisms are available to help modules writers split their modules
1165 into autoloadable files. See the standard AutoLoader module
1166 described in L<AutoLoader> and in L<AutoSplit>, the standard
1167 SelfLoader modules in L<SelfLoader>, and the document on adding C
1168 functions to Perl code in L<perlxs>.
1172 See L<perlref/"Function Templates"> for more about references and closures.
1173 See L<perlxs> if you'd like to learn about calling C subroutines from Perl.
1174 See L<perlembed> if you'd like to learn about calling PErl subroutines from C.
1175 See L<perlmod> to learn about bundling up your functions in separate files.
1176 See L<perlmodlib> to learn what library modules come standard on your system.
1177 See L<perltoot> to learn how to make object method calls.