pod/perlsub.pod

   1 =head1 NAME
   2
   3 perlsub - Perl subroutines
   4
   5 =head1 SYNOPSIS
   6
   7 To declare subroutines:
   8
   9     sub NAME;             # A "forward" declaration.
  10     sub NAME(PROTO);      #  ditto, but with prototypes
  11
  12     sub NAME BLOCK        # A declaration and a definition.
  13     sub NAME(PROTO) BLOCK #  ditto, but with prototypes
  14
  15 To define an anonymous subroutine at runtime:
  16
  17     $subref = sub BLOCK;
  18
  19 To import subroutines:
  20
  21     use PACKAGE qw(NAME1 NAME2 NAME3);
  22
  23 To call subroutines:
  24
  25     NAME(LIST);    # & is optional with parens.
  26     NAME LIST;     # Parens optional if predeclared/imported.
  27     &NAME;         # Passes current @_ to subroutine.
  28
  29 =head1 DESCRIPTION
  30
  31 Like many languages, Perl provides for user-defined subroutines.  These
  32 may be located anywhere in the main program, loaded in from other files
  33 via the C<do>, C<require>, or C<use> keywords, or even generated on the
  34 fly using C<eval> or anonymous subroutines (closures).  You can even call
  35 a function indirectly using a variable containing its name or a CODE reference
  36 to it, as in C<$var = \&function>.
  37
  38 The Perl model for function call and return values is simple: all
  39 functions are passed as parameters one single flat list of scalars, and
  40 all functions likewise return to their caller one single flat list of
  41 scalars.  Any arrays or hashes in these call and return lists will
  42 collapse, losing their identities--but you may always use
  43 pass-by-reference instead to avoid this.  Both call and return lists may
  44 contain as many or as few scalar elements as you'd like.  (Often a
  45 function without an explicit return statement is called a subroutine, but
  46 there's really no difference from the language's perspective.)
  47
  48 Any arguments passed to the routine come in as the array @_.  Thus if you
  49 called a function with two arguments, those would be stored in C<$_[0]>
  50 and C<$_[1]>.  The array @_ is a local array, but its values are implicit
  51 references (predating L<perlref>) to the actual scalar parameters.  The
  52 return value of the subroutine is the value of the last expression
  53 evaluated.  Alternatively, a return statement may be used to specify the
  54 returned value and exit the subroutine.  If you return one or more arrays
  55 and/or hashes, these will be flattened together into one large
  56 indistinguishable list.
  57
  58 Perl does not have named formal parameters, but in practice all you do is
  59 assign to a my() list of these.  Any variables you use in the function
  60 that aren't declared private are global variables.  For the gory details
  61 on creating private variables, see the sections below on L<"Private
  62 Variables via my()"> and L</"Temporary Values via local()">.  To create
  63 protected environments for a set of functions in a separate package (and
  64 probably a separate file), see L<perlmod/"Packages">.
  65
  66 Example:
  67
  68     sub max {
  69         my $max = shift(@_);
  70         foreach $foo (@_) {
  71             $max = $foo if $max < $foo;
  72         }
  73         return $max;
  74     }
  75     $bestday = max($mon,$tue,$wed,$thu,$fri);
  76
  77 Example:
  78
  79     # get a line, combining continuation lines
  80     #  that start with whitespace
  81
  82     sub get_line {
  83         $thisline = $lookahead;  # GLOBAL VARIABLES!!
  84         LINE: while ($lookahead = <STDIN>) {
  85             if ($lookahead =~ /^[ \t]/) {
  86                 $thisline .= $lookahead;
  87             }
  88             else {
  89                 last LINE;
  90             }
  91         }
  92         $thisline;
  93     }
  94
  95     $lookahead = <STDIN>;       # get first line
  96     while ($_ = get_line()) {
  97         ...
  98     }
  99
 100 Use array assignment to a local list to name your formal arguments:
 101
 102     sub maybeset {
 103         my($key, $value) = @_;
 104         $Foo{$key} = $value unless $Foo{$key};
 105     }
 106
 107 This also has the effect of turning call-by-reference into call-by-value,
 108 since the assignment copies the values.  Otherwise a function is free to
 109 do in-place modifications of @_ and change its callers values.
 110
 111     upcase_in($v1, $v2);  # this changes $v1 and $v2
 112     sub upcase_in {
 113         for (@_) { tr/a-z/A-Z/ }
 114     }
 115
 116 You aren't allowed to modify constants in this way, of course.  If an
 117 argument were actually literal and you tried to change it, you'd take a
 118 (presumably fatal) exception.   For example, this won't work:
 119
 120     upcase_in("frederick");
 121
 122 It would be much safer if the upcase_in() function
 123 were written to return a copy of its parameters instead
 124 of changing them in place:
 125
 126     ($v3, $v4) = upcase($v1, $v2);  # this doesn't
 127     sub upcase {
 128         my @parms = @_;
 129         for (@parms) { tr/a-z/A-Z/ }
 130         # wantarray checks if we were called in list context
 131         return wantarray ? @parms : $parms[0];
 132     }
 133
 134 Notice how this (unprototyped) function doesn't care whether it was passed
 135 real scalars or arrays.  Perl will see everything as one big long flat @_
 136 parameter list.  This is one of the ways where Perl's simple
 137 argument-passing style shines.  The upcase() function would work perfectly
 138 well without changing the upcase() definition even if we fed it things
 139 like this:
 140
 141     @newlist   = upcase(@list1, @list2);
 142     @newlist   = upcase( split /:/, $var );
 143
 144 Do not, however, be tempted to do this:
 145
 146     (@a, @b)   = upcase(@list1, @list2);
 147
 148 Because like its flat incoming parameter list, the return list is also
 149 flat.  So all you have managed to do here is stored everything in @a and
 150 made @b an empty list.  See L</"Pass by Reference"> for alternatives.
 151
 152 A subroutine may be called using the "&" prefix.  The "&" is optional in
 153 Perl 5, and so are the parens if the subroutine has been predeclared.
 154 (Note, however, that the "&" is I<NOT> optional when you're just naming
 155 the subroutine, such as when it's used as an argument to defined() or
 156 undef().  Nor is it optional when you want to do an indirect subroutine
 157 call with a subroutine name or reference using the C<&$subref()> or
 158 C<&{$subref}()> constructs.  See L<perlref> for more on that.)
 159
 160 Subroutines may be called recursively.  If a subroutine is called using
 161 the "&" form, the argument list is optional, and if omitted, no @_ array is
 162 set up for the subroutine: the @_ array at the time of the call is
 163 visible to subroutine instead.  This is an efficiency mechanism that
 164 new users may wish to avoid.
 165
 166     &foo(1,2,3);        # pass three arguments
 167     foo(1,2,3);         # the same
 168
 169     foo();              # pass a null list
 170     &foo();             # the same
 171
 172     &foo;               # foo() get current args, like foo(@_) !!
 173     foo;                # like foo() IFF sub foo pre-declared, else "foo"
 174
 175 Not only does the "&" form make the argument list optional, but it also
 176 disables any prototype checking on the arguments you do provide.  This
 177 is partly for historical reasons, and partly for having a convenient way
 178 to cheat if you know what you're doing.  See the section on Prototypes below.
 179
 180 =head2 Private Variables via my()
 181
 182 Synopsis:
 183
 184     my $foo;            # declare $foo lexically local
 185     my (@wid, %get);    # declare list of variables local
 186     my $foo = "flurp";  # declare $foo lexical, and init it
 187     my @oof = @bar;     # declare @oof lexical, and init it
 188
 189 A "my" declares the listed variables to be confined (lexically) to the
 190 enclosing block, subroutine, C<eval>, or C<do/require/use>'d file.  If
 191 more than one value is listed, the list must be placed in parens.  All
 192 listed elements must be legal lvalues.  Only alphanumeric identifiers may
 193 be lexically scoped--magical builtins like $/ must currently be localized with
 194 "local" instead.
 195
 196 Unlike dynamic variables created by the "local" statement, lexical
 197 variables declared with "my" are totally hidden from the outside world,
 198 including any called subroutines (even if it's the same subroutine called
 199 from itself or elsewhere--every call gets its own copy).
 200
 201 (An eval(), however, can see the lexical variables of the scope it is
 202 being evaluated in so long as the names aren't hidden by declarations within
 203 the eval() itself.  See L<perlref>.)
 204
 205 The parameter list to my() may be assigned to if desired, which allows you
 206 to initialize your variables.  (If no initializer is given for a
 207 particular variable, it is created with the undefined value.)  Commonly
 208 this is used to name the parameters to a subroutine.  Examples:
 209
 210     $arg = "fred";        # "global" variable
 211     $n = cube_root(27);
 212     print "$arg thinks the root is $n\n";
 213  fred thinks the root is 3
 214
 215     sub cube_root {
 216         my $arg = shift;  # name doesn't matter
 217         $arg **= 1/3;
 218         return $arg;
 219     }
 220
 221 The "my" is simply a modifier on something you might assign to.  So when
 222 you do assign to the variables in its argument list, the "my" doesn't
 223 change whether those variables is viewed as a scalar or an array.  So
 224
 225     my ($foo) = <STDIN>;
 226     my @FOO = <STDIN>;
 227
 228 both supply a list context to the righthand side, while
 229
 230     my $foo = <STDIN>;
 231
 232 supplies a scalar context.  But the following only declares one variable:
 233
 234     my $foo, $bar = 1;
 235
 236 That has the same effect as
 237
 238     my $foo;
 239     $bar = 1;
 240
 241 The declared variable is not introduced (is not visible) until after
 242 the current statement.  Thus,
 243
 244     my $x = $x;
 245
 246 can be used to initialize the new $x with the value of the old $x, and
 247 the expression
 248
 249     my $x = 123 and $x == 123
 250
 251 is false unless the old $x happened to have the value 123.
 252
 253 Some users may wish to encourage the use of lexically scoped variables.
 254 As an aid to catching implicit references to package variables,
 255 if you say
 256
 257     use strict 'vars';
 258
 259 then any variable reference from there to the end of the enclosing
 260 block must either refer to a lexical variable, or must be fully
 261 qualified with the package name.  A compilation error results
 262 otherwise.  An inner block may countermand this with S<"no strict 'vars'">.
 263
 264 A my() has both a compile-time and a run-time effect.  At compile time,
 265 the compiler takes notice of it; the principle usefulness of this is to
 266 quiet C<use strict 'vars'>.  The actual initialization doesn't happen
 267 until run time, so gets executed every time through a loop.
 268
 269 Variables declared with "my" are not part of any package and are therefore
 270 never fully qualified with the package name.  In particular, you're not
 271 allowed to try to make a package variable (or other global) lexical:
 272
 273     my $pack::var;      # ERROR!  Illegal syntax
 274     my $_;              # also illegal (currently)
 275
 276 In fact, a dynamic variable (also known as package or global variables)
 277 are still accessible using the fully qualified :: notation even while a
 278 lexical of the same name is also visible:
 279
 280     package main;
 281     local $x = 10;
 282     my    $x = 20;
 283     print "$x and $::x\n";
 284
 285 That will print out 20 and 10.
 286
 287 You may declare "my" variables at the outer most scope of a file to
 288 totally hide any such identifiers from the outside world.  This is similar
 289 to a C's static variables at the file level.  To do this with a subroutine
 290 requires the use of a closure (anonymous function).  If a block (such as
 291 an eval(), function, or C<package>) wants to create a private subroutine
 292 that cannot be called from outside that block, it can declare a lexical
 293 variable containing an anonymous sub reference:
 294
 295     my $secret_version = '1.001-beta';
 296     my $secret_sub = sub { print $secret_version };
 297     &$secret_sub();
 298
 299 As long as the reference is never returned by any function within the
 300 module, no outside module can see the subroutine, since its name is not in
 301 any package's symbol table.  Remember that it's not I<REALLY> called
 302 $some_pack::secret_version or anything; it's just $secret_version,
 303 unqualified and unqualifiable.
 304
 305 This does not work with object methods, however; all object methods have
 306 to be in the symbol table of some package to be found.
 307
 308 Just because the lexical variable is lexically (also called statically)
 309 scoped doesn't mean that within a function it works like a C static.  It
 310 normally works more like a C auto.  But here's a mechanism for giving a
 311 function private variables with both lexical scoping and a static
 312 lifetime.  If you do want to create something like C's static variables,
 313 just enclose the whole function in an extra block, and put the
 314 static variable outside the function but in the block.
 315
 316     {
 317         my $secret_val = 0;
 318         sub gimme_another {
 319             return ++$secret_val;
 320         }
 321     }
 322     # $secret_val now becomes unreachable by the outside
 323     # world, but retains its value between calls to gimme_another
 324
 325 If this function is being sourced in from a separate file
 326 via C<require> or C<use>, then this is probably just fine.  If it's
 327 all in the main program, you'll need to arrange for the my()
 328 to be executed early, either by putting the whole block above
 329 your pain program, or more likely, merely placing a BEGIN
 330 sub around it to make sure it gets executed before your program
 331 starts to run:
 332
 333     sub BEGIN {
 334         my $secret_val = 0;
 335         sub gimme_another {
 336             return ++$secret_val;
 337         }
 338     }
 339
 340 See L<perlrun> about the BEGIN function.
 341
 342 =head2 Temporary Values via local()
 343
 344 B<NOTE>: In general, you should be using "my" instead of "local", because
 345 it's faster and safer.  Execeptions to this include the global punctuation
 346 variables, filehandles and formats, and direct manipulation of the Perl
 347 symbol table itself.  Format variables often use "local" though, as do
 348 other variables whose current value must be visible to called
 349 subroutines.
 350
 351 Synopsis:
 352
 353     local $foo;                 # declare $foo dynamically local
 354     local (@wid, %get);         # declare list of variables local
 355     local $foo = "flurp";       # declare $foo dynamic, and init it
 356     local @oof = @bar;          # declare @oof dynamic, and init it
 357
 358     local *FH;                  # localize $FH, @FH, %FH, &FH  ...
 359     local *merlyn = *randal;    # now $merlyn is really $randal, plus
 360                                 #     @merlyn is really @randal, etc
 361     local *merlyn = 'randal';   # SAME THING: promote 'randal' to *randal
 362     local *merlyn = \$randal;   # just alias $merlyn, not @merlyn etc
 363
 364 A local() modifies its listed variables to be local to the enclosing
 365 block, (or subroutine, C<eval{}> or C<do>) and I<the any called from
 366 within that block>.  A local() just gives temporary values to global
 367 (meaning package) variables.  This is known as dynamic scoping.  Lexical
 368 scoping is done with "my", which works more like C's auto declarations.
 369
 370 If more than one variable is given to local(), they must be placed in
 371 parens.  All listed elements must be legal lvalues.  This operator works
 372 by saving the current values of those variables in its argument list on a
 373 hidden stack and restoring them upon exiting the block, subroutine or
 374 eval.  This means that called subroutines can also reference the local
 375 variable, but not the global one.  The argument list may be assigned to if
 376 desired, which allows you to initialize your local variables.  (If no
 377 initializer is given for a particular variable, it is created with an
 378 undefined value.)  Commonly this is used to name the parameters to a
 379 subroutine.  Examples:
 380
 381     for $i ( 0 .. 9 ) {
 382         $digits{$i} = $i;
 383     }
 384     # assume this function uses global %digits hash
 385     parse_num();
 386
 387     # now temporarily add to %digits hash
 388     if ($base12) {
 389         # (NOTE: not claiming this is efficient!)
 390         local %digits  = (%digits, 't' => 10, 'e' => 11);
 391         parse_num();  # parse_num gets this new %digits!
 392     }
 393     # old %digits restored here
 394
 395 Because local() is a run-time command, and so gets executed every time
 396 through a loop.  In releases of Perl previous to 5.0, this used more stack
 397 storage each time until the loop was exited.  Perl now reclaims the space
 398 each time through, but it's still more efficient to declare your variables
 399 outside the loop.
 400
 401 A local is simply a modifier on an lvalue expression.  When you assign to
 402 a localized variable, the local doesn't change whether its list is viewed
 403 as a scalar or an array.  So
 404
 405     local($foo) = <STDIN>;
 406     local @FOO = <STDIN>;
 407
 408 both supply a list context to the righthand side, while
 409
 410     local $foo = <STDIN>;
 411
 412 supplies a scalar context.
 413
 414 =head2 Passing Symbol Table Entries (typeglobs)
 415
 416 [Note:  The mechanism described in this section was originally the only
 417 way to simulate pass-by-reference in older versions of Perl.  While it
 418 still works fine in modern versions, the new reference mechanism is
 419 generally easier to work with.  See below.]
 420
 421 Sometimes you don't want to pass the value of an array to a subroutine
 422 but rather the name of it, so that the subroutine can modify the global
 423 copy of it rather than working with a local copy.  In perl you can
 424 refer to all objects of a particular name by prefixing the name
 425 with a star: C<*foo>.  This is often known as a "type glob", since the
 426 star on the front can be thought of as a wildcard match for all the
 427 funny prefix characters on variables and subroutines and such.
 428
 429 When evaluated, the type glob produces a scalar value that represents
 430 all the objects of that name, including any filehandle, format or
 431 subroutine.  When assigned to, it causes the name mentioned to refer to
 432 whatever "*" value was assigned to it.  Example:
 433
 434     sub doubleary {
 435         local(*someary) = @_;
 436         foreach $elem (@someary) {
 437             $elem *= 2;
 438         }
 439     }
 440     doubleary(*foo);
 441     doubleary(*bar);
 442
 443 Note that scalars are already passed by reference, so you can modify
 444 scalar arguments without using this mechanism by referring explicitly
 445 to $_[0] etc.  You can modify all the elements of an array by passing
 446 all the elements as scalars, but you have to use the * mechanism (or
 447 the equivalent reference mechanism) to push, pop or change the size of
 448 an array.  It will certainly be faster to pass the typeglob (or reference).
 449
 450 Even if you don't want to modify an array, this mechanism is useful for
 451 passing multiple arrays in a single LIST, since normally the LIST
 452 mechanism will merge all the array values so that you can't extract out
 453 the individual arrays.  For more on typeglobs, see L<perldata/"Typeglobs">.
 454
 455 =head2 Pass by Reference
 456
 457 If you want to pass more than one array or hash into a function--or
 458 return them from it--and have them maintain their integrity,
 459 then you're going to have to use an explicit pass-by-reference.
 460 Before you do that, you need to understand references as detailed in L<perlref>.
 461 This section may not make much sense to you otherwise.
 462
 463 Here are a few simple examples.  First, let's pass in several
 464 arrays to a function and have it pop all of then, return a new
 465 list of all their former last elements:
 466
 467     @tailings = popmany ( \@a, \@b, \@c, \@d );
 468
 469     sub popmany {
 470         my $aref;
 471         my @retlist = ();
 472         foreach $aref ( @_ ) {
 473             push @retlist, pop @$aref;
 474         }
 475         return @retlist;
 476     }
 477
 478 Here's how you might write a function that returns a
 479 list of keys occurring in all the hashes passed to it:
 480
 481     @common = inter( \%foo, \%bar, \%joe );
 482     sub inter {
 483         my ($k, $href, %seen); # locals
 484         foreach $href (@_) {
 485             while ( $k = each %$href ) {
 486                 $seen{$k}++;
 487             }
 488         }
 489         return grep { $seen{$_} == @_ } keys %seen;
 490     }
 491
 492 So far, we're just using the normal list return mechanism.
 493 What happens if you want to pass or return a hash?  Well,
 494 if you're only using one of them, or you don't mind them
 495 concatenating, then the normal calling convention is ok, although
 496 a little expensive.
 497
 498 Where people get into trouble is here:
 499
 500     (@a, @b) = func(@c, @d);
 501 or
 502     (%a, %b) = func(%c, %d);
 503
 504 That syntax simply won't work.  It just sets @a or %a and clears the @b or
 505 %b.  Plus the function didn't get passed into two separate arrays or
 506 hashes: it got one long list in @_, as always.
 507
 508 If you can arrange for everyone to deal with this through references, it's
 509 cleaner code, although not so nice to look at.  Here's a function that
 510 takes two array references as arguments, returning the two array elements
 511 in order of how many elements they have in them:
 512
 513     ($aref, $bref) = func(\@c, \@d);
 514     print "@$aref has more than @$bref\n";
 515     sub func {
 516         my ($cref, $dref) = @_;
 517         if (@$cref > @$dref) {
 518             return ($cref, $dref);
 519         } else {
 520             return ($dref, $cref);
 521         }
 522     }
 523
 524 It turns out that you can actually do this also:
 525
 526     (*a, *b) = func(\@c, \@d);
 527     print "@a has more than @b\n";
 528     sub func {
 529         local (*c, *d) = @_;
 530         if (@c > @d) {
 531             return (\@c, \@d);
 532         } else {
 533             return (\@d, \@c);
 534         }
 535     }
 536
 537 Here we're using the typeglobs to do symbol table aliasing.  It's
 538 a tad subtle, though, and also won't work if you're using my()
 539 variables, since only globals (well, and local()s) are in the symbol table.
 540
 541 If you're passing around filehandles, you could usually just use the bare
 542 typeglob, like *STDOUT, but typeglobs references would be better because
 543 they'll still work properly under C<use strict 'refs'>.  For example:
 544
 545     splutter(\*STDOUT);
 546     sub splutter {
 547         my $fh = shift;
 548         print $fh "her um well a hmmm\n";
 549     }
 550
 551     $rec = get_rec(\*STDIN);
 552     sub get_rec {
 553         my $fh = shift;
 554         return scalar <$fh>;
 555     }
 556
 557 If you're planning on generating new filehandles, you could do this:
 558
 559     sub openit {
 560         my $name = shift;
 561         local *FH;
 562         return open (FH, $path) ? \*FH : undef;
 563     }
 564
 565 Although that will actually produce a small memory leak.  See the bottom
 566 of L<perlfunc/open()> for a somewhat cleaner way using the FileHandle
 567 functions supplied with the POSIX package.
 568
 569 =head2 Prototypes
 570
 571 As of the 5.002 release of perl, if you declare
 572
 573     sub mypush (\@@)
 574
 575 then mypush() takes arguments exactly like push() does.  The declaration
 576 of the function to be called must be visible at compile time.  The prototype
 577 only affects the interpretation of new-style calls to the function, where
 578 new-style is defined as not using the C<&> character.  In other words,
 579 if you call it like a builtin function, then it behaves like a builtin
 580 function.  If you call it like an old-fashioned subroutine, then it
 581 behaves like an old-fashioned subroutine.  It naturally falls out from
 582 this rule that prototypes have no influence on subroutine references
 583 like C<\&foo> or on indirect subroutine calls like C<&{$subref}>.
 584
 585 Method calls are not influenced by prototypes either, because the
 586 function to be called is indeterminate at compile time, since it depends
 587 on inheritance.
 588
 589 Since the intent is primarily to let you define subroutines that work
 590 like builtin commands, here are the prototypes for some other functions
 591 that parse almost exactly like the corresponding builtins.
 592
 593     Declared as                 Called as
 594
 595     sub mylink ($$)             mylink $old, $new
 596     sub myvec ($$$)             myvec $var, $offset, 1
 597     sub myindex ($$;$)          myindex &getstring, "substr"
 598     sub mysyswrite ($$$;$)      mysyswrite $buf, 0, length($buf) - $off, $off
 599     sub myreverse (@)           myreverse $a,$b,$c
 600     sub myjoin ($@)             myjoin ":",$a,$b,$c
 601     sub mypop (\@)              mypop @array
 602     sub mysplice (\@$$@)        mysplice @array,@array,0,@pushme
 603     sub mykeys (\%)             mykeys %{$hashref}
 604     sub myopen (*;$)            myopen HANDLE, $name
 605     sub mypipe (**)             mypipe READHANDLE, WRITEHANDLE
 606     sub mygrep (&@)             mygrep { /foo/ } $a,$b,$c
 607     sub myrand ($)              myrand 42
 608     sub mytime ()               mytime
 609
 610 Any backslashed prototype character represents an actual argument
 611 that absolutely must start with that character.
 612
 613 Unbackslashed prototype characters have special meanings.  Any
 614 unbackslashed @ or % eats all the rest of the arguments, and forces
 615 list context.  An argument represented by $ forces scalar context.  An
 616 & requires an anonymous subroutine, which, if passed as the first
 617 argument, does not require the "sub" keyword or a subsequent comma.  A
 618 * does whatever it has to do to turn the argument into a reference to a
 619 symbol table entry.
 620
 621 A semicolon separates mandatory arguments from optional arguments.
 622 (It is redundant before @ or %.)
 623
 624 Note how the last three examples above are treated specially by the parser.
 625 mygrep() is parsed as a true list operator, myrand() is parsed as a
 626 true unary operator with unary precedence the same as rand(), and
 627 mytime() is truly argumentless, just like time().  That is, if you
 628 say
 629
 630     mytime +2;
 631
 632 you'll get mytime() + 2, not mytime(2), which is how it would be parsed
 633 without the prototype.
 634
 635 The interesting thing about & is that you can generate new syntax with it:
 636
 637     sub try (&$) {
 638         my($try,$catch) = @_;
 639         eval { &$try };
 640         if ($@) {
 641             local $_ = $@;
 642             &$catch;
 643         }
 644     }
 645     sub catch (&) { @_ }
 646
 647     try {
 648         die "phooey";
 649     } catch {
 650         /phooey/ and print "unphooey\n";
 651     };
 652
 653 That prints "unphooey".  (Yes, there are still unresolved
 654 issues having to do with the visibility of @_.  I'm ignoring that
 655 question for the moment.  (But note that if we make @_ lexically
 656 scoped, those anonymous subroutines can act like closures... (Gee,
 657 is this sounding a little Lispish?  (Nevermind.))))
 658
 659 And here's a reimplementation of grep:
 660
 661     sub mygrep (&@) {
 662         my $code = shift;
 663         my @result;
 664         foreach $_ (@_) {
 665             push(@result, $_) if &$ref;
 666         }
 667         @result;
 668     }
 669
 670 Some folks would prefer full alphanumeric prototypes.  Alphanumerics have
 671 been intentionally left out of prototypes for the express purpose of
 672 someday in the future adding named, formal parameters.  The current
 673 mechanism's main goal is to let module writers provide better diagnostics
 674 for module users.  Larry feels the notation quite understandable to Perl
 675 programmers, and that it will not intrude greatly upon the meat of the
 676 module, nor make it harder to read.  The line noise is visually
 677 encapsulated into a small pill that's easy to swallow.
 678
 679 It's probably best to prototype new functions, not retrofit prototyping
 680 into older ones.  That's because you must be especially careful about
 681 silent impositions of differing list versus scalar contexts.  For example,
 682 if you decide that a function should take just one parameter, like this:
 683
 684     sub func ($) {
 685         my $n = shift;
 686         print "you gave me $n\n";
 687     }
 688
 689 and someone has been calling it with an array or expression
 690 returning a list:
 691
 692     func(@foo);
 693     func( split /:/ );
 694
 695 Then you've just supplied an automatic scalar() in front of their
 696 argument, which can be more than a bit surprising.  The old @foo
 697 which used to hold one thing doesn't get passed in.  Instead,
 698 the func() now gets passed in 1, that is, the number of elments
 699 in @foo.  And the split() gets called in a scalar context and
 700 starts scribbling on your @_ parameter list.
 701
 702 This is all very powerful, of course, and should only be used in moderation
 703 to make the world a better place.
 704
 705 =head2 Overriding Builtin Functions
 706
 707 Many builtin functions may be overridden, though this should only be
 708 tried occasionally and for good reason.  Typically this might be
 709 done by a package attempting to emulate missing builtin functionality
 710 on a non-Unix system.
 711
 712 Overriding may only be done by importing the name from a
 713 module--ordinary predeclaration isn't good enough.  However, the
 714 C<subs> pragma (compiler directive) lets you, in effect, predeclare subs
 715 via the import syntax, and these names may then override the builtin ones:
 716
 717     use subs 'chdir', 'chroot', 'chmod', 'chown';
 718     chdir $somewhere;
 719     sub chdir { ... }
 720
 721 Library modules should not in general export builtin names like "open"
 722 or "chdir" as part of their default @EXPORT list, since these may
 723 sneak into someone else's namespace and change the semantics unexpectedly.
 724 Instead, if the module adds the name to the @EXPORT_OK list, then it's
 725 possible for a user to import the name explicitly, but not implicitly.
 726 That is, they could say
 727
 728     use Module 'open';
 729
 730 and it would import the open override, but if they said
 731
 732     use Module;
 733
 734 they would get the default imports without the overrides.
 735
 736 =head2 Autoloading
 737
 738 If you call a subroutine that is undefined, you would ordinarily get an
 739 immediate fatal error complaining that the subroutine doesn't exist.
 740 (Likewise for subroutines being used as methods, when the method
 741 doesn't exist in any of the base classes of the class package.) If,
 742 however, there is an C<AUTOLOAD> subroutine defined in the package or
 743 packages that were searched for the original subroutine, then that
 744 C<AUTOLOAD> subroutine is called with the arguments that would have been
 745 passed to the original subroutine.  The fully qualified name of the
 746 original subroutine magically appears in the $AUTOLOAD variable in the
 747 same package as the C<AUTOLOAD> routine.  The name is not passed as an
 748 ordinary argument because, er, well, just because, that's why...
 749
 750 Most C<AUTOLOAD> routines will load in a definition for the subroutine in
 751 question using eval, and then execute that subroutine using a special
 752 form of "goto" that erases the stack frame of the C<AUTOLOAD> routine
 753 without a trace.  (See the standard C<AutoLoader> module, for example.)
 754 But an C<AUTOLOAD> routine can also just emulate the routine and never
 755 define it.   For example, let's pretend that a function that wasn't defined
 756 should just call system() with those arguments.  All you'd do is this:
 757
 758     sub AUTOLOAD {
 759         my $program = $AUTOLOAD;
 760         $program =~ s/.*:://;
 761         system($program, @_);
 762     }
 763     date();
 764     who('am', i');
 765     ls('-l');
 766
 767 In fact, if you preclare the functions you want to call that way, you don't
 768 even need the parentheses:
 769
 770     use subs qw(date who ls);
 771     date;
 772     who "am", "i";
 773     ls -l;
 774
 775 A more complete example of this is the standard Shell module, which
 776 can treat undefined subroutine calls as calls to Unix programs.
 777
 778 Mechanisms are available for modules writers to help split the modules
 779 up into autoloadable files.  See the standard AutoLoader module described
 780 in L<Autoloader>, the standard SelfLoader modules in L<SelfLoader>, and
 781 the document on adding C functions to perl code in L<perlxs>.
 782
 783 =head1 SEE ALSO
 784
 785 See L<perlref> for more on references.  See L<perlxs> if you'd
 786 like to learn about calling C subroutines from perl.  See
 787 L<perlmod> to learn about bundling up your functions in
 788 separate files.