pod/perlsub.pod

   1 =head1 NAME
   2
   3 perlsub - Perl subroutines
   4
   5 =head1 SYNOPSIS
   6
   7 To declare subroutines:
   8
   9     sub NAME;             # A "forward" declaration.
  10     sub NAME(PROTO);      #  ditto, but with prototypes
  11
  12     sub NAME BLOCK        # A declaration and a definition.
  13     sub NAME(PROTO) BLOCK #  ditto, but with prototypes
  14
  15 To define an anonymous subroutine at runtime:
  16
  17     $subref = sub BLOCK;
  18
  19 To import subroutines:
  20
  21     use PACKAGE qw(NAME1 NAME2 NAME3);
  22
  23 To call subroutines:
  24
  25     NAME(LIST);    # & is optional with parens.
  26     NAME LIST;     # Parens optional if predeclared/imported.
  27     &NAME;         # Passes current @_ to subroutine.
  28
  29 =head1 DESCRIPTION
  30
  31 Like many languages, Perl provides for user-defined subroutines.  These
  32 may be located anywhere in the main program, loaded in from other files
  33 via the C<do>, C<require>, or C<use> keywords, or even generated on the
  34 fly using C<eval> or anonymous subroutines (closures).  You can even call
  35 a function indirectly using a variable containing its name or a CODE reference.
  36
  37 The Perl model for function call and return values is simple: all
  38 functions are passed as parameters one single flat list of scalars, and
  39 all functions likewise return to their caller one single flat list of
  40 scalars.  Any arrays or hashes in these call and return lists will
  41 collapse, losing their identities--but you may always use
  42 pass-by-reference instead to avoid this.  Both call and return lists may
  43 contain as many or as few scalar elements as you'd like.  (Often a
  44 function without an explicit return statement is called a subroutine, but
  45 there's really no difference from the language's perspective.)
  46
  47 Any arguments passed to the routine come in as the array @_.  Thus if you
  48 called a function with two arguments, those would be stored in C<$_[0]>
  49 and C<$_[1]>.  The array @_ is a local array, but its values are implicit
  50 references (predating L<perlref>) to the actual scalar parameters.  The
  51 return value of the subroutine is the value of the last expression
  52 evaluated.  Alternatively, a return statement may be used to specify the
  53 returned value and exit the subroutine.  If you return one or more arrays
  54 and/or hashes, these will be flattened together into one large
  55 indistinguishable list.
  56
  57 Perl does not have named formal parameters, but in practice all you do is
  58 assign to a my() list of these.  Any variables you use in the function
  59 that aren't declared private are global variables.  For the gory details
  60 on creating private variables, see the sections below on L<"Private
  61 Variables via my()"> and L</"Temporary Values via local()">.  To create
  62 protected environments for a set of functions in a separate package (and
  63 probably a separate file), see L<perlmod/"Packages">.
  64
  65 Example:
  66
  67     sub max {
  68         my $max = shift(@_);
  69         foreach $foo (@_) {
  70             $max = $foo if $max < $foo;
  71         }
  72         return $max;
  73     }
  74     $bestday = max($mon,$tue,$wed,$thu,$fri);
  75
  76 Example:
  77
  78     # get a line, combining continuation lines
  79     #  that start with whitespace
  80
  81     sub get_line {
  82         $thisline = $lookahead;  # GLOBAL VARIABLES!!
  83         LINE: while ($lookahead = <STDIN>) {
  84             if ($lookahead =~ /^[ \t]/) {
  85                 $thisline .= $lookahead;
  86             }
  87             else {
  88                 last LINE;
  89             }
  90         }
  91         $thisline;
  92     }
  93
  94     $lookahead = <STDIN>;       # get first line
  95     while ($_ = get_line()) {
  96         ...
  97     }
  98
  99 Use array assignment to a local list to name your formal arguments:
 100
 101     sub maybeset {
 102         my($key, $value) = @_;
 103         $Foo{$key} = $value unless $Foo{$key};
 104     }
 105
 106 This also has the effect of turning call-by-reference into call-by-value,
 107 since the assignment copies the values.  Otherwise a function is free to
 108 do in-place modifications of @_ and change its callers values.
 109
 110     upcase_in($v1, $v2);  # this changes $v1 and $v2
 111     sub upcase_in {
 112         for (@_) { tr/a-z/A-Z/ }
 113     }
 114
 115 You aren't allowed to modify constants in this way, of course.  If an
 116 argument were actually literal and you tried to change it, you'd take a
 117 (presumably fatal) exception.   For example, this won't work:
 118
 119     upcase_in("frederick");
 120
 121 It would be much safer if the upcase_in() function
 122 were written to return a copy of its parameters instead
 123 of changing them in place:
 124
 125     ($v3, $v4) = upcase($v1, $v2);  # this doesn't
 126     sub upcase {
 127         my @parms = @_;
 128         for (@parms) { tr/a-z/A-Z/ }
 129         return @parms;
 130     }
 131
 132 Notice how this (unprototyped) function doesn't care whether it was passed
 133 real scalars or arrays.  Perl will see everything as one big long flat @_
 134 parameter list.  This is one of the ways where Perl's simple
 135 argument-passing style shines.  The upcase() function would work perfectly
 136 well without changing the upcase() definition even if we fed it things
 137 like this:
 138
 139     @newlist   = upcase(@list1, @list2);
 140     @newlist   = upcase( split /:/, $var );
 141
 142 Do not, however, be tempted to do this:
 143
 144     (@a, @b)   = upcase(@list1, @list2);
 145
 146 Because like its flat incoming parameter list, the return list is also
 147 flat.  So all you have managed to do here is stored everything in @a and
 148 made @b an empty list.  See L</"Pass by Reference"> for alternatives.
 149
 150 A subroutine may be called using the "&" prefix.  The "&" is optional in
 151 Perl 5, and so are the parens if the subroutine has been predeclared.
 152 (Note, however, that the "&" is I<NOT> optional when you're just naming
 153 the subroutine, such as when it's used as an argument to defined() or
 154 undef().  Nor is it optional when you want to do an indirect subroutine
 155 call with a subroutine name or reference using the C<&$subref()> or
 156 C<&{$subref}()> constructs.  See L<perlref> for more on that.)
 157
 158 Subroutines may be called recursively.  If a subroutine is called using
 159 the "&" form, the argument list is optional, and if omitted, no @_ array is
 160 set up for the subroutine: the @_ array at the time of the call is
 161 visible to subroutine instead.  This is an efficiency mechanism that
 162 new users may wish to avoid.
 163
 164     &foo(1,2,3);        # pass three arguments
 165     foo(1,2,3);         # the same
 166
 167     foo();              # pass a null list
 168     &foo();             # the same
 169
 170     &foo;               # foo() get current args, like foo(@_) !!
 171     foo;                # like foo() IFF sub foo pre-declared, else "foo"
 172
 173 =head2 Private Variables via my()
 174
 175 Synopsis:
 176
 177     my $foo;            # declare $foo lexically local
 178     my (@wid, %get);    # declare list of variables local
 179     my $foo = "flurp";  # declare $foo lexical, and init it
 180     my @oof = @bar;     # declare @oof lexical, and init it
 181
 182 A "my" declares the listed variables to be confined (lexically) to the
 183 enclosing block, subroutine, C<eval>, or C<do/require/use>'d file.  If
 184 more than one value is listed, the list must be placed in parens.  All
 185 listed elements must be legal lvalues.  Only alphanumeric identifiers may
 186 be lexically scoped--magical builtins like $/ must currently be localized with
 187 "local" instead.
 188
 189 Unlike dynamic variables created by the "local" statement, lexical
 190 variables declared with "my" are totally hidden from the outside world,
 191 including any called subroutines (even if it's the same subroutine called
 192 from itself or elsewhere--every call gets its own copy).
 193
 194 (An eval(), however, can see the lexical variables of the scope it is
 195 being evaluated in so long as the names aren't hidden by declarations within
 196 the eval() itself.  See L<perlref>.)
 197
 198 The parameter list to my() may be assigned to if desired, which allows you
 199 to initialize your variables.  (If no initializer is given for a
 200 particular variable, it is created with the undefined value.)  Commonly
 201 this is used to name the parameters to a subroutine.  Examples:
 202
 203     $arg = "fred";        # "global" variable
 204     $n = cube_root(27);
 205     print "$arg thinks the root is $n\n";
 206  fred thinks the root is 3
 207
 208     sub cube_root {
 209         my $arg = shift;  # name doesn't matter
 210         $arg **= 1/3;
 211         return $arg;
 212     }
 213
 214 The "my" is simply a modifier on something you might assign to.  So when
 215 you do assign to the variables in its argument list, the "my" doesn't
 216 change whether those variables is viewed as a scalar or an array.  So
 217
 218     my ($foo) = <STDIN>;
 219     my @FOO = <STDIN>;
 220
 221 both supply a list context to the righthand side, while
 222
 223     my $foo = <STDIN>;
 224
 225 supplies a scalar context.  But the following only declares one variable:
 226
 227     my $foo, $bar = 1;
 228
 229 That has the same effect as
 230
 231     my $foo;
 232     $bar = 1;
 233
 234 The declared variable is not introduced (is not visible) until after
 235 the current statement.  Thus,
 236
 237     my $x = $x;
 238
 239 can be used to initialize the new $x with the value of the old $x, and
 240 the expression
 241
 242     my $x = 123 and $x == 123
 243
 244 is false unless the old $x happened to have the value 123.
 245
 246 Some users may wish to encourage the use of lexically scoped variables.
 247 As an aid to catching implicit references to package variables,
 248 if you say
 249
 250     use strict 'vars';
 251
 252 then any variable reference from there to the end of the enclosing
 253 block must either refer to a lexical variable, or must be fully
 254 qualified with the package name.  A compilation error results
 255 otherwise.  An inner block may countermand this with S<"no strict 'vars'">.
 256
 257 A my() has both a compile-time and a run-time effect.  At compile time,
 258 the compiler takes notice of it; the principle usefulness of this is to
 259 quiet C<use strict 'vars'>.  The actual initialization doesn't happen
 260 until run time, so gets executed every time through a loop.
 261
 262 Variables declared with "my" are not part of any package and are therefore
 263 never fully qualified with the package name.  In particular, you're not
 264 allowed to try to make a package variable (or other global) lexical:
 265
 266     my $pack::var;      # ERROR!  Illegal syntax
 267     my $_;              # also illegal (currently)
 268
 269 In fact, a dynamic variable (also known as package or global variables)
 270 are still accessible using the fully qualified :: notation even while a
 271 lexical of the same name is also visible:
 272
 273     package main;
 274     local $x = 10;
 275     my    $x = 20;
 276     print "$x and $::x\n";
 277
 278 That will print out 20 and 10.
 279
 280 You may declare "my" variables at the outer most scope of a file to
 281 totally hide any such identifiers from the outside world.  This is similar
 282 to a C's static variables at the file level.  To do this with a subroutine
 283 requires the use of a closure (anonymous function).  If a block (such as
 284 an eval(), function, or C<package>) wants to create a private subroutine
 285 that cannot be called from outside that block, it can declare a lexical
 286 variable containing an anonymous sub reference:
 287
 288     my $secret_version = '1.001-beta';
 289     my $secret_sub = sub { print $secret_version };
 290     &$secret_sub();
 291
 292 As long as the reference is never returned by any function within the
 293 module, no outside module can see the subroutine, since its name is not in
 294 any package's symbol table.  Remember that it's not I<REALLY> called
 295 $some_pack::secret_version or anything; it's just $secret_version,
 296 unqualified and unqualifiable.
 297
 298 This does not work with object methods, however; all object methods have
 299 to be in the symbol table of some package to be found.
 300
 301 Just because the lexical variable is lexically (also called statically)
 302 scoped doesn't mean that within a function it works like a C static.  It
 303 normally works more like a C auto.  But here's a mechanism for giving a
 304 function private variables with both lexical scoping and a static
 305 lifetime.  If you do want to create something like C's static variables,
 306 just enclose the whole function in an extra block, and put the
 307 static variable outside the function but in the block.
 308
 309     {
 310         my $secret_val = 0;
 311         sub gimme_another {
 312             return ++$secret_val;
 313         }
 314     }
 315     # $secret_val now becomes unreachable by the outside
 316     # world, but retains its value between calls to gimme_another
 317
 318 If this function is being sourced in from a separate file
 319 via C<require> or C<use>, then this is probably just fine.  If it's
 320 all in the main program, you'll need to arrange for the my()
 321 to be executed early, either by putting the whole block above
 322 your pain program, or more likely, merely placing a BEGIN
 323 sub around it to make sure it gets executed before your program
 324 starts to run:
 325
 326     sub BEGIN {
 327         my $secret_val = 0;
 328         sub gimme_another {
 329             return ++$secret_val;
 330         }
 331     }
 332
 333 See L<perlrun> about the BEGIN function.
 334
 335 =head2 Temporary Values via local()
 336
 337 B<NOTE>: In general, you should be using "my" instead of "local", because
 338 it's faster and safer.  Execeptions to this include the global punctuation
 339 variables, filehandles and formats, and direct manipulation of the Perl
 340 symbol table itself.  Format variables often use "local" though, as do
 341 other variables whose current value must be visible to called
 342 subroutines.
 343
 344 Synopsis:
 345
 346     local $foo;                 # declare $foo dynamically local
 347     local (@wid, %get);         # declare list of variables local
 348     local $foo = "flurp";       # declare $foo dynamic, and init it
 349     local @oof = @bar;          # declare @oof dynamic, and init it
 350
 351     local *FH;                  # localize $FH, @FH, %FH, &FH  ...
 352     local *merlyn = *randal;    # now $merlyn is really $randal, plus
 353                                 #     @merlyn is really @randal, etc
 354     local *merlyn = 'randal';   # SAME THING: promote 'randal' to *randal
 355     local *merlyn = \$randal;   # just alias $merlyn, not @merlyn etc
 356
 357 A local() modifies its listed variables to be local to the enclosing
 358 block, (or subroutine, C<eval{}> or C<do>) and I<the any called from
 359 within that block>.  A local() just gives temporary values to global
 360 (meaning package) variables.  This is known as dynamic scoping.  Lexical
 361 scoping is done with "my", which works more like C's auto declarations.
 362
 363 If more than one variable is given to local(), they must be placed in
 364 parens.  All listed elements must be legal lvalues.  This operator works
 365 by saving the current values of those variables in its argument list on a
 366 hidden stack and restoring them upon exiting the block, subroutine or
 367 eval.  This means that called subroutines can also reference the local
 368 variable, but not the global one.  The argument list may be assigned to if
 369 desired, which allows you to initialize your local variables.  (If no
 370 initializer is given for a particular variable, it is created with an
 371 undefined value.)  Commonly this is used to name the parameters to a
 372 subroutine.  Examples:
 373
 374     for $i ( 0 .. 9 ) {
 375         $digits{$i} = $i;
 376     }
 377     # assume this function uses global %digits hash
 378     parse_num();
 379
 380     # now temporarily add to %digits hash
 381     if ($base12) {
 382         # (NOTE: not claiming this is efficient!)
 383         local %digits  = (%digits, 't' => 10, 'e' => 11);
 384         parse_num();  # parse_num gets this new %digits!
 385     }
 386     # old %digits restored here
 387
 388 Because local() is a run-time command, and so gets executed every time
 389 through a loop.  In releases of Perl previous to 5.0, this used more stack
 390 storage each time until the loop was exited.  Perl now reclaims the space
 391 each time through, but it's still more efficient to declare your variables
 392 outside the loop.
 393
 394 A local is simply a modifier on an lvalue expression.  When you assign to
 395 a localized variable, the local doesn't change whether its list is viewed
 396 as a scalar or an array.  So
 397
 398     local($foo) = <STDIN>;
 399     local @FOO = <STDIN>;
 400
 401 both supply a list context to the righthand side, while
 402
 403     local $foo = <STDIN>;
 404
 405 supplies a scalar context.
 406
 407 =head2 Passing Symbol Table Entries (typeglobs)
 408
 409 [Note:  The mechanism described in this section was originally the only
 410 way to simulate pass-by-reference in older versions of Perl.  While it
 411 still works fine in modern versions, the new reference mechanism is
 412 generally easier to work with.  See below.]
 413
 414 Sometimes you don't want to pass the value of an array to a subroutine
 415 but rather the name of it, so that the subroutine can modify the global
 416 copy of it rather than working with a local copy.  In perl you can
 417 refer to all objects of a particular name by prefixing the name
 418 with a star: C<*foo>.  This is often known as a "type glob", since the
 419 star on the front can be thought of as a wildcard match for all the
 420 funny prefix characters on variables and subroutines and such.
 421
 422 When evaluated, the type glob produces a scalar value that represents
 423 all the objects of that name, including any filehandle, format or
 424 subroutine.  When assigned to, it causes the name mentioned to refer to
 425 whatever "*" value was assigned to it.  Example:
 426
 427     sub doubleary {
 428         local(*someary) = @_;
 429         foreach $elem (@someary) {
 430             $elem *= 2;
 431         }
 432     }
 433     doubleary(*foo);
 434     doubleary(*bar);
 435
 436 Note that scalars are already passed by reference, so you can modify
 437 scalar arguments without using this mechanism by referring explicitly
 438 to $_[0] etc.  You can modify all the elements of an array by passing
 439 all the elements as scalars, but you have to use the * mechanism (or
 440 the equivalent reference mechanism) to push, pop or change the size of
 441 an array.  It will certainly be faster to pass the typeglob (or reference).
 442
 443 Even if you don't want to modify an array, this mechanism is useful for
 444 passing multiple arrays in a single LIST, since normally the LIST
 445 mechanism will merge all the array values so that you can't extract out
 446 the individual arrays.  For more on typeglobs, see L<perldata/"Typeglobs">.
 447
 448 =head2 Pass by Reference
 449
 450 If you want to pass more than one array or hash into a function--or
 451 return them from it--and have them maintain their integrity,
 452 then you're going to have to use an explicit pass-by-reference.
 453 Before you do that, you need to understand references; see L<perlref>.
 454
 455 Here are a few simple examples.  First, let's pass in several
 456 arrays to a function and have it pop all of then, return a new
 457 list of all their former last elements:
 458
 459     @tailings = popmany ( \@a, \@b, \@c, \@d );
 460
 461     sub popmany {
 462         my $aref;
 463         my @retlist = ();
 464         foreach $aref ( @_ ) {
 465             push @retlist, pop @$aref;
 466         }
 467         return @retlist;
 468     }
 469
 470 Here's how you might write a function that returns a
 471 list of keys occurring in all the hashes passed to it:
 472
 473     @common = inter( \%foo, \%bar, \%joe );
 474     sub inter {
 475         my ($k, $href, %seen); # locals
 476         foreach $href (@_) {
 477             while ( $k = each %$href ) {
 478                 $seen{$k}++;
 479             }
 480         }
 481         return grep { $seen{$_} == @_ } keys %seen;
 482     }
 483
 484 So far, we're just using the normal list return mechanism.
 485 What happens if you want to pass or return a hash?  Well,
 486 if you're only using one of them, or you don't mind them
 487 concatenating, then the normal calling convention is ok, although
 488 a little expensive.
 489
 490 Where people get into trouble is here:
 491
 492     (@a, @b) = func(@c, @d);
 493 or
 494     (%a, %b) = func(%c, %d);
 495
 496 That syntax simply won't work.  It just sets @a or %a and clears the @b or
 497 %b.  Plus the function didn't get passed into two separate arrays or
 498 hashes: it got one long list in @_, as always.
 499
 500 If you can arrange for everyone to deal with this through references, it's
 501 cleaner code, although not so nice to look at.  Here's a function that
 502 takes two array references as arguments, returning the two array elements
 503 in order of how many elements they have in them:
 504
 505     ($aref, $bref) = func(\@c, \@d);
 506     print "@$aref has more than @$bref\n";
 507     sub func {
 508         my ($cref, $dref) = @_;
 509         if (@$cref > @$dref) {
 510             return ($cref, $dref);
 511         } else {
 512             return ($cref, $cref);
 513         }
 514     }
 515
 516 It turns out that you can actually do this also:
 517
 518     (*a, *b) = func(\@c, \@d);
 519     print "@a has more than @b\n";
 520     sub func {
 521         local (*c, *d) = @_;
 522         if (@c > @d) {
 523             return (\@c, \@d);
 524         } else {
 525             return (\@d, \@c);
 526         }
 527     }
 528
 529 Here we're using the typeglobs to do symbol table aliasing.  It's
 530 a tad subtle, though, and also won't work if you're using my()
 531 variables, since only globals (well, and local()s) are in the symbol table.
 532
 533 If you're passing around filehandles, you could usually just use the bare
 534 typeglob, like *STDOUT, but typeglobs references would be better because
 535 they'll still work properly under C<use strict 'refs'>.  For example:
 536
 537     splutter(\*STDOUT);
 538     sub splutter {
 539         my $fh = shift;
 540         print $fh "her um well a hmmm\n";
 541     }
 542
 543     $rec = get_rec(\*STDIN);
 544     sub get_rec {
 545         my $fh = shift;
 546         return scalar <$fh>;
 547     }
 548
 549 If you're planning on generating new filehandles, you could do this:
 550
 551     sub openit {
 552         my $name = shift;
 553         local *FH;
 554         return open (FH, $path) ? \*FH : undef;
 555     }
 556
 557 Although that will actually produce a small memory leak.  See the bottom
 558 of L<perlfunc/open()> for a somewhat cleaner way using the FileHandle
 559 functions supplied with the POSIX package.
 560
 561 =head2 Prototypes
 562
 563 As of the 5.002 release of perl, if you declare
 564
 565     sub mypush (\@@)
 566
 567 then mypush() takes arguments exactly like push() does.  (This only works
 568 for function calls that are visible at compile time, not indirect function
 569 calls through a C<&$func> reference nor for method calls as described in
 570 L<perlobj>.)
 571
 572 Here are the prototypes for some other functions that parse almost exactly
 573 like the corresponding builtins.
 574
 575     Declared as                 Called as
 576
 577     sub mylink ($$)             mylink $old, $new
 578     sub myvec ($$$)             myvec $var, $offset, 1
 579     sub myindex ($$;$)          myindex &getstring, "substr"
 580     sub mysyswrite ($$$;$)      mysyswrite $buf, 0, length($buf) - $off, $off
 581     sub myreverse (@)           myreverse $a,$b,$c
 582     sub myjoin ($@)             myjoin ":",$a,$b,$c
 583     sub mypop (\@)              mypop @array
 584     sub mysplice (\@$$@)        mysplice @array,@array,0,@pushme
 585     sub mykeys (\%)             mykeys %{$hashref}
 586     sub myopen (*;$)            myopen HANDLE, $name
 587     sub mypipe (**)             mypipe READHANDLE, WRITEHANDLE
 588     sub mygrep (&@)             mygrep { /foo/ } $a,$b,$c
 589     sub myrand ($)              myrand 42
 590     sub mytime ()               mytime
 591
 592 Any backslashed prototype character must be passed something starting
 593 with that character.  Any unbackslashed @ or % eats all the rest of the
 594 arguments, and forces list context.  An argument represented by $
 595 forces scalar context.  An & requires an anonymous subroutine, and *
 596 does whatever it has to do to turn the argument into a reference to a
 597 symbol table entry.  A semicolon separates mandatory arguments from
 598 optional arguments.
 599
 600 Note that the last three are syntactically distinguished by the lexer.
 601 mygrep() is parsed as a true list operator, myrand() is parsed as a
 602 true unary operator with unary precedence the same as rand(), and
 603 mytime() is truly argumentless, just like time().  That is, if you
 604 say
 605
 606     mytime +2;
 607
 608 you'll get mytime() + 2, not mytime(2), which is how it would be parsed
 609 without the prototype.
 610
 611 The interesting thing about & is that you can generate new syntax with it:
 612
 613     sub try (&$) {
 614         my($try,$catch) = @_;
 615         eval { &$try };
 616         if ($@) {
 617             local $_ = $@;
 618             &$catch;
 619         }
 620     }
 621     sub catch (&) { @_ }
 622
 623     try {
 624         die "phooey";
 625     } catch {
 626         /phooey/ and print "unphooey\n";
 627     };
 628
 629 That prints "unphooey".  (Yes, there are still unresolved
 630 issues having to do with the visibility of @_.  I'm ignoring that
 631 question for the moment.  (But note that if we make @_ lexically
 632 scoped, those anonymous subroutines can act like closures... (Gee,
 633 is this sounding a little Lispish?  (Nevermind.))))
 634
 635 And here's a reimplementation of grep:
 636
 637     sub mygrep (&@) {
 638         my $code = shift;
 639         my @result;
 640         foreach $_ (@_) {
 641             push(@result, $_) if &$ref;
 642         }
 643         @result;
 644     }
 645
 646 Some folks would prefer full alphanumeric prototypes.  Alphanumerics have
 647 been intentionally left out of prototypes for the express purpose of
 648 someday in the future adding named, formal parameters.  The current
 649 mechanism's main goal is to let module writers provide better diagnostics
 650 for module users.  Larry feels the notation quite understandable to Perl
 651 programmers, and that it will not intrude greatly upon the meat of the
 652 module, nor make it harder to read.  The line noise is visually
 653 encapsulated into a small pill that's easy to swallow.
 654
 655 It's probably best to prototype new functions, not retrofit prototyping
 656 into older ones.  That's because you must be especially careful about
 657 silent impositions of differing list versus scalar contexts.  For example,
 658 if you decide that a function should take just one parameter, like this:
 659
 660     sub func ($) {
 661         my $n = shift;
 662         print "you gave me $n\n";
 663     }
 664
 665 and someone has been calling it with an array or expression
 666 returning a list:
 667
 668     func(@foo);
 669     func( split /:/ );
 670
 671 Then you've just supplied an automatic scalar() in front of their
 672 argument, which can be more than a bit surprising.  The old @foo
 673 which used to hold one thing doesn't get passed in.  Instead,
 674 the func() now gets passed in 1, that is, the number of elments
 675 in @foo.  And the split() gets called in a scalar context and
 676 starts scribbling on your @_ parameter list.
 677
 678 This is all very powerful, of course, and should only be used in moderation
 679 to make the world a better place.
 680
 681 =head2 Overriding Builtin Functions
 682
 683 Many builtin functions may be overridden, though this should only be
 684 tried occasionally and for good reason.  Typically this might be
 685 done by a package attempting to emulate missing builtin functionality
 686 on a non-Unix system.
 687
 688 Overriding may only be done by importing the name from a
 689 module--ordinary predeclaration isn't good enough.  However, the
 690 C<subs> pragma (compiler directive) lets you, in effect, predeclare subs
 691 via the import syntax, and these names may then override the builtin ones:
 692
 693     use subs 'chdir', 'chroot', 'chmod', 'chown';
 694     chdir $somewhere;
 695     sub chdir { ... }
 696
 697 Library modules should not in general export builtin names like "open"
 698 or "chdir" as part of their default @EXPORT list, since these may
 699 sneak into someone else's namespace and change the semantics unexpectedly.
 700 Instead, if the module adds the name to the @EXPORT_OK list, then it's
 701 possible for a user to import the name explicitly, but not implicitly.
 702 That is, they could say
 703
 704     use Module 'open';
 705
 706 and it would import the open override, but if they said
 707
 708     use Module;
 709
 710 they would get the default imports without the overrides.
 711
 712 =head2 Autoloading
 713
 714 If you call a subroutine that is undefined, you would ordinarily get an
 715 immediate fatal error complaining that the subroutine doesn't exist.
 716 (Likewise for subroutines being used as methods, when the method
 717 doesn't exist in any of the base classes of the class package.) If,
 718 however, there is an C<AUTOLOAD> subroutine defined in the package or
 719 packages that were searched for the original subroutine, then that
 720 C<AUTOLOAD> subroutine is called with the arguments that would have been
 721 passed to the original subroutine.  The fully qualified name of the
 722 original subroutine magically appears in the $AUTOLOAD variable in the
 723 same package as the C<AUTOLOAD> routine.  The name is not passed as an
 724 ordinary argument because, er, well, just because, that's why...
 725
 726 Most C<AUTOLOAD> routines will load in a definition for the subroutine in
 727 question using eval, and then execute that subroutine using a special
 728 form of "goto" that erases the stack frame of the C<AUTOLOAD> routine
 729 without a trace.  (See the standard C<AutoLoader> module, for example.)
 730 But an C<AUTOLOAD> routine can also just emulate the routine and never
 731 define it.   For example, let's pretend that a function that wasn't defined
 732 should just call system() with those arguments.  All you'd do is this:
 733
 734     sub AUTOLOAD {
 735         my $program = $AUTOLOAD;
 736         $program =~ s/.*:://;
 737         system($program, @_);
 738     }
 739     date();
 740     who('am', i');
 741     ls('-l');
 742
 743 In fact, if you preclare the functions you want to call that way, you don't
 744 even need the parentheses:
 745
 746     use subs qw(date who ls);
 747     date;
 748     who "am", "i";
 749     ls -l;
 750
 751 A more complete example of this is the standard Shell module, which
 752 can treat undefined subroutine calls as calls to Unix programs.
 753
 754 Mechanisms are available for modules writers to help split the modules
 755 up into autoloadable files.  See the standard AutoLoader module described
 756 in L<Autoloader>, the standard SelfLoader modules in L<SelfLoader>, and
 757 the document on adding C functions to perl code in L<perlxs>.
 758
 759 =head1 SEE ALSO
 760
 761 See L<perlref> for more on references.  See L<perlxs> if you'd
 762 like to learn about calling C subroutines from perl.  See
 763 L<perlmod> to learn about bundling up your functions in
 764 separate files.