Integrate mainline

[p5sagit/p5-mst-13.2.git] / pod / perlsub.pod
diff --git a/pod/perlsub.pod b/pod/perlsub.pod

index 47f507f..ce3b120 100644 (file)
--- a/pod/perlsub.pod
+++ b/pod/perlsub.pod
@@ -39,7 +39,7 @@ To call subroutines:
 Like many languages, Perl provides for user-defined subroutines.
 These may be located anywhere in the main program, loaded in from
 other files via the C<do>, C<require>, or C<use> keywords, or
-generated on the fly using C<eval> or anonymous subroutines (closures).
+generated on the fly using C<eval> or anonymous subroutines.
 You can even call a function indirectly using a variable containing
 its name or a CODE reference.
 
@@ -154,7 +154,7 @@ of changing them in place:
     }
 
 Notice how this (unprototyped) function doesn't care whether it was
-passed real scalars or arrays.  Perl sees all arugments as one big,
+passed real scalars or arrays.  Perl sees all arguments as one big,
 long, flat parameter list in C<@_>.  This is one area where
 Perl's simple argument-passing style shines.  The C<upcase()>
 function would work perfectly well without changing the C<upcase()>
@@ -169,8 +169,8 @@ Do not, however, be tempted to do this:
 
 Like the flattened incoming parameter list, the return list is also
 flattened on return.  So all you have managed to do here is stored
-everything in C<@a> and made C<@b> an empty list.  See L<Pass by
-Reference> for alternatives.
+everything in C<@a> and made C<@b> empty.  See 
+L<Pass by Reference> for alternatives.
 
 A subroutine may be called using an explicit C<&> prefix.  The
 C<&> is optional in modern Perl, as are parentheses if the
@@ -179,7 +179,7 @@ when just naming the subroutine, such as when it's used as
 an argument to defined() or undef().  Nor is it optional when you
 want to do an indirect subroutine call with a subroutine name or
 reference using the C<&$subref()> or C<&{$subref}()> constructs,
-although the C<$subref-E<gt>()> notation solves that problem.
+although the C<< $subref->() >> notation solves that problem.
 See L<perlref> for more about all that.
 
 Subroutines may be called recursively.  If a subroutine is called
@@ -207,9 +207,8 @@ core, as are modules whose names are in all lower case.  A
 function in all capitals is a loosely-held convention meaning it
 will be called indirectly by the run-time system itself, usually
 due to a triggered event.  Functions that do special, pre-defined
-things include C<BEGIN>, C<END>, C<AUTOLOAD>, and C<DESTROY>--plus
-all functions mentioned in L<perltie>.  The 5.005 release adds
-C<INIT> to this list.
+things include C<BEGIN>, C<CHECK>, C<INIT>, C<END>, C<AUTOLOAD>,
+C<CLONE> and C<DESTROY>--plus all functions mentioned in L<perltie>.
 
 =head2 Private Variables via my()
 
@@ -221,9 +220,9 @@ Synopsis:
     my @oof = @bar;    # declare @oof lexical, and init it
     my $x : Foo = $y;  # similar, with an attribute applied
 
-B<WARNING>: The use of attribute lists on C<my> declarations is
-experimental.  This feature should not be relied upon.  It may
-change or disappear in future releases of Perl.  See L<attributes>.
+B<WARNING>: The use of attribute lists on C<my> declarations is still
+evolving.  The current semantics and interface are subject to change.
+See L<attributes> and L<Attribute::Handlers>.
 
 The C<my> operator declares the listed variables to be lexically
 confined to the enclosing block, conditional (C<if/unless/elsif/else>),
@@ -328,9 +327,12 @@ the scope of $answer extends from its declaration through the rest
 of that conditional, including any C<elsif> and C<else> clauses, 
 but not beyond it.
 
-None of the foregoing text applies to C<if/unless> or C<while/until>
-modifiers appended to simple statements.  Such modifiers are not
-control structures and have no effect on scoping.
+B<NOTE:> The behaviour of a C<my> statement modified with a statement
+modifier conditional or loop construct (e.g. C<my $x if ...>) is
+B<undefined>.  The value of the C<my> variable may be C<undef>, any
+previously assigned value, or possibly anything else.  Don't rely on
+it.  Future versions of perl might do something different from the
+version of perl you try it out on.  Here be dragons.
 
 The C<foreach> loop defaults to scoping its index variable dynamically
 in the manner of C<local>.  However, if the index variable is
@@ -353,12 +355,12 @@ which are always global, if you say
 
 then any variable mentioned from there to the end of the enclosing
 block must either refer to a lexical variable, be predeclared via
-C<use vars>, or else must be fully qualified with the package name.
+C<our> or C<use vars>, or else must be fully qualified with the package name.
 A compilation error results otherwise.  An inner block may countermand
 this with C<no strict 'vars'>.
 
 A C<my> has both a compile-time and a run-time effect.  At compile
-time, the compiler takes notice of it.  The principle usefulness
+time, the compiler takes notice of it.  The principal usefulness
 of this is to quiet C<use strict 'vars'>, but it is also essential
 for generation of closures as detailed in L<perlref>.  Actual
 initialization is delayed until run time, though, so it gets executed
@@ -455,7 +457,7 @@ starts to run:
     }
 
 See L<perlmod/"Package Constructors and Destructors"> about the
-special triggered functions, C<BEGIN> and C<INIT>.
+special triggered functions, C<BEGIN>, C<CHECK>, C<INIT> and C<END>.
 
 If declared at the outermost scope (the file scope), then lexicals
 work somewhat like C's file statics.  They are available to all
@@ -561,6 +563,13 @@ hash to some other implementation:
     }
     [..%ahash back to its initial tied self again..]
 
+B<WARNING> The code example above does not currently work as described.
+This will be fixed in a future release of Perl; in the meantime, avoid
+code that relies on any particular behaviour of localising tied arrays
+or hashes (localising individual elements is still okay).
+See L<perldelta/"Localising Tied Arrays and Hashes Is Broken"> for more
+details.
+
 As another example, a custom implementation of C<%ENV> might look
 like this:
 
@@ -611,6 +620,75 @@ Perl will print
 The behavior of local() on non-existent members of composite
 types is subject to change in future.
 
+=head2 Lvalue subroutines
+
+B<WARNING>: Lvalue subroutines are still experimental and the
+implementation may change in future versions of Perl.
+
+It is possible to return a modifiable value from a subroutine.
+To do this, you have to declare the subroutine to return an lvalue.
+
+    my $val;
+    sub canmod : lvalue {
+       # return $val; this doesn't work, don't say "return"
+       $val;
+    }
+    sub nomod {
+       $val;
+    }
+
+    canmod() = 5;   # assigns to $val
+    nomod()  = 5;   # ERROR
+
+The scalar/list context for the subroutine and for the right-hand
+side of assignment is determined as if the subroutine call is replaced
+by a scalar. For example, consider:
+
+    data(2,3) = get_data(3,4);
+
+Both subroutines here are called in a scalar context, while in:
+
+    (data(2,3)) = get_data(3,4);
+
+and in:
+
+    (data(2),data(3)) = get_data(3,4);
+
+all the subroutines are called in a list context.
+
+=over 4
+
+=item Lvalue subroutines are EXPERIMENTAL
+
+They appear to be convenient, but there are several reasons to be
+circumspect.
+
+You can't use the return keyword, you must pass out the value before
+falling out of subroutine scope. (see comment in example above).  This
+is usually not a problem, but it disallows an explicit return out of a
+deeply nested loop, which is sometimes a nice way out.
+
+They violate encapsulation.  A normal mutator can check the supplied
+argument before setting the attribute it is protecting, an lvalue
+subroutine never gets that chance.  Consider;
+
+    my $some_array_ref = [];   # protected by mutators ??
+
+    sub set_arr {              # normal mutator
+       my $val = shift;
+       die("expected array, you supplied ", ref $val)
+          unless ref $val eq 'ARRAY';
+       $some_array_ref = $val;
+    }
+    sub set_arr_lv : lvalue {  # lvalue mutator
+       $some_array_ref;
+    }
+
+    # set_arr_lv cannot stop this !
+    set_arr_lv() = { a => 1 };
+
+=back
+
 =head2 Passing Symbol Table Entries (typeglobs)
 
 B<WARNING>: The mechanism described in this section was originally
@@ -659,9 +737,11 @@ Despite the existence of C<my>, there are still three places where the
 C<local> operator still shines.  In fact, in these three places, you
 I<must> use C<local> instead of C<my>.
 
-=over
+=over 4
 
-=item 1. You need to give a global variable a temporary value, especially $_.
+=item 1.
+
+You need to give a global variable a temporary value, especially $_.
 
 The global variables, like C<@ARGV> or the punctuation variables, must be 
 C<local>ized with C<local()>.  This block reads in F</etc/motd>, and splits
@@ -678,7 +758,9 @@ in C<@Fields>.
 It particular, it's important to C<local>ize $_ in any routine that assigns
 to it.  Look out for implicit assignments in C<while> conditionals.
 
-=item 2. You need to create a local file or directory handle or a local function.
+=item 2.
+
+You need to create a local file or directory handle or a local function.
 
 A function that needs a filehandle of its own must use
 C<local()> on a complete typeglob.   This can be used to create new symbol
@@ -686,7 +768,7 @@ table entries:
 
     sub ioqueue {
         local  (*READER, *WRITER);    # not my!
-        pipe    (READER,  WRITER);    or die "pipe: $!";
+        pipe    (READER,  WRITER)     or die "pipe: $!";
         return (*READER, *WRITER);
     }
     ($head, $tail) = ioqueue();
@@ -708,7 +790,9 @@ a local alias.
 See L<perlref/"Function Templates"> for more about manipulating
 functions by name in this way.
 
-=item 3. You want to temporarily change just one element of an array or hash.
+=item 3.
+
+You want to temporarily change just one element of an array or hash.
 
 You can C<local>ize just one element of an aggregate.  Usually this
 is done on dynamics:
@@ -853,7 +937,7 @@ like a built-in function.  If you call it like an old-fashioned
 subroutine, then it behaves like an old-fashioned subroutine.  It
 naturally falls out from this rule that prototypes have no influence
 on subroutine references like C<\&foo> or on indirect subroutine
-calls like C<&{$subref}> or C<$subref-E<gt>()>.
+calls like C<&{$subref}> or C<< $subref->() >>.
 
 Method calls are not influenced by prototypes either, because the
 function to be called is indeterminate at compile time, since
@@ -886,15 +970,41 @@ that absolutely must start with that character.  The value passed
 as part of C<@_> will be a reference to the actual argument given
 in the subroutine call, obtained by applying C<\> to that argument.
 
+You can also backslash several argument types simultaneously by using
+the C<\[]> notation:
+
+    sub myref (\[$@%&*])
+
+will allow calling myref() as
+
+    myref $var
+    myref @array
+    myref %hash
+    myref &sub
+    myref *glob
+
+and the first argument of myref() will be a reference to
+a scalar, an array, a hash, a code, or a glob.
+
 Unbackslashed prototype characters have special meanings.  Any
 unbackslashed C<@> or C<%> eats all remaining arguments, and forces
 list context.  An argument represented by C<$> forces scalar context.  An
 C<&> requires an anonymous subroutine, which, if passed as the first
-argument, does not require the C<sub> keyword or a subsequent comma.  A
-C<*> allows the subroutine to accept a bareword, constant, scalar expression,
+argument, does not require the C<sub> keyword or a subsequent comma.
+
+A C<*> allows the subroutine to accept a bareword, constant, scalar expression,
 typeglob, or a reference to a typeglob in that slot.  The value will be
 available to the subroutine either as a simple scalar, or (in the latter
-two cases) as a reference to the typeglob.
+two cases) as a reference to the typeglob.  If you wish to always convert
+such arguments to a typeglob reference, use Symbol::qualify_to_ref() as
+follows:
+
+    use Symbol 'qualify_to_ref';
+
+    sub foo (*) {
+       my $fh = qualify_to_ref(shift, caller);
+       ...
+    }
 
 A semicolon separates mandatory arguments from optional arguments.
 It is redundant before C<@> or C<%>, which gobble up everything else.
@@ -955,6 +1065,13 @@ programmers, and that it will not intrude greatly upon the meat of the
 module, nor make it harder to read.  The line noise is visually
 encapsulated into a small pill that's easy to swallow.
 
+If you try to use an alphanumeric sequence in a prototype you will
+generate an optional warning - "Illegal character in prototype...".
+Unfortunately earlier versions of Perl allowed the prototype to be
+used as long as its prefix was a valid prototype.  The warning may be
+upgraded to a fatal error in a future version of Perl once the
+majority of offending code is fixed.
+
 It's probably best to prototype new functions, not retrofit prototyping
 into older ones.  That's because you must be especially careful about
 silent impositions of differing list versus scalar contexts.  For example,
@@ -1128,6 +1245,33 @@ a properly written override.  For a fully functional example of overriding
 C<glob>, study the implementation of C<File::DosGlob> in the standard
 library.
 
+When you override a built-in, your replacement should be consistent (if
+possible) with the built-in native syntax.  You can achieve this by using
+a suitable prototype.  To get the prototype of an overridable built-in,
+use the C<prototype> function with an argument of C<"CORE::builtin_name">
+(see L<perlfunc/prototype>).
+
+Note however that some built-ins can't have their syntax expressed by a
+prototype (such as C<system> or C<chomp>).  If you override them you won't
+be able to fully mimic their original syntax.
+
+The built-ins C<do>, C<require> and C<glob> can also be overridden, but due
+to special magic, their original syntax is preserved, and you don't have
+to define a prototype for their replacements.  (You can't override the
+C<do BLOCK> syntax, though).
+
+C<require> has special additional dark magic: if you invoke your
+C<require> replacement as C<require Foo::Bar>, it will actually receive
+the argument C<"Foo/Bar.pm"> in @_.  See L<perlfunc/require>.
+
+And, as you'll have noticed from the previous example, if you override
+C<glob>, the C<E<lt>*E<gt>> glob operator is overridden as well.
+
+In a similar fashion, overriding the C<readline> function also overrides
+the equivalent I/O operator C<< <FILEHANDLE> >>.
+
+Finally, some built-ins (e.g. C<exists> or C<grep>) can't be overridden.
+
 =head2 Autoloading
 
 If you call a subroutine that is undefined, you would ordinarily
@@ -1182,7 +1326,7 @@ functions to Perl code in L<perlxs>.
 
 A subroutine declaration or definition may have a list of attributes
 associated with it.  If such an attribute list is present, it is
-broken up at space or comma boundaries and treated as though a
+broken up at space or colon boundaries and treated as though a
 C<use attributes> had been seen.  See L<attributes> for details
 about what attributes are currently supported.
 Unlike the limitation with the obsolescent C<use attrs>, the
@@ -1196,8 +1340,8 @@ nest properly.
 
 Examples of valid syntax (even though the attributes are unknown):
 
-    sub fnord (&\%) : switch(10,foo(7,3)) , ,  expensive ;
-    sub plugh () : Ugly('\(") , Bad ;
+    sub fnord (&\%) : switch(10,foo(7,3))  :  expensive ;
+    sub plugh () : Ugly('\(") :Bad ;
     sub xyzzy : _5x5 { ... }
 
 Examples of invalid syntax:
@@ -1206,7 +1350,7 @@ Examples of invalid syntax:
     sub snoid : Ugly('(') ;      # ()-string not balanced
     sub xyzzy : 5x5 ;            # "5x5" not a valid identifier
     sub plugh : Y2::north ;      # "Y2::north" not a simple identifier
-    sub snurt : foo + bar ;      # "+" not a comma or space
+    sub snurt : foo + bar ;      # "+" not a colon or space
 
 The attribute list is passed as a list of constant strings to the code
 which associates them with the subroutine.  In particular, the second example
@@ -1216,13 +1360,13 @@ parsed and invoked:
     use attributes __PACKAGE__, \&plugh, q[Ugly('\(")], 'Bad';
 
 For further details on attribute lists and their manipulation,
-see L<attributes>.
+see L<attributes> and L<Attribute::Handlers>.
 
 =head1 SEE ALSO
 
 See L<perlref/"Function Templates"> for more about references and closures.
 See L<perlxs> if you'd like to learn about calling C subroutines from Perl.  
-See L<perlembed> if you'd like to learn about calling PErl subroutines from C.  
+See L<perlembed> if you'd like to learn about calling Perl subroutines from C.  
 See L<perlmod> to learn about bundling up your functions in separate files.
 See L<perlmodlib> to learn what library modules come standard on your system.
 See L<perltoot> to learn how to make object method calls.