[p5sagit/p5-mst-13.2.git] / pod / perlmod.pod

=head1 NAME

perlmod - Perl modules (packages and symbol tables)

=head1 DESCRIPTION

=head2 Packages

Perl provides a mechanism for alternative namespaces to protect
packages from stomping on each other's variables.  In fact, there's
really no such thing as a global variable in Perl.  The package
statement declares the compilation unit as being in the given
namespace.  The scope of the package declaration is from the
declaration itself through the end of the enclosing block, C<eval>,
or file, whichever comes first (the same scope as the my() and
local() operators).  Unqualified dynamic identifiers will be in
this namespace, except for those few identifiers that if unqualified,
default to the main package instead of the current one as described
below.  A package statement affects only dynamic variables--including
those you've used local() on--but I<not> lexical variables created
with my().  Typically it would be the first declaration in a file
included by the C<do>, C<require>, or C<use> operators.  You can
switch into a package in more than one place; it merely influences
which symbol table is used by the compiler for the rest of that
block.  You can refer to variables and filehandles in other packages
by prefixing the identifier with the package name and a double
colon: C<$Package::Variable>.  If the package name is null, the
C<main> package is assumed.  That is, C<$::sail> is equivalent to
C<$main::sail>.

The old package delimiter was a single quote, but double colon is now the
preferred delimiter, in part because it's more readable to humans, and
in part because it's more readable to B<emacs> macros.  It also makes C++
programmers feel like they know what's going on--as opposed to using the
single quote as separator, which was there to make Ada programmers feel
like they knew what's going on.  Because the old-fashioned syntax is still
supported for backwards compatibility, if you try to use a string like
C<"This is $owner's house">, you'll be accessing C<$owner::s>; that is,
the $s variable in package C<owner>, which is probably not what you meant.
Use braces to disambiguate, as in C<"This is ${owner}'s house">.

Packages may themselves contain package separators, as in
C<$OUTER::INNER::var>.  This implies nothing about the order of
name lookups, however.  There are no relative packages: all symbols
are either local to the current package, or must be fully qualified
from the outer package name down.  For instance, there is nowhere
within package C<OUTER> that C<$INNER::var> refers to
C<$OUTER::INNER::var>.  It would treat package C<INNER> as a totally
separate global package.

Only identifiers starting with letters (or underscore) are stored
in a package's symbol table.  All other symbols are kept in package
C<main>, including all punctuation variables, like $_.  In addition,
when unqualified, the identifiers STDIN, STDOUT, STDERR, ARGV,
ARGVOUT, ENV, INC, and SIG are forced to be in package C<main>,
even when used for other purposes than their built-in one.  If you
have a package called C<m>, C<s>, or C<y>, then you can't use the
qualified form of an identifier because it would be instead interpreted
as a pattern match, a substitution, or a transliteration.

Variables beginning with underscore used to be forced into package
main, but we decided it was more useful for package writers to be able
to use leading underscore to indicate private variables and method names.
$_ is still global though.  See also
L<perlvar/"Technical Note on the Syntax of Variable Names">.

C<eval>ed strings are compiled in the package in which the eval() was
compiled.  (Assignments to C<$SIG{}>, however, assume the signal
handler specified is in the C<main> package.  Qualify the signal handler
name if you wish to have a signal handler in a package.)  For an
example, examine F<perldb.pl> in the Perl library.  It initially switches
to the C<DB> package so that the debugger doesn't interfere with variables
in the program you are trying to debug.  At various points, however, it
temporarily switches back to the C<main> package to evaluate various
expressions in the context of the C<main> package (or wherever you came
from).  See L<perldebug>.

The special symbol C<__PACKAGE__> contains the current package, but cannot
(easily) be used to construct variables.

See L<perlsub> for other scoping issues related to my() and local(),
and L<perlref> regarding closures.

=head2 Symbol Tables

The symbol table for a package happens to be stored in the hash of that
name with two colons appended.  The main symbol table's name is thus
C<%main::>, or C<%::> for short.  Likewise the symbol table for the nested
package mentioned earlier is named C<%OUTER::INNER::>.

The value in each entry of the hash is what you are referring to when you
use the C<*name> typeglob notation.  In fact, the following have the same
effect, though the first is more efficient because it does the symbol
table lookups at compile time:

    local *main::foo    = *main::bar;
    local $main::{foo}  = $main::{bar};

(Be sure to note the B<vast> difference between the second line above
and C<local $main::foo = $main::bar>. The former is accessing the hash
C<%main::>, which is the symbol table of package C<main>. The latter is
simply assigning scalar C<$bar> in package C<main> to scalar C<$foo> of
the same package.)

You can use this to print out all the variables in a package, for
instance.  The standard but antiquated F<dumpvar.pl> library and
the CPAN module Devel::Symdump make use of this.

Assignment to a typeglob performs an aliasing operation, i.e.,

    *dick = *richard;

causes variables, subroutines, formats, and file and directory handles
accessible via the identifier C<richard> also to be accessible via the
identifier C<dick>.  If you want to alias only a particular variable or
subroutine, assign a reference instead:

    *dick = \$richard;

Which makes $richard and $dick the same variable, but leaves
@richard and @dick as separate arrays.  Tricky, eh?

This mechanism may be used to pass and return cheap references
into or from subroutines if you don't want to copy the whole
thing.  It only works when assigning to dynamic variables, not
lexicals.

    %some_hash = ();			# can't be my()
    *some_hash = fn( \%another_hash );
    sub fn {
	local *hashsym = shift;
	# now use %hashsym normally, and you
	# will affect the caller's %another_hash
	my %nhash = (); # do what you want
	return \%nhash;
    }

On return, the reference will overwrite the hash slot in the
symbol table specified by the *some_hash typeglob.  This
is a somewhat tricky way of passing around references cheaply
when you don't want to have to remember to dereference variables
explicitly.

Another use of symbol tables is for making "constant" scalars.

    *PI = \3.14159265358979;

Now you cannot alter C<$PI>, which is probably a good thing all in all.
This isn't the same as a constant subroutine, which is subject to
optimization at compile-time.  A constant subroutine is one prototyped
to take no arguments and to return a constant expression.  See 
L<perlsub> for details on these.  The C<use constant> pragma is a
convenient shorthand for these.

You can say C<*foo{PACKAGE}> and C<*foo{NAME}> to find out what name and
package the *foo symbol table entry comes from.  This may be useful
in a subroutine that gets passed typeglobs as arguments:

    sub identify_typeglob {
        my $glob = shift;
        print 'You gave me ', *{$glob}{PACKAGE}, '::', *{$glob}{NAME}, "\n";
    }
    identify_typeglob *foo;
    identify_typeglob *bar::baz;

This prints

    You gave me main::foo
    You gave me bar::baz

The C<*foo{THING}> notation can also be used to obtain references to the
individual elements of *foo.  See L<perlref>.

Subroutine definitions (and declarations, for that matter) need
not necessarily be situated in the package whose symbol table they
occupy.  You can define a subroutine outside its package by
explicitly qualifying the name of the subroutine:

    package main;
    sub Some_package::foo { ... }   # &foo defined in Some_package

This is just a shorthand for a typeglob assignment at compile time:

    BEGIN { *Some_package::foo = sub { ... } }

and is I<not> the same as writing:

    {
	package Some_package;
	sub foo { ... }
    }

In the first two versions, the body of the subroutine is
lexically in the main package, I<not> in Some_package. So
something like this:

    package main;

    $Some_package::name = "fred";
    $main::name = "barney";

    sub Some_package::foo {
	print "in ", __PACKAGE__, ": \$name is '$name'\n";
    }

    Some_package::foo();

prints:

    in main: $name is 'barney'

rather than:

    in Some_package: $name is 'fred'

This also has implications for the use of the SUPER:: qualifier
(see L<perlobj>).

=head2 Package Constructors and Destructors

Four special subroutines act as package constructors and destructors.
These are the C<BEGIN>, C<CHECK>, C<INIT>, and C<END> routines.  The
C<sub> is optional for these routines.

A C<BEGIN> subroutine is executed as soon as possible, that is, the moment
it is completely defined, even before the rest of the containing file
is parsed.  You may have multiple C<BEGIN> blocks within a file--they
will execute in order of definition.  Because a C<BEGIN> block executes
immediately, it can pull in definitions of subroutines and such from other
files in time to be visible to the rest of the file.  Once a C<BEGIN>
has run, it is immediately undefined and any code it used is returned to
Perl's memory pool.  This means you can't ever explicitly call a C<BEGIN>.

An C<END> subroutine is executed as late as possible, that is, after
perl has finished running the program and just before the interpreter
is being exited, even if it is exiting as a result of a die() function.
(But not if it's polymorphing into another program via C<exec>, or
being blown out of the water by a signal--you have to trap that yourself
(if you can).)  You may have multiple C<END> blocks within a file--they
will execute in reverse order of definition; that is: last in, first
out (LIFO).  C<END> blocks are not executed when you run perl with the
C<-c> switch, or if compilation fails.

Inside an C<END> subroutine, C<$?> contains the value that the program is
going to pass to C<exit()>.  You can modify C<$?> to change the exit
value of the program.  Beware of changing C<$?> by accident (e.g. by
running something via C<system>).

Similar to C<BEGIN> blocks, C<INIT> blocks are run just before the
Perl runtime begins execution, in "first in, first out" (FIFO) order.
For example, the code generators documented in L<perlcc> make use of
C<INIT> blocks to initialize and resolve pointers to XSUBs.

Similar to C<END> blocks, C<CHECK> blocks are run just after the
Perl compile phase ends and before the run time begins, in
LIFO order.  C<CHECK> blocks are again useful in the Perl compiler
suite to save the compiled state of the program.

When you use the B<-n> and B<-p> switches to Perl, C<BEGIN> and
C<END> work just as they do in B<awk>, as a degenerate case.
Both C<BEGIN> and C<CHECK> blocks are run when you use the B<-c>
switch for a compile-only syntax check, although your main code
is not.

=head2 Perl Classes

There is no special class syntax in Perl, but a package may act
as a class if it provides subroutines to act as methods.  Such a
package may also derive some of its methods from another class (package)
by listing the other package name(s) in its global @ISA array (which 
must be a package global, not a lexical).

For more on this, see L<perltoot> and L<perlobj>.

=head2 Perl Modules

A module is just a set of related functions in a library file, i.e.,
a Perl package with the same name as the file.  It is specifically 
designed to be reusable by other modules or programs.  It may do this
by providing a mechanism for exporting some of its symbols into the
symbol table of any package using it.  Or it may function as a class
definition and make its semantics available implicitly through
method calls on the class and its objects, without explicitly
exporting anything.  Or it can do a little of both.

For example, to start a traditional, non-OO module called Some::Module,
create a file called F<Some/Module.pm> and start with this template:

    package Some::Module;  # assumes Some/Module.pm

    use strict;
    use warnings;

    BEGIN {
        use Exporter   ();
        our ($VERSION, @ISA, @EXPORT, @EXPORT_OK, %EXPORT_TAGS);

        # set the version for version checking
        $VERSION     = 1.00;
        # if using RCS/CVS, this may be preferred
        $VERSION = do { my @r = (q$Revision: 2.21 $ =~ /\d+/g); sprintf "%d."."%02d" x $#r, @r }; # must be all one line, for MakeMaker

        @ISA         = qw(Exporter);
        @EXPORT      = qw(&func1 &func2 &func4);
        %EXPORT_TAGS = ( );     # eg: TAG => [ qw!name1 name2! ],

        # your exported package globals go here,
        # as well as any optionally exported functions
        @EXPORT_OK   = qw($Var1 %Hashit &func3);
    }
    our @EXPORT_OK;

    # exported package globals go here
    our $Var1;
    our %Hashit;

    # non-exported package globals go here
    our @more;
    our $stuff;

    # initialize package globals, first exported ones
    $Var1   = '';
    %Hashit = ();

    # then the others (which are still accessible as $Some::Module::stuff)
    $stuff  = '';
    @more   = ();

    # all file-scoped lexicals must be created before
    # the functions below that use them.

    # file-private lexicals go here
    my $priv_var    = '';
    my %secret_hash = ();

    # here's a file-private function as a closure,
    # callable as &$priv_func;  it cannot be prototyped.
    my $priv_func = sub {
        # stuff goes here.
    };

    # make all your functions, whether exported or not;
    # remember to put something interesting in the {} stubs
    sub func1      {}    # no prototype
    sub func2()    {}    # proto'd void
    sub func3($$)  {}    # proto'd to 2 scalars

    # this one isn't exported, but could be called!
    sub func4(\%)  {}    # proto'd to 1 hash ref

    END { }       # module clean-up code here (global destructor)

    ## YOUR CODE GOES HERE

    1;  # don't forget to return a true value from the file

Then go on to declare and use your variables in functions without
any qualifications.  See L<Exporter> and the L<perlmodlib> for
details on mechanics and style issues in module creation.

Perl modules are included into your program by saying

    use Module;

or

    use Module LIST;

This is exactly equivalent to

    BEGIN { require Module; import Module; }

or

    BEGIN { require Module; import Module LIST; }

As a special case

    use Module ();

is exactly equivalent to

    BEGIN { require Module; }

All Perl module files have the extension F<.pm>.  The C<use> operator
assumes this so you don't have to spell out "F<Module.pm>" in quotes.
This also helps to differentiate new modules from old F<.pl> and
F<.ph> files.  Module names are also capitalized unless they're
functioning as pragmas; pragmas are in effect compiler directives,
and are sometimes called "pragmatic modules" (or even "pragmata"
if you're a classicist).

The two statements:

    require SomeModule;
    require "SomeModule.pm";		

differ from each other in two ways.  In the first case, any double
colons in the module name, such as C<Some::Module>, are translated
into your system's directory separator, usually "/".   The second
case does not, and would have to be specified literally.  The other
difference is that seeing the first C<require> clues in the compiler
that uses of indirect object notation involving "SomeModule", as
in C<$ob = purge SomeModule>, are method calls, not function calls.
(Yes, this really can make a difference.)

Because the C<use> statement implies a C<BEGIN> block, the importing
of semantics happens as soon as the C<use> statement is compiled,
before the rest of the file is compiled.  This is how it is able
to function as a pragma mechanism, and also how modules are able to
declare subroutines that are then visible as list or unary operators for
the rest of the current file.  This will not work if you use C<require>
instead of C<use>.  With C<require> you can get into this problem:

    require Cwd;		# make Cwd:: accessible
    $here = Cwd::getcwd();

    use Cwd;			# import names from Cwd::
    $here = getcwd();

    require Cwd;	    	# make Cwd:: accessible
    $here = getcwd(); 		# oops! no main::getcwd()

In general, C<use Module ()> is recommended over C<require Module>,
because it determines module availability at compile time, not in the
middle of your program's execution.  An exception would be if two modules
each tried to C<use> each other, and each also called a function from
that other module.  In that case, it's easy to use C<require>s instead.

Perl packages may be nested inside other package names, so we can have
package names containing C<::>.  But if we used that package name
directly as a filename it would make for unwieldy or impossible
filenames on some systems.  Therefore, if a module's name is, say,
C<Text::Soundex>, then its definition is actually found in the library
file F<Text/Soundex.pm>.

Perl modules always have a F<.pm> file, but there may also be
dynamically linked executables (often ending in F<.so>) or autoloaded
subroutine definitions (often ending in F<.al>) associated with the
module.  If so, these will be entirely transparent to the user of
the module.  It is the responsibility of the F<.pm> file to load
(or arrange to autoload) any additional functionality.  For example,
although the POSIX module happens to do both dynamic loading and
autoloading, the user can say just C<use POSIX> to get it all.

=head1 SEE ALSO

See L<perlmodlib> for general style issues related to building Perl
modules and classes, as well as descriptions of the standard library
and CPAN, L<Exporter> for how Perl's standard import/export mechanism
works, L<perltoot> and L<perltootc> for an in-depth tutorial on
creating classes, L<perlobj> for a hard-core reference document on
objects, L<perlsub> for an explanation of functions and scoping,
and L<perlxstut> and L<perlguts> for more information on writing
extension modules.
Commit	Line	Data
a0d0e21e	1	=head1 NAME
a0d0e21e	2
f102b883	3	perlmod - Perl modules (packages and symbol tables)
a0d0e21e	4
	5	=head1 DESCRIPTION
	6
	7	=head2 Packages
	8
19799a22	9	Perl provides a mechanism for alternative namespaces to protect
19799a22	10	packages from stomping on each other's variables. In fact, there's
bc8df162	11	really no such thing as a global variable in Perl. The package
19799a22	12	statement declares the compilation unit as being in the given
	13	namespace. The scope of the package declaration is from the
	14	declaration itself through the end of the enclosing block, C<eval>,
	15	or file, whichever comes first (the same scope as the my() and
	16	local() operators). Unqualified dynamic identifiers will be in
	17	this namespace, except for those few identifiers that if unqualified,
	18	default to the main package instead of the current one as described
	19	below. A package statement affects only dynamic variables--including
	20	those you've used local() on--but I<not> lexical variables created
	21	with my(). Typically it would be the first declaration in a file
	22	included by the C<do>, C<require>, or C<use> operators. You can
	23	switch into a package in more than one place; it merely influences
	24	which symbol table is used by the compiler for the rest of that
	25	block. You can refer to variables and filehandles in other packages
	26	by prefixing the identifier with the package name and a double
	27	colon: C<$Package::Variable>. If the package name is null, the
	28	C<main> package is assumed. That is, C<$::sail> is equivalent to
	29	C<$main::sail>.
a0d0e21e	30
d3ebb66b	31	The old package delimiter was a single quote, but double colon is now the
	32	preferred delimiter, in part because it's more readable to humans, and
	33	in part because it's more readable to B<emacs> macros. It also makes C++
	34	programmers feel like they know what's going on--as opposed to using the
	35	single quote as separator, which was there to make Ada programmers feel
	36	like they knew what's going on. Because the old-fashioned syntax is still
	37	supported for backwards compatibility, if you try to use a string like
	38	C<"This is $owner's house">, you'll be accessing C<$owner::s>; that is,
	39	the $s variable in package C<owner>, which is probably not what you meant.
	40	Use braces to disambiguate, as in C<"This is ${owner}'s house">.
a0d0e21e	41
19799a22	42	Packages may themselves contain package separators, as in
	43	C<$OUTER::INNER::var>. This implies nothing about the order of
	44	name lookups, however. There are no relative packages: all symbols
a0d0e21e	45	are either local to the current package, or must be fully qualified
a0d0e21e	46	from the outer package name down. For instance, there is nowhere
19799a22	47	within package C<OUTER> that C<$INNER::var> refers to
	48	C<$OUTER::INNER::var>. It would treat package C<INNER> as a totally
	49	separate global package.
	50
	51	Only identifiers starting with letters (or underscore) are stored
	52	in a package's symbol table. All other symbols are kept in package
	53	C<main>, including all punctuation variables, like $_. In addition,
	54	when unqualified, the identifiers STDIN, STDOUT, STDERR, ARGV,
	55	ARGVOUT, ENV, INC, and SIG are forced to be in package C<main>,
	56	even when used for other purposes than their built-in one. If you
	57	have a package called C<m>, C<s>, or C<y>, then you can't use the
	58	qualified form of an identifier because it would be instead interpreted
	59	as a pattern match, a substitution, or a transliteration.
	60
	61	Variables beginning with underscore used to be forced into package
a0d0e21e	62	main, but we decided it was more useful for package writers to be able
cb1a09d0	63	to use leading underscore to indicate private variables and method names.
cea6626f	64	$_ is still global though. See also
cea6626f	65	L<perlvar/"Technical Note on the Syntax of Variable Names">.
a0d0e21e	66
19799a22	67	C<eval>ed strings are compiled in the package in which the eval() was
a0d0e21e	68	compiled. (Assignments to C<$SIG{}>, however, assume the signal
748a9306	69	handler specified is in the C<main> package. Qualify the signal handler
a0d0e21e	70	name if you wish to have a signal handler in a package.) For an
	71	example, examine F<perldb.pl> in the Perl library. It initially switches
	72	to the C<DB> package so that the debugger doesn't interfere with variables
19799a22	73	in the program you are trying to debug. At various points, however, it
a0d0e21e	74	temporarily switches back to the C<main> package to evaluate various
	75	expressions in the context of the C<main> package (or wherever you came
	76	from). See L<perldebug>.
	77
f102b883	78	The special symbol C<__PACKAGE__> contains the current package, but cannot
	79	(easily) be used to construct variables.
	80
5f05dabc	81	See L<perlsub> for other scoping issues related to my() and local(),
f102b883	82	and L<perlref> regarding closures.
cb1a09d0	83
a0d0e21e	84	=head2 Symbol Tables
a0d0e21e	85
aa689395	86	The symbol table for a package happens to be stored in the hash of that
aa689395	87	name with two colons appended. The main symbol table's name is thus
5803be0d	88	C<%main::>, or C<%::> for short. Likewise the symbol table for the nested
aa689395	89	package mentioned earlier is named C<%OUTER::INNER::>.
	90
	91	The value in each entry of the hash is what you are referring to when you
	92	use the C<*name> typeglob notation. In fact, the following have the same
	93	effect, though the first is more efficient because it does the symbol
	94	table lookups at compile time:
a0d0e21e	95
f102b883	96	local main::foo = main::bar;
f102b883	97	local $main::{foo} = $main::{bar};
a0d0e21e	98
bc8df162	99	(Be sure to note the B<vast> difference between the second line above
	100	and C<local $main::foo = $main::bar>. The former is accessing the hash
	101	C<%main::>, which is the symbol table of package C<main>. The latter is
	102	simply assigning scalar C<$bar> in package C<main> to scalar C<$foo> of
	103	the same package.)
	104
a0d0e21e	105	You can use this to print out all the variables in a package, for
4375e838	106	instance. The standard but antiquated F<dumpvar.pl> library and
19799a22	107	the CPAN module Devel::Symdump make use of this.
a0d0e21e	108
cb1a09d0	109	Assignment to a typeglob performs an aliasing operation, i.e.,
a0d0e21e	110
	111	dick = richard;
	112
5a964f20	113	causes variables, subroutines, formats, and file and directory handles
	114	accessible via the identifier C<richard> also to be accessible via the
	115	identifier C<dick>. If you want to alias only a particular variable or
19799a22	116	subroutine, assign a reference instead:
a0d0e21e	117
	118	*dick = \$richard;
	119
5a964f20	120	Which makes $richard and $dick the same variable, but leaves
a0d0e21e	121	@richard and @dick as separate arrays. Tricky, eh?
a0d0e21e	122
cb1a09d0	123	This mechanism may be used to pass and return cheap references
5803be0d	124	into or from subroutines if you don't want to copy the whole
5a964f20	125	thing. It only works when assigning to dynamic variables, not
5a964f20	126	lexicals.
cb1a09d0	127
5a964f20	128	%some_hash = (); # can't be my()
cb1a09d0	129	*some_hash = fn( \%another_hash );
	130	sub fn {
	131	local *hashsym = shift;
	132	# now use %hashsym normally, and you
	133	# will affect the caller's %another_hash
	134	my %nhash = (); # do what you want
5f05dabc	135	return \%nhash;
cb1a09d0	136	}
cb1a09d0	137
5f05dabc	138	On return, the reference will overwrite the hash slot in the
cb1a09d0	139	symbol table specified by the *some_hash typeglob. This
c36e9b62	140	is a somewhat tricky way of passing around references cheaply
5803be0d	141	when you don't want to have to remember to dereference variables
cb1a09d0	142	explicitly.
cb1a09d0	143
19799a22	144	Another use of symbol tables is for making "constant" scalars.
cb1a09d0	145
	146	*PI = \3.14159265358979;
	147
bc8df162	148	Now you cannot alter C<$PI>, which is probably a good thing all in all.
5a964f20	149	This isn't the same as a constant subroutine, which is subject to
5803be0d	150	optimization at compile-time. A constant subroutine is one prototyped
	151	to take no arguments and to return a constant expression. See
	152	L<perlsub> for details on these. The C<use constant> pragma is a
5a964f20	153	convenient shorthand for these.
cb1a09d0	154
55497cff	155	You can say C<foo{PACKAGE}> and C<foo{NAME}> to find out what name and
55497cff	156	package the *foo symbol table entry comes from. This may be useful
5a964f20	157	in a subroutine that gets passed typeglobs as arguments:
55497cff	158
	159	sub identify_typeglob {
	160	my $glob = shift;
	161	print 'You gave me ', {$glob}{PACKAGE}, '::', {$glob}{NAME}, "\n";
	162	}
	163	identify_typeglob *foo;
	164	identify_typeglob *bar::baz;
	165
	166	This prints
	167
	168	You gave me main::foo
	169	You gave me bar::baz
	170
19799a22	171	The C<*foo{THING}> notation can also be used to obtain references to the
5803be0d	172	individual elements of *foo. See L<perlref>.
55497cff	173
9263d47b	174	Subroutine definitions (and declarations, for that matter) need
	175	not necessarily be situated in the package whose symbol table they
	176	occupy. You can define a subroutine outside its package by
	177	explicitly qualifying the name of the subroutine:
	178
	179	package main;
	180	sub Some_package::foo { ... } # &foo defined in Some_package
	181
	182	This is just a shorthand for a typeglob assignment at compile time:
	183
	184	BEGIN { *Some_package::foo = sub { ... } }
	185
	186	and is I<not> the same as writing:
	187
	188	{
	189	package Some_package;
	190	sub foo { ... }
	191	}
	192
	193	In the first two versions, the body of the subroutine is
	194	lexically in the main package, I<not> in Some_package. So
	195	something like this:
	196
	197	package main;
	198
	199	$Some_package::name = "fred";
	200	$main::name = "barney";
	201
	202	sub Some_package::foo {
	203	print "in ", __PACKAGE__, ": \$name is '$name'\n";
	204	}
	205
	206	Some_package::foo();
	207
	208	prints:
	209
	210	in main: $name is 'barney'
	211
	212	rather than:
	213
	214	in Some_package: $name is 'fred'
	215
	216	This also has implications for the use of the SUPER:: qualifier
	217	(see L<perlobj>).
	218
a0d0e21e	219	=head2 Package Constructors and Destructors
a0d0e21e	220
7d981616	221	Four special subroutines act as package constructors and destructors.
7d30b5c4	222	These are the C<BEGIN>, C<CHECK>, C<INIT>, and C<END> routines. The
7d981616	223	C<sub> is optional for these routines.
a0d0e21e	224
f102b883	225	A C<BEGIN> subroutine is executed as soon as possible, that is, the moment
	226	it is completely defined, even before the rest of the containing file
	227	is parsed. You may have multiple C<BEGIN> blocks within a file--they
	228	will execute in order of definition. Because a C<BEGIN> block executes
	229	immediately, it can pull in definitions of subroutines and such from other
	230	files in time to be visible to the rest of the file. Once a C<BEGIN>
	231	has run, it is immediately undefined and any code it used is returned to
	232	Perl's memory pool. This means you can't ever explicitly call a C<BEGIN>.
a0d0e21e	233
4f25aa18	234	An C<END> subroutine is executed as late as possible, that is, after
	235	perl has finished running the program and just before the interpreter
	236	is being exited, even if it is exiting as a result of a die() function.
	237	(But not if it's polymorphing into another program via C<exec>, or
	238	being blown out of the water by a signal--you have to trap that yourself
	239	(if you can).) You may have multiple C<END> blocks within a file--they
	240	will execute in reverse order of definition; that is: last in, first
	241	out (LIFO). C<END> blocks are not executed when you run perl with the
db517d64	242	C<-c> switch, or if compilation fails.
a0d0e21e	243
19799a22	244	Inside an C<END> subroutine, C<$?> contains the value that the program is
c36e9b62	245	going to pass to C<exit()>. You can modify C<$?> to change the exit
19799a22	246	value of the program. Beware of changing C<$?> by accident (e.g. by
c36e9b62	247	running something via C<system>).
c36e9b62	248
4f25aa18	249	Similar to C<BEGIN> blocks, C<INIT> blocks are run just before the
	250	Perl runtime begins execution, in "first in, first out" (FIFO) order.
	251	For example, the code generators documented in L<perlcc> make use of
	252	C<INIT> blocks to initialize and resolve pointers to XSUBs.
	253
7d30b5c4	254	Similar to C<END> blocks, C<CHECK> blocks are run just after the
4f25aa18	255	Perl compile phase ends and before the run time begins, in
7d30b5c4	256	LIFO order. C<CHECK> blocks are again useful in the Perl compiler
4f25aa18	257	suite to save the compiled state of the program.
4f25aa18	258
19799a22	259	When you use the B<-n> and B<-p> switches to Perl, C<BEGIN> and
4375e838	260	C<END> work just as they do in B<awk>, as a degenerate case.
	261	Both C<BEGIN> and C<CHECK> blocks are run when you use the B<-c>
	262	switch for a compile-only syntax check, although your main code
	263	is not.
a0d0e21e	264
	265	=head2 Perl Classes
	266
19799a22	267	There is no special class syntax in Perl, but a package may act
5a964f20	268	as a class if it provides subroutines to act as methods. Such a
5a964f20	269	package may also derive some of its methods from another class (package)
19799a22	270	by listing the other package name(s) in its global @ISA array (which
5a964f20	271	must be a package global, not a lexical).
4633a7c4	272
f102b883	273	For more on this, see L<perltoot> and L<perlobj>.
a0d0e21e	274
	275	=head2 Perl Modules
	276
5803be0d	277	A module is just a set of related functions in a library file, i.e.,
	278	a Perl package with the same name as the file. It is specifically
	279	designed to be reusable by other modules or programs. It may do this
	280	by providing a mechanism for exporting some of its symbols into the
19799a22	281	symbol table of any package using it. Or it may function as a class
	282	definition and make its semantics available implicitly through
	283	method calls on the class and its objects, without explicitly
4375e838	284	exporting anything. Or it can do a little of both.
a0d0e21e	285
19799a22	286	For example, to start a traditional, non-OO module called Some::Module,
19799a22	287	create a file called F<Some/Module.pm> and start with this template:
9607fc9c	288
	289	package Some::Module; # assumes Some/Module.pm
	290
	291	use strict;
9f1b1f2d	292	use warnings;
9607fc9c	293
	294	BEGIN {
	295	use Exporter ();
77ca0c92	296	our ($VERSION, @ISA, @EXPORT, @EXPORT_OK, %EXPORT_TAGS);
9607fc9c	297
	298	# set the version for version checking
	299	$VERSION = 1.00;
	300	# if using RCS/CVS, this may be preferred
	301	$VERSION = do { my @r = (q$Revision: 2.21 $ =~ /\d+/g); sprintf "%d."."%02d" x $#r, @r }; # must be all one line, for MakeMaker
	302
	303	@ISA = qw(Exporter);
	304	@EXPORT = qw(&func1 &func2 &func4);
	305	%EXPORT_TAGS = ( ); # eg: TAG => [ qw!name1 name2! ],
	306
	307	# your exported package globals go here,
	308	# as well as any optionally exported functions
	309	@EXPORT_OK = qw($Var1 %Hashit &func3);
	310	}
77ca0c92	311	our @EXPORT_OK;
9607fc9c	312
3da4c8f2	313	# exported package globals go here
	314	our $Var1;
	315	our %Hashit;
	316
9607fc9c	317	# non-exported package globals go here
77ca0c92	318	our @more;
77ca0c92	319	our $stuff;
9607fc9c	320
c2611fb3	321	# initialize package globals, first exported ones
9607fc9c	322	$Var1 = '';
	323	%Hashit = ();
	324
	325	# then the others (which are still accessible as $Some::Module::stuff)
	326	$stuff = '';
	327	@more = ();
	328
	329	# all file-scoped lexicals must be created before
	330	# the functions below that use them.
	331
	332	# file-private lexicals go here
	333	my $priv_var = '';
	334	my %secret_hash = ();
	335
	336	# here's a file-private function as a closure,
	337	# callable as &$priv_func; it cannot be prototyped.
	338	my $priv_func = sub {
	339	# stuff goes here.
	340	};
	341
	342	# make all your functions, whether exported or not;
	343	# remember to put something interesting in the {} stubs
	344	sub func1 {} # no prototype
	345	sub func2() {} # proto'd void
	346	sub func3($$) {} # proto'd to 2 scalars
	347
	348	# this one isn't exported, but could be called!
	349	sub func4(\%) {} # proto'd to 1 hash ref
	350
	351	END { } # module clean-up code here (global destructor)
4633a7c4	352
19799a22	353	## YOUR CODE GOES HERE
	354
	355	1; # don't forget to return a true value from the file
	356
	357	Then go on to declare and use your variables in functions without
	358	any qualifications. See L<Exporter> and the L<perlmodlib> for
	359	details on mechanics and style issues in module creation.
4633a7c4	360
4633a7c4	361	Perl modules are included into your program by saying
a0d0e21e	362
	363	use Module;
	364
	365	or
	366
	367	use Module LIST;
	368
	369	This is exactly equivalent to
	370
5a964f20	371	BEGIN { require Module; import Module; }
a0d0e21e	372
	373	or
	374
5a964f20	375	BEGIN { require Module; import Module LIST; }
a0d0e21e	376
cb1a09d0	377	As a special case
	378
	379	use Module ();
	380
	381	is exactly equivalent to
	382
5a964f20	383	BEGIN { require Module; }
cb1a09d0	384
19799a22	385	All Perl module files have the extension F<.pm>. The C<use> operator
	386	assumes this so you don't have to spell out "F<Module.pm>" in quotes.
	387	This also helps to differentiate new modules from old F<.pl> and
	388	F<.ph> files. Module names are also capitalized unless they're
	389	functioning as pragmas; pragmas are in effect compiler directives,
	390	and are sometimes called "pragmatic modules" (or even "pragmata"
	391	if you're a classicist).
a0d0e21e	392
5a964f20	393	The two statements:
	394
	395	require SomeModule;
	396	require "SomeModule.pm";
	397
	398	differ from each other in two ways. In the first case, any double
	399	colons in the module name, such as C<Some::Module>, are translated
	400	into your system's directory separator, usually "/". The second
19799a22	401	case does not, and would have to be specified literally. The other
	402	difference is that seeing the first C<require> clues in the compiler
	403	that uses of indirect object notation involving "SomeModule", as
	404	in C<$ob = purge SomeModule>, are method calls, not function calls.
	405	(Yes, this really can make a difference.)
	406
	407	Because the C<use> statement implies a C<BEGIN> block, the importing
	408	of semantics happens as soon as the C<use> statement is compiled,
a0d0e21e	409	before the rest of the file is compiled. This is how it is able
a0d0e21e	410	to function as a pragma mechanism, and also how modules are able to
19799a22	411	declare subroutines that are then visible as list or unary operators for
a0d0e21e	412	the rest of the current file. This will not work if you use C<require>
19799a22	413	instead of C<use>. With C<require> you can get into this problem:
a0d0e21e	414
a0d0e21e	415	require Cwd; # make Cwd:: accessible
54310121	416	$here = Cwd::getcwd();
a0d0e21e	417
5f05dabc	418	use Cwd; # import names from Cwd::
a0d0e21e	419	$here = getcwd();
	420
	421	require Cwd; # make Cwd:: accessible
	422	$here = getcwd(); # oops! no main::getcwd()
	423
5a964f20	424	In general, C<use Module ()> is recommended over C<require Module>,
	425	because it determines module availability at compile time, not in the
	426	middle of your program's execution. An exception would be if two modules
	427	each tried to C<use> each other, and each also called a function from
	428	that other module. In that case, it's easy to use C<require>s instead.
cb1a09d0	429
a0d0e21e	430	Perl packages may be nested inside other package names, so we can have
a0d0e21e	431	package names containing C<::>. But if we used that package name
5803be0d	432	directly as a filename it would make for unwieldy or impossible
a0d0e21e	433	filenames on some systems. Therefore, if a module's name is, say,
	434	C<Text::Soundex>, then its definition is actually found in the library
	435	file F<Text/Soundex.pm>.
	436
19799a22	437	Perl modules always have a F<.pm> file, but there may also be
19799a22	438	dynamically linked executables (often ending in F<.so>) or autoloaded
5803be0d	439	subroutine definitions (often ending in F<.al>) associated with the
19799a22	440	module. If so, these will be entirely transparent to the user of
	441	the module. It is the responsibility of the F<.pm> file to load
	442	(or arrange to autoload) any additional functionality. For example,
	443	although the POSIX module happens to do both dynamic loading and
5803be0d	444	autoloading, the user can say just C<use POSIX> to get it all.
a0d0e21e	445
f102b883	446	=head1 SEE ALSO
cb1a09d0	447
f102b883	448	See L<perlmodlib> for general style issues related to building Perl
19799a22	449	modules and classes, as well as descriptions of the standard library
	450	and CPAN, L<Exporter> for how Perl's standard import/export mechanism
	451	works, L<perltoot> and L<perltootc> for an in-depth tutorial on
	452	creating classes, L<perlobj> for a hard-core reference document on
	453	objects, L<perlsub> for an explanation of functions and scoping,
	454	and L<perlxstut> and L<perlguts> for more information on writing
	455	extension modules.