hand apply whitespace mutiliated patch
[p5sagit/p5-mst-13.2.git] / pod / perlmod.pod
CommitLineData
a0d0e21e 1=head1 NAME
2
f102b883 3perlmod - Perl modules (packages and symbol tables)
a0d0e21e 4
5=head1 DESCRIPTION
6
7=head2 Packages
8
748a9306 9Perl provides a mechanism for alternative namespaces to protect packages
5a964f20 10from stomping on each other's variables. In fact, there's really no such
11thing as a global variable in Perl (although some identifiers default
12to the main package instead of the current one). The package statement
13declares the compilation unit as
f102b883 14being in the given namespace. The scope of the package declaration
15is from the declaration itself through the end of the enclosing block,
16C<eval>, C<sub>, or end of file, whichever comes first (the same scope
17as the my() and local() operators). All further unqualified dynamic
5a964f20 18identifiers will be in this namespace. A package statement only affects
19dynamic variables--including those you've used local() on--but
f102b883 20I<not> lexical variables created with my(). Typically it would be
21the first declaration in a file to be included by the C<require> or
22C<use> operator. You can switch into a package in more than one place;
5a964f20 23it merely influences which symbol table is used by the compiler for the
f102b883 24rest of that block. You can refer to variables and filehandles in other
25packages by prefixing the identifier with the package name and a double
26colon: C<$Package::Variable>. If the package name is null, the C<main>
27package is assumed. That is, C<$::sail> is equivalent to C<$main::sail>.
a0d0e21e 28
29(The old package delimiter was a single quote, but double colon
30is now the preferred delimiter, in part because it's more readable
31to humans, and in part because it's more readable to B<emacs> macros.
32It also makes C++ programmers feel like they know what's going on.)
33
34Packages may be nested inside other packages: C<$OUTER::INNER::var>. This
35implies nothing about the order of name lookups, however. All symbols
36are either local to the current package, or must be fully qualified
37from the outer package name down. For instance, there is nowhere
38within package C<OUTER> that C<$INNER::var> refers to C<$OUTER::INNER::var>.
39It would treat package C<INNER> as a totally separate global package.
40
41Only identifiers starting with letters (or underscore) are stored in a
cb1a09d0 42package's symbol table. All other symbols are kept in package C<main>,
5a964f20 43including all of the punctuation variables like $_. In addition, when
44unqualified, the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV,
45INC, and SIG are forced to be in package C<main>, even when used for other
46purposes than their builtin one. Note also that, if you have a package
47called C<m>, C<s>, or C<y>, then you can't use the qualified form of an
48identifier because it will be interpreted instead as a pattern match,
49a substitution, or a transliteration.
a0d0e21e 50
51(Variables beginning with underscore used to be forced into package
52main, but we decided it was more useful for package writers to be able
cb1a09d0 53to use leading underscore to indicate private variables and method names.
54$_ is still global though.)
a0d0e21e 55
56Eval()ed strings are compiled in the package in which the eval() was
57compiled. (Assignments to C<$SIG{}>, however, assume the signal
748a9306 58handler specified is in the C<main> package. Qualify the signal handler
a0d0e21e 59name if you wish to have a signal handler in a package.) For an
60example, examine F<perldb.pl> in the Perl library. It initially switches
61to the C<DB> package so that the debugger doesn't interfere with variables
62in the script you are trying to debug. At various points, however, it
63temporarily switches back to the C<main> package to evaluate various
64expressions in the context of the C<main> package (or wherever you came
65from). See L<perldebug>.
66
f102b883 67The special symbol C<__PACKAGE__> contains the current package, but cannot
68(easily) be used to construct variables.
69
5f05dabc 70See L<perlsub> for other scoping issues related to my() and local(),
f102b883 71and L<perlref> regarding closures.
cb1a09d0 72
a0d0e21e 73=head2 Symbol Tables
74
aa689395 75The symbol table for a package happens to be stored in the hash of that
76name with two colons appended. The main symbol table's name is thus
77C<%main::>, or C<%::> for short. Likewise symbol table for the nested
78package mentioned earlier is named C<%OUTER::INNER::>.
79
80The value in each entry of the hash is what you are referring to when you
81use the C<*name> typeglob notation. In fact, the following have the same
82effect, though the first is more efficient because it does the symbol
83table lookups at compile time:
a0d0e21e 84
f102b883 85 local *main::foo = *main::bar;
86 local $main::{foo} = $main::{bar};
a0d0e21e 87
88You can use this to print out all the variables in a package, for
5a964f20 89instance. The standard F<dumpvar.pl> library and the CPAN module
90Devel::Symdump make use of this.
a0d0e21e 91
cb1a09d0 92Assignment to a typeglob performs an aliasing operation, i.e.,
a0d0e21e 93
94 *dick = *richard;
95
5a964f20 96causes variables, subroutines, formats, and file and directory handles
97accessible via the identifier C<richard> also to be accessible via the
98identifier C<dick>. If you want to alias only a particular variable or
99subroutine, you can assign a reference instead:
a0d0e21e 100
101 *dick = \$richard;
102
5a964f20 103Which makes $richard and $dick the same variable, but leaves
a0d0e21e 104@richard and @dick as separate arrays. Tricky, eh?
105
cb1a09d0 106This mechanism may be used to pass and return cheap references
107into or from subroutines if you won't want to copy the whole
5a964f20 108thing. It only works when assigning to dynamic variables, not
109lexicals.
cb1a09d0 110
5a964f20 111 %some_hash = (); # can't be my()
cb1a09d0 112 *some_hash = fn( \%another_hash );
113 sub fn {
114 local *hashsym = shift;
115 # now use %hashsym normally, and you
116 # will affect the caller's %another_hash
117 my %nhash = (); # do what you want
5f05dabc 118 return \%nhash;
cb1a09d0 119 }
120
5f05dabc 121On return, the reference will overwrite the hash slot in the
cb1a09d0 122symbol table specified by the *some_hash typeglob. This
c36e9b62 123is a somewhat tricky way of passing around references cheaply
cb1a09d0 124when you won't want to have to remember to dereference variables
125explicitly.
126
127Another use of symbol tables is for making "constant" scalars.
128
129 *PI = \3.14159265358979;
130
131Now you cannot alter $PI, which is probably a good thing all in all.
5a964f20 132This isn't the same as a constant subroutine, which is subject to
133optimization at compile-time. This isn't. A constant subroutine is one
134prototyped to take no arguments and to return a constant expression.
135See L<perlsub> for details on these. The C<use constant> pragma is a
136convenient shorthand for these.
cb1a09d0 137
55497cff 138You can say C<*foo{PACKAGE}> and C<*foo{NAME}> to find out what name and
139package the *foo symbol table entry comes from. This may be useful
5a964f20 140in a subroutine that gets passed typeglobs as arguments:
55497cff 141
142 sub identify_typeglob {
143 my $glob = shift;
144 print 'You gave me ', *{$glob}{PACKAGE}, '::', *{$glob}{NAME}, "\n";
145 }
146 identify_typeglob *foo;
147 identify_typeglob *bar::baz;
148
149This prints
150
151 You gave me main::foo
152 You gave me bar::baz
153
154The *foo{THING} notation can also be used to obtain references to the
155individual elements of *foo, see L<perlref>.
156
a0d0e21e 157=head2 Package Constructors and Destructors
158
159There are two special subroutine definitions that function as package
160constructors and destructors. These are the C<BEGIN> and C<END>
161routines. The C<sub> is optional for these routines.
162
f102b883 163A C<BEGIN> subroutine is executed as soon as possible, that is, the moment
164it is completely defined, even before the rest of the containing file
165is parsed. You may have multiple C<BEGIN> blocks within a file--they
166will execute in order of definition. Because a C<BEGIN> block executes
167immediately, it can pull in definitions of subroutines and such from other
168files in time to be visible to the rest of the file. Once a C<BEGIN>
169has run, it is immediately undefined and any code it used is returned to
170Perl's memory pool. This means you can't ever explicitly call a C<BEGIN>.
a0d0e21e 171
5a964f20 172An C<END> subroutine is executed as late as possible, that is, when
173the interpreter is being exited, even if it is exiting as a result of
174a die() function. (But not if it's polymorphing into another program
175via C<exec>, or being blown out of the water by a signal--you have to
176trap that yourself (if you can).) You may have multiple C<END> blocks
177within a file--they will execute in reverse order of definition; that is:
178last in, first out (LIFO).
a0d0e21e 179
5a964f20 180Inside an C<END> subroutine, C<$?> contains the value that the script is
c36e9b62 181going to pass to C<exit()>. You can modify C<$?> to change the exit
f102b883 182value of the script. Beware of changing C<$?> by accident (e.g. by
c36e9b62 183running something via C<system>).
184
5a964f20 185Note that when you use the B<-n> and B<-p> switches to Perl, C<BEGIN> and
186C<END> work just as they do in B<awk>, as a degenerate case. As currently
187implemented (and subject to change, since its inconvenient at best),
188both C<BEGIN> I<and> C<END> blocks are run when you use the B<-c> switch
189for a compile-only syntax check, although your main code is not.
a0d0e21e 190
191=head2 Perl Classes
192
4633a7c4 193There is no special class syntax in Perl, but a package may function
5a964f20 194as a class if it provides subroutines to act as methods. Such a
195package may also derive some of its methods from another class (package)
196by listing the other package name in its global @ISA array (which
197must be a package global, not a lexical).
4633a7c4 198
f102b883 199For more on this, see L<perltoot> and L<perlobj>.
a0d0e21e 200
201=head2 Perl Modules
202
c07a80fd 203A module is just a package that is defined in a library file of
a0d0e21e 204the same name, and is designed to be reusable. It may do this by
205providing a mechanism for exporting some of its symbols into the symbol
206table of any package using it. Or it may function as a class
207definition and make its semantics available implicitly through method
208calls on the class and its objects, without explicit exportation of any
209symbols. Or it can do a little of both.
210
9607fc9c 211For example, to start a normal module called Some::Module, create
212a file called Some/Module.pm and start with this template:
213
214 package Some::Module; # assumes Some/Module.pm
215
216 use strict;
217
218 BEGIN {
219 use Exporter ();
220 use vars qw($VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS);
221
222 # set the version for version checking
223 $VERSION = 1.00;
224 # if using RCS/CVS, this may be preferred
225 $VERSION = do { my @r = (q$Revision: 2.21 $ =~ /\d+/g); sprintf "%d."."%02d" x $#r, @r }; # must be all one line, for MakeMaker
226
227 @ISA = qw(Exporter);
228 @EXPORT = qw(&func1 &func2 &func4);
229 %EXPORT_TAGS = ( ); # eg: TAG => [ qw!name1 name2! ],
230
231 # your exported package globals go here,
232 # as well as any optionally exported functions
233 @EXPORT_OK = qw($Var1 %Hashit &func3);
234 }
235 use vars @EXPORT_OK;
236
237 # non-exported package globals go here
238 use vars qw(@more $stuff);
239
240 # initalize package globals, first exported ones
241 $Var1 = '';
242 %Hashit = ();
243
244 # then the others (which are still accessible as $Some::Module::stuff)
245 $stuff = '';
246 @more = ();
247
248 # all file-scoped lexicals must be created before
249 # the functions below that use them.
250
251 # file-private lexicals go here
252 my $priv_var = '';
253 my %secret_hash = ();
254
255 # here's a file-private function as a closure,
256 # callable as &$priv_func; it cannot be prototyped.
257 my $priv_func = sub {
258 # stuff goes here.
259 };
260
261 # make all your functions, whether exported or not;
262 # remember to put something interesting in the {} stubs
263 sub func1 {} # no prototype
264 sub func2() {} # proto'd void
265 sub func3($$) {} # proto'd to 2 scalars
266
267 # this one isn't exported, but could be called!
268 sub func4(\%) {} # proto'd to 1 hash ref
269
270 END { } # module clean-up code here (global destructor)
4633a7c4 271
272Then go on to declare and use your variables in functions
273without any qualifications.
f102b883 274See L<Exporter> and the L<perlmodlib> for details on
4633a7c4 275mechanics and style issues in module creation.
276
277Perl modules are included into your program by saying
a0d0e21e 278
279 use Module;
280
281or
282
283 use Module LIST;
284
285This is exactly equivalent to
286
5a964f20 287 BEGIN { require Module; import Module; }
a0d0e21e 288
289or
290
5a964f20 291 BEGIN { require Module; import Module LIST; }
a0d0e21e 292
cb1a09d0 293As a special case
294
295 use Module ();
296
297is exactly equivalent to
298
5a964f20 299 BEGIN { require Module; }
cb1a09d0 300
a0d0e21e 301All Perl module files have the extension F<.pm>. C<use> assumes this so
302that you don't have to spell out "F<Module.pm>" in quotes. This also
303helps to differentiate new modules from old F<.pl> and F<.ph> files.
304Module names are also capitalized unless they're functioning as pragmas,
305"Pragmas" are in effect compiler directives, and are sometimes called
306"pragmatic modules" (or even "pragmata" if you're a classicist).
307
5a964f20 308The two statements:
309
310 require SomeModule;
311 require "SomeModule.pm";
312
313differ from each other in two ways. In the first case, any double
314colons in the module name, such as C<Some::Module>, are translated
315into your system's directory separator, usually "/". The second
316case does not, and would have to be specified literally. The other difference
317is that seeing the first C<require> clues in the compiler that uses of
318indirect object notation involving "SomeModule", as in C<$ob = purge SomeModule>,
319are method calls, not function calls. (Yes, this really can make a difference.)
320
a0d0e21e 321Because the C<use> statement implies a C<BEGIN> block, the importation
322of semantics happens at the moment the C<use> statement is compiled,
323before the rest of the file is compiled. This is how it is able
324to function as a pragma mechanism, and also how modules are able to
325declare subroutines that are then visible as list operators for
326the rest of the current file. This will not work if you use C<require>
cb1a09d0 327instead of C<use>. With require you can get into this problem:
a0d0e21e 328
329 require Cwd; # make Cwd:: accessible
54310121 330 $here = Cwd::getcwd();
a0d0e21e 331
5f05dabc 332 use Cwd; # import names from Cwd::
a0d0e21e 333 $here = getcwd();
334
335 require Cwd; # make Cwd:: accessible
336 $here = getcwd(); # oops! no main::getcwd()
337
5a964f20 338In general, C<use Module ()> is recommended over C<require Module>,
339because it determines module availability at compile time, not in the
340middle of your program's execution. An exception would be if two modules
341each tried to C<use> each other, and each also called a function from
342that other module. In that case, it's easy to use C<require>s instead.
cb1a09d0 343
a0d0e21e 344Perl packages may be nested inside other package names, so we can have
345package names containing C<::>. But if we used that package name
346directly as a filename it would makes for unwieldy or impossible
347filenames on some systems. Therefore, if a module's name is, say,
348C<Text::Soundex>, then its definition is actually found in the library
349file F<Text/Soundex.pm>.
350
351Perl modules always have a F<.pm> file, but there may also be dynamically
352linked executables or autoloaded subroutine definitions associated with
353the module. If so, these will be entirely transparent to the user of
354the module. It is the responsibility of the F<.pm> file to load (or
355arrange to autoload) any additional functionality. The POSIX module
356happens to do both dynamic loading and autoloading, but the user can
5f05dabc 357say just C<use POSIX> to get it all.
a0d0e21e 358
f102b883 359For more information on writing extension modules, see L<perlxstut>
a0d0e21e 360and L<perlguts>.
361
f102b883 362=head1 SEE ALSO
cb1a09d0 363
f102b883 364See L<perlmodlib> for general style issues related to building Perl
365modules and classes as well as descriptions of the standard library and
366CPAN, L<Exporter> for how Perl's standard import/export mechanism works,
367L<perltoot> for an in-depth tutorial on creating classes, L<perlobj>
368for a hard-core reference document on objects, and L<perlsub> for an
369explanation of functions and scoping.