Commit | Line | Data |
a0d0e21e |
1 | =head1 NAME |
2 | |
f102b883 |
3 | perlmod - Perl modules (packages and symbol tables) |
a0d0e21e |
4 | |
5 | =head1 DESCRIPTION |
6 | |
7 | =head2 Packages |
8 | |
748a9306 |
9 | Perl provides a mechanism for alternative namespaces to protect packages |
d0c42abe |
10 | from stomping on each other's variables. In fact, apart from certain |
f102b883 |
11 | magical variables, there's really no such thing as a global variable |
12 | in Perl. The package statement declares the compilation unit as |
13 | being in the given namespace. The scope of the package declaration |
14 | is from the declaration itself through the end of the enclosing block, |
15 | C<eval>, C<sub>, or end of file, whichever comes first (the same scope |
16 | as the my() and local() operators). All further unqualified dynamic |
17 | identifiers will be in this namespace. A package statement affects |
18 | only dynamic variables--including those you've used local() on--but |
19 | I<not> lexical variables created with my(). Typically it would be |
20 | the first declaration in a file to be included by the C<require> or |
21 | C<use> operator. You can switch into a package in more than one place; |
22 | it influences merely which symbol table is used by the compiler for the |
23 | rest of that block. You can refer to variables and filehandles in other |
24 | packages by prefixing the identifier with the package name and a double |
25 | colon: C<$Package::Variable>. If the package name is null, the C<main> |
26 | package is assumed. That is, C<$::sail> is equivalent to C<$main::sail>. |
a0d0e21e |
27 | |
28 | (The old package delimiter was a single quote, but double colon |
29 | is now the preferred delimiter, in part because it's more readable |
30 | to humans, and in part because it's more readable to B<emacs> macros. |
31 | It also makes C++ programmers feel like they know what's going on.) |
32 | |
33 | Packages may be nested inside other packages: C<$OUTER::INNER::var>. This |
34 | implies nothing about the order of name lookups, however. All symbols |
35 | are either local to the current package, or must be fully qualified |
36 | from the outer package name down. For instance, there is nowhere |
37 | within package C<OUTER> that C<$INNER::var> refers to C<$OUTER::INNER::var>. |
38 | It would treat package C<INNER> as a totally separate global package. |
39 | |
40 | Only identifiers starting with letters (or underscore) are stored in a |
cb1a09d0 |
41 | package's symbol table. All other symbols are kept in package C<main>, |
42 | including all of the punctuation variables like $_. In addition, the |
5f05dabc |
43 | identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC, and SIG are |
cb1a09d0 |
44 | forced to be in package C<main>, even when used for other purposes than |
54310121 |
45 | their builtin one. Note also that, if you have a package called C<m>, |
5f05dabc |
46 | C<s>, or C<y>, then you can't use the qualified form of an identifier |
cb1a09d0 |
47 | because it will be interpreted instead as a pattern match, a substitution, |
48 | or a translation. |
a0d0e21e |
49 | |
50 | (Variables beginning with underscore used to be forced into package |
51 | main, but we decided it was more useful for package writers to be able |
cb1a09d0 |
52 | to use leading underscore to indicate private variables and method names. |
53 | $_ is still global though.) |
a0d0e21e |
54 | |
55 | Eval()ed strings are compiled in the package in which the eval() was |
56 | compiled. (Assignments to C<$SIG{}>, however, assume the signal |
748a9306 |
57 | handler specified is in the C<main> package. Qualify the signal handler |
a0d0e21e |
58 | name if you wish to have a signal handler in a package.) For an |
59 | example, examine F<perldb.pl> in the Perl library. It initially switches |
60 | to the C<DB> package so that the debugger doesn't interfere with variables |
61 | in the script you are trying to debug. At various points, however, it |
62 | temporarily switches back to the C<main> package to evaluate various |
63 | expressions in the context of the C<main> package (or wherever you came |
64 | from). See L<perldebug>. |
65 | |
f102b883 |
66 | The special symbol C<__PACKAGE__> contains the current package, but cannot |
67 | (easily) be used to construct variables. |
68 | |
5f05dabc |
69 | See L<perlsub> for other scoping issues related to my() and local(), |
f102b883 |
70 | and L<perlref> regarding closures. |
cb1a09d0 |
71 | |
a0d0e21e |
72 | =head2 Symbol Tables |
73 | |
aa689395 |
74 | The symbol table for a package happens to be stored in the hash of that |
75 | name with two colons appended. The main symbol table's name is thus |
76 | C<%main::>, or C<%::> for short. Likewise symbol table for the nested |
77 | package mentioned earlier is named C<%OUTER::INNER::>. |
78 | |
79 | The value in each entry of the hash is what you are referring to when you |
80 | use the C<*name> typeglob notation. In fact, the following have the same |
81 | effect, though the first is more efficient because it does the symbol |
82 | table lookups at compile time: |
a0d0e21e |
83 | |
f102b883 |
84 | local *main::foo = *main::bar; |
85 | local $main::{foo} = $main::{bar}; |
a0d0e21e |
86 | |
87 | You can use this to print out all the variables in a package, for |
88 | instance. Here is F<dumpvar.pl> from the Perl library: |
89 | |
90 | package dumpvar; |
91 | sub main::dumpvar { |
92 | ($package) = @_; |
93 | local(*stab) = eval("*${package}::"); |
94 | while (($key,$val) = each(%stab)) { |
95 | local(*entry) = $val; |
96 | if (defined $entry) { |
97 | print "\$$key = '$entry'\n"; |
98 | } |
99 | |
100 | if (defined @entry) { |
101 | print "\@$key = (\n"; |
102 | foreach $num ($[ .. $#entry) { |
103 | print " $num\t'",$entry[$num],"'\n"; |
104 | } |
105 | print ")\n"; |
106 | } |
107 | |
108 | if ($key ne "${package}::" && defined %entry) { |
109 | print "\%$key = (\n"; |
110 | foreach $key (sort keys(%entry)) { |
111 | print " $key\t'",$entry{$key},"'\n"; |
112 | } |
113 | print ")\n"; |
114 | } |
115 | } |
116 | } |
117 | |
118 | Note that even though the subroutine is compiled in package C<dumpvar>, |
f102b883 |
119 | the name of the subroutine is qualified so that its name is inserted into |
120 | package C<main>. While popular many years ago, this is now considered |
121 | very poor style; in general, you should be writing modules and using the |
122 | normal export mechanism instead of hammering someone else's namespace, |
123 | even main's. |
a0d0e21e |
124 | |
cb1a09d0 |
125 | Assignment to a typeglob performs an aliasing operation, i.e., |
a0d0e21e |
126 | |
127 | *dick = *richard; |
128 | |
5f05dabc |
129 | causes variables, subroutines, and file handles accessible via the |
d0c42abe |
130 | identifier C<richard> to also be accessible via the identifier C<dick>. If |
5f05dabc |
131 | you want to alias only a particular variable or subroutine, you can |
a0d0e21e |
132 | assign a reference instead: |
133 | |
134 | *dick = \$richard; |
135 | |
136 | makes $richard and $dick the same variable, but leaves |
137 | @richard and @dick as separate arrays. Tricky, eh? |
138 | |
cb1a09d0 |
139 | This mechanism may be used to pass and return cheap references |
140 | into or from subroutines if you won't want to copy the whole |
141 | thing. |
142 | |
143 | %some_hash = (); |
144 | *some_hash = fn( \%another_hash ); |
145 | sub fn { |
146 | local *hashsym = shift; |
147 | # now use %hashsym normally, and you |
148 | # will affect the caller's %another_hash |
149 | my %nhash = (); # do what you want |
5f05dabc |
150 | return \%nhash; |
cb1a09d0 |
151 | } |
152 | |
5f05dabc |
153 | On return, the reference will overwrite the hash slot in the |
cb1a09d0 |
154 | symbol table specified by the *some_hash typeglob. This |
c36e9b62 |
155 | is a somewhat tricky way of passing around references cheaply |
cb1a09d0 |
156 | when you won't want to have to remember to dereference variables |
157 | explicitly. |
158 | |
159 | Another use of symbol tables is for making "constant" scalars. |
160 | |
161 | *PI = \3.14159265358979; |
162 | |
163 | Now you cannot alter $PI, which is probably a good thing all in all. |
f102b883 |
164 | This isn't the same as a constant subroutine (one prototyped to |
165 | take no arguments and to return a constant expression), which is |
166 | subject to optimization at compile-time. This isn't. See L<perlsub> |
167 | for details on these. |
cb1a09d0 |
168 | |
55497cff |
169 | You can say C<*foo{PACKAGE}> and C<*foo{NAME}> to find out what name and |
170 | package the *foo symbol table entry comes from. This may be useful |
171 | in a subroutine which is passed typeglobs as arguments |
172 | |
173 | sub identify_typeglob { |
174 | my $glob = shift; |
175 | print 'You gave me ', *{$glob}{PACKAGE}, '::', *{$glob}{NAME}, "\n"; |
176 | } |
177 | identify_typeglob *foo; |
178 | identify_typeglob *bar::baz; |
179 | |
180 | This prints |
181 | |
182 | You gave me main::foo |
183 | You gave me bar::baz |
184 | |
185 | The *foo{THING} notation can also be used to obtain references to the |
186 | individual elements of *foo, see L<perlref>. |
187 | |
a0d0e21e |
188 | =head2 Package Constructors and Destructors |
189 | |
190 | There are two special subroutine definitions that function as package |
191 | constructors and destructors. These are the C<BEGIN> and C<END> |
192 | routines. The C<sub> is optional for these routines. |
193 | |
f102b883 |
194 | A C<BEGIN> subroutine is executed as soon as possible, that is, the moment |
195 | it is completely defined, even before the rest of the containing file |
196 | is parsed. You may have multiple C<BEGIN> blocks within a file--they |
197 | will execute in order of definition. Because a C<BEGIN> block executes |
198 | immediately, it can pull in definitions of subroutines and such from other |
199 | files in time to be visible to the rest of the file. Once a C<BEGIN> |
200 | has run, it is immediately undefined and any code it used is returned to |
201 | Perl's memory pool. This means you can't ever explicitly call a C<BEGIN>. |
a0d0e21e |
202 | |
203 | An C<END> subroutine is executed as late as possible, that is, when the |
204 | interpreter is being exited, even if it is exiting as a result of a |
205 | die() function. (But not if it's is being blown out of the water by a |
206 | signal--you have to trap that yourself (if you can).) You may have |
748a9306 |
207 | multiple C<END> blocks within a file--they will execute in reverse |
a0d0e21e |
208 | order of definition; that is: last in, first out (LIFO). |
209 | |
c36e9b62 |
210 | Inside an C<END> subroutine C<$?> contains the value that the script is |
211 | going to pass to C<exit()>. You can modify C<$?> to change the exit |
f102b883 |
212 | value of the script. Beware of changing C<$?> by accident (e.g. by |
c36e9b62 |
213 | running something via C<system>). |
214 | |
a0d0e21e |
215 | Note that when you use the B<-n> and B<-p> switches to Perl, C<BEGIN> |
216 | and C<END> work just as they do in B<awk>, as a degenerate case. |
217 | |
218 | =head2 Perl Classes |
219 | |
4633a7c4 |
220 | There is no special class syntax in Perl, but a package may function |
a0d0e21e |
221 | as a class if it provides subroutines that function as methods. Such a |
222 | package may also derive some of its methods from another class package |
5f05dabc |
223 | by listing the other package name in its @ISA array. |
4633a7c4 |
224 | |
f102b883 |
225 | For more on this, see L<perltoot> and L<perlobj>. |
a0d0e21e |
226 | |
227 | =head2 Perl Modules |
228 | |
c07a80fd |
229 | A module is just a package that is defined in a library file of |
a0d0e21e |
230 | the same name, and is designed to be reusable. It may do this by |
231 | providing a mechanism for exporting some of its symbols into the symbol |
232 | table of any package using it. Or it may function as a class |
233 | definition and make its semantics available implicitly through method |
234 | calls on the class and its objects, without explicit exportation of any |
235 | symbols. Or it can do a little of both. |
236 | |
9607fc9c |
237 | For example, to start a normal module called Some::Module, create |
238 | a file called Some/Module.pm and start with this template: |
239 | |
240 | package Some::Module; # assumes Some/Module.pm |
241 | |
242 | use strict; |
243 | |
244 | BEGIN { |
245 | use Exporter (); |
246 | use vars qw($VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS); |
247 | |
248 | # set the version for version checking |
249 | $VERSION = 1.00; |
250 | # if using RCS/CVS, this may be preferred |
251 | $VERSION = do { my @r = (q$Revision: 2.21 $ =~ /\d+/g); sprintf "%d."."%02d" x $#r, @r }; # must be all one line, for MakeMaker |
252 | |
253 | @ISA = qw(Exporter); |
254 | @EXPORT = qw(&func1 &func2 &func4); |
255 | %EXPORT_TAGS = ( ); # eg: TAG => [ qw!name1 name2! ], |
256 | |
257 | # your exported package globals go here, |
258 | # as well as any optionally exported functions |
259 | @EXPORT_OK = qw($Var1 %Hashit &func3); |
260 | } |
261 | use vars @EXPORT_OK; |
262 | |
263 | # non-exported package globals go here |
264 | use vars qw(@more $stuff); |
265 | |
266 | # initalize package globals, first exported ones |
267 | $Var1 = ''; |
268 | %Hashit = (); |
269 | |
270 | # then the others (which are still accessible as $Some::Module::stuff) |
271 | $stuff = ''; |
272 | @more = (); |
273 | |
274 | # all file-scoped lexicals must be created before |
275 | # the functions below that use them. |
276 | |
277 | # file-private lexicals go here |
278 | my $priv_var = ''; |
279 | my %secret_hash = (); |
280 | |
281 | # here's a file-private function as a closure, |
282 | # callable as &$priv_func; it cannot be prototyped. |
283 | my $priv_func = sub { |
284 | # stuff goes here. |
285 | }; |
286 | |
287 | # make all your functions, whether exported or not; |
288 | # remember to put something interesting in the {} stubs |
289 | sub func1 {} # no prototype |
290 | sub func2() {} # proto'd void |
291 | sub func3($$) {} # proto'd to 2 scalars |
292 | |
293 | # this one isn't exported, but could be called! |
294 | sub func4(\%) {} # proto'd to 1 hash ref |
295 | |
296 | END { } # module clean-up code here (global destructor) |
4633a7c4 |
297 | |
298 | Then go on to declare and use your variables in functions |
299 | without any qualifications. |
f102b883 |
300 | See L<Exporter> and the L<perlmodlib> for details on |
4633a7c4 |
301 | mechanics and style issues in module creation. |
302 | |
303 | Perl modules are included into your program by saying |
a0d0e21e |
304 | |
305 | use Module; |
306 | |
307 | or |
308 | |
309 | use Module LIST; |
310 | |
311 | This is exactly equivalent to |
312 | |
313 | BEGIN { require "Module.pm"; import Module; } |
314 | |
315 | or |
316 | |
317 | BEGIN { require "Module.pm"; import Module LIST; } |
318 | |
cb1a09d0 |
319 | As a special case |
320 | |
321 | use Module (); |
322 | |
323 | is exactly equivalent to |
324 | |
325 | BEGIN { require "Module.pm"; } |
326 | |
a0d0e21e |
327 | All Perl module files have the extension F<.pm>. C<use> assumes this so |
328 | that you don't have to spell out "F<Module.pm>" in quotes. This also |
329 | helps to differentiate new modules from old F<.pl> and F<.ph> files. |
330 | Module names are also capitalized unless they're functioning as pragmas, |
331 | "Pragmas" are in effect compiler directives, and are sometimes called |
332 | "pragmatic modules" (or even "pragmata" if you're a classicist). |
333 | |
334 | Because the C<use> statement implies a C<BEGIN> block, the importation |
335 | of semantics happens at the moment the C<use> statement is compiled, |
336 | before the rest of the file is compiled. This is how it is able |
337 | to function as a pragma mechanism, and also how modules are able to |
338 | declare subroutines that are then visible as list operators for |
339 | the rest of the current file. This will not work if you use C<require> |
cb1a09d0 |
340 | instead of C<use>. With require you can get into this problem: |
a0d0e21e |
341 | |
342 | require Cwd; # make Cwd:: accessible |
54310121 |
343 | $here = Cwd::getcwd(); |
a0d0e21e |
344 | |
5f05dabc |
345 | use Cwd; # import names from Cwd:: |
a0d0e21e |
346 | $here = getcwd(); |
347 | |
348 | require Cwd; # make Cwd:: accessible |
349 | $here = getcwd(); # oops! no main::getcwd() |
350 | |
cb1a09d0 |
351 | In general C<use Module ();> is recommended over C<require Module;>. |
352 | |
a0d0e21e |
353 | Perl packages may be nested inside other package names, so we can have |
354 | package names containing C<::>. But if we used that package name |
355 | directly as a filename it would makes for unwieldy or impossible |
356 | filenames on some systems. Therefore, if a module's name is, say, |
357 | C<Text::Soundex>, then its definition is actually found in the library |
358 | file F<Text/Soundex.pm>. |
359 | |
360 | Perl modules always have a F<.pm> file, but there may also be dynamically |
361 | linked executables or autoloaded subroutine definitions associated with |
362 | the module. If so, these will be entirely transparent to the user of |
363 | the module. It is the responsibility of the F<.pm> file to load (or |
364 | arrange to autoload) any additional functionality. The POSIX module |
365 | happens to do both dynamic loading and autoloading, but the user can |
5f05dabc |
366 | say just C<use POSIX> to get it all. |
a0d0e21e |
367 | |
f102b883 |
368 | For more information on writing extension modules, see L<perlxstut> |
a0d0e21e |
369 | and L<perlguts>. |
370 | |
f102b883 |
371 | =head1 SEE ALSO |
cb1a09d0 |
372 | |
f102b883 |
373 | See L<perlmodlib> for general style issues related to building Perl |
374 | modules and classes as well as descriptions of the standard library and |
375 | CPAN, L<Exporter> for how Perl's standard import/export mechanism works, |
376 | L<perltoot> for an in-depth tutorial on creating classes, L<perlobj> |
377 | for a hard-core reference document on objects, and L<perlsub> for an |
378 | explanation of functions and scoping. |