Commit | Line | Data |
a0d0e21e |
1 | =head1 NAME |
2 | |
3 | perlobj - Perl objects |
4 | |
5 | =head1 DESCRIPTION |
6 | |
7 | First of all, you need to understand what references are in Perl. See |
8 | L<perlref> for that. |
9 | |
10 | Here are three very simple definitions that you should find reassuring. |
11 | |
12 | =over 4 |
13 | |
14 | =item 1. |
15 | |
16 | An object is simply a reference that happens to know which class it |
17 | belongs to. |
18 | |
19 | =item 2. |
20 | |
21 | A class is simply a package that happens to provide methods to deal |
22 | with object references. |
23 | |
24 | =item 3. |
25 | |
26 | A method is simply a subroutine that expects an object reference (or |
27 | a package name, for static methods) as the first argument. |
28 | |
29 | =back |
30 | |
31 | We'll cover these points now in more depth. |
32 | |
33 | =head2 An Object is Simply a Reference |
34 | |
35 | Unlike say C++, Perl doesn't provide any special syntax for |
36 | constructors. A constructor is merely a subroutine that returns a |
cb1a09d0 |
37 | reference to something "blessed" into a class, generally the |
a0d0e21e |
38 | class that the subroutine is defined in. Here is a typical |
39 | constructor: |
40 | |
41 | package Critter; |
42 | sub new { bless {} } |
43 | |
44 | The C<{}> constructs a reference to an anonymous hash containing no |
45 | key/value pairs. The bless() takes that reference and tells the object |
46 | it references that it's now a Critter, and returns the reference. |
47 | This is for convenience, since the referenced object itself knows that |
48 | it has been blessed, and its reference to it could have been returned |
49 | directly, like this: |
50 | |
51 | sub new { |
52 | my $self = {}; |
53 | bless $self; |
54 | return $self; |
55 | } |
56 | |
57 | In fact, you often see such a thing in more complicated constructors |
58 | that wish to call methods in the class as part of the construction: |
59 | |
60 | sub new { |
61 | my $self = {} |
62 | bless $self; |
63 | $self->initialize(); |
cb1a09d0 |
64 | return $self; |
65 | } |
66 | |
67 | If you care about inheritance (and you should; see L<perlmod/"Modules: |
68 | Creation, Use and Abuse">), then you want to use the two-arg form of bless |
69 | so that your constructors may be inherited: |
70 | |
71 | sub new { |
72 | my $class = shift; |
73 | my $self = {}; |
74 | bless $self, $class |
75 | $self->initialize(); |
76 | return $self; |
77 | } |
78 | |
d28ebecd |
79 | Or if you expect people to call not just C<CLASS-E<gt>new()> but also |
80 | C<$obj-E<gt>new()>, then use something like this. The initialize() |
cb1a09d0 |
81 | method used will be of whatever $class we blessed the |
82 | object into: |
83 | |
84 | sub new { |
85 | my $this = shift; |
86 | my $class = ref($this) || $this; |
87 | my $self = {}; |
88 | bless $self, $class |
89 | $self->initialize(); |
90 | return $self; |
a0d0e21e |
91 | } |
92 | |
93 | Within the class package, the methods will typically deal with the |
94 | reference as an ordinary reference. Outside the class package, |
95 | the reference is generally treated as an opaque value that may |
96 | only be accessed through the class's methods. |
97 | |
748a9306 |
98 | A constructor may re-bless a referenced object currently belonging to |
a0d0e21e |
99 | another class, but then the new class is responsible for all cleanup |
100 | later. The previous blessing is forgotten, as an object may only |
101 | belong to one class at a time. (Although of course it's free to |
102 | inherit methods from many classes.) |
103 | |
104 | A clarification: Perl objects are blessed. References are not. Objects |
105 | know which package they belong to. References do not. The bless() |
106 | function simply uses the reference in order to find the object. Consider |
107 | the following example: |
108 | |
109 | $a = {}; |
110 | $b = $a; |
111 | bless $a, BLAH; |
112 | print "\$b is a ", ref($b), "\n"; |
113 | |
114 | This reports $b as being a BLAH, so obviously bless() |
115 | operated on the object and not on the reference. |
116 | |
117 | =head2 A Class is Simply a Package |
118 | |
119 | Unlike say C++, Perl doesn't provide any special syntax for class |
120 | definitions. You just use a package as a class by putting method |
121 | definitions into the class. |
122 | |
123 | There is a special array within each package called @ISA which says |
124 | where else to look for a method if you can't find it in the current |
125 | package. This is how Perl implements inheritance. Each element of the |
126 | @ISA array is just the name of another package that happens to be a |
127 | class package. The classes are searched (depth first) for missing |
128 | methods in the order that they occur in @ISA. The classes accessible |
cb1a09d0 |
129 | through @ISA are known as base classes of the current class. |
a0d0e21e |
130 | |
131 | If a missing method is found in one of the base classes, it is cached |
132 | in the current class for efficiency. Changing @ISA or defining new |
133 | subroutines invalidates the cache and causes Perl to do the lookup again. |
134 | |
135 | If a method isn't found, but an AUTOLOAD routine is found, then |
136 | that is called on behalf of the missing method. |
137 | |
138 | If neither a method nor an AUTOLOAD routine is found in @ISA, then one |
139 | last try is made for the method (or an AUTOLOAD routine) in a class |
a2bdc9a5 |
140 | called UNIVERSAL. (Several commonly used methods are automatically |
141 | supplied in the UNIVERSAL class; see L<"Default UNIVERSAL methods"> for |
142 | more details.) If that doesn't work, Perl finally gives up and |
a0d0e21e |
143 | complains. |
144 | |
145 | Perl classes only do method inheritance. Data inheritance is left |
146 | up to the class itself. By and large, this is not a problem in Perl, |
147 | because most classes model the attributes of their object using |
148 | an anonymous hash, which serves as its own little namespace to be |
149 | carved up by the various classes that might want to do something |
150 | with the object. |
151 | |
152 | =head2 A Method is Simply a Subroutine |
153 | |
154 | Unlike say C++, Perl doesn't provide any special syntax for method |
155 | definition. (It does provide a little syntax for method invocation |
156 | though. More on that later.) A method expects its first argument |
157 | to be the object or package it is being invoked on. There are just two |
158 | types of methods, which we'll call static and virtual, in honor of |
159 | the two C++ method types they most closely resemble. |
160 | |
161 | A static method expects a class name as the first argument. It |
162 | provides functionality for the class as a whole, not for any individual |
163 | object belonging to the class. Constructors are typically static |
164 | methods. Many static methods simply ignore their first argument, since |
165 | they already know what package they're in, and don't care what package |
166 | they were invoked via. (These aren't necessarily the same, since |
167 | static methods follow the inheritance tree just like ordinary virtual |
168 | methods.) Another typical use for static methods is to look up an |
169 | object by name: |
170 | |
171 | sub find { |
172 | my ($class, $name) = @_; |
173 | $objtable{$name}; |
174 | } |
175 | |
176 | A virtual method expects an object reference as its first argument. |
177 | Typically it shifts the first argument into a "self" or "this" variable, |
178 | and then uses that as an ordinary reference. |
179 | |
180 | sub display { |
181 | my $self = shift; |
182 | my @keys = @_ ? @_ : sort keys %$self; |
183 | foreach $key (@keys) { |
184 | print "\t$key => $self->{$key}\n"; |
185 | } |
186 | } |
187 | |
188 | =head2 Method Invocation |
189 | |
190 | There are two ways to invoke a method, one of which you're already |
191 | familiar with, and the other of which will look familiar. Perl 4 |
192 | already had an "indirect object" syntax that you use when you say |
193 | |
194 | print STDERR "help!!!\n"; |
195 | |
196 | This same syntax can be used to call either static or virtual methods. |
197 | We'll use the two methods defined above, the static method to lookup |
198 | an object reference and the virtual method to print out its attributes. |
199 | |
200 | $fred = find Critter "Fred"; |
201 | display $fred 'Height', 'Weight'; |
202 | |
203 | These could be combined into one statement by using a BLOCK in the |
204 | indirect object slot: |
205 | |
206 | display {find Critter "Fred"} 'Height', 'Weight'; |
207 | |
d28ebecd |
208 | For C++ fans, there's also a syntax using -E<gt> notation that does exactly |
a0d0e21e |
209 | the same thing. The parentheses are required if there are any arguments. |
210 | |
211 | $fred = Critter->find("Fred"); |
212 | $fred->display('Height', 'Weight'); |
213 | |
214 | or in one statement, |
215 | |
216 | Critter->find("Fred")->display('Height', 'Weight'); |
217 | |
218 | There are times when one syntax is more readable, and times when the |
219 | other syntax is more readable. The indirect object syntax is less |
220 | cluttered, but it has the same ambiguity as ordinary list operators. |
221 | Indirect object method calls are parsed using the same rule as list |
222 | operators: "If it looks like a function, it is a function". (Presuming |
223 | for the moment that you think two words in a row can look like a |
224 | function name. C++ programmers seem to think so with some regularity, |
225 | especially when the first word is "new".) Thus, the parens of |
226 | |
227 | new Critter ('Barney', 1.5, 70) |
228 | |
229 | are assumed to surround ALL the arguments of the method call, regardless |
230 | of what comes after. Saying |
231 | |
232 | new Critter ('Bam' x 2), 1.4, 45 |
233 | |
234 | would be equivalent to |
235 | |
236 | Critter->new('Bam' x 2), 1.4, 45 |
237 | |
238 | which is unlikely to do what you want. |
239 | |
240 | There are times when you wish to specify which class's method to use. |
241 | In this case, you can call your method as an ordinary subroutine |
242 | call, being sure to pass the requisite first argument explicitly: |
243 | |
244 | $fred = MyCritter::find("Critter", "Fred"); |
245 | MyCritter::display($fred, 'Height', 'Weight'); |
246 | |
247 | Note however, that this does not do any inheritance. If you merely |
248 | wish to specify that Perl should I<START> looking for a method in a |
249 | particular package, use an ordinary method call, but qualify the method |
250 | name with the package like this: |
251 | |
252 | $fred = Critter->MyCritter::find("Fred"); |
253 | $fred->MyCritter::display('Height', 'Weight'); |
254 | |
cb1a09d0 |
255 | If you're trying to control where the method search begins I<and> you're |
256 | executing in the class itself, then you may use the SUPER pseudoclass, |
257 | which says to start looking in your base class's @ISA list without having |
258 | to explicitly name it: |
259 | |
260 | $self->SUPER::display('Height', 'Weight'); |
261 | |
262 | Please note that the C<SUPER::> construct is I<only> meaningful within the |
263 | class. |
264 | |
748a9306 |
265 | Sometimes you want to call a method when you don't know the method name |
266 | ahead of time. You can use the arrow form, replacing the method name |
267 | with a simple scalar variable containing the method name: |
268 | |
269 | $method = $fast ? "findfirst" : "findbest"; |
270 | $fred->$method(@args); |
271 | |
a2bdc9a5 |
272 | =head2 Default UNIVERSAL methods |
273 | |
274 | The C<UNIVERSAL> package automatically contains the following methods that |
275 | are inherited by all other classes: |
276 | |
277 | =over 4 |
278 | |
279 | =item isa ( CLASS ) |
280 | |
281 | C<isa> returns I<true> if its object is blessed into a sub-class of C<CLASS> |
282 | |
283 | C<isa> is also exportable and can be called as a sub with two arguments. This |
284 | allows the ability to check what a reference points to. Example |
285 | |
286 | use UNIVERSAL qw(isa); |
287 | |
288 | if(isa($ref, 'ARRAY')) { |
289 | ... |
290 | } |
291 | |
292 | =item can ( METHOD ) |
293 | |
294 | C<can> checks to see if its object has a method called C<METHOD>, |
295 | if it does then a reference to the sub is returned, if it does not then |
296 | I<undef> is returned. |
297 | |
760ac839 |
298 | =item VERSION ( [ VERSION ] ) |
299 | |
300 | C<VERSION> returns the VERSION number of the class (package). If |
301 | an argument is given then it will check that the current version is not |
302 | less that the given argument. This method is normally called as a static |
303 | method. This method is also called when the C<VERSION> form of C<use> is |
304 | used. |
a2bdc9a5 |
305 | |
a2bdc9a5 |
306 | |
307 | use A 1.2 qw(some imported subs); |
308 | |
309 | A->require_version( 1.2 ); |
310 | |
311 | =item class () |
312 | |
313 | C<class> returns the class name of its object. |
314 | |
315 | =item is_instance () |
316 | |
317 | C<is_instance> returns true if its object is an instance of some |
318 | class, false if its object is the class (package) itself. Example |
319 | |
320 | A->is_instance(); # False |
321 | |
322 | $var = 'A'; |
323 | $var->is_instance(); # False |
324 | |
325 | $ref = bless [], 'A'; |
326 | $ref->is_instance(); # True |
327 | |
a2bdc9a5 |
328 | =back |
329 | |
330 | B<NOTE:> C<can> directly uses Perl's internal code for method lookup, and |
331 | C<isa> uses a very similar method and cache-ing strategy. This may cause |
332 | strange effects if the Perl code dynamically changes @ISA in any package. |
333 | |
334 | You may add other methods to the UNIVERSAL class via Perl or XS code. |
335 | |
336 | =head2 Destructors |
a0d0e21e |
337 | |
338 | When the last reference to an object goes away, the object is |
339 | automatically destroyed. (This may even be after you exit, if you've |
340 | stored references in global variables.) If you want to capture control |
341 | just before the object is freed, you may define a DESTROY method in |
342 | your class. It will automatically be called at the appropriate moment, |
343 | and you can do any extra cleanup you need to do. |
344 | |
345 | Perl doesn't do nested destruction for you. If your constructor |
346 | reblessed a reference from one of your base classes, your DESTROY may |
347 | need to call DESTROY for any base classes that need it. But this only |
348 | applies to reblessed objects--an object reference that is merely |
349 | I<CONTAINED> in the current object will be freed and destroyed |
350 | automatically when the current object is freed. |
351 | |
748a9306 |
352 | =head2 WARNING |
353 | |
354 | An indirect object is limited to a name, a scalar variable, or a block, |
355 | because it would have to do too much lookahead otherwise, just like any |
d28ebecd |
356 | other postfix dereference in the language. The left side of -E<gt> is not so |
748a9306 |
357 | limited, because it's an infix operator, not a postfix operator. |
358 | |
359 | That means that below, A and B are equivalent to each other, and C and D |
360 | are equivalent, but AB and CD are different: |
361 | |
362 | A: method $obref->{"fieldname"} |
363 | B: (method $obref)->{"fieldname"} |
364 | C: $obref->{"fieldname"}->method() |
365 | D: method {$obref->{"fieldname"}} |
366 | |
a0d0e21e |
367 | =head2 Summary |
368 | |
369 | That's about all there is to it. Now you just need to go off and buy a |
370 | book about object-oriented design methodology, and bang your forehead |
371 | with it for the next six months or so. |
372 | |
cb1a09d0 |
373 | =head2 Two-Phased Garbage Collection |
374 | |
375 | For most purposes, Perl uses a fast and simple reference-based |
376 | garbage collection system. For this reason, there's an extra |
377 | dereference going on at some level, so if you haven't built |
378 | your Perl executable using your C compiler's C<-O> flag, performance |
379 | will suffer. If you I<have> built Perl with C<cc -O>, then this |
380 | probably won't matter. |
381 | |
382 | A more serious concern is that unreachable memory with a non-zero |
383 | reference count will not normally get freed. Therefore, this is a bad |
384 | idea: |
385 | |
386 | { |
387 | my $a; |
388 | $a = \$a; |
389 | } |
390 | |
391 | Even thought $a I<should> go away, it can't. When building recursive data |
392 | structures, you'll have to break the self-reference yourself explicitly |
393 | if you don't care to leak. For example, here's a self-referential |
394 | node such as one might use in a sophisticated tree structure: |
395 | |
396 | sub new_node { |
397 | my $self = shift; |
398 | my $class = ref($self) || $self; |
399 | my $node = {}; |
400 | $node->{LEFT} = $node->{RIGHT} = $node; |
401 | $node->{DATA} = [ @_ ]; |
402 | return bless $node => $class; |
403 | } |
404 | |
405 | If you create nodes like that, they (currently) won't go away unless you |
406 | break their self reference yourself. (In other words, this is not to be |
407 | construed as a feature, and you shouldn't depend on it.) |
408 | |
409 | Almost. |
410 | |
411 | When an interpreter thread finally shuts down (usually when your program |
412 | exits), then a rather costly but complete mark-and-sweep style of garbage |
413 | collection is performed, and everything allocated by that thread gets |
414 | destroyed. This is essential to support Perl as an embedded or a |
415 | multithreadable language. For example, this program demonstrates Perl's |
416 | two-phased garbage collection: |
417 | |
418 | #!/usr/bin/perl |
419 | package Subtle; |
420 | |
421 | sub new { |
422 | my $test; |
423 | $test = \$test; |
424 | warn "CREATING " . \$test; |
425 | return bless \$test; |
426 | } |
427 | |
428 | sub DESTROY { |
429 | my $self = shift; |
430 | warn "DESTROYING $self"; |
431 | } |
432 | |
433 | package main; |
434 | |
435 | warn "starting program"; |
436 | { |
437 | my $a = Subtle->new; |
438 | my $b = Subtle->new; |
439 | $$a = 0; # break selfref |
440 | warn "leaving block"; |
441 | } |
442 | |
443 | warn "just exited block"; |
444 | warn "time to die..."; |
445 | exit; |
446 | |
447 | When run as F</tmp/test>, the following output is produced: |
448 | |
449 | starting program at /tmp/test line 18. |
450 | CREATING SCALAR(0x8e5b8) at /tmp/test line 7. |
451 | CREATING SCALAR(0x8e57c) at /tmp/test line 7. |
452 | leaving block at /tmp/test line 23. |
453 | DESTROYING Subtle=SCALAR(0x8e5b8) at /tmp/test line 13. |
454 | just exited block at /tmp/test line 26. |
455 | time to die... at /tmp/test line 27. |
456 | DESTROYING Subtle=SCALAR(0x8e57c) during global destruction. |
457 | |
458 | Notice that "global destruction" bit there? That's the thread |
459 | garbage collector reaching the unreachable. |
460 | |
461 | Objects are always destructed, even when regular refs aren't and in fact |
462 | are destructed in a separate pass before ordinary refs just to try to |
463 | prevent object destructors from using refs that have been themselves |
464 | destructed. Plain refs are only garbage collected if the destruct level |
465 | is greater than 0. You can test the higher levels of global destruction |
466 | by setting the PERL_DESTRUCT_LEVEL environment variable, presuming |
467 | C<-DDEBUGGING> was enabled during perl build time. |
468 | |
469 | A more complete garbage collection strategy will be implemented |
470 | at a future date. |
471 | |
a0d0e21e |
472 | =head1 SEE ALSO |
473 | |
cb1a09d0 |
474 | You should also check out L<perlbot> for other object tricks, traps, and tips, |
475 | as well as L<perlmod> for some style guides on constructing both modules |
476 | and classes. |