Commit | Line | Data |
cb1a09d0 |
1 | =head1 NAME |
2 | |
3 | perltie - how to hide an object class in a simple variable |
4 | |
5 | =head1 SYNOPSIS |
6 | |
7 | tie VARIABLE, CLASSNAME, LIST |
8 | |
9 | untie VARIABLE |
10 | |
11 | =head1 DESCRIPTION |
12 | |
13 | Prior to release 5.0 of Perl, a programmer could use dbmopen() |
14 | to magically connect an on-disk database in the standard Unix dbm(3x) |
15 | format to a %HASH in their program. However, their Perl was either |
16 | built with one particular dbm library or another, but not both, and |
17 | you couldn't extend this mechanism to other packages or types of variables. |
18 | |
19 | Now you can. |
20 | |
21 | The tie() function binds a variable to a class (package) that will provide |
22 | the implementation for access methods for that variable. Once this magic |
23 | has been performed, accessing a tied variable automatically triggers |
24 | method calls in the proper class. All of the complexity of the class is |
25 | hidden behind magic methods calls. The method names are in ALL CAPS, |
26 | which is a convention that Perl uses to indicate that they're called |
27 | implicitly rather than explicitly--just like the BEGIN() and END() |
28 | functions. |
29 | |
30 | In the tie() call, C<VARIABLE> is the name of the variable to be |
31 | enchanted. C<CLASSNAME> is the name of a class implementing objects of |
32 | the correct type. Any additional arguments in the C<LIST> are passed to |
33 | the appropriate constructor method for that class--meaning TIESCALAR(), |
34 | TIEARRAY(), or TIEHASH(). (Typically these are arguments such as might be |
35 | passed to the dbminit() function of C.) The object returned by the "new" |
36 | method is also returned by the tie() function, which would be useful if |
37 | you wanted to access other methods in C<CLASSNAME>. (You don't actually |
38 | have to return a reference to a right "type" (e.g. HASH or C<CLASSNAME>) |
39 | so long as it's a properly blessed object.) |
40 | |
41 | |
42 | Unlike dbmopen(), the tie() function will not C<use> or C<require> a module |
43 | for you--you need to do that explicitly yourself. |
44 | |
45 | =head2 Tying Scalars |
46 | |
47 | A class implementing a tied scalar should define the following methods: |
48 | TIESCALAR, FETCH, STORE, and possibly DESTROY. |
49 | |
50 | Let's look at each in turn, using as an example a tie class for |
51 | scalars that allows the user to do something like: |
52 | |
53 | tie $his_speed, 'Nice', getppid(); |
54 | tie $my_speed, 'Nice', $$; |
55 | |
56 | And now whenever either of those variables is accessed, its current |
57 | system priority is retrieved and returned. If those variables are set, |
58 | then the process's priority is changed! |
59 | |
60 | We'll use Jarkko Hietaniemi F<E<lt>Jarkko.Hietaniemi@hut.fiE<gt>>'s |
61 | BSD::Resource class (not included) to access the PRIO_PROCESS, PRIO_MIN, |
62 | and PRIO_MAX constants from your system, as well as the getpriority() and |
63 | setpriority() system calls. Here's the preamble of the class. |
64 | |
65 | package Nice; |
66 | use Carp; |
67 | use BSD::Resource; |
68 | use strict; |
69 | $Nice::DEBUG = 0 unless defined $Nice::DEBUG; |
70 | |
71 | =over |
72 | |
73 | =item TIESCALAR classname, LIST |
74 | |
75 | This is the constructor for the class. That means it is |
76 | expected to return a blessed reference to a new scalar |
77 | (probably anonymous) that it's creating. For example: |
78 | |
79 | sub TIESCALAR { |
80 | my $class = shift; |
81 | my $pid = shift || $$; # 0 means me |
82 | |
83 | if ($pid !~ /^\d+$/) { |
84 | carp "Nice::TieScalar got non-numeric pid $pid" if $^W; |
85 | return undef; |
86 | } |
87 | |
88 | unless (kill 0, $pid) { # EPERM or ERSCH, no doubt |
89 | carp "Nice::TieScalar got bad pid $pid: $!" if $^W; |
90 | return undef; |
91 | } |
92 | |
93 | return bless \$pid, $class; |
94 | } |
95 | |
96 | This tie class has chosen to return an error rather than raising an |
97 | exception if its constructor should fail. While this is how dbmopen() works, |
98 | other classes may well not wish to be so forgiving. It checks the global |
99 | variable C<$^W> to see whether to emit a bit of noise anyway. |
100 | |
101 | =item FETCH this |
102 | |
103 | This method will be triggered every time the tied variable is accessed |
104 | (read). It takes no arguments beyond its self reference, which is the |
105 | object representing the scalar we're dealing with. Since in this case |
106 | we're just using a SCALAR ref for the tied scalar object, a simple $$self |
107 | allows the method to get at the real value stored there. In our example |
108 | below, that real value is the process ID to which we've tied our variable. |
109 | |
110 | sub FETCH { |
111 | my $self = shift; |
112 | confess "wrong type" unless ref $self; |
113 | croak "usage error" if @_; |
114 | my $nicety; |
115 | local($!) = 0; |
116 | $nicety = getpriority(PRIO_PROCESS, $$self); |
117 | if ($!) { croak "getpriority failed: $!" } |
118 | return $nicety; |
119 | } |
120 | |
121 | This time we've decided to blow up (raise an exception) if the renice |
122 | fails--there's no place for us to return an error otherwise, and it's |
123 | probably the right thing to do. |
124 | |
125 | =item STORE this, value |
126 | |
127 | This method will be triggered every time the tied variable is set |
128 | (assigned). Beyond its self reference, it also expects one (and only one) |
129 | argument--the new value the user is trying to assign. |
130 | |
131 | sub STORE { |
132 | my $self = shift; |
133 | confess "wrong type" unless ref $self; |
134 | my $new_nicety = shift; |
135 | croak "usage error" if @_; |
136 | |
137 | if ($new_nicety < PRIO_MIN) { |
138 | carp sprintf |
139 | "WARNING: priority %d less than minimum system priority %d", |
140 | $new_nicety, PRIO_MIN if $^W; |
141 | $new_nicety = PRIO_MIN; |
142 | } |
143 | |
144 | if ($new_nicety > PRIO_MAX) { |
145 | carp sprintf |
146 | "WARNING: priority %d greater than maximum system priority %d", |
147 | $new_nicety, PRIO_MAX if $^W; |
148 | $new_nicety = PRIO_MAX; |
149 | } |
150 | |
151 | unless (defined setpriority(PRIO_PROCESS, $$self, $new_nicety)) { |
152 | confess "setpriority failed: $!"; |
153 | } |
154 | return $new_nicety; |
155 | } |
156 | |
157 | =item DESTROY this |
158 | |
159 | This method will be triggered when the tied variable needs to be destructed. |
160 | As with other object classes, such a method is seldom ncessary, since Perl |
161 | deallocates its moribund object's memory for you automatically--this isn't |
162 | C++, you know. We'll use a DESTROY method here for debugging purposes only. |
163 | |
164 | sub DESTROY { |
165 | my $self = shift; |
166 | confess "wrong type" unless ref $self; |
167 | carp "[ Nice::DESTROY pid $$self ]" if $Nice::DEBUG; |
168 | } |
169 | |
170 | =back |
171 | |
172 | That's about all there is to it. Actually, it's more than all there |
173 | is to it, since we've done a few nice things here for the sake |
174 | of completeness, robustness, and general aesthetics. Simpler |
175 | TIESCALAR classes are certainly possible. |
176 | |
177 | =head2 Tying Arrays |
178 | |
179 | A class implementing a tied ordinary array should define the following |
180 | methods: TIEARRAY, FETCH, STORE, and perhaps DESTROY. |
181 | |
182 | B<WARNING>: Tied arrays are I<incomplete>. They are also distinctly lacking |
183 | something for the C<$#ARRAY> access (which is hard, as it's an lvalue), as |
184 | well as the other obvious array functions, like push(), pop(), shift(), |
185 | unshift(), and splice(). |
186 | |
187 | For this discussion, we'll implement an array whose indices are fixed at |
188 | its creation. If you try to access anything beyond those bounds, you'll |
189 | take an exception. (Well, if you access an individual element; an |
190 | aggregate assignment would be missed.) For example: |
191 | |
192 | require Bounded_Array; |
193 | tie @ary, Bounded_Array, 2; |
194 | $| = 1; |
195 | for $i (0 .. 10) { |
196 | print "setting index $i: "; |
197 | $ary[$i] = 10 * $i; |
198 | $ary[$i] = 10 * $i; |
199 | print "value of elt $i now $ary[$i]\n"; |
200 | } |
201 | |
202 | The preamble code for the class is as follows: |
203 | |
204 | package Bounded_Array; |
205 | use Carp; |
206 | use strict; |
207 | |
208 | =over |
209 | |
210 | =item TIEARRAY classname, LIST |
211 | |
212 | This is the constructor for the class. That means it is expected to |
213 | return a blessed reference through which the new array (probably an |
214 | anonymous ARRAY ref) will be accessed. |
215 | |
216 | In our example, just to show you that you don't I<really> have to return an |
217 | ARRAY reference, we'll choose a HASH reference to represent our object. |
218 | A HASH works out well as a generic record type: the C<{BOUND}> field will |
219 | store the maximum bound allowed, and the C<{ARRAY} field will hold the |
220 | true ARRAY ref. If someone outside the class tries to dereference the |
221 | object returned (doubtless thinking it an ARRAY ref), they'll blow up. |
222 | This just goes to show you that you should respect an object's privacy. |
223 | |
224 | sub TIEARRAY { |
225 | my $class = shift; |
226 | my $bound = shift; |
227 | confess "usage: tie(\@ary, 'Bounded_Array', max_subscript)" |
228 | if @_ || $bound =~ /\D/; |
229 | return bless { |
230 | BOUND => $bound, |
231 | ARRAY => [], |
232 | }, $class; |
233 | } |
234 | |
235 | =item FETCH this, index |
236 | |
237 | This method will be triggered every time an individual element the tied array |
238 | is accessed (read). It takes one argument beyond its self reference: the |
239 | index whose value we're trying to fetch. |
240 | |
241 | sub FETCH { |
242 | my($self,$idx) = @_; |
243 | if ($idx > $self->{BOUND}) { |
244 | confess "Array OOB: $idx > $self->{BOUND}"; |
245 | } |
246 | return $self->{ARRAY}[$idx]; |
247 | } |
248 | |
249 | As you may have noticed, the name of the FETCH method (et al.) is the same |
250 | for all accesses, even though the constructors differ in names (TIESCALAR |
251 | vs TIEARRAY). While in theory you could have the same class servicing |
252 | several tied types, in practice this becomes cumbersome, and it's easiest |
253 | to simply keep them at one tie type per class. |
254 | |
255 | =item STORE this, index, value |
256 | |
257 | This method will be triggered every time an element in the tied array is set |
258 | (written). It takes two arguments beyond its self reference: the index at |
259 | which we're trying to store something and the value we're trying to put |
260 | there. For example: |
261 | |
262 | sub STORE { |
263 | my($self, $idx, $value) = @_; |
264 | print "[STORE $value at $idx]\n" if _debug; |
265 | if ($idx > $self->{BOUND} ) { |
266 | confess "Array OOB: $idx > $self->{BOUND}"; |
267 | } |
268 | return $self->{ARRAY}[$idx] = $value; |
269 | } |
270 | |
271 | =item DESTROY this |
272 | |
273 | This method will be triggered when the tied variable needs to be destructed. |
274 | As with the sclar tie class, this is almost never needed in a |
275 | language that does its own garbage collection, so this time we'll |
276 | just leave it out. |
277 | |
278 | =back |
279 | |
280 | The code we presented at the top of the tied array class accesses many |
281 | elements of the array, far more than we've set the bounds to. Therefore, |
282 | it will blow up once they try to access beyond the 2nd element of @ary, as |
283 | the following output demonstrates: |
284 | |
285 | setting index 0: value of elt 0 now 0 |
286 | setting index 1: value of elt 1 now 10 |
287 | setting index 2: value of elt 2 now 20 |
288 | setting index 3: Array OOB: 3 > 2 at Bounded_Array.pm line 39 |
289 | Bounded_Array::FETCH called at testba line 12 |
290 | |
291 | =head2 Tying Hashes |
292 | |
293 | As the first Perl data type to be tied (see dbmopen()), associative arrays |
294 | have the most complete and useful tie() implementation. A class |
295 | implementing a tied associative array should define the following |
296 | methods: TIEHASH is the constructor. FETCH and STORE access the key and |
297 | value pairs. EXISTS reports whether a key is present in the hash, and |
298 | DELETE deletes one. CLEAR empties the hash by deleting all the key and |
299 | value pairs. FIRSTKEY and NEXTKEY implement the keys() and each() |
300 | functions to iterate over all the keys. And DESTROY is called when the |
301 | tied variable is garbage collected. |
302 | |
303 | If this seems like a lot, then feel free to merely inherit |
304 | from the standard TieHash module for most of your methods, redefining only |
305 | the interesting ones. See L<TieHash> for details. |
306 | |
307 | Remember that Perl distinguishes between a key not existing in the hash, |
308 | and the key existing in the hash but having a corresponding value of |
309 | C<undef>. The two possibilities can be tested with the C<exists()> and |
310 | C<defined()> functions. |
311 | |
312 | Here's an example of a somewhat interesting tied hash class: it gives you |
313 | a hash representing a particular user's dotfiles. You index into the hash |
314 | with the name of the file (minus the dot) and you get back that dotfile's |
315 | contents. For example: |
316 | |
317 | use DotFiles; |
318 | tie %dot, DotFiles; |
319 | if ( $dot{profile} =~ /MANPATH/ || |
320 | $dot{login} =~ /MANPATH/ || |
321 | $dot{cshrc} =~ /MANPATH/ ) |
322 | { |
323 | print "you seem to set your manpath\n"; |
324 | } |
325 | |
326 | Or here's another sample of using our tied class: |
327 | |
328 | tie %him, DotFiles, 'daemon'; |
329 | foreach $f ( keys %him ) { |
330 | printf "daemon dot file %s is size %d\n", |
331 | $f, length $him{$f}; |
332 | } |
333 | |
334 | In our tied hash DotFiles example, we use a regular |
335 | hash for the object containing several important |
336 | fields, of which only the C<{LIST}> field will be what the |
337 | user thinks of as the real hash. |
338 | |
339 | =over 5 |
340 | |
341 | =item USER |
342 | |
343 | whose dot files this object represents |
344 | |
345 | =item HOME |
346 | |
347 | where those dotfiles live |
348 | |
349 | =item CLOBBER |
350 | |
351 | whether we should try to change or remove those dot files |
352 | |
353 | =item LIST |
354 | |
355 | the hash of dotfile names and content mappings |
356 | |
357 | =back |
358 | |
359 | Here's the start of F<Dotfiles.pm>: |
360 | |
361 | package DotFiles; |
362 | use Carp; |
363 | sub whowasi { (caller(1))[3] . '()' } |
364 | my $DEBUG = 0; |
365 | sub debug { $DEBUG = @_ ? shift : 1 } |
366 | |
367 | For our example, we want to able to emit debugging info to help in tracing |
368 | during development. We keep also one convenience function around |
369 | internally to help print out warnings; whowasi() returns the function name |
370 | that calls it. |
371 | |
372 | Here are the methods for the DotFiles tied hash. |
373 | |
374 | =over |
375 | |
376 | =item TIEHASH classname, LIST |
377 | |
378 | This is the constructor for the class. That means it is expected to |
379 | return a blessed reference through which the new object (probably but not |
380 | necessarily an anonymous hash) will be accessed. |
381 | |
382 | Here's the constructor: |
383 | |
384 | sub TIEHASH { |
385 | my $self = shift; |
386 | my $user = shift || $>; |
387 | my $dotdir = shift || ''; |
388 | croak "usage: @{[&whowasi]} [USER [DOTDIR]]" if @_; |
389 | $user = getpwuid($user) if $user =~ /^\d+$/; |
390 | my $dir = (getpwnam($user))[7] |
391 | || croak "@{[&whowasi]}: no user $user"; |
392 | $dir .= "/$dotdir" if $dotdir; |
393 | |
394 | my $node = { |
395 | USER => $user, |
396 | HOME => $dir, |
397 | LIST => {}, |
398 | CLOBBER => 0, |
399 | }; |
400 | |
401 | opendir(DIR, $dir) |
402 | || croak "@{[&whowasi]}: can't opendir $dir: $!"; |
403 | foreach $dot ( grep /^\./ && -f "$dir/$_", readdir(DIR)) { |
404 | $dot =~ s/^\.//; |
405 | $node->{LIST}{$dot} = undef; |
406 | } |
407 | closedir DIR; |
408 | return bless $node, $self; |
409 | } |
410 | |
411 | It's probably worth mentioning that if you're going to filetest the |
412 | return values out of a readdir, you'd better prepend the directory |
413 | in question. Otherwise, since we didn't chdir() there, it would |
414 | have been testing the wrong file. |
415 | |
416 | =item FETCH this, key |
417 | |
418 | This method will be triggered every time an element in the tied hash is |
419 | accessed (read). It takes one argument beyond its self reference: the key |
420 | whose value we're trying to fetch. |
421 | |
422 | Here's the fetch for our DotFiles example. |
423 | |
424 | sub FETCH { |
425 | carp &whowasi if $DEBUG; |
426 | my $self = shift; |
427 | my $dot = shift; |
428 | my $dir = $self->{HOME}; |
429 | my $file = "$dir/.$dot"; |
430 | |
431 | unless (exists $self->{LIST}->{$dot} || -f $file) { |
432 | carp "@{[&whowasi]}: no $dot file" if $DEBUG; |
433 | return undef; |
434 | } |
435 | |
436 | if (defined $self->{LIST}->{$dot}) { |
437 | return $self->{LIST}->{$dot}; |
438 | } else { |
439 | return $self->{LIST}->{$dot} = `cat $dir/.$dot`; |
440 | } |
441 | } |
442 | |
443 | It was easy to write by having it call the Unix cat(1) command, but it |
444 | would probably be more portable to open the file manually (and somewhat |
445 | more efficient). Of course, since dot files are a Unixy concept, we're |
446 | not that concerned. |
447 | |
448 | =item STORE this, key, value |
449 | |
450 | This method will be triggered every time an element in the tied hash is set |
451 | (written). It takes two arguments beyond its self reference: the index at |
452 | which we're trying to store something, and the value we're trying to put |
453 | there. |
454 | |
455 | Here in our DotFiles example, we'll be careful not to let |
456 | them try to overwrite the file unless they've called the clobber() |
457 | method on the original object reference returned by tie(). |
458 | |
459 | sub STORE { |
460 | carp &whowasi if $DEBUG; |
461 | my $self = shift; |
462 | my $dot = shift; |
463 | my $value = shift; |
464 | my $file = $self->{HOME} . "/.$dot"; |
465 | my $user = $self->{USER}; |
466 | |
467 | croak "@{[&whowasi]}: $file not clobberable" |
468 | unless $self->{CLOBBER}; |
469 | |
470 | open(F, "> $file") || croak "can't open $file: $!"; |
471 | print F $value; |
472 | close(F); |
473 | } |
474 | |
475 | If they wanted to clobber something, they might say: |
476 | |
477 | $ob = tie %daemon_dots, 'daemon'; |
478 | $ob->clobber(1); |
479 | $daemon_dots{signature} = "A true daemon\n"; |
480 | |
481 | Where the clobber method is simply: |
482 | |
483 | sub clobber { |
484 | my $self = shift; |
485 | $self->{CLOBBER} = @_ ? shift : 1; |
486 | } |
487 | |
488 | =item DELETE this, key |
489 | |
490 | This method is triggered when we remove an element from the hash, |
491 | typically by using the delete() function. Again, we'll |
492 | be careful to check whether they really want to clobber files. |
493 | |
494 | sub DELETE { |
495 | carp &whowasi if $DEBUG; |
496 | |
497 | my $self = shift; |
498 | my $dot = shift; |
499 | my $file = $self->{HOME} . "/.$dot"; |
500 | croak "@{[&whowasi]}: won't remove file $file" |
501 | unless $self->{CLOBBER}; |
502 | delete $self->{LIST}->{$dot}; |
503 | unlink($file) || carp "@{[&whowasi]}: can't unlink $file: $!"; |
504 | } |
505 | |
506 | =item CLEAR this |
507 | |
508 | This method is triggered when the whole hash is to be cleared, usually by |
509 | assigning the empty list to it. |
510 | |
511 | In our example, that would remove all the user's dotfiles! It's such a |
512 | dangerous thing that they'll have to set CLOBBER to something higher than |
513 | 1 to make it happen. |
514 | |
515 | sub CLEAR { |
516 | carp &whowasi if $DEBUG; |
517 | my $self = shift; |
518 | croak "@{[&whowasi]}: won't remove all dotfiles for $self->{USER}" |
519 | unless $self->{CLOBBER} > 1; |
520 | my $dot; |
521 | foreach $dot ( keys %{$self->{LIST}}) { |
522 | $self->DELETE($dot); |
523 | } |
524 | } |
525 | |
526 | =item EXISTS this, key |
527 | |
528 | This method is triggered when the user uses the exists() function |
529 | on a particular hash. In our example, we'll look at the C<{LIST}> |
530 | hash element for this: |
531 | |
532 | sub EXISTS { |
533 | carp &whowasi if $DEBUG; |
534 | my $self = shift; |
535 | my $dot = shift; |
536 | return exists $self->{LIST}->{$dot}; |
537 | } |
538 | |
539 | =item FIRSTKEY this |
540 | |
541 | This method will be triggered when the user is going |
542 | to iterate through the hash, such as via a keys() or each() |
543 | call. |
544 | |
545 | sub FIRSTKEY { |
546 | carp &whowasi if $DEBUG; |
547 | my $self = shift; |
548 | my $a = keys %{$self->{LIST}}; |
549 | each %{$self->{LIST}} |
550 | } |
551 | |
552 | =item NEXTKEY this, lastkey |
553 | |
554 | This method gets triggered during a keys() or each() iteration. It has a |
555 | second argument which is the last key that had been accessed. This is |
556 | useful if you're carrying about ordering or calling the iterator from more |
557 | than one sequence, or not really storing things in a hash anywhere. |
558 | |
559 | For our example, we our using a real hash so we'll just do the simple |
560 | thing, but we'll have to indirect through the LIST field. |
561 | |
562 | sub NEXTKEY { |
563 | carp &whowasi if $DEBUG; |
564 | my $self = shift; |
565 | return each %{ $self->{LIST} } |
566 | } |
567 | |
568 | =item DESTROY this |
569 | |
570 | This method is triggered when a tied hash is about to go out of |
571 | scope. You don't really need it unless you're trying to add debugging |
572 | or have auxiliary state to clean up. Here's a very simple function: |
573 | |
574 | sub DESTROY { |
575 | carp &whowasi if $DEBUG; |
576 | } |
577 | |
578 | =back |
579 | |
580 | Note that functions such as keys() and values() may return huge array |
581 | values when used on large objects, like DBM files. You may prefer to |
582 | use the each() function to iterate over such. Example: |
583 | |
584 | # print out history file offsets |
585 | use NDBM_File; |
586 | tie(%HIST, NDBM_File, '/usr/lib/news/history', 1, 0); |
587 | while (($key,$val) = each %HIST) { |
588 | print $key, ' = ', unpack('L',$val), "\n"; |
589 | } |
590 | untie(%HIST); |
591 | |
592 | =head2 Tying FileHandles |
593 | |
594 | This isn't implemented yet. Sorry; maybe someday. |
595 | |
596 | =head1 SEE ALSO |
597 | |
598 | See L<DB_File> or L<Config> for some interesting tie() implementations. |
599 | |
600 | =head1 BUGS |
601 | |
602 | Tied arrays are I<incomplete>. They are also distinctly lacking something |
603 | for the C<$#ARRAY> access (which is hard, as it's an lvalue), as well as |
604 | the other obvious array functions, like push(), pop(), shift(), unshift(), |
605 | and splice(). |
606 | |
607 | =head1 AUTHOR |
608 | |
609 | Tom Christiansen |