Commit | Line | Data |
cb1a09d0 |
1 | =head1 NAME |
2 | |
3 | perltie - how to hide an object class in a simple variable |
4 | |
5 | =head1 SYNOPSIS |
6 | |
7 | tie VARIABLE, CLASSNAME, LIST |
8 | |
6fdf61fb |
9 | $object = tied VARIABLE |
10 | |
cb1a09d0 |
11 | untie VARIABLE |
12 | |
13 | =head1 DESCRIPTION |
14 | |
15 | Prior to release 5.0 of Perl, a programmer could use dbmopen() |
5f05dabc |
16 | to connect an on-disk database in the standard Unix dbm(3x) |
17 | format magically to a %HASH in their program. However, their Perl was either |
cb1a09d0 |
18 | built with one particular dbm library or another, but not both, and |
19 | you couldn't extend this mechanism to other packages or types of variables. |
20 | |
21 | Now you can. |
22 | |
23 | The tie() function binds a variable to a class (package) that will provide |
24 | the implementation for access methods for that variable. Once this magic |
25 | has been performed, accessing a tied variable automatically triggers |
5a964f20 |
26 | method calls in the proper class. The complexity of the class is |
cb1a09d0 |
27 | hidden behind magic methods calls. The method names are in ALL CAPS, |
28 | which is a convention that Perl uses to indicate that they're called |
29 | implicitly rather than explicitly--just like the BEGIN() and END() |
30 | functions. |
31 | |
32 | In the tie() call, C<VARIABLE> is the name of the variable to be |
33 | enchanted. C<CLASSNAME> is the name of a class implementing objects of |
34 | the correct type. Any additional arguments in the C<LIST> are passed to |
35 | the appropriate constructor method for that class--meaning TIESCALAR(), |
5f05dabc |
36 | TIEARRAY(), TIEHASH(), or TIEHANDLE(). (Typically these are arguments |
a7adf1f0 |
37 | such as might be passed to the dbminit() function of C.) The object |
38 | returned by the "new" method is also returned by the tie() function, |
39 | which would be useful if you wanted to access other methods in |
40 | C<CLASSNAME>. (You don't actually have to return a reference to a right |
5f05dabc |
41 | "type" (e.g., HASH or C<CLASSNAME>) so long as it's a properly blessed |
a7adf1f0 |
42 | object.) You can also retrieve a reference to the underlying object |
43 | using the tied() function. |
cb1a09d0 |
44 | |
45 | Unlike dbmopen(), the tie() function will not C<use> or C<require> a module |
46 | for you--you need to do that explicitly yourself. |
47 | |
48 | =head2 Tying Scalars |
49 | |
50 | A class implementing a tied scalar should define the following methods: |
301e8125 |
51 | TIESCALAR, FETCH, STORE, and possibly UNTIE and/or DESTROY. |
cb1a09d0 |
52 | |
53 | Let's look at each in turn, using as an example a tie class for |
54 | scalars that allows the user to do something like: |
55 | |
56 | tie $his_speed, 'Nice', getppid(); |
57 | tie $my_speed, 'Nice', $$; |
58 | |
59 | And now whenever either of those variables is accessed, its current |
60 | system priority is retrieved and returned. If those variables are set, |
61 | then the process's priority is changed! |
62 | |
5aabfad6 |
63 | We'll use Jarkko Hietaniemi <F<jhi@iki.fi>>'s BSD::Resource class (not |
64 | included) to access the PRIO_PROCESS, PRIO_MIN, and PRIO_MAX constants |
65 | from your system, as well as the getpriority() and setpriority() system |
66 | calls. Here's the preamble of the class. |
cb1a09d0 |
67 | |
68 | package Nice; |
69 | use Carp; |
70 | use BSD::Resource; |
71 | use strict; |
72 | $Nice::DEBUG = 0 unless defined $Nice::DEBUG; |
73 | |
13a2d996 |
74 | =over 4 |
cb1a09d0 |
75 | |
76 | =item TIESCALAR classname, LIST |
77 | |
78 | This is the constructor for the class. That means it is |
79 | expected to return a blessed reference to a new scalar |
80 | (probably anonymous) that it's creating. For example: |
81 | |
82 | sub TIESCALAR { |
83 | my $class = shift; |
84 | my $pid = shift || $$; # 0 means me |
85 | |
86 | if ($pid !~ /^\d+$/) { |
6fdf61fb |
87 | carp "Nice::Tie::Scalar got non-numeric pid $pid" if $^W; |
cb1a09d0 |
88 | return undef; |
89 | } |
90 | |
91 | unless (kill 0, $pid) { # EPERM or ERSCH, no doubt |
6fdf61fb |
92 | carp "Nice::Tie::Scalar got bad pid $pid: $!" if $^W; |
cb1a09d0 |
93 | return undef; |
94 | } |
95 | |
96 | return bless \$pid, $class; |
97 | } |
98 | |
99 | This tie class has chosen to return an error rather than raising an |
100 | exception if its constructor should fail. While this is how dbmopen() works, |
101 | other classes may well not wish to be so forgiving. It checks the global |
102 | variable C<$^W> to see whether to emit a bit of noise anyway. |
103 | |
104 | =item FETCH this |
105 | |
106 | This method will be triggered every time the tied variable is accessed |
107 | (read). It takes no arguments beyond its self reference, which is the |
5f05dabc |
108 | object representing the scalar we're dealing with. Because in this case |
109 | we're using just a SCALAR ref for the tied scalar object, a simple $$self |
cb1a09d0 |
110 | allows the method to get at the real value stored there. In our example |
111 | below, that real value is the process ID to which we've tied our variable. |
112 | |
113 | sub FETCH { |
114 | my $self = shift; |
115 | confess "wrong type" unless ref $self; |
116 | croak "usage error" if @_; |
117 | my $nicety; |
118 | local($!) = 0; |
119 | $nicety = getpriority(PRIO_PROCESS, $$self); |
120 | if ($!) { croak "getpriority failed: $!" } |
121 | return $nicety; |
122 | } |
123 | |
124 | This time we've decided to blow up (raise an exception) if the renice |
125 | fails--there's no place for us to return an error otherwise, and it's |
126 | probably the right thing to do. |
127 | |
128 | =item STORE this, value |
129 | |
130 | This method will be triggered every time the tied variable is set |
131 | (assigned). Beyond its self reference, it also expects one (and only one) |
a177e38d |
132 | argument--the new value the user is trying to assign. Don't worry about |
133 | returning a value from STORE -- the semantic of assignment returning the |
134 | assigned value is implemented with FETCH. |
cb1a09d0 |
135 | |
136 | sub STORE { |
137 | my $self = shift; |
138 | confess "wrong type" unless ref $self; |
139 | my $new_nicety = shift; |
140 | croak "usage error" if @_; |
141 | |
142 | if ($new_nicety < PRIO_MIN) { |
143 | carp sprintf |
144 | "WARNING: priority %d less than minimum system priority %d", |
145 | $new_nicety, PRIO_MIN if $^W; |
146 | $new_nicety = PRIO_MIN; |
147 | } |
148 | |
149 | if ($new_nicety > PRIO_MAX) { |
150 | carp sprintf |
151 | "WARNING: priority %d greater than maximum system priority %d", |
152 | $new_nicety, PRIO_MAX if $^W; |
153 | $new_nicety = PRIO_MAX; |
154 | } |
155 | |
156 | unless (defined setpriority(PRIO_PROCESS, $$self, $new_nicety)) { |
157 | confess "setpriority failed: $!"; |
158 | } |
cb1a09d0 |
159 | } |
160 | |
301e8125 |
161 | =item UNTIE this |
162 | |
163 | This method will be triggered when the C<untie> occurs. This can be useful |
164 | if the class needs to know when no further calls will be made. (Except DESTROY |
d5582e24 |
165 | of course.) See L<The C<untie> Gotcha> below for more details. |
301e8125 |
166 | |
cb1a09d0 |
167 | =item DESTROY this |
168 | |
169 | This method will be triggered when the tied variable needs to be destructed. |
5f05dabc |
170 | As with other object classes, such a method is seldom necessary, because Perl |
cb1a09d0 |
171 | deallocates its moribund object's memory for you automatically--this isn't |
172 | C++, you know. We'll use a DESTROY method here for debugging purposes only. |
173 | |
174 | sub DESTROY { |
175 | my $self = shift; |
176 | confess "wrong type" unless ref $self; |
177 | carp "[ Nice::DESTROY pid $$self ]" if $Nice::DEBUG; |
178 | } |
179 | |
180 | =back |
181 | |
182 | That's about all there is to it. Actually, it's more than all there |
5f05dabc |
183 | is to it, because we've done a few nice things here for the sake |
cb1a09d0 |
184 | of completeness, robustness, and general aesthetics. Simpler |
185 | TIESCALAR classes are certainly possible. |
186 | |
187 | =head2 Tying Arrays |
188 | |
189 | A class implementing a tied ordinary array should define the following |
301e8125 |
190 | methods: TIEARRAY, FETCH, STORE, FETCHSIZE, STORESIZE and perhaps UNTIE and/or DESTROY. |
cb1a09d0 |
191 | |
a60c0954 |
192 | FETCHSIZE and STORESIZE are used to provide C<$#array> and |
193 | equivalent C<scalar(@array)> access. |
c47ff5f1 |
194 | |
01020589 |
195 | The methods POP, PUSH, SHIFT, UNSHIFT, SPLICE, DELETE, and EXISTS are |
196 | required if the perl operator with the corresponding (but lowercase) name |
197 | is to operate on the tied array. The B<Tie::Array> class can be used as a |
198 | base class to implement the first five of these in terms of the basic |
199 | methods above. The default implementations of DELETE and EXISTS in |
200 | B<Tie::Array> simply C<croak>. |
a60c0954 |
201 | |
301e8125 |
202 | In addition EXTEND will be called when perl would have pre-extended |
a60c0954 |
203 | allocation in a real array. |
204 | |
4ae85618 |
205 | For this discussion, we'll implement an array whose elements are a fixed |
206 | size at creation. If you try to create an element larger than the fixed |
207 | size, you'll take an exception. For example: |
cb1a09d0 |
208 | |
4ae85618 |
209 | use FixedElem_Array; |
210 | tie @array, 'FixedElem_Array', 3; |
211 | $array[0] = 'cat'; # ok. |
212 | $array[1] = 'dogs'; # exception, length('dogs') > 3. |
cb1a09d0 |
213 | |
214 | The preamble code for the class is as follows: |
215 | |
4ae85618 |
216 | package FixedElem_Array; |
cb1a09d0 |
217 | use Carp; |
218 | use strict; |
219 | |
13a2d996 |
220 | =over 4 |
cb1a09d0 |
221 | |
222 | =item TIEARRAY classname, LIST |
223 | |
224 | This is the constructor for the class. That means it is expected to |
225 | return a blessed reference through which the new array (probably an |
226 | anonymous ARRAY ref) will be accessed. |
227 | |
228 | In our example, just to show you that you don't I<really> have to return an |
229 | ARRAY reference, we'll choose a HASH reference to represent our object. |
4ae85618 |
230 | A HASH works out well as a generic record type: the C<{ELEMSIZE}> field will |
231 | store the maximum element size allowed, and the C<{ARRAY}> field will hold the |
cb1a09d0 |
232 | true ARRAY ref. If someone outside the class tries to dereference the |
233 | object returned (doubtless thinking it an ARRAY ref), they'll blow up. |
234 | This just goes to show you that you should respect an object's privacy. |
235 | |
236 | sub TIEARRAY { |
4ae85618 |
237 | my $class = shift; |
238 | my $elemsize = shift; |
239 | if ( @_ || $elemsize =~ /\D/ ) { |
240 | croak "usage: tie ARRAY, '" . __PACKAGE__ . "', elem_size"; |
241 | } |
242 | return bless { |
243 | ELEMSIZE => $elemsize, |
244 | ARRAY => [], |
245 | }, $class; |
cb1a09d0 |
246 | } |
247 | |
248 | =item FETCH this, index |
249 | |
250 | This method will be triggered every time an individual element the tied array |
251 | is accessed (read). It takes one argument beyond its self reference: the |
252 | index whose value we're trying to fetch. |
253 | |
254 | sub FETCH { |
4ae85618 |
255 | my $self = shift; |
256 | my $index = shift; |
257 | return $self->{ARRAY}->[$index]; |
cb1a09d0 |
258 | } |
259 | |
301e8125 |
260 | If a negative array index is used to read from an array, the index |
0b931be4 |
261 | will be translated to a positive one internally by calling FETCHSIZE |
6f12eb6d |
262 | before being passed to FETCH. You may disable this feature by |
263 | assigning a true value to the variable C<$NEGATIVE_INDICES> in the |
264 | tied array class. |
301e8125 |
265 | |
cb1a09d0 |
266 | As you may have noticed, the name of the FETCH method (et al.) is the same |
267 | for all accesses, even though the constructors differ in names (TIESCALAR |
268 | vs TIEARRAY). While in theory you could have the same class servicing |
269 | several tied types, in practice this becomes cumbersome, and it's easiest |
5f05dabc |
270 | to keep them at simply one tie type per class. |
cb1a09d0 |
271 | |
272 | =item STORE this, index, value |
273 | |
274 | This method will be triggered every time an element in the tied array is set |
275 | (written). It takes two arguments beyond its self reference: the index at |
276 | which we're trying to store something and the value we're trying to put |
4ae85618 |
277 | there. |
278 | |
279 | In our example, C<undef> is really C<$self-E<gt>{ELEMSIZE}> number of |
280 | spaces so we have a little more work to do here: |
cb1a09d0 |
281 | |
282 | sub STORE { |
4ae85618 |
283 | my $self = shift; |
284 | my( $index, $value ) = @_; |
285 | if ( length $value > $self->{ELEMSIZE} ) { |
286 | croak "length of $value is greater than $self->{ELEMSIZE}"; |
cb1a09d0 |
287 | } |
4ae85618 |
288 | # fill in the blanks |
289 | $self->EXTEND( $index ) if $index > $self->FETCHSIZE(); |
290 | # right justify to keep element size for smaller elements |
291 | $self->{ARRAY}->[$index] = sprintf "%$self->{ELEMSIZE}s", $value; |
cb1a09d0 |
292 | } |
301e8125 |
293 | |
294 | Negative indexes are treated the same as with FETCH. |
295 | |
4ae85618 |
296 | =item FETCHSIZE this |
297 | |
298 | Returns the total number of items in the tied array associated with |
299 | object I<this>. (Equivalent to C<scalar(@array)>). For example: |
300 | |
301 | sub FETCHSIZE { |
302 | my $self = shift; |
303 | return scalar @{$self->{ARRAY}}; |
304 | } |
305 | |
306 | =item STORESIZE this, count |
307 | |
308 | Sets the total number of items in the tied array associated with |
309 | object I<this> to be I<count>. If this makes the array larger then |
310 | class's mapping of C<undef> should be returned for new positions. |
311 | If the array becomes smaller then entries beyond count should be |
312 | deleted. |
313 | |
314 | In our example, 'undef' is really an element containing |
315 | C<$self-E<gt>{ELEMSIZE}> number of spaces. Observe: |
316 | |
f9abed49 |
317 | sub STORESIZE { |
318 | my $self = shift; |
319 | my $count = shift; |
320 | if ( $count > $self->FETCHSIZE() ) { |
321 | foreach ( $count - $self->FETCHSIZE() .. $count ) { |
322 | $self->STORE( $_, '' ); |
323 | } |
324 | } elsif ( $count < $self->FETCHSIZE() ) { |
325 | foreach ( 0 .. $self->FETCHSIZE() - $count - 2 ) { |
326 | $self->POP(); |
327 | } |
328 | } |
329 | } |
4ae85618 |
330 | |
331 | =item EXTEND this, count |
332 | |
333 | Informative call that array is likely to grow to have I<count> entries. |
334 | Can be used to optimize allocation. This method need do nothing. |
335 | |
336 | In our example, we want to make sure there are no blank (C<undef>) |
337 | entries, so C<EXTEND> will make use of C<STORESIZE> to fill elements |
338 | as needed: |
339 | |
340 | sub EXTEND { |
341 | my $self = shift; |
342 | my $count = shift; |
343 | $self->STORESIZE( $count ); |
344 | } |
345 | |
346 | =item EXISTS this, key |
347 | |
348 | Verify that the element at index I<key> exists in the tied array I<this>. |
349 | |
350 | In our example, we will determine that if an element consists of |
351 | C<$self-E<gt>{ELEMSIZE}> spaces only, it does not exist: |
352 | |
353 | sub EXISTS { |
354 | my $self = shift; |
355 | my $index = shift; |
f9abed49 |
356 | return 0 if ! defined $self->{ARRAY}->[$index] || |
357 | $self->{ARRAY}->[$index] eq ' ' x $self->{ELEMSIZE}; |
358 | return 1; |
4ae85618 |
359 | } |
360 | |
361 | =item DELETE this, key |
362 | |
363 | Delete the element at index I<key> from the tied array I<this>. |
364 | |
ad0f383a |
365 | In our example, a deleted item is C<$self-E<gt>{ELEMSIZE}> spaces: |
4ae85618 |
366 | |
367 | sub DELETE { |
368 | my $self = shift; |
369 | my $index = shift; |
370 | return $self->STORE( $index, '' ); |
371 | } |
372 | |
373 | =item CLEAR this |
374 | |
375 | Clear (remove, delete, ...) all values from the tied array associated with |
376 | object I<this>. For example: |
377 | |
378 | sub CLEAR { |
379 | my $self = shift; |
380 | return $self->{ARRAY} = []; |
381 | } |
382 | |
383 | =item PUSH this, LIST |
384 | |
385 | Append elements of I<LIST> to the array. For example: |
386 | |
387 | sub PUSH { |
388 | my $self = shift; |
389 | my @list = @_; |
390 | my $last = $self->FETCHSIZE(); |
391 | $self->STORE( $last + $_, $list[$_] ) foreach 0 .. $#list; |
392 | return $self->FETCHSIZE(); |
393 | } |
394 | |
395 | =item POP this |
396 | |
397 | Remove last element of the array and return it. For example: |
398 | |
399 | sub POP { |
400 | my $self = shift; |
401 | return pop @{$self->{ARRAY}}; |
402 | } |
403 | |
404 | =item SHIFT this |
405 | |
406 | Remove the first element of the array (shifting other elements down) |
407 | and return it. For example: |
408 | |
409 | sub SHIFT { |
410 | my $self = shift; |
411 | return shift @{$self->{ARRAY}}; |
412 | } |
413 | |
414 | =item UNSHIFT this, LIST |
415 | |
416 | Insert LIST elements at the beginning of the array, moving existing elements |
417 | up to make room. For example: |
418 | |
419 | sub UNSHIFT { |
420 | my $self = shift; |
421 | my @list = @_; |
422 | my $size = scalar( @list ); |
423 | # make room for our list |
424 | @{$self->{ARRAY}}[ $size .. $#{$self->{ARRAY}} + $size ] |
425 | = @{$self->{ARRAY}}; |
426 | $self->STORE( $_, $list[$_] ) foreach 0 .. $#list; |
427 | } |
428 | |
429 | =item SPLICE this, offset, length, LIST |
430 | |
431 | Perform the equivalent of C<splice> on the array. |
432 | |
433 | I<offset> is optional and defaults to zero, negative values count back |
434 | from the end of the array. |
435 | |
436 | I<length> is optional and defaults to rest of the array. |
437 | |
438 | I<LIST> may be empty. |
439 | |
440 | Returns a list of the original I<length> elements at I<offset>. |
441 | |
442 | In our example, we'll use a little shortcut if there is a I<LIST>: |
443 | |
444 | sub SPLICE { |
445 | my $self = shift; |
446 | my $offset = shift || 0; |
447 | my $length = shift || $self->FETCHSIZE() - $offset; |
448 | my @list = (); |
449 | if ( @_ ) { |
450 | tie @list, __PACKAGE__, $self->{ELEMSIZE}; |
451 | @list = @_; |
452 | } |
453 | return splice @{$self->{ARRAY}}, $offset, $length, @list; |
454 | } |
455 | |
301e8125 |
456 | =item UNTIE this |
457 | |
d5582e24 |
458 | Will be called when C<untie> happens. (See L<The C<untie> Gotcha> below.) |
cb1a09d0 |
459 | |
460 | =item DESTROY this |
461 | |
462 | This method will be triggered when the tied variable needs to be destructed. |
184e9718 |
463 | As with the scalar tie class, this is almost never needed in a |
cb1a09d0 |
464 | language that does its own garbage collection, so this time we'll |
465 | just leave it out. |
466 | |
467 | =back |
468 | |
cb1a09d0 |
469 | =head2 Tying Hashes |
470 | |
be3174d2 |
471 | Hashes were the first Perl data type to be tied (see dbmopen()). A class |
472 | implementing a tied hash should define the following methods: TIEHASH is |
473 | the constructor. FETCH and STORE access the key and value pairs. EXISTS |
474 | reports whether a key is present in the hash, and DELETE deletes one. |
475 | CLEAR empties the hash by deleting all the key and value pairs. FIRSTKEY |
476 | and NEXTKEY implement the keys() and each() functions to iterate over all |
a3bcc51e |
477 | the keys. SCALAR is triggered when the tied hash is evaluated in scalar |
478 | context. UNTIE is called when C<untie> happens, and DESTROY is called when |
301e8125 |
479 | the tied variable is garbage collected. |
aa689395 |
480 | |
481 | If this seems like a lot, then feel free to inherit from merely the |
d5582e24 |
482 | standard Tie::StdHash module for most of your methods, redefining only the |
aa689395 |
483 | interesting ones. See L<Tie::Hash> for details. |
cb1a09d0 |
484 | |
485 | Remember that Perl distinguishes between a key not existing in the hash, |
486 | and the key existing in the hash but having a corresponding value of |
487 | C<undef>. The two possibilities can be tested with the C<exists()> and |
488 | C<defined()> functions. |
489 | |
490 | Here's an example of a somewhat interesting tied hash class: it gives you |
5f05dabc |
491 | a hash representing a particular user's dot files. You index into the hash |
492 | with the name of the file (minus the dot) and you get back that dot file's |
cb1a09d0 |
493 | contents. For example: |
494 | |
495 | use DotFiles; |
1f57c600 |
496 | tie %dot, 'DotFiles'; |
cb1a09d0 |
497 | if ( $dot{profile} =~ /MANPATH/ || |
498 | $dot{login} =~ /MANPATH/ || |
499 | $dot{cshrc} =~ /MANPATH/ ) |
500 | { |
5f05dabc |
501 | print "you seem to set your MANPATH\n"; |
cb1a09d0 |
502 | } |
503 | |
504 | Or here's another sample of using our tied class: |
505 | |
1f57c600 |
506 | tie %him, 'DotFiles', 'daemon'; |
cb1a09d0 |
507 | foreach $f ( keys %him ) { |
508 | printf "daemon dot file %s is size %d\n", |
509 | $f, length $him{$f}; |
510 | } |
511 | |
512 | In our tied hash DotFiles example, we use a regular |
513 | hash for the object containing several important |
514 | fields, of which only the C<{LIST}> field will be what the |
515 | user thinks of as the real hash. |
516 | |
517 | =over 5 |
518 | |
519 | =item USER |
520 | |
521 | whose dot files this object represents |
522 | |
523 | =item HOME |
524 | |
5f05dabc |
525 | where those dot files live |
cb1a09d0 |
526 | |
527 | =item CLOBBER |
528 | |
529 | whether we should try to change or remove those dot files |
530 | |
531 | =item LIST |
532 | |
5f05dabc |
533 | the hash of dot file names and content mappings |
cb1a09d0 |
534 | |
535 | =back |
536 | |
537 | Here's the start of F<Dotfiles.pm>: |
538 | |
539 | package DotFiles; |
540 | use Carp; |
541 | sub whowasi { (caller(1))[3] . '()' } |
542 | my $DEBUG = 0; |
543 | sub debug { $DEBUG = @_ ? shift : 1 } |
544 | |
5f05dabc |
545 | For our example, we want to be able to emit debugging info to help in tracing |
cb1a09d0 |
546 | during development. We keep also one convenience function around |
547 | internally to help print out warnings; whowasi() returns the function name |
548 | that calls it. |
549 | |
550 | Here are the methods for the DotFiles tied hash. |
551 | |
13a2d996 |
552 | =over 4 |
cb1a09d0 |
553 | |
554 | =item TIEHASH classname, LIST |
555 | |
556 | This is the constructor for the class. That means it is expected to |
557 | return a blessed reference through which the new object (probably but not |
558 | necessarily an anonymous hash) will be accessed. |
559 | |
560 | Here's the constructor: |
561 | |
562 | sub TIEHASH { |
563 | my $self = shift; |
564 | my $user = shift || $>; |
565 | my $dotdir = shift || ''; |
566 | croak "usage: @{[&whowasi]} [USER [DOTDIR]]" if @_; |
567 | $user = getpwuid($user) if $user =~ /^\d+$/; |
568 | my $dir = (getpwnam($user))[7] |
569 | || croak "@{[&whowasi]}: no user $user"; |
570 | $dir .= "/$dotdir" if $dotdir; |
571 | |
572 | my $node = { |
573 | USER => $user, |
574 | HOME => $dir, |
575 | LIST => {}, |
576 | CLOBBER => 0, |
577 | }; |
578 | |
579 | opendir(DIR, $dir) |
580 | || croak "@{[&whowasi]}: can't opendir $dir: $!"; |
581 | foreach $dot ( grep /^\./ && -f "$dir/$_", readdir(DIR)) { |
582 | $dot =~ s/^\.//; |
583 | $node->{LIST}{$dot} = undef; |
584 | } |
585 | closedir DIR; |
586 | return bless $node, $self; |
587 | } |
588 | |
589 | It's probably worth mentioning that if you're going to filetest the |
590 | return values out of a readdir, you'd better prepend the directory |
5f05dabc |
591 | in question. Otherwise, because we didn't chdir() there, it would |
2ae324a7 |
592 | have been testing the wrong file. |
cb1a09d0 |
593 | |
594 | =item FETCH this, key |
595 | |
596 | This method will be triggered every time an element in the tied hash is |
597 | accessed (read). It takes one argument beyond its self reference: the key |
598 | whose value we're trying to fetch. |
599 | |
600 | Here's the fetch for our DotFiles example. |
601 | |
602 | sub FETCH { |
603 | carp &whowasi if $DEBUG; |
604 | my $self = shift; |
605 | my $dot = shift; |
606 | my $dir = $self->{HOME}; |
607 | my $file = "$dir/.$dot"; |
608 | |
609 | unless (exists $self->{LIST}->{$dot} || -f $file) { |
610 | carp "@{[&whowasi]}: no $dot file" if $DEBUG; |
611 | return undef; |
612 | } |
613 | |
614 | if (defined $self->{LIST}->{$dot}) { |
615 | return $self->{LIST}->{$dot}; |
616 | } else { |
617 | return $self->{LIST}->{$dot} = `cat $dir/.$dot`; |
618 | } |
619 | } |
620 | |
621 | It was easy to write by having it call the Unix cat(1) command, but it |
622 | would probably be more portable to open the file manually (and somewhat |
5f05dabc |
623 | more efficient). Of course, because dot files are a Unixy concept, we're |
cb1a09d0 |
624 | not that concerned. |
625 | |
626 | =item STORE this, key, value |
627 | |
628 | This method will be triggered every time an element in the tied hash is set |
629 | (written). It takes two arguments beyond its self reference: the index at |
630 | which we're trying to store something, and the value we're trying to put |
631 | there. |
632 | |
633 | Here in our DotFiles example, we'll be careful not to let |
634 | them try to overwrite the file unless they've called the clobber() |
635 | method on the original object reference returned by tie(). |
636 | |
637 | sub STORE { |
638 | carp &whowasi if $DEBUG; |
639 | my $self = shift; |
640 | my $dot = shift; |
641 | my $value = shift; |
642 | my $file = $self->{HOME} . "/.$dot"; |
643 | my $user = $self->{USER}; |
644 | |
645 | croak "@{[&whowasi]}: $file not clobberable" |
646 | unless $self->{CLOBBER}; |
647 | |
648 | open(F, "> $file") || croak "can't open $file: $!"; |
649 | print F $value; |
650 | close(F); |
651 | } |
652 | |
653 | If they wanted to clobber something, they might say: |
654 | |
655 | $ob = tie %daemon_dots, 'daemon'; |
656 | $ob->clobber(1); |
657 | $daemon_dots{signature} = "A true daemon\n"; |
658 | |
6fdf61fb |
659 | Another way to lay hands on a reference to the underlying object is to |
660 | use the tied() function, so they might alternately have set clobber |
661 | using: |
662 | |
663 | tie %daemon_dots, 'daemon'; |
664 | tied(%daemon_dots)->clobber(1); |
665 | |
666 | The clobber method is simply: |
cb1a09d0 |
667 | |
668 | sub clobber { |
669 | my $self = shift; |
670 | $self->{CLOBBER} = @_ ? shift : 1; |
671 | } |
672 | |
673 | =item DELETE this, key |
674 | |
675 | This method is triggered when we remove an element from the hash, |
676 | typically by using the delete() function. Again, we'll |
677 | be careful to check whether they really want to clobber files. |
678 | |
679 | sub DELETE { |
680 | carp &whowasi if $DEBUG; |
681 | |
682 | my $self = shift; |
683 | my $dot = shift; |
684 | my $file = $self->{HOME} . "/.$dot"; |
685 | croak "@{[&whowasi]}: won't remove file $file" |
686 | unless $self->{CLOBBER}; |
687 | delete $self->{LIST}->{$dot}; |
1f57c600 |
688 | my $success = unlink($file); |
689 | carp "@{[&whowasi]}: can't unlink $file: $!" unless $success; |
690 | $success; |
cb1a09d0 |
691 | } |
692 | |
1f57c600 |
693 | The value returned by DELETE becomes the return value of the call |
694 | to delete(). If you want to emulate the normal behavior of delete(), |
695 | you should return whatever FETCH would have returned for this key. |
696 | In this example, we have chosen instead to return a value which tells |
697 | the caller whether the file was successfully deleted. |
698 | |
cb1a09d0 |
699 | =item CLEAR this |
700 | |
701 | This method is triggered when the whole hash is to be cleared, usually by |
702 | assigning the empty list to it. |
703 | |
5f05dabc |
704 | In our example, that would remove all the user's dot files! It's such a |
cb1a09d0 |
705 | dangerous thing that they'll have to set CLOBBER to something higher than |
706 | 1 to make it happen. |
707 | |
708 | sub CLEAR { |
709 | carp &whowasi if $DEBUG; |
710 | my $self = shift; |
5f05dabc |
711 | croak "@{[&whowasi]}: won't remove all dot files for $self->{USER}" |
cb1a09d0 |
712 | unless $self->{CLOBBER} > 1; |
713 | my $dot; |
714 | foreach $dot ( keys %{$self->{LIST}}) { |
715 | $self->DELETE($dot); |
716 | } |
717 | } |
718 | |
719 | =item EXISTS this, key |
720 | |
721 | This method is triggered when the user uses the exists() function |
722 | on a particular hash. In our example, we'll look at the C<{LIST}> |
723 | hash element for this: |
724 | |
725 | sub EXISTS { |
726 | carp &whowasi if $DEBUG; |
727 | my $self = shift; |
728 | my $dot = shift; |
729 | return exists $self->{LIST}->{$dot}; |
730 | } |
731 | |
732 | =item FIRSTKEY this |
733 | |
734 | This method will be triggered when the user is going |
735 | to iterate through the hash, such as via a keys() or each() |
736 | call. |
737 | |
738 | sub FIRSTKEY { |
739 | carp &whowasi if $DEBUG; |
740 | my $self = shift; |
6fdf61fb |
741 | my $a = keys %{$self->{LIST}}; # reset each() iterator |
cb1a09d0 |
742 | each %{$self->{LIST}} |
743 | } |
744 | |
745 | =item NEXTKEY this, lastkey |
746 | |
747 | This method gets triggered during a keys() or each() iteration. It has a |
748 | second argument which is the last key that had been accessed. This is |
749 | useful if you're carrying about ordering or calling the iterator from more |
750 | than one sequence, or not really storing things in a hash anywhere. |
751 | |
5f05dabc |
752 | For our example, we're using a real hash so we'll do just the simple |
753 | thing, but we'll have to go through the LIST field indirectly. |
cb1a09d0 |
754 | |
755 | sub NEXTKEY { |
756 | carp &whowasi if $DEBUG; |
757 | my $self = shift; |
758 | return each %{ $self->{LIST} } |
759 | } |
760 | |
a3bcc51e |
761 | =item SCALAR this |
762 | |
763 | This is called when the hash is evaluated in scalar context. In order |
764 | to mimic the behaviour of untied hashes, this method should return a |
765 | false value when the tied hash is considered empty. If this method does |
159b10bb |
766 | not exist, perl will make some educated guesses and return true when |
767 | the hash is inside an iteration. If this isn't the case, FIRSTKEY is |
768 | called, and the result will be a false value if FIRSTKEY returns the empty |
769 | list, true otherwise. |
a3bcc51e |
770 | |
47b1b33c |
771 | However, you should B<not> blindly rely on perl always doing the right |
772 | thing. Particularly, perl will mistakenly return true when you clear the |
773 | hash by repeatedly calling DELETE until it is empty. You are therefore |
774 | advised to supply your own SCALAR method when you want to be absolutely |
775 | sure that your hash behaves nicely in scalar context. |
776 | |
a3bcc51e |
777 | In our example we can just call C<scalar> on the underlying hash |
778 | referenced by C<$self-E<gt>{LIST}>: |
779 | |
780 | sub SCALAR { |
781 | carp &whowasi if $DEBUG; |
782 | my $self = shift; |
783 | return scalar %{ $self->{LIST} } |
784 | } |
785 | |
301e8125 |
786 | =item UNTIE this |
787 | |
d5582e24 |
788 | This is called when C<untie> occurs. See L<The C<untie> Gotcha> below. |
301e8125 |
789 | |
cb1a09d0 |
790 | =item DESTROY this |
791 | |
792 | This method is triggered when a tied hash is about to go out of |
793 | scope. You don't really need it unless you're trying to add debugging |
794 | or have auxiliary state to clean up. Here's a very simple function: |
795 | |
796 | sub DESTROY { |
797 | carp &whowasi if $DEBUG; |
798 | } |
799 | |
800 | =back |
801 | |
1d2dff63 |
802 | Note that functions such as keys() and values() may return huge lists |
803 | when used on large objects, like DBM files. You may prefer to use the |
804 | each() function to iterate over such. Example: |
cb1a09d0 |
805 | |
806 | # print out history file offsets |
807 | use NDBM_File; |
1f57c600 |
808 | tie(%HIST, 'NDBM_File', '/usr/lib/news/history', 1, 0); |
cb1a09d0 |
809 | while (($key,$val) = each %HIST) { |
810 | print $key, ' = ', unpack('L',$val), "\n"; |
811 | } |
812 | untie(%HIST); |
813 | |
814 | =head2 Tying FileHandles |
815 | |
184e9718 |
816 | This is partially implemented now. |
a7adf1f0 |
817 | |
2ae324a7 |
818 | A class implementing a tied filehandle should define the following |
1d603a67 |
819 | methods: TIEHANDLE, at least one of PRINT, PRINTF, WRITE, READLINE, GETC, |
301e8125 |
820 | READ, and possibly CLOSE, UNTIE and DESTROY. The class can also provide: BINMODE, |
4592e6ca |
821 | OPEN, EOF, FILENO, SEEK, TELL - if the corresponding perl operators are |
822 | used on the handle. |
a7adf1f0 |
823 | |
7ff03255 |
824 | When STDERR is tied, its PRINT method will be called to issue warnings |
825 | and error messages. This feature is temporarily disabled during the call, |
826 | which means you can use C<warn()> inside PRINT without starting a recursive |
827 | loop. And just like C<__WARN__> and C<__DIE__> handlers, STDERR's PRINT |
828 | method may be called to report parser errors, so the caveats mentioned under |
829 | L<perlvar/%SIG> apply. |
830 | |
831 | All of this is especially useful when perl is embedded in some other |
832 | program, where output to STDOUT and STDERR may have to be redirected |
833 | in some special way. See nvi and the Apache module for examples. |
a7adf1f0 |
834 | |
835 | In our example we're going to create a shouting handle. |
836 | |
837 | package Shout; |
838 | |
13a2d996 |
839 | =over 4 |
a7adf1f0 |
840 | |
841 | =item TIEHANDLE classname, LIST |
842 | |
843 | This is the constructor for the class. That means it is expected to |
184e9718 |
844 | return a blessed reference of some sort. The reference can be used to |
5f05dabc |
845 | hold some internal information. |
a7adf1f0 |
846 | |
7e1af8bc |
847 | sub TIEHANDLE { print "<shout>\n"; my $i; bless \$i, shift } |
a7adf1f0 |
848 | |
1d603a67 |
849 | =item WRITE this, LIST |
850 | |
851 | This method will be called when the handle is written to via the |
852 | C<syswrite> function. |
853 | |
854 | sub WRITE { |
855 | $r = shift; |
856 | my($buf,$len,$offset) = @_; |
857 | print "WRITE called, \$buf=$buf, \$len=$len, \$offset=$offset"; |
858 | } |
859 | |
a7adf1f0 |
860 | =item PRINT this, LIST |
861 | |
46fc3d4c |
862 | This method will be triggered every time the tied handle is printed to |
863 | with the C<print()> function. |
184e9718 |
864 | Beyond its self reference it also expects the list that was passed to |
a7adf1f0 |
865 | the print function. |
866 | |
58f51617 |
867 | sub PRINT { $r = shift; $$r++; print join($,,map(uc($_),@_)),$\ } |
868 | |
46fc3d4c |
869 | =item PRINTF this, LIST |
870 | |
871 | This method will be triggered every time the tied handle is printed to |
872 | with the C<printf()> function. |
873 | Beyond its self reference it also expects the format and list that was |
874 | passed to the printf function. |
875 | |
876 | sub PRINTF { |
877 | shift; |
878 | my $fmt = shift; |
7687bb23 |
879 | print sprintf($fmt, @_); |
46fc3d4c |
880 | } |
881 | |
1d603a67 |
882 | =item READ this, LIST |
2ae324a7 |
883 | |
884 | This method will be called when the handle is read from via the C<read> |
885 | or C<sysread> functions. |
886 | |
887 | sub READ { |
889a76e8 |
888 | my $self = shift; |
69801a40 |
889 | my $bufref = \$_[0]; |
889a76e8 |
890 | my(undef,$len,$offset) = @_; |
891 | print "READ called, \$buf=$bufref, \$len=$len, \$offset=$offset"; |
892 | # add to $$bufref, set $len to number of characters read |
893 | $len; |
2ae324a7 |
894 | } |
895 | |
58f51617 |
896 | =item READLINE this |
897 | |
2ae324a7 |
898 | This method will be called when the handle is read from via <HANDLE>. |
899 | The method should return undef when there is no more data. |
58f51617 |
900 | |
889a76e8 |
901 | sub READLINE { $r = shift; "READLINE called $$r times\n"; } |
a7adf1f0 |
902 | |
2ae324a7 |
903 | =item GETC this |
904 | |
905 | This method will be called when the C<getc> function is called. |
906 | |
907 | sub GETC { print "Don't GETC, Get Perl"; return "a"; } |
908 | |
1d603a67 |
909 | =item CLOSE this |
910 | |
911 | This method will be called when the handle is closed via the C<close> |
912 | function. |
913 | |
914 | sub CLOSE { print "CLOSE called.\n" } |
915 | |
301e8125 |
916 | =item UNTIE this |
917 | |
918 | As with the other types of ties, this method will be called when C<untie> happens. |
d5582e24 |
919 | It may be appropriate to "auto CLOSE" when this occurs. See |
920 | L<The C<untie> Gotcha> below. |
301e8125 |
921 | |
a7adf1f0 |
922 | =item DESTROY this |
923 | |
924 | As with the other types of ties, this method will be called when the |
925 | tied handle is about to be destroyed. This is useful for debugging and |
926 | possibly cleaning up. |
927 | |
928 | sub DESTROY { print "</shout>\n" } |
929 | |
930 | =back |
931 | |
932 | Here's how to use our little example: |
933 | |
934 | tie(*FOO,'Shout'); |
935 | print FOO "hello\n"; |
936 | $a = 4; $b = 6; |
937 | print FOO $a, " plus ", $b, " equals ", $a + $b, "\n"; |
58f51617 |
938 | print <FOO>; |
cb1a09d0 |
939 | |
d7da42b7 |
940 | =head2 UNTIE this |
941 | |
942 | You can define for all tie types an UNTIE method that will be called |
d5582e24 |
943 | at untie(). See L<The C<untie> Gotcha> below. |
d7da42b7 |
944 | |
2752eb9f |
945 | =head2 The C<untie> Gotcha |
946 | |
947 | If you intend making use of the object returned from either tie() or |
948 | tied(), and if the tie's target class defines a destructor, there is a |
949 | subtle gotcha you I<must> guard against. |
950 | |
951 | As setup, consider this (admittedly rather contrived) example of a |
952 | tie; all it does is use a file to keep a log of the values assigned to |
953 | a scalar. |
954 | |
955 | package Remember; |
956 | |
957 | use strict; |
9f1b1f2d |
958 | use warnings; |
2752eb9f |
959 | use IO::File; |
960 | |
961 | sub TIESCALAR { |
962 | my $class = shift; |
963 | my $filename = shift; |
964 | my $handle = new IO::File "> $filename" |
965 | or die "Cannot open $filename: $!\n"; |
966 | |
967 | print $handle "The Start\n"; |
968 | bless {FH => $handle, Value => 0}, $class; |
969 | } |
970 | |
971 | sub FETCH { |
972 | my $self = shift; |
973 | return $self->{Value}; |
974 | } |
975 | |
976 | sub STORE { |
977 | my $self = shift; |
978 | my $value = shift; |
979 | my $handle = $self->{FH}; |
980 | print $handle "$value\n"; |
981 | $self->{Value} = $value; |
982 | } |
983 | |
984 | sub DESTROY { |
985 | my $self = shift; |
986 | my $handle = $self->{FH}; |
987 | print $handle "The End\n"; |
988 | close $handle; |
989 | } |
990 | |
991 | 1; |
992 | |
993 | Here is an example that makes use of this tie: |
994 | |
995 | use strict; |
996 | use Remember; |
997 | |
998 | my $fred; |
999 | tie $fred, 'Remember', 'myfile.txt'; |
1000 | $fred = 1; |
1001 | $fred = 4; |
1002 | $fred = 5; |
1003 | untie $fred; |
1004 | system "cat myfile.txt"; |
1005 | |
1006 | This is the output when it is executed: |
1007 | |
1008 | The Start |
1009 | 1 |
1010 | 4 |
1011 | 5 |
1012 | The End |
1013 | |
1014 | So far so good. Those of you who have been paying attention will have |
1015 | spotted that the tied object hasn't been used so far. So lets add an |
1016 | extra method to the Remember class to allow comments to be included in |
1017 | the file -- say, something like this: |
1018 | |
1019 | sub comment { |
1020 | my $self = shift; |
1021 | my $text = shift; |
1022 | my $handle = $self->{FH}; |
1023 | print $handle $text, "\n"; |
1024 | } |
1025 | |
1026 | And here is the previous example modified to use the C<comment> method |
1027 | (which requires the tied object): |
1028 | |
1029 | use strict; |
1030 | use Remember; |
1031 | |
1032 | my ($fred, $x); |
1033 | $x = tie $fred, 'Remember', 'myfile.txt'; |
1034 | $fred = 1; |
1035 | $fred = 4; |
1036 | comment $x "changing..."; |
1037 | $fred = 5; |
1038 | untie $fred; |
1039 | system "cat myfile.txt"; |
1040 | |
1041 | When this code is executed there is no output. Here's why: |
1042 | |
1043 | When a variable is tied, it is associated with the object which is the |
1044 | return value of the TIESCALAR, TIEARRAY, or TIEHASH function. This |
1045 | object normally has only one reference, namely, the implicit reference |
1046 | from the tied variable. When untie() is called, that reference is |
1047 | destroyed. Then, as in the first example above, the object's |
1048 | destructor (DESTROY) is called, which is normal for objects that have |
1049 | no more valid references; and thus the file is closed. |
1050 | |
1051 | In the second example, however, we have stored another reference to |
19799a22 |
1052 | the tied object in $x. That means that when untie() gets called |
2752eb9f |
1053 | there will still be a valid reference to the object in existence, so |
1054 | the destructor is not called at that time, and thus the file is not |
1055 | closed. The reason there is no output is because the file buffers |
1056 | have not been flushed to disk. |
1057 | |
1058 | Now that you know what the problem is, what can you do to avoid it? |
301e8125 |
1059 | Prior to the introduction of the optional UNTIE method the only way |
1060 | was the good old C<-w> flag. Which will spot any instances where you call |
2752eb9f |
1061 | untie() and there are still valid references to the tied object. If |
9f1b1f2d |
1062 | the second script above this near the top C<use warnings 'untie'> |
1063 | or was run with the C<-w> flag, Perl prints this |
2752eb9f |
1064 | warning message: |
1065 | |
1066 | untie attempted while 1 inner references still exist |
1067 | |
1068 | To get the script to work properly and silence the warning make sure |
1069 | there are no valid references to the tied object I<before> untie() is |
1070 | called: |
1071 | |
1072 | undef $x; |
1073 | untie $fred; |
1074 | |
301e8125 |
1075 | Now that UNTIE exists the class designer can decide which parts of the |
1076 | class functionality are really associated with C<untie> and which with |
1077 | the object being destroyed. What makes sense for a given class depends |
1078 | on whether the inner references are being kept so that non-tie-related |
1079 | methods can be called on the object. But in most cases it probably makes |
1080 | sense to move the functionality that would have been in DESTROY to the UNTIE |
1081 | method. |
1082 | |
1083 | If the UNTIE method exists then the warning above does not occur. Instead the |
1084 | UNTIE method is passed the count of "extra" references and can issue its own |
1085 | warning if appropriate. e.g. to replicate the no UNTIE case this method can |
1086 | be used: |
1087 | |
1088 | sub UNTIE |
1089 | { |
1090 | my ($obj,$count) = @_; |
1091 | carp "untie attempted while $count inner references still exist" if $count; |
1092 | } |
1093 | |
cb1a09d0 |
1094 | =head1 SEE ALSO |
1095 | |
1096 | See L<DB_File> or L<Config> for some interesting tie() implementations. |
3d0ae7ba |
1097 | A good starting point for many tie() implementations is with one of the |
1098 | modules L<Tie::Scalar>, L<Tie::Array>, L<Tie::Hash>, or L<Tie::Handle>. |
cb1a09d0 |
1099 | |
1100 | =head1 BUGS |
1101 | |
029149a3 |
1102 | The bucket usage information provided by C<scalar(%hash)> is not |
1103 | available. What this means is that using %tied_hash in boolean |
1104 | context doesn't work right (currently this always tests false, |
1105 | regardless of whether the hash is empty or hash elements). |
1106 | |
1107 | Localizing tied arrays or hashes does not work. After exiting the |
1108 | scope the arrays or the hashes are not restored. |
1109 | |
e77edca3 |
1110 | Counting the number of entries in a hash via C<scalar(keys(%hash))> |
1111 | or C<scalar(values(%hash)>) is inefficient since it needs to iterate |
1112 | through all the entries with FIRSTKEY/NEXTKEY. |
1113 | |
1114 | Tied hash/array slices cause multiple FETCH/STORE pairs, there are no |
1115 | tie methods for slice operations. |
1116 | |
c07a80fd |
1117 | You cannot easily tie a multilevel data structure (such as a hash of |
1118 | hashes) to a dbm file. The first problem is that all but GDBM and |
1119 | Berkeley DB have size limitations, but beyond that, you also have problems |
1120 | with how references are to be represented on disk. One experimental |
15c110d5 |
1121 | module that does attempt to address this need is DBM::Deep. Check your |
1122 | nearest CPAN site as described in L<perlmodlib> for source code. Note |
1123 | that despite its name, DBM::Deep does not use dbm. Another earlier attempt |
1124 | at solving the problem is MLDBM, which is also available on the CPAN, but |
1125 | which has some fairly serious limitations. |
c07a80fd |
1126 | |
e08f2115 |
1127 | Tied filehandles are still incomplete. sysopen(), truncate(), |
1128 | flock(), fcntl(), stat() and -X can't currently be trapped. |
1129 | |
cb1a09d0 |
1130 | =head1 AUTHOR |
1131 | |
1132 | Tom Christiansen |
a7adf1f0 |
1133 | |
46fc3d4c |
1134 | TIEHANDLE by Sven Verdoolaege <F<skimo@dns.ufsia.ac.be>> and Doug MacEachern <F<dougm@osf.org>> |
301e8125 |
1135 | |
1136 | UNTIE by Nick Ing-Simmons <F<nick@ing-simmons.net>> |
1137 | |
a3bcc51e |
1138 | SCALAR by Tassilo von Parseval <F<tassilo.von.parseval@rwth-aachen.de>> |
1139 | |
e1e60e72 |
1140 | Tying Arrays by Casey West <F<casey@geeknest.com>> |