Commit | Line | Data |
cb1a09d0 |
1 | =head1 NAME |
4633a7c4 |
2 | |
cb1a09d0 |
3 | perldsc - Perl Data Structures Cookbook |
4633a7c4 |
4 | |
cb1a09d0 |
5 | =head1 DESCRIPTION |
4633a7c4 |
6 | |
7 | The single feature most sorely lacking in the Perl programming language |
8 | prior to its 5.0 release was complex data structures. Even without direct |
9 | language support, some valiant programmers did manage to emulate them, but |
10 | it was hard work and not for the faint of heart. You could occasionally |
11 | get away with the C<$m{$LoL,$b}> notation borrowed from I<awk> in which the |
12 | keys are actually more like a single concatenated string C<"$LoL$b">, but |
13 | traversal and sorting were difficult. More desperate programmers even |
14 | hacked Perl's internal symbol table directly, a strategy that proved hard |
15 | to develop and maintain--to put it mildly. |
16 | |
17 | The 5.0 release of Perl let us have complex data structures. You |
18 | may now write something like this and all of a sudden, you'd have a array |
19 | with three dimensions! |
20 | |
21 | for $x (1 .. 10) { |
22 | for $y (1 .. 10) { |
23 | for $z (1 .. 10) { |
24 | $LoL[$x][$y][$z] = |
25 | $x ** $y + $z; |
26 | } |
27 | } |
28 | } |
29 | |
30 | Alas, however simple this may appear, underneath it's a much more |
31 | elaborate construct than meets the eye! |
32 | |
33 | How do you print it out? Why can't you just say C<print @LoL>? How do |
34 | you sort it? How can you pass it to a function or get one of these back |
35 | from a function? Is is an object? Can you save it to disk to read |
36 | back later? How do you access whole rows or columns of that matrix? Do |
37 | all the values have to be numeric? |
38 | |
39 | As you see, it's quite easy to become confused. While some small portion |
40 | of the blame for this can be attributed to the reference-based |
41 | implementation, it's really more due to a lack of existing documentation with |
42 | examples designed for the beginner. |
43 | |
44 | This document is meant to be a detailed but understandable treatment of |
45 | the many different sorts of data structures you might want to develop. It should |
46 | also serve as a cookbook of examples. That way, when you need to create one of these |
47 | complex data structures, you can just pinch, pilfer, or purloin |
48 | a drop-in example from here. |
49 | |
50 | Let's look at each of these possible constructs in detail. There are separate |
51 | documents on each of the following: |
52 | |
53 | =over 5 |
54 | |
55 | =item * arrays of arrays |
56 | |
57 | =item * hashes of arrays |
58 | |
59 | =item * arrays of hashes |
60 | |
61 | =item * hashes of hashes |
62 | |
63 | =item * more elaborate constructs |
64 | |
65 | =item * recursive and self-referential data structures |
66 | |
67 | =item * objects |
68 | |
69 | =back |
70 | |
71 | But for now, let's look at some of the general issues common to all |
72 | of these types of data structures. |
73 | |
74 | =head1 REFERENCES |
75 | |
76 | The most important thing to understand about all data structures in Perl |
77 | -- including multidimensional arrays--is that even though they might |
78 | appear otherwise, Perl C<@ARRAY>s and C<%HASH>es are all internally |
79 | one-dimensional. They can only hold scalar values (meaning a string, |
80 | number, or a reference). They cannot directly contain other arrays or |
81 | hashes, but instead contain I<references> to other arrays or hashes. |
82 | |
83 | You can't use a reference to a array or hash in quite the same way that |
84 | you would a real array or hash. For C or C++ programmers unused to distinguishing |
85 | between arrays and pointers to the same, this can be confusing. If so, |
86 | just think of it as the difference between a structure and a pointer to a |
87 | structure. |
88 | |
89 | You can (and should) read more about references in the perlref(1) man |
90 | page. Briefly, references are rather like pointers that know what they |
91 | point to. (Objects are also a kind of reference, but we won't be needing |
92 | them right away--if ever.) That means that when you have something that |
93 | looks to you like an access to two-or-more-dimensional array and/or hash, |
94 | that what's really going on is that in all these cases, the base type is |
95 | merely a one-dimensional entity that contains references to the next |
96 | level. It's just that you can I<use> it as though it were a |
97 | two-dimensional one. This is actually the way almost all C |
98 | multidimensional arrays work as well. |
99 | |
100 | $list[7][12] # array of arrays |
101 | $list[7]{string} # array of hashes |
102 | $hash{string}[7] # hash of arrays |
103 | $hash{string}{'another string'} # hash of hashes |
104 | |
105 | Now, because the top level only contains references, if you try to print |
106 | out your array in with a simple print() function, you'll get something |
107 | that doesn't look very nice, like this: |
108 | |
109 | @LoL = ( [2, 3], [4, 5, 7], [0] ); |
110 | print $LoL[1][2]; |
111 | 7 |
112 | print @LoL; |
113 | ARRAY(0x83c38)ARRAY(0x8b194)ARRAY(0x8b1d0) |
114 | |
115 | |
116 | That's because Perl doesn't (ever) implicitly dereference your variables. |
117 | If you want to get at the thing a reference is referring to, then you have |
118 | to do this yourself using either prefix typing indicators, like |
119 | C<${$blah}>, C<@{$blah}>, C<@{$blah[$i]}>, or else postfix pointer arrows, |
120 | like C<$a-E<gt>[3]>, C<$h-E<gt>{fred}>, or even C<$ob-E<gt>method()-E<gt>[3]>. |
121 | |
122 | =head1 COMMON MISTAKES |
123 | |
124 | The two most common mistakes made in constructing something like |
125 | an array of arrays is either accidentally counting the number of |
126 | elements or else taking a reference to the same memory location |
127 | repeatedly. Here's the case where you just get the count instead |
128 | of a nested array: |
129 | |
130 | for $i (1..10) { |
131 | @list = somefunc($i); |
132 | $LoL[$i] = @list; # WRONG! |
133 | } |
134 | |
135 | That's just the simple case of assigning a list to a scalar and getting |
136 | its element count. If that's what you really and truly want, then you |
137 | might do well to consider being a tad more explicit about it, like this: |
138 | |
139 | for $i (1..10) { |
140 | @list = somefunc($i); |
141 | $counts[$i] = scalar @list; |
142 | } |
143 | |
144 | Here's the case of taking a reference to the same memory location |
145 | again and again: |
146 | |
147 | for $i (1..10) { |
148 | @list = somefunc($i); |
149 | $LoL[$i] = \@list; # WRONG! |
150 | } |
151 | |
152 | So, just what's the big problem with that? It looks right, doesn't it? |
153 | After all, I just told you that you need an array of references, so by |
154 | golly, you've made me one! |
155 | |
156 | Unfortunately, while this is true, it's still broken. All the references |
157 | in @LoL refer to the I<very same place>, and they will therefore all hold |
158 | whatever was last in @list! It's similar to the problem demonstrated in |
159 | the following C program: |
160 | |
161 | #include <pwd.h> |
162 | main() { |
163 | struct passwd *getpwnam(), *rp, *dp; |
164 | rp = getpwnam("root"); |
165 | dp = getpwnam("daemon"); |
166 | |
167 | printf("daemon name is %s\nroot name is %s\n", |
168 | dp->pw_name, rp->pw_name); |
169 | } |
170 | |
171 | Which will print |
172 | |
173 | daemon name is daemon |
174 | root name is daemon |
175 | |
176 | The problem is that both C<rp> and C<dp> are pointers to the same location |
177 | in memory! In C, you'd have to remember to malloc() yourself some new |
178 | memory. In Perl, you'll want to use the array constructor C<[]> or the |
179 | hash constructor C<{}> instead. Here's the right way to do the preceding |
180 | broken code fragments |
181 | |
182 | for $i (1..10) { |
183 | @list = somefunc($i); |
184 | $LoL[$i] = [ @list ]; |
185 | } |
186 | |
187 | The square brackets make a reference to a new array with a I<copy> |
188 | of what's in @list at the time of the assignment. This is what |
189 | you want. |
190 | |
191 | Note that this will produce something similar, but it's |
192 | much harder to read: |
193 | |
194 | for $i (1..10) { |
195 | @list = 0 .. $i; |
196 | @{$LoL[$i]} = @list; |
197 | } |
198 | |
199 | Is it the same? Well, maybe so--and maybe not. The subtle difference |
200 | is that when you assign something in square brackets, you know for sure |
201 | it's always a brand new reference with a new I<copy> of the data. |
202 | Something else could be going on in this new case with the C<@{$LoL[$i]}}> |
203 | dereference on the left-hand-side of the assignment. It all depends on |
204 | whether C<$LoL[$i]> had been undefined to start with, or whether it |
205 | already contained a reference. If you had already populated @LoL with |
206 | references, as in |
207 | |
208 | $LoL[3] = \@another_list; |
209 | |
210 | Then the assignment with the indirection on the left-hand-side would |
211 | use the existing reference that was already there: |
212 | |
213 | @{$LoL[3]} = @list; |
214 | |
215 | Of course, this I<would> have the "interesting" effect of clobbering |
216 | @another_list. (Have you ever noticed how when a programmer says |
217 | something is "interesting", that rather than meaning "intriguing", |
218 | they're disturbingly more apt to mean that it's "annoying", |
219 | "difficult", or both? :-) |
220 | |
221 | So just remember to always use the array or hash constructors with C<[]> |
222 | or C<{}>, and you'll be fine, although it's not always optimally |
223 | efficient. |
224 | |
225 | Surprisingly, the following dangerous-looking construct will |
226 | actually work out fine: |
227 | |
228 | for $i (1..10) { |
229 | my @list = somefunc($i); |
230 | $LoL[$i] = \@list; |
231 | } |
232 | |
233 | That's because my() is more of a run-time statement than it is a |
234 | compile-time declaration I<per se>. This means that the my() variable is |
235 | remade afresh each time through the loop. So even though it I<looks> as |
236 | though you stored the same variable reference each time, you actually did |
237 | not! This is a subtle distinction that can produce more efficient code at |
238 | the risk of misleading all but the most experienced of programmers. So I |
239 | usually advise against teaching it to beginners. In fact, except for |
240 | passing arguments to functions, I seldom like to see the gimme-a-reference |
241 | operator (backslash) used much at all in code. Instead, I advise |
242 | beginners that they (and most of the rest of us) should try to use the |
243 | much more easily understood constructors C<[]> and C<{}> instead of |
244 | relying upon lexical (or dynamic) scoping and hidden reference-counting to |
245 | do the right thing behind the scenes. |
246 | |
247 | In summary: |
248 | |
249 | $LoL[$i] = [ @list ]; # usually best |
250 | $LoL[$i] = \@list; # perilous; just how my() was that list? |
251 | @{ $LoL[$i] } = @list; # way too tricky for most programmers |
252 | |
253 | |
254 | =head1 CAVEAT ON PRECEDENCE |
255 | |
256 | Speaking of things like C<@{$LoL[$i]}>, the following are actually the |
257 | same thing: |
258 | |
259 | $listref->[2][2] # clear |
260 | $$listref[2][2] # confusing |
261 | |
262 | That's because Perl's precedence rules on its five prefix dereferencers |
263 | (which look like someone swearing: C<$ @ * % &>) make them bind more |
264 | tightly than the postfix subscripting brackets or braces! This will no |
265 | doubt come as a great shock to the C or C++ programmer, who is quite |
266 | accustomed to using C<*a[i]> to mean what's pointed to by the I<i'th> |
267 | element of C<a>. That is, they first take the subscript, and only then |
268 | dereference the thing at that subscript. That's fine in C, but this isn't C. |
269 | |
270 | The seemingly equivalent construct in Perl, C<$$listref[$i]> first does |
271 | the deref of C<$listref>, making it take $listref as a reference to an |
272 | array, and then dereference that, and finally tell you the I<i'th> value |
273 | of the array pointed to by $LoL. If you wanted the C notion, you'd have to |
274 | write C<${$LoL[$i]}> to force the C<$LoL[$i]> to get evaluated first |
275 | before the leading C<$> dereferencer. |
276 | |
277 | =head1 WHY YOU SHOULD ALWAYS C<use strict> |
278 | |
279 | If this is starting to sound scarier than it's worth, relax. Perl has |
280 | some features to help you avoid its most common pitfalls. The best |
281 | way to avoid getting confused is to start every program like this: |
282 | |
283 | #!/usr/bin/perl -w |
284 | use strict; |
285 | |
286 | This way, you'll be forced to declare all your variables with my() and |
287 | also disallow accidental "symbolic dereferencing". Therefore if you'd done |
288 | this: |
289 | |
290 | my $listref = [ |
291 | [ "fred", "barney", "pebbles", "bambam", "dino", ], |
292 | [ "homer", "bart", "marge", "maggie", ], |
293 | [ "george", "jane", "alroy", "judy", ], |
294 | ]; |
295 | |
296 | print $listref[2][2]; |
297 | |
298 | The compiler would immediately flag that as an error I<at compile time>, |
299 | because you were accidentally accessing C<@listref>, an undeclared |
300 | variable, and it would thereby remind you to instead write: |
301 | |
302 | print $listref->[2][2] |
303 | |
304 | =head1 DEBUGGING |
305 | |
306 | The standard Perl debugger in 5.001 doesn't do a very nice job of |
307 | printing out complex data structures. However, the perl5db that |
308 | Ilya Zakharevich E<lt>F<ilya@math.ohio-state.edu>E<gt> |
309 | wrote, which is accessible at |
310 | |
311 | ftp://ftp.perl.com/pub/perl/ext/perl5db-kit-0.9.tar.gz |
312 | |
313 | has several new features, including command line editing as well |
314 | as the C<x> command to dump out complex data structures. For example, |
315 | given the assignment to $LoL above, here's the debugger output: |
316 | |
317 | DB<1> X $LoL |
318 | $LoL = ARRAY(0x13b5a0) |
319 | 0 ARRAY(0x1f0a24) |
320 | 0 'fred' |
321 | 1 'barney' |
322 | 2 'pebbles' |
323 | 3 'bambam' |
324 | 4 'dino' |
325 | 1 ARRAY(0x13b558) |
326 | 0 'homer' |
327 | 1 'bart' |
328 | 2 'marge' |
329 | 3 'maggie' |
330 | 2 ARRAY(0x13b540) |
331 | 0 'george' |
332 | 1 'jane' |
333 | 2 'alroy' |
334 | 3 'judy' |
335 | |
336 | There's also a lower-case B<x> command which is nearly the same. |
337 | |
cb1a09d0 |
338 | =head1 CODE EXAMPLES |
339 | |
340 | Presented with little comment (these will get their own man pages someday) |
341 | here are short code examples illustrating access of various |
342 | types of data structures. |
343 | |
344 | =head1 LISTS OF LISTS |
345 | |
346 | =head2 Declaration of a LIST OF LISTS |
347 | |
348 | @LoL = ( |
349 | [ "fred", "barney" ], |
350 | [ "george", "jane", "elroy" ], |
351 | [ "homer", "marge", "bart" ], |
352 | ); |
353 | |
354 | =head2 Generation of a LIST OF LISTS |
355 | |
356 | # reading from file |
357 | while ( <> ) { |
358 | push @LoL, [ split ]; |
359 | |
360 | |
361 | # calling a function |
362 | for $i ( 1 .. 10 ) { |
363 | $LoL[$i] = [ somefunc($i) ]; |
364 | |
365 | |
366 | # using temp vars |
367 | for $i ( 1 .. 10 ) { |
368 | @tmp = somefunc($i); |
369 | $LoL[$i] = [ @tmp ]; |
370 | |
371 | |
372 | # add to an existing row |
373 | push @{ $LoL[0] }, "wilma", "betty"; |
374 | |
375 | =head2 Access and Printing of a LIST OF LISTS |
376 | |
377 | # one element |
378 | $LoL[0][0] = "Fred"; |
379 | |
380 | # another element |
381 | $LoL[1][1] =~ s/(\w)/\u$1/; |
382 | |
383 | # print the whole thing with refs |
384 | for $aref ( @LoL ) { |
385 | print "\t [ @$aref ],\n"; |
386 | |
387 | |
388 | # print the whole thing with indices |
389 | for $i ( 0 .. $#LoL ) { |
390 | print "\t [ @{$LoL[$i]} ],\n"; |
391 | |
392 | |
393 | # print the whole thing one at a time |
394 | for $i ( 0 .. $#LoL ) { |
395 | for $j ( 0 .. $#{$LoL[$i]} ) { |
396 | print "elt $i $j is $LoL[$i][$j]\n"; |
397 | } |
398 | |
399 | |
400 | =head1 HASHES OF LISTS |
401 | |
402 | =head2 Declaration of a HASH OF LISTS |
403 | |
404 | %HoL = ( |
405 | "flintstones" => [ "fred", "barney" ], |
406 | "jetsons" => [ "george", "jane", "elroy" ], |
407 | "simpsons" => [ "homer", "marge", "bart" ], |
408 | ); |
409 | |
410 | =head2 Generation of a HASH OF LISTS |
411 | |
412 | # reading from file |
413 | # flintstones: fred barney wilma dino |
414 | while ( <> ) { |
415 | next unless s/^(.*?):\s*//; |
416 | $HoL{$1} = [ split ]; |
417 | |
418 | |
419 | # reading from file; more temps |
420 | # flintstones: fred barney wilma dino |
421 | while ( $line = <> ) { |
422 | ($who, $rest) = split /:\s*/, $line, 2; |
423 | @fields = split ' ', $rest; |
424 | $HoL{$who} = [ @fields ]; |
425 | |
426 | |
427 | # calling a function that returns a list |
428 | for $group ( "simpsons", "jetsons", "flintstones" ) { |
429 | $HoL{$group} = [ get_family($group) ]; |
430 | |
431 | |
432 | # likewise, but using temps |
433 | for $group ( "simpsons", "jetsons", "flintstones" ) { |
434 | @members = get_family($group); |
435 | $HoL{$group} = [ @members ]; |
436 | |
437 | |
438 | # append new members to an existing family |
439 | push @{ $HoL{"flintstones"} }, "wilma", "betty"; |
440 | |
441 | =head2 Access and Printing of a HASH OF LISTS |
442 | |
443 | # one element |
444 | $HoL{flintstones}[0] = "Fred"; |
445 | |
446 | # another element |
447 | $HoL{simpsons}[1] =~ s/(\w)/\u$1/; |
448 | |
449 | # print the whole thing |
450 | foreach $family ( keys %HoL ) { |
451 | print "$family: @{ $HoL{$family} }\n" |
452 | |
453 | |
454 | # print the whole thing with indices |
455 | foreach $family ( keys %HoL ) { |
456 | print "family: "; |
457 | foreach $i ( 0 .. $#{ $HoL{$family} ) { |
458 | print " $i = $HoL{$family}[$i]"; |
459 | } |
460 | print "\n"; |
461 | |
462 | |
463 | # print the whole thing sorted by number of members |
464 | foreach $family ( sort { @{$HoL{$b}} <=> @{$HoL{$b}} } keys %HoL ) { |
465 | print "$family: @{ $HoL{$family} }\n" |
466 | |
467 | # print the whole thing sorted by number of members and name |
468 | foreach $family ( sort { @{$HoL{$b}} <=> @{$HoL{$a}} } keys %HoL ) { |
469 | print "$family: ", join(", ", sort @{ $HoL{$family}), "\n"; |
470 | |
471 | =head1 LISTS OF HASHES |
472 | |
473 | =head2 Declaration of a LIST OF HASHES |
474 | |
475 | @LoH = ( |
476 | { |
477 | Lead => "fred", |
478 | Friend => "barney", |
479 | }, |
480 | { |
481 | Lead => "george", |
482 | Wife => "jane", |
483 | Son => "elroy", |
484 | }, |
485 | { |
486 | Lead => "homer", |
487 | Wife => "marge", |
488 | Son => "bart", |
489 | } |
490 | ); |
491 | |
492 | =head2 Generation of a LIST OF HASHES |
493 | |
494 | # reading from file |
495 | # format: LEAD=fred FRIEND=barney |
496 | while ( <> ) { |
497 | $rec = {}; |
498 | for $field ( split ) { |
499 | ($key, $value) = split /=/, $field; |
500 | $rec->{$key} = $value; |
501 | } |
502 | push @LoH, $rec; |
503 | |
504 | |
505 | # reading from file |
506 | # format: LEAD=fred FRIEND=barney |
507 | # no temp |
508 | while ( <> ) { |
509 | push @LoH, { split /[\s+=]/ }; |
510 | |
511 | |
512 | # calling a function that returns a key,value list, like |
513 | # "lead","fred","daughter","pebbles" |
514 | while ( %fields = getnextpairset() ) |
515 | push @LoH, { %fields }; |
516 | |
517 | |
518 | # likewise, but using no temp vars |
519 | while (<>) { |
520 | push @LoH, { parsepairs($_) }; |
521 | |
522 | |
523 | # add key/value to an element |
524 | $LoH[0]{"pet"} = "dino"; |
525 | $LoH[2]{"pet"} = "santa's little helper"; |
526 | |
527 | =head2 Access and Printing of a LIST OF HASHES |
528 | |
529 | # one element |
530 | $LoH[0]{"lead"} = "fred"; |
531 | |
532 | # another element |
533 | $LoH[1]{"lead"} =~ s/(\w)/\u$1/; |
534 | |
535 | # print the whole thing with refs |
536 | for $href ( @LoH ) { |
537 | print "{ "; |
538 | for $role ( keys %$href ) { |
539 | print "$role=$href->{$role} "; |
540 | } |
541 | print "}\n"; |
542 | |
543 | |
544 | # print the whole thing with indices |
545 | for $i ( 0 .. $#LoH ) { |
546 | print "$i is { "; |
547 | for $role ( keys %{ $LoH[$i] } ) { |
548 | print "$role=$LoH[$i]{$role} "; |
549 | } |
550 | print "}\n"; |
551 | |
552 | |
553 | # print the whole thing one at a time |
554 | for $i ( 0 .. $#LoH ) { |
555 | for $role ( keys %{ $LoH[$i] } ) { |
556 | print "elt $i $role is $LoH[$i]{$role}\n"; |
557 | } |
558 | |
559 | =head1 HASHES OF HASHES |
560 | |
561 | =head2 Declaration of a HASH OF HASHES |
562 | |
563 | %HoH = ( |
564 | "flintstones" => { |
565 | "lead" => "fred", |
566 | "pal" => "barney", |
567 | }, |
568 | "jetsons" => { |
569 | "lead" => "george", |
570 | "wife" => "jane", |
571 | "his boy"=> "elroy", |
572 | } |
573 | "simpsons" => { |
574 | "lead" => "homer", |
575 | "wife" => "marge", |
576 | "kid" => "bart", |
577 | ); |
578 | |
579 | =head2 Generation of a HASH OF HASHES |
580 | |
581 | # reading from file |
582 | # flintstones: lead=fred pal=barney wife=wilma pet=dino |
583 | while ( <> ) { |
584 | next unless s/^(.*?):\s*//; |
585 | $who = $1; |
586 | for $field ( split ) { |
587 | ($key, $value) = split /=/, $field; |
588 | $HoH{$who}{$key} = $value; |
589 | } |
590 | |
591 | |
592 | # reading from file; more temps |
593 | while ( <> ) { |
594 | next unless s/^(.*?):\s*//; |
595 | $who = $1; |
596 | $rec = {}; |
597 | $HoH{$who} = $rec; |
598 | for $field ( split ) { |
599 | ($key, $value) = split /=/, $field; |
600 | $rec->{$key} = $value; |
601 | } |
602 | |
603 | |
604 | # calling a function that returns a key,value list, like |
605 | # "lead","fred","daughter","pebbles" |
606 | while ( %fields = getnextpairset() ) |
607 | push @a, { %fields }; |
608 | |
609 | |
610 | # calling a function that returns a key,value hash |
611 | for $group ( "simpsons", "jetsons", "flintstones" ) { |
612 | $HoH{$group} = { get_family($group) }; |
613 | |
614 | |
615 | # likewise, but using temps |
616 | for $group ( "simpsons", "jetsons", "flintstones" ) { |
617 | %members = get_family($group); |
618 | $HoH{$group} = { %members }; |
619 | |
620 | |
621 | # append new members to an existing family |
622 | %new_folks = ( |
623 | "wife" => "wilma", |
624 | "pet" => "dino"; |
625 | ); |
626 | for $what (keys %new_folks) { |
627 | $HoH{flintstones}{$what} = $new_folks{$what}; |
628 | |
629 | |
630 | =head2 Access and Printing of a HASH OF HASHES |
631 | |
632 | # one element |
633 | $HoH{"flintstones"}{"wife"} = "wilma"; |
634 | |
635 | # another element |
636 | $HoH{simpsons}{lead} =~ s/(\w)/\u$1/; |
637 | |
638 | # print the whole thing |
639 | foreach $family ( keys %HoH ) { |
640 | print "$family: "; |
641 | for $role ( keys %{ $HoH{$family} } { |
642 | print "$role=$HoH{$family}{$role} "; |
643 | } |
644 | print "}\n"; |
645 | |
646 | |
647 | # print the whole thing somewhat sorted |
648 | foreach $family ( sort keys %HoH ) { |
649 | print "$family: "; |
650 | for $role ( sort keys %{ $HoH{$family} } { |
651 | print "$role=$HoH{$family}{$role} "; |
652 | } |
653 | print "}\n"; |
654 | |
655 | |
656 | # print the whole thing sorted by number of members |
657 | foreach $family ( sort { keys %{$HoH{$b}} <=> keys %{$HoH{$b}} } keys %HoH ) { |
658 | print "$family: "; |
659 | for $role ( sort keys %{ $HoH{$family} } { |
660 | print "$role=$HoH{$family}{$role} "; |
661 | } |
662 | print "}\n"; |
663 | |
664 | |
665 | # establish a sort order (rank) for each role |
666 | $i = 0; |
667 | for ( qw(lead wife son daughter pal pet) ) { $rank{$_} = ++$i } |
668 | |
669 | # now print the whole thing sorted by number of members |
670 | foreach $family ( sort { keys %{$HoH{$b}} <=> keys %{$HoH{$b}} } keys %HoH ) { |
671 | print "$family: "; |
672 | # and print these according to rank order |
673 | for $role ( sort { $rank{$a} <=> $rank{$b} keys %{ $HoH{$family} } { |
674 | print "$role=$HoH{$family}{$role} "; |
675 | } |
676 | print "}\n"; |
677 | |
678 | |
679 | =head1 MORE ELABORATE RECORDS |
680 | |
681 | =head2 Declaration of MORE ELABORATE RECORDS |
682 | |
683 | Here's a sample showing how to create and use a record whose fields are of |
684 | many different sorts: |
685 | |
686 | $rec = { |
687 | STRING => $string, |
688 | LIST => [ @old_values ], |
689 | LOOKUP => { %some_table }, |
690 | FUNC => \&some_function, |
691 | FANON => sub { $_[0] ** $_[1] }, |
692 | FH => \*STDOUT, |
693 | }; |
694 | |
695 | print $rec->{STRING}; |
696 | |
697 | print $rec->{LIST}[0]; |
698 | $last = pop @ { $rec->{LIST} }; |
699 | |
700 | print $rec->{LOOKUP}{"key"}; |
701 | ($first_k, $first_v) = each %{ $rec->{LOOKUP} }; |
702 | |
703 | $answer = &{ $rec->{FUNC} }($arg); |
704 | $answer = &{ $rec->{FANON} }($arg1, $arg2); |
705 | |
706 | # careful of extra block braces on fh ref |
707 | print { $rec->{FH} } "a string\n"; |
708 | |
709 | use FileHandle; |
710 | $rec->{FH}->autoflush(1); |
711 | $rec->{FH}->print(" a string\n"); |
712 | |
713 | =head2 Declaration of a HASH OF COMPLEX RECORDS |
714 | |
715 | %TV = ( |
716 | "flintstones" => { |
717 | series => "flintstones", |
718 | nights => [ qw(monday thursday friday) ]; |
719 | members => [ |
720 | { name => "fred", role => "lead", age => 36, }, |
721 | { name => "wilma", role => "wife", age => 31, }, |
722 | { name => "pebbles", role => "kid", age => 4, }, |
723 | ], |
724 | }, |
725 | |
726 | "jetsons" => { |
727 | series => "jetsons", |
728 | nights => [ qw(wednesday saturday) ]; |
729 | members => [ |
730 | { name => "george", role => "lead", age => 41, }, |
731 | { name => "jane", role => "wife", age => 39, }, |
732 | { name => "elroy", role => "kid", age => 9, }, |
733 | ], |
734 | }, |
735 | |
736 | "simpsons" => { |
737 | series => "simpsons", |
738 | nights => [ qw(monday) ]; |
739 | members => [ |
740 | { name => "homer", role => "lead", age => 34, }, |
741 | { name => "marge", role => "wife", age => 37, }, |
742 | { name => "bart", role => "kid", age => 11, }, |
743 | ], |
744 | }, |
745 | ); |
746 | |
747 | =head2 Generation of a HASH OF COMPLEX RECORDS |
748 | |
749 | # reading from file |
750 | # this is most easily done by having the file itself be |
751 | # in the raw data format as shown above. perl is happy |
752 | # to parse complex datastructures if declared as data, so |
753 | # sometimes it's easiest to do that |
754 | |
755 | # here's a piece by piece build up |
756 | $rec = {}; |
757 | $rec->{series} = "flintstones"; |
758 | $rec->{nights} = [ find_days() ]; |
759 | |
760 | @members = (); |
761 | # assume this file in field=value syntax |
762 | while () { |
763 | %fields = split /[\s=]+/; |
764 | push @members, { %fields }; |
765 | } |
766 | $rec->{members} = [ @members ]; |
767 | |
768 | # now remember the whole thing |
769 | $TV{ $rec->{series} } = $rec; |
770 | |
771 | ########################################################### |
772 | # now, you might want to make interesting extra fields that |
773 | # include pointers back into the same data structure so if |
774 | # change one piece, it changes everywhere, like for examples |
775 | # if you wanted a {kids} field that was an array reference |
776 | # to a list of the kids' records without having duplicate |
777 | # records and thus update problems. |
778 | ########################################################### |
779 | foreach $family (keys %TV) { |
780 | $rec = $TV{$family}; # temp pointer |
781 | @kids = (); |
782 | for $person ( @{$rec->{members}} ) { |
783 | if ($person->{role} =~ /kid|son|daughter/) { |
784 | push @kids, $person; |
785 | } |
786 | } |
787 | # REMEMBER: $rec and $TV{$family} point to same data!! |
788 | $rec->{kids} = [ @kids ]; |
789 | } |
790 | |
791 | # you copied the list, but the list itself contains pointers |
792 | # to uncopied objects. this means that if you make bart get |
793 | # older via |
794 | |
795 | $TV{simpsons}{kids}[0]{age}++; |
796 | |
797 | # then this would also change in |
798 | print $TV{simpsons}{members}[2]{age}; |
799 | |
800 | # because $TV{simpsons}{kids}[0] and $TV{simpsons}{members}[2] |
801 | # both point to the same underlying anonymous hash table |
802 | |
803 | # print the whole thing |
804 | foreach $family ( keys %TV ) { |
805 | print "the $family"; |
806 | print " is on during @{ $TV{$family}{nights} }\n"; |
807 | print "its members are:\n"; |
808 | for $who ( @{ $TV{$family}{members} } ) { |
809 | print " $who->{name} ($who->{role}), age $who->{age}\n"; |
810 | } |
811 | print "it turns out that $TV{$family}{'lead'} has "; |
812 | print scalar ( @{ $TV{$family}{kids} } ), " kids named "; |
813 | print join (", ", map { $_->{name} } @{ $TV{$family}{kids} } ); |
814 | print "\n"; |
815 | } |
816 | |
4633a7c4 |
817 | =head1 SEE ALSO |
818 | |
cb1a09d0 |
819 | L<perlref>, L<perllol>, L<perldata>, L<perlobj> |
4633a7c4 |
820 | |
821 | =head1 AUTHOR |
822 | |
823 | Tom Christiansen E<lt>F<tchrist@perl.com>E<gt> |
824 | |
825 | Last update: |
cb1a09d0 |
826 | Tue Dec 12 09:20:26 MST 1995 |
4633a7c4 |
827 | |