extra code in pp_concat, Take 2
[p5sagit/p5-mst-13.2.git] / pod / perldsc.pod
CommitLineData
cb1a09d0 1=head1 NAME
4633a7c4 2
cb1a09d0 3perldsc - Perl Data Structures Cookbook
4633a7c4 4
cb1a09d0 5=head1 DESCRIPTION
4633a7c4 6
7The single feature most sorely lacking in the Perl programming language
8prior to its 5.0 release was complex data structures. Even without direct
9language support, some valiant programmers did manage to emulate them, but
10it was hard work and not for the faint of heart. You could occasionally
19799a22 11get away with the C<$m{$AoA,$b}> notation borrowed from B<awk> in which the
12keys are actually more like a single concatenated string C<"$AoA$b">, but
4633a7c4 13traversal and sorting were difficult. More desperate programmers even
14hacked Perl's internal symbol table directly, a strategy that proved hard
15to develop and maintain--to put it mildly.
16
17The 5.0 release of Perl let us have complex data structures. You
d1be9408 18may now write something like this and all of a sudden, you'd have an array
4633a7c4 19with three dimensions!
20
84f709e7 21 for $x (1 .. 10) {
22 for $y (1 .. 10) {
23 for $z (1 .. 10) {
24 $AoA[$x][$y][$z] =
25 $x ** $y + $z;
4633a7c4 26 }
27 }
28 }
29
30Alas, however simple this may appear, underneath it's a much more
31elaborate construct than meets the eye!
32
19799a22 33How do you print it out? Why can't you say just C<print @AoA>? How do
4633a7c4 34you sort it? How can you pass it to a function or get one of these back
d1be9408 35from a function? Is it an object? Can you save it to disk to read
4633a7c4 36back later? How do you access whole rows or columns of that matrix? Do
4973169d 37all the values have to be numeric?
4633a7c4 38
39As you see, it's quite easy to become confused. While some small portion
40of the blame for this can be attributed to the reference-based
41implementation, it's really more due to a lack of existing documentation with
42examples designed for the beginner.
43
5f05dabc 44This document is meant to be a detailed but understandable treatment of the
45many different sorts of data structures you might want to develop. It
46should also serve as a cookbook of examples. That way, when you need to
47create one of these complex data structures, you can just pinch, pilfer, or
48purloin a drop-in example from here.
4633a7c4 49
50Let's look at each of these possible constructs in detail. There are separate
28757baa 51sections on each of the following:
4633a7c4 52
53=over 5
54
55=item * arrays of arrays
56
57=item * hashes of arrays
58
59=item * arrays of hashes
60
61=item * hashes of hashes
62
63=item * more elaborate constructs
64
4633a7c4 65=back
66
5a964f20 67But for now, let's look at general issues common to all
68these types of data structures.
4633a7c4 69
70=head1 REFERENCES
71
72The most important thing to understand about all data structures in Perl
73-- including multidimensional arrays--is that even though they might
74appear otherwise, Perl C<@ARRAY>s and C<%HASH>es are all internally
5f05dabc 75one-dimensional. They can hold only scalar values (meaning a string,
4633a7c4 76number, or a reference). They cannot directly contain other arrays or
77hashes, but instead contain I<references> to other arrays or hashes.
78
d1be9408 79You can't use a reference to an array or hash in quite the same way that you
5f05dabc 80would a real array or hash. For C or C++ programmers unused to
81distinguishing between arrays and pointers to the same, this can be
82confusing. If so, just think of it as the difference between a structure
83and a pointer to a structure.
4633a7c4 84
85You can (and should) read more about references in the perlref(1) man
86page. Briefly, references are rather like pointers that know what they
87point to. (Objects are also a kind of reference, but we won't be needing
4973169d 88them right away--if ever.) This means that when you have something which
89looks to you like an access to a two-or-more-dimensional array and/or hash,
90what's really going on is that the base type is
4633a7c4 91merely a one-dimensional entity that contains references to the next
92level. It's just that you can I<use> it as though it were a
93two-dimensional one. This is actually the way almost all C
94multidimensional arrays work as well.
95
19799a22 96 $array[7][12] # array of arrays
97 $array[7]{string} # array of hashes
4633a7c4 98 $hash{string}[7] # hash of arrays
99 $hash{string}{'another string'} # hash of hashes
100
5f05dabc 101Now, because the top level contains only references, if you try to print
4633a7c4 102out your array in with a simple print() function, you'll get something
103that doesn't look very nice, like this:
104
84f709e7 105 @AoA = ( [2, 3], [4, 5, 7], [0] );
19799a22 106 print $AoA[1][2];
4633a7c4 107 7
19799a22 108 print @AoA;
4633a7c4 109 ARRAY(0x83c38)ARRAY(0x8b194)ARRAY(0x8b1d0)
110
111
112That's because Perl doesn't (ever) implicitly dereference your variables.
113If you want to get at the thing a reference is referring to, then you have
114to do this yourself using either prefix typing indicators, like
115C<${$blah}>, C<@{$blah}>, C<@{$blah[$i]}>, or else postfix pointer arrows,
116like C<$a-E<gt>[3]>, C<$h-E<gt>{fred}>, or even C<$ob-E<gt>method()-E<gt>[3]>.
117
118=head1 COMMON MISTAKES
119
120The two most common mistakes made in constructing something like
121an array of arrays is either accidentally counting the number of
122elements or else taking a reference to the same memory location
123repeatedly. Here's the case where you just get the count instead
124of a nested array:
125
84f709e7 126 for $i (1..10) {
127 @array = somefunc($i);
128 $AoA[$i] = @array; # WRONG!
4973169d 129 }
4633a7c4 130
19799a22 131That's just the simple case of assigning an array to a scalar and getting
4633a7c4 132its element count. If that's what you really and truly want, then you
133might do well to consider being a tad more explicit about it, like this:
134
84f709e7 135 for $i (1..10) {
136 @array = somefunc($i);
137 $counts[$i] = scalar @array;
4973169d 138 }
4633a7c4 139
84f709e7 140Here's the case of taking a reference to the same memory location
141again and again:
4633a7c4 142
84f709e7 143 for $i (1..10) {
144 @array = somefunc($i);
145 $AoA[$i] = \@array; # WRONG!
146 }
147
148So, what's the big problem with that? It looks right, doesn't it?
149After all, I just told you that you need an array of references, so by
150golly, you've made me one!
151
152Unfortunately, while this is true, it's still broken. All the references
153in @AoA refer to the I<very same place>, and they will therefore all hold
154whatever was last in @array! It's similar to the problem demonstrated in
155the following C program:
156
157 #include <pwd.h>
158 main() {
159 struct passwd *getpwnam(), *rp, *dp;
160 rp = getpwnam("root");
161 dp = getpwnam("daemon");
162
163 printf("daemon name is %s\nroot name is %s\n",
164 dp->pw_name, rp->pw_name);
165 }
166
167Which will print
168
169 daemon name is daemon
170 root name is daemon
171
172The problem is that both C<rp> and C<dp> are pointers to the same location
173in memory! In C, you'd have to remember to malloc() yourself some new
174memory. In Perl, you'll want to use the array constructor C<[]> or the
175hash constructor C<{}> instead. Here's the right way to do the preceding
176broken code fragments:
177
178 for $i (1..10) {
179 @array = somefunc($i);
180 $AoA[$i] = [ @array ];
4973169d 181 }
4633a7c4 182
183The square brackets make a reference to a new array with a I<copy>
84f709e7 184of what's in @array at the time of the assignment. This is what
185you want.
4633a7c4 186
187Note that this will produce something similar, but it's
188much harder to read:
189
84f709e7 190 for $i (1..10) {
191 @array = 0 .. $i;
192 @{$AoA[$i]} = @array;
4973169d 193 }
4633a7c4 194
195Is it the same? Well, maybe so--and maybe not. The subtle difference
196is that when you assign something in square brackets, you know for sure
197it's always a brand new reference with a new I<copy> of the data.
84f709e7 198Something else could be going on in this new case with the C<@{$AoA[$i]}}>
4633a7c4 199dereference on the left-hand-side of the assignment. It all depends on
19799a22 200whether C<$AoA[$i]> had been undefined to start with, or whether it
201already contained a reference. If you had already populated @AoA with
4633a7c4 202references, as in
203
19799a22 204 $AoA[3] = \@another_array;
4633a7c4 205
206Then the assignment with the indirection on the left-hand-side would
207use the existing reference that was already there:
208
84f709e7 209 @{$AoA[3]} = @array;
4633a7c4 210
211Of course, this I<would> have the "interesting" effect of clobbering
19799a22 212@another_array. (Have you ever noticed how when a programmer says
4633a7c4 213something is "interesting", that rather than meaning "intriguing",
214they're disturbingly more apt to mean that it's "annoying",
215"difficult", or both? :-)
216
5f05dabc 217So just remember always to use the array or hash constructors with C<[]>
4633a7c4 218or C<{}>, and you'll be fine, although it's not always optimally
4973169d 219efficient.
4633a7c4 220
221Surprisingly, the following dangerous-looking construct will
222actually work out fine:
223
84f709e7 224 for $i (1..10) {
225 my @array = somefunc($i);
226 $AoA[$i] = \@array;
4973169d 227 }
4633a7c4 228
229That's because my() is more of a run-time statement than it is a
230compile-time declaration I<per se>. This means that the my() variable is
231remade afresh each time through the loop. So even though it I<looks> as
232though you stored the same variable reference each time, you actually did
233not! This is a subtle distinction that can produce more efficient code at
234the risk of misleading all but the most experienced of programmers. So I
235usually advise against teaching it to beginners. In fact, except for
236passing arguments to functions, I seldom like to see the gimme-a-reference
237operator (backslash) used much at all in code. Instead, I advise
238beginners that they (and most of the rest of us) should try to use the
239much more easily understood constructors C<[]> and C<{}> instead of
240relying upon lexical (or dynamic) scoping and hidden reference-counting to
241do the right thing behind the scenes.
242
243In summary:
244
84f709e7 245 $AoA[$i] = [ @array ]; # usually best
246 $AoA[$i] = \@array; # perilous; just how my() was that array?
247 @{ $AoA[$i] } = @array; # way too tricky for most programmers
4633a7c4 248
249
4973169d 250=head1 CAVEAT ON PRECEDENCE
4633a7c4 251
84f709e7 252Speaking of things like C<@{$AoA[$i]}>, the following are actually the
4633a7c4 253same thing:
254
19799a22 255 $aref->[2][2] # clear
256 $$aref[2][2] # confusing
4633a7c4 257
258That's because Perl's precedence rules on its five prefix dereferencers
259(which look like someone swearing: C<$ @ * % &>) make them bind more
260tightly than the postfix subscripting brackets or braces! This will no
261doubt come as a great shock to the C or C++ programmer, who is quite
262accustomed to using C<*a[i]> to mean what's pointed to by the I<i'th>
263element of C<a>. That is, they first take the subscript, and only then
264dereference the thing at that subscript. That's fine in C, but this isn't C.
265
19799a22 266The seemingly equivalent construct in Perl, C<$$aref[$i]> first does
267the deref of $aref, making it take $aref as a reference to an
4633a7c4 268array, and then dereference that, and finally tell you the I<i'th> value
19799a22 269of the array pointed to by $AoA. If you wanted the C notion, you'd have to
270write C<${$AoA[$i]}> to force the C<$AoA[$i]> to get evaluated first
4633a7c4 271before the leading C<$> dereferencer.
272
273=head1 WHY YOU SHOULD ALWAYS C<use strict>
274
275If this is starting to sound scarier than it's worth, relax. Perl has
276some features to help you avoid its most common pitfalls. The best
277way to avoid getting confused is to start every program like this:
278
279 #!/usr/bin/perl -w
280 use strict;
281
282This way, you'll be forced to declare all your variables with my() and
283also disallow accidental "symbolic dereferencing". Therefore if you'd done
284this:
285
19799a22 286 my $aref = [
84f709e7 287 [ "fred", "barney", "pebbles", "bambam", "dino", ],
288 [ "homer", "bart", "marge", "maggie", ],
289 [ "george", "jane", "elroy", "judy", ],
4633a7c4 290 ];
291
19799a22 292 print $aref[2][2];
4633a7c4 293
294The compiler would immediately flag that as an error I<at compile time>,
19799a22 295because you were accidentally accessing C<@aref>, an undeclared
5f05dabc 296variable, and it would thereby remind you to write instead:
4633a7c4 297
19799a22 298 print $aref->[2][2]
4633a7c4 299
300=head1 DEBUGGING
301
a6006777 302Before version 5.002, the standard Perl debugger didn't do a very nice job of
303printing out complex data structures. With 5.002 or above, the
4973169d 304debugger includes several new features, including command line editing as
305well as the C<x> command to dump out complex data structures. For
19799a22 306example, given the assignment to $AoA above, here's the debugger output:
4633a7c4 307
19799a22 308 DB<1> x $AoA
309 $AoA = ARRAY(0x13b5a0)
4633a7c4 310 0 ARRAY(0x1f0a24)
311 0 'fred'
312 1 'barney'
313 2 'pebbles'
314 3 'bambam'
315 4 'dino'
316 1 ARRAY(0x13b558)
317 0 'homer'
318 1 'bart'
319 2 'marge'
320 3 'maggie'
321 2 ARRAY(0x13b540)
322 0 'george'
323 1 'jane'
5f05dabc 324 2 'elroy'
4633a7c4 325 3 'judy'
326
cb1a09d0 327=head1 CODE EXAMPLES
328
54310121 329Presented with little comment (these will get their own manpages someday)
4973169d 330here are short code examples illustrating access of various
cb1a09d0 331types of data structures.
332
19799a22 333=head1 ARRAYS OF ARRAYS
cb1a09d0 334
d1be9408 335=head2 Declaration of an ARRAY OF ARRAYS
cb1a09d0 336
84f709e7 337 @AoA = (
338 [ "fred", "barney" ],
339 [ "george", "jane", "elroy" ],
340 [ "homer", "marge", "bart" ],
cb1a09d0 341 );
342
d1be9408 343=head2 Generation of an ARRAY OF ARRAYS
cb1a09d0 344
345 # reading from file
346 while ( <> ) {
19799a22 347 push @AoA, [ split ];
4973169d 348 }
cb1a09d0 349
350 # calling a function
84f709e7 351 for $i ( 1 .. 10 ) {
19799a22 352 $AoA[$i] = [ somefunc($i) ];
4973169d 353 }
cb1a09d0 354
355 # using temp vars
84f709e7 356 for $i ( 1 .. 10 ) {
357 @tmp = somefunc($i);
358 $AoA[$i] = [ @tmp ];
4973169d 359 }
cb1a09d0 360
361 # add to an existing row
84f709e7 362 push @{ $AoA[0] }, "wilma", "betty";
cb1a09d0 363
d1be9408 364=head2 Access and Printing of an ARRAY OF ARRAYS
cb1a09d0 365
366 # one element
84f709e7 367 $AoA[0][0] = "Fred";
cb1a09d0 368
369 # another element
19799a22 370 $AoA[1][1] =~ s/(\w)/\u$1/;
cb1a09d0 371
372 # print the whole thing with refs
84f709e7 373 for $aref ( @AoA ) {
cb1a09d0 374 print "\t [ @$aref ],\n";
4973169d 375 }
cb1a09d0 376
377 # print the whole thing with indices
84f709e7 378 for $i ( 0 .. $#AoA ) {
379 print "\t [ @{$AoA[$i]} ],\n";
4973169d 380 }
cb1a09d0 381
382 # print the whole thing one at a time
84f709e7 383 for $i ( 0 .. $#AoA ) {
384 for $j ( 0 .. $#{ $AoA[$i] } ) {
385 print "elt $i $j is $AoA[$i][$j]\n";
cb1a09d0 386 }
4973169d 387 }
cb1a09d0 388
19799a22 389=head1 HASHES OF ARRAYS
cb1a09d0 390
19799a22 391=head2 Declaration of a HASH OF ARRAYS
cb1a09d0 392
84f709e7 393 %HoA = (
394 flintstones => [ "fred", "barney" ],
395 jetsons => [ "george", "jane", "elroy" ],
396 simpsons => [ "homer", "marge", "bart" ],
cb1a09d0 397 );
398
19799a22 399=head2 Generation of a HASH OF ARRAYS
cb1a09d0 400
401 # reading from file
402 # flintstones: fred barney wilma dino
403 while ( <> ) {
84f709e7 404 next unless s/^(.*?):\s*//;
19799a22 405 $HoA{$1} = [ split ];
4973169d 406 }
cb1a09d0 407
408 # reading from file; more temps
409 # flintstones: fred barney wilma dino
84f709e7 410 while ( $line = <> ) {
411 ($who, $rest) = split /:\s*/, $line, 2;
412 @fields = split ' ', $rest;
413 $HoA{$who} = [ @fields ];
4973169d 414 }
cb1a09d0 415
416 # calling a function that returns a list
84f709e7 417 for $group ( "simpsons", "jetsons", "flintstones" ) {
19799a22 418 $HoA{$group} = [ get_family($group) ];
4973169d 419 }
cb1a09d0 420
421 # likewise, but using temps
84f709e7 422 for $group ( "simpsons", "jetsons", "flintstones" ) {
423 @members = get_family($group);
424 $HoA{$group} = [ @members ];
4973169d 425 }
cb1a09d0 426
427 # append new members to an existing family
84f709e7 428 push @{ $HoA{"flintstones"} }, "wilma", "betty";
cb1a09d0 429
19799a22 430=head2 Access and Printing of a HASH OF ARRAYS
cb1a09d0 431
432 # one element
84f709e7 433 $HoA{flintstones}[0] = "Fred";
cb1a09d0 434
435 # another element
19799a22 436 $HoA{simpsons}[1] =~ s/(\w)/\u$1/;
cb1a09d0 437
438 # print the whole thing
84f709e7 439 foreach $family ( keys %HoA ) {
440 print "$family: @{ $HoA{$family} }\n"
4973169d 441 }
cb1a09d0 442
443 # print the whole thing with indices
84f709e7 444 foreach $family ( keys %HoA ) {
445 print "family: ";
446 foreach $i ( 0 .. $#{ $HoA{$family} } ) {
19799a22 447 print " $i = $HoA{$family}[$i]";
cb1a09d0 448 }
449 print "\n";
4973169d 450 }
cb1a09d0 451
452 # print the whole thing sorted by number of members
84f709e7 453 foreach $family ( sort { @{$HoA{$b}} <=> @{$HoA{$a}} } keys %HoA ) {
19799a22 454 print "$family: @{ $HoA{$family} }\n"
4973169d 455 }
cb1a09d0 456
457 # print the whole thing sorted by number of members and name
84f709e7 458 foreach $family ( sort {
459 @{$HoA{$b}} <=> @{$HoA{$a}}
460 ||
461 $a cmp $b
462 } keys %HoA )
463 {
19799a22 464 print "$family: ", join(", ", sort @{ $HoA{$family} }), "\n";
4973169d 465 }
cb1a09d0 466
19799a22 467=head1 ARRAYS OF HASHES
cb1a09d0 468
d1be9408 469=head2 Declaration of an ARRAY OF HASHES
cb1a09d0 470
84f709e7 471 @AoH = (
cb1a09d0 472 {
84f709e7 473 Lead => "fred",
474 Friend => "barney",
cb1a09d0 475 },
476 {
84f709e7 477 Lead => "george",
478 Wife => "jane",
479 Son => "elroy",
cb1a09d0 480 },
481 {
84f709e7 482 Lead => "homer",
483 Wife => "marge",
484 Son => "bart",
cb1a09d0 485 }
486 );
487
d1be9408 488=head2 Generation of an ARRAY OF HASHES
cb1a09d0 489
490 # reading from file
491 # format: LEAD=fred FRIEND=barney
492 while ( <> ) {
84f709e7 493 $rec = {};
494 for $field ( split ) {
495 ($key, $value) = split /=/, $field;
496 $rec->{$key} = $value;
cb1a09d0 497 }
19799a22 498 push @AoH, $rec;
4973169d 499 }
cb1a09d0 500
501
502 # reading from file
503 # format: LEAD=fred FRIEND=barney
504 # no temp
505 while ( <> ) {
19799a22 506 push @AoH, { split /[\s+=]/ };
4973169d 507 }
cb1a09d0 508
19799a22 509 # calling a function that returns a key/value pair list, like
84f709e7 510 # "lead","fred","daughter","pebbles"
511 while ( %fields = getnextpairset() ) {
19799a22 512 push @AoH, { %fields };
4973169d 513 }
cb1a09d0 514
515 # likewise, but using no temp vars
516 while (<>) {
19799a22 517 push @AoH, { parsepairs($_) };
4973169d 518 }
cb1a09d0 519
520 # add key/value to an element
84f709e7 521 $AoH[0]{pet} = "dino";
19799a22 522 $AoH[2]{pet} = "santa's little helper";
cb1a09d0 523
d1be9408 524=head2 Access and Printing of an ARRAY OF HASHES
cb1a09d0 525
526 # one element
84f709e7 527 $AoH[0]{lead} = "fred";
cb1a09d0 528
529 # another element
19799a22 530 $AoH[1]{lead} =~ s/(\w)/\u$1/;
cb1a09d0 531
532 # print the whole thing with refs
84f709e7 533 for $href ( @AoH ) {
534 print "{ ";
535 for $role ( keys %$href ) {
536 print "$role=$href->{$role} ";
cb1a09d0 537 }
538 print "}\n";
4973169d 539 }
cb1a09d0 540
541 # print the whole thing with indices
84f709e7 542 for $i ( 0 .. $#AoH ) {
cb1a09d0 543 print "$i is { ";
84f709e7 544 for $role ( keys %{ $AoH[$i] } ) {
545 print "$role=$AoH[$i]{$role} ";
cb1a09d0 546 }
547 print "}\n";
4973169d 548 }
cb1a09d0 549
550 # print the whole thing one at a time
84f709e7 551 for $i ( 0 .. $#AoH ) {
552 for $role ( keys %{ $AoH[$i] } ) {
553 print "elt $i $role is $AoH[$i]{$role}\n";
cb1a09d0 554 }
4973169d 555 }
cb1a09d0 556
557=head1 HASHES OF HASHES
558
559=head2 Declaration of a HASH OF HASHES
560
84f709e7 561 %HoH = (
28757baa 562 flintstones => {
84f709e7 563 lead => "fred",
564 pal => "barney",
cb1a09d0 565 },
28757baa 566 jetsons => {
84f709e7 567 lead => "george",
568 wife => "jane",
569 "his boy" => "elroy",
4973169d 570 },
28757baa 571 simpsons => {
84f709e7 572 lead => "homer",
573 wife => "marge",
574 kid => "bart",
4973169d 575 },
576 );
cb1a09d0 577
578=head2 Generation of a HASH OF HASHES
579
580 # reading from file
581 # flintstones: lead=fred pal=barney wife=wilma pet=dino
582 while ( <> ) {
84f709e7 583 next unless s/^(.*?):\s*//;
584 $who = $1;
585 for $field ( split ) {
586 ($key, $value) = split /=/, $field;
cb1a09d0 587 $HoH{$who}{$key} = $value;
588 }
589
590
591 # reading from file; more temps
592 while ( <> ) {
84f709e7 593 next unless s/^(.*?):\s*//;
594 $who = $1;
595 $rec = {};
cb1a09d0 596 $HoH{$who} = $rec;
84f709e7 597 for $field ( split ) {
598 ($key, $value) = split /=/, $field;
599 $rec->{$key} = $value;
cb1a09d0 600 }
4973169d 601 }
cb1a09d0 602
cb1a09d0 603 # calling a function that returns a key,value hash
84f709e7 604 for $group ( "simpsons", "jetsons", "flintstones" ) {
cb1a09d0 605 $HoH{$group} = { get_family($group) };
4973169d 606 }
cb1a09d0 607
608 # likewise, but using temps
84f709e7 609 for $group ( "simpsons", "jetsons", "flintstones" ) {
610 %members = get_family($group);
cb1a09d0 611 $HoH{$group} = { %members };
4973169d 612 }
cb1a09d0 613
614 # append new members to an existing family
84f709e7 615 %new_folks = (
616 wife => "wilma",
617 pet => "dino",
cb1a09d0 618 );
4973169d 619
84f709e7 620 for $what (keys %new_folks) {
cb1a09d0 621 $HoH{flintstones}{$what} = $new_folks{$what};
4973169d 622 }
cb1a09d0 623
624=head2 Access and Printing of a HASH OF HASHES
625
626 # one element
84f709e7 627 $HoH{flintstones}{wife} = "wilma";
cb1a09d0 628
629 # another element
630 $HoH{simpsons}{lead} =~ s/(\w)/\u$1/;
631
632 # print the whole thing
84f709e7 633 foreach $family ( keys %HoH ) {
1fef88e7 634 print "$family: { ";
84f709e7 635 for $role ( keys %{ $HoH{$family} } ) {
636 print "$role=$HoH{$family}{$role} ";
cb1a09d0 637 }
638 print "}\n";
4973169d 639 }
cb1a09d0 640
641 # print the whole thing somewhat sorted
84f709e7 642 foreach $family ( sort keys %HoH ) {
1fef88e7 643 print "$family: { ";
84f709e7 644 for $role ( sort keys %{ $HoH{$family} } ) {
645 print "$role=$HoH{$family}{$role} ";
cb1a09d0 646 }
647 print "}\n";
4973169d 648 }
cb1a09d0 649
84f709e7 650
cb1a09d0 651 # print the whole thing sorted by number of members
84f709e7 652 foreach $family ( sort { keys %{$HoH{$b}} <=> keys %{$HoH{$a}} } keys %HoH ) {
1fef88e7 653 print "$family: { ";
84f709e7 654 for $role ( sort keys %{ $HoH{$family} } ) {
655 print "$role=$HoH{$family}{$role} ";
cb1a09d0 656 }
657 print "}\n";
4973169d 658 }
cb1a09d0 659
660 # establish a sort order (rank) for each role
84f709e7 661 $i = 0;
662 for ( qw(lead wife son daughter pal pet) ) { $rank{$_} = ++$i }
cb1a09d0 663
664 # now print the whole thing sorted by number of members
84f709e7 665 foreach $family ( sort { keys %{ $HoH{$b} } <=> keys %{ $HoH{$a} } } keys %HoH ) {
1fef88e7 666 print "$family: { ";
cb1a09d0 667 # and print these according to rank order
84f709e7 668 for $role ( sort { $rank{$a} <=> $rank{$b} } keys %{ $HoH{$family} } ) {
669 print "$role=$HoH{$family}{$role} ";
cb1a09d0 670 }
671 print "}\n";
4973169d 672 }
cb1a09d0 673
674
675=head1 MORE ELABORATE RECORDS
676
677=head2 Declaration of MORE ELABORATE RECORDS
678
679Here's a sample showing how to create and use a record whose fields are of
680many different sorts:
681
84f709e7 682 $rec = {
4973169d 683 TEXT => $string,
684 SEQUENCE => [ @old_values ],
685 LOOKUP => { %some_table },
686 THATCODE => \&some_function,
687 THISCODE => sub { $_[0] ** $_[1] },
688 HANDLE => \*STDOUT,
cb1a09d0 689 };
690
4973169d 691 print $rec->{TEXT};
cb1a09d0 692
84f709e7 693 print $rec->{SEQUENCE}[0];
694 $last = pop @ { $rec->{SEQUENCE} };
cb1a09d0 695
84f709e7 696 print $rec->{LOOKUP}{"key"};
697 ($first_k, $first_v) = each %{ $rec->{LOOKUP} };
cb1a09d0 698
84f709e7 699 $answer = $rec->{THATCODE}->($arg);
700 $answer = $rec->{THISCODE}->($arg1, $arg2);
cb1a09d0 701
702 # careful of extra block braces on fh ref
4973169d 703 print { $rec->{HANDLE} } "a string\n";
cb1a09d0 704
705 use FileHandle;
4973169d 706 $rec->{HANDLE}->autoflush(1);
707 $rec->{HANDLE}->print(" a string\n");
cb1a09d0 708
709=head2 Declaration of a HASH OF COMPLEX RECORDS
710
84f709e7 711 %TV = (
28757baa 712 flintstones => {
84f709e7 713 series => "flintstones",
4973169d 714 nights => [ qw(monday thursday friday) ],
cb1a09d0 715 members => [
84f709e7 716 { name => "fred", role => "lead", age => 36, },
717 { name => "wilma", role => "wife", age => 31, },
718 { name => "pebbles", role => "kid", age => 4, },
cb1a09d0 719 ],
720 },
721
28757baa 722 jetsons => {
84f709e7 723 series => "jetsons",
4973169d 724 nights => [ qw(wednesday saturday) ],
cb1a09d0 725 members => [
84f709e7 726 { name => "george", role => "lead", age => 41, },
727 { name => "jane", role => "wife", age => 39, },
728 { name => "elroy", role => "kid", age => 9, },
cb1a09d0 729 ],
730 },
731
28757baa 732 simpsons => {
84f709e7 733 series => "simpsons",
4973169d 734 nights => [ qw(monday) ],
cb1a09d0 735 members => [
84f709e7 736 { name => "homer", role => "lead", age => 34, },
737 { name => "marge", role => "wife", age => 37, },
738 { name => "bart", role => "kid", age => 11, },
cb1a09d0 739 ],
740 },
741 );
742
743=head2 Generation of a HASH OF COMPLEX RECORDS
744
84f709e7 745 # reading from file
746 # this is most easily done by having the file itself be
747 # in the raw data format as shown above. perl is happy
748 # to parse complex data structures if declared as data, so
749 # sometimes it's easiest to do that
cb1a09d0 750
84f709e7 751 # here's a piece by piece build up
752 $rec = {};
753 $rec->{series} = "flintstones";
cb1a09d0 754 $rec->{nights} = [ find_days() ];
755
84f709e7 756 @members = ();
cb1a09d0 757 # assume this file in field=value syntax
84f709e7 758 while (<>) {
759 %fields = split /[\s=]+/;
cb1a09d0 760 push @members, { %fields };
761 }
762 $rec->{members} = [ @members ];
763
764 # now remember the whole thing
765 $TV{ $rec->{series} } = $rec;
766
84f709e7 767 ###########################################################
768 # now, you might want to make interesting extra fields that
769 # include pointers back into the same data structure so if
770 # change one piece, it changes everywhere, like for example
771 # if you wanted a {kids} field that was a reference
772 # to an array of the kids' records without having duplicate
773 # records and thus update problems.
774 ###########################################################
775 foreach $family (keys %TV) {
776 $rec = $TV{$family}; # temp pointer
777 @kids = ();
778 for $person ( @{ $rec->{members} } ) {
779 if ($person->{role} =~ /kid|son|daughter/) {
cb1a09d0 780 push @kids, $person;
781 }
782 }
783 # REMEMBER: $rec and $TV{$family} point to same data!!
784 $rec->{kids} = [ @kids ];
785 }
786
84f709e7 787 # you copied the array, but the array itself contains pointers
788 # to uncopied objects. this means that if you make bart get
789 # older via
cb1a09d0 790
791 $TV{simpsons}{kids}[0]{age}++;
792
84f709e7 793 # then this would also change in
794 print $TV{simpsons}{members}[2]{age};
cb1a09d0 795
84f709e7 796 # because $TV{simpsons}{kids}[0] and $TV{simpsons}{members}[2]
797 # both point to the same underlying anonymous hash table
6ba6f0ec 798
84f709e7 799 # print the whole thing
800 foreach $family ( keys %TV ) {
801 print "the $family";
802 print " is on during @{ $TV{$family}{nights} }\n";
803 print "its members are:\n";
804 for $who ( @{ $TV{$family}{members} } ) {
cb1a09d0 805 print " $who->{name} ($who->{role}), age $who->{age}\n";
806 }
84f709e7 807 print "it turns out that $TV{$family}{lead} has ";
808 print scalar ( @{ $TV{$family}{kids} } ), " kids named ";
809 print join (", ", map { $_->{name} } @{ $TV{$family}{kids} } );
810 print "\n";
cb1a09d0 811 }
812
c07a80fd 813=head1 Database Ties
814
815You cannot easily tie a multilevel data structure (such as a hash of
816hashes) to a dbm file. The first problem is that all but GDBM and
817Berkeley DB have size limitations, but beyond that, you also have problems
818with how references are to be represented on disk. One experimental
5f05dabc 819module that does partially attempt to address this need is the MLDBM
f102b883 820module. Check your nearest CPAN site as described in L<perlmodlib> for
c07a80fd 821source code to MLDBM.
822
4633a7c4 823=head1 SEE ALSO
824
1fef88e7 825perlref(1), perllol(1), perldata(1), perlobj(1)
4633a7c4 826
827=head1 AUTHOR
828
9607fc9c 829Tom Christiansen <F<tchrist@perl.com>>
4633a7c4 830
84f709e7 831Last update:
28757baa 832Wed Oct 23 04:57:50 MET DST 1996