pod/perldsc.pod

   1 =head1 NAME
   2 X<data structure> X<complex data structure> X<struct>
   3
   4 perldsc - Perl Data Structures Cookbook
   5
   6 =head1 DESCRIPTION
   7
   8 The single feature most sorely lacking in the Perl programming language
   9 prior to its 5.0 release was complex data structures.  Even without direct
  10 language support, some valiant programmers did manage to emulate them, but
  11 it was hard work and not for the faint of heart.  You could occasionally
  12 get away with the C<$m{$AoA,$b}> notation borrowed from B<awk> in which the
  13 keys are actually more like a single concatenated string C<"$AoA$b">, but
  14 traversal and sorting were difficult.  More desperate programmers even
  15 hacked Perl's internal symbol table directly, a strategy that proved hard
  16 to develop and maintain--to put it mildly.
  17
  18 The 5.0 release of Perl let us have complex data structures.  You
  19 may now write something like this and all of a sudden, you'd have an array
  20 with three dimensions!
  21
  22     for $x (1 .. 10) {
  23         for $y (1 .. 10) {
  24             for $z (1 .. 10) {
  25                 $AoA[$x][$y][$z] =
  26                     $x ** $y + $z;
  27             }
  28         }
  29     }
  30
  31 Alas, however simple this may appear, underneath it's a much more
  32 elaborate construct than meets the eye!
  33
  34 How do you print it out?  Why can't you say just C<print @AoA>?  How do
  35 you sort it?  How can you pass it to a function or get one of these back
  36 from a function?  Is it an object?  Can you save it to disk to read
  37 back later?  How do you access whole rows or columns of that matrix?  Do
  38 all the values have to be numeric?
  39
  40 As you see, it's quite easy to become confused.  While some small portion
  41 of the blame for this can be attributed to the reference-based
  42 implementation, it's really more due to a lack of existing documentation with
  43 examples designed for the beginner.
  44
  45 This document is meant to be a detailed but understandable treatment of the
  46 many different sorts of data structures you might want to develop.  It
  47 should also serve as a cookbook of examples.  That way, when you need to
  48 create one of these complex data structures, you can just pinch, pilfer, or
  49 purloin a drop-in example from here.
  50
  51 Let's look at each of these possible constructs in detail.  There are separate
  52 sections on each of the following:
  53
  54 =over 5
  55
  56 =item * arrays of arrays
  57
  58 =item * hashes of arrays
  59
  60 =item * arrays of hashes
  61
  62 =item * hashes of hashes
  63
  64 =item * more elaborate constructs
  65
  66 =back
  67
  68 But for now, let's look at general issues common to all
  69 these types of data structures.
  70
  71 =head1 REFERENCES
  72 X<reference> X<dereference> X<dereferencing> X<pointer>
  73
  74 The most important thing to understand about all data structures in
  75 Perl--including multidimensional arrays--is that even though they might
  76 appear otherwise, Perl C<@ARRAY>s and C<%HASH>es are all internally
  77 one-dimensional.  They can hold only scalar values (meaning a string,
  78 number, or a reference).  They cannot directly contain other arrays or
  79 hashes, but instead contain I<references> to other arrays or hashes.
  80 X<multidimensional array> X<array, multidimensional>
  81
  82 You can't use a reference to an array or hash in quite the same way that you
  83 would a real array or hash.  For C or C++ programmers unused to
  84 distinguishing between arrays and pointers to the same, this can be
  85 confusing.  If so, just think of it as the difference between a structure
  86 and a pointer to a structure.
  87
  88 You can (and should) read more about references in the perlref(1) man
  89 page.  Briefly, references are rather like pointers that know what they
  90 point to.  (Objects are also a kind of reference, but we won't be needing
  91 them right away--if ever.)  This means that when you have something which
  92 looks to you like an access to a two-or-more-dimensional array and/or hash,
  93 what's really going on is that the base type is
  94 merely a one-dimensional entity that contains references to the next
  95 level.  It's just that you can I<use> it as though it were a
  96 two-dimensional one.  This is actually the way almost all C
  97 multidimensional arrays work as well.
  98
  99     $array[7][12]                       # array of arrays
 100     $array[7]{string}                   # array of hashes
 101     $hash{string}[7]                    # hash of arrays
 102     $hash{string}{'another string'}     # hash of hashes
 103
 104 Now, because the top level contains only references, if you try to print
 105 out your array in with a simple print() function, you'll get something
 106 that doesn't look very nice, like this:
 107
 108     @AoA = ( [2, 3], [4, 5, 7], [0] );
 109     print $AoA[1][2];
 110   7
 111     print @AoA;
 112   ARRAY(0x83c38)ARRAY(0x8b194)ARRAY(0x8b1d0)
 113
 114
 115 That's because Perl doesn't (ever) implicitly dereference your variables.
 116 If you want to get at the thing a reference is referring to, then you have
 117 to do this yourself using either prefix typing indicators, like
 118 C<${$blah}>, C<@{$blah}>, C<@{$blah[$i]}>, or else postfix pointer arrows,
 119 like C<$a-E<gt>[3]>, C<$h-E<gt>{fred}>, or even C<$ob-E<gt>method()-E<gt>[3]>.
 120
 121 =head1 COMMON MISTAKES
 122
 123 The two most common mistakes made in constructing something like
 124 an array of arrays is either accidentally counting the number of
 125 elements or else taking a reference to the same memory location
 126 repeatedly.  Here's the case where you just get the count instead
 127 of a nested array:
 128
 129     for $i (1..10) {
 130         @array = somefunc($i);
 131         $AoA[$i] = @array;      # WRONG!
 132     }
 133
 134 That's just the simple case of assigning an array to a scalar and getting
 135 its element count.  If that's what you really and truly want, then you
 136 might do well to consider being a tad more explicit about it, like this:
 137
 138     for $i (1..10) {
 139         @array = somefunc($i);
 140         $counts[$i] = scalar @array;
 141     }
 142
 143 Here's the case of taking a reference to the same memory location
 144 again and again:
 145
 146     for $i (1..10) {
 147         @array = somefunc($i);
 148         $AoA[$i] = \@array;     # WRONG!
 149     }
 150
 151 So, what's the big problem with that?  It looks right, doesn't it?
 152 After all, I just told you that you need an array of references, so by
 153 golly, you've made me one!
 154
 155 Unfortunately, while this is true, it's still broken.  All the references
 156 in @AoA refer to the I<very same place>, and they will therefore all hold
 157 whatever was last in @array!  It's similar to the problem demonstrated in
 158 the following C program:
 159
 160     #include <pwd.h>
 161     main() {
 162         struct passwd *getpwnam(), *rp, *dp;
 163         rp = getpwnam("root");
 164         dp = getpwnam("daemon");
 165
 166         printf("daemon name is %s\nroot name is %s\n",
 167                 dp->pw_name, rp->pw_name);
 168     }
 169
 170 Which will print
 171
 172     daemon name is daemon
 173     root name is daemon
 174
 175 The problem is that both C<rp> and C<dp> are pointers to the same location
 176 in memory!  In C, you'd have to remember to malloc() yourself some new
 177 memory.  In Perl, you'll want to use the array constructor C<[]> or the
 178 hash constructor C<{}> instead.   Here's the right way to do the preceding
 179 broken code fragments:
 180 X<[]> X<{}>
 181
 182     for $i (1..10) {
 183         @array = somefunc($i);
 184         $AoA[$i] = [ @array ];
 185     }
 186
 187 The square brackets make a reference to a new array with a I<copy>
 188 of what's in @array at the time of the assignment.  This is what
 189 you want.
 190
 191 Note that this will produce something similar, but it's
 192 much harder to read:
 193
 194     for $i (1..10) {
 195         @array = 0 .. $i;
 196         @{$AoA[$i]} = @array;
 197     }
 198
 199 Is it the same?  Well, maybe so--and maybe not.  The subtle difference
 200 is that when you assign something in square brackets, you know for sure
 201 it's always a brand new reference with a new I<copy> of the data.
 202 Something else could be going on in this new case with the C<@{$AoA[$i]}>
 203 dereference on the left-hand-side of the assignment.  It all depends on
 204 whether C<$AoA[$i]> had been undefined to start with, or whether it
 205 already contained a reference.  If you had already populated @AoA with
 206 references, as in
 207
 208     $AoA[3] = \@another_array;
 209
 210 Then the assignment with the indirection on the left-hand-side would
 211 use the existing reference that was already there:
 212
 213     @{$AoA[3]} = @array;
 214
 215 Of course, this I<would> have the "interesting" effect of clobbering
 216 @another_array.  (Have you ever noticed how when a programmer says
 217 something is "interesting", that rather than meaning "intriguing",
 218 they're disturbingly more apt to mean that it's "annoying",
 219 "difficult", or both?  :-)
 220
 221 So just remember always to use the array or hash constructors with C<[]>
 222 or C<{}>, and you'll be fine, although it's not always optimally
 223 efficient.
 224
 225 Surprisingly, the following dangerous-looking construct will
 226 actually work out fine:
 227
 228     for $i (1..10) {
 229         my @array = somefunc($i);
 230         $AoA[$i] = \@array;
 231     }
 232
 233 That's because my() is more of a run-time statement than it is a
 234 compile-time declaration I<per se>.  This means that the my() variable is
 235 remade afresh each time through the loop.  So even though it I<looks> as
 236 though you stored the same variable reference each time, you actually did
 237 not!  This is a subtle distinction that can produce more efficient code at
 238 the risk of misleading all but the most experienced of programmers.  So I
 239 usually advise against teaching it to beginners.  In fact, except for
 240 passing arguments to functions, I seldom like to see the gimme-a-reference
 241 operator (backslash) used much at all in code.  Instead, I advise
 242 beginners that they (and most of the rest of us) should try to use the
 243 much more easily understood constructors C<[]> and C<{}> instead of
 244 relying upon lexical (or dynamic) scoping and hidden reference-counting to
 245 do the right thing behind the scenes.
 246
 247 In summary:
 248
 249     $AoA[$i] = [ @array ];      # usually best
 250     $AoA[$i] = \@array;         # perilous; just how my() was that array?
 251     @{ $AoA[$i] } = @array;     # way too tricky for most programmers
 252
 253
 254 =head1 CAVEAT ON PRECEDENCE
 255 X<dereference, precedence> X<dereferencing, precedence>
 256
 257 Speaking of things like C<@{$AoA[$i]}>, the following are actually the
 258 same thing:
 259 X<< -> >>
 260
 261     $aref->[2][2]       # clear
 262     $$aref[2][2]        # confusing
 263
 264 That's because Perl's precedence rules on its five prefix dereferencers
 265 (which look like someone swearing: C<$ @ * % &>) make them bind more
 266 tightly than the postfix subscripting brackets or braces!  This will no
 267 doubt come as a great shock to the C or C++ programmer, who is quite
 268 accustomed to using C<*a[i]> to mean what's pointed to by the I<i'th>
 269 element of C<a>.  That is, they first take the subscript, and only then
 270 dereference the thing at that subscript.  That's fine in C, but this isn't C.
 271
 272 The seemingly equivalent construct in Perl, C<$$aref[$i]> first does
 273 the deref of $aref, making it take $aref as a reference to an
 274 array, and then dereference that, and finally tell you the I<i'th> value
 275 of the array pointed to by $AoA. If you wanted the C notion, you'd have to
 276 write C<${$AoA[$i]}> to force the C<$AoA[$i]> to get evaluated first
 277 before the leading C<$> dereferencer.
 278
 279 =head1 WHY YOU SHOULD ALWAYS C<use strict>
 280
 281 If this is starting to sound scarier than it's worth, relax.  Perl has
 282 some features to help you avoid its most common pitfalls.  The best
 283 way to avoid getting confused is to start every program like this:
 284
 285     #!/usr/bin/perl -w
 286     use strict;
 287
 288 This way, you'll be forced to declare all your variables with my() and
 289 also disallow accidental "symbolic dereferencing".  Therefore if you'd done
 290 this:
 291
 292     my $aref = [
 293         [ "fred", "barney", "pebbles", "bambam", "dino", ],
 294         [ "homer", "bart", "marge", "maggie", ],
 295         [ "george", "jane", "elroy", "judy", ],
 296     ];
 297
 298     print $aref[2][2];
 299
 300 The compiler would immediately flag that as an error I<at compile time>,
 301 because you were accidentally accessing C<@aref>, an undeclared
 302 variable, and it would thereby remind you to write instead:
 303
 304     print $aref->[2][2]
 305
 306 =head1 DEBUGGING
 307 X<data structure, debugging> X<complex data structure, debugging>
 308 X<AoA, debugging> X<HoA, debugging> X<AoH, debugging> X<HoH, debugging>
 309 X<array of arrays, debugging> X<hash of arrays, debugging>
 310 X<array of hashes, debugging> X<hash of hashes, debugging>
 311
 312 Before version 5.002, the standard Perl debugger didn't do a very nice job of
 313 printing out complex data structures.  With 5.002 or above, the
 314 debugger includes several new features, including command line editing as
 315 well as the C<x> command to dump out complex data structures.  For
 316 example, given the assignment to $AoA above, here's the debugger output:
 317
 318     DB<1> x $AoA
 319     $AoA = ARRAY(0x13b5a0)
 320        0  ARRAY(0x1f0a24)
 321           0  'fred'
 322           1  'barney'
 323           2  'pebbles'
 324           3  'bambam'
 325           4  'dino'
 326        1  ARRAY(0x13b558)
 327           0  'homer'
 328           1  'bart'
 329           2  'marge'
 330           3  'maggie'
 331        2  ARRAY(0x13b540)
 332           0  'george'
 333           1  'jane'
 334           2  'elroy'
 335           3  'judy'
 336
 337 =head1 CODE EXAMPLES
 338
 339 Presented with little comment (these will get their own manpages someday)
 340 here are short code examples illustrating access of various
 341 types of data structures.
 342
 343 =head1 ARRAYS OF ARRAYS
 344 X<array of arrays> X<AoA>
 345
 346 =head2 Declaration of an ARRAY OF ARRAYS
 347
 348  @AoA = (
 349         [ "fred", "barney" ],
 350         [ "george", "jane", "elroy" ],
 351         [ "homer", "marge", "bart" ],
 352       );
 353
 354 =head2 Generation of an ARRAY OF ARRAYS
 355
 356  # reading from file
 357  while ( <> ) {
 358      push @AoA, [ split ];
 359  }
 360
 361  # calling a function
 362  for $i ( 1 .. 10 ) {
 363      $AoA[$i] = [ somefunc($i) ];
 364  }
 365
 366  # using temp vars
 367  for $i ( 1 .. 10 ) {
 368      @tmp = somefunc($i);
 369      $AoA[$i] = [ @tmp ];
 370  }
 371
 372  # add to an existing row
 373  push @{ $AoA[0] }, "wilma", "betty";
 374
 375 =head2 Access and Printing of an ARRAY OF ARRAYS
 376
 377  # one element
 378  $AoA[0][0] = "Fred";
 379
 380  # another element
 381  $AoA[1][1] =~ s/(\w)/\u$1/;
 382
 383  # print the whole thing with refs
 384  for $aref ( @AoA ) {
 385      print "\t [ @$aref ],\n";
 386  }
 387
 388  # print the whole thing with indices
 389  for $i ( 0 .. $#AoA ) {
 390      print "\t [ @{$AoA[$i]} ],\n";
 391  }
 392
 393  # print the whole thing one at a time
 394  for $i ( 0 .. $#AoA ) {
 395      for $j ( 0 .. $#{ $AoA[$i] } ) {
 396          print "elt $i $j is $AoA[$i][$j]\n";
 397      }
 398  }
 399
 400 =head1 HASHES OF ARRAYS
 401 X<hash of arrays> X<HoA>
 402
 403 =head2 Declaration of a HASH OF ARRAYS
 404
 405  %HoA = (
 406         flintstones        => [ "fred", "barney" ],
 407         jetsons            => [ "george", "jane", "elroy" ],
 408         simpsons           => [ "homer", "marge", "bart" ],
 409       );
 410
 411 =head2 Generation of a HASH OF ARRAYS
 412
 413  # reading from file
 414  # flintstones: fred barney wilma dino
 415  while ( <> ) {
 416      next unless s/^(.*?):\s*//;
 417      $HoA{$1} = [ split ];
 418  }
 419
 420  # reading from file; more temps
 421  # flintstones: fred barney wilma dino
 422  while ( $line = <> ) {
 423      ($who, $rest) = split /:\s*/, $line, 2;
 424      @fields = split ' ', $rest;
 425      $HoA{$who} = [ @fields ];
 426  }
 427
 428  # calling a function that returns a list
 429  for $group ( "simpsons", "jetsons", "flintstones" ) {
 430      $HoA{$group} = [ get_family($group) ];
 431  }
 432
 433  # likewise, but using temps
 434  for $group ( "simpsons", "jetsons", "flintstones" ) {
 435      @members = get_family($group);
 436      $HoA{$group} = [ @members ];
 437  }
 438
 439  # append new members to an existing family
 440  push @{ $HoA{"flintstones"} }, "wilma", "betty";
 441
 442 =head2 Access and Printing of a HASH OF ARRAYS
 443
 444  # one element
 445  $HoA{flintstones}[0] = "Fred";
 446
 447  # another element
 448  $HoA{simpsons}[1] =~ s/(\w)/\u$1/;
 449
 450  # print the whole thing
 451  foreach $family ( keys %HoA ) {
 452      print "$family: @{ $HoA{$family} }\n"
 453  }
 454
 455  # print the whole thing with indices
 456  foreach $family ( keys %HoA ) {
 457      print "family: ";
 458      foreach $i ( 0 .. $#{ $HoA{$family} } ) {
 459          print " $i = $HoA{$family}[$i]";
 460      }
 461      print "\n";
 462  }
 463
 464  # print the whole thing sorted by number of members
 465  foreach $family ( sort { @{$HoA{$b}} <=> @{$HoA{$a}} } keys %HoA ) {
 466      print "$family: @{ $HoA{$family} }\n"
 467  }
 468
 469  # print the whole thing sorted by number of members and name
 470  foreach $family ( sort {
 471                             @{$HoA{$b}} <=> @{$HoA{$a}}
 472                                         ||
 473                                     $a cmp $b
 474             } keys %HoA )
 475  {
 476      print "$family: ", join(", ", sort @{ $HoA{$family} }), "\n";
 477  }
 478
 479 =head1 ARRAYS OF HASHES
 480 X<array of hashes> X<AoH>
 481
 482 =head2 Declaration of an ARRAY OF HASHES
 483
 484  @AoH = (
 485         {
 486             Lead     => "fred",
 487             Friend   => "barney",
 488         },
 489         {
 490             Lead     => "george",
 491             Wife     => "jane",
 492             Son      => "elroy",
 493         },
 494         {
 495             Lead     => "homer",
 496             Wife     => "marge",
 497             Son      => "bart",
 498         }
 499   );
 500
 501 =head2 Generation of an ARRAY OF HASHES
 502
 503  # reading from file
 504  # format: LEAD=fred FRIEND=barney
 505  while ( <> ) {
 506      $rec = {};
 507      for $field ( split ) {
 508          ($key, $value) = split /=/, $field;
 509          $rec->{$key} = $value;
 510      }
 511      push @AoH, $rec;
 512  }
 513
 514
 515  # reading from file
 516  # format: LEAD=fred FRIEND=barney
 517  # no temp
 518  while ( <> ) {
 519      push @AoH, { split /[\s+=]/ };
 520  }
 521
 522  # calling a function  that returns a key/value pair list, like
 523  # "lead","fred","daughter","pebbles"
 524  while ( %fields = getnextpairset() ) {
 525      push @AoH, { %fields };
 526  }
 527
 528  # likewise, but using no temp vars
 529  while (<>) {
 530      push @AoH, { parsepairs($_) };
 531  }
 532
 533  # add key/value to an element
 534  $AoH[0]{pet} = "dino";
 535  $AoH[2]{pet} = "santa's little helper";
 536
 537 =head2 Access and Printing of an ARRAY OF HASHES
 538
 539  # one element
 540  $AoH[0]{lead} = "fred";
 541
 542  # another element
 543  $AoH[1]{lead} =~ s/(\w)/\u$1/;
 544
 545  # print the whole thing with refs
 546  for $href ( @AoH ) {
 547      print "{ ";
 548      for $role ( keys %$href ) {
 549          print "$role=$href->{$role} ";
 550      }
 551      print "}\n";
 552  }
 553
 554  # print the whole thing with indices
 555  for $i ( 0 .. $#AoH ) {
 556      print "$i is { ";
 557      for $role ( keys %{ $AoH[$i] } ) {
 558          print "$role=$AoH[$i]{$role} ";
 559      }
 560      print "}\n";
 561  }
 562
 563  # print the whole thing one at a time
 564  for $i ( 0 .. $#AoH ) {
 565      for $role ( keys %{ $AoH[$i] } ) {
 566          print "elt $i $role is $AoH[$i]{$role}\n";
 567      }
 568  }
 569
 570 =head1 HASHES OF HASHES
 571 X<hash of hashes> X<HoH>
 572
 573 =head2 Declaration of a HASH OF HASHES
 574
 575  %HoH = (
 576         flintstones => {
 577                 lead      => "fred",
 578                 pal       => "barney",
 579         },
 580         jetsons     => {
 581                 lead      => "george",
 582                 wife      => "jane",
 583                 "his boy" => "elroy",
 584         },
 585         simpsons    => {
 586                 lead      => "homer",
 587                 wife      => "marge",
 588                 kid       => "bart",
 589         },
 590  );
 591
 592 =head2 Generation of a HASH OF HASHES
 593
 594  # reading from file
 595  # flintstones: lead=fred pal=barney wife=wilma pet=dino
 596  while ( <> ) {
 597      next unless s/^(.*?):\s*//;
 598      $who = $1;
 599      for $field ( split ) {
 600          ($key, $value) = split /=/, $field;
 601          $HoH{$who}{$key} = $value;
 602      }
 603
 604
 605  # reading from file; more temps
 606  while ( <> ) {
 607      next unless s/^(.*?):\s*//;
 608      $who = $1;
 609      $rec = {};
 610      $HoH{$who} = $rec;
 611      for $field ( split ) {
 612          ($key, $value) = split /=/, $field;
 613          $rec->{$key} = $value;
 614      }
 615  }
 616
 617  # calling a function  that returns a key,value hash
 618  for $group ( "simpsons", "jetsons", "flintstones" ) {
 619      $HoH{$group} = { get_family($group) };
 620  }
 621
 622  # likewise, but using temps
 623  for $group ( "simpsons", "jetsons", "flintstones" ) {
 624      %members = get_family($group);
 625      $HoH{$group} = { %members };
 626  }
 627
 628  # append new members to an existing family
 629  %new_folks = (
 630      wife => "wilma",
 631      pet  => "dino",
 632  );
 633
 634  for $what (keys %new_folks) {
 635      $HoH{flintstones}{$what} = $new_folks{$what};
 636  }
 637
 638 =head2 Access and Printing of a HASH OF HASHES
 639
 640  # one element
 641  $HoH{flintstones}{wife} = "wilma";
 642
 643  # another element
 644  $HoH{simpsons}{lead} =~ s/(\w)/\u$1/;
 645
 646  # print the whole thing
 647  foreach $family ( keys %HoH ) {
 648      print "$family: { ";
 649      for $role ( keys %{ $HoH{$family} } ) {
 650          print "$role=$HoH{$family}{$role} ";
 651      }
 652      print "}\n";
 653  }
 654
 655  # print the whole thing  somewhat sorted
 656  foreach $family ( sort keys %HoH ) {
 657      print "$family: { ";
 658      for $role ( sort keys %{ $HoH{$family} } ) {
 659          print "$role=$HoH{$family}{$role} ";
 660      }
 661      print "}\n";
 662  }
 663
 664
 665  # print the whole thing sorted by number of members
 666  foreach $family ( sort { keys %{$HoH{$b}} <=> keys %{$HoH{$a}} } keys %HoH ) {
 667      print "$family: { ";
 668      for $role ( sort keys %{ $HoH{$family} } ) {
 669          print "$role=$HoH{$family}{$role} ";
 670      }
 671      print "}\n";
 672  }
 673
 674  # establish a sort order (rank) for each role
 675  $i = 0;
 676  for ( qw(lead wife son daughter pal pet) ) { $rank{$_} = ++$i }
 677
 678  # now print the whole thing sorted by number of members
 679  foreach $family ( sort { keys %{ $HoH{$b} } <=> keys %{ $HoH{$a} } } keys %HoH ) {
 680      print "$family: { ";
 681      # and print these according to rank order
 682      for $role ( sort { $rank{$a} <=> $rank{$b} }  keys %{ $HoH{$family} } ) {
 683          print "$role=$HoH{$family}{$role} ";
 684      }
 685      print "}\n";
 686  }
 687
 688
 689 =head1 MORE ELABORATE RECORDS
 690 X<record> X<structure> X<struct>
 691
 692 =head2 Declaration of MORE ELABORATE RECORDS
 693
 694 Here's a sample showing how to create and use a record whose fields are of
 695 many different sorts:
 696
 697      $rec = {
 698          TEXT      => $string,
 699          SEQUENCE  => [ @old_values ],
 700          LOOKUP    => { %some_table },
 701          THATCODE  => \&some_function,
 702          THISCODE  => sub { $_[0] ** $_[1] },
 703          HANDLE    => \*STDOUT,
 704      };
 705
 706      print $rec->{TEXT};
 707
 708      print $rec->{SEQUENCE}[0];
 709      $last = pop @ { $rec->{SEQUENCE} };
 710
 711      print $rec->{LOOKUP}{"key"};
 712      ($first_k, $first_v) = each %{ $rec->{LOOKUP} };
 713
 714      $answer = $rec->{THATCODE}->($arg);
 715      $answer = $rec->{THISCODE}->($arg1, $arg2);
 716
 717      # careful of extra block braces on fh ref
 718      print { $rec->{HANDLE} } "a string\n";
 719
 720      use FileHandle;
 721      $rec->{HANDLE}->autoflush(1);
 722      $rec->{HANDLE}->print(" a string\n");
 723
 724 =head2 Declaration of a HASH OF COMPLEX RECORDS
 725
 726      %TV = (
 727         flintstones => {
 728             series   => "flintstones",
 729             nights   => [ qw(monday thursday friday) ],
 730             members  => [
 731                 { name => "fred",    role => "lead", age  => 36, },
 732                 { name => "wilma",   role => "wife", age  => 31, },
 733                 { name => "pebbles", role => "kid",  age  =>  4, },
 734             ],
 735         },
 736
 737         jetsons     => {
 738             series   => "jetsons",
 739             nights   => [ qw(wednesday saturday) ],
 740             members  => [
 741                 { name => "george",  role => "lead", age  => 41, },
 742                 { name => "jane",    role => "wife", age  => 39, },
 743                 { name => "elroy",   role => "kid",  age  =>  9, },
 744             ],
 745          },
 746
 747         simpsons    => {
 748             series   => "simpsons",
 749             nights   => [ qw(monday) ],
 750             members  => [
 751                 { name => "homer", role => "lead", age  => 34, },
 752                 { name => "marge", role => "wife", age => 37, },
 753                 { name => "bart",  role => "kid",  age  =>  11, },
 754             ],
 755          },
 756       );
 757
 758 =head2 Generation of a HASH OF COMPLEX RECORDS
 759
 760      # reading from file
 761      # this is most easily done by having the file itself be
 762      # in the raw data format as shown above.  perl is happy
 763      # to parse complex data structures if declared as data, so
 764      # sometimes it's easiest to do that
 765
 766      # here's a piece by piece build up
 767      $rec = {};
 768      $rec->{series} = "flintstones";
 769      $rec->{nights} = [ find_days() ];
 770
 771      @members = ();
 772      # assume this file in field=value syntax
 773      while (<>) {
 774          %fields = split /[\s=]+/;
 775          push @members, { %fields };
 776      }
 777      $rec->{members} = [ @members ];
 778
 779      # now remember the whole thing
 780      $TV{ $rec->{series} } = $rec;
 781
 782      ###########################################################
 783      # now, you might want to make interesting extra fields that
 784      # include pointers back into the same data structure so if
 785      # change one piece, it changes everywhere, like for example
 786      # if you wanted a {kids} field that was a reference
 787      # to an array of the kids' records without having duplicate
 788      # records and thus update problems.
 789      ###########################################################
 790      foreach $family (keys %TV) {
 791          $rec = $TV{$family}; # temp pointer
 792          @kids = ();
 793          for $person ( @{ $rec->{members} } ) {
 794              if ($person->{role} =~ /kid|son|daughter/) {
 795                  push @kids, $person;
 796              }
 797          }
 798          # REMEMBER: $rec and $TV{$family} point to same data!!
 799          $rec->{kids} = [ @kids ];
 800      }
 801
 802      # you copied the array, but the array itself contains pointers
 803      # to uncopied objects. this means that if you make bart get
 804      # older via
 805
 806      $TV{simpsons}{kids}[0]{age}++;
 807
 808      # then this would also change in
 809      print $TV{simpsons}{members}[2]{age};
 810
 811      # because $TV{simpsons}{kids}[0] and $TV{simpsons}{members}[2]
 812      # both point to the same underlying anonymous hash table
 813
 814      # print the whole thing
 815      foreach $family ( keys %TV ) {
 816          print "the $family";
 817          print " is on during @{ $TV{$family}{nights} }\n";
 818          print "its members are:\n";
 819          for $who ( @{ $TV{$family}{members} } ) {
 820              print " $who->{name} ($who->{role}), age $who->{age}\n";
 821          }
 822          print "it turns out that $TV{$family}{lead} has ";
 823          print scalar ( @{ $TV{$family}{kids} } ), " kids named ";
 824          print join (", ", map { $_->{name} } @{ $TV{$family}{kids} } );
 825          print "\n";
 826      }
 827
 828 =head1 Database Ties
 829
 830 You cannot easily tie a multilevel data structure (such as a hash of
 831 hashes) to a dbm file.  The first problem is that all but GDBM and
 832 Berkeley DB have size limitations, but beyond that, you also have problems
 833 with how references are to be represented on disk.  One experimental
 834 module that does partially attempt to address this need is the MLDBM
 835 module.  Check your nearest CPAN site as described in L<perlmodlib> for
 836 source code to MLDBM.
 837
 838 =head1 SEE ALSO
 839
 840 perlref(1), perllol(1), perldata(1), perlobj(1)
 841
 842 =head1 AUTHOR
 843
 844 Tom Christiansen <F<tchrist@perl.com>>
 845
 846 Last update:
 847 Wed Oct 23 04:57:50 MET DST 1996