lib/DBM/Deep/Internals.pod

   1 =head1 NAME
   2
   3 DBM::Deep::Internals
   4
   5 =head1 OUT OF DATE
   6
   7 This document is out-of-date. It describes an intermediate file format used
   8 during the development from 0.983 to 1.0000. It will be rewritten soon.
   9
  10 =head1 DESCRIPTION
  11
  12 This is a document describing the internal workings of L<DBM::Deep>. It is
  13 not necessary to read this document if you only intend to be a user. This
  14 document is intended for people who either want a deeper understanding of
  15 specifics of how L<DBM::Deep> works or who wish to help program
  16 L<DBM::Deep>.
  17
  18 =head1 CLASS LAYOUT
  19
  20 L<DBM::Deep> is broken up into five classes in three inheritance hierarchies.
  21
  22 =over 4
  23
  24 =item *
  25
  26 L<DBM::Deep> is the parent of L<DBM::Deep::Array> and L<DBM::Deep::Hash>.
  27 These classes form the immediate interface to the outside world. They are the
  28 classes that provide the TIE mechanisms as well as the OO methods.
  29
  30 =item *
  31
  32 L<DBM::Deep::Engine> is the layer that deals with the mechanics of reading
  33 and writing to the file. This is where the logic of the file layout is
  34 handled.
  35
  36 =item *
  37
  38 L<DBM::Deep::File> is the layer that deals with the physical file. As a
  39 singleton that every other object has a reference to, it also provides a place
  40 to handle datastructure-wide items, such as transactions.
  41
  42 =back
  43
  44 =head1 FILE LAYOUT
  45
  46 DBM::Deep uses a tagged file layout. Every section has a tag, a size, and then
  47 the data.
  48
  49 =head2 File header
  50
  51 =over 4
  52
  53 =item * File Signature
  54
  55 The first four bytes are 'DPDB' in network byte order, signifying that this is
  56 a DBM::Deep file.
  57
  58 =item * File tag/size
  59
  60 This is the tagging of the file header. The file used by versions prior to
  61 1.00 had a different fifth byte, allowing the difference to the determined.
  62
  63 =item * Version
  64
  65 This is four bytes containing the file version. This lets the file format change over time.
  66
  67 =item * Constants
  68
  69 These are the file-wide constants that determine how the file is laid out.
  70 They can only be set upon file creation.
  71
  72 =item * Transaction information
  73
  74 The current running transactions are stored here, as is the next transaction
  75 ID.
  76
  77 =item * Freespace information
  78
  79 Pointers into the next free sectors of the various sector sizes (Index,
  80 Bucketlist, and Data) are stored here.
  81
  82 =back
  83
  84 =head2 Index
  85
  86 The Index parts can be tagged either as Hash, Array, or Index. The latter
  87 is if there was a reindexing due to a bucketlist growing too large. The others
  88 are the root index for their respective datatypes. The index consists of a
  89 tag, a size, and then 256 sections containing file locations. Each section
  90 corresponds to each value representable in a byte.
  91
  92 The index is used as follows - whenever a hashed key is being looked up, the
  93 first byte is used to determine which location to go to from the root index.
  94 Then, if that's also an index, the second byte is used, and so forth until a
  95 bucketlist is found.
  96
  97 =head2 Bucketlist
  98
  99 This is the part that contains the link to the data section. A bucketlist
 100 defaults to being 16 buckets long (modifiable by the I<max_buckets>
 101 parameter used when creating a new file). Each bucket contains an MD5 and a
 102 location of the appropriate key section.
 103
 104 =head2 Key area
 105
 106 This is the part that handles transactional awareness. There are
 107 I<max_buckets> sections. Each section contains the location to the data
 108 section, a transaction ID, and whether that transaction considers this key to
 109 be deleted or not.
 110
 111 =head2 Data area
 112
 113 This is the part that actual stores the key, value, and class (if
 114 appropriate). The layout is:
 115
 116 =over 4
 117
 118 =item * tag
 119
 120 =item * length of the value
 121
 122 =item * the actual value
 123
 124 =item * keylength
 125
 126 =item * the actual key
 127
 128 =item * a byte indicating if this value has a classname
 129
 130 =item * the classname (if one is there)
 131
 132 =back
 133
 134 The key is stored after the value because the value is requested more often
 135 than the key.
 136
 137 =head1 PERFORMANCE
 138
 139 L<DBM::Deep> is written completely in Perl. It also is a multi-process DBM
 140 that uses the datafile as a method of synchronizing between multiple
 141 processes. This is unlike most RDBMSes like MySQL and Oracle. Furthermore,
 142 unlike all RDBMSes, L<DBM::Deep> stores both the data and the structure of
 143 that data as it would appear in a Perl program.
 144
 145 =head2 CPU
 146
 147 DBM::Deep attempts to be CPU-light. As it stores all the data on disk,
 148 DBM::Deep is I/O-bound, not CPU-bound.
 149
 150 =head2 RAM
 151
 152 DBM::Deep uses extremely little RAM relative to the amount of data you can
 153 access. You can iterate through a million keys (using C<each()>) without
 154 increasing your memeory usage at all.
 155
 156 =head2 DISK
 157
 158 DBM::Deep is I/O-bound, pure and simple. The faster your disk, the faster
 159 DBM::Deep will be. Currently, when performing C<my $x = $db-E<gt>{foo}>, there
 160 are a minimum of 4 seeks and 1332 + N bytes read (where N is the length of your
 161 data). (All values assume a medium filesize.) The actions taken are:
 162
 163 =over 4
 164
 165 =item 1 Lock the file
 166
 167 =item 1 Perform a stat() to determine if the inode has changed
 168
 169 =item 1 Go to the primary index for the $db (1 seek)
 170
 171 =item 1 Read the tag/size of the primary index (5 bytes)
 172
 173 =item 1 Read the body of the primary index (1024 bytes)
 174
 175 =item 1 Go to the bucketlist for this MD5 (1 seek)
 176
 177 =item 1 Read the tag/size of the bucketlist (5 bytes)
 178
 179 =item 1 Read the body of the bucketlist (144 bytes)
 180
 181 =item 1 Go to the keys location for this MD5 (1 seek)
 182
 183 =item 1 Read the tag/size of the keys section (5 bytes)
 184
 185 =item 1 Read the body of the keys location (144 bytes)
 186
 187 =item 1 Go to the data section that corresponds to this transaction ID. (1 seek)
 188
 189 =item 1 Read the tag/size of the data section (5 bytes)
 190
 191 =item 1 Read the value for this data (N bytes)
 192
 193 =item 1 Unlock the file
 194
 195 =back
 196
 197 Every additional level of indexing (if there are enough keys) requires an
 198 additional seek and the reading of 1029 additional bytes. If the value is
 199 blessed, an additional 1 seek and 9 + M bytes are read (where M is the length
 200 of the classname).
 201
 202 Arrays are (currently) even worse because they're considered "funny hashes"
 203 with the length stored as just another key. This means that if you do any sort
 204 of lookup with a negative index, this entire process is performed twice - once
 205 for the length and once for the value.
 206
 207 =head1 ACTUAL TESTS
 208
 209 =head2 SPEED
 210
 211 Obviously, DBM::Deep isn't going to be as fast as some C-based DBMs, such as
 212 the almighty I<BerkeleyDB>.  But it makes up for it in features like true
 213 multi-level hash/array support, and cross-platform FTPable files.  Even so,
 214 DBM::Deep is still pretty fast, and the speed stays fairly consistent, even
 215 with huge databases.  Here is some test data:
 216
 217     Adding 1,000,000 keys to new DB file...
 218
 219     At 100 keys, avg. speed is 2,703 keys/sec
 220     At 200 keys, avg. speed is 2,642 keys/sec
 221     At 300 keys, avg. speed is 2,598 keys/sec
 222     At 400 keys, avg. speed is 2,578 keys/sec
 223     At 500 keys, avg. speed is 2,722 keys/sec
 224     At 600 keys, avg. speed is 2,628 keys/sec
 225     At 700 keys, avg. speed is 2,700 keys/sec
 226     At 800 keys, avg. speed is 2,607 keys/sec
 227     At 900 keys, avg. speed is 2,190 keys/sec
 228     At 1,000 keys, avg. speed is 2,570 keys/sec
 229     At 2,000 keys, avg. speed is 2,417 keys/sec
 230     At 3,000 keys, avg. speed is 1,982 keys/sec
 231     At 4,000 keys, avg. speed is 1,568 keys/sec
 232     At 5,000 keys, avg. speed is 1,533 keys/sec
 233     At 6,000 keys, avg. speed is 1,787 keys/sec
 234     At 7,000 keys, avg. speed is 1,977 keys/sec
 235     At 8,000 keys, avg. speed is 2,028 keys/sec
 236     At 9,000 keys, avg. speed is 2,077 keys/sec
 237     At 10,000 keys, avg. speed is 2,031 keys/sec
 238     At 20,000 keys, avg. speed is 1,970 keys/sec
 239     At 30,000 keys, avg. speed is 2,050 keys/sec
 240     At 40,000 keys, avg. speed is 2,073 keys/sec
 241     At 50,000 keys, avg. speed is 1,973 keys/sec
 242     At 60,000 keys, avg. speed is 1,914 keys/sec
 243     At 70,000 keys, avg. speed is 2,091 keys/sec
 244     At 80,000 keys, avg. speed is 2,103 keys/sec
 245     At 90,000 keys, avg. speed is 1,886 keys/sec
 246     At 100,000 keys, avg. speed is 1,970 keys/sec
 247     At 200,000 keys, avg. speed is 2,053 keys/sec
 248     At 300,000 keys, avg. speed is 1,697 keys/sec
 249     At 400,000 keys, avg. speed is 1,838 keys/sec
 250     At 500,000 keys, avg. speed is 1,941 keys/sec
 251     At 600,000 keys, avg. speed is 1,930 keys/sec
 252     At 700,000 keys, avg. speed is 1,735 keys/sec
 253     At 800,000 keys, avg. speed is 1,795 keys/sec
 254     At 900,000 keys, avg. speed is 1,221 keys/sec
 255     At 1,000,000 keys, avg. speed is 1,077 keys/sec
 256
 257 This test was performed on a PowerMac G4 1gHz running Mac OS X 10.3.2 & Perl
 258 5.8.1, with an 80GB Ultra ATA/100 HD spinning at 7200RPM.  The hash keys and
 259 values were between 6 - 12 chars in length.  The DB file ended up at 210MB.
 260 Run time was 12 min 3 sec.
 261
 262 =head2 MEMORY USAGE
 263
 264 One of the great things about L<DBM::Deep> is that it uses very little memory.
 265 Even with huge databases (1,000,000+ keys) you will not see much increased
 266 memory on your process.  L<DBM::Deep> relies solely on the filesystem for storing
 267 and fetching data.  Here is output from I<top> before even opening a database
 268 handle:
 269
 270     PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
 271   22831 root      11   0  2716 2716  1296 R     0.0  0.2   0:07 perl
 272
 273 Basically the process is taking 2,716K of memory.  And here is the same
 274 process after storing and fetching 1,000,000 keys:
 275
 276     PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
 277   22831 root      14   0  2772 2772  1328 R     0.0  0.2  13:32 perl
 278
 279 Notice the memory usage increased by only 56K.  Test was performed on a 700mHz
 280 x86 box running Linux RedHat 7.2 & Perl 5.6.1.
 281
 282 =cut