Commit | Line | Data |
d8db2929 |
1 | =head1 NAME |
2 | |
3 | DBM::Deep::Internals |
4 | |
5 | =head1 DESCRIPTION |
6 | |
1cff45d7 |
7 | B<NOTE>: This document is out-of-date. It describes an intermediate file |
8 | format used during the development from 0.983 to 1.0000. It will be rewritten |
9 | soon. |
10 | |
b8370759 |
11 | This is a document describing the internal workings of L<DBM::Deep>. It is |
d8db2929 |
12 | not necessary to read this document if you only intend to be a user. This |
13 | document is intended for people who either want a deeper understanding of |
b8370759 |
14 | specifics of how L<DBM::Deep> works or who wish to help program |
15 | L<DBM::Deep>. |
d8db2929 |
16 | |
17 | =head1 CLASS LAYOUT |
18 | |
b8370759 |
19 | L<DBM::Deep> is broken up into five classes in three inheritance hierarchies. |
d8db2929 |
20 | |
21 | =over 4 |
22 | |
23 | =item * |
24 | |
b8370759 |
25 | L<DBM::Deep> is the parent of L<DBM::Deep::Array> and L<DBM::Deep::Hash>. |
d8db2929 |
26 | These classes form the immediate interface to the outside world. They are the |
27 | classes that provide the TIE mechanisms as well as the OO methods. |
28 | |
29 | =item * |
30 | |
b8370759 |
31 | L<DBM::Deep::Engine> is the layer that deals with the mechanics of reading |
d8db2929 |
32 | and writing to the file. This is where the logic of the file layout is |
33 | handled. |
34 | |
35 | =item * |
36 | |
b8370759 |
37 | L<DBM::Deep::File> is the layer that deals with the physical file. As a |
d8db2929 |
38 | singleton that every other object has a reference to, it also provides a place |
39 | to handle datastructure-wide items, such as transactions. |
40 | |
41 | =back |
42 | |
43 | =head1 FILE LAYOUT |
44 | |
45 | DBM::Deep uses a tagged file layout. Every section has a tag, a size, and then |
46 | the data. |
47 | |
48 | =head2 File header |
49 | |
50 | =over 4 |
51 | |
52 | =item * File Signature |
53 | |
54 | The first four bytes are 'DPDB' in network byte order, signifying that this is |
55 | a DBM::Deep file. |
56 | |
57 | =item * File tag/size |
58 | |
59 | This is the tagging of the file header. The file used by versions prior to |
60 | 1.00 had a different fifth byte, allowing the difference to the determined. |
61 | |
62 | =item * Version |
63 | |
e9b0b5f0 |
64 | This is four bytes containing the file version. This lets the file format change over time. |
65 | |
66 | =item * Constants |
67 | |
68 | These are the file-wide constants that determine how the file is laid out. |
69 | They can only be set upon file creation. |
d8db2929 |
70 | |
71 | =item * Transaction information |
72 | |
73 | The current running transactions are stored here, as is the next transaction |
74 | ID. |
75 | |
e9b0b5f0 |
76 | =item * Freespace information |
d8db2929 |
77 | |
e9b0b5f0 |
78 | Pointers into the next free sectors of the various sector sizes (Index, |
79 | Bucketlist, and Data) are stored here. |
d8db2929 |
80 | |
81 | =back |
82 | |
83 | =head2 Index |
84 | |
85 | The Index parts can be tagged either as Hash, Array, or Index. The latter |
86 | is if there was a reindexing due to a bucketlist growing too large. The others |
87 | are the root index for their respective datatypes. The index consists of a |
88 | tag, a size, and then 256 sections containing file locations. Each section |
89 | corresponds to each value representable in a byte. |
90 | |
91 | The index is used as follows - whenever a hashed key is being looked up, the |
92 | first byte is used to determine which location to go to from the root index. |
93 | Then, if that's also an index, the second byte is used, and so forth until a |
94 | bucketlist is found. |
95 | |
96 | =head2 Bucketlist |
97 | |
98 | This is the part that contains the link to the data section. A bucketlist |
99 | defaults to being 16 buckets long (modifiable by the I<max_buckets> |
100 | parameter used when creating a new file). Each bucket contains an MD5 and a |
101 | location of the appropriate key section. |
102 | |
103 | =head2 Key area |
104 | |
105 | This is the part that handles transactional awareness. There are |
106 | I<max_buckets> sections. Each section contains the location to the data |
107 | section, a transaction ID, and whether that transaction considers this key to |
108 | be deleted or not. |
109 | |
110 | =head2 Data area |
111 | |
112 | This is the part that actual stores the key, value, and class (if |
113 | appropriate). The layout is: |
114 | |
115 | =over 4 |
116 | |
117 | =item * tag |
118 | |
119 | =item * length of the value |
120 | |
121 | =item * the actual value |
122 | |
123 | =item * keylength |
124 | |
125 | =item * the actual key |
126 | |
127 | =item * a byte indicating if this value has a classname |
128 | |
129 | =item * the classname (if one is there) |
130 | |
131 | =back |
132 | |
133 | The key is stored after the value because the value is requested more often |
134 | than the key. |
135 | |
136 | =head1 PERFORMANCE |
137 | |
b8370759 |
138 | L<DBM::Deep> is written completely in Perl. It also is a multi-process DBM |
d8db2929 |
139 | that uses the datafile as a method of synchronizing between multiple |
140 | processes. This is unlike most RDBMSes like MySQL and Oracle. Furthermore, |
b8370759 |
141 | unlike all RDBMSes, L<DBM::Deep> stores both the data and the structure of |
d8db2929 |
142 | that data as it would appear in a Perl program. |
143 | |
144 | =head2 CPU |
145 | |
146 | DBM::Deep attempts to be CPU-light. As it stores all the data on disk, |
147 | DBM::Deep is I/O-bound, not CPU-bound. |
148 | |
149 | =head2 RAM |
150 | |
151 | DBM::Deep uses extremely little RAM relative to the amount of data you can |
152 | access. You can iterate through a million keys (using C<each()>) without |
153 | increasing your memeory usage at all. |
154 | |
155 | =head2 DISK |
156 | |
157 | DBM::Deep is I/O-bound, pure and simple. The faster your disk, the faster |
158 | DBM::Deep will be. Currently, when performing C<my $x = $db-E<gt>{foo}>, there |
159 | are a minimum of 4 seeks and 1332 + N bytes read (where N is the length of your |
e9b0b5f0 |
160 | data). (All values assume a medium filesize.) The actions taken are: |
d8db2929 |
161 | |
162 | =over 4 |
163 | |
164 | =item 1 Lock the file |
165 | |
166 | =item 1 Perform a stat() to determine if the inode has changed |
167 | |
168 | =item 1 Go to the primary index for the $db (1 seek) |
169 | |
170 | =item 1 Read the tag/size of the primary index (5 bytes) |
171 | |
172 | =item 1 Read the body of the primary index (1024 bytes) |
173 | |
174 | =item 1 Go to the bucketlist for this MD5 (1 seek) |
175 | |
176 | =item 1 Read the tag/size of the bucketlist (5 bytes) |
177 | |
178 | =item 1 Read the body of the bucketlist (144 bytes) |
179 | |
180 | =item 1 Go to the keys location for this MD5 (1 seek) |
181 | |
182 | =item 1 Read the tag/size of the keys section (5 bytes) |
183 | |
184 | =item 1 Read the body of the keys location (144 bytes) |
185 | |
186 | =item 1 Go to the data section that corresponds to this transaction ID. (1 seek) |
187 | |
188 | =item 1 Read the tag/size of the data section (5 bytes) |
189 | |
190 | =item 1 Read the value for this data (N bytes) |
191 | |
192 | =item 1 Unlock the file |
193 | |
194 | =back |
195 | |
196 | Every additional level of indexing (if there are enough keys) requires an |
197 | additional seek and the reading of 1029 additional bytes. If the value is |
198 | blessed, an additional 1 seek and 9 + M bytes are read (where M is the length |
199 | of the classname). |
200 | |
201 | Arrays are (currently) even worse because they're considered "funny hashes" |
202 | with the length stored as just another key. This means that if you do any sort |
203 | of lookup with a negative index, this entire process is performed twice - once |
204 | for the length and once for the value. |
205 | |
1990c72d |
206 | =head1 ACTUAL TESTS |
207 | |
208 | =head2 SPEED |
209 | |
210 | Obviously, DBM::Deep isn't going to be as fast as some C-based DBMs, such as |
211 | the almighty I<BerkeleyDB>. But it makes up for it in features like true |
212 | multi-level hash/array support, and cross-platform FTPable files. Even so, |
213 | DBM::Deep is still pretty fast, and the speed stays fairly consistent, even |
214 | with huge databases. Here is some test data: |
215 | |
216 | Adding 1,000,000 keys to new DB file... |
217 | |
218 | At 100 keys, avg. speed is 2,703 keys/sec |
219 | At 200 keys, avg. speed is 2,642 keys/sec |
220 | At 300 keys, avg. speed is 2,598 keys/sec |
221 | At 400 keys, avg. speed is 2,578 keys/sec |
222 | At 500 keys, avg. speed is 2,722 keys/sec |
223 | At 600 keys, avg. speed is 2,628 keys/sec |
224 | At 700 keys, avg. speed is 2,700 keys/sec |
225 | At 800 keys, avg. speed is 2,607 keys/sec |
226 | At 900 keys, avg. speed is 2,190 keys/sec |
227 | At 1,000 keys, avg. speed is 2,570 keys/sec |
228 | At 2,000 keys, avg. speed is 2,417 keys/sec |
229 | At 3,000 keys, avg. speed is 1,982 keys/sec |
230 | At 4,000 keys, avg. speed is 1,568 keys/sec |
231 | At 5,000 keys, avg. speed is 1,533 keys/sec |
232 | At 6,000 keys, avg. speed is 1,787 keys/sec |
233 | At 7,000 keys, avg. speed is 1,977 keys/sec |
234 | At 8,000 keys, avg. speed is 2,028 keys/sec |
235 | At 9,000 keys, avg. speed is 2,077 keys/sec |
236 | At 10,000 keys, avg. speed is 2,031 keys/sec |
237 | At 20,000 keys, avg. speed is 1,970 keys/sec |
238 | At 30,000 keys, avg. speed is 2,050 keys/sec |
239 | At 40,000 keys, avg. speed is 2,073 keys/sec |
240 | At 50,000 keys, avg. speed is 1,973 keys/sec |
241 | At 60,000 keys, avg. speed is 1,914 keys/sec |
242 | At 70,000 keys, avg. speed is 2,091 keys/sec |
243 | At 80,000 keys, avg. speed is 2,103 keys/sec |
244 | At 90,000 keys, avg. speed is 1,886 keys/sec |
245 | At 100,000 keys, avg. speed is 1,970 keys/sec |
246 | At 200,000 keys, avg. speed is 2,053 keys/sec |
247 | At 300,000 keys, avg. speed is 1,697 keys/sec |
248 | At 400,000 keys, avg. speed is 1,838 keys/sec |
249 | At 500,000 keys, avg. speed is 1,941 keys/sec |
250 | At 600,000 keys, avg. speed is 1,930 keys/sec |
251 | At 700,000 keys, avg. speed is 1,735 keys/sec |
252 | At 800,000 keys, avg. speed is 1,795 keys/sec |
253 | At 900,000 keys, avg. speed is 1,221 keys/sec |
254 | At 1,000,000 keys, avg. speed is 1,077 keys/sec |
255 | |
256 | This test was performed on a PowerMac G4 1gHz running Mac OS X 10.3.2 & Perl |
257 | 5.8.1, with an 80GB Ultra ATA/100 HD spinning at 7200RPM. The hash keys and |
258 | values were between 6 - 12 chars in length. The DB file ended up at 210MB. |
259 | Run time was 12 min 3 sec. |
260 | |
261 | =head2 MEMORY USAGE |
262 | |
b8370759 |
263 | One of the great things about L<DBM::Deep> is that it uses very little memory. |
1990c72d |
264 | Even with huge databases (1,000,000+ keys) you will not see much increased |
b8370759 |
265 | memory on your process. L<DBM::Deep> relies solely on the filesystem for storing |
1990c72d |
266 | and fetching data. Here is output from I<top> before even opening a database |
267 | handle: |
268 | |
269 | PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND |
270 | 22831 root 11 0 2716 2716 1296 R 0.0 0.2 0:07 perl |
271 | |
272 | Basically the process is taking 2,716K of memory. And here is the same |
273 | process after storing and fetching 1,000,000 keys: |
274 | |
275 | PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND |
276 | 22831 root 14 0 2772 2772 1328 R 0.0 0.2 13:32 perl |
277 | |
278 | Notice the memory usage increased by only 56K. Test was performed on a 700mHz |
279 | x86 box running Linux RedHat 7.2 & Perl 5.6.1. |
280 | |
d8db2929 |
281 | =cut |