[dbsrgits/DBM-Deep.git] / lib / DBM / Deep / Internals.pod

=head1 NAME

DBM::Deep::Internals

=head1 DESCRIPTION

This is a document describing the internal workings of L<DBM::Deep/>. It is
not necessary to read this document if you only intend to be a user. This
document is intended for people who either want a deeper understanding of
specifics of how L<DBM::Deep/> works or who wish to help program
L<DBM::Deep/>.

=head1 CLASS LAYOUT

L<DBM::Deep/> is broken up into five classes in three inheritance hierarchies.

=over 4

=item *

L<DBM::Deep/> is the parent of L<DBM::Deep::Array/> and L<DBM::Deep::Hash/>.
These classes form the immediate interface to the outside world. They are the
classes that provide the TIE mechanisms as well as the OO methods.

=item *

L<DBM::Deep::Engine/> is the layer that deals with the mechanics of reading
and writing to the file. This is where the logic of the file layout is
handled.

=item *

L<DBM::Deep::File/> is the layer that deals with the physical file. As a
singleton that every other object has a reference to, it also provides a place
to handle datastructure-wide items, such as transactions.

=back

=head1 FILE LAYOUT

DBM::Deep uses a tagged file layout. Every section has a tag, a size, and then
the data.

=head2 File header

=over 4

=item * File Signature

The first four bytes are 'DPDB' in network byte order, signifying that this is
a DBM::Deep file.

=item * File tag/size

This is the tagging of the file header. The file used by versions prior to
1.00 had a different fifth byte, allowing the difference to the determined.

=item * Version

This is four bytes containing the header version. This lets the header change over time.

=item * Transaction information

The current running transactions are stored here, as is the next transaction
ID.

=item * Constants

These are the file-wide constants that determine how the file is laid out.
They can only be set upon file creation.

=back

=head2 Index

The Index parts can be tagged either as Hash, Array, or Index. The latter
is if there was a reindexing due to a bucketlist growing too large. The others
are the root index for their respective datatypes. The index consists of a
tag, a size, and then 256 sections containing file locations. Each section
corresponds to each value representable in a byte.

The index is used as follows - whenever a hashed key is being looked up, the
first byte is used to determine which location to go to from the root index.
Then, if that's also an index, the second byte is used, and so forth until a
bucketlist is found.

=head2 Bucketlist

This is the part that contains the link to the data section. A bucketlist
defaults to being 16 buckets long (modifiable by the I<max_buckets>
parameter used when creating a new file). Each bucket contains an MD5 and a
location of the appropriate key section.

=head2 Key area

This is the part that handles transactional awareness. There are
I<max_buckets> sections. Each section contains the location to the data
section, a transaction ID, and whether that transaction considers this key to
be deleted or not.

=head2 Data area

This is the part that actual stores the key, value, and class (if
appropriate). The layout is:

=over 4

=item * tag

=item * length of the value

=item * the actual value

=item * keylength

=item * the actual key

=item * a byte indicating if this value has a classname

=item * the classname (if one is there)

=back

The key is stored after the value because the value is requested more often
than the key.

=head1 PERFORMANCE

L<DBM::Deep/> is written completely in Perl. It also is a multi-process DBM
that uses the datafile as a method of synchronizing between multiple
processes. This is unlike most RDBMSes like MySQL and Oracle. Furthermore,
unlike all RDBMSes, L<DBM::Deep/> stores both the data and the structure of
that data as it would appear in a Perl program.

=head2 CPU

DBM::Deep attempts to be CPU-light. As it stores all the data on disk,
DBM::Deep is I/O-bound, not CPU-bound.

=head2 RAM

DBM::Deep uses extremely little RAM relative to the amount of data you can
access. You can iterate through a million keys (using C<each()>) without
increasing your memeory usage at all.

=head2 DISK

DBM::Deep is I/O-bound, pure and simple. The faster your disk, the faster
DBM::Deep will be. Currently, when performing C<my $x = $db-E<gt>{foo}>, there
are a minimum of 4 seeks and 1332 + N bytes read (where N is the length of your
data). (All values assume a medium filesize.) The actions take are:

=over 4

=item 1 Lock the file

=item 1 Perform a stat() to determine if the inode has changed

=item 1 Go to the primary index for the $db (1 seek)

=item 1 Read the tag/size of the primary index (5 bytes)

=item 1 Read the body of the primary index (1024 bytes)

=item 1 Go to the bucketlist for this MD5 (1 seek)

=item 1 Read the tag/size of the bucketlist (5 bytes)

=item 1 Read the body of the bucketlist (144 bytes)

=item 1 Go to the keys location for this MD5 (1 seek)

=item 1 Read the tag/size of the keys section (5 bytes)

=item 1 Read the body of the keys location (144 bytes)

=item 1 Go to the data section that corresponds to this transaction ID. (1 seek)

=item 1 Read the tag/size of the data section (5 bytes)

=item 1 Read the value for this data (N bytes)

=item 1 Unlock the file

=back

Every additional level of indexing (if there are enough keys) requires an
additional seek and the reading of 1029 additional bytes. If the value is
blessed, an additional 1 seek and 9 + M bytes are read (where M is the length
of the classname).

Arrays are (currently) even worse because they're considered "funny hashes"
with the length stored as just another key. This means that if you do any sort
of lookup with a negative index, this entire process is performed twice - once
for the length and once for the value.

=cut
Commit	Line	Data
d8db2929	1	=head1 NAME
	2
	3	DBM::Deep::Internals
	4
	5	=head1 DESCRIPTION
	6
	7	This is a document describing the internal workings of L<DBM::Deep/>. It is
	8	not necessary to read this document if you only intend to be a user. This
	9	document is intended for people who either want a deeper understanding of
	10	specifics of how L<DBM::Deep/> works or who wish to help program
	11	L<DBM::Deep/>.
	12
	13	=head1 CLASS LAYOUT
	14
	15	L<DBM::Deep/> is broken up into five classes in three inheritance hierarchies.
	16
	17	=over 4
	18
	19	=item *
	20
	21	L<DBM::Deep/> is the parent of L<DBM::Deep::Array/> and L<DBM::Deep::Hash/>.
	22	These classes form the immediate interface to the outside world. They are the
	23	classes that provide the TIE mechanisms as well as the OO methods.
	24
	25	=item *
	26
	27	L<DBM::Deep::Engine/> is the layer that deals with the mechanics of reading
	28	and writing to the file. This is where the logic of the file layout is
	29	handled.
	30
	31	=item *
	32
	33	L<DBM::Deep::File/> is the layer that deals with the physical file. As a
	34	singleton that every other object has a reference to, it also provides a place
	35	to handle datastructure-wide items, such as transactions.
	36
	37	=back
	38
	39	=head1 FILE LAYOUT
	40
	41	DBM::Deep uses a tagged file layout. Every section has a tag, a size, and then
	42	the data.
	43
	44	=head2 File header
	45
	46	=over 4
	47
	48	=item * File Signature
	49
	50	The first four bytes are 'DPDB' in network byte order, signifying that this is
	51	a DBM::Deep file.
	52
	53	=item * File tag/size
	54
	55	This is the tagging of the file header. The file used by versions prior to
	56	1.00 had a different fifth byte, allowing the difference to the determined.
	57
	58	=item * Version
	59
	60	This is four bytes containing the header version. This lets the header change over time.
	61
	62	=item * Transaction information
	63
	64	The current running transactions are stored here, as is the next transaction
65	ID.
66
67	=item * Constants
68
69	These are the file-wide constants that determine how the file is laid out.
70	They can only be set upon file creation.
71
72	=back
73
74	=head2 Index
75
76	The Index parts can be tagged either as Hash, Array, or Index. The latter
77	is if there was a reindexing due to a bucketlist growing too large. The others
78	are the root index for their respective datatypes. The index consists of a
79	tag, a size, and then 256 sections containing file locations. Each section
80	corresponds to each value representable in a byte.
81
82	The index is used as follows - whenever a hashed key is being looked up, the
83	first byte is used to determine which location to go to from the root index.
84	Then, if that's also an index, the second byte is used, and so forth until a
85	bucketlist is found.
86
87	=head2 Bucketlist
88
89	This is the part that contains the link to the data section. A bucketlist
90	defaults to being 16 buckets long (modifiable by the I<max_buckets>
91	parameter used when creating a new file). Each bucket contains an MD5 and a
92	location of the appropriate key section.
93
94	=head2 Key area
95
96	This is the part that handles transactional awareness. There are
97	I<max_buckets> sections. Each section contains the location to the data
98	section, a transaction ID, and whether that transaction considers this key to
99	be deleted or not.
100
101	=head2 Data area
102
103	This is the part that actual stores the key, value, and class (if
104	appropriate). The layout is:
105
106	=over 4
107
108	=item * tag
109
110	=item * length of the value
111
112	=item * the actual value
113
114	=item * keylength
115
116	=item * the actual key
117
118	=item * a byte indicating if this value has a classname
119
120	=item * the classname (if one is there)
121
122	=back
123
124	The key is stored after the value because the value is requested more often
125	than the key.
126
127	=head1 PERFORMANCE
128
129	L<DBM::Deep/> is written completely in Perl. It also is a multi-process DBM
130	that uses the datafile as a method of synchronizing between multiple
131	processes. This is unlike most RDBMSes like MySQL and Oracle. Furthermore,
132	unlike all RDBMSes, L<DBM::Deep/> stores both the data and the structure of
133	that data as it would appear in a Perl program.
134
135	=head2 CPU
136
137	DBM::Deep attempts to be CPU-light. As it stores all the data on disk,
138	DBM::Deep is I/O-bound, not CPU-bound.
139
140	=head2 RAM
141
142	DBM::Deep uses extremely little RAM relative to the amount of data you can
143	access. You can iterate through a million keys (using C<each()>) without
144	increasing your memeory usage at all.
145
146	=head2 DISK
147
148	DBM::Deep is I/O-bound, pure and simple. The faster your disk, the faster
149	DBM::Deep will be. Currently, when performing C<my $x = $db-E<gt>{foo}>, there
150	are a minimum of 4 seeks and 1332 + N bytes read (where N is the length of your
151	data). (All values assume a medium filesize.) The actions take are:
152
153	=over 4
154
155	=item 1 Lock the file
156
157	=item 1 Perform a stat() to determine if the inode has changed
158
159	=item 1 Go to the primary index for the $db (1 seek)
160
161	=item 1 Read the tag/size of the primary index (5 bytes)
162
163	=item 1 Read the body of the primary index (1024 bytes)
164
165	=item 1 Go to the bucketlist for this MD5 (1 seek)
166
167	=item 1 Read the tag/size of the bucketlist (5 bytes)
168
169	=item 1 Read the body of the bucketlist (144 bytes)
170
171	=item 1 Go to the keys location for this MD5 (1 seek)
172
173	=item 1 Read the tag/size of the keys section (5 bytes)
174
175	=item 1 Read the body of the keys location (144 bytes)
176
177	=item 1 Go to the data section that corresponds to this transaction ID. (1 seek)
178
179	=item 1 Read the tag/size of the data section (5 bytes)
180
181	=item 1 Read the value for this data (N bytes)
182
183	=item 1 Unlock the file
184
185	=back
186
187	Every additional level of indexing (if there are enough keys) requires an
188	additional seek and the reading of 1029 additional bytes. If the value is
189	blessed, an additional 1 seek and 9 + M bytes are read (where M is the length
190	of the classname).
191
192	Arrays are (currently) even worse because they're considered "funny hashes"
193	with the length stored as just another key. This means that if you do any sort
194	of lookup with a negative index, this entire process is performed twice - once
195	for the length and once for the value.
196
197	=cut