X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=lib%2FDBM%2FDeep%2FInternals.pod;h=132bc9eff17bb68bb204a04a6fb949b6200183e6;hb=66285e35e40d582589aef424377640ca96745fce;hp=d80c8163581149185a96325218df60e54a4ae8f9;hpb=d8db292980a47f863c228e8968c4069fd45d35d4;p=dbsrgits%2FDBM-Deep.git

diff --git a/lib/DBM/Deep/Internals.pod b/lib/DBM/Deep/Internals.pod
index d80c816..132bc9e 100644
--- a/lib/DBM/Deep/Internals.pod
+++ b/lib/DBM/Deep/Internals.pod
@@ -4,33 +4,37 @@ DBM::Deep::Internals
 
 =head1 DESCRIPTION
 
-This is a document describing the internal workings of L<DBM::Deep/>. It is
+B<NOTE>: This document is out-of-date. It describes an intermediate file
+format used during the development from 0.983 to 1.0000. It will be rewritten
+soon.
+
+This is a document describing the internal workings of L<DBM::Deep>. It is
 not necessary to read this document if you only intend to be a user. This
 document is intended for people who either want a deeper understanding of
-specifics of how L<DBM::Deep/> works or who wish to help program
-L<DBM::Deep/>.
+specifics of how L<DBM::Deep> works or who wish to help program
+L<DBM::Deep>.
 
 =head1 CLASS LAYOUT
 
-L<DBM::Deep/> is broken up into five classes in three inheritance hierarchies.
+L<DBM::Deep> is broken up into five classes in three inheritance hierarchies.
 
 =over 4
 
 =item *
 
-L<DBM::Deep/> is the parent of L<DBM::Deep::Array/> and L<DBM::Deep::Hash/>.
+L<DBM::Deep> is the parent of L<DBM::Deep::Array> and L<DBM::Deep::Hash>.
 These classes form the immediate interface to the outside world. They are the
 classes that provide the TIE mechanisms as well as the OO methods.
 
 =item *
 
-L<DBM::Deep::Engine/> is the layer that deals with the mechanics of reading
+L<DBM::Deep::Engine> is the layer that deals with the mechanics of reading
 and writing to the file. This is where the logic of the file layout is
 handled.
 
 =item *
 
-L<DBM::Deep::File/> is the layer that deals with the physical file. As a
+L<DBM::Deep::File> is the layer that deals with the physical file. As a
 singleton that every other object has a reference to, it also provides a place
 to handle datastructure-wide items, such as transactions.
 
@@ -57,17 +61,22 @@ This is the tagging of the file header. The file used by versions prior to
 
 =item * Version
 
-This is four bytes containing the header version. This lets the header change over time.
+This is four bytes containing the file version. This lets the file format change over time.
+
+=item * Constants
+
+These are the file-wide constants that determine how the file is laid out.
+They can only be set upon file creation.
 
 =item * Transaction information
 
 The current running transactions are stored here, as is the next transaction
 ID.
 
-=item * Constants
+=item * Freespace information
 
-These are the file-wide constants that determine how the file is laid out.
-They can only be set upon file creation.
+Pointers into the next free sectors of the various sector sizes (Index,
+Bucketlist, and Data) are stored here.
 
 =back
 
@@ -126,10 +135,10 @@ than the key.
 
 =head1 PERFORMANCE
 
-L<DBM::Deep/> is written completely in Perl. It also is a multi-process DBM
+L<DBM::Deep> is written completely in Perl. It also is a multi-process DBM
 that uses the datafile as a method of synchronizing between multiple
 processes. This is unlike most RDBMSes like MySQL and Oracle. Furthermore,
-unlike all RDBMSes, L<DBM::Deep/> stores both the data and the structure of
+unlike all RDBMSes, L<DBM::Deep> stores both the data and the structure of
 that data as it would appear in a Perl program.
 
 =head2 CPU
@@ -148,7 +157,7 @@ increasing your memeory usage at all.
 DBM::Deep is I/O-bound, pure and simple. The faster your disk, the faster
 DBM::Deep will be. Currently, when performing C<my $x = $db-E<gt>{foo}>, there
 are a minimum of 4 seeks and 1332 + N bytes read (where N is the length of your
-data). (All values assume a medium filesize.) The actions take are:
+data). (All values assume a medium filesize.) The actions taken are:
 
 =over 4
 
@@ -194,4 +203,79 @@ with the length stored as just another key. This means that if you do any sort
 of lookup with a negative index, this entire process is performed twice - once
 for the length and once for the value.
 
+=head1 ACTUAL TESTS
+
+=head2 SPEED
+
+Obviously, DBM::Deep isn't going to be as fast as some C-based DBMs, such as
+the almighty I<BerkeleyDB>.  But it makes up for it in features like true
+multi-level hash/array support, and cross-platform FTPable files.  Even so,
+DBM::Deep is still pretty fast, and the speed stays fairly consistent, even
+with huge databases.  Here is some test data:
+
+    Adding 1,000,000 keys to new DB file...
+
+    At 100 keys, avg. speed is 2,703 keys/sec
+    At 200 keys, avg. speed is 2,642 keys/sec
+    At 300 keys, avg. speed is 2,598 keys/sec
+    At 400 keys, avg. speed is 2,578 keys/sec
+    At 500 keys, avg. speed is 2,722 keys/sec
+    At 600 keys, avg. speed is 2,628 keys/sec
+    At 700 keys, avg. speed is 2,700 keys/sec
+    At 800 keys, avg. speed is 2,607 keys/sec
+    At 900 keys, avg. speed is 2,190 keys/sec
+    At 1,000 keys, avg. speed is 2,570 keys/sec
+    At 2,000 keys, avg. speed is 2,417 keys/sec
+    At 3,000 keys, avg. speed is 1,982 keys/sec
+    At 4,000 keys, avg. speed is 1,568 keys/sec
+    At 5,000 keys, avg. speed is 1,533 keys/sec
+    At 6,000 keys, avg. speed is 1,787 keys/sec
+    At 7,000 keys, avg. speed is 1,977 keys/sec
+    At 8,000 keys, avg. speed is 2,028 keys/sec
+    At 9,000 keys, avg. speed is 2,077 keys/sec
+    At 10,000 keys, avg. speed is 2,031 keys/sec
+    At 20,000 keys, avg. speed is 1,970 keys/sec
+    At 30,000 keys, avg. speed is 2,050 keys/sec
+    At 40,000 keys, avg. speed is 2,073 keys/sec
+    At 50,000 keys, avg. speed is 1,973 keys/sec
+    At 60,000 keys, avg. speed is 1,914 keys/sec
+    At 70,000 keys, avg. speed is 2,091 keys/sec
+    At 80,000 keys, avg. speed is 2,103 keys/sec
+    At 90,000 keys, avg. speed is 1,886 keys/sec
+    At 100,000 keys, avg. speed is 1,970 keys/sec
+    At 200,000 keys, avg. speed is 2,053 keys/sec
+    At 300,000 keys, avg. speed is 1,697 keys/sec
+    At 400,000 keys, avg. speed is 1,838 keys/sec
+    At 500,000 keys, avg. speed is 1,941 keys/sec
+    At 600,000 keys, avg. speed is 1,930 keys/sec
+    At 700,000 keys, avg. speed is 1,735 keys/sec
+    At 800,000 keys, avg. speed is 1,795 keys/sec
+    At 900,000 keys, avg. speed is 1,221 keys/sec
+    At 1,000,000 keys, avg. speed is 1,077 keys/sec
+
+This test was performed on a PowerMac G4 1gHz running Mac OS X 10.3.2 & Perl
+5.8.1, with an 80GB Ultra ATA/100 HD spinning at 7200RPM.  The hash keys and
+values were between 6 - 12 chars in length.  The DB file ended up at 210MB.
+Run time was 12 min 3 sec.
+
+=head2 MEMORY USAGE
+
+One of the great things about L<DBM::Deep> is that it uses very little memory.
+Even with huge databases (1,000,000+ keys) you will not see much increased
+memory on your process.  L<DBM::Deep> relies solely on the filesystem for storing
+and fetching data.  Here is output from I<top> before even opening a database
+handle:
+
+    PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
+  22831 root      11   0  2716 2716  1296 R     0.0  0.2   0:07 perl
+
+Basically the process is taking 2,716K of memory.  And here is the same
+process after storing and fetching 1,000,000 keys:
+
+    PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
+  22831 root      14   0  2772 2772  1328 R     0.0  0.2  13:32 perl
+
+Notice the memory usage increased by only 56K.  Test was performed on a 700mHz
+x86 box running Linux RedHat 7.2 & Perl 5.6.1.
+
 =cut