Added SPONSORS section to DBM::Deep's POD
[dbsrgits/DBM-Deep.git] / article.pod
CommitLineData
315b34eb 1=head0 Adding transactions to DBM::Deep
2
6c657dd3 3For the past nine months, I've been working on adding transactions to
af9c9fda 4L<DBM::Deep|DBM::Deep>. During that time, I looked far and wide for an
5accessible description of how a programmer should go about implementing
6transactions. The only things I found were either extremely pedantic academic
6c657dd3 7papers or the code of complex applications. The former weren't very easy to
8read and the latter were less so N<Have I<you> ever tried to read the source
9for BDB or InnoDB? Reading perl's source is easier.>. This is the article I
10wished I'd been able to read nine months ago.
af9c9fda 11
315b34eb 12=head1 What is DBM::Deep?
13
4a38586b 14L<DBM::Deep|DBM::Deep> is a module written completely in Perl that provides a way of
315b34eb 15storing Perl datastructures (scalars, hashes, and arrays) on disk instead of
2cfc2645 16in memory. The datafile produced is able to be transferred from one machine to
315b34eb 17another, regardless of OS or Perl version. There are several reasons why
18someone would want to do this.
19
20=over 4
21
22=item * Transparent Persistence
23
6c657dd3 24This is the ability to save a set of datastructures to disk and retrieve them
25later without the vast majority of the program ever knowing that the data is
315b34eb 26persisted. Furthermore, the datastructure is persisted immediately and not at
27set marshalling periods.
28
29=item * Huge datastructures
30
31Normally, datastructures are limited by the size of RAM the server has.
4a38586b 32L<DBM::Deep|DBM::Deep> allows for the size a given datastructure to be limited by disk
6c657dd3 33size instead (up to the given perl's largefile support).
315b34eb 34
af9c9fda 35=item * Database
315b34eb 36
6c657dd3 37Most programmers hear the word "database" and think "relational database
f5154d2c 38management system" (or RDBMS). A database is a more general term meaning
39"place one stores data." This can be relational, object, or something else.
40The software used to manage and query a database is a "database management
41system" (DBMS).
6c657dd3 42
43L<DBM::Deep|DBM::Deep> provides one half of a DBMS - the data storage part.
44Once the datastructures on disk, L<DBM::Deep|DBM::Deep> provides the
f5154d2c 45capability to allow multiple processes to access the data. N<Presto is a start
46at providing the other half of the DBMS using DBm::Deep as the engine.>
315b34eb 47
48=back
49
2233b12f 50=head1 How does DBM::Deep work?
51
52L<DBM::Deep|DBM::Deep> works by tying a variable to a file on disk. Every
af9c9fda 53read and write go directly to the file and modify the file immediately. To
2233b12f 54represent Perl's hashes and arrays, a record-based file format is used. There
55is a file header storing file-wide values, such as the size of the internal
56file pointers. Afterwards, there are the data records.
57
2cfc2645 58The most important feature of L<DBM::Deep|DBM::Deep> is that it can be
59completely transparent. Other than the line tying the variable to the file, no
60other part of your program needs to know that the variable being used isn't a
af9c9fda 61"normal" Perl variable. So, the following works just fine:
62
63 # As far as the rest of the program is concerned, the following two lines
64 # are identical - they produce a variable $foo that can be used as a hashref.
65 # my $foo = {};
66 my $foo = DBM::Deep->new( 'mydb.db' );
67
68 $foo->{bar} = 'baz';
69 $foo->{complex} = [
70 { a => 'b' }, 0, '123', undef, [ 1 .. 5 ],
71 ];
72
73 # And so on and so forth.
2cfc2645 74
2233b12f 75=head2 DBM::Deep's file structure
76
2cfc2645 77L<DBM::Deep|DBM::Deep>'s file structure is record-based. The key (or array
2233b12f 78index - arrays are currently just funny hashes internally) is hashed using MD5
6c657dd3 79and then stored in a cascade of Index and bucketlist records. The bucketlist
2233b12f 80record stores the actual key string and pointers to where the data records are
81stored. The data records themselves are one of Null, Scalar, or Reference.
82Null represents an I<undef>, Scalar represents a string (numbers are
f5154d2c 83stringified internally for simplicity) and are allocated in fixed-size chunks.
2cfc2645 84Reference represent an array or hash reference and contains a pointer to an
6c657dd3 85Index and bucketlist cascade of its own. Reference will also store the class
2cfc2645 86the hash or array reference is blessed into, meaning that almost all objects
87can be stored safely.
2233b12f 88
89=head2 DBM::Deep's class hierarchy
90
91Managing all of these functions takes a lot of different abstractions. There
92are at least 3 different interfacing layers, and more if you look hard enough.
93To manage this complexity, L<DBM::Deep|DBM::Deep> uses the following abstractions:
94
95=over 4
96
97=item * Tying classes
98
99These are the classes that manage the external face of the module. They manage
100B<all> of the interactions with outside code - OO interface, tying, and
101various utility methods. If they cannot handle the request themselves, they
102delegate to the engine. There are currently three classes in this layer.
103
104=item * Engine classes
105
106These classes manage the file format and all of the ways that the records
107interact with each other. Nearly every call will make requests to the File
2cfc2645 108classes for reading and/or writing data to the file. There are currently nine
109classes in this layer, including a class for each record type.
2233b12f 110
111=item * File class
112
113This class mediates all interaction with the file system. Every read, write,
114lock, and unlock goes through this class. There is currently one class in this
115layer.
116
117=item * Iterator classes
118
119These are introspection classes that provide iteration for hashes. They manage
120keeping track of where the next key should be and how to get there. There are
121currently four classes in this layer.
122
123=back
124
af9c9fda 125=head1 Why add transactions to DBM::Deep?
126
127For the most part, L<DBM::Deep|DBM::Deep> functions perfectly well without
128transactions. Most uses that I've seen tend to be either single-process
129read/write persistence or a multi-process readonly cache for an expensive, but
130static, lookup. With transactions, L<DBM::Deep|DBM::Deep> can now be used
131safely for multi-process read/write persistence in situations that don't
132need (or cannot use) a full RDBMS.
133
134=head2 What are transactions?
315b34eb 135
136Originally from the database world, a transaction is a way of isolating the
137effects of a given set of actions, then applying them all at once. It's a way
138of saying "I'm going to try the following steps, see if I like the result,
0dbc9d1d 139then I want everyone else looking at this datastore to see the results
315b34eb 140immediately." The most common example is taken from banking. Let's say that an
6c657dd3 141application receives a request to have Joe pay Bob five zorkmids. Without
315b34eb 142transactions, the application would take the money from Joe's account, then
143add the money to Bob's account. But, what happens if the application crashes
144after debiting Joe, but before crediting Bob? The application has made money
145disappear. Or, vice versa, if Bob is credited before Joe is debited, the
0dbc9d1d 146application has created money.
315b34eb 147
148With a transaction wrapping the money transfer, if the application crashes in
149the middle, it's as if the action never happened. So, when the application
150recovers from the crash, Joe and Bob still have the same amount of money in
2cfc2645 151their accounts as they did before. The transaction can restart and Bob can
315b34eb 152finally receive his zorkmids.
153
0dbc9d1d 154More formally, transactions are generally considered to be proper when they are
2cfc2645 155ACID-compliant. ACID is an acronym that stands for the following:
0dbc9d1d 156
157=over 4
158
159=item * Atomic
160
161Either every change happens or none of the changes happen.
162
163=item * Consistent
164
165When the transaction begins and when it is committed, the database must be in
2cfc2645 166a legal state. This condition doesn't apply to L<DBM::Deep|DBM::Deep> as all
6c657dd3 167Perl datastructures are internally consistent.
0dbc9d1d 168
169=item * Isolated
170
171As far as a transaction is concerned, it is the only thing running against the
2cfc2645 172database while it is running. Unlike most RDBMSes, L<DBM::Deep|DBM::Deep>
173provides the strongest isolation level possible, usually called
174I<Serializable> by most RDBMSes.
0dbc9d1d 175
176=item * Durable
177
2cfc2645 178Once the database says that a commit has happened, the commit will be
179guaranteed, regardless of whatever happens. I chose to not implement this
6c657dd3 180condition in L<DBM::Deep|DBM::Deep> N<This condition requires the presence of
181at least one more file, which violates one of the original design goals.>.
0dbc9d1d 182
183=back
184
af9c9fda 185=head2 Why add them to DBM::Deep?
315b34eb 186
187The ability to have actions occur in either I<atomically> (as in the previous
188example) or I<isolation> from the rest of the users of the data is a powerful
2cfc2645 189thing. This allows for a large amount of safety and predictability in how
0dbc9d1d 190data transformations occur. Imagine, for example, that you have a set of
315b34eb 191calculations that will update various variables. However, there are some
192situations that will cause you to throw away all results and start over with a
193different seed. Without transactions, you would have to put everything into
194temporary variables, then transfer the values when the calculations were found
2cfc2645 195to be successful. If you ever add a new value or if a value is used in only
196certain calculations, you may forget to do the correct thing. With
197transactions, you start a transaction and do your thing within it. If the
6c657dd3 198calculations succeed, you commit. If they fail, you rollback and try again.
315b34eb 199
6c657dd3 200If you're thinking that this is very similar to how Subversion (SVN) or CVS
201works, you're absolutely correct - they are transactional in exactly the same
202way.
315b34eb 203
6c657dd3 204=head1 How it happened
d7866e36 205
2cfc2645 206The addition of transactions to L<DBM::Deep|DBM::Deep> has easily been the
6c657dd3 207single most complex software endeavor I've ever undertaken. While transactions
208are conceptually simple, the devil is in the details. And there were a B<lot>
209of details.
210
211=head2 The naive approach
212
213Initially, I hoped I could just copy the entire datastructure and mark it as
214owned by the transaction. This is the most straightforward solution and is
215extremely simple to implement. Whenever a transaction starts, copy the whole
216thing over to somewhere else. If the transaction is committed, overwrite the
217original with the transaction's version. If it's rolled back, throw it away.
218
219It's a popular solution as seen by the fact that it's the mechanism used in
220both L<Data::Transactional|Data::Transactional> and
221L<Tie::Scalar::Transactional|Tie::Scalar::Transactional>. While very simple to
222implement, it scales very poorly as the datastructure grows. As one of the
223primary usecases for L<DBM::Deep|DBM::Deep> is working with huge
224datastructures, this plan was dead on arrival.
225
226=head2 The relational approach
227
228As I'm also a MySQL DBA, I looked to how the InnoDB engine implements
229transactions. Given that relational databases are designed to work with large
230amounts of data, it made sense to look here next.
231
232InnoDB implements transactions using MVCC
233N<http://en.wikipedia.org/wiki/Multiversion_concurrency_control>. When a
234transaction starts, it stores a timestamp corresponding to its start time.
235Whenever a modification to a row is committed, the modification is
236timestamped. When a transaction modifies a row, it copies the row into its
237own scratchpad and modifies it. Whenever a transaction reads a row, it first
238attempts to read the row from its scratchpad. If it's not there, then it reads
239the version of the row whose timestamp is no later than the timestamp of the
240transaction. When committing, the transaction's scratchpad is written out to
241the main data area with the timestamp of the commit and the scratchpad is
242thrown away. When rolling back, the scratchpad is thrown out.
243
244At first, this mechanism looked promising and I whipped up a couple spikes
245(or code explorations) to try it out. The problem I ran into, again, was the
246existence of large datastructures. When making large changes to a relational
247database within a transaction, the engine can store the rows within the actual
248table and mark them as being part of a transaction's scratchpad. Perl's
249fractal datastructures, however, don't lend themselves to this kind of
250treatment. The scratchpad would, in some pathological cases, be a
251near-complete copy of the original datastructure. N<Funnily enough, this is
252yet another example of the object relational impedance mismatch
253(http://en.wikipedia.org/wiki/Object-Relational_impedance_mismatch).>
254
255=head2 The subversive approach
256
257Despairing, I went to YAPC::NA::2006 hoping to discuss the problem with the
258best minds in the Perl community. I was lucky enough to run into both Audrey
259Tang (author of Pugs) and clkao (author of SVK). In between talks, I managed
260to discuss the problems I'd run into with both of them. They looked at me
261oddly and asked why I wasn't looking at Subversion (SVN) as a model for
262transactions. My first reaction was "It's a source control application. What
263does it know about transa- . . . Ohhhh!" And they smiled.
264
265Like Perl datastructures, a filesystem is fractal. Directories contain both
266files and directories. Directories act as hashes and a files act as scalars
267whose names are their hashkeys. When a modification is made to a SVN checkout,
268SVN tracks the changes at the filename (or directory name) level. When a
269commit is made, only those filenames which have changes are copied over to the
270HEAD. Everything else remains untouched.
271
272Translating this to hashes and hashkeys, this implies that transactional
273information should be stored at the level of the hashkey. Or, in
274L<DBM::Deep|DBM::Deep> terms, within the bucket for that key. As a nice
275side-effect, other than the key's datastructure within the bucket, the entire
276datafile is unaware of anything to do with transactions.
277
278=head2 The spike
279
280Spikes are kind of like a reconnaissance mission in the military. They go out
281to get intel on the enemy and are explicitly not supposed to take any ground
282or, in many cases, take out of the enemy forces. In coding terms, the spike is
283code meant to explore a problemspace that you B<will> throw away and
284reimplement.
285
286As transactions were going to be between the bucket for the key and the
287datapointer to the value, my first thought was to put in another sector that
288would handle this mapping. This had the advantage of changing nothing except
289for adding one new sector type and the handling for it. Just doing this got me
290to the point where I could pass the following test:
bb67b124 291
292 my $db1 = DBM::Deep->new( $filename );
293 my $db2 = DBM::Deep->new( $filename );
294
295 $db1->{abc} = 'foo';
296
297 is( $db1->{abc}, 'foo' );
298 is( $db2->{abc}, 'foo' );
299
300 $db1->begin_work();
301
bb67b124 302 $db1->{abc} = 'floober';
303
304 is( $db1->{abc}, 'floober' );
305 is( $db2->{abc}, 'foo' );
306
6c657dd3 307Just that much was a major accomplishment.
bb67b124 308
6c657dd3 309=head2 Tests, tests, and more tests
bb67b124 310
6c657dd3 311I was lucky that when I took over L<DBM::Deep|DBM::Deep> that Joe Huckaby
312(the original author) handed me a comprehensive test suite. This meant that I
313could add in transactions with a high degree of confidence that I hadn't
314messed up non-transactional uses. The test suite was also invaluable for
315working through the various situations that transactions can cause.
bb67b124 316
6c657dd3 317But, a test is only as good as the test-writer. For example, it was a while
318before I realized that I needed to test C<is( $db1-E<gt>{abc}, 'foo' )>
319I<before> modifying it in the transaction.
d7866e36 320
6c657dd3 321To pass that test, the code for retrieval needed to look first in the
322transaction's spot and if that spot had never been assigned to, look at the
323spot for the HEAD. While this is how SVN works, it wasn't an immediately
324obvious test to write.
d7866e36 325
6c657dd3 326=head2 The HEAD
bb67b124 327
6c657dd3 328In SVN, the HEAD revision is the latest revision checked into the repository.
329When you do a local modification, you're doing a modification to your copy of
330the HEAD. Then, you choose to either check in (C<commit()>) or revert
331(C<rollback()>) your changes.
bb67b124 332
6c657dd3 333In order to make the code work for the base case (no transaction running), the
334first entry in the transaction sector became the HEAD. Thus, it was assigned
335transaction ID 0. This also had the really neat side-benefit that C<if (
336$trans_id ) {}> will run the code if and only if L<DBM::Deep|DBM::Deep> is
337in a running transaction.
d7866e36 338
6c657dd3 339=head2 Ending the spike
d7866e36 340
f5154d2c 341At this point, I had learned everything I needed from the spike. Yes, the
342SVN idea looked like it was going to work. Yes, there were a lot of squibbly
343details. No, it wasn't going to be finished before I left YAPC::NA. *sigh*
344
345The biggest lessons learned from the spike were:
346
347=over 4
348
349=item 1 Tests are good
350
351I seem to have to relearn this every project I work on. It's pretty sad, if
352you ask me.
353
354=item 1 The transaction sector is superfluous
355
356As I worked with it, the transaction sector didn't add any information over
357extending the actual bucket to have the transaction to datapointer mapping
358within it.
359
360=back
315b34eb 361
0dbc9d1d 362=head2 Protection from changes
363
f5154d2c 364After the missed test for checking that starting a transaction didn't lose the
365connection to the HEAD, I started writing more and more tests, being very anal
366about what I was checking. I wrote tests to check every piece of state I could
367think of before and after every change in state, regardless of where the
368change was made. Writing these tests immediately raised an issue with changing
369the HEAD while a transaction is running. If the transaction has already edited
370that key, it already has its new value. However, if it doesn't, it needs to be
371protected from the change to the HEAD. This is the key piece for providing
372I<Isolation>.
373
374My first attempt to solve this problem focused on having the transaction
375itself detect changes. But, the primary usecase for transactions is that each
376transaction is going to be running in a separate process. Without implementing
377IPC, the only common point between various processes is the datafile itself.
378The only process aware of the change is the process making the change. Even
379though it seemed counter-intuitive, the only sane mechanism was that each
380process modifying the HEAD would also protect all running transactions from
381its change, if needed.
382
383=head2 Committing and rolling back
384
385Now that changes are able to be made within a transaction and the transaction,
386the HEAD, and other transactions are protected from one other, the next step
387was to provide the ability to both commit and rollback these changes.
388
389=head3 Rollback
390
391Conceptually, rolling back should the simpler to implement - just discard the
392changes that have been made within the transaction and continue onward with
393the HEAD. And, for the first attempt, that is exactly what I did. This meant
394that the following test would pass:
395
396 $db->{foo} = 'bar';
397
398 $db->begin_work;
399
400 is( $db->{foo}, 'bar' );
401
402 $db->{foo} = 'baz';
0dbc9d1d 403
f5154d2c 404 is( $db->{foo}, 'baz' );
bb67b124 405
f5154d2c 406 $db->rollback;
bb67b124 407
f5154d2c 408 is( $db->{foo}, 'bar' );
409
410But, this didn't work out very well for repeated use of that transaction slot.
411I threw a number of solutions at the problem, but none of them were
412addressing the real issue - knowing which use of a transaction ID made the
413change vs. which use of a transaction ID was accessing the value.
414
415XXX
416
417=head3 Committing
418
419Committing is much harder than rolling back. The primary difficulty lies in
420tracking exactly what this transaction has changed in order to copy those
421changed bucket entries over to the HEAD. The good news is that only the actual
422datapointers for that transaction need to be copied over - the actual data
423sectors are left untouched.
424
425The key to the solution lay in the decoupled nature of the code I was writing
426along with the fact that every piece of the code had access to the engine
427object, if needed. Committing (and rolling back) are both handled by the
428Engine object. To get that information into the engine, each bucket modified
429by the transaction would inform the engine object that it had been modified by
430that transaction. When a commit occurs, the engine objet iterates over the
431modified buckets and transfers over the new datapointer and discards the old
432one.
2cfc2645 433
bb67b124 434=head2 Deleted marker
435
f5154d2c 436After some more tests, a final edge-case was found. Transactions are performed
437copy-on-write. This means that if there isn't an entry for that transaction,
438the HEAD is looked at. This doesn't work if a key has been deleted within a
439transaction. So, the entry must be marked as deleted within the transaction so
440that the HEAD isn't checekd.
0dbc9d1d 441
442Likewise, when a new key is created in a transaction, the HEAD doesn't have an
443entry for that key. Consider the following situation:
444
445 ok( !exists $db1->{foo} );
446 ok( !exists $db2->{foo} );
447
448 $db1->begin_work();
449 $db1->{foo} = 'bar';
450
451 ok( !exists $db2->{foo} );
452
453The entry for the HEAD for 'foo' needs to be marked as deleted so that
f5154d2c 454transactions which don't have 'foo' don't find something in the HEAD. To add
455this, I originally used a separate flag for each datapointer to indicate if it
456had been marked as deleted or not. I quickly recognized that a data-pointer
457can never have a value of 0 or 1 as those would point to the first and second
458bytes of the datafile, respectively. As these are part of the header, those
459are nonsensical values, so can be re-used for metadata. 0 now means "This
460slot has never been written to" and 1 means "This slot has been explicitly
461deleted."
bb67b124 462
463=head2 Freespace management
464
f5154d2c 465Pre-1.0000 versions of L<DBM::Deep|DBM::Deep> didn't have any form of
466freespace management. This meant that whenever a value was deleted, the old
467value just sat around taking up space, even though it would never be accessed
468again. While barely acceptable for non-transactional uses, this was made
469transactions unusable because transactions, as I've implemented them, are
470predicated on the concept of parallel values that are (presumably) cleaned up
471after the transaction is done with them.
472
473Freespace had never been added before because it requires a different file
474format than the one used in the pre-1.0000 versions. Because I had to change
475the file format anyways B<and> I needed the feature, adding freespace now
476seemed like a good plan.
477
478Freespace was implemented by regularizing all the records so that
479L<DBM::Deep|DBM::Deep> only has three different record sizes - Index,
480BucketList, and Data. Each record type has a fixed length based on various
481parameters the L<DBM::Deep|DBM::Deep> datafile is created with. (In order to
482accomodate values of various sizes, Data records chain.) Whenever the engine
483is finished with a sector, it is freed and added to a list of free sectors of
484that sector's size. Whenever a new sector is requested, the freelist is
485checked first. If the freelist has a sector, it's reused, otherwise a new
486sector is added to the end of the datafile.
487
488Just like everything else, I wrote a mess of tests for adding freespace
489management. One of the tests I thought up was the following:
490
491 $db->{foo} = [ 1 .. 3];
492 my $arr = $db->{foo};
493
494 is( $arr->[1], 2 ); # This always has worked.
495
496 delete $db->{foo};
497
498 isnt( $arr->[1], 2 );
499
500If this was a Perl datastructure, the last test should pass. In the past, that
501test would fail. The key concept I realized was that the C<$arr> variable is
502pointing to a stale area in memory. So, by adding a staleness counter that is
503incremented whenever the sector in use is deleted, I am able to determine if
504the variable in question is looking for a stale version of the sector. At this
505point, L<DBM::Deep|DBM::Deep> returns undef because, at some point, the entry
506was deleted.
507
508=head2 Transactional staleness counters
0dbc9d1d 509
4a38586b 510Once it was implemented for freespace management, staleness counters proved to
61b73529 511be a very powerful concept for transactions themselves. Back in L</Protection
4a38586b 512from changes>, I mentioned that other processes modifying the HEAD will
513protect all running transactions from their effects. This provides
514I<Isolation>. But, the running transaction doesn't know about these entries.
f5154d2c 515This is both a benefit and a drawback. It's a benefit that it makes tracking
516modified buckets very simple (q.v. L</Committing>). But, it means that changes
517made to protect the transaction are not tracked. If they're not cleaned up,
518they will be seen the next time a transaction uses that transaction ID.
4a38586b 519
520By providing a staleness counter for transactions, the costs of cleaning up
521finished transactions is deferred until the space is actually used again. This
522is at the cost of having less-than-optimal space utilization. Changing this in
523the future would be completely transparent to users, so I felt it was an
2cfc2645 524acceptable tradeoff for quick delivery of a functional product.
0dbc9d1d 525
f5154d2c 526=head2 Fiddly bits
527
528At this point, all the major pieces were in place. All that was left was to
529get all the fiddly bits into place. This included handling the behavior of
530C<keys()>, simultaneous transactions with commits and rollbacks in various
531order, and making sure that transactions played nicely when a new Index sector
532needed to be created due to reindexing a full Bucketlist sector. Of these,
533C<keys()> was the hardest. This is when I actually implemented the Iterator classes
534to handle walking the index/bucketlist chain.
535
2cfc2645 536=head1 The future
bb67b124 537
61b73529 538Basic transactions are only the first step. There are several features that
f5154d2c 539can be added on top of what's been provided. If and in which order any of
540these are implemented is completely up to user-feedback. (Note: these are
541advanced topics - I cannot be held responsible for any bleeding ears.)
61b73529 542
543=head2 Conflict resolution
544
545Conflict, in this context, is what happens when two transactions run
f5154d2c 546simultaneously and change the same piece of data. Like most relational databases,
61b73529 547L<DBM::Deep|DBM::Deep> uses a very simplistic form of conflict resolution -
f5154d2c 548last commit wins. This works quite well for a row-based RDBMS, but doesn't work
61b73529 549as well for fractal structures like hashes.
550
551Contrast this with how Subversion handles conflict. It tracks when each
f5154d2c 552transaction was started. If the HEAD was changed after the transaction
553started, the commit is rejected. It is up to the developer to pull in the
554latest changes, mediate any conflicts, and then recommit. There are several
555other ways to handle conflict resolution, many of which can be pulled from
556Haskell's use of Software Transactional Memory (STM).
61b73529 557
558=head2 Checkpoints
559
560A transaction may have several steps within it. The first three may succeed,
561but the fourth might fail. Instead of rolling back all the way to the
562beginning, you might want to rollback to the last successful step and try
f5154d2c 563again. This is a checkpoint. Most RDBMSes provide them and they aren't very
564difficult, conceptually, but I've seen just how "easy" some features can be
565once you start really exploring the problemspace.
61b73529 566
567=head2 Sub-transactions
568
569Similar to L</Checkpoints>, sub-transactions provide a mechanism for trying
570something within a transaction without affecting the transaction. However,
571instead of saying "These steps are safely finished," sub-transactions still
572provides for the ability to rollback the primary transaction all the way.
573Sub-transactions can also work better with libraries that may want to use
574transactions themselves.
575
f5154d2c 576This, of all the features listed, is the one I'm most interested in
577implementing next.
578
af9c9fda 579=head2 Durability
580
581As mentioned in L</What are transactions?>, the 'D' in ACID stands for
582I<Durable>. L<DBM::Deep|DBM::Deep> does not satisfy that criterion because
583durability almost always requires another file (or files) for a commit log. I
584deemed this unacceptable for this release because one of the
f5154d2c 585L<DBM::Deep|DBM::Deep>'s features is the single datafile. To be honest, I
586don't anticipate this to be an issue for most users because the niche that
587L<DBM::Deep|DBM::Deep> occupies is one that is tolerant to failure and a small
588chance of potential dataloss.
589
590However, Berkley DB does provide durability with only a single file. If it
591becomes necessary, cribbing from them could be an option.
af9c9fda 592
315b34eb 593=cut