[dbsrgits/DBM-Deep.git] / lib / DBM / Deep / Engine.pm

package DBM::Deep::Engine;

use 5.006_000;

use strict;
use warnings FATAL => 'all';

use DBM::Deep::Iterator ();

# File-wide notes:
# * Every method in here assumes that the storage has been appropriately
#   safeguarded. This can be anything from flock() to some sort of manual
#   mutex. But, it's the caller's responsability to make sure that this has
#   been done.

sub SIG_HASH     () { 'H' }
sub SIG_ARRAY    () { 'A' }

=head1 NAME

DBM::Deep::Engine

=head1 PURPOSE

This is an internal-use-only object for L<DBM::Deep/>. It mediates the low-level
mapping between the L<DBM::Deep/> objects and the storage medium.

The purpose of this documentation is to provide low-level documentation for
developers. It is B<not> intended to be used by the general public. This
documentation and what it documents can and will change without notice.

=head1 OVERVIEW

The engine exposes an API to the DBM::Deep objects (DBM::Deep, DBM::Deep::Array,
and DBM::Deep::Hash) for their use to access the actual stored values. This API
is the following:

=over 4

=item * new

=item * read_value

=item * get_classname

=item * make_reference

=item * key_exists

=item * delete_key

=item * write_value

=item * get_next_key

=item * setup

=item * begin_work

=item * commit

=item * rollback

=item * lock_exclusive

=item * lock_shared

=item * unlock

=back

They are explained in their own sections below. These methods, in turn, may
provide some bounds-checking, but primarily act to instantiate objects in the
Engine::Sector::* hierarchy and dispatch to them.

=head1 TRANSACTIONS

Transactions in DBM::Deep are implemented using a variant of MVCC. This attempts
to keep the amount of actual work done against the file low while stil providing
Atomicity, Consistency, and Isolation. Durability, unfortunately, cannot be done
with only one file.

=head2 STALENESS

If another process uses a transaction slot and writes stuff to it, then
terminates, the data that process wrote it still within the file. In order to
address this, there is also a transaction staleness counter associated within
every write.  Each time a transaction is started, that process increments that
transaction's staleness counter. If, when it reads a value, the staleness
counters aren't identical, DBM::Deep will consider the value on disk to be stale
and discard it.

=head2 DURABILITY

The fourth leg of ACID is Durability, the guarantee that when a commit returns,
the data will be there the next time you read from it. This should be regardless
of any crashes or powerdowns in between the commit and subsequent read.
DBM::Deep does provide that guarantee; once the commit returns, all of the data
has been transferred from the transaction shadow to the HEAD. The issue arises
with partial commits - a commit that is interrupted in some fashion. In keeping
with DBM::Deep's "tradition" of very light error-checking and non-existent
error-handling, there is no way to recover from a partial commit. (This is
probably a failure in Consistency as well as Durability.)

Other DBMSes use transaction logs (a separate file, generally) to achieve
Durability.  As DBM::Deep is a single-file, we would have to do something
similar to what SQLite and BDB do in terms of committing using synchonized
writes. To do this, we would have to use a much higher RAM footprint and some
serious programming that make my head hurts just to think about it.

=cut

=head2 read_value( $obj, $key )

This takes an object that provides _base_offset() and a string. It returns the
value stored in the corresponding Sector::Value's data section.

=cut

sub read_value { die "read_value must be implemented in a child class" }

=head2 get_classname( $obj )

This takes an object that provides _base_offset() and returns the classname (if
any) associated with it.

It delegates to Sector::Reference::get_classname() for the heavy lifting.

It performs a staleness check.

=cut

sub get_classname { die "get_classname must be implemented in a child class" }

=head2 make_reference( $obj, $old_key, $new_key )

This takes an object that provides _base_offset() and two strings. The
strings correspond to the old key and new key, respectively. This operation
is equivalent to (given C<< $db->{foo} = []; >>) C<< $db->{bar} = $db->{foo} >>.

This returns nothing.

=cut

sub make_reference { die "make_reference must be implemented in a child class" }

=head2 key_exists( $obj, $key )

This takes an object that provides _base_offset() and a string for
the key to be checked. This returns 1 for true and "" for false.

=cut

sub key_exists { die "key_exists must be implemented in a child class" }

=head2 delete_key( $obj, $key )

This takes an object that provides _base_offset() and a string for
the key to be deleted. This returns the result of the Sector::Reference
delete_key() method.

=cut

sub delete_key { die "delete_key must be implemented in a child class" }

=head2 write_value( $obj, $key, $value )

This takes an object that provides _base_offset(), a string for the
key, and a value. This value can be anything storable within L<DBM::Deep/>.

This returns 1 upon success.

=cut

sub write_value { die "write_value must be implemented in a child class" }

=head2 setup( $obj )

This takes an object that provides _base_offset(). It will do everything needed
in order to properly initialize all values for necessary functioning. If this is
called upon an already initialized object, this will also reset the inode.

This returns 1.

=cut

sub setup { die "setup must be implemented in a child class" }

=head2 begin_work( $obj )

This takes an object that provides _base_offset(). It will set up all necessary
bookkeeping in order to run all work within a transaction.

If $obj is already within a transaction, an error wiill be thrown. If there are
no more available transactions, an error will be thrown.

This returns undef.

=cut

sub begin_work { die "begin_work must be implemented in a child class" }

=head2 rollback( $obj )

This takes an object that provides _base_offset(). It will revert all
actions taken within the running transaction.

If $obj is not within a transaction, an error will be thrown.

This returns 1.

=cut

sub rollback { die "rollback must be implemented in a child class" }

=head2 commit( $obj )

This takes an object that provides _base_offset(). It will apply all
actions taken within the transaction to the HEAD.

If $obj is not within a transaction, an error will be thrown.

This returns 1.

=cut

sub commit { die "commit must be implemented in a child class" }

=head2 get_next_key( $obj, $prev_key )

This takes an object that provides _base_offset() and an optional string
representing the prior key returned via a prior invocation of this method.

This method delegates to C<< DBM::Deep::Iterator->get_next_key() >>.

=cut

# XXX Add staleness here
sub get_next_key {
    my $self = shift;
    my ($obj, $prev_key) = @_;

    # XXX Need to add logic about resetting the iterator if any key in the
    # reference has changed
    unless ( $prev_key ) {
        $obj->{iterator} = $self->iterator_class->new({
            base_offset => $obj->_base_offset,
            engine      => $self,
        });
    }

    return $obj->{iterator}->get_next_key( $obj );
}

=head2 lock_exclusive()

This takes an object that provides _base_offset(). It will guarantee that
the storage has taken precautions to be safe for a write.

This returns nothing.

=cut

sub lock_exclusive {
    my $self = shift;
    my ($obj) = @_;
    return $self->storage->lock_exclusive( $obj );
}

=head2 lock_shared()

This takes an object that provides _base_offset(). It will guarantee that
the storage has taken precautions to be safe for a read.

This returns nothing.

=cut

sub lock_shared {
    my $self = shift;
    my ($obj) = @_;
    return $self->storage->lock_shared( $obj );
}

=head2 unlock()

This takes an object that provides _base_offset(). It will guarantee that
the storage has released the most recently-taken lock.

This returns nothing.

=cut

sub unlock {
    my $self = shift;
    my ($obj) = @_;

    my $rv = $self->storage->unlock( $obj );

    $self->flush if $rv;

    return $rv;
}

=head1 INTERNAL METHODS

The following methods are internal-use-only to DBM::Deep::Engine and its
child classes.

=cut

=head2 flush()

This takes no arguments. It will do everything necessary to flush all things to
disk. This is usually called during unlock() and setup().

This returns nothing.

=cut

sub flush {
    my $self = shift;

    # Why do we need to have the storage flush? Shouldn't autoflush take care of
    # things? -RobK, 2008-06-26
    $self->storage->flush;

    return;
}

=head2 load_sector( $loc )

This takes an id/location/offset and loads the sector based on the engine's
defined sector type.

=cut

sub load_sector { $_[0]->sector_type->load( @_ ) }

=head2 ACCESSORS

The following are readonly attributes.

=over 4

=item * storage

=back

=cut

sub storage { $_[0]{storage} }

sub sector_type { die "sector_type must be implemented in a child class" }

1;
__END__
Commit	Line	Data
bf941eae	1	package DBM::Deep::Engine;
	2
	3	use 5.006_000;
	4
	5	use strict;
	6	use warnings FATAL => 'all';
	7
	8	use DBM::Deep::Iterator ();
	9
	10	# File-wide notes:
	11	# * Every method in here assumes that the storage has been appropriately
	12	# safeguarded. This can be anything from flock() to some sort of manual
	13	# mutex. But, it's the caller's responsability to make sure that this has
	14	# been done.
	15
a4d36ff6	16	sub SIG_HASH () { 'H' }
	17	sub SIG_ARRAY () { 'A' }
	18
64a531e5	19	=head1 NAME
	20
	21	DBM::Deep::Engine
	22
	23	=head1 PURPOSE
	24
	25	This is an internal-use-only object for L<DBM::Deep/>. It mediates the low-level
	26	mapping between the L<DBM::Deep/> objects and the storage medium.
	27
	28	The purpose of this documentation is to provide low-level documentation for
	29	developers. It is B<not> intended to be used by the general public. This
	30	documentation and what it documents can and will change without notice.
	31
	32	=head1 OVERVIEW
	33
	34	The engine exposes an API to the DBM::Deep objects (DBM::Deep, DBM::Deep::Array,
	35	and DBM::Deep::Hash) for their use to access the actual stored values. This API
	36	is the following:
	37
	38	=over 4
	39
	40	=item * new
	41
	42	=item * read_value
	43
	44	=item * get_classname
	45
	46	=item * make_reference
	47
	48	=item * key_exists
	49
	50	=item * delete_key
	51
	52	=item * write_value
	53
	54	=item * get_next_key
	55
f4d0ac97	56	=item * setup
64a531e5	57
	58	=item * begin_work
	59
	60	=item * commit
	61
	62	=item * rollback
	63
	64	=item * lock_exclusive
	65
	66	=item * lock_shared
	67
	68	=item * unlock
	69
	70	=back
	71
	72	They are explained in their own sections below. These methods, in turn, may
	73	provide some bounds-checking, but primarily act to instantiate objects in the
	74	Engine::Sector::* hierarchy and dispatch to them.
	75
	76	=head1 TRANSACTIONS
	77
	78	Transactions in DBM::Deep are implemented using a variant of MVCC. This attempts
	79	to keep the amount of actual work done against the file low while stil providing
	80	Atomicity, Consistency, and Isolation. Durability, unfortunately, cannot be done
	81	with only one file.
	82
	83	=head2 STALENESS
	84
	85	If another process uses a transaction slot and writes stuff to it, then
	86	terminates, the data that process wrote it still within the file. In order to
	87	address this, there is also a transaction staleness counter associated within
	88	every write. Each time a transaction is started, that process increments that
	89	transaction's staleness counter. If, when it reads a value, the staleness
	90	counters aren't identical, DBM::Deep will consider the value on disk to be stale
	91	and discard it.
	92
	93	=head2 DURABILITY
	94
	95	The fourth leg of ACID is Durability, the guarantee that when a commit returns,
	96	the data will be there the next time you read from it. This should be regardless
	97	of any crashes or powerdowns in between the commit and subsequent read.
	98	DBM::Deep does provide that guarantee; once the commit returns, all of the data
	99	has been transferred from the transaction shadow to the HEAD. The issue arises
	100	with partial commits - a commit that is interrupted in some fashion. In keeping
	101	with DBM::Deep's "tradition" of very light error-checking and non-existent
	102	error-handling, there is no way to recover from a partial commit. (This is
	103	probably a failure in Consistency as well as Durability.)
	104
	105	Other DBMSes use transaction logs (a separate file, generally) to achieve
	106	Durability. As DBM::Deep is a single-file, we would have to do something
	107	similar to what SQLite and BDB do in terms of committing using synchonized
	108	writes. To do this, we would have to use a much higher RAM footprint and some
	109	serious programming that make my head hurts just to think about it.
	110
	111	=cut
	112
f4d0ac97	113	=head2 read_value( $obj, $key )
64a531e5	114
f4d0ac97	115	This takes an object that provides _base_offset() and a string. It returns the
	116	value stored in the corresponding Sector::Value's data section.
	117
	118	=cut
	119
	120	sub read_value { die "read_value must be implemented in a child class" }
	121
	122	=head2 get_classname( $obj )
	123
	124	This takes an object that provides _base_offset() and returns the classname (if
	125	any) associated with it.
	126
	127	It delegates to Sector::Reference::get_classname() for the heavy lifting.
	128
	129	It performs a staleness check.
	130
	131	=cut
	132
	133	sub get_classname { die "get_classname must be implemented in a child class" }
	134
	135	=head2 make_reference( $obj, $old_key, $new_key )
	136
	137	This takes an object that provides _base_offset() and two strings. The
	138	strings correspond to the old key and new key, respectively. This operation
	139	is equivalent to (given C<< $db->{foo} = []; >>) C<< $db->{bar} = $db->{foo} >>.
	140
	141	This returns nothing.
	142
	143	=cut
	144
	145	sub make_reference { die "make_reference must be implemented in a child class" }
	146
	147	=head2 key_exists( $obj, $key )
	148
	149	This takes an object that provides _base_offset() and a string for
	150	the key to be checked. This returns 1 for true and "" for false.
	151
	152	=cut
	153
	154	sub key_exists { die "key_exists must be implemented in a child class" }
	155
	156	=head2 delete_key( $obj, $key )
	157
	158	This takes an object that provides _base_offset() and a string for
	159	the key to be deleted. This returns the result of the Sector::Reference
	160	delete_key() method.
	161
	162	=cut
	163
	164	sub delete_key { die "delete_key must be implemented in a child class" }
	165
	166	=head2 write_value( $obj, $key, $value )
	167
	168	This takes an object that provides _base_offset(), a string for the
	169	key, and a value. This value can be anything storable within L<DBM::Deep/>.
	170
	171	This returns 1 upon success.
	172
	173	=cut
	174
	175	sub write_value { die "write_value must be implemented in a child class" }
	176
	177	=head2 setup( $obj )
	178
179	This takes an object that provides _base_offset(). It will do everything needed
180	in order to properly initialize all values for necessary functioning. If this is
181	called upon an already initialized object, this will also reset the inode.
182
183	This returns 1.
184
185	=cut
186
187	sub setup { die "setup must be implemented in a child class" }
188
189	=head2 begin_work( $obj )
190
191	This takes an object that provides _base_offset(). It will set up all necessary
192	bookkeeping in order to run all work within a transaction.
193
194	If $obj is already within a transaction, an error wiill be thrown. If there are
195	no more available transactions, an error will be thrown.
196
197	This returns undef.
198
199	=cut
200
201	sub begin_work { die "begin_work must be implemented in a child class" }
202
203	=head2 rollback( $obj )
204
205	This takes an object that provides _base_offset(). It will revert all
206	actions taken within the running transaction.
207
208	If $obj is not within a transaction, an error will be thrown.
209
210	This returns 1.
211
212	=cut
213
214	sub rollback { die "rollback must be implemented in a child class" }
215
216	=head2 commit( $obj )
217
218	This takes an object that provides _base_offset(). It will apply all
219	actions taken within the transaction to the HEAD.
220
221	If $obj is not within a transaction, an error will be thrown.
222
223	This returns 1.
224
225	=cut
226
227	sub commit { die "commit must be implemented in a child class" }
64a531e5	228
bf941eae	229	=head2 get_next_key( $obj, $prev_key )
	230
	231	This takes an object that provides _base_offset() and an optional string
	232	representing the prior key returned via a prior invocation of this method.
	233
	234	This method delegates to C<< DBM::Deep::Iterator->get_next_key() >>.
	235
	236	=cut
	237
	238	# XXX Add staleness here
	239	sub get_next_key {
	240	my $self = shift;
	241	my ($obj, $prev_key) = @_;
	242
f4d0ac97	243	# XXX Need to add logic about resetting the iterator if any key in the
f4d0ac97	244	# reference has changed
bf941eae	245	unless ( $prev_key ) {
19b913ce	246	$obj->{iterator} = $self->iterator_class->new({
bf941eae	247	base_offset => $obj->_base_offset,
	248	engine => $self,
	249	});
	250	}
	251
	252	return $obj->{iterator}->get_next_key( $obj );
	253	}
	254
f4d0ac97	255	=head2 lock_exclusive()
	256
	257	This takes an object that provides _base_offset(). It will guarantee that
	258	the storage has taken precautions to be safe for a write.
	259
	260	This returns nothing.
	261
	262	=cut
	263
	264	sub lock_exclusive {
	265	my $self = shift;
	266	my ($obj) = @_;
	267	return $self->storage->lock_exclusive( $obj );
	268	}
	269
	270	=head2 lock_shared()
	271
	272	This takes an object that provides _base_offset(). It will guarantee that
	273	the storage has taken precautions to be safe for a read.
	274
	275	This returns nothing.
	276
	277	=cut
	278
	279	sub lock_shared {
	280	my $self = shift;
	281	my ($obj) = @_;
	282	return $self->storage->lock_shared( $obj );
	283	}
	284
	285	=head2 unlock()
	286
	287	This takes an object that provides _base_offset(). It will guarantee that
	288	the storage has released the most recently-taken lock.
	289
	290	This returns nothing.
	291
	292	=cut
	293
	294	sub unlock {
	295	my $self = shift;
	296	my ($obj) = @_;
	297
	298	my $rv = $self->storage->unlock( $obj );
	299
	300	$self->flush if $rv;
	301
	302	return $rv;
	303	}
	304
	305	=head1 INTERNAL METHODS
	306
	307	The following methods are internal-use-only to DBM::Deep::Engine and its
	308	child classes.
	309
	310	=cut
	311
	312	=head2 flush()
	313
	314	This takes no arguments. It will do everything necessary to flush all things to
	315	disk. This is usually called during unlock() and setup().
	316
	317	This returns nothing.
	318
319	=cut
320
321	sub flush {
322	my $self = shift;
323
324	# Why do we need to have the storage flush? Shouldn't autoflush take care of
325	# things? -RobK, 2008-06-26
326	$self->storage->flush;
327
328	return;
329	}
330
d6ecf579	331	=head2 load_sector( $loc )
	332
	333	This takes an id/location/offset and loads the sector based on the engine's
	334	defined sector type.
	335
	336	=cut
	337
	338	sub load_sector { $_[0]->sector_type->load( @_ ) }
	339
	340	=head2 ACCESSORS
	341
	342	The following are readonly attributes.
	343
	344	=over 4
	345
	346	=item * storage
	347
	348	=back
	349
	350	=cut
	351
	352	sub storage { $_[0]{storage} }
	353
	354	sub sector_type { die "sector_type must be implemented in a child class" }
	355
bf941eae	356	1;
bf941eae	357	__END__