Commit | Line | Data |
315b34eb |
1 | =head0 Adding transactions to DBM::Deep |
2 | |
3 | =head1 What is DBM::Deep? |
4 | |
4a38586b |
5 | L<DBM::Deep|DBM::Deep> is a module written completely in Perl that provides a way of |
315b34eb |
6 | storing Perl datastructures (scalars, hashes, and arrays) on disk instead of |
7 | in memory. The datafile produced is able to be ftp'ed from one machine to |
8 | another, regardless of OS or Perl version. There are several reasons why |
9 | someone would want to do this. |
10 | |
11 | =over 4 |
12 | |
13 | =item * Transparent Persistence |
14 | |
15 | This is the ability to save a set of data structures to disk and retrieve them |
16 | later without the vast majority of the program even knowing that the data is |
17 | persisted. Furthermore, the datastructure is persisted immediately and not at |
18 | set marshalling periods. |
19 | |
20 | =item * Huge datastructures |
21 | |
22 | Normally, datastructures are limited by the size of RAM the server has. |
4a38586b |
23 | L<DBM::Deep|DBM::Deep> allows for the size a given datastructure to be limited by disk |
315b34eb |
24 | instead. |
25 | |
26 | =item * IPC |
27 | |
28 | While not a common use, this allows for inter-process communication without |
29 | worrying about the specifics of how a given OS handles IPC. |
30 | |
31 | =back |
32 | |
33 | And, with the release of 1.00, there is now a fourth reason - |
0dbc9d1d |
34 | software-transactional memory, or STM |
35 | (L<http://en.wikipedia.org/wiki/Software_transactional_memory>). |
315b34eb |
36 | |
37 | =head1 What are transactions? |
38 | |
39 | Originally from the database world, a transaction is a way of isolating the |
40 | effects of a given set of actions, then applying them all at once. It's a way |
41 | of saying "I'm going to try the following steps, see if I like the result, |
0dbc9d1d |
42 | then I want everyone else looking at this datastore to see the results |
315b34eb |
43 | immediately." The most common example is taken from banking. Let's say that an |
44 | application receives a request to have Joe pay Bob five zorkmids. Without |
45 | transactions, the application would take the money from Joe's account, then |
46 | add the money to Bob's account. But, what happens if the application crashes |
47 | after debiting Joe, but before crediting Bob? The application has made money |
48 | disappear. Or, vice versa, if Bob is credited before Joe is debited, the |
0dbc9d1d |
49 | application has created money. |
315b34eb |
50 | |
51 | With a transaction wrapping the money transfer, if the application crashes in |
52 | the middle, it's as if the action never happened. So, when the application |
53 | recovers from the crash, Joe and Bob still have the same amount of money in |
54 | their accounts as they did before and the transaction can restart and Bob can |
55 | finally receive his zorkmids. |
56 | |
0dbc9d1d |
57 | More formally, transactions are generally considered to be proper when they are |
58 | ACID-compliant. ACID is an acronym that means the following: |
59 | |
60 | =over 4 |
61 | |
62 | =item * Atomic |
63 | |
64 | Either every change happens or none of the changes happen. |
65 | |
66 | =item * Consistent |
67 | |
68 | When the transaction begins and when it is committed, the database must be in |
4a38586b |
69 | a legal state. This restriction doesn't apply to L<DBM::Deep|DBM::Deep> very much. |
0dbc9d1d |
70 | |
71 | =item * Isolated |
72 | |
73 | As far as a transaction is concerned, it is the only thing running against the |
4a38586b |
74 | database while it is running. Unlike most RDBMSes, L<DBM::Deep|DBM::Deep> provides the |
0dbc9d1d |
75 | strongest isolation level possible. |
76 | |
77 | =item * Durable |
78 | |
79 | Once the database says that a comit has happened, the commit will be |
80 | guaranteed, regardless of whatever happens. |
81 | |
82 | =back |
83 | |
315b34eb |
84 | =head1 Why add them to DBM::Deep? |
85 | |
86 | The ability to have actions occur in either I<atomically> (as in the previous |
87 | example) or I<isolation> from the rest of the users of the data is a powerful |
0dbc9d1d |
88 | thing. This allows for a certain amount of safety and predictability in how |
89 | data transformations occur. Imagine, for example, that you have a set of |
315b34eb |
90 | calculations that will update various variables. However, there are some |
91 | situations that will cause you to throw away all results and start over with a |
92 | different seed. Without transactions, you would have to put everything into |
93 | temporary variables, then transfer the values when the calculations were found |
94 | to be successful. With STM, you start a transaction and do your thing within |
95 | it. If the calculations succeed, you commit. If they fail, you rollback and |
d7866e36 |
96 | try again. If you're thinking that this is very similar to how SVN or CVS |
0dbc9d1d |
97 | works, you're absolutely correct - they are transactional in the exact same |
d7866e36 |
98 | way. |
315b34eb |
99 | |
100 | =head1 How it happened |
101 | |
d7866e36 |
102 | =head2 The backstory |
103 | |
4a38586b |
104 | The addition of transactions to L<DBM::Deep|DBM::Deep> has easily been the single most |
0dbc9d1d |
105 | complex software endeavor I've ever undertaken. The first step was to figure |
106 | out exactly how transactions were going to work. After several spikesN<These |
107 | are throwaway coding explorations.>, the best design seemed to look to SVN |
108 | instead of relational databases. The more I investigated, the more I ran up |
109 | against the object-relational impedance mismatch |
110 | N<http://en.wikipedia.org/wiki/Object-Relational_Impedance_Mismatch>, this |
111 | time in terms of being able to translate designs. In the relational world, |
112 | transactions are generally implemented either as row-level locks or using MVCC |
113 | N<http://en.wikipedia.org/wiki/Multiversion_concurrency_control>. Both of |
114 | these assume that there is a I<row>, or singular object, that can be locked |
115 | transparently to everything else. This doesn't translate to a fractally |
116 | repeating structure like a hash or an array. |
117 | |
118 | However, the design used by SVN deals with directories and files which |
119 | corresponds very closely to hashes and hashkeys. In SVN, the modifications are |
120 | stored in the file's structure. Translating this to hashes and hashkeys, this |
121 | means that transactional information should be stored in the keys. This means |
122 | that the entire datafile is unaware of anything to do with transactions, except |
123 | for the key's data structure within the bucket. |
bb67b124 |
124 | |
d7866e36 |
125 | =head2 DBM::Deep's file structure |
126 | |
4a38586b |
127 | L<DBM::Deep|DBM::Deep>'s file structure is a record-based structure. The key (or array |
0dbc9d1d |
128 | index - arrays are currently just funny hashes internally) is hashed using MD5 |
129 | and then stored in a cascade of Index and Bucketlist records. The bucketlist |
130 | record stores the actual key string and pointers to where the data records are |
bb67b124 |
131 | stored. The data records themselves are one of Null, Scalar, or Reference. |
132 | Null represents an I<undef>, Scalar represents a string (numbers are |
0dbc9d1d |
133 | stringified for simplicity) and are allocated in 256byte chunks. References |
134 | represent an array or hash reference and contains a pointer to an Index and |
bb67b124 |
135 | Bucketlist cascade of its own. |
136 | |
137 | =head2 Transactions in the keys |
138 | |
139 | The first pass was to expand the Bucketlist sector to go from a simple key / |
140 | datapointer mapping to a more complex key / transaction / datapointer mapping. |
141 | Initially, I interposed a Transaction record that the bucketlist pointed to. |
142 | That then contained the transaction / datapointer mapping. This had the |
143 | advantage of changing nothing except for adding one new sector type and the |
144 | handling for it. This was very quickly merged into the Bucketlist record to |
145 | simplify the resulting code. |
146 | |
147 | This first step got me to the point where I could pass the following test: |
148 | |
149 | my $db1 = DBM::Deep->new( $filename ); |
150 | my $db2 = DBM::Deep->new( $filename ); |
151 | |
152 | $db1->{abc} = 'foo'; |
153 | |
154 | is( $db1->{abc}, 'foo' ); |
155 | is( $db2->{abc}, 'foo' ); |
156 | |
157 | $db1->begin_work(); |
158 | |
159 | is( $db1->{abc}, 'foo' ); |
160 | is( $db2->{abc}, 'foo' ); |
161 | |
162 | $db1->{abc} = 'floober'; |
163 | |
164 | is( $db1->{abc}, 'floober' ); |
165 | is( $db2->{abc}, 'foo' ); |
166 | |
167 | Just that much was a major accomplishment. The first pass only looked in the |
168 | transaction's spot in the bucket for that key. And, that passed my first tests |
169 | because I didn't check that C<$db1-E<gt>{abc}> was still 'foo' I<before> |
170 | modifying it in the transaction. To pass that test, the code for retrieval |
171 | needed to look first in the transaction's spot and if that spot had never been |
172 | assigned to, look at the spot for the HEAD. |
173 | |
174 | =head2 The concept of the HEAD |
175 | |
176 | This is a concept borrowed from SVN. In SVN, the HEAD revision is the latest |
177 | revision checked into the repository. When you do a ocal modification, you're |
178 | doing a modification to the HEAD. Then, you choose to either check in your |
179 | code (commit()) or revert (rollback()). |
180 | |
4a38586b |
181 | In L<DBM::Deep|DBM::Deep>, I chose to make the HEAD transaction ID 0. This has several |
bb67b124 |
182 | benefits: |
d7866e36 |
183 | |
184 | =over 4 |
185 | |
bb67b124 |
186 | =item * Easy identifiaction of a transaction |
187 | |
0dbc9d1d |
188 | C<if ( $trans_id ) {}> will run the code if and only if we are in a running |
189 | transaction. |
bb67b124 |
190 | |
191 | =item * The HEAD is the first bucket |
d7866e36 |
192 | |
bb67b124 |
193 | In a given bucket, the HEAD is the first datapointer because we mutliply the |
194 | size of the transactional bookkeeping by the transaction ID to find the offset |
195 | to seek into the file. |
d7866e36 |
196 | |
197 | =back |
315b34eb |
198 | |
0dbc9d1d |
199 | =head2 Protection from changes |
200 | |
201 | Let's assume that a transaction is running in one process and another process |
202 | is also modifying the same area in the data. The only way that process B can |
203 | notify process A that a change has occurred is through the common point - the |
204 | DBM file. Because of this, every process that changes the HEAD needs to |
205 | protect all currently running transactions by copying over the pointer to the |
206 | original value into every transaction which hasn't already modified this |
207 | entry. (If it has, then the new value shouldn't overwrite the transaction's |
208 | modification.) This is the key piece for providing I<Isolation>. |
209 | |
bb67b124 |
210 | =head2 Tracking modified buckets |
211 | |
0dbc9d1d |
212 | Rolling back changes is very simple - just don't apply them to the HEAD. The |
213 | next time that transaction ID is reused, the changes will be ignored (q.v. |
214 | L</Staleness counters>). Committing, however, requires that all the changes |
215 | must be transferred over from the bucket entry for the given transaction ID to |
216 | the entry for the HEAD. |
bb67b124 |
217 | |
218 | =head2 Deleted marker |
219 | |
0dbc9d1d |
220 | Transactions are performed copy-on-write. This means that if there isn't an |
221 | entry for that transaction, the HEAD is looked at. This doesn't work if a key |
222 | has been deleted within a transaction. So, the entry must be marked as deleted |
223 | within the transaction so that the HEAD isn't checekd. |
224 | |
225 | Likewise, when a new key is created in a transaction, the HEAD doesn't have an |
226 | entry for that key. Consider the following situation: |
227 | |
228 | ok( !exists $db1->{foo} ); |
229 | ok( !exists $db2->{foo} ); |
230 | |
231 | $db1->begin_work(); |
232 | $db1->{foo} = 'bar'; |
233 | |
234 | ok( !exists $db2->{foo} ); |
235 | |
236 | The entry for the HEAD for 'foo' needs to be marked as deleted so that |
237 | transactions which don't have 'foo' don't find something in the HEAD. |
bb67b124 |
238 | |
239 | =head2 Freespace management |
240 | |
0dbc9d1d |
241 | The second major piece to the 1.00 release was freespace management. In |
4a38586b |
242 | pre-1.00 versions of L<DBM::Deep|DBM::Deep>, the space used by deleted keys would not be |
0dbc9d1d |
243 | recycled. While always a requested feature, the complexity required to |
244 | implement freespace meant that it needed to wait for a complete rewrite of |
245 | several pieces, such as for transactions. |
246 | |
4a38586b |
247 | Freespace is implemented by regularizing all the records so that L<DBM::Deep|DBM::Deep> |
0dbc9d1d |
248 | only has three different record sizes - Index, BucketList, and Data. Each |
4a38586b |
249 | record type has a fixed length based on various parameters the L<DBM::Deep|DBM::Deep> |
0dbc9d1d |
250 | datafile is created with. (In order to accomodate values of various sizes, Data |
251 | records chain.) Whenever a sector is freed, it's added to a freelist of that |
252 | sector's size. Whenever a new sector is requested, the freelist is checked |
253 | first. If the freelist has a sector, it's reused, otherwise a new sector is |
254 | added to the end of the datafile. |
255 | |
256 | Freespace management did bring up another issue - staleness. It is possible to |
257 | have a pointer to a record in memory. If that record is deleted, then reused, |
258 | the pointer in memory has no way of determining that is was deleted and |
259 | readded vs. modified. So, a staleness counter was added which is incremented |
4a38586b |
260 | every time the sector is reused through the freelist. If you then attempt to |
261 | access that stale record, L<DBM::Deep|DBM::Deep> returns undef because, at some point, |
262 | the entry was deleted. |
0dbc9d1d |
263 | |
264 | =head2 Staleness counters |
265 | |
4a38586b |
266 | Once it was implemented for freespace management, staleness counters proved to |
267 | be a very powerful concept for transactions themselves. Back in L<Protection |
268 | from changes>, I mentioned that other processes modifying the HEAD will |
269 | protect all running transactions from their effects. This provides |
270 | I<Isolation>. But, the running transaction doesn't know about these entries. |
271 | If they're not cleaned up, they will be seen the next time a transaction uses |
272 | that transaction ID. |
273 | |
274 | By providing a staleness counter for transactions, the costs of cleaning up |
275 | finished transactions is deferred until the space is actually used again. This |
276 | is at the cost of having less-than-optimal space utilization. Changing this in |
277 | the future would be completely transparent to users, so I felt it was an |
278 | acceptable tradeoff for delivering working code quickly. |
0dbc9d1d |
279 | |
280 | =head1 Conclusion |
bb67b124 |
281 | |
315b34eb |
282 | =cut |