Spellcheck (jawnsy++)
[dbsrgits/DBIx-Class.git] / lib / DBIx / Class / Storage / DBI / Replicated / Introduction.pod
CommitLineData
212cc5c2 1package DBIx::Class::Storage::DBI::Replicated::Introduction;
2
3=head1 NAME
4
5DBIx::Class::Storage::DBI::Replicated::Introduction - Minimum Need to Know
6
7=head1 SYNOPSIS
8
9This is an introductory document for L<DBIx::Class::Storage::Replication>.
10
11This document is not an overview of what replication is or why you should be
12using it. It is not a document explaing how to setup MySQL native replication
13either. Copious external resources are avialable for both. This document
14presumes you have the basics down.
15
16=head1 DESCRIPTION
17
18L<DBIx::Class> supports a framework for using database replication. This system
19is integrated completely, which means once it's setup you should be able to
20automatically just start using a replication cluster without additional work or
21changes to your code. Some caveats apply, primarily related to the proper use
22of transactions (you are wrapping all your database modifying statements inside
23a transaction, right ;) ) however in our experience properly written DBIC will
24work transparently with Replicated storage.
25
26Currently we have support for MySQL native replication, which is relatively
27easy to install and configure. We also currently support single master to one
28or more replicants (also called 'slaves' in some documentation). However the
29framework is not specifically tied to the MySQL framework and supporting other
30replication systems or topographies should be possible. Please bring your
31patches and ideas to the #dbix-class IRC channel or the mailing list.
32
33For an easy way to start playing with MySQL native replication, see:
34L<MySQL::Sandbox>.
35
36If you are using this with a L<Catalyst> based appplication, you may also wish
37to see more recent updates to L<Catalyst::Model::DBIC::Schema>, which has
38support for replication configuration options as well.
39
40=head1 REPLICATED STORAGE
41
42By default, when you start L<DBIx::Class>, your Schema (L<DBIx::Class::Schema>)
43is assigned a storage_type, which when fully connected will reflect your
c1300297 44underlying storage engine as defined by your chosen database driver. For
212cc5c2 45example, if you connect to a MySQL database, your storage_type will be
46L<DBIx::Class::Storage::DBI::mysql> Your storage type class will contain
47database specific code to help smooth over the differences between databases
48and let L<DBIx::Class> do its thing.
49
50If you want to use replication, you will override this setting so that the
51replicated storage engine will 'wrap' your underlying storages and present to
52the end programmer a unified interface. This wrapper storage class will
53delegate method calls to either a master database or one or more replicated
54databases based on if they are read only (by default sent to the replicants)
55or write (reserved for the master). Additionally, the Replicated storage
56will monitor the health of your replicants and automatically drop them should
57one exceed configurable parameters. Later, it can automatically restore a
58replicant when its health is restored.
59
60This gives you a very robust system, since you can add or drop replicants
61and DBIC will automatically adjust itself accordingly.
62
63Additionally, if you need high data integrity, such as when you are executing
64a transaction, replicated storage will automatically delegate all database
65traffic to the master storage. There are several ways to enable this high
66integrity mode, but wrapping your statements inside a transaction is the easy
67and canonical option.
68
69=head1 PARTS OF REPLICATED STORAGE
70
71A replicated storage contains several parts. First, there is the replicated
d4daee7b 72storage itself (L<DBIx::Class::Storage::DBI::Replicated>). A replicated storage
212cc5c2 73takes a pool of replicants (L<DBIx::Class::Storage::DBI::Replicated::Pool>)
74and a software balancer (L<DBIx::Class::Storage::DBI::Replicated::Pool>). The
75balancer does the job of splitting up all the read traffic amongst each
76replicant in the Pool. Currently there are two types of balancers, a Random one
77which chooses a Replicant in the Pool using a naive randomizer algorithm, and a
78First replicant, which just uses the first one in the Pool (and obviously is
79only of value when you have a single replicant).
80
81=head1 REPLICATED STORAGE CONFIGURATION
82
83All the parts of replication can be altered dynamically at runtime, which makes
84it possibly to create a system that automatically scales under load by creating
85more replicants as needed, perhaps using a cloud system such as Amazon EC2.
86However, for common use you can setup your replicated storage to be enabled at
87the time you connect the databases. The following is a breakdown of how you
88may wish to do this. Again, if you are using L<Catalyst>, I strongly recommend
89you use (or upgrade to) the latest L<Catalyst::Model::DBIC::Schema>, which makes
90this job even easier.
91
ce854fd3 92First, you need to get a C<$schema> object and set the storage_type:
93
94 my $schema = MyApp::Schema->clone;
95 $schema->storage_type([
96 '::DBI::Replicated' => {
97 balancer_type => '::Random',
98 balancer_args => {
99 auto_validate_every => 5,
100 master_read_weight => 1
101 },
102 pool_args => {
103 maximum_lag =>2,
104 },
105 }
106 ]);
107
108Then, you need to connect your L<DBIx::Class::Schema>.
109
110 $schema->connection($dsn, $user, $pass);
212cc5c2 111
112Let's break down the settings. The method L<DBIx::Class::Schema/storage_type>
113takes one mandatory parameter, a scalar value, and an option second value which
114is a Hash Reference of configuration options for that storage. In this case,
115we are setting the Replicated storage type using '::DBI::Replicated' as the
116first value. You will only use a different value if you are subclassing the
117replicated storage, so for now just copy that first parameter.
118
119The second parameter contains a hash reference of stuff that gets passed to the
120replicated storage. L<DBIx::Class::Storage::DBI::Replicated/balancer_type> is
121the type of software load balancer you will use to split up traffic among all
122your replicants. Right now we have two options, "::Random" and "::First". You
123can review documentation for both at:
124
125L<DBIx::Class::Storage::DBI::Replicated::Balancer::First>,
126L<DBIx::Class::Storage::DBI::Replicated::Balancer::Random>.
127
128In this case we will have three replicants, so the ::Random option is the only
129one that makes sense.
130
131'balancer_args' get passed to the balancer when it's instantiated. All
132balancers have the 'auto_validate_every' option. This is the number of seconds
133we allow to pass between validation checks on a load balanced replicant. So
134the higher the number, the more possibility that your reads to the replicant
c1300297 135may be inconsistent with what's on the master. Setting this number too low
212cc5c2 136will result in increased database loads, so choose a number with care. Our
137experience is that setting the number around 5 seconds results in a good
138performance / integrity balance.
139
140'master_read_weight' is an option associated with the ::Random balancer. It
141allows you to let the master be read from. I usually leave this off (default
142is off).
143
144The 'pool_args' are configuration options associated with the replicant pool.
145This object (L<DBIx::Class::Storage::DBI::Replicated::Pool>) manages all the
146declared replicants. 'maximum_lag' is the number of seconds a replicant is
147allowed to lag behind the master before being temporarily removed from the pool.
148Keep in mind that the Balancer option 'auto_validate_every' determins how often
149a replicant is tested against this condition, so the true possible lag can be
150higher than the number you set. The default is zero.
151
152No matter how low you set the maximum_lag or the auto_validate_every settings,
153there is always the chance that your replicants will lag a bit behind the
154master for the supported replication system built into MySQL. You can ensure
155reliabily reads by using a transaction, which will force both read and write
156activity to the master, however this will increase the load on your master
157database.
158
159After you've configured the replicated storage, you need to add the connection
160information for the replicants:
161
ce854fd3 162 $schema->storage->connect_replicants(
163 [$dsn1, $user, $pass, \%opts],
164 [$dsn2, $user, $pass, \%opts],
165 [$dsn3, $user, $pass, \%opts],
166 );
212cc5c2 167
168These replicants should be configured as slaves to the master using the
169instructions for MySQL native replication, or if you are just learning, you
170will find L<MySQL::Sandbox> an easy way to set up a replication cluster.
171
172And now your $schema object is properly configured! Enjoy!
173
174=head1 AUTHOR
175
176John Napiorkowski <jjnapiork@cpan.org>
177
178=head1 LICENSE
179
180You may distribute this code under the same terms as Perl itself.
181
182=cut
183
1841;