A couple of typos, and general whitespace cleanup (ick)
[dbsrgits/DBIx-Class.git] / lib / DBIx / Class / Storage / DBI / Replicated / Introduction.pod
CommitLineData
212cc5c2 1package DBIx::Class::Storage::DBI::Replicated::Introduction;
2
3=head1 NAME
4
5DBIx::Class::Storage::DBI::Replicated::Introduction - Minimum Need to Know
6
7=head1 SYNOPSIS
8
9This is an introductory document for L<DBIx::Class::Storage::Replication>.
10
11This document is not an overview of what replication is or why you should be
12using it. It is not a document explaing how to setup MySQL native replication
13either. Copious external resources are avialable for both. This document
14presumes you have the basics down.
15
16=head1 DESCRIPTION
17
18L<DBIx::Class> supports a framework for using database replication. This system
19is integrated completely, which means once it's setup you should be able to
20automatically just start using a replication cluster without additional work or
21changes to your code. Some caveats apply, primarily related to the proper use
22of transactions (you are wrapping all your database modifying statements inside
23a transaction, right ;) ) however in our experience properly written DBIC will
24work transparently with Replicated storage.
25
26Currently we have support for MySQL native replication, which is relatively
27easy to install and configure. We also currently support single master to one
28or more replicants (also called 'slaves' in some documentation). However the
29framework is not specifically tied to the MySQL framework and supporting other
30replication systems or topographies should be possible. Please bring your
31patches and ideas to the #dbix-class IRC channel or the mailing list.
32
33For an easy way to start playing with MySQL native replication, see:
34L<MySQL::Sandbox>.
35
36If you are using this with a L<Catalyst> based appplication, you may also wish
37to see more recent updates to L<Catalyst::Model::DBIC::Schema>, which has
38support for replication configuration options as well.
39
40=head1 REPLICATED STORAGE
41
42By default, when you start L<DBIx::Class>, your Schema (L<DBIx::Class::Schema>)
43is assigned a storage_type, which when fully connected will reflect your
44underlying storage engine as defined by your choosen database driver. For
45example, if you connect to a MySQL database, your storage_type will be
46L<DBIx::Class::Storage::DBI::mysql> Your storage type class will contain
47database specific code to help smooth over the differences between databases
48and let L<DBIx::Class> do its thing.
49
50If you want to use replication, you will override this setting so that the
51replicated storage engine will 'wrap' your underlying storages and present to
52the end programmer a unified interface. This wrapper storage class will
53delegate method calls to either a master database or one or more replicated
54databases based on if they are read only (by default sent to the replicants)
55or write (reserved for the master). Additionally, the Replicated storage
56will monitor the health of your replicants and automatically drop them should
57one exceed configurable parameters. Later, it can automatically restore a
58replicant when its health is restored.
59
60This gives you a very robust system, since you can add or drop replicants
61and DBIC will automatically adjust itself accordingly.
62
63Additionally, if you need high data integrity, such as when you are executing
64a transaction, replicated storage will automatically delegate all database
65traffic to the master storage. There are several ways to enable this high
66integrity mode, but wrapping your statements inside a transaction is the easy
67and canonical option.
68
69=head1 PARTS OF REPLICATED STORAGE
70
71A replicated storage contains several parts. First, there is the replicated
d4daee7b 72storage itself (L<DBIx::Class::Storage::DBI::Replicated>). A replicated storage
212cc5c2 73takes a pool of replicants (L<DBIx::Class::Storage::DBI::Replicated::Pool>)
74and a software balancer (L<DBIx::Class::Storage::DBI::Replicated::Pool>). The
75balancer does the job of splitting up all the read traffic amongst each
76replicant in the Pool. Currently there are two types of balancers, a Random one
77which chooses a Replicant in the Pool using a naive randomizer algorithm, and a
78First replicant, which just uses the first one in the Pool (and obviously is
79only of value when you have a single replicant).
80
81=head1 REPLICATED STORAGE CONFIGURATION
82
83All the parts of replication can be altered dynamically at runtime, which makes
84it possibly to create a system that automatically scales under load by creating
85more replicants as needed, perhaps using a cloud system such as Amazon EC2.
86However, for common use you can setup your replicated storage to be enabled at
87the time you connect the databases. The following is a breakdown of how you
88may wish to do this. Again, if you are using L<Catalyst>, I strongly recommend
89you use (or upgrade to) the latest L<Catalyst::Model::DBIC::Schema>, which makes
90this job even easier.
91
92First, you need to connect your L<DBIx::Class::Schema>. Let's assume you have
93such a schema called, "MyApp::Schema".
94
95 use MyApp::Schema;
96 my $schema = MyApp::Schema->connect($dsn, $user, $pass);
97
98Next, you need to set the storage_type.
99
100 $schema->storage_type(
101 ::DBI::Replicated' => {
102 balancer_type => '::Random',
103 balancer_args => {
104 auto_validate_every => 5,
105 master_read_weight => 1
106 },
107 pool_args => {
108 maximum_lag =>2,
109 },
110 }
111 );
112
113Let's break down the settings. The method L<DBIx::Class::Schema/storage_type>
114takes one mandatory parameter, a scalar value, and an option second value which
115is a Hash Reference of configuration options for that storage. In this case,
116we are setting the Replicated storage type using '::DBI::Replicated' as the
117first value. You will only use a different value if you are subclassing the
118replicated storage, so for now just copy that first parameter.
119
120The second parameter contains a hash reference of stuff that gets passed to the
121replicated storage. L<DBIx::Class::Storage::DBI::Replicated/balancer_type> is
122the type of software load balancer you will use to split up traffic among all
123your replicants. Right now we have two options, "::Random" and "::First". You
124can review documentation for both at:
125
126L<DBIx::Class::Storage::DBI::Replicated::Balancer::First>,
127L<DBIx::Class::Storage::DBI::Replicated::Balancer::Random>.
128
129In this case we will have three replicants, so the ::Random option is the only
130one that makes sense.
131
132'balancer_args' get passed to the balancer when it's instantiated. All
133balancers have the 'auto_validate_every' option. This is the number of seconds
134we allow to pass between validation checks on a load balanced replicant. So
135the higher the number, the more possibility that your reads to the replicant
136may be inconsistant with what's on the master. Setting this number too low
137will result in increased database loads, so choose a number with care. Our
138experience is that setting the number around 5 seconds results in a good
139performance / integrity balance.
140
141'master_read_weight' is an option associated with the ::Random balancer. It
142allows you to let the master be read from. I usually leave this off (default
143is off).
144
145The 'pool_args' are configuration options associated with the replicant pool.
146This object (L<DBIx::Class::Storage::DBI::Replicated::Pool>) manages all the
147declared replicants. 'maximum_lag' is the number of seconds a replicant is
148allowed to lag behind the master before being temporarily removed from the pool.
149Keep in mind that the Balancer option 'auto_validate_every' determins how often
150a replicant is tested against this condition, so the true possible lag can be
151higher than the number you set. The default is zero.
152
153No matter how low you set the maximum_lag or the auto_validate_every settings,
154there is always the chance that your replicants will lag a bit behind the
155master for the supported replication system built into MySQL. You can ensure
156reliabily reads by using a transaction, which will force both read and write
157activity to the master, however this will increase the load on your master
158database.
159
160After you've configured the replicated storage, you need to add the connection
161information for the replicants:
162
163 $schema->storage->connect_replicants(
164 [$dsn1, $user, $pass, \%opts],
165 [$dsn2, $user, $pass, \%opts],
166 [$dsn3, $user, $pass, \%opts],
167 );
168
169These replicants should be configured as slaves to the master using the
170instructions for MySQL native replication, or if you are just learning, you
171will find L<MySQL::Sandbox> an easy way to set up a replication cluster.
172
173And now your $schema object is properly configured! Enjoy!
174
175=head1 AUTHOR
176
177John Napiorkowski <jjnapiork@cpan.org>
178
179=head1 LICENSE
180
181You may distribute this code under the same terms as Perl itself.
182
183=cut
184
1851;