Commit | Line | Data |
212cc5c2 |
1 | package DBIx::Class::Storage::DBI::Replicated::Introduction; |
2 | |
3 | =head1 NAME |
4 | |
5 | DBIx::Class::Storage::DBI::Replicated::Introduction - Minimum Need to Know |
6 | |
7 | =head1 SYNOPSIS |
8 | |
9 | This is an introductory document for L<DBIx::Class::Storage::Replication>. |
10 | |
11 | This document is not an overview of what replication is or why you should be |
12 | using it. It is not a document explaing how to setup MySQL native replication |
13 | either. Copious external resources are avialable for both. This document |
14 | presumes you have the basics down. |
15 | |
16 | =head1 DESCRIPTION |
17 | |
18 | L<DBIx::Class> supports a framework for using database replication. This system |
19 | is integrated completely, which means once it's setup you should be able to |
20 | automatically just start using a replication cluster without additional work or |
21 | changes to your code. Some caveats apply, primarily related to the proper use |
22 | of transactions (you are wrapping all your database modifying statements inside |
23 | a transaction, right ;) ) however in our experience properly written DBIC will |
24 | work transparently with Replicated storage. |
25 | |
26 | Currently we have support for MySQL native replication, which is relatively |
27 | easy to install and configure. We also currently support single master to one |
28 | or more replicants (also called 'slaves' in some documentation). However the |
29 | framework is not specifically tied to the MySQL framework and supporting other |
30 | replication systems or topographies should be possible. Please bring your |
31 | patches and ideas to the #dbix-class IRC channel or the mailing list. |
32 | |
33 | For an easy way to start playing with MySQL native replication, see: |
34 | L<MySQL::Sandbox>. |
35 | |
36 | If you are using this with a L<Catalyst> based appplication, you may also wish |
37 | to see more recent updates to L<Catalyst::Model::DBIC::Schema>, which has |
38 | support for replication configuration options as well. |
39 | |
40 | =head1 REPLICATED STORAGE |
41 | |
42 | By default, when you start L<DBIx::Class>, your Schema (L<DBIx::Class::Schema>) |
43 | is assigned a storage_type, which when fully connected will reflect your |
44 | underlying storage engine as defined by your choosen database driver. For |
45 | example, if you connect to a MySQL database, your storage_type will be |
46 | L<DBIx::Class::Storage::DBI::mysql> Your storage type class will contain |
47 | database specific code to help smooth over the differences between databases |
48 | and let L<DBIx::Class> do its thing. |
49 | |
50 | If you want to use replication, you will override this setting so that the |
51 | replicated storage engine will 'wrap' your underlying storages and present to |
52 | the end programmer a unified interface. This wrapper storage class will |
53 | delegate method calls to either a master database or one or more replicated |
54 | databases based on if they are read only (by default sent to the replicants) |
55 | or write (reserved for the master). Additionally, the Replicated storage |
56 | will monitor the health of your replicants and automatically drop them should |
57 | one exceed configurable parameters. Later, it can automatically restore a |
58 | replicant when its health is restored. |
59 | |
60 | This gives you a very robust system, since you can add or drop replicants |
61 | and DBIC will automatically adjust itself accordingly. |
62 | |
63 | Additionally, if you need high data integrity, such as when you are executing |
64 | a transaction, replicated storage will automatically delegate all database |
65 | traffic to the master storage. There are several ways to enable this high |
66 | integrity mode, but wrapping your statements inside a transaction is the easy |
67 | and canonical option. |
68 | |
69 | =head1 PARTS OF REPLICATED STORAGE |
70 | |
71 | A replicated storage contains several parts. First, there is the replicated |
d4daee7b |
72 | storage itself (L<DBIx::Class::Storage::DBI::Replicated>). A replicated storage |
212cc5c2 |
73 | takes a pool of replicants (L<DBIx::Class::Storage::DBI::Replicated::Pool>) |
74 | and a software balancer (L<DBIx::Class::Storage::DBI::Replicated::Pool>). The |
75 | balancer does the job of splitting up all the read traffic amongst each |
76 | replicant in the Pool. Currently there are two types of balancers, a Random one |
77 | which chooses a Replicant in the Pool using a naive randomizer algorithm, and a |
78 | First replicant, which just uses the first one in the Pool (and obviously is |
79 | only of value when you have a single replicant). |
80 | |
81 | =head1 REPLICATED STORAGE CONFIGURATION |
82 | |
83 | All the parts of replication can be altered dynamically at runtime, which makes |
84 | it possibly to create a system that automatically scales under load by creating |
85 | more replicants as needed, perhaps using a cloud system such as Amazon EC2. |
86 | However, for common use you can setup your replicated storage to be enabled at |
87 | the time you connect the databases. The following is a breakdown of how you |
88 | may wish to do this. Again, if you are using L<Catalyst>, I strongly recommend |
89 | you use (or upgrade to) the latest L<Catalyst::Model::DBIC::Schema>, which makes |
90 | this job even easier. |
91 | |
ce854fd3 |
92 | First, you need to get a C<$schema> object and set the storage_type: |
93 | |
94 | my $schema = MyApp::Schema->clone; |
95 | $schema->storage_type([ |
96 | '::DBI::Replicated' => { |
97 | balancer_type => '::Random', |
98 | balancer_args => { |
99 | auto_validate_every => 5, |
100 | master_read_weight => 1 |
101 | }, |
102 | pool_args => { |
103 | maximum_lag =>2, |
104 | }, |
105 | } |
106 | ]); |
107 | |
108 | Then, you need to connect your L<DBIx::Class::Schema>. |
109 | |
110 | $schema->connection($dsn, $user, $pass); |
212cc5c2 |
111 | |
112 | Let's break down the settings. The method L<DBIx::Class::Schema/storage_type> |
113 | takes one mandatory parameter, a scalar value, and an option second value which |
114 | is a Hash Reference of configuration options for that storage. In this case, |
115 | we are setting the Replicated storage type using '::DBI::Replicated' as the |
116 | first value. You will only use a different value if you are subclassing the |
117 | replicated storage, so for now just copy that first parameter. |
118 | |
119 | The second parameter contains a hash reference of stuff that gets passed to the |
120 | replicated storage. L<DBIx::Class::Storage::DBI::Replicated/balancer_type> is |
121 | the type of software load balancer you will use to split up traffic among all |
122 | your replicants. Right now we have two options, "::Random" and "::First". You |
123 | can review documentation for both at: |
124 | |
125 | L<DBIx::Class::Storage::DBI::Replicated::Balancer::First>, |
126 | L<DBIx::Class::Storage::DBI::Replicated::Balancer::Random>. |
127 | |
128 | In this case we will have three replicants, so the ::Random option is the only |
129 | one that makes sense. |
130 | |
131 | 'balancer_args' get passed to the balancer when it's instantiated. All |
132 | balancers have the 'auto_validate_every' option. This is the number of seconds |
133 | we allow to pass between validation checks on a load balanced replicant. So |
134 | the higher the number, the more possibility that your reads to the replicant |
135 | may be inconsistant with what's on the master. Setting this number too low |
136 | will result in increased database loads, so choose a number with care. Our |
137 | experience is that setting the number around 5 seconds results in a good |
138 | performance / integrity balance. |
139 | |
140 | 'master_read_weight' is an option associated with the ::Random balancer. It |
141 | allows you to let the master be read from. I usually leave this off (default |
142 | is off). |
143 | |
144 | The 'pool_args' are configuration options associated with the replicant pool. |
145 | This object (L<DBIx::Class::Storage::DBI::Replicated::Pool>) manages all the |
146 | declared replicants. 'maximum_lag' is the number of seconds a replicant is |
147 | allowed to lag behind the master before being temporarily removed from the pool. |
148 | Keep in mind that the Balancer option 'auto_validate_every' determins how often |
149 | a replicant is tested against this condition, so the true possible lag can be |
150 | higher than the number you set. The default is zero. |
151 | |
152 | No matter how low you set the maximum_lag or the auto_validate_every settings, |
153 | there is always the chance that your replicants will lag a bit behind the |
154 | master for the supported replication system built into MySQL. You can ensure |
155 | reliabily reads by using a transaction, which will force both read and write |
156 | activity to the master, however this will increase the load on your master |
157 | database. |
158 | |
159 | After you've configured the replicated storage, you need to add the connection |
160 | information for the replicants: |
161 | |
ce854fd3 |
162 | $schema->storage->connect_replicants( |
163 | [$dsn1, $user, $pass, \%opts], |
164 | [$dsn2, $user, $pass, \%opts], |
165 | [$dsn3, $user, $pass, \%opts], |
166 | ); |
212cc5c2 |
167 | |
168 | These replicants should be configured as slaves to the master using the |
169 | instructions for MySQL native replication, or if you are just learning, you |
170 | will find L<MySQL::Sandbox> an easy way to set up a replication cluster. |
171 | |
172 | And now your $schema object is properly configured! Enjoy! |
173 | |
174 | =head1 AUTHOR |
175 | |
176 | John Napiorkowski <jjnapiork@cpan.org> |
177 | |
178 | =head1 LICENSE |
179 | |
180 | You may distribute this code under the same terms as Perl itself. |
181 | |
182 | =cut |
183 | |
184 | 1; |