lib/DBIx/Class/Storage/DBI/Replicated/Introduction.pod

   1 =head1 NAME
   2
   3 DBIx::Class::Storage::DBI::Replicated::Introduction - Minimum Need to Know
   4
   5 =head1 SYNOPSIS
   6
   7 This is an introductory document for L<DBIx::Class::Storage::DBI::Replicated>.
   8
   9 This document is not an overview of what replication is or why you should be
  10 using it. It is not a document explaining how to setup MySQL native replication
  11 either. Copious external resources are available for both. This document
  12 presumes you have the basics down.
  13
  14 =head1 DESCRIPTION
  15
  16 L<DBIx::Class> supports a framework for using database replication. This system
  17 is integrated completely, which means once it's setup you should be able to
  18 automatically just start using a replication cluster without additional work or
  19 changes to your code. Some caveats apply, primarily related to the proper use
  20 of transactions (you are wrapping all your database modifying statements inside
  21 a transaction, right ;) ) however in our experience properly written DBIC will
  22 work transparently with Replicated storage.
  23
  24 Currently we have support for MySQL native replication, which is relatively
  25 easy to install and configure.  We also currently support single master to one
  26 or more replicants (also called 'slaves' in some documentation).  However the
  27 framework is not specifically tied to the MySQL framework and supporting other
  28 replication systems or topographies should be possible.  Please bring your
  29 patches and ideas to the #dbix-class IRC channel or the mailing list.
  30
  31 For an easy way to start playing with MySQL native replication, see:
  32 L<MySQL::Sandbox>.
  33
  34 If you are using this with a L<Catalyst> based application, you may also want
  35 to see more recent updates to L<Catalyst::Model::DBIC::Schema>, which has
  36 support for replication configuration options as well.
  37
  38 =head1 REPLICATED STORAGE
  39
  40 By default, when you start L<DBIx::Class>, your Schema (L<DBIx::Class::Schema>)
  41 is assigned a storage_type, which when fully connected will reflect your
  42 underlying storage engine as defined by your chosen database driver.  For
  43 example, if you connect to a MySQL database, your storage_type will be
  44 L<DBIx::Class::Storage::DBI::mysql>  Your storage type class will contain
  45 database specific code to help smooth over the differences between databases
  46 and let L<DBIx::Class> do its thing.
  47
  48 If you want to use replication, you will override this setting so that the
  49 replicated storage engine will 'wrap' your underlying storages and present
  50 a unified interface to the end programmer.  This wrapper storage class will
  51 delegate method calls to either a master database or one or more replicated
  52 databases based on if they are read only (by default sent to the replicants)
  53 or write (reserved for the master).  Additionally, the Replicated storage
  54 will monitor the health of your replicants and automatically drop them should
  55 one exceed configurable parameters.  Later, it can automatically restore a
  56 replicant when its health is restored.
  57
  58 This gives you a very robust system, since you can add or drop replicants
  59 and DBIC will automatically adjust itself accordingly.
  60
  61 Additionally, if you need high data integrity, such as when you are executing
  62 a transaction, replicated storage will automatically delegate all database
  63 traffic to the master storage.  There are several ways to enable this high
  64 integrity mode, but wrapping your statements inside a transaction is the easy
  65 and canonical option.
  66
  67 =head1 PARTS OF REPLICATED STORAGE
  68
  69 A replicated storage contains several parts.  First, there is the replicated
  70 storage itself (L<DBIx::Class::Storage::DBI::Replicated>).  A replicated storage
  71 takes a pool of replicants (L<DBIx::Class::Storage::DBI::Replicated::Pool>)
  72 and a software balancer (L<DBIx::Class::Storage::DBI::Replicated::Balancer>).
  73 The balancer does the job of splitting up all the read traffic amongst the
  74 replicants in the Pool. Currently there are two types of balancers, a Random one
  75 which chooses a Replicant in the Pool using a naive randomizer algorithm, and a
  76 First replicant, which just uses the first one in the Pool (and obviously is
  77 only of value when you have a single replicant).
  78
  79 =head1 REPLICATED STORAGE CONFIGURATION
  80
  81 All the parts of replication can be altered dynamically at runtime, which makes
  82 it possibly to create a system that automatically scales under load by creating
  83 more replicants as needed, perhaps using a cloud system such as Amazon EC2.
  84 However, for common use you can setup your replicated storage to be enabled at
  85 the time you connect the databases.  The following is a breakdown of how you
  86 may wish to do this.  Again, if you are using L<Catalyst>, I strongly recommend
  87 you use (or upgrade to) the latest L<Catalyst::Model::DBIC::Schema>, which makes
  88 this job even easier.
  89
  90 First, you need to get a C<$schema> object and set the storage_type:
  91
  92   my $schema = MyApp::Schema->clone;
  93   $schema->storage_type([
  94     '::DBI::Replicated' => {
  95       balancer_type => '::Random',
  96       balancer_args => {
  97         auto_validate_every => 5,
  98         master_read_weight => 1
  99       },
 100       pool_args => {
 101         maximum_lag =>2,
 102       },
 103     }
 104   ]);
 105
 106 Then, you need to connect your L<DBIx::Class::Schema>.
 107
 108   $schema->connection($dsn, $user, $pass);
 109
 110 Let's break down the settings.  The method L<DBIx::Class::Schema/storage_type>
 111 takes one mandatory parameter, a scalar value, and an option second value which
 112 is a Hash Reference of configuration options for that storage.  In this case,
 113 we are setting the Replicated storage type using '::DBI::Replicated' as the
 114 first value.  You will only use a different value if you are subclassing the
 115 replicated storage, so for now just copy that first parameter.
 116
 117 The second parameter contains a hash reference of stuff that gets passed to the
 118 replicated storage.  L<DBIx::Class::Storage::DBI::Replicated/balancer_type> is
 119 the type of software load balancer you will use to split up traffic among all
 120 your replicants.  Right now we have two options, "::Random" and "::First". You
 121 can review documentation for both at:
 122
 123 L<DBIx::Class::Storage::DBI::Replicated::Balancer::First>,
 124 L<DBIx::Class::Storage::DBI::Replicated::Balancer::Random>.
 125
 126 In this case we will have three replicants, so the ::Random option is the only
 127 one that makes sense.
 128
 129 'balancer_args' get passed to the balancer when it's instantiated.  All
 130 balancers have the 'auto_validate_every' option.  This is the number of seconds
 131 we allow to pass between validation checks on a load balanced replicant. So
 132 the higher the number, the more possibility that your reads to the replicant
 133 may be inconsistent with what's on the master.  Setting this number too low
 134 will result in increased database loads, so choose a number with care.  Our
 135 experience is that setting the number around 5 seconds results in a good
 136 performance / integrity balance.
 137
 138 'master_read_weight' is an option associated with the ::Random balancer. It
 139 allows you to let the master be read from.  I usually leave this off (default
 140 is off).
 141
 142 The 'pool_args' are configuration options associated with the replicant pool.
 143 This object (L<DBIx::Class::Storage::DBI::Replicated::Pool>) manages all the
 144 declared replicants.  'maximum_lag' is the number of seconds a replicant is
 145 allowed to lag behind the master before being temporarily removed from the pool.
 146 Keep in mind that the Balancer option 'auto_validate_every' determines how often
 147 a replicant is tested against this condition, so the true possible lag can be
 148 higher than the number you set.  The default is zero.
 149
 150 No matter how low you set the maximum_lag or the auto_validate_every settings,
 151 there is always the chance that your replicants will lag a bit behind the
 152 master for the supported replication system built into MySQL.  You can ensure
 153 reliable reads by using a transaction, which will force both read and write
 154 activity to the master, however this will increase the load on your master
 155 database.
 156
 157 After you've configured the replicated storage, you need to add the connection
 158 information for the replicants:
 159
 160   $schema->storage->connect_replicants(
 161     [$dsn1, $user, $pass, \%opts],
 162     [$dsn2, $user, $pass, \%opts],
 163     [$dsn3, $user, $pass, \%opts],
 164   );
 165
 166 These replicants should be configured as slaves to the master using the
 167 instructions for MySQL native replication, or if you are just learning, you
 168 will find L<MySQL::Sandbox> an easy way to set up a replication cluster.
 169
 170 And now your $schema object is properly configured!  Enjoy!
 171
 172 =head1 FURTHER QUESTIONS?
 173
 174 Check the list of L<additional DBIC resources|DBIx::Class/GETTING HELP/SUPPORT>.
 175
 176 =head1 COPYRIGHT AND LICENSE
 177
 178 This module is free software L<copyright|DBIx::Class/COPYRIGHT AND LICENSE>
 179 by the L<DBIx::Class (DBIC) authors|DBIx::Class/AUTHORS>. You can
 180 redistribute it and/or modify it under the same terms as the
 181 L<DBIx::Class library|DBIx::Class/COPYRIGHT AND LICENSE>.
 182