From: John Napiorkowski Date: Fri, 10 Jul 2009 16:00:38 +0000 (+0000) Subject: pod cleanup, fixed broken pod links, and new Introduction pod X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=212cc5c25c31b2ec3ff4b4e20283321617db79e6;p=dbsrgits%2FDBIx-Class-Historic.git pod cleanup, fixed broken pod links, and new Introduction pod --- diff --git a/lib/DBIx/Class/Storage/DBI/Replicated.pm b/lib/DBIx/Class/Storage/DBI/Replicated.pm index a625f82..6302bf1 100644 --- a/lib/DBIx/Class/Storage/DBI/Replicated.pm +++ b/lib/DBIx/Class/Storage/DBI/Replicated.pm @@ -77,12 +77,15 @@ attribute 'force_pool'. For example: Now $RS will force everything (both reads and writes) to use whatever was setup as the master storage. 'master' is hardcoded to always point to the Master, but you can also use any Replicant name. Please see: -L and the replicants attribute for more. +L and the replicants attribute for more. Also see transactions and L for alternative ways to force read traffic to the master. In general, you should wrap your statements in a transaction when you are reading and writing to the same tables at the same time, since your replicants will often lag a bit behind the master. + +See L for more help and +walkthroughs. =head1 DESCRIPTION @@ -162,7 +165,7 @@ has 'pool_type' => ( =head2 pool_args Contains a hashref of initialized information to pass to the Balancer object. -See L for available arguments. +See L for available arguments. =cut @@ -195,7 +198,7 @@ has 'balancer_type' => ( =head2 balancer_args Contains a hashref of initialized information to pass to the Balancer object. -See L for available arguments. +See L for available arguments. =cut diff --git a/lib/DBIx/Class/Storage/DBI/Replicated/Introduction.pod b/lib/DBIx/Class/Storage/DBI/Replicated/Introduction.pod new file mode 100644 index 0000000..d31417c --- /dev/null +++ b/lib/DBIx/Class/Storage/DBI/Replicated/Introduction.pod @@ -0,0 +1,185 @@ +package DBIx::Class::Storage::DBI::Replicated::Introduction; + +=head1 NAME + +DBIx::Class::Storage::DBI::Replicated::Introduction - Minimum Need to Know + +=head1 SYNOPSIS + +This is an introductory document for L. + +This document is not an overview of what replication is or why you should be +using it. It is not a document explaing how to setup MySQL native replication +either. Copious external resources are avialable for both. This document +presumes you have the basics down. + +=head1 DESCRIPTION + +L supports a framework for using database replication. This system +is integrated completely, which means once it's setup you should be able to +automatically just start using a replication cluster without additional work or +changes to your code. Some caveats apply, primarily related to the proper use +of transactions (you are wrapping all your database modifying statements inside +a transaction, right ;) ) however in our experience properly written DBIC will +work transparently with Replicated storage. + +Currently we have support for MySQL native replication, which is relatively +easy to install and configure. We also currently support single master to one +or more replicants (also called 'slaves' in some documentation). However the +framework is not specifically tied to the MySQL framework and supporting other +replication systems or topographies should be possible. Please bring your +patches and ideas to the #dbix-class IRC channel or the mailing list. + +For an easy way to start playing with MySQL native replication, see: +L. + +If you are using this with a L based appplication, you may also wish +to see more recent updates to L, which has +support for replication configuration options as well. + +=head1 REPLICATED STORAGE + +By default, when you start L, your Schema (L) +is assigned a storage_type, which when fully connected will reflect your +underlying storage engine as defined by your choosen database driver. For +example, if you connect to a MySQL database, your storage_type will be +L Your storage type class will contain +database specific code to help smooth over the differences between databases +and let L do its thing. + +If you want to use replication, you will override this setting so that the +replicated storage engine will 'wrap' your underlying storages and present to +the end programmer a unified interface. This wrapper storage class will +delegate method calls to either a master database or one or more replicated +databases based on if they are read only (by default sent to the replicants) +or write (reserved for the master). Additionally, the Replicated storage +will monitor the health of your replicants and automatically drop them should +one exceed configurable parameters. Later, it can automatically restore a +replicant when its health is restored. + +This gives you a very robust system, since you can add or drop replicants +and DBIC will automatically adjust itself accordingly. + +Additionally, if you need high data integrity, such as when you are executing +a transaction, replicated storage will automatically delegate all database +traffic to the master storage. There are several ways to enable this high +integrity mode, but wrapping your statements inside a transaction is the easy +and canonical option. + +=head1 PARTS OF REPLICATED STORAGE + +A replicated storage contains several parts. First, there is the replicated +storage itself (L) +and a software balancer (L). The +balancer does the job of splitting up all the read traffic amongst each +replicant in the Pool. Currently there are two types of balancers, a Random one +which chooses a Replicant in the Pool using a naive randomizer algorithm, and a +First replicant, which just uses the first one in the Pool (and obviously is +only of value when you have a single replicant). + +=head1 REPLICATED STORAGE CONFIGURATION + +All the parts of replication can be altered dynamically at runtime, which makes +it possibly to create a system that automatically scales under load by creating +more replicants as needed, perhaps using a cloud system such as Amazon EC2. +However, for common use you can setup your replicated storage to be enabled at +the time you connect the databases. The following is a breakdown of how you +may wish to do this. Again, if you are using L, I strongly recommend +you use (or upgrade to) the latest L, which makes +this job even easier. + +First, you need to connect your L. Let's assume you have +such a schema called, "MyApp::Schema". + + use MyApp::Schema; + my $schema = MyApp::Schema->connect($dsn, $user, $pass); + +Next, you need to set the storage_type. + + $schema->storage_type( + ::DBI::Replicated' => { + balancer_type => '::Random', + balancer_args => { + auto_validate_every => 5, + master_read_weight => 1 + }, + pool_args => { + maximum_lag =>2, + }, + } + ); + +Let's break down the settings. The method L +takes one mandatory parameter, a scalar value, and an option second value which +is a Hash Reference of configuration options for that storage. In this case, +we are setting the Replicated storage type using '::DBI::Replicated' as the +first value. You will only use a different value if you are subclassing the +replicated storage, so for now just copy that first parameter. + +The second parameter contains a hash reference of stuff that gets passed to the +replicated storage. L is +the type of software load balancer you will use to split up traffic among all +your replicants. Right now we have two options, "::Random" and "::First". You +can review documentation for both at: + +L, +L. + +In this case we will have three replicants, so the ::Random option is the only +one that makes sense. + +'balancer_args' get passed to the balancer when it's instantiated. All +balancers have the 'auto_validate_every' option. This is the number of seconds +we allow to pass between validation checks on a load balanced replicant. So +the higher the number, the more possibility that your reads to the replicant +may be inconsistant with what's on the master. Setting this number too low +will result in increased database loads, so choose a number with care. Our +experience is that setting the number around 5 seconds results in a good +performance / integrity balance. + +'master_read_weight' is an option associated with the ::Random balancer. It +allows you to let the master be read from. I usually leave this off (default +is off). + +The 'pool_args' are configuration options associated with the replicant pool. +This object (L) manages all the +declared replicants. 'maximum_lag' is the number of seconds a replicant is +allowed to lag behind the master before being temporarily removed from the pool. +Keep in mind that the Balancer option 'auto_validate_every' determins how often +a replicant is tested against this condition, so the true possible lag can be +higher than the number you set. The default is zero. + +No matter how low you set the maximum_lag or the auto_validate_every settings, +there is always the chance that your replicants will lag a bit behind the +master for the supported replication system built into MySQL. You can ensure +reliabily reads by using a transaction, which will force both read and write +activity to the master, however this will increase the load on your master +database. + +After you've configured the replicated storage, you need to add the connection +information for the replicants: + + $schema->storage->connect_replicants( + [$dsn1, $user, $pass, \%opts], + [$dsn2, $user, $pass, \%opts], + [$dsn3, $user, $pass, \%opts], + ); + +These replicants should be configured as slaves to the master using the +instructions for MySQL native replication, or if you are just learning, you +will find L an easy way to set up a replication cluster. + +And now your $schema object is properly configured! Enjoy! + +=head1 AUTHOR + +John Napiorkowski + +=head1 LICENSE + +You may distribute this code under the same terms as Perl itself. + +=cut + +1;