From: Rob Kinyon Date: Mon, 10 Aug 2009 02:58:21 +0000 (-0400) Subject: Start on the first draft of the spec X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=7848cf7a67ef93ea2ac29e65c23a7fa0b177dad1;p=dbsrgits%2Fdbic-future.git Start on the first draft of the spec --- diff --git a/lib/DBIx/Class/Manual/Specification.pm b/lib/DBIx/Class/Manual/Specification.pm new file mode 100644 index 0000000..4fc09f9 --- /dev/null +++ b/lib/DBIx/Class/Manual/Specification.pm @@ -0,0 +1,218 @@ +=head1 NAME + +DBIx::Class::Manual::Specification + +=head1 SYNOPSIS + +This discusses the specification for DBIx::Class v0.09. Along with discussing +new features, it will also include proper specification for features that exist +in the DBIx::Class 0.08 versions that will be supported in DBIx::Class 0.09. +Where appropriate, it will also discuss specifications for Data::Query and any +other modules/distributions. + +=head1 MOTIVATIONS + +DBIx::Class has become one of, if not the, premier ORMs in the Perl community. +However, its featureset has grown organically. As such, some features that were +considered a good idea at one time proved to be implemented in such a way that +prevented proper implementation of other features down the road. Examples +include prefetching and searching. + +This document means to provide a solid foundation for future development so as +to prevent painting the distribution into a corner for features that have not +been conceived of at this time. + +=head1 GOALS + +This specification will be driven by the following design goals. + +=head2 Database-agnostic usage + +One of the biggest selling points of an ORM is that a programmer should be able +to change the relational database used to back a given application without +having to change anything but configuration. And, to a point, this works +relatively well. Until it doesn't. DBIx::Class 0.09 will change that. + +=head2 Database-specific optimization + +As it turns out, the drive to be database-agnostic, makes it very hard to +optimize for a given specific database's strengths. In order to support as many +databases, the SQL generated generally ends up being at the lowest common +denominator. This gives ORMs a bad name where performance is concerned. But, it +doesn't have to be that way and DBIx::Class 0.09 will demonstrate how. + +=head2 Datastore-agnostic usage + +Data doesn't just live in relational databases anymore. There are many reasons, +for this, the primary one being that the relational model doesn't necessarily +reflect the structure of certain data models. But, there are many times when +relational calculus represents the proper way of manipulating that data. Or, +more commonly, a given relational query needs to merge data from relational and +non-relational sources. DBIx::Class, along with Data::Query, will provide a +single API for manipulating data in a relational fashion, regardless of whether +that manipulation is in a datastore, within Perl, or both. + +=head2 Intuitive API + +Every API has unavoidable gribbly bits. This falls out of the fact that the +problemspace a given API solves can never be fully mapped to it (an extension +of Godel's Incompleteness Theorem). However, like Perl itself, the API should be +organized and optimized so that the user only needs to know a minimum of the +API's functionality in order to accomplish the most common tasks. Furthermore, +that minimum should be easily discoverable in simple synopses and examples. + +=head2 Extensible API and backend + +Without breaking or, in as many cases as possible I, existing backend +implementation details, the functionality of DBIx::Class should be extensible. +Both the API (new contact points) and backend (new datastores, etc) should be +extended knowing a simple API for doing so. This implies that as many pieces as +possible of the backend functionality should be overridable separate from every +other piece. + +=head1 RESTRICTIONS + +Fill this part in. + +=head1 RELATIONAL THEORY + +(B To understand the fundamental underpinnings of DBIx::Class +0.09's choice of features, it is necessary to have a common understanding of +relational theory. I recommend that everyone at least read this section once in +order to be familiar with the terms I use throughout this document.) + +Relational theory (calculus or algebra - the difference is esoteric) is, at its +roots, all about set theory and set manipulations. Each row in a table (or +tuple) is an element of a set. The various clause (or operations) in a SQL +statement can be viewed as set manipulators that take one or more sets of tuples +as input and provide a set of tuples as output. + +A SQL statement has many clauses and they are evaluated in a very specific order +that is different from the order they are presented. (As mutations share all of +their clauses with queries and evaluate in the same order, only queries will be +discussed here.) The order of evaluation for a SELECT statement is: + +=over 4 + +=item * FROM (with JOINs evaluated in left-to-right order) + +=item * WHERE (with AND/OR evaluated in left-to-right order within each +parenthetical block) + +=item * GROUP BY + +=item * SELECT + +=item * HAVING + +=item * ORDER BY + +=back + +(Expand here.) + +=head1 FEATURES + +=head2 Basic Features + +An ORM must provide some very basic features. They are, in essence, the ability +to: + +=over 4 + +=item * Retrieve data from a relational database (a.k.a., select) + +=item * Insert, update, and delete in that relational database (a.k.a., mutate) + +=back + +These functions are generally achieved by providing some API to the SELECT +statement that returns a collection of objects. These objects then allow for +updating and deleting. The same API that allows for SELECT also, generally, will +allow for creation and, possibly, deletion. Most ORMs do this modelling by +mapping a class to a table and an instance of that class to a given row in that +table. + +DBIx::Class takes a different approach. Given rows in a table are represented by +objects, but the table as a whole is also represented by an object. There is no +class that represents the table in itself. Instead, there is a class that is +used to instantiate the rows of a table and another class that is used to +instantiate objects that represent SQL clauses for statements against that +table. The former is the Row class and the latter is the ResultSet class. This +separation allows for greater expressivity when dealing with a given table. + +DBIx::Class 0.09 extends these concepts by adding a third layer - the source. + +(Expand here.) + +=head3 Sources + +A source, at its simplest, is the representation of a single table in a +relational database. For example, the table `artists`. But, a source can be +much more than that. Within a relational database, "`artists JOIN cds`" forms a +source containing the columns of both `artists` and `cds`. Taken to the logical +extreme, the FROM clause of a query forms a single source. A subquery can also +be viewed as a source. (In fact, any table or join can be viewed as the subquery +"(SELECT * FROM table) AS table".) + +=head3 Resultsets + +The resultset is arguably the single most important breakthrough in DBIx::Class. +It allows for the gradual building of a SQL statement and reuse of that building +for more than just SELECT statements. In addition to being able to separate +responsibilites by letting different pieces of an application (such as security) +decorate the resultset appropriately, a resultset can also be reused for various +needs. The same resultset can be used for a query, a mass update, a mass delete, +or anything else. + +Under the hood, a resultset has a representation for each SQL clause. Each usage +of the resultset will take the clauses that make sense for that usage and leave +the others. + +(Expand here, detailing each usage of a resultset and the clauses that each +uses.) + +=head3 Rows + +Traditionally, the row object is a hashref representing a row in a table. This +representation is simple to implement, meets the 80/20 case, and completely +wrong. It fails when dealing with GROUP BY clauses and custom queries. The +proper treatment of a row object is that it represents the SELECT clause of the +resultset that generated it. In the 80/20 case, the SELECT clause will be all +the columns of the primary table being selected. By moving the defintion of the +row object into the resultset, the row object's definition is closer to its +usage. + +Under the hood, the row object is defined by using anonymous classes built with +roles, one role for each column. The roles for the columns are defined in the +definitions for each table. Those roles flow from the source(s), through the +resultset and into the row objects. + +=head2 Queries as streams + +A grouping of data can be viewed as either a collection or a stream. The main +difference is that a collection is eager and a stream is lazy. For large +datasets, collections can be very expensive in terms of memory and time (filling +that memory). Streams, on the other hand, defer loading anything into memory +until the last possible moment, but are easily convertible into a collection as +needed. (q.v. Higher Order Perl for more information.) + +=head2 + +=head1 TODO + +=over 4 + +=item * L section needs to be filled in. + +=back + +=head1 AUTHOR(S) + +robkinyon: Rob Kinyon C<< >> + +=head1 LICENSE + +You may distribute this code under the same terms as Perl itself. + +=cut