From: Peter Rabbitson Date: Thu, 15 Oct 2015 15:32:34 +0000 (+0200) Subject: Clarify comments/documentation regarding DBIC/SQLA/DQ relationship X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=07fadea8d16b657cc1ee572b8f64fa7a688be853;p=dbsrgits%2FDBIx-Class-Historic.git Clarify comments/documentation regarding DBIC/SQLA/DQ relationship This discussion comes up several times per year, just put it to bed for good --- diff --git a/lib/DBIx/Class/SQLMaker.pm b/lib/DBIx/Class/SQLMaker.pm index fe07e28..25a0386 100644 --- a/lib/DBIx/Class/SQLMaker.pm +++ b/lib/DBIx/Class/SQLMaker.pm @@ -9,13 +9,13 @@ DBIx::Class::SQLMaker - An SQL::Abstract-based SQL maker class =head1 DESCRIPTION -This module is a subclass of L and includes a number of -DBIC-specific workarounds, not yet suitable for inclusion into the +This module is currently a subclass of L and includes a number of +DBIC-specific extensions/workarounds, not suitable for inclusion into the L core. It also provides all (and more than) the functionality of L, see L for more info. -Currently the enhancements to L are: +Currently the enhancements over L are: =over @@ -25,10 +25,102 @@ Currently the enhancements to L are: =item * C/C support (via extensions to the order_by parameter) +=item * A rudimentary multicolumn IN operator + =item * Support of C<...FOR UPDATE> type of select statement modifiers =back +=head1 ROADMAP + +Some maintainer musings on the current state of SQL generation within DBIC as +of Oct 2015 + +=head2 Folding of most (or all) of L into DBIC + +The rise of complex prefetch use, and the general streamlining of result +parsing within DBIC ended up pushing the actual SQL generation to the forefront +of many casual performance profiles. While the idea behind SQLA's API is sound, +the actual implementation is terribly inefficient (once again bumping into the +ridiculously high overhead of perl function calls). + +Given that SQLA has a B distinct life on its own, and is used within an +order of magnitude more projects compared to DBIC, it is prudent to B +disturb the current call chains within SQLA itself. Instead in the near future +an effort will be undertaken to seek a more thorough decoupling of DBIC SQL +generation from reliance on SQLA, possibly to a point where B at all. + +B library itself will continue being maintained> although +it is not likely to gain many extra features, notably dialect support, at least +not within the base C namespace. + +This work (if undertaken) will take into consideration the following +constraints: + +=over + +=item Main API compatibility + +The object returned by C<< $schema->storage->sqlmaker >> needs to be able to +satisfy most of the basic tests found in the current-at-the-time SQLA dist. +While things like L or L +or even worse L will definitely remain +unsupported, the rest of the tests should pass (within reason). + +=item Ability to plug back an SQL::Abstract (or derivative) + +During the initial work on L the test suite of DBIC turned out to +be an invaluable asset to iron out hard-to-reason-about corner cases. In +addition the test suite is much more vast and intricate than the tests of SQLA +itself. This state of affairs is way too valuable to sacrifice in order to gain +faster SQL generation. Thus a compile-time-ENV-check will be introduced along +with an extra CI configuration to ensure that DBIC is used with an off-the-CPAN +SQLA and that it continues to flawlessly run its entire test suite. While this +will undoubtedly complicate the implementation of the better performing SQL +generator, it will preserve both the usability of the test suite for external +projects and will keep L from regressions in the future. + +=back + +Aside from these constraints it is becoming more and more practical to simply +stop using SQLA in day-to-day production deployments of DBIC. The flexibility +of the internals is simply not worth the performance cost. + +=head2 Relationship to L + +When initial work on DQ was taking place, the tools in L<::Storage::DBIHacks +|http://github.com/dbsrgits/dbix-class/blob/current/blead/lib/DBIx/Class/Storage/DBIHacks.pm> +were only beginning to take shape, and it wasn't clear how important they will +become further down the road. In fact the I was +considered an ugly stop-gap, and even a couple of highly entertaining talks +were given to that effect. As the use-cases of DBIC were progressing, and +evidence for the importance of supporting arbitrary SQL was mounting, it became +clearer that DBIC itself would not really benefit in any way from an +integration with DQ, but on the contrary is likely to lose functionality while +the corners of the brand new DQ codebase are sanded off. + +The current status of DBIC/DQ integration is that the only benefit is for DQ by +having access to the very extensive "early adopter" test suite, in the same +manner as early DBIC benefitted tremendously from usurping the Class::DBI test +suite. As far as the DBIC user-base - there are no immediate practical upsides +to DQ integration, neither in terms of API nor in performance. + +So (as described higher up) the DBIC development effort will in the foreseable +future ignore the existence of DQ, and will continue optimizing the preexisting +SQLA-based solution, potentially "organically growing" its own compatible +implementation. Also (again, as described higher up) the ability to plug a +separate SQLA-compatible class providing the necessary surface API will remain +possible, and will be protected at all costs in order to continue providing DQ +access to the test cases of DBIC. + +In the short term, after one more pass over the ResultSet internals is +undertaken I, and before the SQLA/SQLMaker integration +takes place, the preexisting DQ-based branches will be pulled/modified/rebased +to get up-to-date with the current state of the codebase, which changed very +substantially since the last migration effort, especially in the SQL +classification meta-parsing codepath. + =cut use base qw/ @@ -146,8 +238,9 @@ sub select { if( $limiter = $self->can ('emulate_limit') ) { carp_unique( 'Support for the legacy emulate_limit() mechanism inherited from ' - . 'SQL::Abstract::Limit has been deprecated, and will be removed when ' - . 'DBIC transitions to Data::Query. If your code uses this type of ' + . 'SQL::Abstract::Limit has been deprecated, and will be removed at ' + . 'some future point, as it gets in the way of architectural and/or ' + . 'performance advances within DBIC. If your code uses this type of ' . 'limit specification please file an RT and provide the source of ' . 'your emulate_limit() implementation, so an acceptable upgrade-path ' . 'can be devised' @@ -211,9 +304,9 @@ sub insert { # optimized due to hotttnesss # my ($self, $table, $data, $options) = @_; - # SQLA will emit INSERT INTO $table ( ) VALUES ( ) + # FIXME SQLA will emit INSERT INTO $table ( ) VALUES ( ) # which is sadly understood only by MySQL. Change default behavior here, - # until SQLA2 comes with proper dialect support + # until we fold the extra pieces into SQLMaker properly if (! $_[2] or (ref $_[2] eq 'HASH' and !keys %{$_[2]} ) ) { my @bind; my $sql = sprintf( @@ -294,7 +387,10 @@ sub _recurse_fields { # things in the SQLA space need to have more info about the $rs they # create SQL for. The alternative would be to keep expanding the # signature of _select with more and more positional parameters, which -# is just gross. All hail SQLA2! +# is just gross. +# +# FIXME - this will have to transition out to a subclass when the effort +# of folding the SQLA machinery into SQLMaker takes place sub _parse_rs_attrs { my ($self, $arg) = @_; @@ -495,9 +591,14 @@ sub _join_condition { return $self->_recurse_where($cond); } -# This is hideously ugly, but SQLA does not understand multicol IN expressions -# FIXME TEMPORARY - DQ should have native syntax for this -# moved here to raise API questions +# !!! EXPERIMENTAL API !!! WILL CHANGE !!! +# +# This is rather odd, but vanilla SQLA does not have support for multicolumn IN +# expressions +# Currently has only one callsite in ResultSet, body moved into this subclass +# of SQLA to raise API questions like: +# - how do we convey a list of idents...? +# - can binds reside on lhs? # # !!! EXPERIMENTAL API !!! WILL CHANGE !!! sub _where_op_multicolumn_in { diff --git a/lib/DBIx/Class/SQLMaker/LimitDialects.pm b/lib/DBIx/Class/SQLMaker/LimitDialects.pm index cd4d1bd..47027cb 100644 --- a/lib/DBIx/Class/SQLMaker/LimitDialects.pm +++ b/lib/DBIx/Class/SQLMaker/LimitDialects.pm @@ -273,8 +273,8 @@ EOS # method, and the slower BETWEEN query is used instead # # FIXME - this is quite expensive, and does not perform caching of any sort - # as soon as some of the DQ work becomes viable consider switching this - # over + # as soon as some of the SQLA-inlining work becomes viable consider adding + # some rudimentary caching support if ( $rs_attrs->{order_by} and diff --git a/lib/DBIx/Class/Storage/DBI/mysql.pm b/lib/DBIx/Class/Storage/DBI/mysql.pm index b790154..1e76d6b 100644 --- a/lib/DBIx/Class/Storage/DBI/mysql.pm +++ b/lib/DBIx/Class/Storage/DBI/mysql.pm @@ -44,13 +44,19 @@ sub _prep_for_execute { return $self->next::method(@_) if ( $_[0] eq 'select' or $_[0] eq 'insert' ); - # FIXME FIXME FIXME - this is a terrible, gross, incomplete hack - # it should be trivial for mst to port this to DQ (and a good - # exercise as well, since we do not yet have such wide tree walking - # in place). For the time being this will work in limited cases, - # mainly complex update/delete, which is really all we want it for - # currently (allows us to fix some bugs without breaking MySQL in - # the process, and is also crucial for Shadow to be usable) + # FIXME FIXME FIXME - this is a terrible, gross, incomplete, MySQL-specific + # hack but it works rather well for the limited amount of actual use cases + # which can not be done in any other way on MySQL. This allows us to fix + # some bugs without breaking MySQL support in the process and is also + # crucial for more complex things like Shadow to be usable + # + # This code is just a pre-analyzer, working in tandem with ::SQLMaker::MySQL, + # where the possibly-set value of {_modification_target_referenced_re} is + # used to demarcate which part of the final SQL to double-wrap in a subquery. + # + # This is covered extensively by "offline" tests, so when the DQ work + # resumes, this will get flagged. Afaik there are no AST-visitor code of that + # magnitude yet (Oct 2015) within DQ, so a good exercise overall. # extract the source name, construct modification indicator re my $sm = $self->sql_maker; diff --git a/lib/DBIx/Class/Storage/DBIHacks.pm b/lib/DBIx/Class/Storage/DBIHacks.pm index 20966ac..c129b1e 100644 --- a/lib/DBIx/Class/Storage/DBIHacks.pm +++ b/lib/DBIx/Class/Storage/DBIHacks.pm @@ -2,9 +2,24 @@ package #hide from PAUSE DBIx::Class::Storage::DBIHacks; # -# This module contains code that should never have seen the light of day, -# does not belong in the Storage, or is otherwise unfit for public -# display. The arrival of SQLA2 should immediately obsolete 90% of this +# This module contains code supporting a battery of special cases and tests for +# many corner cases pushing the envelope of what DBIC can do. When work on +# these utilities began in mid 2009 (51a296b402c) it wasn't immediately obvious +# that these pieces, despite their misleading on-first-sighe-flakiness, will +# become part of the generic query rewriting machinery of DBIC, allowing it to +# both generate and process queries representing incredibly complex sets with +# reasonable efficiency. +# +# Now (end of 2015), more than 6 years later the routines in this class have +# stabilized enough, and are meticulously covered with tests, to a point where +# an effort to formalize them into user-facing APIs might be worthwhile. +# +# An implementor working on publicizing and/or replacing the routines with a +# more modern SQL generation framework should keep in mind that pretty much all +# existing tests are constructed on the basis of real-world code used in +# production somewhere. +# +# Please hack on this responsibly ;) # use strict; @@ -346,27 +361,53 @@ sub _adjust_select_args_for_complex_prefetch { }); } - # This is totally horrific - the {where} ends up in both the inner and outer query - # Unfortunately not much can be done until SQLA2 introspection arrives, and even - # then if where conditions apply to the *right* side of the prefetch, you may have - # to both filter the inner select (e.g. to apply a limit) and then have to re-filter - # the outer select to exclude joins you didn't want in the first place + # FIXME: The {where} ends up in both the inner and outer query, i.e. *twice* + # + # This is rather horrific, and while we currently *do* have enough + # introspection tooling available to attempt a stab at properly deciding + # whether or not to include the where condition on the outside, the + # machinery is still too slow to apply it here. + # Thus for the time being we do not attempt any sanitation of the where + # clause and just pass it through on both sides of the subquery. This *will* + # be addressed at a later stage, most likely after folding the SQL generator + # into SQLMaker proper # # OTOH it can be seen as a plus: (notes that this query would make a DBA cry ;) + # return $outer_attrs; } +# This is probably the ickiest, yet most relied upon part of the codebase: +# this is the place where we take arbitrary SQL input and break it into its +# constituent parts, making sure we know which *sources* are used in what +# *capacity* ( selecting / restricting / grouping / ordering / joining, etc ) +# Although the method is pretty horrific, the worst thing that can happen is +# for a classification failure, which in turn will result in a vocal exception, +# and will lead to a relatively prompt fix. +# The code has been slowly improving and is covered with a formiddable battery +# of tests, so can be considered "reliably stable" at this point (Oct 2015). # -# I KNOW THIS SUCKS! GET SQLA2 OUT THE DOOR SO THIS CAN DIE! +# A note to implementors attempting to "replace" this - keep in mind that while +# there are multiple optimization avenues, the actual "scan literal elements" +# part *MAY NEVER BE REMOVED*, even if it is limited only ot the (future) AST +# nodes that are deemed opaque (i.e. contain literal expressions). The use of +# blackbox literals is at this point firmly a user-facing API, and is one of +# *the* reasons DBIC remains as flexible as it is. In other words, when working +# on this keep in mind that the following is widespread and *encouraged* way +# of using DBIC in the wild when push comes to shove: +# +# $rs->search( {}, { +# select => \[ $random, @stuff], +# from => \[ $random, @stuff ], +# where => \[ $random, @stuff ], +# group_by => \[ $random, @stuff ], +# order_by => \[ $random, @stuff ], +# } ) +# +# Various incarnations of the above are reflected in many of the tests. If one +# gets to fail, you get to fix it. A "this is crazy, nobody does that" is not +# acceptable going forward. # -# Due to a lack of SQLA2 we fall back to crude scans of all the -# select/where/order/group attributes, in order to determine what -# aliases are needed to fulfill the query. This information is used -# throughout the code to prune unnecessary JOINs from the queries -# in an attempt to reduce the execution time. -# Although the method is pretty horrific, the worst thing that can -# happen is for it to fail due to some scalar SQL, which in turn will -# result in a vocal exception. sub _resolve_aliastypes_from_select_args { my ( $self, $attrs ) = @_; @@ -678,12 +719,7 @@ sub _group_over_selection { # of the external order and convert them to MIN(X) for ASC or MAX(X) # for DESC, and group_by the root columns. The end result should be # exactly what we expect - - # FIXME - this code is a joke, will need to be completely rewritten in - # the DQ branch. But I need to push a POC here, otherwise the - # pesky tests won't pass - # wrap any part of the order_by that "responds" to an ordering alias - # into a MIN/MAX + # $sql_maker ||= $self->sql_maker; $order_chunks ||= [ map { ref $_ eq 'ARRAY' ? $_ : [ $_ ] } $sql_maker->_order_by_chunks($attrs->{order_by}) @@ -691,6 +727,8 @@ sub _group_over_selection { my ($chunk, $is_desc) = $sql_maker->_split_order_chunk($order_chunks->[$o_idx][0]); + # we reached that far - wrap any part of the order_by that "responded" + # to an ordering alias into a MIN/MAX $new_order_by[$o_idx] = \[ sprintf( '%s( %s )%s', ($is_desc ? 'MAX' : 'MIN'), @@ -1041,7 +1079,9 @@ sub _extract_colinfo_of_stable_main_source_order_by_portion { # resultset {where} stacks # # FIXME - while relatively robust, this is still imperfect, one of the first -# things to tackle with DQ +# things to tackle when we get access to a formalized AST. Note that this code +# is covered by a *ridiculous* amount of tests, so starting with porting this +# code would be a rather good exercise sub _collapse_cond { my ($self, $where, $where_is_anded_array) = @_;