Details in attached IRC log
magnet#dbic-cabal_
20131108.log
====
[19:01:38] <mst> using aggregates and distinct seems like it should be an error
[19:02:12] <mst> since it makes no sense to ask for "GROUP BY everything we select" and then select aggregates AFAICS
[19:03:20] <mst> so, yes, what we did before was ... wtf
[19:04:44] <ribasushi> mst: the point is *sometimes* a select in a group-by may make sense is my gut feeling, I basically wanted to "send the nonsense" along, and have the rdbms be the judge
[19:05:52] <mst> well ... what we did before was definitely completely wrong
[19:06:09] <mst> what we're doing now ... I don't want to release it without thinking it through some more, but it's certainly less wrong
[19:07:40] <mst> but I mean ... I dunno, I think group_over_selection makes sense for subquery wrapping
[19:08:04] <mst> but in the case of 'distinct => 1' I think "can't do that with an aggregate in select, pass a real group by you fucking pansy" would be pretty good too
[19:08:14] <mst> since I'm really not convinced it ever can make sense
[19:08:27] <ribasushi> mst: and I am not convinced it never makes sense
[19:08:36] <ribasushi> mst: so bubbling it up doesn't seem like a loss
[19:09:09] <mst> hmmm
[19:09:44] <ribasushi> mst: also I do not want to *entirely* discourage mysql/sqlite-centric grouppage - if a person wants to do that, they should be able to
[19:09:56] <ribasushi> hence why the explicit \'' rule
[19:10:03] <ribasushi> (that is back from 2009-ish actually)
[19:10:05] <mst> actually, looking at this
[19:10:23] <mst> the sql looks like what the user asked for
[19:10:38] <ribasushi> how so?
[19:10:59] <ribasushi> distinct is "final selection grouper" not "first selection"
[19:11:00] <mst> well, they created a resultset of
[19:11:04] <ribasushi> i.e. it is inherently lazy
[19:11:29] <ribasushi> right, it does not take effect on the current selection, didn't from before I took over
[19:11:53] <mst> SELECT year, COUNT(me.cdid) AS cnt ....
[19:11:57] <mst> and then grabbed the 'cnt' column
[19:13:06] <mst> ribasushi: ResultSet originally (0.05000) did '$attrs->{group_by} ||= $attrs->{select} if delete $attrs->{distinct};' in new()
[19:13:15] <mst> so, yes, it takes effect on the current selection
[19:13:36] <mst> so sql is correct according to the original behaviour as implemented
[19:13:36] <ribasushi> mst: but it did this in *_resolve_attrs*
[19:13:55] <ribasushi> which is after the selection is modified by RSETColumn
[19:14:15] <ribasushi> or to put it diff. - RSetColumn does not do a subquery, just like a ->search doesn't
[19:14:30] <mst> it didn't originally, it did it immediately you supplied the distinct attribute to a search
[19:14:40] <mst> anything else is a bug; fix that and we don't need this crap
[19:14:52] <ribasushi> well - when I came around it was lazified via the late resolver
[19:15:08] <ribasushi> and I had bugs reported when that behavior changed later on when I was doin prefetches
[19:15:17] <ribasushi> so this is crap in production for years
[19:15:41] <ribasushi> mst: https://github.com/dbsrgits/dbix-class/blob/master/t/search/distinct.t behaviors on non-aggregates
[19:15:55] <ribasushi> it accreted stuff obviously, the ages of tests are different
[19:17:01] <ribasushi> mst: basically this boils to "is distinct => 1 lazy or eager" and I am afraid this ship sailed around 0.07
[19:17:17] <ribasushi> "fixing" it will serve nothing but bring breakage
[19:17:48] <mst> yeah, it looks like it got broken when _resolved_attrs was factored out of new()
[19:17:53] <ribasushi> nod
[19:18:28] <ribasushi> so - if we assume "distinct is lazy" - you see why I would go the way I did with the latest fix
[19:19:22] <mst> ribasushi: distinct is lazy works fine, you just have to disallow aggregates in the select list in that case
[19:19:30] <mst> and make them do an explicit group_by for that
[19:20:31] <ribasushi> mst: aggregates via the {} syntax have been supported correctly forever - that is distinct knows to ignore them
[19:20:45] <ribasushi> mst: this test is more or less "aggregate in literal" (due to being an alias)
[19:20:47] <mst> even as far as 08000 it was straight up
[19:20:51] <ribasushi> so I can't disallow it
[19:20:56] <ribasushi> because it is... a literal
[19:20:59] <mst> $attrs->{group_by} ||= $attrs->{select} if delete $attrs->{distinct};
[19:21:11] <mst> so if we started allowing them via the {} syntax that wasn't me
[19:21:17] <ribasushi> mst: I came around 0.08010 ;)
[19:22:01] <mst> ribasushi: $attrs->{group_by} ||= $attrs->{select} if delete $attrs->{distinct};
[19:22:07] <mst> ribasushi: in 08011 which is the first release you did
[19:22:34] <ribasushi> mst: it is very possibkle that I broke it when I wasn't knowing wtf is on
[19:22:46] <mst> I think you made lazy distinct work "better"
[19:22:51] <mst> when it shouldn't've been lazy
[19:22:59] <ribasushi> mst: in any case - it's old shit, so even if I am responsible it still boils to "people depend on that"
[19:23:13] <ribasushi> mst: nod, may very well be my rookie mistake ;)
[19:23:15] <mst> sure. at this point it's "I fucked it up, then you made it worse, we'll have to deprecate it"
[19:23:19] <mst> most likely
[19:23:34] <mst> and the only question is damage control so we don't break current usages while isolating the code
[19:24:48] <ribasushi> deprecating it - I wouldn't go that far, I use distinct a lot myself because it does the group construction for me without me thinking
[19:25:01] <ribasushi> juggling selection/ordering is hard enough ;)
+++ /dev/null
-List of things riba needs to clear out before next official in order
-of importance:
-(Keep Getty happy)
-
-- Clarify/warn on the distinct over multiple columns get_column()
an exception object with broken string overloading
- Clarify ambiguous behavior of distinct when used with ResultSetColumn
i.e. $rs->search({}, { distinct => 1 })->get_column (...)
+ - Explicitly deprecate combination of distinct and selecting a
+ non-column via $rs->get_column()
- Setting quote_names propagates to SQL::Translator when producing
SQLite DDL (it is one of the few producers *NOT* quoting by default)
- Fix incorrect binding of large integers on old versions of
}
}
- # collapse the selector to a literal so that it survives a possible distinct parse
- # if it turns out to be an aggregate - at least the user will get a proper exception
- # instead of silent drop of the group_by altogether
- my $new = bless {
- _select => \ $rsrc->storage->sql_maker->_recurse_fields($select),
+ return bless {
+ _select => $select,
_as => $column,
- _parent_resultset => $new_parent_rs }, $class;
- return $new;
+ _parent_resultset => $new_parent_rs
+ }, $class;
}
=head2 as_query
sub _resultset {
my $self = shift;
- return $self->{_resultset} ||= $self->{_parent_resultset}->search(undef,
- {
- select => [$self->{_select}],
- as => [$self->{_as}]
+ return $self->{_resultset} ||= do {
+
+ my $select = $self->{_select};
+
+ if ($self->{_parent_resultset}{attrs}{distinct}) {
+ my $alias = $self->{_parent_resultset}->current_source_alias;
+ my $rsrc = $self->{_parent_resultset}->result_source;
+ my %cols = map { $_ => 1, "$alias.$_" => 1 } $rsrc->columns;
+
+ unless( $cols{$select} ) {
+ carp_unique(
+ 'Use of distinct => 1 while selecting anything other than a column '
+ . 'declared on the primary ResultSource is deprecated - please supply '
+ . 'an explicit group_by instead'
+ );
+
+ # collapse the selector to a literal so that it survives the distinct parse
+ # if it turns out to be an aggregate - at least the user will get a proper exception
+ # instead of silent drop of the group_by altogether
+ $select = \ $rsrc->storage->sql_maker->_recurse_fields($select);
+ }
}
- );
+
+ $self->{_parent_resultset}->search(undef, {
+ columns => { $self->{_as} => $select }
+ });
+ };
}
1;
use Test::More;
use Test::Exception;
+use Test::Warn;
use DBI::Const::GetInfoType;
use Scalar::Util qw/weaken/;
select => [ \ 'YEAR(year)' ], as => ['y'], distinct => 1,
});
- is_deeply (
- [ sort ($rs->get_column ('y')->all) ],
+ my $y_rs = $rs->get_column ('y');
+
+ warnings_exist { is_deeply (
+ [ sort ($y_rs->all) ],
[ sort keys %$cds_per_year ],
'Years group successfully',
- );
+ ) } qr/
+ \QUse of distinct => 1 while selecting anything other than a column \E
+ \Qdeclared on the primary ResultSource is deprecated\E
+ /x, 'deprecation warning';
+
$rs->create ({ artist => 1, year => '0-1-1', title => 'Jesus Rap' });
is_deeply (
- [ sort $rs->get_column ('y')->all ],
+ [ sort $y_rs->all ],
[ 0, sort keys %$cds_per_year ],
'Zero-year groups successfully',
);
year => { '!=', undef }
]});
- is_deeply (
+ warnings_exist { is_deeply (
[ $restrict_rs->get_column('y')->all ],
- [ $rs->get_column ('y')->all ],
+ [ $y_rs->all ],
'Zero year was correctly excluded from resultset',
- );
+ ) } qr/
+ \QUse of distinct => 1 while selecting anything other than a column \E
+ \Qdeclared on the primary ResultSource is deprecated\E
+ /x, 'deprecation warning';
}
# make sure find hooks determine driver
# would silently drop the group_by entirely, likely ending up with nonsensival results
# With the current behavior the user will at least get a nice fat exception from the
# RDBMS (or maybe the RDBMS will even decide to handle the situation sensibly...)
- is_same_sql_bind(
+ warnings_exist { is_same_sql_bind(
$rstypes->{'implicitly grouped'}->get_column('cnt')->as_query,
'(
SELECT COUNT( me.cdid )
=> 'evancarrol'
] ],
'Expected (though nonsensical) SQL generated on rscol-with-distinct-over-function',
- );
+ ) } qr/
+ \QUse of distinct => 1 while selecting anything other than a column \E
+ \Qdeclared on the primary ResultSource is deprecated\E
+ /x, 'deprecation warning';
{
local $TODO = 'multiplying join leaks through to the count aggregate... this may never actually work';
],
},
sqlbind => \[
- "( SELECT (SELECT id FROM cd me LIMIT ?) FROM artist me )",
+ "( SELECT (SELECT me.id FROM cd me LIMIT ?) FROM artist me )",
[ $ROWS => 1 ],
],
},
],
},
sqlbind => \[
- "( SELECT me.artistid, me.name, me.rank, me.charfield, (SELECT id FROM cd me LIMIT ?) FROM artist me )",
+ "( SELECT me.artistid, me.name, me.rank, me.charfield, (SELECT me.id FROM cd me LIMIT ?) FROM artist me )",
[ $ROWS => 1 ],
],
},