Constant -> Value with some enhancements
[dbsrgits/SQL-Abstract-2.0-ish.git] / lib / SQL / Abstract / Manual / Specification.pod
CommitLineData
d6e108eb 1=head1 NAME
2
3SQL::Abstract::Manual::Specification
4
5=head1 SYNOPSIS
6
7This discusses the specification for the AST provided by L<SQL::Abstract>. It is
8meant to describe how the AST is structured, various components provided by
9L<SQL::Abstract> for use with this AST, how to manipulate the AST, and various
10uses for the AST once it is generated.
11
12=head1 MOTIVATIONS
13
14L<SQL::Abstract> has been in use for many years. Originally created to handle
15the where-clause formation found in L<DBIx::Abstract>, it was generalized to
16manage the creation of any SQL statement through the use of Perl structures.
17Through the beating it received as the SQL generation syntax for L<DBIx::Class>,
18various deficiencies were found and a generalized SQL AST was designed. This
19document describes that AST.
20
21=head1 GOALS
22
23The goals for this AST are as follows:
24
25=head2 SQL-specific semantics
26
27Instead of attempting to be an AST to handle any form of query, this will
28instead be specialized to manage SQL queries (and queries that map to SQL
29queries). This means that there will be support for SQL-specific features, such
30as placeholders.
31
32=head2 Perl-specific semantics
33
34This AST is meant to be used from within Perl5 only. So, it will take advantage
35of as many Perl-specific features that make sense to use. No attempt whatosever
36will be made to make this AST work within any other language, including Perl6.
37
38=head2 Whole-lifecycle management
39
40Whether a query is built out of whole cloth in one shot or cobbled together from
41several snippets over the lifetime of a process, this AST will support any way
42to construct the query. Queries can also be built from other queries, so an
43UPDATE statement could be used as the basis for a SELECT statement, DELETE
44statement, or even a DDL statement of some kind.
45
46=head2 Dialect-agnostic usage
47
48Even though SQL itself has several ANSI specifications (SQL-92 and SQL-99 among
49them), this only serves as a basis for what a given RDBMS will expect. However,
50every engine has its own specific extensions and specific ways of handling
51common features. The API to the AST will provide ways of expressing common
52functionality in a common language. The emitters (objects that follow the
53Visitor pattern) will be responsible for converting that common language into
54RDBMS-specific SQL.
55
56=head1 AST STRUCTURE
57
58The AST will be a HoA (hash of arrays). The keys to the hash will be the various
59clauses of a SQL statement, plus some metadata keys. All metadata keys will be
60identifiable as such by being prefixed with an underscore. All keys will be in
61lowercase.
62
63=head2 Metadata keys
64
65These are the additional metadata keys that the AST provides for.
66
67=over 4
68
69=item * _query
70
71This denotes what kind of query this AST should be interpreted as.
72
73=item *
74
75=back
76
77=head2 Structural units
78
79Structural units in the AST are supported by loaded components. L<SQL::Abstract>
80provides for the following structural units by default:
81
82=head3 Identifier
83
84This is a (potentially) fully canonicalized identifier for a table or column. Is
85is of the structure C< [schema][sep][table][sep]column > or
86C< [schema][sep]table >.
87
88In the case of a two-element identifier which could be C< table[sep]column > or
89C< schema[sep]table >, context will determine which it is. However, the AST
90doesn't care which it is, only that it properly parses.
91
10000e9e 92=head3 Value
d6e108eb 93
10000e9e 94A Value is a Perl scalar. It may either be a:
95
96=over 4
97
98=item * String
99
100A String is a quoted series of characters
101
102=item * Number
103
104A Number is an unquoted number in some numeric format
105
106=item * NULL
107
108NULL corresponds to Perl's C<undef>
109
110=item * BindParameter
111
112This corresponds to a value that will be passed in. This value is normally
113quoted in such a fashion so as to protect against SQL injection attacks. (q.v.
114L<DBI/quote()> for an example.)
115
116=back
117
118=head3
d6e108eb 119
120=head3 Function
121
122A Function is anything of the form C< name( arglist ) > where C<name> is a
123string and C<arglist> is a comma-separated list of Expressions.
124
81cd86f1 125Yes, a Subquery is legal as an argument for many functions. Some example
126functions are:
127
128=over 4
129
130=item * C<< IN >>
131
132=item * C<< MAX >>
133
134=item * C<< MIN >>
135
136=item * C<< SUM >>
137
138=back
d6e108eb 139
140=head3 Subquery
141
142A Subquery is another AST whose _query metadata parameter is set to "SELECT".
143
144Most places that a Subquery can be used would require a single value to be
145returned (single column, single row), but that is not something that the AST can
146easily enforce. The single-column restriction can possibly be enforced, but the
147single-row restriction is much more difficult and, in most cases, probably
148impossible.
149
81cd86f1 150Subqueries, when expressed in SQL, must bounded by parentheses.
151
d6e108eb 152=head3 Unary Operator
153
154A UnaryOperator takes a single argument on the RHS and is one of the following:
155
156=over 4
157
158=item * C<< NOT >>
159
160=back
161
162=head3 BinaryOperator
163
164A BinaryOperator takes two arguments (one on the LHS and one on the RHS) and is
165one of the following:
166
167=over 4
168
169=item * C<< = >>
170
171=item * C<< != >>
172
173=item * C<< > >>
174
175=item * C<< < >>
176
177=item * C<< >= >>
178
179=item * C<< <= >>
180
181=item * C<< IS >>
182
183=item * C<< IS NOT >>
184
d6e108eb 185=back
186
187Note that an operator can comprise of what would be multiple tokens in a normal
188parsing effort.
189
190=head3 Expression
191
192An expression can be any one of the following:
193
194=over 4
195
10000e9e 196=item * Value
d6e108eb 197
198=item * Function
199
200=item * Subquery
201
202=item * UnaryOperator Expression
203
204=item * Expression BinaryOperator Expression
205
81cd86f1 206=item * ( Expression )
207
d6e108eb 208=back
209
81cd86f1 210Parentheses indicate precedence and, in some situations, are necessary for
211certain operators.
212
d6e108eb 213=head2 SQL clauses
214
10000e9e 215These are all the legal and acceptable clauses within the AST that would
216correpsond to clauses in a SQL statement. Not all clauses are legal within a
217given RDBMS engine's SQL dialect and some clauses may be required in one and
218optional in another. Detecting and enforcing those engine-specific restrictions
219is the responsibility of the Visitor object.
220
221The clauses are defined with a yacc-like syntax. The various parts are:
222
223=over 4
224
225=item * :=
226
227This means "defined" and is used to create a new term to be used below.
228
229=item * []
230
231This means optional and indicates that the items within it are optional.
232
233=item * []*
234
235This means optional and repeating as many times as desired.
236
237=item * |
238
239This means alternation. It is a binary operator and indicates that either the
240left or right hand sides may be used, but not both.
241
242=item * C<< <> >>
243
244This is a grouping construct. It means that all elements within this construct
245are treated together for the purposes of optional, repeating, alternation, etc.
246
247=back
248
d6e108eb 249The expected clauses are (name and structure):
250
251=head3 select
252
81cd86f1 253This corresponds to the SELECT clause of a SELECT statement.
254
255A select clause is composed as follows:
256
257 SelectComponent := Expression [ [ AS ] String ]
258
259 SelectComponent
260 [ , SelectComponent ]*
d6e108eb 261
262=head3 tables
263
264This is a list of tables that this clause is affecting. It corresponds to the
81cd86f1 265FROM clause in a SELECT statement and the INSERT INTO/UPDATE/DELETE clauses in
266those respective statements. Depending on the _query metadata entry, the
267appropriate clause name will be used.
d6e108eb 268
269The tables clause has several RDBMS-specific variations. The AST will support
270all of them and it is up to the Visitor object constructing the actual SQL to
271validate and/or use what is provided as appropriate.
272
273A table clause is composed as follows:
274
275 TableIdentifier := Identifier [ [ AS ] String ]
81cd86f1 276 JoinType := < LEFT|RIGHT [ OUTER ] > | INNER | CROSS
d6e108eb 277
278 TableIdentifier
279 [
280 < , TableIdentifier >
281 | <
282 [ JoinType ] JOIN TableIdentifier
283 [
284 < USING ( Identifier [ , Identifier ] ) >
285 | < ON [ ( ] Expression [ , Expression ] [ ) ] >
286 ]
287 >
288 ]*
289
290Additionally, where aliases are provided for in the TableIdentifier, those
291aliases must be used as the tablename in subsequent Identifiers that identify a
292column of that table.
293
294=head3 where
295
81cd86f1 296This corresponds to the WHERE clause in a SELECT, UPDATE, or DELETE statement.
297
298A where clause is composed as follows:
299
300 WhereOperator := AND | OR
301 WhereExpression := Expression | Expression WhereOperator Expression
302
303 WhereExpression
304
d6e108eb 305=head3 set
306
81cd86f1 307This corresponds to the SET clause in an INSERT or UPDATE statement.
308
309A set clause is composed as follows:
310
311 SetComponent := Identifier = Expression
312
313 SetComponent [ , SetComponent ]*
314
315=head3 columns
316
317This corresponds to the optional list of columns in an INSERT statement.
318
319A columns clause is composed as follows:
320
321 ( Identifier [ , Identifier ]* )
322
d6e108eb 323=head3 values
324
81cd86f1 325This corresponds to the VALUES clause in an INSERT statement.
326
327A values clause is composed as follows:
328
329 ( Expression [ , Expression ]* )
330
331If there is a columns clause, the number of entries in the values clause must be
332equal to the number of entries in the columns clause.
333
d6e108eb 334=head3 orderby
335
81cd86f1 336This corresponds to the ORDER BY clause in a SELECT statement.
337
338An orderby clause is composed as follows:
339
10000e9e 340 OrderByComponent := XXX-TODO-XXX
81cd86f1 341 OrderByDirection := ASC | DESC
342
343 OrderByComponent [ OrderByDirection ]
344 [ , OrderByComponent [ OrderByDirection ] ]*
345
d6e108eb 346=head3 groupby
347
81cd86f1 348This corresponds to the GROUP BY clause in a SELECT statement.
349
350An groupby clause is composed as follows:
351
10000e9e 352 GroupByComponent := XXX-TODO-XXX
81cd86f1 353
354 GroupByComponent [ , GroupByComponent ]*
355
d6e108eb 356=head3 rows
357
81cd86f1 358This corresponds to the clause that is used in some RDBMS engines to limit the
359number of rows returned by a query. In MySQL, this would be the LIMIT clause.
360
361A rows clause is composed as follows:
362
363 Number [, Number ]
364
d6e108eb 365=head3 for
366
81cd86f1 367This corresponds to the clause that is used in some RDBMS engines to indicate
368what locks are to be taken by this SELECT statement.
369
370A for clause is composed as follows:
371
372 UPDATE | DELETE
373
374=head3 connectby
375
376This corresponds to the clause that is used in some RDBMS engines to provide for
377an adjacency-list query.
378
379A connectby clause is composed as follows:
380
381 Identifier, WhereExpression
382
d6e108eb 383=head1 AUTHORS
384
81cd86f1 385robkinyon: Rob Kinyon C<< <rkinyon@cpan.org> >>
d6e108eb 386
387=head1 LICENSE
388
389You may distribute this code under the same terms as Perl itself.
390
391=cut