Commit | Line | Data |
d6e108eb |
1 | =head1 NAME |
2 | |
3 | SQL::Abstract::Manual::Specification |
4 | |
5 | =head1 SYNOPSIS |
6 | |
7 | This discusses the specification for the AST provided by L<SQL::Abstract>. It is |
8 | meant to describe how the AST is structured, various components provided by |
9 | L<SQL::Abstract> for use with this AST, how to manipulate the AST, and various |
10 | uses for the AST once it is generated. |
11 | |
12 | =head1 MOTIVATIONS |
13 | |
14 | L<SQL::Abstract> has been in use for many years. Originally created to handle |
15 | the where-clause formation found in L<DBIx::Abstract>, it was generalized to |
16 | manage the creation of any SQL statement through the use of Perl structures. |
17 | Through the beating it received as the SQL generation syntax for L<DBIx::Class>, |
18 | various deficiencies were found and a generalized SQL AST was designed. This |
19 | document describes that AST. |
20 | |
21 | =head1 GOALS |
22 | |
23 | The goals for this AST are as follows: |
24 | |
25 | =head2 SQL-specific semantics |
26 | |
27 | Instead of attempting to be an AST to handle any form of query, this will |
28 | instead be specialized to manage SQL queries (and queries that map to SQL |
29 | queries). This means that there will be support for SQL-specific features, such |
30 | as placeholders. |
31 | |
32 | =head2 Perl-specific semantics |
33 | |
34 | This AST is meant to be used from within Perl5 only. So, it will take advantage |
35 | of as many Perl-specific features that make sense to use. No attempt whatosever |
36 | will be made to make this AST work within any other language, including Perl6. |
37 | |
38 | =head2 Whole-lifecycle management |
39 | |
40 | Whether a query is built out of whole cloth in one shot or cobbled together from |
41 | several snippets over the lifetime of a process, this AST will support any way |
42 | to construct the query. Queries can also be built from other queries, so an |
43 | UPDATE statement could be used as the basis for a SELECT statement, DELETE |
44 | statement, or even a DDL statement of some kind. |
45 | |
46 | =head2 Dialect-agnostic usage |
47 | |
48 | Even though SQL itself has several ANSI specifications (SQL-92 and SQL-99 among |
49 | them), this only serves as a basis for what a given RDBMS will expect. However, |
50 | every engine has its own specific extensions and specific ways of handling |
51 | common features. The API to the AST will provide ways of expressing common |
52 | functionality in a common language. The emitters (objects that follow the |
53 | Visitor pattern) will be responsible for converting that common language into |
54 | RDBMS-specific SQL. |
55 | |
56 | =head1 AST STRUCTURE |
57 | |
58 | The AST will be a HoA (hash of arrays). The keys to the hash will be the various |
59 | clauses of a SQL statement, plus some metadata keys. All metadata keys will be |
60 | identifiable as such by being prefixed with an underscore. All keys will be in |
61 | lowercase. |
62 | |
63 | =head2 Metadata keys |
64 | |
65 | These are the additional metadata keys that the AST provides for. |
66 | |
67 | =over 4 |
68 | |
69 | =item * _query |
70 | |
71 | This denotes what kind of query this AST should be interpreted as. |
72 | |
73 | =item * |
74 | |
75 | =back |
76 | |
77 | =head2 Structural units |
78 | |
79 | Structural units in the AST are supported by loaded components. L<SQL::Abstract> |
80 | provides for the following structural units by default: |
81 | |
82 | =head3 Identifier |
83 | |
84 | This is a (potentially) fully canonicalized identifier for a table or column. Is |
85 | is of the structure C< [schema][sep][table][sep]column > or |
86 | C< [schema][sep]table >. |
87 | |
88 | In the case of a two-element identifier which could be C< table[sep]column > or |
89 | C< schema[sep]table >, context will determine which it is. However, the AST |
90 | doesn't care which it is, only that it properly parses. |
91 | |
92 | =head3 Constant |
93 | |
94 | A Constant is a Perl scalar. It may either be a String (quoted series of |
81cd86f1 |
95 | characters) or a number (unquoted) or NULL (corresponds to Perl's C<undef>). |
d6e108eb |
96 | |
97 | =head3 Function |
98 | |
99 | A Function is anything of the form C< name( arglist ) > where C<name> is a |
100 | string and C<arglist> is a comma-separated list of Expressions. |
101 | |
81cd86f1 |
102 | Yes, a Subquery is legal as an argument for many functions. Some example |
103 | functions are: |
104 | |
105 | =over 4 |
106 | |
107 | =item * C<< IN >> |
108 | |
109 | =item * C<< MAX >> |
110 | |
111 | =item * C<< MIN >> |
112 | |
113 | =item * C<< SUM >> |
114 | |
115 | =back |
d6e108eb |
116 | |
117 | =head3 Subquery |
118 | |
119 | A Subquery is another AST whose _query metadata parameter is set to "SELECT". |
120 | |
121 | Most places that a Subquery can be used would require a single value to be |
122 | returned (single column, single row), but that is not something that the AST can |
123 | easily enforce. The single-column restriction can possibly be enforced, but the |
124 | single-row restriction is much more difficult and, in most cases, probably |
125 | impossible. |
126 | |
81cd86f1 |
127 | Subqueries, when expressed in SQL, must bounded by parentheses. |
128 | |
d6e108eb |
129 | =head3 Unary Operator |
130 | |
131 | A UnaryOperator takes a single argument on the RHS and is one of the following: |
132 | |
133 | =over 4 |
134 | |
135 | =item * C<< NOT >> |
136 | |
137 | =back |
138 | |
139 | =head3 BinaryOperator |
140 | |
141 | A BinaryOperator takes two arguments (one on the LHS and one on the RHS) and is |
142 | one of the following: |
143 | |
144 | =over 4 |
145 | |
146 | =item * C<< = >> |
147 | |
148 | =item * C<< != >> |
149 | |
150 | =item * C<< > >> |
151 | |
152 | =item * C<< < >> |
153 | |
154 | =item * C<< >= >> |
155 | |
156 | =item * C<< <= >> |
157 | |
158 | =item * C<< IS >> |
159 | |
160 | =item * C<< IS NOT >> |
161 | |
d6e108eb |
162 | =back |
163 | |
164 | Note that an operator can comprise of what would be multiple tokens in a normal |
165 | parsing effort. |
166 | |
167 | =head3 Expression |
168 | |
169 | An expression can be any one of the following: |
170 | |
171 | =over 4 |
172 | |
173 | =item * Constant |
174 | |
175 | =item * Function |
176 | |
177 | =item * Subquery |
178 | |
179 | =item * UnaryOperator Expression |
180 | |
181 | =item * Expression BinaryOperator Expression |
182 | |
81cd86f1 |
183 | =item * ( Expression ) |
184 | |
d6e108eb |
185 | =back |
186 | |
81cd86f1 |
187 | Parentheses indicate precedence and, in some situations, are necessary for |
188 | certain operators. |
189 | |
d6e108eb |
190 | =head2 SQL clauses |
191 | |
192 | The expected clauses are (name and structure): |
193 | |
194 | =head3 select |
195 | |
81cd86f1 |
196 | This corresponds to the SELECT clause of a SELECT statement. |
197 | |
198 | A select clause is composed as follows: |
199 | |
200 | SelectComponent := Expression [ [ AS ] String ] |
201 | |
202 | SelectComponent |
203 | [ , SelectComponent ]* |
d6e108eb |
204 | |
205 | =head3 tables |
206 | |
207 | This is a list of tables that this clause is affecting. It corresponds to the |
81cd86f1 |
208 | FROM clause in a SELECT statement and the INSERT INTO/UPDATE/DELETE clauses in |
209 | those respective statements. Depending on the _query metadata entry, the |
210 | appropriate clause name will be used. |
d6e108eb |
211 | |
212 | The tables clause has several RDBMS-specific variations. The AST will support |
213 | all of them and it is up to the Visitor object constructing the actual SQL to |
214 | validate and/or use what is provided as appropriate. |
215 | |
216 | A table clause is composed as follows: |
217 | |
218 | TableIdentifier := Identifier [ [ AS ] String ] |
81cd86f1 |
219 | JoinType := < LEFT|RIGHT [ OUTER ] > | INNER | CROSS |
d6e108eb |
220 | |
221 | TableIdentifier |
222 | [ |
223 | < , TableIdentifier > |
224 | | < |
225 | [ JoinType ] JOIN TableIdentifier |
226 | [ |
227 | < USING ( Identifier [ , Identifier ] ) > |
228 | | < ON [ ( ] Expression [ , Expression ] [ ) ] > |
229 | ] |
230 | > |
231 | ]* |
232 | |
233 | Additionally, where aliases are provided for in the TableIdentifier, those |
234 | aliases must be used as the tablename in subsequent Identifiers that identify a |
235 | column of that table. |
236 | |
237 | =head3 where |
238 | |
81cd86f1 |
239 | This corresponds to the WHERE clause in a SELECT, UPDATE, or DELETE statement. |
240 | |
241 | A where clause is composed as follows: |
242 | |
243 | WhereOperator := AND | OR |
244 | WhereExpression := Expression | Expression WhereOperator Expression |
245 | |
246 | WhereExpression |
247 | |
d6e108eb |
248 | =head3 set |
249 | |
81cd86f1 |
250 | This corresponds to the SET clause in an INSERT or UPDATE statement. |
251 | |
252 | A set clause is composed as follows: |
253 | |
254 | SetComponent := Identifier = Expression |
255 | |
256 | SetComponent [ , SetComponent ]* |
257 | |
258 | =head3 columns |
259 | |
260 | This corresponds to the optional list of columns in an INSERT statement. |
261 | |
262 | A columns clause is composed as follows: |
263 | |
264 | ( Identifier [ , Identifier ]* ) |
265 | |
d6e108eb |
266 | =head3 values |
267 | |
81cd86f1 |
268 | This corresponds to the VALUES clause in an INSERT statement. |
269 | |
270 | A values clause is composed as follows: |
271 | |
272 | ( Expression [ , Expression ]* ) |
273 | |
274 | If there is a columns clause, the number of entries in the values clause must be |
275 | equal to the number of entries in the columns clause. |
276 | |
d6e108eb |
277 | =head3 orderby |
278 | |
81cd86f1 |
279 | This corresponds to the ORDER BY clause in a SELECT statement. |
280 | |
281 | An orderby clause is composed as follows: |
282 | |
283 | OrderByComponent := XXX |
284 | OrderByDirection := ASC | DESC |
285 | |
286 | OrderByComponent [ OrderByDirection ] |
287 | [ , OrderByComponent [ OrderByDirection ] ]* |
288 | |
d6e108eb |
289 | =head3 groupby |
290 | |
81cd86f1 |
291 | This corresponds to the GROUP BY clause in a SELECT statement. |
292 | |
293 | An groupby clause is composed as follows: |
294 | |
295 | GroupByComponent := XXX |
296 | |
297 | GroupByComponent [ , GroupByComponent ]* |
298 | |
d6e108eb |
299 | =head3 rows |
300 | |
81cd86f1 |
301 | This corresponds to the clause that is used in some RDBMS engines to limit the |
302 | number of rows returned by a query. In MySQL, this would be the LIMIT clause. |
303 | |
304 | A rows clause is composed as follows: |
305 | |
306 | Number [, Number ] |
307 | |
d6e108eb |
308 | =head3 for |
309 | |
81cd86f1 |
310 | This corresponds to the clause that is used in some RDBMS engines to indicate |
311 | what locks are to be taken by this SELECT statement. |
312 | |
313 | A for clause is composed as follows: |
314 | |
315 | UPDATE | DELETE |
316 | |
317 | =head3 connectby |
318 | |
319 | This corresponds to the clause that is used in some RDBMS engines to provide for |
320 | an adjacency-list query. |
321 | |
322 | A connectby clause is composed as follows: |
323 | |
324 | Identifier, WhereExpression |
325 | |
d6e108eb |
326 | =head3 |
327 | |
328 | =head1 AUTHORS |
329 | |
81cd86f1 |
330 | robkinyon: Rob Kinyon C<< <rkinyon@cpan.org> >> |
d6e108eb |
331 | |
332 | =head1 LICENSE |
333 | |
334 | You may distribute this code under the same terms as Perl itself. |
335 | |
336 | =cut |