Commit | Line | Data |
2e2fc2b4 |
1 | =head1 NAME |
2 | |
3 | SQL::Translator::Manual |
4 | |
5 | =head1 SYNOPSIS |
6 | |
cfdcb09e |
7 | SQL::Translator (AKA "SQLFairy") is a collection of modules for |
8 | transforming (mainly) SQL DDL files into a variety of other formats, |
9 | including other SQL dialects, documentation, images, and code. In |
10 | this manual, we will attempt to address how to use SQLFairy for common |
11 | tasks. For a lower-level discussion of how the code works, please |
12 | read the documentation for L<SQL::Translator>. |
2e2fc2b4 |
13 | |
cfdcb09e |
14 | It may prove helpful to have a general understanding of the SQLFairy |
15 | code before continuing. The code can be broken into three conceptual |
16 | groupings: |
2e2fc2b4 |
17 | |
cfdcb09e |
18 | =over 4 |
2e2fc2b4 |
19 | |
cfdcb09e |
20 | =item * Parsers |
2e2fc2b4 |
21 | |
cfdcb09e |
22 | The parsers are responsible for reading the input files and describing |
23 | them to the Schema object middleware. |
2e2fc2b4 |
24 | |
cfdcb09e |
25 | =item * Producers |
2e2fc2b4 |
26 | |
cfdcb09e |
27 | The producers create the output as described by the Schema middleware. |
2e2fc2b4 |
28 | |
cfdcb09e |
29 | =item * Schema objects |
30 | |
31 | The Schema objects bridge the communication between the Parsers and |
32 | Producers by representing any parsed file through a standard set of |
33 | generic objects to represent concepts like Tables, Fields (columns), |
34 | Indices, Constraints, etc. |
35 | |
36 | =back |
37 | |
38 | It's not necessary to understand how to write or manipulate any |
39 | of these for most common tasks, but you should aware of the concepts |
40 | as they will be referenced later in this document. |
41 | |
42 | =head1 SQLFAIRY SCRIPTS |
43 | |
44 | Most common tasks can be accomplished through the use of the script |
45 | interfaces to the SQL::Translator code. All SQLFairy scripts begin |
46 | with "sqlt." Here are the scripts and a description of what they each |
47 | do: |
48 | |
49 | =over 4 |
50 | |
51 | =item * sqlt |
52 | |
53 | This is the main interface for text-to-text translations, e.g., |
54 | converting a MySQL schema to Oracle. |
55 | |
56 | =item * sqlt-diagram |
57 | |
58 | This is a tailored interface for the Diagram producer and its many |
59 | myriad options. |
60 | |
61 | =item * sqlt-diff |
62 | |
63 | This script will examine two schemas and report the SQL commands |
64 | (ALTER, CREATE) needed to turn the first schema into the second. |
65 | |
66 | =item * sqlt-dumper |
67 | |
68 | This script generates a Perl script that can be used to connect to a |
69 | database and dump the data in each table in different formats, similar |
70 | to the "mysqldump" program. |
71 | |
72 | =item * sqlt-graph |
73 | |
74 | This is an interface to the GraphViz visualization tool and its myriad |
75 | options. |
76 | |
77 | =item * sqlt.cgi |
78 | |
79 | This is a CGI script that presents an HTML form for uploading or |
80 | pasting a schema and choosing an output and the output options. |
81 | |
82 | =back |
83 | |
84 | To read the full documentation for each script, use "perldoc" (or |
85 | execute any of the command-line scripts with the "--help" flag). |
86 | |
87 | =head1 CONVERTING SQL DIALECTS |
88 | |
89 | Probably the most common task SQLFairy is used for is to convert one |
90 | dialect of SQL to another. If you have a text description of an SQL |
91 | database (AKA a "DDL" -- "Data Definition Language"), then you should |
92 | use the "sqlt" script with switches to indicate the parser and |
93 | producer and the name of the text file as the final argument. For |
94 | example, to convert the "foo.sql" MySQL schema to a version suitable |
95 | for PostgreSQL, you would do the following: |
96 | |
97 | $ sqlt -f MySQL -t PostgreSQL foo.sql > foo-pg.sql |
98 | |
99 | The "from" and "to" options are case-sensitive and must match exactly |
100 | the names of the Parser and Producer classes in SQL::Translator. For |
101 | a complete listing of your options, execute "sqlt" with the "--list" |
102 | flag. |
103 | |
104 | =head1 EXTRACT SQL SCHEMAS DIRECTLY FROM DATABASE |
105 | |
106 | It is possible to extract some schemas directly from the database |
107 | without parsing a text file (the "foo.sql" in the above example). |
108 | This can prove significantly faster than parsing a text file. To |
109 | do this, use the "DBI" parser and provide the necessary arguments to |
110 | connect to the database and indicate the producer class, like so: |
111 | |
112 | $ sqlt -f DBI --dsn dbi:mysql:FOO --db-user guest \ |
113 | --db-password p4ssw0rd -t PostgreSQL > foo |
114 | |
115 | The "--list" option to "sqlt" will show the databases supported by |
116 | DBI parsers. |
117 | |
118 | =head1 HANDLING NON-SQL DATA |
119 | |
120 | Certain structured document formats can be easily thought of as |
121 | tables. SQLFairy can parse Microsoft Excel spreadsheets and |
122 | arbitrarily delimited text files just as if they were schemas which |
123 | contained only one table definition. The column names are normalized |
124 | to something sane for most databases (whitespace is converted to |
125 | underscores and non-word characters are removed), and the data in each |
126 | field is scanned to determine the appropriate data type (character, |
127 | integer, or float) and size. For instance, to convert a |
128 | comma-separated file to an SQLite database, do the following: |
129 | |
130 | $ sqlt -f xSV --fs ',' -t SQLite foo.csv > foo-sqlite.sql |
131 | |
132 | Additionally, there are non-SQL represenations of relational schemas |
133 | such as XML and XMI. Currently the XMI support in SQLFairy is |
134 | experimental and not released. Additionally, the only XML supported |
135 | is our own version; however, it would be fairly easy to add an XML |
136 | parser for something like the TorqueDB (http://db.apache.org/torque/) |
137 | project. The actual parsing of XML should be trivial given the number |
138 | of XML parsers available, so all that would be left would be to map |
139 | the specific concepts in the source file to the Schema objects in |
140 | SQLFairy. |
141 | |
142 | To convert a schema in SQLFairy's XML dialect to Oracle, do the following: |
143 | |
144 | $ sqlt -f XML-SQLFairy -t Oracle foo.xml > foo-oracle.sql |
145 | |
146 | =head1 SERIALIZING SCHEMAS |
2e2fc2b4 |
147 | |
cfdcb09e |
148 | Parsing a schema is generally the most computationally expensive |
149 | operation performed by SQLFairy, so it may behoove you to serialize a |
150 | parsed schema if you need to perform repeated conversions. For |
151 | example, as part of a build process the author converts a MySQL schema |
152 | first to YAML, then to PostgreSQL, Oracle, SQLite and Sybase. |
153 | Additionally, a variety of documention in HTML and images is produced. |
154 | This can be accomplished like so: |
155 | |
156 | $ sqlt -f MySQL -t YAML schema-mysql.sql > schema.yaml |
157 | $ sqlt -f YAML -t Oracle schema.yaml > schema-oracle.sql |
158 | $ sqlt -f YAML -t PostgreSQL schema.yaml > schema-postgresql.sql |
159 | $ ... |
160 | |
161 | SQLFairy has three serialization producers, none of which is superior |
162 | to the other in their description of a schema. |
163 | |
164 | =over 4 |
165 | |
166 | =item * XML-SQLFairy |
167 | |
168 | This is the aforementioned XML format. It is essentially a direct |
169 | mapping of the Schema objects into XML. This can also provide a very |
170 | convenient bridge to describing a schema to a non-Perl application. |
171 | Providing a producer argument to "sqlt" of just "XML" will default to |
172 | using "XML-SQLFairy." |
173 | |
174 | =item * Storable |
175 | |
176 | This producer stores the Schema object using Perl's Storable.pm module |
177 | available on CPAN. |
178 | |
179 | =item * YAML |
180 | |
181 | This producer serialized the Schema object with the very readable |
182 | structured data format of YAML (http://www.yaml.org/). Earlier |
183 | examples show serializing to YAML. |
184 | |
185 | =back |
186 | |
187 | =head1 VISUALIZING SQL SCHEMAS |
188 | |
189 | The visualization tools in SQLFairy can graphically represent the |
190 | tables, fields, datatypes and sizes, constraints, and foreign key |
191 | relationships in a very compact and intuitive format. This can be |
192 | very beneficial in understanding and document large or small schemas. |
193 | Two producers in SQLFairy will create pseudo-E/R (entity-relationship) |
194 | diagrams: |
195 | |
196 | =over 4 |
197 | |
198 | =item * Diagram |
199 | |
200 | The first visualization tool in SQLFairy, this producer uses libgd to |
201 | draw a picture of the schema. The tables are evenly distributed in |
202 | definition order running in columns (i.e., no graphing algorithms are |
30bda799 |
203 | used), so the many of the lines showing the foreign key relationships |
204 | may cross over each other and the table boxes. Please read the |
205 | documentation of the "sqlt-diagram" script for all the options |
206 | available to this producer. |
cfdcb09e |
207 | |
208 | =item * GraphViz |
209 | |
210 | The layout of the GraphViz producer is far superior to the Diagram |
211 | producer as it uses the Graphviz binary from Bell Labs to create very |
212 | professional-looking graphs. There are several different layout |
213 | algorithms and node shapes available. Please see the documentation of |
214 | the "sqlt-graph" script for more information. |
215 | |
216 | =back |
217 | |
218 | =head1 AUTOMATED CODE-GENERATION |
219 | |
220 | Given that so many applications interact with SQL databases, it's no |
221 | wonder that people have automated code to deal with this interaction. |
222 | Class::DBI from CPAN is one such module that allows a developer to |
223 | describe the relationships between tables and fields in class |
224 | declarations and then generates all the SQL to interact (SELECT, |
225 | UPDATE, DELETE, INSERT statements) at runtime. Obviously, the schema |
226 | already describes itself, so it only makes sense that you should be |
227 | able to generate this kind of code directly from the schema. The |
228 | "ClassDBI" producer in SQLFairy does just this, creating a Perl module |
229 | that inherits from Class::DBI and sets up most of the code needed to |
230 | interact with the database. Here is an example of how to do this: |
231 | |
232 | $ sqlt -f MySQL -t ClassDBI foo.sql > Foo.pm |
233 | |
234 | Then simply edit Foo.pm as needed and include it in your code. |
235 | |
21521f16 |
236 | =head1 CREATING A DATA DUMPER SCRIPT |
237 | |
238 | The Dumper producer creates a Perl script that can select the fields |
239 | in each table and then create "INSERT" statements for each record in |
240 | the database similar to the output generated by MySQL's "mysqldump" |
241 | program: |
242 | |
243 | $ sqlt -f YAML -t Dumper --dumper-db-user guest \ |
244 | > --dumper-db-pass p4ssw0rd --dumper-dsn dbi:mysql:FOO \ |
245 | > foo.yaml > foo-dumper.pl |
246 | |
247 | And then execute the resulting script to dump the data: |
248 | |
249 | $ chmod +x foo-dumper.pl |
250 | $ ./foo-dumper.pl > foo-data.sql |
251 | |
252 | The dumper script also has a number of options available. Execute the |
253 | script with the "--help" flag to read about them. |
254 | |
cfdcb09e |
255 | =head1 DOCUMENTING WITH SQL::TRANSLATOR |
2e2fc2b4 |
256 | |
cfdcb09e |
257 | SQLFairy offers two producers to help document schemas: |
2e2fc2b4 |
258 | |
cfdcb09e |
259 | =over 4 |
2e2fc2b4 |
260 | |
cfdcb09e |
261 | =item * HTML |
2e2fc2b4 |
262 | |
cfdcb09e |
263 | This producer creates a single HTML document which uses HTML |
264 | formatting to describe the Schema objects and to create hyperlinks on |
265 | foreign key relationships. This can be a surprisingly useful |
266 | documentation aid as it creates a very readable format that allows one |
267 | to jump easily to specific tables and fields. It's also possible to |
268 | plugin your own CSS to further control the presentation of the HTML. |
2e2fc2b4 |
269 | |
cfdcb09e |
270 | =item * POD |
2e2fc2b4 |
271 | |
cfdcb09e |
272 | This is arguably not that useful of a producer by itself, but the |
273 | number of POD-conversion tools could be used to further transform the |
274 | POD into something more interesting. The schema is basically |
275 | represented in POD sections where tables are broken down into fields, |
276 | indices, constraints, foreign keys, etc. |
2e2fc2b4 |
277 | |
cfdcb09e |
278 | =back |
2e2fc2b4 |
279 | |
cfdcb09e |
280 | =head1 TEMPLATE-BASED MANIPULATION OF SCHEMA OBJECTS |
2e2fc2b4 |
281 | |
cfdcb09e |
282 | All of the producers which create text output could have been coded |
283 | using a templating system to mix in the dynamic output with static |
284 | text. CPAN offers several diverse templating systems, but few are as |
285 | powerful as Template Toolkit (http://www.template-toolkit.org/). You |
286 | can easily create your own producer without writing any Perl code at |
287 | all simply by writing a template using Template Toolkit's syntax. The |
288 | template will be passed a reference to the Schema object briefly |
289 | described at the beginning of this document and mentioned many times |
290 | throughout. For example, you could create a template that simply |
291 | prints the name of each table and field that looks like this: |
2e2fc2b4 |
292 | |
cfdcb09e |
293 | # file: schema.tt |
294 | [% FOREACH table IN schema.get_tables %] |
295 | Table: [% table.name %] |
296 | Fields: |
297 | [% FOREACH field IN table.get_fields -%] |
298 | [% field.name %] |
299 | [% END -%] |
300 | [% END %] |
2e2fc2b4 |
301 | |
466c88de |
302 | And then process it like so: |
2e2fc2b4 |
303 | |
cfdcb09e |
304 | $ sqlt -f YAML -t TTSchema --template schema.tt foo.yaml |
2e2fc2b4 |
305 | |
cfdcb09e |
306 | To create output like this: |
2e2fc2b4 |
307 | |
cfdcb09e |
308 | Table: foo |
309 | Fields: |
310 | foo_id |
311 | foo_name |
2e2fc2b4 |
312 | |
cfdcb09e |
313 | For more information on Template Toolkit, please install the |
314 | "Template" module and read the POD. |
2e2fc2b4 |
315 | |
1b0fe900 |
316 | =head1 FINDING THE DIFFERENCES BETWEEN TWO SCHEMAS |
317 | |
318 | As mentioned above, the "sqlt-diff" schema examines two schemas and |
319 | creates SQL schema modification statements that can be used to |
320 | transform the first schema into the second. The flag syntax is |
321 | somewhat quirky: |
322 | |
323 | $ sqlt-diff foo-v1.sql=MySQL foo-v2.sql=Oracle > diff.sql |
324 | |
325 | As demonstrated, the schemas need not even be from the same vendor, |
326 | though this is likely to produce some spurious results as |
327 | datatypes are not currently viewed equivalent unless they match |
328 | exactly, even if they would be converted to the same. For example, |
329 | MySQL's "integer" data type would be converted to Oracle's "number," |
330 | but the differ isn't quite smart enough yet to figure this out. Also, |
331 | as the SQL to ALTER a field definition varies from database vendor to |
332 | vendor, these statements are made using just the keyword "CHANGE" and |
333 | will likely need to be corrected for the target database. |
334 | |
335 | =head1 A UNIFIED GRAPHICAL INTERFACE |
336 | |
337 | Seeing all the above options and scripts, you may be pining for a |
338 | single, graphical interface to handle all these transformations and |
339 | choices. This is exactly what the "sqlt.cgi" script provides. Simply |
340 | drop this script into your web server's CGI directory and enable the |
341 | execute bit and you can point your web browser to an HTML form which |
342 | provides a simple interface to all the SQLFairy parsers and producers. |
343 | |
cfdcb09e |
344 | =head1 PLUGIN YOUR OWN PARSERS AND PRODUCERS |
2e2fc2b4 |
345 | |
cfdcb09e |
346 | Now that you have seen how the parsers and producers interact via the |
347 | Schema objects, you may wish to create your own versions to plugin. |
2e2fc2b4 |
348 | |
30bda799 |
349 | Producers are probably the easier concept to grok, so let's cover that |
350 | first. By far the easiest way to create custom output is to use the |
351 | TTSchema producer in conjunction with a Template Toolkit template as |
352 | described earlier. However, you can also easily pass a reference to a |
353 | subroutine that SQL::Translator can call for the production of the |
354 | ouput. This subroutine will be passed a single argument of the |
355 | SQL::Translator object which you can use to access the Schema objects. |
356 | Please read the POD for SQL::Translator and SQL::Translator::Schema to |
357 | learn the methods you can call. Here is a very simple example: |
358 | |
359 | #!/usr/bin/perl |
360 | |
90f67010 |
361 | use strict; |
30bda799 |
362 | use SQL::Translator; |
363 | |
364 | my $input = q[ |
365 | create table foo ( |
366 | foo_id int not null default '0' primary key, |
367 | foo_name varchar(30) not null default '' |
368 | ); |
369 | |
370 | create table bar ( |
371 | bar_id int not null default '0' primary key, |
372 | bar_value varchar(100) not null default '' |
373 | ); |
374 | ]; |
375 | |
376 | my $t = SQL::Translator->new; |
377 | $t->parser('MySQL') or die $t->error; |
378 | $t->producer( \&produce ) or die $t->error; |
379 | my $output = $t->translate( \$input ) or die $t->error; |
380 | print $output; |
381 | |
382 | sub produce { |
383 | my $tr = shift; |
384 | my $schema = $tr->schema; |
385 | my $output = ''; |
386 | for my $t ( $schema->get_tables ) { |
387 | $output .= join('', "Table = ", $t->name, "\n"); |
388 | } |
389 | return $output; |
390 | } |
391 | |
392 | Executing this script produces the following: |
393 | |
394 | $ ./my-producer.pl |
395 | Table = foo |
396 | Table = bar |
397 | |
398 | A custom parser will be passed two arguments: the SQL::Translator |
399 | object and the data to be parsed. In this example, the schema will be |
400 | represented in a simple text format. Each line is a table definition |
401 | where the fields are separated by colons. The first field is the |
402 | table name and the following fields are column definitions where the |
403 | column name, data type and size are separated by spaces. The |
404 | specifics of the example are unimportant -- what is being demonstrated |
405 | is that you have to decide how to parse the incoming data and then |
406 | map the concepts in the data to the Schema object. |
407 | |
408 | #!/usr/bin/perl |
409 | |
410 | use strict; |
411 | use SQL::Translator; |
412 | |
413 | my $input = |
414 | "foo:foo_id int 11:foo_name varchar 30\n" . |
415 | "bar:bar_id int 11:bar_value varchar 30" |
416 | ; |
417 | |
418 | my $t = SQL::Translator->new; |
419 | $t->parser( \&parser ) or die $t->error; |
420 | $t->producer('Oracle') or die $t->error; |
421 | my $output = $t->translate( \$input ) or die $t->error; |
422 | print $output; |
423 | |
424 | sub parser { |
425 | my ( $tr, $data ) = @_; |
426 | my $schema = $tr->schema; |
466c88de |
427 | |
30bda799 |
428 | for my $line ( split( /\n/, $data ) ) { |
429 | my ( $table_name, @fields ) = split( /:/, $line ); |
430 | my $table = $schema->add_table( name => $table_name ) |
431 | or die $schema->error; |
432 | for ( @fields ) { |
433 | my ( $f_name, $type, $size ) = split; |
434 | $table->add_field( |
435 | name => $f_name, |
436 | data_type => $type, |
437 | size => $size, |
438 | ) or die $table->error; |
439 | } |
440 | } |
466c88de |
441 | |
30bda799 |
442 | return 1; |
443 | } |
444 | |
445 | And here is the output produced by this script: |
446 | |
466c88de |
447 | -- |
30bda799 |
448 | -- Created by SQL::Translator::Producer::Oracle |
449 | -- Created on Wed Mar 31 15:43:30 2004 |
466c88de |
450 | -- |
30bda799 |
451 | -- |
452 | -- Table: foo |
453 | -- |
466c88de |
454 | |
30bda799 |
455 | CREATE TABLE foo ( |
456 | foo_id number(11), |
457 | foo_name varchar2(30) |
458 | ); |
466c88de |
459 | |
30bda799 |
460 | -- |
461 | -- Table: bar |
462 | -- |
466c88de |
463 | |
30bda799 |
464 | CREATE TABLE bar ( |
465 | bar_id number(11), |
466 | bar_value varchar2(30) |
467 | ); |
468 | |
90f67010 |
469 | If you create a useful parser or producer, you are encouraged to |
470 | submit your work to the SQLFairy project! |
471 | |
08a6f829 |
472 | =head1 PLUGIN TEMPLATE TOOLKIT PRODUCERS |
473 | |
474 | You may find that the TTSchema producer doesn't give you enough control over |
475 | templating and you want to play with the Template config or add you own |
476 | variables. Or maybe you just have a really good template you want to submit to |
477 | SQLFairy :) If so, the SQL::Translator::Producer::TT::Base producer may be |
478 | just for you! Instead of working like a normal producer it provides a base |
479 | class so you can cheaply build new producer modules based on templates. |
480 | |
481 | It's simplest use is when we just want to put a single template in its own |
466c88de |
482 | module. So to create a Foo producer we create a F<Custom/Foo.pm> file as |
08a6f829 |
483 | follows, putting our template in the __DATA__ section. |
484 | |
485 | package Custom::Foo.pm; |
486 | use base qw/SQL::Translator::Producer::TT::Base/; |
487 | # Use our new class as the producer |
488 | sub produce { return __PACKAGE__->new( translator => shift )->run; }; |
466c88de |
489 | |
08a6f829 |
490 | __DATA__ |
491 | [% FOREACH table IN schema.get_tables %] |
492 | Table: [% table.name %] |
493 | Fields: |
494 | [% FOREACH field IN table.get_fields -%] |
495 | [% field.name %] |
496 | [% END -%] |
497 | [% END %] |
498 | |
499 | For that we get a producer called Custom::Foo that we can now call like a |
500 | normal producer (as long as the directory with F<Custom/Foo.pm> is in our @INC |
501 | path): |
466c88de |
502 | |
08a6f829 |
503 | $ sqlt -f YAML -t Custom-Foo foo.yaml |
504 | |
505 | The template gets variables of C<schema> and C<translator> to use in building |
506 | its output. You also get a number of methods you can override to hook into the |
507 | template generation. |
508 | |
509 | B<tt_config> Allows you to set the config options used by the Template object. |
510 | The Template Toolkit provides a huge number of options which allow you to do all |
466c88de |
511 | sorts of magic (See L<Template::Manual::Config> for details). This method |
08a6f829 |
512 | provides a hook into them by returning a hash of options for the Template. e.g. |
513 | Say you want to use the INTERPOLATE option to save some typing in your template; |
514 | |
515 | sub tt_config { ( INTERPOLATE => 1 ); } |
516 | |
466c88de |
517 | Another common use for this is to add you own filters to the template: |
08a6f829 |
518 | |
519 | sub tt_config {( |
520 | INTERPOLATE => 1, |
521 | FILTERS => { foo_filter => \&foo_filter, } |
522 | );} |
523 | |
524 | Another common extension is adding your own template variables. This is done |
525 | with B<tt_vars>: |
526 | |
527 | sub tt_vars { ( foo => "bar" ); } |
528 | |
466c88de |
529 | What about using template files instead of DATA sections? You can already - if |
08a6f829 |
530 | you give a template on the command line your new producer will use that instead |
531 | of reading the DATA section: |
532 | |
533 | $ sqlt -f YAML -t Custom-Foo --template foo.tt foo.yaml |
534 | |
466c88de |
535 | This is usefull as you can set up a producer that adds a set of filters and |
08a6f829 |
536 | variables that you can then use in templates given on the command line. (There |
537 | is also a tt_schema method to over ride if you need even finer control over the |
538 | source of your template). Note that if you leave out the DATA section all |
539 | together then your producer will require a template file name to be given. |
540 | |
466c88de |
541 | See L<SQL::Translator::Producer::TT::Base> for more details. |
08a6f829 |
542 | |
2e2fc2b4 |
543 | =head1 AUTHOR |
544 | |
977651a5 |
545 | Ken Y. Clark E<lt>kclark@cpan.orgE<gt>. |