Only output trigger 'scope' if it's set in YAML and JSON producers
[dbsrgits/SQL-Translator.git] / lib / SQL / Translator / Manual.pod
CommitLineData
2e2fc2b4 1=head1 NAME
2
10f70490 3SQL::Translator::Manual - sqlfairy user manual
2e2fc2b4 4
5=head1 SYNOPSIS
6
cfdcb09e 7SQL::Translator (AKA "SQLFairy") is a collection of modules for
8transforming (mainly) SQL DDL files into a variety of other formats,
9including other SQL dialects, documentation, images, and code. In
10this manual, we will attempt to address how to use SQLFairy for common
11tasks. For a lower-level discussion of how the code works, please
12read the documentation for L<SQL::Translator>.
2e2fc2b4 13
cfdcb09e 14It may prove helpful to have a general understanding of the SQLFairy
15code before continuing. The code can be broken into three conceptual
16groupings:
2e2fc2b4 17
cfdcb09e 18=over 4
2e2fc2b4 19
cfdcb09e 20=item * Parsers
2e2fc2b4 21
aee4b66e 22The parsers are responsible for reading the input files and describing
cfdcb09e 23them to the Schema object middleware.
2e2fc2b4 24
cfdcb09e 25=item * Producers
2e2fc2b4 26
cfdcb09e 27The producers create the output as described by the Schema middleware.
2e2fc2b4 28
cfdcb09e 29=item * Schema objects
30
31The Schema objects bridge the communication between the Parsers and
32Producers by representing any parsed file through a standard set of
33generic objects to represent concepts like Tables, Fields (columns),
34Indices, Constraints, etc.
35
36=back
37
38It's not necessary to understand how to write or manipulate any
39of these for most common tasks, but you should aware of the concepts
aee4b66e 40as they will be referenced later in this document.
cfdcb09e 41
42=head1 SQLFAIRY SCRIPTS
43
44Most common tasks can be accomplished through the use of the script
45interfaces to the SQL::Translator code. All SQLFairy scripts begin
46with "sqlt." Here are the scripts and a description of what they each
47do:
48
49=over 4
50
51=item * sqlt
52
53This is the main interface for text-to-text translations, e.g.,
aee4b66e 54converting a MySQL schema to Oracle.
cfdcb09e 55
56=item * sqlt-diagram
57
58This is a tailored interface for the Diagram producer and its many
59myriad options.
60
61=item * sqlt-diff
62
63This script will examine two schemas and report the SQL commands
aee4b66e 64(ALTER, CREATE) needed to turn the first schema into the second.
cfdcb09e 65
66=item * sqlt-dumper
67
68This script generates a Perl script that can be used to connect to a
69database and dump the data in each table in different formats, similar
70to the "mysqldump" program.
71
72=item * sqlt-graph
73
74This is an interface to the GraphViz visualization tool and its myriad
75options.
76
77=item * sqlt.cgi
78
79This is a CGI script that presents an HTML form for uploading or
80pasting a schema and choosing an output and the output options.
81
82=back
83
84To read the full documentation for each script, use "perldoc" (or
85execute any of the command-line scripts with the "--help" flag).
86
87=head1 CONVERTING SQL DIALECTS
88
89Probably the most common task SQLFairy is used for is to convert one
90dialect of SQL to another. If you have a text description of an SQL
91database (AKA a "DDL" -- "Data Definition Language"), then you should
92use the "sqlt" script with switches to indicate the parser and
93producer and the name of the text file as the final argument. For
94example, to convert the "foo.sql" MySQL schema to a version suitable
95for PostgreSQL, you would do the following:
96
97 $ sqlt -f MySQL -t PostgreSQL foo.sql > foo-pg.sql
98
99The "from" and "to" options are case-sensitive and must match exactly
100the names of the Parser and Producer classes in SQL::Translator. For
101a complete listing of your options, execute "sqlt" with the "--list"
102flag.
103
104=head1 EXTRACT SQL SCHEMAS DIRECTLY FROM DATABASE
105
106It is possible to extract some schemas directly from the database
107without parsing a text file (the "foo.sql" in the above example).
108This can prove significantly faster than parsing a text file. To
109do this, use the "DBI" parser and provide the necessary arguments to
110connect to the database and indicate the producer class, like so:
111
112 $ sqlt -f DBI --dsn dbi:mysql:FOO --db-user guest \
113 --db-password p4ssw0rd -t PostgreSQL > foo
114
115The "--list" option to "sqlt" will show the databases supported by
116DBI parsers.
117
118=head1 HANDLING NON-SQL DATA
119
120Certain structured document formats can be easily thought of as
121tables. SQLFairy can parse Microsoft Excel spreadsheets and
122arbitrarily delimited text files just as if they were schemas which
123contained only one table definition. The column names are normalized
124to something sane for most databases (whitespace is converted to
125underscores and non-word characters are removed), and the data in each
126field is scanned to determine the appropriate data type (character,
127integer, or float) and size. For instance, to convert a
128comma-separated file to an SQLite database, do the following:
129
130 $ sqlt -f xSV --fs ',' -t SQLite foo.csv > foo-sqlite.sql
131
08abffc0 132Additionally, there is a non-SQL representation of relational schemas namely
47b3b29f 133XML. Additionally, the only XML supported is our own version; however, it
134would be fairly easy to add an XML parser for something like the TorqueDB
135(http://db.apache.org/torque/) project. The actual parsing of XML should be
136trivial given the number of XML parsers available, so all that would be left
137would be to map the specific concepts in the source file to the Schema objects
aee4b66e 138in SQLFairy.
cfdcb09e 139
140To convert a schema in SQLFairy's XML dialect to Oracle, do the following:
141
aee4b66e 142 $ sqlt -f XML-SQLFairy -t Oracle foo.xml > foo-oracle.sql
cfdcb09e 143
144=head1 SERIALIZING SCHEMAS
2e2fc2b4 145
cfdcb09e 146Parsing a schema is generally the most computationally expensive
147operation performed by SQLFairy, so it may behoove you to serialize a
148parsed schema if you need to perform repeated conversions. For
149example, as part of a build process the author converts a MySQL schema
150first to YAML, then to PostgreSQL, Oracle, SQLite and Sybase.
08abffc0 151Additionally, a variety of documentation in HTML and images is produced.
cfdcb09e 152This can be accomplished like so:
153
154 $ sqlt -f MySQL -t YAML schema-mysql.sql > schema.yaml
155 $ sqlt -f YAML -t Oracle schema.yaml > schema-oracle.sql
156 $ sqlt -f YAML -t PostgreSQL schema.yaml > schema-postgresql.sql
157 $ ...
158
159SQLFairy has three serialization producers, none of which is superior
aee4b66e 160to the other in their description of a schema.
cfdcb09e 161
162=over 4
163
164=item * XML-SQLFairy
165
166This is the aforementioned XML format. It is essentially a direct
167mapping of the Schema objects into XML. This can also provide a very
168convenient bridge to describing a schema to a non-Perl application.
169Providing a producer argument to "sqlt" of just "XML" will default to
170using "XML-SQLFairy."
171
172=item * Storable
173
174This producer stores the Schema object using Perl's Storable.pm module
175available on CPAN.
176
177=item * YAML
178
179This producer serialized the Schema object with the very readable
180structured data format of YAML (http://www.yaml.org/). Earlier
181examples show serializing to YAML.
182
183=back
184
185=head1 VISUALIZING SQL SCHEMAS
186
187The visualization tools in SQLFairy can graphically represent the
188tables, fields, datatypes and sizes, constraints, and foreign key
189relationships in a very compact and intuitive format. This can be
190very beneficial in understanding and document large or small schemas.
191Two producers in SQLFairy will create pseudo-E/R (entity-relationship)
192diagrams:
193
194=over 4
195
196=item * Diagram
197
198The first visualization tool in SQLFairy, this producer uses libgd to
199draw a picture of the schema. The tables are evenly distributed in
200definition order running in columns (i.e., no graphing algorithms are
30bda799 201used), so the many of the lines showing the foreign key relationships
202may cross over each other and the table boxes. Please read the
203documentation of the "sqlt-diagram" script for all the options
204available to this producer.
cfdcb09e 205
206=item * GraphViz
207
208The layout of the GraphViz producer is far superior to the Diagram
209producer as it uses the Graphviz binary from Bell Labs to create very
210professional-looking graphs. There are several different layout
211algorithms and node shapes available. Please see the documentation of
212the "sqlt-graph" script for more information.
213
214=back
215
aee4b66e 216=head1 AUTOMATED CODE-GENERATION
cfdcb09e 217
218Given that so many applications interact with SQL databases, it's no
219wonder that people have automated code to deal with this interaction.
220Class::DBI from CPAN is one such module that allows a developer to
221describe the relationships between tables and fields in class
222declarations and then generates all the SQL to interact (SELECT,
223UPDATE, DELETE, INSERT statements) at runtime. Obviously, the schema
224already describes itself, so it only makes sense that you should be
225able to generate this kind of code directly from the schema. The
226"ClassDBI" producer in SQLFairy does just this, creating a Perl module
227that inherits from Class::DBI and sets up most of the code needed to
228interact with the database. Here is an example of how to do this:
229
230 $ sqlt -f MySQL -t ClassDBI foo.sql > Foo.pm
231
232Then simply edit Foo.pm as needed and include it in your code.
233
21521f16 234=head1 CREATING A DATA DUMPER SCRIPT
235
236The Dumper producer creates a Perl script that can select the fields
237in each table and then create "INSERT" statements for each record in
238the database similar to the output generated by MySQL's "mysqldump"
239program:
240
241 $ sqlt -f YAML -t Dumper --dumper-db-user guest \
242 > --dumper-db-pass p4ssw0rd --dumper-dsn dbi:mysql:FOO \
243 > foo.yaml > foo-dumper.pl
244
245And then execute the resulting script to dump the data:
246
247 $ chmod +x foo-dumper.pl
248 $ ./foo-dumper.pl > foo-data.sql
249
250The dumper script also has a number of options available. Execute the
251script with the "--help" flag to read about them.
252
cfdcb09e 253=head1 DOCUMENTING WITH SQL::TRANSLATOR
2e2fc2b4 254
cfdcb09e 255SQLFairy offers two producers to help document schemas:
2e2fc2b4 256
cfdcb09e 257=over 4
2e2fc2b4 258
cfdcb09e 259=item * HTML
2e2fc2b4 260
cfdcb09e 261This producer creates a single HTML document which uses HTML
262formatting to describe the Schema objects and to create hyperlinks on
263foreign key relationships. This can be a surprisingly useful
264documentation aid as it creates a very readable format that allows one
265to jump easily to specific tables and fields. It's also possible to
266plugin your own CSS to further control the presentation of the HTML.
2e2fc2b4 267
cfdcb09e 268=item * POD
2e2fc2b4 269
cfdcb09e 270This is arguably not that useful of a producer by itself, but the
271number of POD-conversion tools could be used to further transform the
272POD into something more interesting. The schema is basically
273represented in POD sections where tables are broken down into fields,
274indices, constraints, foreign keys, etc.
2e2fc2b4 275
cfdcb09e 276=back
2e2fc2b4 277
cfdcb09e 278=head1 TEMPLATE-BASED MANIPULATION OF SCHEMA OBJECTS
2e2fc2b4 279
cfdcb09e 280All of the producers which create text output could have been coded
281using a templating system to mix in the dynamic output with static
282text. CPAN offers several diverse templating systems, but few are as
283powerful as Template Toolkit (http://www.template-toolkit.org/). You
284can easily create your own producer without writing any Perl code at
285all simply by writing a template using Template Toolkit's syntax. The
286template will be passed a reference to the Schema object briefly
287described at the beginning of this document and mentioned many times
288throughout. For example, you could create a template that simply
289prints the name of each table and field that looks like this:
2e2fc2b4 290
cfdcb09e 291 # file: schema.tt
292 [% FOREACH table IN schema.get_tables %]
293 Table: [% table.name %]
294 Fields:
295 [% FOREACH field IN table.get_fields -%]
296 [% field.name %]
297 [% END -%]
298 [% END %]
2e2fc2b4 299
466c88de 300And then process it like so:
2e2fc2b4 301
cfdcb09e 302 $ sqlt -f YAML -t TTSchema --template schema.tt foo.yaml
2e2fc2b4 303
cfdcb09e 304To create output like this:
2e2fc2b4 305
cfdcb09e 306 Table: foo
307 Fields:
308 foo_id
309 foo_name
2e2fc2b4 310
cfdcb09e 311For more information on Template Toolkit, please install the
312"Template" module and read the POD.
2e2fc2b4 313
1b0fe900 314=head1 FINDING THE DIFFERENCES BETWEEN TWO SCHEMAS
315
316As mentioned above, the "sqlt-diff" schema examines two schemas and
317creates SQL schema modification statements that can be used to
318transform the first schema into the second. The flag syntax is
319somewhat quirky:
320
321 $ sqlt-diff foo-v1.sql=MySQL foo-v2.sql=Oracle > diff.sql
322
323As demonstrated, the schemas need not even be from the same vendor,
aee4b66e 324though this is likely to produce some spurious results as
1b0fe900 325datatypes are not currently viewed equivalent unless they match
326exactly, even if they would be converted to the same. For example,
327MySQL's "integer" data type would be converted to Oracle's "number,"
328but the differ isn't quite smart enough yet to figure this out. Also,
329as the SQL to ALTER a field definition varies from database vendor to
330vendor, these statements are made using just the keyword "CHANGE" and
aee4b66e 331will likely need to be corrected for the target database.
1b0fe900 332
333=head1 A UNIFIED GRAPHICAL INTERFACE
334
335Seeing all the above options and scripts, you may be pining for a
336single, graphical interface to handle all these transformations and
337choices. This is exactly what the "sqlt.cgi" script provides. Simply
338drop this script into your web server's CGI directory and enable the
339execute bit and you can point your web browser to an HTML form which
340provides a simple interface to all the SQLFairy parsers and producers.
341
cfdcb09e 342=head1 PLUGIN YOUR OWN PARSERS AND PRODUCERS
2e2fc2b4 343
cfdcb09e 344Now that you have seen how the parsers and producers interact via the
345Schema objects, you may wish to create your own versions to plugin.
2e2fc2b4 346
30bda799 347Producers are probably the easier concept to grok, so let's cover that
348first. By far the easiest way to create custom output is to use the
349TTSchema producer in conjunction with a Template Toolkit template as
350described earlier. However, you can also easily pass a reference to a
351subroutine that SQL::Translator can call for the production of the
51c4e382 352output. This subroutine will be passed a single argument of the
30bda799 353SQL::Translator object which you can use to access the Schema objects.
354Please read the POD for SQL::Translator and SQL::Translator::Schema to
355learn the methods you can call. Here is a very simple example:
356
357 #!/usr/bin/perl
aee4b66e 358
90f67010 359 use strict;
30bda799 360 use SQL::Translator;
aee4b66e 361
30bda799 362 my $input = q[
363 create table foo (
364 foo_id int not null default '0' primary key,
365 foo_name varchar(30) not null default ''
366 );
aee4b66e 367
30bda799 368 create table bar (
369 bar_id int not null default '0' primary key,
370 bar_value varchar(100) not null default ''
371 );
372 ];
aee4b66e 373
30bda799 374 my $t = SQL::Translator->new;
375 $t->parser('MySQL') or die $t->error;
376 $t->producer( \&produce ) or die $t->error;
377 my $output = $t->translate( \$input ) or die $t->error;
378 print $output;
aee4b66e 379
30bda799 380 sub produce {
381 my $tr = shift;
382 my $schema = $tr->schema;
383 my $output = '';
384 for my $t ( $schema->get_tables ) {
385 $output .= join('', "Table = ", $t->name, "\n");
386 }
387 return $output;
388 }
389
390Executing this script produces the following:
391
aee4b66e 392 $ ./my-producer.pl
30bda799 393 Table = foo
394 Table = bar
395
396A custom parser will be passed two arguments: the SQL::Translator
397object and the data to be parsed. In this example, the schema will be
398represented in a simple text format. Each line is a table definition
399where the fields are separated by colons. The first field is the
400table name and the following fields are column definitions where the
401column name, data type and size are separated by spaces. The
402specifics of the example are unimportant -- what is being demonstrated
403is that you have to decide how to parse the incoming data and then
404map the concepts in the data to the Schema object.
405
406 #!/usr/bin/perl
aee4b66e 407
30bda799 408 use strict;
409 use SQL::Translator;
aee4b66e 410
30bda799 411 my $input =
412 "foo:foo_id int 11:foo_name varchar 30\n" .
413 "bar:bar_id int 11:bar_value varchar 30"
414 ;
aee4b66e 415
30bda799 416 my $t = SQL::Translator->new;
417 $t->parser( \&parser ) or die $t->error;
418 $t->producer('Oracle') or die $t->error;
419 my $output = $t->translate( \$input ) or die $t->error;
420 print $output;
aee4b66e 421
30bda799 422 sub parser {
423 my ( $tr, $data ) = @_;
424 my $schema = $tr->schema;
aee4b66e 425
30bda799 426 for my $line ( split( /\n/, $data ) ) {
427 my ( $table_name, @fields ) = split( /:/, $line );
428 my $table = $schema->add_table( name => $table_name )
429 or die $schema->error;
430 for ( @fields ) {
431 my ( $f_name, $type, $size ) = split;
432 $table->add_field(
433 name => $f_name,
434 data_type => $type,
435 size => $size,
436 ) or die $table->error;
437 }
438 }
466c88de 439
30bda799 440 return 1;
441 }
442
443And here is the output produced by this script:
444
466c88de 445 --
30bda799 446 -- Created by SQL::Translator::Producer::Oracle
447 -- Created on Wed Mar 31 15:43:30 2004
466c88de 448 --
30bda799 449 --
450 -- Table: foo
451 --
466c88de 452
30bda799 453 CREATE TABLE foo (
454 foo_id number(11),
455 foo_name varchar2(30)
456 );
466c88de 457
30bda799 458 --
459 -- Table: bar
460 --
466c88de 461
30bda799 462 CREATE TABLE bar (
463 bar_id number(11),
464 bar_value varchar2(30)
465 );
466
90f67010 467If you create a useful parser or producer, you are encouraged to
468submit your work to the SQLFairy project!
469
08a6f829 470=head1 PLUGIN TEMPLATE TOOLKIT PRODUCERS
471
472You may find that the TTSchema producer doesn't give you enough control over
473templating and you want to play with the Template config or add you own
474variables. Or maybe you just have a really good template you want to submit to
475SQLFairy :) If so, the SQL::Translator::Producer::TT::Base producer may be
476just for you! Instead of working like a normal producer it provides a base
477class so you can cheaply build new producer modules based on templates.
478
479It's simplest use is when we just want to put a single template in its own
466c88de 480module. So to create a Foo producer we create a F<Custom/Foo.pm> file as
08a6f829 481follows, putting our template in the __DATA__ section.
482
483 package Custom::Foo.pm;
484 use base qw/SQL::Translator::Producer::TT::Base/;
485 # Use our new class as the producer
486 sub produce { return __PACKAGE__->new( translator => shift )->run; };
466c88de 487
08a6f829 488 __DATA__
489 [% FOREACH table IN schema.get_tables %]
490 Table: [% table.name %]
491 Fields:
492 [% FOREACH field IN table.get_fields -%]
493 [% field.name %]
494 [% END -%]
495 [% END %]
496
497For that we get a producer called Custom::Foo that we can now call like a
498normal producer (as long as the directory with F<Custom/Foo.pm> is in our @INC
499path):
466c88de 500
08a6f829 501 $ sqlt -f YAML -t Custom-Foo foo.yaml
502
503The template gets variables of C<schema> and C<translator> to use in building
504its output. You also get a number of methods you can override to hook into the
505template generation.
506
507B<tt_config> Allows you to set the config options used by the Template object.
508The Template Toolkit provides a huge number of options which allow you to do all
466c88de 509sorts of magic (See L<Template::Manual::Config> for details). This method
08a6f829 510provides a hook into them by returning a hash of options for the Template. e.g.
511Say you want to use the INTERPOLATE option to save some typing in your template;
512
513 sub tt_config { ( INTERPOLATE => 1 ); }
514
466c88de 515Another common use for this is to add you own filters to the template:
08a6f829 516
517 sub tt_config {(
518 INTERPOLATE => 1,
519 FILTERS => { foo_filter => \&foo_filter, }
520 );}
521
522Another common extension is adding your own template variables. This is done
523with B<tt_vars>:
524
525 sub tt_vars { ( foo => "bar" ); }
526
466c88de 527What about using template files instead of DATA sections? You can already - if
08a6f829 528you give a template on the command line your new producer will use that instead
529of reading the DATA section:
530
531 $ sqlt -f YAML -t Custom-Foo --template foo.tt foo.yaml
532
10f70490 533This is useful as you can set up a producer that adds a set of filters and
08a6f829 534variables that you can then use in templates given on the command line. (There
535is also a tt_schema method to over ride if you need even finer control over the
536source of your template). Note that if you leave out the DATA section all
537together then your producer will require a template file name to be given.
538
466c88de 539See L<SQL::Translator::Producer::TT::Base> for more details.
08a6f829 540
2e2fc2b4 541=head1 AUTHOR
542
11ad2df9 543Ken Y. Clark E<lt>kclark@cpan.orgE<gt>.