lib/SQL/Translator/Manual.pod

   1 =head1 NAME
   2
   3 SQL::Translator::Manual
   4
   5 =head1 SYNOPSIS
   6
   7 SQL::Translator (AKA "SQLFairy") is a collection of modules for
   8 transforming (mainly) SQL DDL files into a variety of other formats,
   9 including other SQL dialects, documentation, images, and code.  In
  10 this manual, we will attempt to address how to use SQLFairy for common
  11 tasks.  For a lower-level discussion of how the code works, please
  12 read the documentation for L<SQL::Translator>.
  13
  14 It may prove helpful to have a general understanding of the SQLFairy
  15 code before continuing.  The code can be broken into three conceptual
  16 groupings:
  17
  18 =over 4
  19
  20 =item * Parsers
  21
  22 The parsers are responsible for reading the input files and describing
  23 them to the Schema object middleware.
  24
  25 =item * Producers
  26
  27 The producers create the output as described by the Schema middleware.
  28
  29 =item * Schema objects
  30
  31 The Schema objects bridge the communication between the Parsers and
  32 Producers by representing any parsed file through a standard set of
  33 generic objects to represent concepts like Tables, Fields (columns),
  34 Indices, Constraints, etc.
  35
  36 =back
  37
  38 It's not necessary to understand how to write or manipulate any
  39 of these for most common tasks, but you should aware of the concepts
  40 as they will be referenced later in this document.
  41
  42 =head1 SQLFAIRY SCRIPTS
  43
  44 Most common tasks can be accomplished through the use of the script
  45 interfaces to the SQL::Translator code.  All SQLFairy scripts begin
  46 with "sqlt."  Here are the scripts and a description of what they each
  47 do:
  48
  49 =over 4
  50
  51 =item * sqlt
  52
  53 This is the main interface for text-to-text translations, e.g.,
  54 converting a MySQL schema to Oracle.
  55
  56 =item * sqlt-diagram
  57
  58 This is a tailored interface for the Diagram producer and its many
  59 myriad options.
  60
  61 =item * sqlt-diff
  62
  63 This script will examine two schemas and report the SQL commands
  64 (ALTER, CREATE) needed to turn the first schema into the second.
  65
  66 =item * sqlt-dumper
  67
  68 This script generates a Perl script that can be used to connect to a
  69 database and dump the data in each table in different formats, similar
  70 to the "mysqldump" program.
  71
  72 =item * sqlt-graph
  73
  74 This is an interface to the GraphViz visualization tool and its myriad
  75 options.
  76
  77 =item * sqlt.cgi
  78
  79 This is a CGI script that presents an HTML form for uploading or
  80 pasting a schema and choosing an output and the output options.
  81
  82 =back
  83
  84 To read the full documentation for each script, use "perldoc" (or
  85 execute any of the command-line scripts with the "--help" flag).
  86
  87 =head1 CONVERTING SQL DIALECTS
  88
  89 Probably the most common task SQLFairy is used for is to convert one
  90 dialect of SQL to another.  If you have a text description of an SQL
  91 database (AKA a "DDL" -- "Data Definition Language"), then you should
  92 use the "sqlt" script with switches to indicate the parser and
  93 producer and the name of the text file as the final argument.  For
  94 example, to convert the "foo.sql" MySQL schema to a version suitable
  95 for PostgreSQL, you would do the following:
  96
  97   $ sqlt -f MySQL -t PostgreSQL foo.sql > foo-pg.sql
  98
  99 The "from" and "to" options are case-sensitive and must match exactly
 100 the names of the Parser and Producer classes in SQL::Translator.  For
 101 a complete listing of your options, execute "sqlt" with the "--list"
 102 flag.
 103
 104 =head1 EXTRACT SQL SCHEMAS DIRECTLY FROM DATABASE
 105
 106 It is possible to extract some schemas directly from the database
 107 without parsing a text file (the "foo.sql" in the above example).
 108 This can prove significantly faster than parsing a text file.  To
 109 do this, use the "DBI" parser and provide the necessary arguments to
 110 connect to the database and indicate the producer class, like so:
 111
 112   $ sqlt -f DBI --dsn dbi:mysql:FOO --db-user guest \
 113     --db-password p4ssw0rd -t PostgreSQL > foo
 114
 115 The "--list" option to "sqlt" will show the databases supported by
 116 DBI parsers.
 117
 118 =head1 HANDLING NON-SQL DATA
 119
 120 Certain structured document formats can be easily thought of as
 121 tables.  SQLFairy can parse Microsoft Excel spreadsheets and
 122 arbitrarily delimited text files just as if they were schemas which
 123 contained only one table definition.  The column names are normalized
 124 to something sane for most databases (whitespace is converted to
 125 underscores and non-word characters are removed), and the data in each
 126 field is scanned to determine the appropriate data type (character,
 127 integer, or float) and size.  For instance, to convert a
 128 comma-separated file to an SQLite database, do the following:
 129
 130   $ sqlt -f xSV --fs ',' -t SQLite foo.csv > foo-sqlite.sql
 131
 132 Additionally, there are non-SQL represenations of relational schemas
 133 such as XML and XMI.  Currently the XMI support in SQLFairy is
 134 experimental and not released.  Additionally, the only XML supported
 135 is our own version;  however, it would be fairly easy to add an XML
 136 parser for something like the TorqueDB (http://db.apache.org/torque/)
 137 project.  The actual parsing of XML should be trivial given the number
 138 of XML parsers available, so all that would be left would be to map
 139 the specific concepts in the source file to the Schema objects in
 140 SQLFairy.
 141
 142 To convert a schema in SQLFairy's XML dialect to Oracle, do the following:
 143
 144   $ sqlt -f XML-SQLFairy -t Oracle foo.xml > foo-oracle.sql
 145
 146 =head1 SERIALIZING SCHEMAS
 147
 148 via XML, YAML and Storable
 149 Parsing a schema is generally the most computationally expensive
 150 operation performed by SQLFairy, so it may behoove you to serialize a
 151 parsed schema if you need to perform repeated conversions.  For
 152 example, as part of a build process the author converts a MySQL schema
 153 first to YAML, then to PostgreSQL, Oracle, SQLite and Sybase.
 154 Additionally, a variety of documention in HTML and images is produced.
 155 This can be accomplished like so:
 156
 157   $ sqlt -f MySQL -t YAML schema-mysql.sql > schema.yaml
 158   $ sqlt -f YAML -t Oracle schema.yaml > schema-oracle.sql
 159   $ sqlt -f YAML -t PostgreSQL schema.yaml > schema-postgresql.sql
 160   $ ...
 161
 162 SQLFairy has three serialization producers, none of which is superior
 163 to the other in their description of a schema.
 164
 165 =over 4
 166
 167 =item * XML-SQLFairy
 168
 169 This is the aforementioned XML format.  It is essentially a direct
 170 mapping of the Schema objects into XML.  This can also provide a very
 171 convenient bridge to describing a schema to a non-Perl application.
 172 Providing a producer argument to "sqlt" of just "XML" will default to
 173 using "XML-SQLFairy."
 174
 175 =item * Storable
 176
 177 This producer stores the Schema object using Perl's Storable.pm module
 178 available on CPAN.
 179
 180 =item * YAML
 181
 182 This producer serialized the Schema object with the very readable
 183 structured data format of YAML (http://www.yaml.org/).  Earlier
 184 examples show serializing to YAML.
 185
 186 =back
 187
 188 =head1 VISUALIZING SQL SCHEMAS
 189
 190 The visualization tools in SQLFairy can graphically represent the
 191 tables, fields, datatypes and sizes, constraints, and foreign key
 192 relationships in a very compact and intuitive format.  This can be
 193 very beneficial in understanding and document large or small schemas.
 194 Two producers in SQLFairy will create pseudo-E/R (entity-relationship)
 195 diagrams:
 196
 197 =over 4
 198
 199 =item * Diagram
 200
 201 The first visualization tool in SQLFairy, this producer uses libgd to
 202 draw a picture of the schema.  The tables are evenly distributed in
 203 definition order running in columns (i.e., no graphing algorithms are
 204 used), so the end result may result in many crossed lines showing the
 205 foreign key relationships.  Please read the documentation of the
 206 "sqlt-diagram" script for all the options available to this producer.
 207
 208 =item * GraphViz
 209
 210 The layout of the GraphViz producer is far superior to the Diagram
 211 producer as it uses the Graphviz binary from Bell Labs to create very
 212 professional-looking graphs.  There are several different layout
 213 algorithms and node shapes available.  Please see the documentation of
 214 the "sqlt-graph" script for more information.
 215
 216 =back
 217
 218 =head1 AUTOMATED CODE-GENERATION
 219
 220 Given that so many applications interact with SQL databases, it's no
 221 wonder that people have automated code to deal with this interaction.
 222 Class::DBI from CPAN is one such module that allows a developer to
 223 describe the relationships between tables and fields in class
 224 declarations and then generates all the SQL to interact (SELECT,
 225 UPDATE, DELETE, INSERT statements) at runtime.  Obviously, the schema
 226 already describes itself, so it only makes sense that you should be
 227 able to generate this kind of code directly from the schema.  The
 228 "ClassDBI" producer in SQLFairy does just this, creating a Perl module
 229 that inherits from Class::DBI and sets up most of the code needed to
 230 interact with the database.  Here is an example of how to do this:
 231
 232   $ sqlt -f MySQL -t ClassDBI foo.sql > Foo.pm
 233
 234 Then simply edit Foo.pm as needed and include it in your code.
 235
 236 =head1 DOCUMENTING WITH SQL::TRANSLATOR
 237
 238 SQLFairy offers two producers to help document schemas:
 239
 240 =over 4
 241
 242 =item * HTML
 243
 244 This producer creates a single HTML document which uses HTML
 245 formatting to describe the Schema objects and to create hyperlinks on
 246 foreign key relationships.  This can be a surprisingly useful
 247 documentation aid as it creates a very readable format that allows one
 248 to jump easily to specific tables and fields.  It's also possible to
 249 plugin your own CSS to further control the presentation of the HTML.
 250
 251 =item * POD
 252
 253 This is arguably not that useful of a producer by itself, but the
 254 number of POD-conversion tools could be used to further transform the
 255 POD into something more interesting.  The schema is basically
 256 represented in POD sections where tables are broken down into fields,
 257 indices, constraints, foreign keys, etc.
 258
 259 =back
 260
 261 =head1 TEMPLATE-BASED MANIPULATION OF SCHEMA OBJECTS
 262
 263 All of the producers which create text output could have been coded
 264 using a templating system to mix in the dynamic output with static
 265 text.  CPAN offers several diverse templating systems, but few are as
 266 powerful as Template Toolkit (http://www.template-toolkit.org/).  You
 267 can easily create your own producer without writing any Perl code at
 268 all simply by writing a template using Template Toolkit's syntax.  The
 269 template will be passed a reference to the Schema object briefly
 270 described at the beginning of this document and mentioned many times
 271 throughout.  For example, you could create a template that simply
 272 prints the name of each table and field that looks like this:
 273
 274   # file: schema.tt
 275   [% FOREACH table IN schema.get_tables %]
 276   Table: [% table.name %]
 277   Fields:
 278   [% FOREACH field IN table.get_fields -%]
 279     [% field.name %]
 280   [% END -%]
 281   [% END %]
 282
 283 And the process it like so:
 284
 285   $ sqlt -f YAML -t TTSchema --template schema.tt foo.yaml
 286
 287 To create output like this:
 288
 289   Table: foo
 290   Fields:
 291     foo_id
 292     foo_name
 293
 294 For more information on Template Toolkit, please install the
 295 "Template" module and read the POD.
 296
 297 =head1 PLUGIN YOUR OWN PARSERS AND PRODUCERS
 298
 299 Now that you have seen how the parsers and producers interact via the
 300 Schema objects, you may wish to create your own versions to plugin.
 301
 302 =head1 AUTHOR
 303
 304 Ken Y. Clark E<lt>kclark@cpan.orgE<gt>.