Commit | Line | Data |
2e2fc2b4 |
1 | =head1 NAME |
2 | |
3 | SQL::Translator::Manual |
4 | |
5 | =head1 SYNOPSIS |
6 | |
cfdcb09e |
7 | SQL::Translator (AKA "SQLFairy") is a collection of modules for |
8 | transforming (mainly) SQL DDL files into a variety of other formats, |
9 | including other SQL dialects, documentation, images, and code. In |
10 | this manual, we will attempt to address how to use SQLFairy for common |
11 | tasks. For a lower-level discussion of how the code works, please |
12 | read the documentation for L<SQL::Translator>. |
2e2fc2b4 |
13 | |
cfdcb09e |
14 | It may prove helpful to have a general understanding of the SQLFairy |
15 | code before continuing. The code can be broken into three conceptual |
16 | groupings: |
2e2fc2b4 |
17 | |
cfdcb09e |
18 | =over 4 |
2e2fc2b4 |
19 | |
cfdcb09e |
20 | =item * Parsers |
2e2fc2b4 |
21 | |
cfdcb09e |
22 | The parsers are responsible for reading the input files and describing |
23 | them to the Schema object middleware. |
2e2fc2b4 |
24 | |
cfdcb09e |
25 | =item * Producers |
2e2fc2b4 |
26 | |
cfdcb09e |
27 | The producers create the output as described by the Schema middleware. |
2e2fc2b4 |
28 | |
cfdcb09e |
29 | =item * Schema objects |
30 | |
31 | The Schema objects bridge the communication between the Parsers and |
32 | Producers by representing any parsed file through a standard set of |
33 | generic objects to represent concepts like Tables, Fields (columns), |
34 | Indices, Constraints, etc. |
35 | |
36 | =back |
37 | |
38 | It's not necessary to understand how to write or manipulate any |
39 | of these for most common tasks, but you should aware of the concepts |
40 | as they will be referenced later in this document. |
41 | |
42 | =head1 SQLFAIRY SCRIPTS |
43 | |
44 | Most common tasks can be accomplished through the use of the script |
45 | interfaces to the SQL::Translator code. All SQLFairy scripts begin |
46 | with "sqlt." Here are the scripts and a description of what they each |
47 | do: |
48 | |
49 | =over 4 |
50 | |
51 | =item * sqlt |
52 | |
53 | This is the main interface for text-to-text translations, e.g., |
54 | converting a MySQL schema to Oracle. |
55 | |
56 | =item * sqlt-diagram |
57 | |
58 | This is a tailored interface for the Diagram producer and its many |
59 | myriad options. |
60 | |
61 | =item * sqlt-diff |
62 | |
63 | This script will examine two schemas and report the SQL commands |
64 | (ALTER, CREATE) needed to turn the first schema into the second. |
65 | |
66 | =item * sqlt-dumper |
67 | |
68 | This script generates a Perl script that can be used to connect to a |
69 | database and dump the data in each table in different formats, similar |
70 | to the "mysqldump" program. |
71 | |
72 | =item * sqlt-graph |
73 | |
74 | This is an interface to the GraphViz visualization tool and its myriad |
75 | options. |
76 | |
77 | =item * sqlt.cgi |
78 | |
79 | This is a CGI script that presents an HTML form for uploading or |
80 | pasting a schema and choosing an output and the output options. |
81 | |
82 | =back |
83 | |
84 | To read the full documentation for each script, use "perldoc" (or |
85 | execute any of the command-line scripts with the "--help" flag). |
86 | |
87 | =head1 CONVERTING SQL DIALECTS |
88 | |
89 | Probably the most common task SQLFairy is used for is to convert one |
90 | dialect of SQL to another. If you have a text description of an SQL |
91 | database (AKA a "DDL" -- "Data Definition Language"), then you should |
92 | use the "sqlt" script with switches to indicate the parser and |
93 | producer and the name of the text file as the final argument. For |
94 | example, to convert the "foo.sql" MySQL schema to a version suitable |
95 | for PostgreSQL, you would do the following: |
96 | |
97 | $ sqlt -f MySQL -t PostgreSQL foo.sql > foo-pg.sql |
98 | |
99 | The "from" and "to" options are case-sensitive and must match exactly |
100 | the names of the Parser and Producer classes in SQL::Translator. For |
101 | a complete listing of your options, execute "sqlt" with the "--list" |
102 | flag. |
103 | |
104 | =head1 EXTRACT SQL SCHEMAS DIRECTLY FROM DATABASE |
105 | |
106 | It is possible to extract some schemas directly from the database |
107 | without parsing a text file (the "foo.sql" in the above example). |
108 | This can prove significantly faster than parsing a text file. To |
109 | do this, use the "DBI" parser and provide the necessary arguments to |
110 | connect to the database and indicate the producer class, like so: |
111 | |
112 | $ sqlt -f DBI --dsn dbi:mysql:FOO --db-user guest \ |
113 | --db-password p4ssw0rd -t PostgreSQL > foo |
114 | |
115 | The "--list" option to "sqlt" will show the databases supported by |
116 | DBI parsers. |
117 | |
118 | =head1 HANDLING NON-SQL DATA |
119 | |
120 | Certain structured document formats can be easily thought of as |
121 | tables. SQLFairy can parse Microsoft Excel spreadsheets and |
122 | arbitrarily delimited text files just as if they were schemas which |
123 | contained only one table definition. The column names are normalized |
124 | to something sane for most databases (whitespace is converted to |
125 | underscores and non-word characters are removed), and the data in each |
126 | field is scanned to determine the appropriate data type (character, |
127 | integer, or float) and size. For instance, to convert a |
128 | comma-separated file to an SQLite database, do the following: |
129 | |
130 | $ sqlt -f xSV --fs ',' -t SQLite foo.csv > foo-sqlite.sql |
131 | |
132 | Additionally, there are non-SQL represenations of relational schemas |
133 | such as XML and XMI. Currently the XMI support in SQLFairy is |
134 | experimental and not released. Additionally, the only XML supported |
135 | is our own version; however, it would be fairly easy to add an XML |
136 | parser for something like the TorqueDB (http://db.apache.org/torque/) |
137 | project. The actual parsing of XML should be trivial given the number |
138 | of XML parsers available, so all that would be left would be to map |
139 | the specific concepts in the source file to the Schema objects in |
140 | SQLFairy. |
141 | |
142 | To convert a schema in SQLFairy's XML dialect to Oracle, do the following: |
143 | |
144 | $ sqlt -f XML-SQLFairy -t Oracle foo.xml > foo-oracle.sql |
145 | |
146 | =head1 SERIALIZING SCHEMAS |
2e2fc2b4 |
147 | |
cfdcb09e |
148 | Parsing a schema is generally the most computationally expensive |
149 | operation performed by SQLFairy, so it may behoove you to serialize a |
150 | parsed schema if you need to perform repeated conversions. For |
151 | example, as part of a build process the author converts a MySQL schema |
152 | first to YAML, then to PostgreSQL, Oracle, SQLite and Sybase. |
153 | Additionally, a variety of documention in HTML and images is produced. |
154 | This can be accomplished like so: |
155 | |
156 | $ sqlt -f MySQL -t YAML schema-mysql.sql > schema.yaml |
157 | $ sqlt -f YAML -t Oracle schema.yaml > schema-oracle.sql |
158 | $ sqlt -f YAML -t PostgreSQL schema.yaml > schema-postgresql.sql |
159 | $ ... |
160 | |
161 | SQLFairy has three serialization producers, none of which is superior |
162 | to the other in their description of a schema. |
163 | |
164 | =over 4 |
165 | |
166 | =item * XML-SQLFairy |
167 | |
168 | This is the aforementioned XML format. It is essentially a direct |
169 | mapping of the Schema objects into XML. This can also provide a very |
170 | convenient bridge to describing a schema to a non-Perl application. |
171 | Providing a producer argument to "sqlt" of just "XML" will default to |
172 | using "XML-SQLFairy." |
173 | |
174 | =item * Storable |
175 | |
176 | This producer stores the Schema object using Perl's Storable.pm module |
177 | available on CPAN. |
178 | |
179 | =item * YAML |
180 | |
181 | This producer serialized the Schema object with the very readable |
182 | structured data format of YAML (http://www.yaml.org/). Earlier |
183 | examples show serializing to YAML. |
184 | |
185 | =back |
186 | |
187 | =head1 VISUALIZING SQL SCHEMAS |
188 | |
189 | The visualization tools in SQLFairy can graphically represent the |
190 | tables, fields, datatypes and sizes, constraints, and foreign key |
191 | relationships in a very compact and intuitive format. This can be |
192 | very beneficial in understanding and document large or small schemas. |
193 | Two producers in SQLFairy will create pseudo-E/R (entity-relationship) |
194 | diagrams: |
195 | |
196 | =over 4 |
197 | |
198 | =item * Diagram |
199 | |
200 | The first visualization tool in SQLFairy, this producer uses libgd to |
201 | draw a picture of the schema. The tables are evenly distributed in |
202 | definition order running in columns (i.e., no graphing algorithms are |
203 | used), so the end result may result in many crossed lines showing the |
204 | foreign key relationships. Please read the documentation of the |
205 | "sqlt-diagram" script for all the options available to this producer. |
206 | |
207 | =item * GraphViz |
208 | |
209 | The layout of the GraphViz producer is far superior to the Diagram |
210 | producer as it uses the Graphviz binary from Bell Labs to create very |
211 | professional-looking graphs. There are several different layout |
212 | algorithms and node shapes available. Please see the documentation of |
213 | the "sqlt-graph" script for more information. |
214 | |
215 | =back |
216 | |
217 | =head1 AUTOMATED CODE-GENERATION |
218 | |
219 | Given that so many applications interact with SQL databases, it's no |
220 | wonder that people have automated code to deal with this interaction. |
221 | Class::DBI from CPAN is one such module that allows a developer to |
222 | describe the relationships between tables and fields in class |
223 | declarations and then generates all the SQL to interact (SELECT, |
224 | UPDATE, DELETE, INSERT statements) at runtime. Obviously, the schema |
225 | already describes itself, so it only makes sense that you should be |
226 | able to generate this kind of code directly from the schema. The |
227 | "ClassDBI" producer in SQLFairy does just this, creating a Perl module |
228 | that inherits from Class::DBI and sets up most of the code needed to |
229 | interact with the database. Here is an example of how to do this: |
230 | |
231 | $ sqlt -f MySQL -t ClassDBI foo.sql > Foo.pm |
232 | |
233 | Then simply edit Foo.pm as needed and include it in your code. |
234 | |
21521f16 |
235 | =head1 CREATING A DATA DUMPER SCRIPT |
236 | |
237 | The Dumper producer creates a Perl script that can select the fields |
238 | in each table and then create "INSERT" statements for each record in |
239 | the database similar to the output generated by MySQL's "mysqldump" |
240 | program: |
241 | |
242 | $ sqlt -f YAML -t Dumper --dumper-db-user guest \ |
243 | > --dumper-db-pass p4ssw0rd --dumper-dsn dbi:mysql:FOO \ |
244 | > foo.yaml > foo-dumper.pl |
245 | |
246 | And then execute the resulting script to dump the data: |
247 | |
248 | $ chmod +x foo-dumper.pl |
249 | $ ./foo-dumper.pl > foo-data.sql |
250 | |
251 | The dumper script also has a number of options available. Execute the |
252 | script with the "--help" flag to read about them. |
253 | |
cfdcb09e |
254 | =head1 DOCUMENTING WITH SQL::TRANSLATOR |
2e2fc2b4 |
255 | |
cfdcb09e |
256 | SQLFairy offers two producers to help document schemas: |
2e2fc2b4 |
257 | |
cfdcb09e |
258 | =over 4 |
2e2fc2b4 |
259 | |
cfdcb09e |
260 | =item * HTML |
2e2fc2b4 |
261 | |
cfdcb09e |
262 | This producer creates a single HTML document which uses HTML |
263 | formatting to describe the Schema objects and to create hyperlinks on |
264 | foreign key relationships. This can be a surprisingly useful |
265 | documentation aid as it creates a very readable format that allows one |
266 | to jump easily to specific tables and fields. It's also possible to |
267 | plugin your own CSS to further control the presentation of the HTML. |
2e2fc2b4 |
268 | |
cfdcb09e |
269 | =item * POD |
2e2fc2b4 |
270 | |
cfdcb09e |
271 | This is arguably not that useful of a producer by itself, but the |
272 | number of POD-conversion tools could be used to further transform the |
273 | POD into something more interesting. The schema is basically |
274 | represented in POD sections where tables are broken down into fields, |
275 | indices, constraints, foreign keys, etc. |
2e2fc2b4 |
276 | |
cfdcb09e |
277 | =back |
2e2fc2b4 |
278 | |
cfdcb09e |
279 | =head1 TEMPLATE-BASED MANIPULATION OF SCHEMA OBJECTS |
2e2fc2b4 |
280 | |
cfdcb09e |
281 | All of the producers which create text output could have been coded |
282 | using a templating system to mix in the dynamic output with static |
283 | text. CPAN offers several diverse templating systems, but few are as |
284 | powerful as Template Toolkit (http://www.template-toolkit.org/). You |
285 | can easily create your own producer without writing any Perl code at |
286 | all simply by writing a template using Template Toolkit's syntax. The |
287 | template will be passed a reference to the Schema object briefly |
288 | described at the beginning of this document and mentioned many times |
289 | throughout. For example, you could create a template that simply |
290 | prints the name of each table and field that looks like this: |
2e2fc2b4 |
291 | |
cfdcb09e |
292 | # file: schema.tt |
293 | [% FOREACH table IN schema.get_tables %] |
294 | Table: [% table.name %] |
295 | Fields: |
296 | [% FOREACH field IN table.get_fields -%] |
297 | [% field.name %] |
298 | [% END -%] |
299 | [% END %] |
2e2fc2b4 |
300 | |
cfdcb09e |
301 | And the process it like so: |
2e2fc2b4 |
302 | |
cfdcb09e |
303 | $ sqlt -f YAML -t TTSchema --template schema.tt foo.yaml |
2e2fc2b4 |
304 | |
cfdcb09e |
305 | To create output like this: |
2e2fc2b4 |
306 | |
cfdcb09e |
307 | Table: foo |
308 | Fields: |
309 | foo_id |
310 | foo_name |
2e2fc2b4 |
311 | |
cfdcb09e |
312 | For more information on Template Toolkit, please install the |
313 | "Template" module and read the POD. |
2e2fc2b4 |
314 | |
1b0fe900 |
315 | =head1 FINDING THE DIFFERENCES BETWEEN TWO SCHEMAS |
316 | |
317 | As mentioned above, the "sqlt-diff" schema examines two schemas and |
318 | creates SQL schema modification statements that can be used to |
319 | transform the first schema into the second. The flag syntax is |
320 | somewhat quirky: |
321 | |
322 | $ sqlt-diff foo-v1.sql=MySQL foo-v2.sql=Oracle > diff.sql |
323 | |
324 | As demonstrated, the schemas need not even be from the same vendor, |
325 | though this is likely to produce some spurious results as |
326 | datatypes are not currently viewed equivalent unless they match |
327 | exactly, even if they would be converted to the same. For example, |
328 | MySQL's "integer" data type would be converted to Oracle's "number," |
329 | but the differ isn't quite smart enough yet to figure this out. Also, |
330 | as the SQL to ALTER a field definition varies from database vendor to |
331 | vendor, these statements are made using just the keyword "CHANGE" and |
332 | will likely need to be corrected for the target database. |
333 | |
334 | =head1 A UNIFIED GRAPHICAL INTERFACE |
335 | |
336 | Seeing all the above options and scripts, you may be pining for a |
337 | single, graphical interface to handle all these transformations and |
338 | choices. This is exactly what the "sqlt.cgi" script provides. Simply |
339 | drop this script into your web server's CGI directory and enable the |
340 | execute bit and you can point your web browser to an HTML form which |
341 | provides a simple interface to all the SQLFairy parsers and producers. |
342 | |
cfdcb09e |
343 | =head1 PLUGIN YOUR OWN PARSERS AND PRODUCERS |
2e2fc2b4 |
344 | |
cfdcb09e |
345 | Now that you have seen how the parsers and producers interact via the |
346 | Schema objects, you may wish to create your own versions to plugin. |
2e2fc2b4 |
347 | |
348 | =head1 AUTHOR |
349 | |
977651a5 |
350 | Ken Y. Clark E<lt>kclark@cpan.orgE<gt>. |