1 package SQL::Translator;
3 # ----------------------------------------------------------------------
4 # $Id: Translator.pm,v 1.6 2002-03-27 12:41:52 dlc Exp $
5 # ----------------------------------------------------------------------
6 # Copyright (C) 2002 Ken Y. Clark <kycl4rk@users.sourceforge.net>,
7 # darren chamberlain <darren@cpan.org>
9 # This program is free software; you can redistribute it and/or
10 # modify it under the terms of the GNU General Public License as
11 # published by the Free Software Foundation; version 2.
13 # This program is distributed in the hope that it will be useful, but
14 # WITHOUT ANY WARRANTY; without even the implied warranty of
15 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
16 # General Public License for more details.
18 # You should have received a copy of the GNU General Public License
19 # along with this program; if not, write to the Free Software
20 # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
22 # -------------------------------------------------------------------
26 SQL::Translator - convert schema from one database to another
31 my $translator = SQL::Translator->new;
33 my $output = $translator->translate(
37 ) or die $translator->error;
42 This module attempts to simplify the task of converting one database
43 create syntax to another through the use of Parsers and Producers.
44 The idea is that any Parser can be used with any Producer in the
45 conversion process. So, if you wanted PostgreSQL-to-Oracle, you would
46 use the PostgreSQL parser and the Oracle producer.
48 Currently, the existing parsers use Parse::RecDescent, but this not
49 a requirement, or even a recommendation. New parser modules don't
50 necessarily have to use Parse::RecDescent, as long as the module
51 implements the appropriate API. With this separation of code, it is
52 hoped that developers will find it easy to add more database dialects
53 by using what's written, writing only what they need, and then
54 contributing their parsers or producers back to the project.
59 use vars qw($VERSION $DEFAULT_SUB $DEBUG);
60 $VERSION = sprintf "%d.%02d", q$Revision: 1.6 $ =~ /(\d+)\.(\d+)/;
61 $DEBUG = 1 unless defined $DEBUG;
63 # ----------------------------------------------------------------------
64 # The default behavior is to "pass through" values (note that the
65 # SQL::Translator instance is the first value ($_[0]), and the stuff
66 # to be parsed is the second value ($_[1])
67 # ----------------------------------------------------------------------
68 $DEFAULT_SUB = sub { $_[1] } unless defined $DEFAULT_SUB;
70 *isa = \&UNIVERSAL::isa;
76 The constructor is called B<new>, and accepts a optional hash of options.
81 =item parser (aka from)
85 =item producer (aka to)
89 =item filename (aka file)
97 All options are, well, optional; these attributes can be set via
98 instance methods. Internally, they are; no (non-syntactical)
99 advantage is gained by passing options to the constructor.
104 # ----------------------------------------------------------------------
108 # new takes an optional hash of arguments. These arguments may
109 # include a parser, specified with the keys "parser" or "from",
110 # and a producer, specified with the keys "producer" or "to".
112 # The values that can be passed as the parser or producer are
113 # given directly to the parser or producer methods, respectively.
114 # See the appropriate method description below for details about
115 # what each expects/accepts.
116 # ----------------------------------------------------------------------
119 my $args = isa($_[0], 'HASH') ? shift : { @_ };
120 my $self = bless { } => $class;
122 # ------------------------------------------------------------------
123 # Set the parser and producer.
125 # If a 'parser' or 'from' parameter is passed in, use that as the
126 # parser; if a 'producer' or 'to' parameter is passed in, use that
127 # as the producer; both default to $DEFAULT_SUB.
128 # ------------------------------------------------------------------
129 $self->parser( $args->{'parser'} || $args->{'from'} || $DEFAULT_SUB);
130 $self->producer($args->{'producer'} || $args->{'to'} || $DEFAULT_SUB);
132 # ------------------------------------------------------------------
133 # Set the parser_args and producer_args
134 # ------------------------------------------------------------------
135 for my $pargs (qw(parser_args producer_args)) {
136 $self->$pargs($args->{$pargs}) if defined $args->{$pargs};
139 # ------------------------------------------------------------------
140 # Set the data source, if 'filename' or 'file' is provided.
141 # ------------------------------------------------------------------
142 $args->{'filename'} ||= $args->{'file'} || "";
143 $self->filename($args->{'filename'}) if $args->{'filename'};
145 # ------------------------------------------------------------------
146 # Finally, if there is a 'data' parameter, use that in preference
147 # to filename and file
148 # ------------------------------------------------------------------
149 if (my $data = $args->{'data'}) {
153 $self->{'debug'} = $DEBUG;
154 $self->{'debug'} = $args->{'debug'} if (defined $args->{'debug'});
156 # ------------------------------------------------------------------
158 # ------------------------------------------------------------------
159 $self->error_out("");
169 The B<producer> method is an accessor/mutator, used to retrieve or
170 define what subroutine is called to produce the output. A subroutine
171 defined as a producer will be invoked as a function (not a method) and
172 passed 2 parameters: its container SQL::Translator instance and a
173 data structure. It is expected that the function transform the data
174 structure to a string. The SQL::Transformer instance is provided for
175 informational purposes; for example, the type of the parser can be
176 retrieved using the B<parser_type> method, and the B<error> and
177 B<debug> methods can be called when needed.
179 When defining a producer, one of several things can be passed
180 in: A module name (e.g., My::Groovy::Producer), a module name
181 relative to the SQL::Translator::Producer namespace (e.g., MySQL), a
182 module name and function combination (My::Groovy::Producer::transmogrify),
183 or a reference to an anonymous subroutine. If a full module name is
184 passed in (for the purposes of this method, a string containing "::"
185 is considered to be a module name), it is treated as a package, and a
186 function called "produce" will be invoked: $modulename::produce. If
187 $modulename cannot be loaded, the final portion is stripped off and
188 treated as a function. In other words, if there is no file named
189 My/Groovy/Producer/transmogrify.pm, SQL::Translator will attempt to load
190 My/Groovy/Producer.pm and use transmogrify as the name of the function,
191 instead of the default "produce".
193 my $tr = SQL::Translator->new;
195 # This will invoke My::Groovy::Producer::produce($tr, $data)
196 $tr->producer("My::Groovy::Producer");
198 # This will invoke SQL::Translator::Producer::Sybase::produce($tr, $data)
199 $tr->producer("Sybase");
201 # This will invoke My::Groovy::Producer::transmogrify($tr, $data),
202 # assuming that My::Groovy::Producer::transmogrify is not a module
204 $tr->producer("My::Groovy::Producer::transmogrify");
206 # This will invoke the referenced subroutine directly, as
207 # $subref->($tr, $data);
208 $tr->producer(\&my_producer);
210 There is also a method named B<producer_type>, which is a string
211 containing the classname to which the above B<produce> function
212 belongs. In the case of anonymous subroutines, this method returns
215 Finally, there is a method named B<producer_args>, which is both an
216 accessor and a mutator. Arbitrary data may be stored in name => value
217 pairs for the producer subroutine to access:
219 sub My::Random::producer {
220 my ($tr, $data) = @_;
221 my $pr_args = $tr->producer_args();
223 # $pr_args is a hashref.
225 Extra data passed to the B<producer> method is passed to
228 $tr->producer("xSV", delimiter => ',\s*');
230 # In SQL::Translator::Producer::xSV:
231 my $args = $tr->producer_args;
232 my $delimiter = $args->{'delimiter'}; # value is => ,\s*
236 # {{{ producer and producer_type
240 # {{{ producer as a mutator
242 my $producer = shift;
244 # {{{ Passed a module name (string containing "::")
245 if ($producer =~ /::/) {
248 # {{{ Module name was passed directly
249 # We try to load the name; if it doesn't load, there's
250 # a possibility that it has a function name attached to
252 if (load($producer)) {
253 $func_name = "produce";
256 # {{{ Module::function was passed
258 # Passed Module::Name::function; try to recover
259 my @func_parts = split /::/, $producer;
260 $func_name = pop @func_parts;
261 $producer = join "::", @func_parts;
263 # If this doesn't work, then we have a legitimate
265 load($producer) or die "Can't load $producer: $@";
268 # {{{ get code reference and assign
269 $self->{'producer'} = \&{ "$producer\::$func_name" };
270 $self->{'producer_type'} = $producer;
271 $self->debug("Got producer: $producer\::$func_name");
275 # {{{ passed an anonymous subroutine reference
276 elsif (isa($producer, 'CODE')) {
277 $self->{'producer'} = $producer;
278 $self->{'producer_type'} = "CODE";
279 $self->debug("Got producer: code ref");
282 # {{{ passed a string containing no "::"; relative package name
284 my $Pp = sprintf "SQL::Translator::Producer::$producer";
285 load($Pp) or die "Can't load $Pp: $@";
286 $self->{'producer'} = \&{ "$Pp\::produce" };
287 $self->{'producer_type'} = $Pp;
288 $self->debug("Got producer: $Pp");
291 # At this point, $self->{'producer'} contains a subroutine
292 # reference that is ready to run
294 # {{{ Anything left? If so, it's producer_args
295 $self->produser_args(@_) if (@_);
299 return $self->{'producer'};
303 # producer_type is an accessor that allows producer subs to get
304 # information about their origin. This is poptentially important;
305 # since all producer subs are called as subroutine refernces, there is
306 # no way for a producer to find out which package the sub lives in
307 # originally, for example.
308 sub producer_type { $_[0]->{'producer_type'} } # }}}
311 # Arbitrary name => value pairs of paramters can be passed to a
312 # producer using this method.
316 my $args = isa($_[0], 'HASH') ? shift : { @_ };
317 $self->{'producer_args'} = $args;
319 $self->{'producer_args'};
325 The B<parser> method defines or retrieves a subroutine that will be
326 called to perform the parsing. The basic idea is the same as that of
327 B<producer> (see above), except the default subroutine name is
328 "parse", and will be invoked as $module_name::parse($tr, $data).
329 Also, the parser subroutine will be passed a string containing the
330 entirety of the data to be parsed (or possibly a reference to a string?).
332 # Invokes SQL::Translator::Parser::MySQL::parse()
333 $tr->parser("MySQL");
335 # Invokes My::Groovy::Parser::parse()
336 $tr->parser("My::Groovy::Parser");
338 # Invoke an anonymous subroutine directly
340 my $dumper = Data::Dumper->new([ $_[1] ], [ "SQL" ]);
341 $dumper->Purity(1)->Terse(1)->Deepcopy(1);
342 return $dumper->Dump;
345 There is also B<parser_type> and B<parser_args>, which perform
346 analogously to B<producer_type> and B<producer_args>
350 # {{{ parser, parser_type, and parser_args
354 # {{{ parser as a mutator
358 # {{{ Passed a module name (string containing "::")
359 if ($parser =~ /::/) {
362 # {{{ Module name was passed directly
363 # We try to load the name; if it doesn't load, there's
364 # a possibility that it has a function name attached to
367 $func_name = "parse";
370 # {{{ Module::function was passed
372 # Passed Module::Name::function; try to recover
373 my @func_parts = split /::/, $parser;
374 $func_name = pop @func_parts;
375 $parser = join "::", @func_parts;
377 # If this doesn't work, then we have a legitimate
379 load($parser) or die "Can't load $parser: $@";
382 # {{{ get code reference and assign
383 $self->{'parser'} = \&{ "$parser\::$func_name" };
384 $self->{'parser_type'} = $parser;
385 $self->debug("Got parser: $parser\::$func_name");
389 # {{{ passed an anonymous subroutine reference
390 elsif (isa($parser, 'CODE')) {
391 $self->{'parser'} = $parser;
392 $self->{'parser_type'} = "CODE";
393 $self->debug("Got parser: code ref");
396 # {{{ passed a string containing no "::"; relative package name
398 my $Pp = sprintf "SQL::Translator::Parser::$parser";
399 load($Pp) or die "Can't load $Pp: $@";
400 $self->{'parser'} = \&{ "$Pp\::parse" };
401 $self->{'parser_type'} = $Pp;
402 $self->debug("Got parser: $Pp");
405 # At this point, $self->{'parser'} contains a subroutine
406 # reference that is ready to run
408 $self->parser_args(@_) if (@_);
411 return $self->{'parser'};
414 sub parser_type { $_[0]->{'parser_type'} }
420 my $args = isa($_[0], 'HASH') ? shift : { @_ };
421 $self->{'parser_args'} = $args;
423 $self->{'parser_args'};
429 The B<translate> method calls the subroutines referenced by the
430 B<parser> and B<producer> data members (described above). It accepts
431 as arguments a number of things, in key => value format, including
432 (potentially) a parser and a producer (they are passed directly to the
433 B<parser> and B<producer> methods).
435 Here is how the parameter list to B<translate> is parsed:
441 1 argument means it's the data to be parsed; which could be a string
442 (filename) or a refernce to a scalar (a string stored in memory), or a
443 reference to a hash, which is parsed as being more than one argument
446 # Parse the file /path/to/datafile
447 my $output = $tr->translate("/path/to/datafile");
449 # Parse the data contained in the string $data
450 my $output = $tr->translate(\$data);
454 More than 1 argument means its a hash of things, and it might be
455 setting a parser, producer, or datasource (this key is named
456 "filename" or "file" if it's a file, or "data" for a SCALAR reference.
458 # As above, parse /path/to/datafile, but with different producers
459 for my $prod ("MySQL", "XML", "Sybase") {
460 print $tr->translate(
462 filename => "/path/to/datafile",
466 # The filename hash key could also be:
467 datasource => \$data,
473 =head2 B<filename>, B<data>
475 Using the B<filename> method, the filename of the data to be parsed
476 can be set. This method can be used in conjunction with the B<data>
477 method, below. If both the B<filename> and B<data> methods are
478 invoked as mutators, the data set in the B<data> method is used.
480 $tr->filename("/my/data/files/create.sql");
484 my $create_script = do {
486 open CREATE, "/my/data/files/create.sql" or die $!;
489 $tr->data(\$create_script);
491 B<filename> takes a string, which is interpreted as a filename.
492 B<data> takes a reference to a string, which is used as the data o be
493 parsed. If a filename is set, then that file is opened and read when
494 the B<translate> method is called, as long as the data instance
499 # {{{ filename - get or set the filename
503 $self->{'filename'} = shift;
504 $self->debug("Got filename: $self->{'filename'}");
509 # {{{ data - get or set the data
510 # if $self->{'data'} is not set, but $self->{'filename'} is, then
511 # $self->{'filename'} is opened and read, whith the results put into
516 # {{{ Set $self->{'data'} to $_[0], if it is provided.
519 if (isa($data, "SCALAR")) {
520 $self->{'data'} = $data;
522 elsif (! ref $data) {
523 $self->{'data'} = \$data;
528 # {{{ If we have a filename but no data yet, populate.
529 if (not $self->{'data'} and my $filename = $self->filename) {
530 $self->debug("Opening '$filename' to get contents...");
535 unless (open FH, $filename) {
536 $self->error_out("Can't open $filename for reading: $!");
541 $self->{'data'} = \$data;
544 $self->error_out("Can't close $filename: $!");
550 return $self->{'data'};
556 my ($args, $parser, $producer);
558 # {{{ Parse arguments
560 # {{{ Passed a reference to a hash
561 if (isa($_[0], 'HASH')) {
563 $self->debug("translate: Got a hashref");
568 # {{{ Passed a reference to a string containing the data
569 elsif (isa($_[0], 'SCALAR')) {
570 # passed a ref to a string
571 $self->debug("translate: Got a SCALAR reference (string)");
576 # {{{ Not a reference; treat it as a filename
577 elsif (! ref $_[0]) {
578 # Not a ref, it's a filename
579 $self->debug("translate: Got a filename");
580 $self->filename($_[0]);
584 # {{{ Passed something else entirely.
586 # We're not impressed. Take your empty string and leave.
592 # You must pass in a hash, or you get nothing.
597 # ----------------------------------------------------------------------
598 # Can specify the data to be transformed using "filename", "file",
600 # ----------------------------------------------------------------------
601 if (my $filename = $args->{'filename'} || $args->{'file'}) {
602 $self->filename($filename);
605 if (my $data = $self->{'data'}) {
609 # ----------------------------------------------------------------
611 # ----------------------------------------------------------------
612 my $data = $self->data;
613 unless (defined $$data) {
614 $self->error_out("Empty data file!");
618 # ----------------------------------------------------------------
619 # Local reference to the parser subroutine
620 # ----------------------------------------------------------------
621 if ($parser = ($args->{'parser'} || $args->{'from'})) {
622 $self->parser($parser);
624 $parser = $self->parser;
627 # ----------------------------------------------------------------
628 # Local reference to the producer subroutine
629 # ----------------------------------------------------------------
630 if ($producer = ($args->{'producer'} || $args->{'to'})) {
631 $self->producer($producer);
633 $producer = $self->producer;
636 # ----------------------------------------------------------------
637 # Execute the parser, then execute the producer with that output
638 # ----------------------------------------------------------------
639 return $producer->($self, $parser->($self, $$data));
645 The error method returns the last error.
650 #-----------------------------------------------------
653 # Return the last error.
655 return shift()->{'error'} || '';
661 Record the error and return undef. The error can be retrieved by
662 calling programs using $tr->error.
664 For Parser or Producer writers, primarily.
671 if ( my $error = shift ) {
672 $self->{'error'} = $error;
680 If the global variable $SQL::Translator::DEBUG is set to a true value,
681 then calls to $tr->debug($msg) will be carped to STDERR. If $DEBUG is
682 not set, then this method does nothing.
690 # carp @_ if $self->{'debug'};
694 my $class = ref $self || $self;
695 carp "[$class] $_" for @_;
703 my $module = do { my $m = shift; $m =~ s[::][/]g; "$m.pm" };
704 return 1 if $INC{$module};
706 eval { require $module };
716 #-----------------------------------------------------
717 # Rescue the drowning and tie your shoestrings.
718 # Henry David Thoreau
719 #-----------------------------------------------------
723 Ken Y. Clark, E<lt>kclark@logsoft.comE<gt>,
724 darren chamberlain E<lt>darren@cpan.orgE<gt>
728 This program is free software; you can redistribute it and/or modify
729 it under the terms of the GNU General Public License as published by
730 the Free Software Foundation; version 2.
732 This program is distributed in the hope that it will be useful, but
733 WITHOUT ANY WARRANTY; without even the implied warranty of
734 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
735 General Public License for more details.
737 You should have received a copy of the GNU General Public License
738 along with this program; if not, write to the Free Software
739 Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
744 L<perl>, L<Parse::RecDescent>