1 package IO::Uncompress::AnyInflate ;
3 # for RFC1950, RFC1951 or RFC1952
9 use IO::Compress::Base::Common qw(createSelfTiedObject);
11 use IO::Uncompress::Adapter::Inflate ();
14 use IO::Uncompress::Base ;
15 use IO::Uncompress::Gunzip ;
16 use IO::Uncompress::Inflate ;
17 use IO::Uncompress::RawInflate ;
18 use IO::Uncompress::Unzip ;
22 our ($VERSION, @ISA, @EXPORT_OK, %EXPORT_TAGS, $AnyInflateError);
24 $VERSION = '2.000_08';
25 $AnyInflateError = '';
27 @ISA = qw( Exporter IO::Uncompress::Base );
28 @EXPORT_OK = qw( $AnyInflateError anyinflate ) ;
29 %EXPORT_TAGS = %IO::Uncompress::Base::DEFLATE_CONSTANTS ;
30 push @{ $EXPORT_TAGS{all} }, @EXPORT_OK ;
31 Exporter::export_ok_tags('all');
33 # TODO - allow the user to pick a set of the three formats to allow
34 # or just assume want to auto-detect any of the three formats.
39 my $obj = createSelfTiedObject($class, \$AnyInflateError);
40 $obj->_create(undef, 0, @_);
45 my $obj = createSelfTiedObject(undef, \$AnyInflateError);
46 return $obj->_inf(@_) ;
59 # any always needs both crc32 and adler32
60 $got->value('CRC32' => 1);
61 $got->value('ADLER32' => 1);
72 my ($obj, $errstr, $errno) = IO::Uncompress::Adapter::Inflate::mkUncompObject();
74 return $self->saveErrorString(undef, $errstr, $errno)
77 *$self->{Uncomp} = $obj;
79 my $magic = $self->ckMagic( qw( RawInflate Inflate Gunzip Unzip ) );
82 *$self->{Info} = $self->readHeader($magic)
98 my $keep = ref $self ;
99 for my $class ( map { "IO::Uncompress::$_" } @names)
101 bless $self => $class;
102 my $magic = $self->ckMagic();
106 #bless $self => $class;
110 $self->pushBack(*$self->{HeaderPending}) ;
111 *$self->{HeaderPending} = '' ;
114 bless $self => $keep;
126 IO::Uncompress::AnyInflate - Perl interface to read RFC 1950, 1951 & 1952 files/buffers
131 use IO::Uncompress::AnyInflate qw(anyinflate $AnyInflateError) ;
133 my $status = anyinflate $input => $output [,OPTS]
134 or die "anyinflate failed: $AnyInflateError\n";
136 my $z = new IO::Uncompress::AnyInflate $input [OPTS]
137 or die "anyinflate failed: $AnyInflateError\n";
139 $status = $z->read($buffer)
140 $status = $z->read($buffer, $length)
141 $status = $z->read($buffer, $length, $offset)
142 $line = $z->getline()
147 $status = $z->inflateSync()
150 $data = $z->getHeaderInfo()
152 $z->seek($position, $whence)
164 read($z, $buffer, $length);
165 read($z, $buffer, $length, $offset);
167 seek($z, $position, $whence)
178 B<WARNING -- This is a Beta release>.
182 =item * DO NOT use in production code.
184 =item * The documentation is incomplete in places.
186 =item * Parts of the interface defined here are tentative.
188 =item * Please report any problems you find.
196 This module provides a Perl interface that allows the reading of
197 files/buffers that conform to RFC's 1950, 1951 and 1952.
199 The module will auto-detect which, if any, of the three supported
200 compression formats is being used.
204 =head1 Functional Interface
206 A top-level function, C<anyinflate>, is provided to carry out
207 "one-shot" uncompression between buffers and/or files. For finer
208 control over the uncompression process, see the L</"OO Interface">
211 use IO::Uncompress::AnyInflate qw(anyinflate $AnyInflateError) ;
213 anyinflate $input => $output [,OPTS]
214 or die "anyinflate failed: $AnyInflateError\n";
218 The functional interface needs Perl5.005 or better.
221 =head2 anyinflate $input => $output [, OPTS]
224 C<anyinflate> expects at least two parameters, C<$input> and C<$output>.
226 =head3 The C<$input> parameter
228 The parameter, C<$input>, is used to define the source of
231 It can take one of the following forms:
237 If the C<$input> parameter is a simple scalar, it is assumed to be a
238 filename. This file will be opened for reading and the input data
239 will be read from it.
243 If the C<$input> parameter is a filehandle, the input data will be
245 The string '-' can be used as an alias for standard input.
247 =item A scalar reference
249 If C<$input> is a scalar reference, the input data will be read
252 =item An array reference
254 If C<$input> is an array reference, each element in the array must be a
257 The input data will be read from each file in turn.
259 The complete array will be walked to ensure that it only
260 contains valid filenames before any data is uncompressed.
264 =item An Input FileGlob string
266 If C<$input> is a string that is delimited by the characters "<" and ">"
267 C<anyinflate> will assume that it is an I<input fileglob string>. The
268 input is the list of files that match the fileglob.
270 If the fileglob does not match any files ...
272 See L<File::GlobMapper|File::GlobMapper> for more details.
277 If the C<$input> parameter is any other type, C<undef> will be returned.
281 =head3 The C<$output> parameter
283 The parameter C<$output> is used to control the destination of the
284 uncompressed data. This parameter can take one of these forms.
290 If the C<$output> parameter is a simple scalar, it is assumed to be a
291 filename. This file will be opened for writing and the uncompressed
292 data will be written to it.
296 If the C<$output> parameter is a filehandle, the uncompressed data
297 will be written to it.
298 The string '-' can be used as an alias for standard output.
301 =item A scalar reference
303 If C<$output> is a scalar reference, the uncompressed data will be
304 stored in C<$$output>.
308 =item An Array Reference
310 If C<$output> is an array reference, the uncompressed data will be
311 pushed onto the array.
313 =item An Output FileGlob
315 If C<$output> is a string that is delimited by the characters "<" and ">"
316 C<anyinflate> will assume that it is an I<output fileglob string>. The
317 output is the list of files that match the fileglob.
319 When C<$output> is an fileglob string, C<$input> must also be a fileglob
320 string. Anything else is an error.
324 If the C<$output> parameter is any other type, C<undef> will be returned.
330 When C<$input> maps to multiple files/buffers and C<$output> is a single
331 file/buffer the uncompressed input files/buffers will all be stored
332 in C<$output> as a single uncompressed stream.
336 =head2 Optional Parameters
338 Unless specified below, the optional parameters for C<anyinflate>,
339 C<OPTS>, are the same as those used with the OO interface defined in the
340 L</"Constructor Options"> section below.
344 =item AutoClose =E<gt> 0|1
346 This option applies to any input or output data streams to
347 C<anyinflate> that are filehandles.
349 If C<AutoClose> is specified, and the value is true, it will result in all
350 input and/or output filehandles being closed once C<anyinflate> has
353 This parameter defaults to 0.
357 =item BinModeOut =E<gt> 0|1
359 When writing to a file or filehandle, set C<binmode> before writing to the
368 =item -Append =E<gt> 0|1
372 =item -MultiStream =E<gt> 0|1
374 Creates a new stream after each file.
387 To read the contents of the file C<file1.txt.Compressed> and write the
388 compressed data to the file C<file1.txt>.
392 use IO::Uncompress::AnyInflate qw(anyinflate $AnyInflateError) ;
394 my $input = "file1.txt.Compressed";
395 my $output = "file1.txt";
396 anyinflate $input => $output
397 or die "anyinflate failed: $AnyInflateError\n";
400 To read from an existing Perl filehandle, C<$input>, and write the
401 uncompressed data to a buffer, C<$buffer>.
405 use IO::Uncompress::AnyInflate qw(anyinflate $AnyInflateError) ;
408 my $input = new IO::File "<file1.txt.Compressed"
409 or die "Cannot open 'file1.txt.Compressed': $!\n" ;
411 anyinflate $input => \$buffer
412 or die "anyinflate failed: $AnyInflateError\n";
414 To uncompress all files in the directory "/my/home" that match "*.txt.Compressed" and store the compressed data in the same directory
418 use IO::Uncompress::AnyInflate qw(anyinflate $AnyInflateError) ;
420 anyinflate '</my/home/*.txt.Compressed>' => '</my/home/#1.txt>'
421 or die "anyinflate failed: $AnyInflateError\n";
423 and if you want to compress each file one at a time, this will do the trick
427 use IO::Uncompress::AnyInflate qw(anyinflate $AnyInflateError) ;
429 for my $input ( glob "/my/home/*.txt.Compressed" )
432 $output =~ s/.Compressed// ;
433 anyinflate $input => $output
434 or die "Error compressing '$input': $AnyInflateError\n";
441 The format of the constructor for IO::Uncompress::AnyInflate is shown below
444 my $z = new IO::Uncompress::AnyInflate $input [OPTS]
445 or die "IO::Uncompress::AnyInflate failed: $AnyInflateError\n";
447 Returns an C<IO::Uncompress::AnyInflate> object on success and undef on failure.
448 The variable C<$AnyInflateError> will contain an error message on failure.
450 If you are running Perl 5.005 or better the object, C<$z>, returned from
451 IO::Uncompress::AnyInflate can be used exactly like an L<IO::File|IO::File> filehandle.
452 This means that all normal input file operations can be carried out with
453 C<$z>. For example, to read a line from a compressed file/buffer you can
454 use either of these forms
456 $line = $z->getline();
459 The mandatory parameter C<$input> is used to determine the source of the
460 compressed data. This parameter can take one of three forms.
466 If the C<$input> parameter is a scalar, it is assumed to be a filename. This
467 file will be opened for reading and the compressed data will be read from it.
471 If the C<$input> parameter is a filehandle, the compressed data will be
473 The string '-' can be used as an alias for standard input.
476 =item A scalar reference
478 If C<$input> is a scalar reference, the compressed data will be read from
483 =head2 Constructor Options
486 The option names defined below are case insensitive and can be optionally
487 prefixed by a '-'. So all of the following are valid
494 OPTS is a combination of the following options:
498 =item -AutoClose =E<gt> 0|1
500 This option is only valid when the C<$input> parameter is a filehandle. If
501 specified, and the value is true, it will result in the file being closed once
502 either the C<close> method is called or the IO::Uncompress::AnyInflate object is
505 This parameter defaults to 0.
507 =item -MultiStream =E<gt> 0|1
511 Allows multiple concatenated compressed streams to be treated as a single
512 compressed stream. Decompression will stop once either the end of the
513 file/buffer is reached, an error is encountered (premature eof, corrupt
514 compressed data) or the end of a stream is not immediately followed by the
515 start of another stream.
517 This parameter defaults to 0.
521 =item -Prime =E<gt> $string
523 This option will uncompress the contents of C<$string> before processing the
526 This option can be useful when the compressed data is embedded in another
527 file/data structure and it is not possible to work out where the compressed
528 data begins without having to read the first few bytes. If this is the
529 case, the uncompression can be I<primed> with these bytes using this
532 =item -Transparent =E<gt> 0|1
534 If this option is set and the input file or buffer is not compressed data,
535 the module will allow reading of it anyway.
537 This option defaults to 1.
539 =item -BlockSize =E<gt> $num
541 When reading the compressed input data, IO::Uncompress::AnyInflate will read it in
542 blocks of C<$num> bytes.
544 This option defaults to 4096.
546 =item -InputLength =E<gt> $size
548 When present this option will limit the number of compressed bytes read
549 from the input file/buffer to C<$size>. This option can be used in the
550 situation where there is useful data directly after the compressed data
551 stream and you know beforehand the exact length of the compressed data
554 This option is mostly used when reading from a filehandle, in which case
555 the file pointer will be left pointing to the first byte directly after the
556 compressed data stream.
560 This option defaults to off.
562 =item -Append =E<gt> 0|1
564 This option controls what the C<read> method does with uncompressed data.
566 If set to 1, all uncompressed data will be appended to the output parameter
567 of the C<read> method.
569 If set to 0, the contents of the output parameter of the C<read> method
570 will be overwritten by the uncompressed data.
574 =item -Strict =E<gt> 0|1
578 This option controls whether the extra checks defined below are used when
579 carrying out the decompression. When Strict is on, the extra tests are
580 carried out, when Strict is off they are not.
582 The default for this option is off.
585 If the input is an RFC 1950 data stream, the following will be checked:
594 The ADLER32 checksum field must be present.
598 The value of the ADLER32 field read must match the adler32 value of the
599 uncompressed data actually contained in the file.
605 If the input is a gzip (RFC 1952) data stream, the following will be checked:
614 If the FHCRC bit is set in the gzip FLG header byte, the CRC16 bytes in the
615 header must match the crc16 value of the gzip header actually read.
619 If the gzip header contains a name field (FNAME) it consists solely of ISO
624 If the gzip header contains a comment field (FCOMMENT) it consists solely
625 of ISO 8859-1 characters plus line-feed.
629 If the gzip FEXTRA header field is present it must conform to the sub-field
630 structure as defined in RFC 1952.
634 The CRC32 and ISIZE trailer fields must be present.
638 The value of the CRC32 field read must match the crc32 value of the
639 uncompressed data actually contained in the gzip file.
643 The value of the ISIZE fields read must match the length of the
644 uncompressed data actually read from the file.
653 =item -ParseExtra =E<gt> 0|1
655 If the gzip FEXTRA header field is present and this option is set, it will
656 force the module to check that it conforms to the sub-field structure as
659 If the C<Strict> is on it will automatically enable this option.
679 $status = $z->read($buffer)
681 Reads a block of compressed data (the size the the compressed block is
682 determined by the C<Buffer> option in the constructor), uncompresses it and
683 writes any uncompressed data into C<$buffer>. If the C<Append> parameter is
684 set in the constructor, the uncompressed data will be appended to the
685 C<$buffer> parameter. Otherwise C<$buffer> will be overwritten.
687 Returns the number of uncompressed bytes written to C<$buffer>, zero if eof
688 or a negative number on error.
694 $status = $z->read($buffer, $length)
695 $status = $z->read($buffer, $length, $offset)
697 $status = read($z, $buffer, $length)
698 $status = read($z, $buffer, $length, $offset)
700 Attempt to read C<$length> bytes of uncompressed data into C<$buffer>.
702 The main difference between this form of the C<read> method and the
703 previous one, is that this one will attempt to return I<exactly> C<$length>
704 bytes. The only circumstances that this function will not is if end-of-file
705 or an IO error is encountered.
707 Returns the number of uncompressed bytes written to C<$buffer>, zero if eof
708 or a negative number on error.
715 $line = $z->getline()
720 This method fully supports the use of of the variable C<$/>
721 (or C<$INPUT_RECORD_SEPARATOR> or C<$RS> when C<English> is in use) to
722 determine what constitutes an end of line. Both paragraph mode and file
723 slurp mode are supported.
732 Read a single character.
738 $char = $z->ungetc($string)
746 $status = $z->inflateSync()
755 $hdr = $z->getHeaderInfo();
756 @hdrs = $z->getHeaderInfo();
758 This method returns either a hash reference (in scalar context) or a list
759 or hash references (in array context) that contains information about each
760 of the header fields in the compressed data stream(s).
772 Returns the uncompressed file offset.
783 Returns true if the end of the compressed input stream has been reached.
789 $z->seek($position, $whence);
790 seek($z, $position, $whence);
795 Provides a sub-set of the C<seek> functionality, with the restriction
796 that it is only legal to seek forward in the input file/buffer.
797 It is a fatal error to attempt to seek backward.
801 The C<$whence> parameter takes one the usual values, namely SEEK_SET,
802 SEEK_CUR or SEEK_END.
804 Returns 1 on success, 0 on failure.
813 This is a noop provided for completeness.
819 Returns true if the object currently refers to a opened file/buffer.
823 my $prev = $z->autoflush()
824 my $prev = $z->autoflush(EXPR)
826 If the C<$z> object is associated with a file or a filehandle, this method
827 returns the current autoflush setting for the underlying filehandle. If
828 C<EXPR> is present, and is non-zero, it will enable flushing after every
829 write/print operation.
831 If C<$z> is associated with a buffer, this method has no effect and always
834 B<Note> that the special variable C<$|> B<cannot> be used to set or
835 retrieve the autoflush setting.
837 =head2 input_line_number
839 $z->input_line_number()
840 $z->input_line_number(EXPR)
844 Returns the current uncompressed line number. If C<EXPR> is present it has
845 the effect of setting the line number. Note that setting the line number
846 does not change the current position within the file/buffer being read.
848 The contents of C<$/> are used to to determine what constitutes a line
858 If the C<$z> object is associated with a file or a filehandle, this method
859 will return the underlying file descriptor.
861 If the C<$z> object is is associated with a buffer, this method will
871 Closes the output file/buffer.
875 For most versions of Perl this method will be automatically invoked if
876 the IO::Uncompress::AnyInflate object is destroyed (either explicitly or by the
877 variable with the reference to the object going out of scope). The
878 exceptions are Perl versions 5.005 through 5.00504 and 5.8.0. In
879 these cases, the C<close> method will be called automatically, but
880 not until global destruction of all live objects when the program is
883 Therefore, if you want your scripts to be able to run on all versions
884 of Perl, you should call C<close> explicitly and not rely on automatic
887 Returns true on success, otherwise 0.
889 If the C<AutoClose> option has been enabled when the IO::Uncompress::AnyInflate
890 object was created, and the object is associated with a file, the
891 underlying file will also be closed.
898 No symbolic constants are required by this IO::Uncompress::AnyInflate at present.
904 Imports C<anyinflate> and C<$AnyInflateError>.
907 use IO::Uncompress::AnyInflate qw(anyinflate $AnyInflateError) ;
918 L<Compress::Zlib>, L<IO::Compress::Gzip>, L<IO::Uncompress::Gunzip>, L<IO::Compress::Deflate>, L<IO::Uncompress::Inflate>, L<IO::Compress::RawDeflate>, L<IO::Uncompress::RawInflate>, L<IO::Compress::Bzip2>, L<IO::Uncompress::Bunzip2>, L<IO::Compress::Lzop>, L<IO::Uncompress::UnLzop>, L<IO::Uncompress::AnyUncompress>
920 L<Compress::Zlib::FAQ|Compress::Zlib::FAQ>
922 L<File::GlobMapper|File::GlobMapper>, L<Archive::Zip|Archive::Zip>,
923 L<Archive::Tar|Archive::Tar>,
927 For RFC 1950, 1951 and 1952 see
928 F<http://www.faqs.org/rfcs/rfc1950.html>,
929 F<http://www.faqs.org/rfcs/rfc1951.html> and
930 F<http://www.faqs.org/rfcs/rfc1952.html>
932 The I<zlib> compression library was written by Jean-loup Gailly
933 F<gzip@prep.ai.mit.edu> and Mark Adler F<madler@alumni.caltech.edu>.
935 The primary site for the I<zlib> compression library is
936 F<http://www.zlib.org>.
938 The primary site for gzip is F<http://www.gzip.org>.
948 The I<IO::Uncompress::AnyInflate> module was written by Paul Marquess,
953 =head1 MODIFICATION HISTORY
955 See the Changes file.
957 =head1 COPYRIGHT AND LICENSE
960 Copyright (c) 2005-2006 Paul Marquess. All rights reserved.
962 This program is free software; you can redistribute it and/or
963 modify it under the same terms as Perl itself.