3 # Copyright (c) 1998-2004 Tom Hughes <tom@compton.nu>.
4 # All rights reserved. This program is free software; you can redistribute
5 # it and/or modify it under the same terms as Perl itself.
13 IO::Zlib - IO:: style interface to L<Compress::Zlib>
17 With any version of Perl 5 you can use the basic OO interface:
22 if ($fh->open("file.gz", "rb")) {
27 $fh = IO::Zlib->new("file.gz", "wb9");
33 $fh = IO::Zlib->new("file.gz", "rb");
36 undef $fh; # automatically closes the file
39 With Perl 5.004 you can also use the TIEHANDLE interface to access
40 compressed files just like ordinary files:
44 tie *FILE, 'IO::Zlib', "file.gz", "wb";
45 print FILE "line 1\nline2\n";
47 tie *FILE, 'IO::Zlib', "file.gz", "rb";
48 while (<FILE>) { print "LINE: ", $_ };
52 C<IO::Zlib> provides an IO:: style interface to L<Compress::Zlib> and
53 hence to gzip/zlib compressed files. It provides many of the same methods
54 as the L<IO::Handle> interface.
56 Starting from IO::Zlib version 1.02, IO::Zlib can also use an
57 external F<gzip> command. The default behaviour is to try to use
58 an external F<gzip> if no C<Compress::Zlib> can be loaded, unless
59 explicitly disabled by
61 use IO::Zlib qw(:gzip_external 0);
63 If explicitly enabled by
65 use IO::Zlib qw(:gzip_external 1);
67 then the external F<gzip> is used B<instead> of C<Compress::Zlib>.
75 Creates an C<IO::Zlib> object. If it receives any parameters, they are
76 passed to the method C<open>; if the open fails, the object is destroyed.
77 Otherwise, it is returned to the caller.
85 =item open ( FILENAME, MODE )
87 C<open> takes two arguments. The first is the name of the file to open
88 and the second is the open mode. The mode can be anything acceptable to
89 L<Compress::Zlib> and by extension anything acceptable to I<zlib> (that
90 basically means POSIX fopen() style mode strings plus an optional number
91 to indicate the compression level).
95 Returns true if the object currently refers to a opened file.
99 Close the file associated with the object and disassociate
100 the file from the handle.
101 Done automatically on destroy.
105 Return the next character from the file, or undef if none remain.
109 Return the next line from the file, or undef on end of string.
110 Can safely be called in an array context.
111 Currently ignores $/ ($INPUT_RECORD_SEPARATOR or $RS when L<English>
112 is in use) and treats lines as delimited by "\n".
116 Get all remaining lines from the file.
117 It will croak() if accidentally called in a scalar context.
119 =item print ( ARGS... )
121 Print ARGS to the file.
123 =item read ( BUF, NBYTES, [OFFSET] )
125 Read some bytes from the file.
126 Returns the number of bytes actually read, 0 on end-of-file, undef on error.
130 Returns true if the handle is currently positioned at end of file?
132 =item seek ( OFFSET, WHENCE )
134 Seek to a given position in the stream.
139 Return the current position in the stream, as a numeric offset.
144 Set the current position, using the opaque value returned by C<getpos()>.
149 Return the current position in the string, as an opaque object.
154 =head1 USING THE EXTERNAL GZIP
156 If the external F<gzip> is used, the following C<open>s are used:
158 open(FH, "gzip -dc $filename |") # for read opens
159 open(FH, " | gzip > $filename") # for write opens
161 You can modify the 'commands' for example to hardwire
162 an absolute path by e.g.
164 use IO::Zlib ':gzip_read_open' => '/some/where/gunzip -c %s |';
165 use IO::Zlib ':gzip_write_open' => '| /some/where/gzip.exe > %s';
167 The C<%s> is expanded to be the filename (C<sprintf> is used, so be
168 careful to escape any other C<%> signs). The 'commands' are checked
169 for sanity - they must contain the C<%s>, and the read open must end
170 with the pipe sign, and the write open must begin with the pipe sign.
176 =item has_Compress_Zlib
178 Returns true if C<Compress::Zlib> is available. Note that this does
179 not mean that C<Compress::Zlib> is being used: see L</gzip_external>
184 Undef if an external F<gzip> B<can> be used if C<Compress::Zlib> is
185 not available (see L</has_Compress_Zlib>), true if an external F<gzip>
186 is explicitly used, false if an external F<gzip> must not be used.
191 True if an external F<gzip> is being used, false if not.
195 Return the 'command' being used for opening a file for reading using an
198 =item gzip_write_open
200 Return the 'command' being used for opening a file for writing using an
209 =item IO::Zlib::getlines: must be called in list context
211 If you want read lines, you must read in list context.
213 =item IO::Zlib::gzopen_external: mode '...' is illegal
215 Use only modes 'rb' or 'wb' or /wb[1-9]/.
217 =item IO::Zlib::import: '...' is illegal
219 The known import symbols are the C<:gzip_external>, C<:gzip_read_open>,
220 and C<:gzip_write_open>. Anything else is not recognized.
222 =item IO::Zlib::import: ':gzip_external' requires an argument
224 The C<:gzip_external> requires one boolean argument.
226 =item IO::Zlib::import: 'gzip_read_open' requires an argument
228 The C<:gzip_external> requires one string argument.
230 =item IO::Zlib::import: 'gzip_read' '...' is illegal
232 The C<:gzip_read_open> argument must end with the pipe sign (|)
233 and have the C<%s> for the filename. See L</"USING THE EXTERNAL GZIP">.
235 =item IO::Zlib::import: 'gzip_write_open' requires an argument
237 The C<:gzip_external> requires one string argument.
239 =item IO::Zlib::import: 'gzip_write_open' '...' is illegal
241 The C<:gzip_write_open> argument must begin with the pipe sign (|)
242 and have the C<%s> for the filename. An output redirect (>) is also
243 often a good idea, depending on your operating system shell syntax.
244 See L</"USING THE EXTERNAL GZIP">.
246 =item IO::Zlib::import: no Compress::Zlib and no external gzip
248 Given that we failed to load C<Compress::Zlib> and that the use of
249 an external F<gzip> was disabled, IO::Zlib has not much chance of working.
251 =item IO::Zlib::open: needs a filename
253 No filename, no open.
255 =item IO::Zlib::READ: NBYTES must be specified
257 We must know how much to read.
259 =item IO::Zlib::WRITE: too long LENGTH
261 The LENGTH must be less than or equal to the buffer size.
263 =item IO::Zlib::WRITE: OFFSET is not supported
265 Offsets of gzipped streams are not supported.
272 L<perlop/"I/O Operators">,
278 Created by Tom Hughes E<lt>F<tom@compton.nu>E<gt>.
280 Support for external gzip added by Jarkko Hietaniemi E<lt>F<jhi@iki.fi>E<gt>.
284 Copyright (c) 1998-2004 Tom Hughes E<lt>F<tom@compton.nu>E<gt>.
285 All rights reserved. This program is free software; you can redistribute
286 it and/or modify it under the same terms as Perl itself.
293 use vars qw($VERSION $AUTOLOAD @ISA);
296 use Fcntl qw(SEEK_SET);
298 my $has_Compress_Zlib;
301 sub has_Compress_Zlib {
306 eval { require Compress::Zlib };
307 $has_Compress_Zlib = $@ ? 0 : 1;
313 # These might use some $^O logic.
314 my $gzip_read_open = "gzip -dc %s |";
315 my $gzip_write_open = "| gzip > %s";
324 sub gzip_write_open {
337 $has_Compress_Zlib || $gzip_external;
343 if ($_[0] eq ':gzip_external') {
346 $gzip_external = shift;
348 croak "$import: ':gzip_external' requires an argument";
351 elsif ($_[0] eq ':gzip_read_open') {
354 $gzip_read_open = shift;
355 croak "$import: ':gzip_read_open' '$gzip_read_open' is illegal"
356 unless $gzip_read_open =~ /^.+%s.+\|\s*$/;
358 croak "$import: ':gzip_read_open' requires an argument";
361 elsif ($_[0] eq ':gzip_write_open') {
364 $gzip_write_open = shift;
365 croak "$import: ':gzip_write_open' '$gzip_read_open' is illegal"
366 unless $gzip_write_open =~ /^\s*\|.+%s.*$/;
368 croak "$import: ':gzip_write_open' requires an argument";
380 if ((!$has_Compress_Zlib && !defined $gzip_external) || $gzip_external) {
381 # The undef *gzopen is really needed only during
382 # testing where we eval several 'use IO::Zlib's.
384 *gzopen = \&gzopen_external;
385 *IO::Handle::gzread = \&gzread_external;
386 *IO::Handle::gzwrite = \&gzwrite_external;
387 *IO::Handle::gzreadline = \&gzreadline_external;
388 *IO::Handle::gzeof = \&gzeof_external;
389 *IO::Handle::gzclose = \&gzclose_external;
392 croak "$import: no Compress::Zlib and no external gzip"
393 unless $has_Compress_Zlib;
394 *gzopen = \&Compress::Zlib::gzopen;
395 *gzread = \&Compress::Zlib::gzread;
396 *gzwrite = \&Compress::Zlib::gzwrite;
397 *gzreadline = \&Compress::Zlib::gzreadline;
398 *gzeof = \&Compress::Zlib::gzeof;
405 my $import = "IO::Zlib::import";
407 if (_import($import, @_)) {
408 croak "$import: '@_' is illegal";
414 @ISA = qw(Tie::Handle);
421 my $self = bless {}, $class;
423 return @args ? $self->OPEN(@args) : $self;
433 my $filename = shift;
436 croak "IO::Zlib::open: needs a filename" unless defined($filename);
438 $self->{'file'} = gzopen($filename,$mode);
440 return defined($self->{'file'}) ? $self : undef;
447 return undef unless defined($self->{'file'});
449 my $status = $self->{'file'}->gzclose();
451 delete $self->{'file'};
453 return ($status == 0) ? 1 : undef;
461 my $offset = $_[2] || 0;
463 croak "IO::Zlib::READ: NBYTES must be specified" unless defined($nbytes);
465 $$bufref = "" unless defined($$bufref);
467 my $bytesread = $self->{'file'}->gzread(substr($$bufref,$offset),$nbytes);
469 return undef if $bytesread < 0;
480 return () if $self->{'file'}->gzreadline($line) <= 0;
482 return $line unless wantarray;
486 while ($self->{'file'}->gzreadline($line) > 0)
501 croak "IO::Zlib::WRITE: too long LENGTH" unless $offset + $length <= length($buf);
503 return $self->{'file'}->gzwrite(substr($buf,$offset,$length));
510 return $self->{'file'}->gzeof();
523 _alias("new", @_) unless $aliased; # Some call new IO::Zlib directly...
527 tie *{$self}, $class, @args;
529 return tied(${$self}) ? bless $self, $class : undef;
536 return scalar tied(*{$self})->READLINE();
543 croak "IO::Zlib::getlines: must be called in list context"
546 return tied(*{$self})->READLINE();
553 return defined tied(*{$self})->{'file'};
560 $AUTOLOAD =~ s/.*:://;
561 $AUTOLOAD =~ tr/a-z/A-Z/;
563 return tied(*{$self})->$AUTOLOAD(@_);
566 sub gzopen_external {
567 my ($filename, $mode) = @_;
569 my $fh = IO::Handle->new();
571 # Because someone will try to read ungzipped files
572 # with this we peek and verify the signature. Yes,
573 # this means that we open the file twice (if it is
575 # Plenty of race conditions exist in this code, but
576 # the alternative would be to capture the stderr of
577 # gzip and parse it, which would be a portability nightmare.
578 if (-e $filename && open($fh, $filename)) {
581 my $rdb = read($fh, $sig, 2);
582 if ($rdb == 2 && $sig eq "\x1F\x8B") {
583 my $ropen = sprintf $gzip_read_open, $filename;
584 if (open($fh, $ropen)) {
591 seek($fh, 0, SEEK_SET) or
592 die "IO::Zlib: open('$filename', 'r'): seek: $!";
597 } elsif ($mode =~ /w/) {
599 $level = "-$1" if $mode =~ /([1-9])/;
600 # To maximize portability we would need to open
601 # two filehandles here, one for "| gzip $level"
602 # and another for "> $filename", and then when
603 # writing copy bytes from the first to the second.
604 # We are using IO::Handle objects for now, however,
605 # and they can only contain one stream at a time.
606 my $wopen = sprintf $gzip_write_open, $filename;
607 if (open($fh, $wopen)) {
615 croak "IO::Zlib::gzopen_external: mode '$mode' is illegal";
620 sub gzread_external {
621 # Use read() instead of syswrite() because people may
622 # mix reads and readlines, and we don't want to mess
623 # the stdio buffering. See also gzreadline_external()
624 # and gzwrite_external().
625 my $nread = read($_[0], $_[1], @_ == 3 ? $_[2] : 4096);
626 defined $nread ? $nread : -1;
629 sub gzwrite_external {
630 # Using syswrite() is okay (cf. gzread_external())
631 # since the bytes leave this process and buffering
632 # is therefore not an issue.
633 my $nwrote = syswrite($_[0], $_[1]);
634 defined $nwrote ? $nwrote : -1;
637 sub gzreadline_external {
638 # See the comment in gzread_external().
639 $_[1] = readline($_[0]);
640 return defined $_[1] ? length($_[1]) : -1;
647 sub gzclose_external {
649 # I am not entirely certain why this is needed but it seems
650 # the above close() always fails (as if the stream would have
651 # been already closed - something to do with using external
652 # processes via pipes?)