3 # Copyright (c) 1998-2004 Tom Hughes <tom@compton.nu>.
4 # All rights reserved. This program is free software; you can redistribute
5 # it and/or modify it under the same terms as Perl itself.
13 IO::Zlib - IO:: style interface to L<Compress::Zlib>
17 With any version of Perl 5 you can use the basic OO interface:
22 if ($fh->open("file.gz", "rb")) {
27 $fh = IO::Zlib->new("file.gz", "wb9");
33 $fh = IO::Zlib->new("file.gz", "rb");
36 undef $fh; # automatically closes the file
39 With Perl 5.004 you can also use the TIEHANDLE interface to access
40 compressed files just like ordinary files:
44 tie *FILE, 'IO::Zlib', "file.gz", "wb";
45 print FILE "line 1\nline2\n";
47 tie *FILE, 'IO::Zlib', "file.gz", "rb";
48 while (<FILE>) { print "LINE: ", $_ };
52 C<IO::Zlib> provides an IO:: style interface to L<Compress::Zlib> and
53 hence to gzip/zlib compressed files. It provides many of the same methods
54 as the L<IO::Handle> interface.
56 Starting from IO::Zlib version 1.02, IO::Zlib can also use an
57 external F<gzip> command. The default behaviour is to try to use
58 an external F<gzip> if no C<Compress::Zlib> can be loaded, unless
59 explicitly disabled by
61 use IO::Zlib qw(:gzip_external 0);
63 If explicitly enabled by
65 use IO::Zlib qw(:gzip_external 1);
67 then the external F<gzip> is used B<instead> of C<Compress::Zlib>.
75 Creates an C<IO::Zlib> object. If it receives any parameters, they are
76 passed to the method C<open>; if the open fails, the object is destroyed.
77 Otherwise, it is returned to the caller.
85 =item open ( FILENAME, MODE )
87 C<open> takes two arguments. The first is the name of the file to open
88 and the second is the open mode. The mode can be anything acceptable to
89 L<Compress::Zlib> and by extension anything acceptable to I<zlib> (that
90 basically means POSIX fopen() style mode strings plus an optional number
91 to indicate the compression level).
95 Returns true if the object currently refers to a opened file.
99 Close the file associated with the object and disassociate
100 the file from the handle.
101 Done automatically on destroy.
105 Return the next character from the file, or undef if none remain.
109 Return the next line from the file, or undef on end of string.
110 Can safely be called in an array context.
111 Currently ignores $/ ($INPUT_RECORD_SEPARATOR or $RS when L<English>
112 is in use) and treats lines as delimited by "\n".
116 Get all remaining lines from the file.
117 It will croak() if accidentally called in a scalar context.
119 =item print ( ARGS... )
121 Print ARGS to the file.
123 =item read ( BUF, NBYTES, [OFFSET] )
125 Read some bytes from the file.
126 Returns the number of bytes actually read, 0 on end-of-file, undef on error.
130 Returns true if the handle is currently positioned at end of file?
132 =item seek ( OFFSET, WHENCE )
134 Seek to a given position in the stream.
139 Return the current position in the stream, as a numeric offset.
144 Set the current position, using the opaque value returned by C<getpos()>.
149 Return the current position in the string, as an opaque object.
154 =head1 USING THE EXTERNAL GZIP
156 If the external F<gzip> is used, the following C<open>s are used:
158 open(FH, "gzip -dc $filename |") # for read opens
159 open(FH, " | gzip > $filename") # for write opens
161 You can modify the 'commands' for example to hardwire
162 an absolute path by e.g.
164 use IO::Zlib ':gzip_read_open' => '/some/where/gunzip -c %s |';
165 use IO::Zlib ':gzip_write_open' => '| /some/where/gzip.exe > %s';
167 The C<%s> is expanded to be the filename (C<sprintf> is used, so be
168 careful to escape any other C<%> signs). The 'commands' are checked
169 for sanity - they must contain the C<%s>, and the read open must end
170 with the pipe sign, and the write open must begin with the pipe sign.
176 =item has_Compress_Zlib
178 Returns true if C<Compress::Zlib> is available. Note that this does
179 not mean that C<Compress::Zlib> is being used: see L</gzip_external>
184 Undef if an external F<gzip> B<can> be used if C<Compress::Zlib> is
185 not available (see L</has_Compress_Zlib>), true if an external F<gzip>
186 is explicitly used, false if an external F<gzip> must not be used.
191 True if an external F<gzip> is being used, false if not.
195 Return the 'command' being used for opening a file for reading using an
198 =item gzip_write_open
200 Return the 'command' being used for opening a file for writing using an
209 =item IO::Zlib::getlines: must be called in list context
211 If you want read lines, you must read in list context.
213 =item IO::Zlib::gzopen_external: mode '...' is illegal
215 Use only modes 'rb' or 'wb' or /wb[1-9]/.
217 =item IO::Zlib::import: '...' is illegal
219 The known import symbols are the C<:gzip_external>, C<:gzip_read_open>,
220 and C<:gzip_write_open>. Anything else is not recognized.
222 =item IO::Zlib::import: ':gzip_external' requires an argument
224 The C<:gzip_external> requires one boolean argument.
226 =item IO::Zlib::import: 'gzip_read_open' requires an argument
228 The C<:gzip_external> requires one string argument.
230 =item IO::Zlib::import: 'gzip_read' '...' is illegal
232 The C<:gzip_read_open> argument must end with the pipe sign (|)
233 and have the C<%s> for the filename. See L</"USING THE EXTERNAL GZIP">.
235 =item IO::Zlib::import: 'gzip_write_open' requires an argument
237 The C<:gzip_external> requires one string argument.
239 =item IO::Zlib::import: 'gzip_write_open' '...' is illegal
241 The C<:gzip_write_open> argument must begin with the pipe sign (|)
242 and have the C<%s> for the filename. An output redirect (>) is also
243 often a good idea, depending on your operating system shell syntax.
244 See L</"USING THE EXTERNAL GZIP">.
246 =item IO::Zlib::import: no Compress::Zlib and no external gzip
248 Given that we failed to load C<Compress::Zlib> and that the use of
249 an external F<gzip> was disabled, IO::Zlib has not much chance of working.
251 =item IO::Zlib::open: needs a filename
253 No filename, no open.
255 =item IO::Zlib::READ: NBYTES must be specified
257 We must know how much to read.
259 =item IO::Zlib::READ: OFFSET is not supported
261 Offsets of gzipped streams are not supported.
263 =item IO::Zlib::WRITE: too long LENGTH
265 The LENGTH must be less than or equal to the buffer size.
267 =item IO::Zlib::WRITE: OFFSET is not supported
269 Offsets of gzipped streams are not supported.
276 L<perlop/"I/O Operators">,
282 Created by Tom Hughes E<lt>F<tom@compton.nu>E<gt>.
284 Support for external gzip added by Jarkko Hietaniemi E<lt>F<jhi@iki.fi>E<gt>.
288 Copyright (c) 1998-2004 Tom Hughes E<lt>F<tom@compton.nu>E<gt>.
289 All rights reserved. This program is free software; you can redistribute
290 it and/or modify it under the same terms as Perl itself.
297 use vars qw($VERSION $AUTOLOAD @ISA);
300 use Fcntl qw(SEEK_SET);
302 my $has_Compress_Zlib;
305 sub has_Compress_Zlib {
310 eval { require Compress::Zlib };
311 $has_Compress_Zlib = $@ ? 0 : 1;
317 # These might use some $^O logic.
318 my $gzip_read_open = "gzip -dc %s |";
319 my $gzip_write_open = "| gzip > %s";
328 sub gzip_write_open {
341 $has_Compress_Zlib || $gzip_external;
347 if ($_[0] eq ':gzip_external') {
350 $gzip_external = shift;
352 croak "$import: ':gzip_external' requires an argument";
355 elsif ($_[0] eq ':gzip_read_open') {
358 $gzip_read_open = shift;
359 croak "$import: ':gzip_read_open' '$gzip_read_open' is illegal"
360 unless $gzip_read_open =~ /^.+%s.+\|\s*$/;
362 croak "$import: ':gzip_read_open' requires an argument";
365 elsif ($_[0] eq ':gzip_write_open') {
368 $gzip_write_open = shift;
369 croak "$import: ':gzip_write_open' '$gzip_read_open' is illegal"
370 unless $gzip_write_open =~ /^\s*\|.+%s.*$/;
372 croak "$import: ':gzip_write_open' requires an argument";
384 if ((!$has_Compress_Zlib && !defined $gzip_external) || $gzip_external) {
385 # The undef *gzopen is really needed only during
386 # testing where we eval several 'use IO::Zlib's.
388 *gzopen = \&gzopen_external;
389 *IO::Handle::gzread = \&gzread_external;
390 *IO::Handle::gzwrite = \&gzwrite_external;
391 *IO::Handle::gzreadline = \&gzreadline_external;
392 *IO::Handle::gzclose = \&gzclose_external;
395 croak "$import: no Compress::Zlib and no external gzip"
396 unless $has_Compress_Zlib;
397 *gzopen = \&Compress::Zlib::gzopen;
398 *gzread = \&Compress::Zlib::gzread;
399 *gzwrite = \&Compress::Zlib::gzwrite;
400 *gzreadline = \&Compress::Zlib::gzreadline;
407 my $import = "IO::Zlib::import";
409 if (_import($import, @_)) {
410 croak "$import: '@_' is illegal";
416 @ISA = qw(Tie::Handle);
423 my $self = bless {}, $class;
425 return @args ? $self->OPEN(@args) : $self;
435 my $filename = shift;
438 croak "IO::Zlib::open: needs a filename" unless defined($filename);
440 $self->{'file'} = gzopen($filename,$mode);
443 return defined($self->{'file'}) ? $self : undef;
450 return undef unless defined($self->{'file'});
452 my $status = $self->{'file'}->gzclose();
454 delete $self->{'file'};
455 delete $self->{'eof'};
457 return ($status == 0) ? 1 : undef;
467 croak "IO::Zlib::READ: NBYTES must be specified" unless defined($nbytes);
468 croak "IO::Zlib::READ: OFFSET is not supported" if defined($offset) && $offset != 0;
470 return 0 if $self->{'eof'};
472 my $bytesread = $self->{'file'}->gzread($$bufref,$nbytes);
474 return undef if $bytesread < 0;
476 $self->{'eof'} = 1 if $bytesread < $nbytes;
487 return () if $self->{'file'}->gzreadline($line) <= 0;
489 return $line unless wantarray;
493 while ($self->{'file'}->gzreadline($line) > 0)
508 croak "IO::Zlib::WRITE: too long LENGTH" unless $length <= length($buf);
509 croak "IO::Zlib::WRITE: OFFSET not supported" if defined($offset) && $offset != 0;
511 return $self->{'file'}->gzwrite(substr($buf,0,$length));
518 return $self->{'eof'};
526 _alias("new", @_) unless $aliased; # Some call new IO::Zlib directly...
530 tie *{$self}, $class, @args;
532 return tied(${$self}) ? bless $self, $class : undef;
539 return scalar tied(*{$self})->READLINE();
546 croak "IO::Zlib::getlines: must be called in list context"
549 return tied(*{$self})->READLINE();
556 return defined tied(*{$self})->{'file'};
563 $AUTOLOAD =~ s/.*:://;
564 $AUTOLOAD =~ tr/a-z/A-Z/;
566 return tied(*{$self})->$AUTOLOAD(@_);
569 sub gzopen_external {
570 my ($filename, $mode) = @_;
572 my $fh = IO::Handle->new();
574 # Because someone will try to read ungzipped files
575 # with this we peek and verify the signature. Yes,
576 # this means that we open the file twice (if it is
578 # Plenty of race conditions exist in this code, but
579 # the alternative would be to capture the stderr of
580 # gzip and parse it, which would be a portability nightmare.
581 if (-e $filename && open($fh, $filename)) {
584 my $rdb = read($fh, $sig, 2);
585 if ($rdb == 2 && $sig eq "\x1F\x8B") {
586 my $ropen = sprintf $gzip_read_open, $filename;
587 if (open($fh, $ropen)) {
594 seek($fh, 0, SEEK_SET) or
595 die "IO::Zlib: open('$filename', 'r'): seek: $!";
600 } elsif ($mode =~ /w/) {
602 $level = "-$1" if $mode =~ /([1-9])/;
603 # To maximize portability we would need to open
604 # two filehandles here, one for "| gzip $level"
605 # and another for "> $filename", and then when
606 # writing copy bytes from the first to the second.
607 # We are using IO::Handle objects for now, however,
608 # and they can only contain one stream at a time.
609 my $wopen = sprintf $gzip_write_open, $filename;
610 if (open($fh, $wopen)) {
618 croak "IO::Zlib::gzopen_external: mode '$mode' is illegal";
623 sub gzread_external {
624 # Use read() instead of syswrite() because people may
625 # mix reads and readlines, and we don't want to mess
626 # the stdio buffering. See also gzreadline_external()
627 # and gzwrite_external().
628 my $nread = read($_[0], $_[1], @_ == 3 ? $_[2] : 4096);
629 defined $nread ? $nread : -1;
632 sub gzwrite_external {
633 # Using syswrite() is okay (cf. gzread_external())
634 # since the bytes leave this process and buffering
635 # is therefore not an issue.
636 my $nwrote = syswrite($_[0], $_[1]);
637 defined $nwrote ? $nwrote : -1;
640 sub gzreadline_external {
641 # See the comment in gzread_external().
642 $_[1] = readline($_[0]);
643 return defined $_[1] ? length($_[1]) : -1;
646 sub gzclose_external {
648 # I am not entirely certain why this is needed but it seems
649 # the above close() always fails (as if the stream would have
650 # been already closed - something to do with using external
651 # processes via pipes?)