3 # Copyright (c) 1998-2004 Tom Hughes <tom@compton.nu>.
4 # All rights reserved. This program is free software; you can redistribute
5 # it and/or modify it under the same terms as Perl itself.
13 IO::Zlib - IO:: style interface to L<Compress::Zlib>
17 With any version of Perl 5 you can use the basic OO interface:
22 if ($fh->open("file.gz", "rb")) {
27 $fh = IO::Zlib->new("file.gz", "wb9");
33 $fh = IO::Zlib->new("file.gz", "rb");
36 undef $fh; # automatically closes the file
39 With Perl 5.004 you can also use the TIEHANDLE interface to access
40 compressed files just like ordinary files:
44 tie *FILE, 'IO::Zlib', "file.gz", "wb";
45 print FILE "line 1\nline2\n";
47 tie *FILE, 'IO::Zlib', "file.gz", "rb";
48 while (<FILE>) { print "LINE: ", $_ };
52 C<IO::Zlib> provides an IO:: style interface to L<Compress::Zlib> and
53 hence to gzip/zlib compressed files. It provides many of the same methods
54 as the L<IO::Handle> interface.
56 Starting from IO::Zlib version 1.02, IO::Zlib can also use an
57 external F<gzip> command. The default behaviour is to try to use
58 an external F<gzip> if no C<Compress::Zlib> can be loaded, unless
59 explicitly disabled by
61 use IO::Zlib qw(:gzip_external 0);
63 If explicitly enabled by
65 use IO::Zlib qw(:gzip_external 1);
67 then the external F<gzip> is used B<instead> of C<Compress::Zlib>.
75 Creates an C<IO::Zlib> object. If it receives any parameters, they are
76 passed to the method C<open>; if the open fails, the object is destroyed.
77 Otherwise, it is returned to the caller.
85 =item open ( FILENAME, MODE )
87 C<open> takes two arguments. The first is the name of the file to open
88 and the second is the open mode. The mode can be anything acceptable to
89 L<Compress::Zlib> and by extension anything acceptable to I<zlib> (that
90 basically means POSIX fopen() style mode strings plus an optional number
91 to indicate the compression level).
95 Returns true if the object currently refers to a opened file.
99 Close the file associated with the object and disassociate
100 the file from the handle.
101 Done automatically on destroy.
105 Return the next character from the file, or undef if none remain.
109 Return the next line from the file, or undef on end of string.
110 Can safely be called in an array context.
111 Currently ignores $/ ($INPUT_RECORD_SEPARATOR or $RS when L<English>
112 is in use) and treats lines as delimited by "\n".
116 Get all remaining lines from the file.
117 It will croak() if accidentally called in a scalar context.
119 =item print ( ARGS... )
121 Print ARGS to the file.
123 =item read ( BUF, NBYTES, [OFFSET] )
125 Read some bytes from the file.
126 Returns the number of bytes actually read, 0 on end-of-file, undef on error.
130 Returns true if the handle is currently positioned at end of file?
132 =item seek ( OFFSET, WHENCE )
134 Seek to a given position in the stream.
139 Return the current position in the stream, as a numeric offset.
144 Set the current position, using the opaque value returned by C<getpos()>.
149 Return the current position in the string, as an opaque object.
154 =head1 USING THE EXTERNAL GZIP
156 If the external F<gzip> is used, the following C<open>s are used:
158 open(FH, "gzip -dc $filename |") # for read opens
159 open(FH, " | gzip > $filename") # for write opens
161 You can modify the 'commands' for example to hardwire
162 an absolute path by e.g.
164 use IO::Zlib ':gzip_read_open' => '/some/where/gunzip -c %s |';
165 use IO::Zlib ':gzip_write_open' => '| /some/where/gzip.exe > %s';
167 The C<%s> is expanded to be the filename (C<sprintf> is used, so be
168 careful to escape any other C<%> signs). The 'commands' are checked
169 for sanity - they must contain the C<%s>, and the read open must end
170 with the pipe sign, and the write open must begin with the pipe sign.
176 =item has_Compress_Zlib
178 Returns true if C<Compress::Zlib> is available. Note that this does
179 not mean that C<Compress::Zlib> is being used: see L</gzip_external>
184 Undef if an external F<gzip> B<can> be used if C<Compress::Zlib> is
185 not available (see L</has_Compress_Zlib>), true if an external F<gzip>
186 is explicitly used, false if an external F<gzip> must not be used.
191 True if an external F<gzip> is being used, false if not.
195 Return the 'command' being used for opening a file for reading using an
198 =item gzip_write_open
200 Return the 'command' being used for opening a file for writing using an
209 =item IO::Zlib::getlines: must be called in list context
211 If you want read lines, you must read in list context.
213 =item IO::Zlib::gzopen_external: mode '...' is illegal
215 Use only modes 'rb' or 'wb' or /wb[1-9]/.
217 =item IO::Zlib::import: '...' is illegal
219 The known import symbols are the C<:gzip_external>, C<:gzip_read_open>,
220 and C<:gzip_write_open>. Anything else is not recognized.
222 =item IO::Zlib::import: ':gzip_external' requires an argument
224 The C<:gzip_external> requires one boolean argument.
226 =item IO::Zlib::import: 'gzip_read_open' requires an argument
228 The C<:gzip_external> requires one string argument.
230 =item IO::Zlib::import: 'gzip_read' '...' is illegal
232 The C<:gzip_read_open> argument must end with the pipe sign (|)
233 and have the C<%s> for the filename. See L</"USING THE EXTERNAL GZIP">.
235 =item IO::Zlib::import: 'gzip_write_open' requires an argument
237 The C<:gzip_external> requires one string argument.
239 =item IO::Zlib::import: 'gzip_write_open' '...' is illegal
241 The C<:gzip_write_open> argument must begin with the pipe sign (|)
242 and have the C<%s> for the filename. An output redirect (>) is also
243 often a good idea, depending on your operating system shell syntax.
244 See L</"USING THE EXTERNAL GZIP">.
246 =item IO::Zlib::import: no Compress::Zlib and no external gzip
248 Given that we failed to load C<Compress::Zlib> and that the use of
249 an external F<gzip> was disabled, IO::Zlib has not much chance of working.
251 =item IO::Zlib::open: needs a filename
253 No filename, no open.
255 =item IO::Zlib::READ: NBYTES must be specified
257 We must know how much to read.
259 =item IO::Zlib::WRITE: too long LENGTH
261 The LENGTH must be less than or equal to the buffer size.
268 L<perlop/"I/O Operators">,
274 Created by Tom Hughes E<lt>F<tom@compton.nu>E<gt>.
276 Support for external gzip added by Jarkko Hietaniemi E<lt>F<jhi@iki.fi>E<gt>.
280 Copyright (c) 1998-2004 Tom Hughes E<lt>F<tom@compton.nu>E<gt>.
281 All rights reserved. This program is free software; you can redistribute
282 it and/or modify it under the same terms as Perl itself.
289 use vars qw($VERSION $AUTOLOAD @ISA);
292 use Fcntl qw(SEEK_SET);
294 my $has_Compress_Zlib;
297 sub has_Compress_Zlib {
302 eval { require Compress::Zlib };
303 $has_Compress_Zlib = $@ ? 0 : 1;
309 # These might use some $^O logic.
310 my $gzip_read_open = "gzip -dc %s |";
311 my $gzip_write_open = "| gzip > %s";
320 sub gzip_write_open {
333 $has_Compress_Zlib || $gzip_external;
339 if ($_[0] eq ':gzip_external') {
342 $gzip_external = shift;
344 croak "$import: ':gzip_external' requires an argument";
347 elsif ($_[0] eq ':gzip_read_open') {
350 $gzip_read_open = shift;
351 croak "$import: ':gzip_read_open' '$gzip_read_open' is illegal"
352 unless $gzip_read_open =~ /^.+%s.+\|\s*$/;
354 croak "$import: ':gzip_read_open' requires an argument";
357 elsif ($_[0] eq ':gzip_write_open') {
360 $gzip_write_open = shift;
361 croak "$import: ':gzip_write_open' '$gzip_read_open' is illegal"
362 unless $gzip_write_open =~ /^\s*\|.+%s.*$/;
364 croak "$import: ':gzip_write_open' requires an argument";
376 if ((!$has_Compress_Zlib && !defined $gzip_external) || $gzip_external) {
377 # The undef *gzopen is really needed only during
378 # testing where we eval several 'use IO::Zlib's.
380 *gzopen = \&gzopen_external;
381 *IO::Handle::gzread = \&gzread_external;
382 *IO::Handle::gzwrite = \&gzwrite_external;
383 *IO::Handle::gzreadline = \&gzreadline_external;
384 *IO::Handle::gzeof = \&gzeof_external;
385 *IO::Handle::gzclose = \&gzclose_external;
388 croak "$import: no Compress::Zlib and no external gzip"
389 unless $has_Compress_Zlib;
390 *gzopen = \&Compress::Zlib::gzopen;
391 *gzread = \&Compress::Zlib::gzread;
392 *gzwrite = \&Compress::Zlib::gzwrite;
393 *gzreadline = \&Compress::Zlib::gzreadline;
394 *gzeof = \&Compress::Zlib::gzeof;
401 my $import = "IO::Zlib::import";
403 if (_import($import, @_)) {
404 croak "$import: '@_' is illegal";
410 @ISA = qw(Tie::Handle);
417 my $self = bless {}, $class;
419 return @args ? $self->OPEN(@args) : $self;
429 my $filename = shift;
432 croak "IO::Zlib::open: needs a filename" unless defined($filename);
434 $self->{'file'} = gzopen($filename,$mode);
436 return defined($self->{'file'}) ? $self : undef;
443 return undef unless defined($self->{'file'});
445 my $status = $self->{'file'}->gzclose();
447 delete $self->{'file'};
449 return ($status == 0) ? 1 : undef;
457 my $offset = $_[2] || 0;
459 croak "IO::Zlib::READ: NBYTES must be specified" unless defined($nbytes);
461 $$bufref = "" unless defined($$bufref);
463 my $bytesread = $self->{'file'}->gzread(substr($$bufref,$offset),$nbytes);
465 return undef if $bytesread < 0;
476 return () if $self->{'file'}->gzreadline($line) <= 0;
478 return $line unless wantarray;
482 while ($self->{'file'}->gzreadline($line) > 0)
497 croak "IO::Zlib::WRITE: too long LENGTH" unless $offset + $length <= length($buf);
499 return $self->{'file'}->gzwrite(substr($buf,$offset,$length));
506 return $self->{'file'}->gzeof();
519 _alias("new", @_) unless $aliased; # Some call new IO::Zlib directly...
523 tie *{$self}, $class, @args;
525 return tied(${$self}) ? bless $self, $class : undef;
532 return scalar tied(*{$self})->READLINE();
539 croak "IO::Zlib::getlines: must be called in list context"
542 return tied(*{$self})->READLINE();
549 return defined tied(*{$self})->{'file'};
556 $AUTOLOAD =~ s/.*:://;
557 $AUTOLOAD =~ tr/a-z/A-Z/;
559 return tied(*{$self})->$AUTOLOAD(@_);
562 sub gzopen_external {
563 my ($filename, $mode) = @_;
565 my $fh = IO::Handle->new();
567 # Because someone will try to read ungzipped files
568 # with this we peek and verify the signature. Yes,
569 # this means that we open the file twice (if it is
571 # Plenty of race conditions exist in this code, but
572 # the alternative would be to capture the stderr of
573 # gzip and parse it, which would be a portability nightmare.
574 if (-e $filename && open($fh, $filename)) {
577 my $rdb = read($fh, $sig, 2);
578 if ($rdb == 2 && $sig eq "\x1F\x8B") {
579 my $ropen = sprintf $gzip_read_open, $filename;
580 if (open($fh, $ropen)) {
587 seek($fh, 0, SEEK_SET) or
588 die "IO::Zlib: open('$filename', 'r'): seek: $!";
593 } elsif ($mode =~ /w/) {
595 $level = "-$1" if $mode =~ /([1-9])/;
596 # To maximize portability we would need to open
597 # two filehandles here, one for "| gzip $level"
598 # and another for "> $filename", and then when
599 # writing copy bytes from the first to the second.
600 # We are using IO::Handle objects for now, however,
601 # and they can only contain one stream at a time.
602 my $wopen = sprintf $gzip_write_open, $filename;
603 if (open($fh, $wopen)) {
611 croak "IO::Zlib::gzopen_external: mode '$mode' is illegal";
616 sub gzread_external {
617 # Use read() instead of syswrite() because people may
618 # mix reads and readlines, and we don't want to mess
619 # the stdio buffering. See also gzreadline_external()
620 # and gzwrite_external().
621 my $nread = read($_[0], $_[1], @_ == 3 ? $_[2] : 4096);
622 defined $nread ? $nread : -1;
625 sub gzwrite_external {
626 # Using syswrite() is okay (cf. gzread_external())
627 # since the bytes leave this process and buffering
628 # is therefore not an issue.
629 my $nwrote = syswrite($_[0], $_[1]);
630 defined $nwrote ? $nwrote : -1;
633 sub gzreadline_external {
634 # See the comment in gzread_external().
635 $_[1] = readline($_[0]);
636 return defined $_[1] ? length($_[1]) : -1;
643 sub gzclose_external {
645 # I am not entirely certain why this is needed but it seems
646 # the above close() always fails (as if the stream would have
647 # been already closed - something to do with using external
648 # processes via pipes?)