=head1 NAME
-perlfaq5 - Files and Formats ($Revision: 1.18 $, $Date: 2002/05/30 07:04:25 $)
+perlfaq5 - Files and Formats ($Revision: 1.40 $, $Date: 2005/11/10 16:06:07 $)
=head1 DESCRIPTION
formats, and footers.
=head2 How do I flush/unbuffer an output filehandle? Why must I do this?
+X<flush> X<buffer> X<unbuffer> X<autoflush>
Perl does not support truly unbuffered output (except
insofar as you can C<syswrite(OUT, $char, 1)>), although it
print() or write(). Setting $| affects buffering only for
the currently selected default file handle. You choose this
handle with the one argument select() call (see
-L<perlvar/$|> and L<perlfunc/select>).
+L<perlvar/$E<verbar>> and L<perlfunc/select>).
Use select() to choose the desired handle, then set its
per-filehandle variables.
or IO::Socket:
use IO::Socket; # this one is kinda a pipe?
- my $sock = IO::Socket::INET->new( 'www.example.com:80' ) ;
+ my $sock = IO::Socket::INET->new( 'www.example.com:80' );
$sock->autoflush();
=head2 How do I change one line in a file/delete a line in a file/insert a line in the middle of a file/append to the beginning of a file?
+X<file, editing>
Use the Tie::File module, which is included in the standard
distribution since Perl 5.8.0.
=head2 How do I count the number of lines in a file?
+X<file, counting lines> X<lines> X<line>
One fairly efficient way is to count newlines in the file. The
following program uses a feature of tr///, as documented in L<perlop>.
This assumes no funny games with newline translations.
+=head2 How can I use Perl's C<-i> option from within a program?
+X<-i> X<in-place>
+
+C<-i> sets the value of Perl's C<$^I> variable, which in turn affects
+the behavior of C<< <> >>; see L<perlrun> for more details. By
+modifying the appropriate variables directly, you can get the same
+behavior within a larger program. For example:
+
+ # ...
+ {
+ local($^I, @ARGV) = ('.orig', glob("*.c"));
+ while (<>) {
+ if ($. == 1) {
+ print "This line should appear at the top of each file\n";
+ }
+ s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case
+ print;
+ close ARGV if eof; # Reset $.
+ }
+ }
+ # $^I and @ARGV return to their old values here
+
+This block modifies all the C<.c> files in the current directory,
+leaving a backup of the original data from each file in a new
+C<.c.orig> file.
+
+=head2 How can I copy a file?
+X<copy> X<file, copy>
+
+(contributed by brian d foy)
+
+Use the File::Copy module. It comes with Perl and can do a
+true copy across file systems, and it does its magic in
+a portable fashion.
+
+ use File::Copy;
+
+ copy( $original, $new_copy ) or die "Copy failed: $!";
+
+If you can't use File::Copy, you'll have to do the work yourself:
+open the original file, open the destination file, then print
+to the destination file as you read the original.
+
=head2 How do I make a temporary file name?
+X<file, temporary>
+
+If you don't need to know the name of the file, you can use C<open()>
+with C<undef> in place of the file name. The C<open()> function
+creates an anonymous temporary file.
+
+ open my $tmp, '+>', undef or die $!;
-Use the File::Temp module, see L<File::Temp> for more information.
+Otherwise, you can use the File::Temp module.
- use File::Temp qw/ tempfile tempdir /;
+ use File::Temp qw/ tempfile tempdir /;
$dir = tempdir( CLEANUP => 1 );
($fh, $filename) = tempfile( DIR => $dir );
my $count = 0;
until (defined(fileno(FH)) || $count++ > 100) {
$base_name =~ s/-(\d+)$/"-" . (1 + $1)/e;
+ # O_EXCL is required for security reasons.
sysopen(FH, $base_name, O_WRONLY|O_EXCL|O_CREAT);
}
if (defined(fileno(FH))
}
=head2 How can I manipulate fixed-record-length files?
+X<fixed-length> X<file, fixed-length records>
-The most efficient way is using pack() and unpack(). This is faster than
-using substr() when taking many, many strings. It is slower for just a few.
+The most efficient way is using L<pack()|perlfunc/"pack"> and
+L<unpack()|perlfunc/"unpack">. This is faster than using
+L<substr()|perlfunc/"substr"> when taking many, many strings. It is
+slower for just a few.
Here is a sample chunk of code to break up and put back together again
some fixed-format input lines, in this case from the output of a normal,
# sample input line:
# 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what
- $PS_T = 'A6 A4 A7 A5 A*';
- open(PS, "ps|");
- print scalar <PS>;
- while (<PS>) {
- ($pid, $tt, $stat, $time, $command) = unpack($PS_T, $_);
- for $var (qw!pid tt stat time command!) {
- print "$var: <$$var>\n";
+ my $PS_T = 'A6 A4 A7 A5 A*';
+ open my $ps, '-|', 'ps';
+ print scalar <$ps>;
+ my @fields = qw( pid tt stat time command );
+ while (<$ps>) {
+ my %process;
+ @process{@fields} = unpack($PS_T, $_);
+ for my $field ( @fields ) {
+ print "$field: <$process{$field}>\n";
}
- print 'line=', pack($PS_T, $pid, $tt, $stat, $time, $command),
- "\n";
+ print 'line=', pack($PS_T, @process{@fields} ), "\n";
}
-We've used C<$$var> in a way that forbidden by C<use strict 'refs'>.
-That is, we've promoted a string to a scalar variable reference using
-symbolic references. This is okay in small programs, but doesn't scale
-well. It also only works on global variables, not lexicals.
+We've used a hash slice in order to easily handle the fields of each row.
+Storing the keys in an array means it's easy to operate on them as a
+group or loop over them with for. It also avoids polluting the program
+with global variables and using symbolic references.
=head2 How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles?
+X<filehandle, local> X<filehandle, passing> X<filehandle, reference>
As of perl5.6, open() autovivifies file and directory handles
as references if you pass it an uninitialized scalar variable.
check out the Symbol or IO::Handle modules.
=head2 How can I use a filehandle indirectly?
+X<filehandle, indirect>
An indirect filehandle is using something other than a symbol
in a place that a filehandle is expected. Here are ways
That block is a proper block like any other, so you can put more
complicated code there. This sends the message out to one of two places:
- $ok = -x "/bin/cat";
+ $ok = -x "/bin/cat";
print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n";
- print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n";
+ print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n";
This approach of treating C<print> and C<printf> like object methods
calls doesn't work for the diamond operator. That's because it's a
game doesn't help you at all here.
=head2 How can I set up a footer format to be used with write()?
+X<footer>
There's no builtin way to do this, but L<perlform> has a couple of
techniques to make it possible for the intrepid hacker.
=head2 How can I write() into a string?
+X<write, into a string>
See L<perlform/"Accessing Formatting Internals"> for an swrite() function.
=head2 How can I output my numbers with commas added?
+X<number, commify>
+
+(contributed by brian d foy and Benjamin Goldberg)
+
+You can use L<Number::Format> to separate places in a number.
+It handles locale information for those of you who want to insert
+full stops instead (or anything else that they want to use,
+really).
+
+This subroutine will add commas to your number:
-This one from Benjamin Goldberg will do it for you:
+ sub commify {
+ local $_ = shift;
+ 1 while s/^([-+]?\d+)(\d{3})/$1,$2/;
+ return $_;
+ }
+
+This regex from Benjamin Goldberg will add commas to numbers:
s/(^[-+]?\d+?(?=(?>(?:\d{3})+)(?!\d))|\G\d{3}(?=\d))/$1,/g;
-or written verbosely:
+It is easier to see with comments:
s/(
^[-+]? # beginning of number.
- \d{1,3}? # first digits before first comma
+ \d+? # first digits before first comma
(?= # followed by, (but not included in the match) :
(?>(?:\d{3})+) # some positive multiple of three digits.
(?!\d) # an *exact* multiple, not x * 3 + 1 or whatever.
)/$1,/xg;
=head2 How can I translate tildes (~) in a filename?
+X<tilde> X<tilde expansion>
Use the <> (glob()) operator, documented in L<perlfunc>. Older
versions of Perl require that you have a shell installed that groks
}ex;
=head2 How come when I open a file read-write it wipes it out?
+X<clobber> X<read-write> X<clobbering> X<truncate> X<truncating>
Because you're using something like this, which truncates the file and
I<then> gives you read-write access:
open(FH, "+> /path/name"); # WRONG (almost always)
Whoops. You should instead use this, which will fail if the file
-doesn't exist.
+doesn't exist.
open(FH, "+< /path/name"); # open for update
To open a file without blocking, creating if necessary:
- sysopen(FH, "/tmp/somefile", O_WRONLY|O_NDELAY|O_CREAT)
- or die "can't open /tmp/somefile: $!":
+ sysopen(FH, "/foo/somefile", O_WRONLY|O_NDELAY|O_CREAT)
+ or die "can't open /foo/somefile: $!":
Be warned that neither creation nor deletion of files is guaranteed to
be an atomic operation over NFS. That is, two processes might both
See also the new L<perlopentut> if you have it (new for 5.6).
-=head2 Why do I sometimes get an "Argument list too long" when I use <*>?
+=head2 Why do I sometimes get an "Argument list too long" when I use E<lt>*E<gt>?
+X<argument list too long>
The C<< <> >> operator performs a globbing operation (see above).
In Perl versions earlier than v5.6.0, the internal glob() operator forks
one that doesn't use the shell to do globbing.
=head2 Is there a leak/bug in glob()?
+X<glob>
Due to the current implementation on some operating systems, when you
use the glob() function or its angle-bracket alias in a scalar
best therefore to use glob() only in list context.
=head2 How can I open a file with a leading ">" or trailing blanks?
+X<filename, special characters>
-Normally perl ignores trailing blanks in filenames, and interprets
-certain leading characters (or a trailing "|") to mean something
-special.
+(contributed by Brian McCauley)
-The three argument form of open() lets you specify the mode
-separately from the filename. The open() function treats
-special mode characters and whitespace in the filename as
-literals
+The special two argument form of Perl's open() function ignores
+trailing blanks in filenames and infers the mode from certain leading
+characters (or a trailing "|"). In older versions of Perl this was the
+only version of open() and so it is prevalent in old code and books.
+Unless you have a particular reason to use the two argument form you
+should use the three argument form of open() which does not treat any
+charcters in the filename as special.
+
open FILE, "<", " file "; # filename is " file "
open FILE, ">", ">file"; # filename is ">file"
-It may be a lot clearer to use sysopen(), though:
-
- use Fcntl;
- $badpath = "<<<something really wicked ";
- sysopen (FH, $badpath, O_WRONLY | O_CREAT | O_TRUNC)
- or die "can't open $badpath: $!";
-
=head2 How can I reliably rename a file?
+X<rename> X<mv> X<move> X<file, rename> X<ren>
-If your operating system supports a proper mv(1) utility or its functional
-equivalent, this works:
+If your operating system supports a proper mv(1) utility or its
+functional equivalent, this works:
rename($old, $new) or system("mv", $old, $new);
Newer versions of File::Copy export a move() function.
=head2 How can I lock a file?
+X<lock> X<file, lock> X<flock>
Perl's builtin flock() function (see L<perlfunc> for details) will call
flock(2) if that exists, fcntl(2) if it doesn't (on perl version 5.004 and
Slavish adherence to portability concerns shouldn't get in the way of
your getting your job done.)
-For more information on file locking, see also
+For more information on file locking, see also
L<perlopentut/"File Locking"> if you have it (new for 5.6).
=back
-=head2 Why can't I just open(FH, ">file.lock")?
+=head2 Why can't I just open(FH, "E<gt>file.lock")?
+X<lock, lockfile race condition>
A common bit of code B<NOT TO USE> is this:
atomic test-and-set instruction. In theory, this "ought" to work:
sysopen(FH, "file.lock", O_WRONLY|O_EXCL|O_CREAT)
- or die "can't open file.lock: $!":
+ or die "can't open file.lock: $!";
except that lamentably, file creation (and deletion) is not atomic
over NFS, so this won't work (at least, not every time) over the net.
these tend to involve busy-wait, which is also subdesirable.
=head2 I still don't get locking. I just want to increment the number in the file. How can I do this?
+X<counter> X<file, counter>
Didn't anyone ever tell you web-page hit counters were useless?
They don't count number of hits, they're a waste of time, and they serve
If the count doesn't impress your friends, then the code might. :-)
=head2 All I want to do is append a small amount of text to the end of a file. Do I still have to use locking?
+X<append> X<file, append>
If you are on a system that correctly implements flock() and you use the
example appending code from "perldoc -f flock" everything will be OK
systems where this probability is reduced to zero.
=head2 How do I randomly update a binary file?
+X<file, binary patch>
If you're just trying to patch a binary, in many cases something as
simple as this works:
Don't forget them or you'll be quite sorry.
=head2 How do I get a file's timestamp in perl?
+X<timestamp> X<file, timestamp>
If you want to retrieve the time at which the file was last
read, written, or had its meta-data (owner, etc) changed,
-you use the B<-M>, B<-A>, or B<-C> file test operations as
+you use the B<-A>, B<-M>, or B<-C> file test operations as
documented in L<perlfunc>. These retrieve the age of the
file (measured against the start-time of your program) in
days as a floating point number. Some platforms may not have
for details.
=head2 How do I set a file's timestamp in perl?
+X<timestamp> X<file, timestamp>
You use the utime() function documented in L<perlfunc/utime>.
By way of example, here's a little program that copies the
Error checking is, as usual, left as an exercise for the reader.
-Note that utime() currently doesn't work correctly with Win95/NT
-ports. A bug has been reported. Check it carefully before using
-utime() on those platforms.
+The perldoc for utime also has an example that has the same
+effect as touch(1) on files that I<already exist>.
-=head2 How do I print to more than one file at once?
+Certain file systems have a limited ability to store the times
+on a file at the expected level of precision. For example, the
+FAT and HPFS filesystem are unable to create dates on files with
+a finer granularity than two seconds. This is a limitation of
+the filesystems, not of utime().
-If you only have to do this once, you can do this:
+=head2 How do I print to more than one file at once?
+X<print, to multiple files>
- for $fh (FH1, FH2, FH3) { print $fh "whatever\n" }
+To connect one filehandle to several output filehandles,
+you can use the IO::Tee or Tie::FileHandle::Multiplex modules.
-To connect up to one filehandle to several output filehandles, it's
-easiest to use the tee(1) program if you have it, and let it take care
-of the multiplexing:
+If you only have to do this once, you can print individually
+to each filehandle.
- open (FH, "| tee file1 file2 file3");
+ for $fh (FH1, FH2, FH3) { print $fh "whatever\n" }
-Or even:
+=head2 How can I read in an entire file all at once?
+X<slurp> X<file, slurping>
- # make STDOUT go to three files, plus original STDOUT
- open (STDOUT, "| tee file1 file2 file3") or die "Teeing off: $!\n";
- print "whatever\n" or die "Writing: $!\n";
- close(STDOUT) or die "Closing: $!\n";
+You can use the File::Slurp module to do it in one step.
-Otherwise you'll have to write your own multiplexing print
-function--or your own tee program--or use Tom Christiansen's,
-at http://www.cpan.org/authors/id/TOMC/scripts/tct.gz , which is
-written in Perl and offers much greater functionality
-than the stock version.
+ use File::Slurp;
-=head2 How can I read in an entire file all at once?
+ $all_of_it = read_file($filename); # entire file in scalar
+ @all_lines = read_file($filename); # one line perl element
The customary Perl approach for processing all the lines in a file is to
do so one line at a time:
while (<INPUT>) {
chomp;
# do something with $_
- }
+ }
close(INPUT) || die "can't close $file: $!";
This is tremendously more efficient than reading the entire file into
$var = <INPUT>;
}
-That temporarily undefs your record separator, and will automatically
+That temporarily undefs your record separator, and will automatically
close the file at block exit. If the file is already open, just use this:
$var = do { local $/; <INPUT> };
and reads that many bytes into the buffer $var.
=head2 How can I read in a file by paragraphs?
+X<file, reading by paragraphs>
Use the C<$/> variable (see L<perlvar> for details). You can either
set it to C<""> to eliminate empty paragraphs (C<"abc\n\n\n\ndef">,
for instance, gets treated as two paragraphs and not three), or
C<"\n\n"> to accept empty paragraphs.
-Note that a blank line must have no blanks in it. Thus
+Note that a blank line must have no blanks in it. Thus
S<C<"fred\n \nstuff\n\n">> is one paragraph, but C<"fred\n\nstuff\n\n"> is two.
=head2 How can I read a single character from a file? From the keyboard?
+X<getc> X<file, reading one character at a time>
You can use the builtin C<getc()> function for most filehandles, but
it won't (easily) work on a terminal device. For STDIN, either use
pipes, and tty devices work, but I<not> files.
=head2 How do I do a C<tail -f> in perl?
+X<tail>
First try
There's also a File::Tail module from CPAN.
=head2 How do I dup() a filehandle in Perl?
+X<dup>
If you check L<perlfunc/open>, you'll see that several of the ways
to call open() should do the trick. For example:
- open(LOG, ">>/tmp/logfile");
+ open(LOG, ">>/foo/logfile");
open(STDERR, ">&LOG");
Or even with a literal numeric descriptor:
Note that "<&STDIN" makes a copy, but "<&=STDIN" make
an alias. That means if you close an aliased handle, all
-aliases become inaccessible. This is not true with
+aliases become inaccessible. This is not true with
a copied one.
Error checking, as always, has been left as an exercise for the reader.
=head2 How do I close a file descriptor by number?
+X<file, closing file descriptors>
This should rarely be necessary, as the Perl close() function is to be
used for things that Perl opened itself, even if it was a dup of a
Or, just use the fdopen(3S) feature of open():
- {
- local *F;
+ {
+ local *F;
open F, "<&=$fd" or die "Cannot reopen fd=$fd: $!";
close F;
}
=head2 Why can't I use "C:\temp\foo" in DOS paths? Why doesn't `C:\temp\foo.exe` work?
+X<filename, DOS issues>
Whoops! You just put a tab and a formfeed into that filename!
Remember that within double quoted strings ("like\this"), the
are more portable, too.
=head2 Why doesn't glob("*.*") get all the files?
+X<glob>
Because even on non-Unix ports, Perl's glob function follows standard
Unix globbing semantics. You'll need C<glob("*")> to get all (non-hidden)
This is elaborately and painstakingly described in the
F<file-dir-perms> article in the "Far More Than You Ever Wanted To
-Know" collection in http://www.cpan.org/olddoc/FMTEYEWTK.tgz .
+Know" collection in http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz .
The executive summary: learn how your filesystem works. The
permissions on a file say what can happen to the data in that file.
the permissions of the file govern whether you're allowed to.
=head2 How do I select a random line from a file?
+X<file, selecting a random line>
Here's an algorithm from the Camel Book:
srand;
rand($.) < 1 && ($line = $_) while <>;
-This has a significant advantage in space over reading the whole
-file in. A simple proof by induction is available upon
-request if you doubt the algorithm's correctness.
+This has a significant advantage in space over reading the whole file
+in. You can find a proof of this method in I<The Art of Computer
+Programming>, Volume 2, Section 3.4.2, by Donald E. Knuth.
+
+You can use the File::Random module which provides a function
+for that algorithm:
+
+ use File::Random qw/random_line/;
+ my $line = random_line($filename);
+
+Another way is to use the Tie::File module, which treats the entire
+file as an array. Simply access a random array element.
=head2 Why do I get weird spaces when I print an array of lines?
=head1 AUTHOR AND COPYRIGHT
-Copyright (c) 1997-2002 Tom Christiansen and Nathan Torkington.
-All rights reserved.
+Copyright (c) 1997-2005 Tom Christiansen, Nathan Torkington, and
+other authors as noted. All rights reserved.
This documentation is free; you can redistribute it and/or modify it
under the same terms as Perl itself.