=head1 NAME
-perlfaq5 - Files and Formats ($Revision: 1.17 $, $Date: 2002/05/23 19:33:50 $)
+perlfaq5 - Files and Formats ($Revision: 1.36 $, $Date: 2005/04/22 19:04:48 $)
=head1 DESCRIPTION
print() or write(). Setting $| affects buffering only for
the currently selected default file handle. You choose this
handle with the one argument select() call (see
-L<perlvar/$|> and L<perlfunc/select>).
+L<perlvar/$E<verbar>> and L<perlfunc/select>).
Use select() to choose the desired handle, then set its
per-filehandle variables.
Some idioms can handle this in a single statement:
select((select(OUTPUT_HANDLE), $| = 1)[0]);
-
+
$| = 1, select $_ for select OUTPUT_HANDLE;
Some modules offer object-oriented access to handles and their
This assumes no funny games with newline translations.
+=head2 How can I use Perl's C<-i> option from within a program?
+
+C<-i> sets the value of Perl's C<$^I> variable, which in turn affects
+the behavior of C<< <> >>; see L<perlrun> for more details. By
+modifying the appropriate variables directly, you can get the same
+behavior within a larger program. For example:
+
+ # ...
+ {
+ local($^I, @ARGV) = ('.orig', glob("*.c"));
+ while (<>) {
+ if ($. == 1) {
+ print "This line should appear at the top of each file\n";
+ }
+ s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case
+ print;
+ close ARGV if eof; # Reset $.
+ }
+ }
+ # $^I and @ARGV return to their old values here
+
+This block modifies all the C<.c> files in the current directory,
+leaving a backup of the original data from each file in a new
+C<.c.orig> file.
+
+=head2 How can I copy a file?
+
+(contributed by brian d foy)
+
+Use the File::Copy module. It comes with Perl and can do a
+true copy across file systems, and it does its magic in
+a portable fashion.
+
+ use File::Copy;
+
+ copy( $original, $new_copy ) or die "Copy failed: $!";
+
+If you can't use File::Copy, you'll have to do the work yourself:
+open the original file, open the destination file, then print
+to the destination file as you read the original.
+
=head2 How do I make a temporary file name?
-Use the File::Temp module, see L<File::Temp> for more information.
+If you don't need to know the name of the file, you can use C<open()>
+with C<undef> in place of the file name. The C<open()> function
+creates an anonymous temporary file.
+
+ open my $tmp, '+>', undef or die $!;
- use File::Temp qw/ tempfile tempdir /;
+Otherwise, you can use the File::Temp module.
+
+ use File::Temp qw/ tempfile tempdir /;
$dir = tempdir( CLEANUP => 1 );
($fh, $filename) = tempfile( DIR => $dir );
my $count = 0;
until (defined(fileno(FH)) || $count++ > 100) {
$base_name =~ s/-(\d+)$/"-" . (1 + $1)/e;
+ # O_EXCL is required for security reasons.
sysopen(FH, $base_name, O_WRONLY|O_EXCL|O_CREAT);
}
if (defined(fileno(FH))
=head2 How can I manipulate fixed-record-length files?
-The most efficient way is using pack() and unpack(). This is faster than
-using substr() when taking many, many strings. It is slower for just a few.
+The most efficient way is using L<pack()|perlfunc/"pack"> and
+L<unpack()|perlfunc/"unpack">. This is faster than using
+L<substr()|perlfunc/"substr"> when taking many, many strings. It is
+slower for just a few.
Here is a sample chunk of code to break up and put back together again
some fixed-format input lines, in this case from the output of a normal,
# sample input line:
# 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what
- $PS_T = 'A6 A4 A7 A5 A*';
- open(PS, "ps|");
- print scalar <PS>;
- while (<PS>) {
- ($pid, $tt, $stat, $time, $command) = unpack($PS_T, $_);
- for $var (qw!pid tt stat time command!) {
- print "$var: <$$var>\n";
+ my $PS_T = 'A6 A4 A7 A5 A*';
+ open my $ps, '-|', 'ps';
+ print scalar <$ps>;
+ my @fields = qw( pid tt stat time command );
+ while (<$ps>) {
+ my %process;
+ @process{@fields} = unpack($PS_T, $_);
+ for my $field ( @fields ) {
+ print "$field: <$process{$field}>\n";
}
- print 'line=', pack($PS_T, $pid, $tt, $stat, $time, $command),
- "\n";
+ print 'line=', pack($PS_T, @process{@fields} ), "\n";
}
-We've used C<$$var> in a way that forbidden by C<use strict 'refs'>.
-That is, we've promoted a string to a scalar variable reference using
-symbolic references. This is okay in small programs, but doesn't scale
-well. It also only works on global variables, not lexicals.
+We've used a hash slice in order to easily handle the fields of each row.
+Storing the keys in an array means it's easy to operate on them as a
+group or loop over them with for. It also avoids polluting the program
+with global variables and using symbolic references.
=head2 How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles?
and use them in the place of named handles.
open my $fh, $file_name;
-
+
open local $fh, $file_name;
-
+
print $fh "Hello World!\n";
-
+
process_file( $fh );
Before perl5.6, you had to deal with various typeglob idioms
open FILE, "> $filename";
process_typeglob( *FILE );
process_reference( \*FILE );
-
+
sub process_typeglob { local *FH = shift; print FH "Typeglob!" }
sub process_reference { local $fh = shift; print $fh "Reference!" }
That block is a proper block like any other, so you can put more
complicated code there. This sends the message out to one of two places:
- $ok = -x "/bin/cat";
+ $ok = -x "/bin/cat";
print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n";
- print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n";
+ print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n";
This approach of treating C<print> and C<printf> like object methods
calls doesn't work for the diamond operator. That's because it's a
=head2 How can I output my numbers with commas added?
-This one from Benjamin Goldberg will do it for you:
+This subroutine will add commas to your number:
+
+ sub commify {
+ local $_ = shift;
+ 1 while s/^([-+]?\d+)(\d{3})/$1,$2/;
+ return $_;
+ }
+
+This regex from Benjamin Goldberg will add commas to numbers:
s/(^[-+]?\d+?(?=(?>(?:\d{3})+)(?!\d))|\G\d{3}(?=\d))/$1,/g;
-or written verbosely:
+It is easier to see with comments:
s/(
^[-+]? # beginning of number.
- \d{1,3}? # first digits before first comma
+ \d+? # first digits before first comma
(?= # followed by, (but not included in the match) :
(?>(?:\d{3})+) # some positive multiple of three digits.
(?!\d) # an *exact* multiple, not x * 3 + 1 or whatever.
open(FH, "+> /path/name"); # WRONG (almost always)
Whoops. You should instead use this, which will fail if the file
-doesn't exist.
+doesn't exist.
open(FH, "+< /path/name"); # open for update
To open a file without blocking, creating if necessary:
- sysopen(FH, "/tmp/somefile", O_WRONLY|O_NDELAY|O_CREAT)
- or die "can't open /tmp/somefile: $!":
+ sysopen(FH, "/foo/somefile", O_WRONLY|O_NDELAY|O_CREAT)
+ or die "can't open /foo/somefile: $!":
Be warned that neither creation nor deletion of files is guaranteed to
be an atomic operation over NFS. That is, two processes might both
See also the new L<perlopentut> if you have it (new for 5.6).
-=head2 Why do I sometimes get an "Argument list too long" when I use <*>?
+=head2 Why do I sometimes get an "Argument list too long" when I use E<lt>*E<gt>?
The C<< <> >> operator performs a globbing operation (see above).
In Perl versions earlier than v5.6.0, the internal glob() operator forks
Normally perl ignores trailing blanks in filenames, and interprets
certain leading characters (or a trailing "|") to mean something
-special.
+special.
The three argument form of open() lets you specify the mode
separately from the filename. The open() function treats
-special mode characters and whitespace in the filename as
+special mode characters and whitespace in the filename as
literals
open FILE, "<", " file "; # filename is " file "
=head2 How can I reliably rename a file?
-If your operating system supports a proper mv(1) utility or its functional
-equivalent, this works:
+If your operating system supports a proper mv(1) utility or its
+functional equivalent, this works:
rename($old, $new) or system("mv", $old, $new);
Slavish adherence to portability concerns shouldn't get in the way of
your getting your job done.)
-For more information on file locking, see also
+For more information on file locking, see also
L<perlopentut/"File Locking"> if you have it (new for 5.6).
=back
-=head2 Why can't I just open(FH, ">file.lock")?
+=head2 Why can't I just open(FH, "E<gt>file.lock")?
A common bit of code B<NOT TO USE> is this:
atomic test-and-set instruction. In theory, this "ought" to work:
sysopen(FH, "file.lock", O_WRONLY|O_EXCL|O_CREAT)
- or die "can't open file.lock: $!":
+ or die "can't open file.lock: $!";
except that lamentably, file creation (and deletion) is not atomic
over NFS, so this won't work (at least, not every time) over the net.
Error checking is, as usual, left as an exercise for the reader.
-Note that utime() currently doesn't work correctly with Win95/NT
-ports. A bug has been reported. Check it carefully before using
-utime() on those platforms.
+The perldoc for utime also has an example that has the same
+effect as touch(1) on files that I<already exist>.
-=head2 How do I print to more than one file at once?
+Certain file systems have a limited ability to store the times
+on a file at the expected level of precision. For example, the
+FAT and HPFS filesystem are unable to create dates on files with
+a finer granularity than two seconds. This is a limitation of
+the filesystems, not of utime().
-If you only have to do this once, you can do this:
+=head2 How do I print to more than one file at once?
- for $fh (FH1, FH2, FH3) { print $fh "whatever\n" }
+To connect one filehandle to several output filehandles,
+you can use the IO::Tee or Tie::FileHandle::Multiplex modules.
-To connect up to one filehandle to several output filehandles, it's
-easiest to use the tee(1) program if you have it, and let it take care
-of the multiplexing:
+If you only have to do this once, you can print individually
+to each filehandle.
- open (FH, "| tee file1 file2 file3");
+ for $fh (FH1, FH2, FH3) { print $fh "whatever\n" }
-Or even:
+=head2 How can I read in an entire file all at once?
- # make STDOUT go to three files, plus original STDOUT
- open (STDOUT, "| tee file1 file2 file3") or die "Teeing off: $!\n";
- print "whatever\n" or die "Writing: $!\n";
- close(STDOUT) or die "Closing: $!\n";
+You can use the File::Slurp module to do it in one step.
-Otherwise you'll have to write your own multiplexing print
-function--or your own tee program--or use Tom Christiansen's,
-at http://www.cpan.org/authors/id/TOMC/scripts/tct.gz , which is
-written in Perl and offers much greater functionality
-than the stock version.
+ use File::Slurp;
-=head2 How can I read in an entire file all at once?
+ $all_of_it = read_file($filename); # entire file in scalar
+ @all_lines = read_file($filename); # one line perl element
The customary Perl approach for processing all the lines in a file is to
do so one line at a time:
while (<INPUT>) {
chomp;
# do something with $_
- }
+ }
close(INPUT) || die "can't close $file: $!";
This is tremendously more efficient than reading the entire file into
$var = <INPUT>;
}
-That temporarily undefs your record separator, and will automatically
+That temporarily undefs your record separator, and will automatically
close the file at block exit. If the file is already open, just use this:
$var = do { local $/; <INPUT> };
for instance, gets treated as two paragraphs and not three), or
C<"\n\n"> to accept empty paragraphs.
-Note that a blank line must have no blanks in it. Thus
+Note that a blank line must have no blanks in it. Thus
S<C<"fred\n \nstuff\n\n">> is one paragraph, but C<"fred\n\nstuff\n\n"> is two.
=head2 How can I read a single character from a file? From the keyboard?
If you check L<perlfunc/open>, you'll see that several of the ways
to call open() should do the trick. For example:
- open(LOG, ">>/tmp/logfile");
+ open(LOG, ">>/foo/logfile");
open(STDERR, ">&LOG");
Or even with a literal numeric descriptor:
Note that "<&STDIN" makes a copy, but "<&=STDIN" make
an alias. That means if you close an aliased handle, all
-aliases become inaccessible. This is not true with
+aliases become inaccessible. This is not true with
a copied one.
Error checking, as always, has been left as an exercise for the reader.
Or, just use the fdopen(3S) feature of open():
- {
- local *F;
+ {
+ local *F;
open F, "<&=$fd" or die "Cannot reopen fd=$fd: $!";
close F;
}
This is elaborately and painstakingly described in the
F<file-dir-perms> article in the "Far More Than You Ever Wanted To
-Know" collection in http://www.cpan.org/olddoc/FMTEYEWTK.tgz .
+Know" collection in http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz .
The executive summary: learn how your filesystem works. The
permissions on a file say what can happen to the data in that file.
srand;
rand($.) < 1 && ($line = $_) while <>;
-This has a significant advantage in space over reading the whole
-file in. A simple proof by induction is available upon
-request if you doubt the algorithm's correctness.
+This has a significant advantage in space over reading the whole file
+in. You can find a proof of this method in I<The Art of Computer
+Programming>, Volume 2, Section 3.4.2, by Donald E. Knuth.
+
+You can use the File::Random module which provides a function
+for that algorithm:
+
+ use File::Random qw/random_line/;
+ my $line = random_line($filename);
+
+Another way is to use the Tie::File module, which treats the entire
+file as an array. Simply access a random array element.
=head2 Why do I get weird spaces when I print an array of lines?
=head1 AUTHOR AND COPYRIGHT
-Copyright (c) 1997-2002 Tom Christiansen and Nathan Torkington.
-All rights reserved.
+Copyright (c) 1997-2005 Tom Christiansen, Nathan Torkington, and
+other authors as noted. All rights reserved.
This documentation is free; you can redistribute it and/or modify it
under the same terms as Perl itself.