Upgrade to CPAN-1.87_62
[p5sagit/p5-mst-13.2.git] / pod / perlfaq5.pod
CommitLineData
68dc0745 1=head1 NAME
2
ac9dac7f 3perlfaq5 - Files and Formats ($Revision: 6019 $)
68dc0745 4
5=head1 DESCRIPTION
6
7This section deals with I/O and the "f" issues: filehandles, flushing,
8formats, and footers.
9
5a964f20 10=head2 How do I flush/unbuffer an output filehandle? Why must I do this?
d74e8afc 11X<flush> X<buffer> X<unbuffer> X<autoflush>
68dc0745 12
c90536be 13Perl does not support truly unbuffered output (except
14insofar as you can C<syswrite(OUT, $char, 1)>), although it
15does support is "command buffering", in which a physical
16write is performed after every output command.
17
18The C standard I/O library (stdio) normally buffers
19characters sent to devices so that there isn't a system call
20for each byte. In most stdio implementations, the type of
21output buffering and the size of the buffer varies according
22to the type of device. Perl's print() and write() functions
23normally buffer output, while syswrite() bypasses buffering
24all together.
25
26If you want your output to be sent immediately when you
27execute print() or write() (for instance, for some network
28protocols), you must set the handle's autoflush flag. This
29flag is the Perl variable $| and when it is set to a true
30value, Perl will flush the handle's buffer after each
31print() or write(). Setting $| affects buffering only for
32the currently selected default file handle. You choose this
33handle with the one argument select() call (see
197aec24 34L<perlvar/$E<verbar>> and L<perlfunc/select>).
c90536be 35
36Use select() to choose the desired handle, then set its
37per-filehandle variables.
5a964f20 38
500071f4 39 $old_fh = select(OUTPUT_HANDLE);
40 $| = 1;
41 select($old_fh);
5a964f20 42
c90536be 43Some modules offer object-oriented access to handles and their
44variables, although they may be overkill if this is the only
45thing you do with them. You can use IO::Handle:
68dc0745 46
500071f4 47 use IO::Handle;
48 open(DEV, ">/dev/printer"); # but is this?
49 DEV->autoflush(1);
68dc0745 50
c90536be 51or IO::Socket:
68dc0745 52
500071f4 53 use IO::Socket; # this one is kinda a pipe?
4358a253 54 my $sock = IO::Socket::INET->new( 'www.example.com:80' );
68dc0745 55
500071f4 56 $sock->autoflush();
68dc0745 57
58=head2 How do I change one line in a file/delete a line in a file/insert a line in the middle of a file/append to the beginning of a file?
d74e8afc 59X<file, editing>
68dc0745 60
1f089b22 61Use the Tie::File module, which is included in the standard
62distribution since Perl 5.8.0.
68dc0745 63
64=head2 How do I count the number of lines in a file?
d74e8afc 65X<file, counting lines> X<lines> X<line>
68dc0745 66
67One fairly efficient way is to count newlines in the file. The
68following program uses a feature of tr///, as documented in L<perlop>.
69If your text file doesn't end with a newline, then it's not really a
70proper text file, so this may report one fewer line than you expect.
71
500071f4 72 $lines = 0;
73 open(FILE, $filename) or die "Can't open `$filename': $!";
74 while (sysread FILE, $buffer, 4096) {
75 $lines += ($buffer =~ tr/\n//);
76 }
77 close FILE;
68dc0745 78
5a964f20 79This assumes no funny games with newline translations.
80
4750257b 81=head2 How can I use Perl's C<-i> option from within a program?
d74e8afc 82X<-i> X<in-place>
4750257b 83
84C<-i> sets the value of Perl's C<$^I> variable, which in turn affects
85the behavior of C<< <> >>; see L<perlrun> for more details. By
86modifying the appropriate variables directly, you can get the same
87behavior within a larger program. For example:
88
500071f4 89 # ...
90 {
91 local($^I, @ARGV) = ('.orig', glob("*.c"));
92 while (<>) {
93 if ($. == 1) {
94 print "This line should appear at the top of each file\n";
95 }
96 s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case
97 print;
98 close ARGV if eof; # Reset $.
99 }
100 }
101 # $^I and @ARGV return to their old values here
4750257b 102
103This block modifies all the C<.c> files in the current directory,
104leaving a backup of the original data from each file in a new
105C<.c.orig> file.
106
7678cced 107=head2 How can I copy a file?
d74e8afc 108X<copy> X<file, copy>
7678cced 109
110(contributed by brian d foy)
111
112Use the File::Copy module. It comes with Perl and can do a
113true copy across file systems, and it does its magic in
114a portable fashion.
115
116 use File::Copy;
117
118 copy( $original, $new_copy ) or die "Copy failed: $!";
119
120If you can't use File::Copy, you'll have to do the work yourself:
121open the original file, open the destination file, then print
122to the destination file as you read the original.
123
68dc0745 124=head2 How do I make a temporary file name?
d74e8afc 125X<file, temporary>
68dc0745 126
7678cced 127If you don't need to know the name of the file, you can use C<open()>
128with C<undef> in place of the file name. The C<open()> function
129creates an anonymous temporary file.
130
131 open my $tmp, '+>', undef or die $!;
6670e5e7 132
7678cced 133Otherwise, you can use the File::Temp module.
68dc0745 134
500071f4 135 use File::Temp qw/ tempfile tempdir /;
a6dd486b 136
500071f4 137 $dir = tempdir( CLEANUP => 1 );
138 ($fh, $filename) = tempfile( DIR => $dir );
5a964f20 139
500071f4 140 # or if you don't need to know the filename
5a964f20 141
500071f4 142 $fh = tempfile( DIR => $dir );
5a964f20 143
16394a69 144The File::Temp has been a standard module since Perl 5.6.1. If you
145don't have a modern enough Perl installed, use the C<new_tmpfile>
146class method from the IO::File module to get a filehandle opened for
147reading and writing. Use it if you don't need to know the file's name:
5a964f20 148
500071f4 149 use IO::File;
150 $fh = IO::File->new_tmpfile()
16394a69 151 or die "Unable to make new temporary file: $!";
5a964f20 152
a6dd486b 153If you're committed to creating a temporary file by hand, use the
154process ID and/or the current time-value. If you need to have many
155temporary files in one process, use a counter:
5a964f20 156
500071f4 157 BEGIN {
68dc0745 158 use Fcntl;
16394a69 159 my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMPDIR} || $ENV{TEMP};
68dc0745 160 my $base_name = sprintf("%s/%d-%d-0000", $temp_dir, $$, time());
500071f4 161
68dc0745 162 sub temp_file {
500071f4 163 local *FH;
164 my $count = 0;
165 until (defined(fileno(FH)) || $count++ > 100) {
68dc0745 166 $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e;
2359510d 167 # O_EXCL is required for security reasons.
5a964f20 168 sysopen(FH, $base_name, O_WRONLY|O_EXCL|O_CREAT);
500071f4 169 }
170
171 if (defined(fileno(FH))
5a964f20 172 return (*FH, $base_name);
ac9dac7f 173 }
500071f4 174 else {
68dc0745 175 return ();
176 }
177 }
500071f4 178 }
68dc0745 179
68dc0745 180=head2 How can I manipulate fixed-record-length files?
d74e8afc 181X<fixed-length> X<file, fixed-length records>
68dc0745 182
793f5136 183The most efficient way is using L<pack()|perlfunc/"pack"> and
184L<unpack()|perlfunc/"unpack">. This is faster than using
185L<substr()|perlfunc/"substr"> when taking many, many strings. It is
186slower for just a few.
5a964f20 187
188Here is a sample chunk of code to break up and put back together again
189some fixed-format input lines, in this case from the output of a normal,
190Berkeley-style ps:
68dc0745 191
500071f4 192 # sample input line:
193 # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what
194 my $PS_T = 'A6 A4 A7 A5 A*';
195 open my $ps, '-|', 'ps';
196 print scalar <$ps>;
197 my @fields = qw( pid tt stat time command );
198 while (<$ps>) {
199 my %process;
200 @process{@fields} = unpack($PS_T, $_);
793f5136 201 for my $field ( @fields ) {
500071f4 202 print "$field: <$process{$field}>\n";
68dc0745 203 }
793f5136 204 print 'line=', pack($PS_T, @process{@fields} ), "\n";
500071f4 205 }
68dc0745 206
793f5136 207We've used a hash slice in order to easily handle the fields of each row.
208Storing the keys in an array means it's easy to operate on them as a
209group or loop over them with for. It also avoids polluting the program
210with global variables and using symbolic references.
5a964f20 211
ac9dac7f 212=head2 How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles?
d74e8afc 213X<filehandle, local> X<filehandle, passing> X<filehandle, reference>
68dc0745 214
c90536be 215As of perl5.6, open() autovivifies file and directory handles
216as references if you pass it an uninitialized scalar variable.
217You can then pass these references just like any other scalar,
218and use them in the place of named handles.
68dc0745 219
c90536be 220 open my $fh, $file_name;
818c4caa 221
c90536be 222 open local $fh, $file_name;
818c4caa 223
c90536be 224 print $fh "Hello World!\n";
818c4caa 225
c90536be 226 process_file( $fh );
68dc0745 227
500071f4 228If you like, you can store these filehandles in an array or a hash.
229If you access them directly, they aren't simple scalars and you
ac9dac7f 230need to give C<print> a little help by placing the filehandle
500071f4 231reference in braces. Perl can only figure it out on its own when
232the filehandle reference is a simple scalar.
233
234 my @fhs = ( $fh1, $fh2, $fh3 );
ac9dac7f 235
500071f4 236 for( $i = 0; $i <= $#fhs; $i++ ) {
237 print {$fhs[$i]} "just another Perl answer, \n";
238 }
239
240
c90536be 241Before perl5.6, you had to deal with various typeglob idioms
242which you may see in older code.
68dc0745 243
c90536be 244 open FILE, "> $filename";
245 process_typeglob( *FILE );
246 process_reference( \*FILE );
818c4caa 247
c90536be 248 sub process_typeglob { local *FH = shift; print FH "Typeglob!" }
249 sub process_reference { local $fh = shift; print $fh "Reference!" }
5a964f20 250
c90536be 251If you want to create many anonymous handles, you should
252check out the Symbol or IO::Handle modules.
5a964f20 253
254=head2 How can I use a filehandle indirectly?
d74e8afc 255X<filehandle, indirect>
5a964f20 256
257An indirect filehandle is using something other than a symbol
258in a place that a filehandle is expected. Here are ways
a6dd486b 259to get indirect filehandles:
5a964f20 260
500071f4 261 $fh = SOME_FH; # bareword is strict-subs hostile
262 $fh = "SOME_FH"; # strict-refs hostile; same package only
263 $fh = *SOME_FH; # typeglob
264 $fh = \*SOME_FH; # ref to typeglob (bless-able)
265 $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob
5a964f20 266
c90536be 267Or, you can use the C<new> method from one of the IO::* modules to
5a964f20 268create an anonymous filehandle, store that in a scalar variable,
269and use it as though it were a normal filehandle.
270
500071f4 271 use IO::Handle; # 5.004 or higher
272 $fh = IO::Handle->new();
5a964f20 273
274Then use any of those as you would a normal filehandle. Anywhere that
275Perl is expecting a filehandle, an indirect filehandle may be used
276instead. An indirect filehandle is just a scalar variable that contains
368c9434 277a filehandle. Functions like C<print>, C<open>, C<seek>, or
c90536be 278the C<< <FH> >> diamond operator will accept either a named filehandle
5a964f20 279or a scalar variable containing one:
280
500071f4 281 ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR);
282 print $ofh "Type it: ";
283 $got = <$ifh>
284 print $efh "What was that: $got";
5a964f20 285
368c9434 286If you're passing a filehandle to a function, you can write
5a964f20 287the function in two ways:
288
500071f4 289 sub accept_fh {
290 my $fh = shift;
291 print $fh "Sending to indirect filehandle\n";
292 }
46fc3d4c 293
5a964f20 294Or it can localize a typeglob and use the filehandle directly:
46fc3d4c 295
500071f4 296 sub accept_fh {
297 local *FH = shift;
298 print FH "Sending to localized filehandle\n";
299 }
46fc3d4c 300
5a964f20 301Both styles work with either objects or typeglobs of real filehandles.
302(They might also work with strings under some circumstances, but this
303is risky.)
304
500071f4 305 accept_fh(*STDOUT);
306 accept_fh($handle);
5a964f20 307
308In the examples above, we assigned the filehandle to a scalar variable
a6dd486b 309before using it. That is because only simple scalar variables, not
310expressions or subscripts of hashes or arrays, can be used with
311built-ins like C<print>, C<printf>, or the diamond operator. Using
8305e449 312something other than a simple scalar variable as a filehandle is
5a964f20 313illegal and won't even compile:
314
500071f4 315 @fd = (*STDIN, *STDOUT, *STDERR);
316 print $fd[1] "Type it: "; # WRONG
317 $got = <$fd[0]> # WRONG
318 print $fd[2] "What was that: $got"; # WRONG
5a964f20 319
320With C<print> and C<printf>, you get around this by using a block and
321an expression where you would place the filehandle:
322
500071f4 323 print { $fd[1] } "funny stuff\n";
324 printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559;
325 # Pity the poor deadbeef.
5a964f20 326
327That block is a proper block like any other, so you can put more
328complicated code there. This sends the message out to one of two places:
329
500071f4 330 $ok = -x "/bin/cat";
331 print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n";
332 print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n";
5a964f20 333
334This approach of treating C<print> and C<printf> like object methods
335calls doesn't work for the diamond operator. That's because it's a
336real operator, not just a function with a comma-less argument. Assuming
337you've been storing typeglobs in your structure as we did above, you
c90536be 338can use the built-in function named C<readline> to read a record just
c47ff5f1 339as C<< <> >> does. Given the initialization shown above for @fd, this
c90536be 340would work, but only because readline() requires a typeglob. It doesn't
5a964f20 341work with objects or strings, which might be a bug we haven't fixed yet.
342
500071f4 343 $got = readline($fd[0]);
5a964f20 344
345Let it be noted that the flakiness of indirect filehandles is not
346related to whether they're strings, typeglobs, objects, or anything else.
347It's the syntax of the fundamental operators. Playing the object
348game doesn't help you at all here.
46fc3d4c 349
68dc0745 350=head2 How can I set up a footer format to be used with write()?
d74e8afc 351X<footer>
68dc0745 352
54310121 353There's no builtin way to do this, but L<perlform> has a couple of
68dc0745 354techniques to make it possible for the intrepid hacker.
355
356=head2 How can I write() into a string?
d74e8afc 357X<write, into a string>
68dc0745 358
65acb1b1 359See L<perlform/"Accessing Formatting Internals"> for an swrite() function.
68dc0745 360
361=head2 How can I output my numbers with commas added?
d74e8afc 362X<number, commify>
68dc0745 363
b68463f7 364(contributed by brian d foy and Benjamin Goldberg)
365
366You can use L<Number::Format> to separate places in a number.
367It handles locale information for those of you who want to insert
368full stops instead (or anything else that they want to use,
369really).
370
49d635f9 371This subroutine will add commas to your number:
372
373 sub commify {
500071f4 374 local $_ = shift;
375 1 while s/^([-+]?\d+)(\d{3})/$1,$2/;
376 return $_;
377 }
49d635f9 378
379This regex from Benjamin Goldberg will add commas to numbers:
68dc0745 380
500071f4 381 s/(^[-+]?\d+?(?=(?>(?:\d{3})+)(?!\d))|\G\d{3}(?=\d))/$1,/g;
68dc0745 382
49d635f9 383It is easier to see with comments:
68dc0745 384
500071f4 385 s/(
386 ^[-+]? # beginning of number.
387 \d+? # first digits before first comma
388 (?= # followed by, (but not included in the match) :
389 (?>(?:\d{3})+) # some positive multiple of three digits.
390 (?!\d) # an *exact* multiple, not x * 3 + 1 or whatever.
391 )
392 | # or:
393 \G\d{3} # after the last group, get three digits
394 (?=\d) # but they have to have more digits after them.
395 )/$1,/xg;
46fc3d4c 396
68dc0745 397=head2 How can I translate tildes (~) in a filename?
d74e8afc 398X<tilde> X<tilde expansion>
68dc0745 399
575cc754 400Use the <> (glob()) operator, documented in L<perlfunc>. Older
401versions of Perl require that you have a shell installed that groks
402tildes. Recent perl versions have this feature built in. The
d6260402 403File::KGlob module (available from CPAN) gives more portable glob
575cc754 404functionality.
68dc0745 405
406Within Perl, you may use this directly:
407
408 $filename =~ s{
409 ^ ~ # find a leading tilde
410 ( # save this in $1
411 [^/] # a non-slash character
412 * # repeated 0 or more times (0 means me)
413 )
414 }{
415 $1
416 ? (getpwnam($1))[7]
417 : ( $ENV{HOME} || $ENV{LOGDIR} )
418 }ex;
419
5a964f20 420=head2 How come when I open a file read-write it wipes it out?
d74e8afc 421X<clobber> X<read-write> X<clobbering> X<truncate> X<truncating>
68dc0745 422
423Because you're using something like this, which truncates the file and
424I<then> gives you read-write access:
425
500071f4 426 open(FH, "+> /path/name"); # WRONG (almost always)
68dc0745 427
428Whoops. You should instead use this, which will fail if the file
197aec24 429doesn't exist.
d92eb7b0 430
500071f4 431 open(FH, "+< /path/name"); # open for update
d92eb7b0 432
c47ff5f1 433Using ">" always clobbers or creates. Using "<" never does
d92eb7b0 434either. The "+" doesn't change this.
68dc0745 435
5a964f20 436Here are examples of many kinds of file opens. Those using sysopen()
437all assume
68dc0745 438
500071f4 439 use Fcntl;
68dc0745 440
5a964f20 441To open file for reading:
68dc0745 442
500071f4 443 open(FH, "< $path") || die $!;
444 sysopen(FH, $path, O_RDONLY) || die $!;
5a964f20 445
446To open file for writing, create new file if needed or else truncate old file:
447
500071f4 448 open(FH, "> $path") || die $!;
449 sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT) || die $!;
450 sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666) || die $!;
5a964f20 451
452To open file for writing, create new file, file must not exist:
453
500071f4 454 sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT) || die $!;
455 sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT, 0666) || die $!;
5a964f20 456
457To open file for appending, create if necessary:
458
500071f4 459 open(FH, ">> $path") || die $!;
460 sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT) || die $!;
461 sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT, 0666) || die $!;
5a964f20 462
463To open file for appending, file must exist:
464
500071f4 465 sysopen(FH, $path, O_WRONLY|O_APPEND) || die $!;
5a964f20 466
467To open file for update, file must exist:
468
500071f4 469 open(FH, "+< $path") || die $!;
470 sysopen(FH, $path, O_RDWR) || die $!;
5a964f20 471
472To open file for update, create file if necessary:
473
500071f4 474 sysopen(FH, $path, O_RDWR|O_CREAT) || die $!;
475 sysopen(FH, $path, O_RDWR|O_CREAT, 0666) || die $!;
5a964f20 476
477To open file for update, file must not exist:
478
500071f4 479 sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT) || die $!;
480 sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT, 0666) || die $!;
5a964f20 481
482To open a file without blocking, creating if necessary:
483
500071f4 484 sysopen(FH, "/foo/somefile", O_WRONLY|O_NDELAY|O_CREAT)
2359510d 485 or die "can't open /foo/somefile: $!":
5a964f20 486
487Be warned that neither creation nor deletion of files is guaranteed to
488be an atomic operation over NFS. That is, two processes might both
a6dd486b 489successfully create or unlink the same file! Therefore O_EXCL
490isn't as exclusive as you might wish.
68dc0745 491
87275199 492See also the new L<perlopentut> if you have it (new for 5.6).
65acb1b1 493
04d666b1 494=head2 Why do I sometimes get an "Argument list too long" when I use E<lt>*E<gt>?
d74e8afc 495X<argument list too long>
68dc0745 496
c47ff5f1 497The C<< <> >> operator performs a globbing operation (see above).
3a4b19e4 498In Perl versions earlier than v5.6.0, the internal glob() operator forks
499csh(1) to do the actual glob expansion, but
68dc0745 500csh can't handle more than 127 items and so gives the error message
501C<Argument list too long>. People who installed tcsh as csh won't
502have this problem, but their users may be surprised by it.
503
3a4b19e4 504To get around this, either upgrade to Perl v5.6.0 or later, do the glob
d6260402 505yourself with readdir() and patterns, or use a module like File::KGlob,
3a4b19e4 506one that doesn't use the shell to do globbing.
68dc0745 507
508=head2 Is there a leak/bug in glob()?
d74e8afc 509X<glob>
68dc0745 510
511Due to the current implementation on some operating systems, when you
512use the glob() function or its angle-bracket alias in a scalar
a6dd486b 513context, you may cause a memory leak and/or unpredictable behavior. It's
68dc0745 514best therefore to use glob() only in list context.
515
c47ff5f1 516=head2 How can I open a file with a leading ">" or trailing blanks?
d74e8afc 517X<filename, special characters>
68dc0745 518
b68463f7 519(contributed by Brian McCauley)
68dc0745 520
b68463f7 521The special two argument form of Perl's open() function ignores
522trailing blanks in filenames and infers the mode from certain leading
523characters (or a trailing "|"). In older versions of Perl this was the
524only version of open() and so it is prevalent in old code and books.
65acb1b1 525
b68463f7 526Unless you have a particular reason to use the two argument form you
527should use the three argument form of open() which does not treat any
528charcters in the filename as special.
58103a2e 529
881bdbd4 530 open FILE, "<", " file "; # filename is " file "
531 open FILE, ">", ">file"; # filename is ">file"
65acb1b1 532
68dc0745 533=head2 How can I reliably rename a file?
d74e8afc 534X<rename> X<mv> X<move> X<file, rename> X<ren>
68dc0745 535
49d635f9 536If your operating system supports a proper mv(1) utility or its
537functional equivalent, this works:
68dc0745 538
500071f4 539 rename($old, $new) or system("mv", $old, $new);
68dc0745 540
d2321c93 541It may be more portable to use the File::Copy module instead.
542You just copy to the new file to the new name (checking return
543values), then delete the old one. This isn't really the same
544semantically as a rename(), which preserves meta-information like
68dc0745 545permissions, timestamps, inode info, etc.
546
d2321c93 547Newer versions of File::Copy export a move() function.
5a964f20 548
68dc0745 549=head2 How can I lock a file?
d74e8afc 550X<lock> X<file, lock> X<flock>
68dc0745 551
54310121 552Perl's builtin flock() function (see L<perlfunc> for details) will call
68dc0745 553flock(2) if that exists, fcntl(2) if it doesn't (on perl version 5.004 and
554later), and lockf(3) if neither of the two previous system calls exists.
555On some systems, it may even use a different form of native locking.
556Here are some gotchas with Perl's flock():
557
558=over 4
559
560=item 1
561
562Produces a fatal error if none of the three system calls (or their
563close equivalent) exists.
564
565=item 2
566
567lockf(3) does not provide shared locking, and requires that the
568filehandle be open for writing (or appending, or read/writing).
569
570=item 3
571
d92eb7b0 572Some versions of flock() can't lock files over a network (e.g. on NFS file
573systems), so you'd need to force the use of fcntl(2) when you build Perl.
a6dd486b 574But even this is dubious at best. See the flock entry of L<perlfunc>
d92eb7b0 575and the F<INSTALL> file in the source distribution for information on
576building Perl to do this.
577
578Two potentially non-obvious but traditional flock semantics are that
a6dd486b 579it waits indefinitely until the lock is granted, and that its locks are
d92eb7b0 580I<merely advisory>. Such discretionary locks are more flexible, but
581offer fewer guarantees. This means that files locked with flock() may
582be modified by programs that do not also use flock(). Cars that stop
583for red lights get on well with each other, but not with cars that don't
584stop for red lights. See the perlport manpage, your port's specific
585documentation, or your system-specific local manpages for details. It's
586best to assume traditional behavior if you're writing portable programs.
a6dd486b 587(If you're not, you should as always feel perfectly free to write
d92eb7b0 588for your own system's idiosyncrasies (sometimes called "features").
589Slavish adherence to portability concerns shouldn't get in the way of
590your getting your job done.)
68dc0745 591
197aec24 592For more information on file locking, see also
13a2d996 593L<perlopentut/"File Locking"> if you have it (new for 5.6).
65acb1b1 594
68dc0745 595=back
596
04d666b1 597=head2 Why can't I just open(FH, "E<gt>file.lock")?
d74e8afc 598X<lock, lockfile race condition>
68dc0745 599
600A common bit of code B<NOT TO USE> is this:
601
500071f4 602 sleep(3) while -e "file.lock"; # PLEASE DO NOT USE
603 open(LCK, "> file.lock"); # THIS BROKEN CODE
68dc0745 604
605This is a classic race condition: you take two steps to do something
606which must be done in one. That's why computer hardware provides an
607atomic test-and-set instruction. In theory, this "ought" to work:
608
500071f4 609 sysopen(FH, "file.lock", O_WRONLY|O_EXCL|O_CREAT)
9b55d3ab 610 or die "can't open file.lock: $!";
68dc0745 611
612except that lamentably, file creation (and deletion) is not atomic
613over NFS, so this won't work (at least, not every time) over the net.
65acb1b1 614Various schemes involving link() have been suggested, but
46fc3d4c 615these tend to involve busy-wait, which is also subdesirable.
68dc0745 616
fc36a67e 617=head2 I still don't get locking. I just want to increment the number in the file. How can I do this?
d74e8afc 618X<counter> X<file, counter>
68dc0745 619
46fc3d4c 620Didn't anyone ever tell you web-page hit counters were useless?
5a964f20 621They don't count number of hits, they're a waste of time, and they serve
a6dd486b 622only to stroke the writer's vanity. It's better to pick a random number;
623they're more realistic.
68dc0745 624
5a964f20 625Anyway, this is what you can do if you can't help yourself.
68dc0745 626
500071f4 627 use Fcntl qw(:DEFAULT :flock);
628 sysopen(FH, "numfile", O_RDWR|O_CREAT) or die "can't open numfile: $!";
629 flock(FH, LOCK_EX) or die "can't flock numfile: $!";
630 $num = <FH> || 0;
631 seek(FH, 0, 0) or die "can't rewind numfile: $!";
632 truncate(FH, 0) or die "can't truncate numfile: $!";
633 (print FH $num+1, "\n") or die "can't write numfile: $!";
634 close FH or die "can't close numfile: $!";
68dc0745 635
46fc3d4c 636Here's a much better web-page hit counter:
68dc0745 637
500071f4 638 $hits = int( (time() - 850_000_000) / rand(1_000) );
68dc0745 639
640If the count doesn't impress your friends, then the code might. :-)
641
f52f3be2 642=head2 All I want to do is append a small amount of text to the end of a file. Do I still have to use locking?
d74e8afc 643X<append> X<file, append>
05caf3a7 644
645If you are on a system that correctly implements flock() and you use the
646example appending code from "perldoc -f flock" everything will be OK
647even if the OS you are on doesn't implement append mode correctly (if
648such a system exists.) So if you are happy to restrict yourself to OSs
649that implement flock() (and that's not really much of a restriction)
650then that is what you should do.
651
652If you know you are only going to use a system that does correctly
653implement appending (i.e. not Win32) then you can omit the seek() from
654the above code.
655
656If you know you are only writing code to run on an OS and filesystem that
657does implement append mode correctly (a local filesystem on a modern
658Unix for example), and you keep the file in block-buffered mode and you
659write less than one buffer-full of output between each manual flushing
8305e449 660of the buffer then each bufferload is almost guaranteed to be written to
05caf3a7 661the end of the file in one chunk without getting intermingled with
662anyone else's output. You can also use the syswrite() function which is
663simply a wrapper around your systems write(2) system call.
664
665There is still a small theoretical chance that a signal will interrupt
666the system level write() operation before completion. There is also a
667possibility that some STDIO implementations may call multiple system
668level write()s even if the buffer was empty to start. There may be some
669systems where this probability is reduced to zero.
670
68dc0745 671=head2 How do I randomly update a binary file?
d74e8afc 672X<file, binary patch>
68dc0745 673
674If you're just trying to patch a binary, in many cases something as
675simple as this works:
676
500071f4 677 perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs
68dc0745 678
679However, if you have fixed sized records, then you might do something more
680like this:
681
500071f4 682 $RECSIZE = 220; # size of record, in bytes
683 $recno = 37; # which record to update
684 open(FH, "+<somewhere") || die "can't update somewhere: $!";
685 seek(FH, $recno * $RECSIZE, 0);
686 read(FH, $record, $RECSIZE) == $RECSIZE || die "can't read record $recno: $!";
687 # munge the record
688 seek(FH, -$RECSIZE, 1);
689 print FH $record;
690 close FH;
68dc0745 691
692Locking and error checking are left as an exercise for the reader.
a6dd486b 693Don't forget them or you'll be quite sorry.
68dc0745 694
68dc0745 695=head2 How do I get a file's timestamp in perl?
d74e8afc 696X<timestamp> X<file, timestamp>
68dc0745 697
881bdbd4 698If you want to retrieve the time at which the file was last
699read, written, or had its meta-data (owner, etc) changed,
a05e4845 700you use the B<-A>, B<-M>, or B<-C> file test operations as
881bdbd4 701documented in L<perlfunc>. These retrieve the age of the
702file (measured against the start-time of your program) in
703days as a floating point number. Some platforms may not have
704all of these times. See L<perlport> for details. To
705retrieve the "raw" time in seconds since the epoch, you
706would call the stat function, then use localtime(),
707gmtime(), or POSIX::strftime() to convert this into
708human-readable form.
68dc0745 709
710Here's an example:
711
500071f4 712 $write_secs = (stat($file))[9];
713 printf "file %s updated at %s\n", $file,
c8db1d39 714 scalar localtime($write_secs);
68dc0745 715
716If you prefer something more legible, use the File::stat module
717(part of the standard distribution in version 5.004 and later):
718
500071f4 719 # error checking left as an exercise for reader.
720 use File::stat;
721 use Time::localtime;
722 $date_string = ctime(stat($file)->mtime);
723 print "file $file updated at $date_string\n";
68dc0745 724
65acb1b1 725The POSIX::strftime() approach has the benefit of being,
726in theory, independent of the current locale. See L<perllocale>
727for details.
68dc0745 728
729=head2 How do I set a file's timestamp in perl?
d74e8afc 730X<timestamp> X<file, timestamp>
68dc0745 731
732You use the utime() function documented in L<perlfunc/utime>.
733By way of example, here's a little program that copies the
734read and write times from its first argument to all the rest
735of them.
736
500071f4 737 if (@ARGV < 2) {
738 die "usage: cptimes timestamp_file other_files ...\n";
739 }
740 $timestamp = shift;
741 ($atime, $mtime) = (stat($timestamp))[8,9];
742 utime $atime, $mtime, @ARGV;
68dc0745 743
65acb1b1 744Error checking is, as usual, left as an exercise for the reader.
68dc0745 745
19a1cd16 746The perldoc for utime also has an example that has the same
747effect as touch(1) on files that I<already exist>.
748
749Certain file systems have a limited ability to store the times
750on a file at the expected level of precision. For example, the
751FAT and HPFS filesystem are unable to create dates on files with
752a finer granularity than two seconds. This is a limitation of
753the filesystems, not of utime().
68dc0745 754
755=head2 How do I print to more than one file at once?
d74e8afc 756X<print, to multiple files>
68dc0745 757
49d635f9 758To connect one filehandle to several output filehandles,
759you can use the IO::Tee or Tie::FileHandle::Multiplex modules.
68dc0745 760
49d635f9 761If you only have to do this once, you can print individually
762to each filehandle.
68dc0745 763
500071f4 764 for $fh (FH1, FH2, FH3) { print $fh "whatever\n" }
5a964f20 765
49d635f9 766=head2 How can I read in an entire file all at once?
d74e8afc 767X<slurp> X<file, slurping>
68dc0745 768
49d635f9 769You can use the File::Slurp module to do it in one step.
68dc0745 770
49d635f9 771 use File::Slurp;
197aec24 772
49d635f9 773 $all_of_it = read_file($filename); # entire file in scalar
500071f4 774 @all_lines = read_file($filename); # one line perl element
d92eb7b0 775
776The customary Perl approach for processing all the lines in a file is to
777do so one line at a time:
778
500071f4 779 open (INPUT, $file) || die "can't open $file: $!";
780 while (<INPUT>) {
781 chomp;
782 # do something with $_
783 }
784 close(INPUT) || die "can't close $file: $!";
d92eb7b0 785
786This is tremendously more efficient than reading the entire file into
787memory as an array of lines and then processing it one element at a time,
a6dd486b 788which is often--if not almost always--the wrong approach. Whenever
d92eb7b0 789you see someone do this:
790
500071f4 791 @lines = <INPUT>;
d92eb7b0 792
30852c57 793you should think long and hard about why you need everything loaded at
794once. It's just not a scalable solution. You might also find it more
795fun to use the standard Tie::File module, or the DB_File module's
796$DB_RECNO bindings, which allow you to tie an array to a file so that
797accessing an element the array actually accesses the corresponding
798line in the file.
d92eb7b0 799
f05bbc40 800You can read the entire filehandle contents into a scalar.
d92eb7b0 801
500071f4 802 {
d92eb7b0 803 local(*INPUT, $/);
804 open (INPUT, $file) || die "can't open $file: $!";
805 $var = <INPUT>;
500071f4 806 }
d92eb7b0 807
197aec24 808That temporarily undefs your record separator, and will automatically
d92eb7b0 809close the file at block exit. If the file is already open, just use this:
810
500071f4 811 $var = do { local $/; <INPUT> };
d92eb7b0 812
f05bbc40 813For ordinary files you can also use the read function.
814
815 read( INPUT, $var, -s INPUT );
816
817The third argument tests the byte size of the data on the INPUT filehandle
818and reads that many bytes into the buffer $var.
819
68dc0745 820=head2 How can I read in a file by paragraphs?
d74e8afc 821X<file, reading by paragraphs>
68dc0745 822
65acb1b1 823Use the C<$/> variable (see L<perlvar> for details). You can either
68dc0745 824set it to C<""> to eliminate empty paragraphs (C<"abc\n\n\n\ndef">,
825for instance, gets treated as two paragraphs and not three), or
826C<"\n\n"> to accept empty paragraphs.
827
197aec24 828Note that a blank line must have no blanks in it. Thus
c4db748a 829S<C<"fred\n \nstuff\n\n">> is one paragraph, but C<"fred\n\nstuff\n\n"> is two.
65acb1b1 830
68dc0745 831=head2 How can I read a single character from a file? From the keyboard?
d74e8afc 832X<getc> X<file, reading one character at a time>
68dc0745 833
834You can use the builtin C<getc()> function for most filehandles, but
835it won't (easily) work on a terminal device. For STDIN, either use
a6dd486b 836the Term::ReadKey module from CPAN or use the sample code in
68dc0745 837L<perlfunc/getc>.
838
65acb1b1 839If your system supports the portable operating system programming
840interface (POSIX), you can use the following code, which you'll note
841turns off echo processing as well.
68dc0745 842
500071f4 843 #!/usr/bin/perl -w
844 use strict;
845 $| = 1;
846 for (1..4) {
847 my $got;
848 print "gimme: ";
849 $got = getone();
850 print "--> $got\n";
851 }
68dc0745 852 exit;
853
500071f4 854 BEGIN {
68dc0745 855 use POSIX qw(:termios_h);
856
857 my ($term, $oterm, $echo, $noecho, $fd_stdin);
858
859 $fd_stdin = fileno(STDIN);
860
861 $term = POSIX::Termios->new();
862 $term->getattr($fd_stdin);
863 $oterm = $term->getlflag();
864
865 $echo = ECHO | ECHOK | ICANON;
866 $noecho = $oterm & ~$echo;
867
868 sub cbreak {
500071f4 869 $term->setlflag($noecho);
870 $term->setcc(VTIME, 1);
871 $term->setattr($fd_stdin, TCSANOW);
872 }
ac9dac7f 873
68dc0745 874 sub cooked {
500071f4 875 $term->setlflag($oterm);
876 $term->setcc(VTIME, 0);
877 $term->setattr($fd_stdin, TCSANOW);
878 }
68dc0745 879
880 sub getone {
500071f4 881 my $key = '';
882 cbreak();
883 sysread(STDIN, $key, 1);
884 cooked();
885 return $key;
886 }
68dc0745 887
500071f4 888 }
68dc0745 889
500071f4 890 END { cooked() }
68dc0745 891
a6dd486b 892The Term::ReadKey module from CPAN may be easier to use. Recent versions
65acb1b1 893include also support for non-portable systems as well.
68dc0745 894
500071f4 895 use Term::ReadKey;
896 open(TTY, "</dev/tty");
897 print "Gimme a char: ";
898 ReadMode "raw";
899 $key = ReadKey 0, *TTY;
900 ReadMode "normal";
901 printf "\nYou said %s, char number %03d\n",
902 $key, ord $key;
68dc0745 903
65acb1b1 904=head2 How can I tell whether there's a character waiting on a filehandle?
68dc0745 905
5a964f20 906The very first thing you should do is look into getting the Term::ReadKey
65acb1b1 907extension from CPAN. As we mentioned earlier, it now even has limited
908support for non-portable (read: not open systems, closed, proprietary,
909not POSIX, not Unix, etc) systems.
5a964f20 910
911You should also check out the Frequently Asked Questions list in
68dc0745 912comp.unix.* for things like this: the answer is essentially the same.
913It's very system dependent. Here's one solution that works on BSD
914systems:
915
500071f4 916 sub key_ready {
917 my($rin, $nfd);
918 vec($rin, fileno(STDIN), 1) = 1;
919 return $nfd = select($rin,undef,undef,0);
920 }
68dc0745 921
65acb1b1 922If you want to find out how many characters are waiting, there's
923also the FIONREAD ioctl call to be looked at. The I<h2ph> tool that
924comes with Perl tries to convert C include files to Perl code, which
925can be C<require>d. FIONREAD ends up defined as a function in the
926I<sys/ioctl.ph> file:
68dc0745 927
500071f4 928 require 'sys/ioctl.ph';
68dc0745 929
500071f4 930 $size = pack("L", 0);
931 ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n";
932 $size = unpack("L", $size);
68dc0745 933
5a964f20 934If I<h2ph> wasn't installed or doesn't work for you, you can
935I<grep> the include files by hand:
68dc0745 936
500071f4 937 % grep FIONREAD /usr/include/*/*
938 /usr/include/asm/ioctls.h:#define FIONREAD 0x541B
68dc0745 939
5a964f20 940Or write a small C program using the editor of champions:
68dc0745 941
500071f4 942 % cat > fionread.c
943 #include <sys/ioctl.h>
944 main() {
945 printf("%#08x\n", FIONREAD);
946 }
947 ^D
948 % cc -o fionread fionread.c
949 % ./fionread
950 0x4004667f
5a964f20 951
8305e449 952And then hard code it, leaving porting as an exercise to your successor.
5a964f20 953
500071f4 954 $FIONREAD = 0x4004667f; # XXX: opsys dependent
5a964f20 955
500071f4 956 $size = pack("L", 0);
957 ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n";
958 $size = unpack("L", $size);
5a964f20 959
a6dd486b 960FIONREAD requires a filehandle connected to a stream, meaning that sockets,
5a964f20 961pipes, and tty devices work, but I<not> files.
68dc0745 962
963=head2 How do I do a C<tail -f> in perl?
ac9dac7f 964X<tail> X<IO::Handle> X<File::Tail> X<clearerr>
68dc0745 965
966First try
967
500071f4 968 seek(GWFILE, 0, 1);
68dc0745 969
970The statement C<seek(GWFILE, 0, 1)> doesn't change the current position,
971but it does clear the end-of-file condition on the handle, so that the
ac9dac7f 972next C<< <GWFILE> >> makes Perl try again to read something.
68dc0745 973
974If that doesn't work (it relies on features of your stdio implementation),
975then you need something more like this:
976
977 for (;;) {
978 for ($curpos = tell(GWFILE); <GWFILE>; $curpos = tell(GWFILE)) {
979 # search for some stuff and put it into files
980 }
981 # sleep for a while
982 seek(GWFILE, $curpos, 0); # seek to where we had been
983 }
984
ac9dac7f 985If this still doesn't work, look into the C<clearerr> method
986from C<IO::Handle>, which resets the error and end-of-file states
987on the handle.
68dc0745 988
ac9dac7f 989There's also a C<File::Tail> module from CPAN.
65acb1b1 990
68dc0745 991=head2 How do I dup() a filehandle in Perl?
d74e8afc 992X<dup>
68dc0745 993
994If you check L<perlfunc/open>, you'll see that several of the ways
995to call open() should do the trick. For example:
996
500071f4 997 open(LOG, ">>/foo/logfile");
998 open(STDERR, ">&LOG");
68dc0745 999
1000Or even with a literal numeric descriptor:
1001
1002 $fd = $ENV{MHCONTEXTFD};
1003 open(MHCONTEXT, "<&=$fd"); # like fdopen(3S)
1004
c47ff5f1 1005Note that "<&STDIN" makes a copy, but "<&=STDIN" make
5a964f20 1006an alias. That means if you close an aliased handle, all
197aec24 1007aliases become inaccessible. This is not true with
5a964f20 1008a copied one.
1009
1010Error checking, as always, has been left as an exercise for the reader.
68dc0745 1011
1012=head2 How do I close a file descriptor by number?
d74e8afc 1013X<file, closing file descriptors>
68dc0745 1014
1015This should rarely be necessary, as the Perl close() function is to be
1016used for things that Perl opened itself, even if it was a dup of a
a6dd486b 1017numeric descriptor as with MHCONTEXT above. But if you really have
68dc0745 1018to, you may be able to do this:
1019
500071f4 1020 require 'sys/syscall.ph';
1021 $rc = syscall(&SYS_close, $fd + 0); # must force numeric
1022 die "can't sysclose $fd: $!" unless $rc == -1;
68dc0745 1023
a6dd486b 1024Or, just use the fdopen(3S) feature of open():
d92eb7b0 1025
500071f4 1026 {
197aec24 1027 local *F;
d92eb7b0 1028 open F, "<&=$fd" or die "Cannot reopen fd=$fd: $!";
1029 close F;
500071f4 1030 }
d92eb7b0 1031
883f1635 1032=head2 Why can't I use "C:\temp\foo" in DOS paths? Why doesn't `C:\temp\foo.exe` work?
d74e8afc 1033X<filename, DOS issues>
68dc0745 1034
1035Whoops! You just put a tab and a formfeed into that filename!
1036Remember that within double quoted strings ("like\this"), the
1037backslash is an escape character. The full list of these is in
1038L<perlop/Quote and Quote-like Operators>. Unsurprisingly, you don't
1039have a file called "c:(tab)emp(formfeed)oo" or
65acb1b1 1040"c:(tab)emp(formfeed)oo.exe" on your legacy DOS filesystem.
68dc0745 1041
1042Either single-quote your strings, or (preferably) use forward slashes.
46fc3d4c 1043Since all DOS and Windows versions since something like MS-DOS 2.0 or so
68dc0745 1044have treated C</> and C<\> the same in a path, you might as well use the
a6dd486b 1045one that doesn't clash with Perl--or the POSIX shell, ANSI C and C++,
65acb1b1 1046awk, Tcl, Java, or Python, just to mention a few. POSIX paths
1047are more portable, too.
68dc0745 1048
1049=head2 Why doesn't glob("*.*") get all the files?
d74e8afc 1050X<glob>
68dc0745 1051
1052Because even on non-Unix ports, Perl's glob function follows standard
46fc3d4c 1053Unix globbing semantics. You'll need C<glob("*")> to get all (non-hidden)
65acb1b1 1054files. This makes glob() portable even to legacy systems. Your
1055port may include proprietary globbing functions as well. Check its
1056documentation for details.
68dc0745 1057
1058=head2 Why does Perl let me delete read-only files? Why does C<-i> clobber protected files? Isn't this a bug in Perl?
1059
06a5f41f 1060This is elaborately and painstakingly described in the
1061F<file-dir-perms> article in the "Far More Than You Ever Wanted To
49d635f9 1062Know" collection in http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz .
68dc0745 1063
1064The executive summary: learn how your filesystem works. The
1065permissions on a file say what can happen to the data in that file.
1066The permissions on a directory say what can happen to the list of
1067files in that directory. If you delete a file, you're removing its
1068name from the directory (so the operation depends on the permissions
1069of the directory, not of the file). If you try to write to the file,
1070the permissions of the file govern whether you're allowed to.
1071
1072=head2 How do I select a random line from a file?
d74e8afc 1073X<file, selecting a random line>
68dc0745 1074
1075Here's an algorithm from the Camel Book:
1076
500071f4 1077 srand;
1078 rand($.) < 1 && ($line = $_) while <>;
68dc0745 1079
49d635f9 1080This has a significant advantage in space over reading the whole file
1081in. You can find a proof of this method in I<The Art of Computer
1082Programming>, Volume 2, Section 3.4.2, by Donald E. Knuth.
1083
1084You can use the File::Random module which provides a function
1085for that algorithm:
1086
1087 use File::Random qw/random_line/;
1088 my $line = random_line($filename);
1089
1090Another way is to use the Tie::File module, which treats the entire
1091file as an array. Simply access a random array element.
68dc0745 1092
65acb1b1 1093=head2 Why do I get weird spaces when I print an array of lines?
1094
1095Saying
1096
500071f4 1097 print "@lines\n";
65acb1b1 1098
1099joins together the elements of C<@lines> with a space between them.
1100If C<@lines> were C<("little", "fluffy", "clouds")> then the above
a6dd486b 1101statement would print
65acb1b1 1102
500071f4 1103 little fluffy clouds
65acb1b1 1104
1105but if each element of C<@lines> was a line of text, ending a newline
1106character C<("little\n", "fluffy\n", "clouds\n")> then it would print:
1107
500071f4 1108 little
1109 fluffy
1110 clouds
65acb1b1 1111
1112If your array contains lines, just print them:
1113
500071f4 1114 print @lines;
1115
1116=head1 REVISION
1117
ac9dac7f 1118Revision: $Revision: 6019 $
500071f4 1119
ac9dac7f 1120Date: $Date: 2006-05-04 19:04:31 +0200 (jeu, 04 mai 2006) $
500071f4 1121
1122See L<perlfaq> for source control details and availability.
65acb1b1 1123
68dc0745 1124=head1 AUTHOR AND COPYRIGHT
1125
58103a2e 1126Copyright (c) 1997-2006 Tom Christiansen, Nathan Torkington, and
7678cced 1127other authors as noted. All rights reserved.
5a964f20 1128
5a7beb56 1129This documentation is free; you can redistribute it and/or modify it
1130under the same terms as Perl itself.
c8db1d39 1131
87275199 1132Irrespective of its distribution, all code examples here are in the public
c8db1d39 1133domain. You are permitted and encouraged to use this code and any
1134derivatives thereof in your own programs for fun or for profit as you
1135see fit. A simple comment in the code giving credit to the FAQ would
1136be courteous but is not required.