(perlfaq/bleadperl) append mode and locking
[p5sagit/p5-mst-13.2.git] / pod / perlfaq5.pod
CommitLineData
68dc0745 1=head1 NAME
2
d92eb7b0 3perlfaq5 - Files and Formats ($Revision: 1.38 $, $Date: 1999/05/23 16:08:30 $)
68dc0745 4
5=head1 DESCRIPTION
6
7This section deals with I/O and the "f" issues: filehandles, flushing,
8formats, and footers.
9
5a964f20 10=head2 How do I flush/unbuffer an output filehandle? Why must I do this?
68dc0745 11
12The C standard I/O library (stdio) normally buffers characters sent to
a6dd486b 13devices. This is done for efficiency reasons so that there isn't a
68dc0745 14system call for each byte. Any time you use print() or write() in
15Perl, you go though this buffering. syswrite() circumvents stdio and
16buffering.
17
5a964f20 18In most stdio implementations, the type of output buffering and the size of
68dc0745 19the buffer varies according to the type of device. Disk files are block
20buffered, often with a buffer size of more than 2k. Pipes and sockets
21are often buffered with a buffer size between 1/2 and 2k. Serial devices
22(e.g. modems, terminals) are normally line-buffered, and stdio sends
23the entire line when it gets the newline.
24
25Perl does not support truly unbuffered output (except insofar as you can
26C<syswrite(OUT, $char, 1)>). What it does instead support is "command
27buffering", in which a physical write is performed after every output
28command. This isn't as hard on your system as unbuffering, but does
29get the output where you want it when you want it.
30
31If you expect characters to get to your device when you print them there,
5a964f20 32you'll want to autoflush its handle.
33Use select() and the C<$|> variable to control autoflushing
34(see L<perlvar/$|> and L<perlfunc/select>):
35
36 $old_fh = select(OUTPUT_HANDLE);
37 $| = 1;
38 select($old_fh);
39
40Or using the traditional idiom:
41
42 select((select(OUTPUT_HANDLE), $| = 1)[0]);
43
44Or if don't mind slowly loading several thousand lines of module code
45just because you're afraid of the C<$|> variable:
68dc0745 46
47 use FileHandle;
5a964f20 48 open(DEV, "+</dev/tty"); # ceci n'est pas une pipe
68dc0745 49 DEV->autoflush(1);
50
51or the newer IO::* modules:
52
53 use IO::Handle;
54 open(DEV, ">/dev/printer"); # but is this?
55 DEV->autoflush(1);
56
57or even this:
58
59 use IO::Socket; # this one is kinda a pipe?
60 $sock = IO::Socket::INET->new(PeerAddr => 'www.perl.com',
61 PeerPort => 'http(80)',
62 Proto => 'tcp');
63 die "$!" unless $sock;
64
65 $sock->autoflush();
5a964f20 66 print $sock "GET / HTTP/1.0" . "\015\012" x 2;
67 $document = join('', <$sock>);
68dc0745 68 print "DOC IS: $document\n";
69
5a964f20 70Note the bizarrely hardcoded carriage return and newline in their octal
71equivalents. This is the ONLY way (currently) to assure a proper flush
d92eb7b0 72on all platforms, including Macintosh. That's the way things work in
5a964f20 73network programming: you really should specify the exact bit pattern
74on the network line terminator. In practice, C<"\n\n"> often works,
75but this is not portable.
68dc0745 76
5a964f20 77See L<perlfaq9> for other examples of fetching URLs over the web.
68dc0745 78
79=head2 How do I change one line in a file/delete a line in a file/insert a line in the middle of a file/append to the beginning of a file?
80
65acb1b1 81Those are operations of a text editor. Perl is not a text editor.
82Perl is a programming language. You have to decompose the problem into
83low-level calls to read, write, open, close, and seek.
84
68dc0745 85Although humans have an easy time thinking of a text file as being a
a6dd486b 86sequence of lines that operates much like a stack of playing cards--or
87punch cards--computers usually see the text file as a sequence of bytes.
65acb1b1 88In general, there's no direct way for Perl to seek to a particular line
89of a file, insert text into a file, or remove text from a file.
68dc0745 90
a6dd486b 91(There are exceptions in special circumstances. You can add or remove
92data at the very end of the file. A sequence of bytes can be replaced
93with another sequence of the same length. The C<$DB_RECNO> array
94bindings as documented in L<DB_File> also provide a direct way of
95modifying a file. Files where all lines are the same length are also
96easy to alter.)
68dc0745 97
98The general solution is to create a temporary copy of the text file with
5a964f20 99the changes you want, then copy that over the original. This assumes
100no locking.
68dc0745 101
102 $old = $file;
103 $new = "$file.tmp.$$";
65acb1b1 104 $bak = "$file.orig";
68dc0745 105
106 open(OLD, "< $old") or die "can't open $old: $!";
107 open(NEW, "> $new") or die "can't open $new: $!";
108
109 # Correct typos, preserving case
110 while (<OLD>) {
111 s/\b(p)earl\b/${1}erl/i;
112 (print NEW $_) or die "can't write to $new: $!";
113 }
114
115 close(OLD) or die "can't close $old: $!";
116 close(NEW) or die "can't close $new: $!";
117
118 rename($old, $bak) or die "can't rename $old to $bak: $!";
119 rename($new, $old) or die "can't rename $new to $old: $!";
120
121Perl can do this sort of thing for you automatically with the C<-i>
46fc3d4c 122command-line switch or the closely-related C<$^I> variable (see
68dc0745 123L<perlrun> for more details). Note that
124C<-i> may require a suffix on some non-Unix systems; see the
125platform-specific documentation that came with your port.
126
127 # Renumber a series of tests from the command line
128 perl -pi -e 's/(^\s+test\s+)\d+/ $1 . ++$count /e' t/op/taint.t
129
130 # form a script
65acb1b1 131 local($^I, @ARGV) = ('.orig', glob("*.c"));
68dc0745 132 while (<>) {
133 if ($. == 1) {
134 print "This line should appear at the top of each file\n";
135 }
136 s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case
137 print;
138 close ARGV if eof; # Reset $.
139 }
140
141If you need to seek to an arbitrary line of a file that changes
142infrequently, you could build up an index of byte positions of where
143the line ends are in the file. If the file is large, an index of
144every tenth or hundredth line end would allow you to seek and read
145fairly efficiently. If the file is sorted, try the look.pl library
146(part of the standard perl distribution).
147
148In the unique case of deleting lines at the end of a file, you
149can use tell() and truncate(). The following code snippet deletes
150the last line of a file without making a copy or reading the
151whole file into memory:
152
153 open (FH, "+< $file");
54310121 154 while ( <FH> ) { $addr = tell(FH) unless eof(FH) }
68dc0745 155 truncate(FH, $addr);
156
157Error checking is left as an exercise for the reader.
158
159=head2 How do I count the number of lines in a file?
160
161One fairly efficient way is to count newlines in the file. The
162following program uses a feature of tr///, as documented in L<perlop>.
163If your text file doesn't end with a newline, then it's not really a
164proper text file, so this may report one fewer line than you expect.
165
166 $lines = 0;
167 open(FILE, $filename) or die "Can't open `$filename': $!";
168 while (sysread FILE, $buffer, 4096) {
169 $lines += ($buffer =~ tr/\n//);
170 }
171 close FILE;
172
5a964f20 173This assumes no funny games with newline translations.
174
68dc0745 175=head2 How do I make a temporary file name?
176
5a964f20 177Use the C<new_tmpfile> class method from the IO::File module to get a
a6dd486b 178filehandle opened for reading and writing. Use it if you don't
179need to know the file's name:
68dc0745 180
65acb1b1 181 use IO::File;
5a964f20 182 $fh = IO::File->new_tmpfile()
65acb1b1 183 or die "Unable to make new temporary file: $!";
5a964f20 184
a6dd486b 185If you do need to know the file's name, you can use the C<tmpnam>
186function from the POSIX module to get a filename that you then open
187yourself:
188
5a964f20 189
190 use Fcntl;
191 use POSIX qw(tmpnam);
192
193 # try new temporary filenames until we get one that didn't already
194 # exist; the check should be unnecessary, but you can't be too careful
195 do { $name = tmpnam() }
196 until sysopen(FH, $name, O_RDWR|O_CREAT|O_EXCL);
197
198 # install atexit-style handler so that when we exit or die,
199 # we automatically delete this temporary file
200 END { unlink($name) or die "Couldn't unlink $name : $!" }
201
202 # now go on to use the file ...
203
a6dd486b 204If you're committed to creating a temporary file by hand, use the
205process ID and/or the current time-value. If you need to have many
206temporary files in one process, use a counter:
5a964f20 207
208 BEGIN {
68dc0745 209 use Fcntl;
210 my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMP} || $ENV{TEMP};
211 my $base_name = sprintf("%s/%d-%d-0000", $temp_dir, $$, time());
212 sub temp_file {
5a964f20 213 local *FH;
68dc0745 214 my $count = 0;
5a964f20 215 until (defined(fileno(FH)) || $count++ > 100) {
68dc0745 216 $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e;
5a964f20 217 sysopen(FH, $base_name, O_WRONLY|O_EXCL|O_CREAT);
68dc0745 218 }
5a964f20 219 if (defined(fileno(FH))
220 return (*FH, $base_name);
68dc0745 221 } else {
222 return ();
223 }
224 }
225 }
226
68dc0745 227=head2 How can I manipulate fixed-record-length files?
228
5a964f20 229The most efficient way is using pack() and unpack(). This is faster than
65acb1b1 230using substr() when taking many, many strings. It is slower for just a few.
5a964f20 231
232Here is a sample chunk of code to break up and put back together again
233some fixed-format input lines, in this case from the output of a normal,
234Berkeley-style ps:
68dc0745 235
236 # sample input line:
237 # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what
238 $PS_T = 'A6 A4 A7 A5 A*';
239 open(PS, "ps|");
5a964f20 240 print scalar <PS>;
68dc0745 241 while (<PS>) {
242 ($pid, $tt, $stat, $time, $command) = unpack($PS_T, $_);
243 for $var (qw!pid tt stat time command!) {
244 print "$var: <$$var>\n";
245 }
246 print 'line=', pack($PS_T, $pid, $tt, $stat, $time, $command),
247 "\n";
248 }
249
5a964f20 250We've used C<$$var> in a way that forbidden by C<use strict 'refs'>.
251That is, we've promoted a string to a scalar variable reference using
252symbolic references. This is ok in small programs, but doesn't scale
253well. It also only works on global variables, not lexicals.
254
68dc0745 255=head2 How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles?
256
5a964f20 257The fastest, simplest, and most direct way is to localize the typeglob
258of the filehandle in question:
68dc0745 259
5a964f20 260 local *TmpHandle;
68dc0745 261
5a964f20 262Typeglobs are fast (especially compared with the alternatives) and
263reasonably easy to use, but they also have one subtle drawback. If you
264had, for example, a function named TmpHandle(), or a variable named
265%TmpHandle, you just hid it from yourself.
68dc0745 266
68dc0745 267 sub findme {
5a964f20 268 local *HostFile;
269 open(HostFile, "</etc/hosts") or die "no /etc/hosts: $!";
270 local $_; # <- VERY IMPORTANT
271 while (<HostFile>) {
68dc0745 272 print if /\b127\.(0\.0\.)?1\b/;
273 }
5a964f20 274 # *HostFile automatically closes/disappears here
275 }
276
a6dd486b 277Here's how to use typeglobs in a loop to open and store a bunch of
5a964f20 278filehandles. We'll use as values of the hash an ordered
279pair to make it easy to sort the hash in insertion order.
280
281 @names = qw(motd termcap passwd hosts);
282 my $i = 0;
283 foreach $filename (@names) {
284 local *FH;
285 open(FH, "/etc/$filename") || die "$filename: $!";
286 $file{$filename} = [ $i++, *FH ];
68dc0745 287 }
288
5a964f20 289 # Using the filehandles in the array
290 foreach $name (sort { $file{$a}[0] <=> $file{$b}[0] } keys %file) {
291 my $fh = $file{$name}[1];
292 my $line = <$fh>;
293 print "$name $. $line";
294 }
295
c8db1d39 296For passing filehandles to functions, the easiest way is to
13a2d996 297preface them with a star, as in func(*STDIN).
298See L<perlfaq7/"Passing Filehandles"> for details.
c8db1d39 299
65acb1b1 300If you want to create many anonymous handles, you should check out the
5a964f20 301Symbol, FileHandle, or IO::Handle (etc.) modules. Here's the equivalent
302code with Symbol::gensym, which is reasonably light-weight:
303
304 foreach $filename (@names) {
305 use Symbol;
306 my $fh = gensym();
307 open($fh, "/etc/$filename") || die "open /etc/$filename: $!";
308 $file{$filename} = [ $i++, $fh ];
309 }
68dc0745 310
a6dd486b 311Here's using the semi-object-oriented FileHandle module, which certainly
65acb1b1 312isn't light-weight:
46fc3d4c 313
314 use FileHandle;
315
46fc3d4c 316 foreach $filename (@names) {
5a964f20 317 my $fh = FileHandle->new("/etc/$filename") or die "$filename: $!";
318 $file{$filename} = [ $i++, $fh ];
46fc3d4c 319 }
320
5a964f20 321Please understand that whether the filehandle happens to be a (probably
a6dd486b 322localized) typeglob or an anonymous handle from one of the modules
5a964f20 323in no way affects the bizarre rules for managing indirect handles.
324See the next question.
325
326=head2 How can I use a filehandle indirectly?
327
328An indirect filehandle is using something other than a symbol
329in a place that a filehandle is expected. Here are ways
a6dd486b 330to get indirect filehandles:
5a964f20 331
332 $fh = SOME_FH; # bareword is strict-subs hostile
333 $fh = "SOME_FH"; # strict-refs hostile; same package only
334 $fh = *SOME_FH; # typeglob
335 $fh = \*SOME_FH; # ref to typeglob (bless-able)
336 $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob
337
a6dd486b 338Or, you can use the C<new> method from the FileHandle or IO modules to
5a964f20 339create an anonymous filehandle, store that in a scalar variable,
340and use it as though it were a normal filehandle.
341
342 use FileHandle;
343 $fh = FileHandle->new();
344
345 use IO::Handle; # 5.004 or higher
346 $fh = IO::Handle->new();
347
348Then use any of those as you would a normal filehandle. Anywhere that
349Perl is expecting a filehandle, an indirect filehandle may be used
350instead. An indirect filehandle is just a scalar variable that contains
368c9434 351a filehandle. Functions like C<print>, C<open>, C<seek>, or
c47ff5f1 352the C<< <FH> >> diamond operator will accept either a read filehandle
5a964f20 353or a scalar variable containing one:
354
355 ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR);
356 print $ofh "Type it: ";
357 $got = <$ifh>
358 print $efh "What was that: $got";
359
368c9434 360If you're passing a filehandle to a function, you can write
5a964f20 361the function in two ways:
362
363 sub accept_fh {
364 my $fh = shift;
365 print $fh "Sending to indirect filehandle\n";
46fc3d4c 366 }
367
5a964f20 368Or it can localize a typeglob and use the filehandle directly:
46fc3d4c 369
5a964f20 370 sub accept_fh {
371 local *FH = shift;
372 print FH "Sending to localized filehandle\n";
46fc3d4c 373 }
374
5a964f20 375Both styles work with either objects or typeglobs of real filehandles.
376(They might also work with strings under some circumstances, but this
377is risky.)
378
379 accept_fh(*STDOUT);
380 accept_fh($handle);
381
382In the examples above, we assigned the filehandle to a scalar variable
a6dd486b 383before using it. That is because only simple scalar variables, not
384expressions or subscripts of hashes or arrays, can be used with
385built-ins like C<print>, C<printf>, or the diamond operator. Using
386something other than a simple scalar varaible as a filehandle is
5a964f20 387illegal and won't even compile:
388
389 @fd = (*STDIN, *STDOUT, *STDERR);
390 print $fd[1] "Type it: "; # WRONG
391 $got = <$fd[0]> # WRONG
392 print $fd[2] "What was that: $got"; # WRONG
393
394With C<print> and C<printf>, you get around this by using a block and
395an expression where you would place the filehandle:
396
397 print { $fd[1] } "funny stuff\n";
398 printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559;
399 # Pity the poor deadbeef.
400
401That block is a proper block like any other, so you can put more
402complicated code there. This sends the message out to one of two places:
403
404 $ok = -x "/bin/cat";
405 print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n";
406 print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n";
407
408This approach of treating C<print> and C<printf> like object methods
409calls doesn't work for the diamond operator. That's because it's a
410real operator, not just a function with a comma-less argument. Assuming
411you've been storing typeglobs in your structure as we did above, you
412can use the built-in function named C<readline> to reads a record just
c47ff5f1 413as C<< <> >> does. Given the initialization shown above for @fd, this
5a964f20 414would work, but only because readline() require a typeglob. It doesn't
415work with objects or strings, which might be a bug we haven't fixed yet.
416
417 $got = readline($fd[0]);
418
419Let it be noted that the flakiness of indirect filehandles is not
420related to whether they're strings, typeglobs, objects, or anything else.
421It's the syntax of the fundamental operators. Playing the object
422game doesn't help you at all here.
46fc3d4c 423
68dc0745 424=head2 How can I set up a footer format to be used with write()?
425
54310121 426There's no builtin way to do this, but L<perlform> has a couple of
68dc0745 427techniques to make it possible for the intrepid hacker.
428
429=head2 How can I write() into a string?
430
65acb1b1 431See L<perlform/"Accessing Formatting Internals"> for an swrite() function.
68dc0745 432
433=head2 How can I output my numbers with commas added?
434
435This one will do it for you:
436
437 sub commify {
438 local $_ = shift;
65acb1b1 439 1 while s/^([-+]?\d+)(\d{3})/$1,$2/;
68dc0745 440 return $_;
441 }
442
443 $n = 23659019423.2331;
444 print "GOT: ", commify($n), "\n";
445
446 GOT: 23,659,019,423.2331
447
448You can't just:
449
65acb1b1 450 s/^([-+]?\d+)(\d{3})/$1,$2/g;
68dc0745 451
452because you have to put the comma in and then recalculate your
453position.
454
a6dd486b 455Alternatively, this code commifies all numbers in a line regardless of
46fc3d4c 456whether they have decimal portions, are preceded by + or -, or
457whatever:
458
459 # from Andrew Johnson <ajohnson@gpu.srv.ualberta.ca>
460 sub commify {
461 my $input = shift;
462 $input = reverse $input;
463 $input =~ s<(\d\d\d)(?=\d)(?!\d*\.)><$1,>g;
65acb1b1 464 return scalar reverse $input;
46fc3d4c 465 }
466
68dc0745 467=head2 How can I translate tildes (~) in a filename?
468
575cc754 469Use the <> (glob()) operator, documented in L<perlfunc>. Older
470versions of Perl require that you have a shell installed that groks
471tildes. Recent perl versions have this feature built in. The
472Glob::KGlob module (available from CPAN) gives more portable glob
473functionality.
68dc0745 474
475Within Perl, you may use this directly:
476
477 $filename =~ s{
478 ^ ~ # find a leading tilde
479 ( # save this in $1
480 [^/] # a non-slash character
481 * # repeated 0 or more times (0 means me)
482 )
483 }{
484 $1
485 ? (getpwnam($1))[7]
486 : ( $ENV{HOME} || $ENV{LOGDIR} )
487 }ex;
488
5a964f20 489=head2 How come when I open a file read-write it wipes it out?
68dc0745 490
491Because you're using something like this, which truncates the file and
492I<then> gives you read-write access:
493
5a964f20 494 open(FH, "+> /path/name"); # WRONG (almost always)
68dc0745 495
496Whoops. You should instead use this, which will fail if the file
d92eb7b0 497doesn't exist.
498
499 open(FH, "+< /path/name"); # open for update
500
c47ff5f1 501Using ">" always clobbers or creates. Using "<" never does
d92eb7b0 502either. The "+" doesn't change this.
68dc0745 503
5a964f20 504Here are examples of many kinds of file opens. Those using sysopen()
505all assume
68dc0745 506
5a964f20 507 use Fcntl;
68dc0745 508
5a964f20 509To open file for reading:
68dc0745 510
5a964f20 511 open(FH, "< $path") || die $!;
512 sysopen(FH, $path, O_RDONLY) || die $!;
513
514To open file for writing, create new file if needed or else truncate old file:
515
516 open(FH, "> $path") || die $!;
517 sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT) || die $!;
518 sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666) || die $!;
519
520To open file for writing, create new file, file must not exist:
521
522 sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT) || die $!;
523 sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT, 0666) || die $!;
524
525To open file for appending, create if necessary:
526
527 open(FH, ">> $path") || die $!;
528 sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT) || die $!;
529 sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT, 0666) || die $!;
530
531To open file for appending, file must exist:
532
533 sysopen(FH, $path, O_WRONLY|O_APPEND) || die $!;
534
535To open file for update, file must exist:
536
537 open(FH, "+< $path") || die $!;
538 sysopen(FH, $path, O_RDWR) || die $!;
539
540To open file for update, create file if necessary:
541
542 sysopen(FH, $path, O_RDWR|O_CREAT) || die $!;
543 sysopen(FH, $path, O_RDWR|O_CREAT, 0666) || die $!;
544
545To open file for update, file must not exist:
546
547 sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT) || die $!;
548 sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT, 0666) || die $!;
549
550To open a file without blocking, creating if necessary:
551
552 sysopen(FH, "/tmp/somefile", O_WRONLY|O_NDELAY|O_CREAT)
553 or die "can't open /tmp/somefile: $!":
554
555Be warned that neither creation nor deletion of files is guaranteed to
556be an atomic operation over NFS. That is, two processes might both
a6dd486b 557successfully create or unlink the same file! Therefore O_EXCL
558isn't as exclusive as you might wish.
68dc0745 559
87275199 560See also the new L<perlopentut> if you have it (new for 5.6).
65acb1b1 561
c47ff5f1 562=head2 Why do I sometimes get an "Argument list too long" when I use <*>?
68dc0745 563
c47ff5f1 564The C<< <> >> operator performs a globbing operation (see above).
3a4b19e4 565In Perl versions earlier than v5.6.0, the internal glob() operator forks
566csh(1) to do the actual glob expansion, but
68dc0745 567csh can't handle more than 127 items and so gives the error message
568C<Argument list too long>. People who installed tcsh as csh won't
569have this problem, but their users may be surprised by it.
570
3a4b19e4 571To get around this, either upgrade to Perl v5.6.0 or later, do the glob
572yourself with readdir() and patterns, or use a module like Glob::KGlob,
573one that doesn't use the shell to do globbing.
68dc0745 574
575=head2 Is there a leak/bug in glob()?
576
577Due to the current implementation on some operating systems, when you
578use the glob() function or its angle-bracket alias in a scalar
a6dd486b 579context, you may cause a memory leak and/or unpredictable behavior. It's
68dc0745 580best therefore to use glob() only in list context.
581
c47ff5f1 582=head2 How can I open a file with a leading ">" or trailing blanks?
68dc0745 583
584Normally perl ignores trailing blanks in filenames, and interprets
585certain leading characters (or a trailing "|") to mean something
a6dd486b 586special. To avoid this, you might want to use a routine like the one below.
587It turns incomplete pathnames into explicit relative ones, and tacks a
68dc0745 588trailing null byte on the name to make perl leave it alone:
589
590 sub safe_filename {
591 local $_ = shift;
65acb1b1 592 s#^([^./])#./$1#;
593 $_ .= "\0";
594 return $_;
68dc0745 595 }
596
65acb1b1 597 $badpath = "<<<something really wicked ";
598 $fn = safe_filename($badpath");
599 open(FH, "> $fn") or "couldn't open $badpath: $!";
600
601This assumes that you are using POSIX (portable operating systems
602interface) paths. If you are on a closed, non-portable, proprietary
603system, you may have to adjust the C<"./"> above.
604
605It would be a lot clearer to use sysopen(), though:
606
607 use Fcntl;
608 $badpath = "<<<something really wicked ";
a6dd486b 609 sysopen (FH, $badpath, O_WRONLY | O_CREAT | O_TRUNC)
65acb1b1 610 or die "can't open $badpath: $!";
68dc0745 611
65acb1b1 612For more information, see also the new L<perlopentut> if you have it
87275199 613(new for 5.6).
68dc0745 614
615=head2 How can I reliably rename a file?
616
a6dd486b 617Well, usually you just use Perl's rename() function. That may not
618work everywhere, though, particularly when renaming files across file systems.
d92eb7b0 619Some sub-Unix systems have broken ports that corrupt the semantics of
a6dd486b 620rename()--for example, WinNT does this right, but Win95 and Win98
d92eb7b0 621are broken. (The last two parts are not surprising, but the first is. :-)
622
623If your operating system supports a proper mv(1) program or its moral
624equivalent, this works:
68dc0745 625
626 rename($old, $new) or system("mv", $old, $new);
627
628It may be more compelling to use the File::Copy module instead. You
629just copy to the new file to the new name (checking return values),
a6dd486b 630then delete the old one. This isn't really the same semantically as a
68dc0745 631real rename(), though, which preserves metainformation like
632permissions, timestamps, inode info, etc.
633
a6dd486b 634Newer versions of File::Copy exports a move() function.
5a964f20 635
68dc0745 636=head2 How can I lock a file?
637
54310121 638Perl's builtin flock() function (see L<perlfunc> for details) will call
68dc0745 639flock(2) if that exists, fcntl(2) if it doesn't (on perl version 5.004 and
640later), and lockf(3) if neither of the two previous system calls exists.
641On some systems, it may even use a different form of native locking.
642Here are some gotchas with Perl's flock():
643
644=over 4
645
646=item 1
647
648Produces a fatal error if none of the three system calls (or their
649close equivalent) exists.
650
651=item 2
652
653lockf(3) does not provide shared locking, and requires that the
654filehandle be open for writing (or appending, or read/writing).
655
656=item 3
657
d92eb7b0 658Some versions of flock() can't lock files over a network (e.g. on NFS file
659systems), so you'd need to force the use of fcntl(2) when you build Perl.
a6dd486b 660But even this is dubious at best. See the flock entry of L<perlfunc>
d92eb7b0 661and the F<INSTALL> file in the source distribution for information on
662building Perl to do this.
663
664Two potentially non-obvious but traditional flock semantics are that
a6dd486b 665it waits indefinitely until the lock is granted, and that its locks are
d92eb7b0 666I<merely advisory>. Such discretionary locks are more flexible, but
667offer fewer guarantees. This means that files locked with flock() may
668be modified by programs that do not also use flock(). Cars that stop
669for red lights get on well with each other, but not with cars that don't
670stop for red lights. See the perlport manpage, your port's specific
671documentation, or your system-specific local manpages for details. It's
672best to assume traditional behavior if you're writing portable programs.
a6dd486b 673(If you're not, you should as always feel perfectly free to write
d92eb7b0 674for your own system's idiosyncrasies (sometimes called "features").
675Slavish adherence to portability concerns shouldn't get in the way of
676your getting your job done.)
68dc0745 677
13a2d996 678For more information on file locking, see also
679L<perlopentut/"File Locking"> if you have it (new for 5.6).
65acb1b1 680
68dc0745 681=back
682
65acb1b1 683=head2 Why can't I just open(FH, ">file.lock")?
68dc0745 684
685A common bit of code B<NOT TO USE> is this:
686
687 sleep(3) while -e "file.lock"; # PLEASE DO NOT USE
688 open(LCK, "> file.lock"); # THIS BROKEN CODE
689
690This is a classic race condition: you take two steps to do something
691which must be done in one. That's why computer hardware provides an
692atomic test-and-set instruction. In theory, this "ought" to work:
693
5a964f20 694 sysopen(FH, "file.lock", O_WRONLY|O_EXCL|O_CREAT)
68dc0745 695 or die "can't open file.lock: $!":
696
697except that lamentably, file creation (and deletion) is not atomic
698over NFS, so this won't work (at least, not every time) over the net.
65acb1b1 699Various schemes involving link() have been suggested, but
46fc3d4c 700these tend to involve busy-wait, which is also subdesirable.
68dc0745 701
fc36a67e 702=head2 I still don't get locking. I just want to increment the number in the file. How can I do this?
68dc0745 703
46fc3d4c 704Didn't anyone ever tell you web-page hit counters were useless?
5a964f20 705They don't count number of hits, they're a waste of time, and they serve
a6dd486b 706only to stroke the writer's vanity. It's better to pick a random number;
707they're more realistic.
68dc0745 708
5a964f20 709Anyway, this is what you can do if you can't help yourself.
68dc0745 710
e2c57c3e 711 use Fcntl qw(:DEFAULT :flock);
5a964f20 712 sysopen(FH, "numfile", O_RDWR|O_CREAT) or die "can't open numfile: $!";
65acb1b1 713 flock(FH, LOCK_EX) or die "can't flock numfile: $!";
68dc0745 714 $num = <FH> || 0;
715 seek(FH, 0, 0) or die "can't rewind numfile: $!";
716 truncate(FH, 0) or die "can't truncate numfile: $!";
717 (print FH $num+1, "\n") or die "can't write numfile: $!";
68dc0745 718 close FH or die "can't close numfile: $!";
719
46fc3d4c 720Here's a much better web-page hit counter:
68dc0745 721
722 $hits = int( (time() - 850_000_000) / rand(1_000) );
723
724If the count doesn't impress your friends, then the code might. :-)
725
05caf3a7 726=head2 All I want to do is append a small amount of text to the end of a
727file. Do I *still* have to use locking?
728
729If you are on a system that correctly implements flock() and you use the
730example appending code from "perldoc -f flock" everything will be OK
731even if the OS you are on doesn't implement append mode correctly (if
732such a system exists.) So if you are happy to restrict yourself to OSs
733that implement flock() (and that's not really much of a restriction)
734then that is what you should do.
735
736If you know you are only going to use a system that does correctly
737implement appending (i.e. not Win32) then you can omit the seek() from
738the above code.
739
740If you know you are only writing code to run on an OS and filesystem that
741does implement append mode correctly (a local filesystem on a modern
742Unix for example), and you keep the file in block-buffered mode and you
743write less than one buffer-full of output between each manual flushing
744of the buffer then each bufferload is almost garanteed to be written to
745the end of the file in one chunk without getting intermingled with
746anyone else's output. You can also use the syswrite() function which is
747simply a wrapper around your systems write(2) system call.
748
749There is still a small theoretical chance that a signal will interrupt
750the system level write() operation before completion. There is also a
751possibility that some STDIO implementations may call multiple system
752level write()s even if the buffer was empty to start. There may be some
753systems where this probability is reduced to zero.
754
68dc0745 755=head2 How do I randomly update a binary file?
756
757If you're just trying to patch a binary, in many cases something as
758simple as this works:
759
760 perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs
761
762However, if you have fixed sized records, then you might do something more
763like this:
764
765 $RECSIZE = 220; # size of record, in bytes
766 $recno = 37; # which record to update
767 open(FH, "+<somewhere") || die "can't update somewhere: $!";
768 seek(FH, $recno * $RECSIZE, 0);
769 read(FH, $record, $RECSIZE) == $RECSIZE || die "can't read record $recno: $!";
770 # munge the record
65acb1b1 771 seek(FH, -$RECSIZE, 1);
68dc0745 772 print FH $record;
773 close FH;
774
775Locking and error checking are left as an exercise for the reader.
a6dd486b 776Don't forget them or you'll be quite sorry.
68dc0745 777
68dc0745 778=head2 How do I get a file's timestamp in perl?
779
780If you want to retrieve the time at which the file was last read,
46fc3d4c 781written, or had its meta-data (owner, etc) changed, you use the B<-M>,
68dc0745 782B<-A>, or B<-C> filetest operations as documented in L<perlfunc>. These
783retrieve the age of the file (measured against the start-time of your
784program) in days as a floating point number. To retrieve the "raw"
785time in seconds since the epoch, you would call the stat function,
786then use localtime(), gmtime(), or POSIX::strftime() to convert this
787into human-readable form.
788
789Here's an example:
790
791 $write_secs = (stat($file))[9];
c8db1d39 792 printf "file %s updated at %s\n", $file,
793 scalar localtime($write_secs);
68dc0745 794
795If you prefer something more legible, use the File::stat module
796(part of the standard distribution in version 5.004 and later):
797
65acb1b1 798 # error checking left as an exercise for reader.
68dc0745 799 use File::stat;
800 use Time::localtime;
801 $date_string = ctime(stat($file)->mtime);
802 print "file $file updated at $date_string\n";
803
65acb1b1 804The POSIX::strftime() approach has the benefit of being,
805in theory, independent of the current locale. See L<perllocale>
806for details.
68dc0745 807
808=head2 How do I set a file's timestamp in perl?
809
810You use the utime() function documented in L<perlfunc/utime>.
811By way of example, here's a little program that copies the
812read and write times from its first argument to all the rest
813of them.
814
815 if (@ARGV < 2) {
816 die "usage: cptimes timestamp_file other_files ...\n";
817 }
818 $timestamp = shift;
819 ($atime, $mtime) = (stat($timestamp))[8,9];
820 utime $atime, $mtime, @ARGV;
821
65acb1b1 822Error checking is, as usual, left as an exercise for the reader.
68dc0745 823
824Note that utime() currently doesn't work correctly with Win95/NT
825ports. A bug has been reported. Check it carefully before using
a6dd486b 826utime() on those platforms.
68dc0745 827
828=head2 How do I print to more than one file at once?
829
830If you only have to do this once, you can do this:
831
832 for $fh (FH1, FH2, FH3) { print $fh "whatever\n" }
833
834To connect up to one filehandle to several output filehandles, it's
835easiest to use the tee(1) program if you have it, and let it take care
836of the multiplexing:
837
838 open (FH, "| tee file1 file2 file3");
839
5a964f20 840Or even:
841
842 # make STDOUT go to three files, plus original STDOUT
843 open (STDOUT, "| tee file1 file2 file3") or die "Teeing off: $!\n";
844 print "whatever\n" or die "Writing: $!\n";
845 close(STDOUT) or die "Closing: $!\n";
68dc0745 846
5a964f20 847Otherwise you'll have to write your own multiplexing print
a6dd486b 848function--or your own tee program--or use Tom Christiansen's,
849at http://www.perl.com/CPAN/authors/id/TOMC/scripts/tct.gz , which is
5a964f20 850written in Perl and offers much greater functionality
851than the stock version.
68dc0745 852
d92eb7b0 853=head2 How can I read in an entire file all at once?
854
855The customary Perl approach for processing all the lines in a file is to
856do so one line at a time:
857
858 open (INPUT, $file) || die "can't open $file: $!";
859 while (<INPUT>) {
860 chomp;
861 # do something with $_
862 }
863 close(INPUT) || die "can't close $file: $!";
864
865This is tremendously more efficient than reading the entire file into
866memory as an array of lines and then processing it one element at a time,
a6dd486b 867which is often--if not almost always--the wrong approach. Whenever
d92eb7b0 868you see someone do this:
869
870 @lines = <INPUT>;
871
a6dd486b 872you should think long and hard about why you need everything loaded
d92eb7b0 873at once. It's just not a scalable solution. You might also find it
106325ad 874more fun to use the standard DB_File module's $DB_RECNO bindings,
d92eb7b0 875which allow you to tie an array to a file so that accessing an element
876the array actually accesses the corresponding line in the file.
877
878On very rare occasion, you may have an algorithm that demands that
879the entire file be in memory at once as one scalar. The simplest solution
a6dd486b 880to that is
d92eb7b0 881
882 $var = `cat $file`;
883
884Being in scalar context, you get the whole thing. In list context,
885you'd get a list of all the lines:
886
887 @lines = `cat $file`;
888
87275199 889This tiny but expedient solution is neat, clean, and portable to
890all systems on which decent tools have been installed. For those
891who prefer not to use the toolbox, you can of course read the file
892manually, although this makes for more complicated code.
d92eb7b0 893
894 {
895 local(*INPUT, $/);
896 open (INPUT, $file) || die "can't open $file: $!";
897 $var = <INPUT>;
898 }
899
900That temporarily undefs your record separator, and will automatically
901close the file at block exit. If the file is already open, just use this:
902
903 $var = do { local $/; <INPUT> };
904
68dc0745 905=head2 How can I read in a file by paragraphs?
906
65acb1b1 907Use the C<$/> variable (see L<perlvar> for details). You can either
68dc0745 908set it to C<""> to eliminate empty paragraphs (C<"abc\n\n\n\ndef">,
909for instance, gets treated as two paragraphs and not three), or
910C<"\n\n"> to accept empty paragraphs.
911
65acb1b1 912Note that a blank line must have no blanks in it. Thus C<"fred\n
913\nstuff\n\n"> is one paragraph, but C<"fred\n\nstuff\n\n"> is two.
914
68dc0745 915=head2 How can I read a single character from a file? From the keyboard?
916
917You can use the builtin C<getc()> function for most filehandles, but
918it won't (easily) work on a terminal device. For STDIN, either use
a6dd486b 919the Term::ReadKey module from CPAN or use the sample code in
68dc0745 920L<perlfunc/getc>.
921
65acb1b1 922If your system supports the portable operating system programming
923interface (POSIX), you can use the following code, which you'll note
924turns off echo processing as well.
68dc0745 925
926 #!/usr/bin/perl -w
927 use strict;
928 $| = 1;
929 for (1..4) {
930 my $got;
931 print "gimme: ";
932 $got = getone();
933 print "--> $got\n";
934 }
935 exit;
936
937 BEGIN {
938 use POSIX qw(:termios_h);
939
940 my ($term, $oterm, $echo, $noecho, $fd_stdin);
941
942 $fd_stdin = fileno(STDIN);
943
944 $term = POSIX::Termios->new();
945 $term->getattr($fd_stdin);
946 $oterm = $term->getlflag();
947
948 $echo = ECHO | ECHOK | ICANON;
949 $noecho = $oterm & ~$echo;
950
951 sub cbreak {
952 $term->setlflag($noecho);
953 $term->setcc(VTIME, 1);
954 $term->setattr($fd_stdin, TCSANOW);
955 }
956
957 sub cooked {
958 $term->setlflag($oterm);
959 $term->setcc(VTIME, 0);
960 $term->setattr($fd_stdin, TCSANOW);
961 }
962
963 sub getone {
964 my $key = '';
965 cbreak();
966 sysread(STDIN, $key, 1);
967 cooked();
968 return $key;
969 }
970
971 }
972
973 END { cooked() }
974
a6dd486b 975The Term::ReadKey module from CPAN may be easier to use. Recent versions
65acb1b1 976include also support for non-portable systems as well.
68dc0745 977
978 use Term::ReadKey;
979 open(TTY, "</dev/tty");
980 print "Gimme a char: ";
981 ReadMode "raw";
982 $key = ReadKey 0, *TTY;
983 ReadMode "normal";
984 printf "\nYou said %s, char number %03d\n",
985 $key, ord $key;
986
65acb1b1 987=head2 How can I tell whether there's a character waiting on a filehandle?
68dc0745 988
5a964f20 989The very first thing you should do is look into getting the Term::ReadKey
65acb1b1 990extension from CPAN. As we mentioned earlier, it now even has limited
991support for non-portable (read: not open systems, closed, proprietary,
992not POSIX, not Unix, etc) systems.
5a964f20 993
994You should also check out the Frequently Asked Questions list in
68dc0745 995comp.unix.* for things like this: the answer is essentially the same.
996It's very system dependent. Here's one solution that works on BSD
997systems:
998
999 sub key_ready {
1000 my($rin, $nfd);
1001 vec($rin, fileno(STDIN), 1) = 1;
1002 return $nfd = select($rin,undef,undef,0);
1003 }
1004
65acb1b1 1005If you want to find out how many characters are waiting, there's
1006also the FIONREAD ioctl call to be looked at. The I<h2ph> tool that
1007comes with Perl tries to convert C include files to Perl code, which
1008can be C<require>d. FIONREAD ends up defined as a function in the
1009I<sys/ioctl.ph> file:
68dc0745 1010
5a964f20 1011 require 'sys/ioctl.ph';
68dc0745 1012
5a964f20 1013 $size = pack("L", 0);
1014 ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n";
1015 $size = unpack("L", $size);
68dc0745 1016
5a964f20 1017If I<h2ph> wasn't installed or doesn't work for you, you can
1018I<grep> the include files by hand:
68dc0745 1019
5a964f20 1020 % grep FIONREAD /usr/include/*/*
1021 /usr/include/asm/ioctls.h:#define FIONREAD 0x541B
68dc0745 1022
5a964f20 1023Or write a small C program using the editor of champions:
68dc0745 1024
5a964f20 1025 % cat > fionread.c
1026 #include <sys/ioctl.h>
1027 main() {
1028 printf("%#08x\n", FIONREAD);
1029 }
1030 ^D
65acb1b1 1031 % cc -o fionread fionread.c
5a964f20 1032 % ./fionread
1033 0x4004667f
1034
1035And then hard-code it, leaving porting as an exercise to your successor.
1036
1037 $FIONREAD = 0x4004667f; # XXX: opsys dependent
1038
1039 $size = pack("L", 0);
1040 ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n";
1041 $size = unpack("L", $size);
1042
a6dd486b 1043FIONREAD requires a filehandle connected to a stream, meaning that sockets,
5a964f20 1044pipes, and tty devices work, but I<not> files.
68dc0745 1045
1046=head2 How do I do a C<tail -f> in perl?
1047
1048First try
1049
1050 seek(GWFILE, 0, 1);
1051
1052The statement C<seek(GWFILE, 0, 1)> doesn't change the current position,
1053but it does clear the end-of-file condition on the handle, so that the
1054next <GWFILE> makes Perl try again to read something.
1055
1056If that doesn't work (it relies on features of your stdio implementation),
1057then you need something more like this:
1058
1059 for (;;) {
1060 for ($curpos = tell(GWFILE); <GWFILE>; $curpos = tell(GWFILE)) {
1061 # search for some stuff and put it into files
1062 }
1063 # sleep for a while
1064 seek(GWFILE, $curpos, 0); # seek to where we had been
1065 }
1066
1067If this still doesn't work, look into the POSIX module. POSIX defines
1068the clearerr() method, which can remove the end of file condition on a
1069filehandle. The method: read until end of file, clearerr(), read some
1070more. Lather, rinse, repeat.
1071
65acb1b1 1072There's also a File::Tail module from CPAN.
1073
68dc0745 1074=head2 How do I dup() a filehandle in Perl?
1075
1076If you check L<perlfunc/open>, you'll see that several of the ways
1077to call open() should do the trick. For example:
1078
1079 open(LOG, ">>/tmp/logfile");
1080 open(STDERR, ">&LOG");
1081
1082Or even with a literal numeric descriptor:
1083
1084 $fd = $ENV{MHCONTEXTFD};
1085 open(MHCONTEXT, "<&=$fd"); # like fdopen(3S)
1086
c47ff5f1 1087Note that "<&STDIN" makes a copy, but "<&=STDIN" make
5a964f20 1088an alias. That means if you close an aliased handle, all
1089aliases become inaccessible. This is not true with
1090a copied one.
1091
1092Error checking, as always, has been left as an exercise for the reader.
68dc0745 1093
1094=head2 How do I close a file descriptor by number?
1095
1096This should rarely be necessary, as the Perl close() function is to be
1097used for things that Perl opened itself, even if it was a dup of a
a6dd486b 1098numeric descriptor as with MHCONTEXT above. But if you really have
68dc0745 1099to, you may be able to do this:
1100
1101 require 'sys/syscall.ph';
1102 $rc = syscall(&SYS_close, $fd + 0); # must force numeric
1103 die "can't sysclose $fd: $!" unless $rc == -1;
1104
a6dd486b 1105Or, just use the fdopen(3S) feature of open():
d92eb7b0 1106
1107 {
1108 local *F;
1109 open F, "<&=$fd" or die "Cannot reopen fd=$fd: $!";
1110 close F;
1111 }
1112
46fc3d4c 1113=head2 Why can't I use "C:\temp\foo" in DOS paths? What doesn't `C:\temp\foo.exe` work?
68dc0745 1114
1115Whoops! You just put a tab and a formfeed into that filename!
1116Remember that within double quoted strings ("like\this"), the
1117backslash is an escape character. The full list of these is in
1118L<perlop/Quote and Quote-like Operators>. Unsurprisingly, you don't
1119have a file called "c:(tab)emp(formfeed)oo" or
65acb1b1 1120"c:(tab)emp(formfeed)oo.exe" on your legacy DOS filesystem.
68dc0745 1121
1122Either single-quote your strings, or (preferably) use forward slashes.
46fc3d4c 1123Since all DOS and Windows versions since something like MS-DOS 2.0 or so
68dc0745 1124have treated C</> and C<\> the same in a path, you might as well use the
a6dd486b 1125one that doesn't clash with Perl--or the POSIX shell, ANSI C and C++,
65acb1b1 1126awk, Tcl, Java, or Python, just to mention a few. POSIX paths
1127are more portable, too.
68dc0745 1128
1129=head2 Why doesn't glob("*.*") get all the files?
1130
1131Because even on non-Unix ports, Perl's glob function follows standard
46fc3d4c 1132Unix globbing semantics. You'll need C<glob("*")> to get all (non-hidden)
65acb1b1 1133files. This makes glob() portable even to legacy systems. Your
1134port may include proprietary globbing functions as well. Check its
1135documentation for details.
68dc0745 1136
1137=head2 Why does Perl let me delete read-only files? Why does C<-i> clobber protected files? Isn't this a bug in Perl?
1138
1139This is elaborately and painstakingly described in the "Far More Than
7b8d334a 1140You Ever Wanted To Know" in
68dc0745 1141http://www.perl.com/CPAN/doc/FMTEYEWTK/file-dir-perms .
1142
1143The executive summary: learn how your filesystem works. The
1144permissions on a file say what can happen to the data in that file.
1145The permissions on a directory say what can happen to the list of
1146files in that directory. If you delete a file, you're removing its
1147name from the directory (so the operation depends on the permissions
1148of the directory, not of the file). If you try to write to the file,
1149the permissions of the file govern whether you're allowed to.
1150
1151=head2 How do I select a random line from a file?
1152
1153Here's an algorithm from the Camel Book:
1154
1155 srand;
1156 rand($.) < 1 && ($line = $_) while <>;
1157
1158This has a significant advantage in space over reading the whole
5a964f20 1159file in. A simple proof by induction is available upon
a6dd486b 1160request if you doubt the algorithm's correctness.
68dc0745 1161
65acb1b1 1162=head2 Why do I get weird spaces when I print an array of lines?
1163
1164Saying
1165
1166 print "@lines\n";
1167
1168joins together the elements of C<@lines> with a space between them.
1169If C<@lines> were C<("little", "fluffy", "clouds")> then the above
a6dd486b 1170statement would print
65acb1b1 1171
1172 little fluffy clouds
1173
1174but if each element of C<@lines> was a line of text, ending a newline
1175character C<("little\n", "fluffy\n", "clouds\n")> then it would print:
1176
1177 little
1178 fluffy
1179 clouds
1180
1181If your array contains lines, just print them:
1182
1183 print @lines;
1184
68dc0745 1185=head1 AUTHOR AND COPYRIGHT
1186
65acb1b1 1187Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington.
5a964f20 1188All rights reserved.
1189
c8db1d39 1190When included as an integrated part of the Standard Distribution
d92eb7b0 1191of Perl or of its documentation (printed or otherwise), this works is
1192covered under Perl's Artistic License. For separate distributions of
c8db1d39 1193all or part of this FAQ outside of that, see L<perlfaq>.
1194
87275199 1195Irrespective of its distribution, all code examples here are in the public
c8db1d39 1196domain. You are permitted and encouraged to use this code and any
1197derivatives thereof in your own programs for fun or for profit as you
1198see fit. A simple comment in the code giving credit to the FAQ would
1199be courteous but is not required.