Don't show code closing STD{IN,OUT} before reopening, because
[p5sagit/p5-mst-13.2.git] / pod / perlfaq5.pod
CommitLineData
68dc0745 1=head1 NAME
2
793f5136 3perlfaq5 - Files and Formats ($Revision: 1.30 $, $Date: 2003/11/23 08:07:46 $)
68dc0745 4
5=head1 DESCRIPTION
6
7This section deals with I/O and the "f" issues: filehandles, flushing,
8formats, and footers.
9
5a964f20 10=head2 How do I flush/unbuffer an output filehandle? Why must I do this?
68dc0745 11
c90536be 12Perl does not support truly unbuffered output (except
13insofar as you can C<syswrite(OUT, $char, 1)>), although it
14does support is "command buffering", in which a physical
15write is performed after every output command.
16
17The C standard I/O library (stdio) normally buffers
18characters sent to devices so that there isn't a system call
19for each byte. In most stdio implementations, the type of
20output buffering and the size of the buffer varies according
21to the type of device. Perl's print() and write() functions
22normally buffer output, while syswrite() bypasses buffering
23all together.
24
25If you want your output to be sent immediately when you
26execute print() or write() (for instance, for some network
27protocols), you must set the handle's autoflush flag. This
28flag is the Perl variable $| and when it is set to a true
29value, Perl will flush the handle's buffer after each
30print() or write(). Setting $| affects buffering only for
31the currently selected default file handle. You choose this
32handle with the one argument select() call (see
197aec24 33L<perlvar/$E<verbar>> and L<perlfunc/select>).
c90536be 34
35Use select() to choose the desired handle, then set its
36per-filehandle variables.
5a964f20 37
38 $old_fh = select(OUTPUT_HANDLE);
39 $| = 1;
40 select($old_fh);
41
c90536be 42Some idioms can handle this in a single statement:
5a964f20 43
44 select((select(OUTPUT_HANDLE), $| = 1)[0]);
818c4caa 45
c90536be 46 $| = 1, select $_ for select OUTPUT_HANDLE;
5a964f20 47
c90536be 48Some modules offer object-oriented access to handles and their
49variables, although they may be overkill if this is the only
50thing you do with them. You can use IO::Handle:
68dc0745 51
52 use IO::Handle;
53 open(DEV, ">/dev/printer"); # but is this?
54 DEV->autoflush(1);
55
c90536be 56or IO::Socket:
68dc0745 57
58 use IO::Socket; # this one is kinda a pipe?
c90536be 59 my $sock = IO::Socket::INET->new( 'www.example.com:80' ) ;
68dc0745 60
61 $sock->autoflush();
68dc0745 62
63=head2 How do I change one line in a file/delete a line in a file/insert a line in the middle of a file/append to the beginning of a file?
64
1f089b22 65Use the Tie::File module, which is included in the standard
66distribution since Perl 5.8.0.
68dc0745 67
68=head2 How do I count the number of lines in a file?
69
70One fairly efficient way is to count newlines in the file. The
71following program uses a feature of tr///, as documented in L<perlop>.
72If your text file doesn't end with a newline, then it's not really a
73proper text file, so this may report one fewer line than you expect.
74
75 $lines = 0;
76 open(FILE, $filename) or die "Can't open `$filename': $!";
77 while (sysread FILE, $buffer, 4096) {
78 $lines += ($buffer =~ tr/\n//);
79 }
80 close FILE;
81
5a964f20 82This assumes no funny games with newline translations.
83
4750257b 84=head2 How can I use Perl's C<-i> option from within a program?
85
86C<-i> sets the value of Perl's C<$^I> variable, which in turn affects
87the behavior of C<< <> >>; see L<perlrun> for more details. By
88modifying the appropriate variables directly, you can get the same
89behavior within a larger program. For example:
90
91 # ...
92 {
93 local($^I, @ARGV) = ('.orig', glob("*.c"));
94 while (<>) {
95 if ($. == 1) {
96 print "This line should appear at the top of each file\n";
97 }
98 s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case
99 print;
100 close ARGV if eof; # Reset $.
101 }
102 }
103 # $^I and @ARGV return to their old values here
104
105This block modifies all the C<.c> files in the current directory,
106leaving a backup of the original data from each file in a new
107C<.c.orig> file.
108
68dc0745 109=head2 How do I make a temporary file name?
110
16394a69 111Use the File::Temp module, see L<File::Temp> for more information.
68dc0745 112
197aec24 113 use File::Temp qw/ tempfile tempdir /;
a6dd486b 114
16394a69 115 $dir = tempdir( CLEANUP => 1 );
116 ($fh, $filename) = tempfile( DIR => $dir );
5a964f20 117
16394a69 118 # or if you don't need to know the filename
5a964f20 119
16394a69 120 $fh = tempfile( DIR => $dir );
5a964f20 121
16394a69 122The File::Temp has been a standard module since Perl 5.6.1. If you
123don't have a modern enough Perl installed, use the C<new_tmpfile>
124class method from the IO::File module to get a filehandle opened for
125reading and writing. Use it if you don't need to know the file's name:
5a964f20 126
16394a69 127 use IO::File;
128 $fh = IO::File->new_tmpfile()
129 or die "Unable to make new temporary file: $!";
5a964f20 130
a6dd486b 131If you're committed to creating a temporary file by hand, use the
132process ID and/or the current time-value. If you need to have many
133temporary files in one process, use a counter:
5a964f20 134
135 BEGIN {
68dc0745 136 use Fcntl;
16394a69 137 my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMPDIR} || $ENV{TEMP};
68dc0745 138 my $base_name = sprintf("%s/%d-%d-0000", $temp_dir, $$, time());
139 sub temp_file {
5a964f20 140 local *FH;
68dc0745 141 my $count = 0;
5a964f20 142 until (defined(fileno(FH)) || $count++ > 100) {
68dc0745 143 $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e;
5a964f20 144 sysopen(FH, $base_name, O_WRONLY|O_EXCL|O_CREAT);
68dc0745 145 }
5a964f20 146 if (defined(fileno(FH))
147 return (*FH, $base_name);
68dc0745 148 } else {
149 return ();
150 }
151 }
152 }
153
68dc0745 154=head2 How can I manipulate fixed-record-length files?
155
793f5136 156The most efficient way is using L<pack()|perlfunc/"pack"> and
157L<unpack()|perlfunc/"unpack">. This is faster than using
158L<substr()|perlfunc/"substr"> when taking many, many strings. It is
159slower for just a few.
5a964f20 160
161Here is a sample chunk of code to break up and put back together again
162some fixed-format input lines, in this case from the output of a normal,
163Berkeley-style ps:
68dc0745 164
165 # sample input line:
166 # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what
793f5136 167 my $PS_T = 'A6 A4 A7 A5 A*';
168 open my $ps, '-|', 'ps';
169 print scalar <$ps>;
170 my @fields = qw( pid tt stat time command );
171 while (<$ps>) {
172 my %process;
173 @process{@fields} = unpack($PS_T, $_);
174 for my $field ( @fields ) {
175 print "$field: <$process{$field}>\n";
68dc0745 176 }
793f5136 177 print 'line=', pack($PS_T, @process{@fields} ), "\n";
68dc0745 178 }
179
793f5136 180We've used a hash slice in order to easily handle the fields of each row.
181Storing the keys in an array means it's easy to operate on them as a
182group or loop over them with for. It also avoids polluting the program
183with global variables and using symbolic references.
5a964f20 184
68dc0745 185=head2 How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles?
186
c90536be 187As of perl5.6, open() autovivifies file and directory handles
188as references if you pass it an uninitialized scalar variable.
189You can then pass these references just like any other scalar,
190and use them in the place of named handles.
68dc0745 191
c90536be 192 open my $fh, $file_name;
818c4caa 193
c90536be 194 open local $fh, $file_name;
818c4caa 195
c90536be 196 print $fh "Hello World!\n";
818c4caa 197
c90536be 198 process_file( $fh );
68dc0745 199
c90536be 200Before perl5.6, you had to deal with various typeglob idioms
201which you may see in older code.
68dc0745 202
c90536be 203 open FILE, "> $filename";
204 process_typeglob( *FILE );
205 process_reference( \*FILE );
818c4caa 206
c90536be 207 sub process_typeglob { local *FH = shift; print FH "Typeglob!" }
208 sub process_reference { local $fh = shift; print $fh "Reference!" }
5a964f20 209
c90536be 210If you want to create many anonymous handles, you should
211check out the Symbol or IO::Handle modules.
5a964f20 212
213=head2 How can I use a filehandle indirectly?
214
215An indirect filehandle is using something other than a symbol
216in a place that a filehandle is expected. Here are ways
a6dd486b 217to get indirect filehandles:
5a964f20 218
219 $fh = SOME_FH; # bareword is strict-subs hostile
220 $fh = "SOME_FH"; # strict-refs hostile; same package only
221 $fh = *SOME_FH; # typeglob
222 $fh = \*SOME_FH; # ref to typeglob (bless-able)
223 $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob
224
c90536be 225Or, you can use the C<new> method from one of the IO::* modules to
5a964f20 226create an anonymous filehandle, store that in a scalar variable,
227and use it as though it were a normal filehandle.
228
5a964f20 229 use IO::Handle; # 5.004 or higher
230 $fh = IO::Handle->new();
231
232Then use any of those as you would a normal filehandle. Anywhere that
233Perl is expecting a filehandle, an indirect filehandle may be used
234instead. An indirect filehandle is just a scalar variable that contains
368c9434 235a filehandle. Functions like C<print>, C<open>, C<seek>, or
c90536be 236the C<< <FH> >> diamond operator will accept either a named filehandle
5a964f20 237or a scalar variable containing one:
238
239 ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR);
240 print $ofh "Type it: ";
241 $got = <$ifh>
242 print $efh "What was that: $got";
243
368c9434 244If you're passing a filehandle to a function, you can write
5a964f20 245the function in two ways:
246
247 sub accept_fh {
248 my $fh = shift;
249 print $fh "Sending to indirect filehandle\n";
46fc3d4c 250 }
251
5a964f20 252Or it can localize a typeglob and use the filehandle directly:
46fc3d4c 253
5a964f20 254 sub accept_fh {
255 local *FH = shift;
256 print FH "Sending to localized filehandle\n";
46fc3d4c 257 }
258
5a964f20 259Both styles work with either objects or typeglobs of real filehandles.
260(They might also work with strings under some circumstances, but this
261is risky.)
262
263 accept_fh(*STDOUT);
264 accept_fh($handle);
265
266In the examples above, we assigned the filehandle to a scalar variable
a6dd486b 267before using it. That is because only simple scalar variables, not
268expressions or subscripts of hashes or arrays, can be used with
269built-ins like C<print>, C<printf>, or the diamond operator. Using
8305e449 270something other than a simple scalar variable as a filehandle is
5a964f20 271illegal and won't even compile:
272
273 @fd = (*STDIN, *STDOUT, *STDERR);
274 print $fd[1] "Type it: "; # WRONG
275 $got = <$fd[0]> # WRONG
276 print $fd[2] "What was that: $got"; # WRONG
277
278With C<print> and C<printf>, you get around this by using a block and
279an expression where you would place the filehandle:
280
281 print { $fd[1] } "funny stuff\n";
282 printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559;
283 # Pity the poor deadbeef.
284
285That block is a proper block like any other, so you can put more
286complicated code there. This sends the message out to one of two places:
287
197aec24 288 $ok = -x "/bin/cat";
5a964f20 289 print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n";
197aec24 290 print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n";
5a964f20 291
292This approach of treating C<print> and C<printf> like object methods
293calls doesn't work for the diamond operator. That's because it's a
294real operator, not just a function with a comma-less argument. Assuming
295you've been storing typeglobs in your structure as we did above, you
c90536be 296can use the built-in function named C<readline> to read a record just
c47ff5f1 297as C<< <> >> does. Given the initialization shown above for @fd, this
c90536be 298would work, but only because readline() requires a typeglob. It doesn't
5a964f20 299work with objects or strings, which might be a bug we haven't fixed yet.
300
301 $got = readline($fd[0]);
302
303Let it be noted that the flakiness of indirect filehandles is not
304related to whether they're strings, typeglobs, objects, or anything else.
305It's the syntax of the fundamental operators. Playing the object
306game doesn't help you at all here.
46fc3d4c 307
68dc0745 308=head2 How can I set up a footer format to be used with write()?
309
54310121 310There's no builtin way to do this, but L<perlform> has a couple of
68dc0745 311techniques to make it possible for the intrepid hacker.
312
313=head2 How can I write() into a string?
314
65acb1b1 315See L<perlform/"Accessing Formatting Internals"> for an swrite() function.
68dc0745 316
317=head2 How can I output my numbers with commas added?
318
49d635f9 319This subroutine will add commas to your number:
320
321 sub commify {
322 local $_ = shift;
323 1 while s/^([-+]?\d+)(\d{3})/$1,$2/;
324 return $_;
325 }
326
327This regex from Benjamin Goldberg will add commas to numbers:
68dc0745 328
881bdbd4 329 s/(^[-+]?\d+?(?=(?>(?:\d{3})+)(?!\d))|\G\d{3}(?=\d))/$1,/g;
68dc0745 330
49d635f9 331It is easier to see with comments:
68dc0745 332
881bdbd4 333 s/(
334 ^[-+]? # beginning of number.
335 \d{1,3}? # first digits before first comma
336 (?= # followed by, (but not included in the match) :
337 (?>(?:\d{3})+) # some positive multiple of three digits.
338 (?!\d) # an *exact* multiple, not x * 3 + 1 or whatever.
339 )
340 | # or:
341 \G\d{3} # after the last group, get three digits
342 (?=\d) # but they have to have more digits after them.
343 )/$1,/xg;
46fc3d4c 344
68dc0745 345=head2 How can I translate tildes (~) in a filename?
346
575cc754 347Use the <> (glob()) operator, documented in L<perlfunc>. Older
348versions of Perl require that you have a shell installed that groks
349tildes. Recent perl versions have this feature built in. The
d6260402 350File::KGlob module (available from CPAN) gives more portable glob
575cc754 351functionality.
68dc0745 352
353Within Perl, you may use this directly:
354
355 $filename =~ s{
356 ^ ~ # find a leading tilde
357 ( # save this in $1
358 [^/] # a non-slash character
359 * # repeated 0 or more times (0 means me)
360 )
361 }{
362 $1
363 ? (getpwnam($1))[7]
364 : ( $ENV{HOME} || $ENV{LOGDIR} )
365 }ex;
366
5a964f20 367=head2 How come when I open a file read-write it wipes it out?
68dc0745 368
369Because you're using something like this, which truncates the file and
370I<then> gives you read-write access:
371
5a964f20 372 open(FH, "+> /path/name"); # WRONG (almost always)
68dc0745 373
374Whoops. You should instead use this, which will fail if the file
197aec24 375doesn't exist.
d92eb7b0 376
377 open(FH, "+< /path/name"); # open for update
378
c47ff5f1 379Using ">" always clobbers or creates. Using "<" never does
d92eb7b0 380either. The "+" doesn't change this.
68dc0745 381
5a964f20 382Here are examples of many kinds of file opens. Those using sysopen()
383all assume
68dc0745 384
5a964f20 385 use Fcntl;
68dc0745 386
5a964f20 387To open file for reading:
68dc0745 388
5a964f20 389 open(FH, "< $path") || die $!;
390 sysopen(FH, $path, O_RDONLY) || die $!;
391
392To open file for writing, create new file if needed or else truncate old file:
393
394 open(FH, "> $path") || die $!;
395 sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT) || die $!;
396 sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666) || die $!;
397
398To open file for writing, create new file, file must not exist:
399
400 sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT) || die $!;
401 sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT, 0666) || die $!;
402
403To open file for appending, create if necessary:
404
405 open(FH, ">> $path") || die $!;
406 sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT) || die $!;
407 sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT, 0666) || die $!;
408
409To open file for appending, file must exist:
410
411 sysopen(FH, $path, O_WRONLY|O_APPEND) || die $!;
412
413To open file for update, file must exist:
414
415 open(FH, "+< $path") || die $!;
416 sysopen(FH, $path, O_RDWR) || die $!;
417
418To open file for update, create file if necessary:
419
420 sysopen(FH, $path, O_RDWR|O_CREAT) || die $!;
421 sysopen(FH, $path, O_RDWR|O_CREAT, 0666) || die $!;
422
423To open file for update, file must not exist:
424
425 sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT) || die $!;
426 sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT, 0666) || die $!;
427
428To open a file without blocking, creating if necessary:
429
430 sysopen(FH, "/tmp/somefile", O_WRONLY|O_NDELAY|O_CREAT)
431 or die "can't open /tmp/somefile: $!":
432
433Be warned that neither creation nor deletion of files is guaranteed to
434be an atomic operation over NFS. That is, two processes might both
a6dd486b 435successfully create or unlink the same file! Therefore O_EXCL
436isn't as exclusive as you might wish.
68dc0745 437
87275199 438See also the new L<perlopentut> if you have it (new for 5.6).
65acb1b1 439
04d666b1 440=head2 Why do I sometimes get an "Argument list too long" when I use E<lt>*E<gt>?
68dc0745 441
c47ff5f1 442The C<< <> >> operator performs a globbing operation (see above).
3a4b19e4 443In Perl versions earlier than v5.6.0, the internal glob() operator forks
444csh(1) to do the actual glob expansion, but
68dc0745 445csh can't handle more than 127 items and so gives the error message
446C<Argument list too long>. People who installed tcsh as csh won't
447have this problem, but their users may be surprised by it.
448
3a4b19e4 449To get around this, either upgrade to Perl v5.6.0 or later, do the glob
d6260402 450yourself with readdir() and patterns, or use a module like File::KGlob,
3a4b19e4 451one that doesn't use the shell to do globbing.
68dc0745 452
453=head2 Is there a leak/bug in glob()?
454
455Due to the current implementation on some operating systems, when you
456use the glob() function or its angle-bracket alias in a scalar
a6dd486b 457context, you may cause a memory leak and/or unpredictable behavior. It's
68dc0745 458best therefore to use glob() only in list context.
459
c47ff5f1 460=head2 How can I open a file with a leading ">" or trailing blanks?
68dc0745 461
462Normally perl ignores trailing blanks in filenames, and interprets
463certain leading characters (or a trailing "|") to mean something
197aec24 464special.
68dc0745 465
881bdbd4 466The three argument form of open() lets you specify the mode
467separately from the filename. The open() function treats
197aec24 468special mode characters and whitespace in the filename as
881bdbd4 469literals
65acb1b1 470
881bdbd4 471 open FILE, "<", " file "; # filename is " file "
472 open FILE, ">", ">file"; # filename is ">file"
65acb1b1 473
881bdbd4 474It may be a lot clearer to use sysopen(), though:
65acb1b1 475
476 use Fcntl;
477 $badpath = "<<<something really wicked ";
a6dd486b 478 sysopen (FH, $badpath, O_WRONLY | O_CREAT | O_TRUNC)
65acb1b1 479 or die "can't open $badpath: $!";
68dc0745 480
68dc0745 481=head2 How can I reliably rename a file?
482
49d635f9 483If your operating system supports a proper mv(1) utility or its
484functional equivalent, this works:
68dc0745 485
486 rename($old, $new) or system("mv", $old, $new);
487
d2321c93 488It may be more portable to use the File::Copy module instead.
489You just copy to the new file to the new name (checking return
490values), then delete the old one. This isn't really the same
491semantically as a rename(), which preserves meta-information like
68dc0745 492permissions, timestamps, inode info, etc.
493
d2321c93 494Newer versions of File::Copy export a move() function.
5a964f20 495
68dc0745 496=head2 How can I lock a file?
497
54310121 498Perl's builtin flock() function (see L<perlfunc> for details) will call
68dc0745 499flock(2) if that exists, fcntl(2) if it doesn't (on perl version 5.004 and
500later), and lockf(3) if neither of the two previous system calls exists.
501On some systems, it may even use a different form of native locking.
502Here are some gotchas with Perl's flock():
503
504=over 4
505
506=item 1
507
508Produces a fatal error if none of the three system calls (or their
509close equivalent) exists.
510
511=item 2
512
513lockf(3) does not provide shared locking, and requires that the
514filehandle be open for writing (or appending, or read/writing).
515
516=item 3
517
d92eb7b0 518Some versions of flock() can't lock files over a network (e.g. on NFS file
519systems), so you'd need to force the use of fcntl(2) when you build Perl.
a6dd486b 520But even this is dubious at best. See the flock entry of L<perlfunc>
d92eb7b0 521and the F<INSTALL> file in the source distribution for information on
522building Perl to do this.
523
524Two potentially non-obvious but traditional flock semantics are that
a6dd486b 525it waits indefinitely until the lock is granted, and that its locks are
d92eb7b0 526I<merely advisory>. Such discretionary locks are more flexible, but
527offer fewer guarantees. This means that files locked with flock() may
528be modified by programs that do not also use flock(). Cars that stop
529for red lights get on well with each other, but not with cars that don't
530stop for red lights. See the perlport manpage, your port's specific
531documentation, or your system-specific local manpages for details. It's
532best to assume traditional behavior if you're writing portable programs.
a6dd486b 533(If you're not, you should as always feel perfectly free to write
d92eb7b0 534for your own system's idiosyncrasies (sometimes called "features").
535Slavish adherence to portability concerns shouldn't get in the way of
536your getting your job done.)
68dc0745 537
197aec24 538For more information on file locking, see also
13a2d996 539L<perlopentut/"File Locking"> if you have it (new for 5.6).
65acb1b1 540
68dc0745 541=back
542
04d666b1 543=head2 Why can't I just open(FH, "E<gt>file.lock")?
68dc0745 544
545A common bit of code B<NOT TO USE> is this:
546
547 sleep(3) while -e "file.lock"; # PLEASE DO NOT USE
548 open(LCK, "> file.lock"); # THIS BROKEN CODE
549
550This is a classic race condition: you take two steps to do something
551which must be done in one. That's why computer hardware provides an
552atomic test-and-set instruction. In theory, this "ought" to work:
553
5a964f20 554 sysopen(FH, "file.lock", O_WRONLY|O_EXCL|O_CREAT)
9b55d3ab 555 or die "can't open file.lock: $!";
68dc0745 556
557except that lamentably, file creation (and deletion) is not atomic
558over NFS, so this won't work (at least, not every time) over the net.
65acb1b1 559Various schemes involving link() have been suggested, but
46fc3d4c 560these tend to involve busy-wait, which is also subdesirable.
68dc0745 561
fc36a67e 562=head2 I still don't get locking. I just want to increment the number in the file. How can I do this?
68dc0745 563
46fc3d4c 564Didn't anyone ever tell you web-page hit counters were useless?
5a964f20 565They don't count number of hits, they're a waste of time, and they serve
a6dd486b 566only to stroke the writer's vanity. It's better to pick a random number;
567they're more realistic.
68dc0745 568
5a964f20 569Anyway, this is what you can do if you can't help yourself.
68dc0745 570
e2c57c3e 571 use Fcntl qw(:DEFAULT :flock);
5a964f20 572 sysopen(FH, "numfile", O_RDWR|O_CREAT) or die "can't open numfile: $!";
65acb1b1 573 flock(FH, LOCK_EX) or die "can't flock numfile: $!";
68dc0745 574 $num = <FH> || 0;
575 seek(FH, 0, 0) or die "can't rewind numfile: $!";
576 truncate(FH, 0) or die "can't truncate numfile: $!";
577 (print FH $num+1, "\n") or die "can't write numfile: $!";
68dc0745 578 close FH or die "can't close numfile: $!";
579
46fc3d4c 580Here's a much better web-page hit counter:
68dc0745 581
582 $hits = int( (time() - 850_000_000) / rand(1_000) );
583
584If the count doesn't impress your friends, then the code might. :-)
585
f52f3be2 586=head2 All I want to do is append a small amount of text to the end of a file. Do I still have to use locking?
05caf3a7 587
588If you are on a system that correctly implements flock() and you use the
589example appending code from "perldoc -f flock" everything will be OK
590even if the OS you are on doesn't implement append mode correctly (if
591such a system exists.) So if you are happy to restrict yourself to OSs
592that implement flock() (and that's not really much of a restriction)
593then that is what you should do.
594
595If you know you are only going to use a system that does correctly
596implement appending (i.e. not Win32) then you can omit the seek() from
597the above code.
598
599If you know you are only writing code to run on an OS and filesystem that
600does implement append mode correctly (a local filesystem on a modern
601Unix for example), and you keep the file in block-buffered mode and you
602write less than one buffer-full of output between each manual flushing
8305e449 603of the buffer then each bufferload is almost guaranteed to be written to
05caf3a7 604the end of the file in one chunk without getting intermingled with
605anyone else's output. You can also use the syswrite() function which is
606simply a wrapper around your systems write(2) system call.
607
608There is still a small theoretical chance that a signal will interrupt
609the system level write() operation before completion. There is also a
610possibility that some STDIO implementations may call multiple system
611level write()s even if the buffer was empty to start. There may be some
612systems where this probability is reduced to zero.
613
68dc0745 614=head2 How do I randomly update a binary file?
615
616If you're just trying to patch a binary, in many cases something as
617simple as this works:
618
619 perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs
620
621However, if you have fixed sized records, then you might do something more
622like this:
623
624 $RECSIZE = 220; # size of record, in bytes
625 $recno = 37; # which record to update
626 open(FH, "+<somewhere") || die "can't update somewhere: $!";
627 seek(FH, $recno * $RECSIZE, 0);
628 read(FH, $record, $RECSIZE) == $RECSIZE || die "can't read record $recno: $!";
629 # munge the record
65acb1b1 630 seek(FH, -$RECSIZE, 1);
68dc0745 631 print FH $record;
632 close FH;
633
634Locking and error checking are left as an exercise for the reader.
a6dd486b 635Don't forget them or you'll be quite sorry.
68dc0745 636
68dc0745 637=head2 How do I get a file's timestamp in perl?
638
881bdbd4 639If you want to retrieve the time at which the file was last
640read, written, or had its meta-data (owner, etc) changed,
641you use the B<-M>, B<-A>, or B<-C> file test operations as
642documented in L<perlfunc>. These retrieve the age of the
643file (measured against the start-time of your program) in
644days as a floating point number. Some platforms may not have
645all of these times. See L<perlport> for details. To
646retrieve the "raw" time in seconds since the epoch, you
647would call the stat function, then use localtime(),
648gmtime(), or POSIX::strftime() to convert this into
649human-readable form.
68dc0745 650
651Here's an example:
652
653 $write_secs = (stat($file))[9];
c8db1d39 654 printf "file %s updated at %s\n", $file,
655 scalar localtime($write_secs);
68dc0745 656
657If you prefer something more legible, use the File::stat module
658(part of the standard distribution in version 5.004 and later):
659
65acb1b1 660 # error checking left as an exercise for reader.
68dc0745 661 use File::stat;
662 use Time::localtime;
663 $date_string = ctime(stat($file)->mtime);
664 print "file $file updated at $date_string\n";
665
65acb1b1 666The POSIX::strftime() approach has the benefit of being,
667in theory, independent of the current locale. See L<perllocale>
668for details.
68dc0745 669
670=head2 How do I set a file's timestamp in perl?
671
672You use the utime() function documented in L<perlfunc/utime>.
673By way of example, here's a little program that copies the
674read and write times from its first argument to all the rest
675of them.
676
677 if (@ARGV < 2) {
678 die "usage: cptimes timestamp_file other_files ...\n";
679 }
680 $timestamp = shift;
681 ($atime, $mtime) = (stat($timestamp))[8,9];
682 utime $atime, $mtime, @ARGV;
683
65acb1b1 684Error checking is, as usual, left as an exercise for the reader.
68dc0745 685
686Note that utime() currently doesn't work correctly with Win95/NT
687ports. A bug has been reported. Check it carefully before using
a6dd486b 688utime() on those platforms.
68dc0745 689
690=head2 How do I print to more than one file at once?
691
49d635f9 692To connect one filehandle to several output filehandles,
693you can use the IO::Tee or Tie::FileHandle::Multiplex modules.
68dc0745 694
49d635f9 695If you only have to do this once, you can print individually
696to each filehandle.
68dc0745 697
49d635f9 698 for $fh (FH1, FH2, FH3) { print $fh "whatever\n" }
5a964f20 699
49d635f9 700=head2 How can I read in an entire file all at once?
68dc0745 701
49d635f9 702You can use the File::Slurp module to do it in one step.
68dc0745 703
49d635f9 704 use File::Slurp;
197aec24 705
49d635f9 706 $all_of_it = read_file($filename); # entire file in scalar
707 @all_lines = read_file($filename); # one line perl element
d92eb7b0 708
709The customary Perl approach for processing all the lines in a file is to
710do so one line at a time:
711
712 open (INPUT, $file) || die "can't open $file: $!";
713 while (<INPUT>) {
714 chomp;
715 # do something with $_
197aec24 716 }
d92eb7b0 717 close(INPUT) || die "can't close $file: $!";
718
719This is tremendously more efficient than reading the entire file into
720memory as an array of lines and then processing it one element at a time,
a6dd486b 721which is often--if not almost always--the wrong approach. Whenever
d92eb7b0 722you see someone do this:
723
724 @lines = <INPUT>;
725
30852c57 726you should think long and hard about why you need everything loaded at
727once. It's just not a scalable solution. You might also find it more
728fun to use the standard Tie::File module, or the DB_File module's
729$DB_RECNO bindings, which allow you to tie an array to a file so that
730accessing an element the array actually accesses the corresponding
731line in the file.
d92eb7b0 732
f05bbc40 733You can read the entire filehandle contents into a scalar.
d92eb7b0 734
735 {
736 local(*INPUT, $/);
737 open (INPUT, $file) || die "can't open $file: $!";
738 $var = <INPUT>;
739 }
740
197aec24 741That temporarily undefs your record separator, and will automatically
d92eb7b0 742close the file at block exit. If the file is already open, just use this:
743
744 $var = do { local $/; <INPUT> };
745
f05bbc40 746For ordinary files you can also use the read function.
747
748 read( INPUT, $var, -s INPUT );
749
750The third argument tests the byte size of the data on the INPUT filehandle
751and reads that many bytes into the buffer $var.
752
68dc0745 753=head2 How can I read in a file by paragraphs?
754
65acb1b1 755Use the C<$/> variable (see L<perlvar> for details). You can either
68dc0745 756set it to C<""> to eliminate empty paragraphs (C<"abc\n\n\n\ndef">,
757for instance, gets treated as two paragraphs and not three), or
758C<"\n\n"> to accept empty paragraphs.
759
197aec24 760Note that a blank line must have no blanks in it. Thus
c4db748a 761S<C<"fred\n \nstuff\n\n">> is one paragraph, but C<"fred\n\nstuff\n\n"> is two.
65acb1b1 762
68dc0745 763=head2 How can I read a single character from a file? From the keyboard?
764
765You can use the builtin C<getc()> function for most filehandles, but
766it won't (easily) work on a terminal device. For STDIN, either use
a6dd486b 767the Term::ReadKey module from CPAN or use the sample code in
68dc0745 768L<perlfunc/getc>.
769
65acb1b1 770If your system supports the portable operating system programming
771interface (POSIX), you can use the following code, which you'll note
772turns off echo processing as well.
68dc0745 773
774 #!/usr/bin/perl -w
775 use strict;
776 $| = 1;
777 for (1..4) {
778 my $got;
779 print "gimme: ";
780 $got = getone();
781 print "--> $got\n";
782 }
783 exit;
784
785 BEGIN {
786 use POSIX qw(:termios_h);
787
788 my ($term, $oterm, $echo, $noecho, $fd_stdin);
789
790 $fd_stdin = fileno(STDIN);
791
792 $term = POSIX::Termios->new();
793 $term->getattr($fd_stdin);
794 $oterm = $term->getlflag();
795
796 $echo = ECHO | ECHOK | ICANON;
797 $noecho = $oterm & ~$echo;
798
799 sub cbreak {
800 $term->setlflag($noecho);
801 $term->setcc(VTIME, 1);
802 $term->setattr($fd_stdin, TCSANOW);
803 }
804
805 sub cooked {
806 $term->setlflag($oterm);
807 $term->setcc(VTIME, 0);
808 $term->setattr($fd_stdin, TCSANOW);
809 }
810
811 sub getone {
812 my $key = '';
813 cbreak();
814 sysread(STDIN, $key, 1);
815 cooked();
816 return $key;
817 }
818
819 }
820
821 END { cooked() }
822
a6dd486b 823The Term::ReadKey module from CPAN may be easier to use. Recent versions
65acb1b1 824include also support for non-portable systems as well.
68dc0745 825
826 use Term::ReadKey;
827 open(TTY, "</dev/tty");
828 print "Gimme a char: ";
829 ReadMode "raw";
830 $key = ReadKey 0, *TTY;
831 ReadMode "normal";
832 printf "\nYou said %s, char number %03d\n",
833 $key, ord $key;
834
65acb1b1 835=head2 How can I tell whether there's a character waiting on a filehandle?
68dc0745 836
5a964f20 837The very first thing you should do is look into getting the Term::ReadKey
65acb1b1 838extension from CPAN. As we mentioned earlier, it now even has limited
839support for non-portable (read: not open systems, closed, proprietary,
840not POSIX, not Unix, etc) systems.
5a964f20 841
842You should also check out the Frequently Asked Questions list in
68dc0745 843comp.unix.* for things like this: the answer is essentially the same.
844It's very system dependent. Here's one solution that works on BSD
845systems:
846
847 sub key_ready {
848 my($rin, $nfd);
849 vec($rin, fileno(STDIN), 1) = 1;
850 return $nfd = select($rin,undef,undef,0);
851 }
852
65acb1b1 853If you want to find out how many characters are waiting, there's
854also the FIONREAD ioctl call to be looked at. The I<h2ph> tool that
855comes with Perl tries to convert C include files to Perl code, which
856can be C<require>d. FIONREAD ends up defined as a function in the
857I<sys/ioctl.ph> file:
68dc0745 858
5a964f20 859 require 'sys/ioctl.ph';
68dc0745 860
5a964f20 861 $size = pack("L", 0);
862 ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n";
863 $size = unpack("L", $size);
68dc0745 864
5a964f20 865If I<h2ph> wasn't installed or doesn't work for you, you can
866I<grep> the include files by hand:
68dc0745 867
5a964f20 868 % grep FIONREAD /usr/include/*/*
869 /usr/include/asm/ioctls.h:#define FIONREAD 0x541B
68dc0745 870
5a964f20 871Or write a small C program using the editor of champions:
68dc0745 872
5a964f20 873 % cat > fionread.c
874 #include <sys/ioctl.h>
875 main() {
876 printf("%#08x\n", FIONREAD);
877 }
878 ^D
65acb1b1 879 % cc -o fionread fionread.c
5a964f20 880 % ./fionread
881 0x4004667f
882
8305e449 883And then hard code it, leaving porting as an exercise to your successor.
5a964f20 884
885 $FIONREAD = 0x4004667f; # XXX: opsys dependent
886
887 $size = pack("L", 0);
888 ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n";
889 $size = unpack("L", $size);
890
a6dd486b 891FIONREAD requires a filehandle connected to a stream, meaning that sockets,
5a964f20 892pipes, and tty devices work, but I<not> files.
68dc0745 893
894=head2 How do I do a C<tail -f> in perl?
895
896First try
897
898 seek(GWFILE, 0, 1);
899
900The statement C<seek(GWFILE, 0, 1)> doesn't change the current position,
901but it does clear the end-of-file condition on the handle, so that the
902next <GWFILE> makes Perl try again to read something.
903
904If that doesn't work (it relies on features of your stdio implementation),
905then you need something more like this:
906
907 for (;;) {
908 for ($curpos = tell(GWFILE); <GWFILE>; $curpos = tell(GWFILE)) {
909 # search for some stuff and put it into files
910 }
911 # sleep for a while
912 seek(GWFILE, $curpos, 0); # seek to where we had been
913 }
914
915If this still doesn't work, look into the POSIX module. POSIX defines
916the clearerr() method, which can remove the end of file condition on a
917filehandle. The method: read until end of file, clearerr(), read some
918more. Lather, rinse, repeat.
919
65acb1b1 920There's also a File::Tail module from CPAN.
921
68dc0745 922=head2 How do I dup() a filehandle in Perl?
923
924If you check L<perlfunc/open>, you'll see that several of the ways
925to call open() should do the trick. For example:
926
927 open(LOG, ">>/tmp/logfile");
928 open(STDERR, ">&LOG");
929
930Or even with a literal numeric descriptor:
931
932 $fd = $ENV{MHCONTEXTFD};
933 open(MHCONTEXT, "<&=$fd"); # like fdopen(3S)
934
c47ff5f1 935Note that "<&STDIN" makes a copy, but "<&=STDIN" make
5a964f20 936an alias. That means if you close an aliased handle, all
197aec24 937aliases become inaccessible. This is not true with
5a964f20 938a copied one.
939
940Error checking, as always, has been left as an exercise for the reader.
68dc0745 941
942=head2 How do I close a file descriptor by number?
943
944This should rarely be necessary, as the Perl close() function is to be
945used for things that Perl opened itself, even if it was a dup of a
a6dd486b 946numeric descriptor as with MHCONTEXT above. But if you really have
68dc0745 947to, you may be able to do this:
948
949 require 'sys/syscall.ph';
950 $rc = syscall(&SYS_close, $fd + 0); # must force numeric
951 die "can't sysclose $fd: $!" unless $rc == -1;
952
a6dd486b 953Or, just use the fdopen(3S) feature of open():
d92eb7b0 954
197aec24 955 {
956 local *F;
d92eb7b0 957 open F, "<&=$fd" or die "Cannot reopen fd=$fd: $!";
958 close F;
959 }
960
883f1635 961=head2 Why can't I use "C:\temp\foo" in DOS paths? Why doesn't `C:\temp\foo.exe` work?
68dc0745 962
963Whoops! You just put a tab and a formfeed into that filename!
964Remember that within double quoted strings ("like\this"), the
965backslash is an escape character. The full list of these is in
966L<perlop/Quote and Quote-like Operators>. Unsurprisingly, you don't
967have a file called "c:(tab)emp(formfeed)oo" or
65acb1b1 968"c:(tab)emp(formfeed)oo.exe" on your legacy DOS filesystem.
68dc0745 969
970Either single-quote your strings, or (preferably) use forward slashes.
46fc3d4c 971Since all DOS and Windows versions since something like MS-DOS 2.0 or so
68dc0745 972have treated C</> and C<\> the same in a path, you might as well use the
a6dd486b 973one that doesn't clash with Perl--or the POSIX shell, ANSI C and C++,
65acb1b1 974awk, Tcl, Java, or Python, just to mention a few. POSIX paths
975are more portable, too.
68dc0745 976
977=head2 Why doesn't glob("*.*") get all the files?
978
979Because even on non-Unix ports, Perl's glob function follows standard
46fc3d4c 980Unix globbing semantics. You'll need C<glob("*")> to get all (non-hidden)
65acb1b1 981files. This makes glob() portable even to legacy systems. Your
982port may include proprietary globbing functions as well. Check its
983documentation for details.
68dc0745 984
985=head2 Why does Perl let me delete read-only files? Why does C<-i> clobber protected files? Isn't this a bug in Perl?
986
06a5f41f 987This is elaborately and painstakingly described in the
988F<file-dir-perms> article in the "Far More Than You Ever Wanted To
49d635f9 989Know" collection in http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz .
68dc0745 990
991The executive summary: learn how your filesystem works. The
992permissions on a file say what can happen to the data in that file.
993The permissions on a directory say what can happen to the list of
994files in that directory. If you delete a file, you're removing its
995name from the directory (so the operation depends on the permissions
996of the directory, not of the file). If you try to write to the file,
997the permissions of the file govern whether you're allowed to.
998
999=head2 How do I select a random line from a file?
1000
1001Here's an algorithm from the Camel Book:
1002
1003 srand;
1004 rand($.) < 1 && ($line = $_) while <>;
1005
49d635f9 1006This has a significant advantage in space over reading the whole file
1007in. You can find a proof of this method in I<The Art of Computer
1008Programming>, Volume 2, Section 3.4.2, by Donald E. Knuth.
1009
1010You can use the File::Random module which provides a function
1011for that algorithm:
1012
1013 use File::Random qw/random_line/;
1014 my $line = random_line($filename);
1015
1016Another way is to use the Tie::File module, which treats the entire
1017file as an array. Simply access a random array element.
68dc0745 1018
65acb1b1 1019=head2 Why do I get weird spaces when I print an array of lines?
1020
1021Saying
1022
1023 print "@lines\n";
1024
1025joins together the elements of C<@lines> with a space between them.
1026If C<@lines> were C<("little", "fluffy", "clouds")> then the above
a6dd486b 1027statement would print
65acb1b1 1028
1029 little fluffy clouds
1030
1031but if each element of C<@lines> was a line of text, ending a newline
1032character C<("little\n", "fluffy\n", "clouds\n")> then it would print:
1033
1034 little
1035 fluffy
1036 clouds
1037
1038If your array contains lines, just print them:
1039
1040 print @lines;
1041
68dc0745 1042=head1 AUTHOR AND COPYRIGHT
1043
0bc0ad85 1044Copyright (c) 1997-2002 Tom Christiansen and Nathan Torkington.
5a964f20 1045All rights reserved.
1046
5a7beb56 1047This documentation is free; you can redistribute it and/or modify it
1048under the same terms as Perl itself.
c8db1d39 1049
87275199 1050Irrespective of its distribution, all code examples here are in the public
c8db1d39 1051domain. You are permitted and encouraged to use this code and any
1052derivatives thereof in your own programs for fun or for profit as you
1053see fit. A simple comment in the code giving credit to the FAQ would
1054be courteous but is not required.