Simplify title of perluniprops
[p5sagit/p5-mst-13.2.git] / pod / perlfaq5.pod
CommitLineData
68dc0745 1=head1 NAME
2
109f0441 3perlfaq5 - Files and Formats
68dc0745 4
5=head1 DESCRIPTION
6
7This section deals with I/O and the "f" issues: filehandles, flushing,
8formats, and footers.
9
5a964f20 10=head2 How do I flush/unbuffer an output filehandle? Why must I do this?
d74e8afc 11X<flush> X<buffer> X<unbuffer> X<autoflush>
68dc0745 12
109f0441 13(contributed by brian d foy)
5a964f20 14
109f0441 15You might like to read Mark Jason Dominus's "Suffering From Buffering"
16at http://perl.plover.com/FAQs/Buffering.html .
68dc0745 17
109f0441 18Perl normally buffers output so it doesn't make a system call for every
19bit of output. By saving up output, it makes fewer expensive system calls.
20For instance, in this little bit of code, you want to print a dot to the
21screen for every line you process to watch the progress of your program.
22Instead of seeing a dot for every line, Perl buffers the output and you
23have a long wait before you see a row of 50 dots all at once:
24
25 # long wait, then row of dots all at once
26 while( <> ) {
27 print ".";
28 print "\n" unless ++$count % 50;
29
30 #... expensive line processing operations
31 }
32
33To get around this, you have to unbuffer the output filehandle, in this
34case, C<STDOUT>. You can set the special variable C<$|> to a true value
35(mnemonic: making your filehandles "piping hot"):
36
37 $|++;
38
39 # dot shown immediately
40 while( <> ) {
41 print ".";
42 print "\n" unless ++$count % 50;
43
44 #... expensive line processing operations
45 }
46
47The C<$|> is one of the per-filehandle special variables, so each
48filehandle has its own copy of its value. If you want to merge
49standard output and standard error for instance, you have to unbuffer
50each (although STDERR might be unbuffered by default):
51
52 {
53 my $previous_default = select(STDOUT); # save previous default
54 $|++; # autoflush STDOUT
55 select(STDERR);
56 $|++; # autoflush STDERR, to be sure
57 select($previous_default); # restore previous default
58 }
68dc0745 59
109f0441 60 # now should alternate . and +
61 while( 1 )
62 {
63 sleep 1;
64 print STDOUT ".";
65 print STDERR "+";
66 print STDOUT "\n" unless ++$count % 25;
67 }
68
69Besides the C<$|> special variable, you can use C<binmode> to give
70your filehandle a C<:unix> layer, which is unbuffered:
71
72 binmode( STDOUT, ":unix" );
68dc0745 73
109f0441 74 while( 1 ) {
75 sleep 1;
76 print ".";
77 print "\n" unless ++$count % 50;
78 }
68dc0745 79
109f0441 80For more information on output layers, see the entries for C<binmode>
81and C<open> in L<perlfunc>, and the C<PerlIO> module documentation.
68dc0745 82
109f0441 83If you are using C<IO::Handle> or one of its subclasses, you can
84call the C<autoflush> method to change the settings of the
85filehandle:
c195e131 86
87 use IO::Handle;
109f0441 88 open my( $io_fh ), ">", "output.txt";
89 $io_fh->autoflush(1);
90
91The C<IO::Handle> objects also have a C<flush> method. You can flush
92the buffer any time you want without auto-buffering
c195e131 93
109f0441 94 $io_fh->flush;
487af187 95
e573f903 96=head2 How do I change, delete, or insert a line in a file, or append to the beginning of a file?
d74e8afc 97X<file, editing>
68dc0745 98
e573f903 99(contributed by brian d foy)
100
101The basic idea of inserting, changing, or deleting a line from a text
102file involves reading and printing the file to the point you want to
103make the change, making the change, then reading and printing the rest
104of the file. Perl doesn't provide random access to lines (especially
105since the record input separator, C<$/>, is mutable), although modules
106such as C<Tie::File> can fake it.
107
108A Perl program to do these tasks takes the basic form of opening a
109file, printing its lines, then closing the file:
110
111 open my $in, '<', $file or die "Can't read old file: $!";
112 open my $out, '>', "$file.new" or die "Can't write new file: $!";
113
114 while( <$in> )
115 {
116 print $out $_;
117 }
118
119 close $out;
120
121Within that basic form, add the parts that you need to insert, change,
122or delete lines.
123
124To prepend lines to the beginning, print those lines before you enter
125the loop that prints the existing lines.
126
127 open my $in, '<', $file or die "Can't read old file: $!";
128 open my $out, '>', "$file.new" or die "Can't write new file: $!";
129
109f0441 130 print $out "# Add this line to the top\n"; # <--- HERE'S THE MAGIC
e573f903 131
132 while( <$in> )
133 {
134 print $out $_;
135 }
136
137 close $out;
138
139To change existing lines, insert the code to modify the lines inside
140the C<while> loop. In this case, the code finds all lowercased
141versions of "perl" and uppercases them. The happens for every line, so
142be sure that you're supposed to do that on every line!
143
144 open my $in, '<', $file or die "Can't read old file: $!";
145 open my $out, '>', "$file.new" or die "Can't write new file: $!";
146
109f0441 147 print $out "# Add this line to the top\n";
e573f903 148
149 while( <$in> )
150 {
151 s/\b(perl)\b/Perl/g;
152 print $out $_;
153 }
154
155 close $out;
156
157To change only a particular line, the input line number, C<$.>, is
ee891a00 158useful. First read and print the lines up to the one you want to
159change. Next, read the single line you want to change, change it, and
160print it. After that, read the rest of the lines and print those:
e573f903 161
ee891a00 162 while( <$in> ) # print the lines before the change
e573f903 163 {
e573f903 164 print $out $_;
ee891a00 165 last if $. == 4; # line number before change
e573f903 166 }
167
ee891a00 168 my $line = <$in>;
169 $line =~ s/\b(perl)\b/Perl/g;
170 print $out $line;
171
172 while( <$in> ) # print the rest of the lines
173 {
174 print $out $_;
175 }
109f0441 176
e573f903 177To skip lines, use the looping controls. The C<next> in this example
178skips comment lines, and the C<last> stops all processing once it
179encounters either C<__END__> or C<__DATA__>.
180
181 while( <$in> )
182 {
183 next if /^\s+#/; # skip comment lines
184 last if /^__(END|DATA)__$/; # stop at end of code marker
185 print $out $_;
186 }
187
188Do the same sort of thing to delete a particular line by using C<next>
189to skip the lines you don't want to show up in the output. This
190example skips every fifth line:
191
192 while( <$in> )
193 {
194 next unless $. % 5;
195 print $out $_;
196 }
197
198If, for some odd reason, you really want to see the whole file at once
f12f5f55 199rather than processing line-by-line, you can slurp it in (as long as
e573f903 200you can fit the whole thing in memory!):
201
202 open my $in, '<', $file or die "Can't read old file: $!"
203 open my $out, '>', "$file.new" or die "Can't write new file: $!";
204
205 my @lines = do { local $/; <$in> }; # slurp!
206
207 # do your magic here
208
209 print $out @lines;
210
211Modules such as C<File::Slurp> and C<Tie::File> can help with that
212too. If you can, however, avoid reading the entire file at once. Perl
213won't give that memory back to the operating system until the process
214finishes.
215
216You can also use Perl one-liners to modify a file in-place. The
217following changes all 'Fred' to 'Barney' in F<inFile.txt>, overwriting
218the file with the new contents. With the C<-p> switch, Perl wraps a
219C<while> loop around the code you specify with C<-e>, and C<-i> turns
220on in-place editing. The current line is in C<$_>. With C<-p>, Perl
221automatically prints the value of C<$_> at the end of the loop. See
222L<perlrun> for more details.
223
224 perl -pi -e 's/Fred/Barney/' inFile.txt
225
226To make a backup of C<inFile.txt>, give C<-i> a file extension to add:
227
228 perl -pi.bak -e 's/Fred/Barney/' inFile.txt
229
230To change only the fifth line, you can add a test checking C<$.>, the
231input line number, then only perform the operation when the test
232passes:
233
234 perl -pi -e 's/Fred/Barney/ if $. == 5' inFile.txt
235
236To add lines before a certain line, you can add a line (or lines!)
237before Perl prints C<$_>:
238
239 perl -pi -e 'print "Put before third line\n" if $. == 3' inFile.txt
240
241You can even add a line to the beginning of a file, since the current
242line prints at the end of the loop:
243
244 perl -pi -e 'print "Put before first line\n" if $. == 1' inFile.txt
245
246To insert a line after one already in the file, use the C<-n> switch.
247It's just like C<-p> except that it doesn't print C<$_> at the end of
248the loop, so you have to do that yourself. In this case, print C<$_>
249first, then print the line that you want to add.
250
251 perl -ni -e 'print; print "Put after fifth line\n" if $. == 5' inFile.txt
252
253To delete lines, only print the ones that you want.
254
255 perl -ni -e 'print unless /d/' inFile.txt
256
257 ... or ...
258
259 perl -pi -e 'next unless /d/' inFile.txt
68dc0745 260
261=head2 How do I count the number of lines in a file?
d74e8afc 262X<file, counting lines> X<lines> X<line>
68dc0745 263
264One fairly efficient way is to count newlines in the file. The
265following program uses a feature of tr///, as documented in L<perlop>.
266If your text file doesn't end with a newline, then it's not really a
267proper text file, so this may report one fewer line than you expect.
268
500071f4 269 $lines = 0;
270 open(FILE, $filename) or die "Can't open `$filename': $!";
271 while (sysread FILE, $buffer, 4096) {
272 $lines += ($buffer =~ tr/\n//);
273 }
274 close FILE;
68dc0745 275
5a964f20 276This assumes no funny games with newline translations.
277
589a5df2 278=head2 How do I delete the last N lines from a file?
279X<lines> X<file>
280
281(contributed by brian d foy)
282
283The easiest conceptual solution is to count the lines in the
284file then start at the beginning and print the number of lines
285(minus the last N) to a new file.
286
287Most often, the real question is how you can delete the last N
288lines without making more than one pass over the file, or how to
289do it with a lot of copying. The easy concept is the hard reality when
290you might have millions of lines in your file.
291
292One trick is to use C<File::ReadBackwards>, which starts at the end of
293the file. That module provides an object that wraps the real filehandle
294to make it easy for you to move around the file. Once you get to the
295spot you need, you can get the actual filehandle and work with it as
296normal. In this case, you get the file position at the end of the last
297line you want to keep and truncate the file to that point:
298
299 use File::ReadBackwards;
300
301 my $filename = 'test.txt';
302 my $Lines_to_truncate = 2;
303
304 my $bw = File::ReadBackwards->new( $filename )
305 or die "Could not read backwards in [$filename]: $!";
306
307 my $lines_from_end = 0;
308 until( $bw->eof or $lines_from_end == $Lines_to_truncate )
309 {
310 print "Got: ", $bw->readline;
311 $lines_from_end++;
312 }
313
314 truncate( $filename, $bw->tell );
315
316The C<File::ReadBackwards> module also has the advantage of setting
317the input record separator to a regular expression.
318
319You can also use the C<Tie::File> module which lets you access
320the lines through a tied array. You can use normal array operations
321to modify your file, including setting the last index and using
322C<splice>.
323
4750257b 324=head2 How can I use Perl's C<-i> option from within a program?
d74e8afc 325X<-i> X<in-place>
4750257b 326
327C<-i> sets the value of Perl's C<$^I> variable, which in turn affects
328the behavior of C<< <> >>; see L<perlrun> for more details. By
329modifying the appropriate variables directly, you can get the same
330behavior within a larger program. For example:
331
500071f4 332 # ...
333 {
334 local($^I, @ARGV) = ('.orig', glob("*.c"));
335 while (<>) {
336 if ($. == 1) {
337 print "This line should appear at the top of each file\n";
338 }
339 s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case
340 print;
341 close ARGV if eof; # Reset $.
342 }
343 }
344 # $^I and @ARGV return to their old values here
4750257b 345
346This block modifies all the C<.c> files in the current directory,
347leaving a backup of the original data from each file in a new
348C<.c.orig> file.
349
7678cced 350=head2 How can I copy a file?
109f0441 351X<copy> X<file, copy> X<File::Copy>
7678cced 352
353(contributed by brian d foy)
354
109f0441 355Use the C<File::Copy> module. It comes with Perl and can do a
7678cced 356true copy across file systems, and it does its magic in
357a portable fashion.
358
359 use File::Copy;
360
361 copy( $original, $new_copy ) or die "Copy failed: $!";
362
109f0441 363If you can't use C<File::Copy>, you'll have to do the work yourself:
7678cced 364open the original file, open the destination file, then print
109f0441 365to the destination file as you read the original. You also have to
366remember to copy the permissions, owner, and group to the new file.
7678cced 367
68dc0745 368=head2 How do I make a temporary file name?
d74e8afc 369X<file, temporary>
68dc0745 370
7678cced 371If you don't need to know the name of the file, you can use C<open()>
109f0441 372with C<undef> in place of the file name. In Perl 5.8 or later, the
373C<open()> function creates an anonymous temporary file:
7678cced 374
375 open my $tmp, '+>', undef or die $!;
6670e5e7 376
7678cced 377Otherwise, you can use the File::Temp module.
68dc0745 378
500071f4 379 use File::Temp qw/ tempfile tempdir /;
a6dd486b 380
500071f4 381 $dir = tempdir( CLEANUP => 1 );
382 ($fh, $filename) = tempfile( DIR => $dir );
5a964f20 383
500071f4 384 # or if you don't need to know the filename
5a964f20 385
500071f4 386 $fh = tempfile( DIR => $dir );
5a964f20 387
16394a69 388The File::Temp has been a standard module since Perl 5.6.1. If you
389don't have a modern enough Perl installed, use the C<new_tmpfile>
390class method from the IO::File module to get a filehandle opened for
391reading and writing. Use it if you don't need to know the file's name:
5a964f20 392
500071f4 393 use IO::File;
394 $fh = IO::File->new_tmpfile()
16394a69 395 or die "Unable to make new temporary file: $!";
5a964f20 396
a6dd486b 397If you're committed to creating a temporary file by hand, use the
398process ID and/or the current time-value. If you need to have many
399temporary files in one process, use a counter:
5a964f20 400
500071f4 401 BEGIN {
68dc0745 402 use Fcntl;
16394a69 403 my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMPDIR} || $ENV{TEMP};
c195e131 404 my $base_name = sprintf "%s/%d-%d-0000", $temp_dir, $$, time;
500071f4 405
68dc0745 406 sub temp_file {
500071f4 407 local *FH;
408 my $count = 0;
c195e131 409 until( defined(fileno(FH)) || $count++ > 100 ) {
410 $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e;
411 # O_EXCL is required for security reasons.
412 sysopen FH, $base_name, O_WRONLY|O_EXCL|O_CREAT;
413 }
414
415 if( defined fileno(FH) ) {
416 return (*FH, $base_name);
417 }
418 else {
419 return ();
420 }
500071f4 421 }
109f0441 422
500071f4 423 }
68dc0745 424
68dc0745 425=head2 How can I manipulate fixed-record-length files?
d74e8afc 426X<fixed-length> X<file, fixed-length records>
68dc0745 427
793f5136 428The most efficient way is using L<pack()|perlfunc/"pack"> and
429L<unpack()|perlfunc/"unpack">. This is faster than using
430L<substr()|perlfunc/"substr"> when taking many, many strings. It is
431slower for just a few.
5a964f20 432
433Here is a sample chunk of code to break up and put back together again
434some fixed-format input lines, in this case from the output of a normal,
435Berkeley-style ps:
68dc0745 436
500071f4 437 # sample input line:
438 # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what
439 my $PS_T = 'A6 A4 A7 A5 A*';
440 open my $ps, '-|', 'ps';
441 print scalar <$ps>;
442 my @fields = qw( pid tt stat time command );
443 while (<$ps>) {
444 my %process;
445 @process{@fields} = unpack($PS_T, $_);
793f5136 446 for my $field ( @fields ) {
500071f4 447 print "$field: <$process{$field}>\n";
68dc0745 448 }
793f5136 449 print 'line=', pack($PS_T, @process{@fields} ), "\n";
500071f4 450 }
68dc0745 451
793f5136 452We've used a hash slice in order to easily handle the fields of each row.
453Storing the keys in an array means it's easy to operate on them as a
454group or loop over them with for. It also avoids polluting the program
455with global variables and using symbolic references.
5a964f20 456
ac9dac7f 457=head2 How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles?
d74e8afc 458X<filehandle, local> X<filehandle, passing> X<filehandle, reference>
68dc0745 459
c90536be 460As of perl5.6, open() autovivifies file and directory handles
461as references if you pass it an uninitialized scalar variable.
462You can then pass these references just like any other scalar,
463and use them in the place of named handles.
68dc0745 464
c90536be 465 open my $fh, $file_name;
818c4caa 466
c90536be 467 open local $fh, $file_name;
818c4caa 468
c90536be 469 print $fh "Hello World!\n";
818c4caa 470
c90536be 471 process_file( $fh );
68dc0745 472
500071f4 473If you like, you can store these filehandles in an array or a hash.
474If you access them directly, they aren't simple scalars and you
ac9dac7f 475need to give C<print> a little help by placing the filehandle
500071f4 476reference in braces. Perl can only figure it out on its own when
477the filehandle reference is a simple scalar.
478
479 my @fhs = ( $fh1, $fh2, $fh3 );
ac9dac7f 480
500071f4 481 for( $i = 0; $i <= $#fhs; $i++ ) {
482 print {$fhs[$i]} "just another Perl answer, \n";
483 }
484
c90536be 485Before perl5.6, you had to deal with various typeglob idioms
486which you may see in older code.
68dc0745 487
c90536be 488 open FILE, "> $filename";
489 process_typeglob( *FILE );
490 process_reference( \*FILE );
818c4caa 491
c90536be 492 sub process_typeglob { local *FH = shift; print FH "Typeglob!" }
493 sub process_reference { local $fh = shift; print $fh "Reference!" }
5a964f20 494
c90536be 495If you want to create many anonymous handles, you should
496check out the Symbol or IO::Handle modules.
5a964f20 497
498=head2 How can I use a filehandle indirectly?
d74e8afc 499X<filehandle, indirect>
5a964f20 500
501An indirect filehandle is using something other than a symbol
502in a place that a filehandle is expected. Here are ways
a6dd486b 503to get indirect filehandles:
5a964f20 504
500071f4 505 $fh = SOME_FH; # bareword is strict-subs hostile
506 $fh = "SOME_FH"; # strict-refs hostile; same package only
507 $fh = *SOME_FH; # typeglob
508 $fh = \*SOME_FH; # ref to typeglob (bless-able)
509 $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob
5a964f20 510
c90536be 511Or, you can use the C<new> method from one of the IO::* modules to
5a964f20 512create an anonymous filehandle, store that in a scalar variable,
513and use it as though it were a normal filehandle.
514
500071f4 515 use IO::Handle; # 5.004 or higher
516 $fh = IO::Handle->new();
5a964f20 517
518Then use any of those as you would a normal filehandle. Anywhere that
519Perl is expecting a filehandle, an indirect filehandle may be used
520instead. An indirect filehandle is just a scalar variable that contains
368c9434 521a filehandle. Functions like C<print>, C<open>, C<seek>, or
c90536be 522the C<< <FH> >> diamond operator will accept either a named filehandle
5a964f20 523or a scalar variable containing one:
524
500071f4 525 ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR);
526 print $ofh "Type it: ";
527 $got = <$ifh>
528 print $efh "What was that: $got";
5a964f20 529
368c9434 530If you're passing a filehandle to a function, you can write
5a964f20 531the function in two ways:
532
500071f4 533 sub accept_fh {
534 my $fh = shift;
535 print $fh "Sending to indirect filehandle\n";
536 }
46fc3d4c 537
5a964f20 538Or it can localize a typeglob and use the filehandle directly:
46fc3d4c 539
500071f4 540 sub accept_fh {
541 local *FH = shift;
542 print FH "Sending to localized filehandle\n";
543 }
46fc3d4c 544
5a964f20 545Both styles work with either objects or typeglobs of real filehandles.
546(They might also work with strings under some circumstances, but this
547is risky.)
548
500071f4 549 accept_fh(*STDOUT);
550 accept_fh($handle);
5a964f20 551
552In the examples above, we assigned the filehandle to a scalar variable
a6dd486b 553before using it. That is because only simple scalar variables, not
554expressions or subscripts of hashes or arrays, can be used with
555built-ins like C<print>, C<printf>, or the diamond operator. Using
8305e449 556something other than a simple scalar variable as a filehandle is
5a964f20 557illegal and won't even compile:
558
500071f4 559 @fd = (*STDIN, *STDOUT, *STDERR);
560 print $fd[1] "Type it: "; # WRONG
561 $got = <$fd[0]> # WRONG
562 print $fd[2] "What was that: $got"; # WRONG
5a964f20 563
564With C<print> and C<printf>, you get around this by using a block and
565an expression where you would place the filehandle:
566
500071f4 567 print { $fd[1] } "funny stuff\n";
568 printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559;
569 # Pity the poor deadbeef.
5a964f20 570
571That block is a proper block like any other, so you can put more
572complicated code there. This sends the message out to one of two places:
573
500071f4 574 $ok = -x "/bin/cat";
575 print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n";
576 print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n";
5a964f20 577
578This approach of treating C<print> and C<printf> like object methods
579calls doesn't work for the diamond operator. That's because it's a
580real operator, not just a function with a comma-less argument. Assuming
581you've been storing typeglobs in your structure as we did above, you
c90536be 582can use the built-in function named C<readline> to read a record just
c47ff5f1 583as C<< <> >> does. Given the initialization shown above for @fd, this
c90536be 584would work, but only because readline() requires a typeglob. It doesn't
5a964f20 585work with objects or strings, which might be a bug we haven't fixed yet.
586
500071f4 587 $got = readline($fd[0]);
5a964f20 588
589Let it be noted that the flakiness of indirect filehandles is not
590related to whether they're strings, typeglobs, objects, or anything else.
591It's the syntax of the fundamental operators. Playing the object
592game doesn't help you at all here.
46fc3d4c 593
68dc0745 594=head2 How can I set up a footer format to be used with write()?
d74e8afc 595X<footer>
68dc0745 596
54310121 597There's no builtin way to do this, but L<perlform> has a couple of
68dc0745 598techniques to make it possible for the intrepid hacker.
599
600=head2 How can I write() into a string?
d74e8afc 601X<write, into a string>
68dc0745 602
c195e131 603See L<perlform/"Accessing Formatting Internals"> for an C<swrite()> function.
68dc0745 604
c195e131 605=head2 How can I open a filehandle to a string?
109f0441 606X<string> X<open> X<IO::String> X<filehandle>
c195e131 607
608(contributed by Peter J. Holzer, hjp-usenet2@hjp.at)
609
109f0441 610Since Perl 5.8.0 a file handle referring to a string can be created by
611calling open with a reference to that string instead of the filename.
612This file handle can then be used to read from or write to the string:
c195e131 613
614 open(my $fh, '>', \$string) or die "Could not open string for writing";
615 print $fh "foo\n";
616 print $fh "bar\n"; # $string now contains "foo\nbar\n"
617
618 open(my $fh, '<', \$string) or die "Could not open string for reading";
619 my $x = <$fh>; # $x now contains "foo\n"
620
621With older versions of Perl, the C<IO::String> module provides similar
622functionality.
487af187 623
68dc0745 624=head2 How can I output my numbers with commas added?
d74e8afc 625X<number, commify>
68dc0745 626
b68463f7 627(contributed by brian d foy and Benjamin Goldberg)
628
629You can use L<Number::Format> to separate places in a number.
630It handles locale information for those of you who want to insert
631full stops instead (or anything else that they want to use,
632really).
633
49d635f9 634This subroutine will add commas to your number:
635
636 sub commify {
500071f4 637 local $_ = shift;
638 1 while s/^([-+]?\d+)(\d{3})/$1,$2/;
639 return $_;
640 }
49d635f9 641
642This regex from Benjamin Goldberg will add commas to numbers:
68dc0745 643
500071f4 644 s/(^[-+]?\d+?(?=(?>(?:\d{3})+)(?!\d))|\G\d{3}(?=\d))/$1,/g;
68dc0745 645
49d635f9 646It is easier to see with comments:
68dc0745 647
500071f4 648 s/(
649 ^[-+]? # beginning of number.
650 \d+? # first digits before first comma
651 (?= # followed by, (but not included in the match) :
652 (?>(?:\d{3})+) # some positive multiple of three digits.
653 (?!\d) # an *exact* multiple, not x * 3 + 1 or whatever.
654 )
655 | # or:
656 \G\d{3} # after the last group, get three digits
657 (?=\d) # but they have to have more digits after them.
658 )/$1,/xg;
46fc3d4c 659
68dc0745 660=head2 How can I translate tildes (~) in a filename?
d74e8afc 661X<tilde> X<tilde expansion>
68dc0745 662
109f0441 663Use the E<lt>E<gt> (C<glob()>) operator, documented in L<perlfunc>.
664Versions of Perl older than 5.6 require that you have a shell
665installed that groks tildes. Later versions of Perl have this feature
666built in. The C<File::KGlob> module (available from CPAN) gives more
667portable glob functionality.
68dc0745 668
669Within Perl, you may use this directly:
670
671 $filename =~ s{
672 ^ ~ # find a leading tilde
673 ( # save this in $1
674 [^/] # a non-slash character
675 * # repeated 0 or more times (0 means me)
676 )
677 }{
678 $1
679 ? (getpwnam($1))[7]
680 : ( $ENV{HOME} || $ENV{LOGDIR} )
681 }ex;
682
5a964f20 683=head2 How come when I open a file read-write it wipes it out?
d74e8afc 684X<clobber> X<read-write> X<clobbering> X<truncate> X<truncating>
68dc0745 685
686Because you're using something like this, which truncates the file and
687I<then> gives you read-write access:
688
500071f4 689 open(FH, "+> /path/name"); # WRONG (almost always)
68dc0745 690
691Whoops. You should instead use this, which will fail if the file
197aec24 692doesn't exist.
d92eb7b0 693
500071f4 694 open(FH, "+< /path/name"); # open for update
d92eb7b0 695
c47ff5f1 696Using ">" always clobbers or creates. Using "<" never does
d92eb7b0 697either. The "+" doesn't change this.
68dc0745 698
5a964f20 699Here are examples of many kinds of file opens. Those using sysopen()
700all assume
68dc0745 701
500071f4 702 use Fcntl;
68dc0745 703
5a964f20 704To open file for reading:
68dc0745 705
500071f4 706 open(FH, "< $path") || die $!;
707 sysopen(FH, $path, O_RDONLY) || die $!;
5a964f20 708
709To open file for writing, create new file if needed or else truncate old file:
710
500071f4 711 open(FH, "> $path") || die $!;
712 sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT) || die $!;
713 sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666) || die $!;
5a964f20 714
715To open file for writing, create new file, file must not exist:
716
500071f4 717 sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT) || die $!;
718 sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT, 0666) || die $!;
5a964f20 719
720To open file for appending, create if necessary:
721
500071f4 722 open(FH, ">> $path") || die $!;
723 sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT) || die $!;
724 sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT, 0666) || die $!;
5a964f20 725
726To open file for appending, file must exist:
727
500071f4 728 sysopen(FH, $path, O_WRONLY|O_APPEND) || die $!;
5a964f20 729
730To open file for update, file must exist:
731
500071f4 732 open(FH, "+< $path") || die $!;
733 sysopen(FH, $path, O_RDWR) || die $!;
5a964f20 734
735To open file for update, create file if necessary:
736
500071f4 737 sysopen(FH, $path, O_RDWR|O_CREAT) || die $!;
738 sysopen(FH, $path, O_RDWR|O_CREAT, 0666) || die $!;
5a964f20 739
740To open file for update, file must not exist:
741
500071f4 742 sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT) || die $!;
743 sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT, 0666) || die $!;
5a964f20 744
745To open a file without blocking, creating if necessary:
746
500071f4 747 sysopen(FH, "/foo/somefile", O_WRONLY|O_NDELAY|O_CREAT)
2359510d 748 or die "can't open /foo/somefile: $!":
5a964f20 749
750Be warned that neither creation nor deletion of files is guaranteed to
751be an atomic operation over NFS. That is, two processes might both
a6dd486b 752successfully create or unlink the same file! Therefore O_EXCL
753isn't as exclusive as you might wish.
68dc0745 754
87275199 755See also the new L<perlopentut> if you have it (new for 5.6).
65acb1b1 756
04d666b1 757=head2 Why do I sometimes get an "Argument list too long" when I use E<lt>*E<gt>?
d74e8afc 758X<argument list too long>
68dc0745 759
c47ff5f1 760The C<< <> >> operator performs a globbing operation (see above).
3a4b19e4 761In Perl versions earlier than v5.6.0, the internal glob() operator forks
762csh(1) to do the actual glob expansion, but
68dc0745 763csh can't handle more than 127 items and so gives the error message
764C<Argument list too long>. People who installed tcsh as csh won't
765have this problem, but their users may be surprised by it.
766
3a4b19e4 767To get around this, either upgrade to Perl v5.6.0 or later, do the glob
d6260402 768yourself with readdir() and patterns, or use a module like File::KGlob,
3a4b19e4 769one that doesn't use the shell to do globbing.
68dc0745 770
771=head2 Is there a leak/bug in glob()?
d74e8afc 772X<glob>
68dc0745 773
589a5df2 774(contributed by brian d foy)
f12f5f55 775
776Starting with Perl 5.6.0, C<glob> is implemented internally rather
777than relying on an external resource. As such, memory issues with
778C<glob> aren't a problem in modern perls.
68dc0745 779
c47ff5f1 780=head2 How can I open a file with a leading ">" or trailing blanks?
d74e8afc 781X<filename, special characters>
68dc0745 782
b68463f7 783(contributed by Brian McCauley)
68dc0745 784
b68463f7 785The special two argument form of Perl's open() function ignores
786trailing blanks in filenames and infers the mode from certain leading
787characters (or a trailing "|"). In older versions of Perl this was the
788only version of open() and so it is prevalent in old code and books.
65acb1b1 789
b68463f7 790Unless you have a particular reason to use the two argument form you
791should use the three argument form of open() which does not treat any
c195e131 792characters in the filename as special.
58103a2e 793
881bdbd4 794 open FILE, "<", " file "; # filename is " file "
795 open FILE, ">", ">file"; # filename is ">file"
65acb1b1 796
68dc0745 797=head2 How can I reliably rename a file?
f12f5f55 798X<rename> X<mv> X<move> X<file, rename>
68dc0745 799
49d635f9 800If your operating system supports a proper mv(1) utility or its
801functional equivalent, this works:
68dc0745 802
500071f4 803 rename($old, $new) or system("mv", $old, $new);
68dc0745 804
f12f5f55 805It may be more portable to use the C<File::Copy> module instead.
d2321c93 806You just copy to the new file to the new name (checking return
807values), then delete the old one. This isn't really the same
f12f5f55 808semantically as a C<rename()>, which preserves meta-information like
68dc0745 809permissions, timestamps, inode info, etc.
810
811=head2 How can I lock a file?
d74e8afc 812X<lock> X<file, lock> X<flock>
68dc0745 813
54310121 814Perl's builtin flock() function (see L<perlfunc> for details) will call
68dc0745 815flock(2) if that exists, fcntl(2) if it doesn't (on perl version 5.004 and
816later), and lockf(3) if neither of the two previous system calls exists.
817On some systems, it may even use a different form of native locking.
818Here are some gotchas with Perl's flock():
819
820=over 4
821
822=item 1
823
824Produces a fatal error if none of the three system calls (or their
825close equivalent) exists.
826
827=item 2
828
829lockf(3) does not provide shared locking, and requires that the
830filehandle be open for writing (or appending, or read/writing).
831
832=item 3
833
d92eb7b0 834Some versions of flock() can't lock files over a network (e.g. on NFS file
835systems), so you'd need to force the use of fcntl(2) when you build Perl.
a6dd486b 836But even this is dubious at best. See the flock entry of L<perlfunc>
d92eb7b0 837and the F<INSTALL> file in the source distribution for information on
838building Perl to do this.
839
840Two potentially non-obvious but traditional flock semantics are that
a6dd486b 841it waits indefinitely until the lock is granted, and that its locks are
d92eb7b0 842I<merely advisory>. Such discretionary locks are more flexible, but
843offer fewer guarantees. This means that files locked with flock() may
844be modified by programs that do not also use flock(). Cars that stop
845for red lights get on well with each other, but not with cars that don't
846stop for red lights. See the perlport manpage, your port's specific
847documentation, or your system-specific local manpages for details. It's
848best to assume traditional behavior if you're writing portable programs.
a6dd486b 849(If you're not, you should as always feel perfectly free to write
d92eb7b0 850for your own system's idiosyncrasies (sometimes called "features").
851Slavish adherence to portability concerns shouldn't get in the way of
852your getting your job done.)
68dc0745 853
197aec24 854For more information on file locking, see also
13a2d996 855L<perlopentut/"File Locking"> if you have it (new for 5.6).
65acb1b1 856
68dc0745 857=back
858
04d666b1 859=head2 Why can't I just open(FH, "E<gt>file.lock")?
d74e8afc 860X<lock, lockfile race condition>
68dc0745 861
862A common bit of code B<NOT TO USE> is this:
863
500071f4 864 sleep(3) while -e "file.lock"; # PLEASE DO NOT USE
865 open(LCK, "> file.lock"); # THIS BROKEN CODE
68dc0745 866
867This is a classic race condition: you take two steps to do something
868which must be done in one. That's why computer hardware provides an
869atomic test-and-set instruction. In theory, this "ought" to work:
870
500071f4 871 sysopen(FH, "file.lock", O_WRONLY|O_EXCL|O_CREAT)
9b55d3ab 872 or die "can't open file.lock: $!";
68dc0745 873
874except that lamentably, file creation (and deletion) is not atomic
875over NFS, so this won't work (at least, not every time) over the net.
65acb1b1 876Various schemes involving link() have been suggested, but
c195e131 877these tend to involve busy-wait, which is also less than desirable.
68dc0745 878
fc36a67e 879=head2 I still don't get locking. I just want to increment the number in the file. How can I do this?
d74e8afc 880X<counter> X<file, counter>
68dc0745 881
46fc3d4c 882Didn't anyone ever tell you web-page hit counters were useless?
5a964f20 883They don't count number of hits, they're a waste of time, and they serve
a6dd486b 884only to stroke the writer's vanity. It's better to pick a random number;
885they're more realistic.
68dc0745 886
5a964f20 887Anyway, this is what you can do if you can't help yourself.
68dc0745 888
500071f4 889 use Fcntl qw(:DEFAULT :flock);
890 sysopen(FH, "numfile", O_RDWR|O_CREAT) or die "can't open numfile: $!";
891 flock(FH, LOCK_EX) or die "can't flock numfile: $!";
892 $num = <FH> || 0;
893 seek(FH, 0, 0) or die "can't rewind numfile: $!";
894 truncate(FH, 0) or die "can't truncate numfile: $!";
895 (print FH $num+1, "\n") or die "can't write numfile: $!";
896 close FH or die "can't close numfile: $!";
68dc0745 897
46fc3d4c 898Here's a much better web-page hit counter:
68dc0745 899
500071f4 900 $hits = int( (time() - 850_000_000) / rand(1_000) );
68dc0745 901
902If the count doesn't impress your friends, then the code might. :-)
903
f52f3be2 904=head2 All I want to do is append a small amount of text to the end of a file. Do I still have to use locking?
d74e8afc 905X<append> X<file, append>
05caf3a7 906
109f0441 907If you are on a system that correctly implements C<flock> and you use
908the example appending code from "perldoc -f flock" everything will be
909OK even if the OS you are on doesn't implement append mode correctly
910(if such a system exists.) So if you are happy to restrict yourself to
911OSs that implement C<flock> (and that's not really much of a
912restriction) then that is what you should do.
05caf3a7 913
914If you know you are only going to use a system that does correctly
109f0441 915implement appending (i.e. not Win32) then you can omit the C<seek>
916from the code in the previous answer.
917
918If you know you are only writing code to run on an OS and filesystem
919that does implement append mode correctly (a local filesystem on a
920modern Unix for example), and you keep the file in block-buffered mode
921and you write less than one buffer-full of output between each manual
922flushing of the buffer then each bufferload is almost guaranteed to be
923written to the end of the file in one chunk without getting
924intermingled with anyone else's output. You can also use the
925C<syswrite> function which is simply a wrapper around your system's
926C<write(2)> system call.
05caf3a7 927
928There is still a small theoretical chance that a signal will interrupt
109f0441 929the system level C<write()> operation before completion. There is also
930a possibility that some STDIO implementations may call multiple system
931level C<write()>s even if the buffer was empty to start. There may be
932some systems where this probability is reduced to zero, and this is
933not a concern when using C<:perlio> instead of your system's STDIO.
05caf3a7 934
68dc0745 935=head2 How do I randomly update a binary file?
d74e8afc 936X<file, binary patch>
68dc0745 937
938If you're just trying to patch a binary, in many cases something as
939simple as this works:
940
500071f4 941 perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs
68dc0745 942
943However, if you have fixed sized records, then you might do something more
944like this:
945
500071f4 946 $RECSIZE = 220; # size of record, in bytes
947 $recno = 37; # which record to update
948 open(FH, "+<somewhere") || die "can't update somewhere: $!";
949 seek(FH, $recno * $RECSIZE, 0);
950 read(FH, $record, $RECSIZE) == $RECSIZE || die "can't read record $recno: $!";
951 # munge the record
952 seek(FH, -$RECSIZE, 1);
953 print FH $record;
954 close FH;
68dc0745 955
956Locking and error checking are left as an exercise for the reader.
a6dd486b 957Don't forget them or you'll be quite sorry.
68dc0745 958
68dc0745 959=head2 How do I get a file's timestamp in perl?
d74e8afc 960X<timestamp> X<file, timestamp>
68dc0745 961
589a5df2 962If you want to retrieve the time at which the file was last read,
963written, or had its meta-data (owner, etc) changed, you use the B<-A>,
964B<-M>, or B<-C> file test operations as documented in L<perlfunc>.
965These retrieve the age of the file (measured against the start-time of
966your program) in days as a floating point number. Some platforms may
967not have all of these times. See L<perlport> for details. To retrieve
968the "raw" time in seconds since the epoch, you would call the stat
969function, then use C<localtime()>, C<gmtime()>, or
970C<POSIX::strftime()> to convert this into human-readable form.
68dc0745 971
972Here's an example:
973
500071f4 974 $write_secs = (stat($file))[9];
975 printf "file %s updated at %s\n", $file,
c8db1d39 976 scalar localtime($write_secs);
68dc0745 977
978If you prefer something more legible, use the File::stat module
979(part of the standard distribution in version 5.004 and later):
980
500071f4 981 # error checking left as an exercise for reader.
982 use File::stat;
983 use Time::localtime;
984 $date_string = ctime(stat($file)->mtime);
985 print "file $file updated at $date_string\n";
68dc0745 986
65acb1b1 987The POSIX::strftime() approach has the benefit of being,
988in theory, independent of the current locale. See L<perllocale>
989for details.
68dc0745 990
991=head2 How do I set a file's timestamp in perl?
d74e8afc 992X<timestamp> X<file, timestamp>
68dc0745 993
994You use the utime() function documented in L<perlfunc/utime>.
995By way of example, here's a little program that copies the
996read and write times from its first argument to all the rest
997of them.
998
500071f4 999 if (@ARGV < 2) {
1000 die "usage: cptimes timestamp_file other_files ...\n";
1001 }
1002 $timestamp = shift;
1003 ($atime, $mtime) = (stat($timestamp))[8,9];
1004 utime $atime, $mtime, @ARGV;
68dc0745 1005
65acb1b1 1006Error checking is, as usual, left as an exercise for the reader.
68dc0745 1007
19a1cd16 1008The perldoc for utime also has an example that has the same
1009effect as touch(1) on files that I<already exist>.
1010
1011Certain file systems have a limited ability to store the times
1012on a file at the expected level of precision. For example, the
1013FAT and HPFS filesystem are unable to create dates on files with
1014a finer granularity than two seconds. This is a limitation of
1015the filesystems, not of utime().
68dc0745 1016
1017=head2 How do I print to more than one file at once?
d74e8afc 1018X<print, to multiple files>
68dc0745 1019
49d635f9 1020To connect one filehandle to several output filehandles,
1021you can use the IO::Tee or Tie::FileHandle::Multiplex modules.
68dc0745 1022
49d635f9 1023If you only have to do this once, you can print individually
1024to each filehandle.
68dc0745 1025
500071f4 1026 for $fh (FH1, FH2, FH3) { print $fh "whatever\n" }
5a964f20 1027
49d635f9 1028=head2 How can I read in an entire file all at once?
d74e8afc 1029X<slurp> X<file, slurping>
68dc0745 1030
49d635f9 1031You can use the File::Slurp module to do it in one step.
68dc0745 1032
49d635f9 1033 use File::Slurp;
197aec24 1034
49d635f9 1035 $all_of_it = read_file($filename); # entire file in scalar
109f0441 1036 @all_lines = read_file($filename); # one line per element
d92eb7b0 1037
1038The customary Perl approach for processing all the lines in a file is to
1039do so one line at a time:
1040
500071f4 1041 open (INPUT, $file) || die "can't open $file: $!";
1042 while (<INPUT>) {
1043 chomp;
1044 # do something with $_
1045 }
1046 close(INPUT) || die "can't close $file: $!";
d92eb7b0 1047
1048This is tremendously more efficient than reading the entire file into
1049memory as an array of lines and then processing it one element at a time,
a6dd486b 1050which is often--if not almost always--the wrong approach. Whenever
d92eb7b0 1051you see someone do this:
1052
500071f4 1053 @lines = <INPUT>;
d92eb7b0 1054
30852c57 1055you should think long and hard about why you need everything loaded at
1056once. It's just not a scalable solution. You might also find it more
1057fun to use the standard Tie::File module, or the DB_File module's
1058$DB_RECNO bindings, which allow you to tie an array to a file so that
1059accessing an element the array actually accesses the corresponding
1060line in the file.
d92eb7b0 1061
f05bbc40 1062You can read the entire filehandle contents into a scalar.
d92eb7b0 1063
500071f4 1064 {
d92eb7b0 1065 local(*INPUT, $/);
1066 open (INPUT, $file) || die "can't open $file: $!";
1067 $var = <INPUT>;
500071f4 1068 }
d92eb7b0 1069
197aec24 1070That temporarily undefs your record separator, and will automatically
d92eb7b0 1071close the file at block exit. If the file is already open, just use this:
1072
500071f4 1073 $var = do { local $/; <INPUT> };
d92eb7b0 1074
f05bbc40 1075For ordinary files you can also use the read function.
1076
1077 read( INPUT, $var, -s INPUT );
1078
1079The third argument tests the byte size of the data on the INPUT filehandle
1080and reads that many bytes into the buffer $var.
1081
68dc0745 1082=head2 How can I read in a file by paragraphs?
d74e8afc 1083X<file, reading by paragraphs>
68dc0745 1084
65acb1b1 1085Use the C<$/> variable (see L<perlvar> for details). You can either
68dc0745 1086set it to C<""> to eliminate empty paragraphs (C<"abc\n\n\n\ndef">,
1087for instance, gets treated as two paragraphs and not three), or
1088C<"\n\n"> to accept empty paragraphs.
1089
197aec24 1090Note that a blank line must have no blanks in it. Thus
c4db748a 1091S<C<"fred\n \nstuff\n\n">> is one paragraph, but C<"fred\n\nstuff\n\n"> is two.
65acb1b1 1092
68dc0745 1093=head2 How can I read a single character from a file? From the keyboard?
d74e8afc 1094X<getc> X<file, reading one character at a time>
68dc0745 1095
1096You can use the builtin C<getc()> function for most filehandles, but
1097it won't (easily) work on a terminal device. For STDIN, either use
a6dd486b 1098the Term::ReadKey module from CPAN or use the sample code in
68dc0745 1099L<perlfunc/getc>.
1100
65acb1b1 1101If your system supports the portable operating system programming
1102interface (POSIX), you can use the following code, which you'll note
1103turns off echo processing as well.
68dc0745 1104
500071f4 1105 #!/usr/bin/perl -w
1106 use strict;
1107 $| = 1;
1108 for (1..4) {
1109 my $got;
1110 print "gimme: ";
1111 $got = getone();
1112 print "--> $got\n";
1113 }
68dc0745 1114 exit;
1115
500071f4 1116 BEGIN {
68dc0745 1117 use POSIX qw(:termios_h);
1118
1119 my ($term, $oterm, $echo, $noecho, $fd_stdin);
1120
1121 $fd_stdin = fileno(STDIN);
1122
1123 $term = POSIX::Termios->new();
1124 $term->getattr($fd_stdin);
1125 $oterm = $term->getlflag();
1126
1127 $echo = ECHO | ECHOK | ICANON;
1128 $noecho = $oterm & ~$echo;
1129
1130 sub cbreak {
500071f4 1131 $term->setlflag($noecho);
1132 $term->setcc(VTIME, 1);
1133 $term->setattr($fd_stdin, TCSANOW);
1134 }
ac9dac7f 1135
68dc0745 1136 sub cooked {
500071f4 1137 $term->setlflag($oterm);
1138 $term->setcc(VTIME, 0);
1139 $term->setattr($fd_stdin, TCSANOW);
1140 }
68dc0745 1141
1142 sub getone {
500071f4 1143 my $key = '';
1144 cbreak();
1145 sysread(STDIN, $key, 1);
1146 cooked();
1147 return $key;
1148 }
68dc0745 1149
500071f4 1150 }
68dc0745 1151
500071f4 1152 END { cooked() }
68dc0745 1153
a6dd486b 1154The Term::ReadKey module from CPAN may be easier to use. Recent versions
65acb1b1 1155include also support for non-portable systems as well.
68dc0745 1156
500071f4 1157 use Term::ReadKey;
1158 open(TTY, "</dev/tty");
1159 print "Gimme a char: ";
1160 ReadMode "raw";
1161 $key = ReadKey 0, *TTY;
1162 ReadMode "normal";
1163 printf "\nYou said %s, char number %03d\n",
1164 $key, ord $key;
68dc0745 1165
65acb1b1 1166=head2 How can I tell whether there's a character waiting on a filehandle?
68dc0745 1167
5a964f20 1168The very first thing you should do is look into getting the Term::ReadKey
65acb1b1 1169extension from CPAN. As we mentioned earlier, it now even has limited
1170support for non-portable (read: not open systems, closed, proprietary,
589a5df2 1171not POSIX, not Unix, etc.) systems.
5a964f20 1172
1173You should also check out the Frequently Asked Questions list in
68dc0745 1174comp.unix.* for things like this: the answer is essentially the same.
1175It's very system dependent. Here's one solution that works on BSD
1176systems:
1177
500071f4 1178 sub key_ready {
1179 my($rin, $nfd);
1180 vec($rin, fileno(STDIN), 1) = 1;
1181 return $nfd = select($rin,undef,undef,0);
1182 }
68dc0745 1183
65acb1b1 1184If you want to find out how many characters are waiting, there's
1185also the FIONREAD ioctl call to be looked at. The I<h2ph> tool that
1186comes with Perl tries to convert C include files to Perl code, which
1187can be C<require>d. FIONREAD ends up defined as a function in the
1188I<sys/ioctl.ph> file:
68dc0745 1189
500071f4 1190 require 'sys/ioctl.ph';
68dc0745 1191
500071f4 1192 $size = pack("L", 0);
1193 ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n";
1194 $size = unpack("L", $size);
68dc0745 1195
5a964f20 1196If I<h2ph> wasn't installed or doesn't work for you, you can
1197I<grep> the include files by hand:
68dc0745 1198
500071f4 1199 % grep FIONREAD /usr/include/*/*
1200 /usr/include/asm/ioctls.h:#define FIONREAD 0x541B
68dc0745 1201
5a964f20 1202Or write a small C program using the editor of champions:
68dc0745 1203
500071f4 1204 % cat > fionread.c
1205 #include <sys/ioctl.h>
1206 main() {
1207 printf("%#08x\n", FIONREAD);
1208 }
1209 ^D
1210 % cc -o fionread fionread.c
1211 % ./fionread
1212 0x4004667f
5a964f20 1213
8305e449 1214And then hard code it, leaving porting as an exercise to your successor.
5a964f20 1215
500071f4 1216 $FIONREAD = 0x4004667f; # XXX: opsys dependent
5a964f20 1217
500071f4 1218 $size = pack("L", 0);
1219 ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n";
1220 $size = unpack("L", $size);
5a964f20 1221
a6dd486b 1222FIONREAD requires a filehandle connected to a stream, meaning that sockets,
5a964f20 1223pipes, and tty devices work, but I<not> files.
68dc0745 1224
1225=head2 How do I do a C<tail -f> in perl?
ac9dac7f 1226X<tail> X<IO::Handle> X<File::Tail> X<clearerr>
68dc0745 1227
1228First try
1229
500071f4 1230 seek(GWFILE, 0, 1);
68dc0745 1231
1232The statement C<seek(GWFILE, 0, 1)> doesn't change the current position,
1233but it does clear the end-of-file condition on the handle, so that the
ac9dac7f 1234next C<< <GWFILE> >> makes Perl try again to read something.
68dc0745 1235
1236If that doesn't work (it relies on features of your stdio implementation),
1237then you need something more like this:
1238
1239 for (;;) {
1240 for ($curpos = tell(GWFILE); <GWFILE>; $curpos = tell(GWFILE)) {
1241 # search for some stuff and put it into files
1242 }
1243 # sleep for a while
1244 seek(GWFILE, $curpos, 0); # seek to where we had been
1245 }
1246
ac9dac7f 1247If this still doesn't work, look into the C<clearerr> method
1248from C<IO::Handle>, which resets the error and end-of-file states
1249on the handle.
68dc0745 1250
ac9dac7f 1251There's also a C<File::Tail> module from CPAN.
65acb1b1 1252
68dc0745 1253=head2 How do I dup() a filehandle in Perl?
d74e8afc 1254X<dup>
68dc0745 1255
1256If you check L<perlfunc/open>, you'll see that several of the ways
1257to call open() should do the trick. For example:
1258
500071f4 1259 open(LOG, ">>/foo/logfile");
1260 open(STDERR, ">&LOG");
68dc0745 1261
1262Or even with a literal numeric descriptor:
1263
1264 $fd = $ENV{MHCONTEXTFD};
1265 open(MHCONTEXT, "<&=$fd"); # like fdopen(3S)
1266
c47ff5f1 1267Note that "<&STDIN" makes a copy, but "<&=STDIN" make
5a964f20 1268an alias. That means if you close an aliased handle, all
197aec24 1269aliases become inaccessible. This is not true with
5a964f20 1270a copied one.
1271
1272Error checking, as always, has been left as an exercise for the reader.
68dc0745 1273
1274=head2 How do I close a file descriptor by number?
ee891a00 1275X<file, closing file descriptors> X<POSIX> X<close>
1276
1277If, for some reason, you have a file descriptor instead of a
1278filehandle (perhaps you used C<POSIX::open>), you can use the
1279C<close()> function from the C<POSIX> module:
68dc0745 1280
ee891a00 1281 use POSIX ();
109f0441 1282
ee891a00 1283 POSIX::close( $fd );
109f0441 1284
ac003c96 1285This should rarely be necessary, as the Perl C<close()> function is to be
68dc0745 1286used for things that Perl opened itself, even if it was a dup of a
ac003c96 1287numeric descriptor as with C<MHCONTEXT> above. But if you really have
68dc0745 1288to, you may be able to do this:
1289
500071f4 1290 require 'sys/syscall.ph';
1291 $rc = syscall(&SYS_close, $fd + 0); # must force numeric
1292 die "can't sysclose $fd: $!" unless $rc == -1;
68dc0745 1293
ee891a00 1294Or, just use the fdopen(3S) feature of C<open()>:
d92eb7b0 1295
500071f4 1296 {
ee891a00 1297 open my( $fh ), "<&=$fd" or die "Cannot reopen fd=$fd: $!";
1298 close $fh;
500071f4 1299 }
d92eb7b0 1300
883f1635 1301=head2 Why can't I use "C:\temp\foo" in DOS paths? Why doesn't `C:\temp\foo.exe` work?
d74e8afc 1302X<filename, DOS issues>
68dc0745 1303
1304Whoops! You just put a tab and a formfeed into that filename!
1305Remember that within double quoted strings ("like\this"), the
1306backslash is an escape character. The full list of these is in
1307L<perlop/Quote and Quote-like Operators>. Unsurprisingly, you don't
1308have a file called "c:(tab)emp(formfeed)oo" or
65acb1b1 1309"c:(tab)emp(formfeed)oo.exe" on your legacy DOS filesystem.
68dc0745 1310
1311Either single-quote your strings, or (preferably) use forward slashes.
46fc3d4c 1312Since all DOS and Windows versions since something like MS-DOS 2.0 or so
68dc0745 1313have treated C</> and C<\> the same in a path, you might as well use the
a6dd486b 1314one that doesn't clash with Perl--or the POSIX shell, ANSI C and C++,
65acb1b1 1315awk, Tcl, Java, or Python, just to mention a few. POSIX paths
1316are more portable, too.
68dc0745 1317
1318=head2 Why doesn't glob("*.*") get all the files?
d74e8afc 1319X<glob>
68dc0745 1320
1321Because even on non-Unix ports, Perl's glob function follows standard
46fc3d4c 1322Unix globbing semantics. You'll need C<glob("*")> to get all (non-hidden)
65acb1b1 1323files. This makes glob() portable even to legacy systems. Your
1324port may include proprietary globbing functions as well. Check its
1325documentation for details.
68dc0745 1326
1327=head2 Why does Perl let me delete read-only files? Why does C<-i> clobber protected files? Isn't this a bug in Perl?
1328
06a5f41f 1329This is elaborately and painstakingly described in the
1330F<file-dir-perms> article in the "Far More Than You Ever Wanted To
49d635f9 1331Know" collection in http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz .
68dc0745 1332
1333The executive summary: learn how your filesystem works. The
1334permissions on a file say what can happen to the data in that file.
1335The permissions on a directory say what can happen to the list of
1336files in that directory. If you delete a file, you're removing its
1337name from the directory (so the operation depends on the permissions
1338of the directory, not of the file). If you try to write to the file,
1339the permissions of the file govern whether you're allowed to.
1340
1341=head2 How do I select a random line from a file?
d74e8afc 1342X<file, selecting a random line>
68dc0745 1343
109f0441 1344Short of loading the file into a database or pre-indexing the lines in
1345the file, there are a couple of things that you can do.
1346
1347Here's a reservoir-sampling algorithm from the Camel Book:
68dc0745 1348
500071f4 1349 srand;
1350 rand($.) < 1 && ($line = $_) while <>;
68dc0745 1351
49d635f9 1352This has a significant advantage in space over reading the whole file
1353in. You can find a proof of this method in I<The Art of Computer
1354Programming>, Volume 2, Section 3.4.2, by Donald E. Knuth.
1355
109f0441 1356You can use the C<File::Random> module which provides a function
49d635f9 1357for that algorithm:
1358
1359 use File::Random qw/random_line/;
1360 my $line = random_line($filename);
1361
109f0441 1362Another way is to use the C<Tie::File> module, which treats the entire
49d635f9 1363file as an array. Simply access a random array element.
68dc0745 1364
65acb1b1 1365=head2 Why do I get weird spaces when I print an array of lines?
1366
109f0441 1367(contributed by brian d foy)
1368
1369If you are seeing spaces between the elements of your array when
1370you print the array, you are probably interpolating the array in
1371double quotes:
1372
1373 my @animals = qw(camel llama alpaca vicuna);
1374 print "animals are: @animals\n";
65acb1b1 1375
109f0441 1376It's the double quotes, not the C<print>, doing this. Whenever you
1377interpolate an array in a double quote context, Perl joins the
1378elements with spaces (or whatever is in C<$">, which is a space by
1379default):
65acb1b1 1380
109f0441 1381 animals are: camel llama alpaca vicuna
65acb1b1 1382
109f0441 1383This is different than printing the array without the interpolation:
65acb1b1 1384
109f0441 1385 my @animals = qw(camel llama alpaca vicuna);
1386 print "animals are: ", @animals, "\n";
65acb1b1 1387
109f0441 1388Now the output doesn't have the spaces between the elements because
1389the elements of C<@animals> simply become part of the list to
1390C<print>:
65acb1b1 1391
109f0441 1392 animals are: camelllamaalpacavicuna
1393
1394You might notice this when each of the elements of C<@array> end with
1395a newline. You expect to print one element per line, but notice that
1396every line after the first is indented:
1397
1398 this is a line
1399 this is another line
1400 this is the third line
1401
1402That extra space comes from the interpolation of the array. If you
1403don't want to put anything between your array elements, don't use the
1404array in double quotes. You can send it to print without them:
65acb1b1 1405
500071f4 1406 print @lines;
1407
109f0441 1408=head2 How do I traverse a directory tree?
1409
1410(contributed by brian d foy)
1411
1412The C<File::Find> module, which comes with Perl, does all of the hard
1413work to traverse a directory structure. It comes with Perl. You simply
1414call the C<find> subroutine with a callback subroutine and the
1415directories you want to traverse:
1416
1417 use File::Find;
1418
1419 find( \&wanted, @directories );
1420
1421 sub wanted {
1422 # full path in $File::Find::name
1423 # just filename in $_
1424 ... do whatever you want to do ...
1425 }
1426
1427The C<File::Find::Closures>, which you can download from CPAN, provides
1428many ready-to-use subroutines that you can use with C<File::Find>.
1429
1430The C<File::Finder>, which you can download from CPAN, can help you
1431create the callback subroutine using something closer to the syntax of
1432the C<find> command-line utility:
1433
1434 use File::Find;
1435 use File::Finder;
1436
1437 my $deep_dirs = File::Finder->depth->type('d')->ls->exec('rmdir','{}');
1438
1439 find( $deep_dirs->as_options, @places );
1440
1441The C<File::Find::Rule> module, which you can download from CPAN, has
1442a similar interface, but does the traversal for you too:
1443
1444 use File::Find::Rule;
1445
1446 my @files = File::Find::Rule->file()
1447 ->name( '*.pm' )
1448 ->in( @INC );
1449
1450=head2 How do I delete a directory tree?
1451
1452(contributed by brian d foy)
1453
1454If you have an empty directory, you can use Perl's built-in C<rmdir>. If
1455the directory is not empty (so, no files or subdirectories), you either
1456have to empty it yourself (a lot of work) or use a module to help you.
1457
1458The C<File::Path> module, which comes with Perl, has a C<rmtree> which
1459can take care of all of the hard work for you:
1460
1461 use File::Path qw(rmtree);
1462
1463 rmtree( \@directories, 0, 0 );
1464
1465The first argument to C<rmtree> is either a string representing a directory path
1466or an array reference. The second argument controls progress messages, and the
1467third argument controls the handling of files you don't have permissions to
1468delete. See the C<File::Path> module for the details.
1469
1470=head2 How do I copy an entire directory?
1471
1472(contributed by Shlomi Fish)
1473
1474To do the equivalent of C<cp -R> (i.e. copy an entire directory tree
1475recursively) in portable Perl, you'll either need to write something yourself
1476or find a good CPAN module such as L<File::Copy::Recursive>.
500071f4 1477=head1 REVISION
1478
109f0441 1479Revision: $Revision$
500071f4 1480
109f0441 1481Date: $Date$
500071f4 1482
1483See L<perlfaq> for source control details and availability.
65acb1b1 1484
68dc0745 1485=head1 AUTHOR AND COPYRIGHT
1486
109f0441 1487Copyright (c) 1997-2009 Tom Christiansen, Nathan Torkington, and
7678cced 1488other authors as noted. All rights reserved.
5a964f20 1489
5a7beb56 1490This documentation is free; you can redistribute it and/or modify it
1491under the same terms as Perl itself.
c8db1d39 1492
87275199 1493Irrespective of its distribution, all code examples here are in the public
c8db1d39 1494domain. You are permitted and encouraged to use this code and any
1495derivatives thereof in your own programs for fun or for profit as you
1496see fit. A simple comment in the code giving credit to the FAQ would
1497be courteous but is not required.