Re: [ID 20010608.010] File::Find re-entrancy
[p5sagit/p5-mst-13.2.git] / lib / File / Find.pm
CommitLineData
a0d0e21e 1package File::Find;
b75c8c73 2use strict;
b395063c 3use warnings;
4use 5.6.0;
a85af077 5our $VERSION = '1.01';
a0d0e21e 6require Exporter;
6280b799 7require Cwd;
a0d0e21e 8
f06db76b 9=head1 NAME
10
11find - traverse a file tree
12
13finddepth - traverse a directory structure depth-first
14
15=head1 SYNOPSIS
16
17 use File::Find;
81793b90 18 find(\&wanted, '/foo', '/bar');
f06db76b 19 sub wanted { ... }
237437d0 20
f06db76b 21 use File::Find;
81793b90 22 finddepth(\&wanted, '/foo', '/bar');
f06db76b 23 sub wanted { ... }
3cb6de81 24
81793b90 25 use File::Find;
26 find({ wanted => \&process, follow => 1 }, '.');
f06db76b 27
28=head1 DESCRIPTION
29
20408e3c 30The first argument to find() is either a hash reference describing the
81793b90 31operations to be performed for each file, or a code reference.
20408e3c 32
81793b90 33Here are the possible keys for the hash:
34
35=over 3
36
37=item C<wanted>
38
39The value should be a code reference. This code reference is called
40I<the wanted() function> below.
41
42=item C<bydepth>
43
44Reports the name of a directory only AFTER all its entries
45have been reported. Entry point finddepth() is a shortcut for
46specifying C<{ bydepth => 1 }> in the first argument of find().
47
719c805e 48=item C<preprocess>
49
7e47e6ff 50The value should be a code reference. This code reference is used to
51preprocess the current directory. The name of currently processed
52directory is in $File::Find::dir. Your preprocessing function is
53called after readdir() but before the loop that calls the wanted()
54function. It is called with a list of strings (actually file/directory
55names) and is expected to return a list of strings. The code can be
56used to sort the file/directory names alphabetically, numerically,
57or to filter out directory entries based on their name alone. When
58I<follow> or I<follow_fast> are in effect, C<preprocess> is a no-op.
719c805e 59
60=item C<postprocess>
61
7e47e6ff 62The value should be a code reference. It is invoked just before leaving
63the currently processed directory. It is called in void context with no
64arguments. The name of the current directory is in $File::Find::dir. This
65hook is handy for summarizing a directory, such as calculating its disk
66usage. When I<follow> or I<follow_fast> are in effect, C<preprocess> is a
67no-op.
719c805e 68
81793b90 69=item C<follow>
70
71Causes symbolic links to be followed. Since directory trees with symbolic
72links (followed) may contain files more than once and may even have
73cycles, a hash has to be built up with an entry for each file.
74This might be expensive both in space and time for a large
75directory tree. See I<follow_fast> and I<follow_skip> below.
76If either I<follow> or I<follow_fast> is in effect:
77
78=over 6
79
a45bd81d 80=item *
81793b90 81
f10e1564 82It is guaranteed that an I<lstat> has been called before the user's
81793b90 83I<wanted()> function is called. This enables fast file checks involving S< _>.
84
a45bd81d 85=item *
81793b90 86
87There is a variable C<$File::Find::fullname> which holds the absolute
88pathname of the file with all symbolic links resolved
89
90=back
91
92=item C<follow_fast>
93
f10e1564 94This is similar to I<follow> except that it may report some files more
95than once. It does detect cycles, however. Since only symbolic links
96have to be hashed, this is much cheaper both in space and time. If
97processing a file more than once (by the user's I<wanted()> function)
81793b90 98is worse than just taking time, the option I<follow> should be used.
99
100=item C<follow_skip>
101
102C<follow_skip==1>, which is the default, causes all files which are
103neither directories nor symbolic links to be ignored if they are about
104to be processed a second time. If a directory or a symbolic link
105are about to be processed a second time, File::Find dies.
106C<follow_skip==0> causes File::Find to die if any file is about to be
107processed a second time.
108C<follow_skip==2> causes File::Find to ignore any duplicate files and
7e47e6ff 109directories but to proceed normally otherwise.
20408e3c 110
f06db76b 111
81793b90 112=item C<no_chdir>
113
114Does not C<chdir()> to each directory as it recurses. The wanted()
115function will need to be aware of this, of course. In this case,
116C<$_> will be the same as C<$File::Find::name>.
117
118=item C<untaint>
119
120If find is used in taint-mode (-T command line switch or if EUID != UID
121or if EGID != GID) then internally directory names have to be untainted
7e47e6ff 122before they can be chdir'ed to. Therefore they are checked against a regular
123expression I<untaint_pattern>. Note that all names passed to the user's
124I<wanted()> function are still tainted. If this option is used while
125not in taint-mode, C<untaint> is a no-op.
81793b90 126
127=item C<untaint_pattern>
128
129See above. This should be set using the C<qr> quoting operator.
130The default is set to C<qr|^([-+@\w./]+)$|>.
ac06ab1d 131Note that the parantheses are vital.
81793b90 132
133=item C<untaint_skip>
134
7e47e6ff 135If set, a directory which fails the I<untaint_pattern> is skipped,
136including all its sub-directories. The default is to 'die' in such a case.
81793b90 137
138=back
139
140The wanted() function does whatever verifications you want.
141C<$File::Find::dir> contains the current directory name, and C<$_> the
142current filename within that directory. C<$File::Find::name> contains
f10e1564 143the complete pathname to the file. You are chdir()'d to
144C<$File::Find::dir> when the function is called, unless C<no_chdir>
7e47e6ff 145was specified. When C<follow> or C<follow_fast> are in effect, there is
f10e1564 146also a C<$File::Find::fullname>. The function may set
147C<$File::Find::prune> to prune the tree unless C<bydepth> was
148specified. Unless C<follow> or C<follow_fast> is specified, for
149compatibility reasons (find.pl, find2perl) there are in addition the
150following globals available: C<$File::Find::topdir>,
151C<$File::Find::topdev>, C<$File::Find::topino>,
e7b91b67 152C<$File::Find::topmode> and C<$File::Find::topnlink>.
47a735e8 153
20408e3c 154This library is useful for the C<find2perl> tool, which when fed,
f06db76b 155
156 find2perl / -name .nfs\* -mtime +7 \
81793b90 157 -exec rm -f {} \; -o -fstype nfs -prune
f06db76b 158
159produces something like:
160
161 sub wanted {
c7b9dd21 162 /^\.nfs.*\z/s &&
81793b90 163 (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_)) &&
f06db76b 164 int(-M _) > 7 &&
165 unlink($_)
166 ||
81793b90 167 ($nlink || (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))) &&
f06db76b 168 $dev < 0 &&
6280b799 169 ($File::Find::prune = 1);
f06db76b 170 }
171
81793b90 172Set the variable C<$File::Find::dont_use_nlink> if you're using AFS,
6280b799 173since AFS cheats.
f06db76b 174
f06db76b 175
176Here's another interesting wanted function. It will find all symlinks
177that don't resolve:
178
179 sub wanted {
81793b90 180 -l && !-e && print "bogus link: $File::Find::name\n";
237437d0 181 }
f06db76b 182
81793b90 183See also the script C<pfind> on CPAN for a nice application of this
184module.
185
186=head1 CAVEAT
187
f10e1564 188Be aware that the option to follow symbolic links can be dangerous.
81793b90 189Depending on the structure of the directory tree (including symbolic
190links to directories) you might traverse a given (physical) directory
191more than once (only if C<follow_fast> is in effect).
192Furthermore, deleting or changing files in a symbolically linked directory
193might cause very unpleasant surprises, since you delete or change files
194in an unknown directory.
0530a6c4 195
7e47e6ff 196=head1 NOTES
197
198=over 4
199
200=item *
201
202Mac OS (Classic) users should note a few differences:
203
204=over 4
205
206=item *
207
208The path separator is ':', not '/', and the current directory is denoted
209as ':', not '.'. You should be careful about specifying relative pathnames.
210While a full path always begins with a volume name, a relative pathname
211should always begin with a ':'. If specifying a volume name only, a
212trailing ':' is required.
213
214=item *
215
216C<$File::Find::dir> is guaranteed to end with a ':'. If C<$_>
217contains the name of a directory, that name may or may not end with a
218':'. Likewise, C<$File::Find::name>, which contains the complete
219pathname to that directory, and C<$File::Find::fullname>, which holds
220the absolute pathname of that directory with all symbolic links resolved,
221may or may not end with a ':'.
222
223=item *
224
225The default C<untaint_pattern> (see above) on Mac OS is set to
226C<qr|^(.+)$|>. Note that the parentheses are vital.
227
228=item *
229
230The invisible system file "Icon\015" is ignored. While this file may
231appear in every directory, there are some more invisible system files
232on every volume, which are all located at the volume root level (i.e.
233"MacintoshHD:"). These system files are B<not> excluded automatically.
234Your filter may use the following code to recognize invisible files or
235directories (requires Mac::Files):
236
237 use Mac::Files;
238
239 # invisible() -- returns 1 if file/directory is invisible,
240 # 0 if it's visible or undef if an error occured
241
242 sub invisible($) {
243 my $file = shift;
244 my ($fileCat, $fileInfo);
245 my $invisible_flag = 1 << 14;
246
247 if ( $fileCat = FSpGetCatInfo($file) ) {
248 if ($fileInfo = $fileCat->ioFlFndrInfo() ) {
249 return (($fileInfo->fdFlags & $invisible_flag) && 1);
250 }
251 }
252 return undef;
253 }
254
255Generally, invisible files are system files, unless an odd application
256decides to use invisible files for its own purposes. To distinguish
257such files from system files, you have to look at the B<type> and B<creator>
258file attributes. The MacPerl built-in functions C<GetFileInfo(FILE)> and
259C<SetFileInfo(CREATOR, TYPE, FILES)> offer access to these attributes
260(see MacPerl.pm for details).
261
262Files that appear on the desktop actually reside in an (hidden) directory
263named "Desktop Folder" on the particular disk volume. Note that, although
264all desktop files appear to be on the same "virtual" desktop, each disk
265volume actually maintains its own "Desktop Folder" directory.
266
267=back
268
269=back
0530a6c4 270
a85af077 271=head1 HISTORY
272
273File::Find used to produce incorrect results if called recursively.
274During the development of perl 5.8 this bug was fixed.
275The first fixed version of File::Find was 1.01.
276
f06db76b 277=cut
278
b75c8c73 279our @ISA = qw(Exporter);
280our @EXPORT = qw(find finddepth);
6280b799 281
a0d0e21e 282
81793b90 283use strict;
284my $Is_VMS;
7e47e6ff 285my $Is_MacOS;
81793b90 286
287require File::Basename;
7e47e6ff 288require File::Spec;
81793b90 289
9f826d6a 290# Should ideally be my() not our() but local() currently
291# refuses to operate on lexicals
292
293our %SLnkSeen;
294our ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow,
719c805e 295 $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat,
296 $pre_process, $post_process);
81793b90 297
298sub contract_name {
299 my ($cdir,$fn) = @_;
300
7e47e6ff 301 return substr($cdir,0,rindex($cdir,'/')) if $fn eq $File::Find::current_dir;
81793b90 302
303 $cdir = substr($cdir,0,rindex($cdir,'/')+1);
304
305 $fn =~ s|^\./||;
306
307 my $abs_name= $cdir . $fn;
308
309 if (substr($fn,0,3) eq '../') {
fecbda2b 310 1 while $abs_name =~ s!/[^/]*/\.\./!/!;
81793b90 311 }
312
313 return $abs_name;
314}
315
7e47e6ff 316# return the absolute name of a directory or file
317sub contract_name_Mac {
318 my ($cdir,$fn) = @_;
319 my $abs_name;
320
321 if ($fn =~ /^(:+)(.*)$/) { # valid pathname starting with a ':'
322
323 my $colon_count = length ($1);
324 if ($colon_count == 1) {
325 $abs_name = $cdir . $2;
326 return $abs_name;
327 }
328 else {
329 # need to move up the tree, but
330 # only if it's not a volume name
331 for (my $i=1; $i<$colon_count; $i++) {
332 unless ($cdir =~ /^[^:]+:$/) { # volume name
333 $cdir =~ s/[^:]+:$//;
334 }
335 else {
336 return undef;
337 }
338 }
339 $abs_name = $cdir . $2;
340 return $abs_name;
341 }
342
343 }
344 else {
345
346 # $fn may be a valid path to a directory or file or (dangling)
347 # symlink, without a leading ':'
348 if ( (-e $fn) || (-l $fn) ) {
349 if ($fn =~ /^[^:]+:/) { # a volume name like DataHD:*
350 return $fn; # $fn is already an absolute path
351 }
352 else {
353 $abs_name = $cdir . $fn;
354 return $abs_name;
355 }
356 }
357 else { # argh!, $fn is not a valid directory/file
358 return undef;
359 }
360 }
361}
81793b90 362
363sub PathCombine($$) {
364 my ($Base,$Name) = @_;
365 my $AbsName;
366
7e47e6ff 367 if ($Is_MacOS) {
368 # $Name is the resolved symlink (always a full path on MacOS),
369 # i.e. there's no need to call contract_name_Mac()
370 $AbsName = $Name;
371
372 # (simple) check for recursion
373 if ( ( $Base =~ /^$AbsName/) && (-d $AbsName) ) { # recursion
374 return undef;
375 }
81793b90 376 }
377 else {
7e47e6ff 378 if (substr($Name,0,1) eq '/') {
379 $AbsName= $Name;
380 }
381 else {
382 $AbsName= contract_name($Base,$Name);
383 }
81793b90 384
7e47e6ff 385 # (simple) check for recursion
386 my $newlen= length($AbsName);
387 if ($newlen <= length($Base)) {
388 if (($newlen == length($Base) || substr($Base,$newlen,1) eq '/')
389 && $AbsName eq substr($Base,0,$newlen))
390 {
391 return undef;
392 }
81793b90 393 }
394 }
395 return $AbsName;
396}
397
398sub Follow_SymLink($) {
399 my ($AbsName) = @_;
400
401 my ($NewName,$DEV, $INO);
402 ($DEV, $INO)= lstat $AbsName;
403
404 while (-l _) {
405 if ($SLnkSeen{$DEV, $INO}++) {
406 if ($follow_skip < 2) {
407 die "$AbsName is encountered a second time";
a0d0e21e 408 }
409 else {
81793b90 410 return undef;
a0d0e21e 411 }
412 }
81793b90 413 $NewName= PathCombine($AbsName, readlink($AbsName));
414 unless(defined $NewName) {
415 if ($follow_skip < 2) {
416 die "$AbsName is a recursive symbolic link";
417 }
418 else {
419 return undef;
a0d0e21e 420 }
81793b90 421 }
422 else {
423 $AbsName= $NewName;
424 }
425 ($DEV, $INO) = lstat($AbsName);
426 return undef unless defined $DEV; # dangling symbolic link
427 }
428
429 if ($full_check && $SLnkSeen{$DEV, $INO}++) {
7e47e6ff 430 if ( ($follow_skip < 1) || ((-d _) && ($follow_skip < 2)) ) {
81793b90 431 die "$AbsName encountered a second time";
432 }
433 else {
434 return undef;
435 }
436 }
437
438 return $AbsName;
439}
440
17f410f9 441our($dir, $name, $fullname, $prune);
81793b90 442sub _find_dir_symlnk($$$);
443sub _find_dir($$$);
444
7e47e6ff 445# check whether or not a scalar variable is tainted
446# (code straight from the Camel, 3rd ed., page 561)
447sub is_tainted_pp {
448 my $arg = shift;
449 my $nada = substr($arg, 0, 0); # zero-length
450 local $@;
451 eval { eval "# $nada" };
452 return length($@) != 0;
453}
454
81793b90 455sub _find_opt {
456 my $wanted = shift;
457 die "invalid top directory" unless defined $_[0];
458
9f826d6a 459 # This function must local()ize everything because callbacks may
460 # call find() or finddepth()
461
462 local %SLnkSeen;
463 local ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow,
464 $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat,
465 $pre_process, $post_process);
466 local($dir, $name, $fullname, $prune);
467
81793b90 468 my $cwd = $wanted->{bydepth} ? Cwd::fastcwd() : Cwd::cwd();
469 my $cwd_untainted = $cwd;
7e47e6ff 470 my $check_t_cwd = 1;
81793b90 471 $wanted_callback = $wanted->{wanted};
472 $bydepth = $wanted->{bydepth};
719c805e 473 $pre_process = $wanted->{preprocess};
474 $post_process = $wanted->{postprocess};
81793b90 475 $no_chdir = $wanted->{no_chdir};
476 $full_check = $wanted->{follow};
477 $follow = $full_check || $wanted->{follow_fast};
478 $follow_skip = $wanted->{follow_skip};
479 $untaint = $wanted->{untaint};
480 $untaint_pat = $wanted->{untaint_pattern};
481 $untaint_skip = $wanted->{untaint_skip};
482
e7b91b67 483 # for compatability reasons (find.pl, find2perl)
9f826d6a 484 local our ($topdir, $topdev, $topino, $topmode, $topnlink);
81793b90 485
486 # a symbolic link to a directory doesn't increase the link count
487 $avoid_nlink = $follow || $File::Find::dont_use_nlink;
488
e7b91b67 489 my ($abs_dir, $Is_Dir);
81793b90 490
491 Proc_Top_Item:
492 foreach my $TOP (@_) {
7e47e6ff 493 my $top_item = $TOP;
494
495 if ($Is_MacOS) {
496 ($topdev,$topino,$topmode,$topnlink) = $follow ? stat $top_item : lstat $top_item;
497 $top_item = ":$top_item"
498 if ( (-d _) && ($top_item =~ /^[^:]+\z/) );
499 }
500 else {
501 $top_item =~ s|/\z|| unless $top_item eq '/';
502 ($topdev,$topino,$topmode,$topnlink) = $follow ? stat $top_item : lstat $top_item;
503 }
504
505 $Is_Dir= 0;
506
507 if ($follow) {
508
509 if ($Is_MacOS) {
510 $cwd = "$cwd:" unless ($cwd =~ /:$/); # for safety
511
512 if ($top_item eq $File::Find::current_dir) {
513 $abs_dir = $cwd;
514 }
515 else {
516 $abs_dir = contract_name_Mac($cwd, $top_item);
517 unless (defined $abs_dir) {
518 warn "Can't determine absolute path for $top_item (No such file or directory)\n";
519 next Proc_Top_Item;
520 }
521 }
522
523 }
524 else {
525 if (substr($top_item,0,1) eq '/') {
526 $abs_dir = $top_item;
527 }
528 elsif ($top_item eq $File::Find::current_dir) {
529 $abs_dir = $cwd;
530 }
531 else { # care about any ../
532 $abs_dir = contract_name("$cwd/",$top_item);
533 }
534 }
535 $abs_dir= Follow_SymLink($abs_dir);
536 unless (defined $abs_dir) {
81793b90 537 warn "$top_item is a dangling symbolic link\n";
538 next Proc_Top_Item;
7e47e6ff 539 }
540
541 if (-d _) {
81793b90 542 _find_dir_symlnk($wanted, $abs_dir, $top_item);
543 $Is_Dir= 1;
7e47e6ff 544 }
545 }
81793b90 546 else { # no follow
7e47e6ff 547 $topdir = $top_item;
548 unless (defined $topnlink) {
549 warn "Can't stat $top_item: $!\n";
550 next Proc_Top_Item;
551 }
552 if (-d _) {
c7b9dd21 553 $top_item =~ s/\.dir\z// if $Is_VMS;
e7b91b67 554 _find_dir($wanted, $top_item, $topnlink);
81793b90 555 $Is_Dir= 1;
7e47e6ff 556 }
237437d0 557 else {
81793b90 558 $abs_dir= $top_item;
7e47e6ff 559 }
560 }
81793b90 561
7e47e6ff 562 unless ($Is_Dir) {
81793b90 563 unless (($_,$dir) = File::Basename::fileparse($abs_dir)) {
7e47e6ff 564 if ($Is_MacOS) {
565 ($dir,$_) = (':', $top_item); # $File::Find::dir, $_
566 }
567 else {
568 ($dir,$_) = ('./', $top_item);
569 }
81793b90 570 }
571
7e47e6ff 572 $abs_dir = $dir;
573 if (( $untaint ) && (is_tainted($dir) )) {
574 ( $abs_dir ) = $dir =~ m|$untaint_pat|;
81793b90 575 unless (defined $abs_dir) {
576 if ($untaint_skip == 0) {
7e47e6ff 577 die "directory $dir is still tainted";
81793b90 578 }
579 else {
580 next Proc_Top_Item;
581 }
582 }
7e47e6ff 583 }
81793b90 584
7e47e6ff 585 unless ($no_chdir || chdir $abs_dir) {
586 warn "Couldn't chdir $abs_dir: $!\n";
587 next Proc_Top_Item;
588 }
719911cc 589
7e47e6ff 590 $name = $abs_dir . $_; # $File::Find::name
719911cc 591
7e47e6ff 592 { &$wanted_callback }; # protect against wild "next"
81793b90 593
7e47e6ff 594 }
81793b90 595
7e47e6ff 596 unless ( $no_chdir ) {
597 if ( ($check_t_cwd) && (($untaint) && (is_tainted($cwd) )) ) {
598 ( $cwd_untainted ) = $cwd =~ m|$untaint_pat|;
599 unless (defined $cwd_untainted) {
600 die "insecure cwd in find(depth)";
601 }
602 $check_t_cwd = 0;
603 }
604 unless (chdir $cwd_untainted) {
605 die "Can't cd to $cwd: $!\n";
606 }
607 }
81793b90 608 }
609}
610
611# API:
612# $wanted
613# $p_dir : "parent directory"
614# $nlink : what came back from the stat
615# preconditions:
616# chdir (if not no_chdir) to dir
617
618sub _find_dir($$$) {
619 my ($wanted, $p_dir, $nlink) = @_;
620 my ($CdLvl,$Level) = (0,0);
621 my @Stack;
622 my @filenames;
623 my ($subcount,$sub_nlink);
624 my $SE= [];
625 my $dir_name= $p_dir;
7e47e6ff 626 my $dir_pref;
627 my $dir_rel;
628 my $tainted = 0;
629
630 if ($Is_MacOS) {
631 $dir_pref= ($p_dir =~ /:$/) ? $p_dir : "$p_dir:"; # preface
632 $dir_rel= ':'; # directory name relative to current directory
633 }
634 else {
635 $dir_pref= ( $p_dir eq '/' ? '/' : "$p_dir/" );
636 $dir_rel= '.'; # directory name relative to current directory
637 }
81793b90 638
639 local ($dir, $name, $prune, *DIR);
7e47e6ff 640
641 unless ( $no_chdir || ($p_dir eq $File::Find::current_dir)) {
81793b90 642 my $udir = $p_dir;
7e47e6ff 643 if (( $untaint ) && (is_tainted($p_dir) )) {
644 ( $udir ) = $p_dir =~ m|$untaint_pat|;
81793b90 645 unless (defined $udir) {
646 if ($untaint_skip == 0) {
647 die "directory $p_dir is still tainted";
648 }
649 else {
650 return;
651 }
237437d0 652 }
a0d0e21e 653 }
81793b90 654 unless (chdir $udir) {
655 warn "Can't cd to $udir: $!\n";
656 return;
657 }
658 }
7e47e6ff 659
660 # push the starting directory
57e73c4b 661 push @Stack,[$CdLvl,$p_dir,$dir_rel,-1] if $bydepth;
81793b90 662
7e47e6ff 663 if ($Is_MacOS) {
664 $p_dir = $dir_pref; # ensure trailing ':'
665 }
666
81793b90 667 while (defined $SE) {
668 unless ($bydepth) {
7e47e6ff 669 $dir= $p_dir; # $File::Find::dir
670 $name= $dir_name; # $File::Find::name
671 $_= ($no_chdir ? $dir_name : $dir_rel ); # $_
81793b90 672 # prune may happen here
7e47e6ff 673 $prune= 0;
674 { &$wanted_callback }; # protect against wild "next"
675 next if $prune;
81793b90 676 }
7e47e6ff 677
81793b90 678 # change to that directory
7e47e6ff 679 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
81793b90 680 my $udir= $dir_rel;
7e47e6ff 681 if ( ($untaint) && (($tainted) || ($tainted = is_tainted($dir_rel) )) ) {
682 ( $udir ) = $dir_rel =~ m|$untaint_pat|;
81793b90 683 unless (defined $udir) {
684 if ($untaint_skip == 0) {
7e47e6ff 685 if ($Is_MacOS) {
686 die "directory ($p_dir) $dir_rel is still tainted";
687 }
688 else {
689 die "directory (" . ($p_dir ne '/' ? $p_dir : '') . "/) $dir_rel is still tainted";
690 }
691 } else { # $untaint_skip == 1
692 next;
81793b90 693 }
694 }
695 }
696 unless (chdir $udir) {
7e47e6ff 697 if ($Is_MacOS) {
698 warn "Can't cd to ($p_dir) $udir: $!\n";
699 }
700 else {
701 warn "Can't cd to (" . ($p_dir ne '/' ? $p_dir : '') . "/) $udir: $!\n";
702 }
81793b90 703 next;
704 }
705 $CdLvl++;
706 }
707
7e47e6ff 708 if ($Is_MacOS) {
709 $dir_name = "$dir_name:" unless ($dir_name =~ /:$/);
710 }
711
712 $dir= $dir_name; # $File::Find::dir
81793b90 713
714 # Get the list of files in the current directory.
7e47e6ff 715 unless (opendir DIR, ($no_chdir ? $dir_name : $File::Find::current_dir)) {
81793b90 716 warn "Can't opendir($dir_name): $!\n";
717 next;
718 }
719 @filenames = readdir DIR;
720 closedir(DIR);
719c805e 721 @filenames = &$pre_process(@filenames) if $pre_process;
722 push @Stack,[$CdLvl,$dir_name,"",-2] if $post_process;
81793b90 723
724 if ($nlink == 2 && !$avoid_nlink) {
725 # This dir has no subdirectories.
726 for my $FN (@filenames) {
7e47e6ff 727 next if $FN =~ $File::Find::skip_pattern;
81793b90 728
7e47e6ff 729 $name = $dir_pref . $FN; # $File::Find::name
730 $_ = ($no_chdir ? $name : $FN); # $_
73396e07 731 { &$wanted_callback }; # protect against wild "next"
81793b90 732 }
733
734 }
735 else {
736 # This dir has subdirectories.
737 $subcount = $nlink - 2;
738
739 for my $FN (@filenames) {
7e47e6ff 740 next if $FN =~ $File::Find::skip_pattern;
81793b90 741 if ($subcount > 0 || $avoid_nlink) {
742 # Seen all the subdirs?
743 # check for directoriness.
744 # stat is faster for a file in the current directory
07867069 745 $sub_nlink = (lstat ($no_chdir ? $dir_pref . $FN : $FN))[3];
81793b90 746
747 if (-d _) {
748 --$subcount;
c7b9dd21 749 $FN =~ s/\.dir\z// if $Is_VMS;
81793b90 750 push @Stack,[$CdLvl,$dir_name,$FN,$sub_nlink];
751 }
752 else {
7e47e6ff 753 $name = $dir_pref . $FN; # $File::Find::name
754 $_= ($no_chdir ? $name : $FN); # $_
73396e07 755 { &$wanted_callback }; # protect against wild "next"
81793b90 756 }
757 }
07867069 758 else {
7e47e6ff 759 $name = $dir_pref . $FN; # $File::Find::name
760 $_= ($no_chdir ? $name : $FN); # $_
73396e07 761 { &$wanted_callback }; # protect against wild "next"
81793b90 762 }
763 }
764 }
17b275ff 765 }
766 continue {
57e73c4b 767 while ( defined ($SE = pop @Stack) ) {
81793b90 768 ($Level, $p_dir, $dir_rel, $nlink) = @$SE;
769 if ($CdLvl > $Level && !$no_chdir) {
7e47e6ff 770 my $tmp;
771 if ($Is_MacOS) {
772 $tmp = (':' x ($CdLvl-$Level)) . ':';
773 }
774 else {
775 $tmp = join('/',('..') x ($CdLvl-$Level));
776 }
777 die "Can't cd to $dir_name" . $tmp
778 unless chdir ($tmp);
81793b90 779 $CdLvl = $Level;
780 }
7e47e6ff 781
782 if ($Is_MacOS) {
783 # $pdir always has a trailing ':', except for the starting dir,
784 # where $dir_rel eq ':'
785 $dir_name = "$p_dir$dir_rel";
786 $dir_pref = "$dir_name:";
787 }
788 else {
789 $dir_name = ($p_dir eq '/' ? "/$dir_rel" : "$p_dir/$dir_rel");
790 $dir_pref = "$dir_name/";
791 }
792
719c805e 793 if ( $nlink == -2 ) {
7e47e6ff 794 $name = $dir = $p_dir; # $File::Find::name / dir
795 if ($Is_MacOS) {
796 $_ = ':'; # $_
797 }
798 else {
799 $_ = '.';
800 }
719c805e 801 &$post_process; # End-of-directory processing
7e47e6ff 802 }
803 elsif ( $nlink < 0 ) { # must be finddepth, report dirname now
804 $name = $dir_name;
805 if ($Is_MacOS) {
806 if ($dir_rel eq ':') { # must be the top dir, where we started
807 $name =~ s|:$||; # $File::Find::name
808 $p_dir = "$p_dir:" unless ($p_dir =~ /:$/);
809 }
810 $dir = $p_dir; # $File::Find::dir
811 $_ = ($no_chdir ? $name : $dir_rel); # $_
812 }
813 else {
814 if ( substr($name,-2) eq '/.' ) {
815 $name =~ s|/\.$||;
816 }
817 $dir = $p_dir;
818 $_ = ($no_chdir ? $dir_name : $dir_rel );
819 if ( substr($_,-2) eq '/.' ) {
820 s|/\.$||;
821 }
822 }
823 { &$wanted_callback }; # protect against wild "next"
824 }
825 else {
826 push @Stack,[$CdLvl,$p_dir,$dir_rel,-1] if $bydepth;
827 last;
828 }
81793b90 829 }
a0d0e21e 830 }
831}
832
81793b90 833
834# API:
835# $wanted
836# $dir_loc : absolute location of a dir
837# $p_dir : "parent directory"
838# preconditions:
839# chdir (if not no_chdir) to dir
840
841sub _find_dir_symlnk($$$) {
7e47e6ff 842 my ($wanted, $dir_loc, $p_dir) = @_; # $dir_loc is the absolute directory
81793b90 843 my @Stack;
844 my @filenames;
845 my $new_loc;
7e47e6ff 846 my $updir_loc = $dir_loc; # untainted parent directory
81793b90 847 my $SE = [];
848 my $dir_name = $p_dir;
7e47e6ff 849 my $dir_pref;
850 my $loc_pref;
851 my $dir_rel;
852 my $byd_flag; # flag for pending stack entry if $bydepth
853 my $tainted = 0;
854 my $ok = 1;
855
856 if ($Is_MacOS) {
857 $dir_pref = ($p_dir =~ /:$/) ? "$p_dir" : "$p_dir:";
858 $loc_pref = ($dir_loc =~ /:$/) ? "$dir_loc" : "$dir_loc:";
859 $dir_rel = ':'; # directory name relative to current directory
860 } else {
861 $dir_pref = ( $p_dir eq '/' ? '/' : "$p_dir/" );
862 $loc_pref = ( $dir_loc eq '/' ? '/' : "$dir_loc/" );
863 $dir_rel = '.'; # directory name relative to current directory
864 }
81793b90 865
866 local ($dir, $name, $fullname, $prune, *DIR);
7e47e6ff 867
868 unless ($no_chdir) {
869 # untaint the topdir
870 if (( $untaint ) && (is_tainted($dir_loc) )) {
871 ( $updir_loc ) = $dir_loc =~ m|$untaint_pat|; # parent dir, now untainted
872 # once untainted, $updir_loc is pushed on the stack (as parent directory);
873 # hence, we don't need to untaint the parent directory every time we chdir
874 # to it later
875 unless (defined $updir_loc) {
81793b90 876 if ($untaint_skip == 0) {
877 die "directory $dir_loc is still tainted";
878 }
879 else {
880 return;
881 }
882 }
883 }
7e47e6ff 884 $ok = chdir($updir_loc) unless ($p_dir eq $File::Find::current_dir);
885 unless ($ok) {
886 warn "Can't cd to $updir_loc: $!\n";
81793b90 887 return;
888 }
889 }
890
7e47e6ff 891 push @Stack,[$dir_loc,$updir_loc,$p_dir,$dir_rel,-1] if $bydepth;
892
893 if ($Is_MacOS) {
894 $p_dir = $dir_pref; # ensure trailing ':'
895 }
57e73c4b 896
81793b90 897 while (defined $SE) {
898
899 unless ($bydepth) {
7e47e6ff 900 # change (back) to parent directory (always untainted)
704ea872 901 unless ($no_chdir) {
7e47e6ff 902 unless (chdir $updir_loc) {
903 warn "Can't cd to $updir_loc: $!\n";
704ea872 904 next;
905 }
906 }
7e47e6ff 907 $dir= $p_dir; # $File::Find::dir
908 $name= $dir_name; # $File::Find::name
909 $_= ($no_chdir ? $dir_name : $dir_rel ); # $_
910 $fullname= $dir_loc; # $File::Find::fullname
81793b90 911 # prune may happen here
7e47e6ff 912 $prune= 0;
704ea872 913 lstat($_); # make sure file tests with '_' work
7e47e6ff 914 { &$wanted_callback }; # protect against wild "next"
915 next if $prune;
81793b90 916 }
917
918 # change to that directory
7e47e6ff 919 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
920 $updir_loc = $dir_loc;
921 if ( ($untaint) && (($tainted) || ($tainted = is_tainted($dir_loc) )) ) {
922 # untaint $dir_loc, what will be pushed on the stack as (untainted) parent dir
923 ( $updir_loc ) = $dir_loc =~ m|$untaint_pat|;
924 unless (defined $updir_loc) {
81793b90 925 if ($untaint_skip == 0) {
926 die "directory $dir_loc is still tainted";
a0d0e21e 927 }
237437d0 928 else {
81793b90 929 next;
237437d0 930 }
a0d0e21e 931 }
932 }
7e47e6ff 933 unless (chdir $updir_loc) {
934 warn "Can't cd to $updir_loc: $!\n";
81793b90 935 next;
936 }
937 }
938
7e47e6ff 939 if ($Is_MacOS) {
940 $dir_name = "$dir_name:" unless ($dir_name =~ /:$/);
941 }
942
943 $dir = $dir_name; # $File::Find::dir
81793b90 944
945 # Get the list of files in the current directory.
7e47e6ff 946 unless (opendir DIR, ($no_chdir ? $dir_loc : $File::Find::current_dir)) {
81793b90 947 warn "Can't opendir($dir_loc): $!\n";
948 next;
949 }
950 @filenames = readdir DIR;
951 closedir(DIR);
952
953 for my $FN (@filenames) {
7e47e6ff 954 next if $FN =~ $File::Find::skip_pattern;
81793b90 955
956 # follow symbolic links / do an lstat
07867069 957 $new_loc = Follow_SymLink($loc_pref.$FN);
81793b90 958
959 # ignore if invalid symlink
960 next unless defined $new_loc;
7e47e6ff 961
81793b90 962 if (-d _) {
7e47e6ff 963 push @Stack,[$new_loc,$updir_loc,$dir_name,$FN,1];
81793b90 964 }
965 else {
7e47e6ff 966 $fullname = $new_loc; # $File::Find::fullname
967 $name = $dir_pref . $FN; # $File::Find::name
968 $_ = ($no_chdir ? $name : $FN); # $_
73396e07 969 { &$wanted_callback }; # protect against wild "next"
81793b90 970 }
971 }
972
81793b90 973 }
974 continue {
57e73c4b 975 while (defined($SE = pop @Stack)) {
7e47e6ff 976 ($dir_loc, $updir_loc, $p_dir, $dir_rel, $byd_flag) = @$SE;
977 if ($Is_MacOS) {
978 # $p_dir always has a trailing ':', except for the starting dir,
979 # where $dir_rel eq ':'
980 $dir_name = "$p_dir$dir_rel";
981 $dir_pref = "$dir_name:";
982 $loc_pref = ($dir_loc =~ /:$/) ? $dir_loc : "$dir_loc:";
983 }
984 else {
985 $dir_name = ($p_dir eq '/' ? "/$dir_rel" : "$p_dir/$dir_rel");
986 $dir_pref = "$dir_name/";
987 $loc_pref = "$dir_loc/";
988 }
989 if ( $byd_flag < 0 ) { # must be finddepth, report dirname now
990 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
991 unless (chdir $updir_loc) { # $updir_loc (parent dir) is always untainted
992 warn "Can't cd to $updir_loc: $!\n";
993 next;
994 }
995 }
996 $fullname = $dir_loc; # $File::Find::fullname
997 $name = $dir_name; # $File::Find::name
998 if ($Is_MacOS) {
999 if ($dir_rel eq ':') { # must be the top dir, where we started
1000 $name =~ s|:$||; # $File::Find::name
1001 $p_dir = "$p_dir:" unless ($p_dir =~ /:$/);
1002 }
1003 $dir = $p_dir; # $File::Find::dir
1004 $_ = ($no_chdir ? $name : $dir_rel); # $_
1005 }
1006 else {
1007 if ( substr($name,-2) eq '/.' ) {
1008 $name =~ s|/\.$||; # $File::Find::name
1009 }
1010 $dir = $p_dir; # $File::Find::dir
1011 $_ = ($no_chdir ? $dir_name : $dir_rel); # $_
1012 if ( substr($_,-2) eq '/.' ) {
1013 s|/\.$||;
1014 }
1015 }
1016
1017 lstat($_); # make sure file tests with '_' work
1018 { &$wanted_callback }; # protect against wild "next"
1019 }
1020 else {
1021 push @Stack,[$dir_loc, $updir_loc, $p_dir, $dir_rel,-1] if $bydepth;
1022 last;
1023 }
a0d0e21e 1024 }
1025 }
1026}
1027
81793b90 1028
20408e3c 1029sub wrap_wanted {
81793b90 1030 my $wanted = shift;
1031 if ( ref($wanted) eq 'HASH' ) {
1032 if ( $wanted->{follow} || $wanted->{follow_fast}) {
1033 $wanted->{follow_skip} = 1 unless defined $wanted->{follow_skip};
1034 }
1035 if ( $wanted->{untaint} ) {
7e47e6ff 1036 $wanted->{untaint_pattern} = $File::Find::untaint_pattern
81793b90 1037 unless defined $wanted->{untaint_pattern};
1038 $wanted->{untaint_skip} = 0 unless defined $wanted->{untaint_skip};
1039 }
1040 return $wanted;
1041 }
1042 else {
1043 return { wanted => $wanted };
1044 }
a0d0e21e 1045}
1046
20408e3c 1047sub find {
81793b90 1048 my $wanted = shift;
1049 _find_opt(wrap_wanted($wanted), @_);
a0d0e21e 1050}
1051
55d729e4 1052sub finddepth {
81793b90 1053 my $wanted = wrap_wanted(shift);
1054 $wanted->{bydepth} = 1;
1055 _find_opt($wanted, @_);
20408e3c 1056}
6280b799 1057
7e47e6ff 1058# default
1059$File::Find::skip_pattern = qr/^\.{1,2}\z/;
1060$File::Find::untaint_pattern = qr|^([-+@\w./]+)$|;
1061
6280b799 1062# These are hard-coded for now, but may move to hint files.
10eba763 1063if ($^O eq 'VMS') {
81793b90 1064 $Is_VMS = 1;
7e47e6ff 1065 $File::Find::dont_use_nlink = 1;
1066}
1067elsif ($^O eq 'MacOS') {
1068 $Is_MacOS = 1;
1069 $File::Find::dont_use_nlink = 1;
1070 $File::Find::skip_pattern = qr/^Icon\015\z/;
1071 $File::Find::untaint_pattern = qr|^(.+)$|;
748a9306 1072}
1073
7e47e6ff 1074# this _should_ work properly on all platforms
1075# where File::Find can be expected to work
1076$File::Find::current_dir = File::Spec->curdir || '.';
1077
81793b90 1078$File::Find::dont_use_nlink = 1
497711e7 1079 if $^O eq 'os2' || $^O eq 'dos' || $^O eq 'amigaos' || $^O eq 'MSWin32' ||
fa6a1c44 1080 $^O eq 'cygwin' || $^O eq 'epoc';
6280b799 1081
20408e3c 1082# Set dont_use_nlink in your hint file if your system's stat doesn't
1083# report the number of links in a directory as an indication
1084# of the number of files.
1085# See, e.g. hints/machten.sh for MachTen 2.2.
81793b90 1086unless ($File::Find::dont_use_nlink) {
1087 require Config;
1088 $File::Find::dont_use_nlink = 1 if ($Config::Config{'dont_use_nlink'});
20408e3c 1089}
1090
7e47e6ff 1091# We need a function that checks if a scalar is tainted. Either use the
1092# Scalar::Util module's tainted() function or our (slower) pure Perl
1093# fallback is_tainted_pp()
1094{
1095 local $@;
1096 eval { require Scalar::Util };
1097 *is_tainted = $@ ? \&is_tainted_pp : \&Scalar::Util::tainted;
1098}
1099
a0d0e21e 11001;