More VERSION tuning: to avoid unnecessary Perl upgrades
[p5sagit/p5-mst-13.2.git] / lib / File / Find.pm
CommitLineData
a0d0e21e 1package File::Find;
b75c8c73 2use strict;
b395063c 3use warnings;
4use 5.6.0;
a85af077 5our $VERSION = '1.01';
a0d0e21e 6require Exporter;
6280b799 7require Cwd;
a0d0e21e 8
f06db76b 9=head1 NAME
10
11find - traverse a file tree
12
13finddepth - traverse a directory structure depth-first
14
15=head1 SYNOPSIS
16
17 use File::Find;
81793b90 18 find(\&wanted, '/foo', '/bar');
f06db76b 19 sub wanted { ... }
237437d0 20
f06db76b 21 use File::Find;
81793b90 22 finddepth(\&wanted, '/foo', '/bar');
f06db76b 23 sub wanted { ... }
3cb6de81 24
81793b90 25 use File::Find;
26 find({ wanted => \&process, follow => 1 }, '.');
f06db76b 27
28=head1 DESCRIPTION
29
20408e3c 30The first argument to find() is either a hash reference describing the
81793b90 31operations to be performed for each file, or a code reference.
20408e3c 32
81793b90 33Here are the possible keys for the hash:
34
35=over 3
36
37=item C<wanted>
38
39The value should be a code reference. This code reference is called
40I<the wanted() function> below.
41
42=item C<bydepth>
43
44Reports the name of a directory only AFTER all its entries
45have been reported. Entry point finddepth() is a shortcut for
46specifying C<{ bydepth => 1 }> in the first argument of find().
47
719c805e 48=item C<preprocess>
49
7e47e6ff 50The value should be a code reference. This code reference is used to
51preprocess the current directory. The name of currently processed
52directory is in $File::Find::dir. Your preprocessing function is
53called after readdir() but before the loop that calls the wanted()
54function. It is called with a list of strings (actually file/directory
55names) and is expected to return a list of strings. The code can be
56used to sort the file/directory names alphabetically, numerically,
57or to filter out directory entries based on their name alone. When
58I<follow> or I<follow_fast> are in effect, C<preprocess> is a no-op.
719c805e 59
60=item C<postprocess>
61
7e47e6ff 62The value should be a code reference. It is invoked just before leaving
63the currently processed directory. It is called in void context with no
64arguments. The name of the current directory is in $File::Find::dir. This
65hook is handy for summarizing a directory, such as calculating its disk
3fa6e24b 66usage. When I<follow> or I<follow_fast> are in effect, C<postprocess> is a
7e47e6ff 67no-op.
719c805e 68
81793b90 69=item C<follow>
70
71Causes symbolic links to be followed. Since directory trees with symbolic
72links (followed) may contain files more than once and may even have
73cycles, a hash has to be built up with an entry for each file.
74This might be expensive both in space and time for a large
75directory tree. See I<follow_fast> and I<follow_skip> below.
76If either I<follow> or I<follow_fast> is in effect:
77
78=over 6
79
a45bd81d 80=item *
81793b90 81
f10e1564 82It is guaranteed that an I<lstat> has been called before the user's
81793b90 83I<wanted()> function is called. This enables fast file checks involving S< _>.
84
a45bd81d 85=item *
81793b90 86
87There is a variable C<$File::Find::fullname> which holds the absolute
88pathname of the file with all symbolic links resolved
89
90=back
91
92=item C<follow_fast>
93
f10e1564 94This is similar to I<follow> except that it may report some files more
95than once. It does detect cycles, however. Since only symbolic links
96have to be hashed, this is much cheaper both in space and time. If
97processing a file more than once (by the user's I<wanted()> function)
81793b90 98is worse than just taking time, the option I<follow> should be used.
99
100=item C<follow_skip>
101
102C<follow_skip==1>, which is the default, causes all files which are
103neither directories nor symbolic links to be ignored if they are about
104to be processed a second time. If a directory or a symbolic link
105are about to be processed a second time, File::Find dies.
106C<follow_skip==0> causes File::Find to die if any file is about to be
107processed a second time.
108C<follow_skip==2> causes File::Find to ignore any duplicate files and
7e47e6ff 109directories but to proceed normally otherwise.
20408e3c 110
80e52b73 111=item C<dangling_symlinks>
112
113If true and a code reference, will be called with the symbolic link
114name and the directory it lives in as arguments. Otherwise, if true
115and warnings are on, warning "symbolic_link_name is a dangling
116symbolic link\n" will be issued. If false, the dangling symbolic link
117will be silently ignored.
f06db76b 118
81793b90 119=item C<no_chdir>
120
121Does not C<chdir()> to each directory as it recurses. The wanted()
122function will need to be aware of this, of course. In this case,
123C<$_> will be the same as C<$File::Find::name>.
124
125=item C<untaint>
126
127If find is used in taint-mode (-T command line switch or if EUID != UID
128or if EGID != GID) then internally directory names have to be untainted
7e47e6ff 129before they can be chdir'ed to. Therefore they are checked against a regular
130expression I<untaint_pattern>. Note that all names passed to the user's
131I<wanted()> function are still tainted. If this option is used while
132not in taint-mode, C<untaint> is a no-op.
81793b90 133
134=item C<untaint_pattern>
135
136See above. This should be set using the C<qr> quoting operator.
137The default is set to C<qr|^([-+@\w./]+)$|>.
ac06ab1d 138Note that the parantheses are vital.
81793b90 139
140=item C<untaint_skip>
141
7e47e6ff 142If set, a directory which fails the I<untaint_pattern> is skipped,
143including all its sub-directories. The default is to 'die' in such a case.
81793b90 144
145=back
146
147The wanted() function does whatever verifications you want.
148C<$File::Find::dir> contains the current directory name, and C<$_> the
149current filename within that directory. C<$File::Find::name> contains
f10e1564 150the complete pathname to the file. You are chdir()'d to
151C<$File::Find::dir> when the function is called, unless C<no_chdir>
7e47e6ff 152was specified. When C<follow> or C<follow_fast> are in effect, there is
f10e1564 153also a C<$File::Find::fullname>. The function may set
154C<$File::Find::prune> to prune the tree unless C<bydepth> was
155specified. Unless C<follow> or C<follow_fast> is specified, for
156compatibility reasons (find.pl, find2perl) there are in addition the
157following globals available: C<$File::Find::topdir>,
158C<$File::Find::topdev>, C<$File::Find::topino>,
e7b91b67 159C<$File::Find::topmode> and C<$File::Find::topnlink>.
47a735e8 160
20408e3c 161This library is useful for the C<find2perl> tool, which when fed,
f06db76b 162
163 find2perl / -name .nfs\* -mtime +7 \
81793b90 164 -exec rm -f {} \; -o -fstype nfs -prune
f06db76b 165
166produces something like:
167
168 sub wanted {
c7b9dd21 169 /^\.nfs.*\z/s &&
81793b90 170 (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_)) &&
f06db76b 171 int(-M _) > 7 &&
172 unlink($_)
173 ||
81793b90 174 ($nlink || (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))) &&
f06db76b 175 $dev < 0 &&
6280b799 176 ($File::Find::prune = 1);
f06db76b 177 }
178
81793b90 179Set the variable C<$File::Find::dont_use_nlink> if you're using AFS,
6280b799 180since AFS cheats.
f06db76b 181
f06db76b 182
183Here's another interesting wanted function. It will find all symlinks
184that don't resolve:
185
186 sub wanted {
81793b90 187 -l && !-e && print "bogus link: $File::Find::name\n";
237437d0 188 }
f06db76b 189
81793b90 190See also the script C<pfind> on CPAN for a nice application of this
191module.
192
193=head1 CAVEAT
194
f10e1564 195Be aware that the option to follow symbolic links can be dangerous.
81793b90 196Depending on the structure of the directory tree (including symbolic
197links to directories) you might traverse a given (physical) directory
198more than once (only if C<follow_fast> is in effect).
199Furthermore, deleting or changing files in a symbolically linked directory
200might cause very unpleasant surprises, since you delete or change files
201in an unknown directory.
0530a6c4 202
7e47e6ff 203=head1 NOTES
204
205=over 4
206
207=item *
208
209Mac OS (Classic) users should note a few differences:
210
211=over 4
212
213=item *
214
215The path separator is ':', not '/', and the current directory is denoted
216as ':', not '.'. You should be careful about specifying relative pathnames.
217While a full path always begins with a volume name, a relative pathname
218should always begin with a ':'. If specifying a volume name only, a
219trailing ':' is required.
220
221=item *
222
223C<$File::Find::dir> is guaranteed to end with a ':'. If C<$_>
224contains the name of a directory, that name may or may not end with a
225':'. Likewise, C<$File::Find::name>, which contains the complete
226pathname to that directory, and C<$File::Find::fullname>, which holds
227the absolute pathname of that directory with all symbolic links resolved,
228may or may not end with a ':'.
229
230=item *
231
232The default C<untaint_pattern> (see above) on Mac OS is set to
233C<qr|^(.+)$|>. Note that the parentheses are vital.
234
235=item *
236
237The invisible system file "Icon\015" is ignored. While this file may
238appear in every directory, there are some more invisible system files
239on every volume, which are all located at the volume root level (i.e.
240"MacintoshHD:"). These system files are B<not> excluded automatically.
241Your filter may use the following code to recognize invisible files or
242directories (requires Mac::Files):
243
244 use Mac::Files;
245
246 # invisible() -- returns 1 if file/directory is invisible,
247 # 0 if it's visible or undef if an error occured
248
249 sub invisible($) {
250 my $file = shift;
251 my ($fileCat, $fileInfo);
252 my $invisible_flag = 1 << 14;
253
254 if ( $fileCat = FSpGetCatInfo($file) ) {
255 if ($fileInfo = $fileCat->ioFlFndrInfo() ) {
256 return (($fileInfo->fdFlags & $invisible_flag) && 1);
257 }
258 }
259 return undef;
260 }
261
262Generally, invisible files are system files, unless an odd application
263decides to use invisible files for its own purposes. To distinguish
264such files from system files, you have to look at the B<type> and B<creator>
265file attributes. The MacPerl built-in functions C<GetFileInfo(FILE)> and
266C<SetFileInfo(CREATOR, TYPE, FILES)> offer access to these attributes
267(see MacPerl.pm for details).
268
269Files that appear on the desktop actually reside in an (hidden) directory
270named "Desktop Folder" on the particular disk volume. Note that, although
271all desktop files appear to be on the same "virtual" desktop, each disk
272volume actually maintains its own "Desktop Folder" directory.
273
274=back
275
276=back
0530a6c4 277
a85af077 278=head1 HISTORY
279
280File::Find used to produce incorrect results if called recursively.
281During the development of perl 5.8 this bug was fixed.
282The first fixed version of File::Find was 1.01.
283
f06db76b 284=cut
285
b75c8c73 286our @ISA = qw(Exporter);
287our @EXPORT = qw(find finddepth);
6280b799 288
a0d0e21e 289
81793b90 290use strict;
291my $Is_VMS;
7e47e6ff 292my $Is_MacOS;
81793b90 293
294require File::Basename;
7e47e6ff 295require File::Spec;
81793b90 296
9f826d6a 297# Should ideally be my() not our() but local() currently
298# refuses to operate on lexicals
299
300our %SLnkSeen;
301our ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow,
719c805e 302 $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat,
80e52b73 303 $pre_process, $post_process, $dangling_symlinks);
81793b90 304
305sub contract_name {
306 my ($cdir,$fn) = @_;
307
7e47e6ff 308 return substr($cdir,0,rindex($cdir,'/')) if $fn eq $File::Find::current_dir;
81793b90 309
310 $cdir = substr($cdir,0,rindex($cdir,'/')+1);
311
312 $fn =~ s|^\./||;
313
314 my $abs_name= $cdir . $fn;
315
316 if (substr($fn,0,3) eq '../') {
fecbda2b 317 1 while $abs_name =~ s!/[^/]*/\.\./!/!;
81793b90 318 }
319
320 return $abs_name;
321}
322
7e47e6ff 323# return the absolute name of a directory or file
324sub contract_name_Mac {
325 my ($cdir,$fn) = @_;
326 my $abs_name;
327
328 if ($fn =~ /^(:+)(.*)$/) { # valid pathname starting with a ':'
329
330 my $colon_count = length ($1);
331 if ($colon_count == 1) {
332 $abs_name = $cdir . $2;
333 return $abs_name;
334 }
335 else {
336 # need to move up the tree, but
337 # only if it's not a volume name
338 for (my $i=1; $i<$colon_count; $i++) {
339 unless ($cdir =~ /^[^:]+:$/) { # volume name
340 $cdir =~ s/[^:]+:$//;
341 }
342 else {
343 return undef;
344 }
345 }
346 $abs_name = $cdir . $2;
347 return $abs_name;
348 }
349
350 }
351 else {
352
353 # $fn may be a valid path to a directory or file or (dangling)
354 # symlink, without a leading ':'
355 if ( (-e $fn) || (-l $fn) ) {
356 if ($fn =~ /^[^:]+:/) { # a volume name like DataHD:*
357 return $fn; # $fn is already an absolute path
358 }
359 else {
360 $abs_name = $cdir . $fn;
361 return $abs_name;
362 }
363 }
364 else { # argh!, $fn is not a valid directory/file
365 return undef;
366 }
367 }
368}
81793b90 369
370sub PathCombine($$) {
371 my ($Base,$Name) = @_;
372 my $AbsName;
373
7e47e6ff 374 if ($Is_MacOS) {
375 # $Name is the resolved symlink (always a full path on MacOS),
376 # i.e. there's no need to call contract_name_Mac()
377 $AbsName = $Name;
378
379 # (simple) check for recursion
380 if ( ( $Base =~ /^$AbsName/) && (-d $AbsName) ) { # recursion
381 return undef;
382 }
81793b90 383 }
384 else {
7e47e6ff 385 if (substr($Name,0,1) eq '/') {
386 $AbsName= $Name;
387 }
388 else {
389 $AbsName= contract_name($Base,$Name);
390 }
81793b90 391
7e47e6ff 392 # (simple) check for recursion
393 my $newlen= length($AbsName);
394 if ($newlen <= length($Base)) {
395 if (($newlen == length($Base) || substr($Base,$newlen,1) eq '/')
396 && $AbsName eq substr($Base,0,$newlen))
397 {
398 return undef;
399 }
81793b90 400 }
401 }
402 return $AbsName;
403}
404
405sub Follow_SymLink($) {
406 my ($AbsName) = @_;
407
408 my ($NewName,$DEV, $INO);
409 ($DEV, $INO)= lstat $AbsName;
410
411 while (-l _) {
412 if ($SLnkSeen{$DEV, $INO}++) {
413 if ($follow_skip < 2) {
414 die "$AbsName is encountered a second time";
a0d0e21e 415 }
416 else {
81793b90 417 return undef;
a0d0e21e 418 }
419 }
81793b90 420 $NewName= PathCombine($AbsName, readlink($AbsName));
421 unless(defined $NewName) {
422 if ($follow_skip < 2) {
423 die "$AbsName is a recursive symbolic link";
424 }
425 else {
426 return undef;
a0d0e21e 427 }
81793b90 428 }
429 else {
430 $AbsName= $NewName;
431 }
432 ($DEV, $INO) = lstat($AbsName);
433 return undef unless defined $DEV; # dangling symbolic link
434 }
435
436 if ($full_check && $SLnkSeen{$DEV, $INO}++) {
7e47e6ff 437 if ( ($follow_skip < 1) || ((-d _) && ($follow_skip < 2)) ) {
81793b90 438 die "$AbsName encountered a second time";
439 }
440 else {
441 return undef;
442 }
443 }
444
445 return $AbsName;
446}
447
17f410f9 448our($dir, $name, $fullname, $prune);
81793b90 449sub _find_dir_symlnk($$$);
450sub _find_dir($$$);
451
7e47e6ff 452# check whether or not a scalar variable is tainted
453# (code straight from the Camel, 3rd ed., page 561)
454sub is_tainted_pp {
455 my $arg = shift;
456 my $nada = substr($arg, 0, 0); # zero-length
457 local $@;
458 eval { eval "# $nada" };
459 return length($@) != 0;
460}
461
81793b90 462sub _find_opt {
463 my $wanted = shift;
464 die "invalid top directory" unless defined $_[0];
465
9f826d6a 466 # This function must local()ize everything because callbacks may
467 # call find() or finddepth()
468
469 local %SLnkSeen;
470 local ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow,
471 $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat,
80e52b73 472 $pre_process, $post_process, $dangling_symlinks);
9f826d6a 473 local($dir, $name, $fullname, $prune);
474
80e52b73 475 my $cwd = $wanted->{bydepth} ? Cwd::fastcwd() : Cwd::cwd();
476 my $cwd_untainted = $cwd;
477 my $check_t_cwd = 1;
478 $wanted_callback = $wanted->{wanted};
479 $bydepth = $wanted->{bydepth};
480 $pre_process = $wanted->{preprocess};
481 $post_process = $wanted->{postprocess};
482 $no_chdir = $wanted->{no_chdir};
483 $full_check = $wanted->{follow};
484 $follow = $full_check || $wanted->{follow_fast};
485 $follow_skip = $wanted->{follow_skip};
486 $untaint = $wanted->{untaint};
487 $untaint_pat = $wanted->{untaint_pattern};
488 $untaint_skip = $wanted->{untaint_skip};
489 $dangling_symlinks = $wanted->{dangling_symlinks};
81793b90 490
e7b91b67 491 # for compatability reasons (find.pl, find2perl)
9f826d6a 492 local our ($topdir, $topdev, $topino, $topmode, $topnlink);
81793b90 493
494 # a symbolic link to a directory doesn't increase the link count
495 $avoid_nlink = $follow || $File::Find::dont_use_nlink;
496
e7b91b67 497 my ($abs_dir, $Is_Dir);
81793b90 498
499 Proc_Top_Item:
500 foreach my $TOP (@_) {
7e47e6ff 501 my $top_item = $TOP;
502
503 if ($Is_MacOS) {
504 ($topdev,$topino,$topmode,$topnlink) = $follow ? stat $top_item : lstat $top_item;
505 $top_item = ":$top_item"
3fa6e24b 506 if ( (-d _) && ( $top_item !~ /:/ ) );
7e47e6ff 507 }
508 else {
509 $top_item =~ s|/\z|| unless $top_item eq '/';
510 ($topdev,$topino,$topmode,$topnlink) = $follow ? stat $top_item : lstat $top_item;
511 }
512
513 $Is_Dir= 0;
514
515 if ($follow) {
516
517 if ($Is_MacOS) {
518 $cwd = "$cwd:" unless ($cwd =~ /:$/); # for safety
519
520 if ($top_item eq $File::Find::current_dir) {
521 $abs_dir = $cwd;
522 }
523 else {
524 $abs_dir = contract_name_Mac($cwd, $top_item);
525 unless (defined $abs_dir) {
3fa6e24b 526 warn "Can't determine absolute path for $top_item (No such file or directory)\n" if $^W;
7e47e6ff 527 next Proc_Top_Item;
528 }
529 }
530
531 }
532 else {
533 if (substr($top_item,0,1) eq '/') {
534 $abs_dir = $top_item;
535 }
536 elsif ($top_item eq $File::Find::current_dir) {
537 $abs_dir = $cwd;
538 }
539 else { # care about any ../
540 $abs_dir = contract_name("$cwd/",$top_item);
541 }
542 }
543 $abs_dir= Follow_SymLink($abs_dir);
544 unless (defined $abs_dir) {
80e52b73 545 if ($dangling_symlinks) {
546 if (ref $dangling_symlinks eq 'CODE') {
547 $dangling_symlinks->($top_item, $cwd);
548 } else {
549 warn "$top_item is a dangling symbolic link\n" if $^W;
550 }
551 }
81793b90 552 next Proc_Top_Item;
7e47e6ff 553 }
554
555 if (-d _) {
81793b90 556 _find_dir_symlnk($wanted, $abs_dir, $top_item);
557 $Is_Dir= 1;
7e47e6ff 558 }
559 }
81793b90 560 else { # no follow
7e47e6ff 561 $topdir = $top_item;
562 unless (defined $topnlink) {
3fa6e24b 563 warn "Can't stat $top_item: $!\n" if $^W;
7e47e6ff 564 next Proc_Top_Item;
565 }
566 if (-d _) {
c7b9dd21 567 $top_item =~ s/\.dir\z// if $Is_VMS;
e7b91b67 568 _find_dir($wanted, $top_item, $topnlink);
81793b90 569 $Is_Dir= 1;
7e47e6ff 570 }
237437d0 571 else {
81793b90 572 $abs_dir= $top_item;
7e47e6ff 573 }
574 }
81793b90 575
7e47e6ff 576 unless ($Is_Dir) {
81793b90 577 unless (($_,$dir) = File::Basename::fileparse($abs_dir)) {
7e47e6ff 578 if ($Is_MacOS) {
579 ($dir,$_) = (':', $top_item); # $File::Find::dir, $_
580 }
581 else {
582 ($dir,$_) = ('./', $top_item);
583 }
81793b90 584 }
585
7e47e6ff 586 $abs_dir = $dir;
587 if (( $untaint ) && (is_tainted($dir) )) {
588 ( $abs_dir ) = $dir =~ m|$untaint_pat|;
81793b90 589 unless (defined $abs_dir) {
590 if ($untaint_skip == 0) {
7e47e6ff 591 die "directory $dir is still tainted";
81793b90 592 }
593 else {
594 next Proc_Top_Item;
595 }
596 }
7e47e6ff 597 }
81793b90 598
7e47e6ff 599 unless ($no_chdir || chdir $abs_dir) {
3fa6e24b 600 warn "Couldn't chdir $abs_dir: $!\n" if $^W;
7e47e6ff 601 next Proc_Top_Item;
602 }
719911cc 603
7e47e6ff 604 $name = $abs_dir . $_; # $File::Find::name
719911cc 605
7e47e6ff 606 { &$wanted_callback }; # protect against wild "next"
81793b90 607
7e47e6ff 608 }
81793b90 609
7e47e6ff 610 unless ( $no_chdir ) {
611 if ( ($check_t_cwd) && (($untaint) && (is_tainted($cwd) )) ) {
612 ( $cwd_untainted ) = $cwd =~ m|$untaint_pat|;
613 unless (defined $cwd_untainted) {
614 die "insecure cwd in find(depth)";
615 }
616 $check_t_cwd = 0;
617 }
618 unless (chdir $cwd_untainted) {
619 die "Can't cd to $cwd: $!\n";
620 }
621 }
81793b90 622 }
623}
624
625# API:
626# $wanted
627# $p_dir : "parent directory"
628# $nlink : what came back from the stat
629# preconditions:
630# chdir (if not no_chdir) to dir
631
632sub _find_dir($$$) {
633 my ($wanted, $p_dir, $nlink) = @_;
634 my ($CdLvl,$Level) = (0,0);
635 my @Stack;
636 my @filenames;
637 my ($subcount,$sub_nlink);
638 my $SE= [];
639 my $dir_name= $p_dir;
7e47e6ff 640 my $dir_pref;
641 my $dir_rel;
642 my $tainted = 0;
643
644 if ($Is_MacOS) {
645 $dir_pref= ($p_dir =~ /:$/) ? $p_dir : "$p_dir:"; # preface
646 $dir_rel= ':'; # directory name relative to current directory
647 }
648 else {
649 $dir_pref= ( $p_dir eq '/' ? '/' : "$p_dir/" );
650 $dir_rel= '.'; # directory name relative to current directory
651 }
81793b90 652
653 local ($dir, $name, $prune, *DIR);
7e47e6ff 654
655 unless ( $no_chdir || ($p_dir eq $File::Find::current_dir)) {
81793b90 656 my $udir = $p_dir;
7e47e6ff 657 if (( $untaint ) && (is_tainted($p_dir) )) {
658 ( $udir ) = $p_dir =~ m|$untaint_pat|;
81793b90 659 unless (defined $udir) {
660 if ($untaint_skip == 0) {
661 die "directory $p_dir is still tainted";
662 }
663 else {
664 return;
665 }
237437d0 666 }
a0d0e21e 667 }
81793b90 668 unless (chdir $udir) {
3fa6e24b 669 warn "Can't cd to $udir: $!\n" if $^W;
81793b90 670 return;
671 }
672 }
7e47e6ff 673
674 # push the starting directory
57e73c4b 675 push @Stack,[$CdLvl,$p_dir,$dir_rel,-1] if $bydepth;
81793b90 676
7e47e6ff 677 if ($Is_MacOS) {
678 $p_dir = $dir_pref; # ensure trailing ':'
679 }
680
81793b90 681 while (defined $SE) {
682 unless ($bydepth) {
7e47e6ff 683 $dir= $p_dir; # $File::Find::dir
684 $name= $dir_name; # $File::Find::name
685 $_= ($no_chdir ? $dir_name : $dir_rel ); # $_
81793b90 686 # prune may happen here
7e47e6ff 687 $prune= 0;
688 { &$wanted_callback }; # protect against wild "next"
689 next if $prune;
81793b90 690 }
7e47e6ff 691
81793b90 692 # change to that directory
7e47e6ff 693 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
81793b90 694 my $udir= $dir_rel;
7e47e6ff 695 if ( ($untaint) && (($tainted) || ($tainted = is_tainted($dir_rel) )) ) {
696 ( $udir ) = $dir_rel =~ m|$untaint_pat|;
81793b90 697 unless (defined $udir) {
698 if ($untaint_skip == 0) {
7e47e6ff 699 if ($Is_MacOS) {
700 die "directory ($p_dir) $dir_rel is still tainted";
701 }
702 else {
703 die "directory (" . ($p_dir ne '/' ? $p_dir : '') . "/) $dir_rel is still tainted";
704 }
705 } else { # $untaint_skip == 1
706 next;
81793b90 707 }
708 }
709 }
710 unless (chdir $udir) {
7e47e6ff 711 if ($Is_MacOS) {
3fa6e24b 712 warn "Can't cd to ($p_dir) $udir: $!\n" if $^W;
7e47e6ff 713 }
714 else {
3fa6e24b 715 warn "Can't cd to (" . ($p_dir ne '/' ? $p_dir : '') . "/) $udir: $!\n" if $^W;
7e47e6ff 716 }
81793b90 717 next;
718 }
719 $CdLvl++;
720 }
721
7e47e6ff 722 if ($Is_MacOS) {
723 $dir_name = "$dir_name:" unless ($dir_name =~ /:$/);
724 }
725
726 $dir= $dir_name; # $File::Find::dir
81793b90 727
728 # Get the list of files in the current directory.
7e47e6ff 729 unless (opendir DIR, ($no_chdir ? $dir_name : $File::Find::current_dir)) {
3fa6e24b 730 warn "Can't opendir($dir_name): $!\n" if $^W;
81793b90 731 next;
732 }
733 @filenames = readdir DIR;
734 closedir(DIR);
719c805e 735 @filenames = &$pre_process(@filenames) if $pre_process;
736 push @Stack,[$CdLvl,$dir_name,"",-2] if $post_process;
81793b90 737
738 if ($nlink == 2 && !$avoid_nlink) {
739 # This dir has no subdirectories.
740 for my $FN (@filenames) {
7e47e6ff 741 next if $FN =~ $File::Find::skip_pattern;
81793b90 742
7e47e6ff 743 $name = $dir_pref . $FN; # $File::Find::name
744 $_ = ($no_chdir ? $name : $FN); # $_
73396e07 745 { &$wanted_callback }; # protect against wild "next"
81793b90 746 }
747
748 }
749 else {
750 # This dir has subdirectories.
751 $subcount = $nlink - 2;
752
753 for my $FN (@filenames) {
7e47e6ff 754 next if $FN =~ $File::Find::skip_pattern;
81793b90 755 if ($subcount > 0 || $avoid_nlink) {
756 # Seen all the subdirs?
757 # check for directoriness.
758 # stat is faster for a file in the current directory
07867069 759 $sub_nlink = (lstat ($no_chdir ? $dir_pref . $FN : $FN))[3];
81793b90 760
761 if (-d _) {
762 --$subcount;
c7b9dd21 763 $FN =~ s/\.dir\z// if $Is_VMS;
81793b90 764 push @Stack,[$CdLvl,$dir_name,$FN,$sub_nlink];
765 }
766 else {
7e47e6ff 767 $name = $dir_pref . $FN; # $File::Find::name
768 $_= ($no_chdir ? $name : $FN); # $_
73396e07 769 { &$wanted_callback }; # protect against wild "next"
81793b90 770 }
771 }
07867069 772 else {
7e47e6ff 773 $name = $dir_pref . $FN; # $File::Find::name
774 $_= ($no_chdir ? $name : $FN); # $_
73396e07 775 { &$wanted_callback }; # protect against wild "next"
81793b90 776 }
777 }
778 }
17b275ff 779 }
780 continue {
57e73c4b 781 while ( defined ($SE = pop @Stack) ) {
81793b90 782 ($Level, $p_dir, $dir_rel, $nlink) = @$SE;
783 if ($CdLvl > $Level && !$no_chdir) {
7e47e6ff 784 my $tmp;
785 if ($Is_MacOS) {
786 $tmp = (':' x ($CdLvl-$Level)) . ':';
787 }
788 else {
789 $tmp = join('/',('..') x ($CdLvl-$Level));
790 }
791 die "Can't cd to $dir_name" . $tmp
792 unless chdir ($tmp);
81793b90 793 $CdLvl = $Level;
794 }
7e47e6ff 795
796 if ($Is_MacOS) {
797 # $pdir always has a trailing ':', except for the starting dir,
798 # where $dir_rel eq ':'
799 $dir_name = "$p_dir$dir_rel";
800 $dir_pref = "$dir_name:";
801 }
802 else {
803 $dir_name = ($p_dir eq '/' ? "/$dir_rel" : "$p_dir/$dir_rel");
804 $dir_pref = "$dir_name/";
805 }
806
719c805e 807 if ( $nlink == -2 ) {
7e47e6ff 808 $name = $dir = $p_dir; # $File::Find::name / dir
809 if ($Is_MacOS) {
810 $_ = ':'; # $_
811 }
812 else {
813 $_ = '.';
814 }
719c805e 815 &$post_process; # End-of-directory processing
7e47e6ff 816 }
817 elsif ( $nlink < 0 ) { # must be finddepth, report dirname now
818 $name = $dir_name;
819 if ($Is_MacOS) {
820 if ($dir_rel eq ':') { # must be the top dir, where we started
821 $name =~ s|:$||; # $File::Find::name
822 $p_dir = "$p_dir:" unless ($p_dir =~ /:$/);
823 }
824 $dir = $p_dir; # $File::Find::dir
825 $_ = ($no_chdir ? $name : $dir_rel); # $_
826 }
827 else {
828 if ( substr($name,-2) eq '/.' ) {
829 $name =~ s|/\.$||;
830 }
831 $dir = $p_dir;
832 $_ = ($no_chdir ? $dir_name : $dir_rel );
833 if ( substr($_,-2) eq '/.' ) {
834 s|/\.$||;
835 }
836 }
837 { &$wanted_callback }; # protect against wild "next"
838 }
839 else {
840 push @Stack,[$CdLvl,$p_dir,$dir_rel,-1] if $bydepth;
841 last;
842 }
81793b90 843 }
a0d0e21e 844 }
845}
846
81793b90 847
848# API:
849# $wanted
850# $dir_loc : absolute location of a dir
851# $p_dir : "parent directory"
852# preconditions:
853# chdir (if not no_chdir) to dir
854
855sub _find_dir_symlnk($$$) {
7e47e6ff 856 my ($wanted, $dir_loc, $p_dir) = @_; # $dir_loc is the absolute directory
81793b90 857 my @Stack;
858 my @filenames;
859 my $new_loc;
7e47e6ff 860 my $updir_loc = $dir_loc; # untainted parent directory
81793b90 861 my $SE = [];
862 my $dir_name = $p_dir;
7e47e6ff 863 my $dir_pref;
864 my $loc_pref;
865 my $dir_rel;
866 my $byd_flag; # flag for pending stack entry if $bydepth
867 my $tainted = 0;
868 my $ok = 1;
869
870 if ($Is_MacOS) {
871 $dir_pref = ($p_dir =~ /:$/) ? "$p_dir" : "$p_dir:";
872 $loc_pref = ($dir_loc =~ /:$/) ? "$dir_loc" : "$dir_loc:";
873 $dir_rel = ':'; # directory name relative to current directory
874 } else {
875 $dir_pref = ( $p_dir eq '/' ? '/' : "$p_dir/" );
876 $loc_pref = ( $dir_loc eq '/' ? '/' : "$dir_loc/" );
877 $dir_rel = '.'; # directory name relative to current directory
878 }
81793b90 879
880 local ($dir, $name, $fullname, $prune, *DIR);
7e47e6ff 881
882 unless ($no_chdir) {
883 # untaint the topdir
884 if (( $untaint ) && (is_tainted($dir_loc) )) {
885 ( $updir_loc ) = $dir_loc =~ m|$untaint_pat|; # parent dir, now untainted
886 # once untainted, $updir_loc is pushed on the stack (as parent directory);
887 # hence, we don't need to untaint the parent directory every time we chdir
888 # to it later
889 unless (defined $updir_loc) {
81793b90 890 if ($untaint_skip == 0) {
891 die "directory $dir_loc is still tainted";
892 }
893 else {
894 return;
895 }
896 }
897 }
7e47e6ff 898 $ok = chdir($updir_loc) unless ($p_dir eq $File::Find::current_dir);
899 unless ($ok) {
3fa6e24b 900 warn "Can't cd to $updir_loc: $!\n" if $^W;
81793b90 901 return;
902 }
903 }
904
7e47e6ff 905 push @Stack,[$dir_loc,$updir_loc,$p_dir,$dir_rel,-1] if $bydepth;
906
907 if ($Is_MacOS) {
908 $p_dir = $dir_pref; # ensure trailing ':'
909 }
57e73c4b 910
81793b90 911 while (defined $SE) {
912
913 unless ($bydepth) {
7e47e6ff 914 # change (back) to parent directory (always untainted)
704ea872 915 unless ($no_chdir) {
7e47e6ff 916 unless (chdir $updir_loc) {
3fa6e24b 917 warn "Can't cd to $updir_loc: $!\n" if $^W;
704ea872 918 next;
919 }
920 }
7e47e6ff 921 $dir= $p_dir; # $File::Find::dir
922 $name= $dir_name; # $File::Find::name
923 $_= ($no_chdir ? $dir_name : $dir_rel ); # $_
924 $fullname= $dir_loc; # $File::Find::fullname
81793b90 925 # prune may happen here
7e47e6ff 926 $prune= 0;
704ea872 927 lstat($_); # make sure file tests with '_' work
7e47e6ff 928 { &$wanted_callback }; # protect against wild "next"
929 next if $prune;
81793b90 930 }
931
932 # change to that directory
7e47e6ff 933 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
934 $updir_loc = $dir_loc;
935 if ( ($untaint) && (($tainted) || ($tainted = is_tainted($dir_loc) )) ) {
936 # untaint $dir_loc, what will be pushed on the stack as (untainted) parent dir
937 ( $updir_loc ) = $dir_loc =~ m|$untaint_pat|;
938 unless (defined $updir_loc) {
81793b90 939 if ($untaint_skip == 0) {
940 die "directory $dir_loc is still tainted";
a0d0e21e 941 }
237437d0 942 else {
81793b90 943 next;
237437d0 944 }
a0d0e21e 945 }
946 }
7e47e6ff 947 unless (chdir $updir_loc) {
3fa6e24b 948 warn "Can't cd to $updir_loc: $!\n" if $^W;
81793b90 949 next;
950 }
951 }
952
7e47e6ff 953 if ($Is_MacOS) {
954 $dir_name = "$dir_name:" unless ($dir_name =~ /:$/);
955 }
956
957 $dir = $dir_name; # $File::Find::dir
81793b90 958
959 # Get the list of files in the current directory.
7e47e6ff 960 unless (opendir DIR, ($no_chdir ? $dir_loc : $File::Find::current_dir)) {
3fa6e24b 961 warn "Can't opendir($dir_loc): $!\n" if $^W;
81793b90 962 next;
963 }
964 @filenames = readdir DIR;
965 closedir(DIR);
966
967 for my $FN (@filenames) {
7e47e6ff 968 next if $FN =~ $File::Find::skip_pattern;
81793b90 969
970 # follow symbolic links / do an lstat
07867069 971 $new_loc = Follow_SymLink($loc_pref.$FN);
81793b90 972
973 # ignore if invalid symlink
974 next unless defined $new_loc;
7e47e6ff 975
81793b90 976 if (-d _) {
7e47e6ff 977 push @Stack,[$new_loc,$updir_loc,$dir_name,$FN,1];
81793b90 978 }
979 else {
7e47e6ff 980 $fullname = $new_loc; # $File::Find::fullname
981 $name = $dir_pref . $FN; # $File::Find::name
982 $_ = ($no_chdir ? $name : $FN); # $_
73396e07 983 { &$wanted_callback }; # protect against wild "next"
81793b90 984 }
985 }
986
81793b90 987 }
988 continue {
57e73c4b 989 while (defined($SE = pop @Stack)) {
7e47e6ff 990 ($dir_loc, $updir_loc, $p_dir, $dir_rel, $byd_flag) = @$SE;
991 if ($Is_MacOS) {
992 # $p_dir always has a trailing ':', except for the starting dir,
993 # where $dir_rel eq ':'
994 $dir_name = "$p_dir$dir_rel";
995 $dir_pref = "$dir_name:";
996 $loc_pref = ($dir_loc =~ /:$/) ? $dir_loc : "$dir_loc:";
997 }
998 else {
999 $dir_name = ($p_dir eq '/' ? "/$dir_rel" : "$p_dir/$dir_rel");
1000 $dir_pref = "$dir_name/";
1001 $loc_pref = "$dir_loc/";
1002 }
1003 if ( $byd_flag < 0 ) { # must be finddepth, report dirname now
1004 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
1005 unless (chdir $updir_loc) { # $updir_loc (parent dir) is always untainted
3fa6e24b 1006 warn "Can't cd to $updir_loc: $!\n" if $^W;
7e47e6ff 1007 next;
1008 }
1009 }
1010 $fullname = $dir_loc; # $File::Find::fullname
1011 $name = $dir_name; # $File::Find::name
1012 if ($Is_MacOS) {
1013 if ($dir_rel eq ':') { # must be the top dir, where we started
1014 $name =~ s|:$||; # $File::Find::name
1015 $p_dir = "$p_dir:" unless ($p_dir =~ /:$/);
1016 }
1017 $dir = $p_dir; # $File::Find::dir
1018 $_ = ($no_chdir ? $name : $dir_rel); # $_
1019 }
1020 else {
1021 if ( substr($name,-2) eq '/.' ) {
1022 $name =~ s|/\.$||; # $File::Find::name
1023 }
1024 $dir = $p_dir; # $File::Find::dir
1025 $_ = ($no_chdir ? $dir_name : $dir_rel); # $_
1026 if ( substr($_,-2) eq '/.' ) {
1027 s|/\.$||;
1028 }
1029 }
1030
1031 lstat($_); # make sure file tests with '_' work
1032 { &$wanted_callback }; # protect against wild "next"
1033 }
1034 else {
1035 push @Stack,[$dir_loc, $updir_loc, $p_dir, $dir_rel,-1] if $bydepth;
1036 last;
1037 }
a0d0e21e 1038 }
1039 }
1040}
1041
81793b90 1042
20408e3c 1043sub wrap_wanted {
81793b90 1044 my $wanted = shift;
1045 if ( ref($wanted) eq 'HASH' ) {
1046 if ( $wanted->{follow} || $wanted->{follow_fast}) {
1047 $wanted->{follow_skip} = 1 unless defined $wanted->{follow_skip};
1048 }
1049 if ( $wanted->{untaint} ) {
7e47e6ff 1050 $wanted->{untaint_pattern} = $File::Find::untaint_pattern
81793b90 1051 unless defined $wanted->{untaint_pattern};
1052 $wanted->{untaint_skip} = 0 unless defined $wanted->{untaint_skip};
1053 }
1054 return $wanted;
1055 }
1056 else {
1057 return { wanted => $wanted };
1058 }
a0d0e21e 1059}
1060
20408e3c 1061sub find {
81793b90 1062 my $wanted = shift;
1063 _find_opt(wrap_wanted($wanted), @_);
a0d0e21e 1064}
1065
55d729e4 1066sub finddepth {
81793b90 1067 my $wanted = wrap_wanted(shift);
1068 $wanted->{bydepth} = 1;
1069 _find_opt($wanted, @_);
20408e3c 1070}
6280b799 1071
7e47e6ff 1072# default
1073$File::Find::skip_pattern = qr/^\.{1,2}\z/;
1074$File::Find::untaint_pattern = qr|^([-+@\w./]+)$|;
1075
6280b799 1076# These are hard-coded for now, but may move to hint files.
10eba763 1077if ($^O eq 'VMS') {
81793b90 1078 $Is_VMS = 1;
7e47e6ff 1079 $File::Find::dont_use_nlink = 1;
1080}
1081elsif ($^O eq 'MacOS') {
1082 $Is_MacOS = 1;
1083 $File::Find::dont_use_nlink = 1;
1084 $File::Find::skip_pattern = qr/^Icon\015\z/;
1085 $File::Find::untaint_pattern = qr|^(.+)$|;
748a9306 1086}
1087
7e47e6ff 1088# this _should_ work properly on all platforms
1089# where File::Find can be expected to work
1090$File::Find::current_dir = File::Spec->curdir || '.';
1091
81793b90 1092$File::Find::dont_use_nlink = 1
497711e7 1093 if $^O eq 'os2' || $^O eq 'dos' || $^O eq 'amigaos' || $^O eq 'MSWin32' ||
2986a63f 1094 $^O eq 'cygwin' || $^O eq 'epoc' || $^O eq 'NetWare';
6280b799 1095
20408e3c 1096# Set dont_use_nlink in your hint file if your system's stat doesn't
1097# report the number of links in a directory as an indication
1098# of the number of files.
1099# See, e.g. hints/machten.sh for MachTen 2.2.
81793b90 1100unless ($File::Find::dont_use_nlink) {
1101 require Config;
1102 $File::Find::dont_use_nlink = 1 if ($Config::Config{'dont_use_nlink'});
20408e3c 1103}
1104
7e47e6ff 1105# We need a function that checks if a scalar is tainted. Either use the
1106# Scalar::Util module's tainted() function or our (slower) pure Perl
1107# fallback is_tainted_pp()
1108{
1109 local $@;
1110 eval { require Scalar::Util };
1111 *is_tainted = $@ ? \&is_tainted_pp : \&Scalar::Util::tainted;
1112}
1113
a0d0e21e 11141;