Manual integration error in #12235.
[p5sagit/p5-mst-13.2.git] / lib / File / Find.pm
CommitLineData
a0d0e21e 1package File::Find;
3b825e41 2use 5.006;
b75c8c73 3use strict;
b395063c 4use warnings;
88d01e8d 5our $VERSION = '1.02';
a0d0e21e 6require Exporter;
6280b799 7require Cwd;
a0d0e21e 8
f06db76b 9=head1 NAME
10
11find - traverse a file tree
12
13finddepth - traverse a directory structure depth-first
14
15=head1 SYNOPSIS
16
17 use File::Find;
81793b90 18 find(\&wanted, '/foo', '/bar');
f06db76b 19 sub wanted { ... }
237437d0 20
f06db76b 21 use File::Find;
81793b90 22 finddepth(\&wanted, '/foo', '/bar');
f06db76b 23 sub wanted { ... }
3cb6de81 24
81793b90 25 use File::Find;
26 find({ wanted => \&process, follow => 1 }, '.');
f06db76b 27
28=head1 DESCRIPTION
29
20408e3c 30The first argument to find() is either a hash reference describing the
81793b90 31operations to be performed for each file, or a code reference.
20408e3c 32
81793b90 33Here are the possible keys for the hash:
34
35=over 3
36
37=item C<wanted>
38
39The value should be a code reference. This code reference is called
40I<the wanted() function> below.
41
42=item C<bydepth>
43
44Reports the name of a directory only AFTER all its entries
45have been reported. Entry point finddepth() is a shortcut for
46specifying C<{ bydepth => 1 }> in the first argument of find().
47
719c805e 48=item C<preprocess>
49
7e47e6ff 50The value should be a code reference. This code reference is used to
51preprocess the current directory. The name of currently processed
52directory is in $File::Find::dir. Your preprocessing function is
53called after readdir() but before the loop that calls the wanted()
54function. It is called with a list of strings (actually file/directory
55names) and is expected to return a list of strings. The code can be
56used to sort the file/directory names alphabetically, numerically,
57or to filter out directory entries based on their name alone. When
58I<follow> or I<follow_fast> are in effect, C<preprocess> is a no-op.
719c805e 59
60=item C<postprocess>
61
7e47e6ff 62The value should be a code reference. It is invoked just before leaving
63the currently processed directory. It is called in void context with no
64arguments. The name of the current directory is in $File::Find::dir. This
65hook is handy for summarizing a directory, such as calculating its disk
3fa6e24b 66usage. When I<follow> or I<follow_fast> are in effect, C<postprocess> is a
7e47e6ff 67no-op.
719c805e 68
81793b90 69=item C<follow>
70
71Causes symbolic links to be followed. Since directory trees with symbolic
72links (followed) may contain files more than once and may even have
73cycles, a hash has to be built up with an entry for each file.
74This might be expensive both in space and time for a large
75directory tree. See I<follow_fast> and I<follow_skip> below.
76If either I<follow> or I<follow_fast> is in effect:
77
78=over 6
79
a45bd81d 80=item *
81793b90 81
f10e1564 82It is guaranteed that an I<lstat> has been called before the user's
81793b90 83I<wanted()> function is called. This enables fast file checks involving S< _>.
84
a45bd81d 85=item *
81793b90 86
87There is a variable C<$File::Find::fullname> which holds the absolute
88pathname of the file with all symbolic links resolved
89
90=back
91
92=item C<follow_fast>
93
f10e1564 94This is similar to I<follow> except that it may report some files more
95than once. It does detect cycles, however. Since only symbolic links
96have to be hashed, this is much cheaper both in space and time. If
97processing a file more than once (by the user's I<wanted()> function)
81793b90 98is worse than just taking time, the option I<follow> should be used.
99
100=item C<follow_skip>
101
102C<follow_skip==1>, which is the default, causes all files which are
103neither directories nor symbolic links to be ignored if they are about
104to be processed a second time. If a directory or a symbolic link
105are about to be processed a second time, File::Find dies.
106C<follow_skip==0> causes File::Find to die if any file is about to be
107processed a second time.
108C<follow_skip==2> causes File::Find to ignore any duplicate files and
7e47e6ff 109directories but to proceed normally otherwise.
20408e3c 110
80e52b73 111=item C<dangling_symlinks>
112
113If true and a code reference, will be called with the symbolic link
114name and the directory it lives in as arguments. Otherwise, if true
115and warnings are on, warning "symbolic_link_name is a dangling
116symbolic link\n" will be issued. If false, the dangling symbolic link
117will be silently ignored.
f06db76b 118
81793b90 119=item C<no_chdir>
120
121Does not C<chdir()> to each directory as it recurses. The wanted()
122function will need to be aware of this, of course. In this case,
123C<$_> will be the same as C<$File::Find::name>.
124
125=item C<untaint>
126
127If find is used in taint-mode (-T command line switch or if EUID != UID
128or if EGID != GID) then internally directory names have to be untainted
7e47e6ff 129before they can be chdir'ed to. Therefore they are checked against a regular
130expression I<untaint_pattern>. Note that all names passed to the user's
131I<wanted()> function are still tainted. If this option is used while
132not in taint-mode, C<untaint> is a no-op.
81793b90 133
134=item C<untaint_pattern>
135
136See above. This should be set using the C<qr> quoting operator.
137The default is set to C<qr|^([-+@\w./]+)$|>.
1cffc1dd 138Note that the parentheses are vital.
81793b90 139
140=item C<untaint_skip>
141
7e47e6ff 142If set, a directory which fails the I<untaint_pattern> is skipped,
143including all its sub-directories. The default is to 'die' in such a case.
81793b90 144
145=back
146
147The wanted() function does whatever verifications you want.
148C<$File::Find::dir> contains the current directory name, and C<$_> the
149current filename within that directory. C<$File::Find::name> contains
f10e1564 150the complete pathname to the file. You are chdir()'d to
151C<$File::Find::dir> when the function is called, unless C<no_chdir>
7e47e6ff 152was specified. When C<follow> or C<follow_fast> are in effect, there is
f10e1564 153also a C<$File::Find::fullname>. The function may set
154C<$File::Find::prune> to prune the tree unless C<bydepth> was
155specified. Unless C<follow> or C<follow_fast> is specified, for
156compatibility reasons (find.pl, find2perl) there are in addition the
157following globals available: C<$File::Find::topdir>,
158C<$File::Find::topdev>, C<$File::Find::topino>,
e7b91b67 159C<$File::Find::topmode> and C<$File::Find::topnlink>.
47a735e8 160
20408e3c 161This library is useful for the C<find2perl> tool, which when fed,
f06db76b 162
163 find2perl / -name .nfs\* -mtime +7 \
81793b90 164 -exec rm -f {} \; -o -fstype nfs -prune
f06db76b 165
166produces something like:
167
168 sub wanted {
c7b9dd21 169 /^\.nfs.*\z/s &&
81793b90 170 (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_)) &&
f06db76b 171 int(-M _) > 7 &&
172 unlink($_)
173 ||
81793b90 174 ($nlink || (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))) &&
f06db76b 175 $dev < 0 &&
6280b799 176 ($File::Find::prune = 1);
f06db76b 177 }
178
43dece2a 179Notice the C<_> in the above C<int(-M _)>: the C<_> is a magical
180filehandle that caches the information from the preceding
181stat(), lstat(), or filetest.
182
81793b90 183Set the variable C<$File::Find::dont_use_nlink> if you're using AFS,
6280b799 184since AFS cheats.
f06db76b 185
1cffc1dd 186Here's another interesting wanted function. It will find all symbolic
187links that don't resolve:
f06db76b 188
189 sub wanted {
81793b90 190 -l && !-e && print "bogus link: $File::Find::name\n";
237437d0 191 }
f06db76b 192
81793b90 193See also the script C<pfind> on CPAN for a nice application of this
194module.
195
196=head1 CAVEAT
197
f10e1564 198Be aware that the option to follow symbolic links can be dangerous.
81793b90 199Depending on the structure of the directory tree (including symbolic
200links to directories) you might traverse a given (physical) directory
201more than once (only if C<follow_fast> is in effect).
202Furthermore, deleting or changing files in a symbolically linked directory
203might cause very unpleasant surprises, since you delete or change files
204in an unknown directory.
0530a6c4 205
7e47e6ff 206=head1 NOTES
207
208=over 4
209
210=item *
211
212Mac OS (Classic) users should note a few differences:
213
214=over 4
215
216=item *
217
218The path separator is ':', not '/', and the current directory is denoted
219as ':', not '.'. You should be careful about specifying relative pathnames.
220While a full path always begins with a volume name, a relative pathname
221should always begin with a ':'. If specifying a volume name only, a
222trailing ':' is required.
223
224=item *
225
226C<$File::Find::dir> is guaranteed to end with a ':'. If C<$_>
227contains the name of a directory, that name may or may not end with a
228':'. Likewise, C<$File::Find::name>, which contains the complete
229pathname to that directory, and C<$File::Find::fullname>, which holds
230the absolute pathname of that directory with all symbolic links resolved,
231may or may not end with a ':'.
232
233=item *
234
235The default C<untaint_pattern> (see above) on Mac OS is set to
236C<qr|^(.+)$|>. Note that the parentheses are vital.
237
238=item *
239
240The invisible system file "Icon\015" is ignored. While this file may
241appear in every directory, there are some more invisible system files
242on every volume, which are all located at the volume root level (i.e.
243"MacintoshHD:"). These system files are B<not> excluded automatically.
244Your filter may use the following code to recognize invisible files or
245directories (requires Mac::Files):
246
247 use Mac::Files;
248
249 # invisible() -- returns 1 if file/directory is invisible,
1cffc1dd 250 # 0 if it's visible or undef if an error occurred
7e47e6ff 251
252 sub invisible($) {
253 my $file = shift;
254 my ($fileCat, $fileInfo);
255 my $invisible_flag = 1 << 14;
256
257 if ( $fileCat = FSpGetCatInfo($file) ) {
258 if ($fileInfo = $fileCat->ioFlFndrInfo() ) {
259 return (($fileInfo->fdFlags & $invisible_flag) && 1);
260 }
261 }
262 return undef;
263 }
264
265Generally, invisible files are system files, unless an odd application
266decides to use invisible files for its own purposes. To distinguish
267such files from system files, you have to look at the B<type> and B<creator>
268file attributes. The MacPerl built-in functions C<GetFileInfo(FILE)> and
269C<SetFileInfo(CREATOR, TYPE, FILES)> offer access to these attributes
270(see MacPerl.pm for details).
271
272Files that appear on the desktop actually reside in an (hidden) directory
273named "Desktop Folder" on the particular disk volume. Note that, although
274all desktop files appear to be on the same "virtual" desktop, each disk
275volume actually maintains its own "Desktop Folder" directory.
276
277=back
278
279=back
0530a6c4 280
a85af077 281=head1 HISTORY
282
283File::Find used to produce incorrect results if called recursively.
284During the development of perl 5.8 this bug was fixed.
285The first fixed version of File::Find was 1.01.
286
f06db76b 287=cut
288
b75c8c73 289our @ISA = qw(Exporter);
290our @EXPORT = qw(find finddepth);
6280b799 291
a0d0e21e 292
81793b90 293use strict;
294my $Is_VMS;
7e47e6ff 295my $Is_MacOS;
81793b90 296
297require File::Basename;
7e47e6ff 298require File::Spec;
81793b90 299
9f826d6a 300# Should ideally be my() not our() but local() currently
301# refuses to operate on lexicals
302
303our %SLnkSeen;
304our ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow,
719c805e 305 $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat,
80e52b73 306 $pre_process, $post_process, $dangling_symlinks);
81793b90 307
308sub contract_name {
309 my ($cdir,$fn) = @_;
310
7e47e6ff 311 return substr($cdir,0,rindex($cdir,'/')) if $fn eq $File::Find::current_dir;
81793b90 312
313 $cdir = substr($cdir,0,rindex($cdir,'/')+1);
314
315 $fn =~ s|^\./||;
316
317 my $abs_name= $cdir . $fn;
318
319 if (substr($fn,0,3) eq '../') {
fecbda2b 320 1 while $abs_name =~ s!/[^/]*/\.\./!/!;
81793b90 321 }
322
323 return $abs_name;
324}
325
7e47e6ff 326# return the absolute name of a directory or file
327sub contract_name_Mac {
328 my ($cdir,$fn) = @_;
329 my $abs_name;
330
331 if ($fn =~ /^(:+)(.*)$/) { # valid pathname starting with a ':'
332
333 my $colon_count = length ($1);
334 if ($colon_count == 1) {
335 $abs_name = $cdir . $2;
336 return $abs_name;
337 }
338 else {
339 # need to move up the tree, but
340 # only if it's not a volume name
341 for (my $i=1; $i<$colon_count; $i++) {
342 unless ($cdir =~ /^[^:]+:$/) { # volume name
343 $cdir =~ s/[^:]+:$//;
344 }
345 else {
346 return undef;
347 }
348 }
349 $abs_name = $cdir . $2;
350 return $abs_name;
351 }
352
353 }
354 else {
355
356 # $fn may be a valid path to a directory or file or (dangling)
357 # symlink, without a leading ':'
358 if ( (-e $fn) || (-l $fn) ) {
359 if ($fn =~ /^[^:]+:/) { # a volume name like DataHD:*
360 return $fn; # $fn is already an absolute path
361 }
362 else {
363 $abs_name = $cdir . $fn;
364 return $abs_name;
365 }
366 }
367 else { # argh!, $fn is not a valid directory/file
368 return undef;
369 }
370 }
371}
81793b90 372
373sub PathCombine($$) {
374 my ($Base,$Name) = @_;
375 my $AbsName;
376
7e47e6ff 377 if ($Is_MacOS) {
378 # $Name is the resolved symlink (always a full path on MacOS),
379 # i.e. there's no need to call contract_name_Mac()
380 $AbsName = $Name;
381
382 # (simple) check for recursion
383 if ( ( $Base =~ /^$AbsName/) && (-d $AbsName) ) { # recursion
384 return undef;
385 }
81793b90 386 }
387 else {
7e47e6ff 388 if (substr($Name,0,1) eq '/') {
389 $AbsName= $Name;
390 }
391 else {
392 $AbsName= contract_name($Base,$Name);
393 }
81793b90 394
7e47e6ff 395 # (simple) check for recursion
396 my $newlen= length($AbsName);
397 if ($newlen <= length($Base)) {
398 if (($newlen == length($Base) || substr($Base,$newlen,1) eq '/')
399 && $AbsName eq substr($Base,0,$newlen))
400 {
401 return undef;
402 }
81793b90 403 }
404 }
405 return $AbsName;
406}
407
408sub Follow_SymLink($) {
409 my ($AbsName) = @_;
410
411 my ($NewName,$DEV, $INO);
412 ($DEV, $INO)= lstat $AbsName;
413
414 while (-l _) {
415 if ($SLnkSeen{$DEV, $INO}++) {
416 if ($follow_skip < 2) {
417 die "$AbsName is encountered a second time";
a0d0e21e 418 }
419 else {
81793b90 420 return undef;
a0d0e21e 421 }
422 }
81793b90 423 $NewName= PathCombine($AbsName, readlink($AbsName));
424 unless(defined $NewName) {
425 if ($follow_skip < 2) {
426 die "$AbsName is a recursive symbolic link";
427 }
428 else {
429 return undef;
a0d0e21e 430 }
81793b90 431 }
432 else {
433 $AbsName= $NewName;
434 }
435 ($DEV, $INO) = lstat($AbsName);
436 return undef unless defined $DEV; # dangling symbolic link
437 }
438
439 if ($full_check && $SLnkSeen{$DEV, $INO}++) {
7e47e6ff 440 if ( ($follow_skip < 1) || ((-d _) && ($follow_skip < 2)) ) {
81793b90 441 die "$AbsName encountered a second time";
442 }
443 else {
444 return undef;
445 }
446 }
447
448 return $AbsName;
449}
450
17f410f9 451our($dir, $name, $fullname, $prune);
81793b90 452sub _find_dir_symlnk($$$);
453sub _find_dir($$$);
454
7e47e6ff 455# check whether or not a scalar variable is tainted
456# (code straight from the Camel, 3rd ed., page 561)
457sub is_tainted_pp {
458 my $arg = shift;
459 my $nada = substr($arg, 0, 0); # zero-length
460 local $@;
461 eval { eval "# $nada" };
462 return length($@) != 0;
463}
464
81793b90 465sub _find_opt {
466 my $wanted = shift;
467 die "invalid top directory" unless defined $_[0];
468
9f826d6a 469 # This function must local()ize everything because callbacks may
470 # call find() or finddepth()
471
472 local %SLnkSeen;
473 local ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow,
474 $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat,
80e52b73 475 $pre_process, $post_process, $dangling_symlinks);
9f826d6a 476 local($dir, $name, $fullname, $prune);
477
a0c9c202 478 my $cwd = $wanted->{bydepth} ? Cwd::fastcwd() : Cwd::getcwd();
80e52b73 479 my $cwd_untainted = $cwd;
480 my $check_t_cwd = 1;
481 $wanted_callback = $wanted->{wanted};
482 $bydepth = $wanted->{bydepth};
483 $pre_process = $wanted->{preprocess};
484 $post_process = $wanted->{postprocess};
485 $no_chdir = $wanted->{no_chdir};
486 $full_check = $wanted->{follow};
487 $follow = $full_check || $wanted->{follow_fast};
488 $follow_skip = $wanted->{follow_skip};
489 $untaint = $wanted->{untaint};
490 $untaint_pat = $wanted->{untaint_pattern};
491 $untaint_skip = $wanted->{untaint_skip};
492 $dangling_symlinks = $wanted->{dangling_symlinks};
81793b90 493
1cffc1dd 494 # for compatibility reasons (find.pl, find2perl)
9f826d6a 495 local our ($topdir, $topdev, $topino, $topmode, $topnlink);
81793b90 496
497 # a symbolic link to a directory doesn't increase the link count
498 $avoid_nlink = $follow || $File::Find::dont_use_nlink;
499
e7b91b67 500 my ($abs_dir, $Is_Dir);
81793b90 501
502 Proc_Top_Item:
503 foreach my $TOP (@_) {
7e47e6ff 504 my $top_item = $TOP;
505
506 if ($Is_MacOS) {
507 ($topdev,$topino,$topmode,$topnlink) = $follow ? stat $top_item : lstat $top_item;
508 $top_item = ":$top_item"
3fa6e24b 509 if ( (-d _) && ( $top_item !~ /:/ ) );
7e47e6ff 510 }
511 else {
512 $top_item =~ s|/\z|| unless $top_item eq '/';
513 ($topdev,$topino,$topmode,$topnlink) = $follow ? stat $top_item : lstat $top_item;
514 }
515
516 $Is_Dir= 0;
517
518 if ($follow) {
519
520 if ($Is_MacOS) {
521 $cwd = "$cwd:" unless ($cwd =~ /:$/); # for safety
522
523 if ($top_item eq $File::Find::current_dir) {
524 $abs_dir = $cwd;
525 }
526 else {
527 $abs_dir = contract_name_Mac($cwd, $top_item);
528 unless (defined $abs_dir) {
3fa6e24b 529 warn "Can't determine absolute path for $top_item (No such file or directory)\n" if $^W;
7e47e6ff 530 next Proc_Top_Item;
531 }
532 }
533
534 }
535 else {
536 if (substr($top_item,0,1) eq '/') {
537 $abs_dir = $top_item;
538 }
539 elsif ($top_item eq $File::Find::current_dir) {
540 $abs_dir = $cwd;
541 }
542 else { # care about any ../
543 $abs_dir = contract_name("$cwd/",$top_item);
544 }
545 }
546 $abs_dir= Follow_SymLink($abs_dir);
547 unless (defined $abs_dir) {
80e52b73 548 if ($dangling_symlinks) {
549 if (ref $dangling_symlinks eq 'CODE') {
550 $dangling_symlinks->($top_item, $cwd);
551 } else {
552 warn "$top_item is a dangling symbolic link\n" if $^W;
553 }
554 }
81793b90 555 next Proc_Top_Item;
7e47e6ff 556 }
557
558 if (-d _) {
81793b90 559 _find_dir_symlnk($wanted, $abs_dir, $top_item);
560 $Is_Dir= 1;
7e47e6ff 561 }
562 }
81793b90 563 else { # no follow
7e47e6ff 564 $topdir = $top_item;
565 unless (defined $topnlink) {
3fa6e24b 566 warn "Can't stat $top_item: $!\n" if $^W;
7e47e6ff 567 next Proc_Top_Item;
568 }
569 if (-d _) {
c7b9dd21 570 $top_item =~ s/\.dir\z// if $Is_VMS;
e7b91b67 571 _find_dir($wanted, $top_item, $topnlink);
81793b90 572 $Is_Dir= 1;
7e47e6ff 573 }
237437d0 574 else {
81793b90 575 $abs_dir= $top_item;
7e47e6ff 576 }
577 }
81793b90 578
7e47e6ff 579 unless ($Is_Dir) {
81793b90 580 unless (($_,$dir) = File::Basename::fileparse($abs_dir)) {
7e47e6ff 581 if ($Is_MacOS) {
582 ($dir,$_) = (':', $top_item); # $File::Find::dir, $_
583 }
584 else {
585 ($dir,$_) = ('./', $top_item);
586 }
81793b90 587 }
588
7e47e6ff 589 $abs_dir = $dir;
590 if (( $untaint ) && (is_tainted($dir) )) {
591 ( $abs_dir ) = $dir =~ m|$untaint_pat|;
81793b90 592 unless (defined $abs_dir) {
593 if ($untaint_skip == 0) {
7e47e6ff 594 die "directory $dir is still tainted";
81793b90 595 }
596 else {
597 next Proc_Top_Item;
598 }
599 }
7e47e6ff 600 }
81793b90 601
7e47e6ff 602 unless ($no_chdir || chdir $abs_dir) {
3fa6e24b 603 warn "Couldn't chdir $abs_dir: $!\n" if $^W;
7e47e6ff 604 next Proc_Top_Item;
605 }
719911cc 606
7e47e6ff 607 $name = $abs_dir . $_; # $File::Find::name
719911cc 608
7e47e6ff 609 { &$wanted_callback }; # protect against wild "next"
81793b90 610
7e47e6ff 611 }
81793b90 612
7e47e6ff 613 unless ( $no_chdir ) {
614 if ( ($check_t_cwd) && (($untaint) && (is_tainted($cwd) )) ) {
615 ( $cwd_untainted ) = $cwd =~ m|$untaint_pat|;
616 unless (defined $cwd_untainted) {
617 die "insecure cwd in find(depth)";
618 }
619 $check_t_cwd = 0;
620 }
621 unless (chdir $cwd_untainted) {
622 die "Can't cd to $cwd: $!\n";
623 }
624 }
81793b90 625 }
626}
627
628# API:
629# $wanted
630# $p_dir : "parent directory"
631# $nlink : what came back from the stat
632# preconditions:
633# chdir (if not no_chdir) to dir
634
635sub _find_dir($$$) {
636 my ($wanted, $p_dir, $nlink) = @_;
637 my ($CdLvl,$Level) = (0,0);
638 my @Stack;
639 my @filenames;
640 my ($subcount,$sub_nlink);
641 my $SE= [];
642 my $dir_name= $p_dir;
7e47e6ff 643 my $dir_pref;
39e79f6b 644 my $dir_rel = $File::Find::current_dir;
7e47e6ff 645 my $tainted = 0;
646
647 if ($Is_MacOS) {
648 $dir_pref= ($p_dir =~ /:$/) ? $p_dir : "$p_dir:"; # preface
7e47e6ff 649 }
650 else {
651 $dir_pref= ( $p_dir eq '/' ? '/' : "$p_dir/" );
7e47e6ff 652 }
81793b90 653
654 local ($dir, $name, $prune, *DIR);
7e47e6ff 655
656 unless ( $no_chdir || ($p_dir eq $File::Find::current_dir)) {
81793b90 657 my $udir = $p_dir;
7e47e6ff 658 if (( $untaint ) && (is_tainted($p_dir) )) {
659 ( $udir ) = $p_dir =~ m|$untaint_pat|;
81793b90 660 unless (defined $udir) {
661 if ($untaint_skip == 0) {
662 die "directory $p_dir is still tainted";
663 }
664 else {
665 return;
666 }
237437d0 667 }
a0d0e21e 668 }
81793b90 669 unless (chdir $udir) {
3fa6e24b 670 warn "Can't cd to $udir: $!\n" if $^W;
81793b90 671 return;
672 }
673 }
7e47e6ff 674
675 # push the starting directory
57e73c4b 676 push @Stack,[$CdLvl,$p_dir,$dir_rel,-1] if $bydepth;
81793b90 677
7e47e6ff 678 if ($Is_MacOS) {
679 $p_dir = $dir_pref; # ensure trailing ':'
680 }
681
81793b90 682 while (defined $SE) {
683 unless ($bydepth) {
7e47e6ff 684 $dir= $p_dir; # $File::Find::dir
685 $name= $dir_name; # $File::Find::name
686 $_= ($no_chdir ? $dir_name : $dir_rel ); # $_
81793b90 687 # prune may happen here
7e47e6ff 688 $prune= 0;
689 { &$wanted_callback }; # protect against wild "next"
690 next if $prune;
81793b90 691 }
7e47e6ff 692
81793b90 693 # change to that directory
7e47e6ff 694 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
81793b90 695 my $udir= $dir_rel;
7e47e6ff 696 if ( ($untaint) && (($tainted) || ($tainted = is_tainted($dir_rel) )) ) {
697 ( $udir ) = $dir_rel =~ m|$untaint_pat|;
81793b90 698 unless (defined $udir) {
699 if ($untaint_skip == 0) {
7e47e6ff 700 if ($Is_MacOS) {
701 die "directory ($p_dir) $dir_rel is still tainted";
702 }
703 else {
704 die "directory (" . ($p_dir ne '/' ? $p_dir : '') . "/) $dir_rel is still tainted";
705 }
706 } else { # $untaint_skip == 1
707 next;
81793b90 708 }
709 }
710 }
711 unless (chdir $udir) {
7e47e6ff 712 if ($Is_MacOS) {
3fa6e24b 713 warn "Can't cd to ($p_dir) $udir: $!\n" if $^W;
7e47e6ff 714 }
715 else {
3fa6e24b 716 warn "Can't cd to (" . ($p_dir ne '/' ? $p_dir : '') . "/) $udir: $!\n" if $^W;
7e47e6ff 717 }
81793b90 718 next;
719 }
720 $CdLvl++;
721 }
722
7e47e6ff 723 if ($Is_MacOS) {
724 $dir_name = "$dir_name:" unless ($dir_name =~ /:$/);
725 }
726
727 $dir= $dir_name; # $File::Find::dir
81793b90 728
729 # Get the list of files in the current directory.
7e47e6ff 730 unless (opendir DIR, ($no_chdir ? $dir_name : $File::Find::current_dir)) {
3fa6e24b 731 warn "Can't opendir($dir_name): $!\n" if $^W;
81793b90 732 next;
733 }
734 @filenames = readdir DIR;
735 closedir(DIR);
719c805e 736 @filenames = &$pre_process(@filenames) if $pre_process;
737 push @Stack,[$CdLvl,$dir_name,"",-2] if $post_process;
81793b90 738
739 if ($nlink == 2 && !$avoid_nlink) {
740 # This dir has no subdirectories.
741 for my $FN (@filenames) {
7e47e6ff 742 next if $FN =~ $File::Find::skip_pattern;
81793b90 743
7e47e6ff 744 $name = $dir_pref . $FN; # $File::Find::name
745 $_ = ($no_chdir ? $name : $FN); # $_
73396e07 746 { &$wanted_callback }; # protect against wild "next"
81793b90 747 }
748
749 }
750 else {
751 # This dir has subdirectories.
752 $subcount = $nlink - 2;
753
754 for my $FN (@filenames) {
7e47e6ff 755 next if $FN =~ $File::Find::skip_pattern;
81793b90 756 if ($subcount > 0 || $avoid_nlink) {
757 # Seen all the subdirs?
758 # check for directoriness.
759 # stat is faster for a file in the current directory
07867069 760 $sub_nlink = (lstat ($no_chdir ? $dir_pref . $FN : $FN))[3];
81793b90 761
762 if (-d _) {
763 --$subcount;
c7b9dd21 764 $FN =~ s/\.dir\z// if $Is_VMS;
81793b90 765 push @Stack,[$CdLvl,$dir_name,$FN,$sub_nlink];
766 }
767 else {
7e47e6ff 768 $name = $dir_pref . $FN; # $File::Find::name
769 $_= ($no_chdir ? $name : $FN); # $_
73396e07 770 { &$wanted_callback }; # protect against wild "next"
81793b90 771 }
772 }
07867069 773 else {
7e47e6ff 774 $name = $dir_pref . $FN; # $File::Find::name
775 $_= ($no_chdir ? $name : $FN); # $_
73396e07 776 { &$wanted_callback }; # protect against wild "next"
81793b90 777 }
778 }
779 }
17b275ff 780 }
781 continue {
57e73c4b 782 while ( defined ($SE = pop @Stack) ) {
81793b90 783 ($Level, $p_dir, $dir_rel, $nlink) = @$SE;
784 if ($CdLvl > $Level && !$no_chdir) {
7e47e6ff 785 my $tmp;
786 if ($Is_MacOS) {
787 $tmp = (':' x ($CdLvl-$Level)) . ':';
788 }
789 else {
790 $tmp = join('/',('..') x ($CdLvl-$Level));
791 }
792 die "Can't cd to $dir_name" . $tmp
793 unless chdir ($tmp);
81793b90 794 $CdLvl = $Level;
795 }
7e47e6ff 796
797 if ($Is_MacOS) {
798 # $pdir always has a trailing ':', except for the starting dir,
799 # where $dir_rel eq ':'
800 $dir_name = "$p_dir$dir_rel";
801 $dir_pref = "$dir_name:";
802 }
803 else {
804 $dir_name = ($p_dir eq '/' ? "/$dir_rel" : "$p_dir/$dir_rel");
805 $dir_pref = "$dir_name/";
806 }
807
719c805e 808 if ( $nlink == -2 ) {
7e47e6ff 809 $name = $dir = $p_dir; # $File::Find::name / dir
39e79f6b 810 $_ = $File::Find::current_dir;
719c805e 811 &$post_process; # End-of-directory processing
7e47e6ff 812 }
813 elsif ( $nlink < 0 ) { # must be finddepth, report dirname now
814 $name = $dir_name;
815 if ($Is_MacOS) {
816 if ($dir_rel eq ':') { # must be the top dir, where we started
817 $name =~ s|:$||; # $File::Find::name
818 $p_dir = "$p_dir:" unless ($p_dir =~ /:$/);
819 }
820 $dir = $p_dir; # $File::Find::dir
821 $_ = ($no_chdir ? $name : $dir_rel); # $_
822 }
823 else {
824 if ( substr($name,-2) eq '/.' ) {
825 $name =~ s|/\.$||;
826 }
827 $dir = $p_dir;
828 $_ = ($no_chdir ? $dir_name : $dir_rel );
829 if ( substr($_,-2) eq '/.' ) {
830 s|/\.$||;
831 }
832 }
833 { &$wanted_callback }; # protect against wild "next"
834 }
835 else {
836 push @Stack,[$CdLvl,$p_dir,$dir_rel,-1] if $bydepth;
837 last;
838 }
81793b90 839 }
a0d0e21e 840 }
841}
842
81793b90 843
844# API:
845# $wanted
846# $dir_loc : absolute location of a dir
847# $p_dir : "parent directory"
848# preconditions:
849# chdir (if not no_chdir) to dir
850
851sub _find_dir_symlnk($$$) {
7e47e6ff 852 my ($wanted, $dir_loc, $p_dir) = @_; # $dir_loc is the absolute directory
81793b90 853 my @Stack;
854 my @filenames;
855 my $new_loc;
7e47e6ff 856 my $updir_loc = $dir_loc; # untainted parent directory
81793b90 857 my $SE = [];
858 my $dir_name = $p_dir;
7e47e6ff 859 my $dir_pref;
860 my $loc_pref;
39e79f6b 861 my $dir_rel = $File::Find::current_dir;
7e47e6ff 862 my $byd_flag; # flag for pending stack entry if $bydepth
863 my $tainted = 0;
864 my $ok = 1;
865
866 if ($Is_MacOS) {
867 $dir_pref = ($p_dir =~ /:$/) ? "$p_dir" : "$p_dir:";
868 $loc_pref = ($dir_loc =~ /:$/) ? "$dir_loc" : "$dir_loc:";
7e47e6ff 869 } else {
870 $dir_pref = ( $p_dir eq '/' ? '/' : "$p_dir/" );
871 $loc_pref = ( $dir_loc eq '/' ? '/' : "$dir_loc/" );
7e47e6ff 872 }
81793b90 873
874 local ($dir, $name, $fullname, $prune, *DIR);
7e47e6ff 875
876 unless ($no_chdir) {
877 # untaint the topdir
878 if (( $untaint ) && (is_tainted($dir_loc) )) {
879 ( $updir_loc ) = $dir_loc =~ m|$untaint_pat|; # parent dir, now untainted
880 # once untainted, $updir_loc is pushed on the stack (as parent directory);
881 # hence, we don't need to untaint the parent directory every time we chdir
882 # to it later
883 unless (defined $updir_loc) {
81793b90 884 if ($untaint_skip == 0) {
885 die "directory $dir_loc is still tainted";
886 }
887 else {
888 return;
889 }
890 }
891 }
7e47e6ff 892 $ok = chdir($updir_loc) unless ($p_dir eq $File::Find::current_dir);
893 unless ($ok) {
3fa6e24b 894 warn "Can't cd to $updir_loc: $!\n" if $^W;
81793b90 895 return;
896 }
897 }
898
7e47e6ff 899 push @Stack,[$dir_loc,$updir_loc,$p_dir,$dir_rel,-1] if $bydepth;
900
901 if ($Is_MacOS) {
902 $p_dir = $dir_pref; # ensure trailing ':'
903 }
57e73c4b 904
81793b90 905 while (defined $SE) {
906
907 unless ($bydepth) {
7e47e6ff 908 # change (back) to parent directory (always untainted)
704ea872 909 unless ($no_chdir) {
7e47e6ff 910 unless (chdir $updir_loc) {
3fa6e24b 911 warn "Can't cd to $updir_loc: $!\n" if $^W;
704ea872 912 next;
913 }
914 }
7e47e6ff 915 $dir= $p_dir; # $File::Find::dir
916 $name= $dir_name; # $File::Find::name
917 $_= ($no_chdir ? $dir_name : $dir_rel ); # $_
918 $fullname= $dir_loc; # $File::Find::fullname
81793b90 919 # prune may happen here
7e47e6ff 920 $prune= 0;
704ea872 921 lstat($_); # make sure file tests with '_' work
7e47e6ff 922 { &$wanted_callback }; # protect against wild "next"
923 next if $prune;
81793b90 924 }
925
926 # change to that directory
7e47e6ff 927 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
928 $updir_loc = $dir_loc;
929 if ( ($untaint) && (($tainted) || ($tainted = is_tainted($dir_loc) )) ) {
930 # untaint $dir_loc, what will be pushed on the stack as (untainted) parent dir
931 ( $updir_loc ) = $dir_loc =~ m|$untaint_pat|;
932 unless (defined $updir_loc) {
81793b90 933 if ($untaint_skip == 0) {
934 die "directory $dir_loc is still tainted";
a0d0e21e 935 }
237437d0 936 else {
81793b90 937 next;
237437d0 938 }
a0d0e21e 939 }
940 }
7e47e6ff 941 unless (chdir $updir_loc) {
3fa6e24b 942 warn "Can't cd to $updir_loc: $!\n" if $^W;
81793b90 943 next;
944 }
945 }
946
7e47e6ff 947 if ($Is_MacOS) {
948 $dir_name = "$dir_name:" unless ($dir_name =~ /:$/);
949 }
950
951 $dir = $dir_name; # $File::Find::dir
81793b90 952
953 # Get the list of files in the current directory.
7e47e6ff 954 unless (opendir DIR, ($no_chdir ? $dir_loc : $File::Find::current_dir)) {
3fa6e24b 955 warn "Can't opendir($dir_loc): $!\n" if $^W;
81793b90 956 next;
957 }
958 @filenames = readdir DIR;
959 closedir(DIR);
960
961 for my $FN (@filenames) {
7e47e6ff 962 next if $FN =~ $File::Find::skip_pattern;
81793b90 963
964 # follow symbolic links / do an lstat
07867069 965 $new_loc = Follow_SymLink($loc_pref.$FN);
81793b90 966
967 # ignore if invalid symlink
968 next unless defined $new_loc;
7e47e6ff 969
81793b90 970 if (-d _) {
7e47e6ff 971 push @Stack,[$new_loc,$updir_loc,$dir_name,$FN,1];
81793b90 972 }
973 else {
7e47e6ff 974 $fullname = $new_loc; # $File::Find::fullname
975 $name = $dir_pref . $FN; # $File::Find::name
976 $_ = ($no_chdir ? $name : $FN); # $_
73396e07 977 { &$wanted_callback }; # protect against wild "next"
81793b90 978 }
979 }
980
81793b90 981 }
982 continue {
57e73c4b 983 while (defined($SE = pop @Stack)) {
7e47e6ff 984 ($dir_loc, $updir_loc, $p_dir, $dir_rel, $byd_flag) = @$SE;
985 if ($Is_MacOS) {
986 # $p_dir always has a trailing ':', except for the starting dir,
987 # where $dir_rel eq ':'
988 $dir_name = "$p_dir$dir_rel";
989 $dir_pref = "$dir_name:";
990 $loc_pref = ($dir_loc =~ /:$/) ? $dir_loc : "$dir_loc:";
991 }
992 else {
993 $dir_name = ($p_dir eq '/' ? "/$dir_rel" : "$p_dir/$dir_rel");
994 $dir_pref = "$dir_name/";
995 $loc_pref = "$dir_loc/";
996 }
997 if ( $byd_flag < 0 ) { # must be finddepth, report dirname now
998 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
999 unless (chdir $updir_loc) { # $updir_loc (parent dir) is always untainted
3fa6e24b 1000 warn "Can't cd to $updir_loc: $!\n" if $^W;
7e47e6ff 1001 next;
1002 }
1003 }
1004 $fullname = $dir_loc; # $File::Find::fullname
1005 $name = $dir_name; # $File::Find::name
1006 if ($Is_MacOS) {
1007 if ($dir_rel eq ':') { # must be the top dir, where we started
1008 $name =~ s|:$||; # $File::Find::name
1009 $p_dir = "$p_dir:" unless ($p_dir =~ /:$/);
1010 }
1011 $dir = $p_dir; # $File::Find::dir
1012 $_ = ($no_chdir ? $name : $dir_rel); # $_
1013 }
1014 else {
1015 if ( substr($name,-2) eq '/.' ) {
1016 $name =~ s|/\.$||; # $File::Find::name
1017 }
1018 $dir = $p_dir; # $File::Find::dir
1019 $_ = ($no_chdir ? $dir_name : $dir_rel); # $_
1020 if ( substr($_,-2) eq '/.' ) {
1021 s|/\.$||;
1022 }
1023 }
1024
1025 lstat($_); # make sure file tests with '_' work
1026 { &$wanted_callback }; # protect against wild "next"
1027 }
1028 else {
1029 push @Stack,[$dir_loc, $updir_loc, $p_dir, $dir_rel,-1] if $bydepth;
1030 last;
1031 }
a0d0e21e 1032 }
1033 }
1034}
1035
81793b90 1036
20408e3c 1037sub wrap_wanted {
81793b90 1038 my $wanted = shift;
1039 if ( ref($wanted) eq 'HASH' ) {
1040 if ( $wanted->{follow} || $wanted->{follow_fast}) {
1041 $wanted->{follow_skip} = 1 unless defined $wanted->{follow_skip};
1042 }
1043 if ( $wanted->{untaint} ) {
7e47e6ff 1044 $wanted->{untaint_pattern} = $File::Find::untaint_pattern
81793b90 1045 unless defined $wanted->{untaint_pattern};
1046 $wanted->{untaint_skip} = 0 unless defined $wanted->{untaint_skip};
1047 }
1048 return $wanted;
1049 }
1050 else {
1051 return { wanted => $wanted };
1052 }
a0d0e21e 1053}
1054
20408e3c 1055sub find {
81793b90 1056 my $wanted = shift;
1057 _find_opt(wrap_wanted($wanted), @_);
a0d0e21e 1058}
1059
55d729e4 1060sub finddepth {
81793b90 1061 my $wanted = wrap_wanted(shift);
1062 $wanted->{bydepth} = 1;
1063 _find_opt($wanted, @_);
20408e3c 1064}
6280b799 1065
7e47e6ff 1066# default
1067$File::Find::skip_pattern = qr/^\.{1,2}\z/;
1068$File::Find::untaint_pattern = qr|^([-+@\w./]+)$|;
1069
6280b799 1070# These are hard-coded for now, but may move to hint files.
10eba763 1071if ($^O eq 'VMS') {
81793b90 1072 $Is_VMS = 1;
7e47e6ff 1073 $File::Find::dont_use_nlink = 1;
1074}
1075elsif ($^O eq 'MacOS') {
1076 $Is_MacOS = 1;
1077 $File::Find::dont_use_nlink = 1;
1078 $File::Find::skip_pattern = qr/^Icon\015\z/;
1079 $File::Find::untaint_pattern = qr|^(.+)$|;
748a9306 1080}
1081
7e47e6ff 1082# this _should_ work properly on all platforms
1083# where File::Find can be expected to work
1084$File::Find::current_dir = File::Spec->curdir || '.';
1085
81793b90 1086$File::Find::dont_use_nlink = 1
497711e7 1087 if $^O eq 'os2' || $^O eq 'dos' || $^O eq 'amigaos' || $^O eq 'MSWin32' ||
2986a63f 1088 $^O eq 'cygwin' || $^O eq 'epoc' || $^O eq 'NetWare';
6280b799 1089
20408e3c 1090# Set dont_use_nlink in your hint file if your system's stat doesn't
1091# report the number of links in a directory as an indication
1092# of the number of files.
1093# See, e.g. hints/machten.sh for MachTen 2.2.
81793b90 1094unless ($File::Find::dont_use_nlink) {
1095 require Config;
1096 $File::Find::dont_use_nlink = 1 if ($Config::Config{'dont_use_nlink'});
20408e3c 1097}
1098
7e47e6ff 1099# We need a function that checks if a scalar is tainted. Either use the
1100# Scalar::Util module's tainted() function or our (slower) pure Perl
1101# fallback is_tainted_pp()
1102{
1103 local $@;
1104 eval { require Scalar::Util };
1105 *is_tainted = $@ ? \&is_tainted_pp : \&Scalar::Util::tainted;
1106}
1107
a0d0e21e 11081;