Doc tweaks.
[p5sagit/p5-mst-13.2.git] / lib / File / Find.pm
CommitLineData
a0d0e21e 1package File::Find;
3b825e41 2use 5.006;
b75c8c73 3use strict;
b395063c 4use warnings;
5fa2bf2b 5our $VERSION = '1.03';
a0d0e21e 6require Exporter;
6280b799 7require Cwd;
a0d0e21e 8
f06db76b 9=head1 NAME
10
11find - traverse a file tree
12
13finddepth - traverse a directory structure depth-first
14
15=head1 SYNOPSIS
16
17 use File::Find;
81793b90 18 find(\&wanted, '/foo', '/bar');
f06db76b 19 sub wanted { ... }
237437d0 20
f06db76b 21 use File::Find;
81793b90 22 finddepth(\&wanted, '/foo', '/bar');
f06db76b 23 sub wanted { ... }
3cb6de81 24
81793b90 25 use File::Find;
26 find({ wanted => \&process, follow => 1 }, '.');
f06db76b 27
28=head1 DESCRIPTION
29
20408e3c 30The first argument to find() is either a hash reference describing the
81793b90 31operations to be performed for each file, or a code reference.
20408e3c 32
81793b90 33Here are the possible keys for the hash:
34
35=over 3
36
37=item C<wanted>
38
39The value should be a code reference. This code reference is called
40I<the wanted() function> below.
41
42=item C<bydepth>
43
44Reports the name of a directory only AFTER all its entries
45have been reported. Entry point finddepth() is a shortcut for
46specifying C<{ bydepth => 1 }> in the first argument of find().
47
719c805e 48=item C<preprocess>
49
7e47e6ff 50The value should be a code reference. This code reference is used to
51preprocess the current directory. The name of currently processed
52directory is in $File::Find::dir. Your preprocessing function is
53called after readdir() but before the loop that calls the wanted()
54function. It is called with a list of strings (actually file/directory
55names) and is expected to return a list of strings. The code can be
56used to sort the file/directory names alphabetically, numerically,
57or to filter out directory entries based on their name alone. When
58I<follow> or I<follow_fast> are in effect, C<preprocess> is a no-op.
719c805e 59
60=item C<postprocess>
61
7e47e6ff 62The value should be a code reference. It is invoked just before leaving
63the currently processed directory. It is called in void context with no
64arguments. The name of the current directory is in $File::Find::dir. This
65hook is handy for summarizing a directory, such as calculating its disk
3fa6e24b 66usage. When I<follow> or I<follow_fast> are in effect, C<postprocess> is a
7e47e6ff 67no-op.
719c805e 68
81793b90 69=item C<follow>
70
71Causes symbolic links to be followed. Since directory trees with symbolic
72links (followed) may contain files more than once and may even have
73cycles, a hash has to be built up with an entry for each file.
74This might be expensive both in space and time for a large
75directory tree. See I<follow_fast> and I<follow_skip> below.
76If either I<follow> or I<follow_fast> is in effect:
77
78=over 6
79
a45bd81d 80=item *
81793b90 81
f10e1564 82It is guaranteed that an I<lstat> has been called before the user's
81793b90 83I<wanted()> function is called. This enables fast file checks involving S< _>.
84
a45bd81d 85=item *
81793b90 86
87There is a variable C<$File::Find::fullname> which holds the absolute
88pathname of the file with all symbolic links resolved
89
90=back
91
92=item C<follow_fast>
93
f10e1564 94This is similar to I<follow> except that it may report some files more
95than once. It does detect cycles, however. Since only symbolic links
96have to be hashed, this is much cheaper both in space and time. If
97processing a file more than once (by the user's I<wanted()> function)
81793b90 98is worse than just taking time, the option I<follow> should be used.
99
100=item C<follow_skip>
101
102C<follow_skip==1>, which is the default, causes all files which are
103neither directories nor symbolic links to be ignored if they are about
104to be processed a second time. If a directory or a symbolic link
105are about to be processed a second time, File::Find dies.
106C<follow_skip==0> causes File::Find to die if any file is about to be
107processed a second time.
108C<follow_skip==2> causes File::Find to ignore any duplicate files and
7e47e6ff 109directories but to proceed normally otherwise.
20408e3c 110
80e52b73 111=item C<dangling_symlinks>
112
113If true and a code reference, will be called with the symbolic link
114name and the directory it lives in as arguments. Otherwise, if true
115and warnings are on, warning "symbolic_link_name is a dangling
116symbolic link\n" will be issued. If false, the dangling symbolic link
117will be silently ignored.
f06db76b 118
81793b90 119=item C<no_chdir>
120
121Does not C<chdir()> to each directory as it recurses. The wanted()
122function will need to be aware of this, of course. In this case,
123C<$_> will be the same as C<$File::Find::name>.
124
125=item C<untaint>
126
127If find is used in taint-mode (-T command line switch or if EUID != UID
128or if EGID != GID) then internally directory names have to be untainted
7e47e6ff 129before they can be chdir'ed to. Therefore they are checked against a regular
130expression I<untaint_pattern>. Note that all names passed to the user's
131I<wanted()> function are still tainted. If this option is used while
132not in taint-mode, C<untaint> is a no-op.
81793b90 133
134=item C<untaint_pattern>
135
136See above. This should be set using the C<qr> quoting operator.
137The default is set to C<qr|^([-+@\w./]+)$|>.
1cffc1dd 138Note that the parentheses are vital.
81793b90 139
140=item C<untaint_skip>
141
7e47e6ff 142If set, a directory which fails the I<untaint_pattern> is skipped,
143including all its sub-directories. The default is to 'die' in such a case.
81793b90 144
145=back
146
147The wanted() function does whatever verifications you want.
148C<$File::Find::dir> contains the current directory name, and C<$_> the
149current filename within that directory. C<$File::Find::name> contains
f10e1564 150the complete pathname to the file. You are chdir()'d to
151C<$File::Find::dir> when the function is called, unless C<no_chdir>
7e47e6ff 152was specified. When C<follow> or C<follow_fast> are in effect, there is
f10e1564 153also a C<$File::Find::fullname>. The function may set
154C<$File::Find::prune> to prune the tree unless C<bydepth> was
155specified. Unless C<follow> or C<follow_fast> is specified, for
156compatibility reasons (find.pl, find2perl) there are in addition the
157following globals available: C<$File::Find::topdir>,
158C<$File::Find::topdev>, C<$File::Find::topino>,
e7b91b67 159C<$File::Find::topmode> and C<$File::Find::topnlink>.
47a735e8 160
20408e3c 161This library is useful for the C<find2perl> tool, which when fed,
f06db76b 162
163 find2perl / -name .nfs\* -mtime +7 \
81793b90 164 -exec rm -f {} \; -o -fstype nfs -prune
f06db76b 165
166produces something like:
167
168 sub wanted {
c7b9dd21 169 /^\.nfs.*\z/s &&
81793b90 170 (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_)) &&
f06db76b 171 int(-M _) > 7 &&
172 unlink($_)
173 ||
81793b90 174 ($nlink || (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))) &&
f06db76b 175 $dev < 0 &&
6280b799 176 ($File::Find::prune = 1);
f06db76b 177 }
178
43dece2a 179Notice the C<_> in the above C<int(-M _)>: the C<_> is a magical
180filehandle that caches the information from the preceding
181stat(), lstat(), or filetest.
182
1cffc1dd 183Here's another interesting wanted function. It will find all symbolic
184links that don't resolve:
f06db76b 185
186 sub wanted {
81793b90 187 -l && !-e && print "bogus link: $File::Find::name\n";
237437d0 188 }
f06db76b 189
81793b90 190See also the script C<pfind> on CPAN for a nice application of this
191module.
192
193=head1 CAVEAT
194
5fa2bf2b 195=over 2
196
197=item $dont_use_nlink
198
199You can set the variable C<$File::Find::dont_use_nlink> to 1, if you want to
200force File::Find to always stat directories. This was used for systems
201that do not have the correct C<nlink> count for directories. Examples are
202ISO-9660 (CD-R), AFS, and operating systems like OS/2, DOS and a couple of
203others.
204
205Since now File::Find should now detect such things on-the-fly and switch it
206self to using stat, this will probably not a problem to you.
207
208If you do set $dont_use_nlink to 1, you will notice slow-downs.
209
210=item symlinks
211
f10e1564 212Be aware that the option to follow symbolic links can be dangerous.
81793b90 213Depending on the structure of the directory tree (including symbolic
214links to directories) you might traverse a given (physical) directory
215more than once (only if C<follow_fast> is in effect).
216Furthermore, deleting or changing files in a symbolically linked directory
217might cause very unpleasant surprises, since you delete or change files
218in an unknown directory.
0530a6c4 219
5fa2bf2b 220=back
221
7e47e6ff 222=head1 NOTES
223
224=over 4
225
226=item *
227
228Mac OS (Classic) users should note a few differences:
229
230=over 4
231
232=item *
233
234The path separator is ':', not '/', and the current directory is denoted
235as ':', not '.'. You should be careful about specifying relative pathnames.
236While a full path always begins with a volume name, a relative pathname
237should always begin with a ':'. If specifying a volume name only, a
238trailing ':' is required.
239
240=item *
241
242C<$File::Find::dir> is guaranteed to end with a ':'. If C<$_>
243contains the name of a directory, that name may or may not end with a
244':'. Likewise, C<$File::Find::name>, which contains the complete
245pathname to that directory, and C<$File::Find::fullname>, which holds
246the absolute pathname of that directory with all symbolic links resolved,
247may or may not end with a ':'.
248
249=item *
250
251The default C<untaint_pattern> (see above) on Mac OS is set to
252C<qr|^(.+)$|>. Note that the parentheses are vital.
253
254=item *
255
256The invisible system file "Icon\015" is ignored. While this file may
257appear in every directory, there are some more invisible system files
258on every volume, which are all located at the volume root level (i.e.
259"MacintoshHD:"). These system files are B<not> excluded automatically.
260Your filter may use the following code to recognize invisible files or
261directories (requires Mac::Files):
262
263 use Mac::Files;
264
265 # invisible() -- returns 1 if file/directory is invisible,
1cffc1dd 266 # 0 if it's visible or undef if an error occurred
7e47e6ff 267
268 sub invisible($) {
269 my $file = shift;
270 my ($fileCat, $fileInfo);
271 my $invisible_flag = 1 << 14;
272
273 if ( $fileCat = FSpGetCatInfo($file) ) {
274 if ($fileInfo = $fileCat->ioFlFndrInfo() ) {
275 return (($fileInfo->fdFlags & $invisible_flag) && 1);
276 }
277 }
278 return undef;
279 }
280
281Generally, invisible files are system files, unless an odd application
282decides to use invisible files for its own purposes. To distinguish
283such files from system files, you have to look at the B<type> and B<creator>
284file attributes. The MacPerl built-in functions C<GetFileInfo(FILE)> and
285C<SetFileInfo(CREATOR, TYPE, FILES)> offer access to these attributes
286(see MacPerl.pm for details).
287
288Files that appear on the desktop actually reside in an (hidden) directory
289named "Desktop Folder" on the particular disk volume. Note that, although
290all desktop files appear to be on the same "virtual" desktop, each disk
291volume actually maintains its own "Desktop Folder" directory.
292
293=back
294
295=back
0530a6c4 296
a85af077 297=head1 HISTORY
298
299File::Find used to produce incorrect results if called recursively.
300During the development of perl 5.8 this bug was fixed.
301The first fixed version of File::Find was 1.01.
302
f06db76b 303=cut
304
b75c8c73 305our @ISA = qw(Exporter);
306our @EXPORT = qw(find finddepth);
6280b799 307
a0d0e21e 308
81793b90 309use strict;
310my $Is_VMS;
7e47e6ff 311my $Is_MacOS;
81793b90 312
313require File::Basename;
7e47e6ff 314require File::Spec;
81793b90 315
9f826d6a 316# Should ideally be my() not our() but local() currently
317# refuses to operate on lexicals
318
319our %SLnkSeen;
320our ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow,
719c805e 321 $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat,
80e52b73 322 $pre_process, $post_process, $dangling_symlinks);
81793b90 323
324sub contract_name {
325 my ($cdir,$fn) = @_;
326
7e47e6ff 327 return substr($cdir,0,rindex($cdir,'/')) if $fn eq $File::Find::current_dir;
81793b90 328
329 $cdir = substr($cdir,0,rindex($cdir,'/')+1);
330
331 $fn =~ s|^\./||;
332
333 my $abs_name= $cdir . $fn;
334
335 if (substr($fn,0,3) eq '../') {
fecbda2b 336 1 while $abs_name =~ s!/[^/]*/\.\./!/!;
81793b90 337 }
338
339 return $abs_name;
340}
341
7e47e6ff 342# return the absolute name of a directory or file
343sub contract_name_Mac {
344 my ($cdir,$fn) = @_;
345 my $abs_name;
346
347 if ($fn =~ /^(:+)(.*)$/) { # valid pathname starting with a ':'
348
349 my $colon_count = length ($1);
350 if ($colon_count == 1) {
351 $abs_name = $cdir . $2;
352 return $abs_name;
353 }
354 else {
355 # need to move up the tree, but
356 # only if it's not a volume name
357 for (my $i=1; $i<$colon_count; $i++) {
358 unless ($cdir =~ /^[^:]+:$/) { # volume name
359 $cdir =~ s/[^:]+:$//;
360 }
361 else {
362 return undef;
363 }
364 }
365 $abs_name = $cdir . $2;
366 return $abs_name;
367 }
368
369 }
370 else {
371
372 # $fn may be a valid path to a directory or file or (dangling)
373 # symlink, without a leading ':'
374 if ( (-e $fn) || (-l $fn) ) {
375 if ($fn =~ /^[^:]+:/) { # a volume name like DataHD:*
376 return $fn; # $fn is already an absolute path
377 }
378 else {
379 $abs_name = $cdir . $fn;
380 return $abs_name;
381 }
382 }
383 else { # argh!, $fn is not a valid directory/file
384 return undef;
385 }
386 }
387}
81793b90 388
389sub PathCombine($$) {
390 my ($Base,$Name) = @_;
391 my $AbsName;
392
7e47e6ff 393 if ($Is_MacOS) {
394 # $Name is the resolved symlink (always a full path on MacOS),
395 # i.e. there's no need to call contract_name_Mac()
396 $AbsName = $Name;
397
398 # (simple) check for recursion
399 if ( ( $Base =~ /^$AbsName/) && (-d $AbsName) ) { # recursion
400 return undef;
401 }
81793b90 402 }
403 else {
7e47e6ff 404 if (substr($Name,0,1) eq '/') {
405 $AbsName= $Name;
406 }
407 else {
408 $AbsName= contract_name($Base,$Name);
409 }
81793b90 410
7e47e6ff 411 # (simple) check for recursion
412 my $newlen= length($AbsName);
413 if ($newlen <= length($Base)) {
414 if (($newlen == length($Base) || substr($Base,$newlen,1) eq '/')
415 && $AbsName eq substr($Base,0,$newlen))
416 {
417 return undef;
418 }
81793b90 419 }
420 }
421 return $AbsName;
422}
423
424sub Follow_SymLink($) {
425 my ($AbsName) = @_;
426
427 my ($NewName,$DEV, $INO);
428 ($DEV, $INO)= lstat $AbsName;
429
430 while (-l _) {
431 if ($SLnkSeen{$DEV, $INO}++) {
432 if ($follow_skip < 2) {
433 die "$AbsName is encountered a second time";
a0d0e21e 434 }
435 else {
81793b90 436 return undef;
a0d0e21e 437 }
438 }
81793b90 439 $NewName= PathCombine($AbsName, readlink($AbsName));
440 unless(defined $NewName) {
441 if ($follow_skip < 2) {
442 die "$AbsName is a recursive symbolic link";
443 }
444 else {
445 return undef;
a0d0e21e 446 }
81793b90 447 }
448 else {
449 $AbsName= $NewName;
450 }
451 ($DEV, $INO) = lstat($AbsName);
452 return undef unless defined $DEV; # dangling symbolic link
453 }
454
455 if ($full_check && $SLnkSeen{$DEV, $INO}++) {
7e47e6ff 456 if ( ($follow_skip < 1) || ((-d _) && ($follow_skip < 2)) ) {
81793b90 457 die "$AbsName encountered a second time";
458 }
459 else {
460 return undef;
461 }
462 }
463
464 return $AbsName;
465}
466
17f410f9 467our($dir, $name, $fullname, $prune);
81793b90 468sub _find_dir_symlnk($$$);
469sub _find_dir($$$);
470
7e47e6ff 471# check whether or not a scalar variable is tainted
472# (code straight from the Camel, 3rd ed., page 561)
473sub is_tainted_pp {
474 my $arg = shift;
475 my $nada = substr($arg, 0, 0); # zero-length
476 local $@;
477 eval { eval "# $nada" };
478 return length($@) != 0;
479}
480
81793b90 481sub _find_opt {
482 my $wanted = shift;
483 die "invalid top directory" unless defined $_[0];
484
9f826d6a 485 # This function must local()ize everything because callbacks may
486 # call find() or finddepth()
487
488 local %SLnkSeen;
489 local ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow,
490 $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat,
80e52b73 491 $pre_process, $post_process, $dangling_symlinks);
9f826d6a 492 local($dir, $name, $fullname, $prune);
493
a0c9c202 494 my $cwd = $wanted->{bydepth} ? Cwd::fastcwd() : Cwd::getcwd();
80e52b73 495 my $cwd_untainted = $cwd;
496 my $check_t_cwd = 1;
497 $wanted_callback = $wanted->{wanted};
498 $bydepth = $wanted->{bydepth};
499 $pre_process = $wanted->{preprocess};
500 $post_process = $wanted->{postprocess};
501 $no_chdir = $wanted->{no_chdir};
502 $full_check = $wanted->{follow};
503 $follow = $full_check || $wanted->{follow_fast};
504 $follow_skip = $wanted->{follow_skip};
505 $untaint = $wanted->{untaint};
506 $untaint_pat = $wanted->{untaint_pattern};
507 $untaint_skip = $wanted->{untaint_skip};
508 $dangling_symlinks = $wanted->{dangling_symlinks};
81793b90 509
1cffc1dd 510 # for compatibility reasons (find.pl, find2perl)
9f826d6a 511 local our ($topdir, $topdev, $topino, $topmode, $topnlink);
81793b90 512
513 # a symbolic link to a directory doesn't increase the link count
514 $avoid_nlink = $follow || $File::Find::dont_use_nlink;
515
e7b91b67 516 my ($abs_dir, $Is_Dir);
81793b90 517
518 Proc_Top_Item:
519 foreach my $TOP (@_) {
7e47e6ff 520 my $top_item = $TOP;
521
522 if ($Is_MacOS) {
523 ($topdev,$topino,$topmode,$topnlink) = $follow ? stat $top_item : lstat $top_item;
524 $top_item = ":$top_item"
3fa6e24b 525 if ( (-d _) && ( $top_item !~ /:/ ) );
7e47e6ff 526 }
527 else {
528 $top_item =~ s|/\z|| unless $top_item eq '/';
529 ($topdev,$topino,$topmode,$topnlink) = $follow ? stat $top_item : lstat $top_item;
530 }
531
532 $Is_Dir= 0;
533
534 if ($follow) {
535
536 if ($Is_MacOS) {
537 $cwd = "$cwd:" unless ($cwd =~ /:$/); # for safety
538
539 if ($top_item eq $File::Find::current_dir) {
540 $abs_dir = $cwd;
541 }
542 else {
543 $abs_dir = contract_name_Mac($cwd, $top_item);
544 unless (defined $abs_dir) {
3fa6e24b 545 warn "Can't determine absolute path for $top_item (No such file or directory)\n" if $^W;
7e47e6ff 546 next Proc_Top_Item;
547 }
548 }
549
550 }
551 else {
552 if (substr($top_item,0,1) eq '/') {
553 $abs_dir = $top_item;
554 }
555 elsif ($top_item eq $File::Find::current_dir) {
556 $abs_dir = $cwd;
557 }
558 else { # care about any ../
559 $abs_dir = contract_name("$cwd/",$top_item);
560 }
561 }
562 $abs_dir= Follow_SymLink($abs_dir);
563 unless (defined $abs_dir) {
80e52b73 564 if ($dangling_symlinks) {
565 if (ref $dangling_symlinks eq 'CODE') {
566 $dangling_symlinks->($top_item, $cwd);
567 } else {
568 warn "$top_item is a dangling symbolic link\n" if $^W;
569 }
570 }
81793b90 571 next Proc_Top_Item;
7e47e6ff 572 }
573
574 if (-d _) {
81793b90 575 _find_dir_symlnk($wanted, $abs_dir, $top_item);
576 $Is_Dir= 1;
7e47e6ff 577 }
578 }
81793b90 579 else { # no follow
7e47e6ff 580 $topdir = $top_item;
581 unless (defined $topnlink) {
3fa6e24b 582 warn "Can't stat $top_item: $!\n" if $^W;
7e47e6ff 583 next Proc_Top_Item;
584 }
585 if (-d _) {
c7b9dd21 586 $top_item =~ s/\.dir\z// if $Is_VMS;
e7b91b67 587 _find_dir($wanted, $top_item, $topnlink);
81793b90 588 $Is_Dir= 1;
7e47e6ff 589 }
237437d0 590 else {
81793b90 591 $abs_dir= $top_item;
7e47e6ff 592 }
593 }
81793b90 594
7e47e6ff 595 unless ($Is_Dir) {
81793b90 596 unless (($_,$dir) = File::Basename::fileparse($abs_dir)) {
7e47e6ff 597 if ($Is_MacOS) {
598 ($dir,$_) = (':', $top_item); # $File::Find::dir, $_
599 }
600 else {
601 ($dir,$_) = ('./', $top_item);
602 }
81793b90 603 }
604
7e47e6ff 605 $abs_dir = $dir;
606 if (( $untaint ) && (is_tainted($dir) )) {
607 ( $abs_dir ) = $dir =~ m|$untaint_pat|;
81793b90 608 unless (defined $abs_dir) {
609 if ($untaint_skip == 0) {
7e47e6ff 610 die "directory $dir is still tainted";
81793b90 611 }
612 else {
613 next Proc_Top_Item;
614 }
615 }
7e47e6ff 616 }
81793b90 617
7e47e6ff 618 unless ($no_chdir || chdir $abs_dir) {
3fa6e24b 619 warn "Couldn't chdir $abs_dir: $!\n" if $^W;
7e47e6ff 620 next Proc_Top_Item;
621 }
719911cc 622
7e47e6ff 623 $name = $abs_dir . $_; # $File::Find::name
719911cc 624
7e47e6ff 625 { &$wanted_callback }; # protect against wild "next"
81793b90 626
7e47e6ff 627 }
81793b90 628
7e47e6ff 629 unless ( $no_chdir ) {
630 if ( ($check_t_cwd) && (($untaint) && (is_tainted($cwd) )) ) {
631 ( $cwd_untainted ) = $cwd =~ m|$untaint_pat|;
632 unless (defined $cwd_untainted) {
633 die "insecure cwd in find(depth)";
634 }
635 $check_t_cwd = 0;
636 }
637 unless (chdir $cwd_untainted) {
638 die "Can't cd to $cwd: $!\n";
639 }
640 }
81793b90 641 }
642}
643
644# API:
645# $wanted
646# $p_dir : "parent directory"
647# $nlink : what came back from the stat
648# preconditions:
649# chdir (if not no_chdir) to dir
650
651sub _find_dir($$$) {
652 my ($wanted, $p_dir, $nlink) = @_;
653 my ($CdLvl,$Level) = (0,0);
654 my @Stack;
655 my @filenames;
656 my ($subcount,$sub_nlink);
657 my $SE= [];
658 my $dir_name= $p_dir;
7e47e6ff 659 my $dir_pref;
39e79f6b 660 my $dir_rel = $File::Find::current_dir;
7e47e6ff 661 my $tainted = 0;
5fa2bf2b 662 my $no_nlink;
7e47e6ff 663
664 if ($Is_MacOS) {
665 $dir_pref= ($p_dir =~ /:$/) ? $p_dir : "$p_dir:"; # preface
7e47e6ff 666 }
667 else {
668 $dir_pref= ( $p_dir eq '/' ? '/' : "$p_dir/" );
7e47e6ff 669 }
81793b90 670
671 local ($dir, $name, $prune, *DIR);
7e47e6ff 672
673 unless ( $no_chdir || ($p_dir eq $File::Find::current_dir)) {
81793b90 674 my $udir = $p_dir;
7e47e6ff 675 if (( $untaint ) && (is_tainted($p_dir) )) {
676 ( $udir ) = $p_dir =~ m|$untaint_pat|;
81793b90 677 unless (defined $udir) {
678 if ($untaint_skip == 0) {
679 die "directory $p_dir is still tainted";
680 }
681 else {
682 return;
683 }
237437d0 684 }
a0d0e21e 685 }
81793b90 686 unless (chdir $udir) {
3fa6e24b 687 warn "Can't cd to $udir: $!\n" if $^W;
81793b90 688 return;
689 }
690 }
7e47e6ff 691
692 # push the starting directory
57e73c4b 693 push @Stack,[$CdLvl,$p_dir,$dir_rel,-1] if $bydepth;
81793b90 694
7e47e6ff 695 if ($Is_MacOS) {
696 $p_dir = $dir_pref; # ensure trailing ':'
697 }
698
81793b90 699 while (defined $SE) {
700 unless ($bydepth) {
7e47e6ff 701 $dir= $p_dir; # $File::Find::dir
702 $name= $dir_name; # $File::Find::name
703 $_= ($no_chdir ? $dir_name : $dir_rel ); # $_
81793b90 704 # prune may happen here
7e47e6ff 705 $prune= 0;
706 { &$wanted_callback }; # protect against wild "next"
707 next if $prune;
81793b90 708 }
7e47e6ff 709
81793b90 710 # change to that directory
7e47e6ff 711 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
81793b90 712 my $udir= $dir_rel;
7e47e6ff 713 if ( ($untaint) && (($tainted) || ($tainted = is_tainted($dir_rel) )) ) {
714 ( $udir ) = $dir_rel =~ m|$untaint_pat|;
81793b90 715 unless (defined $udir) {
716 if ($untaint_skip == 0) {
7e47e6ff 717 if ($Is_MacOS) {
718 die "directory ($p_dir) $dir_rel is still tainted";
719 }
720 else {
721 die "directory (" . ($p_dir ne '/' ? $p_dir : '') . "/) $dir_rel is still tainted";
722 }
723 } else { # $untaint_skip == 1
724 next;
81793b90 725 }
726 }
727 }
728 unless (chdir $udir) {
7e47e6ff 729 if ($Is_MacOS) {
3fa6e24b 730 warn "Can't cd to ($p_dir) $udir: $!\n" if $^W;
7e47e6ff 731 }
732 else {
3fa6e24b 733 warn "Can't cd to (" . ($p_dir ne '/' ? $p_dir : '') . "/) $udir: $!\n" if $^W;
7e47e6ff 734 }
81793b90 735 next;
736 }
737 $CdLvl++;
738 }
739
7e47e6ff 740 if ($Is_MacOS) {
741 $dir_name = "$dir_name:" unless ($dir_name =~ /:$/);
742 }
743
744 $dir= $dir_name; # $File::Find::dir
81793b90 745
746 # Get the list of files in the current directory.
7e47e6ff 747 unless (opendir DIR, ($no_chdir ? $dir_name : $File::Find::current_dir)) {
3fa6e24b 748 warn "Can't opendir($dir_name): $!\n" if $^W;
81793b90 749 next;
750 }
751 @filenames = readdir DIR;
752 closedir(DIR);
719c805e 753 @filenames = &$pre_process(@filenames) if $pre_process;
754 push @Stack,[$CdLvl,$dir_name,"",-2] if $post_process;
81793b90 755
5fa2bf2b 756 # default: use whatever was specifid
757 # (if $nlink >= 2, and $avoid_nlink == 0, this will switch back)
758 $no_nlink = $avoid_nlink;
759 # if dir has wrong nlink count, force switch to slower stat method
760 $no_nlink = 1 if ($nlink < 2);
761
762 if ($nlink == 2 && !$no_nlink) {
81793b90 763 # This dir has no subdirectories.
764 for my $FN (@filenames) {
7e47e6ff 765 next if $FN =~ $File::Find::skip_pattern;
81793b90 766
7e47e6ff 767 $name = $dir_pref . $FN; # $File::Find::name
768 $_ = ($no_chdir ? $name : $FN); # $_
73396e07 769 { &$wanted_callback }; # protect against wild "next"
81793b90 770 }
771
772 }
773 else {
774 # This dir has subdirectories.
775 $subcount = $nlink - 2;
776
777 for my $FN (@filenames) {
7e47e6ff 778 next if $FN =~ $File::Find::skip_pattern;
5fa2bf2b 779 if ($subcount > 0 || $no_nlink) {
81793b90 780 # Seen all the subdirs?
781 # check for directoriness.
782 # stat is faster for a file in the current directory
07867069 783 $sub_nlink = (lstat ($no_chdir ? $dir_pref . $FN : $FN))[3];
81793b90 784
785 if (-d _) {
786 --$subcount;
c7b9dd21 787 $FN =~ s/\.dir\z// if $Is_VMS;
81793b90 788 push @Stack,[$CdLvl,$dir_name,$FN,$sub_nlink];
789 }
790 else {
7e47e6ff 791 $name = $dir_pref . $FN; # $File::Find::name
792 $_= ($no_chdir ? $name : $FN); # $_
73396e07 793 { &$wanted_callback }; # protect against wild "next"
81793b90 794 }
795 }
07867069 796 else {
7e47e6ff 797 $name = $dir_pref . $FN; # $File::Find::name
798 $_= ($no_chdir ? $name : $FN); # $_
73396e07 799 { &$wanted_callback }; # protect against wild "next"
81793b90 800 }
801 }
802 }
17b275ff 803 }
804 continue {
57e73c4b 805 while ( defined ($SE = pop @Stack) ) {
81793b90 806 ($Level, $p_dir, $dir_rel, $nlink) = @$SE;
807 if ($CdLvl > $Level && !$no_chdir) {
7e47e6ff 808 my $tmp;
809 if ($Is_MacOS) {
810 $tmp = (':' x ($CdLvl-$Level)) . ':';
811 }
812 else {
813 $tmp = join('/',('..') x ($CdLvl-$Level));
814 }
815 die "Can't cd to $dir_name" . $tmp
816 unless chdir ($tmp);
81793b90 817 $CdLvl = $Level;
818 }
7e47e6ff 819
820 if ($Is_MacOS) {
821 # $pdir always has a trailing ':', except for the starting dir,
822 # where $dir_rel eq ':'
823 $dir_name = "$p_dir$dir_rel";
824 $dir_pref = "$dir_name:";
825 }
826 else {
827 $dir_name = ($p_dir eq '/' ? "/$dir_rel" : "$p_dir/$dir_rel");
828 $dir_pref = "$dir_name/";
829 }
830
719c805e 831 if ( $nlink == -2 ) {
7e47e6ff 832 $name = $dir = $p_dir; # $File::Find::name / dir
39e79f6b 833 $_ = $File::Find::current_dir;
719c805e 834 &$post_process; # End-of-directory processing
7e47e6ff 835 }
836 elsif ( $nlink < 0 ) { # must be finddepth, report dirname now
837 $name = $dir_name;
838 if ($Is_MacOS) {
839 if ($dir_rel eq ':') { # must be the top dir, where we started
840 $name =~ s|:$||; # $File::Find::name
841 $p_dir = "$p_dir:" unless ($p_dir =~ /:$/);
842 }
843 $dir = $p_dir; # $File::Find::dir
844 $_ = ($no_chdir ? $name : $dir_rel); # $_
845 }
846 else {
847 if ( substr($name,-2) eq '/.' ) {
848 $name =~ s|/\.$||;
849 }
850 $dir = $p_dir;
851 $_ = ($no_chdir ? $dir_name : $dir_rel );
852 if ( substr($_,-2) eq '/.' ) {
853 s|/\.$||;
854 }
855 }
856 { &$wanted_callback }; # protect against wild "next"
857 }
858 else {
859 push @Stack,[$CdLvl,$p_dir,$dir_rel,-1] if $bydepth;
860 last;
861 }
81793b90 862 }
a0d0e21e 863 }
864}
865
81793b90 866
867# API:
868# $wanted
869# $dir_loc : absolute location of a dir
870# $p_dir : "parent directory"
871# preconditions:
872# chdir (if not no_chdir) to dir
873
874sub _find_dir_symlnk($$$) {
7e47e6ff 875 my ($wanted, $dir_loc, $p_dir) = @_; # $dir_loc is the absolute directory
81793b90 876 my @Stack;
877 my @filenames;
878 my $new_loc;
7e47e6ff 879 my $updir_loc = $dir_loc; # untainted parent directory
81793b90 880 my $SE = [];
881 my $dir_name = $p_dir;
7e47e6ff 882 my $dir_pref;
883 my $loc_pref;
39e79f6b 884 my $dir_rel = $File::Find::current_dir;
7e47e6ff 885 my $byd_flag; # flag for pending stack entry if $bydepth
886 my $tainted = 0;
887 my $ok = 1;
888
889 if ($Is_MacOS) {
890 $dir_pref = ($p_dir =~ /:$/) ? "$p_dir" : "$p_dir:";
891 $loc_pref = ($dir_loc =~ /:$/) ? "$dir_loc" : "$dir_loc:";
7e47e6ff 892 } else {
893 $dir_pref = ( $p_dir eq '/' ? '/' : "$p_dir/" );
894 $loc_pref = ( $dir_loc eq '/' ? '/' : "$dir_loc/" );
7e47e6ff 895 }
81793b90 896
897 local ($dir, $name, $fullname, $prune, *DIR);
7e47e6ff 898
899 unless ($no_chdir) {
900 # untaint the topdir
901 if (( $untaint ) && (is_tainted($dir_loc) )) {
902 ( $updir_loc ) = $dir_loc =~ m|$untaint_pat|; # parent dir, now untainted
903 # once untainted, $updir_loc is pushed on the stack (as parent directory);
904 # hence, we don't need to untaint the parent directory every time we chdir
905 # to it later
906 unless (defined $updir_loc) {
81793b90 907 if ($untaint_skip == 0) {
908 die "directory $dir_loc is still tainted";
909 }
910 else {
911 return;
912 }
913 }
914 }
7e47e6ff 915 $ok = chdir($updir_loc) unless ($p_dir eq $File::Find::current_dir);
916 unless ($ok) {
3fa6e24b 917 warn "Can't cd to $updir_loc: $!\n" if $^W;
81793b90 918 return;
919 }
920 }
921
7e47e6ff 922 push @Stack,[$dir_loc,$updir_loc,$p_dir,$dir_rel,-1] if $bydepth;
923
924 if ($Is_MacOS) {
925 $p_dir = $dir_pref; # ensure trailing ':'
926 }
57e73c4b 927
81793b90 928 while (defined $SE) {
929
930 unless ($bydepth) {
7e47e6ff 931 # change (back) to parent directory (always untainted)
704ea872 932 unless ($no_chdir) {
7e47e6ff 933 unless (chdir $updir_loc) {
3fa6e24b 934 warn "Can't cd to $updir_loc: $!\n" if $^W;
704ea872 935 next;
936 }
937 }
7e47e6ff 938 $dir= $p_dir; # $File::Find::dir
939 $name= $dir_name; # $File::Find::name
940 $_= ($no_chdir ? $dir_name : $dir_rel ); # $_
941 $fullname= $dir_loc; # $File::Find::fullname
81793b90 942 # prune may happen here
7e47e6ff 943 $prune= 0;
704ea872 944 lstat($_); # make sure file tests with '_' work
7e47e6ff 945 { &$wanted_callback }; # protect against wild "next"
946 next if $prune;
81793b90 947 }
948
949 # change to that directory
7e47e6ff 950 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
951 $updir_loc = $dir_loc;
952 if ( ($untaint) && (($tainted) || ($tainted = is_tainted($dir_loc) )) ) {
953 # untaint $dir_loc, what will be pushed on the stack as (untainted) parent dir
954 ( $updir_loc ) = $dir_loc =~ m|$untaint_pat|;
955 unless (defined $updir_loc) {
81793b90 956 if ($untaint_skip == 0) {
957 die "directory $dir_loc is still tainted";
a0d0e21e 958 }
237437d0 959 else {
81793b90 960 next;
237437d0 961 }
a0d0e21e 962 }
963 }
7e47e6ff 964 unless (chdir $updir_loc) {
3fa6e24b 965 warn "Can't cd to $updir_loc: $!\n" if $^W;
81793b90 966 next;
967 }
968 }
969
7e47e6ff 970 if ($Is_MacOS) {
971 $dir_name = "$dir_name:" unless ($dir_name =~ /:$/);
972 }
973
974 $dir = $dir_name; # $File::Find::dir
81793b90 975
976 # Get the list of files in the current directory.
7e47e6ff 977 unless (opendir DIR, ($no_chdir ? $dir_loc : $File::Find::current_dir)) {
3fa6e24b 978 warn "Can't opendir($dir_loc): $!\n" if $^W;
81793b90 979 next;
980 }
981 @filenames = readdir DIR;
982 closedir(DIR);
983
984 for my $FN (@filenames) {
7e47e6ff 985 next if $FN =~ $File::Find::skip_pattern;
81793b90 986
987 # follow symbolic links / do an lstat
07867069 988 $new_loc = Follow_SymLink($loc_pref.$FN);
81793b90 989
990 # ignore if invalid symlink
991 next unless defined $new_loc;
7e47e6ff 992
81793b90 993 if (-d _) {
7e47e6ff 994 push @Stack,[$new_loc,$updir_loc,$dir_name,$FN,1];
81793b90 995 }
996 else {
7e47e6ff 997 $fullname = $new_loc; # $File::Find::fullname
998 $name = $dir_pref . $FN; # $File::Find::name
999 $_ = ($no_chdir ? $name : $FN); # $_
73396e07 1000 { &$wanted_callback }; # protect against wild "next"
81793b90 1001 }
1002 }
1003
81793b90 1004 }
1005 continue {
57e73c4b 1006 while (defined($SE = pop @Stack)) {
7e47e6ff 1007 ($dir_loc, $updir_loc, $p_dir, $dir_rel, $byd_flag) = @$SE;
1008 if ($Is_MacOS) {
1009 # $p_dir always has a trailing ':', except for the starting dir,
1010 # where $dir_rel eq ':'
1011 $dir_name = "$p_dir$dir_rel";
1012 $dir_pref = "$dir_name:";
1013 $loc_pref = ($dir_loc =~ /:$/) ? $dir_loc : "$dir_loc:";
1014 }
1015 else {
1016 $dir_name = ($p_dir eq '/' ? "/$dir_rel" : "$p_dir/$dir_rel");
1017 $dir_pref = "$dir_name/";
1018 $loc_pref = "$dir_loc/";
1019 }
1020 if ( $byd_flag < 0 ) { # must be finddepth, report dirname now
1021 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
1022 unless (chdir $updir_loc) { # $updir_loc (parent dir) is always untainted
3fa6e24b 1023 warn "Can't cd to $updir_loc: $!\n" if $^W;
7e47e6ff 1024 next;
1025 }
1026 }
1027 $fullname = $dir_loc; # $File::Find::fullname
1028 $name = $dir_name; # $File::Find::name
1029 if ($Is_MacOS) {
1030 if ($dir_rel eq ':') { # must be the top dir, where we started
1031 $name =~ s|:$||; # $File::Find::name
1032 $p_dir = "$p_dir:" unless ($p_dir =~ /:$/);
1033 }
1034 $dir = $p_dir; # $File::Find::dir
1035 $_ = ($no_chdir ? $name : $dir_rel); # $_
1036 }
1037 else {
1038 if ( substr($name,-2) eq '/.' ) {
1039 $name =~ s|/\.$||; # $File::Find::name
1040 }
1041 $dir = $p_dir; # $File::Find::dir
1042 $_ = ($no_chdir ? $dir_name : $dir_rel); # $_
1043 if ( substr($_,-2) eq '/.' ) {
1044 s|/\.$||;
1045 }
1046 }
1047
1048 lstat($_); # make sure file tests with '_' work
1049 { &$wanted_callback }; # protect against wild "next"
1050 }
1051 else {
1052 push @Stack,[$dir_loc, $updir_loc, $p_dir, $dir_rel,-1] if $bydepth;
1053 last;
1054 }
a0d0e21e 1055 }
1056 }
1057}
1058
81793b90 1059
20408e3c 1060sub wrap_wanted {
81793b90 1061 my $wanted = shift;
1062 if ( ref($wanted) eq 'HASH' ) {
1063 if ( $wanted->{follow} || $wanted->{follow_fast}) {
1064 $wanted->{follow_skip} = 1 unless defined $wanted->{follow_skip};
1065 }
1066 if ( $wanted->{untaint} ) {
7e47e6ff 1067 $wanted->{untaint_pattern} = $File::Find::untaint_pattern
81793b90 1068 unless defined $wanted->{untaint_pattern};
1069 $wanted->{untaint_skip} = 0 unless defined $wanted->{untaint_skip};
1070 }
1071 return $wanted;
1072 }
1073 else {
1074 return { wanted => $wanted };
1075 }
a0d0e21e 1076}
1077
20408e3c 1078sub find {
81793b90 1079 my $wanted = shift;
1080 _find_opt(wrap_wanted($wanted), @_);
a0d0e21e 1081}
1082
55d729e4 1083sub finddepth {
81793b90 1084 my $wanted = wrap_wanted(shift);
1085 $wanted->{bydepth} = 1;
1086 _find_opt($wanted, @_);
20408e3c 1087}
6280b799 1088
7e47e6ff 1089# default
1090$File::Find::skip_pattern = qr/^\.{1,2}\z/;
1091$File::Find::untaint_pattern = qr|^([-+@\w./]+)$|;
1092
6280b799 1093# These are hard-coded for now, but may move to hint files.
10eba763 1094if ($^O eq 'VMS') {
81793b90 1095 $Is_VMS = 1;
7e47e6ff 1096 $File::Find::dont_use_nlink = 1;
1097}
1098elsif ($^O eq 'MacOS') {
1099 $Is_MacOS = 1;
1100 $File::Find::dont_use_nlink = 1;
1101 $File::Find::skip_pattern = qr/^Icon\015\z/;
1102 $File::Find::untaint_pattern = qr|^(.+)$|;
748a9306 1103}
1104
7e47e6ff 1105# this _should_ work properly on all platforms
1106# where File::Find can be expected to work
1107$File::Find::current_dir = File::Spec->curdir || '.';
1108
81793b90 1109$File::Find::dont_use_nlink = 1
497711e7 1110 if $^O eq 'os2' || $^O eq 'dos' || $^O eq 'amigaos' || $^O eq 'MSWin32' ||
2986a63f 1111 $^O eq 'cygwin' || $^O eq 'epoc' || $^O eq 'NetWare';
6280b799 1112
20408e3c 1113# Set dont_use_nlink in your hint file if your system's stat doesn't
1114# report the number of links in a directory as an indication
1115# of the number of files.
1116# See, e.g. hints/machten.sh for MachTen 2.2.
81793b90 1117unless ($File::Find::dont_use_nlink) {
1118 require Config;
1119 $File::Find::dont_use_nlink = 1 if ($Config::Config{'dont_use_nlink'});
20408e3c 1120}
1121
7e47e6ff 1122# We need a function that checks if a scalar is tainted. Either use the
1123# Scalar::Util module's tainted() function or our (slower) pure Perl
1124# fallback is_tainted_pp()
1125{
1126 local $@;
1127 eval { require Scalar::Util };
1128 *is_tainted = $@ ? \&is_tainted_pp : \&Scalar::Util::tainted;
1129}
1130
a0d0e21e 11311;