The KOI8-R example wasn't quite right.
[p5sagit/p5-mst-13.2.git] / lib / File / Find.pm
CommitLineData
a0d0e21e 1package File::Find;
3b825e41 2use 5.006;
b75c8c73 3use strict;
b395063c 4use warnings;
cd68ec93 5use warnings::register;
6our $VERSION = '1.04';
a0d0e21e 7require Exporter;
6280b799 8require Cwd;
a0d0e21e 9
f06db76b 10=head1 NAME
11
12find - traverse a file tree
13
14finddepth - traverse a directory structure depth-first
15
16=head1 SYNOPSIS
17
18 use File::Find;
81793b90 19 find(\&wanted, '/foo', '/bar');
f06db76b 20 sub wanted { ... }
237437d0 21
f06db76b 22 use File::Find;
81793b90 23 finddepth(\&wanted, '/foo', '/bar');
f06db76b 24 sub wanted { ... }
3cb6de81 25
81793b90 26 use File::Find;
27 find({ wanted => \&process, follow => 1 }, '.');
f06db76b 28
29=head1 DESCRIPTION
30
20408e3c 31The first argument to find() is either a hash reference describing the
81793b90 32operations to be performed for each file, or a code reference.
20408e3c 33
81793b90 34Here are the possible keys for the hash:
35
36=over 3
37
38=item C<wanted>
39
40The value should be a code reference. This code reference is called
41I<the wanted() function> below.
42
43=item C<bydepth>
44
45Reports the name of a directory only AFTER all its entries
46have been reported. Entry point finddepth() is a shortcut for
47specifying C<{ bydepth => 1 }> in the first argument of find().
48
719c805e 49=item C<preprocess>
50
7e47e6ff 51The value should be a code reference. This code reference is used to
52preprocess the current directory. The name of currently processed
53directory is in $File::Find::dir. Your preprocessing function is
54called after readdir() but before the loop that calls the wanted()
55function. It is called with a list of strings (actually file/directory
56names) and is expected to return a list of strings. The code can be
57used to sort the file/directory names alphabetically, numerically,
58or to filter out directory entries based on their name alone. When
59I<follow> or I<follow_fast> are in effect, C<preprocess> is a no-op.
719c805e 60
61=item C<postprocess>
62
7e47e6ff 63The value should be a code reference. It is invoked just before leaving
64the currently processed directory. It is called in void context with no
65arguments. The name of the current directory is in $File::Find::dir. This
66hook is handy for summarizing a directory, such as calculating its disk
3fa6e24b 67usage. When I<follow> or I<follow_fast> are in effect, C<postprocess> is a
7e47e6ff 68no-op.
719c805e 69
81793b90 70=item C<follow>
71
72Causes symbolic links to be followed. Since directory trees with symbolic
73links (followed) may contain files more than once and may even have
74cycles, a hash has to be built up with an entry for each file.
75This might be expensive both in space and time for a large
76directory tree. See I<follow_fast> and I<follow_skip> below.
77If either I<follow> or I<follow_fast> is in effect:
78
79=over 6
80
a45bd81d 81=item *
81793b90 82
f10e1564 83It is guaranteed that an I<lstat> has been called before the user's
81793b90 84I<wanted()> function is called. This enables fast file checks involving S< _>.
85
a45bd81d 86=item *
81793b90 87
88There is a variable C<$File::Find::fullname> which holds the absolute
89pathname of the file with all symbolic links resolved
90
91=back
92
93=item C<follow_fast>
94
f10e1564 95This is similar to I<follow> except that it may report some files more
96than once. It does detect cycles, however. Since only symbolic links
97have to be hashed, this is much cheaper both in space and time. If
98processing a file more than once (by the user's I<wanted()> function)
81793b90 99is worse than just taking time, the option I<follow> should be used.
100
101=item C<follow_skip>
102
103C<follow_skip==1>, which is the default, causes all files which are
104neither directories nor symbolic links to be ignored if they are about
105to be processed a second time. If a directory or a symbolic link
106are about to be processed a second time, File::Find dies.
107C<follow_skip==0> causes File::Find to die if any file is about to be
108processed a second time.
109C<follow_skip==2> causes File::Find to ignore any duplicate files and
7e47e6ff 110directories but to proceed normally otherwise.
20408e3c 111
80e52b73 112=item C<dangling_symlinks>
113
114If true and a code reference, will be called with the symbolic link
115name and the directory it lives in as arguments. Otherwise, if true
116and warnings are on, warning "symbolic_link_name is a dangling
117symbolic link\n" will be issued. If false, the dangling symbolic link
118will be silently ignored.
f06db76b 119
81793b90 120=item C<no_chdir>
121
122Does not C<chdir()> to each directory as it recurses. The wanted()
123function will need to be aware of this, of course. In this case,
124C<$_> will be the same as C<$File::Find::name>.
125
126=item C<untaint>
127
128If find is used in taint-mode (-T command line switch or if EUID != UID
129or if EGID != GID) then internally directory names have to be untainted
7e47e6ff 130before they can be chdir'ed to. Therefore they are checked against a regular
131expression I<untaint_pattern>. Note that all names passed to the user's
132I<wanted()> function are still tainted. If this option is used while
133not in taint-mode, C<untaint> is a no-op.
81793b90 134
135=item C<untaint_pattern>
136
137See above. This should be set using the C<qr> quoting operator.
138The default is set to C<qr|^([-+@\w./]+)$|>.
1cffc1dd 139Note that the parentheses are vital.
81793b90 140
141=item C<untaint_skip>
142
7e47e6ff 143If set, a directory which fails the I<untaint_pattern> is skipped,
144including all its sub-directories. The default is to 'die' in such a case.
81793b90 145
146=back
147
148The wanted() function does whatever verifications you want.
149C<$File::Find::dir> contains the current directory name, and C<$_> the
150current filename within that directory. C<$File::Find::name> contains
f10e1564 151the complete pathname to the file. You are chdir()'d to
152C<$File::Find::dir> when the function is called, unless C<no_chdir>
7e47e6ff 153was specified. When C<follow> or C<follow_fast> are in effect, there is
f10e1564 154also a C<$File::Find::fullname>. The function may set
155C<$File::Find::prune> to prune the tree unless C<bydepth> was
156specified. Unless C<follow> or C<follow_fast> is specified, for
157compatibility reasons (find.pl, find2perl) there are in addition the
158following globals available: C<$File::Find::topdir>,
159C<$File::Find::topdev>, C<$File::Find::topino>,
e7b91b67 160C<$File::Find::topmode> and C<$File::Find::topnlink>.
47a735e8 161
20408e3c 162This library is useful for the C<find2perl> tool, which when fed,
f06db76b 163
164 find2perl / -name .nfs\* -mtime +7 \
81793b90 165 -exec rm -f {} \; -o -fstype nfs -prune
f06db76b 166
167produces something like:
168
169 sub wanted {
c7b9dd21 170 /^\.nfs.*\z/s &&
81793b90 171 (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_)) &&
f06db76b 172 int(-M _) > 7 &&
173 unlink($_)
174 ||
81793b90 175 ($nlink || (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))) &&
f06db76b 176 $dev < 0 &&
6280b799 177 ($File::Find::prune = 1);
f06db76b 178 }
179
43dece2a 180Notice the C<_> in the above C<int(-M _)>: the C<_> is a magical
181filehandle that caches the information from the preceding
182stat(), lstat(), or filetest.
183
1cffc1dd 184Here's another interesting wanted function. It will find all symbolic
185links that don't resolve:
f06db76b 186
187 sub wanted {
81793b90 188 -l && !-e && print "bogus link: $File::Find::name\n";
237437d0 189 }
f06db76b 190
81793b90 191See also the script C<pfind> on CPAN for a nice application of this
192module.
193
cd68ec93 194=head1 WARNINGS
195
196If you run your program with the C<-w> switch, or if you use the
197C<warnings> pragma, File::Find will report warnings for several weird
198situations. You can disable these warnings by putting the statement
199
200 no warnings 'File::Find';
201
202in the appropriate scope. See L<perllexwarn> for more info about lexical
203warnings.
204
81793b90 205=head1 CAVEAT
206
5fa2bf2b 207=over 2
208
209=item $dont_use_nlink
210
211You can set the variable C<$File::Find::dont_use_nlink> to 1, if you want to
212force File::Find to always stat directories. This was used for systems
213that do not have the correct C<nlink> count for directories. Examples are
214ISO-9660 (CD-R), AFS, and operating systems like OS/2, DOS and a couple of
215others.
216
217Since now File::Find should now detect such things on-the-fly and switch it
218self to using stat, this will probably not a problem to you.
219
220If you do set $dont_use_nlink to 1, you will notice slow-downs.
221
222=item symlinks
223
f10e1564 224Be aware that the option to follow symbolic links can be dangerous.
81793b90 225Depending on the structure of the directory tree (including symbolic
226links to directories) you might traverse a given (physical) directory
227more than once (only if C<follow_fast> is in effect).
228Furthermore, deleting or changing files in a symbolically linked directory
229might cause very unpleasant surprises, since you delete or change files
230in an unknown directory.
0530a6c4 231
5fa2bf2b 232=back
233
7e47e6ff 234=head1 NOTES
235
236=over 4
237
238=item *
239
240Mac OS (Classic) users should note a few differences:
241
242=over 4
243
244=item *
245
246The path separator is ':', not '/', and the current directory is denoted
247as ':', not '.'. You should be careful about specifying relative pathnames.
248While a full path always begins with a volume name, a relative pathname
249should always begin with a ':'. If specifying a volume name only, a
250trailing ':' is required.
251
252=item *
253
254C<$File::Find::dir> is guaranteed to end with a ':'. If C<$_>
255contains the name of a directory, that name may or may not end with a
256':'. Likewise, C<$File::Find::name>, which contains the complete
257pathname to that directory, and C<$File::Find::fullname>, which holds
258the absolute pathname of that directory with all symbolic links resolved,
259may or may not end with a ':'.
260
261=item *
262
263The default C<untaint_pattern> (see above) on Mac OS is set to
264C<qr|^(.+)$|>. Note that the parentheses are vital.
265
266=item *
267
268The invisible system file "Icon\015" is ignored. While this file may
269appear in every directory, there are some more invisible system files
270on every volume, which are all located at the volume root level (i.e.
271"MacintoshHD:"). These system files are B<not> excluded automatically.
272Your filter may use the following code to recognize invisible files or
273directories (requires Mac::Files):
274
275 use Mac::Files;
276
277 # invisible() -- returns 1 if file/directory is invisible,
1cffc1dd 278 # 0 if it's visible or undef if an error occurred
7e47e6ff 279
280 sub invisible($) {
281 my $file = shift;
282 my ($fileCat, $fileInfo);
283 my $invisible_flag = 1 << 14;
284
285 if ( $fileCat = FSpGetCatInfo($file) ) {
286 if ($fileInfo = $fileCat->ioFlFndrInfo() ) {
287 return (($fileInfo->fdFlags & $invisible_flag) && 1);
288 }
289 }
290 return undef;
291 }
292
293Generally, invisible files are system files, unless an odd application
294decides to use invisible files for its own purposes. To distinguish
295such files from system files, you have to look at the B<type> and B<creator>
296file attributes. The MacPerl built-in functions C<GetFileInfo(FILE)> and
297C<SetFileInfo(CREATOR, TYPE, FILES)> offer access to these attributes
298(see MacPerl.pm for details).
299
300Files that appear on the desktop actually reside in an (hidden) directory
301named "Desktop Folder" on the particular disk volume. Note that, although
302all desktop files appear to be on the same "virtual" desktop, each disk
303volume actually maintains its own "Desktop Folder" directory.
304
305=back
306
307=back
0530a6c4 308
a85af077 309=head1 HISTORY
310
311File::Find used to produce incorrect results if called recursively.
312During the development of perl 5.8 this bug was fixed.
313The first fixed version of File::Find was 1.01.
314
f06db76b 315=cut
316
b75c8c73 317our @ISA = qw(Exporter);
318our @EXPORT = qw(find finddepth);
6280b799 319
a0d0e21e 320
81793b90 321use strict;
322my $Is_VMS;
7e47e6ff 323my $Is_MacOS;
81793b90 324
325require File::Basename;
7e47e6ff 326require File::Spec;
81793b90 327
9f826d6a 328# Should ideally be my() not our() but local() currently
329# refuses to operate on lexicals
330
331our %SLnkSeen;
332our ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow,
719c805e 333 $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat,
80e52b73 334 $pre_process, $post_process, $dangling_symlinks);
81793b90 335
336sub contract_name {
337 my ($cdir,$fn) = @_;
338
7e47e6ff 339 return substr($cdir,0,rindex($cdir,'/')) if $fn eq $File::Find::current_dir;
81793b90 340
341 $cdir = substr($cdir,0,rindex($cdir,'/')+1);
342
343 $fn =~ s|^\./||;
344
345 my $abs_name= $cdir . $fn;
346
347 if (substr($fn,0,3) eq '../') {
fecbda2b 348 1 while $abs_name =~ s!/[^/]*/\.\./!/!;
81793b90 349 }
350
351 return $abs_name;
352}
353
7e47e6ff 354# return the absolute name of a directory or file
355sub contract_name_Mac {
356 my ($cdir,$fn) = @_;
357 my $abs_name;
358
359 if ($fn =~ /^(:+)(.*)$/) { # valid pathname starting with a ':'
360
361 my $colon_count = length ($1);
362 if ($colon_count == 1) {
363 $abs_name = $cdir . $2;
364 return $abs_name;
365 }
366 else {
367 # need to move up the tree, but
368 # only if it's not a volume name
369 for (my $i=1; $i<$colon_count; $i++) {
370 unless ($cdir =~ /^[^:]+:$/) { # volume name
371 $cdir =~ s/[^:]+:$//;
372 }
373 else {
374 return undef;
375 }
376 }
377 $abs_name = $cdir . $2;
378 return $abs_name;
379 }
380
381 }
382 else {
383
384 # $fn may be a valid path to a directory or file or (dangling)
385 # symlink, without a leading ':'
386 if ( (-e $fn) || (-l $fn) ) {
387 if ($fn =~ /^[^:]+:/) { # a volume name like DataHD:*
388 return $fn; # $fn is already an absolute path
389 }
390 else {
391 $abs_name = $cdir . $fn;
392 return $abs_name;
393 }
394 }
395 else { # argh!, $fn is not a valid directory/file
396 return undef;
397 }
398 }
399}
81793b90 400
401sub PathCombine($$) {
402 my ($Base,$Name) = @_;
403 my $AbsName;
404
7e47e6ff 405 if ($Is_MacOS) {
406 # $Name is the resolved symlink (always a full path on MacOS),
407 # i.e. there's no need to call contract_name_Mac()
408 $AbsName = $Name;
409
410 # (simple) check for recursion
411 if ( ( $Base =~ /^$AbsName/) && (-d $AbsName) ) { # recursion
412 return undef;
413 }
81793b90 414 }
415 else {
7e47e6ff 416 if (substr($Name,0,1) eq '/') {
417 $AbsName= $Name;
418 }
419 else {
420 $AbsName= contract_name($Base,$Name);
421 }
81793b90 422
7e47e6ff 423 # (simple) check for recursion
424 my $newlen= length($AbsName);
425 if ($newlen <= length($Base)) {
426 if (($newlen == length($Base) || substr($Base,$newlen,1) eq '/')
427 && $AbsName eq substr($Base,0,$newlen))
428 {
429 return undef;
430 }
81793b90 431 }
432 }
433 return $AbsName;
434}
435
436sub Follow_SymLink($) {
437 my ($AbsName) = @_;
438
439 my ($NewName,$DEV, $INO);
440 ($DEV, $INO)= lstat $AbsName;
441
442 while (-l _) {
443 if ($SLnkSeen{$DEV, $INO}++) {
444 if ($follow_skip < 2) {
445 die "$AbsName is encountered a second time";
a0d0e21e 446 }
447 else {
81793b90 448 return undef;
a0d0e21e 449 }
450 }
81793b90 451 $NewName= PathCombine($AbsName, readlink($AbsName));
452 unless(defined $NewName) {
453 if ($follow_skip < 2) {
454 die "$AbsName is a recursive symbolic link";
455 }
456 else {
457 return undef;
a0d0e21e 458 }
81793b90 459 }
460 else {
461 $AbsName= $NewName;
462 }
463 ($DEV, $INO) = lstat($AbsName);
464 return undef unless defined $DEV; # dangling symbolic link
465 }
466
cd68ec93 467 if ($full_check && defined $DEV && $SLnkSeen{$DEV, $INO}++) {
7e47e6ff 468 if ( ($follow_skip < 1) || ((-d _) && ($follow_skip < 2)) ) {
81793b90 469 die "$AbsName encountered a second time";
470 }
471 else {
472 return undef;
473 }
474 }
475
476 return $AbsName;
477}
478
17f410f9 479our($dir, $name, $fullname, $prune);
81793b90 480sub _find_dir_symlnk($$$);
481sub _find_dir($$$);
482
7e47e6ff 483# check whether or not a scalar variable is tainted
484# (code straight from the Camel, 3rd ed., page 561)
485sub is_tainted_pp {
486 my $arg = shift;
487 my $nada = substr($arg, 0, 0); # zero-length
488 local $@;
489 eval { eval "# $nada" };
490 return length($@) != 0;
491}
492
81793b90 493sub _find_opt {
494 my $wanted = shift;
495 die "invalid top directory" unless defined $_[0];
496
9f826d6a 497 # This function must local()ize everything because callbacks may
498 # call find() or finddepth()
499
500 local %SLnkSeen;
501 local ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow,
502 $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat,
80e52b73 503 $pre_process, $post_process, $dangling_symlinks);
9f826d6a 504 local($dir, $name, $fullname, $prune);
505
a0c9c202 506 my $cwd = $wanted->{bydepth} ? Cwd::fastcwd() : Cwd::getcwd();
80e52b73 507 my $cwd_untainted = $cwd;
508 my $check_t_cwd = 1;
509 $wanted_callback = $wanted->{wanted};
510 $bydepth = $wanted->{bydepth};
511 $pre_process = $wanted->{preprocess};
512 $post_process = $wanted->{postprocess};
513 $no_chdir = $wanted->{no_chdir};
514 $full_check = $wanted->{follow};
515 $follow = $full_check || $wanted->{follow_fast};
516 $follow_skip = $wanted->{follow_skip};
517 $untaint = $wanted->{untaint};
518 $untaint_pat = $wanted->{untaint_pattern};
519 $untaint_skip = $wanted->{untaint_skip};
520 $dangling_symlinks = $wanted->{dangling_symlinks};
81793b90 521
1cffc1dd 522 # for compatibility reasons (find.pl, find2perl)
9f826d6a 523 local our ($topdir, $topdev, $topino, $topmode, $topnlink);
81793b90 524
525 # a symbolic link to a directory doesn't increase the link count
526 $avoid_nlink = $follow || $File::Find::dont_use_nlink;
527
e7b91b67 528 my ($abs_dir, $Is_Dir);
81793b90 529
530 Proc_Top_Item:
531 foreach my $TOP (@_) {
7e47e6ff 532 my $top_item = $TOP;
533
534 if ($Is_MacOS) {
535 ($topdev,$topino,$topmode,$topnlink) = $follow ? stat $top_item : lstat $top_item;
536 $top_item = ":$top_item"
3fa6e24b 537 if ( (-d _) && ( $top_item !~ /:/ ) );
7e47e6ff 538 }
539 else {
540 $top_item =~ s|/\z|| unless $top_item eq '/';
541 ($topdev,$topino,$topmode,$topnlink) = $follow ? stat $top_item : lstat $top_item;
542 }
543
544 $Is_Dir= 0;
545
546 if ($follow) {
547
548 if ($Is_MacOS) {
549 $cwd = "$cwd:" unless ($cwd =~ /:$/); # for safety
550
551 if ($top_item eq $File::Find::current_dir) {
552 $abs_dir = $cwd;
553 }
554 else {
555 $abs_dir = contract_name_Mac($cwd, $top_item);
556 unless (defined $abs_dir) {
cd68ec93 557 warnings::warnif "Can't determine absolute path for $top_item (No such file or directory)\n";
7e47e6ff 558 next Proc_Top_Item;
559 }
560 }
561
562 }
563 else {
564 if (substr($top_item,0,1) eq '/') {
565 $abs_dir = $top_item;
566 }
567 elsif ($top_item eq $File::Find::current_dir) {
568 $abs_dir = $cwd;
569 }
570 else { # care about any ../
571 $abs_dir = contract_name("$cwd/",$top_item);
572 }
573 }
574 $abs_dir= Follow_SymLink($abs_dir);
575 unless (defined $abs_dir) {
80e52b73 576 if ($dangling_symlinks) {
577 if (ref $dangling_symlinks eq 'CODE') {
578 $dangling_symlinks->($top_item, $cwd);
579 } else {
cd68ec93 580 warnings::warnif "$top_item is a dangling symbolic link\n";
80e52b73 581 }
582 }
81793b90 583 next Proc_Top_Item;
7e47e6ff 584 }
585
586 if (-d _) {
81793b90 587 _find_dir_symlnk($wanted, $abs_dir, $top_item);
588 $Is_Dir= 1;
7e47e6ff 589 }
590 }
81793b90 591 else { # no follow
7e47e6ff 592 $topdir = $top_item;
593 unless (defined $topnlink) {
cd68ec93 594 warnings::warnif "Can't stat $top_item: $!\n";
7e47e6ff 595 next Proc_Top_Item;
596 }
597 if (-d _) {
c7b9dd21 598 $top_item =~ s/\.dir\z// if $Is_VMS;
e7b91b67 599 _find_dir($wanted, $top_item, $topnlink);
81793b90 600 $Is_Dir= 1;
7e47e6ff 601 }
237437d0 602 else {
81793b90 603 $abs_dir= $top_item;
7e47e6ff 604 }
605 }
81793b90 606
7e47e6ff 607 unless ($Is_Dir) {
81793b90 608 unless (($_,$dir) = File::Basename::fileparse($abs_dir)) {
7e47e6ff 609 if ($Is_MacOS) {
610 ($dir,$_) = (':', $top_item); # $File::Find::dir, $_
611 }
612 else {
613 ($dir,$_) = ('./', $top_item);
614 }
81793b90 615 }
616
7e47e6ff 617 $abs_dir = $dir;
618 if (( $untaint ) && (is_tainted($dir) )) {
619 ( $abs_dir ) = $dir =~ m|$untaint_pat|;
81793b90 620 unless (defined $abs_dir) {
621 if ($untaint_skip == 0) {
7e47e6ff 622 die "directory $dir is still tainted";
81793b90 623 }
624 else {
625 next Proc_Top_Item;
626 }
627 }
7e47e6ff 628 }
81793b90 629
7e47e6ff 630 unless ($no_chdir || chdir $abs_dir) {
cd68ec93 631 warnings::warnif "Couldn't chdir $abs_dir: $!\n";
7e47e6ff 632 next Proc_Top_Item;
633 }
719911cc 634
7e47e6ff 635 $name = $abs_dir . $_; # $File::Find::name
719911cc 636
7e47e6ff 637 { &$wanted_callback }; # protect against wild "next"
81793b90 638
7e47e6ff 639 }
81793b90 640
7e47e6ff 641 unless ( $no_chdir ) {
642 if ( ($check_t_cwd) && (($untaint) && (is_tainted($cwd) )) ) {
643 ( $cwd_untainted ) = $cwd =~ m|$untaint_pat|;
644 unless (defined $cwd_untainted) {
645 die "insecure cwd in find(depth)";
646 }
647 $check_t_cwd = 0;
648 }
649 unless (chdir $cwd_untainted) {
650 die "Can't cd to $cwd: $!\n";
651 }
652 }
81793b90 653 }
654}
655
656# API:
657# $wanted
658# $p_dir : "parent directory"
659# $nlink : what came back from the stat
660# preconditions:
661# chdir (if not no_chdir) to dir
662
663sub _find_dir($$$) {
664 my ($wanted, $p_dir, $nlink) = @_;
665 my ($CdLvl,$Level) = (0,0);
666 my @Stack;
667 my @filenames;
668 my ($subcount,$sub_nlink);
669 my $SE= [];
670 my $dir_name= $p_dir;
7e47e6ff 671 my $dir_pref;
39e79f6b 672 my $dir_rel = $File::Find::current_dir;
7e47e6ff 673 my $tainted = 0;
5fa2bf2b 674 my $no_nlink;
7e47e6ff 675
676 if ($Is_MacOS) {
677 $dir_pref= ($p_dir =~ /:$/) ? $p_dir : "$p_dir:"; # preface
7e47e6ff 678 }
679 else {
680 $dir_pref= ( $p_dir eq '/' ? '/' : "$p_dir/" );
7e47e6ff 681 }
81793b90 682
683 local ($dir, $name, $prune, *DIR);
7e47e6ff 684
685 unless ( $no_chdir || ($p_dir eq $File::Find::current_dir)) {
81793b90 686 my $udir = $p_dir;
7e47e6ff 687 if (( $untaint ) && (is_tainted($p_dir) )) {
688 ( $udir ) = $p_dir =~ m|$untaint_pat|;
81793b90 689 unless (defined $udir) {
690 if ($untaint_skip == 0) {
691 die "directory $p_dir is still tainted";
692 }
693 else {
694 return;
695 }
237437d0 696 }
a0d0e21e 697 }
81793b90 698 unless (chdir $udir) {
cd68ec93 699 warnings::warnif "Can't cd to $udir: $!\n";
81793b90 700 return;
701 }
702 }
7e47e6ff 703
704 # push the starting directory
57e73c4b 705 push @Stack,[$CdLvl,$p_dir,$dir_rel,-1] if $bydepth;
81793b90 706
7e47e6ff 707 if ($Is_MacOS) {
708 $p_dir = $dir_pref; # ensure trailing ':'
709 }
710
81793b90 711 while (defined $SE) {
712 unless ($bydepth) {
7e47e6ff 713 $dir= $p_dir; # $File::Find::dir
714 $name= $dir_name; # $File::Find::name
715 $_= ($no_chdir ? $dir_name : $dir_rel ); # $_
81793b90 716 # prune may happen here
7e47e6ff 717 $prune= 0;
718 { &$wanted_callback }; # protect against wild "next"
719 next if $prune;
81793b90 720 }
7e47e6ff 721
81793b90 722 # change to that directory
7e47e6ff 723 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
81793b90 724 my $udir= $dir_rel;
7e47e6ff 725 if ( ($untaint) && (($tainted) || ($tainted = is_tainted($dir_rel) )) ) {
726 ( $udir ) = $dir_rel =~ m|$untaint_pat|;
81793b90 727 unless (defined $udir) {
728 if ($untaint_skip == 0) {
7e47e6ff 729 if ($Is_MacOS) {
730 die "directory ($p_dir) $dir_rel is still tainted";
731 }
732 else {
733 die "directory (" . ($p_dir ne '/' ? $p_dir : '') . "/) $dir_rel is still tainted";
734 }
735 } else { # $untaint_skip == 1
736 next;
81793b90 737 }
738 }
739 }
740 unless (chdir $udir) {
7e47e6ff 741 if ($Is_MacOS) {
cd68ec93 742 warnings::warnif "Can't cd to ($p_dir) $udir: $!\n";
7e47e6ff 743 }
744 else {
cd68ec93 745 warnings::warnif "Can't cd to (" .
746 ($p_dir ne '/' ? $p_dir : '') . "/) $udir: $!\n";
7e47e6ff 747 }
81793b90 748 next;
749 }
750 $CdLvl++;
751 }
752
7e47e6ff 753 if ($Is_MacOS) {
754 $dir_name = "$dir_name:" unless ($dir_name =~ /:$/);
755 }
756
757 $dir= $dir_name; # $File::Find::dir
81793b90 758
759 # Get the list of files in the current directory.
7e47e6ff 760 unless (opendir DIR, ($no_chdir ? $dir_name : $File::Find::current_dir)) {
cd68ec93 761 warnings::warnif "Can't opendir($dir_name): $!\n";
81793b90 762 next;
763 }
764 @filenames = readdir DIR;
765 closedir(DIR);
719c805e 766 @filenames = &$pre_process(@filenames) if $pre_process;
767 push @Stack,[$CdLvl,$dir_name,"",-2] if $post_process;
81793b90 768
5fa2bf2b 769 # default: use whatever was specifid
770 # (if $nlink >= 2, and $avoid_nlink == 0, this will switch back)
771 $no_nlink = $avoid_nlink;
772 # if dir has wrong nlink count, force switch to slower stat method
773 $no_nlink = 1 if ($nlink < 2);
774
775 if ($nlink == 2 && !$no_nlink) {
81793b90 776 # This dir has no subdirectories.
777 for my $FN (@filenames) {
7e47e6ff 778 next if $FN =~ $File::Find::skip_pattern;
81793b90 779
7e47e6ff 780 $name = $dir_pref . $FN; # $File::Find::name
781 $_ = ($no_chdir ? $name : $FN); # $_
73396e07 782 { &$wanted_callback }; # protect against wild "next"
81793b90 783 }
784
785 }
786 else {
787 # This dir has subdirectories.
788 $subcount = $nlink - 2;
789
790 for my $FN (@filenames) {
7e47e6ff 791 next if $FN =~ $File::Find::skip_pattern;
5fa2bf2b 792 if ($subcount > 0 || $no_nlink) {
81793b90 793 # Seen all the subdirs?
794 # check for directoriness.
795 # stat is faster for a file in the current directory
07867069 796 $sub_nlink = (lstat ($no_chdir ? $dir_pref . $FN : $FN))[3];
81793b90 797
798 if (-d _) {
799 --$subcount;
c7b9dd21 800 $FN =~ s/\.dir\z// if $Is_VMS;
81793b90 801 push @Stack,[$CdLvl,$dir_name,$FN,$sub_nlink];
802 }
803 else {
7e47e6ff 804 $name = $dir_pref . $FN; # $File::Find::name
805 $_= ($no_chdir ? $name : $FN); # $_
73396e07 806 { &$wanted_callback }; # protect against wild "next"
81793b90 807 }
808 }
07867069 809 else {
7e47e6ff 810 $name = $dir_pref . $FN; # $File::Find::name
811 $_= ($no_chdir ? $name : $FN); # $_
73396e07 812 { &$wanted_callback }; # protect against wild "next"
81793b90 813 }
814 }
815 }
17b275ff 816 }
817 continue {
57e73c4b 818 while ( defined ($SE = pop @Stack) ) {
81793b90 819 ($Level, $p_dir, $dir_rel, $nlink) = @$SE;
820 if ($CdLvl > $Level && !$no_chdir) {
7e47e6ff 821 my $tmp;
822 if ($Is_MacOS) {
823 $tmp = (':' x ($CdLvl-$Level)) . ':';
824 }
825 else {
826 $tmp = join('/',('..') x ($CdLvl-$Level));
827 }
828 die "Can't cd to $dir_name" . $tmp
829 unless chdir ($tmp);
81793b90 830 $CdLvl = $Level;
831 }
7e47e6ff 832
833 if ($Is_MacOS) {
834 # $pdir always has a trailing ':', except for the starting dir,
835 # where $dir_rel eq ':'
836 $dir_name = "$p_dir$dir_rel";
837 $dir_pref = "$dir_name:";
838 }
839 else {
840 $dir_name = ($p_dir eq '/' ? "/$dir_rel" : "$p_dir/$dir_rel");
841 $dir_pref = "$dir_name/";
842 }
843
719c805e 844 if ( $nlink == -2 ) {
7e47e6ff 845 $name = $dir = $p_dir; # $File::Find::name / dir
39e79f6b 846 $_ = $File::Find::current_dir;
719c805e 847 &$post_process; # End-of-directory processing
7e47e6ff 848 }
849 elsif ( $nlink < 0 ) { # must be finddepth, report dirname now
850 $name = $dir_name;
851 if ($Is_MacOS) {
852 if ($dir_rel eq ':') { # must be the top dir, where we started
853 $name =~ s|:$||; # $File::Find::name
854 $p_dir = "$p_dir:" unless ($p_dir =~ /:$/);
855 }
856 $dir = $p_dir; # $File::Find::dir
857 $_ = ($no_chdir ? $name : $dir_rel); # $_
858 }
859 else {
860 if ( substr($name,-2) eq '/.' ) {
861 $name =~ s|/\.$||;
862 }
863 $dir = $p_dir;
864 $_ = ($no_chdir ? $dir_name : $dir_rel );
865 if ( substr($_,-2) eq '/.' ) {
866 s|/\.$||;
867 }
868 }
869 { &$wanted_callback }; # protect against wild "next"
870 }
871 else {
872 push @Stack,[$CdLvl,$p_dir,$dir_rel,-1] if $bydepth;
873 last;
874 }
81793b90 875 }
a0d0e21e 876 }
877}
878
81793b90 879
880# API:
881# $wanted
882# $dir_loc : absolute location of a dir
883# $p_dir : "parent directory"
884# preconditions:
885# chdir (if not no_chdir) to dir
886
887sub _find_dir_symlnk($$$) {
7e47e6ff 888 my ($wanted, $dir_loc, $p_dir) = @_; # $dir_loc is the absolute directory
81793b90 889 my @Stack;
890 my @filenames;
891 my $new_loc;
7e47e6ff 892 my $updir_loc = $dir_loc; # untainted parent directory
81793b90 893 my $SE = [];
894 my $dir_name = $p_dir;
7e47e6ff 895 my $dir_pref;
896 my $loc_pref;
39e79f6b 897 my $dir_rel = $File::Find::current_dir;
7e47e6ff 898 my $byd_flag; # flag for pending stack entry if $bydepth
899 my $tainted = 0;
900 my $ok = 1;
901
902 if ($Is_MacOS) {
903 $dir_pref = ($p_dir =~ /:$/) ? "$p_dir" : "$p_dir:";
904 $loc_pref = ($dir_loc =~ /:$/) ? "$dir_loc" : "$dir_loc:";
7e47e6ff 905 } else {
906 $dir_pref = ( $p_dir eq '/' ? '/' : "$p_dir/" );
907 $loc_pref = ( $dir_loc eq '/' ? '/' : "$dir_loc/" );
7e47e6ff 908 }
81793b90 909
910 local ($dir, $name, $fullname, $prune, *DIR);
7e47e6ff 911
912 unless ($no_chdir) {
913 # untaint the topdir
914 if (( $untaint ) && (is_tainted($dir_loc) )) {
915 ( $updir_loc ) = $dir_loc =~ m|$untaint_pat|; # parent dir, now untainted
916 # once untainted, $updir_loc is pushed on the stack (as parent directory);
917 # hence, we don't need to untaint the parent directory every time we chdir
918 # to it later
919 unless (defined $updir_loc) {
81793b90 920 if ($untaint_skip == 0) {
921 die "directory $dir_loc is still tainted";
922 }
923 else {
924 return;
925 }
926 }
927 }
7e47e6ff 928 $ok = chdir($updir_loc) unless ($p_dir eq $File::Find::current_dir);
929 unless ($ok) {
cd68ec93 930 warnings::warnif "Can't cd to $updir_loc: $!\n";
81793b90 931 return;
932 }
933 }
934
7e47e6ff 935 push @Stack,[$dir_loc,$updir_loc,$p_dir,$dir_rel,-1] if $bydepth;
936
937 if ($Is_MacOS) {
938 $p_dir = $dir_pref; # ensure trailing ':'
939 }
57e73c4b 940
81793b90 941 while (defined $SE) {
942
943 unless ($bydepth) {
7e47e6ff 944 # change (back) to parent directory (always untainted)
704ea872 945 unless ($no_chdir) {
7e47e6ff 946 unless (chdir $updir_loc) {
cd68ec93 947 warnings::warnif "Can't cd to $updir_loc: $!\n";
704ea872 948 next;
949 }
950 }
7e47e6ff 951 $dir= $p_dir; # $File::Find::dir
952 $name= $dir_name; # $File::Find::name
953 $_= ($no_chdir ? $dir_name : $dir_rel ); # $_
954 $fullname= $dir_loc; # $File::Find::fullname
81793b90 955 # prune may happen here
7e47e6ff 956 $prune= 0;
704ea872 957 lstat($_); # make sure file tests with '_' work
7e47e6ff 958 { &$wanted_callback }; # protect against wild "next"
959 next if $prune;
81793b90 960 }
961
962 # change to that directory
7e47e6ff 963 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
964 $updir_loc = $dir_loc;
965 if ( ($untaint) && (($tainted) || ($tainted = is_tainted($dir_loc) )) ) {
966 # untaint $dir_loc, what will be pushed on the stack as (untainted) parent dir
967 ( $updir_loc ) = $dir_loc =~ m|$untaint_pat|;
968 unless (defined $updir_loc) {
81793b90 969 if ($untaint_skip == 0) {
970 die "directory $dir_loc is still tainted";
a0d0e21e 971 }
237437d0 972 else {
81793b90 973 next;
237437d0 974 }
a0d0e21e 975 }
976 }
7e47e6ff 977 unless (chdir $updir_loc) {
cd68ec93 978 warnings::warnif "Can't cd to $updir_loc: $!\n";
81793b90 979 next;
980 }
981 }
982
7e47e6ff 983 if ($Is_MacOS) {
984 $dir_name = "$dir_name:" unless ($dir_name =~ /:$/);
985 }
986
987 $dir = $dir_name; # $File::Find::dir
81793b90 988
989 # Get the list of files in the current directory.
7e47e6ff 990 unless (opendir DIR, ($no_chdir ? $dir_loc : $File::Find::current_dir)) {
cd68ec93 991 warnings::warnif "Can't opendir($dir_loc): $!\n";
81793b90 992 next;
993 }
994 @filenames = readdir DIR;
995 closedir(DIR);
996
997 for my $FN (@filenames) {
7e47e6ff 998 next if $FN =~ $File::Find::skip_pattern;
81793b90 999
1000 # follow symbolic links / do an lstat
07867069 1001 $new_loc = Follow_SymLink($loc_pref.$FN);
81793b90 1002
1003 # ignore if invalid symlink
1004 next unless defined $new_loc;
7e47e6ff 1005
81793b90 1006 if (-d _) {
7e47e6ff 1007 push @Stack,[$new_loc,$updir_loc,$dir_name,$FN,1];
81793b90 1008 }
1009 else {
7e47e6ff 1010 $fullname = $new_loc; # $File::Find::fullname
1011 $name = $dir_pref . $FN; # $File::Find::name
1012 $_ = ($no_chdir ? $name : $FN); # $_
73396e07 1013 { &$wanted_callback }; # protect against wild "next"
81793b90 1014 }
1015 }
1016
81793b90 1017 }
1018 continue {
57e73c4b 1019 while (defined($SE = pop @Stack)) {
7e47e6ff 1020 ($dir_loc, $updir_loc, $p_dir, $dir_rel, $byd_flag) = @$SE;
1021 if ($Is_MacOS) {
1022 # $p_dir always has a trailing ':', except for the starting dir,
1023 # where $dir_rel eq ':'
1024 $dir_name = "$p_dir$dir_rel";
1025 $dir_pref = "$dir_name:";
1026 $loc_pref = ($dir_loc =~ /:$/) ? $dir_loc : "$dir_loc:";
1027 }
1028 else {
1029 $dir_name = ($p_dir eq '/' ? "/$dir_rel" : "$p_dir/$dir_rel");
1030 $dir_pref = "$dir_name/";
1031 $loc_pref = "$dir_loc/";
1032 }
1033 if ( $byd_flag < 0 ) { # must be finddepth, report dirname now
1034 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
1035 unless (chdir $updir_loc) { # $updir_loc (parent dir) is always untainted
cd68ec93 1036 warnings::warnif "Can't cd to $updir_loc: $!\n";
7e47e6ff 1037 next;
1038 }
1039 }
1040 $fullname = $dir_loc; # $File::Find::fullname
1041 $name = $dir_name; # $File::Find::name
1042 if ($Is_MacOS) {
1043 if ($dir_rel eq ':') { # must be the top dir, where we started
1044 $name =~ s|:$||; # $File::Find::name
1045 $p_dir = "$p_dir:" unless ($p_dir =~ /:$/);
1046 }
1047 $dir = $p_dir; # $File::Find::dir
1048 $_ = ($no_chdir ? $name : $dir_rel); # $_
1049 }
1050 else {
1051 if ( substr($name,-2) eq '/.' ) {
1052 $name =~ s|/\.$||; # $File::Find::name
1053 }
1054 $dir = $p_dir; # $File::Find::dir
1055 $_ = ($no_chdir ? $dir_name : $dir_rel); # $_
1056 if ( substr($_,-2) eq '/.' ) {
1057 s|/\.$||;
1058 }
1059 }
1060
1061 lstat($_); # make sure file tests with '_' work
1062 { &$wanted_callback }; # protect against wild "next"
1063 }
1064 else {
1065 push @Stack,[$dir_loc, $updir_loc, $p_dir, $dir_rel,-1] if $bydepth;
1066 last;
1067 }
a0d0e21e 1068 }
1069 }
1070}
1071
81793b90 1072
20408e3c 1073sub wrap_wanted {
81793b90 1074 my $wanted = shift;
1075 if ( ref($wanted) eq 'HASH' ) {
1076 if ( $wanted->{follow} || $wanted->{follow_fast}) {
1077 $wanted->{follow_skip} = 1 unless defined $wanted->{follow_skip};
1078 }
1079 if ( $wanted->{untaint} ) {
7e47e6ff 1080 $wanted->{untaint_pattern} = $File::Find::untaint_pattern
81793b90 1081 unless defined $wanted->{untaint_pattern};
1082 $wanted->{untaint_skip} = 0 unless defined $wanted->{untaint_skip};
1083 }
1084 return $wanted;
1085 }
1086 else {
1087 return { wanted => $wanted };
1088 }
a0d0e21e 1089}
1090
20408e3c 1091sub find {
81793b90 1092 my $wanted = shift;
1093 _find_opt(wrap_wanted($wanted), @_);
a0d0e21e 1094}
1095
55d729e4 1096sub finddepth {
81793b90 1097 my $wanted = wrap_wanted(shift);
1098 $wanted->{bydepth} = 1;
1099 _find_opt($wanted, @_);
20408e3c 1100}
6280b799 1101
7e47e6ff 1102# default
1103$File::Find::skip_pattern = qr/^\.{1,2}\z/;
1104$File::Find::untaint_pattern = qr|^([-+@\w./]+)$|;
1105
6280b799 1106# These are hard-coded for now, but may move to hint files.
10eba763 1107if ($^O eq 'VMS') {
81793b90 1108 $Is_VMS = 1;
7e47e6ff 1109 $File::Find::dont_use_nlink = 1;
1110}
1111elsif ($^O eq 'MacOS') {
1112 $Is_MacOS = 1;
1113 $File::Find::dont_use_nlink = 1;
1114 $File::Find::skip_pattern = qr/^Icon\015\z/;
1115 $File::Find::untaint_pattern = qr|^(.+)$|;
748a9306 1116}
1117
7e47e6ff 1118# this _should_ work properly on all platforms
1119# where File::Find can be expected to work
1120$File::Find::current_dir = File::Spec->curdir || '.';
1121
81793b90 1122$File::Find::dont_use_nlink = 1
497711e7 1123 if $^O eq 'os2' || $^O eq 'dos' || $^O eq 'amigaos' || $^O eq 'MSWin32' ||
2986a63f 1124 $^O eq 'cygwin' || $^O eq 'epoc' || $^O eq 'NetWare';
6280b799 1125
20408e3c 1126# Set dont_use_nlink in your hint file if your system's stat doesn't
1127# report the number of links in a directory as an indication
1128# of the number of files.
1129# See, e.g. hints/machten.sh for MachTen 2.2.
81793b90 1130unless ($File::Find::dont_use_nlink) {
1131 require Config;
1132 $File::Find::dont_use_nlink = 1 if ($Config::Config{'dont_use_nlink'});
20408e3c 1133}
1134
7e47e6ff 1135# We need a function that checks if a scalar is tainted. Either use the
1136# Scalar::Util module's tainted() function or our (slower) pure Perl
1137# fallback is_tainted_pp()
1138{
1139 local $@;
1140 eval { require Scalar::Util };
1141 *is_tainted = $@ ? \&is_tainted_pp : \&Scalar::Util::tainted;
1142}
1143
a0d0e21e 11441;