Re: [PATCH] Re: [PATCH] maintperl - File::Spec cwd() stuff
[p5sagit/p5-mst-13.2.git] / lib / File / Find.pm
CommitLineData
a0d0e21e 1package File::Find;
3b825e41 2use 5.006;
b75c8c73 3use strict;
b395063c 4use warnings;
cd68ec93 5use warnings::register;
14885b4f 6our $VERSION = '1.05';
a0d0e21e 7require Exporter;
6280b799 8require Cwd;
a0d0e21e 9
7bd31527 10#
11# Modified to ensure sub-directory traversal order is not inverded by stack
12# push and pops. That is remains in the same order as in the directory file,
13# or user pre-processing (EG:sorted).
14#
15
f06db76b 16=head1 NAME
17
abfdd623 18File::Find - Traverse a directory tree.
f06db76b 19
20=head1 SYNOPSIS
21
22 use File::Find;
6d355c6e 23 find(\&wanted, @directories_to_search);
f06db76b 24 sub wanted { ... }
237437d0 25
f06db76b 26 use File::Find;
abfdd623 27 finddepth(\&wanted, @directories_to_search);
f06db76b 28 sub wanted { ... }
3cb6de81 29
81793b90 30 use File::Find;
31 find({ wanted => \&process, follow => 1 }, '.');
f06db76b 32
33=head1 DESCRIPTION
34
abfdd623 35These are functions for searching through directory trees doing work
36on each file found similar to the Unix I<find> command. File::Find
37exports two functions, C<find> and C<finddepth>. They work similarly
38but have subtle differences.
39
40=over 4
41
42=item B<find>
43
44 find(\&wanted, @directories);
45 find(\%options, @directories);
46
47find() does a breadth-first search over the given @directories in the
6d355c6e 48order they are given. In essence, it works from the top down.
abfdd623 49
50For each file or directory found the &wanted subroutine is called (see
51below for details). Additionally, for each directory found it will go
52into that directory and continue the search.
53
54=item B<finddepth>
55
56 finddepth(\&wanted, @directories);
57 finddepth(\%options, @directories);
58
59finddepth() works just like find() except it does a depth-first search.
60It works from the bottom of the directory tree up.
61
62=back
63
64=head2 %options
65
20408e3c 66The first argument to find() is either a hash reference describing the
abfdd623 67operations to be performed for each file, or a code reference. The
68code reference is described in L<The wanted function> below.
20408e3c 69
81793b90 70Here are the possible keys for the hash:
71
72=over 3
73
74=item C<wanted>
75
abfdd623 76The value should be a code reference. This code reference is
77described in L<The wanted function> below.
81793b90 78
79=item C<bydepth>
80
81Reports the name of a directory only AFTER all its entries
82have been reported. Entry point finddepth() is a shortcut for
f801979b 83specifying C<{ bydepth =E<gt> 1 }> in the first argument of find().
81793b90 84
719c805e 85=item C<preprocess>
86
7e47e6ff 87The value should be a code reference. This code reference is used to
88preprocess the current directory. The name of currently processed
89directory is in $File::Find::dir. Your preprocessing function is
90called after readdir() but before the loop that calls the wanted()
91function. It is called with a list of strings (actually file/directory
92names) and is expected to return a list of strings. The code can be
93used to sort the file/directory names alphabetically, numerically,
94or to filter out directory entries based on their name alone. When
95I<follow> or I<follow_fast> are in effect, C<preprocess> is a no-op.
719c805e 96
97=item C<postprocess>
98
7e47e6ff 99The value should be a code reference. It is invoked just before leaving
100the currently processed directory. It is called in void context with no
101arguments. The name of the current directory is in $File::Find::dir. This
102hook is handy for summarizing a directory, such as calculating its disk
3fa6e24b 103usage. When I<follow> or I<follow_fast> are in effect, C<postprocess> is a
7e47e6ff 104no-op.
719c805e 105
81793b90 106=item C<follow>
107
108Causes symbolic links to be followed. Since directory trees with symbolic
109links (followed) may contain files more than once and may even have
110cycles, a hash has to be built up with an entry for each file.
111This might be expensive both in space and time for a large
112directory tree. See I<follow_fast> and I<follow_skip> below.
113If either I<follow> or I<follow_fast> is in effect:
114
115=over 6
116
a45bd81d 117=item *
81793b90 118
f10e1564 119It is guaranteed that an I<lstat> has been called before the user's
81793b90 120I<wanted()> function is called. This enables fast file checks involving S< _>.
121
a45bd81d 122=item *
81793b90 123
124There is a variable C<$File::Find::fullname> which holds the absolute
125pathname of the file with all symbolic links resolved
126
127=back
128
129=item C<follow_fast>
130
f10e1564 131This is similar to I<follow> except that it may report some files more
132than once. It does detect cycles, however. Since only symbolic links
133have to be hashed, this is much cheaper both in space and time. If
134processing a file more than once (by the user's I<wanted()> function)
81793b90 135is worse than just taking time, the option I<follow> should be used.
136
137=item C<follow_skip>
138
139C<follow_skip==1>, which is the default, causes all files which are
140neither directories nor symbolic links to be ignored if they are about
141to be processed a second time. If a directory or a symbolic link
142are about to be processed a second time, File::Find dies.
143C<follow_skip==0> causes File::Find to die if any file is about to be
144processed a second time.
145C<follow_skip==2> causes File::Find to ignore any duplicate files and
7e47e6ff 146directories but to proceed normally otherwise.
20408e3c 147
80e52b73 148=item C<dangling_symlinks>
149
150If true and a code reference, will be called with the symbolic link
151name and the directory it lives in as arguments. Otherwise, if true
152and warnings are on, warning "symbolic_link_name is a dangling
153symbolic link\n" will be issued. If false, the dangling symbolic link
154will be silently ignored.
f06db76b 155
81793b90 156=item C<no_chdir>
157
158Does not C<chdir()> to each directory as it recurses. The wanted()
159function will need to be aware of this, of course. In this case,
160C<$_> will be the same as C<$File::Find::name>.
161
162=item C<untaint>
163
164If find is used in taint-mode (-T command line switch or if EUID != UID
165or if EGID != GID) then internally directory names have to be untainted
7e47e6ff 166before they can be chdir'ed to. Therefore they are checked against a regular
167expression I<untaint_pattern>. Note that all names passed to the user's
168I<wanted()> function are still tainted. If this option is used while
169not in taint-mode, C<untaint> is a no-op.
81793b90 170
171=item C<untaint_pattern>
172
173See above. This should be set using the C<qr> quoting operator.
174The default is set to C<qr|^([-+@\w./]+)$|>.
1cffc1dd 175Note that the parentheses are vital.
81793b90 176
177=item C<untaint_skip>
178
7e47e6ff 179If set, a directory which fails the I<untaint_pattern> is skipped,
180including all its sub-directories. The default is to 'die' in such a case.
81793b90 181
182=back
183
abfdd623 184=head2 The wanted function
185
186The wanted() function does whatever verifications you want on each
187file and directory. It takes no arguments but rather does its work
188through a collection of variables.
189
190=over 4
191
f837ebe2 192=item C<$File::Find::dir> is the current directory name,
abfdd623 193
194=item C<$_> is the current filename within that directory
195
f837ebe2 196=item C<$File::Find::name> is the complete pathname to the file.
abfdd623 197
198=back
199
f837ebe2 200Don't modify these variables.
201
abfdd623 202For example, when examining the file /some/path/foo.ext you will have:
203
204 $File::Find::dir = /some/path/
205 $_ = foo.ext
206 $File::Find::name = /some/path/foo.ext
207
208You are chdir()'d toC<$File::Find::dir> when the function is called,
209unless C<no_chdir> was specified. Note that when changing to
210directories is in effect the root directory (F</>) is a somewhat
211special case inasmuch as the concatenation of C<$File::Find::dir>,
212C<'/'> and C<$_> is not literally equal to C<$File::Find::name>. The
213table below summarizes all variants:
5cf0a2f2 214
215 $File::Find::name $File::Find::dir $_
216 default / / .
217 no_chdir=>0 /etc / etc
218 /etc/x /etc x
abfdd623 219
5cf0a2f2 220 no_chdir=>1 / / /
221 /etc / /etc
222 /etc/x /etc /etc/x
223
224
225When <follow> or <follow_fast> are in effect, there is
f10e1564 226also a C<$File::Find::fullname>. The function may set
227C<$File::Find::prune> to prune the tree unless C<bydepth> was
228specified. Unless C<follow> or C<follow_fast> is specified, for
229compatibility reasons (find.pl, find2perl) there are in addition the
230following globals available: C<$File::Find::topdir>,
231C<$File::Find::topdev>, C<$File::Find::topino>,
e7b91b67 232C<$File::Find::topmode> and C<$File::Find::topnlink>.
47a735e8 233
20408e3c 234This library is useful for the C<find2perl> tool, which when fed,
f06db76b 235
236 find2perl / -name .nfs\* -mtime +7 \
81793b90 237 -exec rm -f {} \; -o -fstype nfs -prune
f06db76b 238
239produces something like:
240
241 sub wanted {
c7b9dd21 242 /^\.nfs.*\z/s &&
81793b90 243 (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_)) &&
f06db76b 244 int(-M _) > 7 &&
245 unlink($_)
246 ||
81793b90 247 ($nlink || (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))) &&
f06db76b 248 $dev < 0 &&
6280b799 249 ($File::Find::prune = 1);
f06db76b 250 }
251
43dece2a 252Notice the C<_> in the above C<int(-M _)>: the C<_> is a magical
253filehandle that caches the information from the preceding
254stat(), lstat(), or filetest.
255
1cffc1dd 256Here's another interesting wanted function. It will find all symbolic
257links that don't resolve:
f06db76b 258
259 sub wanted {
81793b90 260 -l && !-e && print "bogus link: $File::Find::name\n";
237437d0 261 }
f06db76b 262
81793b90 263See also the script C<pfind> on CPAN for a nice application of this
264module.
265
cd68ec93 266=head1 WARNINGS
267
268If you run your program with the C<-w> switch, or if you use the
269C<warnings> pragma, File::Find will report warnings for several weird
270situations. You can disable these warnings by putting the statement
271
272 no warnings 'File::Find';
273
274in the appropriate scope. See L<perllexwarn> for more info about lexical
275warnings.
276
81793b90 277=head1 CAVEAT
278
5fa2bf2b 279=over 2
280
281=item $dont_use_nlink
282
283You can set the variable C<$File::Find::dont_use_nlink> to 1, if you want to
6cf3b067 284force File::Find to always stat directories. This was used for file systems
285that do not have an C<nlink> count matching the number of sub-directories.
286Examples are ISO-9660 (CD-ROM), AFS, HPFS (OS/2 file system), FAT (DOS file
287system) and a couple of others.
5fa2bf2b 288
6cf3b067 289You shouldn't need to set this variable, since File::Find should now detect
290such file systems on-the-fly and switch itself to using stat. This works even
291for parts of your file system, like a mounted CD-ROM.
5fa2bf2b 292
6cf3b067 293If you do set C<$File::Find::dont_use_nlink> to 1, you will notice slow-downs.
5fa2bf2b 294
295=item symlinks
296
f10e1564 297Be aware that the option to follow symbolic links can be dangerous.
81793b90 298Depending on the structure of the directory tree (including symbolic
299links to directories) you might traverse a given (physical) directory
300more than once (only if C<follow_fast> is in effect).
301Furthermore, deleting or changing files in a symbolically linked directory
302might cause very unpleasant surprises, since you delete or change files
303in an unknown directory.
0530a6c4 304
5fa2bf2b 305=back
306
7e47e6ff 307=head1 NOTES
308
309=over 4
310
311=item *
312
313Mac OS (Classic) users should note a few differences:
314
315=over 4
316
317=item *
318
319The path separator is ':', not '/', and the current directory is denoted
320as ':', not '.'. You should be careful about specifying relative pathnames.
321While a full path always begins with a volume name, a relative pathname
322should always begin with a ':'. If specifying a volume name only, a
323trailing ':' is required.
324
325=item *
326
327C<$File::Find::dir> is guaranteed to end with a ':'. If C<$_>
328contains the name of a directory, that name may or may not end with a
329':'. Likewise, C<$File::Find::name>, which contains the complete
330pathname to that directory, and C<$File::Find::fullname>, which holds
331the absolute pathname of that directory with all symbolic links resolved,
332may or may not end with a ':'.
333
334=item *
335
336The default C<untaint_pattern> (see above) on Mac OS is set to
337C<qr|^(.+)$|>. Note that the parentheses are vital.
338
339=item *
340
341The invisible system file "Icon\015" is ignored. While this file may
342appear in every directory, there are some more invisible system files
343on every volume, which are all located at the volume root level (i.e.
344"MacintoshHD:"). These system files are B<not> excluded automatically.
345Your filter may use the following code to recognize invisible files or
346directories (requires Mac::Files):
347
348 use Mac::Files;
349
350 # invisible() -- returns 1 if file/directory is invisible,
1cffc1dd 351 # 0 if it's visible or undef if an error occurred
7e47e6ff 352
353 sub invisible($) {
354 my $file = shift;
355 my ($fileCat, $fileInfo);
356 my $invisible_flag = 1 << 14;
357
358 if ( $fileCat = FSpGetCatInfo($file) ) {
359 if ($fileInfo = $fileCat->ioFlFndrInfo() ) {
360 return (($fileInfo->fdFlags & $invisible_flag) && 1);
361 }
362 }
363 return undef;
364 }
365
366Generally, invisible files are system files, unless an odd application
367decides to use invisible files for its own purposes. To distinguish
368such files from system files, you have to look at the B<type> and B<creator>
369file attributes. The MacPerl built-in functions C<GetFileInfo(FILE)> and
370C<SetFileInfo(CREATOR, TYPE, FILES)> offer access to these attributes
371(see MacPerl.pm for details).
372
373Files that appear on the desktop actually reside in an (hidden) directory
374named "Desktop Folder" on the particular disk volume. Note that, although
375all desktop files appear to be on the same "virtual" desktop, each disk
376volume actually maintains its own "Desktop Folder" directory.
377
378=back
379
380=back
0530a6c4 381
a85af077 382=head1 HISTORY
383
384File::Find used to produce incorrect results if called recursively.
385During the development of perl 5.8 this bug was fixed.
386The first fixed version of File::Find was 1.01.
387
f06db76b 388=cut
389
b75c8c73 390our @ISA = qw(Exporter);
391our @EXPORT = qw(find finddepth);
6280b799 392
a0d0e21e 393
81793b90 394use strict;
395my $Is_VMS;
7e47e6ff 396my $Is_MacOS;
81793b90 397
398require File::Basename;
7e47e6ff 399require File::Spec;
81793b90 400
9f826d6a 401# Should ideally be my() not our() but local() currently
402# refuses to operate on lexicals
403
404our %SLnkSeen;
405our ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow,
719c805e 406 $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat,
80e52b73 407 $pre_process, $post_process, $dangling_symlinks);
81793b90 408
409sub contract_name {
410 my ($cdir,$fn) = @_;
411
7e47e6ff 412 return substr($cdir,0,rindex($cdir,'/')) if $fn eq $File::Find::current_dir;
81793b90 413
414 $cdir = substr($cdir,0,rindex($cdir,'/')+1);
415
416 $fn =~ s|^\./||;
417
418 my $abs_name= $cdir . $fn;
419
420 if (substr($fn,0,3) eq '../') {
fecbda2b 421 1 while $abs_name =~ s!/[^/]*/\.\./!/!;
81793b90 422 }
423
424 return $abs_name;
425}
426
7e47e6ff 427# return the absolute name of a directory or file
428sub contract_name_Mac {
429 my ($cdir,$fn) = @_;
430 my $abs_name;
431
432 if ($fn =~ /^(:+)(.*)$/) { # valid pathname starting with a ':'
433
434 my $colon_count = length ($1);
435 if ($colon_count == 1) {
436 $abs_name = $cdir . $2;
437 return $abs_name;
438 }
439 else {
440 # need to move up the tree, but
441 # only if it's not a volume name
442 for (my $i=1; $i<$colon_count; $i++) {
443 unless ($cdir =~ /^[^:]+:$/) { # volume name
444 $cdir =~ s/[^:]+:$//;
445 }
446 else {
447 return undef;
448 }
449 }
450 $abs_name = $cdir . $2;
451 return $abs_name;
452 }
453
454 }
455 else {
456
457 # $fn may be a valid path to a directory or file or (dangling)
458 # symlink, without a leading ':'
459 if ( (-e $fn) || (-l $fn) ) {
460 if ($fn =~ /^[^:]+:/) { # a volume name like DataHD:*
461 return $fn; # $fn is already an absolute path
462 }
463 else {
464 $abs_name = $cdir . $fn;
465 return $abs_name;
466 }
467 }
468 else { # argh!, $fn is not a valid directory/file
469 return undef;
470 }
471 }
472}
81793b90 473
474sub PathCombine($$) {
475 my ($Base,$Name) = @_;
476 my $AbsName;
477
7e47e6ff 478 if ($Is_MacOS) {
479 # $Name is the resolved symlink (always a full path on MacOS),
480 # i.e. there's no need to call contract_name_Mac()
481 $AbsName = $Name;
482
483 # (simple) check for recursion
484 if ( ( $Base =~ /^$AbsName/) && (-d $AbsName) ) { # recursion
485 return undef;
486 }
81793b90 487 }
488 else {
7e47e6ff 489 if (substr($Name,0,1) eq '/') {
490 $AbsName= $Name;
491 }
492 else {
493 $AbsName= contract_name($Base,$Name);
494 }
81793b90 495
7e47e6ff 496 # (simple) check for recursion
497 my $newlen= length($AbsName);
498 if ($newlen <= length($Base)) {
499 if (($newlen == length($Base) || substr($Base,$newlen,1) eq '/')
500 && $AbsName eq substr($Base,0,$newlen))
501 {
502 return undef;
503 }
81793b90 504 }
505 }
506 return $AbsName;
507}
508
509sub Follow_SymLink($) {
510 my ($AbsName) = @_;
511
512 my ($NewName,$DEV, $INO);
513 ($DEV, $INO)= lstat $AbsName;
514
515 while (-l _) {
516 if ($SLnkSeen{$DEV, $INO}++) {
517 if ($follow_skip < 2) {
518 die "$AbsName is encountered a second time";
a0d0e21e 519 }
520 else {
81793b90 521 return undef;
a0d0e21e 522 }
523 }
81793b90 524 $NewName= PathCombine($AbsName, readlink($AbsName));
525 unless(defined $NewName) {
526 if ($follow_skip < 2) {
527 die "$AbsName is a recursive symbolic link";
528 }
529 else {
530 return undef;
a0d0e21e 531 }
81793b90 532 }
533 else {
534 $AbsName= $NewName;
535 }
536 ($DEV, $INO) = lstat($AbsName);
537 return undef unless defined $DEV; # dangling symbolic link
538 }
539
cd68ec93 540 if ($full_check && defined $DEV && $SLnkSeen{$DEV, $INO}++) {
7e47e6ff 541 if ( ($follow_skip < 1) || ((-d _) && ($follow_skip < 2)) ) {
81793b90 542 die "$AbsName encountered a second time";
543 }
544 else {
545 return undef;
546 }
547 }
548
549 return $AbsName;
550}
551
17f410f9 552our($dir, $name, $fullname, $prune);
81793b90 553sub _find_dir_symlnk($$$);
554sub _find_dir($$$);
555
7e47e6ff 556# check whether or not a scalar variable is tainted
557# (code straight from the Camel, 3rd ed., page 561)
558sub is_tainted_pp {
559 my $arg = shift;
560 my $nada = substr($arg, 0, 0); # zero-length
561 local $@;
562 eval { eval "# $nada" };
563 return length($@) != 0;
564}
565
81793b90 566sub _find_opt {
567 my $wanted = shift;
568 die "invalid top directory" unless defined $_[0];
569
9f826d6a 570 # This function must local()ize everything because callbacks may
571 # call find() or finddepth()
572
573 local %SLnkSeen;
574 local ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow,
575 $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat,
80e52b73 576 $pre_process, $post_process, $dangling_symlinks);
17ab9c14 577 local($dir, $name, $fullname, $prune, $_);
9f826d6a 578
a0c9c202 579 my $cwd = $wanted->{bydepth} ? Cwd::fastcwd() : Cwd::getcwd();
80e52b73 580 my $cwd_untainted = $cwd;
581 my $check_t_cwd = 1;
582 $wanted_callback = $wanted->{wanted};
583 $bydepth = $wanted->{bydepth};
584 $pre_process = $wanted->{preprocess};
585 $post_process = $wanted->{postprocess};
586 $no_chdir = $wanted->{no_chdir};
587 $full_check = $wanted->{follow};
588 $follow = $full_check || $wanted->{follow_fast};
589 $follow_skip = $wanted->{follow_skip};
590 $untaint = $wanted->{untaint};
591 $untaint_pat = $wanted->{untaint_pattern};
592 $untaint_skip = $wanted->{untaint_skip};
593 $dangling_symlinks = $wanted->{dangling_symlinks};
81793b90 594
1cffc1dd 595 # for compatibility reasons (find.pl, find2perl)
9f826d6a 596 local our ($topdir, $topdev, $topino, $topmode, $topnlink);
81793b90 597
598 # a symbolic link to a directory doesn't increase the link count
599 $avoid_nlink = $follow || $File::Find::dont_use_nlink;
600
e7b91b67 601 my ($abs_dir, $Is_Dir);
81793b90 602
603 Proc_Top_Item:
604 foreach my $TOP (@_) {
7e47e6ff 605 my $top_item = $TOP;
606
607 if ($Is_MacOS) {
608 ($topdev,$topino,$topmode,$topnlink) = $follow ? stat $top_item : lstat $top_item;
609 $top_item = ":$top_item"
3fa6e24b 610 if ( (-d _) && ( $top_item !~ /:/ ) );
7e47e6ff 611 }
612 else {
613 $top_item =~ s|/\z|| unless $top_item eq '/';
614 ($topdev,$topino,$topmode,$topnlink) = $follow ? stat $top_item : lstat $top_item;
615 }
616
617 $Is_Dir= 0;
618
619 if ($follow) {
620
621 if ($Is_MacOS) {
622 $cwd = "$cwd:" unless ($cwd =~ /:$/); # for safety
623
624 if ($top_item eq $File::Find::current_dir) {
625 $abs_dir = $cwd;
626 }
627 else {
628 $abs_dir = contract_name_Mac($cwd, $top_item);
629 unless (defined $abs_dir) {
cd68ec93 630 warnings::warnif "Can't determine absolute path for $top_item (No such file or directory)\n";
7e47e6ff 631 next Proc_Top_Item;
632 }
633 }
634
635 }
636 else {
637 if (substr($top_item,0,1) eq '/') {
638 $abs_dir = $top_item;
639 }
640 elsif ($top_item eq $File::Find::current_dir) {
641 $abs_dir = $cwd;
642 }
643 else { # care about any ../
644 $abs_dir = contract_name("$cwd/",$top_item);
645 }
646 }
647 $abs_dir= Follow_SymLink($abs_dir);
648 unless (defined $abs_dir) {
80e52b73 649 if ($dangling_symlinks) {
650 if (ref $dangling_symlinks eq 'CODE') {
651 $dangling_symlinks->($top_item, $cwd);
652 } else {
cd68ec93 653 warnings::warnif "$top_item is a dangling symbolic link\n";
80e52b73 654 }
655 }
81793b90 656 next Proc_Top_Item;
7e47e6ff 657 }
658
659 if (-d _) {
81793b90 660 _find_dir_symlnk($wanted, $abs_dir, $top_item);
661 $Is_Dir= 1;
7e47e6ff 662 }
663 }
81793b90 664 else { # no follow
7e47e6ff 665 $topdir = $top_item;
666 unless (defined $topnlink) {
cd68ec93 667 warnings::warnif "Can't stat $top_item: $!\n";
7e47e6ff 668 next Proc_Top_Item;
669 }
670 if (-d _) {
544ff7a7 671 $top_item =~ s/\.dir\z//i if $Is_VMS;
e7b91b67 672 _find_dir($wanted, $top_item, $topnlink);
81793b90 673 $Is_Dir= 1;
7e47e6ff 674 }
237437d0 675 else {
81793b90 676 $abs_dir= $top_item;
7e47e6ff 677 }
678 }
81793b90 679
7e47e6ff 680 unless ($Is_Dir) {
81793b90 681 unless (($_,$dir) = File::Basename::fileparse($abs_dir)) {
7e47e6ff 682 if ($Is_MacOS) {
683 ($dir,$_) = (':', $top_item); # $File::Find::dir, $_
684 }
685 else {
686 ($dir,$_) = ('./', $top_item);
687 }
81793b90 688 }
689
7e47e6ff 690 $abs_dir = $dir;
691 if (( $untaint ) && (is_tainted($dir) )) {
692 ( $abs_dir ) = $dir =~ m|$untaint_pat|;
81793b90 693 unless (defined $abs_dir) {
694 if ($untaint_skip == 0) {
7e47e6ff 695 die "directory $dir is still tainted";
81793b90 696 }
697 else {
698 next Proc_Top_Item;
699 }
700 }
7e47e6ff 701 }
81793b90 702
7e47e6ff 703 unless ($no_chdir || chdir $abs_dir) {
cd68ec93 704 warnings::warnif "Couldn't chdir $abs_dir: $!\n";
7e47e6ff 705 next Proc_Top_Item;
706 }
719911cc 707
7e47e6ff 708 $name = $abs_dir . $_; # $File::Find::name
3bb6d3e5 709 $_ = $name if $no_chdir;
719911cc 710
abfdd623 711 { $wanted_callback->() }; # protect against wild "next"
81793b90 712
7e47e6ff 713 }
81793b90 714
7e47e6ff 715 unless ( $no_chdir ) {
716 if ( ($check_t_cwd) && (($untaint) && (is_tainted($cwd) )) ) {
717 ( $cwd_untainted ) = $cwd =~ m|$untaint_pat|;
718 unless (defined $cwd_untainted) {
719 die "insecure cwd in find(depth)";
720 }
721 $check_t_cwd = 0;
722 }
723 unless (chdir $cwd_untainted) {
724 die "Can't cd to $cwd: $!\n";
725 }
726 }
81793b90 727 }
728}
729
730# API:
731# $wanted
732# $p_dir : "parent directory"
733# $nlink : what came back from the stat
734# preconditions:
735# chdir (if not no_chdir) to dir
736
737sub _find_dir($$$) {
738 my ($wanted, $p_dir, $nlink) = @_;
739 my ($CdLvl,$Level) = (0,0);
740 my @Stack;
741 my @filenames;
742 my ($subcount,$sub_nlink);
743 my $SE= [];
744 my $dir_name= $p_dir;
7e47e6ff 745 my $dir_pref;
39e79f6b 746 my $dir_rel = $File::Find::current_dir;
7e47e6ff 747 my $tainted = 0;
5fa2bf2b 748 my $no_nlink;
7e47e6ff 749
750 if ($Is_MacOS) {
751 $dir_pref= ($p_dir =~ /:$/) ? $p_dir : "$p_dir:"; # preface
7e47e6ff 752 }
753 else {
754 $dir_pref= ( $p_dir eq '/' ? '/' : "$p_dir/" );
7e47e6ff 755 }
81793b90 756
757 local ($dir, $name, $prune, *DIR);
7e47e6ff 758
759 unless ( $no_chdir || ($p_dir eq $File::Find::current_dir)) {
81793b90 760 my $udir = $p_dir;
7e47e6ff 761 if (( $untaint ) && (is_tainted($p_dir) )) {
762 ( $udir ) = $p_dir =~ m|$untaint_pat|;
81793b90 763 unless (defined $udir) {
764 if ($untaint_skip == 0) {
765 die "directory $p_dir is still tainted";
766 }
767 else {
768 return;
769 }
237437d0 770 }
a0d0e21e 771 }
8d8eebbf 772 unless (chdir ($Is_VMS && $udir !~ /[\/\[<]+/ ? "./$udir" : $udir)) {
cd68ec93 773 warnings::warnif "Can't cd to $udir: $!\n";
81793b90 774 return;
775 }
776 }
7e47e6ff 777
778 # push the starting directory
57e73c4b 779 push @Stack,[$CdLvl,$p_dir,$dir_rel,-1] if $bydepth;
81793b90 780
7e47e6ff 781 if ($Is_MacOS) {
782 $p_dir = $dir_pref; # ensure trailing ':'
783 }
784
81793b90 785 while (defined $SE) {
786 unless ($bydepth) {
7e47e6ff 787 $dir= $p_dir; # $File::Find::dir
788 $name= $dir_name; # $File::Find::name
789 $_= ($no_chdir ? $dir_name : $dir_rel ); # $_
81793b90 790 # prune may happen here
7e47e6ff 791 $prune= 0;
abfdd623 792 { $wanted_callback->() }; # protect against wild "next"
7e47e6ff 793 next if $prune;
81793b90 794 }
7e47e6ff 795
81793b90 796 # change to that directory
7e47e6ff 797 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
81793b90 798 my $udir= $dir_rel;
7e47e6ff 799 if ( ($untaint) && (($tainted) || ($tainted = is_tainted($dir_rel) )) ) {
800 ( $udir ) = $dir_rel =~ m|$untaint_pat|;
81793b90 801 unless (defined $udir) {
802 if ($untaint_skip == 0) {
7e47e6ff 803 if ($Is_MacOS) {
804 die "directory ($p_dir) $dir_rel is still tainted";
805 }
806 else {
807 die "directory (" . ($p_dir ne '/' ? $p_dir : '') . "/) $dir_rel is still tainted";
808 }
809 } else { # $untaint_skip == 1
810 next;
81793b90 811 }
812 }
813 }
8d8eebbf 814 unless (chdir ($Is_VMS && $udir !~ /[\/\[<]+/ ? "./$udir" : $udir)) {
7e47e6ff 815 if ($Is_MacOS) {
cd68ec93 816 warnings::warnif "Can't cd to ($p_dir) $udir: $!\n";
7e47e6ff 817 }
818 else {
cd68ec93 819 warnings::warnif "Can't cd to (" .
820 ($p_dir ne '/' ? $p_dir : '') . "/) $udir: $!\n";
7e47e6ff 821 }
81793b90 822 next;
823 }
824 $CdLvl++;
825 }
826
7e47e6ff 827 if ($Is_MacOS) {
828 $dir_name = "$dir_name:" unless ($dir_name =~ /:$/);
829 }
830
831 $dir= $dir_name; # $File::Find::dir
81793b90 832
833 # Get the list of files in the current directory.
7e47e6ff 834 unless (opendir DIR, ($no_chdir ? $dir_name : $File::Find::current_dir)) {
cd68ec93 835 warnings::warnif "Can't opendir($dir_name): $!\n";
81793b90 836 next;
837 }
838 @filenames = readdir DIR;
839 closedir(DIR);
abfdd623 840 @filenames = $pre_process->(@filenames) if $pre_process;
719c805e 841 push @Stack,[$CdLvl,$dir_name,"",-2] if $post_process;
81793b90 842
5fa2bf2b 843 # default: use whatever was specifid
844 # (if $nlink >= 2, and $avoid_nlink == 0, this will switch back)
845 $no_nlink = $avoid_nlink;
846 # if dir has wrong nlink count, force switch to slower stat method
847 $no_nlink = 1 if ($nlink < 2);
848
849 if ($nlink == 2 && !$no_nlink) {
81793b90 850 # This dir has no subdirectories.
851 for my $FN (@filenames) {
7e47e6ff 852 next if $FN =~ $File::Find::skip_pattern;
81793b90 853
7e47e6ff 854 $name = $dir_pref . $FN; # $File::Find::name
855 $_ = ($no_chdir ? $name : $FN); # $_
abfdd623 856 { $wanted_callback->() }; # protect against wild "next"
81793b90 857 }
858
859 }
860 else {
861 # This dir has subdirectories.
862 $subcount = $nlink - 2;
863
7bd31527 864 # HACK: insert directories at this position. so as to preserve
865 # the user pre-processed ordering of files.
866 # EG: directory traversal is in user sorted order, not at random.
867 my $stack_top = @Stack;
868
81793b90 869 for my $FN (@filenames) {
7e47e6ff 870 next if $FN =~ $File::Find::skip_pattern;
5fa2bf2b 871 if ($subcount > 0 || $no_nlink) {
81793b90 872 # Seen all the subdirs?
873 # check for directoriness.
874 # stat is faster for a file in the current directory
07867069 875 $sub_nlink = (lstat ($no_chdir ? $dir_pref . $FN : $FN))[3];
81793b90 876
877 if (-d _) {
878 --$subcount;
544ff7a7 879 $FN =~ s/\.dir\z//i if $Is_VMS;
7bd31527 880 # HACK: replace push to preserve dir traversal order
881 #push @Stack,[$CdLvl,$dir_name,$FN,$sub_nlink];
882 splice @Stack, $stack_top, 0,
883 [$CdLvl,$dir_name,$FN,$sub_nlink];
81793b90 884 }
885 else {
7e47e6ff 886 $name = $dir_pref . $FN; # $File::Find::name
887 $_= ($no_chdir ? $name : $FN); # $_
abfdd623 888 { $wanted_callback->() }; # protect against wild "next"
81793b90 889 }
890 }
07867069 891 else {
7e47e6ff 892 $name = $dir_pref . $FN; # $File::Find::name
893 $_= ($no_chdir ? $name : $FN); # $_
abfdd623 894 { $wanted_callback->() }; # protect against wild "next"
81793b90 895 }
896 }
897 }
17b275ff 898 }
899 continue {
57e73c4b 900 while ( defined ($SE = pop @Stack) ) {
81793b90 901 ($Level, $p_dir, $dir_rel, $nlink) = @$SE;
902 if ($CdLvl > $Level && !$no_chdir) {
7e47e6ff 903 my $tmp;
904 if ($Is_MacOS) {
905 $tmp = (':' x ($CdLvl-$Level)) . ':';
906 }
907 else {
908 $tmp = join('/',('..') x ($CdLvl-$Level));
909 }
910 die "Can't cd to $dir_name" . $tmp
911 unless chdir ($tmp);
81793b90 912 $CdLvl = $Level;
913 }
7e47e6ff 914
915 if ($Is_MacOS) {
916 # $pdir always has a trailing ':', except for the starting dir,
917 # where $dir_rel eq ':'
918 $dir_name = "$p_dir$dir_rel";
919 $dir_pref = "$dir_name:";
920 }
921 else {
922 $dir_name = ($p_dir eq '/' ? "/$dir_rel" : "$p_dir/$dir_rel");
923 $dir_pref = "$dir_name/";
924 }
925
719c805e 926 if ( $nlink == -2 ) {
7e47e6ff 927 $name = $dir = $p_dir; # $File::Find::name / dir
39e79f6b 928 $_ = $File::Find::current_dir;
abfdd623 929 $post_process->(); # End-of-directory processing
7e47e6ff 930 }
931 elsif ( $nlink < 0 ) { # must be finddepth, report dirname now
932 $name = $dir_name;
933 if ($Is_MacOS) {
934 if ($dir_rel eq ':') { # must be the top dir, where we started
935 $name =~ s|:$||; # $File::Find::name
936 $p_dir = "$p_dir:" unless ($p_dir =~ /:$/);
937 }
938 $dir = $p_dir; # $File::Find::dir
939 $_ = ($no_chdir ? $name : $dir_rel); # $_
940 }
941 else {
942 if ( substr($name,-2) eq '/.' ) {
5cf0a2f2 943 substr($name, length($name) == 2 ? -1 : -2) = '';
7e47e6ff 944 }
945 $dir = $p_dir;
946 $_ = ($no_chdir ? $dir_name : $dir_rel );
947 if ( substr($_,-2) eq '/.' ) {
5cf0a2f2 948 substr($_, length($_) == 2 ? -1 : -2) = '';
7e47e6ff 949 }
950 }
abfdd623 951 { $wanted_callback->() }; # protect against wild "next"
7e47e6ff 952 }
953 else {
954 push @Stack,[$CdLvl,$p_dir,$dir_rel,-1] if $bydepth;
955 last;
956 }
81793b90 957 }
a0d0e21e 958 }
959}
960
81793b90 961
962# API:
963# $wanted
964# $dir_loc : absolute location of a dir
965# $p_dir : "parent directory"
966# preconditions:
967# chdir (if not no_chdir) to dir
968
969sub _find_dir_symlnk($$$) {
7e47e6ff 970 my ($wanted, $dir_loc, $p_dir) = @_; # $dir_loc is the absolute directory
81793b90 971 my @Stack;
972 my @filenames;
973 my $new_loc;
7e47e6ff 974 my $updir_loc = $dir_loc; # untainted parent directory
81793b90 975 my $SE = [];
976 my $dir_name = $p_dir;
7e47e6ff 977 my $dir_pref;
978 my $loc_pref;
39e79f6b 979 my $dir_rel = $File::Find::current_dir;
7e47e6ff 980 my $byd_flag; # flag for pending stack entry if $bydepth
981 my $tainted = 0;
982 my $ok = 1;
983
984 if ($Is_MacOS) {
985 $dir_pref = ($p_dir =~ /:$/) ? "$p_dir" : "$p_dir:";
986 $loc_pref = ($dir_loc =~ /:$/) ? "$dir_loc" : "$dir_loc:";
7e47e6ff 987 } else {
988 $dir_pref = ( $p_dir eq '/' ? '/' : "$p_dir/" );
989 $loc_pref = ( $dir_loc eq '/' ? '/' : "$dir_loc/" );
7e47e6ff 990 }
81793b90 991
992 local ($dir, $name, $fullname, $prune, *DIR);
7e47e6ff 993
994 unless ($no_chdir) {
995 # untaint the topdir
996 if (( $untaint ) && (is_tainted($dir_loc) )) {
997 ( $updir_loc ) = $dir_loc =~ m|$untaint_pat|; # parent dir, now untainted
998 # once untainted, $updir_loc is pushed on the stack (as parent directory);
999 # hence, we don't need to untaint the parent directory every time we chdir
1000 # to it later
1001 unless (defined $updir_loc) {
81793b90 1002 if ($untaint_skip == 0) {
1003 die "directory $dir_loc is still tainted";
1004 }
1005 else {
1006 return;
1007 }
1008 }
1009 }
7e47e6ff 1010 $ok = chdir($updir_loc) unless ($p_dir eq $File::Find::current_dir);
1011 unless ($ok) {
cd68ec93 1012 warnings::warnif "Can't cd to $updir_loc: $!\n";
81793b90 1013 return;
1014 }
1015 }
1016
7e47e6ff 1017 push @Stack,[$dir_loc,$updir_loc,$p_dir,$dir_rel,-1] if $bydepth;
1018
1019 if ($Is_MacOS) {
1020 $p_dir = $dir_pref; # ensure trailing ':'
1021 }
57e73c4b 1022
81793b90 1023 while (defined $SE) {
1024
1025 unless ($bydepth) {
7e47e6ff 1026 # change (back) to parent directory (always untainted)
704ea872 1027 unless ($no_chdir) {
7e47e6ff 1028 unless (chdir $updir_loc) {
cd68ec93 1029 warnings::warnif "Can't cd to $updir_loc: $!\n";
704ea872 1030 next;
1031 }
1032 }
7e47e6ff 1033 $dir= $p_dir; # $File::Find::dir
1034 $name= $dir_name; # $File::Find::name
1035 $_= ($no_chdir ? $dir_name : $dir_rel ); # $_
1036 $fullname= $dir_loc; # $File::Find::fullname
81793b90 1037 # prune may happen here
7e47e6ff 1038 $prune= 0;
704ea872 1039 lstat($_); # make sure file tests with '_' work
abfdd623 1040 { $wanted_callback->() }; # protect against wild "next"
7e47e6ff 1041 next if $prune;
81793b90 1042 }
1043
1044 # change to that directory
7e47e6ff 1045 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
1046 $updir_loc = $dir_loc;
1047 if ( ($untaint) && (($tainted) || ($tainted = is_tainted($dir_loc) )) ) {
1048 # untaint $dir_loc, what will be pushed on the stack as (untainted) parent dir
1049 ( $updir_loc ) = $dir_loc =~ m|$untaint_pat|;
1050 unless (defined $updir_loc) {
81793b90 1051 if ($untaint_skip == 0) {
1052 die "directory $dir_loc is still tainted";
a0d0e21e 1053 }
237437d0 1054 else {
81793b90 1055 next;
237437d0 1056 }
a0d0e21e 1057 }
1058 }
7e47e6ff 1059 unless (chdir $updir_loc) {
cd68ec93 1060 warnings::warnif "Can't cd to $updir_loc: $!\n";
81793b90 1061 next;
1062 }
1063 }
1064
7e47e6ff 1065 if ($Is_MacOS) {
1066 $dir_name = "$dir_name:" unless ($dir_name =~ /:$/);
1067 }
1068
1069 $dir = $dir_name; # $File::Find::dir
81793b90 1070
1071 # Get the list of files in the current directory.
7e47e6ff 1072 unless (opendir DIR, ($no_chdir ? $dir_loc : $File::Find::current_dir)) {
cd68ec93 1073 warnings::warnif "Can't opendir($dir_loc): $!\n";
81793b90 1074 next;
1075 }
1076 @filenames = readdir DIR;
1077 closedir(DIR);
1078
1079 for my $FN (@filenames) {
7e47e6ff 1080 next if $FN =~ $File::Find::skip_pattern;
81793b90 1081
1082 # follow symbolic links / do an lstat
07867069 1083 $new_loc = Follow_SymLink($loc_pref.$FN);
81793b90 1084
1085 # ignore if invalid symlink
1086 next unless defined $new_loc;
7e47e6ff 1087
81793b90 1088 if (-d _) {
7e47e6ff 1089 push @Stack,[$new_loc,$updir_loc,$dir_name,$FN,1];
81793b90 1090 }
1091 else {
7e47e6ff 1092 $fullname = $new_loc; # $File::Find::fullname
1093 $name = $dir_pref . $FN; # $File::Find::name
1094 $_ = ($no_chdir ? $name : $FN); # $_
abfdd623 1095 { $wanted_callback->() }; # protect against wild "next"
81793b90 1096 }
1097 }
1098
81793b90 1099 }
1100 continue {
57e73c4b 1101 while (defined($SE = pop @Stack)) {
7e47e6ff 1102 ($dir_loc, $updir_loc, $p_dir, $dir_rel, $byd_flag) = @$SE;
1103 if ($Is_MacOS) {
1104 # $p_dir always has a trailing ':', except for the starting dir,
1105 # where $dir_rel eq ':'
1106 $dir_name = "$p_dir$dir_rel";
1107 $dir_pref = "$dir_name:";
1108 $loc_pref = ($dir_loc =~ /:$/) ? $dir_loc : "$dir_loc:";
1109 }
1110 else {
1111 $dir_name = ($p_dir eq '/' ? "/$dir_rel" : "$p_dir/$dir_rel");
1112 $dir_pref = "$dir_name/";
1113 $loc_pref = "$dir_loc/";
1114 }
1115 if ( $byd_flag < 0 ) { # must be finddepth, report dirname now
1116 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
1117 unless (chdir $updir_loc) { # $updir_loc (parent dir) is always untainted
cd68ec93 1118 warnings::warnif "Can't cd to $updir_loc: $!\n";
7e47e6ff 1119 next;
1120 }
1121 }
1122 $fullname = $dir_loc; # $File::Find::fullname
1123 $name = $dir_name; # $File::Find::name
1124 if ($Is_MacOS) {
1125 if ($dir_rel eq ':') { # must be the top dir, where we started
1126 $name =~ s|:$||; # $File::Find::name
1127 $p_dir = "$p_dir:" unless ($p_dir =~ /:$/);
1128 }
1129 $dir = $p_dir; # $File::Find::dir
1130 $_ = ($no_chdir ? $name : $dir_rel); # $_
1131 }
1132 else {
1133 if ( substr($name,-2) eq '/.' ) {
f801979b 1134 substr($name, length($name) == 2 ? -1 : -2) = ''; # $File::Find::name
7e47e6ff 1135 }
1136 $dir = $p_dir; # $File::Find::dir
1137 $_ = ($no_chdir ? $dir_name : $dir_rel); # $_
1138 if ( substr($_,-2) eq '/.' ) {
f801979b 1139 substr($_, length($_) == 2 ? -1 : -2) = '';
7e47e6ff 1140 }
1141 }
1142
1143 lstat($_); # make sure file tests with '_' work
abfdd623 1144 { $wanted_callback->() }; # protect against wild "next"
7e47e6ff 1145 }
1146 else {
1147 push @Stack,[$dir_loc, $updir_loc, $p_dir, $dir_rel,-1] if $bydepth;
1148 last;
1149 }
a0d0e21e 1150 }
1151 }
1152}
1153
81793b90 1154
20408e3c 1155sub wrap_wanted {
81793b90 1156 my $wanted = shift;
1157 if ( ref($wanted) eq 'HASH' ) {
1158 if ( $wanted->{follow} || $wanted->{follow_fast}) {
1159 $wanted->{follow_skip} = 1 unless defined $wanted->{follow_skip};
1160 }
1161 if ( $wanted->{untaint} ) {
7e47e6ff 1162 $wanted->{untaint_pattern} = $File::Find::untaint_pattern
81793b90 1163 unless defined $wanted->{untaint_pattern};
1164 $wanted->{untaint_skip} = 0 unless defined $wanted->{untaint_skip};
1165 }
1166 return $wanted;
1167 }
1168 else {
1169 return { wanted => $wanted };
1170 }
a0d0e21e 1171}
1172
20408e3c 1173sub find {
81793b90 1174 my $wanted = shift;
1175 _find_opt(wrap_wanted($wanted), @_);
a0d0e21e 1176}
1177
55d729e4 1178sub finddepth {
81793b90 1179 my $wanted = wrap_wanted(shift);
1180 $wanted->{bydepth} = 1;
1181 _find_opt($wanted, @_);
20408e3c 1182}
6280b799 1183
7e47e6ff 1184# default
1185$File::Find::skip_pattern = qr/^\.{1,2}\z/;
1186$File::Find::untaint_pattern = qr|^([-+@\w./]+)$|;
1187
6280b799 1188# These are hard-coded for now, but may move to hint files.
10eba763 1189if ($^O eq 'VMS') {
81793b90 1190 $Is_VMS = 1;
7e47e6ff 1191 $File::Find::dont_use_nlink = 1;
1192}
1193elsif ($^O eq 'MacOS') {
1194 $Is_MacOS = 1;
1195 $File::Find::dont_use_nlink = 1;
1196 $File::Find::skip_pattern = qr/^Icon\015\z/;
1197 $File::Find::untaint_pattern = qr|^(.+)$|;
748a9306 1198}
1199
7e47e6ff 1200# this _should_ work properly on all platforms
1201# where File::Find can be expected to work
1202$File::Find::current_dir = File::Spec->curdir || '.';
1203
81793b90 1204$File::Find::dont_use_nlink = 1
497711e7 1205 if $^O eq 'os2' || $^O eq 'dos' || $^O eq 'amigaos' || $^O eq 'MSWin32' ||
1119cb72 1206 $^O eq 'cygwin' || $^O eq 'epoc' || $^O eq 'qnx' ||
1207 $^O eq 'nto';
6280b799 1208
20408e3c 1209# Set dont_use_nlink in your hint file if your system's stat doesn't
1210# report the number of links in a directory as an indication
1211# of the number of files.
1212# See, e.g. hints/machten.sh for MachTen 2.2.
81793b90 1213unless ($File::Find::dont_use_nlink) {
1214 require Config;
1215 $File::Find::dont_use_nlink = 1 if ($Config::Config{'dont_use_nlink'});
20408e3c 1216}
1217
7e47e6ff 1218# We need a function that checks if a scalar is tainted. Either use the
1219# Scalar::Util module's tainted() function or our (slower) pure Perl
1220# fallback is_tainted_pp()
1221{
1222 local $@;
1223 eval { require Scalar::Util };
1224 *is_tainted = $@ ? \&is_tainted_pp : \&Scalar::Util::tainted;
1225}
1226
a0d0e21e 12271;