Use UCHARAT() as suggested by Inaba Hiroto.
[p5sagit/p5-mst-13.2.git] / lib / File / Find.pm
CommitLineData
a0d0e21e 1package File::Find;
b75c8c73 2use strict;
b395063c 3use warnings;
4use 5.6.0;
a85af077 5our $VERSION = '1.01';
a0d0e21e 6require Exporter;
6280b799 7require Cwd;
a0d0e21e 8
f06db76b 9=head1 NAME
10
11find - traverse a file tree
12
13finddepth - traverse a directory structure depth-first
14
15=head1 SYNOPSIS
16
17 use File::Find;
81793b90 18 find(\&wanted, '/foo', '/bar');
f06db76b 19 sub wanted { ... }
237437d0 20
f06db76b 21 use File::Find;
81793b90 22 finddepth(\&wanted, '/foo', '/bar');
f06db76b 23 sub wanted { ... }
3cb6de81 24
81793b90 25 use File::Find;
26 find({ wanted => \&process, follow => 1 }, '.');
f06db76b 27
28=head1 DESCRIPTION
29
20408e3c 30The first argument to find() is either a hash reference describing the
81793b90 31operations to be performed for each file, or a code reference.
20408e3c 32
81793b90 33Here are the possible keys for the hash:
34
35=over 3
36
37=item C<wanted>
38
39The value should be a code reference. This code reference is called
40I<the wanted() function> below.
41
42=item C<bydepth>
43
44Reports the name of a directory only AFTER all its entries
45have been reported. Entry point finddepth() is a shortcut for
46specifying C<{ bydepth => 1 }> in the first argument of find().
47
719c805e 48=item C<preprocess>
49
7e47e6ff 50The value should be a code reference. This code reference is used to
51preprocess the current directory. The name of currently processed
52directory is in $File::Find::dir. Your preprocessing function is
53called after readdir() but before the loop that calls the wanted()
54function. It is called with a list of strings (actually file/directory
55names) and is expected to return a list of strings. The code can be
56used to sort the file/directory names alphabetically, numerically,
57or to filter out directory entries based on their name alone. When
58I<follow> or I<follow_fast> are in effect, C<preprocess> is a no-op.
719c805e 59
60=item C<postprocess>
61
7e47e6ff 62The value should be a code reference. It is invoked just before leaving
63the currently processed directory. It is called in void context with no
64arguments. The name of the current directory is in $File::Find::dir. This
65hook is handy for summarizing a directory, such as calculating its disk
3fa6e24b 66usage. When I<follow> or I<follow_fast> are in effect, C<postprocess> is a
7e47e6ff 67no-op.
719c805e 68
81793b90 69=item C<follow>
70
71Causes symbolic links to be followed. Since directory trees with symbolic
72links (followed) may contain files more than once and may even have
73cycles, a hash has to be built up with an entry for each file.
74This might be expensive both in space and time for a large
75directory tree. See I<follow_fast> and I<follow_skip> below.
76If either I<follow> or I<follow_fast> is in effect:
77
78=over 6
79
a45bd81d 80=item *
81793b90 81
f10e1564 82It is guaranteed that an I<lstat> has been called before the user's
81793b90 83I<wanted()> function is called. This enables fast file checks involving S< _>.
84
a45bd81d 85=item *
81793b90 86
87There is a variable C<$File::Find::fullname> which holds the absolute
88pathname of the file with all symbolic links resolved
89
90=back
91
92=item C<follow_fast>
93
f10e1564 94This is similar to I<follow> except that it may report some files more
95than once. It does detect cycles, however. Since only symbolic links
96have to be hashed, this is much cheaper both in space and time. If
97processing a file more than once (by the user's I<wanted()> function)
81793b90 98is worse than just taking time, the option I<follow> should be used.
99
100=item C<follow_skip>
101
102C<follow_skip==1>, which is the default, causes all files which are
103neither directories nor symbolic links to be ignored if they are about
104to be processed a second time. If a directory or a symbolic link
105are about to be processed a second time, File::Find dies.
106C<follow_skip==0> causes File::Find to die if any file is about to be
107processed a second time.
108C<follow_skip==2> causes File::Find to ignore any duplicate files and
7e47e6ff 109directories but to proceed normally otherwise.
20408e3c 110
80e52b73 111=item C<dangling_symlinks>
112
113If true and a code reference, will be called with the symbolic link
114name and the directory it lives in as arguments. Otherwise, if true
115and warnings are on, warning "symbolic_link_name is a dangling
116symbolic link\n" will be issued. If false, the dangling symbolic link
117will be silently ignored.
f06db76b 118
81793b90 119=item C<no_chdir>
120
121Does not C<chdir()> to each directory as it recurses. The wanted()
122function will need to be aware of this, of course. In this case,
123C<$_> will be the same as C<$File::Find::name>.
124
125=item C<untaint>
126
127If find is used in taint-mode (-T command line switch or if EUID != UID
128or if EGID != GID) then internally directory names have to be untainted
7e47e6ff 129before they can be chdir'ed to. Therefore they are checked against a regular
130expression I<untaint_pattern>. Note that all names passed to the user's
131I<wanted()> function are still tainted. If this option is used while
132not in taint-mode, C<untaint> is a no-op.
81793b90 133
134=item C<untaint_pattern>
135
136See above. This should be set using the C<qr> quoting operator.
137The default is set to C<qr|^([-+@\w./]+)$|>.
1cffc1dd 138Note that the parentheses are vital.
81793b90 139
140=item C<untaint_skip>
141
7e47e6ff 142If set, a directory which fails the I<untaint_pattern> is skipped,
143including all its sub-directories. The default is to 'die' in such a case.
81793b90 144
145=back
146
147The wanted() function does whatever verifications you want.
148C<$File::Find::dir> contains the current directory name, and C<$_> the
149current filename within that directory. C<$File::Find::name> contains
f10e1564 150the complete pathname to the file. You are chdir()'d to
151C<$File::Find::dir> when the function is called, unless C<no_chdir>
7e47e6ff 152was specified. When C<follow> or C<follow_fast> are in effect, there is
f10e1564 153also a C<$File::Find::fullname>. The function may set
154C<$File::Find::prune> to prune the tree unless C<bydepth> was
155specified. Unless C<follow> or C<follow_fast> is specified, for
156compatibility reasons (find.pl, find2perl) there are in addition the
157following globals available: C<$File::Find::topdir>,
158C<$File::Find::topdev>, C<$File::Find::topino>,
e7b91b67 159C<$File::Find::topmode> and C<$File::Find::topnlink>.
47a735e8 160
20408e3c 161This library is useful for the C<find2perl> tool, which when fed,
f06db76b 162
163 find2perl / -name .nfs\* -mtime +7 \
81793b90 164 -exec rm -f {} \; -o -fstype nfs -prune
f06db76b 165
166produces something like:
167
168 sub wanted {
c7b9dd21 169 /^\.nfs.*\z/s &&
81793b90 170 (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_)) &&
f06db76b 171 int(-M _) > 7 &&
172 unlink($_)
173 ||
81793b90 174 ($nlink || (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))) &&
f06db76b 175 $dev < 0 &&
6280b799 176 ($File::Find::prune = 1);
f06db76b 177 }
178
43dece2a 179Notice the C<_> in the above C<int(-M _)>: the C<_> is a magical
180filehandle that caches the information from the preceding
181stat(), lstat(), or filetest.
182
81793b90 183Set the variable C<$File::Find::dont_use_nlink> if you're using AFS,
6280b799 184since AFS cheats.
f06db76b 185
1cffc1dd 186Here's another interesting wanted function. It will find all symbolic
187links that don't resolve:
f06db76b 188
189 sub wanted {
81793b90 190 -l && !-e && print "bogus link: $File::Find::name\n";
237437d0 191 }
f06db76b 192
81793b90 193See also the script C<pfind> on CPAN for a nice application of this
194module.
195
196=head1 CAVEAT
197
f10e1564 198Be aware that the option to follow symbolic links can be dangerous.
81793b90 199Depending on the structure of the directory tree (including symbolic
200links to directories) you might traverse a given (physical) directory
201more than once (only if C<follow_fast> is in effect).
202Furthermore, deleting or changing files in a symbolically linked directory
203might cause very unpleasant surprises, since you delete or change files
204in an unknown directory.
0530a6c4 205
7e47e6ff 206=head1 NOTES
207
208=over 4
209
210=item *
211
212Mac OS (Classic) users should note a few differences:
213
214=over 4
215
216=item *
217
218The path separator is ':', not '/', and the current directory is denoted
219as ':', not '.'. You should be careful about specifying relative pathnames.
220While a full path always begins with a volume name, a relative pathname
221should always begin with a ':'. If specifying a volume name only, a
222trailing ':' is required.
223
224=item *
225
226C<$File::Find::dir> is guaranteed to end with a ':'. If C<$_>
227contains the name of a directory, that name may or may not end with a
228':'. Likewise, C<$File::Find::name>, which contains the complete
229pathname to that directory, and C<$File::Find::fullname>, which holds
230the absolute pathname of that directory with all symbolic links resolved,
231may or may not end with a ':'.
232
233=item *
234
235The default C<untaint_pattern> (see above) on Mac OS is set to
236C<qr|^(.+)$|>. Note that the parentheses are vital.
237
238=item *
239
240The invisible system file "Icon\015" is ignored. While this file may
241appear in every directory, there are some more invisible system files
242on every volume, which are all located at the volume root level (i.e.
243"MacintoshHD:"). These system files are B<not> excluded automatically.
244Your filter may use the following code to recognize invisible files or
245directories (requires Mac::Files):
246
247 use Mac::Files;
248
249 # invisible() -- returns 1 if file/directory is invisible,
1cffc1dd 250 # 0 if it's visible or undef if an error occurred
7e47e6ff 251
252 sub invisible($) {
253 my $file = shift;
254 my ($fileCat, $fileInfo);
255 my $invisible_flag = 1 << 14;
256
257 if ( $fileCat = FSpGetCatInfo($file) ) {
258 if ($fileInfo = $fileCat->ioFlFndrInfo() ) {
259 return (($fileInfo->fdFlags & $invisible_flag) && 1);
260 }
261 }
262 return undef;
263 }
264
265Generally, invisible files are system files, unless an odd application
266decides to use invisible files for its own purposes. To distinguish
267such files from system files, you have to look at the B<type> and B<creator>
268file attributes. The MacPerl built-in functions C<GetFileInfo(FILE)> and
269C<SetFileInfo(CREATOR, TYPE, FILES)> offer access to these attributes
270(see MacPerl.pm for details).
271
272Files that appear on the desktop actually reside in an (hidden) directory
273named "Desktop Folder" on the particular disk volume. Note that, although
274all desktop files appear to be on the same "virtual" desktop, each disk
275volume actually maintains its own "Desktop Folder" directory.
276
277=back
278
279=back
0530a6c4 280
a85af077 281=head1 HISTORY
282
283File::Find used to produce incorrect results if called recursively.
284During the development of perl 5.8 this bug was fixed.
285The first fixed version of File::Find was 1.01.
286
f06db76b 287=cut
288
b75c8c73 289our @ISA = qw(Exporter);
290our @EXPORT = qw(find finddepth);
6280b799 291
a0d0e21e 292
81793b90 293use strict;
294my $Is_VMS;
7e47e6ff 295my $Is_MacOS;
81793b90 296
297require File::Basename;
7e47e6ff 298require File::Spec;
81793b90 299
9f826d6a 300# Should ideally be my() not our() but local() currently
301# refuses to operate on lexicals
302
303our %SLnkSeen;
304our ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow,
719c805e 305 $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat,
80e52b73 306 $pre_process, $post_process, $dangling_symlinks);
81793b90 307
308sub contract_name {
309 my ($cdir,$fn) = @_;
310
7e47e6ff 311 return substr($cdir,0,rindex($cdir,'/')) if $fn eq $File::Find::current_dir;
81793b90 312
313 $cdir = substr($cdir,0,rindex($cdir,'/')+1);
314
315 $fn =~ s|^\./||;
316
317 my $abs_name= $cdir . $fn;
318
319 if (substr($fn,0,3) eq '../') {
fecbda2b 320 1 while $abs_name =~ s!/[^/]*/\.\./!/!;
81793b90 321 }
322
323 return $abs_name;
324}
325
7e47e6ff 326# return the absolute name of a directory or file
327sub contract_name_Mac {
328 my ($cdir,$fn) = @_;
329 my $abs_name;
330
331 if ($fn =~ /^(:+)(.*)$/) { # valid pathname starting with a ':'
332
333 my $colon_count = length ($1);
334 if ($colon_count == 1) {
335 $abs_name = $cdir . $2;
336 return $abs_name;
337 }
338 else {
339 # need to move up the tree, but
340 # only if it's not a volume name
341 for (my $i=1; $i<$colon_count; $i++) {
342 unless ($cdir =~ /^[^:]+:$/) { # volume name
343 $cdir =~ s/[^:]+:$//;
344 }
345 else {
346 return undef;
347 }
348 }
349 $abs_name = $cdir . $2;
350 return $abs_name;
351 }
352
353 }
354 else {
355
356 # $fn may be a valid path to a directory or file or (dangling)
357 # symlink, without a leading ':'
358 if ( (-e $fn) || (-l $fn) ) {
359 if ($fn =~ /^[^:]+:/) { # a volume name like DataHD:*
360 return $fn; # $fn is already an absolute path
361 }
362 else {
363 $abs_name = $cdir . $fn;
364 return $abs_name;
365 }
366 }
367 else { # argh!, $fn is not a valid directory/file
368 return undef;
369 }
370 }
371}
81793b90 372
373sub PathCombine($$) {
374 my ($Base,$Name) = @_;
375 my $AbsName;
376
7e47e6ff 377 if ($Is_MacOS) {
378 # $Name is the resolved symlink (always a full path on MacOS),
379 # i.e. there's no need to call contract_name_Mac()
380 $AbsName = $Name;
381
382 # (simple) check for recursion
383 if ( ( $Base =~ /^$AbsName/) && (-d $AbsName) ) { # recursion
384 return undef;
385 }
81793b90 386 }
387 else {
7e47e6ff 388 if (substr($Name,0,1) eq '/') {
389 $AbsName= $Name;
390 }
391 else {
392 $AbsName= contract_name($Base,$Name);
393 }
81793b90 394
7e47e6ff 395 # (simple) check for recursion
396 my $newlen= length($AbsName);
397 if ($newlen <= length($Base)) {
398 if (($newlen == length($Base) || substr($Base,$newlen,1) eq '/')
399 && $AbsName eq substr($Base,0,$newlen))
400 {
401 return undef;
402 }
81793b90 403 }
404 }
405 return $AbsName;
406}
407
408sub Follow_SymLink($) {
409 my ($AbsName) = @_;
410
411 my ($NewName,$DEV, $INO);
412 ($DEV, $INO)= lstat $AbsName;
413
414 while (-l _) {
415 if ($SLnkSeen{$DEV, $INO}++) {
416 if ($follow_skip < 2) {
417 die "$AbsName is encountered a second time";
a0d0e21e 418 }
419 else {
81793b90 420 return undef;
a0d0e21e 421 }
422 }
81793b90 423 $NewName= PathCombine($AbsName, readlink($AbsName));
424 unless(defined $NewName) {
425 if ($follow_skip < 2) {
426 die "$AbsName is a recursive symbolic link";
427 }
428 else {
429 return undef;
a0d0e21e 430 }
81793b90 431 }
432 else {
433 $AbsName= $NewName;
434 }
435 ($DEV, $INO) = lstat($AbsName);
436 return undef unless defined $DEV; # dangling symbolic link
437 }
438
439 if ($full_check && $SLnkSeen{$DEV, $INO}++) {
7e47e6ff 440 if ( ($follow_skip < 1) || ((-d _) && ($follow_skip < 2)) ) {
81793b90 441 die "$AbsName encountered a second time";
442 }
443 else {
444 return undef;
445 }
446 }
447
448 return $AbsName;
449}
450
17f410f9 451our($dir, $name, $fullname, $prune);
81793b90 452sub _find_dir_symlnk($$$);
453sub _find_dir($$$);
454
7e47e6ff 455# check whether or not a scalar variable is tainted
456# (code straight from the Camel, 3rd ed., page 561)
457sub is_tainted_pp {
458 my $arg = shift;
459 my $nada = substr($arg, 0, 0); # zero-length
460 local $@;
461 eval { eval "# $nada" };
462 return length($@) != 0;
463}
464
81793b90 465sub _find_opt {
466 my $wanted = shift;
467 die "invalid top directory" unless defined $_[0];
468
9f826d6a 469 # This function must local()ize everything because callbacks may
470 # call find() or finddepth()
471
472 local %SLnkSeen;
473 local ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow,
474 $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat,
80e52b73 475 $pre_process, $post_process, $dangling_symlinks);
9f826d6a 476 local($dir, $name, $fullname, $prune);
477
a0c9c202 478 my $cwd = $wanted->{bydepth} ? Cwd::fastcwd() : Cwd::getcwd();
80e52b73 479 my $cwd_untainted = $cwd;
480 my $check_t_cwd = 1;
481 $wanted_callback = $wanted->{wanted};
482 $bydepth = $wanted->{bydepth};
483 $pre_process = $wanted->{preprocess};
484 $post_process = $wanted->{postprocess};
485 $no_chdir = $wanted->{no_chdir};
486 $full_check = $wanted->{follow};
487 $follow = $full_check || $wanted->{follow_fast};
488 $follow_skip = $wanted->{follow_skip};
489 $untaint = $wanted->{untaint};
490 $untaint_pat = $wanted->{untaint_pattern};
491 $untaint_skip = $wanted->{untaint_skip};
492 $dangling_symlinks = $wanted->{dangling_symlinks};
81793b90 493
1cffc1dd 494 # for compatibility reasons (find.pl, find2perl)
9f826d6a 495 local our ($topdir, $topdev, $topino, $topmode, $topnlink);
81793b90 496
497 # a symbolic link to a directory doesn't increase the link count
498 $avoid_nlink = $follow || $File::Find::dont_use_nlink;
499
e7b91b67 500 my ($abs_dir, $Is_Dir);
81793b90 501
502 Proc_Top_Item:
503 foreach my $TOP (@_) {
7e47e6ff 504 my $top_item = $TOP;
505
506 if ($Is_MacOS) {
507 ($topdev,$topino,$topmode,$topnlink) = $follow ? stat $top_item : lstat $top_item;
508 $top_item = ":$top_item"
3fa6e24b 509 if ( (-d _) && ( $top_item !~ /:/ ) );
7e47e6ff 510 }
511 else {
512 $top_item =~ s|/\z|| unless $top_item eq '/';
513 ($topdev,$topino,$topmode,$topnlink) = $follow ? stat $top_item : lstat $top_item;
514 }
515
516 $Is_Dir= 0;
517
518 if ($follow) {
519
520 if ($Is_MacOS) {
521 $cwd = "$cwd:" unless ($cwd =~ /:$/); # for safety
522
523 if ($top_item eq $File::Find::current_dir) {
524 $abs_dir = $cwd;
525 }
526 else {
527 $abs_dir = contract_name_Mac($cwd, $top_item);
528 unless (defined $abs_dir) {
3fa6e24b 529 warn "Can't determine absolute path for $top_item (No such file or directory)\n" if $^W;
7e47e6ff 530 next Proc_Top_Item;
531 }
532 }
533
534 }
535 else {
536 if (substr($top_item,0,1) eq '/') {
537 $abs_dir = $top_item;
538 }
539 elsif ($top_item eq $File::Find::current_dir) {
540 $abs_dir = $cwd;
541 }
542 else { # care about any ../
543 $abs_dir = contract_name("$cwd/",$top_item);
544 }
545 }
546 $abs_dir= Follow_SymLink($abs_dir);
547 unless (defined $abs_dir) {
80e52b73 548 if ($dangling_symlinks) {
549 if (ref $dangling_symlinks eq 'CODE') {
550 $dangling_symlinks->($top_item, $cwd);
551 } else {
552 warn "$top_item is a dangling symbolic link\n" if $^W;
553 }
554 }
81793b90 555 next Proc_Top_Item;
7e47e6ff 556 }
557
558 if (-d _) {
81793b90 559 _find_dir_symlnk($wanted, $abs_dir, $top_item);
560 $Is_Dir= 1;
7e47e6ff 561 }
562 }
81793b90 563 else { # no follow
7e47e6ff 564 $topdir = $top_item;
565 unless (defined $topnlink) {
3fa6e24b 566 warn "Can't stat $top_item: $!\n" if $^W;
7e47e6ff 567 next Proc_Top_Item;
568 }
569 if (-d _) {
c7b9dd21 570 $top_item =~ s/\.dir\z// if $Is_VMS;
e7b91b67 571 _find_dir($wanted, $top_item, $topnlink);
81793b90 572 $Is_Dir= 1;
7e47e6ff 573 }
237437d0 574 else {
81793b90 575 $abs_dir= $top_item;
7e47e6ff 576 }
577 }
81793b90 578
7e47e6ff 579 unless ($Is_Dir) {
81793b90 580 unless (($_,$dir) = File::Basename::fileparse($abs_dir)) {
7e47e6ff 581 if ($Is_MacOS) {
582 ($dir,$_) = (':', $top_item); # $File::Find::dir, $_
583 }
584 else {
585 ($dir,$_) = ('./', $top_item);
586 }
81793b90 587 }
588
7e47e6ff 589 $abs_dir = $dir;
590 if (( $untaint ) && (is_tainted($dir) )) {
591 ( $abs_dir ) = $dir =~ m|$untaint_pat|;
81793b90 592 unless (defined $abs_dir) {
593 if ($untaint_skip == 0) {
7e47e6ff 594 die "directory $dir is still tainted";
81793b90 595 }
596 else {
597 next Proc_Top_Item;
598 }
599 }
7e47e6ff 600 }
81793b90 601
7e47e6ff 602 unless ($no_chdir || chdir $abs_dir) {
3fa6e24b 603 warn "Couldn't chdir $abs_dir: $!\n" if $^W;
7e47e6ff 604 next Proc_Top_Item;
605 }
719911cc 606
7e47e6ff 607 $name = $abs_dir . $_; # $File::Find::name
719911cc 608
7e47e6ff 609 { &$wanted_callback }; # protect against wild "next"
81793b90 610
7e47e6ff 611 }
81793b90 612
7e47e6ff 613 unless ( $no_chdir ) {
614 if ( ($check_t_cwd) && (($untaint) && (is_tainted($cwd) )) ) {
615 ( $cwd_untainted ) = $cwd =~ m|$untaint_pat|;
616 unless (defined $cwd_untainted) {
617 die "insecure cwd in find(depth)";
618 }
619 $check_t_cwd = 0;
620 }
621 unless (chdir $cwd_untainted) {
622 die "Can't cd to $cwd: $!\n";
623 }
624 }
81793b90 625 }
626}
627
628# API:
629# $wanted
630# $p_dir : "parent directory"
631# $nlink : what came back from the stat
632# preconditions:
633# chdir (if not no_chdir) to dir
634
635sub _find_dir($$$) {
636 my ($wanted, $p_dir, $nlink) = @_;
637 my ($CdLvl,$Level) = (0,0);
638 my @Stack;
639 my @filenames;
640 my ($subcount,$sub_nlink);
641 my $SE= [];
642 my $dir_name= $p_dir;
7e47e6ff 643 my $dir_pref;
644 my $dir_rel;
645 my $tainted = 0;
646
647 if ($Is_MacOS) {
648 $dir_pref= ($p_dir =~ /:$/) ? $p_dir : "$p_dir:"; # preface
649 $dir_rel= ':'; # directory name relative to current directory
650 }
651 else {
652 $dir_pref= ( $p_dir eq '/' ? '/' : "$p_dir/" );
653 $dir_rel= '.'; # directory name relative to current directory
654 }
81793b90 655
656 local ($dir, $name, $prune, *DIR);
7e47e6ff 657
658 unless ( $no_chdir || ($p_dir eq $File::Find::current_dir)) {
81793b90 659 my $udir = $p_dir;
7e47e6ff 660 if (( $untaint ) && (is_tainted($p_dir) )) {
661 ( $udir ) = $p_dir =~ m|$untaint_pat|;
81793b90 662 unless (defined $udir) {
663 if ($untaint_skip == 0) {
664 die "directory $p_dir is still tainted";
665 }
666 else {
667 return;
668 }
237437d0 669 }
a0d0e21e 670 }
81793b90 671 unless (chdir $udir) {
3fa6e24b 672 warn "Can't cd to $udir: $!\n" if $^W;
81793b90 673 return;
674 }
675 }
7e47e6ff 676
677 # push the starting directory
57e73c4b 678 push @Stack,[$CdLvl,$p_dir,$dir_rel,-1] if $bydepth;
81793b90 679
7e47e6ff 680 if ($Is_MacOS) {
681 $p_dir = $dir_pref; # ensure trailing ':'
682 }
683
81793b90 684 while (defined $SE) {
685 unless ($bydepth) {
7e47e6ff 686 $dir= $p_dir; # $File::Find::dir
687 $name= $dir_name; # $File::Find::name
688 $_= ($no_chdir ? $dir_name : $dir_rel ); # $_
81793b90 689 # prune may happen here
7e47e6ff 690 $prune= 0;
691 { &$wanted_callback }; # protect against wild "next"
692 next if $prune;
81793b90 693 }
7e47e6ff 694
81793b90 695 # change to that directory
7e47e6ff 696 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
81793b90 697 my $udir= $dir_rel;
7e47e6ff 698 if ( ($untaint) && (($tainted) || ($tainted = is_tainted($dir_rel) )) ) {
699 ( $udir ) = $dir_rel =~ m|$untaint_pat|;
81793b90 700 unless (defined $udir) {
701 if ($untaint_skip == 0) {
7e47e6ff 702 if ($Is_MacOS) {
703 die "directory ($p_dir) $dir_rel is still tainted";
704 }
705 else {
706 die "directory (" . ($p_dir ne '/' ? $p_dir : '') . "/) $dir_rel is still tainted";
707 }
708 } else { # $untaint_skip == 1
709 next;
81793b90 710 }
711 }
712 }
713 unless (chdir $udir) {
7e47e6ff 714 if ($Is_MacOS) {
3fa6e24b 715 warn "Can't cd to ($p_dir) $udir: $!\n" if $^W;
7e47e6ff 716 }
717 else {
3fa6e24b 718 warn "Can't cd to (" . ($p_dir ne '/' ? $p_dir : '') . "/) $udir: $!\n" if $^W;
7e47e6ff 719 }
81793b90 720 next;
721 }
722 $CdLvl++;
723 }
724
7e47e6ff 725 if ($Is_MacOS) {
726 $dir_name = "$dir_name:" unless ($dir_name =~ /:$/);
727 }
728
729 $dir= $dir_name; # $File::Find::dir
81793b90 730
731 # Get the list of files in the current directory.
7e47e6ff 732 unless (opendir DIR, ($no_chdir ? $dir_name : $File::Find::current_dir)) {
3fa6e24b 733 warn "Can't opendir($dir_name): $!\n" if $^W;
81793b90 734 next;
735 }
736 @filenames = readdir DIR;
737 closedir(DIR);
719c805e 738 @filenames = &$pre_process(@filenames) if $pre_process;
739 push @Stack,[$CdLvl,$dir_name,"",-2] if $post_process;
81793b90 740
741 if ($nlink == 2 && !$avoid_nlink) {
742 # This dir has no subdirectories.
743 for my $FN (@filenames) {
7e47e6ff 744 next if $FN =~ $File::Find::skip_pattern;
81793b90 745
7e47e6ff 746 $name = $dir_pref . $FN; # $File::Find::name
747 $_ = ($no_chdir ? $name : $FN); # $_
73396e07 748 { &$wanted_callback }; # protect against wild "next"
81793b90 749 }
750
751 }
752 else {
753 # This dir has subdirectories.
754 $subcount = $nlink - 2;
755
756 for my $FN (@filenames) {
7e47e6ff 757 next if $FN =~ $File::Find::skip_pattern;
81793b90 758 if ($subcount > 0 || $avoid_nlink) {
759 # Seen all the subdirs?
760 # check for directoriness.
761 # stat is faster for a file in the current directory
07867069 762 $sub_nlink = (lstat ($no_chdir ? $dir_pref . $FN : $FN))[3];
81793b90 763
764 if (-d _) {
765 --$subcount;
c7b9dd21 766 $FN =~ s/\.dir\z// if $Is_VMS;
81793b90 767 push @Stack,[$CdLvl,$dir_name,$FN,$sub_nlink];
768 }
769 else {
7e47e6ff 770 $name = $dir_pref . $FN; # $File::Find::name
771 $_= ($no_chdir ? $name : $FN); # $_
73396e07 772 { &$wanted_callback }; # protect against wild "next"
81793b90 773 }
774 }
07867069 775 else {
7e47e6ff 776 $name = $dir_pref . $FN; # $File::Find::name
777 $_= ($no_chdir ? $name : $FN); # $_
73396e07 778 { &$wanted_callback }; # protect against wild "next"
81793b90 779 }
780 }
781 }
17b275ff 782 }
783 continue {
57e73c4b 784 while ( defined ($SE = pop @Stack) ) {
81793b90 785 ($Level, $p_dir, $dir_rel, $nlink) = @$SE;
786 if ($CdLvl > $Level && !$no_chdir) {
7e47e6ff 787 my $tmp;
788 if ($Is_MacOS) {
789 $tmp = (':' x ($CdLvl-$Level)) . ':';
790 }
791 else {
792 $tmp = join('/',('..') x ($CdLvl-$Level));
793 }
794 die "Can't cd to $dir_name" . $tmp
795 unless chdir ($tmp);
81793b90 796 $CdLvl = $Level;
797 }
7e47e6ff 798
799 if ($Is_MacOS) {
800 # $pdir always has a trailing ':', except for the starting dir,
801 # where $dir_rel eq ':'
802 $dir_name = "$p_dir$dir_rel";
803 $dir_pref = "$dir_name:";
804 }
805 else {
806 $dir_name = ($p_dir eq '/' ? "/$dir_rel" : "$p_dir/$dir_rel");
807 $dir_pref = "$dir_name/";
808 }
809
719c805e 810 if ( $nlink == -2 ) {
7e47e6ff 811 $name = $dir = $p_dir; # $File::Find::name / dir
812 if ($Is_MacOS) {
813 $_ = ':'; # $_
814 }
815 else {
816 $_ = '.';
817 }
719c805e 818 &$post_process; # End-of-directory processing
7e47e6ff 819 }
820 elsif ( $nlink < 0 ) { # must be finddepth, report dirname now
821 $name = $dir_name;
822 if ($Is_MacOS) {
823 if ($dir_rel eq ':') { # must be the top dir, where we started
824 $name =~ s|:$||; # $File::Find::name
825 $p_dir = "$p_dir:" unless ($p_dir =~ /:$/);
826 }
827 $dir = $p_dir; # $File::Find::dir
828 $_ = ($no_chdir ? $name : $dir_rel); # $_
829 }
830 else {
831 if ( substr($name,-2) eq '/.' ) {
832 $name =~ s|/\.$||;
833 }
834 $dir = $p_dir;
835 $_ = ($no_chdir ? $dir_name : $dir_rel );
836 if ( substr($_,-2) eq '/.' ) {
837 s|/\.$||;
838 }
839 }
840 { &$wanted_callback }; # protect against wild "next"
841 }
842 else {
843 push @Stack,[$CdLvl,$p_dir,$dir_rel,-1] if $bydepth;
844 last;
845 }
81793b90 846 }
a0d0e21e 847 }
848}
849
81793b90 850
851# API:
852# $wanted
853# $dir_loc : absolute location of a dir
854# $p_dir : "parent directory"
855# preconditions:
856# chdir (if not no_chdir) to dir
857
858sub _find_dir_symlnk($$$) {
7e47e6ff 859 my ($wanted, $dir_loc, $p_dir) = @_; # $dir_loc is the absolute directory
81793b90 860 my @Stack;
861 my @filenames;
862 my $new_loc;
7e47e6ff 863 my $updir_loc = $dir_loc; # untainted parent directory
81793b90 864 my $SE = [];
865 my $dir_name = $p_dir;
7e47e6ff 866 my $dir_pref;
867 my $loc_pref;
868 my $dir_rel;
869 my $byd_flag; # flag for pending stack entry if $bydepth
870 my $tainted = 0;
871 my $ok = 1;
872
873 if ($Is_MacOS) {
874 $dir_pref = ($p_dir =~ /:$/) ? "$p_dir" : "$p_dir:";
875 $loc_pref = ($dir_loc =~ /:$/) ? "$dir_loc" : "$dir_loc:";
876 $dir_rel = ':'; # directory name relative to current directory
877 } else {
878 $dir_pref = ( $p_dir eq '/' ? '/' : "$p_dir/" );
879 $loc_pref = ( $dir_loc eq '/' ? '/' : "$dir_loc/" );
880 $dir_rel = '.'; # directory name relative to current directory
881 }
81793b90 882
883 local ($dir, $name, $fullname, $prune, *DIR);
7e47e6ff 884
885 unless ($no_chdir) {
886 # untaint the topdir
887 if (( $untaint ) && (is_tainted($dir_loc) )) {
888 ( $updir_loc ) = $dir_loc =~ m|$untaint_pat|; # parent dir, now untainted
889 # once untainted, $updir_loc is pushed on the stack (as parent directory);
890 # hence, we don't need to untaint the parent directory every time we chdir
891 # to it later
892 unless (defined $updir_loc) {
81793b90 893 if ($untaint_skip == 0) {
894 die "directory $dir_loc is still tainted";
895 }
896 else {
897 return;
898 }
899 }
900 }
7e47e6ff 901 $ok = chdir($updir_loc) unless ($p_dir eq $File::Find::current_dir);
902 unless ($ok) {
3fa6e24b 903 warn "Can't cd to $updir_loc: $!\n" if $^W;
81793b90 904 return;
905 }
906 }
907
7e47e6ff 908 push @Stack,[$dir_loc,$updir_loc,$p_dir,$dir_rel,-1] if $bydepth;
909
910 if ($Is_MacOS) {
911 $p_dir = $dir_pref; # ensure trailing ':'
912 }
57e73c4b 913
81793b90 914 while (defined $SE) {
915
916 unless ($bydepth) {
7e47e6ff 917 # change (back) to parent directory (always untainted)
704ea872 918 unless ($no_chdir) {
7e47e6ff 919 unless (chdir $updir_loc) {
3fa6e24b 920 warn "Can't cd to $updir_loc: $!\n" if $^W;
704ea872 921 next;
922 }
923 }
7e47e6ff 924 $dir= $p_dir; # $File::Find::dir
925 $name= $dir_name; # $File::Find::name
926 $_= ($no_chdir ? $dir_name : $dir_rel ); # $_
927 $fullname= $dir_loc; # $File::Find::fullname
81793b90 928 # prune may happen here
7e47e6ff 929 $prune= 0;
704ea872 930 lstat($_); # make sure file tests with '_' work
7e47e6ff 931 { &$wanted_callback }; # protect against wild "next"
932 next if $prune;
81793b90 933 }
934
935 # change to that directory
7e47e6ff 936 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
937 $updir_loc = $dir_loc;
938 if ( ($untaint) && (($tainted) || ($tainted = is_tainted($dir_loc) )) ) {
939 # untaint $dir_loc, what will be pushed on the stack as (untainted) parent dir
940 ( $updir_loc ) = $dir_loc =~ m|$untaint_pat|;
941 unless (defined $updir_loc) {
81793b90 942 if ($untaint_skip == 0) {
943 die "directory $dir_loc is still tainted";
a0d0e21e 944 }
237437d0 945 else {
81793b90 946 next;
237437d0 947 }
a0d0e21e 948 }
949 }
7e47e6ff 950 unless (chdir $updir_loc) {
3fa6e24b 951 warn "Can't cd to $updir_loc: $!\n" if $^W;
81793b90 952 next;
953 }
954 }
955
7e47e6ff 956 if ($Is_MacOS) {
957 $dir_name = "$dir_name:" unless ($dir_name =~ /:$/);
958 }
959
960 $dir = $dir_name; # $File::Find::dir
81793b90 961
962 # Get the list of files in the current directory.
7e47e6ff 963 unless (opendir DIR, ($no_chdir ? $dir_loc : $File::Find::current_dir)) {
3fa6e24b 964 warn "Can't opendir($dir_loc): $!\n" if $^W;
81793b90 965 next;
966 }
967 @filenames = readdir DIR;
968 closedir(DIR);
969
970 for my $FN (@filenames) {
7e47e6ff 971 next if $FN =~ $File::Find::skip_pattern;
81793b90 972
973 # follow symbolic links / do an lstat
07867069 974 $new_loc = Follow_SymLink($loc_pref.$FN);
81793b90 975
976 # ignore if invalid symlink
977 next unless defined $new_loc;
7e47e6ff 978
81793b90 979 if (-d _) {
7e47e6ff 980 push @Stack,[$new_loc,$updir_loc,$dir_name,$FN,1];
81793b90 981 }
982 else {
7e47e6ff 983 $fullname = $new_loc; # $File::Find::fullname
984 $name = $dir_pref . $FN; # $File::Find::name
985 $_ = ($no_chdir ? $name : $FN); # $_
73396e07 986 { &$wanted_callback }; # protect against wild "next"
81793b90 987 }
988 }
989
81793b90 990 }
991 continue {
57e73c4b 992 while (defined($SE = pop @Stack)) {
7e47e6ff 993 ($dir_loc, $updir_loc, $p_dir, $dir_rel, $byd_flag) = @$SE;
994 if ($Is_MacOS) {
995 # $p_dir always has a trailing ':', except for the starting dir,
996 # where $dir_rel eq ':'
997 $dir_name = "$p_dir$dir_rel";
998 $dir_pref = "$dir_name:";
999 $loc_pref = ($dir_loc =~ /:$/) ? $dir_loc : "$dir_loc:";
1000 }
1001 else {
1002 $dir_name = ($p_dir eq '/' ? "/$dir_rel" : "$p_dir/$dir_rel");
1003 $dir_pref = "$dir_name/";
1004 $loc_pref = "$dir_loc/";
1005 }
1006 if ( $byd_flag < 0 ) { # must be finddepth, report dirname now
1007 unless ($no_chdir || ($dir_rel eq $File::Find::current_dir)) {
1008 unless (chdir $updir_loc) { # $updir_loc (parent dir) is always untainted
3fa6e24b 1009 warn "Can't cd to $updir_loc: $!\n" if $^W;
7e47e6ff 1010 next;
1011 }
1012 }
1013 $fullname = $dir_loc; # $File::Find::fullname
1014 $name = $dir_name; # $File::Find::name
1015 if ($Is_MacOS) {
1016 if ($dir_rel eq ':') { # must be the top dir, where we started
1017 $name =~ s|:$||; # $File::Find::name
1018 $p_dir = "$p_dir:" unless ($p_dir =~ /:$/);
1019 }
1020 $dir = $p_dir; # $File::Find::dir
1021 $_ = ($no_chdir ? $name : $dir_rel); # $_
1022 }
1023 else {
1024 if ( substr($name,-2) eq '/.' ) {
1025 $name =~ s|/\.$||; # $File::Find::name
1026 }
1027 $dir = $p_dir; # $File::Find::dir
1028 $_ = ($no_chdir ? $dir_name : $dir_rel); # $_
1029 if ( substr($_,-2) eq '/.' ) {
1030 s|/\.$||;
1031 }
1032 }
1033
1034 lstat($_); # make sure file tests with '_' work
1035 { &$wanted_callback }; # protect against wild "next"
1036 }
1037 else {
1038 push @Stack,[$dir_loc, $updir_loc, $p_dir, $dir_rel,-1] if $bydepth;
1039 last;
1040 }
a0d0e21e 1041 }
1042 }
1043}
1044
81793b90 1045
20408e3c 1046sub wrap_wanted {
81793b90 1047 my $wanted = shift;
1048 if ( ref($wanted) eq 'HASH' ) {
1049 if ( $wanted->{follow} || $wanted->{follow_fast}) {
1050 $wanted->{follow_skip} = 1 unless defined $wanted->{follow_skip};
1051 }
1052 if ( $wanted->{untaint} ) {
7e47e6ff 1053 $wanted->{untaint_pattern} = $File::Find::untaint_pattern
81793b90 1054 unless defined $wanted->{untaint_pattern};
1055 $wanted->{untaint_skip} = 0 unless defined $wanted->{untaint_skip};
1056 }
1057 return $wanted;
1058 }
1059 else {
1060 return { wanted => $wanted };
1061 }
a0d0e21e 1062}
1063
20408e3c 1064sub find {
81793b90 1065 my $wanted = shift;
1066 _find_opt(wrap_wanted($wanted), @_);
a0d0e21e 1067}
1068
55d729e4 1069sub finddepth {
81793b90 1070 my $wanted = wrap_wanted(shift);
1071 $wanted->{bydepth} = 1;
1072 _find_opt($wanted, @_);
20408e3c 1073}
6280b799 1074
7e47e6ff 1075# default
1076$File::Find::skip_pattern = qr/^\.{1,2}\z/;
1077$File::Find::untaint_pattern = qr|^([-+@\w./]+)$|;
1078
6280b799 1079# These are hard-coded for now, but may move to hint files.
10eba763 1080if ($^O eq 'VMS') {
81793b90 1081 $Is_VMS = 1;
7e47e6ff 1082 $File::Find::dont_use_nlink = 1;
1083}
1084elsif ($^O eq 'MacOS') {
1085 $Is_MacOS = 1;
1086 $File::Find::dont_use_nlink = 1;
1087 $File::Find::skip_pattern = qr/^Icon\015\z/;
1088 $File::Find::untaint_pattern = qr|^(.+)$|;
748a9306 1089}
1090
7e47e6ff 1091# this _should_ work properly on all platforms
1092# where File::Find can be expected to work
1093$File::Find::current_dir = File::Spec->curdir || '.';
1094
81793b90 1095$File::Find::dont_use_nlink = 1
497711e7 1096 if $^O eq 'os2' || $^O eq 'dos' || $^O eq 'amigaos' || $^O eq 'MSWin32' ||
2986a63f 1097 $^O eq 'cygwin' || $^O eq 'epoc' || $^O eq 'NetWare';
6280b799 1098
20408e3c 1099# Set dont_use_nlink in your hint file if your system's stat doesn't
1100# report the number of links in a directory as an indication
1101# of the number of files.
1102# See, e.g. hints/machten.sh for MachTen 2.2.
81793b90 1103unless ($File::Find::dont_use_nlink) {
1104 require Config;
1105 $File::Find::dont_use_nlink = 1 if ($Config::Config{'dont_use_nlink'});
20408e3c 1106}
1107
7e47e6ff 1108# We need a function that checks if a scalar is tainted. Either use the
1109# Scalar::Util module's tainted() function or our (slower) pure Perl
1110# fallback is_tainted_pp()
1111{
1112 local $@;
1113 eval { require Scalar::Util };
1114 *is_tainted = $@ ? \&is_tainted_pp : \&Scalar::Util::tainted;
1115}
1116
a0d0e21e 11171;