From: Jarkko Hietaniemi Date: Thu, 26 Jun 2003 05:32:02 +0000 (+0000) Subject: Bite the bullet and apply the hash randomisation patch. X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=504f80c1f3625809f472c1ce21089fdae860d9fd;p=p5sagit%2Fp5-mst-13.2.git Bite the bullet and apply the hash randomisation patch. [perl #22371] Algorimic Complexity Attack on Perl 5.6.1, 5.8.0 p4raw-id: //depot/perl@19854 --- diff --git a/INSTALL b/INSTALL index cb73d21..1c494d2 100644 --- a/INSTALL +++ b/INSTALL @@ -836,6 +836,36 @@ Configure should detect this problem and warn you about problems with _exit vs. exit. If you have this problem, the fix is to go back to your sfio sources and correct iffe's guess about atexit. +=head2 Algorithmic Complexity Attacks on Hashes + +In Perls 5.8.0 and earlier it was easy to create degenerate hashes. +Processing such hashes would consume large amounts of CPU time, +causing a "Denial of Service" attack against Perl. Such hashes may be +a problem for example for mod_perl sites, sites with Perl CGI scripts +and web services, that process data originating from external sources. + +In Perl 5.8.1 a security feature was introduced to make it harder +to create such degenerate hashes. + +Because of this feature the keys(), values(), and each() functions +will return the hash elements in different order between different +runs of Perl even with the same data. One can still revert to the old +predictable order by setting the environment variable PERL_HASH_SEED, +see L. Another option is to add -DUSE_HASH_SEED_EXPLICIT to +the compilation flags, in which case one has to explicitly set the +PERL_HASH_SEED environment variable to enable the security feature, +or -DNO_HASH_SEED to completely disable the feature. + +B, and the +ordering has already changed several times during the lifetime of +Perl 5. Also, the ordering of hash keys already (in Perl 5.8.0 and +earlier) depends on the insertion order. + +Note that because of this randomisation for example the Data::Dumper +results will be different between different runs of Perl since +Data::Dumper by default dumps hashes "unordered". The use of the +Data::Dumper C filter is recommended. + =head2 SOCKS Perl can be configured to be 'socksified', that is, to use the SOCKS diff --git a/embedvar.h b/embedvar.h index a1b5720..dcb980a 100644 --- a/embedvar.h +++ b/embedvar.h @@ -254,6 +254,7 @@ #define PL_gid (vTHX->Igid) #define PL_glob_index (vTHX->Iglob_index) #define PL_globalstash (vTHX->Iglobalstash) +#define PL_hash_seed (vTHX->Ihash_seed) #define PL_he_arenaroot (vTHX->Ihe_arenaroot) #define PL_he_root (vTHX->Ihe_root) #define PL_hintgv (vTHX->Ihintgv) @@ -556,6 +557,7 @@ #define PL_Igid PL_gid #define PL_Iglob_index PL_glob_index #define PL_Iglobalstash PL_globalstash +#define PL_Ihash_seed PL_hash_seed #define PL_Ihe_arenaroot PL_he_arenaroot #define PL_Ihe_root PL_he_root #define PL_Ihintgv PL_hintgv diff --git a/ext/Data/Dumper/Dumper.pm b/ext/Data/Dumper/Dumper.pm index f51b243..c00b218 100644 --- a/ext/Data/Dumper/Dumper.pm +++ b/ext/Data/Dumper/Dumper.pm @@ -1193,6 +1193,17 @@ XSUB implementation does not support them. SCALAR objects have the weirdest looking C workaround. +=head2 NOTE + +Starting from Perl 5.8.1 different runs of Perl will have different +ordering of hash keys. The change was done for greater security, +see L. This means that +different runs of Perl will have different Data::Dumper outputs if +the data contains hashes. If you need to have identical Data::Dumper +outputs from different runs of Perl, use the environment variable +PERL_HASH_SEED, see L. Using this restores +the old (platform-specific) ordering: an even prettier solution might +be to use the C filter of Data::Dumper. =head1 AUTHOR @@ -1202,7 +1213,6 @@ Copyright (c) 1996-98 Gurusamy Sarathy. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. - =head1 VERSION Version 2.12 (unreleased) diff --git a/ext/Data/Dumper/t/dumper.t b/ext/Data/Dumper/t/dumper.t index e1de62d..7663439 100755 --- a/ext/Data/Dumper/t/dumper.t +++ b/ext/Data/Dumper/t/dumper.t @@ -13,6 +13,9 @@ BEGIN { } } +# Since Perl 5.8.1 because otherwise hash ordering is really random. +local $Data::Dumper::Sortkeys = 1; + use Data::Dumper; use Config; my $Is_ebcdic = defined($Config{'ebcdic'}) && $Config{'ebcdic'} eq 'define'; @@ -94,11 +97,11 @@ $WANT = <<'EOT'; #$a = [ # 1, # { +# 'a' => $a, +# 'b' => $a->[1], # 'c' => [ # 'c' -# ], -# 'a' => $a, -# 'b' => $a->[1] +# ] # }, # $a->[1]{'c'} # ]; @@ -116,11 +119,11 @@ $WANT = <<'EOT'; #@a = ( # 1, # { +# 'a' => [], +# 'b' => {}, # 'c' => [ # 'c' -# ], -# 'a' => [], -# 'b' => {} +# ] # }, # [] # ); @@ -138,19 +141,19 @@ TEST q(Data::Dumper->Dumpxs([$a, $b], [qw(*a b)])) if $XS; ## $WANT = <<'EOT'; #%b = ( -# 'c' => [ -# 'c' -# ], # 'a' => [ # 1, # {}, -# [] +# [ +# 'c' +# ] # ], -# 'b' => {} +# 'b' => {}, +# 'c' => [] # ); #$b{'a'}[1] = \%b; -#$b{'a'}[2] = $b{'c'}; #$b{'b'} = \%b; +#$b{'c'} = $b{'a'}[2]; #$a = $b{'a'}; EOT @@ -163,15 +166,15 @@ $WANT = <<'EOT'; #$a = [ # 1, # { -# 'c' => [], # 'a' => [], -# 'b' => {} +# 'b' => {}, +# 'c' => [] # }, # [] #]; -#$a->[1]{'c'} = \@c; #$a->[1]{'a'} = $a; #$a->[1]{'b'} = $a->[1]; +#$a->[1]{'c'} = \@c; #$a->[2] = \@c; #$b = $a->[1]; EOT @@ -199,12 +202,12 @@ $WANT = <<'EOT'; # 1, # #1 # { +# a => $a, +# b => $a->[1], # c => [ # #0 # 'c' -# ], -# a => $a, -# b => $a->[1] +# ] # }, # #2 # $a->[1]{c} @@ -224,11 +227,11 @@ $WANT = <<'EOT'; #$VAR1 = [ # 1, # { +# 'a' => [], +# 'b' => {}, # 'c' => [ # 'c' -# ], -# 'a' => [], -# 'b' => {} +# ] # }, # [] #]; @@ -246,11 +249,11 @@ $WANT = <<'EOT'; #[ # 1, # { +# a => $VAR1, +# b => $VAR1->[1], # c => [ # 'c' -# ], -# a => $VAR1, -# b => $VAR1->[1] +# ] # }, # $VAR1->[1]{c} #] @@ -269,8 +272,8 @@ EOT ## $WANT = <<'EOT'; #$VAR1 = { -# "reftest" => \\1, -# "abc\0'\efg" => "mno\0" +# "abc\0'\efg" => "mno\0", +# "reftest" => \\1 #}; EOT @@ -284,8 +287,8 @@ $foo = { "abc\000\'\efg" => "mno\000", $WANT = <<"EOT"; #\$VAR1 = { -# 'reftest' => \\\\1, -# 'abc\0\\'\efg' => 'mno\0' +# 'abc\0\\'\efg' => 'mno\0', +# 'reftest' => \\\\1 #}; EOT @@ -320,15 +323,15 @@ EOT # do{my $o}, # #2 # { -# 'c' => [], # 'a' => 1, # 'b' => do{my $o}, +# 'c' => [], # 'd' => {} # } # ]; #*::foo{ARRAY}->[1] = $foo; -#*::foo{ARRAY}->[2]{'c'} = *::foo{ARRAY}; #*::foo{ARRAY}->[2]{'b'} = *::foo{SCALAR}; +#*::foo{ARRAY}->[2]{'c'} = *::foo{ARRAY}; #*::foo{ARRAY}->[2]{'d'} = *::foo{ARRAY}->[2]; #*::foo = *::foo{ARRAY}->[2]; #@bar = @{*::foo{ARRAY}}; @@ -349,15 +352,15 @@ EOT # -10, # do{my $o}, # { -# 'c' => [], # 'a' => 1, # 'b' => do{my $o}, +# 'c' => [], # 'd' => {} # } #]; #*::foo{ARRAY}->[1] = $foo; -#*::foo{ARRAY}->[2]{'c'} = *::foo{ARRAY}; #*::foo{ARRAY}->[2]{'b'} = *::foo{SCALAR}; +#*::foo{ARRAY}->[2]{'c'} = *::foo{ARRAY}; #*::foo{ARRAY}->[2]{'d'} = *::foo{ARRAY}->[2]; #*::foo = *::foo{ARRAY}->[2]; #$bar = *::foo{ARRAY}; @@ -379,13 +382,13 @@ EOT #*::foo = \5; #*::foo = \@bar; #*::foo = { -# 'c' => [], # 'a' => 1, # 'b' => do{my $o}, +# 'c' => [], # 'd' => {} #}; -#*::foo{HASH}->{'c'} = \@bar; #*::foo{HASH}->{'b'} = *::foo{SCALAR}; +#*::foo{HASH}->{'c'} = \@bar; #*::foo{HASH}->{'d'} = *::foo{HASH}; #$bar[2] = *::foo{HASH}; #%baz = %{*::foo{HASH}}; @@ -406,13 +409,13 @@ EOT #*::foo = \5; #*::foo = $bar; #*::foo = { -# 'c' => [], # 'a' => 1, # 'b' => do{my $o}, +# 'c' => [], # 'd' => {} #}; -#*::foo{HASH}->{'c'} = $bar; #*::foo{HASH}->{'b'} = *::foo{SCALAR}; +#*::foo{HASH}->{'c'} = $bar; #*::foo{HASH}->{'d'} = *::foo{HASH}; #$bar->[2] = *::foo{HASH}; #$baz = *::foo{HASH}; @@ -430,9 +433,9 @@ EOT # -10, # $foo, # { -# c => \@bar, # a => 1, # b => \5, +# c => \@bar, # d => $bar[2] # } #); @@ -452,9 +455,9 @@ EOT # -10, # $foo, # { -# c => $bar, # a => 1, # b => \5, +# c => $bar, # d => $bar->[2] # } #]; @@ -483,8 +486,8 @@ EOT ## $WANT = <<'EOT'; #%kennels = ( -# Second => \'Wags', -# First => \'Fido' +# First => \'Fido', +# Second => \'Wags' #); #@dogs = ( # ${$kennels{First}}, @@ -522,8 +525,8 @@ EOT ## $WANT = <<'EOT'; #%kennels = ( -# Second => \'Wags', -# First => \'Fido' +# First => \'Fido', +# Second => \'Wags' #); #@dogs = ( # ${$kennels{First}}, @@ -546,8 +549,8 @@ EOT # 'Fido', # 'Wags', # { -# Second => \$dogs[1], -# First => \$dogs[0] +# First => \$dogs[0], +# Second => \$dogs[1] # } #); #%kennels = %{$dogs[2]}; @@ -581,13 +584,13 @@ EOT # 'Fido', # 'Wags', # { -# Second => \'Wags', -# First => \'Fido' +# First => \'Fido', +# Second => \'Wags' # } #); #%kennels = ( -# Second => \'Wags', -# First => \'Fido' +# First => \'Fido', +# Second => \'Wags' #); EOT @@ -833,7 +836,6 @@ EOT { $i = 0; $a = { map { ("$_$_$_", ++$i) } 'I'..'Q' }; - local $Data::Dumper::Sortkeys = 1; ############# 193 ## diff --git a/hv.h b/hv.h index 6a51ca4..c43fc57 100644 --- a/hv.h +++ b/hv.h @@ -56,13 +56,20 @@ struct xpvhv { * (a) the hashed data being interpreted as "unsigned char" (new since 5.8, * a "char" can be either signed or signed, depending on the compiler) * (b) catering for old code that uses a "char" + * The "hash seed" feature was added in Perl 5.8.1 to perturb the results + * to avoid "algorithmic complexity attacks". */ +#if defined(USE_HASH_SEED) || defined(USE_HASH_SEED_EXPLICIT) +# define PERL_HASH_SEED PL_hash_seed +#else +# define PERL_HASH_SEED 0 +#endif #define PERL_HASH(hash,str,len) \ STMT_START { \ register const char *s_PeRlHaSh_tmp = str; \ register const unsigned char *s_PeRlHaSh = (const unsigned char *)s_PeRlHaSh_tmp; \ register I32 i_PeRlHaSh = len; \ - register U32 hash_PeRlHaSh = 0; \ + register U32 hash_PeRlHaSh = PERL_HASH_SEED; \ while (i_PeRlHaSh--) { \ hash_PeRlHaSh += *s_PeRlHaSh++; \ hash_PeRlHaSh += (hash_PeRlHaSh << 10); \ diff --git a/intrpvar.h b/intrpvar.h index 44d6296..6d77cec 100644 --- a/intrpvar.h +++ b/intrpvar.h @@ -523,6 +523,8 @@ PERLVARI(Irunops_dbg, runops_proc_t, MEMBER_TO_FPTR(Perl_runops_debug)) PERLVARI(Ippid, IV, 0) #endif +PERLVARI(Ihash_seed, UV, 0) /* Hash initializer */ + PERLVAR(IDBassertion, SV *) PERLVARI(Icv_has_eval, I32, 0) /* PL_compcv includes an entereval or similar */ diff --git a/perl.c b/perl.c index f85b010..6b59701 100644 --- a/perl.c +++ b/perl.c @@ -275,6 +275,33 @@ perl_construct(pTHXx) PL_stashcache = newHV(); +#if defined(USE_HASH_SEED) || defined(USE_HASH_SEED_EXPLICIT) + /* [perl #22371] Algorimic Complexity Attack on Perl 5.6.1, 5.8.0 */ + { + char *s = PerlEnv_getenv("PERL_HASH_SEED"); + if (s) + while (isSPACE(*s)) s++; + if (s && isDIGIT(*s)) + PL_hash_seed = (UV)atoi(s); +#ifndef USE_HASH_SEED_EXPLICIT + else { + /* Compute a random seed */ + (void)seedDrand01((Rand_seed_t)seed()); + PL_srand_called = TRUE; + PL_hash_seed = (UV)(Drand01() * (NV)UV_MAX); +#if RANDBITS < (UVSIZE * 8) + { + int skip = (UVSIZE * 8) - RANDBITS; + PL_hash_seed >>= skip; + /* The low bits might need extra help. */ + PL_hash_seed += (UV)(Drand01() * ((1 << skip) - 1)); + } +#endif /* RANDBITS < (UVSIZE * 8) */ + } +#endif /* USE_HASH_SEED_EXPLICIT */ + } +#endif /* #if defined(USE_HASH_SEED) || defined(USE_HASH_SEED_EXPLICIT) */ + ENTER; } diff --git a/perl.h b/perl.h index 9dbc248..61fab6c 100644 --- a/perl.h +++ b/perl.h @@ -2250,6 +2250,12 @@ typedef struct crypt_data { /* straight from /usr/include/crypt.h */ #if !defined(OS2) && !defined(MACOS_TRADITIONAL) # include "iperlsys.h" #endif + +/* [perl #22371] Algorimic Complexity Attack on Perl 5.6.1, 5.8.0 */ +#if !defined(NO_HASH_SEED) && !defined(USE_HASH_SEED) && !defined(USE_HASH_SEED_EXPLICIT) +# define USE_HASH_SEED +#endif + #include "regexp.h" #include "sv.h" #include "util.h" diff --git a/perlapi.h b/perlapi.h index e18dfbb..0f56a0a 100644 --- a/perlapi.h +++ b/perlapi.h @@ -266,6 +266,8 @@ END_EXTERN_C #define PL_glob_index (*Perl_Iglob_index_ptr(aTHX)) #undef PL_globalstash #define PL_globalstash (*Perl_Iglobalstash_ptr(aTHX)) +#undef PL_hash_seed +#define PL_hash_seed (*Perl_Ihash_seed_ptr(aTHX)) #undef PL_he_arenaroot #define PL_he_arenaroot (*Perl_Ihe_arenaroot_ptr(aTHX)) #undef PL_he_root diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod index a36cda0..1000fc9 100644 --- a/pod/perlfunc.pod +++ b/pod/perlfunc.pod @@ -1267,9 +1267,12 @@ it. When called in scalar context, returns only the key for the next element in the hash. Entries are returned in an apparently random order. The actual random -order is subject to change in future versions of perl, but it is guaranteed -to be in the same order as either the C or C function -would produce on the same (unmodified) hash. +order is subject to change in future versions of perl, but it is +guaranteed to be in the same order as either the C or C +function would produce on the same (unmodified) hash. Since Perl +5.8.1 the ordering is different even between different runs of Perl +because of security reasons (see L) value), and C in @@ -2311,13 +2314,19 @@ first argument. Compare L. =item keys HASH -Returns a list consisting of all the keys of the named hash. (In -scalar context, returns the number of keys.) The keys are returned in -an apparently random order. The actual random order is subject to -change in future versions of perl, but it is guaranteed to be the same -order as either the C or C function produces (given -that the hash has not been modified). As a side effect, it resets -HASH's iterator. +Returns a list consisting of all the keys of the named hash. +(In scalar context, returns the number of keys.) + +The keys are returned in an apparently random order. The actual +random order is subject to change in future versions of perl, but it +is guaranteed to be the same order as either the C or C +function produces (given that the hash has not been modified). +Since Perl 5.8.1 the ordering is different even between different +runs of Perl because of security reasons (see L. Here is yet another way to print your environment: @@ -6205,12 +6214,19 @@ above.) =item values HASH -Returns a list consisting of all the values of the named hash. (In a -scalar context, returns the number of values.) The values are -returned in an apparently random order. The actual random order is -subject to change in future versions of perl, but it is guaranteed to -be the same order as either the C or C function would -produce on the same (unmodified) hash. +Returns a list consisting of all the values of the named hash. +(In a scalar context, returns the number of values.) + +The values are returned in an apparently random order. The actual +random order is subject to change in future versions of perl, but it +is guaranteed to be the same order as either the C or C +function would produce on the same (unmodified) hash. Since Perl +5.8.1 the ordering is different even between different runs of Perl +because of security reasons (see L. Note that the values are not copied, which means modifying them will modify the contents of the hash: @@ -6218,7 +6234,6 @@ modify the contents of the hash: for (values %hash) { s/foo/bar/g } # modifies %hash values for (@hash{keys %hash}) { s/foo/bar/g } # same -As a side effect, calling values() resets the HASH's internal iterator. See also C, C, and C. =item vec EXPR,OFFSET,BITS diff --git a/pod/perlrun.pod b/pod/perlrun.pod index c33c478..0a02df1 100644 --- a/pod/perlrun.pod +++ b/pod/perlrun.pod @@ -1106,6 +1106,26 @@ references. See L for more information. If using the C pragma without an explicit encoding name, the PERL_ENCODING environment variable is consulted for an encoding name. +=item PERL_HASH_SEED + +(Since Perl 5.8.1.) + +Used to randomise Perl's internal hash function. To emulate the +pre-5.8.1 behaviour, set to an integer (zero means exactly the same +order as 5.8.0). "Pre-5.8.1" means, among other things, that hash +keys will be ordered the same between different runs of Perl. + +The default behaviour is to randomise unless the PERL_HASH_SEED is set. +If Perl has been compiled with the -DUSE_HASH_SEED_EXPLICIT the default +behaviour is B to randomise unless the PERL_HASH_SEED is set. + +If PERL_HASH_SEED is unset or set to a non-numeric string, Perl uses +the pseudorandom seed supplied by the operating system and libraries. +If unset, each different run of Perl will have different ordering of +the outputs of keys(), values, and each(). + +See L for more information. + =item PERL_ROOT (specific to the VMS port) A translation concealed rooted logical name that contains perl and the diff --git a/pod/perlsec.pod b/pod/perlsec.pod index 1c2dbd2..92853dd 100644 --- a/pod/perlsec.pod +++ b/pod/perlsec.pod @@ -386,6 +386,62 @@ certain security pitfalls. See L for an overview and L for details, and L for security implications in particular. +=head2 Algorithmic Complexity Attacks + +Certain internal algorithms used in the implementation of Perl can +be attacked by choosing the input carefully to consume large amounts +of either time or space or both. This can lead into the so-called +I (DoS) attacks. + +=over 4 + +=item * + +Hash Function - the algorithm used to "order" hash elements has been +changed several times during the development of Perl, mainly to be +reasonably fast. In Perl 5.8.1 also the security aspect was taken +into account. + +In Perls before 5.8.1 one could rather easily generate data that as +hash keys would cause Perl to consume large amounts of time because +internal structure of hashes would badly degenerate. In Perl 5.8.1 +the hash function is randomly perturbed by a pseudorandom seed which +makes generating such naughty hash keys harder. +See L for more information. + +The random perturbation is done by default but if one wants for some +reason emulate the old behaviour one can set the environment variable +PERL_HASH_SEED to zero (or any other integer). One possible reason +for wanting to emulate the old behaviour is that in the new behaviour +consecutive runs of Perl will order hash keys differently, which may +confuse some applications (like Data::Dumper: the outputs of two +different runs are no more identical). + +=item * + +Regular expressions - Perl's regular expression engine is so called +NFA (Non-Finite Automaton), which among other things means that it can +rather easily consume large amounts of both time and space if the +regular expression may match in several ways. Careful crafting of the +regular expressions can help but quite often there really isn't much +one can do (the book "Mastering Regular Expressions" is required +reading, see L). Running out of space manifests itself by +Perl running out of memory. + +=item * + +Sorting - the quicksort algorithm used in Perls before 5.8.0 to +implement the sort() function is very easy to trick into misbehaving +so that it consumes a lot of time. Nothing more is required than +resorting a list already sorted. Starting from Perl 5.8.0 a different +sorting algorithm, mergesort, is used. Mergesort is insensitive to +its input data, so it cannot be similarly fooled. + +=back + +See L for more information, +and any computer science text book on the algorithmic complexity. + =head1 SEE ALSO L for its description of cleaning up environment variables. diff --git a/sv.c b/sv.c index f001497..b6d0920 100644 --- a/sv.c +++ b/sv.c @@ -11269,6 +11269,7 @@ perl_clone_using(PerlInterpreter *proto_perl, UV flags, PL_glob_index = proto_perl->Iglob_index; PL_srand_called = proto_perl->Isrand_called; + PL_hash_seed = proto_perl->Ihash_seed; PL_uudmap['M'] = 0; /* reinits on demand */ PL_bitcount = Nullch; /* reinits on demand */