Commit | Line | Data |
a0d0e21e |
1 | =head1 NAME |
2 | |
3 | perlsec - Perl security |
4 | |
5 | =head1 DESCRIPTION |
6 | |
425e5e39 |
7 | Perl is designed to make it easy to program securely even when running |
8 | with extra privileges, like setuid or setgid programs. Unlike most |
9 | command-line shells, which are based on multiple substitution passes on |
10 | each line of the script, Perl uses a more conventional evaluation scheme |
11 | with fewer hidden snags. Additionally, because the language has more |
12 | built-in functionality, it can rely less upon external (and possibly |
13 | untrustworthy) programs to accomplish its purposes. |
a0d0e21e |
14 | |
425e5e39 |
15 | Perl automatically enables a set of special security checks, called I<taint |
16 | mode>, when it detects its program running with differing real and effective |
17 | user or group IDs. The setuid bit in Unix permissions is mode 04000, the |
18 | setgid bit mode 02000; either or both may be set. You can also enable taint |
5f05dabc |
19 | mode explicitly by using the B<-T> command line flag. This flag is |
425e5e39 |
20 | I<strongly> suggested for server programs and any program run on behalf of |
21 | someone else, such as a CGI script. |
a0d0e21e |
22 | |
425e5e39 |
23 | While in this mode, Perl takes special precautions called I<taint |
24 | checks> to prevent both obvious and subtle traps. Some of these checks |
25 | are reasonably simple, such as verifying that path directories aren't |
26 | writable by others; careful programmers have always used checks like |
27 | these. Other checks, however, are best supported by the language itself, |
28 | and it is these checks especially that contribute to making a setuid Perl |
29 | program more secure than the corresponding C program. |
30 | |
31 | You may not use data derived from outside your program to affect something |
32 | else outside your program--at least, not by accident. All command-line |
a034a98d |
33 | arguments, environment variables, locale information (see L<perllocale>), |
34 | and file input are marked as "tainted". Tainted data may not be used |
35 | directly or indirectly in any command that invokes a sub-shell, nor in any |
36 | command that modifies files, directories, or processes. Any variable set |
37 | within an expression that has previously referenced a tainted value itself |
38 | becomes tainted, even if it is logically impossible for the tainted value |
39 | to influence the variable. Because taintedness is associated with each |
40 | scalar value, some elements of an array can be tainted and others not. |
a0d0e21e |
41 | |
a0d0e21e |
42 | For example: |
43 | |
425e5e39 |
44 | $arg = shift; # $arg is tainted |
45 | $hid = $arg, 'bar'; # $hid is also tainted |
46 | $line = <>; # Tainted |
a0d0e21e |
47 | $path = $ENV{'PATH'}; # Tainted, but see below |
425e5e39 |
48 | $data = 'abc'; # Not tainted |
a0d0e21e |
49 | |
425e5e39 |
50 | system "echo $arg"; # Insecure |
51 | system "/bin/echo", $arg; # Secure (doesn't use sh) |
52 | system "echo $hid"; # Insecure |
53 | system "echo $data"; # Insecure until PATH set |
a0d0e21e |
54 | |
425e5e39 |
55 | $path = $ENV{'PATH'}; # $path now tainted |
a0d0e21e |
56 | |
425e5e39 |
57 | $ENV{'PATH'} = '/bin:/usr/bin'; |
58 | $ENV{'IFS'} = '' if $ENV{'IFS'} ne ''; |
a0d0e21e |
59 | |
425e5e39 |
60 | $path = $ENV{'PATH'}; # $path now NOT tainted |
61 | system "echo $data"; # Is secure now! |
a0d0e21e |
62 | |
425e5e39 |
63 | open(FOO, "< $arg"); # OK - read-only file |
64 | open(FOO, "> $arg"); # Not OK - trying to write |
a0d0e21e |
65 | |
425e5e39 |
66 | open(FOO,"echo $arg|"); # Not OK, but... |
67 | open(FOO,"-|") |
68 | or exec 'echo', $arg; # OK |
a0d0e21e |
69 | |
425e5e39 |
70 | $shout = `echo $arg`; # Insecure, $shout now tainted |
a0d0e21e |
71 | |
425e5e39 |
72 | unlink $data, $arg; # Insecure |
73 | umask $arg; # Insecure |
a0d0e21e |
74 | |
425e5e39 |
75 | exec "echo $arg"; # Insecure |
76 | exec "echo", $arg; # Secure (doesn't use the shell) |
77 | exec "sh", '-c', $arg; # Considered secure, alas! |
a0d0e21e |
78 | |
79 | If you try to do something insecure, you will get a fatal error saying |
80 | something like "Insecure dependency" or "Insecure PATH". Note that you |
425e5e39 |
81 | can still write an insecure B<system> or B<exec>, but only by explicitly |
82 | doing something like the last example above. |
83 | |
84 | =head2 Laundering and Detecting Tainted Data |
85 | |
86 | To test whether a variable contains tainted data, and whose use would thus |
87 | trigger an "Insecure dependency" message, you can use the following |
88 | I<is_tainted()> function. |
89 | |
90 | sub is_tainted { |
91 | return ! eval { |
92 | join('',@_), kill 0; |
93 | 1; |
94 | }; |
95 | } |
96 | |
97 | This function makes use of the fact that the presence of tainted data |
98 | anywhere within an expression renders the entire expression tainted. It |
99 | would be inefficient for every operator to test every argument for |
100 | taintedness. Instead, the slightly more efficient and conservative |
101 | approach is used that if any tainted value has been accessed within the |
102 | same expression, the whole expression is considered tainted. |
103 | |
5f05dabc |
104 | But testing for taintedness gets you only so far. Sometimes you have just |
425e5e39 |
105 | to clear your data's taintedness. The only way to bypass the tainting |
5f05dabc |
106 | mechanism is by referencing sub-patterns from a regular expression match. |
425e5e39 |
107 | Perl presumes that if you reference a substring using $1, $2, etc., that |
108 | you knew what you were doing when you wrote the pattern. That means using |
109 | a bit of thought--don't just blindly untaint anything, or you defeat the |
a034a98d |
110 | entire mechanism. It's better to verify that the variable has only good |
111 | characters (for certain values of "good") rather than checking whether it |
112 | has any bad characters. That's because it's far too easy to miss bad |
113 | characters that you never thought of. |
425e5e39 |
114 | |
115 | Here's a test to make sure that the data contains nothing but "word" |
116 | characters (alphabetics, numerics, and underscores), a hyphen, an at sign, |
117 | or a dot. |
118 | |
119 | if ($data =~ /^([-\@\w.]+)$/) { |
120 | $data = $1; # $data now untainted |
121 | } else { |
122 | die "Bad data in $data"; # log this somewhere |
123 | } |
124 | |
5f05dabc |
125 | This is fairly secure because C</\w+/> doesn't normally match shell |
425e5e39 |
126 | metacharacters, nor are dot, dash, or at going to mean something special |
127 | to the shell. Use of C</.+/> would have been insecure in theory because |
128 | it lets everything through, but Perl doesn't check for that. The lesson |
129 | is that when untainting, you must be exceedingly careful with your patterns. |
130 | Laundering data using regular expression is the I<ONLY> mechanism for |
131 | untainting dirty data, unless you use the strategy detailed below to fork |
132 | a child of lesser privilege. |
133 | |
a034a98d |
134 | The example does not untaint $data if C<use locale> is in effect, |
135 | because the characters matched by C<\w> are determined by the locale. |
136 | Perl considers that locale definitions are untrustworthy because they |
137 | contain data from outside the program. If you are writing a |
138 | locale-aware program, and want to launder data with a regular expression |
139 | containing C<\w>, put C<no locale> ahead of the expression in the same |
140 | block. See L<perllocale/SECURITY> for further discussion and examples. |
141 | |
425e5e39 |
142 | =head2 Cleaning Up Your Path |
143 | |
1fef88e7 |
144 | For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to a |
425e5e39 |
145 | known value, and each directory in the path must be non-writable by others |
146 | than its owner and group. You may be surprised to get this message even |
147 | if the pathname to your executable is fully qualified. This is I<not> |
148 | generated because you didn't supply a full path to the program; instead, |
149 | it's generated because you never set your PATH environment variable, or |
150 | you didn't set it to something that was safe. Because Perl can't |
151 | guarantee that the executable in question isn't itself going to turn |
152 | around and execute some other program that is dependent on your PATH, it |
153 | makes sure you set the PATH. |
a0d0e21e |
154 | |
155 | It's also possible to get into trouble with other operations that don't |
156 | care whether they use tainted values. Make judicious use of the file |
157 | tests in dealing with any user-supplied filenames. When possible, do |
158 | opens and such after setting C<$E<gt> = $E<lt>>. (Remember group IDs, |
425e5e39 |
159 | too!) Perl doesn't prevent you from opening tainted filenames for reading, |
a0d0e21e |
160 | so be careful what you print out. The tainting mechanism is intended to |
161 | prevent stupid mistakes, not to remove the need for thought. |
162 | |
425e5e39 |
163 | Perl does not call the shell to expand wild cards when you pass B<system> |
164 | and B<exec> explicit parameter lists instead of strings with possible shell |
165 | wildcards in them. Unfortunately, the B<open>, B<glob>, and |
5f05dabc |
166 | back-tick functions provide no such alternate calling convention, so more |
425e5e39 |
167 | subterfuge will be required. |
168 | |
169 | Perl provides a reasonably safe way to open a file or pipe from a setuid |
170 | or setgid program: just create a child process with reduced privilege who |
171 | does the dirty work for you. First, fork a child using the special |
172 | B<open> syntax that connects the parent and child by a pipe. Now the |
173 | child resets its ID set and any other per-process attributes, like |
174 | environment variables, umasks, current working directories, back to the |
175 | originals or known safe values. Then the child process, which no longer |
176 | has any special permissions, does the B<open> or other system call. |
177 | Finally, the child passes the data it managed to access back to the |
5f05dabc |
178 | parent. Because the file or pipe was opened in the child while running |
425e5e39 |
179 | under less privilege than the parent, it's not apt to be tricked into |
180 | doing something it shouldn't. |
181 | |
5f05dabc |
182 | Here's a way to do back-ticks reasonably safely. Notice how the B<exec> is |
425e5e39 |
183 | not called with a string that the shell could expand. This is by far the |
184 | best way to call something that might be subjected to shell escapes: just |
185 | never call the shell at all. By the time we get to the B<exec>, tainting |
186 | is turned off, however, so be careful what you call and what you pass it. |
cb1a09d0 |
187 | |
425e5e39 |
188 | use English; |
cb1a09d0 |
189 | die unless defined $pid = open(KID, "-|"); |
190 | if ($pid) { # parent |
191 | while (<KID>) { |
192 | # do something |
425e5e39 |
193 | } |
cb1a09d0 |
194 | close KID; |
195 | } else { |
425e5e39 |
196 | $EUID = $UID; |
197 | $EGID = $GID; # XXX: initgroups() not called |
198 | $ENV{PATH} = "/bin:/usr/bin"; |
199 | exec 'myprog', 'arg1', 'arg2'; |
200 | die "can't exec myprog: $!"; |
201 | } |
202 | |
203 | A similar strategy would work for wildcard expansion via C<glob>. |
204 | |
205 | Taint checking is most useful when although you trust yourself not to have |
206 | written a program to give away the farm, you don't necessarily trust those |
207 | who end up using it not to try to trick it into doing something bad. This |
208 | is the kind of security checking that's useful for setuid programs and |
209 | programs launched on someone else's behalf, like CGI programs. |
210 | |
211 | This is quite different, however, from not even trusting the writer of the |
212 | code not to try to do something evil. That's the kind of trust needed |
213 | when someone hands you a program you've never seen before and says, "Here, |
214 | run this." For that kind of safety, check out the Safe module, |
215 | included standard in the Perl distribution. This module allows the |
216 | programmer to set up special compartments in which all system operations |
217 | are trapped and namespace access is carefully controlled. |
218 | |
219 | =head2 Security Bugs |
220 | |
221 | Beyond the obvious problems that stem from giving special privileges to |
222 | systems as flexible as scripts, on many versions of Unix, setuid scripts |
223 | are inherently insecure right from the start. The problem is a race |
224 | condition in the kernel. Between the time the kernel opens the file to |
225 | see which interpreter to run and when the (now-setuid) interpreter turns |
226 | around and reopens the file to interpret it, the file in question may have |
227 | changed, especially if you have symbolic links on your system. |
228 | |
229 | Fortunately, sometimes this kernel "feature" can be disabled. |
230 | Unfortunately, there are two ways to disable it. The system can simply |
231 | outlaw scripts with the setuid bit set, which doesn't help much. |
232 | Alternately, it can simply ignore the setuid bit on scripts. If the |
233 | latter is true, Perl can emulate the setuid and setgid mechanism when it |
234 | notices the otherwise useless setuid/gid bits on Perl scripts. It does |
235 | this via a special executable called B<suidperl> that is automatically |
236 | invoked for you if it's needed. |
237 | |
238 | However, if the kernel setuid script feature isn't disabled, Perl will |
239 | complain loudly that your setuid script is insecure. You'll need to |
240 | either disable the kernel setuid script feature, or put a C wrapper around |
241 | the script. A C wrapper is just a compiled program that does nothing |
242 | except call your Perl program. Compiled programs are not subject to the |
243 | kernel bug that plagues setuid scripts. Here's a simple wrapper, written |
244 | in C: |
245 | |
246 | #define REAL_PATH "/path/to/script" |
247 | main(ac, av) |
248 | char **av; |
249 | { |
250 | execv(REAL_PATH, av); |
cb1a09d0 |
251 | } |
252 | |
425e5e39 |
253 | Compile this wrapper into a binary executable and then make I<it> rather |
254 | than your script setuid or setgid. |
255 | |
256 | See the program B<wrapsuid> in the F<eg> directory of your Perl |
257 | distribution for a convenient way to do this automatically for all your |
258 | setuid Perl programs. It moves setuid scripts into files with the same |
259 | name plus a leading dot, and then compiles a wrapper like the one above |
260 | for each of them. |
261 | |
262 | In recent years, vendors have begun to supply systems free of this |
263 | inherent security bug. On such systems, when the kernel passes the name |
264 | of the setuid script to open to the interpreter, rather than using a |
265 | pathname subject to meddling, it instead passes I</dev/fd/3>. This is a |
266 | special file already opened on the script, so that there can be no race |
267 | condition for evil scripts to exploit. On these systems, Perl should be |
268 | compiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>. The B<Configure> |
269 | program that builds Perl tries to figure this out for itself, so you |
270 | should never have to specify this yourself. Most modern releases of |
271 | SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition. |
272 | |
273 | Prior to release 5.003 of Perl, a bug in the code of B<suidperl> could |
274 | introduce a security hole in systems compiled with strict POSIX |
275 | compliance. |