Pod typos, pod2man bugs, and miscellaneous installation comments
[p5sagit/p5-mst-13.2.git] / pod / perlsec.pod
CommitLineData
425e5e39 1
a0d0e21e 2=head1 NAME
3
4perlsec - Perl security
5
6=head1 DESCRIPTION
7
425e5e39 8Perl is designed to make it easy to program securely even when running
9with extra privileges, like setuid or setgid programs. Unlike most
10command-line shells, which are based on multiple substitution passes on
11each line of the script, Perl uses a more conventional evaluation scheme
12with fewer hidden snags. Additionally, because the language has more
13built-in functionality, it can rely less upon external (and possibly
14untrustworthy) programs to accomplish its purposes.
a0d0e21e 15
425e5e39 16Perl automatically enables a set of special security checks, called I<taint
17mode>, when it detects its program running with differing real and effective
18user or group IDs. The setuid bit in Unix permissions is mode 04000, the
19setgid bit mode 02000; either or both may be set. You can also enable taint
20mode explicitly by using the the B<-T> command line flag. This flag is
21I<strongly> suggested for server programs and any program run on behalf of
22someone else, such as a CGI script.
a0d0e21e 23
425e5e39 24While in this mode, Perl takes special precautions called I<taint
25checks> to prevent both obvious and subtle traps. Some of these checks
26are reasonably simple, such as verifying that path directories aren't
27writable by others; careful programmers have always used checks like
28these. Other checks, however, are best supported by the language itself,
29and it is these checks especially that contribute to making a setuid Perl
30program more secure than the corresponding C program.
31
32You may not use data derived from outside your program to affect something
33else outside your program--at least, not by accident. All command-line
34arguments, environment variables, and file input are marked as "tainted".
35Tainted data may not be used directly or indirectly in any command that
36invokes a subshell, nor in any command that modifies files, directories,
37or processes. Any variable set within an expression that has previously
38referenced a tainted value itself becomes tainted, even if it is logically
39impossible for the tainted value to influence the variable. Because
40taintedness is associated with each scalar value, some elements of an
41array can be tainted and others not.
a0d0e21e 42
a0d0e21e 43For example:
44
425e5e39 45 $arg = shift; # $arg is tainted
46 $hid = $arg, 'bar'; # $hid is also tainted
47 $line = <>; # Tainted
a0d0e21e 48 $path = $ENV{'PATH'}; # Tainted, but see below
425e5e39 49 $data = 'abc'; # Not tainted
a0d0e21e 50
425e5e39 51 system "echo $arg"; # Insecure
52 system "/bin/echo", $arg; # Secure (doesn't use sh)
53 system "echo $hid"; # Insecure
54 system "echo $data"; # Insecure until PATH set
a0d0e21e 55
425e5e39 56 $path = $ENV{'PATH'}; # $path now tainted
a0d0e21e 57
425e5e39 58 $ENV{'PATH'} = '/bin:/usr/bin';
59 $ENV{'IFS'} = '' if $ENV{'IFS'} ne '';
a0d0e21e 60
425e5e39 61 $path = $ENV{'PATH'}; # $path now NOT tainted
62 system "echo $data"; # Is secure now!
a0d0e21e 63
425e5e39 64 open(FOO, "< $arg"); # OK - read-only file
65 open(FOO, "> $arg"); # Not OK - trying to write
a0d0e21e 66
425e5e39 67 open(FOO,"echo $arg|"); # Not OK, but...
68 open(FOO,"-|")
69 or exec 'echo', $arg; # OK
a0d0e21e 70
425e5e39 71 $shout = `echo $arg`; # Insecure, $shout now tainted
a0d0e21e 72
425e5e39 73 unlink $data, $arg; # Insecure
74 umask $arg; # Insecure
a0d0e21e 75
425e5e39 76 exec "echo $arg"; # Insecure
77 exec "echo", $arg; # Secure (doesn't use the shell)
78 exec "sh", '-c', $arg; # Considered secure, alas!
a0d0e21e 79
80If you try to do something insecure, you will get a fatal error saying
81something like "Insecure dependency" or "Insecure PATH". Note that you
425e5e39 82can still write an insecure B<system> or B<exec>, but only by explicitly
83doing something like the last example above.
84
85=head2 Laundering and Detecting Tainted Data
86
87To test whether a variable contains tainted data, and whose use would thus
88trigger an "Insecure dependency" message, you can use the following
89I<is_tainted()> function.
90
91 sub is_tainted {
92 return ! eval {
93 join('',@_), kill 0;
94 1;
95 };
96 }
97
98This function makes use of the fact that the presence of tainted data
99anywhere within an expression renders the entire expression tainted. It
100would be inefficient for every operator to test every argument for
101taintedness. Instead, the slightly more efficient and conservative
102approach is used that if any tainted value has been accessed within the
103same expression, the whole expression is considered tainted.
104
105But testing for taintedness only gets you so far. Sometimes you just have
106to clear your data's taintedness. The only way to bypass the tainting
107mechanism is by referencing subpatterns from a regular expression match.
108Perl presumes that if you reference a substring using $1, $2, etc., that
109you knew what you were doing when you wrote the pattern. That means using
110a bit of thought--don't just blindly untaint anything, or you defeat the
111entire mechanism. It's better to verify that the variable has only
112good characters (for certain values of "good") rather than checking
113whether it has any bad characters. That's because it's far too easy to
114miss bad characters that you never thought of.
115
116Here's a test to make sure that the data contains nothing but "word"
117characters (alphabetics, numerics, and underscores), a hyphen, an at sign,
118or a dot.
119
120 if ($data =~ /^([-\@\w.]+)$/) {
121 $data = $1; # $data now untainted
122 } else {
123 die "Bad data in $data"; # log this somewhere
124 }
125
126This is fairly secure since C</\w+/> doesn't normally match shell
127metacharacters, nor are dot, dash, or at going to mean something special
128to the shell. Use of C</.+/> would have been insecure in theory because
129it lets everything through, but Perl doesn't check for that. The lesson
130is that when untainting, you must be exceedingly careful with your patterns.
131Laundering data using regular expression is the I<ONLY> mechanism for
132untainting dirty data, unless you use the strategy detailed below to fork
133a child of lesser privilege.
134
135=head2 Cleaning Up Your Path
136
1fef88e7 137For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to a
425e5e39 138known value, and each directory in the path must be non-writable by others
139than its owner and group. You may be surprised to get this message even
140if the pathname to your executable is fully qualified. This is I<not>
141generated because you didn't supply a full path to the program; instead,
142it's generated because you never set your PATH environment variable, or
143you didn't set it to something that was safe. Because Perl can't
144guarantee that the executable in question isn't itself going to turn
145around and execute some other program that is dependent on your PATH, it
146makes sure you set the PATH.
a0d0e21e 147
148It's also possible to get into trouble with other operations that don't
149care whether they use tainted values. Make judicious use of the file
150tests in dealing with any user-supplied filenames. When possible, do
151opens and such after setting C<$E<gt> = $E<lt>>. (Remember group IDs,
425e5e39 152too!) Perl doesn't prevent you from opening tainted filenames for reading,
a0d0e21e 153so be careful what you print out. The tainting mechanism is intended to
154prevent stupid mistakes, not to remove the need for thought.
155
425e5e39 156Perl does not call the shell to expand wild cards when you pass B<system>
157and B<exec> explicit parameter lists instead of strings with possible shell
158wildcards in them. Unfortunately, the B<open>, B<glob>, and
159backtick functions provide no such alternate calling convention, so more
160subterfuge will be required.
161
162Perl provides a reasonably safe way to open a file or pipe from a setuid
163or setgid program: just create a child process with reduced privilege who
164does the dirty work for you. First, fork a child using the special
165B<open> syntax that connects the parent and child by a pipe. Now the
166child resets its ID set and any other per-process attributes, like
167environment variables, umasks, current working directories, back to the
168originals or known safe values. Then the child process, which no longer
169has any special permissions, does the B<open> or other system call.
170Finally, the child passes the data it managed to access back to the
171parent. Since the file or pipe was opened in the child while running
172under less privilege than the parent, it's not apt to be tricked into
173doing something it shouldn't.
174
175Here's a way to do backticks reasonably safely. Notice how the B<exec> is
176not called with a string that the shell could expand. This is by far the
177best way to call something that might be subjected to shell escapes: just
178never call the shell at all. By the time we get to the B<exec>, tainting
179is turned off, however, so be careful what you call and what you pass it.
cb1a09d0 180
425e5e39 181 use English;
cb1a09d0 182 die unless defined $pid = open(KID, "-|");
183 if ($pid) { # parent
184 while (<KID>) {
185 # do something
425e5e39 186 }
cb1a09d0 187 close KID;
188 } else {
425e5e39 189 $EUID = $UID;
190 $EGID = $GID; # XXX: initgroups() not called
191 $ENV{PATH} = "/bin:/usr/bin";
192 exec 'myprog', 'arg1', 'arg2';
193 die "can't exec myprog: $!";
194 }
195
196A similar strategy would work for wildcard expansion via C<glob>.
197
198Taint checking is most useful when although you trust yourself not to have
199written a program to give away the farm, you don't necessarily trust those
200who end up using it not to try to trick it into doing something bad. This
201is the kind of security checking that's useful for setuid programs and
202programs launched on someone else's behalf, like CGI programs.
203
204This is quite different, however, from not even trusting the writer of the
205code not to try to do something evil. That's the kind of trust needed
206when someone hands you a program you've never seen before and says, "Here,
207run this." For that kind of safety, check out the Safe module,
208included standard in the Perl distribution. This module allows the
209programmer to set up special compartments in which all system operations
210are trapped and namespace access is carefully controlled.
211
212=head2 Security Bugs
213
214Beyond the obvious problems that stem from giving special privileges to
215systems as flexible as scripts, on many versions of Unix, setuid scripts
216are inherently insecure right from the start. The problem is a race
217condition in the kernel. Between the time the kernel opens the file to
218see which interpreter to run and when the (now-setuid) interpreter turns
219around and reopens the file to interpret it, the file in question may have
220changed, especially if you have symbolic links on your system.
221
222Fortunately, sometimes this kernel "feature" can be disabled.
223Unfortunately, there are two ways to disable it. The system can simply
224outlaw scripts with the setuid bit set, which doesn't help much.
225Alternately, it can simply ignore the setuid bit on scripts. If the
226latter is true, Perl can emulate the setuid and setgid mechanism when it
227notices the otherwise useless setuid/gid bits on Perl scripts. It does
228this via a special executable called B<suidperl> that is automatically
229invoked for you if it's needed.
230
231However, if the kernel setuid script feature isn't disabled, Perl will
232complain loudly that your setuid script is insecure. You'll need to
233either disable the kernel setuid script feature, or put a C wrapper around
234the script. A C wrapper is just a compiled program that does nothing
235except call your Perl program. Compiled programs are not subject to the
236kernel bug that plagues setuid scripts. Here's a simple wrapper, written
237in C:
238
239 #define REAL_PATH "/path/to/script"
240 main(ac, av)
241 char **av;
242 {
243 execv(REAL_PATH, av);
cb1a09d0 244 }
245
425e5e39 246Compile this wrapper into a binary executable and then make I<it> rather
247than your script setuid or setgid.
248
249See the program B<wrapsuid> in the F<eg> directory of your Perl
250distribution for a convenient way to do this automatically for all your
251setuid Perl programs. It moves setuid scripts into files with the same
252name plus a leading dot, and then compiles a wrapper like the one above
253for each of them.
254
255In recent years, vendors have begun to supply systems free of this
256inherent security bug. On such systems, when the kernel passes the name
257of the setuid script to open to the interpreter, rather than using a
258pathname subject to meddling, it instead passes I</dev/fd/3>. This is a
259special file already opened on the script, so that there can be no race
260condition for evil scripts to exploit. On these systems, Perl should be
261compiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>. The B<Configure>
262program that builds Perl tries to figure this out for itself, so you
263should never have to specify this yourself. Most modern releases of
264SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition.
265
266Prior to release 5.003 of Perl, a bug in the code of B<suidperl> could
267introduce a security hole in systems compiled with strict POSIX
268compliance.