1 | =head1 NAME |
---|
2 | |
---|
3 | perlsec - Perl security |
---|
4 | |
---|
5 | =head1 DESCRIPTION |
---|
6 | |
---|
7 | Perl is designed to make it easy to program securely even when running |
---|
8 | with extra privileges, like setuid or setgid programs. Unlike most |
---|
9 | command line shells, which are based on multiple substitution passes on |
---|
10 | each line of the script, Perl uses a more conventional evaluation scheme |
---|
11 | with fewer hidden snags. Additionally, because the language has more |
---|
12 | builtin functionality, it can rely less upon external (and possibly |
---|
13 | untrustworthy) programs to accomplish its purposes. |
---|
14 | |
---|
15 | Perl automatically enables a set of special security checks, called I<taint |
---|
16 | mode>, when it detects its program running with differing real and effective |
---|
17 | user or group IDs. The setuid bit in Unix permissions is mode 04000, the |
---|
18 | setgid bit mode 02000; either or both may be set. You can also enable taint |
---|
19 | mode explicitly by using the B<-T> command line flag. This flag is |
---|
20 | I<strongly> suggested for server programs and any program run on behalf of |
---|
21 | someone else, such as a CGI script. Once taint mode is on, it's on for |
---|
22 | the remainder of your script. |
---|
23 | |
---|
24 | While in this mode, Perl takes special precautions called I<taint |
---|
25 | checks> to prevent both obvious and subtle traps. Some of these checks |
---|
26 | are reasonably simple, such as verifying that path directories aren't |
---|
27 | writable by others; careful programmers have always used checks like |
---|
28 | these. Other checks, however, are best supported by the language itself, |
---|
29 | and it is these checks especially that contribute to making a set-id Perl |
---|
30 | program more secure than the corresponding C program. |
---|
31 | |
---|
32 | You may not use data derived from outside your program to affect |
---|
33 | something else outside your program--at least, not by accident. All |
---|
34 | command line arguments, environment variables, locale information (see |
---|
35 | L<perllocale>), results of certain system calls (readdir(), |
---|
36 | readlink(), the variable of shmread(), the messages returned by |
---|
37 | msgrcv(), the password, gcos and shell fields returned by the |
---|
38 | getpwxxx() calls), and all file input are marked as "tainted". |
---|
39 | Tainted data may not be used directly or indirectly in any command |
---|
40 | that invokes a sub-shell, nor in any command that modifies files, |
---|
41 | directories, or processes, B<with the following exceptions>: |
---|
42 | |
---|
43 | =over 4 |
---|
44 | |
---|
45 | =item * |
---|
46 | |
---|
47 | If you pass a list of arguments to either C<system> or C<exec>, |
---|
48 | the elements of that list are B<not> checked for taintedness. |
---|
49 | |
---|
50 | =item * |
---|
51 | |
---|
52 | Arguments to C<print> and C<syswrite> are B<not> checked for taintedness. |
---|
53 | |
---|
54 | =back |
---|
55 | |
---|
56 | Any variable set to a value |
---|
57 | derived from tainted data will itself be tainted, even if it is |
---|
58 | logically impossible for the tainted data to alter the variable. |
---|
59 | Because taintedness is associated with each scalar value, some |
---|
60 | elements of an array can be tainted and others not. |
---|
61 | |
---|
62 | For example: |
---|
63 | |
---|
64 | $arg = shift; # $arg is tainted |
---|
65 | $hid = $arg, 'bar'; # $hid is also tainted |
---|
66 | $line = <>; # Tainted |
---|
67 | $line = <STDIN>; # Also tainted |
---|
68 | open FOO, "/home/me/bar" or die $!; |
---|
69 | $line = <FOO>; # Still tainted |
---|
70 | $path = $ENV{'PATH'}; # Tainted, but see below |
---|
71 | $data = 'abc'; # Not tainted |
---|
72 | |
---|
73 | system "echo $arg"; # Insecure |
---|
74 | system "/bin/echo", $arg; # Secure (doesn't use sh) |
---|
75 | system "echo $hid"; # Insecure |
---|
76 | system "echo $data"; # Insecure until PATH set |
---|
77 | |
---|
78 | $path = $ENV{'PATH'}; # $path now tainted |
---|
79 | |
---|
80 | $ENV{'PATH'} = '/bin:/usr/bin'; |
---|
81 | delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'}; |
---|
82 | |
---|
83 | $path = $ENV{'PATH'}; # $path now NOT tainted |
---|
84 | system "echo $data"; # Is secure now! |
---|
85 | |
---|
86 | open(FOO, "< $arg"); # OK - read-only file |
---|
87 | open(FOO, "> $arg"); # Not OK - trying to write |
---|
88 | |
---|
89 | open(FOO,"echo $arg|"); # Not OK, but... |
---|
90 | open(FOO,"-|") |
---|
91 | or exec 'echo', $arg; # OK |
---|
92 | |
---|
93 | $shout = `echo $arg`; # Insecure, $shout now tainted |
---|
94 | |
---|
95 | unlink $data, $arg; # Insecure |
---|
96 | umask $arg; # Insecure |
---|
97 | |
---|
98 | exec "echo $arg"; # Insecure |
---|
99 | exec "echo", $arg; # Secure (doesn't use the shell) |
---|
100 | exec "sh", '-c', $arg; # Considered secure, alas! |
---|
101 | |
---|
102 | @files = <*.c>; # insecure (uses readdir() or similar) |
---|
103 | @files = glob('*.c'); # insecure (uses readdir() or similar) |
---|
104 | |
---|
105 | If you try to do something insecure, you will get a fatal error saying |
---|
106 | something like "Insecure dependency" or "Insecure $ENV{PATH}". Note that you |
---|
107 | can still write an insecure B<system> or B<exec>, but only by explicitly |
---|
108 | doing something like the "considered secure" example above. |
---|
109 | |
---|
110 | =head2 Laundering and Detecting Tainted Data |
---|
111 | |
---|
112 | To test whether a variable contains tainted data, and whose use would thus |
---|
113 | trigger an "Insecure dependency" message, check your nearby CPAN mirror |
---|
114 | for the F<Taint.pm> module, which should become available around November |
---|
115 | 1997. Or you may be able to use the following I<is_tainted()> function. |
---|
116 | |
---|
117 | sub is_tainted { |
---|
118 | return ! eval { |
---|
119 | join('',@_), kill 0; |
---|
120 | 1; |
---|
121 | }; |
---|
122 | } |
---|
123 | |
---|
124 | This function makes use of the fact that the presence of tainted data |
---|
125 | anywhere within an expression renders the entire expression tainted. It |
---|
126 | would be inefficient for every operator to test every argument for |
---|
127 | taintedness. Instead, the slightly more efficient and conservative |
---|
128 | approach is used that if any tainted value has been accessed within the |
---|
129 | same expression, the whole expression is considered tainted. |
---|
130 | |
---|
131 | But testing for taintedness gets you only so far. Sometimes you have just |
---|
132 | to clear your data's taintedness. The only way to bypass the tainting |
---|
133 | mechanism is by referencing subpatterns from a regular expression match. |
---|
134 | Perl presumes that if you reference a substring using $1, $2, etc., that |
---|
135 | you knew what you were doing when you wrote the pattern. That means using |
---|
136 | a bit of thought--don't just blindly untaint anything, or you defeat the |
---|
137 | entire mechanism. It's better to verify that the variable has only good |
---|
138 | characters (for certain values of "good") rather than checking whether it |
---|
139 | has any bad characters. That's because it's far too easy to miss bad |
---|
140 | characters that you never thought of. |
---|
141 | |
---|
142 | Here's a test to make sure that the data contains nothing but "word" |
---|
143 | characters (alphabetics, numerics, and underscores), a hyphen, an at sign, |
---|
144 | or a dot. |
---|
145 | |
---|
146 | if ($data =~ /^([-\@\w.]+)$/) { |
---|
147 | $data = $1; # $data now untainted |
---|
148 | } else { |
---|
149 | die "Bad data in $data"; # log this somewhere |
---|
150 | } |
---|
151 | |
---|
152 | This is fairly secure because C</\w+/> doesn't normally match shell |
---|
153 | metacharacters, nor are dot, dash, or at going to mean something special |
---|
154 | to the shell. Use of C</.+/> would have been insecure in theory because |
---|
155 | it lets everything through, but Perl doesn't check for that. The lesson |
---|
156 | is that when untainting, you must be exceedingly careful with your patterns. |
---|
157 | Laundering data using regular expression is the I<only> mechanism for |
---|
158 | untainting dirty data, unless you use the strategy detailed below to fork |
---|
159 | a child of lesser privilege. |
---|
160 | |
---|
161 | The example does not untaint $data if C<use locale> is in effect, |
---|
162 | because the characters matched by C<\w> are determined by the locale. |
---|
163 | Perl considers that locale definitions are untrustworthy because they |
---|
164 | contain data from outside the program. If you are writing a |
---|
165 | locale-aware program, and want to launder data with a regular expression |
---|
166 | containing C<\w>, put C<no locale> ahead of the expression in the same |
---|
167 | block. See L<perllocale/SECURITY> for further discussion and examples. |
---|
168 | |
---|
169 | =head2 Switches On the "#!" Line |
---|
170 | |
---|
171 | When you make a script executable, in order to make it usable as a |
---|
172 | command, the system will pass switches to perl from the script's #! |
---|
173 | line. Perl checks that any command line switches given to a setuid |
---|
174 | (or setgid) script actually match the ones set on the #! line. Some |
---|
175 | Unix and Unix-like environments impose a one-switch limit on the #! |
---|
176 | line, so you may need to use something like C<-wU> instead of C<-w -U> |
---|
177 | under such systems. (This issue should arise only in Unix or |
---|
178 | Unix-like environments that support #! and setuid or setgid scripts.) |
---|
179 | |
---|
180 | =head2 Cleaning Up Your Path |
---|
181 | |
---|
182 | For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to a |
---|
183 | known value, and each directory in the path must be non-writable by others |
---|
184 | than its owner and group. You may be surprised to get this message even |
---|
185 | if the pathname to your executable is fully qualified. This is I<not> |
---|
186 | generated because you didn't supply a full path to the program; instead, |
---|
187 | it's generated because you never set your PATH environment variable, or |
---|
188 | you didn't set it to something that was safe. Because Perl can't |
---|
189 | guarantee that the executable in question isn't itself going to turn |
---|
190 | around and execute some other program that is dependent on your PATH, it |
---|
191 | makes sure you set the PATH. |
---|
192 | |
---|
193 | The PATH isn't the only environment variable which can cause problems. |
---|
194 | Because some shells may use the variables IFS, CDPATH, ENV, and |
---|
195 | BASH_ENV, Perl checks that those are either empty or untainted when |
---|
196 | starting subprocesses. You may wish to add something like this to your |
---|
197 | setid and taint-checking scripts. |
---|
198 | |
---|
199 | delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; # Make %ENV safer |
---|
200 | |
---|
201 | It's also possible to get into trouble with other operations that don't |
---|
202 | care whether they use tainted values. Make judicious use of the file |
---|
203 | tests in dealing with any user-supplied filenames. When possible, do |
---|
204 | opens and such B<after> properly dropping any special user (or group!) |
---|
205 | privileges. Perl doesn't prevent you from opening tainted filenames for reading, |
---|
206 | so be careful what you print out. The tainting mechanism is intended to |
---|
207 | prevent stupid mistakes, not to remove the need for thought. |
---|
208 | |
---|
209 | Perl does not call the shell to expand wild cards when you pass B<system> |
---|
210 | and B<exec> explicit parameter lists instead of strings with possible shell |
---|
211 | wildcards in them. Unfortunately, the B<open>, B<glob>, and |
---|
212 | backtick functions provide no such alternate calling convention, so more |
---|
213 | subterfuge will be required. |
---|
214 | |
---|
215 | Perl provides a reasonably safe way to open a file or pipe from a setuid |
---|
216 | or setgid program: just create a child process with reduced privilege who |
---|
217 | does the dirty work for you. First, fork a child using the special |
---|
218 | B<open> syntax that connects the parent and child by a pipe. Now the |
---|
219 | child resets its ID set and any other per-process attributes, like |
---|
220 | environment variables, umasks, current working directories, back to the |
---|
221 | originals or known safe values. Then the child process, which no longer |
---|
222 | has any special permissions, does the B<open> or other system call. |
---|
223 | Finally, the child passes the data it managed to access back to the |
---|
224 | parent. Because the file or pipe was opened in the child while running |
---|
225 | under less privilege than the parent, it's not apt to be tricked into |
---|
226 | doing something it shouldn't. |
---|
227 | |
---|
228 | Here's a way to do backticks reasonably safely. Notice how the B<exec> is |
---|
229 | not called with a string that the shell could expand. This is by far the |
---|
230 | best way to call something that might be subjected to shell escapes: just |
---|
231 | never call the shell at all. |
---|
232 | |
---|
233 | use English; |
---|
234 | die "Can't fork: $!" unless defined($pid = open(KID, "-|")); |
---|
235 | if ($pid) { # parent |
---|
236 | while (<KID>) { |
---|
237 | # do something |
---|
238 | } |
---|
239 | close KID; |
---|
240 | } else { |
---|
241 | my @temp = ($EUID, $EGID); |
---|
242 | my $orig_uid = $UID; |
---|
243 | my $orig_gid = $GID; |
---|
244 | $EUID = $UID; |
---|
245 | $EGID = $GID; |
---|
246 | # Drop privileges |
---|
247 | $UID = $orig_uid; |
---|
248 | $GID = $orig_gid; |
---|
249 | # Make sure privs are really gone |
---|
250 | ($EUID, $EGID) = @temp; |
---|
251 | die "Can't drop privileges" |
---|
252 | unless $UID == $EUID && $GID eq $EGID; |
---|
253 | $ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH. |
---|
254 | # Consider sanitizing the environment even more. |
---|
255 | exec 'myprog', 'arg1', 'arg2' |
---|
256 | or die "can't exec myprog: $!"; |
---|
257 | } |
---|
258 | |
---|
259 | A similar strategy would work for wildcard expansion via C<glob>, although |
---|
260 | you can use C<readdir> instead. |
---|
261 | |
---|
262 | Taint checking is most useful when although you trust yourself not to have |
---|
263 | written a program to give away the farm, you don't necessarily trust those |
---|
264 | who end up using it not to try to trick it into doing something bad. This |
---|
265 | is the kind of security checking that's useful for set-id programs and |
---|
266 | programs launched on someone else's behalf, like CGI programs. |
---|
267 | |
---|
268 | This is quite different, however, from not even trusting the writer of the |
---|
269 | code not to try to do something evil. That's the kind of trust needed |
---|
270 | when someone hands you a program you've never seen before and says, "Here, |
---|
271 | run this." For that kind of safety, check out the Safe module, |
---|
272 | included standard in the Perl distribution. This module allows the |
---|
273 | programmer to set up special compartments in which all system operations |
---|
274 | are trapped and namespace access is carefully controlled. |
---|
275 | |
---|
276 | =head2 Security Bugs |
---|
277 | |
---|
278 | Beyond the obvious problems that stem from giving special privileges to |
---|
279 | systems as flexible as scripts, on many versions of Unix, set-id scripts |
---|
280 | are inherently insecure right from the start. The problem is a race |
---|
281 | condition in the kernel. Between the time the kernel opens the file to |
---|
282 | see which interpreter to run and when the (now-set-id) interpreter turns |
---|
283 | around and reopens the file to interpret it, the file in question may have |
---|
284 | changed, especially if you have symbolic links on your system. |
---|
285 | |
---|
286 | Fortunately, sometimes this kernel "feature" can be disabled. |
---|
287 | Unfortunately, there are two ways to disable it. The system can simply |
---|
288 | outlaw scripts with any set-id bit set, which doesn't help much. |
---|
289 | Alternately, it can simply ignore the set-id bits on scripts. If the |
---|
290 | latter is true, Perl can emulate the setuid and setgid mechanism when it |
---|
291 | notices the otherwise useless setuid/gid bits on Perl scripts. It does |
---|
292 | this via a special executable called B<suidperl> that is automatically |
---|
293 | invoked for you if it's needed. |
---|
294 | |
---|
295 | However, if the kernel set-id script feature isn't disabled, Perl will |
---|
296 | complain loudly that your set-id script is insecure. You'll need to |
---|
297 | either disable the kernel set-id script feature, or put a C wrapper around |
---|
298 | the script. A C wrapper is just a compiled program that does nothing |
---|
299 | except call your Perl program. Compiled programs are not subject to the |
---|
300 | kernel bug that plagues set-id scripts. Here's a simple wrapper, written |
---|
301 | in C: |
---|
302 | |
---|
303 | #define REAL_PATH "/path/to/script" |
---|
304 | main(ac, av) |
---|
305 | char **av; |
---|
306 | { |
---|
307 | execv(REAL_PATH, av); |
---|
308 | } |
---|
309 | |
---|
310 | Compile this wrapper into a binary executable and then make I<it> rather |
---|
311 | than your script setuid or setgid. |
---|
312 | |
---|
313 | In recent years, vendors have begun to supply systems free of this |
---|
314 | inherent security bug. On such systems, when the kernel passes the name |
---|
315 | of the set-id script to open to the interpreter, rather than using a |
---|
316 | pathname subject to meddling, it instead passes I</dev/fd/3>. This is a |
---|
317 | special file already opened on the script, so that there can be no race |
---|
318 | condition for evil scripts to exploit. On these systems, Perl should be |
---|
319 | compiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>. The B<Configure> |
---|
320 | program that builds Perl tries to figure this out for itself, so you |
---|
321 | should never have to specify this yourself. Most modern releases of |
---|
322 | SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition. |
---|
323 | |
---|
324 | Prior to release 5.6.1 of Perl, bugs in the code of B<suidperl> could |
---|
325 | introduce a security hole. |
---|
326 | |
---|
327 | =head2 Protecting Your Programs |
---|
328 | |
---|
329 | There are a number of ways to hide the source to your Perl programs, |
---|
330 | with varying levels of "security". |
---|
331 | |
---|
332 | First of all, however, you I<can't> take away read permission, because |
---|
333 | the source code has to be readable in order to be compiled and |
---|
334 | interpreted. (That doesn't mean that a CGI script's source is |
---|
335 | readable by people on the web, though.) So you have to leave the |
---|
336 | permissions at the socially friendly 0755 level. This lets |
---|
337 | people on your local system only see your source. |
---|
338 | |
---|
339 | Some people mistakenly regard this as a security problem. If your program does |
---|
340 | insecure things, and relies on people not knowing how to exploit those |
---|
341 | insecurities, it is not secure. It is often possible for someone to |
---|
342 | determine the insecure things and exploit them without viewing the |
---|
343 | source. Security through obscurity, the name for hiding your bugs |
---|
344 | instead of fixing them, is little security indeed. |
---|
345 | |
---|
346 | You can try using encryption via source filters (Filter::* from CPAN). |
---|
347 | But crackers might be able to decrypt it. You can try using the |
---|
348 | byte code compiler and interpreter described below, but crackers might |
---|
349 | be able to de-compile it. You can try using the native-code compiler |
---|
350 | described below, but crackers might be able to disassemble it. These |
---|
351 | pose varying degrees of difficulty to people wanting to get at your |
---|
352 | code, but none can definitively conceal it (this is true of every |
---|
353 | language, not just Perl). |
---|
354 | |
---|
355 | If you're concerned about people profiting from your code, then the |
---|
356 | bottom line is that nothing but a restrictive licence will give you |
---|
357 | legal security. License your software and pepper it with threatening |
---|
358 | statements like "This is unpublished proprietary software of XYZ Corp. |
---|
359 | Your access to it does not give you permission to use it blah blah |
---|
360 | blah." You should see a lawyer to be sure your licence's wording will |
---|
361 | stand up in court. |
---|
362 | |
---|
363 | =head1 SEE ALSO |
---|
364 | |
---|
365 | L<perlrun> for its description of cleaning up environment variables. |
---|