threads documentation: fork on UNIX might not copy all threads.
[p5sagit/p5-mst-13.2.git] / pod / perlipc.pod
CommitLineData
a0d0e21e 1=head1 NAME
2
184e9718 3perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores)
a0d0e21e 4
5=head1 DESCRIPTION
6
4633a7c4 7The basic IPC facilities of Perl are built out of the good old Unix
8signals, named pipes, pipe opens, the Berkeley socket routines, and SysV
9IPC calls. Each is used in slightly different situations.
10
11=head1 Signals
12
490f90af 13Perl uses a simple signal handling model: the %SIG hash contains names
14or references of user-installed signal handlers. These handlers will
15be called with an argument which is the name of the signal that
16triggered it. A signal may be generated intentionally from a
17particular keyboard sequence like control-C or control-Z, sent to you
18from another process, or triggered automatically by the kernel when
19special events transpire, like a child process exiting, your process
20running out of stack space, or hitting file size limit.
4633a7c4 21
a11adca0 22For example, to trap an interrupt signal, set up a handler like this:
4633a7c4 23
24 sub catch_zap {
25 my $signame = shift;
26 $shucks++;
27 die "Somebody sent me a SIG$signame";
54310121 28 }
4633a7c4 29 $SIG{INT} = 'catch_zap'; # could fail in modules
30 $SIG{INT} = \&catch_zap; # best strategy
31
490f90af 32Prior to Perl 5.7.3 it was necessary to do as little as you possibly
33could in your handler; notice how all we do is set a global variable
34and then raise an exception. That's because on most systems,
35libraries are not re-entrant; particularly, memory allocation and I/O
36routines are not. That meant that doing nearly I<anything> in your
37handler could in theory trigger a memory fault and subsequent core
ec488bcf 38dump - see L</Deferred Signals (Safe Signals)> below.
a11adca0 39
4633a7c4 40The names of the signals are the ones listed out by C<kill -l> on your
41system, or you can retrieve them from the Config module. Set up an
42@signame list indexed by number to get the name and a %signo table
43indexed by name to get the number:
44
45 use Config;
46 defined $Config{sig_name} || die "No sigs?";
47 foreach $name (split(' ', $Config{sig_name})) {
48 $signo{$name} = $i;
49 $signame[$i] = $name;
50 $i++;
54310121 51 }
4633a7c4 52
6a3992aa 53So to check whether signal 17 and SIGALRM were the same, do just this:
4633a7c4 54
55 print "signal #17 = $signame[17]\n";
54310121 56 if ($signo{ALRM}) {
4633a7c4 57 print "SIGALRM is $signo{ALRM}\n";
54310121 58 }
4633a7c4 59
60You may also choose to assign the strings C<'IGNORE'> or C<'DEFAULT'> as
61the handler, in which case Perl will try to discard the signal or do the
f648820c 62default thing.
63
19799a22 64On most Unix platforms, the C<CHLD> (sometimes also known as C<CLD>) signal
f648820c 65has special behavior with respect to a value of C<'IGNORE'>.
66Setting C<$SIG{CHLD}> to C<'IGNORE'> on such a platform has the effect of
67not creating zombie processes when the parent process fails to C<wait()>
68on its child processes (i.e. child processes are automatically reaped).
69Calling C<wait()> with C<$SIG{CHLD}> set to C<'IGNORE'> usually returns
70C<-1> on such platforms.
71
72Some signals can be neither trapped nor ignored, such as
4633a7c4 73the KILL and STOP (but not the TSTP) signals. One strategy for
74temporarily ignoring signals is to use a local() statement, which will be
75automatically restored once your block is exited. (Remember that local()
76values are "inherited" by functions called from within that block.)
77
78 sub precious {
79 local $SIG{INT} = 'IGNORE';
80 &more_functions;
54310121 81 }
4633a7c4 82 sub more_functions {
83 # interrupts still ignored, for now...
54310121 84 }
4633a7c4 85
86Sending a signal to a negative process ID means that you send the signal
fb73857a 87to the entire Unix process-group. This code sends a hang-up signal to all
88processes in the current process group (and sets $SIG{HUP} to IGNORE so
89it doesn't kill itself):
4633a7c4 90
91 {
92 local $SIG{HUP} = 'IGNORE';
93 kill HUP => -$$;
94 # snazzy writing of: kill('HUP', -$$)
95 }
a0d0e21e 96
4633a7c4 97Another interesting signal to send is signal number zero. This doesn't
1e9c1022 98actually affect a child process, but instead checks whether it's alive
54310121 99or has changed its UID.
a0d0e21e 100
4633a7c4 101 unless (kill 0 => $kid_pid) {
102 warn "something wicked happened to $kid_pid";
54310121 103 }
a0d0e21e 104
1e9c1022 105When directed at a process whose UID is not identical to that
106of the sending process, signal number zero may fail because
107you lack permission to send the signal, even though the process is alive.
bf003f36 108You may be able to determine the cause of failure using C<%!>.
1e9c1022 109
bf003f36 110 unless (kill 0 => $pid or $!{EPERM}) {
1e9c1022 111 warn "$pid looks dead";
112 }
113
4633a7c4 114You might also want to employ anonymous functions for simple signal
115handlers:
a0d0e21e 116
4633a7c4 117 $SIG{INT} = sub { die "\nOutta here!\n" };
a0d0e21e 118
4633a7c4 119But that will be problematic for the more complicated handlers that need
54310121 120to reinstall themselves. Because Perl's signal mechanism is currently
184e9718 121based on the signal(3) function from the C library, you may sometimes be so
4633a7c4 122misfortunate as to run on systems where that function is "broken", that
123is, it behaves in the old unreliable SysV way rather than the newer, more
124reasonable BSD and POSIX fashion. So you'll see defensive people writing
125signal handlers like this:
a0d0e21e 126
54310121 127 sub REAPER {
4633a7c4 128 $waitedpid = wait;
6a3992aa 129 # loathe sysV: it makes us not only reinstate
130 # the handler, but place it after the wait
54310121 131 $SIG{CHLD} = \&REAPER;
4633a7c4 132 }
133 $SIG{CHLD} = \&REAPER;
134 # now do something that forks...
135
816229cf 136or better still:
4633a7c4 137
6a3992aa 138 use POSIX ":sys_wait_h";
54310121 139 sub REAPER {
4633a7c4 140 my $child;
816229cf 141 # If a second child dies while in the signal handler caused by the
142 # first death, we won't get another signal. So must loop here else
143 # we will leave the unreaped child as a zombie. And the next time
144 # two children die we get another zombie. And so on.
1450d070 145 while (($child = waitpid(-1,WNOHANG)) > 0) {
4633a7c4 146 $Kid_Status{$child} = $?;
54310121 147 }
6a3992aa 148 $SIG{CHLD} = \&REAPER; # still loathe sysV
4633a7c4 149 }
150 $SIG{CHLD} = \&REAPER;
151 # do something that forks...
152
153Signal handling is also used for timeouts in Unix, While safely
154protected within an C<eval{}> block, you set a signal handler to trap
155alarm signals and then schedule to have one delivered to you in some
156number of seconds. Then try your blocking operation, clearing the alarm
157when it's done but not before you've exited your C<eval{}> block. If it
158goes off, you'll use die() to jump out of the block, much as you might
159using longjmp() or throw() in other languages.
160
161Here's an example:
162
54310121 163 eval {
4633a7c4 164 local $SIG{ALRM} = sub { die "alarm clock restart" };
54310121 165 alarm 10;
4633a7c4 166 flock(FH, 2); # blocking write lock
54310121 167 alarm 0;
4633a7c4 168 };
169 if ($@ and $@ !~ /alarm clock restart/) { die }
170
8a4f6ac2 171If the operation being timed out is system() or qx(), this technique
172is liable to generate zombies. If this matters to you, you'll
173need to do your own fork() and exec(), and kill the errant child process.
174
4633a7c4 175For more complex signal handling, you might see the standard POSIX
176module. Lamentably, this is almost entirely undocumented, but
177the F<t/lib/posix.t> file from the Perl source distribution has some
178examples in it.
179
28494392 180=head2 Handling the SIGHUP Signal in Daemons
181
182A process that usually starts when the system boots and shuts down
183when the system is shut down is called a daemon (Disk And Execution
184MONitor). If a daemon process has a configuration file which is
185modified after the process has been started, there should be a way to
186tell that process to re-read its configuration file, without stopping
187the process. Many daemons provide this mechanism using the C<SIGHUP>
188signal handler. When you want to tell the daemon to re-read the file
189you simply send it the C<SIGHUP> signal.
190
3031ea75 191Not all platforms automatically reinstall their (native) signal
192handlers after a signal delivery. This means that the handler works
193only the first time the signal is sent. The solution to this problem
194is to use C<POSIX> signal handlers if available, their behaviour
195is well-defined.
28494392 196
197The following example implements a simple daemon, which restarts
198itself every time the C<SIGHUP> signal is received. The actual code is
199located in the subroutine C<code()>, which simply prints some debug
200info to show that it works and should be replaced with the real code.
201
202 #!/usr/bin/perl -w
d6fd60d6 203
28494392 204 use POSIX ();
205 use FindBin ();
206 use File::Basename ();
207 use File::Spec::Functions;
d6fd60d6 208
28494392 209 $|=1;
d6fd60d6 210
28494392 211 # make the daemon cross-platform, so exec always calls the script
212 # itself with the right path, no matter how the script was invoked.
213 my $script = File::Basename::basename($0);
214 my $SELF = catfile $FindBin::Bin, $script;
d6fd60d6 215
28494392 216 # POSIX unmasks the sigprocmask properly
217 my $sigset = POSIX::SigSet->new();
218 my $action = POSIX::SigAction->new('sigHUP_handler',
219 $sigset,
220 &POSIX::SA_NODEFER);
221 POSIX::sigaction(&POSIX::SIGHUP, $action);
d6fd60d6 222
28494392 223 sub sigHUP_handler {
224 print "got SIGHUP\n";
225 exec($SELF, @ARGV) or die "Couldn't restart: $!\n";
226 }
d6fd60d6 227
28494392 228 code();
d6fd60d6 229
28494392 230 sub code {
231 print "PID: $$\n";
232 print "ARGV: @ARGV\n";
233 my $c = 0;
234 while (++$c) {
235 sleep 2;
236 print "$c\n";
237 }
238 }
239 __END__
240
241
4633a7c4 242=head1 Named Pipes
243
244A named pipe (often referred to as a FIFO) is an old Unix IPC
245mechanism for processes communicating on the same machine. It works
54310121 246just like a regular, connected anonymous pipes, except that the
4633a7c4 247processes rendezvous using a filename and don't have to be related.
248
249To create a named pipe, use the Unix command mknod(1) or on some
250systems, mkfifo(1). These may not be in your normal path.
251
252 # system return val is backwards, so && not ||
253 #
254 $ENV{PATH} .= ":/etc:/usr/etc";
54310121 255 if ( system('mknod', $path, 'p')
4633a7c4 256 && system('mkfifo', $path) )
257 {
5a964f20 258 die "mk{nod,fifo} $path failed";
54310121 259 }
4633a7c4 260
261
262A fifo is convenient when you want to connect a process to an unrelated
263one. When you open a fifo, the program will block until there's something
54310121 264on the other end.
4633a7c4 265
266For example, let's say you'd like to have your F<.signature> file be a
267named pipe that has a Perl program on the other end. Now every time any
6a3992aa 268program (like a mailer, news reader, finger program, etc.) tries to read
4633a7c4 269from that file, the reading program will block and your program will
6a3992aa 270supply the new signature. We'll use the pipe-checking file test B<-p>
4633a7c4 271to find out whether anyone (or anything) has accidentally removed our fifo.
272
273 chdir; # go home
274 $FIFO = '.signature';
275 $ENV{PATH} .= ":/etc:/usr/games";
276
277 while (1) {
278 unless (-p $FIFO) {
279 unlink $FIFO;
54310121 280 system('mknod', $FIFO, 'p')
4633a7c4 281 && die "can't mknod $FIFO: $!";
54310121 282 }
4633a7c4 283
284 # next line blocks until there's a reader
285 open (FIFO, "> $FIFO") || die "can't write $FIFO: $!";
286 print FIFO "John Smith (smith\@host.org)\n", `fortune -s`;
287 close FIFO;
6a3992aa 288 sleep 2; # to avoid dup signals
4633a7c4 289 }
a0d0e21e 290
ffc145e8 291=head2 Deferred Signals (Safe Signals)
5a964f20 292
490f90af 293In Perls before Perl 5.7.3 by installing Perl code to deal with
294signals, you were exposing yourself to danger from two things. First,
295few system library functions are re-entrant. If the signal interrupts
296while Perl is executing one function (like malloc(3) or printf(3)),
297and your signal handler then calls the same function again, you could
298get unpredictable behavior--often, a core dump. Second, Perl isn't
299itself re-entrant at the lowest levels. If the signal interrupts Perl
300while Perl is changing its own internal data structures, similarly
301unpredictable behaviour may result.
5a964f20 302
a11adca0 303There were two things you could do, knowing this: be paranoid or be
304pragmatic. The paranoid approach was to do as little as possible in your
5a964f20 305signal handler. Set an existing integer variable that already has a
306value, and return. This doesn't help you if you're in a slow system call,
307which will just restart. That means you have to C<die> to longjump(3) out
308of the handler. Even this is a little cavalier for the true paranoiac,
309who avoids C<die> in a handler because the system I<is> out to get you.
a11adca0 310The pragmatic approach was to say ``I know the risks, but prefer the
311convenience'', and to do anything you wanted in your signal handler,
312and be prepared to clean up core dumps now and again.
313
490f90af 314In Perl 5.7.3 and later to avoid these problems signals are
315"deferred"-- that is when the signal is delivered to the process by
316the system (to the C code that implements Perl) a flag is set, and the
317handler returns immediately. Then at strategic "safe" points in the
318Perl interpreter (e.g. when it is about to execute a new opcode) the
319flags are checked and the Perl level handler from %SIG is
320executed. The "deferred" scheme allows much more flexibility in the
321coding of signal handler as we know Perl interpreter is in a safe
322state, and that we are not in a system library function when the
323handler is called. However the implementation does differ from
324previous Perls in the following ways:
5a964f20 325
a11adca0 326=over 4
5a964f20 327
a11adca0 328=item Long running opcodes
329
490f90af 330As Perl interpreter only looks at the signal flags when it about to
331execute a new opcode if a signal arrives during a long running opcode
332(e.g. a regular expression operation on a very large string) then
333signal will not be seen until operation completes.
a11adca0 334
335=item Interrupting IO
336
490f90af 337When a signal is delivered (e.g. INT control-C) the operating system
338breaks into IO operations like C<read> (used to implement Perls
339E<lt>E<gt> operator). On older Perls the handler was called
340immediately (and as C<read> is not "unsafe" this worked well). With
341the "deferred" scheme the handler is not called immediately, and if
342Perl is using system's C<stdio> library that library may re-start the
343C<read> without returning to Perl and giving it a chance to call the
344%SIG handler. If this happens on your system the solution is to use
345C<:perlio> layer to do IO - at least on those handles which you want
346to be able to break into with signals. (The C<:perlio> layer checks
347the signal flags and calls %SIG handlers before resuming IO operation.)
348
349Note that the default in Perl 5.7.3 and later is to automatically use
350the C<:perlio> layer.
a11adca0 351
91d81acc 352Note that some networking library functions like gethostbyname() are
353known to have their own implementations of timeouts which may conflict
354with your timeouts. If you are having problems with such functions,
355you can try using the POSIX sigaction() function, which bypasses the
356Perl safe signals (note that this means subjecting yourself to
357possible memory corruption, as described above). Instead of setting
358C<$SIG{ALRM}> try something like the following:
359
360 use POSIX;
361 sigaction SIGALRM, new POSIX::SigAction sub { die "alarm\n" }
362 or die "Error setting SIGALRM handler: $!\n";
363
9ce5b4ad 364=item Restartable system calls
365
366On systems that supported it, older versions of Perl used the
367SA_RESTART flag when installing %SIG handlers. This meant that
368restartable system calls would continue rather than returning when
369a signal arrived. In order to deliver deferred signals promptly,
370Perl 5.7.3 and later do I<not> use SA_RESTART. Consequently,
371restartable system calls can fail (with $! set to C<EINTR>) in places
372where they previously would have succeeded.
373
374Note that the default C<:perlio> layer will retry C<read>, C<write>
375and C<close> as described above and that interrupted C<wait> and
376C<waitpid> calls will always be retried.
377
a11adca0 378=item Signals as "faults"
379
490f90af 380Certain signals e.g. SEGV, ILL, BUS are generated as a result of
381virtual memory or other "faults". These are normally fatal and there
382is little a Perl-level handler can do with them. (In particular the
383old signal scheme was particularly unsafe in such cases.) However if
384a %SIG handler is set the new scheme simply sets a flag and returns as
385described above. This may cause the operating system to try the
386offending machine instruction again and - as nothing has changed - it
387will generate the signal again. The result of this is a rather odd
388"loop". In future Perl's signal mechanism may be changed to avoid this
389- perhaps by simply disallowing %SIG handlers on signals of that
390type. Until then the work-round is not to set a %SIG handler on those
391signals. (Which signals they are is operating system dependant.)
a11adca0 392
393=item Signals triggered by operating system state
394
490f90af 395On some operating systems certain signal handlers are supposed to "do
396something" before returning. One example can be CHLD or CLD which
397indicates a child process has completed. On some operating systems the
398signal handler is expected to C<wait> for the completed child
399process. On such systems the deferred signal scheme will not work for
400those signals (it does not do the C<wait>). Again the failure will
401look like a loop as the operating system will re-issue the signal as
402there are un-waited-for completed child processes.
a11adca0 403
818c4caa 404=back
a0d0e21e 405
4ffa73a3 406If you want the old signal behaviour back regardless of possible
407memory corruption, set the environment variable C<PERL_SIGNALS> to
45c0772f 408C<"unsafe"> (a new feature since Perl 5.8.1).
4ffa73a3 409
4633a7c4 410=head1 Using open() for IPC
411
490f90af 412Perl's basic open() statement can also be used for unidirectional
413interprocess communication by either appending or prepending a pipe
414symbol to the second argument to open(). Here's how to start
415something up in a child process you intend to write to:
4633a7c4 416
54310121 417 open(SPOOLER, "| cat -v | lpr -h 2>/dev/null")
4633a7c4 418 || die "can't fork: $!";
419 local $SIG{PIPE} = sub { die "spooler pipe broke" };
420 print SPOOLER "stuff\n";
421 close SPOOLER || die "bad spool: $! $?";
422
423And here's how to start up a child process you intend to read from:
424
425 open(STATUS, "netstat -an 2>&1 |")
426 || die "can't fork: $!";
427 while (<STATUS>) {
428 next if /^(tcp|udp)/;
429 print;
54310121 430 }
a2eb9003 431 close STATUS || die "bad netstat: $! $?";
4633a7c4 432
433If one can be sure that a particular program is a Perl script that is
434expecting filenames in @ARGV, the clever programmer can write something
435like this:
436
5a964f20 437 % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
4633a7c4 438
439and irrespective of which shell it's called from, the Perl program will
440read from the file F<f1>, the process F<cmd1>, standard input (F<tmpfile>
441in this case), the F<f2> file, the F<cmd2> command, and finally the F<f3>
442file. Pretty nifty, eh?
443
54310121 444You might notice that you could use backticks for much the
4633a7c4 445same effect as opening a pipe for reading:
446
447 print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`;
448 die "bad netstat" if $?;
449
450While this is true on the surface, it's much more efficient to process the
451file one line or record at a time because then you don't have to read the
19799a22 452whole thing into memory at once. It also gives you finer control of the
4633a7c4 453whole process, letting you to kill off the child process early if you'd
454like.
455
456Be careful to check both the open() and the close() return values. If
457you're I<writing> to a pipe, you should also trap SIGPIPE. Otherwise,
458think of what happens when you start up a pipe to a command that doesn't
459exist: the open() will in all likelihood succeed (it only reflects the
460fork()'s success), but then your output will fail--spectacularly. Perl
461can't know whether the command worked because your command is actually
462running in a separate process whose exec() might have failed. Therefore,
6a3992aa 463while readers of bogus commands return just a quick end of file, writers
4633a7c4 464to bogus command will trigger a signal they'd better be prepared to
465handle. Consider:
466
5a964f20 467 open(FH, "|bogus") or die "can't fork: $!";
468 print FH "bang\n" or die "can't write: $!";
469 close FH or die "can't close: $!";
470
471That won't blow up until the close, and it will blow up with a SIGPIPE.
472To catch it, you could use this:
473
474 $SIG{PIPE} = 'IGNORE';
475 open(FH, "|bogus") or die "can't fork: $!";
476 print FH "bang\n" or die "can't write: $!";
477 close FH or die "can't close: status=$?";
4633a7c4 478
68dc0745 479=head2 Filehandles
480
5a964f20 481Both the main process and any child processes it forks share the same
482STDIN, STDOUT, and STDERR filehandles. If both processes try to access
45bc9206 483them at once, strange things can happen. You may also want to close
5a964f20 484or reopen the filehandles for the child. You can get around this by
485opening your pipe with open(), but on some systems this means that the
486child process cannot outlive the parent.
68dc0745 487
488=head2 Background Processes
489
490You can run a command in the background with:
491
7b05b7e3 492 system("cmd &");
68dc0745 493
494The command's STDOUT and STDERR (and possibly STDIN, depending on your
495shell) will be the same as the parent's. You won't need to catch
496SIGCHLD because of the double-fork taking place (see below for more
497details).
498
499=head2 Complete Dissociation of Child from Parent
500
501In some cases (starting server processes, for instance) you'll want to
893af57a 502completely dissociate the child process from the parent. This is
503often called daemonization. A well behaved daemon will also chdir()
504to the root directory (so it doesn't prevent unmounting the filesystem
505containing the directory from which it was launched) and redirect its
506standard file descriptors from and to F</dev/null> (so that random
507output doesn't wind up on the user's terminal).
508
509 use POSIX 'setsid';
510
511 sub daemonize {
512 chdir '/' or die "Can't chdir to /: $!";
513 open STDIN, '/dev/null' or die "Can't read /dev/null: $!";
514 open STDOUT, '>/dev/null'
515 or die "Can't write to /dev/null: $!";
516 defined(my $pid = fork) or die "Can't fork: $!";
517 exit if $pid;
518 setsid or die "Can't start a new session: $!";
519 open STDERR, '>&STDOUT' or die "Can't dup stdout: $!";
520 }
5a964f20 521
893af57a 522The fork() has to come before the setsid() to ensure that you aren't a
523process group leader (the setsid() will fail if you are). If your
524system doesn't have the setsid() function, open F</dev/tty> and use the
525C<TIOCNOTTY> ioctl() on it instead. See L<tty(4)> for details.
5a964f20 526
893af57a 527Non-Unix users should check their Your_OS::Process module for other
528solutions.
68dc0745 529
4633a7c4 530=head2 Safe Pipe Opens
531
532Another interesting approach to IPC is making your single program go
533multiprocess and communicate between (or even amongst) yourselves. The
534open() function will accept a file argument of either C<"-|"> or C<"|-">
535to do a very interesting thing: it forks a child connected to the
536filehandle you've opened. The child is running the same program as the
537parent. This is useful for safely opening a file when running under an
538assumed UID or GID, for example. If you open a pipe I<to> minus, you can
539write to the filehandle you opened and your kid will find it in his
540STDIN. If you open a pipe I<from> minus, you can read from the filehandle
541you opened whatever your kid writes to his STDOUT.
542
a1ce9542 543 use English '-no_match_vars';
4633a7c4 544 my $sleep_count = 0;
545
54310121 546 do {
c07a80fd 547 $pid = open(KID_TO_WRITE, "|-");
4633a7c4 548 unless (defined $pid) {
549 warn "cannot fork: $!";
550 die "bailing out" if $sleep_count++ > 6;
551 sleep 10;
54310121 552 }
4633a7c4 553 } until defined $pid;
554
555 if ($pid) { # parent
c07a80fd 556 print KID_TO_WRITE @some_data;
557 close(KID_TO_WRITE) || warn "kid exited $?";
4633a7c4 558 } else { # child
559 ($EUID, $EGID) = ($UID, $GID); # suid progs only
54310121 560 open (FILE, "> /safe/file")
4633a7c4 561 || die "can't open /safe/file: $!";
562 while (<STDIN>) {
563 print FILE; # child's STDIN is parent's KID
54310121 564 }
4633a7c4 565 exit; # don't forget this
54310121 566 }
4633a7c4 567
568Another common use for this construct is when you need to execute
569something without the shell's interference. With system(), it's
54310121 570straightforward, but you can't use a pipe open or backticks safely.
4633a7c4 571That's because there's no way to stop the shell from getting its hands on
572your arguments. Instead, use lower-level control to call exec() directly.
573
54310121 574Here's a safe backtick or pipe open for read:
4633a7c4 575
576 # add error processing as above
c07a80fd 577 $pid = open(KID_TO_READ, "-|");
4633a7c4 578
579 if ($pid) { # parent
c07a80fd 580 while (<KID_TO_READ>) {
4633a7c4 581 # do something interesting
54310121 582 }
c07a80fd 583 close(KID_TO_READ) || warn "kid exited $?";
4633a7c4 584
585 } else { # child
586 ($EUID, $EGID) = ($UID, $GID); # suid only
587 exec($program, @options, @args)
588 || die "can't exec program: $!";
589 # NOTREACHED
54310121 590 }
4633a7c4 591
592
593And here's a safe pipe open for writing:
594
595 # add error processing as above
c07a80fd 596 $pid = open(KID_TO_WRITE, "|-");
76c0e0db 597 $SIG{PIPE} = sub { die "whoops, $program pipe broke" };
4633a7c4 598
599 if ($pid) { # parent
600 for (@data) {
c07a80fd 601 print KID_TO_WRITE;
54310121 602 }
c07a80fd 603 close(KID_TO_WRITE) || warn "kid exited $?";
4633a7c4 604
605 } else { # child
606 ($EUID, $EGID) = ($UID, $GID);
607 exec($program, @options, @args)
608 || die "can't exec program: $!";
609 # NOTREACHED
54310121 610 }
4633a7c4 611
307eac13 612Since Perl 5.8.0, you can also use the list form of C<open> for pipes :
613the syntax
614
615 open KID_PS, "-|", "ps", "aux" or die $!;
616
617forks the ps(1) command (without spawning a shell, as there are more than
618three arguments to open()), and reads its standard output via the
ca585e4d 619C<KID_PS> filehandle. The corresponding syntax to read from command
620pipes (with C<"|-"> in place of C<"-|">) is also implemented.
307eac13 621
4633a7c4 622Note that these operations are full Unix forks, which means they may not be
623correctly implemented on alien systems. Additionally, these are not true
54310121 624multithreading. If you'd like to learn more about threading, see the
184e9718 625F<modules> file mentioned below in the SEE ALSO section.
4633a7c4 626
7b05b7e3 627=head2 Bidirectional Communication with Another Process
4633a7c4 628
629While this works reasonably well for unidirectional communication, what
630about bidirectional communication? The obvious thing you'd like to do
631doesn't actually work:
632
c07a80fd 633 open(PROG_FOR_READING_AND_WRITING, "| some program |")
4633a7c4 634
9f1b1f2d 635and if you forget to use the C<use warnings> pragma or the B<-w> flag,
636then you'll miss out entirely on the diagnostic message:
4633a7c4 637
638 Can't do bidirectional pipe at -e line 1.
639
640If you really want to, you can use the standard open2() library function
7b05b7e3 641to catch both ends. There's also an open3() for tridirectional I/O so you
4633a7c4 642can also catch your child's STDERR, but doing so would then require an
643awkward select() loop and wouldn't allow you to use normal Perl input
644operations.
645
646If you look at its source, you'll see that open2() uses low-level
5a964f20 647primitives like Unix pipe() and exec() calls to create all the connections.
4633a7c4 648While it might have been slightly more efficient by using socketpair(), it
649would have then been even less portable than it already is. The open2()
650and open3() functions are unlikely to work anywhere except on a Unix
651system or some other one purporting to be POSIX compliant.
652
653Here's an example of using open2():
654
655 use FileHandle;
656 use IPC::Open2;
5a964f20 657 $pid = open2(*Reader, *Writer, "cat -u -n" );
4633a7c4 658 print Writer "stuff\n";
659 $got = <Reader>;
660
6a3992aa 661The problem with this is that Unix buffering is really going to
662ruin your day. Even though your C<Writer> filehandle is auto-flushed,
4633a7c4 663and the process on the other end will get your data in a timely manner,
6a3992aa 664you can't usually do anything to force it to give it back to you
54310121 665in a similarly quick fashion. In this case, we could, because we
4633a7c4 666gave I<cat> a B<-u> flag to make it unbuffered. But very few Unix
667commands are designed to operate over pipes, so this seldom works
54310121 668unless you yourself wrote the program on the other end of the
4633a7c4 669double-ended pipe.
670
54310121 671A solution to this is the nonstandard F<Comm.pl> library. It uses
4633a7c4 672pseudo-ttys to make your program behave more reasonably:
673
674 require 'Comm.pl';
675 $ph = open_proc('cat -n');
676 for (1..10) {
677 print $ph "a line\n";
678 print "got back ", scalar <$ph>;
679 }
a0d0e21e 680
4633a7c4 681This way you don't have to have control over the source code of the
54310121 682program you're using. The F<Comm> library also has expect()
683and interact() functions. Find the library (and we hope its
4633a7c4 684successor F<IPC::Chat>) at your nearest CPAN archive as detailed
184e9718 685in the SEE ALSO section below.
a0d0e21e 686
c8db1d39 687The newer Expect.pm module from CPAN also addresses this kind of thing.
688This module requires two other modules from CPAN: IO::Pty and IO::Stty.
689It sets up a pseudo-terminal to interact with programs that insist on
a11adca0 690using talking to the terminal device driver. If your system is
c8db1d39 691amongst those supported, this may be your best bet.
692
5a964f20 693=head2 Bidirectional Communication with Yourself
694
695If you want, you may make low-level pipe() and fork()
696to stitch this together by hand. This example only
697talks to itself, but you could reopen the appropriate
698handles to STDIN and STDOUT and call other processes.
699
700 #!/usr/bin/perl -w
701 # pipe1 - bidirectional communication using two pipe pairs
702 # designed for the socketpair-challenged
703 use IO::Handle; # thousands of lines just for autoflush :-(
704 pipe(PARENT_RDR, CHILD_WTR); # XXX: failure?
705 pipe(CHILD_RDR, PARENT_WTR); # XXX: failure?
706 CHILD_WTR->autoflush(1);
707 PARENT_WTR->autoflush(1);
708
709 if ($pid = fork) {
710 close PARENT_RDR; close PARENT_WTR;
711 print CHILD_WTR "Parent Pid $$ is sending this\n";
712 chomp($line = <CHILD_RDR>);
713 print "Parent Pid $$ just read this: `$line'\n";
714 close CHILD_RDR; close CHILD_WTR;
715 waitpid($pid,0);
716 } else {
717 die "cannot fork: $!" unless defined $pid;
718 close CHILD_RDR; close CHILD_WTR;
719 chomp($line = <PARENT_RDR>);
720 print "Child Pid $$ just read this: `$line'\n";
721 print PARENT_WTR "Child Pid $$ is sending this\n";
722 close PARENT_RDR; close PARENT_WTR;
723 exit;
724 }
725
a11adca0 726But you don't actually have to make two pipe calls. If you
5a964f20 727have the socketpair() system call, it will do this all for you.
728
729 #!/usr/bin/perl -w
730 # pipe2 - bidirectional communication using socketpair
731 # "the best ones always go both ways"
732
733 use Socket;
734 use IO::Handle; # thousands of lines just for autoflush :-(
735 # We say AF_UNIX because although *_LOCAL is the
736 # POSIX 1003.1g form of the constant, many machines
737 # still don't have it.
738 socketpair(CHILD, PARENT, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
739 or die "socketpair: $!";
740
741 CHILD->autoflush(1);
742 PARENT->autoflush(1);
743
744 if ($pid = fork) {
745 close PARENT;
746 print CHILD "Parent Pid $$ is sending this\n";
747 chomp($line = <CHILD>);
748 print "Parent Pid $$ just read this: `$line'\n";
749 close CHILD;
750 waitpid($pid,0);
751 } else {
752 die "cannot fork: $!" unless defined $pid;
753 close CHILD;
754 chomp($line = <PARENT>);
755 print "Child Pid $$ just read this: `$line'\n";
756 print PARENT "Child Pid $$ is sending this\n";
757 close PARENT;
758 exit;
759 }
760
4633a7c4 761=head1 Sockets: Client/Server Communication
a0d0e21e 762
6a3992aa 763While not limited to Unix-derived operating systems (e.g., WinSock on PCs
4633a7c4 764provides socket support, as do some VMS libraries), you may not have
184e9718 765sockets on your system, in which case this section probably isn't going to do
6a3992aa 766you much good. With sockets, you can do both virtual circuits (i.e., TCP
767streams) and datagrams (i.e., UDP packets). You may be able to do even more
4633a7c4 768depending on your system.
769
770The Perl function calls for dealing with sockets have the same names as
771the corresponding system calls in C, but their arguments tend to differ
772for two reasons: first, Perl filehandles work differently than C file
773descriptors. Second, Perl already knows the length of its strings, so you
774don't need to pass that information.
a0d0e21e 775
4633a7c4 776One of the major problems with old socket code in Perl was that it used
777hard-coded values for some of the constants, which severely hurt
778portability. If you ever see code that does anything like explicitly
779setting C<$AF_INET = 2>, you know you're in for big trouble: An
780immeasurably superior approach is to use the C<Socket> module, which more
781reliably grants access to various constants and functions you'll need.
a0d0e21e 782
68dc0745 783If you're not writing a server/client for an existing protocol like
784NNTP or SMTP, you should give some thought to how your server will
785know when the client has finished talking, and vice-versa. Most
786protocols are based on one-line messages and responses (so one party
4a6725af 787knows the other has finished when a "\n" is received) or multi-line
68dc0745 788messages and responses that end with a period on an empty line
789("\n.\n" terminates a message/response).
790
5a964f20 791=head2 Internet Line Terminators
792
793The Internet line terminator is "\015\012". Under ASCII variants of
794Unix, that could usually be written as "\r\n", but under other systems,
795"\r\n" might at times be "\015\015\012", "\012\012\015", or something
796completely different. The standards specify writing "\015\012" to be
797conformant (be strict in what you provide), but they also recommend
798accepting a lone "\012" on input (but be lenient in what you require).
799We haven't always been very good about that in the code in this manpage,
800but unless you're on a Mac, you'll probably be ok.
801
4633a7c4 802=head2 Internet TCP Clients and Servers
a0d0e21e 803
4633a7c4 804Use Internet-domain sockets when you want to do client-server
805communication that might extend to machines outside of your own system.
806
807Here's a sample TCP client using Internet-domain sockets:
808
809 #!/usr/bin/perl -w
4633a7c4 810 use strict;
811 use Socket;
812 my ($remote,$port, $iaddr, $paddr, $proto, $line);
813
814 $remote = shift || 'localhost';
815 $port = shift || 2345; # random port
816 if ($port =~ /\D/) { $port = getservbyname($port, 'tcp') }
817 die "No port" unless $port;
818 $iaddr = inet_aton($remote) || die "no host: $remote";
819 $paddr = sockaddr_in($port, $iaddr);
820
821 $proto = getprotobyname('tcp');
822 socket(SOCK, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
823 connect(SOCK, $paddr) || die "connect: $!";
54310121 824 while (defined($line = <SOCK>)) {
4633a7c4 825 print $line;
54310121 826 }
4633a7c4 827
828 close (SOCK) || die "close: $!";
829 exit;
830
831And here's a corresponding server to go along with it. We'll
832leave the address as INADDR_ANY so that the kernel can choose
54310121 833the appropriate interface on multihomed hosts. If you want sit
c07a80fd 834on a particular interface (like the external side of a gateway
835or firewall machine), you should fill this in with your real address
836instead.
837
838 #!/usr/bin/perl -Tw
c07a80fd 839 use strict;
840 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
841 use Socket;
842 use Carp;
5865a7df 843 my $EOL = "\015\012";
c07a80fd 844
54310121 845 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
c07a80fd 846
847 my $port = shift || 2345;
848 my $proto = getprotobyname('tcp');
51ee6500 849
5865a7df 850 ($port) = $port =~ /^(\d+)$/ or die "invalid port";
6a3992aa 851
c07a80fd 852 socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
54310121 853 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR,
c07a80fd 854 pack("l", 1)) || die "setsockopt: $!";
855 bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
856 listen(Server,SOMAXCONN) || die "listen: $!";
857
858 logmsg "server started on port $port";
859
860 my $paddr;
861
862 $SIG{CHLD} = \&REAPER;
863
864 for ( ; $paddr = accept(Client,Server); close Client) {
865 my($port,$iaddr) = sockaddr_in($paddr);
866 my $name = gethostbyaddr($iaddr,AF_INET);
867
54310121 868 logmsg "connection from $name [",
869 inet_ntoa($iaddr), "]
c07a80fd 870 at port $port";
871
54310121 872 print Client "Hello there, $name, it's now ",
5a964f20 873 scalar localtime, $EOL;
54310121 874 }
c07a80fd 875
54310121 876And here's a multithreaded version. It's multithreaded in that
877like most typical servers, it spawns (forks) a slave server to
c07a80fd 878handle the client request so that the master server can quickly
879go back to service a new client.
4633a7c4 880
881 #!/usr/bin/perl -Tw
4633a7c4 882 use strict;
883 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
a0d0e21e 884 use Socket;
4633a7c4 885 use Carp;
5865a7df 886 my $EOL = "\015\012";
a0d0e21e 887
4633a7c4 888 sub spawn; # forward declaration
54310121 889 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
a0d0e21e 890
4633a7c4 891 my $port = shift || 2345;
892 my $proto = getprotobyname('tcp');
51ee6500 893
5865a7df 894 ($port) = $port =~ /^(\d+)$/ or die "invalid port";
54310121 895
c07a80fd 896 socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
54310121 897 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR,
c07a80fd 898 pack("l", 1)) || die "setsockopt: $!";
899 bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
900 listen(Server,SOMAXCONN) || die "listen: $!";
a0d0e21e 901
4633a7c4 902 logmsg "server started on port $port";
a0d0e21e 903
4633a7c4 904 my $waitedpid = 0;
905 my $paddr;
a0d0e21e 906
816229cf 907 use POSIX ":sys_wait_h";
54310121 908 sub REAPER {
816229cf 909 my $child;
910 while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
911 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
912 }
6a3992aa 913 $SIG{CHLD} = \&REAPER; # loathe sysV
4633a7c4 914 }
915
916 $SIG{CHLD} = \&REAPER;
917
54310121 918 for ( $waitedpid = 0;
919 ($paddr = accept(Client,Server)) || $waitedpid;
920 $waitedpid = 0, close Client)
4633a7c4 921 {
6a3992aa 922 next if $waitedpid and not $paddr;
4633a7c4 923 my($port,$iaddr) = sockaddr_in($paddr);
924 my $name = gethostbyaddr($iaddr,AF_INET);
925
54310121 926 logmsg "connection from $name [",
927 inet_ntoa($iaddr), "]
4633a7c4 928 at port $port";
a0d0e21e 929
54310121 930 spawn sub {
b921b357 931 $|=1;
5a964f20 932 print "Hello there, $name, it's now ", scalar localtime, $EOL;
933 exec '/usr/games/fortune' # XXX: `wrong' line terminators
4633a7c4 934 or confess "can't exec fortune: $!";
935 };
a0d0e21e 936
54310121 937 }
a0d0e21e 938
4633a7c4 939 sub spawn {
940 my $coderef = shift;
a0d0e21e 941
54310121 942 unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
4633a7c4 943 confess "usage: spawn CODEREF";
a0d0e21e 944 }
4633a7c4 945
946 my $pid;
947 if (!defined($pid = fork)) {
948 logmsg "cannot fork: $!";
949 return;
950 } elsif ($pid) {
951 logmsg "begat $pid";
6a3992aa 952 return; # I'm the parent
4633a7c4 953 }
6a3992aa 954 # else I'm the child -- go spawn
4633a7c4 955
c07a80fd 956 open(STDIN, "<&Client") || die "can't dup client to stdin";
957 open(STDOUT, ">&Client") || die "can't dup client to stdout";
4633a7c4 958 ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
959 exit &$coderef();
54310121 960 }
4633a7c4 961
962This server takes the trouble to clone off a child version via fork() for
963each incoming request. That way it can handle many requests at once,
964which you might not always want. Even if you don't fork(), the listen()
965will allow that many pending connections. Forking servers have to be
966particularly careful about cleaning up their dead children (called
967"zombies" in Unix parlance), because otherwise you'll quickly fill up your
968process table.
969
970We suggest that you use the B<-T> flag to use taint checking (see L<perlsec>)
971even if we aren't running setuid or setgid. This is always a good idea
972for servers and other programs run on behalf of someone else (like CGI
973scripts), because it lessens the chances that people from the outside will
974be able to compromise your system.
975
976Let's look at another TCP client. This one connects to the TCP "time"
977service on a number of different machines and shows how far their clocks
978differ from the system on which it's being run:
979
980 #!/usr/bin/perl -w
4633a7c4 981 use strict;
982 use Socket;
983
984 my $SECS_of_70_YEARS = 2208988800;
54310121 985 sub ctime { scalar localtime(shift) }
4633a7c4 986
54310121 987 my $iaddr = gethostbyname('localhost');
988 my $proto = getprotobyname('tcp');
989 my $port = getservbyname('time', 'tcp');
4633a7c4 990 my $paddr = sockaddr_in(0, $iaddr);
991 my($host);
992
993 $| = 1;
994 printf "%-24s %8s %s\n", "localhost", 0, ctime(time());
995
996 foreach $host (@ARGV) {
997 printf "%-24s ", $host;
998 my $hisiaddr = inet_aton($host) || die "unknown host";
999 my $hispaddr = sockaddr_in($port, $hisiaddr);
1000 socket(SOCKET, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
1001 connect(SOCKET, $hispaddr) || die "bind: $!";
1002 my $rtime = ' ';
1003 read(SOCKET, $rtime, 4);
1004 close(SOCKET);
1005 my $histime = unpack("N", $rtime) - $SECS_of_70_YEARS ;
1006 printf "%8d %s\n", $histime - time, ctime($histime);
a0d0e21e 1007 }
1008
4633a7c4 1009=head2 Unix-Domain TCP Clients and Servers
1010
a2eb9003 1011That's fine for Internet-domain clients and servers, but what about local
4633a7c4 1012communications? While you can use the same setup, sometimes you don't
1013want to. Unix-domain sockets are local to the current host, and are often
54310121 1014used internally to implement pipes. Unlike Internet domain sockets, Unix
4633a7c4 1015domain sockets can show up in the file system with an ls(1) listing.
1016
5a964f20 1017 % ls -l /dev/log
4633a7c4 1018 srw-rw-rw- 1 root 0 Oct 31 07:23 /dev/log
a0d0e21e 1019
4633a7c4 1020You can test for these with Perl's B<-S> file test:
1021
1022 unless ( -S '/dev/log' ) {
3ba19564 1023 die "something's wicked with the log system";
54310121 1024 }
4633a7c4 1025
1026Here's a sample Unix-domain client:
1027
1028 #!/usr/bin/perl -w
4633a7c4 1029 use Socket;
1030 use strict;
1031 my ($rendezvous, $line);
1032
1033 $rendezvous = shift || '/tmp/catsock';
1034 socket(SOCK, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!";
9607fc9c 1035 connect(SOCK, sockaddr_un($rendezvous)) || die "connect: $!";
54310121 1036 while (defined($line = <SOCK>)) {
4633a7c4 1037 print $line;
54310121 1038 }
4633a7c4 1039 exit;
1040
5a964f20 1041And here's a corresponding server. You don't have to worry about silly
1042network terminators here because Unix domain sockets are guaranteed
1043to be on the localhost, and thus everything works right.
4633a7c4 1044
1045 #!/usr/bin/perl -Tw
4633a7c4 1046 use strict;
1047 use Socket;
1048 use Carp;
1049
1050 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
5865a7df 1051 sub spawn; # forward declaration
5a964f20 1052 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
4633a7c4 1053
1054 my $NAME = '/tmp/catsock';
1055 my $uaddr = sockaddr_un($NAME);
1056 my $proto = getprotobyname('tcp');
1057
c07a80fd 1058 socket(Server,PF_UNIX,SOCK_STREAM,0) || die "socket: $!";
4633a7c4 1059 unlink($NAME);
c07a80fd 1060 bind (Server, $uaddr) || die "bind: $!";
1061 listen(Server,SOMAXCONN) || die "listen: $!";
4633a7c4 1062
1063 logmsg "server started on $NAME";
1064
5a964f20 1065 my $waitedpid;
1066
816229cf 1067 use POSIX ":sys_wait_h";
5a964f20 1068 sub REAPER {
816229cf 1069 my $child;
1070 while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
1071 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
1072 }
5a964f20 1073 $SIG{CHLD} = \&REAPER; # loathe sysV
5a964f20 1074 }
1075
4633a7c4 1076 $SIG{CHLD} = \&REAPER;
1077
5a964f20 1078
54310121 1079 for ( $waitedpid = 0;
1080 accept(Client,Server) || $waitedpid;
1081 $waitedpid = 0, close Client)
4633a7c4 1082 {
1083 next if $waitedpid;
1084 logmsg "connection on $NAME";
54310121 1085 spawn sub {
4633a7c4 1086 print "Hello there, it's now ", scalar localtime, "\n";
1087 exec '/usr/games/fortune' or die "can't exec fortune: $!";
1088 };
54310121 1089 }
4633a7c4 1090
5865a7df 1091 sub spawn {
1092 my $coderef = shift;
1093
1094 unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
1095 confess "usage: spawn CODEREF";
1096 }
1097
1098 my $pid;
1099 if (!defined($pid = fork)) {
1100 logmsg "cannot fork: $!";
1101 return;
1102 } elsif ($pid) {
1103 logmsg "begat $pid";
1104 return; # I'm the parent
1105 }
1106 # else I'm the child -- go spawn
1107
1108 open(STDIN, "<&Client") || die "can't dup client to stdin";
1109 open(STDOUT, ">&Client") || die "can't dup client to stdout";
1110 ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
1111 exit &$coderef();
1112 }
1113
4633a7c4 1114As you see, it's remarkably similar to the Internet domain TCP server, so
1115much so, in fact, that we've omitted several duplicate functions--spawn(),
1116logmsg(), ctime(), and REAPER()--which are exactly the same as in the
1117other server.
1118
1119So why would you ever want to use a Unix domain socket instead of a
1120simpler named pipe? Because a named pipe doesn't give you sessions. You
1121can't tell one process's data from another's. With socket programming,
1122you get a separate session for each client: that's why accept() takes two
1123arguments.
1124
1125For example, let's say that you have a long running database server daemon
1126that you want folks from the World Wide Web to be able to access, but only
1127if they go through a CGI interface. You'd have a small, simple CGI
1128program that does whatever checks and logging you feel like, and then acts
1129as a Unix-domain client and connects to your private server.
1130
7b05b7e3 1131=head1 TCP Clients with IO::Socket
1132
1133For those preferring a higher-level interface to socket programming, the
1134IO::Socket module provides an object-oriented approach. IO::Socket is
1135included as part of the standard Perl distribution as of the 5.004
1136release. If you're running an earlier version of Perl, just fetch
106325ad 1137IO::Socket from CPAN, where you'll also find modules providing easy
7b05b7e3 1138interfaces to the following systems: DNS, FTP, Ident (RFC 931), NIS and
1139NISPlus, NNTP, Ping, POP3, SMTP, SNMP, SSLeay, Telnet, and Time--just
1140to name a few.
1141
1142=head2 A Simple Client
1143
1144Here's a client that creates a TCP connection to the "daytime"
1145service at port 13 of the host name "localhost" and prints out everything
1146that the server there cares to provide.
1147
1148 #!/usr/bin/perl -w
1149 use IO::Socket;
1150 $remote = IO::Socket::INET->new(
1151 Proto => "tcp",
1152 PeerAddr => "localhost",
1153 PeerPort => "daytime(13)",
1154 )
1155 or die "cannot connect to daytime port at localhost";
1156 while ( <$remote> ) { print }
1157
1158When you run this program, you should get something back that
1159looks like this:
1160
1161 Wed May 14 08:40:46 MDT 1997
1162
1163Here are what those parameters to the C<new> constructor mean:
1164
13a2d996 1165=over 4
7b05b7e3 1166
1167=item C<Proto>
1168
1169This is which protocol to use. In this case, the socket handle returned
1170will be connected to a TCP socket, because we want a stream-oriented
1171connection, that is, one that acts pretty much like a plain old file.
1172Not all sockets are this of this type. For example, the UDP protocol
1173can be used to make a datagram socket, used for message-passing.
1174
1175=item C<PeerAddr>
1176
1177This is the name or Internet address of the remote host the server is
1178running on. We could have specified a longer name like C<"www.perl.com">,
1179or an address like C<"204.148.40.9">. For demonstration purposes, we've
1180used the special hostname C<"localhost">, which should always mean the
1181current machine you're running on. The corresponding Internet address
1182for localhost is C<"127.1">, if you'd rather use that.
1183
1184=item C<PeerPort>
1185
1186This is the service name or port number we'd like to connect to.
1187We could have gotten away with using just C<"daytime"> on systems with a
1188well-configured system services file,[FOOTNOTE: The system services file
1189is in I</etc/services> under Unix] but just in case, we've specified the
1190port number (13) in parentheses. Using just the number would also have
1191worked, but constant numbers make careful programmers nervous.
1192
1193=back
1194
1195Notice how the return value from the C<new> constructor is used as
1196a filehandle in the C<while> loop? That's what's called an indirect
1197filehandle, a scalar variable containing a filehandle. You can use
1198it the same way you would a normal filehandle. For example, you
1199can read one line from it this way:
1200
1201 $line = <$handle>;
1202
1203all remaining lines from is this way:
1204
1205 @lines = <$handle>;
1206
1207and send a line of data to it this way:
1208
1209 print $handle "some data\n";
1210
1211=head2 A Webget Client
1212
1213Here's a simple client that takes a remote host to fetch a document
1214from, and then a list of documents to get from that host. This is a
1215more interesting client than the previous one because it first sends
1216something to the server before fetching the server's response.
1217
1218 #!/usr/bin/perl -w
1219 use IO::Socket;
1220 unless (@ARGV > 1) { die "usage: $0 host document ..." }
1221 $host = shift(@ARGV);
5a964f20 1222 $EOL = "\015\012";
1223 $BLANK = $EOL x 2;
7b05b7e3 1224 foreach $document ( @ARGV ) {
1225 $remote = IO::Socket::INET->new( Proto => "tcp",
1226 PeerAddr => $host,
1227 PeerPort => "http(80)",
1228 );
1229 unless ($remote) { die "cannot connect to http daemon on $host" }
1230 $remote->autoflush(1);
5a964f20 1231 print $remote "GET $document HTTP/1.0" . $BLANK;
7b05b7e3 1232 while ( <$remote> ) { print }
1233 close $remote;
1234 }
1235
1236The web server handing the "http" service, which is assumed to be at
4375e838 1237its standard port, number 80. If the web server you're trying to
7b05b7e3 1238connect to is at a different port (like 1080 or 8080), you should specify
c47ff5f1 1239as the named-parameter pair, C<< PeerPort => 8080 >>. The C<autoflush>
7b05b7e3 1240method is used on the socket because otherwise the system would buffer
1241up the output we sent it. (If you're on a Mac, you'll also need to
1242change every C<"\n"> in your code that sends data over the network to
1243be a C<"\015\012"> instead.)
1244
1245Connecting to the server is only the first part of the process: once you
1246have the connection, you have to use the server's language. Each server
1247on the network has its own little command language that it expects as
1248input. The string that we send to the server starting with "GET" is in
1249HTTP syntax. In this case, we simply request each specified document.
1250Yes, we really are making a new connection for each document, even though
1251it's the same host. That's the way you always used to have to speak HTTP.
1252Recent versions of web browsers may request that the remote server leave
1253the connection open a little while, but the server doesn't have to honor
1254such a request.
1255
1256Here's an example of running that program, which we'll call I<webget>:
1257
5a964f20 1258 % webget www.perl.com /guanaco.html
7b05b7e3 1259 HTTP/1.1 404 File Not Found
1260 Date: Thu, 08 May 1997 18:02:32 GMT
1261 Server: Apache/1.2b6
1262 Connection: close
1263 Content-type: text/html
1264
1265 <HEAD><TITLE>404 File Not Found</TITLE></HEAD>
1266 <BODY><H1>File Not Found</H1>
1267 The requested URL /guanaco.html was not found on this server.<P>
1268 </BODY>
1269
1270Ok, so that's not very interesting, because it didn't find that
1271particular document. But a long response wouldn't have fit on this page.
1272
1273For a more fully-featured version of this program, you should look to
1274the I<lwp-request> program included with the LWP modules from CPAN.
1275
1276=head2 Interactive Client with IO::Socket
1277
1278Well, that's all fine if you want to send one command and get one answer,
1279but what about setting up something fully interactive, somewhat like
1280the way I<telnet> works? That way you can type a line, get the answer,
1281type a line, get the answer, etc.
1282
1283This client is more complicated than the two we've done so far, but if
1284you're on a system that supports the powerful C<fork> call, the solution
1285isn't that rough. Once you've made the connection to whatever service
1286you'd like to chat with, call C<fork> to clone your process. Each of
1287these two identical process has a very simple job to do: the parent
1288copies everything from the socket to standard output, while the child
1289simultaneously copies everything from standard input to the socket.
1290To accomplish the same thing using just one process would be I<much>
1291harder, because it's easier to code two processes to do one thing than it
1292is to code one process to do two things. (This keep-it-simple principle
5a964f20 1293a cornerstones of the Unix philosophy, and good software engineering as
1294well, which is probably why it's spread to other systems.)
7b05b7e3 1295
1296Here's the code:
1297
1298 #!/usr/bin/perl -w
1299 use strict;
1300 use IO::Socket;
1301 my ($host, $port, $kidpid, $handle, $line);
1302
1303 unless (@ARGV == 2) { die "usage: $0 host port" }
1304 ($host, $port) = @ARGV;
1305
1306 # create a tcp connection to the specified host and port
1307 $handle = IO::Socket::INET->new(Proto => "tcp",
1308 PeerAddr => $host,
1309 PeerPort => $port)
1310 or die "can't connect to port $port on $host: $!";
1311
1312 $handle->autoflush(1); # so output gets there right away
1313 print STDERR "[Connected to $host:$port]\n";
1314
1315 # split the program into two processes, identical twins
1316 die "can't fork: $!" unless defined($kidpid = fork());
1317
1318 # the if{} block runs only in the parent process
1319 if ($kidpid) {
1320 # copy the socket to standard output
1321 while (defined ($line = <$handle>)) {
1322 print STDOUT $line;
1323 }
1324 kill("TERM", $kidpid); # send SIGTERM to child
1325 }
1326 # the else{} block runs only in the child process
1327 else {
1328 # copy standard input to the socket
1329 while (defined ($line = <STDIN>)) {
1330 print $handle $line;
1331 }
1332 }
1333
1334The C<kill> function in the parent's C<if> block is there to send a
1335signal to our child process (current running in the C<else> block)
1336as soon as the remote server has closed its end of the connection.
1337
7b05b7e3 1338If the remote server sends data a byte at time, and you need that
1339data immediately without waiting for a newline (which might not happen),
1340you may wish to replace the C<while> loop in the parent with the
1341following:
1342
1343 my $byte;
1344 while (sysread($handle, $byte, 1) == 1) {
1345 print STDOUT $byte;
1346 }
1347
1348Making a system call for each byte you want to read is not very efficient
1349(to put it mildly) but is the simplest to explain and works reasonably
1350well.
1351
1352=head1 TCP Servers with IO::Socket
1353
5a964f20 1354As always, setting up a server is little bit more involved than running a client.
7b05b7e3 1355The model is that the server creates a special kind of socket that
1356does nothing but listen on a particular port for incoming connections.
c47ff5f1 1357It does this by calling the C<< IO::Socket::INET->new() >> method with
7b05b7e3 1358slightly different arguments than the client did.
1359
13a2d996 1360=over 4
7b05b7e3 1361
1362=item Proto
1363
1364This is which protocol to use. Like our clients, we'll
1365still specify C<"tcp"> here.
1366
1367=item LocalPort
1368
1369We specify a local
1370port in the C<LocalPort> argument, which we didn't do for the client.
1371This is service name or port number for which you want to be the
1372server. (Under Unix, ports under 1024 are restricted to the
1373superuser.) In our sample, we'll use port 9000, but you can use
1374any port that's not currently in use on your system. If you try
1375to use one already in used, you'll get an "Address already in use"
19799a22 1376message. Under Unix, the C<netstat -a> command will show
7b05b7e3 1377which services current have servers.
1378
1379=item Listen
1380
1381The C<Listen> parameter is set to the maximum number of
1382pending connections we can accept until we turn away incoming clients.
1383Think of it as a call-waiting queue for your telephone.
1384The low-level Socket module has a special symbol for the system maximum, which
1385is SOMAXCONN.
1386
1387=item Reuse
1388
1389The C<Reuse> parameter is needed so that we restart our server
1390manually without waiting a few minutes to allow system buffers to
1391clear out.
1392
1393=back
1394
1395Once the generic server socket has been created using the parameters
1396listed above, the server then waits for a new client to connect
d1be9408 1397to it. The server blocks in the C<accept> method, which eventually accepts a
1398bidirectional connection from the remote client. (Make sure to autoflush
7b05b7e3 1399this handle to circumvent buffering.)
1400
1401To add to user-friendliness, our server prompts the user for commands.
1402Most servers don't do this. Because of the prompt without a newline,
1403you'll have to use the C<sysread> variant of the interactive client above.
1404
1405This server accepts one of five different commands, sending output
1406back to the client. Note that unlike most network servers, this one
1407only handles one incoming client at a time. Multithreaded servers are
f83494b9 1408covered in Chapter 6 of the Camel.
7b05b7e3 1409
1410Here's the code. We'll
1411
1412 #!/usr/bin/perl -w
1413 use IO::Socket;
1414 use Net::hostent; # for OO version of gethostbyaddr
1415
1416 $PORT = 9000; # pick something not in use
1417
1418 $server = IO::Socket::INET->new( Proto => 'tcp',
1419 LocalPort => $PORT,
1420 Listen => SOMAXCONN,
1421 Reuse => 1);
1422
1423 die "can't setup server" unless $server;
1424 print "[Server $0 accepting clients]\n";
1425
1426 while ($client = $server->accept()) {
1427 $client->autoflush(1);
1428 print $client "Welcome to $0; type help for command list.\n";
1429 $hostinfo = gethostbyaddr($client->peeraddr);
78fc38e1 1430 printf "[Connect from %s]\n", $hostinfo ? $hostinfo->name : $client->peerhost;
7b05b7e3 1431 print $client "Command? ";
1432 while ( <$client>) {
1433 next unless /\S/; # blank line
1434 if (/quit|exit/i) { last; }
1435 elsif (/date|time/i) { printf $client "%s\n", scalar localtime; }
1436 elsif (/who/i ) { print $client `who 2>&1`; }
1437 elsif (/cookie/i ) { print $client `/usr/games/fortune 2>&1`; }
1438 elsif (/motd/i ) { print $client `cat /etc/motd 2>&1`; }
1439 else {
1440 print $client "Commands: quit date who cookie motd\n";
1441 }
1442 } continue {
1443 print $client "Command? ";
1444 }
1445 close $client;
1446 }
1447
1448=head1 UDP: Message Passing
4633a7c4 1449
1450Another kind of client-server setup is one that uses not connections, but
1451messages. UDP communications involve much lower overhead but also provide
1452less reliability, as there are no promises that messages will arrive at
1453all, let alone in order and unmangled. Still, UDP offers some advantages
1454over TCP, including being able to "broadcast" or "multicast" to a whole
1455bunch of destination hosts at once (usually on your local subnet). If you
1456find yourself overly concerned about reliability and start building checks
6a3992aa 1457into your message system, then you probably should use just TCP to start
4633a7c4 1458with.
1459
90034919 1460Note that UDP datagrams are I<not> a bytestream and should not be treated
1461as such. This makes using I/O mechanisms with internal buffering
1462like stdio (i.e. print() and friends) especially cumbersome. Use syswrite(),
1463or better send(), like in the example below.
1464
4633a7c4 1465Here's a UDP program similar to the sample Internet TCP client given
7b05b7e3 1466earlier. However, instead of checking one host at a time, the UDP version
4633a7c4 1467will check many of them asynchronously by simulating a multicast and then
1468using select() to do a timed-out wait for I/O. To do something similar
1469with TCP, you'd have to use a different socket handle for each host.
1470
1471 #!/usr/bin/perl -w
1472 use strict;
4633a7c4 1473 use Socket;
1474 use Sys::Hostname;
1475
54310121 1476 my ( $count, $hisiaddr, $hispaddr, $histime,
1477 $host, $iaddr, $paddr, $port, $proto,
4633a7c4 1478 $rin, $rout, $rtime, $SECS_of_70_YEARS);
1479
1480 $SECS_of_70_YEARS = 2208988800;
1481
1482 $iaddr = gethostbyname(hostname());
1483 $proto = getprotobyname('udp');
1484 $port = getservbyname('time', 'udp');
1485 $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick
1486
1487 socket(SOCKET, PF_INET, SOCK_DGRAM, $proto) || die "socket: $!";
1488 bind(SOCKET, $paddr) || die "bind: $!";
1489
1490 $| = 1;
1491 printf "%-12s %8s %s\n", "localhost", 0, scalar localtime time;
1492 $count = 0;
1493 for $host (@ARGV) {
1494 $count++;
1495 $hisiaddr = inet_aton($host) || die "unknown host";
1496 $hispaddr = sockaddr_in($port, $hisiaddr);
1497 defined(send(SOCKET, 0, 0, $hispaddr)) || die "send $host: $!";
1498 }
1499
1500 $rin = '';
1501 vec($rin, fileno(SOCKET), 1) = 1;
1502
1503 # timeout after 10.0 seconds
1504 while ($count && select($rout = $rin, undef, undef, 10.0)) {
1505 $rtime = '';
1506 ($hispaddr = recv(SOCKET, $rtime, 4, 0)) || die "recv: $!";
1507 ($port, $hisiaddr) = sockaddr_in($hispaddr);
1508 $host = gethostbyaddr($hisiaddr, AF_INET);
1509 $histime = unpack("N", $rtime) - $SECS_of_70_YEARS ;
1510 printf "%-12s ", $host;
1511 printf "%8d %s\n", $histime - time, scalar localtime($histime);
1512 $count--;
1513 }
1514
90034919 1515Note that this example does not include any retries and may consequently
1516fail to contact a reachable host. The most prominent reason for this
1517is congestion of the queues on the sending host if the number of
a31a806a 1518list of hosts to contact is sufficiently large.
90034919 1519
4633a7c4 1520=head1 SysV IPC
1521
1522While System V IPC isn't so widely used as sockets, it still has some
1523interesting uses. You can't, however, effectively use SysV IPC or
1524Berkeley mmap() to have shared memory so as to share a variable amongst
1525several processes. That's because Perl would reallocate your string when
1526you weren't wanting it to.
1527
54310121 1528Here's a small example showing shared memory usage.
a0d0e21e 1529
41d6edb2 1530 use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRWXU);
0ade1984 1531
a0d0e21e 1532 $size = 2000;
41d6edb2 1533 $id = shmget(IPC_PRIVATE, $size, S_IRWXU) || die "$!";
1534 print "shm key $id\n";
a0d0e21e 1535
1536 $message = "Message #1";
41d6edb2 1537 shmwrite($id, $message, 0, 60) || die "$!";
0ade1984 1538 print "wrote: '$message'\n";
41d6edb2 1539 shmread($id, $buff, 0, 60) || die "$!";
0ade1984 1540 print "read : '$buff'\n";
a0d0e21e 1541
0ade1984 1542 # the buffer of shmread is zero-character end-padded.
1543 substr($buff, index($buff, "\0")) = '';
1544 print "un" unless $buff eq $message;
1545 print "swell\n";
a0d0e21e 1546
41d6edb2 1547 print "deleting shm $id\n";
1548 shmctl($id, IPC_RMID, 0) || die "$!";
a0d0e21e 1549
1550Here's an example of a semaphore:
1551
0ade1984 1552 use IPC::SysV qw(IPC_CREAT);
1553
a0d0e21e 1554 $IPC_KEY = 1234;
41d6edb2 1555 $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT ) || die "$!";
1556 print "shm key $id\n";
a0d0e21e 1557
a2eb9003 1558Put this code in a separate file to be run in more than one process.
a0d0e21e 1559Call the file F<take>:
1560
1561 # create a semaphore
1562
1563 $IPC_KEY = 1234;
41d6edb2 1564 $id = semget($IPC_KEY, 0 , 0 );
1565 die if !defined($id);
a0d0e21e 1566
1567 $semnum = 0;
1568 $semflag = 0;
1569
1570 # 'take' semaphore
1571 # wait for semaphore to be zero
1572 $semop = 0;
41d6edb2 1573 $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag);
a0d0e21e 1574
1575 # Increment the semaphore count
1576 $semop = 1;
41d6edb2 1577 $opstring2 = pack("s!s!s!", $semnum, $semop, $semflag);
a0d0e21e 1578 $opstring = $opstring1 . $opstring2;
1579
41d6edb2 1580 semop($id,$opstring) || die "$!";
a0d0e21e 1581
a2eb9003 1582Put this code in a separate file to be run in more than one process.
a0d0e21e 1583Call this file F<give>:
1584
4633a7c4 1585 # 'give' the semaphore
a0d0e21e 1586 # run this in the original process and you will see
1587 # that the second process continues
1588
1589 $IPC_KEY = 1234;
41d6edb2 1590 $id = semget($IPC_KEY, 0, 0);
1591 die if !defined($id);
a0d0e21e 1592
1593 $semnum = 0;
1594 $semflag = 0;
1595
1596 # Decrement the semaphore count
1597 $semop = -1;
41d6edb2 1598 $opstring = pack("s!s!s!", $semnum, $semop, $semflag);
a0d0e21e 1599
41d6edb2 1600 semop($id,$opstring) || die "$!";
a0d0e21e 1601
7b05b7e3 1602The SysV IPC code above was written long ago, and it's definitely
0ade1984 1603clunky looking. For a more modern look, see the IPC::SysV module
1604which is included with Perl starting from Perl 5.005.
4633a7c4 1605
41d6edb2 1606A small example demonstrating SysV message queues:
1607
1608 use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRWXU);
1609
1610 my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRWXU);
1611
1612 my $sent = "message";
1613 my $type = 1234;
1614 my $rcvd;
1615 my $type_rcvd;
1616
1617 if (defined $id) {
1618 if (msgsnd($id, pack("l! a*", $type_sent, $sent), 0)) {
1619 if (msgrcv($id, $rcvd, 60, 0, 0)) {
1620 ($type_rcvd, $rcvd) = unpack("l! a*", $rcvd);
1621 if ($rcvd eq $sent) {
1622 print "okay\n";
1623 } else {
1624 print "not okay\n";
1625 }
1626 } else {
1627 die "# msgrcv failed\n";
1628 }
1629 } else {
1630 die "# msgsnd failed\n";
1631 }
1632 msgctl($id, IPC_RMID, 0) || die "# msgctl failed: $!\n";
1633 } else {
1634 die "# msgget failed\n";
1635 }
1636
4633a7c4 1637=head1 NOTES
1638
5a964f20 1639Most of these routines quietly but politely return C<undef> when they
1640fail instead of causing your program to die right then and there due to
1641an uncaught exception. (Actually, some of the new I<Socket> conversion
1642functions croak() on bad arguments.) It is therefore essential to
1643check return values from these functions. Always begin your socket
1644programs this way for optimal success, and don't forget to add B<-T>
1645taint checking flag to the #! line for servers:
4633a7c4 1646
5a964f20 1647 #!/usr/bin/perl -Tw
4633a7c4 1648 use strict;
1649 use sigtrap;
1650 use Socket;
1651
1652=head1 BUGS
1653
1654All these routines create system-specific portability problems. As noted
1655elsewhere, Perl is at the mercy of your C libraries for much of its system
1656behaviour. It's probably safest to assume broken SysV semantics for
6a3992aa 1657signals and to stick with simple TCP and UDP socket operations; e.g., don't
a2eb9003 1658try to pass open file descriptors over a local UDP datagram socket if you
4633a7c4 1659want your code to stand a chance of being portable.
1660
5a964f20 1661As mentioned in the signals section, because few vendors provide C
1662libraries that are safely re-entrant, the prudent programmer will do
1663little else within a handler beyond setting a numeric variable that
1664already exists; or, if locked into a slow (restarting) system call,
1665using die() to raise an exception and longjmp(3) out. In fact, even
1666these may in some cases cause a core dump. It's probably best to avoid
a11adca0 1667signals except where they are absolutely inevitable. This
5a964f20 1668will be addressed in a future release of Perl.
4633a7c4 1669
1670=head1 AUTHOR
1671
1672Tom Christiansen, with occasional vestiges of Larry Wall's original
7b05b7e3 1673version and suggestions from the Perl Porters.
4633a7c4 1674
1675=head1 SEE ALSO
1676
7b05b7e3 1677There's a lot more to networking than this, but this should get you
1678started.
1679
c04e1326 1680For intrepid programmers, the indispensable textbook is I<Unix
1681Network Programming, 2nd Edition, Volume 1> by W. Richard Stevens
1682(published by Prentice-Hall). Note that most books on networking
1683address the subject from the perspective of a C programmer; translation
1684to Perl is left as an exercise for the reader.
7b05b7e3 1685
1686The IO::Socket(3) manpage describes the object library, and the Socket(3)
1687manpage describes the low-level interface to sockets. Besides the obvious
1688functions in L<perlfunc>, you should also check out the F<modules> file
1689at your nearest CPAN site. (See L<perlmodlib> or best yet, the F<Perl
1690FAQ> for a description of what CPAN is and where to get it.)
1691
4633a7c4 1692Section 5 of the F<modules> file is devoted to "Networking, Device Control
6a3992aa 1693(modems), and Interprocess Communication", and contains numerous unbundled
4633a7c4 1694modules numerous networking modules, Chat and Expect operations, CGI
1695programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP, Telnet,
1696Threads, and ToolTalk--just to name a few.