extra code in pp_concat, Take 2
[p5sagit/p5-mst-13.2.git] / pod / perlipc.pod
CommitLineData
a0d0e21e 1=head1 NAME
2
184e9718 3perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores)
a0d0e21e 4
5=head1 DESCRIPTION
6
4633a7c4 7The basic IPC facilities of Perl are built out of the good old Unix
8signals, named pipes, pipe opens, the Berkeley socket routines, and SysV
9IPC calls. Each is used in slightly different situations.
10
11=head1 Signals
12
490f90af 13Perl uses a simple signal handling model: the %SIG hash contains names
14or references of user-installed signal handlers. These handlers will
15be called with an argument which is the name of the signal that
16triggered it. A signal may be generated intentionally from a
17particular keyboard sequence like control-C or control-Z, sent to you
18from another process, or triggered automatically by the kernel when
19special events transpire, like a child process exiting, your process
20running out of stack space, or hitting file size limit.
4633a7c4 21
a11adca0 22For example, to trap an interrupt signal, set up a handler like this:
4633a7c4 23
24 sub catch_zap {
25 my $signame = shift;
26 $shucks++;
27 die "Somebody sent me a SIG$signame";
54310121 28 }
4633a7c4 29 $SIG{INT} = 'catch_zap'; # could fail in modules
30 $SIG{INT} = \&catch_zap; # best strategy
31
490f90af 32Prior to Perl 5.7.3 it was necessary to do as little as you possibly
33could in your handler; notice how all we do is set a global variable
34and then raise an exception. That's because on most systems,
35libraries are not re-entrant; particularly, memory allocation and I/O
36routines are not. That meant that doing nearly I<anything> in your
37handler could in theory trigger a memory fault and subsequent core
ec488bcf 38dump - see L</Deferred Signals (Safe Signals)> below.
a11adca0 39
4633a7c4 40The names of the signals are the ones listed out by C<kill -l> on your
41system, or you can retrieve them from the Config module. Set up an
42@signame list indexed by number to get the name and a %signo table
43indexed by name to get the number:
44
45 use Config;
46 defined $Config{sig_name} || die "No sigs?";
47 foreach $name (split(' ', $Config{sig_name})) {
48 $signo{$name} = $i;
49 $signame[$i] = $name;
50 $i++;
54310121 51 }
4633a7c4 52
6a3992aa 53So to check whether signal 17 and SIGALRM were the same, do just this:
4633a7c4 54
55 print "signal #17 = $signame[17]\n";
54310121 56 if ($signo{ALRM}) {
4633a7c4 57 print "SIGALRM is $signo{ALRM}\n";
54310121 58 }
4633a7c4 59
60You may also choose to assign the strings C<'IGNORE'> or C<'DEFAULT'> as
61the handler, in which case Perl will try to discard the signal or do the
f648820c 62default thing.
63
19799a22 64On most Unix platforms, the C<CHLD> (sometimes also known as C<CLD>) signal
f648820c 65has special behavior with respect to a value of C<'IGNORE'>.
66Setting C<$SIG{CHLD}> to C<'IGNORE'> on such a platform has the effect of
67not creating zombie processes when the parent process fails to C<wait()>
68on its child processes (i.e. child processes are automatically reaped).
69Calling C<wait()> with C<$SIG{CHLD}> set to C<'IGNORE'> usually returns
70C<-1> on such platforms.
71
72Some signals can be neither trapped nor ignored, such as
4633a7c4 73the KILL and STOP (but not the TSTP) signals. One strategy for
74temporarily ignoring signals is to use a local() statement, which will be
75automatically restored once your block is exited. (Remember that local()
76values are "inherited" by functions called from within that block.)
77
78 sub precious {
79 local $SIG{INT} = 'IGNORE';
80 &more_functions;
54310121 81 }
4633a7c4 82 sub more_functions {
83 # interrupts still ignored, for now...
54310121 84 }
4633a7c4 85
86Sending a signal to a negative process ID means that you send the signal
fb73857a 87to the entire Unix process-group. This code sends a hang-up signal to all
88processes in the current process group (and sets $SIG{HUP} to IGNORE so
89it doesn't kill itself):
4633a7c4 90
91 {
92 local $SIG{HUP} = 'IGNORE';
93 kill HUP => -$$;
94 # snazzy writing of: kill('HUP', -$$)
95 }
a0d0e21e 96
4633a7c4 97Another interesting signal to send is signal number zero. This doesn't
1e9c1022 98actually affect a child process, but instead checks whether it's alive
54310121 99or has changed its UID.
a0d0e21e 100
4633a7c4 101 unless (kill 0 => $kid_pid) {
102 warn "something wicked happened to $kid_pid";
54310121 103 }
a0d0e21e 104
1e9c1022 105When directed at a process whose UID is not identical to that
106of the sending process, signal number zero may fail because
107you lack permission to send the signal, even though the process is alive.
bf003f36 108You may be able to determine the cause of failure using C<%!>.
1e9c1022 109
bf003f36 110 unless (kill 0 => $pid or $!{EPERM}) {
1e9c1022 111 warn "$pid looks dead";
112 }
113
4633a7c4 114You might also want to employ anonymous functions for simple signal
115handlers:
a0d0e21e 116
4633a7c4 117 $SIG{INT} = sub { die "\nOutta here!\n" };
a0d0e21e 118
4633a7c4 119But that will be problematic for the more complicated handlers that need
54310121 120to reinstall themselves. Because Perl's signal mechanism is currently
184e9718 121based on the signal(3) function from the C library, you may sometimes be so
4633a7c4 122misfortunate as to run on systems where that function is "broken", that
123is, it behaves in the old unreliable SysV way rather than the newer, more
124reasonable BSD and POSIX fashion. So you'll see defensive people writing
125signal handlers like this:
a0d0e21e 126
54310121 127 sub REAPER {
4633a7c4 128 $waitedpid = wait;
6a3992aa 129 # loathe sysV: it makes us not only reinstate
130 # the handler, but place it after the wait
54310121 131 $SIG{CHLD} = \&REAPER;
4633a7c4 132 }
133 $SIG{CHLD} = \&REAPER;
134 # now do something that forks...
135
816229cf 136or better still:
4633a7c4 137
6a3992aa 138 use POSIX ":sys_wait_h";
54310121 139 sub REAPER {
4633a7c4 140 my $child;
816229cf 141 # If a second child dies while in the signal handler caused by the
142 # first death, we won't get another signal. So must loop here else
143 # we will leave the unreaped child as a zombie. And the next time
144 # two children die we get another zombie. And so on.
1450d070 145 while (($child = waitpid(-1,WNOHANG)) > 0) {
4633a7c4 146 $Kid_Status{$child} = $?;
54310121 147 }
6a3992aa 148 $SIG{CHLD} = \&REAPER; # still loathe sysV
4633a7c4 149 }
150 $SIG{CHLD} = \&REAPER;
151 # do something that forks...
152
153Signal handling is also used for timeouts in Unix, While safely
154protected within an C<eval{}> block, you set a signal handler to trap
155alarm signals and then schedule to have one delivered to you in some
156number of seconds. Then try your blocking operation, clearing the alarm
157when it's done but not before you've exited your C<eval{}> block. If it
158goes off, you'll use die() to jump out of the block, much as you might
159using longjmp() or throw() in other languages.
160
161Here's an example:
162
54310121 163 eval {
4633a7c4 164 local $SIG{ALRM} = sub { die "alarm clock restart" };
54310121 165 alarm 10;
4633a7c4 166 flock(FH, 2); # blocking write lock
54310121 167 alarm 0;
4633a7c4 168 };
169 if ($@ and $@ !~ /alarm clock restart/) { die }
170
8a4f6ac2 171If the operation being timed out is system() or qx(), this technique
172is liable to generate zombies. If this matters to you, you'll
173need to do your own fork() and exec(), and kill the errant child process.
174
4633a7c4 175For more complex signal handling, you might see the standard POSIX
176module. Lamentably, this is almost entirely undocumented, but
177the F<t/lib/posix.t> file from the Perl source distribution has some
178examples in it.
179
28494392 180=head2 Handling the SIGHUP Signal in Daemons
181
182A process that usually starts when the system boots and shuts down
183when the system is shut down is called a daemon (Disk And Execution
184MONitor). If a daemon process has a configuration file which is
185modified after the process has been started, there should be a way to
186tell that process to re-read its configuration file, without stopping
187the process. Many daemons provide this mechanism using the C<SIGHUP>
188signal handler. When you want to tell the daemon to re-read the file
189you simply send it the C<SIGHUP> signal.
190
3031ea75 191Not all platforms automatically reinstall their (native) signal
192handlers after a signal delivery. This means that the handler works
193only the first time the signal is sent. The solution to this problem
194is to use C<POSIX> signal handlers if available, their behaviour
195is well-defined.
28494392 196
197The following example implements a simple daemon, which restarts
198itself every time the C<SIGHUP> signal is received. The actual code is
199located in the subroutine C<code()>, which simply prints some debug
200info to show that it works and should be replaced with the real code.
201
202 #!/usr/bin/perl -w
d6fd60d6 203
28494392 204 use POSIX ();
205 use FindBin ();
206 use File::Basename ();
207 use File::Spec::Functions;
d6fd60d6 208
28494392 209 $|=1;
d6fd60d6 210
28494392 211 # make the daemon cross-platform, so exec always calls the script
212 # itself with the right path, no matter how the script was invoked.
213 my $script = File::Basename::basename($0);
214 my $SELF = catfile $FindBin::Bin, $script;
d6fd60d6 215
28494392 216 # POSIX unmasks the sigprocmask properly
217 my $sigset = POSIX::SigSet->new();
218 my $action = POSIX::SigAction->new('sigHUP_handler',
219 $sigset,
220 &POSIX::SA_NODEFER);
221 POSIX::sigaction(&POSIX::SIGHUP, $action);
d6fd60d6 222
28494392 223 sub sigHUP_handler {
224 print "got SIGHUP\n";
225 exec($SELF, @ARGV) or die "Couldn't restart: $!\n";
226 }
d6fd60d6 227
28494392 228 code();
d6fd60d6 229
28494392 230 sub code {
231 print "PID: $$\n";
232 print "ARGV: @ARGV\n";
233 my $c = 0;
234 while (++$c) {
235 sleep 2;
236 print "$c\n";
237 }
238 }
239 __END__
240
241
4633a7c4 242=head1 Named Pipes
243
244A named pipe (often referred to as a FIFO) is an old Unix IPC
245mechanism for processes communicating on the same machine. It works
54310121 246just like a regular, connected anonymous pipes, except that the
4633a7c4 247processes rendezvous using a filename and don't have to be related.
248
249To create a named pipe, use the Unix command mknod(1) or on some
250systems, mkfifo(1). These may not be in your normal path.
251
252 # system return val is backwards, so && not ||
253 #
254 $ENV{PATH} .= ":/etc:/usr/etc";
54310121 255 if ( system('mknod', $path, 'p')
4633a7c4 256 && system('mkfifo', $path) )
257 {
5a964f20 258 die "mk{nod,fifo} $path failed";
54310121 259 }
4633a7c4 260
261
262A fifo is convenient when you want to connect a process to an unrelated
263one. When you open a fifo, the program will block until there's something
54310121 264on the other end.
4633a7c4 265
266For example, let's say you'd like to have your F<.signature> file be a
267named pipe that has a Perl program on the other end. Now every time any
6a3992aa 268program (like a mailer, news reader, finger program, etc.) tries to read
4633a7c4 269from that file, the reading program will block and your program will
6a3992aa 270supply the new signature. We'll use the pipe-checking file test B<-p>
4633a7c4 271to find out whether anyone (or anything) has accidentally removed our fifo.
272
273 chdir; # go home
274 $FIFO = '.signature';
275 $ENV{PATH} .= ":/etc:/usr/games";
276
277 while (1) {
278 unless (-p $FIFO) {
279 unlink $FIFO;
54310121 280 system('mknod', $FIFO, 'p')
4633a7c4 281 && die "can't mknod $FIFO: $!";
54310121 282 }
4633a7c4 283
284 # next line blocks until there's a reader
285 open (FIFO, "> $FIFO") || die "can't write $FIFO: $!";
286 print FIFO "John Smith (smith\@host.org)\n", `fortune -s`;
287 close FIFO;
6a3992aa 288 sleep 2; # to avoid dup signals
4633a7c4 289 }
a0d0e21e 290
ffc145e8 291=head2 Deferred Signals (Safe Signals)
5a964f20 292
490f90af 293In Perls before Perl 5.7.3 by installing Perl code to deal with
294signals, you were exposing yourself to danger from two things. First,
295few system library functions are re-entrant. If the signal interrupts
296while Perl is executing one function (like malloc(3) or printf(3)),
297and your signal handler then calls the same function again, you could
298get unpredictable behavior--often, a core dump. Second, Perl isn't
299itself re-entrant at the lowest levels. If the signal interrupts Perl
300while Perl is changing its own internal data structures, similarly
301unpredictable behaviour may result.
5a964f20 302
a11adca0 303There were two things you could do, knowing this: be paranoid or be
304pragmatic. The paranoid approach was to do as little as possible in your
5a964f20 305signal handler. Set an existing integer variable that already has a
306value, and return. This doesn't help you if you're in a slow system call,
307which will just restart. That means you have to C<die> to longjump(3) out
308of the handler. Even this is a little cavalier for the true paranoiac,
309who avoids C<die> in a handler because the system I<is> out to get you.
a11adca0 310The pragmatic approach was to say ``I know the risks, but prefer the
311convenience'', and to do anything you wanted in your signal handler,
312and be prepared to clean up core dumps now and again.
313
490f90af 314In Perl 5.7.3 and later to avoid these problems signals are
315"deferred"-- that is when the signal is delivered to the process by
316the system (to the C code that implements Perl) a flag is set, and the
317handler returns immediately. Then at strategic "safe" points in the
318Perl interpreter (e.g. when it is about to execute a new opcode) the
319flags are checked and the Perl level handler from %SIG is
320executed. The "deferred" scheme allows much more flexibility in the
321coding of signal handler as we know Perl interpreter is in a safe
322state, and that we are not in a system library function when the
323handler is called. However the implementation does differ from
324previous Perls in the following ways:
5a964f20 325
a11adca0 326=over 4
5a964f20 327
a11adca0 328=item Long running opcodes
329
490f90af 330As Perl interpreter only looks at the signal flags when it about to
331execute a new opcode if a signal arrives during a long running opcode
332(e.g. a regular expression operation on a very large string) then
333signal will not be seen until operation completes.
a11adca0 334
335=item Interrupting IO
336
490f90af 337When a signal is delivered (e.g. INT control-C) the operating system
338breaks into IO operations like C<read> (used to implement Perls
339E<lt>E<gt> operator). On older Perls the handler was called
340immediately (and as C<read> is not "unsafe" this worked well). With
341the "deferred" scheme the handler is not called immediately, and if
342Perl is using system's C<stdio> library that library may re-start the
343C<read> without returning to Perl and giving it a chance to call the
344%SIG handler. If this happens on your system the solution is to use
345C<:perlio> layer to do IO - at least on those handles which you want
346to be able to break into with signals. (The C<:perlio> layer checks
347the signal flags and calls %SIG handlers before resuming IO operation.)
348
349Note that the default in Perl 5.7.3 and later is to automatically use
350the C<:perlio> layer.
a11adca0 351
91d81acc 352Note that some networking library functions like gethostbyname() are
353known to have their own implementations of timeouts which may conflict
354with your timeouts. If you are having problems with such functions,
355you can try using the POSIX sigaction() function, which bypasses the
356Perl safe signals (note that this means subjecting yourself to
357possible memory corruption, as described above). Instead of setting
e399c6ae 358C<$SIG{ALRM}>:
91d81acc 359
e399c6ae 360 local $SIG{ALRM} = sub { die "alarm" };
361
362try something like the following:
363
364 use POSIX qw(SIGALRM);
365 POSIX::sigaction(SIGALRM,
366 POSIX::SigAction->new(sub { die "alarm" }))
367 or die "Error setting SIGALRM handler: $!\n";
91d81acc 368
9ce5b4ad 369=item Restartable system calls
370
371On systems that supported it, older versions of Perl used the
372SA_RESTART flag when installing %SIG handlers. This meant that
373restartable system calls would continue rather than returning when
374a signal arrived. In order to deliver deferred signals promptly,
375Perl 5.7.3 and later do I<not> use SA_RESTART. Consequently,
376restartable system calls can fail (with $! set to C<EINTR>) in places
377where they previously would have succeeded.
378
379Note that the default C<:perlio> layer will retry C<read>, C<write>
380and C<close> as described above and that interrupted C<wait> and
381C<waitpid> calls will always be retried.
382
a11adca0 383=item Signals as "faults"
384
490f90af 385Certain signals e.g. SEGV, ILL, BUS are generated as a result of
386virtual memory or other "faults". These are normally fatal and there
387is little a Perl-level handler can do with them. (In particular the
388old signal scheme was particularly unsafe in such cases.) However if
389a %SIG handler is set the new scheme simply sets a flag and returns as
390described above. This may cause the operating system to try the
391offending machine instruction again and - as nothing has changed - it
392will generate the signal again. The result of this is a rather odd
393"loop". In future Perl's signal mechanism may be changed to avoid this
394- perhaps by simply disallowing %SIG handlers on signals of that
395type. Until then the work-round is not to set a %SIG handler on those
fa11829f 396signals. (Which signals they are is operating system dependent.)
a11adca0 397
398=item Signals triggered by operating system state
399
490f90af 400On some operating systems certain signal handlers are supposed to "do
401something" before returning. One example can be CHLD or CLD which
402indicates a child process has completed. On some operating systems the
403signal handler is expected to C<wait> for the completed child
404process. On such systems the deferred signal scheme will not work for
405those signals (it does not do the C<wait>). Again the failure will
406look like a loop as the operating system will re-issue the signal as
407there are un-waited-for completed child processes.
a11adca0 408
818c4caa 409=back
a0d0e21e 410
4ffa73a3 411If you want the old signal behaviour back regardless of possible
412memory corruption, set the environment variable C<PERL_SIGNALS> to
45c0772f 413C<"unsafe"> (a new feature since Perl 5.8.1).
4ffa73a3 414
4633a7c4 415=head1 Using open() for IPC
416
490f90af 417Perl's basic open() statement can also be used for unidirectional
418interprocess communication by either appending or prepending a pipe
419symbol to the second argument to open(). Here's how to start
420something up in a child process you intend to write to:
4633a7c4 421
54310121 422 open(SPOOLER, "| cat -v | lpr -h 2>/dev/null")
4633a7c4 423 || die "can't fork: $!";
424 local $SIG{PIPE} = sub { die "spooler pipe broke" };
425 print SPOOLER "stuff\n";
426 close SPOOLER || die "bad spool: $! $?";
427
428And here's how to start up a child process you intend to read from:
429
430 open(STATUS, "netstat -an 2>&1 |")
431 || die "can't fork: $!";
432 while (<STATUS>) {
433 next if /^(tcp|udp)/;
434 print;
54310121 435 }
a2eb9003 436 close STATUS || die "bad netstat: $! $?";
4633a7c4 437
438If one can be sure that a particular program is a Perl script that is
439expecting filenames in @ARGV, the clever programmer can write something
440like this:
441
5a964f20 442 % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
4633a7c4 443
444and irrespective of which shell it's called from, the Perl program will
445read from the file F<f1>, the process F<cmd1>, standard input (F<tmpfile>
446in this case), the F<f2> file, the F<cmd2> command, and finally the F<f3>
447file. Pretty nifty, eh?
448
54310121 449You might notice that you could use backticks for much the
4633a7c4 450same effect as opening a pipe for reading:
451
452 print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`;
453 die "bad netstat" if $?;
454
455While this is true on the surface, it's much more efficient to process the
456file one line or record at a time because then you don't have to read the
19799a22 457whole thing into memory at once. It also gives you finer control of the
4633a7c4 458whole process, letting you to kill off the child process early if you'd
459like.
460
461Be careful to check both the open() and the close() return values. If
462you're I<writing> to a pipe, you should also trap SIGPIPE. Otherwise,
463think of what happens when you start up a pipe to a command that doesn't
464exist: the open() will in all likelihood succeed (it only reflects the
465fork()'s success), but then your output will fail--spectacularly. Perl
466can't know whether the command worked because your command is actually
467running in a separate process whose exec() might have failed. Therefore,
6a3992aa 468while readers of bogus commands return just a quick end of file, writers
4633a7c4 469to bogus command will trigger a signal they'd better be prepared to
470handle. Consider:
471
5a964f20 472 open(FH, "|bogus") or die "can't fork: $!";
473 print FH "bang\n" or die "can't write: $!";
474 close FH or die "can't close: $!";
475
476That won't blow up until the close, and it will blow up with a SIGPIPE.
477To catch it, you could use this:
478
479 $SIG{PIPE} = 'IGNORE';
480 open(FH, "|bogus") or die "can't fork: $!";
481 print FH "bang\n" or die "can't write: $!";
482 close FH or die "can't close: status=$?";
4633a7c4 483
68dc0745 484=head2 Filehandles
485
5a964f20 486Both the main process and any child processes it forks share the same
487STDIN, STDOUT, and STDERR filehandles. If both processes try to access
45bc9206 488them at once, strange things can happen. You may also want to close
5a964f20 489or reopen the filehandles for the child. You can get around this by
490opening your pipe with open(), but on some systems this means that the
491child process cannot outlive the parent.
68dc0745 492
493=head2 Background Processes
494
495You can run a command in the background with:
496
7b05b7e3 497 system("cmd &");
68dc0745 498
499The command's STDOUT and STDERR (and possibly STDIN, depending on your
500shell) will be the same as the parent's. You won't need to catch
501SIGCHLD because of the double-fork taking place (see below for more
502details).
503
504=head2 Complete Dissociation of Child from Parent
505
506In some cases (starting server processes, for instance) you'll want to
893af57a 507completely dissociate the child process from the parent. This is
508often called daemonization. A well behaved daemon will also chdir()
509to the root directory (so it doesn't prevent unmounting the filesystem
510containing the directory from which it was launched) and redirect its
511standard file descriptors from and to F</dev/null> (so that random
512output doesn't wind up on the user's terminal).
513
514 use POSIX 'setsid';
515
516 sub daemonize {
517 chdir '/' or die "Can't chdir to /: $!";
518 open STDIN, '/dev/null' or die "Can't read /dev/null: $!";
519 open STDOUT, '>/dev/null'
520 or die "Can't write to /dev/null: $!";
521 defined(my $pid = fork) or die "Can't fork: $!";
522 exit if $pid;
523 setsid or die "Can't start a new session: $!";
524 open STDERR, '>&STDOUT' or die "Can't dup stdout: $!";
525 }
5a964f20 526
893af57a 527The fork() has to come before the setsid() to ensure that you aren't a
528process group leader (the setsid() will fail if you are). If your
529system doesn't have the setsid() function, open F</dev/tty> and use the
530C<TIOCNOTTY> ioctl() on it instead. See L<tty(4)> for details.
5a964f20 531
893af57a 532Non-Unix users should check their Your_OS::Process module for other
533solutions.
68dc0745 534
4633a7c4 535=head2 Safe Pipe Opens
536
537Another interesting approach to IPC is making your single program go
538multiprocess and communicate between (or even amongst) yourselves. The
539open() function will accept a file argument of either C<"-|"> or C<"|-">
540to do a very interesting thing: it forks a child connected to the
541filehandle you've opened. The child is running the same program as the
542parent. This is useful for safely opening a file when running under an
543assumed UID or GID, for example. If you open a pipe I<to> minus, you can
544write to the filehandle you opened and your kid will find it in his
545STDIN. If you open a pipe I<from> minus, you can read from the filehandle
546you opened whatever your kid writes to his STDOUT.
547
a1ce9542 548 use English '-no_match_vars';
4633a7c4 549 my $sleep_count = 0;
550
54310121 551 do {
c07a80fd 552 $pid = open(KID_TO_WRITE, "|-");
4633a7c4 553 unless (defined $pid) {
554 warn "cannot fork: $!";
555 die "bailing out" if $sleep_count++ > 6;
556 sleep 10;
54310121 557 }
4633a7c4 558 } until defined $pid;
559
560 if ($pid) { # parent
c07a80fd 561 print KID_TO_WRITE @some_data;
562 close(KID_TO_WRITE) || warn "kid exited $?";
4633a7c4 563 } else { # child
564 ($EUID, $EGID) = ($UID, $GID); # suid progs only
54310121 565 open (FILE, "> /safe/file")
4633a7c4 566 || die "can't open /safe/file: $!";
567 while (<STDIN>) {
568 print FILE; # child's STDIN is parent's KID
54310121 569 }
4633a7c4 570 exit; # don't forget this
54310121 571 }
4633a7c4 572
573Another common use for this construct is when you need to execute
574something without the shell's interference. With system(), it's
54310121 575straightforward, but you can't use a pipe open or backticks safely.
4633a7c4 576That's because there's no way to stop the shell from getting its hands on
577your arguments. Instead, use lower-level control to call exec() directly.
578
54310121 579Here's a safe backtick or pipe open for read:
4633a7c4 580
581 # add error processing as above
c07a80fd 582 $pid = open(KID_TO_READ, "-|");
4633a7c4 583
584 if ($pid) { # parent
c07a80fd 585 while (<KID_TO_READ>) {
4633a7c4 586 # do something interesting
54310121 587 }
c07a80fd 588 close(KID_TO_READ) || warn "kid exited $?";
4633a7c4 589
590 } else { # child
591 ($EUID, $EGID) = ($UID, $GID); # suid only
592 exec($program, @options, @args)
593 || die "can't exec program: $!";
594 # NOTREACHED
54310121 595 }
4633a7c4 596
597
598And here's a safe pipe open for writing:
599
600 # add error processing as above
c07a80fd 601 $pid = open(KID_TO_WRITE, "|-");
76c0e0db 602 $SIG{PIPE} = sub { die "whoops, $program pipe broke" };
4633a7c4 603
604 if ($pid) { # parent
605 for (@data) {
c07a80fd 606 print KID_TO_WRITE;
54310121 607 }
c07a80fd 608 close(KID_TO_WRITE) || warn "kid exited $?";
4633a7c4 609
610 } else { # child
611 ($EUID, $EGID) = ($UID, $GID);
612 exec($program, @options, @args)
613 || die "can't exec program: $!";
614 # NOTREACHED
54310121 615 }
4633a7c4 616
307eac13 617Since Perl 5.8.0, you can also use the list form of C<open> for pipes :
618the syntax
619
620 open KID_PS, "-|", "ps", "aux" or die $!;
621
622forks the ps(1) command (without spawning a shell, as there are more than
623three arguments to open()), and reads its standard output via the
8a2485f8 624C<KID_PS> filehandle. The corresponding syntax to write to command
ca585e4d 625pipes (with C<"|-"> in place of C<"-|">) is also implemented.
307eac13 626
4633a7c4 627Note that these operations are full Unix forks, which means they may not be
628correctly implemented on alien systems. Additionally, these are not true
54310121 629multithreading. If you'd like to learn more about threading, see the
184e9718 630F<modules> file mentioned below in the SEE ALSO section.
4633a7c4 631
7b05b7e3 632=head2 Bidirectional Communication with Another Process
4633a7c4 633
634While this works reasonably well for unidirectional communication, what
635about bidirectional communication? The obvious thing you'd like to do
636doesn't actually work:
637
c07a80fd 638 open(PROG_FOR_READING_AND_WRITING, "| some program |")
4633a7c4 639
9f1b1f2d 640and if you forget to use the C<use warnings> pragma or the B<-w> flag,
641then you'll miss out entirely on the diagnostic message:
4633a7c4 642
643 Can't do bidirectional pipe at -e line 1.
644
645If you really want to, you can use the standard open2() library function
7b05b7e3 646to catch both ends. There's also an open3() for tridirectional I/O so you
4633a7c4 647can also catch your child's STDERR, but doing so would then require an
648awkward select() loop and wouldn't allow you to use normal Perl input
649operations.
650
651If you look at its source, you'll see that open2() uses low-level
5a964f20 652primitives like Unix pipe() and exec() calls to create all the connections.
4633a7c4 653While it might have been slightly more efficient by using socketpair(), it
654would have then been even less portable than it already is. The open2()
655and open3() functions are unlikely to work anywhere except on a Unix
656system or some other one purporting to be POSIX compliant.
657
658Here's an example of using open2():
659
660 use FileHandle;
661 use IPC::Open2;
5a964f20 662 $pid = open2(*Reader, *Writer, "cat -u -n" );
4633a7c4 663 print Writer "stuff\n";
664 $got = <Reader>;
665
6a3992aa 666The problem with this is that Unix buffering is really going to
667ruin your day. Even though your C<Writer> filehandle is auto-flushed,
4633a7c4 668and the process on the other end will get your data in a timely manner,
6a3992aa 669you can't usually do anything to force it to give it back to you
54310121 670in a similarly quick fashion. In this case, we could, because we
4633a7c4 671gave I<cat> a B<-u> flag to make it unbuffered. But very few Unix
672commands are designed to operate over pipes, so this seldom works
54310121 673unless you yourself wrote the program on the other end of the
4633a7c4 674double-ended pipe.
675
54310121 676A solution to this is the nonstandard F<Comm.pl> library. It uses
4633a7c4 677pseudo-ttys to make your program behave more reasonably:
678
679 require 'Comm.pl';
680 $ph = open_proc('cat -n');
681 for (1..10) {
682 print $ph "a line\n";
683 print "got back ", scalar <$ph>;
684 }
a0d0e21e 685
4633a7c4 686This way you don't have to have control over the source code of the
54310121 687program you're using. The F<Comm> library also has expect()
688and interact() functions. Find the library (and we hope its
4633a7c4 689successor F<IPC::Chat>) at your nearest CPAN archive as detailed
184e9718 690in the SEE ALSO section below.
a0d0e21e 691
c8db1d39 692The newer Expect.pm module from CPAN also addresses this kind of thing.
693This module requires two other modules from CPAN: IO::Pty and IO::Stty.
694It sets up a pseudo-terminal to interact with programs that insist on
a11adca0 695using talking to the terminal device driver. If your system is
c8db1d39 696amongst those supported, this may be your best bet.
697
5a964f20 698=head2 Bidirectional Communication with Yourself
699
700If you want, you may make low-level pipe() and fork()
701to stitch this together by hand. This example only
702talks to itself, but you could reopen the appropriate
703handles to STDIN and STDOUT and call other processes.
704
705 #!/usr/bin/perl -w
706 # pipe1 - bidirectional communication using two pipe pairs
707 # designed for the socketpair-challenged
708 use IO::Handle; # thousands of lines just for autoflush :-(
709 pipe(PARENT_RDR, CHILD_WTR); # XXX: failure?
710 pipe(CHILD_RDR, PARENT_WTR); # XXX: failure?
711 CHILD_WTR->autoflush(1);
712 PARENT_WTR->autoflush(1);
713
714 if ($pid = fork) {
715 close PARENT_RDR; close PARENT_WTR;
716 print CHILD_WTR "Parent Pid $$ is sending this\n";
717 chomp($line = <CHILD_RDR>);
718 print "Parent Pid $$ just read this: `$line'\n";
719 close CHILD_RDR; close CHILD_WTR;
720 waitpid($pid,0);
721 } else {
722 die "cannot fork: $!" unless defined $pid;
723 close CHILD_RDR; close CHILD_WTR;
724 chomp($line = <PARENT_RDR>);
725 print "Child Pid $$ just read this: `$line'\n";
726 print PARENT_WTR "Child Pid $$ is sending this\n";
727 close PARENT_RDR; close PARENT_WTR;
728 exit;
729 }
730
a11adca0 731But you don't actually have to make two pipe calls. If you
5a964f20 732have the socketpair() system call, it will do this all for you.
733
734 #!/usr/bin/perl -w
735 # pipe2 - bidirectional communication using socketpair
736 # "the best ones always go both ways"
737
738 use Socket;
739 use IO::Handle; # thousands of lines just for autoflush :-(
740 # We say AF_UNIX because although *_LOCAL is the
741 # POSIX 1003.1g form of the constant, many machines
742 # still don't have it.
743 socketpair(CHILD, PARENT, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
744 or die "socketpair: $!";
745
746 CHILD->autoflush(1);
747 PARENT->autoflush(1);
748
749 if ($pid = fork) {
750 close PARENT;
751 print CHILD "Parent Pid $$ is sending this\n";
752 chomp($line = <CHILD>);
753 print "Parent Pid $$ just read this: `$line'\n";
754 close CHILD;
755 waitpid($pid,0);
756 } else {
757 die "cannot fork: $!" unless defined $pid;
758 close CHILD;
759 chomp($line = <PARENT>);
760 print "Child Pid $$ just read this: `$line'\n";
761 print PARENT "Child Pid $$ is sending this\n";
762 close PARENT;
763 exit;
764 }
765
4633a7c4 766=head1 Sockets: Client/Server Communication
a0d0e21e 767
6a3992aa 768While not limited to Unix-derived operating systems (e.g., WinSock on PCs
4633a7c4 769provides socket support, as do some VMS libraries), you may not have
184e9718 770sockets on your system, in which case this section probably isn't going to do
6a3992aa 771you much good. With sockets, you can do both virtual circuits (i.e., TCP
772streams) and datagrams (i.e., UDP packets). You may be able to do even more
4633a7c4 773depending on your system.
774
775The Perl function calls for dealing with sockets have the same names as
776the corresponding system calls in C, but their arguments tend to differ
777for two reasons: first, Perl filehandles work differently than C file
778descriptors. Second, Perl already knows the length of its strings, so you
779don't need to pass that information.
a0d0e21e 780
4633a7c4 781One of the major problems with old socket code in Perl was that it used
782hard-coded values for some of the constants, which severely hurt
783portability. If you ever see code that does anything like explicitly
784setting C<$AF_INET = 2>, you know you're in for big trouble: An
785immeasurably superior approach is to use the C<Socket> module, which more
786reliably grants access to various constants and functions you'll need.
a0d0e21e 787
68dc0745 788If you're not writing a server/client for an existing protocol like
789NNTP or SMTP, you should give some thought to how your server will
790know when the client has finished talking, and vice-versa. Most
791protocols are based on one-line messages and responses (so one party
4a6725af 792knows the other has finished when a "\n" is received) or multi-line
68dc0745 793messages and responses that end with a period on an empty line
794("\n.\n" terminates a message/response).
795
5a964f20 796=head2 Internet Line Terminators
797
798The Internet line terminator is "\015\012". Under ASCII variants of
799Unix, that could usually be written as "\r\n", but under other systems,
800"\r\n" might at times be "\015\015\012", "\012\012\015", or something
801completely different. The standards specify writing "\015\012" to be
802conformant (be strict in what you provide), but they also recommend
803accepting a lone "\012" on input (but be lenient in what you require).
804We haven't always been very good about that in the code in this manpage,
805but unless you're on a Mac, you'll probably be ok.
806
4633a7c4 807=head2 Internet TCP Clients and Servers
a0d0e21e 808
4633a7c4 809Use Internet-domain sockets when you want to do client-server
810communication that might extend to machines outside of your own system.
811
812Here's a sample TCP client using Internet-domain sockets:
813
814 #!/usr/bin/perl -w
4633a7c4 815 use strict;
816 use Socket;
817 my ($remote,$port, $iaddr, $paddr, $proto, $line);
818
819 $remote = shift || 'localhost';
820 $port = shift || 2345; # random port
821 if ($port =~ /\D/) { $port = getservbyname($port, 'tcp') }
822 die "No port" unless $port;
823 $iaddr = inet_aton($remote) || die "no host: $remote";
824 $paddr = sockaddr_in($port, $iaddr);
825
826 $proto = getprotobyname('tcp');
827 socket(SOCK, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
828 connect(SOCK, $paddr) || die "connect: $!";
54310121 829 while (defined($line = <SOCK>)) {
4633a7c4 830 print $line;
54310121 831 }
4633a7c4 832
833 close (SOCK) || die "close: $!";
834 exit;
835
836And here's a corresponding server to go along with it. We'll
837leave the address as INADDR_ANY so that the kernel can choose
54310121 838the appropriate interface on multihomed hosts. If you want sit
c07a80fd 839on a particular interface (like the external side of a gateway
840or firewall machine), you should fill this in with your real address
841instead.
842
843 #!/usr/bin/perl -Tw
c07a80fd 844 use strict;
845 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
846 use Socket;
847 use Carp;
5865a7df 848 my $EOL = "\015\012";
c07a80fd 849
54310121 850 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
c07a80fd 851
852 my $port = shift || 2345;
853 my $proto = getprotobyname('tcp');
51ee6500 854
5865a7df 855 ($port) = $port =~ /^(\d+)$/ or die "invalid port";
6a3992aa 856
c07a80fd 857 socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
54310121 858 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR,
c07a80fd 859 pack("l", 1)) || die "setsockopt: $!";
860 bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
861 listen(Server,SOMAXCONN) || die "listen: $!";
862
863 logmsg "server started on port $port";
864
865 my $paddr;
866
867 $SIG{CHLD} = \&REAPER;
868
869 for ( ; $paddr = accept(Client,Server); close Client) {
870 my($port,$iaddr) = sockaddr_in($paddr);
871 my $name = gethostbyaddr($iaddr,AF_INET);
872
54310121 873 logmsg "connection from $name [",
874 inet_ntoa($iaddr), "]
c07a80fd 875 at port $port";
876
54310121 877 print Client "Hello there, $name, it's now ",
5a964f20 878 scalar localtime, $EOL;
54310121 879 }
c07a80fd 880
54310121 881And here's a multithreaded version. It's multithreaded in that
882like most typical servers, it spawns (forks) a slave server to
c07a80fd 883handle the client request so that the master server can quickly
884go back to service a new client.
4633a7c4 885
886 #!/usr/bin/perl -Tw
4633a7c4 887 use strict;
888 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
a0d0e21e 889 use Socket;
4633a7c4 890 use Carp;
5865a7df 891 my $EOL = "\015\012";
a0d0e21e 892
4633a7c4 893 sub spawn; # forward declaration
54310121 894 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
a0d0e21e 895
4633a7c4 896 my $port = shift || 2345;
897 my $proto = getprotobyname('tcp');
51ee6500 898
5865a7df 899 ($port) = $port =~ /^(\d+)$/ or die "invalid port";
54310121 900
c07a80fd 901 socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
54310121 902 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR,
c07a80fd 903 pack("l", 1)) || die "setsockopt: $!";
904 bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
905 listen(Server,SOMAXCONN) || die "listen: $!";
a0d0e21e 906
4633a7c4 907 logmsg "server started on port $port";
a0d0e21e 908
4633a7c4 909 my $waitedpid = 0;
910 my $paddr;
a0d0e21e 911
816229cf 912 use POSIX ":sys_wait_h";
54310121 913 sub REAPER {
816229cf 914 my $child;
915 while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
916 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
917 }
6a3992aa 918 $SIG{CHLD} = \&REAPER; # loathe sysV
4633a7c4 919 }
920
921 $SIG{CHLD} = \&REAPER;
922
54310121 923 for ( $waitedpid = 0;
924 ($paddr = accept(Client,Server)) || $waitedpid;
925 $waitedpid = 0, close Client)
4633a7c4 926 {
6a3992aa 927 next if $waitedpid and not $paddr;
4633a7c4 928 my($port,$iaddr) = sockaddr_in($paddr);
929 my $name = gethostbyaddr($iaddr,AF_INET);
930
54310121 931 logmsg "connection from $name [",
932 inet_ntoa($iaddr), "]
4633a7c4 933 at port $port";
a0d0e21e 934
54310121 935 spawn sub {
b921b357 936 $|=1;
5a964f20 937 print "Hello there, $name, it's now ", scalar localtime, $EOL;
938 exec '/usr/games/fortune' # XXX: `wrong' line terminators
4633a7c4 939 or confess "can't exec fortune: $!";
940 };
a0d0e21e 941
54310121 942 }
a0d0e21e 943
4633a7c4 944 sub spawn {
945 my $coderef = shift;
a0d0e21e 946
54310121 947 unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
4633a7c4 948 confess "usage: spawn CODEREF";
a0d0e21e 949 }
4633a7c4 950
951 my $pid;
952 if (!defined($pid = fork)) {
953 logmsg "cannot fork: $!";
954 return;
955 } elsif ($pid) {
956 logmsg "begat $pid";
6a3992aa 957 return; # I'm the parent
4633a7c4 958 }
6a3992aa 959 # else I'm the child -- go spawn
4633a7c4 960
c07a80fd 961 open(STDIN, "<&Client") || die "can't dup client to stdin";
962 open(STDOUT, ">&Client") || die "can't dup client to stdout";
4633a7c4 963 ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
964 exit &$coderef();
54310121 965 }
4633a7c4 966
967This server takes the trouble to clone off a child version via fork() for
968each incoming request. That way it can handle many requests at once,
969which you might not always want. Even if you don't fork(), the listen()
970will allow that many pending connections. Forking servers have to be
971particularly careful about cleaning up their dead children (called
972"zombies" in Unix parlance), because otherwise you'll quickly fill up your
973process table.
974
975We suggest that you use the B<-T> flag to use taint checking (see L<perlsec>)
976even if we aren't running setuid or setgid. This is always a good idea
977for servers and other programs run on behalf of someone else (like CGI
978scripts), because it lessens the chances that people from the outside will
979be able to compromise your system.
980
981Let's look at another TCP client. This one connects to the TCP "time"
982service on a number of different machines and shows how far their clocks
983differ from the system on which it's being run:
984
985 #!/usr/bin/perl -w
4633a7c4 986 use strict;
987 use Socket;
988
989 my $SECS_of_70_YEARS = 2208988800;
54310121 990 sub ctime { scalar localtime(shift) }
4633a7c4 991
54310121 992 my $iaddr = gethostbyname('localhost');
993 my $proto = getprotobyname('tcp');
994 my $port = getservbyname('time', 'tcp');
4633a7c4 995 my $paddr = sockaddr_in(0, $iaddr);
996 my($host);
997
998 $| = 1;
999 printf "%-24s %8s %s\n", "localhost", 0, ctime(time());
1000
1001 foreach $host (@ARGV) {
1002 printf "%-24s ", $host;
1003 my $hisiaddr = inet_aton($host) || die "unknown host";
1004 my $hispaddr = sockaddr_in($port, $hisiaddr);
1005 socket(SOCKET, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
1006 connect(SOCKET, $hispaddr) || die "bind: $!";
1007 my $rtime = ' ';
1008 read(SOCKET, $rtime, 4);
1009 close(SOCKET);
1010 my $histime = unpack("N", $rtime) - $SECS_of_70_YEARS ;
1011 printf "%8d %s\n", $histime - time, ctime($histime);
a0d0e21e 1012 }
1013
4633a7c4 1014=head2 Unix-Domain TCP Clients and Servers
1015
a2eb9003 1016That's fine for Internet-domain clients and servers, but what about local
4633a7c4 1017communications? While you can use the same setup, sometimes you don't
1018want to. Unix-domain sockets are local to the current host, and are often
54310121 1019used internally to implement pipes. Unlike Internet domain sockets, Unix
4633a7c4 1020domain sockets can show up in the file system with an ls(1) listing.
1021
5a964f20 1022 % ls -l /dev/log
4633a7c4 1023 srw-rw-rw- 1 root 0 Oct 31 07:23 /dev/log
a0d0e21e 1024
4633a7c4 1025You can test for these with Perl's B<-S> file test:
1026
1027 unless ( -S '/dev/log' ) {
3ba19564 1028 die "something's wicked with the log system";
54310121 1029 }
4633a7c4 1030
1031Here's a sample Unix-domain client:
1032
1033 #!/usr/bin/perl -w
4633a7c4 1034 use Socket;
1035 use strict;
1036 my ($rendezvous, $line);
1037
2359510d 1038 $rendezvous = shift || 'catsock';
4633a7c4 1039 socket(SOCK, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!";
9607fc9c 1040 connect(SOCK, sockaddr_un($rendezvous)) || die "connect: $!";
54310121 1041 while (defined($line = <SOCK>)) {
4633a7c4 1042 print $line;
54310121 1043 }
4633a7c4 1044 exit;
1045
5a964f20 1046And here's a corresponding server. You don't have to worry about silly
1047network terminators here because Unix domain sockets are guaranteed
1048to be on the localhost, and thus everything works right.
4633a7c4 1049
1050 #!/usr/bin/perl -Tw
4633a7c4 1051 use strict;
1052 use Socket;
1053 use Carp;
1054
1055 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
5865a7df 1056 sub spawn; # forward declaration
5a964f20 1057 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
4633a7c4 1058
2359510d 1059 my $NAME = 'catsock';
4633a7c4 1060 my $uaddr = sockaddr_un($NAME);
1061 my $proto = getprotobyname('tcp');
1062
c07a80fd 1063 socket(Server,PF_UNIX,SOCK_STREAM,0) || die "socket: $!";
4633a7c4 1064 unlink($NAME);
c07a80fd 1065 bind (Server, $uaddr) || die "bind: $!";
1066 listen(Server,SOMAXCONN) || die "listen: $!";
4633a7c4 1067
1068 logmsg "server started on $NAME";
1069
5a964f20 1070 my $waitedpid;
1071
816229cf 1072 use POSIX ":sys_wait_h";
5a964f20 1073 sub REAPER {
816229cf 1074 my $child;
1075 while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
1076 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
1077 }
5a964f20 1078 $SIG{CHLD} = \&REAPER; # loathe sysV
5a964f20 1079 }
1080
4633a7c4 1081 $SIG{CHLD} = \&REAPER;
1082
5a964f20 1083
54310121 1084 for ( $waitedpid = 0;
1085 accept(Client,Server) || $waitedpid;
1086 $waitedpid = 0, close Client)
4633a7c4 1087 {
1088 next if $waitedpid;
1089 logmsg "connection on $NAME";
54310121 1090 spawn sub {
4633a7c4 1091 print "Hello there, it's now ", scalar localtime, "\n";
1092 exec '/usr/games/fortune' or die "can't exec fortune: $!";
1093 };
54310121 1094 }
4633a7c4 1095
5865a7df 1096 sub spawn {
1097 my $coderef = shift;
1098
1099 unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
1100 confess "usage: spawn CODEREF";
1101 }
1102
1103 my $pid;
1104 if (!defined($pid = fork)) {
1105 logmsg "cannot fork: $!";
1106 return;
1107 } elsif ($pid) {
1108 logmsg "begat $pid";
1109 return; # I'm the parent
1110 }
1111 # else I'm the child -- go spawn
1112
1113 open(STDIN, "<&Client") || die "can't dup client to stdin";
1114 open(STDOUT, ">&Client") || die "can't dup client to stdout";
1115 ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
1116 exit &$coderef();
1117 }
1118
4633a7c4 1119As you see, it's remarkably similar to the Internet domain TCP server, so
1120much so, in fact, that we've omitted several duplicate functions--spawn(),
1121logmsg(), ctime(), and REAPER()--which are exactly the same as in the
1122other server.
1123
1124So why would you ever want to use a Unix domain socket instead of a
1125simpler named pipe? Because a named pipe doesn't give you sessions. You
1126can't tell one process's data from another's. With socket programming,
1127you get a separate session for each client: that's why accept() takes two
1128arguments.
1129
1130For example, let's say that you have a long running database server daemon
1131that you want folks from the World Wide Web to be able to access, but only
1132if they go through a CGI interface. You'd have a small, simple CGI
1133program that does whatever checks and logging you feel like, and then acts
1134as a Unix-domain client and connects to your private server.
1135
7b05b7e3 1136=head1 TCP Clients with IO::Socket
1137
1138For those preferring a higher-level interface to socket programming, the
1139IO::Socket module provides an object-oriented approach. IO::Socket is
1140included as part of the standard Perl distribution as of the 5.004
1141release. If you're running an earlier version of Perl, just fetch
106325ad 1142IO::Socket from CPAN, where you'll also find modules providing easy
7b05b7e3 1143interfaces to the following systems: DNS, FTP, Ident (RFC 931), NIS and
1144NISPlus, NNTP, Ping, POP3, SMTP, SNMP, SSLeay, Telnet, and Time--just
1145to name a few.
1146
1147=head2 A Simple Client
1148
1149Here's a client that creates a TCP connection to the "daytime"
1150service at port 13 of the host name "localhost" and prints out everything
1151that the server there cares to provide.
1152
1153 #!/usr/bin/perl -w
1154 use IO::Socket;
1155 $remote = IO::Socket::INET->new(
1156 Proto => "tcp",
1157 PeerAddr => "localhost",
1158 PeerPort => "daytime(13)",
1159 )
1160 or die "cannot connect to daytime port at localhost";
1161 while ( <$remote> ) { print }
1162
1163When you run this program, you should get something back that
1164looks like this:
1165
1166 Wed May 14 08:40:46 MDT 1997
1167
1168Here are what those parameters to the C<new> constructor mean:
1169
13a2d996 1170=over 4
7b05b7e3 1171
1172=item C<Proto>
1173
1174This is which protocol to use. In this case, the socket handle returned
1175will be connected to a TCP socket, because we want a stream-oriented
1176connection, that is, one that acts pretty much like a plain old file.
1177Not all sockets are this of this type. For example, the UDP protocol
1178can be used to make a datagram socket, used for message-passing.
1179
1180=item C<PeerAddr>
1181
1182This is the name or Internet address of the remote host the server is
1183running on. We could have specified a longer name like C<"www.perl.com">,
1184or an address like C<"204.148.40.9">. For demonstration purposes, we've
1185used the special hostname C<"localhost">, which should always mean the
1186current machine you're running on. The corresponding Internet address
1187for localhost is C<"127.1">, if you'd rather use that.
1188
1189=item C<PeerPort>
1190
1191This is the service name or port number we'd like to connect to.
1192We could have gotten away with using just C<"daytime"> on systems with a
1193well-configured system services file,[FOOTNOTE: The system services file
1194is in I</etc/services> under Unix] but just in case, we've specified the
1195port number (13) in parentheses. Using just the number would also have
1196worked, but constant numbers make careful programmers nervous.
1197
1198=back
1199
1200Notice how the return value from the C<new> constructor is used as
1201a filehandle in the C<while> loop? That's what's called an indirect
1202filehandle, a scalar variable containing a filehandle. You can use
1203it the same way you would a normal filehandle. For example, you
1204can read one line from it this way:
1205
1206 $line = <$handle>;
1207
1208all remaining lines from is this way:
1209
1210 @lines = <$handle>;
1211
1212and send a line of data to it this way:
1213
1214 print $handle "some data\n";
1215
1216=head2 A Webget Client
1217
1218Here's a simple client that takes a remote host to fetch a document
1219from, and then a list of documents to get from that host. This is a
1220more interesting client than the previous one because it first sends
1221something to the server before fetching the server's response.
1222
1223 #!/usr/bin/perl -w
1224 use IO::Socket;
1225 unless (@ARGV > 1) { die "usage: $0 host document ..." }
1226 $host = shift(@ARGV);
5a964f20 1227 $EOL = "\015\012";
1228 $BLANK = $EOL x 2;
7b05b7e3 1229 foreach $document ( @ARGV ) {
1230 $remote = IO::Socket::INET->new( Proto => "tcp",
1231 PeerAddr => $host,
1232 PeerPort => "http(80)",
1233 );
1234 unless ($remote) { die "cannot connect to http daemon on $host" }
1235 $remote->autoflush(1);
5a964f20 1236 print $remote "GET $document HTTP/1.0" . $BLANK;
7b05b7e3 1237 while ( <$remote> ) { print }
1238 close $remote;
1239 }
1240
1241The web server handing the "http" service, which is assumed to be at
4375e838 1242its standard port, number 80. If the web server you're trying to
7b05b7e3 1243connect to is at a different port (like 1080 or 8080), you should specify
c47ff5f1 1244as the named-parameter pair, C<< PeerPort => 8080 >>. The C<autoflush>
7b05b7e3 1245method is used on the socket because otherwise the system would buffer
1246up the output we sent it. (If you're on a Mac, you'll also need to
1247change every C<"\n"> in your code that sends data over the network to
1248be a C<"\015\012"> instead.)
1249
1250Connecting to the server is only the first part of the process: once you
1251have the connection, you have to use the server's language. Each server
1252on the network has its own little command language that it expects as
1253input. The string that we send to the server starting with "GET" is in
1254HTTP syntax. In this case, we simply request each specified document.
1255Yes, we really are making a new connection for each document, even though
1256it's the same host. That's the way you always used to have to speak HTTP.
1257Recent versions of web browsers may request that the remote server leave
1258the connection open a little while, but the server doesn't have to honor
1259such a request.
1260
1261Here's an example of running that program, which we'll call I<webget>:
1262
5a964f20 1263 % webget www.perl.com /guanaco.html
7b05b7e3 1264 HTTP/1.1 404 File Not Found
1265 Date: Thu, 08 May 1997 18:02:32 GMT
1266 Server: Apache/1.2b6
1267 Connection: close
1268 Content-type: text/html
1269
1270 <HEAD><TITLE>404 File Not Found</TITLE></HEAD>
1271 <BODY><H1>File Not Found</H1>
1272 The requested URL /guanaco.html was not found on this server.<P>
1273 </BODY>
1274
1275Ok, so that's not very interesting, because it didn't find that
1276particular document. But a long response wouldn't have fit on this page.
1277
1278For a more fully-featured version of this program, you should look to
1279the I<lwp-request> program included with the LWP modules from CPAN.
1280
1281=head2 Interactive Client with IO::Socket
1282
1283Well, that's all fine if you want to send one command and get one answer,
1284but what about setting up something fully interactive, somewhat like
1285the way I<telnet> works? That way you can type a line, get the answer,
1286type a line, get the answer, etc.
1287
1288This client is more complicated than the two we've done so far, but if
1289you're on a system that supports the powerful C<fork> call, the solution
1290isn't that rough. Once you've made the connection to whatever service
1291you'd like to chat with, call C<fork> to clone your process. Each of
1292these two identical process has a very simple job to do: the parent
1293copies everything from the socket to standard output, while the child
1294simultaneously copies everything from standard input to the socket.
1295To accomplish the same thing using just one process would be I<much>
1296harder, because it's easier to code two processes to do one thing than it
1297is to code one process to do two things. (This keep-it-simple principle
5a964f20 1298a cornerstones of the Unix philosophy, and good software engineering as
1299well, which is probably why it's spread to other systems.)
7b05b7e3 1300
1301Here's the code:
1302
1303 #!/usr/bin/perl -w
1304 use strict;
1305 use IO::Socket;
1306 my ($host, $port, $kidpid, $handle, $line);
1307
1308 unless (@ARGV == 2) { die "usage: $0 host port" }
1309 ($host, $port) = @ARGV;
1310
1311 # create a tcp connection to the specified host and port
1312 $handle = IO::Socket::INET->new(Proto => "tcp",
1313 PeerAddr => $host,
1314 PeerPort => $port)
1315 or die "can't connect to port $port on $host: $!";
1316
1317 $handle->autoflush(1); # so output gets there right away
1318 print STDERR "[Connected to $host:$port]\n";
1319
1320 # split the program into two processes, identical twins
1321 die "can't fork: $!" unless defined($kidpid = fork());
1322
1323 # the if{} block runs only in the parent process
1324 if ($kidpid) {
1325 # copy the socket to standard output
1326 while (defined ($line = <$handle>)) {
1327 print STDOUT $line;
1328 }
1329 kill("TERM", $kidpid); # send SIGTERM to child
1330 }
1331 # the else{} block runs only in the child process
1332 else {
1333 # copy standard input to the socket
1334 while (defined ($line = <STDIN>)) {
1335 print $handle $line;
1336 }
1337 }
1338
1339The C<kill> function in the parent's C<if> block is there to send a
1340signal to our child process (current running in the C<else> block)
1341as soon as the remote server has closed its end of the connection.
1342
7b05b7e3 1343If the remote server sends data a byte at time, and you need that
1344data immediately without waiting for a newline (which might not happen),
1345you may wish to replace the C<while> loop in the parent with the
1346following:
1347
1348 my $byte;
1349 while (sysread($handle, $byte, 1) == 1) {
1350 print STDOUT $byte;
1351 }
1352
1353Making a system call for each byte you want to read is not very efficient
1354(to put it mildly) but is the simplest to explain and works reasonably
1355well.
1356
1357=head1 TCP Servers with IO::Socket
1358
5a964f20 1359As always, setting up a server is little bit more involved than running a client.
7b05b7e3 1360The model is that the server creates a special kind of socket that
1361does nothing but listen on a particular port for incoming connections.
c47ff5f1 1362It does this by calling the C<< IO::Socket::INET->new() >> method with
7b05b7e3 1363slightly different arguments than the client did.
1364
13a2d996 1365=over 4
7b05b7e3 1366
1367=item Proto
1368
1369This is which protocol to use. Like our clients, we'll
1370still specify C<"tcp"> here.
1371
1372=item LocalPort
1373
1374We specify a local
1375port in the C<LocalPort> argument, which we didn't do for the client.
1376This is service name or port number for which you want to be the
1377server. (Under Unix, ports under 1024 are restricted to the
1378superuser.) In our sample, we'll use port 9000, but you can use
1379any port that's not currently in use on your system. If you try
1380to use one already in used, you'll get an "Address already in use"
19799a22 1381message. Under Unix, the C<netstat -a> command will show
7b05b7e3 1382which services current have servers.
1383
1384=item Listen
1385
1386The C<Listen> parameter is set to the maximum number of
1387pending connections we can accept until we turn away incoming clients.
1388Think of it as a call-waiting queue for your telephone.
1389The low-level Socket module has a special symbol for the system maximum, which
1390is SOMAXCONN.
1391
1392=item Reuse
1393
1394The C<Reuse> parameter is needed so that we restart our server
1395manually without waiting a few minutes to allow system buffers to
1396clear out.
1397
1398=back
1399
1400Once the generic server socket has been created using the parameters
1401listed above, the server then waits for a new client to connect
d1be9408 1402to it. The server blocks in the C<accept> method, which eventually accepts a
1403bidirectional connection from the remote client. (Make sure to autoflush
7b05b7e3 1404this handle to circumvent buffering.)
1405
1406To add to user-friendliness, our server prompts the user for commands.
1407Most servers don't do this. Because of the prompt without a newline,
1408you'll have to use the C<sysread> variant of the interactive client above.
1409
1410This server accepts one of five different commands, sending output
1411back to the client. Note that unlike most network servers, this one
1412only handles one incoming client at a time. Multithreaded servers are
f83494b9 1413covered in Chapter 6 of the Camel.
7b05b7e3 1414
1415Here's the code. We'll
1416
1417 #!/usr/bin/perl -w
1418 use IO::Socket;
1419 use Net::hostent; # for OO version of gethostbyaddr
1420
1421 $PORT = 9000; # pick something not in use
1422
1423 $server = IO::Socket::INET->new( Proto => 'tcp',
1424 LocalPort => $PORT,
1425 Listen => SOMAXCONN,
1426 Reuse => 1);
1427
1428 die "can't setup server" unless $server;
1429 print "[Server $0 accepting clients]\n";
1430
1431 while ($client = $server->accept()) {
1432 $client->autoflush(1);
1433 print $client "Welcome to $0; type help for command list.\n";
1434 $hostinfo = gethostbyaddr($client->peeraddr);
78fc38e1 1435 printf "[Connect from %s]\n", $hostinfo ? $hostinfo->name : $client->peerhost;
7b05b7e3 1436 print $client "Command? ";
1437 while ( <$client>) {
1438 next unless /\S/; # blank line
1439 if (/quit|exit/i) { last; }
1440 elsif (/date|time/i) { printf $client "%s\n", scalar localtime; }
1441 elsif (/who/i ) { print $client `who 2>&1`; }
1442 elsif (/cookie/i ) { print $client `/usr/games/fortune 2>&1`; }
1443 elsif (/motd/i ) { print $client `cat /etc/motd 2>&1`; }
1444 else {
1445 print $client "Commands: quit date who cookie motd\n";
1446 }
1447 } continue {
1448 print $client "Command? ";
1449 }
1450 close $client;
1451 }
1452
1453=head1 UDP: Message Passing
4633a7c4 1454
1455Another kind of client-server setup is one that uses not connections, but
1456messages. UDP communications involve much lower overhead but also provide
1457less reliability, as there are no promises that messages will arrive at
1458all, let alone in order and unmangled. Still, UDP offers some advantages
1459over TCP, including being able to "broadcast" or "multicast" to a whole
1460bunch of destination hosts at once (usually on your local subnet). If you
1461find yourself overly concerned about reliability and start building checks
6a3992aa 1462into your message system, then you probably should use just TCP to start
4633a7c4 1463with.
1464
90034919 1465Note that UDP datagrams are I<not> a bytestream and should not be treated
1466as such. This makes using I/O mechanisms with internal buffering
1467like stdio (i.e. print() and friends) especially cumbersome. Use syswrite(),
1468or better send(), like in the example below.
1469
4633a7c4 1470Here's a UDP program similar to the sample Internet TCP client given
7b05b7e3 1471earlier. However, instead of checking one host at a time, the UDP version
4633a7c4 1472will check many of them asynchronously by simulating a multicast and then
1473using select() to do a timed-out wait for I/O. To do something similar
1474with TCP, you'd have to use a different socket handle for each host.
1475
1476 #!/usr/bin/perl -w
1477 use strict;
4633a7c4 1478 use Socket;
1479 use Sys::Hostname;
1480
54310121 1481 my ( $count, $hisiaddr, $hispaddr, $histime,
1482 $host, $iaddr, $paddr, $port, $proto,
4633a7c4 1483 $rin, $rout, $rtime, $SECS_of_70_YEARS);
1484
1485 $SECS_of_70_YEARS = 2208988800;
1486
1487 $iaddr = gethostbyname(hostname());
1488 $proto = getprotobyname('udp');
1489 $port = getservbyname('time', 'udp');
1490 $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick
1491
1492 socket(SOCKET, PF_INET, SOCK_DGRAM, $proto) || die "socket: $!";
1493 bind(SOCKET, $paddr) || die "bind: $!";
1494
1495 $| = 1;
1496 printf "%-12s %8s %s\n", "localhost", 0, scalar localtime time;
1497 $count = 0;
1498 for $host (@ARGV) {
1499 $count++;
1500 $hisiaddr = inet_aton($host) || die "unknown host";
1501 $hispaddr = sockaddr_in($port, $hisiaddr);
1502 defined(send(SOCKET, 0, 0, $hispaddr)) || die "send $host: $!";
1503 }
1504
1505 $rin = '';
1506 vec($rin, fileno(SOCKET), 1) = 1;
1507
1508 # timeout after 10.0 seconds
1509 while ($count && select($rout = $rin, undef, undef, 10.0)) {
1510 $rtime = '';
1511 ($hispaddr = recv(SOCKET, $rtime, 4, 0)) || die "recv: $!";
1512 ($port, $hisiaddr) = sockaddr_in($hispaddr);
1513 $host = gethostbyaddr($hisiaddr, AF_INET);
1514 $histime = unpack("N", $rtime) - $SECS_of_70_YEARS ;
1515 printf "%-12s ", $host;
1516 printf "%8d %s\n", $histime - time, scalar localtime($histime);
1517 $count--;
1518 }
1519
90034919 1520Note that this example does not include any retries and may consequently
1521fail to contact a reachable host. The most prominent reason for this
1522is congestion of the queues on the sending host if the number of
a31a806a 1523list of hosts to contact is sufficiently large.
90034919 1524
4633a7c4 1525=head1 SysV IPC
1526
1527While System V IPC isn't so widely used as sockets, it still has some
1528interesting uses. You can't, however, effectively use SysV IPC or
1529Berkeley mmap() to have shared memory so as to share a variable amongst
1530several processes. That's because Perl would reallocate your string when
1531you weren't wanting it to.
1532
54310121 1533Here's a small example showing shared memory usage.
a0d0e21e 1534
41d6edb2 1535 use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRWXU);
0ade1984 1536
a0d0e21e 1537 $size = 2000;
41d6edb2 1538 $id = shmget(IPC_PRIVATE, $size, S_IRWXU) || die "$!";
1539 print "shm key $id\n";
a0d0e21e 1540
1541 $message = "Message #1";
41d6edb2 1542 shmwrite($id, $message, 0, 60) || die "$!";
0ade1984 1543 print "wrote: '$message'\n";
41d6edb2 1544 shmread($id, $buff, 0, 60) || die "$!";
0ade1984 1545 print "read : '$buff'\n";
a0d0e21e 1546
0ade1984 1547 # the buffer of shmread is zero-character end-padded.
1548 substr($buff, index($buff, "\0")) = '';
1549 print "un" unless $buff eq $message;
1550 print "swell\n";
a0d0e21e 1551
41d6edb2 1552 print "deleting shm $id\n";
1553 shmctl($id, IPC_RMID, 0) || die "$!";
a0d0e21e 1554
1555Here's an example of a semaphore:
1556
0ade1984 1557 use IPC::SysV qw(IPC_CREAT);
1558
a0d0e21e 1559 $IPC_KEY = 1234;
41d6edb2 1560 $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT ) || die "$!";
1561 print "shm key $id\n";
a0d0e21e 1562
a2eb9003 1563Put this code in a separate file to be run in more than one process.
a0d0e21e 1564Call the file F<take>:
1565
1566 # create a semaphore
1567
1568 $IPC_KEY = 1234;
41d6edb2 1569 $id = semget($IPC_KEY, 0 , 0 );
1570 die if !defined($id);
a0d0e21e 1571
1572 $semnum = 0;
1573 $semflag = 0;
1574
1575 # 'take' semaphore
1576 # wait for semaphore to be zero
1577 $semop = 0;
41d6edb2 1578 $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag);
a0d0e21e 1579
1580 # Increment the semaphore count
1581 $semop = 1;
41d6edb2 1582 $opstring2 = pack("s!s!s!", $semnum, $semop, $semflag);
a0d0e21e 1583 $opstring = $opstring1 . $opstring2;
1584
41d6edb2 1585 semop($id,$opstring) || die "$!";
a0d0e21e 1586
a2eb9003 1587Put this code in a separate file to be run in more than one process.
a0d0e21e 1588Call this file F<give>:
1589
4633a7c4 1590 # 'give' the semaphore
a0d0e21e 1591 # run this in the original process and you will see
1592 # that the second process continues
1593
1594 $IPC_KEY = 1234;
41d6edb2 1595 $id = semget($IPC_KEY, 0, 0);
1596 die if !defined($id);
a0d0e21e 1597
1598 $semnum = 0;
1599 $semflag = 0;
1600
1601 # Decrement the semaphore count
1602 $semop = -1;
41d6edb2 1603 $opstring = pack("s!s!s!", $semnum, $semop, $semflag);
a0d0e21e 1604
41d6edb2 1605 semop($id,$opstring) || die "$!";
a0d0e21e 1606
7b05b7e3 1607The SysV IPC code above was written long ago, and it's definitely
0ade1984 1608clunky looking. For a more modern look, see the IPC::SysV module
1609which is included with Perl starting from Perl 5.005.
4633a7c4 1610
41d6edb2 1611A small example demonstrating SysV message queues:
1612
1613 use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRWXU);
1614
1615 my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRWXU);
1616
1617 my $sent = "message";
e343e2e2 1618 my $type_sent = 1234;
41d6edb2 1619 my $rcvd;
1620 my $type_rcvd;
1621
1622 if (defined $id) {
1623 if (msgsnd($id, pack("l! a*", $type_sent, $sent), 0)) {
1624 if (msgrcv($id, $rcvd, 60, 0, 0)) {
1625 ($type_rcvd, $rcvd) = unpack("l! a*", $rcvd);
1626 if ($rcvd eq $sent) {
1627 print "okay\n";
1628 } else {
1629 print "not okay\n";
1630 }
1631 } else {
1632 die "# msgrcv failed\n";
1633 }
1634 } else {
1635 die "# msgsnd failed\n";
1636 }
1637 msgctl($id, IPC_RMID, 0) || die "# msgctl failed: $!\n";
1638 } else {
1639 die "# msgget failed\n";
1640 }
1641
4633a7c4 1642=head1 NOTES
1643
5a964f20 1644Most of these routines quietly but politely return C<undef> when they
1645fail instead of causing your program to die right then and there due to
1646an uncaught exception. (Actually, some of the new I<Socket> conversion
1647functions croak() on bad arguments.) It is therefore essential to
1648check return values from these functions. Always begin your socket
1649programs this way for optimal success, and don't forget to add B<-T>
1650taint checking flag to the #! line for servers:
4633a7c4 1651
5a964f20 1652 #!/usr/bin/perl -Tw
4633a7c4 1653 use strict;
1654 use sigtrap;
1655 use Socket;
1656
1657=head1 BUGS
1658
1659All these routines create system-specific portability problems. As noted
1660elsewhere, Perl is at the mercy of your C libraries for much of its system
1661behaviour. It's probably safest to assume broken SysV semantics for
6a3992aa 1662signals and to stick with simple TCP and UDP socket operations; e.g., don't
a2eb9003 1663try to pass open file descriptors over a local UDP datagram socket if you
4633a7c4 1664want your code to stand a chance of being portable.
1665
4633a7c4 1666=head1 AUTHOR
1667
1668Tom Christiansen, with occasional vestiges of Larry Wall's original
7b05b7e3 1669version and suggestions from the Perl Porters.
4633a7c4 1670
1671=head1 SEE ALSO
1672
7b05b7e3 1673There's a lot more to networking than this, but this should get you
1674started.
1675
c04e1326 1676For intrepid programmers, the indispensable textbook is I<Unix
1677Network Programming, 2nd Edition, Volume 1> by W. Richard Stevens
1678(published by Prentice-Hall). Note that most books on networking
1679address the subject from the perspective of a C programmer; translation
1680to Perl is left as an exercise for the reader.
7b05b7e3 1681
1682The IO::Socket(3) manpage describes the object library, and the Socket(3)
1683manpage describes the low-level interface to sockets. Besides the obvious
1684functions in L<perlfunc>, you should also check out the F<modules> file
1685at your nearest CPAN site. (See L<perlmodlib> or best yet, the F<Perl
1686FAQ> for a description of what CPAN is and where to get it.)
1687
4633a7c4 1688Section 5 of the F<modules> file is devoted to "Networking, Device Control
6a3992aa 1689(modems), and Interprocess Communication", and contains numerous unbundled
4633a7c4 1690modules numerous networking modules, Chat and Expect operations, CGI
1691programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP, Telnet,
1692Threads, and ToolTalk--just to name a few.