Re: [perl #18872] File::Basename example misleading
[p5sagit/p5-mst-13.2.git] / pod / perlipc.pod
CommitLineData
a0d0e21e 1=head1 NAME
2
184e9718 3perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores)
a0d0e21e 4
5=head1 DESCRIPTION
6
4633a7c4 7The basic IPC facilities of Perl are built out of the good old Unix
8signals, named pipes, pipe opens, the Berkeley socket routines, and SysV
9IPC calls. Each is used in slightly different situations.
10
11=head1 Signals
12
490f90af 13Perl uses a simple signal handling model: the %SIG hash contains names
14or references of user-installed signal handlers. These handlers will
15be called with an argument which is the name of the signal that
16triggered it. A signal may be generated intentionally from a
17particular keyboard sequence like control-C or control-Z, sent to you
18from another process, or triggered automatically by the kernel when
19special events transpire, like a child process exiting, your process
20running out of stack space, or hitting file size limit.
4633a7c4 21
a11adca0 22For example, to trap an interrupt signal, set up a handler like this:
4633a7c4 23
24 sub catch_zap {
25 my $signame = shift;
26 $shucks++;
27 die "Somebody sent me a SIG$signame";
54310121 28 }
4633a7c4 29 $SIG{INT} = 'catch_zap'; # could fail in modules
30 $SIG{INT} = \&catch_zap; # best strategy
31
490f90af 32Prior to Perl 5.7.3 it was necessary to do as little as you possibly
33could in your handler; notice how all we do is set a global variable
34and then raise an exception. That's because on most systems,
35libraries are not re-entrant; particularly, memory allocation and I/O
36routines are not. That meant that doing nearly I<anything> in your
37handler could in theory trigger a memory fault and subsequent core
38dump - see L<Deferred Signals> below.
a11adca0 39
4633a7c4 40The names of the signals are the ones listed out by C<kill -l> on your
41system, or you can retrieve them from the Config module. Set up an
42@signame list indexed by number to get the name and a %signo table
43indexed by name to get the number:
44
45 use Config;
46 defined $Config{sig_name} || die "No sigs?";
47 foreach $name (split(' ', $Config{sig_name})) {
48 $signo{$name} = $i;
49 $signame[$i] = $name;
50 $i++;
54310121 51 }
4633a7c4 52
6a3992aa 53So to check whether signal 17 and SIGALRM were the same, do just this:
4633a7c4 54
55 print "signal #17 = $signame[17]\n";
54310121 56 if ($signo{ALRM}) {
4633a7c4 57 print "SIGALRM is $signo{ALRM}\n";
54310121 58 }
4633a7c4 59
60You may also choose to assign the strings C<'IGNORE'> or C<'DEFAULT'> as
61the handler, in which case Perl will try to discard the signal or do the
f648820c 62default thing.
63
19799a22 64On most Unix platforms, the C<CHLD> (sometimes also known as C<CLD>) signal
f648820c 65has special behavior with respect to a value of C<'IGNORE'>.
66Setting C<$SIG{CHLD}> to C<'IGNORE'> on such a platform has the effect of
67not creating zombie processes when the parent process fails to C<wait()>
68on its child processes (i.e. child processes are automatically reaped).
69Calling C<wait()> with C<$SIG{CHLD}> set to C<'IGNORE'> usually returns
70C<-1> on such platforms.
71
72Some signals can be neither trapped nor ignored, such as
4633a7c4 73the KILL and STOP (but not the TSTP) signals. One strategy for
74temporarily ignoring signals is to use a local() statement, which will be
75automatically restored once your block is exited. (Remember that local()
76values are "inherited" by functions called from within that block.)
77
78 sub precious {
79 local $SIG{INT} = 'IGNORE';
80 &more_functions;
54310121 81 }
4633a7c4 82 sub more_functions {
83 # interrupts still ignored, for now...
54310121 84 }
4633a7c4 85
86Sending a signal to a negative process ID means that you send the signal
fb73857a 87to the entire Unix process-group. This code sends a hang-up signal to all
88processes in the current process group (and sets $SIG{HUP} to IGNORE so
89it doesn't kill itself):
4633a7c4 90
91 {
92 local $SIG{HUP} = 'IGNORE';
93 kill HUP => -$$;
94 # snazzy writing of: kill('HUP', -$$)
95 }
a0d0e21e 96
4633a7c4 97Another interesting signal to send is signal number zero. This doesn't
98actually affect another process, but instead checks whether it's alive
54310121 99or has changed its UID.
a0d0e21e 100
4633a7c4 101 unless (kill 0 => $kid_pid) {
102 warn "something wicked happened to $kid_pid";
54310121 103 }
a0d0e21e 104
4633a7c4 105You might also want to employ anonymous functions for simple signal
106handlers:
a0d0e21e 107
4633a7c4 108 $SIG{INT} = sub { die "\nOutta here!\n" };
a0d0e21e 109
4633a7c4 110But that will be problematic for the more complicated handlers that need
54310121 111to reinstall themselves. Because Perl's signal mechanism is currently
184e9718 112based on the signal(3) function from the C library, you may sometimes be so
4633a7c4 113misfortunate as to run on systems where that function is "broken", that
114is, it behaves in the old unreliable SysV way rather than the newer, more
115reasonable BSD and POSIX fashion. So you'll see defensive people writing
116signal handlers like this:
a0d0e21e 117
54310121 118 sub REAPER {
4633a7c4 119 $waitedpid = wait;
6a3992aa 120 # loathe sysV: it makes us not only reinstate
121 # the handler, but place it after the wait
54310121 122 $SIG{CHLD} = \&REAPER;
4633a7c4 123 }
124 $SIG{CHLD} = \&REAPER;
125 # now do something that forks...
126
816229cf 127or better still:
4633a7c4 128
6a3992aa 129 use POSIX ":sys_wait_h";
54310121 130 sub REAPER {
4633a7c4 131 my $child;
816229cf 132 # If a second child dies while in the signal handler caused by the
133 # first death, we won't get another signal. So must loop here else
134 # we will leave the unreaped child as a zombie. And the next time
135 # two children die we get another zombie. And so on.
1450d070 136 while (($child = waitpid(-1,WNOHANG)) > 0) {
4633a7c4 137 $Kid_Status{$child} = $?;
54310121 138 }
6a3992aa 139 $SIG{CHLD} = \&REAPER; # still loathe sysV
4633a7c4 140 }
141 $SIG{CHLD} = \&REAPER;
142 # do something that forks...
143
144Signal handling is also used for timeouts in Unix, While safely
145protected within an C<eval{}> block, you set a signal handler to trap
146alarm signals and then schedule to have one delivered to you in some
147number of seconds. Then try your blocking operation, clearing the alarm
148when it's done but not before you've exited your C<eval{}> block. If it
149goes off, you'll use die() to jump out of the block, much as you might
150using longjmp() or throw() in other languages.
151
152Here's an example:
153
54310121 154 eval {
4633a7c4 155 local $SIG{ALRM} = sub { die "alarm clock restart" };
54310121 156 alarm 10;
4633a7c4 157 flock(FH, 2); # blocking write lock
54310121 158 alarm 0;
4633a7c4 159 };
160 if ($@ and $@ !~ /alarm clock restart/) { die }
161
8a4f6ac2 162If the operation being timed out is system() or qx(), this technique
163is liable to generate zombies. If this matters to you, you'll
164need to do your own fork() and exec(), and kill the errant child process.
165
4633a7c4 166For more complex signal handling, you might see the standard POSIX
167module. Lamentably, this is almost entirely undocumented, but
168the F<t/lib/posix.t> file from the Perl source distribution has some
169examples in it.
170
28494392 171=head2 Handling the SIGHUP Signal in Daemons
172
173A process that usually starts when the system boots and shuts down
174when the system is shut down is called a daemon (Disk And Execution
175MONitor). If a daemon process has a configuration file which is
176modified after the process has been started, there should be a way to
177tell that process to re-read its configuration file, without stopping
178the process. Many daemons provide this mechanism using the C<SIGHUP>
179signal handler. When you want to tell the daemon to re-read the file
180you simply send it the C<SIGHUP> signal.
181
3031ea75 182Not all platforms automatically reinstall their (native) signal
183handlers after a signal delivery. This means that the handler works
184only the first time the signal is sent. The solution to this problem
185is to use C<POSIX> signal handlers if available, their behaviour
186is well-defined.
28494392 187
188The following example implements a simple daemon, which restarts
189itself every time the C<SIGHUP> signal is received. The actual code is
190located in the subroutine C<code()>, which simply prints some debug
191info to show that it works and should be replaced with the real code.
192
193 #!/usr/bin/perl -w
d6fd60d6 194
28494392 195 use POSIX ();
196 use FindBin ();
197 use File::Basename ();
198 use File::Spec::Functions;
d6fd60d6 199
28494392 200 $|=1;
d6fd60d6 201
28494392 202 # make the daemon cross-platform, so exec always calls the script
203 # itself with the right path, no matter how the script was invoked.
204 my $script = File::Basename::basename($0);
205 my $SELF = catfile $FindBin::Bin, $script;
d6fd60d6 206
28494392 207 # POSIX unmasks the sigprocmask properly
208 my $sigset = POSIX::SigSet->new();
209 my $action = POSIX::SigAction->new('sigHUP_handler',
210 $sigset,
211 &POSIX::SA_NODEFER);
212 POSIX::sigaction(&POSIX::SIGHUP, $action);
d6fd60d6 213
28494392 214 sub sigHUP_handler {
215 print "got SIGHUP\n";
216 exec($SELF, @ARGV) or die "Couldn't restart: $!\n";
217 }
d6fd60d6 218
28494392 219 code();
d6fd60d6 220
28494392 221 sub code {
222 print "PID: $$\n";
223 print "ARGV: @ARGV\n";
224 my $c = 0;
225 while (++$c) {
226 sleep 2;
227 print "$c\n";
228 }
229 }
230 __END__
231
232
4633a7c4 233=head1 Named Pipes
234
235A named pipe (often referred to as a FIFO) is an old Unix IPC
236mechanism for processes communicating on the same machine. It works
54310121 237just like a regular, connected anonymous pipes, except that the
4633a7c4 238processes rendezvous using a filename and don't have to be related.
239
240To create a named pipe, use the Unix command mknod(1) or on some
241systems, mkfifo(1). These may not be in your normal path.
242
243 # system return val is backwards, so && not ||
244 #
245 $ENV{PATH} .= ":/etc:/usr/etc";
54310121 246 if ( system('mknod', $path, 'p')
4633a7c4 247 && system('mkfifo', $path) )
248 {
5a964f20 249 die "mk{nod,fifo} $path failed";
54310121 250 }
4633a7c4 251
252
253A fifo is convenient when you want to connect a process to an unrelated
254one. When you open a fifo, the program will block until there's something
54310121 255on the other end.
4633a7c4 256
257For example, let's say you'd like to have your F<.signature> file be a
258named pipe that has a Perl program on the other end. Now every time any
6a3992aa 259program (like a mailer, news reader, finger program, etc.) tries to read
4633a7c4 260from that file, the reading program will block and your program will
6a3992aa 261supply the new signature. We'll use the pipe-checking file test B<-p>
4633a7c4 262to find out whether anyone (or anything) has accidentally removed our fifo.
263
264 chdir; # go home
265 $FIFO = '.signature';
266 $ENV{PATH} .= ":/etc:/usr/games";
267
268 while (1) {
269 unless (-p $FIFO) {
270 unlink $FIFO;
54310121 271 system('mknod', $FIFO, 'p')
4633a7c4 272 && die "can't mknod $FIFO: $!";
54310121 273 }
4633a7c4 274
275 # next line blocks until there's a reader
276 open (FIFO, "> $FIFO") || die "can't write $FIFO: $!";
277 print FIFO "John Smith (smith\@host.org)\n", `fortune -s`;
278 close FIFO;
6a3992aa 279 sleep 2; # to avoid dup signals
4633a7c4 280 }
a0d0e21e 281
4ffa73a3 282=head2 Deferred Signals (Safe signals)
5a964f20 283
490f90af 284In Perls before Perl 5.7.3 by installing Perl code to deal with
285signals, you were exposing yourself to danger from two things. First,
286few system library functions are re-entrant. If the signal interrupts
287while Perl is executing one function (like malloc(3) or printf(3)),
288and your signal handler then calls the same function again, you could
289get unpredictable behavior--often, a core dump. Second, Perl isn't
290itself re-entrant at the lowest levels. If the signal interrupts Perl
291while Perl is changing its own internal data structures, similarly
292unpredictable behaviour may result.
5a964f20 293
a11adca0 294There were two things you could do, knowing this: be paranoid or be
295pragmatic. The paranoid approach was to do as little as possible in your
5a964f20 296signal handler. Set an existing integer variable that already has a
297value, and return. This doesn't help you if you're in a slow system call,
298which will just restart. That means you have to C<die> to longjump(3) out
299of the handler. Even this is a little cavalier for the true paranoiac,
300who avoids C<die> in a handler because the system I<is> out to get you.
a11adca0 301The pragmatic approach was to say ``I know the risks, but prefer the
302convenience'', and to do anything you wanted in your signal handler,
303and be prepared to clean up core dumps now and again.
304
490f90af 305In Perl 5.7.3 and later to avoid these problems signals are
306"deferred"-- that is when the signal is delivered to the process by
307the system (to the C code that implements Perl) a flag is set, and the
308handler returns immediately. Then at strategic "safe" points in the
309Perl interpreter (e.g. when it is about to execute a new opcode) the
310flags are checked and the Perl level handler from %SIG is
311executed. The "deferred" scheme allows much more flexibility in the
312coding of signal handler as we know Perl interpreter is in a safe
313state, and that we are not in a system library function when the
314handler is called. However the implementation does differ from
315previous Perls in the following ways:
5a964f20 316
a11adca0 317=over 4
5a964f20 318
a11adca0 319=item Long running opcodes
320
490f90af 321As Perl interpreter only looks at the signal flags when it about to
322execute a new opcode if a signal arrives during a long running opcode
323(e.g. a regular expression operation on a very large string) then
324signal will not be seen until operation completes.
a11adca0 325
326=item Interrupting IO
327
490f90af 328When a signal is delivered (e.g. INT control-C) the operating system
329breaks into IO operations like C<read> (used to implement Perls
330E<lt>E<gt> operator). On older Perls the handler was called
331immediately (and as C<read> is not "unsafe" this worked well). With
332the "deferred" scheme the handler is not called immediately, and if
333Perl is using system's C<stdio> library that library may re-start the
334C<read> without returning to Perl and giving it a chance to call the
335%SIG handler. If this happens on your system the solution is to use
336C<:perlio> layer to do IO - at least on those handles which you want
337to be able to break into with signals. (The C<:perlio> layer checks
338the signal flags and calls %SIG handlers before resuming IO operation.)
339
340Note that the default in Perl 5.7.3 and later is to automatically use
341the C<:perlio> layer.
a11adca0 342
343=item Signals as "faults"
344
490f90af 345Certain signals e.g. SEGV, ILL, BUS are generated as a result of
346virtual memory or other "faults". These are normally fatal and there
347is little a Perl-level handler can do with them. (In particular the
348old signal scheme was particularly unsafe in such cases.) However if
349a %SIG handler is set the new scheme simply sets a flag and returns as
350described above. This may cause the operating system to try the
351offending machine instruction again and - as nothing has changed - it
352will generate the signal again. The result of this is a rather odd
353"loop". In future Perl's signal mechanism may be changed to avoid this
354- perhaps by simply disallowing %SIG handlers on signals of that
355type. Until then the work-round is not to set a %SIG handler on those
356signals. (Which signals they are is operating system dependant.)
a11adca0 357
358=item Signals triggered by operating system state
359
490f90af 360On some operating systems certain signal handlers are supposed to "do
361something" before returning. One example can be CHLD or CLD which
362indicates a child process has completed. On some operating systems the
363signal handler is expected to C<wait> for the completed child
364process. On such systems the deferred signal scheme will not work for
365those signals (it does not do the C<wait>). Again the failure will
366look like a loop as the operating system will re-issue the signal as
367there are un-waited-for completed child processes.
a11adca0 368
818c4caa 369=back
a0d0e21e 370
4ffa73a3 371If you want the old signal behaviour back regardless of possible
372memory corruption, set the environment variable C<PERL_SIGNALS> to
373C<"unsafe">.
374
4633a7c4 375=head1 Using open() for IPC
376
490f90af 377Perl's basic open() statement can also be used for unidirectional
378interprocess communication by either appending or prepending a pipe
379symbol to the second argument to open(). Here's how to start
380something up in a child process you intend to write to:
4633a7c4 381
54310121 382 open(SPOOLER, "| cat -v | lpr -h 2>/dev/null")
4633a7c4 383 || die "can't fork: $!";
384 local $SIG{PIPE} = sub { die "spooler pipe broke" };
385 print SPOOLER "stuff\n";
386 close SPOOLER || die "bad spool: $! $?";
387
388And here's how to start up a child process you intend to read from:
389
390 open(STATUS, "netstat -an 2>&1 |")
391 || die "can't fork: $!";
392 while (<STATUS>) {
393 next if /^(tcp|udp)/;
394 print;
54310121 395 }
a2eb9003 396 close STATUS || die "bad netstat: $! $?";
4633a7c4 397
398If one can be sure that a particular program is a Perl script that is
399expecting filenames in @ARGV, the clever programmer can write something
400like this:
401
5a964f20 402 % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
4633a7c4 403
404and irrespective of which shell it's called from, the Perl program will
405read from the file F<f1>, the process F<cmd1>, standard input (F<tmpfile>
406in this case), the F<f2> file, the F<cmd2> command, and finally the F<f3>
407file. Pretty nifty, eh?
408
54310121 409You might notice that you could use backticks for much the
4633a7c4 410same effect as opening a pipe for reading:
411
412 print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`;
413 die "bad netstat" if $?;
414
415While this is true on the surface, it's much more efficient to process the
416file one line or record at a time because then you don't have to read the
19799a22 417whole thing into memory at once. It also gives you finer control of the
4633a7c4 418whole process, letting you to kill off the child process early if you'd
419like.
420
421Be careful to check both the open() and the close() return values. If
422you're I<writing> to a pipe, you should also trap SIGPIPE. Otherwise,
423think of what happens when you start up a pipe to a command that doesn't
424exist: the open() will in all likelihood succeed (it only reflects the
425fork()'s success), but then your output will fail--spectacularly. Perl
426can't know whether the command worked because your command is actually
427running in a separate process whose exec() might have failed. Therefore,
6a3992aa 428while readers of bogus commands return just a quick end of file, writers
4633a7c4 429to bogus command will trigger a signal they'd better be prepared to
430handle. Consider:
431
5a964f20 432 open(FH, "|bogus") or die "can't fork: $!";
433 print FH "bang\n" or die "can't write: $!";
434 close FH or die "can't close: $!";
435
436That won't blow up until the close, and it will blow up with a SIGPIPE.
437To catch it, you could use this:
438
439 $SIG{PIPE} = 'IGNORE';
440 open(FH, "|bogus") or die "can't fork: $!";
441 print FH "bang\n" or die "can't write: $!";
442 close FH or die "can't close: status=$?";
4633a7c4 443
68dc0745 444=head2 Filehandles
445
5a964f20 446Both the main process and any child processes it forks share the same
447STDIN, STDOUT, and STDERR filehandles. If both processes try to access
45bc9206 448them at once, strange things can happen. You may also want to close
5a964f20 449or reopen the filehandles for the child. You can get around this by
450opening your pipe with open(), but on some systems this means that the
451child process cannot outlive the parent.
68dc0745 452
453=head2 Background Processes
454
455You can run a command in the background with:
456
7b05b7e3 457 system("cmd &");
68dc0745 458
459The command's STDOUT and STDERR (and possibly STDIN, depending on your
460shell) will be the same as the parent's. You won't need to catch
461SIGCHLD because of the double-fork taking place (see below for more
462details).
463
464=head2 Complete Dissociation of Child from Parent
465
466In some cases (starting server processes, for instance) you'll want to
893af57a 467completely dissociate the child process from the parent. This is
468often called daemonization. A well behaved daemon will also chdir()
469to the root directory (so it doesn't prevent unmounting the filesystem
470containing the directory from which it was launched) and redirect its
471standard file descriptors from and to F</dev/null> (so that random
472output doesn't wind up on the user's terminal).
473
474 use POSIX 'setsid';
475
476 sub daemonize {
477 chdir '/' or die "Can't chdir to /: $!";
478 open STDIN, '/dev/null' or die "Can't read /dev/null: $!";
479 open STDOUT, '>/dev/null'
480 or die "Can't write to /dev/null: $!";
481 defined(my $pid = fork) or die "Can't fork: $!";
482 exit if $pid;
483 setsid or die "Can't start a new session: $!";
484 open STDERR, '>&STDOUT' or die "Can't dup stdout: $!";
485 }
5a964f20 486
893af57a 487The fork() has to come before the setsid() to ensure that you aren't a
488process group leader (the setsid() will fail if you are). If your
489system doesn't have the setsid() function, open F</dev/tty> and use the
490C<TIOCNOTTY> ioctl() on it instead. See L<tty(4)> for details.
5a964f20 491
893af57a 492Non-Unix users should check their Your_OS::Process module for other
493solutions.
68dc0745 494
4633a7c4 495=head2 Safe Pipe Opens
496
497Another interesting approach to IPC is making your single program go
498multiprocess and communicate between (or even amongst) yourselves. The
499open() function will accept a file argument of either C<"-|"> or C<"|-">
500to do a very interesting thing: it forks a child connected to the
501filehandle you've opened. The child is running the same program as the
502parent. This is useful for safely opening a file when running under an
503assumed UID or GID, for example. If you open a pipe I<to> minus, you can
504write to the filehandle you opened and your kid will find it in his
505STDIN. If you open a pipe I<from> minus, you can read from the filehandle
506you opened whatever your kid writes to his STDOUT.
507
a1ce9542 508 use English '-no_match_vars';
4633a7c4 509 my $sleep_count = 0;
510
54310121 511 do {
c07a80fd 512 $pid = open(KID_TO_WRITE, "|-");
4633a7c4 513 unless (defined $pid) {
514 warn "cannot fork: $!";
515 die "bailing out" if $sleep_count++ > 6;
516 sleep 10;
54310121 517 }
4633a7c4 518 } until defined $pid;
519
520 if ($pid) { # parent
c07a80fd 521 print KID_TO_WRITE @some_data;
522 close(KID_TO_WRITE) || warn "kid exited $?";
4633a7c4 523 } else { # child
524 ($EUID, $EGID) = ($UID, $GID); # suid progs only
54310121 525 open (FILE, "> /safe/file")
4633a7c4 526 || die "can't open /safe/file: $!";
527 while (<STDIN>) {
528 print FILE; # child's STDIN is parent's KID
54310121 529 }
4633a7c4 530 exit; # don't forget this
54310121 531 }
4633a7c4 532
533Another common use for this construct is when you need to execute
534something without the shell's interference. With system(), it's
54310121 535straightforward, but you can't use a pipe open or backticks safely.
4633a7c4 536That's because there's no way to stop the shell from getting its hands on
537your arguments. Instead, use lower-level control to call exec() directly.
538
54310121 539Here's a safe backtick or pipe open for read:
4633a7c4 540
541 # add error processing as above
c07a80fd 542 $pid = open(KID_TO_READ, "-|");
4633a7c4 543
544 if ($pid) { # parent
c07a80fd 545 while (<KID_TO_READ>) {
4633a7c4 546 # do something interesting
54310121 547 }
c07a80fd 548 close(KID_TO_READ) || warn "kid exited $?";
4633a7c4 549
550 } else { # child
551 ($EUID, $EGID) = ($UID, $GID); # suid only
552 exec($program, @options, @args)
553 || die "can't exec program: $!";
554 # NOTREACHED
54310121 555 }
4633a7c4 556
557
558And here's a safe pipe open for writing:
559
560 # add error processing as above
c07a80fd 561 $pid = open(KID_TO_WRITE, "|-");
76c0e0db 562 $SIG{PIPE} = sub { die "whoops, $program pipe broke" };
4633a7c4 563
564 if ($pid) { # parent
565 for (@data) {
c07a80fd 566 print KID_TO_WRITE;
54310121 567 }
c07a80fd 568 close(KID_TO_WRITE) || warn "kid exited $?";
4633a7c4 569
570 } else { # child
571 ($EUID, $EGID) = ($UID, $GID);
572 exec($program, @options, @args)
573 || die "can't exec program: $!";
574 # NOTREACHED
54310121 575 }
4633a7c4 576
307eac13 577Since Perl 5.8.0, you can also use the list form of C<open> for pipes :
578the syntax
579
580 open KID_PS, "-|", "ps", "aux" or die $!;
581
582forks the ps(1) command (without spawning a shell, as there are more than
583three arguments to open()), and reads its standard output via the
ca585e4d 584C<KID_PS> filehandle. The corresponding syntax to read from command
585pipes (with C<"|-"> in place of C<"-|">) is also implemented.
307eac13 586
4633a7c4 587Note that these operations are full Unix forks, which means they may not be
588correctly implemented on alien systems. Additionally, these are not true
54310121 589multithreading. If you'd like to learn more about threading, see the
184e9718 590F<modules> file mentioned below in the SEE ALSO section.
4633a7c4 591
7b05b7e3 592=head2 Bidirectional Communication with Another Process
4633a7c4 593
594While this works reasonably well for unidirectional communication, what
595about bidirectional communication? The obvious thing you'd like to do
596doesn't actually work:
597
c07a80fd 598 open(PROG_FOR_READING_AND_WRITING, "| some program |")
4633a7c4 599
9f1b1f2d 600and if you forget to use the C<use warnings> pragma or the B<-w> flag,
601then you'll miss out entirely on the diagnostic message:
4633a7c4 602
603 Can't do bidirectional pipe at -e line 1.
604
605If you really want to, you can use the standard open2() library function
7b05b7e3 606to catch both ends. There's also an open3() for tridirectional I/O so you
4633a7c4 607can also catch your child's STDERR, but doing so would then require an
608awkward select() loop and wouldn't allow you to use normal Perl input
609operations.
610
611If you look at its source, you'll see that open2() uses low-level
5a964f20 612primitives like Unix pipe() and exec() calls to create all the connections.
4633a7c4 613While it might have been slightly more efficient by using socketpair(), it
614would have then been even less portable than it already is. The open2()
615and open3() functions are unlikely to work anywhere except on a Unix
616system or some other one purporting to be POSIX compliant.
617
618Here's an example of using open2():
619
620 use FileHandle;
621 use IPC::Open2;
5a964f20 622 $pid = open2(*Reader, *Writer, "cat -u -n" );
4633a7c4 623 print Writer "stuff\n";
624 $got = <Reader>;
625
6a3992aa 626The problem with this is that Unix buffering is really going to
627ruin your day. Even though your C<Writer> filehandle is auto-flushed,
4633a7c4 628and the process on the other end will get your data in a timely manner,
6a3992aa 629you can't usually do anything to force it to give it back to you
54310121 630in a similarly quick fashion. In this case, we could, because we
4633a7c4 631gave I<cat> a B<-u> flag to make it unbuffered. But very few Unix
632commands are designed to operate over pipes, so this seldom works
54310121 633unless you yourself wrote the program on the other end of the
4633a7c4 634double-ended pipe.
635
54310121 636A solution to this is the nonstandard F<Comm.pl> library. It uses
4633a7c4 637pseudo-ttys to make your program behave more reasonably:
638
639 require 'Comm.pl';
640 $ph = open_proc('cat -n');
641 for (1..10) {
642 print $ph "a line\n";
643 print "got back ", scalar <$ph>;
644 }
a0d0e21e 645
4633a7c4 646This way you don't have to have control over the source code of the
54310121 647program you're using. The F<Comm> library also has expect()
648and interact() functions. Find the library (and we hope its
4633a7c4 649successor F<IPC::Chat>) at your nearest CPAN archive as detailed
184e9718 650in the SEE ALSO section below.
a0d0e21e 651
c8db1d39 652The newer Expect.pm module from CPAN also addresses this kind of thing.
653This module requires two other modules from CPAN: IO::Pty and IO::Stty.
654It sets up a pseudo-terminal to interact with programs that insist on
a11adca0 655using talking to the terminal device driver. If your system is
c8db1d39 656amongst those supported, this may be your best bet.
657
5a964f20 658=head2 Bidirectional Communication with Yourself
659
660If you want, you may make low-level pipe() and fork()
661to stitch this together by hand. This example only
662talks to itself, but you could reopen the appropriate
663handles to STDIN and STDOUT and call other processes.
664
665 #!/usr/bin/perl -w
666 # pipe1 - bidirectional communication using two pipe pairs
667 # designed for the socketpair-challenged
668 use IO::Handle; # thousands of lines just for autoflush :-(
669 pipe(PARENT_RDR, CHILD_WTR); # XXX: failure?
670 pipe(CHILD_RDR, PARENT_WTR); # XXX: failure?
671 CHILD_WTR->autoflush(1);
672 PARENT_WTR->autoflush(1);
673
674 if ($pid = fork) {
675 close PARENT_RDR; close PARENT_WTR;
676 print CHILD_WTR "Parent Pid $$ is sending this\n";
677 chomp($line = <CHILD_RDR>);
678 print "Parent Pid $$ just read this: `$line'\n";
679 close CHILD_RDR; close CHILD_WTR;
680 waitpid($pid,0);
681 } else {
682 die "cannot fork: $!" unless defined $pid;
683 close CHILD_RDR; close CHILD_WTR;
684 chomp($line = <PARENT_RDR>);
685 print "Child Pid $$ just read this: `$line'\n";
686 print PARENT_WTR "Child Pid $$ is sending this\n";
687 close PARENT_RDR; close PARENT_WTR;
688 exit;
689 }
690
a11adca0 691But you don't actually have to make two pipe calls. If you
5a964f20 692have the socketpair() system call, it will do this all for you.
693
694 #!/usr/bin/perl -w
695 # pipe2 - bidirectional communication using socketpair
696 # "the best ones always go both ways"
697
698 use Socket;
699 use IO::Handle; # thousands of lines just for autoflush :-(
700 # We say AF_UNIX because although *_LOCAL is the
701 # POSIX 1003.1g form of the constant, many machines
702 # still don't have it.
703 socketpair(CHILD, PARENT, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
704 or die "socketpair: $!";
705
706 CHILD->autoflush(1);
707 PARENT->autoflush(1);
708
709 if ($pid = fork) {
710 close PARENT;
711 print CHILD "Parent Pid $$ is sending this\n";
712 chomp($line = <CHILD>);
713 print "Parent Pid $$ just read this: `$line'\n";
714 close CHILD;
715 waitpid($pid,0);
716 } else {
717 die "cannot fork: $!" unless defined $pid;
718 close CHILD;
719 chomp($line = <PARENT>);
720 print "Child Pid $$ just read this: `$line'\n";
721 print PARENT "Child Pid $$ is sending this\n";
722 close PARENT;
723 exit;
724 }
725
4633a7c4 726=head1 Sockets: Client/Server Communication
a0d0e21e 727
6a3992aa 728While not limited to Unix-derived operating systems (e.g., WinSock on PCs
4633a7c4 729provides socket support, as do some VMS libraries), you may not have
184e9718 730sockets on your system, in which case this section probably isn't going to do
6a3992aa 731you much good. With sockets, you can do both virtual circuits (i.e., TCP
732streams) and datagrams (i.e., UDP packets). You may be able to do even more
4633a7c4 733depending on your system.
734
735The Perl function calls for dealing with sockets have the same names as
736the corresponding system calls in C, but their arguments tend to differ
737for two reasons: first, Perl filehandles work differently than C file
738descriptors. Second, Perl already knows the length of its strings, so you
739don't need to pass that information.
a0d0e21e 740
4633a7c4 741One of the major problems with old socket code in Perl was that it used
742hard-coded values for some of the constants, which severely hurt
743portability. If you ever see code that does anything like explicitly
744setting C<$AF_INET = 2>, you know you're in for big trouble: An
745immeasurably superior approach is to use the C<Socket> module, which more
746reliably grants access to various constants and functions you'll need.
a0d0e21e 747
68dc0745 748If you're not writing a server/client for an existing protocol like
749NNTP or SMTP, you should give some thought to how your server will
750know when the client has finished talking, and vice-versa. Most
751protocols are based on one-line messages and responses (so one party
4a6725af 752knows the other has finished when a "\n" is received) or multi-line
68dc0745 753messages and responses that end with a period on an empty line
754("\n.\n" terminates a message/response).
755
5a964f20 756=head2 Internet Line Terminators
757
758The Internet line terminator is "\015\012". Under ASCII variants of
759Unix, that could usually be written as "\r\n", but under other systems,
760"\r\n" might at times be "\015\015\012", "\012\012\015", or something
761completely different. The standards specify writing "\015\012" to be
762conformant (be strict in what you provide), but they also recommend
763accepting a lone "\012" on input (but be lenient in what you require).
764We haven't always been very good about that in the code in this manpage,
765but unless you're on a Mac, you'll probably be ok.
766
4633a7c4 767=head2 Internet TCP Clients and Servers
a0d0e21e 768
4633a7c4 769Use Internet-domain sockets when you want to do client-server
770communication that might extend to machines outside of your own system.
771
772Here's a sample TCP client using Internet-domain sockets:
773
774 #!/usr/bin/perl -w
4633a7c4 775 use strict;
776 use Socket;
777 my ($remote,$port, $iaddr, $paddr, $proto, $line);
778
779 $remote = shift || 'localhost';
780 $port = shift || 2345; # random port
781 if ($port =~ /\D/) { $port = getservbyname($port, 'tcp') }
782 die "No port" unless $port;
783 $iaddr = inet_aton($remote) || die "no host: $remote";
784 $paddr = sockaddr_in($port, $iaddr);
785
786 $proto = getprotobyname('tcp');
787 socket(SOCK, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
788 connect(SOCK, $paddr) || die "connect: $!";
54310121 789 while (defined($line = <SOCK>)) {
4633a7c4 790 print $line;
54310121 791 }
4633a7c4 792
793 close (SOCK) || die "close: $!";
794 exit;
795
796And here's a corresponding server to go along with it. We'll
797leave the address as INADDR_ANY so that the kernel can choose
54310121 798the appropriate interface on multihomed hosts. If you want sit
c07a80fd 799on a particular interface (like the external side of a gateway
800or firewall machine), you should fill this in with your real address
801instead.
802
803 #!/usr/bin/perl -Tw
c07a80fd 804 use strict;
805 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
806 use Socket;
807 use Carp;
5865a7df 808 my $EOL = "\015\012";
c07a80fd 809
54310121 810 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
c07a80fd 811
812 my $port = shift || 2345;
813 my $proto = getprotobyname('tcp');
51ee6500 814
5865a7df 815 ($port) = $port =~ /^(\d+)$/ or die "invalid port";
6a3992aa 816
c07a80fd 817 socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
54310121 818 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR,
c07a80fd 819 pack("l", 1)) || die "setsockopt: $!";
820 bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
821 listen(Server,SOMAXCONN) || die "listen: $!";
822
823 logmsg "server started on port $port";
824
825 my $paddr;
826
827 $SIG{CHLD} = \&REAPER;
828
829 for ( ; $paddr = accept(Client,Server); close Client) {
830 my($port,$iaddr) = sockaddr_in($paddr);
831 my $name = gethostbyaddr($iaddr,AF_INET);
832
54310121 833 logmsg "connection from $name [",
834 inet_ntoa($iaddr), "]
c07a80fd 835 at port $port";
836
54310121 837 print Client "Hello there, $name, it's now ",
5a964f20 838 scalar localtime, $EOL;
54310121 839 }
c07a80fd 840
54310121 841And here's a multithreaded version. It's multithreaded in that
842like most typical servers, it spawns (forks) a slave server to
c07a80fd 843handle the client request so that the master server can quickly
844go back to service a new client.
4633a7c4 845
846 #!/usr/bin/perl -Tw
4633a7c4 847 use strict;
848 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
a0d0e21e 849 use Socket;
4633a7c4 850 use Carp;
5865a7df 851 my $EOL = "\015\012";
a0d0e21e 852
4633a7c4 853 sub spawn; # forward declaration
54310121 854 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
a0d0e21e 855
4633a7c4 856 my $port = shift || 2345;
857 my $proto = getprotobyname('tcp');
51ee6500 858
5865a7df 859 ($port) = $port =~ /^(\d+)$/ or die "invalid port";
54310121 860
c07a80fd 861 socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
54310121 862 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR,
c07a80fd 863 pack("l", 1)) || die "setsockopt: $!";
864 bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
865 listen(Server,SOMAXCONN) || die "listen: $!";
a0d0e21e 866
4633a7c4 867 logmsg "server started on port $port";
a0d0e21e 868
4633a7c4 869 my $waitedpid = 0;
870 my $paddr;
a0d0e21e 871
816229cf 872 use POSIX ":sys_wait_h";
54310121 873 sub REAPER {
816229cf 874 my $child;
875 while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
876 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
877 }
6a3992aa 878 $SIG{CHLD} = \&REAPER; # loathe sysV
4633a7c4 879 }
880
881 $SIG{CHLD} = \&REAPER;
882
54310121 883 for ( $waitedpid = 0;
884 ($paddr = accept(Client,Server)) || $waitedpid;
885 $waitedpid = 0, close Client)
4633a7c4 886 {
6a3992aa 887 next if $waitedpid and not $paddr;
4633a7c4 888 my($port,$iaddr) = sockaddr_in($paddr);
889 my $name = gethostbyaddr($iaddr,AF_INET);
890
54310121 891 logmsg "connection from $name [",
892 inet_ntoa($iaddr), "]
4633a7c4 893 at port $port";
a0d0e21e 894
54310121 895 spawn sub {
b921b357 896 $|=1;
5a964f20 897 print "Hello there, $name, it's now ", scalar localtime, $EOL;
898 exec '/usr/games/fortune' # XXX: `wrong' line terminators
4633a7c4 899 or confess "can't exec fortune: $!";
900 };
a0d0e21e 901
54310121 902 }
a0d0e21e 903
4633a7c4 904 sub spawn {
905 my $coderef = shift;
a0d0e21e 906
54310121 907 unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
4633a7c4 908 confess "usage: spawn CODEREF";
a0d0e21e 909 }
4633a7c4 910
911 my $pid;
912 if (!defined($pid = fork)) {
913 logmsg "cannot fork: $!";
914 return;
915 } elsif ($pid) {
916 logmsg "begat $pid";
6a3992aa 917 return; # I'm the parent
4633a7c4 918 }
6a3992aa 919 # else I'm the child -- go spawn
4633a7c4 920
c07a80fd 921 open(STDIN, "<&Client") || die "can't dup client to stdin";
922 open(STDOUT, ">&Client") || die "can't dup client to stdout";
4633a7c4 923 ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
924 exit &$coderef();
54310121 925 }
4633a7c4 926
927This server takes the trouble to clone off a child version via fork() for
928each incoming request. That way it can handle many requests at once,
929which you might not always want. Even if you don't fork(), the listen()
930will allow that many pending connections. Forking servers have to be
931particularly careful about cleaning up their dead children (called
932"zombies" in Unix parlance), because otherwise you'll quickly fill up your
933process table.
934
935We suggest that you use the B<-T> flag to use taint checking (see L<perlsec>)
936even if we aren't running setuid or setgid. This is always a good idea
937for servers and other programs run on behalf of someone else (like CGI
938scripts), because it lessens the chances that people from the outside will
939be able to compromise your system.
940
941Let's look at another TCP client. This one connects to the TCP "time"
942service on a number of different machines and shows how far their clocks
943differ from the system on which it's being run:
944
945 #!/usr/bin/perl -w
4633a7c4 946 use strict;
947 use Socket;
948
949 my $SECS_of_70_YEARS = 2208988800;
54310121 950 sub ctime { scalar localtime(shift) }
4633a7c4 951
54310121 952 my $iaddr = gethostbyname('localhost');
953 my $proto = getprotobyname('tcp');
954 my $port = getservbyname('time', 'tcp');
4633a7c4 955 my $paddr = sockaddr_in(0, $iaddr);
956 my($host);
957
958 $| = 1;
959 printf "%-24s %8s %s\n", "localhost", 0, ctime(time());
960
961 foreach $host (@ARGV) {
962 printf "%-24s ", $host;
963 my $hisiaddr = inet_aton($host) || die "unknown host";
964 my $hispaddr = sockaddr_in($port, $hisiaddr);
965 socket(SOCKET, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
966 connect(SOCKET, $hispaddr) || die "bind: $!";
967 my $rtime = ' ';
968 read(SOCKET, $rtime, 4);
969 close(SOCKET);
970 my $histime = unpack("N", $rtime) - $SECS_of_70_YEARS ;
971 printf "%8d %s\n", $histime - time, ctime($histime);
a0d0e21e 972 }
973
4633a7c4 974=head2 Unix-Domain TCP Clients and Servers
975
a2eb9003 976That's fine for Internet-domain clients and servers, but what about local
4633a7c4 977communications? While you can use the same setup, sometimes you don't
978want to. Unix-domain sockets are local to the current host, and are often
54310121 979used internally to implement pipes. Unlike Internet domain sockets, Unix
4633a7c4 980domain sockets can show up in the file system with an ls(1) listing.
981
5a964f20 982 % ls -l /dev/log
4633a7c4 983 srw-rw-rw- 1 root 0 Oct 31 07:23 /dev/log
a0d0e21e 984
4633a7c4 985You can test for these with Perl's B<-S> file test:
986
987 unless ( -S '/dev/log' ) {
3ba19564 988 die "something's wicked with the log system";
54310121 989 }
4633a7c4 990
991Here's a sample Unix-domain client:
992
993 #!/usr/bin/perl -w
4633a7c4 994 use Socket;
995 use strict;
996 my ($rendezvous, $line);
997
998 $rendezvous = shift || '/tmp/catsock';
999 socket(SOCK, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!";
9607fc9c 1000 connect(SOCK, sockaddr_un($rendezvous)) || die "connect: $!";
54310121 1001 while (defined($line = <SOCK>)) {
4633a7c4 1002 print $line;
54310121 1003 }
4633a7c4 1004 exit;
1005
5a964f20 1006And here's a corresponding server. You don't have to worry about silly
1007network terminators here because Unix domain sockets are guaranteed
1008to be on the localhost, and thus everything works right.
4633a7c4 1009
1010 #!/usr/bin/perl -Tw
4633a7c4 1011 use strict;
1012 use Socket;
1013 use Carp;
1014
1015 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
5865a7df 1016 sub spawn; # forward declaration
5a964f20 1017 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
4633a7c4 1018
1019 my $NAME = '/tmp/catsock';
1020 my $uaddr = sockaddr_un($NAME);
1021 my $proto = getprotobyname('tcp');
1022
c07a80fd 1023 socket(Server,PF_UNIX,SOCK_STREAM,0) || die "socket: $!";
4633a7c4 1024 unlink($NAME);
c07a80fd 1025 bind (Server, $uaddr) || die "bind: $!";
1026 listen(Server,SOMAXCONN) || die "listen: $!";
4633a7c4 1027
1028 logmsg "server started on $NAME";
1029
5a964f20 1030 my $waitedpid;
1031
816229cf 1032 use POSIX ":sys_wait_h";
5a964f20 1033 sub REAPER {
816229cf 1034 my $child;
1035 while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
1036 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
1037 }
5a964f20 1038 $SIG{CHLD} = \&REAPER; # loathe sysV
5a964f20 1039 }
1040
4633a7c4 1041 $SIG{CHLD} = \&REAPER;
1042
5a964f20 1043
54310121 1044 for ( $waitedpid = 0;
1045 accept(Client,Server) || $waitedpid;
1046 $waitedpid = 0, close Client)
4633a7c4 1047 {
1048 next if $waitedpid;
1049 logmsg "connection on $NAME";
54310121 1050 spawn sub {
4633a7c4 1051 print "Hello there, it's now ", scalar localtime, "\n";
1052 exec '/usr/games/fortune' or die "can't exec fortune: $!";
1053 };
54310121 1054 }
4633a7c4 1055
5865a7df 1056 sub spawn {
1057 my $coderef = shift;
1058
1059 unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
1060 confess "usage: spawn CODEREF";
1061 }
1062
1063 my $pid;
1064 if (!defined($pid = fork)) {
1065 logmsg "cannot fork: $!";
1066 return;
1067 } elsif ($pid) {
1068 logmsg "begat $pid";
1069 return; # I'm the parent
1070 }
1071 # else I'm the child -- go spawn
1072
1073 open(STDIN, "<&Client") || die "can't dup client to stdin";
1074 open(STDOUT, ">&Client") || die "can't dup client to stdout";
1075 ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
1076 exit &$coderef();
1077 }
1078
4633a7c4 1079As you see, it's remarkably similar to the Internet domain TCP server, so
1080much so, in fact, that we've omitted several duplicate functions--spawn(),
1081logmsg(), ctime(), and REAPER()--which are exactly the same as in the
1082other server.
1083
1084So why would you ever want to use a Unix domain socket instead of a
1085simpler named pipe? Because a named pipe doesn't give you sessions. You
1086can't tell one process's data from another's. With socket programming,
1087you get a separate session for each client: that's why accept() takes two
1088arguments.
1089
1090For example, let's say that you have a long running database server daemon
1091that you want folks from the World Wide Web to be able to access, but only
1092if they go through a CGI interface. You'd have a small, simple CGI
1093program that does whatever checks and logging you feel like, and then acts
1094as a Unix-domain client and connects to your private server.
1095
7b05b7e3 1096=head1 TCP Clients with IO::Socket
1097
1098For those preferring a higher-level interface to socket programming, the
1099IO::Socket module provides an object-oriented approach. IO::Socket is
1100included as part of the standard Perl distribution as of the 5.004
1101release. If you're running an earlier version of Perl, just fetch
106325ad 1102IO::Socket from CPAN, where you'll also find modules providing easy
7b05b7e3 1103interfaces to the following systems: DNS, FTP, Ident (RFC 931), NIS and
1104NISPlus, NNTP, Ping, POP3, SMTP, SNMP, SSLeay, Telnet, and Time--just
1105to name a few.
1106
1107=head2 A Simple Client
1108
1109Here's a client that creates a TCP connection to the "daytime"
1110service at port 13 of the host name "localhost" and prints out everything
1111that the server there cares to provide.
1112
1113 #!/usr/bin/perl -w
1114 use IO::Socket;
1115 $remote = IO::Socket::INET->new(
1116 Proto => "tcp",
1117 PeerAddr => "localhost",
1118 PeerPort => "daytime(13)",
1119 )
1120 or die "cannot connect to daytime port at localhost";
1121 while ( <$remote> ) { print }
1122
1123When you run this program, you should get something back that
1124looks like this:
1125
1126 Wed May 14 08:40:46 MDT 1997
1127
1128Here are what those parameters to the C<new> constructor mean:
1129
13a2d996 1130=over 4
7b05b7e3 1131
1132=item C<Proto>
1133
1134This is which protocol to use. In this case, the socket handle returned
1135will be connected to a TCP socket, because we want a stream-oriented
1136connection, that is, one that acts pretty much like a plain old file.
1137Not all sockets are this of this type. For example, the UDP protocol
1138can be used to make a datagram socket, used for message-passing.
1139
1140=item C<PeerAddr>
1141
1142This is the name or Internet address of the remote host the server is
1143running on. We could have specified a longer name like C<"www.perl.com">,
1144or an address like C<"204.148.40.9">. For demonstration purposes, we've
1145used the special hostname C<"localhost">, which should always mean the
1146current machine you're running on. The corresponding Internet address
1147for localhost is C<"127.1">, if you'd rather use that.
1148
1149=item C<PeerPort>
1150
1151This is the service name or port number we'd like to connect to.
1152We could have gotten away with using just C<"daytime"> on systems with a
1153well-configured system services file,[FOOTNOTE: The system services file
1154is in I</etc/services> under Unix] but just in case, we've specified the
1155port number (13) in parentheses. Using just the number would also have
1156worked, but constant numbers make careful programmers nervous.
1157
1158=back
1159
1160Notice how the return value from the C<new> constructor is used as
1161a filehandle in the C<while> loop? That's what's called an indirect
1162filehandle, a scalar variable containing a filehandle. You can use
1163it the same way you would a normal filehandle. For example, you
1164can read one line from it this way:
1165
1166 $line = <$handle>;
1167
1168all remaining lines from is this way:
1169
1170 @lines = <$handle>;
1171
1172and send a line of data to it this way:
1173
1174 print $handle "some data\n";
1175
1176=head2 A Webget Client
1177
1178Here's a simple client that takes a remote host to fetch a document
1179from, and then a list of documents to get from that host. This is a
1180more interesting client than the previous one because it first sends
1181something to the server before fetching the server's response.
1182
1183 #!/usr/bin/perl -w
1184 use IO::Socket;
1185 unless (@ARGV > 1) { die "usage: $0 host document ..." }
1186 $host = shift(@ARGV);
5a964f20 1187 $EOL = "\015\012";
1188 $BLANK = $EOL x 2;
7b05b7e3 1189 foreach $document ( @ARGV ) {
1190 $remote = IO::Socket::INET->new( Proto => "tcp",
1191 PeerAddr => $host,
1192 PeerPort => "http(80)",
1193 );
1194 unless ($remote) { die "cannot connect to http daemon on $host" }
1195 $remote->autoflush(1);
5a964f20 1196 print $remote "GET $document HTTP/1.0" . $BLANK;
7b05b7e3 1197 while ( <$remote> ) { print }
1198 close $remote;
1199 }
1200
1201The web server handing the "http" service, which is assumed to be at
4375e838 1202its standard port, number 80. If the web server you're trying to
7b05b7e3 1203connect to is at a different port (like 1080 or 8080), you should specify
c47ff5f1 1204as the named-parameter pair, C<< PeerPort => 8080 >>. The C<autoflush>
7b05b7e3 1205method is used on the socket because otherwise the system would buffer
1206up the output we sent it. (If you're on a Mac, you'll also need to
1207change every C<"\n"> in your code that sends data over the network to
1208be a C<"\015\012"> instead.)
1209
1210Connecting to the server is only the first part of the process: once you
1211have the connection, you have to use the server's language. Each server
1212on the network has its own little command language that it expects as
1213input. The string that we send to the server starting with "GET" is in
1214HTTP syntax. In this case, we simply request each specified document.
1215Yes, we really are making a new connection for each document, even though
1216it's the same host. That's the way you always used to have to speak HTTP.
1217Recent versions of web browsers may request that the remote server leave
1218the connection open a little while, but the server doesn't have to honor
1219such a request.
1220
1221Here's an example of running that program, which we'll call I<webget>:
1222
5a964f20 1223 % webget www.perl.com /guanaco.html
7b05b7e3 1224 HTTP/1.1 404 File Not Found
1225 Date: Thu, 08 May 1997 18:02:32 GMT
1226 Server: Apache/1.2b6
1227 Connection: close
1228 Content-type: text/html
1229
1230 <HEAD><TITLE>404 File Not Found</TITLE></HEAD>
1231 <BODY><H1>File Not Found</H1>
1232 The requested URL /guanaco.html was not found on this server.<P>
1233 </BODY>
1234
1235Ok, so that's not very interesting, because it didn't find that
1236particular document. But a long response wouldn't have fit on this page.
1237
1238For a more fully-featured version of this program, you should look to
1239the I<lwp-request> program included with the LWP modules from CPAN.
1240
1241=head2 Interactive Client with IO::Socket
1242
1243Well, that's all fine if you want to send one command and get one answer,
1244but what about setting up something fully interactive, somewhat like
1245the way I<telnet> works? That way you can type a line, get the answer,
1246type a line, get the answer, etc.
1247
1248This client is more complicated than the two we've done so far, but if
1249you're on a system that supports the powerful C<fork> call, the solution
1250isn't that rough. Once you've made the connection to whatever service
1251you'd like to chat with, call C<fork> to clone your process. Each of
1252these two identical process has a very simple job to do: the parent
1253copies everything from the socket to standard output, while the child
1254simultaneously copies everything from standard input to the socket.
1255To accomplish the same thing using just one process would be I<much>
1256harder, because it's easier to code two processes to do one thing than it
1257is to code one process to do two things. (This keep-it-simple principle
5a964f20 1258a cornerstones of the Unix philosophy, and good software engineering as
1259well, which is probably why it's spread to other systems.)
7b05b7e3 1260
1261Here's the code:
1262
1263 #!/usr/bin/perl -w
1264 use strict;
1265 use IO::Socket;
1266 my ($host, $port, $kidpid, $handle, $line);
1267
1268 unless (@ARGV == 2) { die "usage: $0 host port" }
1269 ($host, $port) = @ARGV;
1270
1271 # create a tcp connection to the specified host and port
1272 $handle = IO::Socket::INET->new(Proto => "tcp",
1273 PeerAddr => $host,
1274 PeerPort => $port)
1275 or die "can't connect to port $port on $host: $!";
1276
1277 $handle->autoflush(1); # so output gets there right away
1278 print STDERR "[Connected to $host:$port]\n";
1279
1280 # split the program into two processes, identical twins
1281 die "can't fork: $!" unless defined($kidpid = fork());
1282
1283 # the if{} block runs only in the parent process
1284 if ($kidpid) {
1285 # copy the socket to standard output
1286 while (defined ($line = <$handle>)) {
1287 print STDOUT $line;
1288 }
1289 kill("TERM", $kidpid); # send SIGTERM to child
1290 }
1291 # the else{} block runs only in the child process
1292 else {
1293 # copy standard input to the socket
1294 while (defined ($line = <STDIN>)) {
1295 print $handle $line;
1296 }
1297 }
1298
1299The C<kill> function in the parent's C<if> block is there to send a
1300signal to our child process (current running in the C<else> block)
1301as soon as the remote server has closed its end of the connection.
1302
7b05b7e3 1303If the remote server sends data a byte at time, and you need that
1304data immediately without waiting for a newline (which might not happen),
1305you may wish to replace the C<while> loop in the parent with the
1306following:
1307
1308 my $byte;
1309 while (sysread($handle, $byte, 1) == 1) {
1310 print STDOUT $byte;
1311 }
1312
1313Making a system call for each byte you want to read is not very efficient
1314(to put it mildly) but is the simplest to explain and works reasonably
1315well.
1316
1317=head1 TCP Servers with IO::Socket
1318
5a964f20 1319As always, setting up a server is little bit more involved than running a client.
7b05b7e3 1320The model is that the server creates a special kind of socket that
1321does nothing but listen on a particular port for incoming connections.
c47ff5f1 1322It does this by calling the C<< IO::Socket::INET->new() >> method with
7b05b7e3 1323slightly different arguments than the client did.
1324
13a2d996 1325=over 4
7b05b7e3 1326
1327=item Proto
1328
1329This is which protocol to use. Like our clients, we'll
1330still specify C<"tcp"> here.
1331
1332=item LocalPort
1333
1334We specify a local
1335port in the C<LocalPort> argument, which we didn't do for the client.
1336This is service name or port number for which you want to be the
1337server. (Under Unix, ports under 1024 are restricted to the
1338superuser.) In our sample, we'll use port 9000, but you can use
1339any port that's not currently in use on your system. If you try
1340to use one already in used, you'll get an "Address already in use"
19799a22 1341message. Under Unix, the C<netstat -a> command will show
7b05b7e3 1342which services current have servers.
1343
1344=item Listen
1345
1346The C<Listen> parameter is set to the maximum number of
1347pending connections we can accept until we turn away incoming clients.
1348Think of it as a call-waiting queue for your telephone.
1349The low-level Socket module has a special symbol for the system maximum, which
1350is SOMAXCONN.
1351
1352=item Reuse
1353
1354The C<Reuse> parameter is needed so that we restart our server
1355manually without waiting a few minutes to allow system buffers to
1356clear out.
1357
1358=back
1359
1360Once the generic server socket has been created using the parameters
1361listed above, the server then waits for a new client to connect
d1be9408 1362to it. The server blocks in the C<accept> method, which eventually accepts a
1363bidirectional connection from the remote client. (Make sure to autoflush
7b05b7e3 1364this handle to circumvent buffering.)
1365
1366To add to user-friendliness, our server prompts the user for commands.
1367Most servers don't do this. Because of the prompt without a newline,
1368you'll have to use the C<sysread> variant of the interactive client above.
1369
1370This server accepts one of five different commands, sending output
1371back to the client. Note that unlike most network servers, this one
1372only handles one incoming client at a time. Multithreaded servers are
f83494b9 1373covered in Chapter 6 of the Camel.
7b05b7e3 1374
1375Here's the code. We'll
1376
1377 #!/usr/bin/perl -w
1378 use IO::Socket;
1379 use Net::hostent; # for OO version of gethostbyaddr
1380
1381 $PORT = 9000; # pick something not in use
1382
1383 $server = IO::Socket::INET->new( Proto => 'tcp',
1384 LocalPort => $PORT,
1385 Listen => SOMAXCONN,
1386 Reuse => 1);
1387
1388 die "can't setup server" unless $server;
1389 print "[Server $0 accepting clients]\n";
1390
1391 while ($client = $server->accept()) {
1392 $client->autoflush(1);
1393 print $client "Welcome to $0; type help for command list.\n";
1394 $hostinfo = gethostbyaddr($client->peeraddr);
78fc38e1 1395 printf "[Connect from %s]\n", $hostinfo ? $hostinfo->name : $client->peerhost;
7b05b7e3 1396 print $client "Command? ";
1397 while ( <$client>) {
1398 next unless /\S/; # blank line
1399 if (/quit|exit/i) { last; }
1400 elsif (/date|time/i) { printf $client "%s\n", scalar localtime; }
1401 elsif (/who/i ) { print $client `who 2>&1`; }
1402 elsif (/cookie/i ) { print $client `/usr/games/fortune 2>&1`; }
1403 elsif (/motd/i ) { print $client `cat /etc/motd 2>&1`; }
1404 else {
1405 print $client "Commands: quit date who cookie motd\n";
1406 }
1407 } continue {
1408 print $client "Command? ";
1409 }
1410 close $client;
1411 }
1412
1413=head1 UDP: Message Passing
4633a7c4 1414
1415Another kind of client-server setup is one that uses not connections, but
1416messages. UDP communications involve much lower overhead but also provide
1417less reliability, as there are no promises that messages will arrive at
1418all, let alone in order and unmangled. Still, UDP offers some advantages
1419over TCP, including being able to "broadcast" or "multicast" to a whole
1420bunch of destination hosts at once (usually on your local subnet). If you
1421find yourself overly concerned about reliability and start building checks
6a3992aa 1422into your message system, then you probably should use just TCP to start
4633a7c4 1423with.
1424
90034919 1425Note that UDP datagrams are I<not> a bytestream and should not be treated
1426as such. This makes using I/O mechanisms with internal buffering
1427like stdio (i.e. print() and friends) especially cumbersome. Use syswrite(),
1428or better send(), like in the example below.
1429
4633a7c4 1430Here's a UDP program similar to the sample Internet TCP client given
7b05b7e3 1431earlier. However, instead of checking one host at a time, the UDP version
4633a7c4 1432will check many of them asynchronously by simulating a multicast and then
1433using select() to do a timed-out wait for I/O. To do something similar
1434with TCP, you'd have to use a different socket handle for each host.
1435
1436 #!/usr/bin/perl -w
1437 use strict;
4633a7c4 1438 use Socket;
1439 use Sys::Hostname;
1440
54310121 1441 my ( $count, $hisiaddr, $hispaddr, $histime,
1442 $host, $iaddr, $paddr, $port, $proto,
4633a7c4 1443 $rin, $rout, $rtime, $SECS_of_70_YEARS);
1444
1445 $SECS_of_70_YEARS = 2208988800;
1446
1447 $iaddr = gethostbyname(hostname());
1448 $proto = getprotobyname('udp');
1449 $port = getservbyname('time', 'udp');
1450 $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick
1451
1452 socket(SOCKET, PF_INET, SOCK_DGRAM, $proto) || die "socket: $!";
1453 bind(SOCKET, $paddr) || die "bind: $!";
1454
1455 $| = 1;
1456 printf "%-12s %8s %s\n", "localhost", 0, scalar localtime time;
1457 $count = 0;
1458 for $host (@ARGV) {
1459 $count++;
1460 $hisiaddr = inet_aton($host) || die "unknown host";
1461 $hispaddr = sockaddr_in($port, $hisiaddr);
1462 defined(send(SOCKET, 0, 0, $hispaddr)) || die "send $host: $!";
1463 }
1464
1465 $rin = '';
1466 vec($rin, fileno(SOCKET), 1) = 1;
1467
1468 # timeout after 10.0 seconds
1469 while ($count && select($rout = $rin, undef, undef, 10.0)) {
1470 $rtime = '';
1471 ($hispaddr = recv(SOCKET, $rtime, 4, 0)) || die "recv: $!";
1472 ($port, $hisiaddr) = sockaddr_in($hispaddr);
1473 $host = gethostbyaddr($hisiaddr, AF_INET);
1474 $histime = unpack("N", $rtime) - $SECS_of_70_YEARS ;
1475 printf "%-12s ", $host;
1476 printf "%8d %s\n", $histime - time, scalar localtime($histime);
1477 $count--;
1478 }
1479
90034919 1480Note that this example does not include any retries and may consequently
1481fail to contact a reachable host. The most prominent reason for this
1482is congestion of the queues on the sending host if the number of
a31a806a 1483list of hosts to contact is sufficiently large.
90034919 1484
4633a7c4 1485=head1 SysV IPC
1486
1487While System V IPC isn't so widely used as sockets, it still has some
1488interesting uses. You can't, however, effectively use SysV IPC or
1489Berkeley mmap() to have shared memory so as to share a variable amongst
1490several processes. That's because Perl would reallocate your string when
1491you weren't wanting it to.
1492
54310121 1493Here's a small example showing shared memory usage.
a0d0e21e 1494
41d6edb2 1495 use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRWXU);
0ade1984 1496
a0d0e21e 1497 $size = 2000;
41d6edb2 1498 $id = shmget(IPC_PRIVATE, $size, S_IRWXU) || die "$!";
1499 print "shm key $id\n";
a0d0e21e 1500
1501 $message = "Message #1";
41d6edb2 1502 shmwrite($id, $message, 0, 60) || die "$!";
0ade1984 1503 print "wrote: '$message'\n";
41d6edb2 1504 shmread($id, $buff, 0, 60) || die "$!";
0ade1984 1505 print "read : '$buff'\n";
a0d0e21e 1506
0ade1984 1507 # the buffer of shmread is zero-character end-padded.
1508 substr($buff, index($buff, "\0")) = '';
1509 print "un" unless $buff eq $message;
1510 print "swell\n";
a0d0e21e 1511
41d6edb2 1512 print "deleting shm $id\n";
1513 shmctl($id, IPC_RMID, 0) || die "$!";
a0d0e21e 1514
1515Here's an example of a semaphore:
1516
0ade1984 1517 use IPC::SysV qw(IPC_CREAT);
1518
a0d0e21e 1519 $IPC_KEY = 1234;
41d6edb2 1520 $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT ) || die "$!";
1521 print "shm key $id\n";
a0d0e21e 1522
a2eb9003 1523Put this code in a separate file to be run in more than one process.
a0d0e21e 1524Call the file F<take>:
1525
1526 # create a semaphore
1527
1528 $IPC_KEY = 1234;
41d6edb2 1529 $id = semget($IPC_KEY, 0 , 0 );
1530 die if !defined($id);
a0d0e21e 1531
1532 $semnum = 0;
1533 $semflag = 0;
1534
1535 # 'take' semaphore
1536 # wait for semaphore to be zero
1537 $semop = 0;
41d6edb2 1538 $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag);
a0d0e21e 1539
1540 # Increment the semaphore count
1541 $semop = 1;
41d6edb2 1542 $opstring2 = pack("s!s!s!", $semnum, $semop, $semflag);
a0d0e21e 1543 $opstring = $opstring1 . $opstring2;
1544
41d6edb2 1545 semop($id,$opstring) || die "$!";
a0d0e21e 1546
a2eb9003 1547Put this code in a separate file to be run in more than one process.
a0d0e21e 1548Call this file F<give>:
1549
4633a7c4 1550 # 'give' the semaphore
a0d0e21e 1551 # run this in the original process and you will see
1552 # that the second process continues
1553
1554 $IPC_KEY = 1234;
41d6edb2 1555 $id = semget($IPC_KEY, 0, 0);
1556 die if !defined($id);
a0d0e21e 1557
1558 $semnum = 0;
1559 $semflag = 0;
1560
1561 # Decrement the semaphore count
1562 $semop = -1;
41d6edb2 1563 $opstring = pack("s!s!s!", $semnum, $semop, $semflag);
a0d0e21e 1564
41d6edb2 1565 semop($id,$opstring) || die "$!";
a0d0e21e 1566
7b05b7e3 1567The SysV IPC code above was written long ago, and it's definitely
0ade1984 1568clunky looking. For a more modern look, see the IPC::SysV module
1569which is included with Perl starting from Perl 5.005.
4633a7c4 1570
41d6edb2 1571A small example demonstrating SysV message queues:
1572
1573 use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRWXU);
1574
1575 my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRWXU);
1576
1577 my $sent = "message";
1578 my $type = 1234;
1579 my $rcvd;
1580 my $type_rcvd;
1581
1582 if (defined $id) {
1583 if (msgsnd($id, pack("l! a*", $type_sent, $sent), 0)) {
1584 if (msgrcv($id, $rcvd, 60, 0, 0)) {
1585 ($type_rcvd, $rcvd) = unpack("l! a*", $rcvd);
1586 if ($rcvd eq $sent) {
1587 print "okay\n";
1588 } else {
1589 print "not okay\n";
1590 }
1591 } else {
1592 die "# msgrcv failed\n";
1593 }
1594 } else {
1595 die "# msgsnd failed\n";
1596 }
1597 msgctl($id, IPC_RMID, 0) || die "# msgctl failed: $!\n";
1598 } else {
1599 die "# msgget failed\n";
1600 }
1601
4633a7c4 1602=head1 NOTES
1603
5a964f20 1604Most of these routines quietly but politely return C<undef> when they
1605fail instead of causing your program to die right then and there due to
1606an uncaught exception. (Actually, some of the new I<Socket> conversion
1607functions croak() on bad arguments.) It is therefore essential to
1608check return values from these functions. Always begin your socket
1609programs this way for optimal success, and don't forget to add B<-T>
1610taint checking flag to the #! line for servers:
4633a7c4 1611
5a964f20 1612 #!/usr/bin/perl -Tw
4633a7c4 1613 use strict;
1614 use sigtrap;
1615 use Socket;
1616
1617=head1 BUGS
1618
1619All these routines create system-specific portability problems. As noted
1620elsewhere, Perl is at the mercy of your C libraries for much of its system
1621behaviour. It's probably safest to assume broken SysV semantics for
6a3992aa 1622signals and to stick with simple TCP and UDP socket operations; e.g., don't
a2eb9003 1623try to pass open file descriptors over a local UDP datagram socket if you
4633a7c4 1624want your code to stand a chance of being portable.
1625
5a964f20 1626As mentioned in the signals section, because few vendors provide C
1627libraries that are safely re-entrant, the prudent programmer will do
1628little else within a handler beyond setting a numeric variable that
1629already exists; or, if locked into a slow (restarting) system call,
1630using die() to raise an exception and longjmp(3) out. In fact, even
1631these may in some cases cause a core dump. It's probably best to avoid
a11adca0 1632signals except where they are absolutely inevitable. This
5a964f20 1633will be addressed in a future release of Perl.
4633a7c4 1634
1635=head1 AUTHOR
1636
1637Tom Christiansen, with occasional vestiges of Larry Wall's original
7b05b7e3 1638version and suggestions from the Perl Porters.
4633a7c4 1639
1640=head1 SEE ALSO
1641
7b05b7e3 1642There's a lot more to networking than this, but this should get you
1643started.
1644
c04e1326 1645For intrepid programmers, the indispensable textbook is I<Unix
1646Network Programming, 2nd Edition, Volume 1> by W. Richard Stevens
1647(published by Prentice-Hall). Note that most books on networking
1648address the subject from the perspective of a C programmer; translation
1649to Perl is left as an exercise for the reader.
7b05b7e3 1650
1651The IO::Socket(3) manpage describes the object library, and the Socket(3)
1652manpage describes the low-level interface to sockets. Besides the obvious
1653functions in L<perlfunc>, you should also check out the F<modules> file
1654at your nearest CPAN site. (See L<perlmodlib> or best yet, the F<Perl
1655FAQ> for a description of what CPAN is and where to get it.)
1656
4633a7c4 1657Section 5 of the F<modules> file is devoted to "Networking, Device Control
6a3992aa 1658(modems), and Interprocess Communication", and contains numerous unbundled
4633a7c4 1659modules numerous networking modules, Chat and Expect operations, CGI
1660programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP, Telnet,
1661Threads, and ToolTalk--just to name a few.