pod/perlipc.pod

   1 =head1 NAME
   2
   3 perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores)
   4
   5 =head1 DESCRIPTION
   6
   7 The basic IPC facilities of Perl are built out of the good old Unix
   8 signals, named pipes, pipe opens, the Berkeley socket routines, and SysV
   9 IPC calls.  Each is used in slightly different situations.
  10
  11 =head1 Signals
  12
  13 Perl uses a simple signal handling model: the %SIG hash contains names
  14 or references of user-installed signal handlers.  These handlers will
  15 be called with an argument which is the name of the signal that
  16 triggered it.  A signal may be generated intentionally from a
  17 particular keyboard sequence like control-C or control-Z, sent to you
  18 from another process, or triggered automatically by the kernel when
  19 special events transpire, like a child process exiting, your process
  20 running out of stack space, or hitting file size limit.
  21
  22 For example, to trap an interrupt signal, set up a handler like this:
  23
  24     sub catch_zap {
  25         my $signame = shift;
  26         $shucks++;
  27         die "Somebody sent me a SIG$signame";
  28     }
  29     $SIG{INT} = 'catch_zap';  # could fail in modules
  30     $SIG{INT} = \&catch_zap;  # best strategy
  31
  32 Prior to Perl 5.7.3 it was necessary to do as little as you possibly
  33 could in your handler; notice how all we do is set a global variable
  34 and then raise an exception.  That's because on most systems,
  35 libraries are not re-entrant; particularly, memory allocation and I/O
  36 routines are not.  That meant that doing nearly I<anything> in your
  37 handler could in theory trigger a memory fault and subsequent core
  38 dump - see L</Deferred Signals (Safe Signals)> below.
  39
  40 The names of the signals are the ones listed out by C<kill -l> on your
  41 system, or you can retrieve them from the Config module.  Set up an
  42 @signame list indexed by number to get the name and a %signo table
  43 indexed by name to get the number:
  44
  45     use Config;
  46     defined $Config{sig_name} || die "No sigs?";
  47     foreach $name (split(' ', $Config{sig_name})) {
  48         $signo{$name} = $i;
  49         $signame[$i] = $name;
  50         $i++;
  51     }
  52
  53 So to check whether signal 17 and SIGALRM were the same, do just this:
  54
  55     print "signal #17 = $signame[17]\n";
  56     if ($signo{ALRM}) {
  57         print "SIGALRM is $signo{ALRM}\n";
  58     }
  59
  60 You may also choose to assign the strings C<'IGNORE'> or C<'DEFAULT'> as
  61 the handler, in which case Perl will try to discard the signal or do the
  62 default thing.
  63
  64 On most Unix platforms, the C<CHLD> (sometimes also known as C<CLD>) signal
  65 has special behavior with respect to a value of C<'IGNORE'>.
  66 Setting C<$SIG{CHLD}> to C<'IGNORE'> on such a platform has the effect of
  67 not creating zombie processes when the parent process fails to C<wait()>
  68 on its child processes (i.e. child processes are automatically reaped).
  69 Calling C<wait()> with C<$SIG{CHLD}> set to C<'IGNORE'> usually returns
  70 C<-1> on such platforms.
  71
  72 Some signals can be neither trapped nor ignored, such as
  73 the KILL and STOP (but not the TSTP) signals.  One strategy for
  74 temporarily ignoring signals is to use a local() statement, which will be
  75 automatically restored once your block is exited.  (Remember that local()
  76 values are "inherited" by functions called from within that block.)
  77
  78     sub precious {
  79         local $SIG{INT} = 'IGNORE';
  80         &more_functions;
  81     }
  82     sub more_functions {
  83         # interrupts still ignored, for now...
  84     }
  85
  86 Sending a signal to a negative process ID means that you send the signal
  87 to the entire Unix process-group.  This code sends a hang-up signal to all
  88 processes in the current process group (and sets $SIG{HUP} to IGNORE so
  89 it doesn't kill itself):
  90
  91     {
  92         local $SIG{HUP} = 'IGNORE';
  93         kill HUP => -$$;
  94         # snazzy writing of: kill('HUP', -$$)
  95     }
  96
  97 Another interesting signal to send is signal number zero.  This doesn't
  98 actually affect a child process, but instead checks whether it's alive
  99 or has changed its UID.
 100
 101     unless (kill 0 => $kid_pid) {
 102         warn "something wicked happened to $kid_pid";
 103     }
 104
 105 When directed at a process whose UID is not identical to that
 106 of the sending process, signal number zero may fail because
 107 you lack permission to send the signal, even though the process is alive.
 108 You may be able to determine the cause of failure using C<%!>.
 109
 110     unless (kill 0 => $pid or $!{EPERM}) {
 111         warn "$pid looks dead";
 112     }
 113
 114 You might also want to employ anonymous functions for simple signal
 115 handlers:
 116
 117     $SIG{INT} = sub { die "\nOutta here!\n" };
 118
 119 But that will be problematic for the more complicated handlers that need
 120 to reinstall themselves.  Because Perl's signal mechanism is currently
 121 based on the signal(3) function from the C library, you may sometimes be so
 122 misfortunate as to run on systems where that function is "broken", that
 123 is, it behaves in the old unreliable SysV way rather than the newer, more
 124 reasonable BSD and POSIX fashion.  So you'll see defensive people writing
 125 signal handlers like this:
 126
 127     sub REAPER {
 128         $waitedpid = wait;
 129         # loathe sysV: it makes us not only reinstate
 130         # the handler, but place it after the wait
 131         $SIG{CHLD} = \&REAPER;
 132     }
 133     $SIG{CHLD} = \&REAPER;
 134     # now do something that forks...
 135
 136 or better still:
 137
 138     use POSIX ":sys_wait_h";
 139     sub REAPER {
 140         my $child;
 141         # If a second child dies while in the signal handler caused by the
 142         # first death, we won't get another signal. So must loop here else
 143         # we will leave the unreaped child as a zombie. And the next time
 144         # two children die we get another zombie. And so on.
 145         while (($child = waitpid(-1,WNOHANG)) > 0) {
 146             $Kid_Status{$child} = $?;
 147         }
 148         $SIG{CHLD} = \&REAPER;  # still loathe sysV
 149     }
 150     $SIG{CHLD} = \&REAPER;
 151     # do something that forks...
 152
 153 Signal handling is also used for timeouts in Unix,   While safely
 154 protected within an C<eval{}> block, you set a signal handler to trap
 155 alarm signals and then schedule to have one delivered to you in some
 156 number of seconds.  Then try your blocking operation, clearing the alarm
 157 when it's done but not before you've exited your C<eval{}> block.  If it
 158 goes off, you'll use die() to jump out of the block, much as you might
 159 using longjmp() or throw() in other languages.
 160
 161 Here's an example:
 162
 163     eval {
 164         local $SIG{ALRM} = sub { die "alarm clock restart" };
 165         alarm 10;
 166         flock(FH, 2);   # blocking write lock
 167         alarm 0;
 168     };
 169     if ($@ and $@ !~ /alarm clock restart/) { die }
 170
 171 If the operation being timed out is system() or qx(), this technique
 172 is liable to generate zombies.    If this matters to you, you'll
 173 need to do your own fork() and exec(), and kill the errant child process.
 174
 175 For more complex signal handling, you might see the standard POSIX
 176 module.  Lamentably, this is almost entirely undocumented, but
 177 the F<t/lib/posix.t> file from the Perl source distribution has some
 178 examples in it.
 179
 180 =head2 Handling the SIGHUP Signal in Daemons
 181
 182 A process that usually starts when the system boots and shuts down
 183 when the system is shut down is called a daemon (Disk And Execution
 184 MONitor). If a daemon process has a configuration file which is
 185 modified after the process has been started, there should be a way to
 186 tell that process to re-read its configuration file, without stopping
 187 the process. Many daemons provide this mechanism using the C<SIGHUP>
 188 signal handler. When you want to tell the daemon to re-read the file
 189 you simply send it the C<SIGHUP> signal.
 190
 191 Not all platforms automatically reinstall their (native) signal
 192 handlers after a signal delivery.  This means that the handler works
 193 only the first time the signal is sent. The solution to this problem
 194 is to use C<POSIX> signal handlers if available, their behaviour
 195 is well-defined.
 196
 197 The following example implements a simple daemon, which restarts
 198 itself every time the C<SIGHUP> signal is received. The actual code is
 199 located in the subroutine C<code()>, which simply prints some debug
 200 info to show that it works and should be replaced with the real code.
 201
 202   #!/usr/bin/perl -w
 203
 204   use POSIX ();
 205   use FindBin ();
 206   use File::Basename ();
 207   use File::Spec::Functions;
 208
 209   $|=1;
 210
 211   # make the daemon cross-platform, so exec always calls the script
 212   # itself with the right path, no matter how the script was invoked.
 213   my $script = File::Basename::basename($0);
 214   my $SELF = catfile $FindBin::Bin, $script;
 215
 216   # POSIX unmasks the sigprocmask properly
 217   my $sigset = POSIX::SigSet->new();
 218   my $action = POSIX::SigAction->new('sigHUP_handler',
 219                                      $sigset,
 220                                      &POSIX::SA_NODEFER);
 221   POSIX::sigaction(&POSIX::SIGHUP, $action);
 222
 223   sub sigHUP_handler {
 224       print "got SIGHUP\n";
 225       exec($SELF, @ARGV) or die "Couldn't restart: $!\n";
 226   }
 227
 228   code();
 229
 230   sub code {
 231       print "PID: $$\n";
 232       print "ARGV: @ARGV\n";
 233       my $c = 0;
 234       while (++$c) {
 235           sleep 2;
 236           print "$c\n";
 237       }
 238   }
 239   __END__
 240
 241
 242 =head1 Named Pipes
 243
 244 A named pipe (often referred to as a FIFO) is an old Unix IPC
 245 mechanism for processes communicating on the same machine.  It works
 246 just like a regular, connected anonymous pipes, except that the
 247 processes rendezvous using a filename and don't have to be related.
 248
 249 To create a named pipe, use the Unix command mknod(1) or on some
 250 systems, mkfifo(1).  These may not be in your normal path.
 251
 252     # system return val is backwards, so && not ||
 253     #
 254     $ENV{PATH} .= ":/etc:/usr/etc";
 255     if  (      system('mknod',  $path, 'p')
 256             && system('mkfifo', $path) )
 257     {
 258         die "mk{nod,fifo} $path failed";
 259     }
 260
 261
 262 A fifo is convenient when you want to connect a process to an unrelated
 263 one.  When you open a fifo, the program will block until there's something
 264 on the other end.
 265
 266 For example, let's say you'd like to have your F<.signature> file be a
 267 named pipe that has a Perl program on the other end.  Now every time any
 268 program (like a mailer, news reader, finger program, etc.) tries to read
 269 from that file, the reading program will block and your program will
 270 supply the new signature.  We'll use the pipe-checking file test B<-p>
 271 to find out whether anyone (or anything) has accidentally removed our fifo.
 272
 273     chdir; # go home
 274     $FIFO = '.signature';
 275     $ENV{PATH} .= ":/etc:/usr/games";
 276
 277     while (1) {
 278         unless (-p $FIFO) {
 279             unlink $FIFO;
 280             system('mknod', $FIFO, 'p')
 281                 && die "can't mknod $FIFO: $!";
 282         }
 283
 284         # next line blocks until there's a reader
 285         open (FIFO, "> $FIFO") || die "can't write $FIFO: $!";
 286         print FIFO "John Smith (smith\@host.org)\n", `fortune -s`;
 287         close FIFO;
 288         sleep 2;    # to avoid dup signals
 289     }
 290
 291 =head2 Deferred Signals (Safe Signals)
 292
 293 In Perls before Perl 5.7.3 by installing Perl code to deal with
 294 signals, you were exposing yourself to danger from two things.  First,
 295 few system library functions are re-entrant.  If the signal interrupts
 296 while Perl is executing one function (like malloc(3) or printf(3)),
 297 and your signal handler then calls the same function again, you could
 298 get unpredictable behavior--often, a core dump.  Second, Perl isn't
 299 itself re-entrant at the lowest levels.  If the signal interrupts Perl
 300 while Perl is changing its own internal data structures, similarly
 301 unpredictable behaviour may result.
 302
 303 There were two things you could do, knowing this: be paranoid or be
 304 pragmatic.  The paranoid approach was to do as little as possible in your
 305 signal handler.  Set an existing integer variable that already has a
 306 value, and return.  This doesn't help you if you're in a slow system call,
 307 which will just restart.  That means you have to C<die> to longjump(3) out
 308 of the handler.  Even this is a little cavalier for the true paranoiac,
 309 who avoids C<die> in a handler because the system I<is> out to get you.
 310 The pragmatic approach was to say "I know the risks, but prefer the
 311 convenience", and to do anything you wanted in your signal handler,
 312 and be prepared to clean up core dumps now and again.
 313
 314 In Perl 5.7.3 and later to avoid these problems signals are
 315 "deferred"-- that is when the signal is delivered to the process by
 316 the system (to the C code that implements Perl) a flag is set, and the
 317 handler returns immediately. Then at strategic "safe" points in the
 318 Perl interpreter (e.g. when it is about to execute a new opcode) the
 319 flags are checked and the Perl level handler from %SIG is
 320 executed. The "deferred" scheme allows much more flexibility in the
 321 coding of signal handler as we know Perl interpreter is in a safe
 322 state, and that we are not in a system library function when the
 323 handler is called.  However the implementation does differ from
 324 previous Perls in the following ways:
 325
 326 =over 4
 327
 328 =item Long running opcodes
 329
 330 As Perl interpreter only looks at the signal flags when it about to
 331 execute a new opcode if a signal arrives during a long running opcode
 332 (e.g. a regular expression operation on a very large string) then
 333 signal will not be seen until operation completes.
 334
 335 =item Interrupting IO
 336
 337 When a signal is delivered (e.g. INT control-C) the operating system
 338 breaks into IO operations like C<read> (used to implement Perls
 339 E<lt>E<gt> operator). On older Perls the handler was called
 340 immediately (and as C<read> is not "unsafe" this worked well). With
 341 the "deferred" scheme the handler is not called immediately, and if
 342 Perl is using system's C<stdio> library that library may re-start the
 343 C<read> without returning to Perl and giving it a chance to call the
 344 %SIG handler. If this happens on your system the solution is to use
 345 C<:perlio> layer to do IO - at least on those handles which you want
 346 to be able to break into with signals. (The C<:perlio> layer checks
 347 the signal flags and calls %SIG handlers before resuming IO operation.)
 348
 349 Note that the default in Perl 5.7.3 and later is to automatically use
 350 the C<:perlio> layer.
 351
 352 Note that some networking library functions like gethostbyname() are
 353 known to have their own implementations of timeouts which may conflict
 354 with your timeouts.  If you are having problems with such functions,
 355 you can try using the POSIX sigaction() function, which bypasses the
 356 Perl safe signals (note that this means subjecting yourself to
 357 possible memory corruption, as described above).  Instead of setting
 358 C<$SIG{ALRM}>:
 359
 360    local $SIG{ALRM} = sub { die "alarm" };
 361
 362 try something like the following:
 363
 364     use POSIX qw(SIGALRM);
 365     POSIX::sigaction(SIGALRM,
 366                      POSIX::SigAction->new(sub { die "alarm" }))
 367           or die "Error setting SIGALRM handler: $!\n";
 368
 369 =item Restartable system calls
 370
 371 On systems that supported it, older versions of Perl used the
 372 SA_RESTART flag when installing %SIG handlers.  This meant that
 373 restartable system calls would continue rather than returning when
 374 a signal arrived.  In order to deliver deferred signals promptly,
 375 Perl 5.7.3 and later do I<not> use SA_RESTART.  Consequently,
 376 restartable system calls can fail (with $! set to C<EINTR>) in places
 377 where they previously would have succeeded.
 378
 379 Note that the default C<:perlio> layer will retry C<read>, C<write>
 380 and C<close> as described above and that interrupted C<wait> and
 381 C<waitpid> calls will always be retried.
 382
 383 =item Signals as "faults"
 384
 385 Certain signals e.g. SEGV, ILL, BUS are generated as a result of
 386 virtual memory or other "faults". These are normally fatal and there
 387 is little a Perl-level handler can do with them. (In particular the
 388 old signal scheme was particularly unsafe in such cases.)  However if
 389 a %SIG handler is set the new scheme simply sets a flag and returns as
 390 described above. This may cause the operating system to try the
 391 offending machine instruction again and - as nothing has changed - it
 392 will generate the signal again. The result of this is a rather odd
 393 "loop". In future Perl's signal mechanism may be changed to avoid this
 394 - perhaps by simply disallowing %SIG handlers on signals of that
 395 type. Until then the work-round is not to set a %SIG handler on those
 396 signals. (Which signals they are is operating system dependent.)
 397
 398 =item Signals triggered by operating system state
 399
 400 On some operating systems certain signal handlers are supposed to "do
 401 something" before returning. One example can be CHLD or CLD which
 402 indicates a child process has completed. On some operating systems the
 403 signal handler is expected to C<wait> for the completed child
 404 process. On such systems the deferred signal scheme will not work for
 405 those signals (it does not do the C<wait>). Again the failure will
 406 look like a loop as the operating system will re-issue the signal as
 407 there are un-waited-for completed child processes.
 408
 409 =back
 410
 411 If you want the old signal behaviour back regardless of possible
 412 memory corruption, set the environment variable C<PERL_SIGNALS> to
 413 C<"unsafe"> (a new feature since Perl 5.8.1).
 414
 415 =head1 Using open() for IPC
 416
 417 Perl's basic open() statement can also be used for unidirectional
 418 interprocess communication by either appending or prepending a pipe
 419 symbol to the second argument to open().  Here's how to start
 420 something up in a child process you intend to write to:
 421
 422     open(SPOOLER, "| cat -v | lpr -h 2>/dev/null")
 423                     || die "can't fork: $!";
 424     local $SIG{PIPE} = sub { die "spooler pipe broke" };
 425     print SPOOLER "stuff\n";
 426     close SPOOLER || die "bad spool: $! $?";
 427
 428 And here's how to start up a child process you intend to read from:
 429
 430     open(STATUS, "netstat -an 2>&1 |")
 431                     || die "can't fork: $!";
 432     while (<STATUS>) {
 433         next if /^(tcp|udp)/;
 434         print;
 435     }
 436     close STATUS || die "bad netstat: $! $?";
 437
 438 If one can be sure that a particular program is a Perl script that is
 439 expecting filenames in @ARGV, the clever programmer can write something
 440 like this:
 441
 442     % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
 443
 444 and irrespective of which shell it's called from, the Perl program will
 445 read from the file F<f1>, the process F<cmd1>, standard input (F<tmpfile>
 446 in this case), the F<f2> file, the F<cmd2> command, and finally the F<f3>
 447 file.  Pretty nifty, eh?
 448
 449 You might notice that you could use backticks for much the
 450 same effect as opening a pipe for reading:
 451
 452     print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`;
 453     die "bad netstat" if $?;
 454
 455 While this is true on the surface, it's much more efficient to process the
 456 file one line or record at a time because then you don't have to read the
 457 whole thing into memory at once.  It also gives you finer control of the
 458 whole process, letting you to kill off the child process early if you'd
 459 like.
 460
 461 Be careful to check both the open() and the close() return values.  If
 462 you're I<writing> to a pipe, you should also trap SIGPIPE.  Otherwise,
 463 think of what happens when you start up a pipe to a command that doesn't
 464 exist: the open() will in all likelihood succeed (it only reflects the
 465 fork()'s success), but then your output will fail--spectacularly.  Perl
 466 can't know whether the command worked because your command is actually
 467 running in a separate process whose exec() might have failed.  Therefore,
 468 while readers of bogus commands return just a quick end of file, writers
 469 to bogus command will trigger a signal they'd better be prepared to
 470 handle.  Consider:
 471
 472     open(FH, "|bogus")  or die "can't fork: $!";
 473     print FH "bang\n"   or die "can't write: $!";
 474     close FH            or die "can't close: $!";
 475
 476 That won't blow up until the close, and it will blow up with a SIGPIPE.
 477 To catch it, you could use this:
 478
 479     $SIG{PIPE} = 'IGNORE';
 480     open(FH, "|bogus")  or die "can't fork: $!";
 481     print FH "bang\n"   or die "can't write: $!";
 482     close FH            or die "can't close: status=$?";
 483
 484 =head2 Filehandles
 485
 486 Both the main process and any child processes it forks share the same
 487 STDIN, STDOUT, and STDERR filehandles.  If both processes try to access
 488 them at once, strange things can happen.  You may also want to close
 489 or reopen the filehandles for the child.  You can get around this by
 490 opening your pipe with open(), but on some systems this means that the
 491 child process cannot outlive the parent.
 492
 493 =head2 Background Processes
 494
 495 You can run a command in the background with:
 496
 497     system("cmd &");
 498
 499 The command's STDOUT and STDERR (and possibly STDIN, depending on your
 500 shell) will be the same as the parent's.  You won't need to catch
 501 SIGCHLD because of the double-fork taking place (see below for more
 502 details).
 503
 504 =head2 Complete Dissociation of Child from Parent
 505
 506 In some cases (starting server processes, for instance) you'll want to
 507 completely dissociate the child process from the parent.  This is
 508 often called daemonization.  A well behaved daemon will also chdir()
 509 to the root directory (so it doesn't prevent unmounting the filesystem
 510 containing the directory from which it was launched) and redirect its
 511 standard file descriptors from and to F</dev/null> (so that random
 512 output doesn't wind up on the user's terminal).
 513
 514     use POSIX 'setsid';
 515
 516     sub daemonize {
 517         chdir '/'               or die "Can't chdir to /: $!";
 518         open STDIN, '/dev/null' or die "Can't read /dev/null: $!";
 519         open STDOUT, '>/dev/null'
 520                                 or die "Can't write to /dev/null: $!";
 521         defined(my $pid = fork) or die "Can't fork: $!";
 522         exit if $pid;
 523         setsid                  or die "Can't start a new session: $!";
 524         open STDERR, '>&STDOUT' or die "Can't dup stdout: $!";
 525     }
 526
 527 The fork() has to come before the setsid() to ensure that you aren't a
 528 process group leader (the setsid() will fail if you are).  If your
 529 system doesn't have the setsid() function, open F</dev/tty> and use the
 530 C<TIOCNOTTY> ioctl() on it instead.  See L<tty(4)> for details.
 531
 532 Non-Unix users should check their Your_OS::Process module for other
 533 solutions.
 534
 535 =head2 Safe Pipe Opens
 536
 537 Another interesting approach to IPC is making your single program go
 538 multiprocess and communicate between (or even amongst) yourselves.  The
 539 open() function will accept a file argument of either C<"-|"> or C<"|-">
 540 to do a very interesting thing: it forks a child connected to the
 541 filehandle you've opened.  The child is running the same program as the
 542 parent.  This is useful for safely opening a file when running under an
 543 assumed UID or GID, for example.  If you open a pipe I<to> minus, you can
 544 write to the filehandle you opened and your kid will find it in his
 545 STDIN.  If you open a pipe I<from> minus, you can read from the filehandle
 546 you opened whatever your kid writes to his STDOUT.
 547
 548     use English '-no_match_vars';
 549     my $sleep_count = 0;
 550
 551     do {
 552         $pid = open(KID_TO_WRITE, "|-");
 553         unless (defined $pid) {
 554             warn "cannot fork: $!";
 555             die "bailing out" if $sleep_count++ > 6;
 556             sleep 10;
 557         }
 558     } until defined $pid;
 559
 560     if ($pid) {  # parent
 561         print KID_TO_WRITE @some_data;
 562         close(KID_TO_WRITE) || warn "kid exited $?";
 563     } else {     # child
 564         ($EUID, $EGID) = ($UID, $GID); # suid progs only
 565         open (FILE, "> /safe/file")
 566             || die "can't open /safe/file: $!";
 567         while (<STDIN>) {
 568             print FILE; # child's STDIN is parent's KID
 569         }
 570         exit;  # don't forget this
 571     }
 572
 573 Another common use for this construct is when you need to execute
 574 something without the shell's interference.  With system(), it's
 575 straightforward, but you can't use a pipe open or backticks safely.
 576 That's because there's no way to stop the shell from getting its hands on
 577 your arguments.   Instead, use lower-level control to call exec() directly.
 578
 579 Here's a safe backtick or pipe open for read:
 580
 581     # add error processing as above
 582     $pid = open(KID_TO_READ, "-|");
 583
 584     if ($pid) {   # parent
 585         while (<KID_TO_READ>) {
 586             # do something interesting
 587         }
 588         close(KID_TO_READ) || warn "kid exited $?";
 589
 590     } else {      # child
 591         ($EUID, $EGID) = ($UID, $GID); # suid only
 592         exec($program, @options, @args)
 593             || die "can't exec program: $!";
 594         # NOTREACHED
 595     }
 596
 597
 598 And here's a safe pipe open for writing:
 599
 600     # add error processing as above
 601     $pid = open(KID_TO_WRITE, "|-");
 602     $SIG{PIPE} = sub { die "whoops, $program pipe broke" };
 603
 604     if ($pid) {  # parent
 605         for (@data) {
 606             print KID_TO_WRITE;
 607         }
 608         close(KID_TO_WRITE) || warn "kid exited $?";
 609
 610     } else {     # child
 611         ($EUID, $EGID) = ($UID, $GID);
 612         exec($program, @options, @args)
 613             || die "can't exec program: $!";
 614         # NOTREACHED
 615     }
 616
 617 Since Perl 5.8.0, you can also use the list form of C<open> for pipes :
 618 the syntax
 619
 620     open KID_PS, "-|", "ps", "aux" or die $!;
 621
 622 forks the ps(1) command (without spawning a shell, as there are more than
 623 three arguments to open()), and reads its standard output via the
 624 C<KID_PS> filehandle.  The corresponding syntax to write to command
 625 pipes (with C<"|-"> in place of C<"-|">) is also implemented.
 626
 627 Note that these operations are full Unix forks, which means they may not be
 628 correctly implemented on alien systems.  Additionally, these are not true
 629 multithreading.  If you'd like to learn more about threading, see the
 630 F<modules> file mentioned below in the SEE ALSO section.
 631
 632 =head2 Bidirectional Communication with Another Process
 633
 634 While this works reasonably well for unidirectional communication, what
 635 about bidirectional communication?  The obvious thing you'd like to do
 636 doesn't actually work:
 637
 638     open(PROG_FOR_READING_AND_WRITING, "| some program |")
 639
 640 and if you forget to use the C<use warnings> pragma or the B<-w> flag,
 641 then you'll miss out entirely on the diagnostic message:
 642
 643     Can't do bidirectional pipe at -e line 1.
 644
 645 If you really want to, you can use the standard open2() library function
 646 to catch both ends.  There's also an open3() for tridirectional I/O so you
 647 can also catch your child's STDERR, but doing so would then require an
 648 awkward select() loop and wouldn't allow you to use normal Perl input
 649 operations.
 650
 651 If you look at its source, you'll see that open2() uses low-level
 652 primitives like Unix pipe() and exec() calls to create all the connections.
 653 While it might have been slightly more efficient by using socketpair(), it
 654 would have then been even less portable than it already is.  The open2()
 655 and open3() functions are  unlikely to work anywhere except on a Unix
 656 system or some other one purporting to be POSIX compliant.
 657
 658 Here's an example of using open2():
 659
 660     use FileHandle;
 661     use IPC::Open2;
 662     $pid = open2(*Reader, *Writer, "cat -u -n" );
 663     print Writer "stuff\n";
 664     $got = <Reader>;
 665
 666 The problem with this is that Unix buffering is really going to
 667 ruin your day.  Even though your C<Writer> filehandle is auto-flushed,
 668 and the process on the other end will get your data in a timely manner,
 669 you can't usually do anything to force it to give it back to you
 670 in a similarly quick fashion.  In this case, we could, because we
 671 gave I<cat> a B<-u> flag to make it unbuffered.  But very few Unix
 672 commands are designed to operate over pipes, so this seldom works
 673 unless you yourself wrote the program on the other end of the
 674 double-ended pipe.
 675
 676 A solution to this is the nonstandard F<Comm.pl> library.  It uses
 677 pseudo-ttys to make your program behave more reasonably:
 678
 679     require 'Comm.pl';
 680     $ph = open_proc('cat -n');
 681     for (1..10) {
 682         print $ph "a line\n";
 683         print "got back ", scalar <$ph>;
 684     }
 685
 686 This way you don't have to have control over the source code of the
 687 program you're using.  The F<Comm> library also has expect()
 688 and interact() functions.  Find the library (and we hope its
 689 successor F<IPC::Chat>) at your nearest CPAN archive as detailed
 690 in the SEE ALSO section below.
 691
 692 The newer Expect.pm module from CPAN also addresses this kind of thing.
 693 This module requires two other modules from CPAN: IO::Pty and IO::Stty.
 694 It sets up a pseudo-terminal to interact with programs that insist on
 695 using talking to the terminal device driver.  If your system is
 696 amongst those supported, this may be your best bet.
 697
 698 =head2 Bidirectional Communication with Yourself
 699
 700 If you want, you may make low-level pipe() and fork()
 701 to stitch this together by hand.  This example only
 702 talks to itself, but you could reopen the appropriate
 703 handles to STDIN and STDOUT and call other processes.
 704
 705     #!/usr/bin/perl -w
 706     # pipe1 - bidirectional communication using two pipe pairs
 707     #         designed for the socketpair-challenged
 708     use IO::Handle;     # thousands of lines just for autoflush :-(
 709     pipe(PARENT_RDR, CHILD_WTR);                # XXX: failure?
 710     pipe(CHILD_RDR,  PARENT_WTR);               # XXX: failure?
 711     CHILD_WTR->autoflush(1);
 712     PARENT_WTR->autoflush(1);
 713
 714     if ($pid = fork) {
 715         close PARENT_RDR; close PARENT_WTR;
 716         print CHILD_WTR "Parent Pid $$ is sending this\n";
 717         chomp($line = <CHILD_RDR>);
 718         print "Parent Pid $$ just read this: `$line'\n";
 719         close CHILD_RDR; close CHILD_WTR;
 720         waitpid($pid,0);
 721     } else {
 722         die "cannot fork: $!" unless defined $pid;
 723         close CHILD_RDR; close CHILD_WTR;
 724         chomp($line = <PARENT_RDR>);
 725         print "Child Pid $$ just read this: `$line'\n";
 726         print PARENT_WTR "Child Pid $$ is sending this\n";
 727         close PARENT_RDR; close PARENT_WTR;
 728         exit;
 729     }
 730
 731 But you don't actually have to make two pipe calls.  If you
 732 have the socketpair() system call, it will do this all for you.
 733
 734     #!/usr/bin/perl -w
 735     # pipe2 - bidirectional communication using socketpair
 736     #   "the best ones always go both ways"
 737
 738     use Socket;
 739     use IO::Handle;     # thousands of lines just for autoflush :-(
 740     # We say AF_UNIX because although *_LOCAL is the
 741     # POSIX 1003.1g form of the constant, many machines
 742     # still don't have it.
 743     socketpair(CHILD, PARENT, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
 744                                 or  die "socketpair: $!";
 745
 746     CHILD->autoflush(1);
 747     PARENT->autoflush(1);
 748
 749     if ($pid = fork) {
 750         close PARENT;
 751         print CHILD "Parent Pid $$ is sending this\n";
 752         chomp($line = <CHILD>);
 753         print "Parent Pid $$ just read this: `$line'\n";
 754         close CHILD;
 755         waitpid($pid,0);
 756     } else {
 757         die "cannot fork: $!" unless defined $pid;
 758         close CHILD;
 759         chomp($line = <PARENT>);
 760         print "Child Pid $$ just read this: `$line'\n";
 761         print PARENT "Child Pid $$ is sending this\n";
 762         close PARENT;
 763         exit;
 764     }
 765
 766 =head1 Sockets: Client/Server Communication
 767
 768 While not limited to Unix-derived operating systems (e.g., WinSock on PCs
 769 provides socket support, as do some VMS libraries), you may not have
 770 sockets on your system, in which case this section probably isn't going to do
 771 you much good.  With sockets, you can do both virtual circuits (i.e., TCP
 772 streams) and datagrams (i.e., UDP packets).  You may be able to do even more
 773 depending on your system.
 774
 775 The Perl function calls for dealing with sockets have the same names as
 776 the corresponding system calls in C, but their arguments tend to differ
 777 for two reasons: first, Perl filehandles work differently than C file
 778 descriptors.  Second, Perl already knows the length of its strings, so you
 779 don't need to pass that information.
 780
 781 One of the major problems with old socket code in Perl was that it used
 782 hard-coded values for some of the constants, which severely hurt
 783 portability.  If you ever see code that does anything like explicitly
 784 setting C<$AF_INET = 2>, you know you're in for big trouble:  An
 785 immeasurably superior approach is to use the C<Socket> module, which more
 786 reliably grants access to various constants and functions you'll need.
 787
 788 If you're not writing a server/client for an existing protocol like
 789 NNTP or SMTP, you should give some thought to how your server will
 790 know when the client has finished talking, and vice-versa.  Most
 791 protocols are based on one-line messages and responses (so one party
 792 knows the other has finished when a "\n" is received) or multi-line
 793 messages and responses that end with a period on an empty line
 794 ("\n.\n" terminates a message/response).
 795
 796 =head2 Internet Line Terminators
 797
 798 The Internet line terminator is "\015\012".  Under ASCII variants of
 799 Unix, that could usually be written as "\r\n", but under other systems,
 800 "\r\n" might at times be "\015\015\012", "\012\012\015", or something
 801 completely different.  The standards specify writing "\015\012" to be
 802 conformant (be strict in what you provide), but they also recommend
 803 accepting a lone "\012" on input (but be lenient in what you require).
 804 We haven't always been very good about that in the code in this manpage,
 805 but unless you're on a Mac, you'll probably be ok.
 806
 807 =head2 Internet TCP Clients and Servers
 808
 809 Use Internet-domain sockets when you want to do client-server
 810 communication that might extend to machines outside of your own system.
 811
 812 Here's a sample TCP client using Internet-domain sockets:
 813
 814     #!/usr/bin/perl -w
 815     use strict;
 816     use Socket;
 817     my ($remote,$port, $iaddr, $paddr, $proto, $line);
 818
 819     $remote  = shift || 'localhost';
 820     $port    = shift || 2345;  # random port
 821     if ($port =~ /\D/) { $port = getservbyname($port, 'tcp') }
 822     die "No port" unless $port;
 823     $iaddr   = inet_aton($remote)               || die "no host: $remote";
 824     $paddr   = sockaddr_in($port, $iaddr);
 825
 826     $proto   = getprotobyname('tcp');
 827     socket(SOCK, PF_INET, SOCK_STREAM, $proto)  || die "socket: $!";
 828     connect(SOCK, $paddr)    || die "connect: $!";
 829     while (defined($line = <SOCK>)) {
 830         print $line;
 831     }
 832
 833     close (SOCK)            || die "close: $!";
 834     exit;
 835
 836 And here's a corresponding server to go along with it.  We'll
 837 leave the address as INADDR_ANY so that the kernel can choose
 838 the appropriate interface on multihomed hosts.  If you want sit
 839 on a particular interface (like the external side of a gateway
 840 or firewall machine), you should fill this in with your real address
 841 instead.
 842
 843     #!/usr/bin/perl -Tw
 844     use strict;
 845     BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
 846     use Socket;
 847     use Carp;
 848     my $EOL = "\015\012";
 849
 850     sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
 851
 852     my $port = shift || 2345;
 853     my $proto = getprotobyname('tcp');
 854
 855     ($port) = $port =~ /^(\d+)$/                        or die "invalid port";
 856
 857     socket(Server, PF_INET, SOCK_STREAM, $proto)        || die "socket: $!";
 858     setsockopt(Server, SOL_SOCKET, SO_REUSEADDR,
 859                                         pack("l", 1))   || die "setsockopt: $!";
 860     bind(Server, sockaddr_in($port, INADDR_ANY))        || die "bind: $!";
 861     listen(Server,SOMAXCONN)                            || die "listen: $!";
 862
 863     logmsg "server started on port $port";
 864
 865     my $paddr;
 866
 867     $SIG{CHLD} = \&REAPER;
 868
 869     for ( ; $paddr = accept(Client,Server); close Client) {
 870         my($port,$iaddr) = sockaddr_in($paddr);
 871         my $name = gethostbyaddr($iaddr,AF_INET);
 872
 873         logmsg "connection from $name [",
 874                 inet_ntoa($iaddr), "]
 875                 at port $port";
 876
 877         print Client "Hello there, $name, it's now ",
 878                         scalar localtime, $EOL;
 879     }
 880
 881 And here's a multithreaded version.  It's multithreaded in that
 882 like most typical servers, it spawns (forks) a slave server to
 883 handle the client request so that the master server can quickly
 884 go back to service a new client.
 885
 886     #!/usr/bin/perl -Tw
 887     use strict;
 888     BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
 889     use Socket;
 890     use Carp;
 891     my $EOL = "\015\012";
 892
 893     sub spawn;  # forward declaration
 894     sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
 895
 896     my $port = shift || 2345;
 897     my $proto = getprotobyname('tcp');
 898
 899     ($port) = $port =~ /^(\d+)$/                        or die "invalid port";
 900
 901     socket(Server, PF_INET, SOCK_STREAM, $proto)        || die "socket: $!";
 902     setsockopt(Server, SOL_SOCKET, SO_REUSEADDR,
 903                                         pack("l", 1))   || die "setsockopt: $!";
 904     bind(Server, sockaddr_in($port, INADDR_ANY))        || die "bind: $!";
 905     listen(Server,SOMAXCONN)                            || die "listen: $!";
 906
 907     logmsg "server started on port $port";
 908
 909     my $waitedpid = 0;
 910     my $paddr;
 911
 912     use POSIX ":sys_wait_h";
 913     sub REAPER {
 914         my $child;
 915         while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
 916             logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
 917         }
 918         $SIG{CHLD} = \&REAPER;  # loathe sysV
 919     }
 920
 921     $SIG{CHLD} = \&REAPER;
 922
 923     for ( $waitedpid = 0;
 924           ($paddr = accept(Client,Server)) || $waitedpid;
 925           $waitedpid = 0, close Client)
 926     {
 927         next if $waitedpid and not $paddr;
 928         my($port,$iaddr) = sockaddr_in($paddr);
 929         my $name = gethostbyaddr($iaddr,AF_INET);
 930
 931         logmsg "connection from $name [",
 932                 inet_ntoa($iaddr), "]
 933                 at port $port";
 934
 935         spawn sub {
 936             $|=1;
 937             print "Hello there, $name, it's now ", scalar localtime, $EOL;
 938             exec '/usr/games/fortune'           # XXX: `wrong' line terminators
 939                 or confess "can't exec fortune: $!";
 940         };
 941
 942     }
 943
 944     sub spawn {
 945         my $coderef = shift;
 946
 947         unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
 948             confess "usage: spawn CODEREF";
 949         }
 950
 951         my $pid;
 952         if (!defined($pid = fork)) {
 953             logmsg "cannot fork: $!";
 954             return;
 955         } elsif ($pid) {
 956             logmsg "begat $pid";
 957             return; # I'm the parent
 958         }
 959         # else I'm the child -- go spawn
 960
 961         open(STDIN,  "<&Client")   || die "can't dup client to stdin";
 962         open(STDOUT, ">&Client")   || die "can't dup client to stdout";
 963         ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
 964         exit &$coderef();
 965     }
 966
 967 This server takes the trouble to clone off a child version via fork() for
 968 each incoming request.  That way it can handle many requests at once,
 969 which you might not always want.  Even if you don't fork(), the listen()
 970 will allow that many pending connections.  Forking servers have to be
 971 particularly careful about cleaning up their dead children (called
 972 "zombies" in Unix parlance), because otherwise you'll quickly fill up your
 973 process table.
 974
 975 We suggest that you use the B<-T> flag to use taint checking (see L<perlsec>)
 976 even if we aren't running setuid or setgid.  This is always a good idea
 977 for servers and other programs run on behalf of someone else (like CGI
 978 scripts), because it lessens the chances that people from the outside will
 979 be able to compromise your system.
 980
 981 Let's look at another TCP client.  This one connects to the TCP "time"
 982 service on a number of different machines and shows how far their clocks
 983 differ from the system on which it's being run:
 984
 985     #!/usr/bin/perl  -w
 986     use strict;
 987     use Socket;
 988
 989     my $SECS_of_70_YEARS = 2208988800;
 990     sub ctime { scalar localtime(shift) }
 991
 992     my $iaddr = gethostbyname('localhost');
 993     my $proto = getprotobyname('tcp');
 994     my $port = getservbyname('time', 'tcp');
 995     my $paddr = sockaddr_in(0, $iaddr);
 996     my($host);
 997
 998     $| = 1;
 999     printf "%-24s %8s %s\n",  "localhost", 0, ctime(time());
1000
1001     foreach $host (@ARGV) {
1002         printf "%-24s ", $host;
1003         my $hisiaddr = inet_aton($host)     || die "unknown host";
1004         my $hispaddr = sockaddr_in($port, $hisiaddr);
1005         socket(SOCKET, PF_INET, SOCK_STREAM, $proto)   || die "socket: $!";
1006         connect(SOCKET, $hispaddr)          || die "bind: $!";
1007         my $rtime = '    ';
1008         read(SOCKET, $rtime, 4);
1009         close(SOCKET);
1010         my $histime = unpack("N", $rtime) - $SECS_of_70_YEARS ;
1011         printf "%8d %s\n", $histime - time, ctime($histime);
1012     }
1013
1014 =head2 Unix-Domain TCP Clients and Servers
1015
1016 That's fine for Internet-domain clients and servers, but what about local
1017 communications?  While you can use the same setup, sometimes you don't
1018 want to.  Unix-domain sockets are local to the current host, and are often
1019 used internally to implement pipes.  Unlike Internet domain sockets, Unix
1020 domain sockets can show up in the file system with an ls(1) listing.
1021
1022     % ls -l /dev/log
1023     srw-rw-rw-  1 root            0 Oct 31 07:23 /dev/log
1024
1025 You can test for these with Perl's B<-S> file test:
1026
1027     unless ( -S '/dev/log' ) {
1028         die "something's wicked with the log system";
1029     }
1030
1031 Here's a sample Unix-domain client:
1032
1033     #!/usr/bin/perl -w
1034     use Socket;
1035     use strict;
1036     my ($rendezvous, $line);
1037
1038     $rendezvous = shift || 'catsock';
1039     socket(SOCK, PF_UNIX, SOCK_STREAM, 0)       || die "socket: $!";
1040     connect(SOCK, sockaddr_un($rendezvous))     || die "connect: $!";
1041     while (defined($line = <SOCK>)) {
1042         print $line;
1043     }
1044     exit;
1045
1046 And here's a corresponding server.  You don't have to worry about silly
1047 network terminators here because Unix domain sockets are guaranteed
1048 to be on the localhost, and thus everything works right.
1049
1050     #!/usr/bin/perl -Tw
1051     use strict;
1052     use Socket;
1053     use Carp;
1054
1055     BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
1056     sub spawn;  # forward declaration
1057     sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
1058
1059     my $NAME = 'catsock';
1060     my $uaddr = sockaddr_un($NAME);
1061     my $proto = getprotobyname('tcp');
1062
1063     socket(Server,PF_UNIX,SOCK_STREAM,0)        || die "socket: $!";
1064     unlink($NAME);
1065     bind  (Server, $uaddr)                      || die "bind: $!";
1066     listen(Server,SOMAXCONN)                    || die "listen: $!";
1067
1068     logmsg "server started on $NAME";
1069
1070     my $waitedpid;
1071
1072     use POSIX ":sys_wait_h";
1073     sub REAPER {
1074         my $child;
1075         while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
1076             logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
1077         }
1078         $SIG{CHLD} = \&REAPER;  # loathe sysV
1079     }
1080
1081     $SIG{CHLD} = \&REAPER;
1082
1083
1084     for ( $waitedpid = 0;
1085           accept(Client,Server) || $waitedpid;
1086           $waitedpid = 0, close Client)
1087     {
1088         next if $waitedpid;
1089         logmsg "connection on $NAME";
1090         spawn sub {
1091             print "Hello there, it's now ", scalar localtime, "\n";
1092             exec '/usr/games/fortune' or die "can't exec fortune: $!";
1093         };
1094     }
1095
1096     sub spawn {
1097         my $coderef = shift;
1098
1099         unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
1100             confess "usage: spawn CODEREF";
1101         }
1102
1103         my $pid;
1104         if (!defined($pid = fork)) {
1105             logmsg "cannot fork: $!";
1106             return;
1107         } elsif ($pid) {
1108             logmsg "begat $pid";
1109             return; # I'm the parent
1110         }
1111         # else I'm the child -- go spawn
1112
1113         open(STDIN,  "<&Client")   || die "can't dup client to stdin";
1114         open(STDOUT, ">&Client")   || die "can't dup client to stdout";
1115         ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
1116         exit &$coderef();
1117     }
1118
1119 As you see, it's remarkably similar to the Internet domain TCP server, so
1120 much so, in fact, that we've omitted several duplicate functions--spawn(),
1121 logmsg(), ctime(), and REAPER()--which are exactly the same as in the
1122 other server.
1123
1124 So why would you ever want to use a Unix domain socket instead of a
1125 simpler named pipe?  Because a named pipe doesn't give you sessions.  You
1126 can't tell one process's data from another's.  With socket programming,
1127 you get a separate session for each client: that's why accept() takes two
1128 arguments.
1129
1130 For example, let's say that you have a long running database server daemon
1131 that you want folks from the World Wide Web to be able to access, but only
1132 if they go through a CGI interface.  You'd have a small, simple CGI
1133 program that does whatever checks and logging you feel like, and then acts
1134 as a Unix-domain client and connects to your private server.
1135
1136 =head1 TCP Clients with IO::Socket
1137
1138 For those preferring a higher-level interface to socket programming, the
1139 IO::Socket module provides an object-oriented approach.  IO::Socket is
1140 included as part of the standard Perl distribution as of the 5.004
1141 release.  If you're running an earlier version of Perl, just fetch
1142 IO::Socket from CPAN, where you'll also find modules providing easy
1143 interfaces to the following systems: DNS, FTP, Ident (RFC 931), NIS and
1144 NISPlus, NNTP, Ping, POP3, SMTP, SNMP, SSLeay, Telnet, and Time--just
1145 to name a few.
1146
1147 =head2 A Simple Client
1148
1149 Here's a client that creates a TCP connection to the "daytime"
1150 service at port 13 of the host name "localhost" and prints out everything
1151 that the server there cares to provide.
1152
1153     #!/usr/bin/perl -w
1154     use IO::Socket;
1155     $remote = IO::Socket::INET->new(
1156                         Proto    => "tcp",
1157                         PeerAddr => "localhost",
1158                         PeerPort => "daytime(13)",
1159                     )
1160                   or die "cannot connect to daytime port at localhost";
1161     while ( <$remote> ) { print }
1162
1163 When you run this program, you should get something back that
1164 looks like this:
1165
1166     Wed May 14 08:40:46 MDT 1997
1167
1168 Here are what those parameters to the C<new> constructor mean:
1169
1170 =over 4
1171
1172 =item C<Proto>
1173
1174 This is which protocol to use.  In this case, the socket handle returned
1175 will be connected to a TCP socket, because we want a stream-oriented
1176 connection, that is, one that acts pretty much like a plain old file.
1177 Not all sockets are this of this type.  For example, the UDP protocol
1178 can be used to make a datagram socket, used for message-passing.
1179
1180 =item C<PeerAddr>
1181
1182 This is the name or Internet address of the remote host the server is
1183 running on.  We could have specified a longer name like C<"www.perl.com">,
1184 or an address like C<"204.148.40.9">.  For demonstration purposes, we've
1185 used the special hostname C<"localhost">, which should always mean the
1186 current machine you're running on.  The corresponding Internet address
1187 for localhost is C<"127.1">, if you'd rather use that.
1188
1189 =item C<PeerPort>
1190
1191 This is the service name or port number we'd like to connect to.
1192 We could have gotten away with using just C<"daytime"> on systems with a
1193 well-configured system services file,[FOOTNOTE: The system services file
1194 is in I</etc/services> under Unix] but just in case, we've specified the
1195 port number (13) in parentheses.  Using just the number would also have
1196 worked, but constant numbers make careful programmers nervous.
1197
1198 =back
1199
1200 Notice how the return value from the C<new> constructor is used as
1201 a filehandle in the C<while> loop?  That's what's called an indirect
1202 filehandle, a scalar variable containing a filehandle.  You can use
1203 it the same way you would a normal filehandle.  For example, you
1204 can read one line from it this way:
1205
1206     $line = <$handle>;
1207
1208 all remaining lines from is this way:
1209
1210     @lines = <$handle>;
1211
1212 and send a line of data to it this way:
1213
1214     print $handle "some data\n";
1215
1216 =head2 A Webget Client
1217
1218 Here's a simple client that takes a remote host to fetch a document
1219 from, and then a list of documents to get from that host.  This is a
1220 more interesting client than the previous one because it first sends
1221 something to the server before fetching the server's response.
1222
1223     #!/usr/bin/perl -w
1224     use IO::Socket;
1225     unless (@ARGV > 1) { die "usage: $0 host document ..." }
1226     $host = shift(@ARGV);
1227     $EOL = "\015\012";
1228     $BLANK = $EOL x 2;
1229     foreach $document ( @ARGV ) {
1230         $remote = IO::Socket::INET->new( Proto     => "tcp",
1231                                          PeerAddr  => $host,
1232                                          PeerPort  => "http(80)",
1233                                         );
1234         unless ($remote) { die "cannot connect to http daemon on $host" }
1235         $remote->autoflush(1);
1236         print $remote "GET $document HTTP/1.0" . $BLANK;
1237         while ( <$remote> ) { print }
1238         close $remote;
1239     }
1240
1241 The web server handing the "http" service, which is assumed to be at
1242 its standard port, number 80.  If the web server you're trying to
1243 connect to is at a different port (like 1080 or 8080), you should specify
1244 as the named-parameter pair, C<< PeerPort => 8080 >>.  The C<autoflush>
1245 method is used on the socket because otherwise the system would buffer
1246 up the output we sent it.  (If you're on a Mac, you'll also need to
1247 change every C<"\n"> in your code that sends data over the network to
1248 be a C<"\015\012"> instead.)
1249
1250 Connecting to the server is only the first part of the process: once you
1251 have the connection, you have to use the server's language.  Each server
1252 on the network has its own little command language that it expects as
1253 input.  The string that we send to the server starting with "GET" is in
1254 HTTP syntax.  In this case, we simply request each specified document.
1255 Yes, we really are making a new connection for each document, even though
1256 it's the same host.  That's the way you always used to have to speak HTTP.
1257 Recent versions of web browsers may request that the remote server leave
1258 the connection open a little while, but the server doesn't have to honor
1259 such a request.
1260
1261 Here's an example of running that program, which we'll call I<webget>:
1262
1263     % webget www.perl.com /guanaco.html
1264     HTTP/1.1 404 File Not Found
1265     Date: Thu, 08 May 1997 18:02:32 GMT
1266     Server: Apache/1.2b6
1267     Connection: close
1268     Content-type: text/html
1269
1270     <HEAD><TITLE>404 File Not Found</TITLE></HEAD>
1271     <BODY><H1>File Not Found</H1>
1272     The requested URL /guanaco.html was not found on this server.<P>
1273     </BODY>
1274
1275 Ok, so that's not very interesting, because it didn't find that
1276 particular document.  But a long response wouldn't have fit on this page.
1277
1278 For a more fully-featured version of this program, you should look to
1279 the I<lwp-request> program included with the LWP modules from CPAN.
1280
1281 =head2 Interactive Client with IO::Socket
1282
1283 Well, that's all fine if you want to send one command and get one answer,
1284 but what about setting up something fully interactive, somewhat like
1285 the way I<telnet> works?  That way you can type a line, get the answer,
1286 type a line, get the answer, etc.
1287
1288 This client is more complicated than the two we've done so far, but if
1289 you're on a system that supports the powerful C<fork> call, the solution
1290 isn't that rough.  Once you've made the connection to whatever service
1291 you'd like to chat with, call C<fork> to clone your process.  Each of
1292 these two identical process has a very simple job to do: the parent
1293 copies everything from the socket to standard output, while the child
1294 simultaneously copies everything from standard input to the socket.
1295 To accomplish the same thing using just one process would be I<much>
1296 harder, because it's easier to code two processes to do one thing than it
1297 is to code one process to do two things.  (This keep-it-simple principle
1298 a cornerstones of the Unix philosophy, and good software engineering as
1299 well, which is probably why it's spread to other systems.)
1300
1301 Here's the code:
1302
1303     #!/usr/bin/perl -w
1304     use strict;
1305     use IO::Socket;
1306     my ($host, $port, $kidpid, $handle, $line);
1307
1308     unless (@ARGV == 2) { die "usage: $0 host port" }
1309     ($host, $port) = @ARGV;
1310
1311     # create a tcp connection to the specified host and port
1312     $handle = IO::Socket::INET->new(Proto     => "tcp",
1313                                     PeerAddr  => $host,
1314                                     PeerPort  => $port)
1315            or die "can't connect to port $port on $host: $!";
1316
1317     $handle->autoflush(1);              # so output gets there right away
1318     print STDERR "[Connected to $host:$port]\n";
1319
1320     # split the program into two processes, identical twins
1321     die "can't fork: $!" unless defined($kidpid = fork());
1322
1323     # the if{} block runs only in the parent process
1324     if ($kidpid) {
1325         # copy the socket to standard output
1326         while (defined ($line = <$handle>)) {
1327             print STDOUT $line;
1328         }
1329         kill("TERM", $kidpid);                  # send SIGTERM to child
1330     }
1331     # the else{} block runs only in the child process
1332     else {
1333         # copy standard input to the socket
1334         while (defined ($line = <STDIN>)) {
1335             print $handle $line;
1336         }
1337     }
1338
1339 The C<kill> function in the parent's C<if> block is there to send a
1340 signal to our child process (current running in the C<else> block)
1341 as soon as the remote server has closed its end of the connection.
1342
1343 If the remote server sends data a byte at time, and you need that
1344 data immediately without waiting for a newline (which might not happen),
1345 you may wish to replace the C<while> loop in the parent with the
1346 following:
1347
1348     my $byte;
1349     while (sysread($handle, $byte, 1) == 1) {
1350         print STDOUT $byte;
1351     }
1352
1353 Making a system call for each byte you want to read is not very efficient
1354 (to put it mildly) but is the simplest to explain and works reasonably
1355 well.
1356
1357 =head1 TCP Servers with IO::Socket
1358
1359 As always, setting up a server is little bit more involved than running a client.
1360 The model is that the server creates a special kind of socket that
1361 does nothing but listen on a particular port for incoming connections.
1362 It does this by calling the C<< IO::Socket::INET->new() >> method with
1363 slightly different arguments than the client did.
1364
1365 =over 4
1366
1367 =item Proto
1368
1369 This is which protocol to use.  Like our clients, we'll
1370 still specify C<"tcp"> here.
1371
1372 =item LocalPort
1373
1374 We specify a local
1375 port in the C<LocalPort> argument, which we didn't do for the client.
1376 This is service name or port number for which you want to be the
1377 server. (Under Unix, ports under 1024 are restricted to the
1378 superuser.)  In our sample, we'll use port 9000, but you can use
1379 any port that's not currently in use on your system.  If you try
1380 to use one already in used, you'll get an "Address already in use"
1381 message.  Under Unix, the C<netstat -a> command will show
1382 which services current have servers.
1383
1384 =item Listen
1385
1386 The C<Listen> parameter is set to the maximum number of
1387 pending connections we can accept until we turn away incoming clients.
1388 Think of it as a call-waiting queue for your telephone.
1389 The low-level Socket module has a special symbol for the system maximum, which
1390 is SOMAXCONN.
1391
1392 =item Reuse
1393
1394 The C<Reuse> parameter is needed so that we restart our server
1395 manually without waiting a few minutes to allow system buffers to
1396 clear out.
1397
1398 =back
1399
1400 Once the generic server socket has been created using the parameters
1401 listed above, the server then waits for a new client to connect
1402 to it.  The server blocks in the C<accept> method, which eventually accepts a
1403 bidirectional connection from the remote client.  (Make sure to autoflush
1404 this handle to circumvent buffering.)
1405
1406 To add to user-friendliness, our server prompts the user for commands.
1407 Most servers don't do this.  Because of the prompt without a newline,
1408 you'll have to use the C<sysread> variant of the interactive client above.
1409
1410 This server accepts one of five different commands, sending output
1411 back to the client.  Note that unlike most network servers, this one
1412 only handles one incoming client at a time.  Multithreaded servers are
1413 covered in Chapter 6 of the Camel.
1414
1415 Here's the code.  We'll
1416
1417  #!/usr/bin/perl -w
1418  use IO::Socket;
1419  use Net::hostent;              # for OO version of gethostbyaddr
1420
1421  $PORT = 9000;                  # pick something not in use
1422
1423  $server = IO::Socket::INET->new( Proto     => 'tcp',
1424                                   LocalPort => $PORT,
1425                                   Listen    => SOMAXCONN,
1426                                   Reuse     => 1);
1427
1428  die "can't setup server" unless $server;
1429  print "[Server $0 accepting clients]\n";
1430
1431  while ($client = $server->accept()) {
1432    $client->autoflush(1);
1433    print $client "Welcome to $0; type help for command list.\n";
1434    $hostinfo = gethostbyaddr($client->peeraddr);
1435    printf "[Connect from %s]\n", $hostinfo ? $hostinfo->name : $client->peerhost;
1436    print $client "Command? ";
1437    while ( <$client>) {
1438      next unless /\S/;       # blank line
1439      if    (/quit|exit/i)    { last;                                     }
1440      elsif (/date|time/i)    { printf $client "%s\n", scalar localtime;  }
1441      elsif (/who/i )         { print  $client `who 2>&1`;                }
1442      elsif (/cookie/i )      { print  $client `/usr/games/fortune 2>&1`; }
1443      elsif (/motd/i )        { print  $client `cat /etc/motd 2>&1`;      }
1444      else {
1445        print $client "Commands: quit date who cookie motd\n";
1446      }
1447    } continue {
1448       print $client "Command? ";
1449    }
1450    close $client;
1451  }
1452
1453 =head1 UDP: Message Passing
1454
1455 Another kind of client-server setup is one that uses not connections, but
1456 messages.  UDP communications involve much lower overhead but also provide
1457 less reliability, as there are no promises that messages will arrive at
1458 all, let alone in order and unmangled.  Still, UDP offers some advantages
1459 over TCP, including being able to "broadcast" or "multicast" to a whole
1460 bunch of destination hosts at once (usually on your local subnet).  If you
1461 find yourself overly concerned about reliability and start building checks
1462 into your message system, then you probably should use just TCP to start
1463 with.
1464
1465 Note that UDP datagrams are I<not> a bytestream and should not be treated
1466 as such. This makes using I/O mechanisms with internal buffering
1467 like stdio (i.e. print() and friends) especially cumbersome. Use syswrite(),
1468 or better send(), like in the example below.
1469
1470 Here's a UDP program similar to the sample Internet TCP client given
1471 earlier.  However, instead of checking one host at a time, the UDP version
1472 will check many of them asynchronously by simulating a multicast and then
1473 using select() to do a timed-out wait for I/O.  To do something similar
1474 with TCP, you'd have to use a different socket handle for each host.
1475
1476     #!/usr/bin/perl -w
1477     use strict;
1478     use Socket;
1479     use Sys::Hostname;
1480
1481     my ( $count, $hisiaddr, $hispaddr, $histime,
1482          $host, $iaddr, $paddr, $port, $proto,
1483          $rin, $rout, $rtime, $SECS_of_70_YEARS);
1484
1485     $SECS_of_70_YEARS      = 2208988800;
1486
1487     $iaddr = gethostbyname(hostname());
1488     $proto = getprotobyname('udp');
1489     $port = getservbyname('time', 'udp');
1490     $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick
1491
1492     socket(SOCKET, PF_INET, SOCK_DGRAM, $proto)   || die "socket: $!";
1493     bind(SOCKET, $paddr)                          || die "bind: $!";
1494
1495     $| = 1;
1496     printf "%-12s %8s %s\n",  "localhost", 0, scalar localtime time;
1497     $count = 0;
1498     for $host (@ARGV) {
1499         $count++;
1500         $hisiaddr = inet_aton($host)    || die "unknown host";
1501         $hispaddr = sockaddr_in($port, $hisiaddr);
1502         defined(send(SOCKET, 0, 0, $hispaddr))    || die "send $host: $!";
1503     }
1504
1505     $rin = '';
1506     vec($rin, fileno(SOCKET), 1) = 1;
1507
1508     # timeout after 10.0 seconds
1509     while ($count && select($rout = $rin, undef, undef, 10.0)) {
1510         $rtime = '';
1511         ($hispaddr = recv(SOCKET, $rtime, 4, 0))        || die "recv: $!";
1512         ($port, $hisiaddr) = sockaddr_in($hispaddr);
1513         $host = gethostbyaddr($hisiaddr, AF_INET);
1514         $histime = unpack("N", $rtime) - $SECS_of_70_YEARS ;
1515         printf "%-12s ", $host;
1516         printf "%8d %s\n", $histime - time, scalar localtime($histime);
1517         $count--;
1518     }
1519
1520 Note that this example does not include any retries and may consequently
1521 fail to contact a reachable host. The most prominent reason for this
1522 is congestion of the queues on the sending host if the number of
1523 list of hosts to contact is sufficiently large.
1524
1525 =head1 SysV IPC
1526
1527 While System V IPC isn't so widely used as sockets, it still has some
1528 interesting uses.  You can't, however, effectively use SysV IPC or
1529 Berkeley mmap() to have shared memory so as to share a variable amongst
1530 several processes.  That's because Perl would reallocate your string when
1531 you weren't wanting it to.
1532
1533 Here's a small example showing shared memory usage.
1534
1535     use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRWXU);
1536
1537     $size = 2000;
1538     $id = shmget(IPC_PRIVATE, $size, S_IRWXU) || die "$!";
1539     print "shm key $id\n";
1540
1541     $message = "Message #1";
1542     shmwrite($id, $message, 0, 60) || die "$!";
1543     print "wrote: '$message'\n";
1544     shmread($id, $buff, 0, 60) || die "$!";
1545     print "read : '$buff'\n";
1546
1547     # the buffer of shmread is zero-character end-padded.
1548     substr($buff, index($buff, "\0")) = '';
1549     print "un" unless $buff eq $message;
1550     print "swell\n";
1551
1552     print "deleting shm $id\n";
1553     shmctl($id, IPC_RMID, 0) || die "$!";
1554
1555 Here's an example of a semaphore:
1556
1557     use IPC::SysV qw(IPC_CREAT);
1558
1559     $IPC_KEY = 1234;
1560     $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT ) || die "$!";
1561     print "shm key $id\n";
1562
1563 Put this code in a separate file to be run in more than one process.
1564 Call the file F<take>:
1565
1566     # create a semaphore
1567
1568     $IPC_KEY = 1234;
1569     $id = semget($IPC_KEY,  0 , 0 );
1570     die if !defined($id);
1571
1572     $semnum = 0;
1573     $semflag = 0;
1574
1575     # 'take' semaphore
1576     # wait for semaphore to be zero
1577     $semop = 0;
1578     $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag);
1579
1580     # Increment the semaphore count
1581     $semop = 1;
1582     $opstring2 = pack("s!s!s!", $semnum, $semop,  $semflag);
1583     $opstring = $opstring1 . $opstring2;
1584
1585     semop($id,$opstring) || die "$!";
1586
1587 Put this code in a separate file to be run in more than one process.
1588 Call this file F<give>:
1589
1590     # 'give' the semaphore
1591     # run this in the original process and you will see
1592     # that the second process continues
1593
1594     $IPC_KEY = 1234;
1595     $id = semget($IPC_KEY, 0, 0);
1596     die if !defined($id);
1597
1598     $semnum = 0;
1599     $semflag = 0;
1600
1601     # Decrement the semaphore count
1602     $semop = -1;
1603     $opstring = pack("s!s!s!", $semnum, $semop, $semflag);
1604
1605     semop($id,$opstring) || die "$!";
1606
1607 The SysV IPC code above was written long ago, and it's definitely
1608 clunky looking.  For a more modern look, see the IPC::SysV module
1609 which is included with Perl starting from Perl 5.005.
1610
1611 A small example demonstrating SysV message queues:
1612
1613     use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRWXU);
1614
1615     my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRWXU);
1616
1617     my $sent = "message";
1618     my $type_sent = 1234;
1619     my $rcvd;
1620     my $type_rcvd;
1621
1622     if (defined $id) {
1623         if (msgsnd($id, pack("l! a*", $type_sent, $sent), 0)) {
1624             if (msgrcv($id, $rcvd, 60, 0, 0)) {
1625                 ($type_rcvd, $rcvd) = unpack("l! a*", $rcvd);
1626                 if ($rcvd eq $sent) {
1627                     print "okay\n";
1628                 } else {
1629                     print "not okay\n";
1630                 }
1631             } else {
1632                 die "# msgrcv failed\n";
1633             }
1634         } else {
1635             die "# msgsnd failed\n";
1636         }
1637         msgctl($id, IPC_RMID, 0) || die "# msgctl failed: $!\n";
1638     } else {
1639         die "# msgget failed\n";
1640     }
1641
1642 =head1 NOTES
1643
1644 Most of these routines quietly but politely return C<undef> when they
1645 fail instead of causing your program to die right then and there due to
1646 an uncaught exception.  (Actually, some of the new I<Socket> conversion
1647 functions  croak() on bad arguments.)  It is therefore essential to
1648 check return values from these functions.  Always begin your socket
1649 programs this way for optimal success, and don't forget to add B<-T>
1650 taint checking flag to the #! line for servers:
1651
1652     #!/usr/bin/perl -Tw
1653     use strict;
1654     use sigtrap;
1655     use Socket;
1656
1657 =head1 BUGS
1658
1659 All these routines create system-specific portability problems.  As noted
1660 elsewhere, Perl is at the mercy of your C libraries for much of its system
1661 behaviour.  It's probably safest to assume broken SysV semantics for
1662 signals and to stick with simple TCP and UDP socket operations; e.g., don't
1663 try to pass open file descriptors over a local UDP datagram socket if you
1664 want your code to stand a chance of being portable.
1665
1666 =head1 AUTHOR
1667
1668 Tom Christiansen, with occasional vestiges of Larry Wall's original
1669 version and suggestions from the Perl Porters.
1670
1671 =head1 SEE ALSO
1672
1673 There's a lot more to networking than this, but this should get you
1674 started.
1675
1676 For intrepid programmers, the indispensable textbook is I<Unix
1677 Network Programming, 2nd Edition, Volume 1> by W. Richard Stevens
1678 (published by Prentice-Hall).  Note that most books on networking
1679 address the subject from the perspective of a C programmer; translation
1680 to Perl is left as an exercise for the reader.
1681
1682 The IO::Socket(3) manpage describes the object library, and the Socket(3)
1683 manpage describes the low-level interface to sockets.  Besides the obvious
1684 functions in L<perlfunc>, you should also check out the F<modules> file
1685 at your nearest CPAN site.  (See L<perlmodlib> or best yet, the F<Perl
1686 FAQ> for a description of what CPAN is and where to get it.)
1687
1688 Section 5 of the F<modules> file is devoted to "Networking, Device Control
1689 (modems), and Interprocess Communication", and contains numerous unbundled
1690 modules numerous networking modules, Chat and Expect operations, CGI
1691 programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP, Telnet,
1692 Threads, and ToolTalk--just to name a few.