pod/perlipc.pod

   1 =head1 NAME
   2
   3 perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores)
   4
   5 =head1 DESCRIPTION
   6
   7 The basic IPC facilities of Perl are built out of the good old Unix
   8 signals, named pipes, pipe opens, the Berkeley socket routines, and SysV
   9 IPC calls.  Each is used in slightly different situations.
  10
  11 =head1 Signals
  12
  13 Perl uses a simple signal handling model: the %SIG hash contains names
  14 or references of user-installed signal handlers.  These handlers will
  15 be called with an argument which is the name of the signal that
  16 triggered it.  A signal may be generated intentionally from a
  17 particular keyboard sequence like control-C or control-Z, sent to you
  18 from another process, or triggered automatically by the kernel when
  19 special events transpire, like a child process exiting, your process
  20 running out of stack space, or hitting file size limit.
  21
  22 For example, to trap an interrupt signal, set up a handler like this:
  23
  24     sub catch_zap {
  25         my $signame = shift;
  26         $shucks++;
  27         die "Somebody sent me a SIG$signame";
  28     }
  29     $SIG{INT} = 'catch_zap';  # could fail in modules
  30     $SIG{INT} = \&catch_zap;  # best strategy
  31
  32 Prior to Perl 5.7.3 it was necessary to do as little as you possibly
  33 could in your handler; notice how all we do is set a global variable
  34 and then raise an exception.  That's because on most systems,
  35 libraries are not re-entrant; particularly, memory allocation and I/O
  36 routines are not.  That meant that doing nearly I<anything> in your
  37 handler could in theory trigger a memory fault and subsequent core
  38 dump - see L</Deferred Signals (Safe Signals)> below.
  39
  40 The names of the signals are the ones listed out by C<kill -l> on your
  41 system, or you can retrieve them from the Config module.  Set up an
  42 @signame list indexed by number to get the name and a %signo table
  43 indexed by name to get the number:
  44
  45     use Config;
  46     defined $Config{sig_name} || die "No sigs?";
  47     foreach $name (split(' ', $Config{sig_name})) {
  48         $signo{$name} = $i;
  49         $signame[$i] = $name;
  50         $i++;
  51     }
  52
  53 So to check whether signal 17 and SIGALRM were the same, do just this:
  54
  55     print "signal #17 = $signame[17]\n";
  56     if ($signo{ALRM}) {
  57         print "SIGALRM is $signo{ALRM}\n";
  58     }
  59
  60 You may also choose to assign the strings C<'IGNORE'> or C<'DEFAULT'> as
  61 the handler, in which case Perl will try to discard the signal or do the
  62 default thing.
  63
  64 On most Unix platforms, the C<CHLD> (sometimes also known as C<CLD>) signal
  65 has special behavior with respect to a value of C<'IGNORE'>.
  66 Setting C<$SIG{CHLD}> to C<'IGNORE'> on such a platform has the effect of
  67 not creating zombie processes when the parent process fails to C<wait()>
  68 on its child processes (i.e. child processes are automatically reaped).
  69 Calling C<wait()> with C<$SIG{CHLD}> set to C<'IGNORE'> usually returns
  70 C<-1> on such platforms.
  71
  72 Some signals can be neither trapped nor ignored, such as
  73 the KILL and STOP (but not the TSTP) signals.  One strategy for
  74 temporarily ignoring signals is to use a local() statement, which will be
  75 automatically restored once your block is exited.  (Remember that local()
  76 values are "inherited" by functions called from within that block.)
  77
  78     sub precious {
  79         local $SIG{INT} = 'IGNORE';
  80         &more_functions;
  81     }
  82     sub more_functions {
  83         # interrupts still ignored, for now...
  84     }
  85
  86 Sending a signal to a negative process ID means that you send the signal
  87 to the entire Unix process-group.  This code sends a hang-up signal to all
  88 processes in the current process group (and sets $SIG{HUP} to IGNORE so
  89 it doesn't kill itself):
  90
  91     {
  92         local $SIG{HUP} = 'IGNORE';
  93         kill HUP => -$$;
  94         # snazzy writing of: kill('HUP', -$$)
  95     }
  96
  97 Another interesting signal to send is signal number zero.  This doesn't
  98 actually affect a child process, but instead checks whether it's alive
  99 or has changed its UID.
 100
 101     unless (kill 0 => $kid_pid) {
 102         warn "something wicked happened to $kid_pid";
 103     }
 104
 105 When directed at a process whose UID is not identical to that
 106 of the sending process, signal number zero may fail because
 107 you lack permission to send the signal, even though the process is alive.
 108 You may be able to determine the cause of failure using C<%!>.
 109
 110     unless (kill 0 => $pid or $!{EPERM}) {
 111         warn "$pid looks dead";
 112     }
 113
 114 You might also want to employ anonymous functions for simple signal
 115 handlers:
 116
 117     $SIG{INT} = sub { die "\nOutta here!\n" };
 118
 119 But that will be problematic for the more complicated handlers that need
 120 to reinstall themselves.  Because Perl's signal mechanism is currently
 121 based on the signal(3) function from the C library, you may sometimes be so
 122 misfortunate as to run on systems where that function is "broken", that
 123 is, it behaves in the old unreliable SysV way rather than the newer, more
 124 reasonable BSD and POSIX fashion.  So you'll see defensive people writing
 125 signal handlers like this:
 126
 127     sub REAPER {
 128         $waitedpid = wait;
 129         # loathe sysV: it makes us not only reinstate
 130         # the handler, but place it after the wait
 131         $SIG{CHLD} = \&REAPER;
 132     }
 133     $SIG{CHLD} = \&REAPER;
 134     # now do something that forks...
 135
 136 or better still:
 137
 138     use POSIX ":sys_wait_h";
 139     sub REAPER {
 140         my $child;
 141         # If a second child dies while in the signal handler caused by the
 142         # first death, we won't get another signal. So must loop here else
 143         # we will leave the unreaped child as a zombie. And the next time
 144         # two children die we get another zombie. And so on.
 145         while (($child = waitpid(-1,WNOHANG)) > 0) {
 146             $Kid_Status{$child} = $?;
 147         }
 148         $SIG{CHLD} = \&REAPER;  # still loathe sysV
 149     }
 150     $SIG{CHLD} = \&REAPER;
 151     # do something that forks...
 152
 153 Signal handling is also used for timeouts in Unix,   While safely
 154 protected within an C<eval{}> block, you set a signal handler to trap
 155 alarm signals and then schedule to have one delivered to you in some
 156 number of seconds.  Then try your blocking operation, clearing the alarm
 157 when it's done but not before you've exited your C<eval{}> block.  If it
 158 goes off, you'll use die() to jump out of the block, much as you might
 159 using longjmp() or throw() in other languages.
 160
 161 Here's an example:
 162
 163     eval {
 164         local $SIG{ALRM} = sub { die "alarm clock restart" };
 165         alarm 10;
 166         flock(FH, 2);   # blocking write lock
 167         alarm 0;
 168     };
 169     if ($@ and $@ !~ /alarm clock restart/) { die }
 170
 171 If the operation being timed out is system() or qx(), this technique
 172 is liable to generate zombies.    If this matters to you, you'll
 173 need to do your own fork() and exec(), and kill the errant child process.
 174
 175 For more complex signal handling, you might see the standard POSIX
 176 module.  Lamentably, this is almost entirely undocumented, but
 177 the F<t/lib/posix.t> file from the Perl source distribution has some
 178 examples in it.
 179
 180 =head2 Handling the SIGHUP Signal in Daemons
 181
 182 A process that usually starts when the system boots and shuts down
 183 when the system is shut down is called a daemon (Disk And Execution
 184 MONitor). If a daemon process has a configuration file which is
 185 modified after the process has been started, there should be a way to
 186 tell that process to re-read its configuration file, without stopping
 187 the process. Many daemons provide this mechanism using the C<SIGHUP>
 188 signal handler. When you want to tell the daemon to re-read the file
 189 you simply send it the C<SIGHUP> signal.
 190
 191 Not all platforms automatically reinstall their (native) signal
 192 handlers after a signal delivery.  This means that the handler works
 193 only the first time the signal is sent. The solution to this problem
 194 is to use C<POSIX> signal handlers if available, their behaviour
 195 is well-defined.
 196
 197 The following example implements a simple daemon, which restarts
 198 itself every time the C<SIGHUP> signal is received. The actual code is
 199 located in the subroutine C<code()>, which simply prints some debug
 200 info to show that it works and should be replaced with the real code.
 201
 202   #!/usr/bin/perl -w
 203
 204   use POSIX ();
 205   use FindBin ();
 206   use File::Basename ();
 207   use File::Spec::Functions;
 208
 209   $|=1;
 210
 211   # make the daemon cross-platform, so exec always calls the script
 212   # itself with the right path, no matter how the script was invoked.
 213   my $script = File::Basename::basename($0);
 214   my $SELF = catfile $FindBin::Bin, $script;
 215
 216   # POSIX unmasks the sigprocmask properly
 217   my $sigset = POSIX::SigSet->new();
 218   my $action = POSIX::SigAction->new('sigHUP_handler',
 219                                      $sigset,
 220                                      &POSIX::SA_NODEFER);
 221   POSIX::sigaction(&POSIX::SIGHUP, $action);
 222
 223   sub sigHUP_handler {
 224       print "got SIGHUP\n";
 225       exec($SELF, @ARGV) or die "Couldn't restart: $!\n";
 226   }
 227
 228   code();
 229
 230   sub code {
 231       print "PID: $$\n";
 232       print "ARGV: @ARGV\n";
 233       my $c = 0;
 234       while (++$c) {
 235           sleep 2;
 236           print "$c\n";
 237       }
 238   }
 239   __END__
 240
 241
 242 =head1 Named Pipes
 243
 244 A named pipe (often referred to as a FIFO) is an old Unix IPC
 245 mechanism for processes communicating on the same machine.  It works
 246 just like a regular, connected anonymous pipes, except that the
 247 processes rendezvous using a filename and don't have to be related.
 248
 249 To create a named pipe, use the Unix command mknod(1) or on some
 250 systems, mkfifo(1).  These may not be in your normal path.
 251
 252     # system return val is backwards, so && not ||
 253     #
 254     $ENV{PATH} .= ":/etc:/usr/etc";
 255     if  (      system('mknod',  $path, 'p')
 256             && system('mkfifo', $path) )
 257     {
 258         die "mk{nod,fifo} $path failed";
 259     }
 260
 261
 262 A fifo is convenient when you want to connect a process to an unrelated
 263 one.  When you open a fifo, the program will block until there's something
 264 on the other end.
 265
 266 For example, let's say you'd like to have your F<.signature> file be a
 267 named pipe that has a Perl program on the other end.  Now every time any
 268 program (like a mailer, news reader, finger program, etc.) tries to read
 269 from that file, the reading program will block and your program will
 270 supply the new signature.  We'll use the pipe-checking file test B<-p>
 271 to find out whether anyone (or anything) has accidentally removed our fifo.
 272
 273     chdir; # go home
 274     $FIFO = '.signature';
 275     $ENV{PATH} .= ":/etc:/usr/games";
 276
 277     while (1) {
 278         unless (-p $FIFO) {
 279             unlink $FIFO;
 280             system('mknod', $FIFO, 'p')
 281                 && die "can't mknod $FIFO: $!";
 282         }
 283
 284         # next line blocks until there's a reader
 285         open (FIFO, "> $FIFO") || die "can't write $FIFO: $!";
 286         print FIFO "John Smith (smith\@host.org)\n", `fortune -s`;
 287         close FIFO;
 288         sleep 2;    # to avoid dup signals
 289     }
 290
 291 =head2 Deferred Signals (Safe Signals)
 292
 293 In Perls before Perl 5.7.3 by installing Perl code to deal with
 294 signals, you were exposing yourself to danger from two things.  First,
 295 few system library functions are re-entrant.  If the signal interrupts
 296 while Perl is executing one function (like malloc(3) or printf(3)),
 297 and your signal handler then calls the same function again, you could
 298 get unpredictable behavior--often, a core dump.  Second, Perl isn't
 299 itself re-entrant at the lowest levels.  If the signal interrupts Perl
 300 while Perl is changing its own internal data structures, similarly
 301 unpredictable behaviour may result.
 302
 303 There were two things you could do, knowing this: be paranoid or be
 304 pragmatic.  The paranoid approach was to do as little as possible in your
 305 signal handler.  Set an existing integer variable that already has a
 306 value, and return.  This doesn't help you if you're in a slow system call,
 307 which will just restart.  That means you have to C<die> to longjump(3) out
 308 of the handler.  Even this is a little cavalier for the true paranoiac,
 309 who avoids C<die> in a handler because the system I<is> out to get you.
 310 The pragmatic approach was to say ``I know the risks, but prefer the
 311 convenience'', and to do anything you wanted in your signal handler,
 312 and be prepared to clean up core dumps now and again.
 313
 314 In Perl 5.7.3 and later to avoid these problems signals are
 315 "deferred"-- that is when the signal is delivered to the process by
 316 the system (to the C code that implements Perl) a flag is set, and the
 317 handler returns immediately. Then at strategic "safe" points in the
 318 Perl interpreter (e.g. when it is about to execute a new opcode) the
 319 flags are checked and the Perl level handler from %SIG is
 320 executed. The "deferred" scheme allows much more flexibility in the
 321 coding of signal handler as we know Perl interpreter is in a safe
 322 state, and that we are not in a system library function when the
 323 handler is called.  However the implementation does differ from
 324 previous Perls in the following ways:
 325
 326 =over 4
 327
 328 =item Long running opcodes
 329
 330 As Perl interpreter only looks at the signal flags when it about to
 331 execute a new opcode if a signal arrives during a long running opcode
 332 (e.g. a regular expression operation on a very large string) then
 333 signal will not be seen until operation completes.
 334
 335 =item Interrupting IO
 336
 337 When a signal is delivered (e.g. INT control-C) the operating system
 338 breaks into IO operations like C<read> (used to implement Perls
 339 E<lt>E<gt> operator). On older Perls the handler was called
 340 immediately (and as C<read> is not "unsafe" this worked well). With
 341 the "deferred" scheme the handler is not called immediately, and if
 342 Perl is using system's C<stdio> library that library may re-start the
 343 C<read> without returning to Perl and giving it a chance to call the
 344 %SIG handler. If this happens on your system the solution is to use
 345 C<:perlio> layer to do IO - at least on those handles which you want
 346 to be able to break into with signals. (The C<:perlio> layer checks
 347 the signal flags and calls %SIG handlers before resuming IO operation.)
 348
 349 Note that the default in Perl 5.7.3 and later is to automatically use
 350 the C<:perlio> layer.
 351
 352 Note that some networking library functions like gethostbyname() are
 353 known to have their own implementations of timeouts which may conflict
 354 with your timeouts.  If you are having problems with such functions,
 355 you can try using the POSIX sigaction() function, which bypasses the
 356 Perl safe signals (note that this means subjecting yourself to
 357 possible memory corruption, as described above).  Instead of setting
 358 C<$SIG{ALRM}> try something like the following:
 359
 360     use POSIX;
 361     sigaction SIGALRM, new POSIX::SigAction sub { die "alarm\n" }
 362         or die "Error setting SIGALRM handler: $!\n";
 363
 364 =item Restartable system calls
 365
 366 On systems that supported it, older versions of Perl used the
 367 SA_RESTART flag when installing %SIG handlers.  This meant that
 368 restartable system calls would continue rather than returning when
 369 a signal arrived.  In order to deliver deferred signals promptly,
 370 Perl 5.7.3 and later do I<not> use SA_RESTART.  Consequently,
 371 restartable system calls can fail (with $! set to C<EINTR>) in places
 372 where they previously would have succeeded.
 373
 374 Note that the default C<:perlio> layer will retry C<read>, C<write>
 375 and C<close> as described above and that interrupted C<wait> and
 376 C<waitpid> calls will always be retried.
 377
 378 =item Signals as "faults"
 379
 380 Certain signals e.g. SEGV, ILL, BUS are generated as a result of
 381 virtual memory or other "faults". These are normally fatal and there
 382 is little a Perl-level handler can do with them. (In particular the
 383 old signal scheme was particularly unsafe in such cases.)  However if
 384 a %SIG handler is set the new scheme simply sets a flag and returns as
 385 described above. This may cause the operating system to try the
 386 offending machine instruction again and - as nothing has changed - it
 387 will generate the signal again. The result of this is a rather odd
 388 "loop". In future Perl's signal mechanism may be changed to avoid this
 389 - perhaps by simply disallowing %SIG handlers on signals of that
 390 type. Until then the work-round is not to set a %SIG handler on those
 391 signals. (Which signals they are is operating system dependant.)
 392
 393 =item Signals triggered by operating system state
 394
 395 On some operating systems certain signal handlers are supposed to "do
 396 something" before returning. One example can be CHLD or CLD which
 397 indicates a child process has completed. On some operating systems the
 398 signal handler is expected to C<wait> for the completed child
 399 process. On such systems the deferred signal scheme will not work for
 400 those signals (it does not do the C<wait>). Again the failure will
 401 look like a loop as the operating system will re-issue the signal as
 402 there are un-waited-for completed child processes.
 403
 404 =back
 405
 406 If you want the old signal behaviour back regardless of possible
 407 memory corruption, set the environment variable C<PERL_SIGNALS> to
 408 C<"unsafe"> (a new feature since Perl 5.8.1).
 409
 410 =head1 Using open() for IPC
 411
 412 Perl's basic open() statement can also be used for unidirectional
 413 interprocess communication by either appending or prepending a pipe
 414 symbol to the second argument to open().  Here's how to start
 415 something up in a child process you intend to write to:
 416
 417     open(SPOOLER, "| cat -v | lpr -h 2>/dev/null")
 418                     || die "can't fork: $!";
 419     local $SIG{PIPE} = sub { die "spooler pipe broke" };
 420     print SPOOLER "stuff\n";
 421     close SPOOLER || die "bad spool: $! $?";
 422
 423 And here's how to start up a child process you intend to read from:
 424
 425     open(STATUS, "netstat -an 2>&1 |")
 426                     || die "can't fork: $!";
 427     while (<STATUS>) {
 428         next if /^(tcp|udp)/;
 429         print;
 430     }
 431     close STATUS || die "bad netstat: $! $?";
 432
 433 If one can be sure that a particular program is a Perl script that is
 434 expecting filenames in @ARGV, the clever programmer can write something
 435 like this:
 436
 437     % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
 438
 439 and irrespective of which shell it's called from, the Perl program will
 440 read from the file F<f1>, the process F<cmd1>, standard input (F<tmpfile>
 441 in this case), the F<f2> file, the F<cmd2> command, and finally the F<f3>
 442 file.  Pretty nifty, eh?
 443
 444 You might notice that you could use backticks for much the
 445 same effect as opening a pipe for reading:
 446
 447     print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`;
 448     die "bad netstat" if $?;
 449
 450 While this is true on the surface, it's much more efficient to process the
 451 file one line or record at a time because then you don't have to read the
 452 whole thing into memory at once.  It also gives you finer control of the
 453 whole process, letting you to kill off the child process early if you'd
 454 like.
 455
 456 Be careful to check both the open() and the close() return values.  If
 457 you're I<writing> to a pipe, you should also trap SIGPIPE.  Otherwise,
 458 think of what happens when you start up a pipe to a command that doesn't
 459 exist: the open() will in all likelihood succeed (it only reflects the
 460 fork()'s success), but then your output will fail--spectacularly.  Perl
 461 can't know whether the command worked because your command is actually
 462 running in a separate process whose exec() might have failed.  Therefore,
 463 while readers of bogus commands return just a quick end of file, writers
 464 to bogus command will trigger a signal they'd better be prepared to
 465 handle.  Consider:
 466
 467     open(FH, "|bogus")  or die "can't fork: $!";
 468     print FH "bang\n"   or die "can't write: $!";
 469     close FH            or die "can't close: $!";
 470
 471 That won't blow up until the close, and it will blow up with a SIGPIPE.
 472 To catch it, you could use this:
 473
 474     $SIG{PIPE} = 'IGNORE';
 475     open(FH, "|bogus")  or die "can't fork: $!";
 476     print FH "bang\n"   or die "can't write: $!";
 477     close FH            or die "can't close: status=$?";
 478
 479 =head2 Filehandles
 480
 481 Both the main process and any child processes it forks share the same
 482 STDIN, STDOUT, and STDERR filehandles.  If both processes try to access
 483 them at once, strange things can happen.  You may also want to close
 484 or reopen the filehandles for the child.  You can get around this by
 485 opening your pipe with open(), but on some systems this means that the
 486 child process cannot outlive the parent.
 487
 488 =head2 Background Processes
 489
 490 You can run a command in the background with:
 491
 492     system("cmd &");
 493
 494 The command's STDOUT and STDERR (and possibly STDIN, depending on your
 495 shell) will be the same as the parent's.  You won't need to catch
 496 SIGCHLD because of the double-fork taking place (see below for more
 497 details).
 498
 499 =head2 Complete Dissociation of Child from Parent
 500
 501 In some cases (starting server processes, for instance) you'll want to
 502 completely dissociate the child process from the parent.  This is
 503 often called daemonization.  A well behaved daemon will also chdir()
 504 to the root directory (so it doesn't prevent unmounting the filesystem
 505 containing the directory from which it was launched) and redirect its
 506 standard file descriptors from and to F</dev/null> (so that random
 507 output doesn't wind up on the user's terminal).
 508
 509     use POSIX 'setsid';
 510
 511     sub daemonize {
 512         chdir '/'               or die "Can't chdir to /: $!";
 513         open STDIN, '/dev/null' or die "Can't read /dev/null: $!";
 514         open STDOUT, '>/dev/null'
 515                                 or die "Can't write to /dev/null: $!";
 516         defined(my $pid = fork) or die "Can't fork: $!";
 517         exit if $pid;
 518         setsid                  or die "Can't start a new session: $!";
 519         open STDERR, '>&STDOUT' or die "Can't dup stdout: $!";
 520     }
 521
 522 The fork() has to come before the setsid() to ensure that you aren't a
 523 process group leader (the setsid() will fail if you are).  If your
 524 system doesn't have the setsid() function, open F</dev/tty> and use the
 525 C<TIOCNOTTY> ioctl() on it instead.  See L<tty(4)> for details.
 526
 527 Non-Unix users should check their Your_OS::Process module for other
 528 solutions.
 529
 530 =head2 Safe Pipe Opens
 531
 532 Another interesting approach to IPC is making your single program go
 533 multiprocess and communicate between (or even amongst) yourselves.  The
 534 open() function will accept a file argument of either C<"-|"> or C<"|-">
 535 to do a very interesting thing: it forks a child connected to the
 536 filehandle you've opened.  The child is running the same program as the
 537 parent.  This is useful for safely opening a file when running under an
 538 assumed UID or GID, for example.  If you open a pipe I<to> minus, you can
 539 write to the filehandle you opened and your kid will find it in his
 540 STDIN.  If you open a pipe I<from> minus, you can read from the filehandle
 541 you opened whatever your kid writes to his STDOUT.
 542
 543     use English '-no_match_vars';
 544     my $sleep_count = 0;
 545
 546     do {
 547         $pid = open(KID_TO_WRITE, "|-");
 548         unless (defined $pid) {
 549             warn "cannot fork: $!";
 550             die "bailing out" if $sleep_count++ > 6;
 551             sleep 10;
 552         }
 553     } until defined $pid;
 554
 555     if ($pid) {  # parent
 556         print KID_TO_WRITE @some_data;
 557         close(KID_TO_WRITE) || warn "kid exited $?";
 558     } else {     # child
 559         ($EUID, $EGID) = ($UID, $GID); # suid progs only
 560         open (FILE, "> /safe/file")
 561             || die "can't open /safe/file: $!";
 562         while (<STDIN>) {
 563             print FILE; # child's STDIN is parent's KID
 564         }
 565         exit;  # don't forget this
 566     }
 567
 568 Another common use for this construct is when you need to execute
 569 something without the shell's interference.  With system(), it's
 570 straightforward, but you can't use a pipe open or backticks safely.
 571 That's because there's no way to stop the shell from getting its hands on
 572 your arguments.   Instead, use lower-level control to call exec() directly.
 573
 574 Here's a safe backtick or pipe open for read:
 575
 576     # add error processing as above
 577     $pid = open(KID_TO_READ, "-|");
 578
 579     if ($pid) {   # parent
 580         while (<KID_TO_READ>) {
 581             # do something interesting
 582         }
 583         close(KID_TO_READ) || warn "kid exited $?";
 584
 585     } else {      # child
 586         ($EUID, $EGID) = ($UID, $GID); # suid only
 587         exec($program, @options, @args)
 588             || die "can't exec program: $!";
 589         # NOTREACHED
 590     }
 591
 592
 593 And here's a safe pipe open for writing:
 594
 595     # add error processing as above
 596     $pid = open(KID_TO_WRITE, "|-");
 597     $SIG{PIPE} = sub { die "whoops, $program pipe broke" };
 598
 599     if ($pid) {  # parent
 600         for (@data) {
 601             print KID_TO_WRITE;
 602         }
 603         close(KID_TO_WRITE) || warn "kid exited $?";
 604
 605     } else {     # child
 606         ($EUID, $EGID) = ($UID, $GID);
 607         exec($program, @options, @args)
 608             || die "can't exec program: $!";
 609         # NOTREACHED
 610     }
 611
 612 Since Perl 5.8.0, you can also use the list form of C<open> for pipes :
 613 the syntax
 614
 615     open KID_PS, "-|", "ps", "aux" or die $!;
 616
 617 forks the ps(1) command (without spawning a shell, as there are more than
 618 three arguments to open()), and reads its standard output via the
 619 C<KID_PS> filehandle.  The corresponding syntax to read from command
 620 pipes (with C<"|-"> in place of C<"-|">) is also implemented.
 621
 622 Note that these operations are full Unix forks, which means they may not be
 623 correctly implemented on alien systems.  Additionally, these are not true
 624 multithreading.  If you'd like to learn more about threading, see the
 625 F<modules> file mentioned below in the SEE ALSO section.
 626
 627 =head2 Bidirectional Communication with Another Process
 628
 629 While this works reasonably well for unidirectional communication, what
 630 about bidirectional communication?  The obvious thing you'd like to do
 631 doesn't actually work:
 632
 633     open(PROG_FOR_READING_AND_WRITING, "| some program |")
 634
 635 and if you forget to use the C<use warnings> pragma or the B<-w> flag,
 636 then you'll miss out entirely on the diagnostic message:
 637
 638     Can't do bidirectional pipe at -e line 1.
 639
 640 If you really want to, you can use the standard open2() library function
 641 to catch both ends.  There's also an open3() for tridirectional I/O so you
 642 can also catch your child's STDERR, but doing so would then require an
 643 awkward select() loop and wouldn't allow you to use normal Perl input
 644 operations.
 645
 646 If you look at its source, you'll see that open2() uses low-level
 647 primitives like Unix pipe() and exec() calls to create all the connections.
 648 While it might have been slightly more efficient by using socketpair(), it
 649 would have then been even less portable than it already is.  The open2()
 650 and open3() functions are  unlikely to work anywhere except on a Unix
 651 system or some other one purporting to be POSIX compliant.
 652
 653 Here's an example of using open2():
 654
 655     use FileHandle;
 656     use IPC::Open2;
 657     $pid = open2(*Reader, *Writer, "cat -u -n" );
 658     print Writer "stuff\n";
 659     $got = <Reader>;
 660
 661 The problem with this is that Unix buffering is really going to
 662 ruin your day.  Even though your C<Writer> filehandle is auto-flushed,
 663 and the process on the other end will get your data in a timely manner,
 664 you can't usually do anything to force it to give it back to you
 665 in a similarly quick fashion.  In this case, we could, because we
 666 gave I<cat> a B<-u> flag to make it unbuffered.  But very few Unix
 667 commands are designed to operate over pipes, so this seldom works
 668 unless you yourself wrote the program on the other end of the
 669 double-ended pipe.
 670
 671 A solution to this is the nonstandard F<Comm.pl> library.  It uses
 672 pseudo-ttys to make your program behave more reasonably:
 673
 674     require 'Comm.pl';
 675     $ph = open_proc('cat -n');
 676     for (1..10) {
 677         print $ph "a line\n";
 678         print "got back ", scalar <$ph>;
 679     }
 680
 681 This way you don't have to have control over the source code of the
 682 program you're using.  The F<Comm> library also has expect()
 683 and interact() functions.  Find the library (and we hope its
 684 successor F<IPC::Chat>) at your nearest CPAN archive as detailed
 685 in the SEE ALSO section below.
 686
 687 The newer Expect.pm module from CPAN also addresses this kind of thing.
 688 This module requires two other modules from CPAN: IO::Pty and IO::Stty.
 689 It sets up a pseudo-terminal to interact with programs that insist on
 690 using talking to the terminal device driver.  If your system is
 691 amongst those supported, this may be your best bet.
 692
 693 =head2 Bidirectional Communication with Yourself
 694
 695 If you want, you may make low-level pipe() and fork()
 696 to stitch this together by hand.  This example only
 697 talks to itself, but you could reopen the appropriate
 698 handles to STDIN and STDOUT and call other processes.
 699
 700     #!/usr/bin/perl -w
 701     # pipe1 - bidirectional communication using two pipe pairs
 702     #         designed for the socketpair-challenged
 703     use IO::Handle;     # thousands of lines just for autoflush :-(
 704     pipe(PARENT_RDR, CHILD_WTR);                # XXX: failure?
 705     pipe(CHILD_RDR,  PARENT_WTR);               # XXX: failure?
 706     CHILD_WTR->autoflush(1);
 707     PARENT_WTR->autoflush(1);
 708
 709     if ($pid = fork) {
 710         close PARENT_RDR; close PARENT_WTR;
 711         print CHILD_WTR "Parent Pid $$ is sending this\n";
 712         chomp($line = <CHILD_RDR>);
 713         print "Parent Pid $$ just read this: `$line'\n";
 714         close CHILD_RDR; close CHILD_WTR;
 715         waitpid($pid,0);
 716     } else {
 717         die "cannot fork: $!" unless defined $pid;
 718         close CHILD_RDR; close CHILD_WTR;
 719         chomp($line = <PARENT_RDR>);
 720         print "Child Pid $$ just read this: `$line'\n";
 721         print PARENT_WTR "Child Pid $$ is sending this\n";
 722         close PARENT_RDR; close PARENT_WTR;
 723         exit;
 724     }
 725
 726 But you don't actually have to make two pipe calls.  If you
 727 have the socketpair() system call, it will do this all for you.
 728
 729     #!/usr/bin/perl -w
 730     # pipe2 - bidirectional communication using socketpair
 731     #   "the best ones always go both ways"
 732
 733     use Socket;
 734     use IO::Handle;     # thousands of lines just for autoflush :-(
 735     # We say AF_UNIX because although *_LOCAL is the
 736     # POSIX 1003.1g form of the constant, many machines
 737     # still don't have it.
 738     socketpair(CHILD, PARENT, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
 739                                 or  die "socketpair: $!";
 740
 741     CHILD->autoflush(1);
 742     PARENT->autoflush(1);
 743
 744     if ($pid = fork) {
 745         close PARENT;
 746         print CHILD "Parent Pid $$ is sending this\n";
 747         chomp($line = <CHILD>);
 748         print "Parent Pid $$ just read this: `$line'\n";
 749         close CHILD;
 750         waitpid($pid,0);
 751     } else {
 752         die "cannot fork: $!" unless defined $pid;
 753         close CHILD;
 754         chomp($line = <PARENT>);
 755         print "Child Pid $$ just read this: `$line'\n";
 756         print PARENT "Child Pid $$ is sending this\n";
 757         close PARENT;
 758         exit;
 759     }
 760
 761 =head1 Sockets: Client/Server Communication
 762
 763 While not limited to Unix-derived operating systems (e.g., WinSock on PCs
 764 provides socket support, as do some VMS libraries), you may not have
 765 sockets on your system, in which case this section probably isn't going to do
 766 you much good.  With sockets, you can do both virtual circuits (i.e., TCP
 767 streams) and datagrams (i.e., UDP packets).  You may be able to do even more
 768 depending on your system.
 769
 770 The Perl function calls for dealing with sockets have the same names as
 771 the corresponding system calls in C, but their arguments tend to differ
 772 for two reasons: first, Perl filehandles work differently than C file
 773 descriptors.  Second, Perl already knows the length of its strings, so you
 774 don't need to pass that information.
 775
 776 One of the major problems with old socket code in Perl was that it used
 777 hard-coded values for some of the constants, which severely hurt
 778 portability.  If you ever see code that does anything like explicitly
 779 setting C<$AF_INET = 2>, you know you're in for big trouble:  An
 780 immeasurably superior approach is to use the C<Socket> module, which more
 781 reliably grants access to various constants and functions you'll need.
 782
 783 If you're not writing a server/client for an existing protocol like
 784 NNTP or SMTP, you should give some thought to how your server will
 785 know when the client has finished talking, and vice-versa.  Most
 786 protocols are based on one-line messages and responses (so one party
 787 knows the other has finished when a "\n" is received) or multi-line
 788 messages and responses that end with a period on an empty line
 789 ("\n.\n" terminates a message/response).
 790
 791 =head2 Internet Line Terminators
 792
 793 The Internet line terminator is "\015\012".  Under ASCII variants of
 794 Unix, that could usually be written as "\r\n", but under other systems,
 795 "\r\n" might at times be "\015\015\012", "\012\012\015", or something
 796 completely different.  The standards specify writing "\015\012" to be
 797 conformant (be strict in what you provide), but they also recommend
 798 accepting a lone "\012" on input (but be lenient in what you require).
 799 We haven't always been very good about that in the code in this manpage,
 800 but unless you're on a Mac, you'll probably be ok.
 801
 802 =head2 Internet TCP Clients and Servers
 803
 804 Use Internet-domain sockets when you want to do client-server
 805 communication that might extend to machines outside of your own system.
 806
 807 Here's a sample TCP client using Internet-domain sockets:
 808
 809     #!/usr/bin/perl -w
 810     use strict;
 811     use Socket;
 812     my ($remote,$port, $iaddr, $paddr, $proto, $line);
 813
 814     $remote  = shift || 'localhost';
 815     $port    = shift || 2345;  # random port
 816     if ($port =~ /\D/) { $port = getservbyname($port, 'tcp') }
 817     die "No port" unless $port;
 818     $iaddr   = inet_aton($remote)               || die "no host: $remote";
 819     $paddr   = sockaddr_in($port, $iaddr);
 820
 821     $proto   = getprotobyname('tcp');
 822     socket(SOCK, PF_INET, SOCK_STREAM, $proto)  || die "socket: $!";
 823     connect(SOCK, $paddr)    || die "connect: $!";
 824     while (defined($line = <SOCK>)) {
 825         print $line;
 826     }
 827
 828     close (SOCK)            || die "close: $!";
 829     exit;
 830
 831 And here's a corresponding server to go along with it.  We'll
 832 leave the address as INADDR_ANY so that the kernel can choose
 833 the appropriate interface on multihomed hosts.  If you want sit
 834 on a particular interface (like the external side of a gateway
 835 or firewall machine), you should fill this in with your real address
 836 instead.
 837
 838     #!/usr/bin/perl -Tw
 839     use strict;
 840     BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
 841     use Socket;
 842     use Carp;
 843     my $EOL = "\015\012";
 844
 845     sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
 846
 847     my $port = shift || 2345;
 848     my $proto = getprotobyname('tcp');
 849
 850     ($port) = $port =~ /^(\d+)$/                        or die "invalid port";
 851
 852     socket(Server, PF_INET, SOCK_STREAM, $proto)        || die "socket: $!";
 853     setsockopt(Server, SOL_SOCKET, SO_REUSEADDR,
 854                                         pack("l", 1))   || die "setsockopt: $!";
 855     bind(Server, sockaddr_in($port, INADDR_ANY))        || die "bind: $!";
 856     listen(Server,SOMAXCONN)                            || die "listen: $!";
 857
 858     logmsg "server started on port $port";
 859
 860     my $paddr;
 861
 862     $SIG{CHLD} = \&REAPER;
 863
 864     for ( ; $paddr = accept(Client,Server); close Client) {
 865         my($port,$iaddr) = sockaddr_in($paddr);
 866         my $name = gethostbyaddr($iaddr,AF_INET);
 867
 868         logmsg "connection from $name [",
 869                 inet_ntoa($iaddr), "]
 870                 at port $port";
 871
 872         print Client "Hello there, $name, it's now ",
 873                         scalar localtime, $EOL;
 874     }
 875
 876 And here's a multithreaded version.  It's multithreaded in that
 877 like most typical servers, it spawns (forks) a slave server to
 878 handle the client request so that the master server can quickly
 879 go back to service a new client.
 880
 881     #!/usr/bin/perl -Tw
 882     use strict;
 883     BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
 884     use Socket;
 885     use Carp;
 886     my $EOL = "\015\012";
 887
 888     sub spawn;  # forward declaration
 889     sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
 890
 891     my $port = shift || 2345;
 892     my $proto = getprotobyname('tcp');
 893
 894     ($port) = $port =~ /^(\d+)$/                        or die "invalid port";
 895
 896     socket(Server, PF_INET, SOCK_STREAM, $proto)        || die "socket: $!";
 897     setsockopt(Server, SOL_SOCKET, SO_REUSEADDR,
 898                                         pack("l", 1))   || die "setsockopt: $!";
 899     bind(Server, sockaddr_in($port, INADDR_ANY))        || die "bind: $!";
 900     listen(Server,SOMAXCONN)                            || die "listen: $!";
 901
 902     logmsg "server started on port $port";
 903
 904     my $waitedpid = 0;
 905     my $paddr;
 906
 907     use POSIX ":sys_wait_h";
 908     sub REAPER {
 909         my $child;
 910         while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
 911             logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
 912         }
 913         $SIG{CHLD} = \&REAPER;  # loathe sysV
 914     }
 915
 916     $SIG{CHLD} = \&REAPER;
 917
 918     for ( $waitedpid = 0;
 919           ($paddr = accept(Client,Server)) || $waitedpid;
 920           $waitedpid = 0, close Client)
 921     {
 922         next if $waitedpid and not $paddr;
 923         my($port,$iaddr) = sockaddr_in($paddr);
 924         my $name = gethostbyaddr($iaddr,AF_INET);
 925
 926         logmsg "connection from $name [",
 927                 inet_ntoa($iaddr), "]
 928                 at port $port";
 929
 930         spawn sub {
 931             $|=1;
 932             print "Hello there, $name, it's now ", scalar localtime, $EOL;
 933             exec '/usr/games/fortune'           # XXX: `wrong' line terminators
 934                 or confess "can't exec fortune: $!";
 935         };
 936
 937     }
 938
 939     sub spawn {
 940         my $coderef = shift;
 941
 942         unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
 943             confess "usage: spawn CODEREF";
 944         }
 945
 946         my $pid;
 947         if (!defined($pid = fork)) {
 948             logmsg "cannot fork: $!";
 949             return;
 950         } elsif ($pid) {
 951             logmsg "begat $pid";
 952             return; # I'm the parent
 953         }
 954         # else I'm the child -- go spawn
 955
 956         open(STDIN,  "<&Client")   || die "can't dup client to stdin";
 957         open(STDOUT, ">&Client")   || die "can't dup client to stdout";
 958         ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
 959         exit &$coderef();
 960     }
 961
 962 This server takes the trouble to clone off a child version via fork() for
 963 each incoming request.  That way it can handle many requests at once,
 964 which you might not always want.  Even if you don't fork(), the listen()
 965 will allow that many pending connections.  Forking servers have to be
 966 particularly careful about cleaning up their dead children (called
 967 "zombies" in Unix parlance), because otherwise you'll quickly fill up your
 968 process table.
 969
 970 We suggest that you use the B<-T> flag to use taint checking (see L<perlsec>)
 971 even if we aren't running setuid or setgid.  This is always a good idea
 972 for servers and other programs run on behalf of someone else (like CGI
 973 scripts), because it lessens the chances that people from the outside will
 974 be able to compromise your system.
 975
 976 Let's look at another TCP client.  This one connects to the TCP "time"
 977 service on a number of different machines and shows how far their clocks
 978 differ from the system on which it's being run:
 979
 980     #!/usr/bin/perl  -w
 981     use strict;
 982     use Socket;
 983
 984     my $SECS_of_70_YEARS = 2208988800;
 985     sub ctime { scalar localtime(shift) }
 986
 987     my $iaddr = gethostbyname('localhost');
 988     my $proto = getprotobyname('tcp');
 989     my $port = getservbyname('time', 'tcp');
 990     my $paddr = sockaddr_in(0, $iaddr);
 991     my($host);
 992
 993     $| = 1;
 994     printf "%-24s %8s %s\n",  "localhost", 0, ctime(time());
 995
 996     foreach $host (@ARGV) {
 997         printf "%-24s ", $host;
 998         my $hisiaddr = inet_aton($host)     || die "unknown host";
 999         my $hispaddr = sockaddr_in($port, $hisiaddr);
1000         socket(SOCKET, PF_INET, SOCK_STREAM, $proto)   || die "socket: $!";
1001         connect(SOCKET, $hispaddr)          || die "bind: $!";
1002         my $rtime = '    ';
1003         read(SOCKET, $rtime, 4);
1004         close(SOCKET);
1005         my $histime = unpack("N", $rtime) - $SECS_of_70_YEARS ;
1006         printf "%8d %s\n", $histime - time, ctime($histime);
1007     }
1008
1009 =head2 Unix-Domain TCP Clients and Servers
1010
1011 That's fine for Internet-domain clients and servers, but what about local
1012 communications?  While you can use the same setup, sometimes you don't
1013 want to.  Unix-domain sockets are local to the current host, and are often
1014 used internally to implement pipes.  Unlike Internet domain sockets, Unix
1015 domain sockets can show up in the file system with an ls(1) listing.
1016
1017     % ls -l /dev/log
1018     srw-rw-rw-  1 root            0 Oct 31 07:23 /dev/log
1019
1020 You can test for these with Perl's B<-S> file test:
1021
1022     unless ( -S '/dev/log' ) {
1023         die "something's wicked with the log system";
1024     }
1025
1026 Here's a sample Unix-domain client:
1027
1028     #!/usr/bin/perl -w
1029     use Socket;
1030     use strict;
1031     my ($rendezvous, $line);
1032
1033     $rendezvous = shift || 'catsock';
1034     socket(SOCK, PF_UNIX, SOCK_STREAM, 0)       || die "socket: $!";
1035     connect(SOCK, sockaddr_un($rendezvous))     || die "connect: $!";
1036     while (defined($line = <SOCK>)) {
1037         print $line;
1038     }
1039     exit;
1040
1041 And here's a corresponding server.  You don't have to worry about silly
1042 network terminators here because Unix domain sockets are guaranteed
1043 to be on the localhost, and thus everything works right.
1044
1045     #!/usr/bin/perl -Tw
1046     use strict;
1047     use Socket;
1048     use Carp;
1049
1050     BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
1051     sub spawn;  # forward declaration
1052     sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
1053
1054     my $NAME = 'catsock';
1055     my $uaddr = sockaddr_un($NAME);
1056     my $proto = getprotobyname('tcp');
1057
1058     socket(Server,PF_UNIX,SOCK_STREAM,0)        || die "socket: $!";
1059     unlink($NAME);
1060     bind  (Server, $uaddr)                      || die "bind: $!";
1061     listen(Server,SOMAXCONN)                    || die "listen: $!";
1062
1063     logmsg "server started on $NAME";
1064
1065     my $waitedpid;
1066
1067     use POSIX ":sys_wait_h";
1068     sub REAPER {
1069         my $child;
1070         while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
1071             logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
1072         }
1073         $SIG{CHLD} = \&REAPER;  # loathe sysV
1074     }
1075
1076     $SIG{CHLD} = \&REAPER;
1077
1078
1079     for ( $waitedpid = 0;
1080           accept(Client,Server) || $waitedpid;
1081           $waitedpid = 0, close Client)
1082     {
1083         next if $waitedpid;
1084         logmsg "connection on $NAME";
1085         spawn sub {
1086             print "Hello there, it's now ", scalar localtime, "\n";
1087             exec '/usr/games/fortune' or die "can't exec fortune: $!";
1088         };
1089     }
1090
1091     sub spawn {
1092         my $coderef = shift;
1093
1094         unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
1095             confess "usage: spawn CODEREF";
1096         }
1097
1098         my $pid;
1099         if (!defined($pid = fork)) {
1100             logmsg "cannot fork: $!";
1101             return;
1102         } elsif ($pid) {
1103             logmsg "begat $pid";
1104             return; # I'm the parent
1105         }
1106         # else I'm the child -- go spawn
1107
1108         open(STDIN,  "<&Client")   || die "can't dup client to stdin";
1109         open(STDOUT, ">&Client")   || die "can't dup client to stdout";
1110         ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
1111         exit &$coderef();
1112     }
1113
1114 As you see, it's remarkably similar to the Internet domain TCP server, so
1115 much so, in fact, that we've omitted several duplicate functions--spawn(),
1116 logmsg(), ctime(), and REAPER()--which are exactly the same as in the
1117 other server.
1118
1119 So why would you ever want to use a Unix domain socket instead of a
1120 simpler named pipe?  Because a named pipe doesn't give you sessions.  You
1121 can't tell one process's data from another's.  With socket programming,
1122 you get a separate session for each client: that's why accept() takes two
1123 arguments.
1124
1125 For example, let's say that you have a long running database server daemon
1126 that you want folks from the World Wide Web to be able to access, but only
1127 if they go through a CGI interface.  You'd have a small, simple CGI
1128 program that does whatever checks and logging you feel like, and then acts
1129 as a Unix-domain client and connects to your private server.
1130
1131 =head1 TCP Clients with IO::Socket
1132
1133 For those preferring a higher-level interface to socket programming, the
1134 IO::Socket module provides an object-oriented approach.  IO::Socket is
1135 included as part of the standard Perl distribution as of the 5.004
1136 release.  If you're running an earlier version of Perl, just fetch
1137 IO::Socket from CPAN, where you'll also find modules providing easy
1138 interfaces to the following systems: DNS, FTP, Ident (RFC 931), NIS and
1139 NISPlus, NNTP, Ping, POP3, SMTP, SNMP, SSLeay, Telnet, and Time--just
1140 to name a few.
1141
1142 =head2 A Simple Client
1143
1144 Here's a client that creates a TCP connection to the "daytime"
1145 service at port 13 of the host name "localhost" and prints out everything
1146 that the server there cares to provide.
1147
1148     #!/usr/bin/perl -w
1149     use IO::Socket;
1150     $remote = IO::Socket::INET->new(
1151                         Proto    => "tcp",
1152                         PeerAddr => "localhost",
1153                         PeerPort => "daytime(13)",
1154                     )
1155                   or die "cannot connect to daytime port at localhost";
1156     while ( <$remote> ) { print }
1157
1158 When you run this program, you should get something back that
1159 looks like this:
1160
1161     Wed May 14 08:40:46 MDT 1997
1162
1163 Here are what those parameters to the C<new> constructor mean:
1164
1165 =over 4
1166
1167 =item C<Proto>
1168
1169 This is which protocol to use.  In this case, the socket handle returned
1170 will be connected to a TCP socket, because we want a stream-oriented
1171 connection, that is, one that acts pretty much like a plain old file.
1172 Not all sockets are this of this type.  For example, the UDP protocol
1173 can be used to make a datagram socket, used for message-passing.
1174
1175 =item C<PeerAddr>
1176
1177 This is the name or Internet address of the remote host the server is
1178 running on.  We could have specified a longer name like C<"www.perl.com">,
1179 or an address like C<"204.148.40.9">.  For demonstration purposes, we've
1180 used the special hostname C<"localhost">, which should always mean the
1181 current machine you're running on.  The corresponding Internet address
1182 for localhost is C<"127.1">, if you'd rather use that.
1183
1184 =item C<PeerPort>
1185
1186 This is the service name or port number we'd like to connect to.
1187 We could have gotten away with using just C<"daytime"> on systems with a
1188 well-configured system services file,[FOOTNOTE: The system services file
1189 is in I</etc/services> under Unix] but just in case, we've specified the
1190 port number (13) in parentheses.  Using just the number would also have
1191 worked, but constant numbers make careful programmers nervous.
1192
1193 =back
1194
1195 Notice how the return value from the C<new> constructor is used as
1196 a filehandle in the C<while> loop?  That's what's called an indirect
1197 filehandle, a scalar variable containing a filehandle.  You can use
1198 it the same way you would a normal filehandle.  For example, you
1199 can read one line from it this way:
1200
1201     $line = <$handle>;
1202
1203 all remaining lines from is this way:
1204
1205     @lines = <$handle>;
1206
1207 and send a line of data to it this way:
1208
1209     print $handle "some data\n";
1210
1211 =head2 A Webget Client
1212
1213 Here's a simple client that takes a remote host to fetch a document
1214 from, and then a list of documents to get from that host.  This is a
1215 more interesting client than the previous one because it first sends
1216 something to the server before fetching the server's response.
1217
1218     #!/usr/bin/perl -w
1219     use IO::Socket;
1220     unless (@ARGV > 1) { die "usage: $0 host document ..." }
1221     $host = shift(@ARGV);
1222     $EOL = "\015\012";
1223     $BLANK = $EOL x 2;
1224     foreach $document ( @ARGV ) {
1225         $remote = IO::Socket::INET->new( Proto     => "tcp",
1226                                          PeerAddr  => $host,
1227                                          PeerPort  => "http(80)",
1228                                         );
1229         unless ($remote) { die "cannot connect to http daemon on $host" }
1230         $remote->autoflush(1);
1231         print $remote "GET $document HTTP/1.0" . $BLANK;
1232         while ( <$remote> ) { print }
1233         close $remote;
1234     }
1235
1236 The web server handing the "http" service, which is assumed to be at
1237 its standard port, number 80.  If the web server you're trying to
1238 connect to is at a different port (like 1080 or 8080), you should specify
1239 as the named-parameter pair, C<< PeerPort => 8080 >>.  The C<autoflush>
1240 method is used on the socket because otherwise the system would buffer
1241 up the output we sent it.  (If you're on a Mac, you'll also need to
1242 change every C<"\n"> in your code that sends data over the network to
1243 be a C<"\015\012"> instead.)
1244
1245 Connecting to the server is only the first part of the process: once you
1246 have the connection, you have to use the server's language.  Each server
1247 on the network has its own little command language that it expects as
1248 input.  The string that we send to the server starting with "GET" is in
1249 HTTP syntax.  In this case, we simply request each specified document.
1250 Yes, we really are making a new connection for each document, even though
1251 it's the same host.  That's the way you always used to have to speak HTTP.
1252 Recent versions of web browsers may request that the remote server leave
1253 the connection open a little while, but the server doesn't have to honor
1254 such a request.
1255
1256 Here's an example of running that program, which we'll call I<webget>:
1257
1258     % webget www.perl.com /guanaco.html
1259     HTTP/1.1 404 File Not Found
1260     Date: Thu, 08 May 1997 18:02:32 GMT
1261     Server: Apache/1.2b6
1262     Connection: close
1263     Content-type: text/html
1264
1265     <HEAD><TITLE>404 File Not Found</TITLE></HEAD>
1266     <BODY><H1>File Not Found</H1>
1267     The requested URL /guanaco.html was not found on this server.<P>
1268     </BODY>
1269
1270 Ok, so that's not very interesting, because it didn't find that
1271 particular document.  But a long response wouldn't have fit on this page.
1272
1273 For a more fully-featured version of this program, you should look to
1274 the I<lwp-request> program included with the LWP modules from CPAN.
1275
1276 =head2 Interactive Client with IO::Socket
1277
1278 Well, that's all fine if you want to send one command and get one answer,
1279 but what about setting up something fully interactive, somewhat like
1280 the way I<telnet> works?  That way you can type a line, get the answer,
1281 type a line, get the answer, etc.
1282
1283 This client is more complicated than the two we've done so far, but if
1284 you're on a system that supports the powerful C<fork> call, the solution
1285 isn't that rough.  Once you've made the connection to whatever service
1286 you'd like to chat with, call C<fork> to clone your process.  Each of
1287 these two identical process has a very simple job to do: the parent
1288 copies everything from the socket to standard output, while the child
1289 simultaneously copies everything from standard input to the socket.
1290 To accomplish the same thing using just one process would be I<much>
1291 harder, because it's easier to code two processes to do one thing than it
1292 is to code one process to do two things.  (This keep-it-simple principle
1293 a cornerstones of the Unix philosophy, and good software engineering as
1294 well, which is probably why it's spread to other systems.)
1295
1296 Here's the code:
1297
1298     #!/usr/bin/perl -w
1299     use strict;
1300     use IO::Socket;
1301     my ($host, $port, $kidpid, $handle, $line);
1302
1303     unless (@ARGV == 2) { die "usage: $0 host port" }
1304     ($host, $port) = @ARGV;
1305
1306     # create a tcp connection to the specified host and port
1307     $handle = IO::Socket::INET->new(Proto     => "tcp",
1308                                     PeerAddr  => $host,
1309                                     PeerPort  => $port)
1310            or die "can't connect to port $port on $host: $!";
1311
1312     $handle->autoflush(1);              # so output gets there right away
1313     print STDERR "[Connected to $host:$port]\n";
1314
1315     # split the program into two processes, identical twins
1316     die "can't fork: $!" unless defined($kidpid = fork());
1317
1318     # the if{} block runs only in the parent process
1319     if ($kidpid) {
1320         # copy the socket to standard output
1321         while (defined ($line = <$handle>)) {
1322             print STDOUT $line;
1323         }
1324         kill("TERM", $kidpid);                  # send SIGTERM to child
1325     }
1326     # the else{} block runs only in the child process
1327     else {
1328         # copy standard input to the socket
1329         while (defined ($line = <STDIN>)) {
1330             print $handle $line;
1331         }
1332     }
1333
1334 The C<kill> function in the parent's C<if> block is there to send a
1335 signal to our child process (current running in the C<else> block)
1336 as soon as the remote server has closed its end of the connection.
1337
1338 If the remote server sends data a byte at time, and you need that
1339 data immediately without waiting for a newline (which might not happen),
1340 you may wish to replace the C<while> loop in the parent with the
1341 following:
1342
1343     my $byte;
1344     while (sysread($handle, $byte, 1) == 1) {
1345         print STDOUT $byte;
1346     }
1347
1348 Making a system call for each byte you want to read is not very efficient
1349 (to put it mildly) but is the simplest to explain and works reasonably
1350 well.
1351
1352 =head1 TCP Servers with IO::Socket
1353
1354 As always, setting up a server is little bit more involved than running a client.
1355 The model is that the server creates a special kind of socket that
1356 does nothing but listen on a particular port for incoming connections.
1357 It does this by calling the C<< IO::Socket::INET->new() >> method with
1358 slightly different arguments than the client did.
1359
1360 =over 4
1361
1362 =item Proto
1363
1364 This is which protocol to use.  Like our clients, we'll
1365 still specify C<"tcp"> here.
1366
1367 =item LocalPort
1368
1369 We specify a local
1370 port in the C<LocalPort> argument, which we didn't do for the client.
1371 This is service name or port number for which you want to be the
1372 server. (Under Unix, ports under 1024 are restricted to the
1373 superuser.)  In our sample, we'll use port 9000, but you can use
1374 any port that's not currently in use on your system.  If you try
1375 to use one already in used, you'll get an "Address already in use"
1376 message.  Under Unix, the C<netstat -a> command will show
1377 which services current have servers.
1378
1379 =item Listen
1380
1381 The C<Listen> parameter is set to the maximum number of
1382 pending connections we can accept until we turn away incoming clients.
1383 Think of it as a call-waiting queue for your telephone.
1384 The low-level Socket module has a special symbol for the system maximum, which
1385 is SOMAXCONN.
1386
1387 =item Reuse
1388
1389 The C<Reuse> parameter is needed so that we restart our server
1390 manually without waiting a few minutes to allow system buffers to
1391 clear out.
1392
1393 =back
1394
1395 Once the generic server socket has been created using the parameters
1396 listed above, the server then waits for a new client to connect
1397 to it.  The server blocks in the C<accept> method, which eventually accepts a
1398 bidirectional connection from the remote client.  (Make sure to autoflush
1399 this handle to circumvent buffering.)
1400
1401 To add to user-friendliness, our server prompts the user for commands.
1402 Most servers don't do this.  Because of the prompt without a newline,
1403 you'll have to use the C<sysread> variant of the interactive client above.
1404
1405 This server accepts one of five different commands, sending output
1406 back to the client.  Note that unlike most network servers, this one
1407 only handles one incoming client at a time.  Multithreaded servers are
1408 covered in Chapter 6 of the Camel.
1409
1410 Here's the code.  We'll
1411
1412  #!/usr/bin/perl -w
1413  use IO::Socket;
1414  use Net::hostent;              # for OO version of gethostbyaddr
1415
1416  $PORT = 9000;                  # pick something not in use
1417
1418  $server = IO::Socket::INET->new( Proto     => 'tcp',
1419                                   LocalPort => $PORT,
1420                                   Listen    => SOMAXCONN,
1421                                   Reuse     => 1);
1422
1423  die "can't setup server" unless $server;
1424  print "[Server $0 accepting clients]\n";
1425
1426  while ($client = $server->accept()) {
1427    $client->autoflush(1);
1428    print $client "Welcome to $0; type help for command list.\n";
1429    $hostinfo = gethostbyaddr($client->peeraddr);
1430    printf "[Connect from %s]\n", $hostinfo ? $hostinfo->name : $client->peerhost;
1431    print $client "Command? ";
1432    while ( <$client>) {
1433      next unless /\S/;       # blank line
1434      if    (/quit|exit/i)    { last;                                     }
1435      elsif (/date|time/i)    { printf $client "%s\n", scalar localtime;  }
1436      elsif (/who/i )         { print  $client `who 2>&1`;                }
1437      elsif (/cookie/i )      { print  $client `/usr/games/fortune 2>&1`; }
1438      elsif (/motd/i )        { print  $client `cat /etc/motd 2>&1`;      }
1439      else {
1440        print $client "Commands: quit date who cookie motd\n";
1441      }
1442    } continue {
1443       print $client "Command? ";
1444    }
1445    close $client;
1446  }
1447
1448 =head1 UDP: Message Passing
1449
1450 Another kind of client-server setup is one that uses not connections, but
1451 messages.  UDP communications involve much lower overhead but also provide
1452 less reliability, as there are no promises that messages will arrive at
1453 all, let alone in order and unmangled.  Still, UDP offers some advantages
1454 over TCP, including being able to "broadcast" or "multicast" to a whole
1455 bunch of destination hosts at once (usually on your local subnet).  If you
1456 find yourself overly concerned about reliability and start building checks
1457 into your message system, then you probably should use just TCP to start
1458 with.
1459
1460 Note that UDP datagrams are I<not> a bytestream and should not be treated
1461 as such. This makes using I/O mechanisms with internal buffering
1462 like stdio (i.e. print() and friends) especially cumbersome. Use syswrite(),
1463 or better send(), like in the example below.
1464
1465 Here's a UDP program similar to the sample Internet TCP client given
1466 earlier.  However, instead of checking one host at a time, the UDP version
1467 will check many of them asynchronously by simulating a multicast and then
1468 using select() to do a timed-out wait for I/O.  To do something similar
1469 with TCP, you'd have to use a different socket handle for each host.
1470
1471     #!/usr/bin/perl -w
1472     use strict;
1473     use Socket;
1474     use Sys::Hostname;
1475
1476     my ( $count, $hisiaddr, $hispaddr, $histime,
1477          $host, $iaddr, $paddr, $port, $proto,
1478          $rin, $rout, $rtime, $SECS_of_70_YEARS);
1479
1480     $SECS_of_70_YEARS      = 2208988800;
1481
1482     $iaddr = gethostbyname(hostname());
1483     $proto = getprotobyname('udp');
1484     $port = getservbyname('time', 'udp');
1485     $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick
1486
1487     socket(SOCKET, PF_INET, SOCK_DGRAM, $proto)   || die "socket: $!";
1488     bind(SOCKET, $paddr)                          || die "bind: $!";
1489
1490     $| = 1;
1491     printf "%-12s %8s %s\n",  "localhost", 0, scalar localtime time;
1492     $count = 0;
1493     for $host (@ARGV) {
1494         $count++;
1495         $hisiaddr = inet_aton($host)    || die "unknown host";
1496         $hispaddr = sockaddr_in($port, $hisiaddr);
1497         defined(send(SOCKET, 0, 0, $hispaddr))    || die "send $host: $!";
1498     }
1499
1500     $rin = '';
1501     vec($rin, fileno(SOCKET), 1) = 1;
1502
1503     # timeout after 10.0 seconds
1504     while ($count && select($rout = $rin, undef, undef, 10.0)) {
1505         $rtime = '';
1506         ($hispaddr = recv(SOCKET, $rtime, 4, 0))        || die "recv: $!";
1507         ($port, $hisiaddr) = sockaddr_in($hispaddr);
1508         $host = gethostbyaddr($hisiaddr, AF_INET);
1509         $histime = unpack("N", $rtime) - $SECS_of_70_YEARS ;
1510         printf "%-12s ", $host;
1511         printf "%8d %s\n", $histime - time, scalar localtime($histime);
1512         $count--;
1513     }
1514
1515 Note that this example does not include any retries and may consequently
1516 fail to contact a reachable host. The most prominent reason for this
1517 is congestion of the queues on the sending host if the number of
1518 list of hosts to contact is sufficiently large.
1519
1520 =head1 SysV IPC
1521
1522 While System V IPC isn't so widely used as sockets, it still has some
1523 interesting uses.  You can't, however, effectively use SysV IPC or
1524 Berkeley mmap() to have shared memory so as to share a variable amongst
1525 several processes.  That's because Perl would reallocate your string when
1526 you weren't wanting it to.
1527
1528 Here's a small example showing shared memory usage.
1529
1530     use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRWXU);
1531
1532     $size = 2000;
1533     $id = shmget(IPC_PRIVATE, $size, S_IRWXU) || die "$!";
1534     print "shm key $id\n";
1535
1536     $message = "Message #1";
1537     shmwrite($id, $message, 0, 60) || die "$!";
1538     print "wrote: '$message'\n";
1539     shmread($id, $buff, 0, 60) || die "$!";
1540     print "read : '$buff'\n";
1541
1542     # the buffer of shmread is zero-character end-padded.
1543     substr($buff, index($buff, "\0")) = '';
1544     print "un" unless $buff eq $message;
1545     print "swell\n";
1546
1547     print "deleting shm $id\n";
1548     shmctl($id, IPC_RMID, 0) || die "$!";
1549
1550 Here's an example of a semaphore:
1551
1552     use IPC::SysV qw(IPC_CREAT);
1553
1554     $IPC_KEY = 1234;
1555     $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT ) || die "$!";
1556     print "shm key $id\n";
1557
1558 Put this code in a separate file to be run in more than one process.
1559 Call the file F<take>:
1560
1561     # create a semaphore
1562
1563     $IPC_KEY = 1234;
1564     $id = semget($IPC_KEY,  0 , 0 );
1565     die if !defined($id);
1566
1567     $semnum = 0;
1568     $semflag = 0;
1569
1570     # 'take' semaphore
1571     # wait for semaphore to be zero
1572     $semop = 0;
1573     $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag);
1574
1575     # Increment the semaphore count
1576     $semop = 1;
1577     $opstring2 = pack("s!s!s!", $semnum, $semop,  $semflag);
1578     $opstring = $opstring1 . $opstring2;
1579
1580     semop($id,$opstring) || die "$!";
1581
1582 Put this code in a separate file to be run in more than one process.
1583 Call this file F<give>:
1584
1585     # 'give' the semaphore
1586     # run this in the original process and you will see
1587     # that the second process continues
1588
1589     $IPC_KEY = 1234;
1590     $id = semget($IPC_KEY, 0, 0);
1591     die if !defined($id);
1592
1593     $semnum = 0;
1594     $semflag = 0;
1595
1596     # Decrement the semaphore count
1597     $semop = -1;
1598     $opstring = pack("s!s!s!", $semnum, $semop, $semflag);
1599
1600     semop($id,$opstring) || die "$!";
1601
1602 The SysV IPC code above was written long ago, and it's definitely
1603 clunky looking.  For a more modern look, see the IPC::SysV module
1604 which is included with Perl starting from Perl 5.005.
1605
1606 A small example demonstrating SysV message queues:
1607
1608     use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRWXU);
1609
1610     my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRWXU);
1611
1612     my $sent = "message";
1613     my $type = 1234;
1614     my $rcvd;
1615     my $type_rcvd;
1616
1617     if (defined $id) {
1618         if (msgsnd($id, pack("l! a*", $type_sent, $sent), 0)) {
1619             if (msgrcv($id, $rcvd, 60, 0, 0)) {
1620                 ($type_rcvd, $rcvd) = unpack("l! a*", $rcvd);
1621                 if ($rcvd eq $sent) {
1622                     print "okay\n";
1623                 } else {
1624                     print "not okay\n";
1625                 }
1626             } else {
1627                 die "# msgrcv failed\n";
1628             }
1629         } else {
1630             die "# msgsnd failed\n";
1631         }
1632         msgctl($id, IPC_RMID, 0) || die "# msgctl failed: $!\n";
1633     } else {
1634         die "# msgget failed\n";
1635     }
1636
1637 =head1 NOTES
1638
1639 Most of these routines quietly but politely return C<undef> when they
1640 fail instead of causing your program to die right then and there due to
1641 an uncaught exception.  (Actually, some of the new I<Socket> conversion
1642 functions  croak() on bad arguments.)  It is therefore essential to
1643 check return values from these functions.  Always begin your socket
1644 programs this way for optimal success, and don't forget to add B<-T>
1645 taint checking flag to the #! line for servers:
1646
1647     #!/usr/bin/perl -Tw
1648     use strict;
1649     use sigtrap;
1650     use Socket;
1651
1652 =head1 BUGS
1653
1654 All these routines create system-specific portability problems.  As noted
1655 elsewhere, Perl is at the mercy of your C libraries for much of its system
1656 behaviour.  It's probably safest to assume broken SysV semantics for
1657 signals and to stick with simple TCP and UDP socket operations; e.g., don't
1658 try to pass open file descriptors over a local UDP datagram socket if you
1659 want your code to stand a chance of being portable.
1660
1661 As mentioned in the signals section, because few vendors provide C
1662 libraries that are safely re-entrant, the prudent programmer will do
1663 little else within a handler beyond setting a numeric variable that
1664 already exists; or, if locked into a slow (restarting) system call,
1665 using die() to raise an exception and longjmp(3) out.  In fact, even
1666 these may in some cases cause a core dump.  It's probably best to avoid
1667 signals except where they are absolutely inevitable.  This
1668 will be addressed in a future release of Perl.
1669
1670 =head1 AUTHOR
1671
1672 Tom Christiansen, with occasional vestiges of Larry Wall's original
1673 version and suggestions from the Perl Porters.
1674
1675 =head1 SEE ALSO
1676
1677 There's a lot more to networking than this, but this should get you
1678 started.
1679
1680 For intrepid programmers, the indispensable textbook is I<Unix
1681 Network Programming, 2nd Edition, Volume 1> by W. Richard Stevens
1682 (published by Prentice-Hall).  Note that most books on networking
1683 address the subject from the perspective of a C programmer; translation
1684 to Perl is left as an exercise for the reader.
1685
1686 The IO::Socket(3) manpage describes the object library, and the Socket(3)
1687 manpage describes the low-level interface to sockets.  Besides the obvious
1688 functions in L<perlfunc>, you should also check out the F<modules> file
1689 at your nearest CPAN site.  (See L<perlmodlib> or best yet, the F<Perl
1690 FAQ> for a description of what CPAN is and where to get it.)
1691
1692 Section 5 of the F<modules> file is devoted to "Networking, Device Control
1693 (modems), and Interprocess Communication", and contains numerous unbundled
1694 modules numerous networking modules, Chat and Expect operations, CGI
1695 programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP, Telnet,
1696 Threads, and ToolTalk--just to name a few.