source: trunk/third/perl/pod/perlipc.pod @ 20075

Revision 20075, 61.3 KB checked in by zacheiss, 21 years ago (diff)
This commit was generated by cvs2svn to compensate for changes in r20074, which included commits to RCS files with non-trunk default branches.
Line 
1=head1 NAME
2
3perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores)
4
5=head1 DESCRIPTION
6
7The basic IPC facilities of Perl are built out of the good old Unix
8signals, named pipes, pipe opens, the Berkeley socket routines, and SysV
9IPC calls.  Each is used in slightly different situations.
10
11=head1 Signals
12
13Perl uses a simple signal handling model: the %SIG hash contains names
14or references of user-installed signal handlers.  These handlers will
15be called with an argument which is the name of the signal that
16triggered it.  A signal may be generated intentionally from a
17particular keyboard sequence like control-C or control-Z, sent to you
18from another process, or triggered automatically by the kernel when
19special events transpire, like a child process exiting, your process
20running out of stack space, or hitting file size limit.
21
22For example, to trap an interrupt signal, set up a handler like this:
23
24    sub catch_zap {
25        my $signame = shift;
26        $shucks++;
27        die "Somebody sent me a SIG$signame";
28    }
29    $SIG{INT} = 'catch_zap';  # could fail in modules
30    $SIG{INT} = \&catch_zap;  # best strategy
31
32Prior to Perl 5.7.3 it was necessary to do as little as you possibly
33could in your handler; notice how all we do is set a global variable
34and then raise an exception.  That's because on most systems,
35libraries are not re-entrant; particularly, memory allocation and I/O
36routines are not.  That meant that doing nearly I<anything> in your
37handler could in theory trigger a memory fault and subsequent core
38dump - see L</Deferred Signals (Safe Signals)> below.
39
40The names of the signals are the ones listed out by C<kill -l> on your
41system, or you can retrieve them from the Config module.  Set up an
42@signame list indexed by number to get the name and a %signo table
43indexed by name to get the number:
44
45    use Config;
46    defined $Config{sig_name} || die "No sigs?";
47    foreach $name (split(' ', $Config{sig_name})) {
48        $signo{$name} = $i;
49        $signame[$i] = $name;
50        $i++;
51    }
52
53So to check whether signal 17 and SIGALRM were the same, do just this:
54
55    print "signal #17 = $signame[17]\n";
56    if ($signo{ALRM}) {
57        print "SIGALRM is $signo{ALRM}\n";
58    }
59
60You may also choose to assign the strings C<'IGNORE'> or C<'DEFAULT'> as
61the handler, in which case Perl will try to discard the signal or do the
62default thing.
63
64On most Unix platforms, the C<CHLD> (sometimes also known as C<CLD>) signal
65has special behavior with respect to a value of C<'IGNORE'>.
66Setting C<$SIG{CHLD}> to C<'IGNORE'> on such a platform has the effect of
67not creating zombie processes when the parent process fails to C<wait()>
68on its child processes (i.e. child processes are automatically reaped).
69Calling C<wait()> with C<$SIG{CHLD}> set to C<'IGNORE'> usually returns
70C<-1> on such platforms.
71
72Some signals can be neither trapped nor ignored, such as
73the KILL and STOP (but not the TSTP) signals.  One strategy for
74temporarily ignoring signals is to use a local() statement, which will be
75automatically restored once your block is exited.  (Remember that local()
76values are "inherited" by functions called from within that block.)
77
78    sub precious {
79        local $SIG{INT} = 'IGNORE';
80        &more_functions;
81    }
82    sub more_functions {
83        # interrupts still ignored, for now...
84    }
85
86Sending a signal to a negative process ID means that you send the signal
87to the entire Unix process-group.  This code sends a hang-up signal to all
88processes in the current process group (and sets $SIG{HUP} to IGNORE so
89it doesn't kill itself):
90
91    {
92        local $SIG{HUP} = 'IGNORE';
93        kill HUP => -$$;
94        # snazzy writing of: kill('HUP', -$$)
95    }
96
97Another interesting signal to send is signal number zero.  This doesn't
98actually affect a child process, but instead checks whether it's alive
99or has changed its UID.
100
101    unless (kill 0 => $kid_pid) {
102        warn "something wicked happened to $kid_pid";
103    }
104
105When directed at a process whose UID is not identical to that
106of the sending process, signal number zero may fail because
107you lack permission to send the signal, even though the process is alive.
108You may be able to determine the cause of failure using C<%!>.
109
110    unless (kill 0 => $pid or $!{EPERM}) {
111        warn "$pid looks dead";
112    }
113
114You might also want to employ anonymous functions for simple signal
115handlers:
116
117    $SIG{INT} = sub { die "\nOutta here!\n" };
118
119But that will be problematic for the more complicated handlers that need
120to reinstall themselves.  Because Perl's signal mechanism is currently
121based on the signal(3) function from the C library, you may sometimes be so
122misfortunate as to run on systems where that function is "broken", that
123is, it behaves in the old unreliable SysV way rather than the newer, more
124reasonable BSD and POSIX fashion.  So you'll see defensive people writing
125signal handlers like this:
126
127    sub REAPER {
128        $waitedpid = wait;
129        # loathe sysV: it makes us not only reinstate
130        # the handler, but place it after the wait
131        $SIG{CHLD} = \&REAPER;
132    }
133    $SIG{CHLD} = \&REAPER;
134    # now do something that forks...
135
136or better still:
137
138    use POSIX ":sys_wait_h";
139    sub REAPER {
140        my $child;
141        # If a second child dies while in the signal handler caused by the
142        # first death, we won't get another signal. So must loop here else
143        # we will leave the unreaped child as a zombie. And the next time
144        # two children die we get another zombie. And so on.
145        while (($child = waitpid(-1,WNOHANG)) > 0) {
146            $Kid_Status{$child} = $?;
147        }
148        $SIG{CHLD} = \&REAPER;  # still loathe sysV
149    }
150    $SIG{CHLD} = \&REAPER;
151    # do something that forks...
152
153Signal handling is also used for timeouts in Unix,   While safely
154protected within an C<eval{}> block, you set a signal handler to trap
155alarm signals and then schedule to have one delivered to you in some
156number of seconds.  Then try your blocking operation, clearing the alarm
157when it's done but not before you've exited your C<eval{}> block.  If it
158goes off, you'll use die() to jump out of the block, much as you might
159using longjmp() or throw() in other languages.
160
161Here's an example:
162
163    eval {
164        local $SIG{ALRM} = sub { die "alarm clock restart" };
165        alarm 10;
166        flock(FH, 2);   # blocking write lock
167        alarm 0;
168    };
169    if ($@ and $@ !~ /alarm clock restart/) { die }
170
171If the operation being timed out is system() or qx(), this technique
172is liable to generate zombies.    If this matters to you, you'll
173need to do your own fork() and exec(), and kill the errant child process.
174
175For more complex signal handling, you might see the standard POSIX
176module.  Lamentably, this is almost entirely undocumented, but
177the F<t/lib/posix.t> file from the Perl source distribution has some
178examples in it.
179
180=head2 Handling the SIGHUP Signal in Daemons
181
182A process that usually starts when the system boots and shuts down
183when the system is shut down is called a daemon (Disk And Execution
184MONitor). If a daemon process has a configuration file which is
185modified after the process has been started, there should be a way to
186tell that process to re-read its configuration file, without stopping
187the process. Many daemons provide this mechanism using the C<SIGHUP>
188signal handler. When you want to tell the daemon to re-read the file
189you simply send it the C<SIGHUP> signal.
190
191Not all platforms automatically reinstall their (native) signal
192handlers after a signal delivery.  This means that the handler works
193only the first time the signal is sent. The solution to this problem
194is to use C<POSIX> signal handlers if available, their behaviour
195is well-defined.
196
197The following example implements a simple daemon, which restarts
198itself every time the C<SIGHUP> signal is received. The actual code is
199located in the subroutine C<code()>, which simply prints some debug
200info to show that it works and should be replaced with the real code.
201
202  #!/usr/bin/perl -w
203
204  use POSIX ();
205  use FindBin ();
206  use File::Basename ();
207  use File::Spec::Functions;
208
209  $|=1;
210
211  # make the daemon cross-platform, so exec always calls the script
212  # itself with the right path, no matter how the script was invoked.
213  my $script = File::Basename::basename($0);
214  my $SELF = catfile $FindBin::Bin, $script;
215
216  # POSIX unmasks the sigprocmask properly
217  my $sigset = POSIX::SigSet->new();
218  my $action = POSIX::SigAction->new('sigHUP_handler',
219                                     $sigset,
220                                     &POSIX::SA_NODEFER);
221  POSIX::sigaction(&POSIX::SIGHUP, $action);
222
223  sub sigHUP_handler {
224      print "got SIGHUP\n";
225      exec($SELF, @ARGV) or die "Couldn't restart: $!\n";
226  }
227
228  code();
229
230  sub code {
231      print "PID: $$\n";
232      print "ARGV: @ARGV\n";
233      my $c = 0;
234      while (++$c) {
235          sleep 2;
236          print "$c\n";
237      }
238  }
239  __END__
240
241
242=head1 Named Pipes
243
244A named pipe (often referred to as a FIFO) is an old Unix IPC
245mechanism for processes communicating on the same machine.  It works
246just like a regular, connected anonymous pipes, except that the
247processes rendezvous using a filename and don't have to be related.
248
249To create a named pipe, use the Unix command mknod(1) or on some
250systems, mkfifo(1).  These may not be in your normal path.
251
252    # system return val is backwards, so && not ||
253    #
254    $ENV{PATH} .= ":/etc:/usr/etc";
255    if  (      system('mknod',  $path, 'p')
256            && system('mkfifo', $path) )
257    {
258        die "mk{nod,fifo} $path failed";
259    }
260
261
262A fifo is convenient when you want to connect a process to an unrelated
263one.  When you open a fifo, the program will block until there's something
264on the other end.
265
266For example, let's say you'd like to have your F<.signature> file be a
267named pipe that has a Perl program on the other end.  Now every time any
268program (like a mailer, news reader, finger program, etc.) tries to read
269from that file, the reading program will block and your program will
270supply the new signature.  We'll use the pipe-checking file test B<-p>
271to find out whether anyone (or anything) has accidentally removed our fifo.
272
273    chdir; # go home
274    $FIFO = '.signature';
275    $ENV{PATH} .= ":/etc:/usr/games";
276
277    while (1) {
278        unless (-p $FIFO) {
279            unlink $FIFO;
280            system('mknod', $FIFO, 'p')
281                && die "can't mknod $FIFO: $!";
282        }
283
284        # next line blocks until there's a reader
285        open (FIFO, "> $FIFO") || die "can't write $FIFO: $!";
286        print FIFO "John Smith (smith\@host.org)\n", `fortune -s`;
287        close FIFO;
288        sleep 2;    # to avoid dup signals
289    }
290
291=head2 Deferred Signals (Safe Signals)
292
293In Perls before Perl 5.7.3 by installing Perl code to deal with
294signals, you were exposing yourself to danger from two things.  First,
295few system library functions are re-entrant.  If the signal interrupts
296while Perl is executing one function (like malloc(3) or printf(3)),
297and your signal handler then calls the same function again, you could
298get unpredictable behavior--often, a core dump.  Second, Perl isn't
299itself re-entrant at the lowest levels.  If the signal interrupts Perl
300while Perl is changing its own internal data structures, similarly
301unpredictable behaviour may result.
302
303There were two things you could do, knowing this: be paranoid or be
304pragmatic.  The paranoid approach was to do as little as possible in your
305signal handler.  Set an existing integer variable that already has a
306value, and return.  This doesn't help you if you're in a slow system call,
307which will just restart.  That means you have to C<die> to longjump(3) out
308of the handler.  Even this is a little cavalier for the true paranoiac,
309who avoids C<die> in a handler because the system I<is> out to get you.
310The pragmatic approach was to say ``I know the risks, but prefer the
311convenience'', and to do anything you wanted in your signal handler,
312and be prepared to clean up core dumps now and again.
313
314In Perl 5.7.3 and later to avoid these problems signals are
315"deferred"-- that is when the signal is delivered to the process by
316the system (to the C code that implements Perl) a flag is set, and the
317handler returns immediately. Then at strategic "safe" points in the
318Perl interpreter (e.g. when it is about to execute a new opcode) the
319flags are checked and the Perl level handler from %SIG is
320executed. The "deferred" scheme allows much more flexibility in the
321coding of signal handler as we know Perl interpreter is in a safe
322state, and that we are not in a system library function when the
323handler is called.  However the implementation does differ from
324previous Perls in the following ways:
325
326=over 4
327
328=item Long running opcodes
329
330As Perl interpreter only looks at the signal flags when it about to
331execute a new opcode if a signal arrives during a long running opcode
332(e.g. a regular expression operation on a very large string) then
333signal will not be seen until operation completes.
334
335=item Interrupting IO
336
337When a signal is delivered (e.g. INT control-C) the operating system
338breaks into IO operations like C<read> (used to implement Perls
339E<lt>E<gt> operator). On older Perls the handler was called
340immediately (and as C<read> is not "unsafe" this worked well). With
341the "deferred" scheme the handler is not called immediately, and if
342Perl is using system's C<stdio> library that library may re-start the
343C<read> without returning to Perl and giving it a chance to call the
344%SIG handler. If this happens on your system the solution is to use
345C<:perlio> layer to do IO - at least on those handles which you want
346to be able to break into with signals. (The C<:perlio> layer checks
347the signal flags and calls %SIG handlers before resuming IO operation.)
348
349Note that the default in Perl 5.7.3 and later is to automatically use
350the C<:perlio> layer.
351
352Note that some networking library functions like gethostbyname() are
353known to have their own implementations of timeouts which may conflict
354with your timeouts.  If you are having problems with such functions,
355you can try using the POSIX sigaction() function, which bypasses the
356Perl safe signals (note that this means subjecting yourself to
357possible memory corruption, as described above).  Instead of setting
358C<$SIG{ALRM}> try something like the following:
359
360    use POSIX;
361    sigaction SIGALRM, new POSIX::SigAction sub { die "alarm\n" }
362        or die "Error setting SIGALRM handler: $!\n";
363
364=item Restartable system calls
365
366On systems that supported it, older versions of Perl used the
367SA_RESTART flag when installing %SIG handlers.  This meant that
368restartable system calls would continue rather than returning when
369a signal arrived.  In order to deliver deferred signals promptly,
370Perl 5.7.3 and later do I<not> use SA_RESTART.  Consequently,
371restartable system calls can fail (with $! set to C<EINTR>) in places
372where they previously would have succeeded.
373
374Note that the default C<:perlio> layer will retry C<read>, C<write>
375and C<close> as described above and that interrupted C<wait> and
376C<waitpid> calls will always be retried.
377
378=item Signals as "faults"
379
380Certain signals e.g. SEGV, ILL, BUS are generated as a result of
381virtual memory or other "faults". These are normally fatal and there
382is little a Perl-level handler can do with them. (In particular the
383old signal scheme was particularly unsafe in such cases.)  However if
384a %SIG handler is set the new scheme simply sets a flag and returns as
385described above. This may cause the operating system to try the
386offending machine instruction again and - as nothing has changed - it
387will generate the signal again. The result of this is a rather odd
388"loop". In future Perl's signal mechanism may be changed to avoid this
389- perhaps by simply disallowing %SIG handlers on signals of that
390type. Until then the work-round is not to set a %SIG handler on those
391signals. (Which signals they are is operating system dependant.)
392
393=item Signals triggered by operating system state
394
395On some operating systems certain signal handlers are supposed to "do
396something" before returning. One example can be CHLD or CLD which
397indicates a child process has completed. On some operating systems the
398signal handler is expected to C<wait> for the completed child
399process. On such systems the deferred signal scheme will not work for
400those signals (it does not do the C<wait>). Again the failure will
401look like a loop as the operating system will re-issue the signal as
402there are un-waited-for completed child processes.
403
404=back
405
406If you want the old signal behaviour back regardless of possible
407memory corruption, set the environment variable C<PERL_SIGNALS> to
408C<"unsafe"> (a new feature since Perl 5.8.1).
409
410=head1 Using open() for IPC
411
412Perl's basic open() statement can also be used for unidirectional
413interprocess communication by either appending or prepending a pipe
414symbol to the second argument to open().  Here's how to start
415something up in a child process you intend to write to:
416
417    open(SPOOLER, "| cat -v | lpr -h 2>/dev/null")
418                    || die "can't fork: $!";
419    local $SIG{PIPE} = sub { die "spooler pipe broke" };
420    print SPOOLER "stuff\n";
421    close SPOOLER || die "bad spool: $! $?";
422
423And here's how to start up a child process you intend to read from:
424
425    open(STATUS, "netstat -an 2>&1 |")
426                    || die "can't fork: $!";
427    while (<STATUS>) {
428        next if /^(tcp|udp)/;
429        print;
430    }
431    close STATUS || die "bad netstat: $! $?";
432
433If one can be sure that a particular program is a Perl script that is
434expecting filenames in @ARGV, the clever programmer can write something
435like this:
436
437    % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
438
439and irrespective of which shell it's called from, the Perl program will
440read from the file F<f1>, the process F<cmd1>, standard input (F<tmpfile>
441in this case), the F<f2> file, the F<cmd2> command, and finally the F<f3>
442file.  Pretty nifty, eh?
443
444You might notice that you could use backticks for much the
445same effect as opening a pipe for reading:
446
447    print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`;
448    die "bad netstat" if $?;
449
450While this is true on the surface, it's much more efficient to process the
451file one line or record at a time because then you don't have to read the
452whole thing into memory at once.  It also gives you finer control of the
453whole process, letting you to kill off the child process early if you'd
454like.
455
456Be careful to check both the open() and the close() return values.  If
457you're I<writing> to a pipe, you should also trap SIGPIPE.  Otherwise,
458think of what happens when you start up a pipe to a command that doesn't
459exist: the open() will in all likelihood succeed (it only reflects the
460fork()'s success), but then your output will fail--spectacularly.  Perl
461can't know whether the command worked because your command is actually
462running in a separate process whose exec() might have failed.  Therefore,
463while readers of bogus commands return just a quick end of file, writers
464to bogus command will trigger a signal they'd better be prepared to
465handle.  Consider:
466
467    open(FH, "|bogus")  or die "can't fork: $!";
468    print FH "bang\n"   or die "can't write: $!";
469    close FH            or die "can't close: $!";
470
471That won't blow up until the close, and it will blow up with a SIGPIPE.
472To catch it, you could use this:
473
474    $SIG{PIPE} = 'IGNORE';
475    open(FH, "|bogus")  or die "can't fork: $!";
476    print FH "bang\n"   or die "can't write: $!";
477    close FH            or die "can't close: status=$?";
478
479=head2 Filehandles
480
481Both the main process and any child processes it forks share the same
482STDIN, STDOUT, and STDERR filehandles.  If both processes try to access
483them at once, strange things can happen.  You may also want to close
484or reopen the filehandles for the child.  You can get around this by
485opening your pipe with open(), but on some systems this means that the
486child process cannot outlive the parent.
487
488=head2 Background Processes
489
490You can run a command in the background with:
491
492    system("cmd &");
493
494The command's STDOUT and STDERR (and possibly STDIN, depending on your
495shell) will be the same as the parent's.  You won't need to catch
496SIGCHLD because of the double-fork taking place (see below for more
497details).
498
499=head2 Complete Dissociation of Child from Parent
500
501In some cases (starting server processes, for instance) you'll want to
502completely dissociate the child process from the parent.  This is
503often called daemonization.  A well behaved daemon will also chdir()
504to the root directory (so it doesn't prevent unmounting the filesystem
505containing the directory from which it was launched) and redirect its
506standard file descriptors from and to F</dev/null> (so that random
507output doesn't wind up on the user's terminal).
508
509    use POSIX 'setsid';
510
511    sub daemonize {
512        chdir '/'               or die "Can't chdir to /: $!";
513        open STDIN, '/dev/null' or die "Can't read /dev/null: $!";
514        open STDOUT, '>/dev/null'
515                                or die "Can't write to /dev/null: $!";
516        defined(my $pid = fork) or die "Can't fork: $!";
517        exit if $pid;
518        setsid                  or die "Can't start a new session: $!";
519        open STDERR, '>&STDOUT' or die "Can't dup stdout: $!";
520    }
521
522The fork() has to come before the setsid() to ensure that you aren't a
523process group leader (the setsid() will fail if you are).  If your
524system doesn't have the setsid() function, open F</dev/tty> and use the
525C<TIOCNOTTY> ioctl() on it instead.  See L<tty(4)> for details.
526
527Non-Unix users should check their Your_OS::Process module for other
528solutions.
529
530=head2 Safe Pipe Opens
531
532Another interesting approach to IPC is making your single program go
533multiprocess and communicate between (or even amongst) yourselves.  The
534open() function will accept a file argument of either C<"-|"> or C<"|-">
535to do a very interesting thing: it forks a child connected to the
536filehandle you've opened.  The child is running the same program as the
537parent.  This is useful for safely opening a file when running under an
538assumed UID or GID, for example.  If you open a pipe I<to> minus, you can
539write to the filehandle you opened and your kid will find it in his
540STDIN.  If you open a pipe I<from> minus, you can read from the filehandle
541you opened whatever your kid writes to his STDOUT.
542
543    use English '-no_match_vars';
544    my $sleep_count = 0;
545
546    do {
547        $pid = open(KID_TO_WRITE, "|-");
548        unless (defined $pid) {
549            warn "cannot fork: $!";
550            die "bailing out" if $sleep_count++ > 6;
551            sleep 10;
552        }
553    } until defined $pid;
554
555    if ($pid) {  # parent
556        print KID_TO_WRITE @some_data;
557        close(KID_TO_WRITE) || warn "kid exited $?";
558    } else {     # child
559        ($EUID, $EGID) = ($UID, $GID); # suid progs only
560        open (FILE, "> /safe/file")
561            || die "can't open /safe/file: $!";
562        while (<STDIN>) {
563            print FILE; # child's STDIN is parent's KID
564        }
565        exit;  # don't forget this
566    }
567
568Another common use for this construct is when you need to execute
569something without the shell's interference.  With system(), it's
570straightforward, but you can't use a pipe open or backticks safely.
571That's because there's no way to stop the shell from getting its hands on
572your arguments.   Instead, use lower-level control to call exec() directly.
573
574Here's a safe backtick or pipe open for read:
575
576    # add error processing as above
577    $pid = open(KID_TO_READ, "-|");
578
579    if ($pid) {   # parent
580        while (<KID_TO_READ>) {
581            # do something interesting
582        }
583        close(KID_TO_READ) || warn "kid exited $?";
584
585    } else {      # child
586        ($EUID, $EGID) = ($UID, $GID); # suid only
587        exec($program, @options, @args)
588            || die "can't exec program: $!";
589        # NOTREACHED
590    }
591
592
593And here's a safe pipe open for writing:
594
595    # add error processing as above
596    $pid = open(KID_TO_WRITE, "|-");
597    $SIG{PIPE} = sub { die "whoops, $program pipe broke" };
598
599    if ($pid) {  # parent
600        for (@data) {
601            print KID_TO_WRITE;
602        }
603        close(KID_TO_WRITE) || warn "kid exited $?";
604
605    } else {     # child
606        ($EUID, $EGID) = ($UID, $GID);
607        exec($program, @options, @args)
608            || die "can't exec program: $!";
609        # NOTREACHED
610    }
611
612Since Perl 5.8.0, you can also use the list form of C<open> for pipes :
613the syntax
614
615    open KID_PS, "-|", "ps", "aux" or die $!;
616
617forks the ps(1) command (without spawning a shell, as there are more than
618three arguments to open()), and reads its standard output via the
619C<KID_PS> filehandle.  The corresponding syntax to read from command
620pipes (with C<"|-"> in place of C<"-|">) is also implemented.
621
622Note that these operations are full Unix forks, which means they may not be
623correctly implemented on alien systems.  Additionally, these are not true
624multithreading.  If you'd like to learn more about threading, see the
625F<modules> file mentioned below in the SEE ALSO section.
626
627=head2 Bidirectional Communication with Another Process
628
629While this works reasonably well for unidirectional communication, what
630about bidirectional communication?  The obvious thing you'd like to do
631doesn't actually work:
632
633    open(PROG_FOR_READING_AND_WRITING, "| some program |")
634
635and if you forget to use the C<use warnings> pragma or the B<-w> flag,
636then you'll miss out entirely on the diagnostic message:
637
638    Can't do bidirectional pipe at -e line 1.
639
640If you really want to, you can use the standard open2() library function
641to catch both ends.  There's also an open3() for tridirectional I/O so you
642can also catch your child's STDERR, but doing so would then require an
643awkward select() loop and wouldn't allow you to use normal Perl input
644operations.
645
646If you look at its source, you'll see that open2() uses low-level
647primitives like Unix pipe() and exec() calls to create all the connections.
648While it might have been slightly more efficient by using socketpair(), it
649would have then been even less portable than it already is.  The open2()
650and open3() functions are  unlikely to work anywhere except on a Unix
651system or some other one purporting to be POSIX compliant.
652
653Here's an example of using open2():
654
655    use FileHandle;
656    use IPC::Open2;
657    $pid = open2(*Reader, *Writer, "cat -u -n" );
658    print Writer "stuff\n";
659    $got = <Reader>;
660
661The problem with this is that Unix buffering is really going to
662ruin your day.  Even though your C<Writer> filehandle is auto-flushed,
663and the process on the other end will get your data in a timely manner,
664you can't usually do anything to force it to give it back to you
665in a similarly quick fashion.  In this case, we could, because we
666gave I<cat> a B<-u> flag to make it unbuffered.  But very few Unix
667commands are designed to operate over pipes, so this seldom works
668unless you yourself wrote the program on the other end of the
669double-ended pipe.
670
671A solution to this is the nonstandard F<Comm.pl> library.  It uses
672pseudo-ttys to make your program behave more reasonably:
673
674    require 'Comm.pl';
675    $ph = open_proc('cat -n');
676    for (1..10) {
677        print $ph "a line\n";
678        print "got back ", scalar <$ph>;
679    }
680
681This way you don't have to have control over the source code of the
682program you're using.  The F<Comm> library also has expect()
683and interact() functions.  Find the library (and we hope its
684successor F<IPC::Chat>) at your nearest CPAN archive as detailed
685in the SEE ALSO section below.
686
687The newer Expect.pm module from CPAN also addresses this kind of thing.
688This module requires two other modules from CPAN: IO::Pty and IO::Stty.
689It sets up a pseudo-terminal to interact with programs that insist on
690using talking to the terminal device driver.  If your system is
691amongst those supported, this may be your best bet.
692
693=head2 Bidirectional Communication with Yourself
694
695If you want, you may make low-level pipe() and fork()
696to stitch this together by hand.  This example only
697talks to itself, but you could reopen the appropriate
698handles to STDIN and STDOUT and call other processes.
699
700    #!/usr/bin/perl -w
701    # pipe1 - bidirectional communication using two pipe pairs
702    #         designed for the socketpair-challenged
703    use IO::Handle;     # thousands of lines just for autoflush :-(
704    pipe(PARENT_RDR, CHILD_WTR);                # XXX: failure?
705    pipe(CHILD_RDR,  PARENT_WTR);               # XXX: failure?
706    CHILD_WTR->autoflush(1);
707    PARENT_WTR->autoflush(1);
708
709    if ($pid = fork) {
710        close PARENT_RDR; close PARENT_WTR;
711        print CHILD_WTR "Parent Pid $$ is sending this\n";
712        chomp($line = <CHILD_RDR>);
713        print "Parent Pid $$ just read this: `$line'\n";
714        close CHILD_RDR; close CHILD_WTR;
715        waitpid($pid,0);
716    } else {
717        die "cannot fork: $!" unless defined $pid;
718        close CHILD_RDR; close CHILD_WTR;
719        chomp($line = <PARENT_RDR>);
720        print "Child Pid $$ just read this: `$line'\n";
721        print PARENT_WTR "Child Pid $$ is sending this\n";
722        close PARENT_RDR; close PARENT_WTR;
723        exit;
724    }
725
726But you don't actually have to make two pipe calls.  If you
727have the socketpair() system call, it will do this all for you.
728
729    #!/usr/bin/perl -w
730    # pipe2 - bidirectional communication using socketpair
731    #   "the best ones always go both ways"
732
733    use Socket;
734    use IO::Handle;     # thousands of lines just for autoflush :-(
735    # We say AF_UNIX because although *_LOCAL is the
736    # POSIX 1003.1g form of the constant, many machines
737    # still don't have it.
738    socketpair(CHILD, PARENT, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
739                                or  die "socketpair: $!";
740
741    CHILD->autoflush(1);
742    PARENT->autoflush(1);
743
744    if ($pid = fork) {
745        close PARENT;
746        print CHILD "Parent Pid $$ is sending this\n";
747        chomp($line = <CHILD>);
748        print "Parent Pid $$ just read this: `$line'\n";
749        close CHILD;
750        waitpid($pid,0);
751    } else {
752        die "cannot fork: $!" unless defined $pid;
753        close CHILD;
754        chomp($line = <PARENT>);
755        print "Child Pid $$ just read this: `$line'\n";
756        print PARENT "Child Pid $$ is sending this\n";
757        close PARENT;
758        exit;
759    }
760
761=head1 Sockets: Client/Server Communication
762
763While not limited to Unix-derived operating systems (e.g., WinSock on PCs
764provides socket support, as do some VMS libraries), you may not have
765sockets on your system, in which case this section probably isn't going to do
766you much good.  With sockets, you can do both virtual circuits (i.e., TCP
767streams) and datagrams (i.e., UDP packets).  You may be able to do even more
768depending on your system.
769
770The Perl function calls for dealing with sockets have the same names as
771the corresponding system calls in C, but their arguments tend to differ
772for two reasons: first, Perl filehandles work differently than C file
773descriptors.  Second, Perl already knows the length of its strings, so you
774don't need to pass that information.
775
776One of the major problems with old socket code in Perl was that it used
777hard-coded values for some of the constants, which severely hurt
778portability.  If you ever see code that does anything like explicitly
779setting C<$AF_INET = 2>, you know you're in for big trouble:  An
780immeasurably superior approach is to use the C<Socket> module, which more
781reliably grants access to various constants and functions you'll need.
782
783If you're not writing a server/client for an existing protocol like
784NNTP or SMTP, you should give some thought to how your server will
785know when the client has finished talking, and vice-versa.  Most
786protocols are based on one-line messages and responses (so one party
787knows the other has finished when a "\n" is received) or multi-line
788messages and responses that end with a period on an empty line
789("\n.\n" terminates a message/response).
790
791=head2 Internet Line Terminators
792
793The Internet line terminator is "\015\012".  Under ASCII variants of
794Unix, that could usually be written as "\r\n", but under other systems,
795"\r\n" might at times be "\015\015\012", "\012\012\015", or something
796completely different.  The standards specify writing "\015\012" to be
797conformant (be strict in what you provide), but they also recommend
798accepting a lone "\012" on input (but be lenient in what you require).
799We haven't always been very good about that in the code in this manpage,
800but unless you're on a Mac, you'll probably be ok.
801
802=head2 Internet TCP Clients and Servers
803
804Use Internet-domain sockets when you want to do client-server
805communication that might extend to machines outside of your own system.
806
807Here's a sample TCP client using Internet-domain sockets:
808
809    #!/usr/bin/perl -w
810    use strict;
811    use Socket;
812    my ($remote,$port, $iaddr, $paddr, $proto, $line);
813
814    $remote  = shift || 'localhost';
815    $port    = shift || 2345;  # random port
816    if ($port =~ /\D/) { $port = getservbyname($port, 'tcp') }
817    die "No port" unless $port;
818    $iaddr   = inet_aton($remote)               || die "no host: $remote";
819    $paddr   = sockaddr_in($port, $iaddr);
820
821    $proto   = getprotobyname('tcp');
822    socket(SOCK, PF_INET, SOCK_STREAM, $proto)  || die "socket: $!";
823    connect(SOCK, $paddr)    || die "connect: $!";
824    while (defined($line = <SOCK>)) {
825        print $line;
826    }
827
828    close (SOCK)            || die "close: $!";
829    exit;
830
831And here's a corresponding server to go along with it.  We'll
832leave the address as INADDR_ANY so that the kernel can choose
833the appropriate interface on multihomed hosts.  If you want sit
834on a particular interface (like the external side of a gateway
835or firewall machine), you should fill this in with your real address
836instead.
837
838    #!/usr/bin/perl -Tw
839    use strict;
840    BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
841    use Socket;
842    use Carp;
843    my $EOL = "\015\012";
844
845    sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
846
847    my $port = shift || 2345;
848    my $proto = getprotobyname('tcp');
849
850    ($port) = $port =~ /^(\d+)$/                        or die "invalid port";
851
852    socket(Server, PF_INET, SOCK_STREAM, $proto)        || die "socket: $!";
853    setsockopt(Server, SOL_SOCKET, SO_REUSEADDR,
854                                        pack("l", 1))   || die "setsockopt: $!";
855    bind(Server, sockaddr_in($port, INADDR_ANY))        || die "bind: $!";
856    listen(Server,SOMAXCONN)                            || die "listen: $!";
857
858    logmsg "server started on port $port";
859
860    my $paddr;
861
862    $SIG{CHLD} = \&REAPER;
863
864    for ( ; $paddr = accept(Client,Server); close Client) {
865        my($port,$iaddr) = sockaddr_in($paddr);
866        my $name = gethostbyaddr($iaddr,AF_INET);
867
868        logmsg "connection from $name [",
869                inet_ntoa($iaddr), "]
870                at port $port";
871
872        print Client "Hello there, $name, it's now ",
873                        scalar localtime, $EOL;
874    }
875
876And here's a multithreaded version.  It's multithreaded in that
877like most typical servers, it spawns (forks) a slave server to
878handle the client request so that the master server can quickly
879go back to service a new client.
880
881    #!/usr/bin/perl -Tw
882    use strict;
883    BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
884    use Socket;
885    use Carp;
886    my $EOL = "\015\012";
887
888    sub spawn;  # forward declaration
889    sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
890
891    my $port = shift || 2345;
892    my $proto = getprotobyname('tcp');
893
894    ($port) = $port =~ /^(\d+)$/                        or die "invalid port";
895
896    socket(Server, PF_INET, SOCK_STREAM, $proto)        || die "socket: $!";
897    setsockopt(Server, SOL_SOCKET, SO_REUSEADDR,
898                                        pack("l", 1))   || die "setsockopt: $!";
899    bind(Server, sockaddr_in($port, INADDR_ANY))        || die "bind: $!";
900    listen(Server,SOMAXCONN)                            || die "listen: $!";
901
902    logmsg "server started on port $port";
903
904    my $waitedpid = 0;
905    my $paddr;
906
907    use POSIX ":sys_wait_h";
908    sub REAPER {
909        my $child;
910        while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
911            logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
912        }
913        $SIG{CHLD} = \&REAPER;  # loathe sysV
914    }
915
916    $SIG{CHLD} = \&REAPER;
917
918    for ( $waitedpid = 0;
919          ($paddr = accept(Client,Server)) || $waitedpid;
920          $waitedpid = 0, close Client)
921    {
922        next if $waitedpid and not $paddr;
923        my($port,$iaddr) = sockaddr_in($paddr);
924        my $name = gethostbyaddr($iaddr,AF_INET);
925
926        logmsg "connection from $name [",
927                inet_ntoa($iaddr), "]
928                at port $port";
929
930        spawn sub {
931            $|=1;
932            print "Hello there, $name, it's now ", scalar localtime, $EOL;
933            exec '/usr/games/fortune'           # XXX: `wrong' line terminators
934                or confess "can't exec fortune: $!";
935        };
936
937    }
938
939    sub spawn {
940        my $coderef = shift;
941
942        unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
943            confess "usage: spawn CODEREF";
944        }
945
946        my $pid;
947        if (!defined($pid = fork)) {
948            logmsg "cannot fork: $!";
949            return;
950        } elsif ($pid) {
951            logmsg "begat $pid";
952            return; # I'm the parent
953        }
954        # else I'm the child -- go spawn
955
956        open(STDIN,  "<&Client")   || die "can't dup client to stdin";
957        open(STDOUT, ">&Client")   || die "can't dup client to stdout";
958        ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
959        exit &$coderef();
960    }
961
962This server takes the trouble to clone off a child version via fork() for
963each incoming request.  That way it can handle many requests at once,
964which you might not always want.  Even if you don't fork(), the listen()
965will allow that many pending connections.  Forking servers have to be
966particularly careful about cleaning up their dead children (called
967"zombies" in Unix parlance), because otherwise you'll quickly fill up your
968process table.
969
970We suggest that you use the B<-T> flag to use taint checking (see L<perlsec>)
971even if we aren't running setuid or setgid.  This is always a good idea
972for servers and other programs run on behalf of someone else (like CGI
973scripts), because it lessens the chances that people from the outside will
974be able to compromise your system.
975
976Let's look at another TCP client.  This one connects to the TCP "time"
977service on a number of different machines and shows how far their clocks
978differ from the system on which it's being run:
979
980    #!/usr/bin/perl  -w
981    use strict;
982    use Socket;
983
984    my $SECS_of_70_YEARS = 2208988800;
985    sub ctime { scalar localtime(shift) }
986
987    my $iaddr = gethostbyname('localhost');
988    my $proto = getprotobyname('tcp');
989    my $port = getservbyname('time', 'tcp');
990    my $paddr = sockaddr_in(0, $iaddr);
991    my($host);
992
993    $| = 1;
994    printf "%-24s %8s %s\n",  "localhost", 0, ctime(time());
995
996    foreach $host (@ARGV) {
997        printf "%-24s ", $host;
998        my $hisiaddr = inet_aton($host)     || die "unknown host";
999        my $hispaddr = sockaddr_in($port, $hisiaddr);
1000        socket(SOCKET, PF_INET, SOCK_STREAM, $proto)   || die "socket: $!";
1001        connect(SOCKET, $hispaddr)          || die "bind: $!";
1002        my $rtime = '    ';
1003        read(SOCKET, $rtime, 4);
1004        close(SOCKET);
1005        my $histime = unpack("N", $rtime) - $SECS_of_70_YEARS ;
1006        printf "%8d %s\n", $histime - time, ctime($histime);
1007    }
1008
1009=head2 Unix-Domain TCP Clients and Servers
1010
1011That's fine for Internet-domain clients and servers, but what about local
1012communications?  While you can use the same setup, sometimes you don't
1013want to.  Unix-domain sockets are local to the current host, and are often
1014used internally to implement pipes.  Unlike Internet domain sockets, Unix
1015domain sockets can show up in the file system with an ls(1) listing.
1016
1017    % ls -l /dev/log
1018    srw-rw-rw-  1 root            0 Oct 31 07:23 /dev/log
1019
1020You can test for these with Perl's B<-S> file test:
1021
1022    unless ( -S '/dev/log' ) {
1023        die "something's wicked with the log system";
1024    }
1025
1026Here's a sample Unix-domain client:
1027
1028    #!/usr/bin/perl -w
1029    use Socket;
1030    use strict;
1031    my ($rendezvous, $line);
1032
1033    $rendezvous = shift || '/tmp/catsock';
1034    socket(SOCK, PF_UNIX, SOCK_STREAM, 0)       || die "socket: $!";
1035    connect(SOCK, sockaddr_un($rendezvous))     || die "connect: $!";
1036    while (defined($line = <SOCK>)) {
1037        print $line;
1038    }
1039    exit;
1040
1041And here's a corresponding server.  You don't have to worry about silly
1042network terminators here because Unix domain sockets are guaranteed
1043to be on the localhost, and thus everything works right.
1044
1045    #!/usr/bin/perl -Tw
1046    use strict;
1047    use Socket;
1048    use Carp;
1049
1050    BEGIN { $ENV{PATH} = '/usr/ucb:/bin' }
1051    sub spawn;  # forward declaration
1052    sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" }
1053
1054    my $NAME = '/tmp/catsock';
1055    my $uaddr = sockaddr_un($NAME);
1056    my $proto = getprotobyname('tcp');
1057
1058    socket(Server,PF_UNIX,SOCK_STREAM,0)        || die "socket: $!";
1059    unlink($NAME);
1060    bind  (Server, $uaddr)                      || die "bind: $!";
1061    listen(Server,SOMAXCONN)                    || die "listen: $!";
1062
1063    logmsg "server started on $NAME";
1064
1065    my $waitedpid;
1066
1067    use POSIX ":sys_wait_h";
1068    sub REAPER {
1069        my $child;
1070        while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
1071            logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
1072        }
1073        $SIG{CHLD} = \&REAPER;  # loathe sysV
1074    }
1075
1076    $SIG{CHLD} = \&REAPER;
1077
1078
1079    for ( $waitedpid = 0;
1080          accept(Client,Server) || $waitedpid;
1081          $waitedpid = 0, close Client)
1082    {
1083        next if $waitedpid;
1084        logmsg "connection on $NAME";
1085        spawn sub {
1086            print "Hello there, it's now ", scalar localtime, "\n";
1087            exec '/usr/games/fortune' or die "can't exec fortune: $!";
1088        };
1089    }
1090
1091    sub spawn {
1092        my $coderef = shift;
1093
1094        unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
1095            confess "usage: spawn CODEREF";
1096        }
1097
1098        my $pid;
1099        if (!defined($pid = fork)) {
1100            logmsg "cannot fork: $!";
1101            return;
1102        } elsif ($pid) {
1103            logmsg "begat $pid";
1104            return; # I'm the parent
1105        }
1106        # else I'm the child -- go spawn
1107
1108        open(STDIN,  "<&Client")   || die "can't dup client to stdin";
1109        open(STDOUT, ">&Client")   || die "can't dup client to stdout";
1110        ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
1111        exit &$coderef();
1112    }
1113
1114As you see, it's remarkably similar to the Internet domain TCP server, so
1115much so, in fact, that we've omitted several duplicate functions--spawn(),
1116logmsg(), ctime(), and REAPER()--which are exactly the same as in the
1117other server.
1118
1119So why would you ever want to use a Unix domain socket instead of a
1120simpler named pipe?  Because a named pipe doesn't give you sessions.  You
1121can't tell one process's data from another's.  With socket programming,
1122you get a separate session for each client: that's why accept() takes two
1123arguments.
1124
1125For example, let's say that you have a long running database server daemon
1126that you want folks from the World Wide Web to be able to access, but only
1127if they go through a CGI interface.  You'd have a small, simple CGI
1128program that does whatever checks and logging you feel like, and then acts
1129as a Unix-domain client and connects to your private server.
1130
1131=head1 TCP Clients with IO::Socket
1132
1133For those preferring a higher-level interface to socket programming, the
1134IO::Socket module provides an object-oriented approach.  IO::Socket is
1135included as part of the standard Perl distribution as of the 5.004
1136release.  If you're running an earlier version of Perl, just fetch
1137IO::Socket from CPAN, where you'll also find modules providing easy
1138interfaces to the following systems: DNS, FTP, Ident (RFC 931), NIS and
1139NISPlus, NNTP, Ping, POP3, SMTP, SNMP, SSLeay, Telnet, and Time--just
1140to name a few.
1141
1142=head2 A Simple Client
1143
1144Here's a client that creates a TCP connection to the "daytime"
1145service at port 13 of the host name "localhost" and prints out everything
1146that the server there cares to provide.
1147
1148    #!/usr/bin/perl -w
1149    use IO::Socket;
1150    $remote = IO::Socket::INET->new(
1151                        Proto    => "tcp",
1152                        PeerAddr => "localhost",
1153                        PeerPort => "daytime(13)",
1154                    )
1155                  or die "cannot connect to daytime port at localhost";
1156    while ( <$remote> ) { print }
1157
1158When you run this program, you should get something back that
1159looks like this:
1160
1161    Wed May 14 08:40:46 MDT 1997
1162
1163Here are what those parameters to the C<new> constructor mean:
1164
1165=over 4
1166
1167=item C<Proto>
1168
1169This is which protocol to use.  In this case, the socket handle returned
1170will be connected to a TCP socket, because we want a stream-oriented
1171connection, that is, one that acts pretty much like a plain old file.
1172Not all sockets are this of this type.  For example, the UDP protocol
1173can be used to make a datagram socket, used for message-passing.
1174
1175=item C<PeerAddr>
1176
1177This is the name or Internet address of the remote host the server is
1178running on.  We could have specified a longer name like C<"www.perl.com">,
1179or an address like C<"204.148.40.9">.  For demonstration purposes, we've
1180used the special hostname C<"localhost">, which should always mean the
1181current machine you're running on.  The corresponding Internet address
1182for localhost is C<"127.1">, if you'd rather use that.
1183
1184=item C<PeerPort>
1185
1186This is the service name or port number we'd like to connect to.
1187We could have gotten away with using just C<"daytime"> on systems with a
1188well-configured system services file,[FOOTNOTE: The system services file
1189is in I</etc/services> under Unix] but just in case, we've specified the
1190port number (13) in parentheses.  Using just the number would also have
1191worked, but constant numbers make careful programmers nervous.
1192
1193=back
1194
1195Notice how the return value from the C<new> constructor is used as
1196a filehandle in the C<while> loop?  That's what's called an indirect
1197filehandle, a scalar variable containing a filehandle.  You can use
1198it the same way you would a normal filehandle.  For example, you
1199can read one line from it this way:
1200
1201    $line = <$handle>;
1202
1203all remaining lines from is this way:
1204
1205    @lines = <$handle>;
1206
1207and send a line of data to it this way:
1208
1209    print $handle "some data\n";
1210
1211=head2 A Webget Client
1212
1213Here's a simple client that takes a remote host to fetch a document
1214from, and then a list of documents to get from that host.  This is a
1215more interesting client than the previous one because it first sends
1216something to the server before fetching the server's response.
1217
1218    #!/usr/bin/perl -w
1219    use IO::Socket;
1220    unless (@ARGV > 1) { die "usage: $0 host document ..." }
1221    $host = shift(@ARGV);
1222    $EOL = "\015\012";
1223    $BLANK = $EOL x 2;
1224    foreach $document ( @ARGV ) {
1225        $remote = IO::Socket::INET->new( Proto     => "tcp",
1226                                         PeerAddr  => $host,
1227                                         PeerPort  => "http(80)",
1228                                        );
1229        unless ($remote) { die "cannot connect to http daemon on $host" }
1230        $remote->autoflush(1);
1231        print $remote "GET $document HTTP/1.0" . $BLANK;
1232        while ( <$remote> ) { print }
1233        close $remote;
1234    }
1235
1236The web server handing the "http" service, which is assumed to be at
1237its standard port, number 80.  If the web server you're trying to
1238connect to is at a different port (like 1080 or 8080), you should specify
1239as the named-parameter pair, C<< PeerPort => 8080 >>.  The C<autoflush>
1240method is used on the socket because otherwise the system would buffer
1241up the output we sent it.  (If you're on a Mac, you'll also need to
1242change every C<"\n"> in your code that sends data over the network to
1243be a C<"\015\012"> instead.)
1244
1245Connecting to the server is only the first part of the process: once you
1246have the connection, you have to use the server's language.  Each server
1247on the network has its own little command language that it expects as
1248input.  The string that we send to the server starting with "GET" is in
1249HTTP syntax.  In this case, we simply request each specified document.
1250Yes, we really are making a new connection for each document, even though
1251it's the same host.  That's the way you always used to have to speak HTTP.
1252Recent versions of web browsers may request that the remote server leave
1253the connection open a little while, but the server doesn't have to honor
1254such a request.
1255
1256Here's an example of running that program, which we'll call I<webget>:
1257
1258    % webget www.perl.com /guanaco.html
1259    HTTP/1.1 404 File Not Found
1260    Date: Thu, 08 May 1997 18:02:32 GMT
1261    Server: Apache/1.2b6
1262    Connection: close
1263    Content-type: text/html
1264
1265    <HEAD><TITLE>404 File Not Found</TITLE></HEAD>
1266    <BODY><H1>File Not Found</H1>
1267    The requested URL /guanaco.html was not found on this server.<P>
1268    </BODY>
1269
1270Ok, so that's not very interesting, because it didn't find that
1271particular document.  But a long response wouldn't have fit on this page.
1272
1273For a more fully-featured version of this program, you should look to
1274the I<lwp-request> program included with the LWP modules from CPAN.
1275
1276=head2 Interactive Client with IO::Socket
1277
1278Well, that's all fine if you want to send one command and get one answer,
1279but what about setting up something fully interactive, somewhat like
1280the way I<telnet> works?  That way you can type a line, get the answer,
1281type a line, get the answer, etc.
1282
1283This client is more complicated than the two we've done so far, but if
1284you're on a system that supports the powerful C<fork> call, the solution
1285isn't that rough.  Once you've made the connection to whatever service
1286you'd like to chat with, call C<fork> to clone your process.  Each of
1287these two identical process has a very simple job to do: the parent
1288copies everything from the socket to standard output, while the child
1289simultaneously copies everything from standard input to the socket.
1290To accomplish the same thing using just one process would be I<much>
1291harder, because it's easier to code two processes to do one thing than it
1292is to code one process to do two things.  (This keep-it-simple principle
1293a cornerstones of the Unix philosophy, and good software engineering as
1294well, which is probably why it's spread to other systems.)
1295
1296Here's the code:
1297
1298    #!/usr/bin/perl -w
1299    use strict;
1300    use IO::Socket;
1301    my ($host, $port, $kidpid, $handle, $line);
1302
1303    unless (@ARGV == 2) { die "usage: $0 host port" }
1304    ($host, $port) = @ARGV;
1305
1306    # create a tcp connection to the specified host and port
1307    $handle = IO::Socket::INET->new(Proto     => "tcp",
1308                                    PeerAddr  => $host,
1309                                    PeerPort  => $port)
1310           or die "can't connect to port $port on $host: $!";
1311
1312    $handle->autoflush(1);              # so output gets there right away
1313    print STDERR "[Connected to $host:$port]\n";
1314
1315    # split the program into two processes, identical twins
1316    die "can't fork: $!" unless defined($kidpid = fork());
1317
1318    # the if{} block runs only in the parent process
1319    if ($kidpid) {
1320        # copy the socket to standard output
1321        while (defined ($line = <$handle>)) {
1322            print STDOUT $line;
1323        }
1324        kill("TERM", $kidpid);                  # send SIGTERM to child
1325    }
1326    # the else{} block runs only in the child process
1327    else {
1328        # copy standard input to the socket
1329        while (defined ($line = <STDIN>)) {
1330            print $handle $line;
1331        }
1332    }
1333
1334The C<kill> function in the parent's C<if> block is there to send a
1335signal to our child process (current running in the C<else> block)
1336as soon as the remote server has closed its end of the connection.
1337
1338If the remote server sends data a byte at time, and you need that
1339data immediately without waiting for a newline (which might not happen),
1340you may wish to replace the C<while> loop in the parent with the
1341following:
1342
1343    my $byte;
1344    while (sysread($handle, $byte, 1) == 1) {
1345        print STDOUT $byte;
1346    }
1347
1348Making a system call for each byte you want to read is not very efficient
1349(to put it mildly) but is the simplest to explain and works reasonably
1350well.
1351
1352=head1 TCP Servers with IO::Socket
1353
1354As always, setting up a server is little bit more involved than running a client.
1355The model is that the server creates a special kind of socket that
1356does nothing but listen on a particular port for incoming connections.
1357It does this by calling the C<< IO::Socket::INET->new() >> method with
1358slightly different arguments than the client did.
1359
1360=over 4
1361
1362=item Proto
1363
1364This is which protocol to use.  Like our clients, we'll
1365still specify C<"tcp"> here.
1366
1367=item LocalPort
1368
1369We specify a local
1370port in the C<LocalPort> argument, which we didn't do for the client.
1371This is service name or port number for which you want to be the
1372server. (Under Unix, ports under 1024 are restricted to the
1373superuser.)  In our sample, we'll use port 9000, but you can use
1374any port that's not currently in use on your system.  If you try
1375to use one already in used, you'll get an "Address already in use"
1376message.  Under Unix, the C<netstat -a> command will show
1377which services current have servers.
1378
1379=item Listen
1380
1381The C<Listen> parameter is set to the maximum number of
1382pending connections we can accept until we turn away incoming clients.
1383Think of it as a call-waiting queue for your telephone.
1384The low-level Socket module has a special symbol for the system maximum, which
1385is SOMAXCONN.
1386
1387=item Reuse
1388
1389The C<Reuse> parameter is needed so that we restart our server
1390manually without waiting a few minutes to allow system buffers to
1391clear out.
1392
1393=back
1394
1395Once the generic server socket has been created using the parameters
1396listed above, the server then waits for a new client to connect
1397to it.  The server blocks in the C<accept> method, which eventually accepts a
1398bidirectional connection from the remote client.  (Make sure to autoflush
1399this handle to circumvent buffering.)
1400
1401To add to user-friendliness, our server prompts the user for commands.
1402Most servers don't do this.  Because of the prompt without a newline,
1403you'll have to use the C<sysread> variant of the interactive client above.
1404
1405This server accepts one of five different commands, sending output
1406back to the client.  Note that unlike most network servers, this one
1407only handles one incoming client at a time.  Multithreaded servers are
1408covered in Chapter 6 of the Camel.
1409
1410Here's the code.  We'll
1411
1412 #!/usr/bin/perl -w
1413 use IO::Socket;
1414 use Net::hostent;              # for OO version of gethostbyaddr
1415
1416 $PORT = 9000;                  # pick something not in use
1417
1418 $server = IO::Socket::INET->new( Proto     => 'tcp',
1419                                  LocalPort => $PORT,
1420                                  Listen    => SOMAXCONN,
1421                                  Reuse     => 1);
1422
1423 die "can't setup server" unless $server;
1424 print "[Server $0 accepting clients]\n";
1425
1426 while ($client = $server->accept()) {
1427   $client->autoflush(1);
1428   print $client "Welcome to $0; type help for command list.\n";
1429   $hostinfo = gethostbyaddr($client->peeraddr);
1430   printf "[Connect from %s]\n", $hostinfo ? $hostinfo->name : $client->peerhost;
1431   print $client "Command? ";
1432   while ( <$client>) {
1433     next unless /\S/;       # blank line
1434     if    (/quit|exit/i)    { last;                                     }
1435     elsif (/date|time/i)    { printf $client "%s\n", scalar localtime;  }
1436     elsif (/who/i )         { print  $client `who 2>&1`;                }
1437     elsif (/cookie/i )      { print  $client `/usr/games/fortune 2>&1`; }
1438     elsif (/motd/i )        { print  $client `cat /etc/motd 2>&1`;      }
1439     else {
1440       print $client "Commands: quit date who cookie motd\n";
1441     }
1442   } continue {
1443      print $client "Command? ";
1444   }
1445   close $client;
1446 }
1447
1448=head1 UDP: Message Passing
1449
1450Another kind of client-server setup is one that uses not connections, but
1451messages.  UDP communications involve much lower overhead but also provide
1452less reliability, as there are no promises that messages will arrive at
1453all, let alone in order and unmangled.  Still, UDP offers some advantages
1454over TCP, including being able to "broadcast" or "multicast" to a whole
1455bunch of destination hosts at once (usually on your local subnet).  If you
1456find yourself overly concerned about reliability and start building checks
1457into your message system, then you probably should use just TCP to start
1458with.
1459
1460Note that UDP datagrams are I<not> a bytestream and should not be treated
1461as such. This makes using I/O mechanisms with internal buffering
1462like stdio (i.e. print() and friends) especially cumbersome. Use syswrite(),
1463or better send(), like in the example below.
1464
1465Here's a UDP program similar to the sample Internet TCP client given
1466earlier.  However, instead of checking one host at a time, the UDP version
1467will check many of them asynchronously by simulating a multicast and then
1468using select() to do a timed-out wait for I/O.  To do something similar
1469with TCP, you'd have to use a different socket handle for each host.
1470
1471    #!/usr/bin/perl -w
1472    use strict;
1473    use Socket;
1474    use Sys::Hostname;
1475
1476    my ( $count, $hisiaddr, $hispaddr, $histime,
1477         $host, $iaddr, $paddr, $port, $proto,
1478         $rin, $rout, $rtime, $SECS_of_70_YEARS);
1479
1480    $SECS_of_70_YEARS      = 2208988800;
1481
1482    $iaddr = gethostbyname(hostname());
1483    $proto = getprotobyname('udp');
1484    $port = getservbyname('time', 'udp');
1485    $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick
1486
1487    socket(SOCKET, PF_INET, SOCK_DGRAM, $proto)   || die "socket: $!";
1488    bind(SOCKET, $paddr)                          || die "bind: $!";
1489
1490    $| = 1;
1491    printf "%-12s %8s %s\n",  "localhost", 0, scalar localtime time;
1492    $count = 0;
1493    for $host (@ARGV) {
1494        $count++;
1495        $hisiaddr = inet_aton($host)    || die "unknown host";
1496        $hispaddr = sockaddr_in($port, $hisiaddr);
1497        defined(send(SOCKET, 0, 0, $hispaddr))    || die "send $host: $!";
1498    }
1499
1500    $rin = '';
1501    vec($rin, fileno(SOCKET), 1) = 1;
1502
1503    # timeout after 10.0 seconds
1504    while ($count && select($rout = $rin, undef, undef, 10.0)) {
1505        $rtime = '';
1506        ($hispaddr = recv(SOCKET, $rtime, 4, 0))        || die "recv: $!";
1507        ($port, $hisiaddr) = sockaddr_in($hispaddr);
1508        $host = gethostbyaddr($hisiaddr, AF_INET);
1509        $histime = unpack("N", $rtime) - $SECS_of_70_YEARS ;
1510        printf "%-12s ", $host;
1511        printf "%8d %s\n", $histime - time, scalar localtime($histime);
1512        $count--;
1513    }
1514
1515Note that this example does not include any retries and may consequently
1516fail to contact a reachable host. The most prominent reason for this
1517is congestion of the queues on the sending host if the number of
1518list of hosts to contact is sufficiently large.
1519
1520=head1 SysV IPC
1521
1522While System V IPC isn't so widely used as sockets, it still has some
1523interesting uses.  You can't, however, effectively use SysV IPC or
1524Berkeley mmap() to have shared memory so as to share a variable amongst
1525several processes.  That's because Perl would reallocate your string when
1526you weren't wanting it to.
1527
1528Here's a small example showing shared memory usage.
1529
1530    use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRWXU);
1531
1532    $size = 2000;
1533    $id = shmget(IPC_PRIVATE, $size, S_IRWXU) || die "$!";
1534    print "shm key $id\n";
1535
1536    $message = "Message #1";
1537    shmwrite($id, $message, 0, 60) || die "$!";
1538    print "wrote: '$message'\n";
1539    shmread($id, $buff, 0, 60) || die "$!";
1540    print "read : '$buff'\n";
1541
1542    # the buffer of shmread is zero-character end-padded.
1543    substr($buff, index($buff, "\0")) = '';
1544    print "un" unless $buff eq $message;
1545    print "swell\n";
1546
1547    print "deleting shm $id\n";
1548    shmctl($id, IPC_RMID, 0) || die "$!";
1549
1550Here's an example of a semaphore:
1551
1552    use IPC::SysV qw(IPC_CREAT);
1553
1554    $IPC_KEY = 1234;
1555    $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT ) || die "$!";
1556    print "shm key $id\n";
1557
1558Put this code in a separate file to be run in more than one process.
1559Call the file F<take>:
1560
1561    # create a semaphore
1562
1563    $IPC_KEY = 1234;
1564    $id = semget($IPC_KEY,  0 , 0 );
1565    die if !defined($id);
1566
1567    $semnum = 0;
1568    $semflag = 0;
1569
1570    # 'take' semaphore
1571    # wait for semaphore to be zero
1572    $semop = 0;
1573    $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag);
1574
1575    # Increment the semaphore count
1576    $semop = 1;
1577    $opstring2 = pack("s!s!s!", $semnum, $semop,  $semflag);
1578    $opstring = $opstring1 . $opstring2;
1579
1580    semop($id,$opstring) || die "$!";
1581
1582Put this code in a separate file to be run in more than one process.
1583Call this file F<give>:
1584
1585    # 'give' the semaphore
1586    # run this in the original process and you will see
1587    # that the second process continues
1588
1589    $IPC_KEY = 1234;
1590    $id = semget($IPC_KEY, 0, 0);
1591    die if !defined($id);
1592
1593    $semnum = 0;
1594    $semflag = 0;
1595
1596    # Decrement the semaphore count
1597    $semop = -1;
1598    $opstring = pack("s!s!s!", $semnum, $semop, $semflag);
1599
1600    semop($id,$opstring) || die "$!";
1601
1602The SysV IPC code above was written long ago, and it's definitely
1603clunky looking.  For a more modern look, see the IPC::SysV module
1604which is included with Perl starting from Perl 5.005.
1605
1606A small example demonstrating SysV message queues:
1607
1608    use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRWXU);
1609
1610    my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRWXU);
1611
1612    my $sent = "message";
1613    my $type = 1234;
1614    my $rcvd;
1615    my $type_rcvd;
1616
1617    if (defined $id) {
1618        if (msgsnd($id, pack("l! a*", $type_sent, $sent), 0)) {
1619            if (msgrcv($id, $rcvd, 60, 0, 0)) {
1620                ($type_rcvd, $rcvd) = unpack("l! a*", $rcvd);
1621                if ($rcvd eq $sent) {
1622                    print "okay\n";
1623                } else {
1624                    print "not okay\n";
1625                }
1626            } else {
1627                die "# msgrcv failed\n";
1628            }
1629        } else {
1630            die "# msgsnd failed\n";
1631        }
1632        msgctl($id, IPC_RMID, 0) || die "# msgctl failed: $!\n";
1633    } else {
1634        die "# msgget failed\n";
1635    }
1636
1637=head1 NOTES
1638
1639Most of these routines quietly but politely return C<undef> when they
1640fail instead of causing your program to die right then and there due to
1641an uncaught exception.  (Actually, some of the new I<Socket> conversion
1642functions  croak() on bad arguments.)  It is therefore essential to
1643check return values from these functions.  Always begin your socket
1644programs this way for optimal success, and don't forget to add B<-T>
1645taint checking flag to the #! line for servers:
1646
1647    #!/usr/bin/perl -Tw
1648    use strict;
1649    use sigtrap;
1650    use Socket;
1651
1652=head1 BUGS
1653
1654All these routines create system-specific portability problems.  As noted
1655elsewhere, Perl is at the mercy of your C libraries for much of its system
1656behaviour.  It's probably safest to assume broken SysV semantics for
1657signals and to stick with simple TCP and UDP socket operations; e.g., don't
1658try to pass open file descriptors over a local UDP datagram socket if you
1659want your code to stand a chance of being portable.
1660
1661As mentioned in the signals section, because few vendors provide C
1662libraries that are safely re-entrant, the prudent programmer will do
1663little else within a handler beyond setting a numeric variable that
1664already exists; or, if locked into a slow (restarting) system call,
1665using die() to raise an exception and longjmp(3) out.  In fact, even
1666these may in some cases cause a core dump.  It's probably best to avoid
1667signals except where they are absolutely inevitable.  This
1668will be addressed in a future release of Perl.
1669
1670=head1 AUTHOR
1671
1672Tom Christiansen, with occasional vestiges of Larry Wall's original
1673version and suggestions from the Perl Porters.
1674
1675=head1 SEE ALSO
1676
1677There's a lot more to networking than this, but this should get you
1678started.
1679
1680For intrepid programmers, the indispensable textbook is I<Unix
1681Network Programming, 2nd Edition, Volume 1> by W. Richard Stevens
1682(published by Prentice-Hall).  Note that most books on networking
1683address the subject from the perspective of a C programmer; translation
1684to Perl is left as an exercise for the reader.
1685
1686The IO::Socket(3) manpage describes the object library, and the Socket(3)
1687manpage describes the low-level interface to sockets.  Besides the obvious
1688functions in L<perlfunc>, you should also check out the F<modules> file
1689at your nearest CPAN site.  (See L<perlmodlib> or best yet, the F<Perl
1690FAQ> for a description of what CPAN is and where to get it.)
1691
1692Section 5 of the F<modules> file is devoted to "Networking, Device Control
1693(modems), and Interprocess Communication", and contains numerous unbundled
1694modules numerous networking modules, Chat and Expect operations, CGI
1695programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP, Telnet,
1696Threads, and ToolTalk--just to name a few.
Note: See TracBrowser for help on using the repository browser.