source: trunk/third/perl/pod/perlfaq5.pod @ 17035

Revision 17035, 41.8 KB checked in by zacheiss, 23 years ago (diff)
This commit was generated by cvs2svn to compensate for changes in r17034, which included commits to RCS files with non-trunk default branches.
Line 
1=head1 NAME
2
3perlfaq5 - Files and Formats ($Revision: 1.1.1.3 $, $Date: 2002-02-07 21:12:22 $)
4
5=head1 DESCRIPTION
6
7This section deals with I/O and the "f" issues: filehandles, flushing,
8formats, and footers.
9
10=head2 How do I flush/unbuffer an output filehandle?  Why must I do this?
11
12The C standard I/O library (stdio) normally buffers characters sent to
13devices.  This is done for efficiency reasons so that there isn't a
14system call for each byte.  Any time you use print() or write() in
15Perl, you go though this buffering.  syswrite() circumvents stdio and
16buffering.
17
18In most stdio implementations, the type of output buffering and the size of
19the buffer varies according to the type of device.  Disk files are block
20buffered, often with a buffer size of more than 2k.  Pipes and sockets
21are often buffered with a buffer size between 1/2 and 2k.  Serial devices
22(e.g. modems, terminals) are normally line-buffered, and stdio sends
23the entire line when it gets the newline.
24
25Perl does not support truly unbuffered output (except insofar as you can
26C<syswrite(OUT, $char, 1)>).  What it does instead support is "command
27buffering", in which a physical write is performed after every output
28command.  This isn't as hard on your system as unbuffering, but does
29get the output where you want it when you want it.
30
31If you expect characters to get to your device when you print them there,
32you'll want to autoflush its handle.
33Use select() and the C<$|> variable to control autoflushing
34(see L<perlvar/$|> and L<perlfunc/select>):
35
36    $old_fh = select(OUTPUT_HANDLE);
37    $| = 1;
38    select($old_fh);
39
40Or using the traditional idiom:
41
42    select((select(OUTPUT_HANDLE), $| = 1)[0]);
43
44Or if don't mind slowly loading several thousand lines of module code
45just because you're afraid of the C<$|> variable:
46
47    use FileHandle;
48    open(DEV, "+</dev/tty");      # ceci n'est pas une pipe
49    DEV->autoflush(1);
50
51or the newer IO::* modules:
52
53    use IO::Handle;
54    open(DEV, ">/dev/printer");   # but is this?
55    DEV->autoflush(1);
56
57or even this:
58
59    use IO::Socket;               # this one is kinda a pipe?
60    $sock = IO::Socket::INET->new(PeerAddr => 'www.perl.com',
61                                  PeerPort => 'http(80)',
62                                  Proto    => 'tcp');
63    die "$!" unless $sock;
64
65    $sock->autoflush();
66    print $sock "GET / HTTP/1.0" . "\015\012" x 2;
67    $document = join('', <$sock>);
68    print "DOC IS: $document\n";
69
70Note the bizarrely hardcoded carriage return and newline in their octal
71equivalents.  This is the ONLY way (currently) to assure a proper flush
72on all platforms, including Macintosh.  That's the way things work in
73network programming: you really should specify the exact bit pattern
74on the network line terminator.  In practice, C<"\n\n"> often works,
75but this is not portable.
76
77See L<perlfaq9> for other examples of fetching URLs over the web.
78
79=head2 How do I change one line in a file/delete a line in a file/insert a line in the middle of a file/append to the beginning of a file?
80
81Those are operations of a text editor.  Perl is not a text editor.
82Perl is a programming language.  You have to decompose the problem into
83low-level calls to read, write, open, close, and seek.
84
85Although humans have an easy time thinking of a text file as being a
86sequence of lines that operates much like a stack of playing cards--or
87punch cards--computers usually see the text file as a sequence of bytes.
88In general, there's no direct way for Perl to seek to a particular line
89of a file, insert text into a file, or remove text from a file.
90
91(There are exceptions in special circumstances.  You can add or remove
92data at the very end of the file.  A sequence of bytes can be replaced
93with another sequence of the same length.  The C<$DB_RECNO> array
94bindings as documented in L<DB_File> also provide a direct way of
95modifying a file.  Files where all lines are the same length are also
96easy to alter.)
97
98The general solution is to create a temporary copy of the text file with
99the changes you want, then copy that over the original.  This assumes
100no locking.
101
102    $old = $file;
103    $new = "$file.tmp.$$";
104    $bak = "$file.orig";
105
106    open(OLD, "< $old")         or die "can't open $old: $!";
107    open(NEW, "> $new")         or die "can't open $new: $!";
108
109    # Correct typos, preserving case
110    while (<OLD>) {
111        s/\b(p)earl\b/${1}erl/i;
112        (print NEW $_)          or die "can't write to $new: $!";
113    }
114
115    close(OLD)                  or die "can't close $old: $!";
116    close(NEW)                  or die "can't close $new: $!";
117
118    rename($old, $bak)          or die "can't rename $old to $bak: $!";
119    rename($new, $old)          or die "can't rename $new to $old: $!";
120
121Perl can do this sort of thing for you automatically with the C<-i>
122command-line switch or the closely-related C<$^I> variable (see
123L<perlrun> for more details).  Note that
124C<-i> may require a suffix on some non-Unix systems; see the
125platform-specific documentation that came with your port.
126
127    # Renumber a series of tests from the command line
128    perl -pi -e 's/(^\s+test\s+)\d+/ $1 . ++$count /e' t/op/taint.t
129
130    # form a script
131    local($^I, @ARGV) = ('.orig', glob("*.c"));
132    while (<>) {
133        if ($. == 1) {
134            print "This line should appear at the top of each file\n";
135        }
136        s/\b(p)earl\b/${1}erl/i;        # Correct typos, preserving case
137        print;
138        close ARGV if eof;              # Reset $.
139    }
140
141If you need to seek to an arbitrary line of a file that changes
142infrequently, you could build up an index of byte positions of where
143the line ends are in the file.  If the file is large, an index of
144every tenth or hundredth line end would allow you to seek and read
145fairly efficiently.  If the file is sorted, try the look.pl library
146(part of the standard perl distribution).
147
148In the unique case of deleting lines at the end of a file, you
149can use tell() and truncate().  The following code snippet deletes
150the last line of a file without making a copy or reading the
151whole file into memory:
152
153        open (FH, "+< $file");
154        while ( <FH> ) { $addr = tell(FH) unless eof(FH) }
155        truncate(FH, $addr);
156
157Error checking is left as an exercise for the reader.
158
159=head2 How do I count the number of lines in a file?
160
161One fairly efficient way is to count newlines in the file. The
162following program uses a feature of tr///, as documented in L<perlop>.
163If your text file doesn't end with a newline, then it's not really a
164proper text file, so this may report one fewer line than you expect.
165
166    $lines = 0;
167    open(FILE, $filename) or die "Can't open `$filename': $!";
168    while (sysread FILE, $buffer, 4096) {
169        $lines += ($buffer =~ tr/\n//);
170    }
171    close FILE;
172
173This assumes no funny games with newline translations.
174
175=head2 How do I make a temporary file name?
176
177Use the C<new_tmpfile> class method from the IO::File module to get a
178filehandle opened for reading and writing.  Use it if you don't
179need to know the file's name:
180
181    use IO::File;
182    $fh = IO::File->new_tmpfile()
183        or die "Unable to make new temporary file: $!";
184
185If you do need to know the file's name, you can use the C<tmpnam>
186function from the POSIX module to get a filename that you then open
187yourself:
188
189
190    use Fcntl;
191    use POSIX qw(tmpnam);
192
193    # try new temporary filenames until we get one that didn't already
194    # exist;  the check should be unnecessary, but you can't be too careful
195    do { $name = tmpnam() }
196        until sysopen(FH, $name, O_RDWR|O_CREAT|O_EXCL);
197
198    # install atexit-style handler so that when we exit or die,
199    # we automatically delete this temporary file
200    END { unlink($name) or die "Couldn't unlink $name : $!" }
201
202    # now go on to use the file ...
203
204If you're committed to creating a temporary file by hand, use the
205process ID and/or the current time-value.  If you need to have many
206temporary files in one process, use a counter:
207
208    BEGIN {
209        use Fcntl;
210        my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMP} || $ENV{TEMP};
211        my $base_name = sprintf("%s/%d-%d-0000", $temp_dir, $$, time());
212        sub temp_file {
213            local *FH;
214            my $count = 0;
215            until (defined(fileno(FH)) || $count++ > 100) {
216                $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e;
217                sysopen(FH, $base_name, O_WRONLY|O_EXCL|O_CREAT);
218            }
219            if (defined(fileno(FH))
220                return (*FH, $base_name);
221            } else {
222                return ();
223            }
224        }
225    }
226
227=head2 How can I manipulate fixed-record-length files?
228
229The most efficient way is using pack() and unpack().  This is faster than
230using substr() when taking many, many strings.  It is slower for just a few.
231
232Here is a sample chunk of code to break up and put back together again
233some fixed-format input lines, in this case from the output of a normal,
234Berkeley-style ps:
235
236    # sample input line:
237    #   15158 p5  T      0:00 perl /home/tchrist/scripts/now-what
238    $PS_T = 'A6 A4 A7 A5 A*';
239    open(PS, "ps|");
240    print scalar <PS>;
241    while (<PS>) {
242        ($pid, $tt, $stat, $time, $command) = unpack($PS_T, $_);
243        for $var (qw!pid tt stat time command!) {
244            print "$var: <$$var>\n";
245        }
246        print 'line=', pack($PS_T, $pid, $tt, $stat, $time, $command),
247                "\n";
248    }
249
250We've used C<$$var> in a way that forbidden by C<use strict 'refs'>.
251That is, we've promoted a string to a scalar variable reference using
252symbolic references.  This is ok in small programs, but doesn't scale
253well.   It also only works on global variables, not lexicals.
254
255=head2 How can I make a filehandle local to a subroutine?  How do I pass filehandles between subroutines?  How do I make an array of filehandles?
256
257The fastest, simplest, and most direct way is to localize the typeglob
258of the filehandle in question:
259
260    local *TmpHandle;
261
262Typeglobs are fast (especially compared with the alternatives) and
263reasonably easy to use, but they also have one subtle drawback.  If you
264had, for example, a function named TmpHandle(), or a variable named
265%TmpHandle, you just hid it from yourself.
266
267    sub findme {
268        local *HostFile;
269        open(HostFile, "</etc/hosts") or die "no /etc/hosts: $!";
270        local $_;               # <- VERY IMPORTANT
271        while (<HostFile>) {
272            print if /\b127\.(0\.0\.)?1\b/;
273        }
274        # *HostFile automatically closes/disappears here
275    }
276
277Here's how to use typeglobs in a loop to open and store a bunch of
278filehandles.  We'll use as values of the hash an ordered
279pair to make it easy to sort the hash in insertion order.
280
281    @names = qw(motd termcap passwd hosts);
282    my $i = 0;
283    foreach $filename (@names) {
284        local *FH;
285        open(FH, "/etc/$filename") || die "$filename: $!";
286        $file{$filename} = [ $i++, *FH ];
287    }
288
289    # Using the filehandles in the array
290    foreach $name (sort { $file{$a}[0] <=> $file{$b}[0] } keys %file) {
291        my $fh = $file{$name}[1];
292        my $line = <$fh>;
293        print "$name $. $line";
294    }
295
296For passing filehandles to functions, the easiest way is to
297preface them with a star, as in func(*STDIN). 
298See L<perlfaq7/"Passing Filehandles"> for details.
299
300If you want to create many anonymous handles, you should check out the
301Symbol, FileHandle, or IO::Handle (etc.) modules.  Here's the equivalent
302code with Symbol::gensym, which is reasonably light-weight:
303
304    foreach $filename (@names) {
305        use Symbol;
306        my $fh = gensym();
307        open($fh, "/etc/$filename") || die "open /etc/$filename: $!";
308        $file{$filename} = [ $i++, $fh ];
309    }
310
311Here's using the semi-object-oriented FileHandle module, which certainly
312isn't light-weight:
313
314    use FileHandle;
315
316    foreach $filename (@names) {
317        my $fh = FileHandle->new("/etc/$filename") or die "$filename: $!";
318        $file{$filename} = [ $i++, $fh ];
319    }
320
321Please understand that whether the filehandle happens to be a (probably
322localized) typeglob or an anonymous handle from one of the modules
323in no way affects the bizarre rules for managing indirect handles.
324See the next question.
325
326=head2 How can I use a filehandle indirectly?
327
328An indirect filehandle is using something other than a symbol
329in a place that a filehandle is expected.  Here are ways
330to get indirect filehandles:
331
332    $fh =   SOME_FH;       # bareword is strict-subs hostile
333    $fh =  "SOME_FH";      # strict-refs hostile; same package only
334    $fh =  *SOME_FH;       # typeglob
335    $fh = \*SOME_FH;       # ref to typeglob (bless-able)
336    $fh =  *SOME_FH{IO};   # blessed IO::Handle from *SOME_FH typeglob
337
338Or, you can use the C<new> method from the FileHandle or IO modules to
339create an anonymous filehandle, store that in a scalar variable,
340and use it as though it were a normal filehandle.
341
342    use FileHandle;
343    $fh = FileHandle->new();
344
345    use IO::Handle;                     # 5.004 or higher
346    $fh = IO::Handle->new();
347
348Then use any of those as you would a normal filehandle.  Anywhere that
349Perl is expecting a filehandle, an indirect filehandle may be used
350instead. An indirect filehandle is just a scalar variable that contains
351a filehandle.  Functions like C<print>, C<open>, C<seek>, or
352the C<< <FH> >> diamond operator will accept either a read filehandle
353or a scalar variable containing one:
354
355    ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR);
356    print $ofh "Type it: ";
357    $got = <$ifh>
358    print $efh "What was that: $got";
359
360If you're passing a filehandle to a function, you can write
361the function in two ways:
362
363    sub accept_fh {
364        my $fh = shift;
365        print $fh "Sending to indirect filehandle\n";
366    }
367
368Or it can localize a typeglob and use the filehandle directly:
369
370    sub accept_fh {
371        local *FH = shift;
372        print  FH "Sending to localized filehandle\n";
373    }
374
375Both styles work with either objects or typeglobs of real filehandles.
376(They might also work with strings under some circumstances, but this
377is risky.)
378
379    accept_fh(*STDOUT);
380    accept_fh($handle);
381
382In the examples above, we assigned the filehandle to a scalar variable
383before using it.  That is because only simple scalar variables, not
384expressions or subscripts of hashes or arrays, can be used with
385built-ins like C<print>, C<printf>, or the diamond operator.  Using
386something other than a simple scalar varaible as a filehandle is
387illegal and won't even compile:
388
389    @fd = (*STDIN, *STDOUT, *STDERR);
390    print $fd[1] "Type it: ";                           # WRONG
391    $got = <$fd[0]>                                     # WRONG
392    print $fd[2] "What was that: $got";                 # WRONG
393
394With C<print> and C<printf>, you get around this by using a block and
395an expression where you would place the filehandle:
396
397    print  { $fd[1] } "funny stuff\n";
398    printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559;
399    # Pity the poor deadbeef.
400
401That block is a proper block like any other, so you can put more
402complicated code there.  This sends the message out to one of two places:
403
404    $ok = -x "/bin/cat";               
405    print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n";
406    print { $fd[ 1+ ($ok || 0) ]  } "cat stat $ok\n";           
407
408This approach of treating C<print> and C<printf> like object methods
409calls doesn't work for the diamond operator.  That's because it's a
410real operator, not just a function with a comma-less argument.  Assuming
411you've been storing typeglobs in your structure as we did above, you
412can use the built-in function named C<readline> to reads a record just
413as C<< <> >> does.  Given the initialization shown above for @fd, this
414would work, but only because readline() require a typeglob.  It doesn't
415work with objects or strings, which might be a bug we haven't fixed yet.
416
417    $got = readline($fd[0]);
418
419Let it be noted that the flakiness of indirect filehandles is not
420related to whether they're strings, typeglobs, objects, or anything else.
421It's the syntax of the fundamental operators.  Playing the object
422game doesn't help you at all here.
423
424=head2 How can I set up a footer format to be used with write()?
425
426There's no builtin way to do this, but L<perlform> has a couple of
427techniques to make it possible for the intrepid hacker.
428
429=head2 How can I write() into a string?
430
431See L<perlform/"Accessing Formatting Internals"> for an swrite() function.
432
433=head2 How can I output my numbers with commas added?
434
435This one will do it for you:
436
437    sub commify {
438        local $_  = shift;
439        1 while s/^([-+]?\d+)(\d{3})/$1,$2/;
440        return $_;
441    }
442
443    $n = 23659019423.2331;
444    print "GOT: ", commify($n), "\n";
445
446    GOT: 23,659,019,423.2331
447
448You can't just:
449
450    s/^([-+]?\d+)(\d{3})/$1,$2/g;
451
452because you have to put the comma in and then recalculate your
453position.
454
455Alternatively, this code commifies all numbers in a line regardless of
456whether they have decimal portions, are preceded by + or -, or
457whatever:
458
459    # from Andrew Johnson <ajohnson@gpu.srv.ualberta.ca>
460    sub commify {
461       my $input = shift;
462        $input = reverse $input;
463        $input =~ s<(\d\d\d)(?=\d)(?!\d*\.)><$1,>g;
464        return scalar reverse $input;
465    }
466
467=head2 How can I translate tildes (~) in a filename?
468
469Use the <> (glob()) operator, documented in L<perlfunc>.  Older
470versions of Perl require that you have a shell installed that groks
471tildes.  Recent perl versions have this feature built in. The
472Glob::KGlob module (available from CPAN) gives more portable glob
473functionality.
474
475Within Perl, you may use this directly:
476
477        $filename =~ s{
478          ^ ~             # find a leading tilde
479          (               # save this in $1
480              [^/]        # a non-slash character
481                    *     # repeated 0 or more times (0 means me)
482          )
483        }{
484          $1
485              ? (getpwnam($1))[7]
486              : ( $ENV{HOME} || $ENV{LOGDIR} )
487        }ex;
488
489=head2 How come when I open a file read-write it wipes it out?
490
491Because you're using something like this, which truncates the file and
492I<then> gives you read-write access:
493
494    open(FH, "+> /path/name");          # WRONG (almost always)
495
496Whoops.  You should instead use this, which will fail if the file
497doesn't exist. 
498
499    open(FH, "+< /path/name");          # open for update
500
501Using ">" always clobbers or creates.  Using "<" never does
502either.  The "+" doesn't change this.
503
504Here are examples of many kinds of file opens.  Those using sysopen()
505all assume
506
507    use Fcntl;
508
509To open file for reading:
510
511    open(FH, "< $path")                                 || die $!;
512    sysopen(FH, $path, O_RDONLY)                        || die $!;
513
514To open file for writing, create new file if needed or else truncate old file:
515
516    open(FH, "> $path") || die $!;
517    sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT)        || die $!;
518    sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666)  || die $!;
519
520To open file for writing, create new file, file must not exist:
521
522    sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT)         || die $!;
523    sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT, 0666)   || die $!;
524
525To open file for appending, create if necessary:
526
527    open(FH, ">> $path") || die $!;
528    sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT)       || die $!;
529    sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT, 0666) || die $!;
530
531To open file for appending, file must exist:
532
533    sysopen(FH, $path, O_WRONLY|O_APPEND)               || die $!;
534
535To open file for update, file must exist:
536
537    open(FH, "+< $path")                                || die $!;
538    sysopen(FH, $path, O_RDWR)                          || die $!;
539
540To open file for update, create file if necessary:
541
542    sysopen(FH, $path, O_RDWR|O_CREAT)                  || die $!;
543    sysopen(FH, $path, O_RDWR|O_CREAT, 0666)            || die $!;
544
545To open file for update, file must not exist:
546
547    sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT)           || die $!;
548    sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT, 0666)     || die $!;
549
550To open a file without blocking, creating if necessary:
551
552    sysopen(FH, "/tmp/somefile", O_WRONLY|O_NDELAY|O_CREAT)
553            or die "can't open /tmp/somefile: $!":
554
555Be warned that neither creation nor deletion of files is guaranteed to
556be an atomic operation over NFS.  That is, two processes might both
557successfully create or unlink the same file!  Therefore O_EXCL
558isn't as exclusive as you might wish.
559
560See also the new L<perlopentut> if you have it (new for 5.6).
561
562=head2 Why do I sometimes get an "Argument list too long" when I use <*>?
563
564The C<< <> >> operator performs a globbing operation (see above).
565In Perl versions earlier than v5.6.0, the internal glob() operator forks
566csh(1) to do the actual glob expansion, but
567csh can't handle more than 127 items and so gives the error message
568C<Argument list too long>.  People who installed tcsh as csh won't
569have this problem, but their users may be surprised by it.
570
571To get around this, either upgrade to Perl v5.6.0 or later, do the glob
572yourself with readdir() and patterns, or use a module like Glob::KGlob,
573one that doesn't use the shell to do globbing.
574
575=head2 Is there a leak/bug in glob()?
576
577Due to the current implementation on some operating systems, when you
578use the glob() function or its angle-bracket alias in a scalar
579context, you may cause a memory leak and/or unpredictable behavior.  It's
580best therefore to use glob() only in list context.
581
582=head2 How can I open a file with a leading ">" or trailing blanks?
583
584Normally perl ignores trailing blanks in filenames, and interprets
585certain leading characters (or a trailing "|") to mean something
586special.  To avoid this, you might want to use a routine like the one below.
587It turns incomplete pathnames into explicit relative ones, and tacks a
588trailing null byte on the name to make perl leave it alone:
589
590    sub safe_filename {
591        local $_  = shift;
592        s#^([^./])#./$1#;
593        $_ .= "\0";
594        return $_;
595    }
596
597    $badpath = "<<<something really wicked   ";
598    $fn = safe_filename($badpath");
599    open(FH, "> $fn") or "couldn't open $badpath: $!";
600
601This assumes that you are using POSIX (portable operating systems
602interface) paths.  If you are on a closed, non-portable, proprietary
603system, you may have to adjust the C<"./"> above.
604
605It would be a lot clearer to use sysopen(), though:
606
607    use Fcntl;
608    $badpath = "<<<something really wicked   ";
609    sysopen (FH, $badpath, O_WRONLY | O_CREAT | O_TRUNC)
610        or die "can't open $badpath: $!";
611
612For more information, see also the new L<perlopentut> if you have it
613(new for 5.6).
614
615=head2 How can I reliably rename a file?
616
617Well, usually you just use Perl's rename() function.  That may not
618work everywhere, though, particularly when renaming files across file systems.
619Some sub-Unix systems have broken ports that corrupt the semantics of
620rename()--for example, WinNT does this right, but Win95 and Win98
621are broken.  (The last two parts are not surprising, but the first is. :-)
622
623If your operating system supports a proper mv(1) program or its moral
624equivalent, this works:
625
626    rename($old, $new) or system("mv", $old, $new);
627
628It may be more compelling to use the File::Copy module instead.  You
629just copy to the new file to the new name (checking return values),
630then delete the old one.  This isn't really the same semantically as a
631real rename(), though, which preserves metainformation like
632permissions, timestamps, inode info, etc.
633
634Newer versions of File::Copy exports a move() function.
635
636=head2 How can I lock a file?
637
638Perl's builtin flock() function (see L<perlfunc> for details) will call
639flock(2) if that exists, fcntl(2) if it doesn't (on perl version 5.004 and
640later), and lockf(3) if neither of the two previous system calls exists.
641On some systems, it may even use a different form of native locking.
642Here are some gotchas with Perl's flock():
643
644=over 4
645
646=item 1
647
648Produces a fatal error if none of the three system calls (or their
649close equivalent) exists.
650
651=item 2
652
653lockf(3) does not provide shared locking, and requires that the
654filehandle be open for writing (or appending, or read/writing).
655
656=item 3
657
658Some versions of flock() can't lock files over a network (e.g. on NFS file
659systems), so you'd need to force the use of fcntl(2) when you build Perl.
660But even this is dubious at best.  See the flock entry of L<perlfunc>
661and the F<INSTALL> file in the source distribution for information on
662building Perl to do this.
663
664Two potentially non-obvious but traditional flock semantics are that
665it waits indefinitely until the lock is granted, and that its locks are
666I<merely advisory>.  Such discretionary locks are more flexible, but
667offer fewer guarantees.  This means that files locked with flock() may
668be modified by programs that do not also use flock().  Cars that stop
669for red lights get on well with each other, but not with cars that don't
670stop for red lights.  See the perlport manpage, your port's specific
671documentation, or your system-specific local manpages for details.  It's
672best to assume traditional behavior if you're writing portable programs.
673(If you're not, you should as always feel perfectly free to write
674for your own system's idiosyncrasies (sometimes called "features").
675Slavish adherence to portability concerns shouldn't get in the way of
676your getting your job done.)
677
678For more information on file locking, see also
679L<perlopentut/"File Locking"> if you have it (new for 5.6).
680
681=back
682
683=head2 Why can't I just open(FH, ">file.lock")?
684
685A common bit of code B<NOT TO USE> is this:
686
687    sleep(3) while -e "file.lock";      # PLEASE DO NOT USE
688    open(LCK, "> file.lock");           # THIS BROKEN CODE
689
690This is a classic race condition: you take two steps to do something
691which must be done in one.  That's why computer hardware provides an
692atomic test-and-set instruction.   In theory, this "ought" to work:
693
694    sysopen(FH, "file.lock", O_WRONLY|O_EXCL|O_CREAT)
695                or die "can't open  file.lock: $!":
696
697except that lamentably, file creation (and deletion) is not atomic
698over NFS, so this won't work (at least, not every time) over the net.
699Various schemes involving link() have been suggested, but
700these tend to involve busy-wait, which is also subdesirable.
701
702=head2 I still don't get locking.  I just want to increment the number in the file.  How can I do this?
703
704Didn't anyone ever tell you web-page hit counters were useless?
705They don't count number of hits, they're a waste of time, and they serve
706only to stroke the writer's vanity.  It's better to pick a random number;
707they're more realistic.
708
709Anyway, this is what you can do if you can't help yourself.
710
711    use Fcntl qw(:DEFAULT :flock);
712    sysopen(FH, "numfile", O_RDWR|O_CREAT)       or die "can't open numfile: $!";
713    flock(FH, LOCK_EX)                           or die "can't flock numfile: $!";
714    $num = <FH> || 0;
715    seek(FH, 0, 0)                               or die "can't rewind numfile: $!";
716    truncate(FH, 0)                              or die "can't truncate numfile: $!";
717    (print FH $num+1, "\n")                      or die "can't write numfile: $!";
718    close FH                                     or die "can't close numfile: $!";
719
720Here's a much better web-page hit counter:
721
722    $hits = int( (time() - 850_000_000) / rand(1_000) );
723
724If the count doesn't impress your friends, then the code might.  :-)
725
726=head2 How do I randomly update a binary file?
727
728If you're just trying to patch a binary, in many cases something as
729simple as this works:
730
731    perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs
732
733However, if you have fixed sized records, then you might do something more
734like this:
735
736    $RECSIZE = 220; # size of record, in bytes
737    $recno   = 37;  # which record to update
738    open(FH, "+<somewhere") || die "can't update somewhere: $!";
739    seek(FH, $recno * $RECSIZE, 0);
740    read(FH, $record, $RECSIZE) == $RECSIZE || die "can't read record $recno: $!";
741    # munge the record
742    seek(FH, -$RECSIZE, 1);
743    print FH $record;
744    close FH;
745
746Locking and error checking are left as an exercise for the reader.
747Don't forget them or you'll be quite sorry.
748
749=head2 How do I get a file's timestamp in perl?
750
751If you want to retrieve the time at which the file was last read,
752written, or had its meta-data (owner, etc) changed, you use the B<-M>,
753B<-A>, or B<-C> filetest operations as documented in L<perlfunc>.  These
754retrieve the age of the file (measured against the start-time of your
755program) in days as a floating point number.  To retrieve the "raw"
756time in seconds since the epoch, you would call the stat function,
757then use localtime(), gmtime(), or POSIX::strftime() to convert this
758into human-readable form.
759
760Here's an example:
761
762    $write_secs = (stat($file))[9];
763    printf "file %s updated at %s\n", $file,
764        scalar localtime($write_secs);
765
766If you prefer something more legible, use the File::stat module
767(part of the standard distribution in version 5.004 and later):
768
769    # error checking left as an exercise for reader.
770    use File::stat;
771    use Time::localtime;
772    $date_string = ctime(stat($file)->mtime);
773    print "file $file updated at $date_string\n";
774
775The POSIX::strftime() approach has the benefit of being,
776in theory, independent of the current locale.  See L<perllocale>
777for details.
778
779=head2 How do I set a file's timestamp in perl?
780
781You use the utime() function documented in L<perlfunc/utime>.
782By way of example, here's a little program that copies the
783read and write times from its first argument to all the rest
784of them.
785
786    if (@ARGV < 2) {
787        die "usage: cptimes timestamp_file other_files ...\n";
788    }
789    $timestamp = shift;
790    ($atime, $mtime) = (stat($timestamp))[8,9];
791    utime $atime, $mtime, @ARGV;
792
793Error checking is, as usual, left as an exercise for the reader.
794
795Note that utime() currently doesn't work correctly with Win95/NT
796ports.  A bug has been reported.  Check it carefully before using
797utime() on those platforms.
798
799=head2 How do I print to more than one file at once?
800
801If you only have to do this once, you can do this:
802
803    for $fh (FH1, FH2, FH3) { print $fh "whatever\n" }
804
805To connect up to one filehandle to several output filehandles, it's
806easiest to use the tee(1) program if you have it, and let it take care
807of the multiplexing:
808
809    open (FH, "| tee file1 file2 file3");
810
811Or even:
812
813    # make STDOUT go to three files, plus original STDOUT
814    open (STDOUT, "| tee file1 file2 file3") or die "Teeing off: $!\n";
815    print "whatever\n"                       or die "Writing: $!\n";
816    close(STDOUT)                            or die "Closing: $!\n";
817
818Otherwise you'll have to write your own multiplexing print
819function--or your own tee program--or use Tom Christiansen's,
820at http://www.perl.com/CPAN/authors/id/TOMC/scripts/tct.gz , which is
821written in Perl and offers much greater functionality
822than the stock version.
823
824=head2 How can I read in an entire file all at once?
825
826The customary Perl approach for processing all the lines in a file is to
827do so one line at a time:
828
829    open (INPUT, $file)         || die "can't open $file: $!";
830    while (<INPUT>) {
831        chomp;
832        # do something with $_
833    }
834    close(INPUT)                || die "can't close $file: $!";
835
836This is tremendously more efficient than reading the entire file into
837memory as an array of lines and then processing it one element at a time,
838which is often--if not almost always--the wrong approach.  Whenever
839you see someone do this:
840
841    @lines = <INPUT>;
842
843you should think long and hard about why you need everything loaded
844at once.  It's just not a scalable solution.  You might also find it
845more fun to use the standard DB_File module's $DB_RECNO bindings,
846which allow you to tie an array to a file so that accessing an element
847the array actually accesses the corresponding line in the file.
848
849On very rare occasion, you may have an algorithm that demands that
850the entire file be in memory at once as one scalar.  The simplest solution
851to that is
852
853    $var = `cat $file`;
854
855Being in scalar context, you get the whole thing.  In list context,
856you'd get a list of all the lines:
857
858    @lines = `cat $file`;
859
860This tiny but expedient solution is neat, clean, and portable to
861all systems on which decent tools have been installed.  For those
862who prefer not to use the toolbox, you can of course read the file
863manually, although this makes for more complicated code.
864
865    {
866        local(*INPUT, $/);
867        open (INPUT, $file)     || die "can't open $file: $!";
868        $var = <INPUT>;
869    }
870
871That temporarily undefs your record separator, and will automatically
872close the file at block exit.  If the file is already open, just use this:
873
874    $var = do { local $/; <INPUT> };
875
876=head2 How can I read in a file by paragraphs?
877
878Use the C<$/> variable (see L<perlvar> for details).  You can either
879set it to C<""> to eliminate empty paragraphs (C<"abc\n\n\n\ndef">,
880for instance, gets treated as two paragraphs and not three), or
881C<"\n\n"> to accept empty paragraphs.
882
883Note that a blank line must have no blanks in it.  Thus C<"fred\n
884\nstuff\n\n"> is one paragraph, but C<"fred\n\nstuff\n\n"> is two.
885
886=head2 How can I read a single character from a file?  From the keyboard?
887
888You can use the builtin C<getc()> function for most filehandles, but
889it won't (easily) work on a terminal device.  For STDIN, either use
890the Term::ReadKey module from CPAN or use the sample code in
891L<perlfunc/getc>.
892
893If your system supports the portable operating system programming
894interface (POSIX), you can use the following code, which you'll note
895turns off echo processing as well.
896
897    #!/usr/bin/perl -w
898    use strict;
899    $| = 1;
900    for (1..4) {
901        my $got;
902        print "gimme: ";
903        $got = getone();
904        print "--> $got\n";
905    }
906    exit;
907
908    BEGIN {
909        use POSIX qw(:termios_h);
910
911        my ($term, $oterm, $echo, $noecho, $fd_stdin);
912
913        $fd_stdin = fileno(STDIN);
914
915        $term     = POSIX::Termios->new();
916        $term->getattr($fd_stdin);
917        $oterm     = $term->getlflag();
918
919        $echo     = ECHO | ECHOK | ICANON;
920        $noecho   = $oterm & ~$echo;
921
922        sub cbreak {
923            $term->setlflag($noecho);
924            $term->setcc(VTIME, 1);
925            $term->setattr($fd_stdin, TCSANOW);
926        }
927
928        sub cooked {
929            $term->setlflag($oterm);
930            $term->setcc(VTIME, 0);
931            $term->setattr($fd_stdin, TCSANOW);
932        }
933
934        sub getone {
935            my $key = '';
936            cbreak();
937            sysread(STDIN, $key, 1);
938            cooked();
939            return $key;
940        }
941
942    }
943
944    END { cooked() }
945
946The Term::ReadKey module from CPAN may be easier to use.  Recent versions
947include also support for non-portable systems as well.
948
949    use Term::ReadKey;
950    open(TTY, "</dev/tty");
951    print "Gimme a char: ";
952    ReadMode "raw";
953    $key = ReadKey 0, *TTY;
954    ReadMode "normal";
955    printf "\nYou said %s, char number %03d\n",
956        $key, ord $key;
957
958For legacy DOS systems, Dan Carson <dbc@tc.fluke.COM> reports the following:
959
960To put the PC in "raw" mode, use ioctl with some magic numbers gleaned
961from msdos.c (Perl source file) and Ralf Brown's interrupt list (comes
962across the net every so often):
963
964    $old_ioctl = ioctl(STDIN,0,0);     # Gets device info
965    $old_ioctl &= 0xff;
966    ioctl(STDIN,1,$old_ioctl | 32);    # Writes it back, setting bit 5
967
968Then to read a single character:
969
970    sysread(STDIN,$c,1);               # Read a single character
971
972And to put the PC back to "cooked" mode:
973
974    ioctl(STDIN,1,$old_ioctl);         # Sets it back to cooked mode.
975
976So now you have $c.  If C<ord($c) == 0>, you have a two byte code, which
977means you hit a special key.  Read another byte with C<sysread(STDIN,$c,1)>,
978and that value tells you what combination it was according to this
979table:
980
981    # PC 2-byte keycodes = ^@ + the following:
982
983    # HEX     KEYS
984    # ---     ----
985    # 0F      SHF TAB
986    # 10-19   ALT QWERTYUIOP
987    # 1E-26   ALT ASDFGHJKL
988    # 2C-32   ALT ZXCVBNM
989    # 3B-44   F1-F10
990    # 47-49   HOME,UP,PgUp
991    # 4B      LEFT
992    # 4D      RIGHT
993    # 4F-53   END,DOWN,PgDn,Ins,Del
994    # 54-5D   SHF F1-F10
995    # 5E-67   CTR F1-F10
996    # 68-71   ALT F1-F10
997    # 73-77   CTR LEFT,RIGHT,END,PgDn,HOME
998    # 78-83   ALT 1234567890-=
999    # 84      CTR PgUp
1000
1001This is all trial and error I did a long time ago; I hope I'm reading the
1002file that worked...
1003
1004=head2 How can I tell whether there's a character waiting on a filehandle?
1005
1006The very first thing you should do is look into getting the Term::ReadKey
1007extension from CPAN.  As we mentioned earlier, it now even has limited
1008support for non-portable (read: not open systems, closed, proprietary,
1009not POSIX, not Unix, etc) systems.
1010
1011You should also check out the Frequently Asked Questions list in
1012comp.unix.* for things like this: the answer is essentially the same.
1013It's very system dependent.  Here's one solution that works on BSD
1014systems:
1015
1016    sub key_ready {
1017        my($rin, $nfd);
1018        vec($rin, fileno(STDIN), 1) = 1;
1019        return $nfd = select($rin,undef,undef,0);
1020    }
1021
1022If you want to find out how many characters are waiting, there's
1023also the FIONREAD ioctl call to be looked at.  The I<h2ph> tool that
1024comes with Perl tries to convert C include files to Perl code, which
1025can be C<require>d.  FIONREAD ends up defined as a function in the
1026I<sys/ioctl.ph> file:
1027
1028    require 'sys/ioctl.ph';
1029
1030    $size = pack("L", 0);
1031    ioctl(FH, FIONREAD(), $size)    or die "Couldn't call ioctl: $!\n";
1032    $size = unpack("L", $size);
1033
1034If I<h2ph> wasn't installed or doesn't work for you, you can
1035I<grep> the include files by hand:
1036
1037    % grep FIONREAD /usr/include/*/*
1038    /usr/include/asm/ioctls.h:#define FIONREAD      0x541B
1039
1040Or write a small C program using the editor of champions:
1041
1042    % cat > fionread.c
1043    #include <sys/ioctl.h>
1044    main() {
1045        printf("%#08x\n", FIONREAD);
1046    }
1047    ^D
1048    % cc -o fionread fionread.c
1049    % ./fionread
1050    0x4004667f
1051
1052And then hard-code it, leaving porting as an exercise to your successor.
1053
1054    $FIONREAD = 0x4004667f;         # XXX: opsys dependent
1055
1056    $size = pack("L", 0);
1057    ioctl(FH, $FIONREAD, $size)     or die "Couldn't call ioctl: $!\n";
1058    $size = unpack("L", $size);
1059
1060FIONREAD requires a filehandle connected to a stream, meaning that sockets,
1061pipes, and tty devices work, but I<not> files.
1062
1063=head2 How do I do a C<tail -f> in perl?
1064
1065First try
1066
1067    seek(GWFILE, 0, 1);
1068
1069The statement C<seek(GWFILE, 0, 1)> doesn't change the current position,
1070but it does clear the end-of-file condition on the handle, so that the
1071next <GWFILE> makes Perl try again to read something.
1072
1073If that doesn't work (it relies on features of your stdio implementation),
1074then you need something more like this:
1075
1076        for (;;) {
1077          for ($curpos = tell(GWFILE); <GWFILE>; $curpos = tell(GWFILE)) {
1078            # search for some stuff and put it into files
1079          }
1080          # sleep for a while
1081          seek(GWFILE, $curpos, 0);  # seek to where we had been
1082        }
1083
1084If this still doesn't work, look into the POSIX module.  POSIX defines
1085the clearerr() method, which can remove the end of file condition on a
1086filehandle.  The method: read until end of file, clearerr(), read some
1087more.  Lather, rinse, repeat.
1088
1089There's also a File::Tail module from CPAN.
1090
1091=head2 How do I dup() a filehandle in Perl?
1092
1093If you check L<perlfunc/open>, you'll see that several of the ways
1094to call open() should do the trick.  For example:
1095
1096    open(LOG, ">>/tmp/logfile");
1097    open(STDERR, ">&LOG");
1098
1099Or even with a literal numeric descriptor:
1100
1101   $fd = $ENV{MHCONTEXTFD};
1102   open(MHCONTEXT, "<&=$fd");   # like fdopen(3S)
1103
1104Note that "<&STDIN" makes a copy, but "<&=STDIN" make
1105an alias.  That means if you close an aliased handle, all
1106aliases become inaccessible.  This is not true with
1107a copied one.
1108
1109Error checking, as always, has been left as an exercise for the reader.
1110
1111=head2 How do I close a file descriptor by number?
1112
1113This should rarely be necessary, as the Perl close() function is to be
1114used for things that Perl opened itself, even if it was a dup of a
1115numeric descriptor as with MHCONTEXT above.  But if you really have
1116to, you may be able to do this:
1117
1118    require 'sys/syscall.ph';
1119    $rc = syscall(&SYS_close, $fd + 0);  # must force numeric
1120    die "can't sysclose $fd: $!" unless $rc == -1;
1121
1122Or, just use the fdopen(3S) feature of open():
1123
1124    {
1125        local *F;
1126        open F, "<&=$fd" or die "Cannot reopen fd=$fd: $!";
1127        close F;
1128    }
1129
1130=head2 Why can't I use "C:\temp\foo" in DOS paths?  What doesn't `C:\temp\foo.exe` work?
1131
1132Whoops!  You just put a tab and a formfeed into that filename!
1133Remember that within double quoted strings ("like\this"), the
1134backslash is an escape character.  The full list of these is in
1135L<perlop/Quote and Quote-like Operators>.  Unsurprisingly, you don't
1136have a file called "c:(tab)emp(formfeed)oo" or
1137"c:(tab)emp(formfeed)oo.exe" on your legacy DOS filesystem.
1138
1139Either single-quote your strings, or (preferably) use forward slashes.
1140Since all DOS and Windows versions since something like MS-DOS 2.0 or so
1141have treated C</> and C<\> the same in a path, you might as well use the
1142one that doesn't clash with Perl--or the POSIX shell, ANSI C and C++,
1143awk, Tcl, Java, or Python, just to mention a few.  POSIX paths
1144are more portable, too.
1145
1146=head2 Why doesn't glob("*.*") get all the files?
1147
1148Because even on non-Unix ports, Perl's glob function follows standard
1149Unix globbing semantics.  You'll need C<glob("*")> to get all (non-hidden)
1150files.  This makes glob() portable even to legacy systems.  Your
1151port may include proprietary globbing functions as well.  Check its
1152documentation for details.
1153
1154=head2 Why does Perl let me delete read-only files?  Why does C<-i> clobber protected files?  Isn't this a bug in Perl?
1155
1156This is elaborately and painstakingly described in the "Far More Than
1157You Ever Wanted To Know" in
1158http://www.perl.com/CPAN/doc/FMTEYEWTK/file-dir-perms .
1159
1160The executive summary: learn how your filesystem works.  The
1161permissions on a file say what can happen to the data in that file.
1162The permissions on a directory say what can happen to the list of
1163files in that directory.  If you delete a file, you're removing its
1164name from the directory (so the operation depends on the permissions
1165of the directory, not of the file).  If you try to write to the file,
1166the permissions of the file govern whether you're allowed to.
1167
1168=head2 How do I select a random line from a file?
1169
1170Here's an algorithm from the Camel Book:
1171
1172    srand;
1173    rand($.) < 1 && ($line = $_) while <>;
1174
1175This has a significant advantage in space over reading the whole
1176file in.  A simple proof by induction is available upon
1177request if you doubt the algorithm's correctness.
1178
1179=head2 Why do I get weird spaces when I print an array of lines?
1180
1181Saying
1182
1183    print "@lines\n";
1184
1185joins together the elements of C<@lines> with a space between them.
1186If C<@lines> were C<("little", "fluffy", "clouds")> then the above
1187statement would print
1188
1189    little fluffy clouds
1190
1191but if each element of C<@lines> was a line of text, ending a newline
1192character C<("little\n", "fluffy\n", "clouds\n")> then it would print:
1193
1194    little
1195     fluffy
1196     clouds
1197
1198If your array contains lines, just print them:
1199
1200    print @lines;
1201
1202=head1 AUTHOR AND COPYRIGHT
1203
1204Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington.
1205All rights reserved.
1206
1207When included as an integrated part of the Standard Distribution
1208of Perl or of its documentation (printed or otherwise), this works is
1209covered under Perl's Artistic License.  For separate distributions of
1210all or part of this FAQ outside of that, see L<perlfaq>.
1211
1212Irrespective of its distribution, all code examples here are in the public
1213domain.  You are permitted and encouraged to use this code and any
1214derivatives thereof in your own programs for fun or for profit as you
1215see fit.  A simple comment in the code giving credit to the FAQ would
1216be courteous but is not required.
Note: See TracBrowser for help on using the repository browser.