=item * Where can I get a list of Larry Wall witticisms?
-=item * How can I convince my sysadmin/supervisor/employees to use version (5/5.005/Perl instead of some other language)?
+=item * How can I convince my sysadmin/supervisor/employees to use (version 5/5.005/Perl) instead of some other language?
=back
=item * Where can I learn about linking C with Perl? [h2xs, xsubpp]
=item * I've read perlembed, perlguts, etc., but I can't embed perl in
-my C program, what am I doing wrong?
+my C program; what am I doing wrong?
=item * When I tried to run my script, I got this message. What does it
mean?
=item * I put a regular expression into $/ but it didn't work. What's wrong?
-=item * How do I substitute case insensitively on the LHS, but preserving case on the RHS?
+=item * How do I substitute case insensitively on the LHS while preserving case on the RHS?
=item * How can I make C<\w> match national character sets?
=item * How can I do an atexit() or setjmp()/longjmp()? (Exception handling)
-=item * Why doesn't my sockets program work under System V (Solaris)? What does the error message "Protocol not supported" mean?
+=item * Why doesn't my sockets program work under System V (Solaris)? What does the error message "Protocol not supported" mean?
=item * How can I call my system's unique C functions from Perl?
=over 4
-=item * My CGI script runs from the command line but not the browser. (500 Server Error)
+=item * My CGI script runs from the command line but not the browser. (500 Server Error)
=item * How can I get better error messages from a CGI program?
This document is posted regularly to comp.lang.perl.announce and
several other related newsgroups. It is available in a variety of
-formats from CPAN in the /CPAN/doc/FAQs/FAQ/ directory, or on the web
+formats from CPAN in the /CPAN/doc/FAQs/FAQ/ directory or on the web
at http://www.perl.com/perl/faq/ .
=head2 How to contribute to this document
=head2 Bundled Distributions
-When included as part of the Standard Version of Perl, or as part of
+When included as part of the Standard Version of Perl or as part of
its complete documentation whether printed or otherwise, this work
may be distributed only under the terms of Perl's Artistic License.
Any distribution of this file or derivatives thereof I<outside>
-of that package require that special arrangements be made with
+of that package requires that special arrangements be made with
copyright holder.
Irrespective of its distribution, all code examples in these files
=over 4
+=item 1/November/2000
+
+A few grammatical fixes and updates implemented by John Borwick.
+
=item 23/May/99
Extensive updates from the net in preparation for 5.6 release.
no longer maintained; its last patch (4.036) was in 1992, long ago and
far away. Sure, it's stable, but so is anything that's dead; in fact,
perl4 had been called a dead, flea-bitten camel carcass. The most recent
-production release is 5.005_03 (although 5.004_05 is still supported).
-The most cutting-edge development release is 5.005_57. Further references
+production release is 5.6 (although 5.005_03 is still supported).
+The most cutting-edge development release is 5.7. Further references
to the Perl language in this document refer to the production release
unless otherwise specified. There may be one or more official bug fixes
by the time you read this, and also perhaps some experimental versions
=head2 Is Perl difficult to learn?
-No, Perl is easy to start learning -- and easy to keep learning. It looks
+No, Perl is easy to start learning--and easy to keep learning. It looks
like most programming languages you're likely to have experience
with, so if you've ever written a C program, an awk script, a shell
-script, or even a BASIC program, you're already part way there.
+script, or even a BASIC program, you're already partway there.
Most tasks only require a small subset of the Perl language. One of
the guiding mottos for Perl development is "there's more than one way
=head2 When shouldn't I program in Perl?
-When your manager forbids it -- but do consider replacing them :-).
+When your manager forbids it--but do consider replacing them :-).
Actually, one good reason is when you already have an existing
application written in another language that's all done (and done
that Perl remains fundamentally a dynamically typed language, not
a statically typed one. You certainly won't be chastised if you don't
trust nuclear-plant or brain-surgery monitoring code to it. And Larry
-will sleep easier, too -- Wall Street programs not withstanding. :-)
+will sleep easier, too--Wall Street programs not withstanding. :-)
=head2 What's the difference between "perl" and "Perl"?
what you give the actors. A program is what you give the audience."
Originally, a script was a canned sequence of normally interactive
-commands, that is, a chat script. Something like a UUCP or PPP chat
+commands--that is, a chat script. Something like a UUCP or PPP chat
script or an expect script fits the bill nicely, as do configuration
scripts run by a program at its start up, such F<.cshrc> or F<.ircrc>,
for example. Chat scripts were just drivers for existing programs,
not stand-alone programs in their own right.
A computer scientist will correctly explain that all programs are
-interpreted, and that the only question is at what level. But if you
+interpreted and that the only question is at what level. But if you
ask this question of someone who isn't a computer scientist, they might
tell you that a I<program> has been compiled to physical machine code
-once, and can then be run multiple times, whereas a I<script> must be
+once and can then be run multiple times, whereas a I<script> must be
translated by a program each time it's used.
Perl programs are (usually) neither strictly compiled nor strictly
http://x1.dejanews.com/dnquery.xp?QRY=*&DBS=2&ST=PS&defaultOp=AND&LNG=ALL&format=terse&showsort=date&maxhits=100&subjects=&groups=&authors=larry@*wall.org&fromdate=&todate=
-=head2 How can I convince my sysadmin/supervisor/employees to use version (5/5.005/Perl instead of some other language)?
+=head2 How can I convince my sysadmin/supervisor/employees to use (version 5/5.005/Perl) instead of some other language?
If your manager or employees are wary of unsupported software, or
software which doesn't officially ship with your operating system, you
simplicity, and power, then the typical manager/supervisor/employee
may be persuaded. Regarding using Perl in general, it's also
sometimes helpful to point out that delivery times may be reduced
-using Perl, as compared to other languages.
+using Perl compared to other languages.
If you have a project which has a bottleneck, especially in terms of
translation or testing, Perl almost certainly will provide a viable,
-and quick solution. In conjunction with any persuasion effort, you
+quick solution. In conjunction with any persuasion effort, you
should not fail to point out that Perl is used, quite extensively, and
with extremely reliable and valuable results, at many large computer
-software and/or hardware companies throughout the world. In fact,
-many Unix vendors now ship Perl by default, and support is usually
+software and hardware companies throughout the world. In fact,
+many Unix vendors now ship Perl by default. Support is usually
just a news-posting away, if you can't find the answer in the
I<comprehensive> documentation, including this FAQ.
number of modules and extensions which greatly reduce development time
for any given task. Also mention that the difference between version
4 and version 5 of Perl is like the difference between awk and C++.
-(Well, OK, maybe not quite that distinct, but you get the idea.) If you
+(Well, OK, maybe it's not quite that distinct, but you get the idea.) If you
want support and a reasonable guarantee that what you're developing
will continue to work in the future, then you have to run the supported
version. That probably means running the 5.005 release, although 5.004
approaches are doomed to failure.
One simple way to check that things are in the right place is to print out
-the hard-coded @INC which perl is looking for.
+the hard-coded @INC that perl looks through for libraries:
% perl -e 'print join("\n",@INC)'
-If this command lists any paths which don't exist on your system, then you
+If this command lists any paths that don't exist on your system, then you
may need to move the appropriate libraries to these locations, or create
symbolic links, aliases, or shortcuts appropriately. @INC is also printed as
part of the output of
installed as well: type C<man perl> if you're on a system resembling Unix.
This will lead you to other important man pages, including how to set your
$MANPATH. If you're not on a Unix system, access to the documentation
-will be different; for example, it might be only in HTML format. But all
+will be different; for example, documentation might only be in HTML format. All
proper Perl installations have fully-accessible documentation.
You might also try C<perldoc perl> in case your system doesn't
troff, html, and plain text. There's also a web page at
http://www.perl.com/perl/info/documentation.html that might help.
-Many good books have been written about Perl -- see the section below
+Many good books have been written about Perl--see the section below
for more details.
Tutorial documents are included in current or upcoming Perl releases
-include L<perltoot> for objects, L<perlopentut> for file opening
-semantics, L<perlreftut> for managing references, and L<perlxstut>
-for linking C and Perl together. There may be more by the
-time you read this. The following URLs might also be of
+include L<perltoot> for objects or L<perlboot> for a beginner's
+approach to objects, L<perlopentut> for file opening semantics,
+L<perlreftut> for managing references, L<perlretut> for regular
+expressions, L<perlthrtut> for threads, L<perldebtut> for debugging,
+and L<perlxstut> for linking C and Perl together. There may be more
+by the time you read this. The following URLs might also be of
assistance:
http://language.perl.com/info/documentation.html
A number of books on Perl and/or CGI programming are available. A few of
these are good, some are OK, but many aren't worth your money. Tom
Christiansen maintains a list of these books, some with extensive
-reviews, at http://www.perl.com/perl/critiques/index.html.
+reviews, at http://www.perl.com/perl/critiques/index.html .
The incontestably definitive reference book on Perl, written by
the creator of Perl, is now (July 2000) in its third edition:
The companion volume to the Camel containing thousands
of real-world examples, mini-tutorials, and complete programs
-(first premiering at the 1998 Perl Conference), is:
+(first premiered at the 1998 Perl Conference), is:
The Perl Cookbook (the "Ram Book"):
by Tom Christiansen and Nathan Torkington,
http://perl.oreilly.com/cookbook/
If you're already a hard-core systems programmer, then the Camel Book
-might suffice for you to learn Perl from. But if you're not, check
-out:
+might suffice for you to learn Perl from. If you're not, check
+out
Learning Perl (the "Llama Book"):
by Randal Schwartz and Tom Christiansen
http://www.oreilly.com/catalog/lperl2/
Despite the picture at the URL above, the second edition of "Llama
-Book" really has a blue cover, and is updated for the 5.004 release
+Book" really has a blue cover and was updated for the 5.004 release
of Perl. Various foreign language editions are available, including
-I<Learning Perl on Win32 Systems> (the Gecko Book).
+I<Learning Perl on Win32 Systems> (the "Gecko Book").
If you're not an accidental programmer, but a more serious and possibly
even degreed computer scientist who doesn't need as much hand-holding as
The first and only periodical devoted to All Things Perl, I<The
Perl Journal> contains tutorials, demonstrations, case studies,
-announcements, contests, and much more. TPJ has columns on web
+announcements, contests, and much more. I<TPJ> has columns on web
development, databases, Win32 Perl, graphical programming, regular
expressions, and networking, and sponsors the Obfuscated Perl
Contest. It is published quarterly under the gentle hand of its
I<Performance Computing> (http://www.performance-computing.com/), and Usenix's
newsletter/magazine to its members, I<login:>, at http://www.usenix.org/.
Randal's Web Technique's columns are available on the web at
-http://www.stonehenge.com/merlyn/WebTechniques/.
+http://www.stonehenge.com/merlyn/WebTechniques/ .
=head2 Perl on the Net: FTP and WWW Access
-To get the best (and possibly cheapest) performance, pick a site from
+To get the best performance, pick a site from
the list below and use it to grab the complete list of mirror sites.
From there you can find the quickest site for you. Remember, the
following list is I<not> the complete list of CPAN mirrors
http://www.deja.com/dnquery.xp?QRY=&DBS=2&ST=PS&defaultOp=AND&LNG=ALL&format=terse&showsort=date&maxhits=25&subjects=&groups=*perl*&authors=&fromdate=&todate=
-You'll probably want to trim that down a bit, though.
+You might want to trim that down a bit, though.
You'll probably want more a sophisticated query and retrieval mechanism
than a file listing, preferably one that allows you to retrieve
=head2 Where can I buy a commercial version of Perl?
-In a real sense, Perl already I<is> commercial software: It has a license
+In a real sense, Perl already I<is> commercial software: it has a license
that you can grab and carefully read to your manager. It is distributed
in releases and comes in well-defined packages. There is a very large
user community and an extensive literature. The comp.lang.perl.*
purchase order from a company whom they can sue should anything go awry.
Or maybe they need very serious hand-holding and contractual obligations.
Shrink-wrapped CDs with Perl on them are available from several sources if
-that will help. For example, many Perl books carry a Perl distribution
-on them, as do the O'Reilly Perl Resource Kits (in both the Unix flavor
+that will help. For example, many Perl books include a distribution of Perl,
+as do the O'Reilly Perl Resource Kits (in both the Unix flavor
and in the proprietary Microsoft flavor); the free Unix distributions
also all come with Perl.
-Or you can purchase commercial incidence based support through the Perl
-Clinic. The following is a commercial from them:
+Alternatively, you can purchase commercial incidence based support
+through the Perl Clinic. The following is a commercial from them:
"The Perl Clinic is a commercial Perl support service operated by
ActiveState Tool Corp. and The Ingram Group. The operators have many
we will put our best effort into understanding your problem, providing an
explanation of the situation, and a recommendation on how to proceed."
-Contact The Perl Clinic at:
+Contact The Perl Clinic at
www.PerlClinic.com
=head2 How do I debug my Perl programs?
Have you tried C<use warnings> or used C<-w>? They enable warnings
-for dubious practices.
+to detect dubious practices.
Have you tried C<use strict>? It prevents you from using symbolic
references, makes you predeclare any subroutines that you call as bare
words, and (probably most importantly) forces you to predeclare your
-variables with C<my> or C<our> or C<use vars>.
+variables with C<my>, C<our>, or C<use vars>.
-Did you check the returns of each and every system call? The operating
-system (and thus Perl) tells you whether they worked or not, and if not
+Did you check the return values of each and every system call? The operating
+system (and thus Perl) tells you whether they worked, and if not
why.
open(FH, "> /etc/cantwrite")
or die "Couldn't write to /etc/cantwrite: $!\n";
Did you read L<perltrap>? It's full of gotchas for old and new Perl
-programmers, and even has sections for those of you who are upgrading
+programmers and even has sections for those of you who are upgrading
from languages like I<awk> and I<C>.
Have you tried the Perl debugger, described in L<perldebug>? You can
=head2 How do I profile my Perl programs?
-You should get the Devel::DProf module from CPAN, and also use
+You should get the Devel::DProf module from CPAN and also use
Benchmark.pm from the standard distribution. Benchmark lets you time
specific portions of your code, while Devel::DProf gives detailed
breakdowns of where your code spends its time.
map: 6 secs ( 4.97 usr 0.00 sys = 4.97 cpu)
Be aware that a good benchmark is very hard to write. It only tests the
-data you give it, and really proves little about differing complexities
+data you give it and proves little about the differing complexities
of contrasting algorithms.
=head2 How do I cross-reference my Perl programs?
Of course, if you simply follow the guidelines in L<perlstyle>, you
shouldn't need to reformat. The habit of formatting your code as you
write it will help prevent bugs. Your editor can and should help you
-with this. The perl-mode for emacs can provide a remarkable amount of
-help with most (but not all) code, and even less programmable editors
-can provide significant assistance. Tom swears by the following
-settings in vi and its clones:
+with this. The perl-mode or newer cperl-mode for emacs can provide
+remarkable amounts of help with most (but not all) code, and even less
+programmable editors can provide significant assistance. Tom swears
+by the following settings in vi and its clones:
set ai sw=4
map! ^O {^M}^[O^T
Now put that in your F<.exrc> file (replacing the caret characters
with control characters) and away you go. In insert mode, ^T is
-for indenting, ^D is for undenting, and ^O is for blockdenting --
+for indenting, ^D is for undenting, and ^O is for blockdenting--
as it were. If you haven't used the last one, you're missing
a lot. A more complete example, with comments, can be found at
http://www.perl.com/CPAN-local/authors/id/TOMC/scripts/toms.exrc.gz
=head2 Is there an IDE or Windows Perl Editor?
-If you're on Unix, you already have an IDE -- Unix itself. This powerful
+If you're on Unix, you already have an IDE--Unix itself. This powerful
IDE derives from its interoperability, flexibility, and configurability.
If you really want to get a feel for Unix-qua-IDE, the best thing to do
is to find some high-powered programmer whose native language is Unix.
functional, powerful, and elegant. You will be absolutely astonished
at the speed and ease exhibited by the native speaker of Unix in his
home territory. The art and skill of a virtuoso can only be seen to be
-believed. That is the path to mastery -- all these cobbled little IDEs
+believed. That is the path to mastery--all these cobbled little IDEs
are expensive toys designed to sell a flashy demo using cheap tricks,
and being optimized for immediate but shallow understanding rather than
enduring use, are but a dim palimpsest of real tools.
In short, you just have to learn the toolbox. However, if you're not
on Unix, then your vendor probably didn't bother to provide you with
a proper toolbox on the so-called complete system that you forked out
-your hard-earned cash on.
+your hard-earned cash for.
-PerlBuilder (XXX URL to follow) is an integrated development environment
-for Windows that supports Perl development. Perl programs are just plain
-text, though, so you could download emacs for Windows (???) or a vi clone
-(vim) which runs on for win32 (http://www.cs.vu.nl/%7Etmgil/vi.html).
-If you're transferring Windows files to Unix, be sure to transfer in
-ASCII mode so the ends of lines are appropriately mangled.
+PerlBuilder (http://www.solutionsoft.com/perl.htm) is an integrated
+development environment for Windows that supports Perl development.
+Perl programs are just plain text, though, so you could download emacs
+for Windows (http://www.gnu.org/software/emacs/windows/ntemacs.html)
+or a vi clone (vim) which runs on for win32
+(http://www.cs.vu.nl/%7Etmgil/vi.html). If you're transferring
+Windows files to Unix be sure to transfer them in ASCII mode so the ends
+of lines are appropriately mangled.
=head2 Where can I get Perl macros for vi?
For a complete version of Tom Christiansen's vi configuration file,
-see http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/toms.exrc.gz,
-the standard benchmark file for vi emulators. This runs best with nvi,
+see http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/toms.exrc.gz ,
+the standard benchmark file for vi emulators. The file runs best with nvi,
the current version of vi out of Berkeley, which incidentally can be built
-with an embedded Perl interpreter -- see http://www.perl.com/CPAN/src/misc.
+with an embedded Perl interpreter--see http://www.perl.com/CPAN/src/misc.
=head2 Where can I get perl-mode for emacs?
to the Athena Widget set. Both are available from CPAN. See the
directory http://www.perl.com/CPAN/modules/by-category/08_User_Interfaces/
-Invaluable for Perl/Tk programming are: the Perl/Tk FAQ at
+Invaluable for Perl/Tk programming are the Perl/Tk FAQ at
http://w4.lns.cornell.edu/%7Epvhp/ptk/ptkTOC.html , the Perl/Tk Reference
Guide available at
http://www.perl.com/CPAN-local/authors/Stephen_O_Lidie/ , and the
=head2 What is undump?
-See the next questions.
+See the next question on ``How can I make my Perl program run faster?''
=head2 How can I make my Perl program run faster?
AutoSplit and AutoLoader modules in the standard distribution for
that. Or you could locate the bottleneck and think about writing just
that part in C, the way we used to take bottlenecks in C code and
-write them in assembler. Similar to rewriting in C is the use of
-modules that have critical sections written in C (for instance, the
+write them in assembler. Similar to rewriting in C,
+modules that have critical sections can be written in C (for instance, the
PDL module from CPAN).
In some cases, it may be worth it to use the backend compiler to
In some cases, using substr() or vec() to simulate arrays can be
highly beneficial. For example, an array of a thousand booleans will
take at least 20,000 bytes of space, but it can be turned into one
-125-byte bit vector for a considerable memory savings. The standard
+125-byte bit vector--a considerable memory savings. The standard
Tie::SubstrHash module can also help for certain types of data
structure. If you're working with specialist data structures
(matrices, for instance) modules that implement these in C may use
won't. In general, try it yourself and see.
However, judicious use of my() on your variables will help make sure
-that they go out of scope so that Perl can free up their storage for
+that they go out of scope so that Perl can free up that space for
use in other parts of your program. A global variable, of course, never
goes out of scope, so you can't get its space automatically reclaimed,
although undef()ing and/or delete()ing it will achieve the same effect.
See http://www.perl.com/CPAN/modules/by-category/15_World_Wide_Web_HTML_HTTP_CGI/ .
A non-free, commercial product, ``The Velocity Engine for Perl'',
-(http://www.binevolve.com/ or http://www.binevolve.com/velocigen/) might
-also be worth looking at. It will allow you to increase the performance
-of your Perl programs, up to 25 times faster than normal CGI Perl by
-running in persistent Perl mode, or 4 to 5 times faster without any
-modification to your existing CGI programs. Fully functional evaluation
-copies are available from the web site.
+(http://www.binevolve.com/ or http://www.binevolve.com/velocigen/ )
+might also be worth looking at. It will allow you to increase the
+performance of your Perl programs, running programs up to 25 times
+faster than normal CGI Perl when running in persistent Perl mode or 4
+to 5 times faster without any modification to your existing CGI
+programs. Fully functional evaluation copies are available from the
+web site.
=head2 How can I hide the source for my Perl program?
First of all, however, you I<can't> take away read permission, because
the source code has to be readable in order to be compiled and
interpreted. (That doesn't mean that a CGI script's source is
-readable by people on the web, though, only by people with access to
-the filesystem) So you have to leave the permissions at the socially
+readable by people on the web, though--only by people with access to
+the filesystem.) So you have to leave the permissions at the socially
friendly 0755 level.
Some people regard this as a security problem. If your program does
-insecure things, and relies on people not knowing how to exploit those
+insecure things and relies on people not knowing how to exploit those
insecurities, it is not secure. It is often possible for someone to
determine the insecure things and exploit them without viewing the
source. Security through obscurity, the name for hiding your bugs
might still be able to de-compile it. You can try using the native-code
compiler described below, but crackers might be able to disassemble it.
These pose varying degrees of difficulty to people wanting to get at
-your code, but none can definitively conceal it (this is true of every
+your code, but none can definitively conceal it (true of every
language, not just Perl).
If you're concerned about people profiting from your code, then the
Merely compiling into C does not in and of itself guarantee that your
code will run very much faster. That's because except for lucky cases
where a lot of native type inferencing is possible, the normal Perl
-run time system is still present and so your program will take just as
+run-time system is still present and so your program will take just as
long to run and be just as big. Most programs save little more than
compilation time, leaving execution no more than 10-30% faster. A few
-rare programs actually benefit significantly (like several times
+rare programs actually benefit significantly (even running several times
faster), but this takes some tweaking of your code.
You'll probably be astonished to learn that the current version of the
size!
In general, the compiler will do nothing to make a Perl program smaller,
-faster, more portable, or more secure. In fact, it will usually hurt
-all of those. The executable will be bigger, your VM system may take
+faster, more portable, or more secure. In fact, it can make your
+situation worse. The executable will be bigger, your VM system may take
longer to load the whole thing, the binary is fragile and hard to fix,
and compilation never stopped software piracy in the form of crackers,
viruses, or bootleggers. The real advantage of the compiler is merely
=head2 How can I compile Perl into Java?
-You can't. Not yet, anyway. You can integrate Java and Perl with the
+You can also integrate Java and Perl with the
Perl Resource Kit from O'Reilly and Associates. See
-http://www.oreilly.com/catalog/prkunix/ for more information.
-The Java interface will be supported in the core 5.6 release
-of Perl.
+http://www.oreilly.com/catalog/prkunix/ .
+
+Perl 5.6 comes with Java Perl Lingo, or JPL. JPL, still in
+development, allows Perl code to be called from Java. See jpl/README
+in the Perl source tree.
=head2 How can I get C<#!perl> to work on [MS-DOS,NT,...]?
as the first line in C<*.cmd> file (C<-S> due to a bug in cmd.exe's
`extproc' handling). For DOS one should first invent a corresponding
-batch file, and codify it in C<ALTERNATIVE_SHEBANG> (see the
+batch file and codify it in C<ALTERNATIVE_SHEBANG> (see the
F<INSTALL> file in the source distribution for more information).
The Win95/NT installation, when using the ActiveState port of Perl,
# VMS
perl -e "print ""Hello world\n"""
-The problem is that none of this is reliable: it depends on the
+The problem is that none of these examples are reliable: they depend on the
command interpreter. Under Unix, the first two often work. Under DOS,
-it's entirely possible neither works. If 4DOS was the command shell,
+it's entirely possible that neither works. If 4DOS was the command shell,
you'd probably have better luck like this:
perl -e "print <Ctrl-x>"Hello world\n<Ctrl-x>""
CGI Security FAQ
http://www.go2net.com/people/paulp/cgi-security/safe-cgi.txt
-
=head2 Where can I learn about object-oriented Perl programming?
-A good place to start is L<perltoot>, and you can use L<perlobj> and
-L<perlbot> for reference. Perltoot didn't come out until the 5.004
-release, but you can get a copy (in pod, html, or postscript) from
-http://www.perl.com/CPAN/doc/FMTEYEWTK/ .
+A good place to start is L<perltoot>, and you can use L<perlobj>,
+L<perlboot>, and L<perlbot> for reference. Perltoot didn't come out
+until the 5.004 release; you can get a copy (in pod, html, or
+postscript) from http://www.perl.com/CPAN/doc/FMTEYEWTK/ .
=head2 Where can I learn about linking C with Perl? [h2xs, xsubpp]
solved their problems.
=head2 I've read perlembed, perlguts, etc., but I can't embed perl in
-my C program, what am I doing wrong?
+my C program; what am I doing wrong?
Download the ExtUtils::Embed kit from CPAN and run `make test'. If
the tests pass, read the pods again and again and again. If they
=head1 DESCRIPTION
-The section of the FAQ answers question related to the manipulation
+The section of the FAQ answers questions related to the manipulation
of data as numbers, dates, strings, arrays, hashes, and miscellaneous
data issues.
=head2 Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)?
The infinite set that a mathematician thinks of as the real numbers can
-only be approximate on a computer, since the computer only has a finite
+only be approximated on a computer, since the computer only has a finite
number of bits to store an infinite number of, um, numbers.
Internally, your computer represents floating-point numbers in binary.
Floating-point numbers read in from a file or appearing as literals
in your program are converted from their decimal floating-point
-representation (eg, 19.95) to the internal binary representation.
+representation (eg, 19.95) to an internal binary representation.
However, 19.95 can't be precisely represented as a binary
floating-point number, just like 1/3 can't be exactly represented as a
When a floating-point number gets printed, the binary floating-point
representation is converted back to decimal. These decimal numbers
are displayed in either the format you specify with printf(), or the
-current output format for numbers (see L<perlvar/"$#"> if you use
+current output format for numbers. (See L<perlvar/"$#"> if you use
print. C<$#> has a different default value in Perl5 than it did in
Perl4. Changing C<$#> yourself is deprecated.)
$ceil = ceil(3.5); # 4
$floor = floor(3.5); # 3
-In 5.000 to 5.003 Perls, trigonometry was done in the Math::Complex
+In 5.000 to 5.003 perls, trigonometry was done in the Math::Complex
module. With 5.004, the Math::Trig module (part of the standard Perl
distribution) implements the trigonometric functions. Internally it
uses the Math::Complex module and some functions can break out from
Computers are good at being predictable and bad at being random
(despite appearances caused by bugs in your programs :-).
-http://www.perl.com/CPAN/doc/FMTEYEWTK/random, courtesy of Tom
-Phoenix, talks more about this.. John von Neumann said, ``Anyone who
+http://www.perl.com/CPAN/doc/FMTEYEWTK/random , courtesy of Tom
+Phoenix, talks more about this. John von Neumann said, ``Anyone who
attempts to generate random numbers by deterministic means is, of
course, living in a state of sin.''
available from CPAN.)
Before you immerse yourself too deeply in this, be sure to verify that it
-is the I<Julian> Day you really want. Are they really just interested in
+is the I<Julian> Day you really want. Are you really just interested in
a way of getting serial days so that they can do date arithmetic? If you
are interested in performing date arithmetic, this can be done using
either Date::Manip or Date::Calc, without converting to Julian Day first.
It depends just what you mean by ``escape''. URL escapes are dealt
with in L<perlfaq9>. Shell escapes with the backslash (C<\>)
-character are removed with:
+character are removed with
s/\\(.)/$1/g;
substr($a, 0, 3) = "Tom";
Although those with a pattern matching kind of thought process will
-likely prefer:
+likely prefer
$a =~ s/^.../Tom/;
=head2 How can I count the number of occurrences of a substring within a string?
-There are a number of ways, with varying efficiency: If you want a
+There are a number of ways, with varying efficiency. If you want a
count of a certain single character (X) within a string, you can use the
C<tr///> function like so:
$line =~ s/\b(\w)/\U$1/g;
This has the strange effect of turning "C<don't do it>" into "C<Don'T
-Do It>". Sometimes you might want this, instead (Suggested by brian d.
-foy):
+Do It>". Sometimes you might want this. Other times you might need a
+more thorough solution (Suggested by brian d. foy):
$string =~ s/ (
(^\w) #at the beginning of the line
use Text::ParseWords;
@new = quotewords(",", 0, $text);
-There's also a Text::CSV module on CPAN.
+There's also a Text::CSV (Comma-Separated Values) module on CPAN.
=head2 How do I strip blank space from the beginning/end of a string?
-Although the simplest approach would seem to be:
+Although the simplest approach would seem to be
$string =~ s/^\s*(.*?)\s*$/$1/;
-Not only is this unnecessarily slow and destructive, it also fails with
+not only is this unnecessarily slow and destructive, it also fails with
embedded newlines. It is much faster to do this operation in two steps:
$string =~ s/^\s+//;
=head2 How do I find the soundex value of a string?
Use the standard Text::Soundex module distributed with Perl.
-But before you do so, you may want to determine whether `soundex' is in
+Before you do so, you may want to determine whether `soundex' is in
fact what you think it is. Knuth's soundex algorithm compresses words
into a small space, and so it does not necessarily distinguish between
two words which you might want to appear separately. For example, the
=head2 What's wrong with always quoting "$vars"?
-The problem is that those double-quotes force stringification,
-coercing numbers and references into strings, even when you
-don't want them to be. Think of it this way: double-quote
+The problem is that those double-quotes force stringification--
+coercing numbers and references into strings--even when you
+don't want them to be strings. Think of it this way: double-quote
expansion is used to produce new strings. If you already
have a string, why do you need more?
A nice general-purpose fixer-upper function for indented here documents
follows. It expects to be called with a here document as its argument.
It looks to see whether each line begins with a common substring, and
-if so, strips that off. Otherwise, it takes the amount of leading
-white space found on the first line and removes that much off each
+if so, strips that substring off. Otherwise, it takes the amount of leading
+whitespace found on the first line and removes that much off each
subsequent line.
sub fix {
local $_ = shift;
- my ($white, $leader); # common white space and common leading string
+ my ($white, $leader); # common whitespace and common leading string
if (/^\s*(?:([^\w\s]+)(\s*).*\n)(?:\s*\1\2?.*\n)+$/) {
($white, $leader) = ($2, quotemeta($1));
} else {
@@@ }
MAIN_INTERPRETER_LOOP
-Or with a fixed amount of leading white space, with remaining
+Or with a fixed amount of leading whitespace, with remaining
indentation correctly preserved:
$poem = fix<<EVER_ON_AND_ON;
context, you initialize arrays with lists, and you foreach() across
a list. C<@> variables are arrays, anonymous arrays are arrays, arrays
in scalar context behave like the number of elements in them, subroutines
-access their arguments through the array C<@_>, push/pop/shift only work
+access their arguments through the array C<@_>, and push/pop/shift only work
on arrays.
As a side note, there's no such thing as a list in scalar context.
=head2 What is the difference between $array[1] and @array[1]?
-The former is a scalar value, the latter an array slice, which makes
+The former is a scalar value; the latter an array slice, making
it a list with one (scalar) value. You should use $ when you want a
scalar value (most of the time) and @ when you want a list with one
scalar value in it (very, very rarely; nearly never, in fact).
Please do not use
- $is_there = grep $_ eq $whatever, @array;
+ ($is_there) = grep $_ eq $whatever, @array;
or worse yet
- $is_there = grep /$whatever/, @array;
+ ($is_there) = grep /$whatever/, @array;
These are slow (checks every element even if the first matches),
inefficient (same reason), and potentially buggy (what if there are
}
Note that this is the I<symmetric difference>, that is, all elements in
-either A or in B, but not in both. Think of it as an xor operation.
+either A or in B but not in both. Think of it as an xor operation.
=head2 How do I test whether two arrays or hashes are equal?
}
print "\n";
-You could grow the list this way:
+You could add to the list this way:
my ($head, $tail);
$tail = append($head, 1); # grow a new head
fisher_yates_shuffle( \@array ); # permutes @array in place
You've probably seen shuffling algorithms that work using splice,
-randomly picking another element to swap the current element with:
+randomly picking another element to swap the current element with
srand;
@new = ();
}
@sorted = @data[ sort { $idx[$a] cmp $idx[$b] } 0 .. $#idx ];
-Which could also be written this way, using a trick
+which could also be written this way, using a trick
that's come to be known as the Schwartzian Transform:
@sorted = map { $_->[0] }
Even if the table doesn't double, there's no telling whether your new
entry will be inserted before or after the current iterator position.
-Either treasure up your changes and make them after the iterator finishes,
+Either treasure up your changes and make them after the iterator finishes
or use keys to fetch all the old keys at once, and iterate over the list
of keys.
$num_keys = scalar keys %hash;
-In void context, the keys() function just resets the iterator, which is
+The keys() function also resets the iterator, which in void context is
faster for tied hashes than would be iterating through the whole
hash, one key-value pair at a time.
} keys %hash; # and by value
Here we'll do a reverse numeric sort by value, and if two keys are
-identical, sort by length of key, and if that fails, by straight ASCII
-comparison of the keys (well, possibly modified by your locale -- see
+identical, sort by length of key, or if that fails, by straight ASCII
+comparison of the keys (well, possibly modified by your locale--see
L<perllocale>).
@keys = sort {
=head2 How do I flush/unbuffer an output filehandle? Why must I do this?
The C standard I/O library (stdio) normally buffers characters sent to
-devices. This is done for efficiency reasons, so that there isn't a
+devices. This is done for efficiency reasons so that there isn't a
system call for each byte. Any time you use print() or write() in
Perl, you go though this buffering. syswrite() circumvents stdio and
buffering.
low-level calls to read, write, open, close, and seek.
Although humans have an easy time thinking of a text file as being a
-sequence of lines that operates much like a stack of playing cards -- or
-punch cards -- computers usually see the text file as a sequence of bytes.
+sequence of lines that operates much like a stack of playing cards--or
+punch cards--computers usually see the text file as a sequence of bytes.
In general, there's no direct way for Perl to seek to a particular line
of a file, insert text into a file, or remove text from a file.
-(There are exceptions in special circumstances. You can add or remove at
-the very end of the file. Another is replacing a sequence of bytes with
-another sequence of the same length. Another is using the C<$DB_RECNO>
-array bindings as documented in L<DB_File>. Yet another is manipulating
-files with all lines the same length.)
+(There are exceptions in special circumstances. You can add or remove
+data at the very end of the file. A sequence of bytes can be replaced
+with another sequence of the same length. The C<$DB_RECNO> array
+bindings as documented in L<DB_File> also provide a direct way of
+modifying a file. Files where all lines are the same length are also
+easy to alter.)
The general solution is to create a temporary copy of the text file with
the changes you want, then copy that over the original. This assumes
=head2 How do I make a temporary file name?
Use the C<new_tmpfile> class method from the IO::File module to get a
-filehandle opened for reading and writing. Use this if you don't
-need to know the file's name.
+filehandle opened for reading and writing. Use it if you don't
+need to know the file's name:
use IO::File;
$fh = IO::File->new_tmpfile()
or die "Unable to make new temporary file: $!";
-Or you can use the C<tmpnam> function from the POSIX module to get a
-filename that you then open yourself. Use this if you do need to know
-the file's name.
+If you do need to know the file's name, you can use the C<tmpnam>
+function from the POSIX module to get a filename that you then open
+yourself:
+
use Fcntl;
use POSIX qw(tmpnam);
# now go on to use the file ...
-If you're committed to doing this by hand, use the process ID and/or
-the current time-value. If you need to have many temporary files in
-one process, use a counter:
+If you're committed to creating a temporary file by hand, use the
+process ID and/or the current time-value. If you need to have many
+temporary files in one process, use a counter:
BEGIN {
use Fcntl;
# *HostFile automatically closes/disappears here
}
-Here's how to use this in a loop to open and store a bunch of
+Here's how to use typeglobs in a loop to open and store a bunch of
filehandles. We'll use as values of the hash an ordered
pair to make it easy to sort the hash in insertion order.
$file{$filename} = [ $i++, $fh ];
}
-Or here using the semi-object-oriented FileHandle module, which certainly
+Here's using the semi-object-oriented FileHandle module, which certainly
isn't light-weight:
use FileHandle;
}
Please understand that whether the filehandle happens to be a (probably
-localized) typeglob or an anonymous handle from one of the modules,
+localized) typeglob or an anonymous handle from one of the modules
in no way affects the bizarre rules for managing indirect handles.
See the next question.
An indirect filehandle is using something other than a symbol
in a place that a filehandle is expected. Here are ways
-to get those:
+to get indirect filehandles:
$fh = SOME_FH; # bareword is strict-subs hostile
$fh = "SOME_FH"; # strict-refs hostile; same package only
$fh = \*SOME_FH; # ref to typeglob (bless-able)
$fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob
-Or to use the C<new> method from the FileHandle or IO modules to
+Or, you can use the C<new> method from the FileHandle or IO modules to
create an anonymous filehandle, store that in a scalar variable,
and use it as though it were a normal filehandle.
accept_fh($handle);
In the examples above, we assigned the filehandle to a scalar variable
-before using it. That is because only simple scalar variables,
-not expressions or subscripts into hashes or arrays, can be used with
-built-ins like C<print>, C<printf>, or the diamond operator. These are
+before using it. That is because only simple scalar variables, not
+expressions or subscripts of hashes or arrays, can be used with
+built-ins like C<print>, C<printf>, or the diamond operator. Using
+something other than a simple scalar varaible as a filehandle is
illegal and won't even compile:
@fd = (*STDIN, *STDOUT, *STDERR);
because you have to put the comma in and then recalculate your
position.
-Alternatively, this commifies all numbers in a line regardless of
+Alternatively, this code commifies all numbers in a line regardless of
whether they have decimal portions, are preceded by + or -, or
whatever:
Use the <> (glob()) operator, documented in L<perlfunc>. This
requires that you have a shell installed that groks tildes, meaning
-csh or tcsh or (some versions of) ksh, and thus may have portability
+csh or tcsh or (some versions of) ksh, and thus your code may have portability
problems. The Glob::KGlob module (available from CPAN) gives more
portable glob functionality.
Be warned that neither creation nor deletion of files is guaranteed to
be an atomic operation over NFS. That is, two processes might both
-successful create or unlink the same file! Therefore O_EXCL
-isn't so exclusive as you might wish.
+successfully create or unlink the same file! Therefore O_EXCL
+isn't as exclusive as you might wish.
See also the new L<perlopentut> if you have it (new for 5.6).
Due to the current implementation on some operating systems, when you
use the glob() function or its angle-bracket alias in a scalar
-context, you may cause a leak and/or unpredictable behavior. It's
+context, you may cause a memory leak and/or unpredictable behavior. It's
best therefore to use glob() only in list context.
=head2 How can I open a file with a leading ">" or trailing blanks?
Normally perl ignores trailing blanks in filenames, and interprets
certain leading characters (or a trailing "|") to mean something
-special. To avoid this, you might want to use a routine like this.
-It makes incomplete pathnames into explicit relative ones, and tacks a
+special. To avoid this, you might want to use a routine like the one below.
+It turns incomplete pathnames into explicit relative ones, and tacks a
trailing null byte on the name to make perl leave it alone:
sub safe_filename {
use Fcntl;
$badpath = "<<<something really wicked ";
- open (FH, $badpath, O_WRONLY | O_CREAT | O_TRUNC)
+ sysopen (FH, $badpath, O_WRONLY | O_CREAT | O_TRUNC)
or die "can't open $badpath: $!";
For more information, see also the new L<perlopentut> if you have it
=head2 How can I reliably rename a file?
-Well, usually you just use Perl's rename() function. But that may not
-work everywhere, in particular, renaming files across file systems.
+Well, usually you just use Perl's rename() function. That may not
+work everywhere, though, particularly when renaming files across file systems.
Some sub-Unix systems have broken ports that corrupt the semantics of
-rename() -- for example, WinNT does this right, but Win95 and Win98
+rename()--for example, WinNT does this right, but Win95 and Win98
are broken. (The last two parts are not surprising, but the first is. :-)
If your operating system supports a proper mv(1) program or its moral
It may be more compelling to use the File::Copy module instead. You
just copy to the new file to the new name (checking return values),
-then delete the old one. This isn't really the same semantics as a
+then delete the old one. This isn't really the same semantically as a
real rename(), though, which preserves metainformation like
permissions, timestamps, inode info, etc.
-The newer version of File::Copy exports a move() function.
+Newer versions of File::Copy exports a move() function.
=head2 How can I lock a file?
Some versions of flock() can't lock files over a network (e.g. on NFS file
systems), so you'd need to force the use of fcntl(2) when you build Perl.
-But even this is dubious at best. See the flock entry of L<perlfunc>,
+But even this is dubious at best. See the flock entry of L<perlfunc>
and the F<INSTALL> file in the source distribution for information on
building Perl to do this.
Two potentially non-obvious but traditional flock semantics are that
-it waits indefinitely until the lock is granted, and that its locks
+it waits indefinitely until the lock is granted, and that its locks are
I<merely advisory>. Such discretionary locks are more flexible, but
offer fewer guarantees. This means that files locked with flock() may
be modified by programs that do not also use flock(). Cars that stop
stop for red lights. See the perlport manpage, your port's specific
documentation, or your system-specific local manpages for details. It's
best to assume traditional behavior if you're writing portable programs.
-(But if you're not, you should as always feel perfectly free to write
+(If you're not, you should as always feel perfectly free to write
for your own system's idiosyncrasies (sometimes called "features").
Slavish adherence to portability concerns shouldn't get in the way of
your getting your job done.)
Didn't anyone ever tell you web-page hit counters were useless?
They don't count number of hits, they're a waste of time, and they serve
-only to stroke the writer's vanity. Better to pick a random number.
-It's more realistic.
+only to stroke the writer's vanity. It's better to pick a random number;
+they're more realistic.
Anyway, this is what you can do if you can't help yourself.
seek(FH, 0, 0) or die "can't rewind numfile: $!";
truncate(FH, 0) or die "can't truncate numfile: $!";
(print FH $num+1, "\n") or die "can't write numfile: $!";
- # Perl as of 5.004 automatically flushes before unlocking
- flock(FH, LOCK_UN) or die "can't flock numfile: $!";
close FH or die "can't close numfile: $!";
Here's a much better web-page hit counter:
close FH;
Locking and error checking are left as an exercise for the reader.
-Don't forget them, or you'll be quite sorry.
+Don't forget them or you'll be quite sorry.
=head2 How do I get a file's timestamp in perl?
Note that utime() currently doesn't work correctly with Win95/NT
ports. A bug has been reported. Check it carefully before using
-it on those platforms.
+utime() on those platforms.
=head2 How do I print to more than one file at once?
close(STDOUT) or die "Closing: $!\n";
Otherwise you'll have to write your own multiplexing print
-function -- or your own tee program -- or use Tom Christiansen's,
-at http://www.perl.com/CPAN/authors/id/TOMC/scripts/tct.gz, which is
+function--or your own tee program--or use Tom Christiansen's,
+at http://www.perl.com/CPAN/authors/id/TOMC/scripts/tct.gz , which is
written in Perl and offers much greater functionality
than the stock version.
This is tremendously more efficient than reading the entire file into
memory as an array of lines and then processing it one element at a time,
-which is often -- if not almost always -- the wrong approach. Whenever
+which is often--if not almost always--the wrong approach. Whenever
you see someone do this:
@lines = <INPUT>;
-You should think long and hard about why you need everything loaded
+you should think long and hard about why you need everything loaded
at once. It's just not a scalable solution. You might also find it
more fun to use the standard DB_File module's $DB_RECNO bindings,
which allow you to tie an array to a file so that accessing an element
On very rare occasion, you may have an algorithm that demands that
the entire file be in memory at once as one scalar. The simplest solution
-to that is:
+to that is
$var = `cat $file`;
You can use the builtin C<getc()> function for most filehandles, but
it won't (easily) work on a terminal device. For STDIN, either use
-the Term::ReadKey module from CPAN, or use the sample code in
+the Term::ReadKey module from CPAN or use the sample code in
L<perlfunc/getc>.
If your system supports the portable operating system programming
END { cooked() }
-The Term::ReadKey module from CPAN may be easier to use. Recent version
+The Term::ReadKey module from CPAN may be easier to use. Recent versions
include also support for non-portable systems as well.
use Term::ReadKey;
# 78-83 ALT 1234567890-=
# 84 CTR PgUp
-This is all trial and error I did a long time ago, I hope I'm reading the
-file that worked.
+This is all trial and error I did a long time ago; I hope I'm reading the
+file that worked...
=head2 How can I tell whether there's a character waiting on a filehandle?
ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n";
$size = unpack("L", $size);
-FIONREAD requires a filehandle connected to a stream, meaning sockets,
+FIONREAD requires a filehandle connected to a stream, meaning that sockets,
pipes, and tty devices work, but I<not> files.
=head2 How do I do a C<tail -f> in perl?
This should rarely be necessary, as the Perl close() function is to be
used for things that Perl opened itself, even if it was a dup of a
-numeric descriptor, as with MHCONTEXT above. But if you really have
+numeric descriptor as with MHCONTEXT above. But if you really have
to, you may be able to do this:
require 'sys/syscall.ph';
$rc = syscall(&SYS_close, $fd + 0); # must force numeric
die "can't sysclose $fd: $!" unless $rc == -1;
-Or just use the fdopen(3S) feature of open():
+Or, just use the fdopen(3S) feature of open():
{
local *F;
Either single-quote your strings, or (preferably) use forward slashes.
Since all DOS and Windows versions since something like MS-DOS 2.0 or so
have treated C</> and C<\> the same in a path, you might as well use the
-one that doesn't clash with Perl -- or the POSIX shell, ANSI C and C++,
+one that doesn't clash with Perl--or the POSIX shell, ANSI C and C++,
awk, Tcl, Java, or Python, just to mention a few. POSIX paths
are more portable, too.
This has a significant advantage in space over reading the whole
file in. A simple proof by induction is available upon
-request if you doubt its correctness.
+request if you doubt the algorithm's correctness.
=head2 Why do I get weird spaces when I print an array of lines?
joins together the elements of C<@lines> with a space between them.
If C<@lines> were C<("little", "fluffy", "clouds")> then the above
-statement would print:
+statement would print
little fluffy clouds
littered with answers involving regular expressions. For example,
decoding a URL and checking whether something is a number are handled
with regular expressions, but those answers are found elsewhere in
-this document (in the section on Data and the Networking one on
-networking, to be precise).
+this document (in L<perlfaq9>: ``How do I decode or create those %-encodings
+on the web'' and L<perfaq4>: ``How do I determine whether a scalar is
+a number/whole/integer/float'', to be precise).
=head2 How can I hope to use regular expressions without creating illegible and unmaintainable code?
$file->waitfor('/second line\n/');
print $file->getline;
-=head2 How do I substitute case insensitively on the LHS, but preserving case on the RHS?
+=head2 How do I substitute case insensitively on the LHS while preserving case on the RHS?
Here's a lovely Perlish solution by Larry Rosler. It exploits
properties of bitwise xor on ASCII strings.
=head2 What is C</o> really for?
Using a variable in a regular expression match forces a re-evaluation
-(and perhaps recompilation) each time through. The C</o> modifier
-locks in the regex the first time it's used. This always happens in a
-constant regular expression, and in fact, the pattern was compiled
-into the internal format at the same time your entire program was.
+(and perhaps recompilation) each time the regular expression is
+encountered. The C</o> modifier locks in the regex the first time
+it's used. This always happens in a constant regular expression, and
+in fact, the pattern was compiled into the internal format at the same
+time your entire program was.
Use of C</o> is irrelevant unless variable interpolation is used in
the pattern, and if so, the regex engine will neither know nor care
=head2 Can I use Perl regular expressions to match balanced text?
Although Perl regular expressions are more powerful than "mathematical"
-regular expressions, because they feature conveniences like backreferences
-(C<\1> and its ilk), they still aren't powerful enough -- with
+regular expressions because they feature conveniences like backreferences
+(C<\1> and its ilk), they still aren't powerful enough--with
the possible exception of bizarre and experimental features in the
development-track releases of Perl. You still need to use non-regex
techniques to parse balanced text, such as the text enclosed between
or C<(> and C<)> can be found in
http://www.perl.com/CPAN/authors/id/TOMC/scripts/pull_quotes.gz .
-The C::Scan module from CPAN contains such subs for internal usage,
+The C::Scan module from CPAN contains such subs for internal use,
but they are undocumented.
=head2 What does it mean that regexes are greedy? How can I get around it?
print "$count $line";
}
-If you want these output in a sorted order, see the section on Hashes.
+If you want these output in a sorted order, see L<perlfaq4>: ``How do I
+sort a hash (optionally by value instead of key)?''.
=head2 How can I do approximate matching?
=head2 Why don't word-boundary searches with C<\b> work for me?
-Two common misconceptions are that C<\b> is a synonym for C<\s+>, and
+Two common misconceptions are that C<\b> is a synonym for C<\s+> and
that it's the edge between whitespace characters and non-whitespace
characters. Neither is correct. C<\b> is the place between a C<\w>
character and a C<\W> character (that is, C<\b> is the edge of a
=head2 Why does using $&, $`, or $' slow my program down?
-Because once Perl sees that you need one of these variables anywhere in
-the program, it has to provide them on each and every pattern match.
+Once Perl sees that you need one of these variables anywhere in
+the program, it provides them on each and every pattern match.
The same mechanism that handles these provides for the use of $1, $2,
etc., so you pay the same price for each regex that contains capturing
-parentheses. But if you never use $&, etc., in your script, then regexes
+parentheses. If you never use $&, etc., in your script, then regexes
I<without> capturing parentheses won't be penalized. So avoid $&, $',
and $` if you can, but if you can't, once you've used them at all, use
them at will because you've already paid the price. Remember that some
}
}
-But then you lose the vertical alignment of the regular expressions.
+but then you lose the vertical alignment of the regular expressions.
=head2 Are Perl regexes DFAs or NFAs? Are they POSIX compliant?
chomp($pattern = <STDIN>);
if ($line =~ /$pattern/) { }
-Or, since you have no guarantee that your user entered
+Alternatively, since you have no guarantee that your user entered
a valid regular expression, trap the exception this way:
if (eval { $line =~ /$pattern/ }) { }
-But if all you really want to search for a string, not a pattern,
+If all you really want to search for a string, not a pattern,
then you should either use the index() function, which is made for
string searching, or if you can't be disabused of using a pattern
match on a non-pattern, then be sure to use C<\Q>...C<\E>, documented
* for all types of that symbol name. In version 4 you used them like
pointers, but in modern perls you can just use references.
-A couple of others that you're likely to encounter that aren't
-really type specifiers are:
+There are couple of other symbols that you're likely to encounter that aren't
+really type specifiers:
<> are used for inputting a record from a filehandle.
\ takes a reference to something.
Note that <FILE> is I<neither> the type specifier for files
nor the name of the handle. It is the C<< <> >> operator applied
-to the handle FILE. It reads one line (well, record - see
+to the handle FILE. It reads one line (well, record--see
L<perlvar/$/>) from the handle FILE in scalar context, or I<all> lines
in list context. When performing open, close, or any other operation
-besides C<< <> >> on files, or even talking about the handle, do
+besides C<< <> >> on files, or even when talking about the handle, do
I<not> use the brackets. These are correct: C<eof(FH)>, C<seek(FH, 0,
2)> and "copying from STDIN to FILE".
=head2 What's an extension?
-A way of calling compiled C code from Perl. Reading L<perlxstut>
-is a good place to learn more about extensions.
+An extension is a way of calling compiled C code from Perl. Reading
+L<perlxstut> is a good place to learn more about extensions.
=head2 Why do Perl operators have different precedence than C operators?
Actually, they don't. All C operators that Perl copies have the same
precedence in Perl as they do in C. The problem is with operators that C
doesn't have, especially functions that give a list context to everything
-on their right, eg print, chmod, exec, and so on. Such functions are
+on their right, eg. print, chmod, exec, and so on. Such functions are
called "list operators" and appear as such in the precedence table in
L<perlop>.
}
This is not C<-w> clean, however. There is no C<-w> clean way to
-detect taintedness - take this as a hint that you should untaint
+detect taintedness--take this as a hint that you should untaint
all possibly-tainted data.
=head2 What's a closure?
Closures make sense in any programming language where you can have the
return value of a function be itself a function, as you can in Perl.
Note that some languages provide anonymous functions but are not
-capable of providing proper closures; the Python language, for
+capable of providing proper closures: the Python language, for
example. For more information on closures, check out any textbook on
functional programming. Scheme is a language that not only supports
but encourages closures.
objects. See L<perlsub/"Pass by Reference"> for this particular
question, and L<perlref> for information on references.
+See ``Passing Regexes'', below, for information on passing regular
+expressions.
+
=over 4
=item Passing Variables and Functions
-Regular variables and functions are quite easy: just pass in a
+Regular variables and functions are quite easy to pass: just pass in a
reference to an existing or anonymous variable or function:
func( \$some_scalar );
=item Passing Filehandles
To pass filehandles to subroutines, use the C<*FH> or C<\*FH> notations.
-These are "typeglobs" - see L<perldata/"Typeglobs and Filehandles">
+These are "typeglobs"--see L<perldata/"Typeglobs and Filehandles">
and especially L<perlsub/"Pass by Reference"> for more information.
Here's an excerpt:
}
}
-Or you can use a closure to bundle up the object and its method call
-and arguments:
+Or, you can use a closure to bundle up the object, its
+method call, and arguments:
my $whatnot = sub { $some_obj->obfuscate(@args) };
func($whatnot);
that was initialized at compile time.
To declare a file-private variable, you'll still use a my(), putting
-it at the outer scope level at the top of the file. Assume this is in
-file Pax.pm:
+the declaration at the outer scope level at the top of the file.
+Assume this is in file Pax.pm:
package Pax;
my $started = scalar(localtime(time()));
=head2 What's the difference between dynamic and lexical (static) scoping? Between local() and my()?
-C<local($x)> saves away the old value of the global variable C<$x>,
-and assigns a new value for the duration of the subroutine, I<which is
+C<local($x)> saves away the old value of the global variable C<$x>
+and assigns a new value for the duration of the subroutine I<which is
visible in other functions called from that subroutine>. This is done
at run-time, so is called dynamic scoping. local() always affects global
variables, also called package variables or dynamic variables.
C<my($x)> creates a new variable that is only visible in the current
-subroutine. This is done at compile-time, so is called lexical or
+subroutine. This is done at compile-time, so it is called lexical or
static scoping. my() always affects private variables, also called
lexical variables or (improperly) static(ly scoped) variables.
=head2 What's the difference between calling a function as &foo and foo()?
When you call a function as C<&foo>, you allow that function access to
-your current @_ values, and you by-pass prototypes. That means that
-the function doesn't get an empty @_, it gets yours! While not
+your current @_ values, and you bypass prototypes.
+The function doesn't get an empty @_--it gets yours! While not
strictly speaking a bug (it's documented that way in L<perlsub>), it
would be hard to consider this a feature in most cases.
For example, let's say you wanted to test which of many answers you were
given, but in a case-insensitive way that also allows abbreviations.
You can use the following technique if the strings all start with
-different characters, or if you want to arrange the matches so that
+different characters or if you want to arrange the matches so that
one takes precedence over another, as C<"SEND"> has precedence over
C<"STOP"> here:
Some possible reasons: your inheritance is getting confused, you've
misspelled the method name, or the object is of the wrong type. Check
-out L<perltoot> for details on these. You may also use C<print
-ref($object)> to find out the class C<$object> was blessed into.
+out L<perltoot> for details about any of the above cases. You may
+also use C<print ref($object)> to find out the class C<$object> was
+blessed into.
Another possible reason for problems is because you've used the
indirect object syntax (eg, C<find Guru "Samy">) on a class name
before Perl has seen that such a package exists. It's wisest to make
sure your packages are all defined before you start using them, which
will be taken care of if you use the C<use> statement instead of
-C<require>. If not, make sure to use arrow notation (eg,
+C<require>. If not, make sure to use arrow notation (eg.,
C<< Guru->find("Samy") >>) instead. Object notation is explained in
L<perlobj>.
my $packname = __PACKAGE__;
-But if you're a method and you want to print an error message
+But, if you're a method and you want to print an error message
that includes the kind of object you were called on (which is
not necessarily the same as the one in which you were compiled):
This works I<sometimes>, but it is a very bad idea for two reasons.
-The first reason is that they I<only work on global variables>.
-That means above that if $fred is a lexical variable created with my(),
-that the code won't work at all: you'll accidentally access the global
-and skip right over the private lexical altogether. Global variables
-are bad because they can easily collide accidentally and in general make
-for non-scalable and confusing code.
+The first reason is that this technique I<only works on global
+variables>. That means that if $fred is a lexical variable created
+with my() in the above example, the code wouldn't work at all: you'd
+accidentally access the global and skip right over the private lexical
+altogether. Global variables are bad because they can easily collide
+accidentally and in general make for non-scalable and confusing code.
Symbolic references are forbidden under the C<use strict> pragma.
They are not true references and consequently are not reference counted
or garbage collected.
The other reason why using a variable to hold the name of another
-variable a bad idea is that the question often stems from a lack of
+variable is a bad idea is that the question often stems from a lack of
understanding of Perl data structures, particularly hashes. By using
symbolic references, you are just using the package's symbol-table hash
(like C<%main::>) instead of a user-defined hash. The solution is to
$str = 'this has a $fred and $barney in it';
$str =~ s/(\$\w+)/$1/eeg; # need double eval
-Instead, it would be better to keep a hash around like %USER_VARS and have
+it would be better to keep a hash around like %USER_VARS and have
variable references actually refer to entries in that hash:
$str =~ s/\$(\w+)/$USER_VARS{$1}/g; # no /e here at all
$str = 'this has a %fred% and %barney% in it';
$str =~ s/%(\w+)%/$USER_VARS{$1}/g; # no /e here at all
-Another reason that folks sometimes think they want a variable to contain
-the name of a variable is because they don't know how to build proper
-data structures using hashes. For example, let's say they wanted two
-hashes in their program: %fred and %barney, and to use another scalar
-variable to refer to those by name.
+Another reason that folks sometimes think they want a variable to
+contain the name of a variable is because they don't know how to build
+proper data structures using hashes. For example, let's say they
+wanted two hashes in their program: %fred and %barney, and that they
+wanted to use another scalar variable to refer to those by name.
$name = "fred";
$$name{WIFE} = "wilma"; # set %fred
So, sometimes you might want to use symbolic references to directly
manipulate the symbol table. This doesn't matter for formats, handles, and
-subroutines, because they are always global -- you can't use my() on them.
-But for scalars, arrays, and hashes -- and usually for subroutines --
-you probably want to use hard references only.
+subroutines, because they are always global--you can't use my() on them.
+For scalars, arrays, and hashes, though--and usually for subroutines--
+you probably only want to use hard references.
=head1 AUTHOR AND COPYRIGHT
encouraged to use this code in your own programs for fun
or for profit as you see fit. A simple comment in the code giving
credit would be courteous but is not required.
+
=head1 DESCRIPTION
This section of the Perl FAQ covers questions involving operating
-system interaction. This involves interprocess communication (IPC),
+system interaction. Topics include interprocess communication (IPC),
control over the user-interface (keyboard, screen and pointing
devices), and most anything else not related to data manipulation.
$key = ReadKey(0);
ReadMode('normal');
-However, that requires that you have a working C compiler and can use it
-to build and install a CPAN module. Here's a solution using
-the standard POSIX module, which is already on your systems (assuming
-your system supports POSIX).
+However, using the code requires that you have a working C compiler
+and can use it to build and install a CPAN module. Here's a solution
+using the standard POSIX module, which is already on your systems
+(assuming your system supports POSIX).
use HotKey;
$key = readkey();
(This question has nothing to do with the web. See a different
FAQ for that.)
-There's an example of this in L<perlfunc/crypt>). First, you put
-the terminal into "no echo" mode, then just read the password
-normally. You may do this with an old-style ioctl() function, POSIX
-terminal control (see L<POSIX>, and Chapter 7 of the Camel), or a call
+There's an example of this in L<perlfunc/crypt>). First, you put the
+terminal into "no echo" mode, then just read the password normally.
+You may do this with an old-style ioctl() function, POSIX terminal
+control (see L<POSIX> and Chapter 7 of the Camel, 2nd ed.), or a call
to the B<stty> program, with varying degrees of portability.
You can also do this for most systems using the Term::ReadKey module
This depends on which operating system your program is running on. In
the case of Unix, the serial ports will be accessible through files in
-/dev; on other systems, the devices names will doubtless differ.
+/dev; on other systems, device names will doubtless differ.
Several problem areas common to all device interaction are the
-following
+following:
=over 4
=item lockfiles
Your system may use lockfiles to control multiple access. Make sure
-you follow the correct protocol. Unpredictable behaviour can result
+you follow the correct protocol. Unpredictable behavior can result
from multiple processes reading from one device.
=item open mode
print DEV "atv1\012"; # wrong, for some devices
print DEV "atv1\015"; # right, for some devices
-Even though with normal text files, a "\n" will do the trick, there is
+Even though with normal text files a "\n" will do the trick, there is
still no unified scheme for terminating a line that is portable
between Unix, DOS/Win, and Macintosh, except to terminate I<ALL> line
ends with "\015\012", and strip what you don't need from the output.
If you expect characters to get to your device when you print() them,
you'll want to autoflush that filehandle. You can use select()
and the C<$|> variable to control autoflushing (see L<perlvar/$|>
-and L<perlfunc/select>):
+and L<perlfunc/select>, or L<perlfaq5>, ``How do I flush/unbuffer an
+output filehandle? Why must I do this?''):
$oldh = select(DEV);
$| = 1;
You spend lots and lots of money on dedicated hardware, but this is
bound to get you talked about.
-Seriously, you can't if they are Unix password files - the Unix
+Seriously, you can't if they are Unix password files--the Unix
password system employs one-way encryption. It's more like hashing than
encryption. The best you can check is whether something else hashes to
the same string. You can't turn a hash back into the original string.
sometimes avoid this by using syswrite() instead of print().
Unless you're exceedingly careful, the only safe things to do inside a
-signal handler are: set a variable and exit. And in the first case,
+signal handler are (1) set a variable and (2) exit. In the first case,
you should only set a variable in such a way that malloc() is not
called (eg, by setting a variable that already has a value).
you're in a "slow" call, such as <FH>, read(), connect(), or
wait(), that the only way to terminate them is by "longjumping" out;
that is, by raising an exception. See the time-out handler for a
-blocking flock() in L<perlipc/"Signals"> or chapter 6 of the Camel.
+blocking flock() in L<perlipc/"Signals"> or chapter 6 of the Camel, 2nd ed.
=head2 How do I modify the shadow password file on a Unix system?
-If perl was installed correctly, and your shadow library was written
+If perl was installed correctly and your shadow library was written
properly, the getpw*() functions described in L<perlfunc> should in
theory provide (read-only) access to entries in the shadow password
file. To change the file, make a new shadow password file (the format
-varies from system to system - see L<passwd(5)> for specifics) and use
+varies from system to system--see L<passwd(5)> for specifics) and use
pwd_mkdb(8) to install it (see L<pwd_mkdb(8)> for more details).
=head2 How do I set the time and date?
close(STDOUT) || die "stdout close failed: $!";
}
-The END block isn't called when untrapped signals kill the program, though, so if
-you use END blocks you should also use
+The END block isn't called when untrapped signals kill the program,
+though, so if you use END blocks you should also use
use sigtrap qw(die normal-signals);
Perl's exception-handling mechanism is its eval() operator. You can
use eval() as setjmp and die() as longjmp. For details of this, see
the section on signals, especially the time-out handler for a blocking
-flock() in L<perlipc/"Signals"> and chapter 6 of the Camel.
+flock() in L<perlipc/"Signals"> and chapter 6 of the Camel 2nd ed.
If exception handling is all you're interested in, try the
exceptions.pl library (part of the standard perl distribution).
If you want the atexit() syntax (and an rmexit() as well), try the
AtExit module available from CPAN.
-=head2 Why doesn't my sockets program work under System V (Solaris)? What does the error message "Protocol not supported" mean?
+=head2 Why doesn't my sockets program work under System V (Solaris)? What does the error message "Protocol not supported" mean?
Some Sys-V based systems, notably Solaris 2.X, redefined some of the
standard socket constants. Since these were constant across all
=head2 How can I call my system's unique C functions from Perl?
-In most cases, you write an external module to do it - see the answer
+In most cases, you write an external module to do it--see the answer
to "Where can I learn about linking C with Perl? [h2xs, xsubpp]".
However, if the function is a system call, and your system supports
syscall(), you can use the syscall function (documented in
L<perlfunc>).
Remember to check the modules that came with your distribution, and
-CPAN as well - someone may already have written a module to do it.
+CPAN as well--someone may already have written a module to do it.
=head2 Where do I get the include files to do ioctl() or syscall()?
open (PIPE, "cmd |"); # using open()
With system(), both STDOUT and STDERR will go the same place as the
-script's versions of these, unless the command redirects them.
+script's STDOUT and STDERR, unless the system() command redirects them.
Backticks and open() read B<only> the STDOUT of your command.
With any of these, you can change file descriptors before the call:
piped open() contains shell metacharacters, perl fork()s, then exec()s
a shell to decode the metacharacters and eventually run the desired
program. Now when you call wait(), you only learn whether or not the
-I<shell> could be successfully started. Best to avoid shell
+I<shell> could be successfully started...it's best to avoid shell
metacharacters.
On systems that follow the spawn() paradigm, open() I<might> do what
`cat /etc/termcap`;
You haven't assigned the output anywhere, so it just wastes memory
-(for a little while). Plus you forgot to check C<$?> to see whether
-the program even ran correctly. Even if you wrote
+(for a little while). You forgot to check C<$?> to see whether
+the program even ran correctly, too. Even if you wrote
print `cat /etc/termcap`;
-In most cases, this could and probably should be written as
+this code could and probably should be written as
system("cat /etc/termcap") == 0
or die "cat program failed!";
-Which will get the output quickly (as it is generated, instead of only
+which will get the output quickly (as it is generated, instead of only
at the end) and also check the return value.
system() also provides direct control over whether shell wildcard
=head2 Why can't my script read from STDIN after I gave it EOF (^D on Unix, ^Z on MS-DOS)?
-Because some stdio's set error and eof flags that need clearing. The
+Some stdio's set error and eof flags that need clearing. The
POSIX module defines clearerr() that you can use. That is the
technically correct way to do it. Here are some less reliable
workarounds:
=item Unix
-In the strictest sense, it can't be done -- the script executes as a
+In the strictest sense, it can't be done--the script executes as a
different process from the shell it was started from. Changes to a
-process are not reflected in its parent, only in its own children
+process are not reflected in its parent--only in any children
created after the change. There is shell magic that may allow you to
fake it by eval()ing the script's output in your shell; check out the
comp.unix.questions FAQ for details.
=head2 How do I close a process's filehandle without waiting for it to complete?
Assuming your system supports such things, just send an appropriate signal
-to the process (see L<perlfunc/"kill">. It's common to first send a TERM
+to the process (see L<perlfunc/"kill">). It's common to first send a TERM
signal, wait a little bit, and then send a KILL signal to finish it off.
=head2 How do I fork a daemon process?
sysopen(FH, "/tmp/somefile", O_WRONLY|O_NDELAY|O_CREAT, 0644)
or die "can't open /tmp/somefile: $!":
-
-
-
=head2 How do I install a module from CPAN?
The easiest way is to have a module also named CPAN do it for you.
get a new F<perl> binary with your extension linked in.
See L<ExtUtils::MakeMaker> for more details on building extensions.
-See also the next question.
+See also the next question, ``What's the difference between require
+and use?''.
=head2 What's the difference between require and use?
Perl offers several different ways to include code from one file into
another. Here are the deltas between the various inclusion constructs:
- 1) do $file is like eval `cat $file`, except the former:
+ 1) do $file is like eval `cat $file`, except the former
1.1: searches @INC and updates %INC.
1.2: bequeaths an *unrelated* lexical scope on the eval'ed code.
- 2) require $file is like do $file, except the former:
+ 2) require $file is like do $file, except the former
2.1: checks for redundant loading, skipping already loaded files.
2.2: raises an exception on failure to find, compile, or execute $file.
- 3) require Module is like require "Module.pm", except the former:
+ 3) require Module is like require "Module.pm", except the former
3.1: translates each "::" into your system's directory separator.
3.2: primes the parser to disambiguate class Module as an indirect object.
- 4) use Module is like require Module, except the former:
+ 4) use Module is like require Module, except the former
4.1: loads the module at compile time, not run-time.
4.2: imports symbols and semantics from that package to the current one.
use lib '/u/mydir/perl';
-This is almost the same as:
+This is almost the same as
BEGIN {
unshift(@INC, '/u/mydir/perl');
This section deals with questions related to networking, the internet,
and a few on the web.
-=head2 My CGI script runs from the command line but not the browser. (500 Server Error)
+=head2 My CGI script runs from the command line but not the browser. (500 Server Error)
If you can demonstrate that you've read the following FAQs and that
your problem isn't something simple that can be easily answered, you'll
Many folks attempt a simple-minded regular expression approach, like
C<< s/<.*?>//g >>, but that fails in many cases because the tags
may continue over line breaks, they may contain quoted angle-brackets,
-or HTML comment may be present. Plus folks forget to convert
-entities, like C<<> for example.
+or HTML comment may be present. Plus, folks forget to convert
+entities--like C<<> for example.
Here's one "simple-minded" approach, that works for most files:
EOF
-To be correct to the spec, each of those virtual newlines should really be
-physical C<"\015\012"> sequences by the time you hit the client browser.
-Except for NPH scripts, though, that local newline should get translated
-by your server into standard form, so you shouldn't have a problem
-here, even if you are stuck on MacOS. Everybody else probably won't
-even notice.
+To be correct to the spec, each of those virtual newlines should
+really be physical C<"\015\012"> sequences by the time your message is
+received by the client browser. Except for NPH scripts, though, that
+local newline should get translated by your server into standard form,
+so you shouldn't have a problem here, even if you are stuck on MacOS.
+Everybody else probably won't even notice.
=head2 How do I put a password on my web pages?
=head2 How do I make sure users can't enter values into a form that cause my CGI script to do bad things?
Read the CGI security FAQ, at
-http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html, and the
+http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html , and the
Perl/CGI FAQ at
-http://www.perl.com/CPAN/doc/FAQs/cgi/perl-cgi-faq.html.
+http://www.perl.com/CPAN/doc/FAQs/cgi/perl-cgi-faq.html .
In brief: use tainting (see L<perlsec>), which makes sure that data
from outside your script (eg, CGI parameters) are never used in
=head2 How do I return the user's mail address?
-On systems that support getpwuid, the $< variable and the
+On systems that support getpwuid, the $< variable, and the
Sys::Hostname module (which is part of the standard perl distribution),
you can probably try using something like this:
While you could use the Mail::Folder module from CPAN (part of the
MailFolder package) or the Mail::Internet module from CPAN (also part
-of the MailTools package), often a module is overkill, though. Here's a
+of the MailTools package), often a module is overkill. Here's a
mail sorter.
#!/usr/bin/perl
=head2 How do I fetch a news article or the active newsgroups?
Use the Net::NNTP or News::NNTPClient modules, both available from CPAN.
-This can make tasks like fetching the newsgroup list as simple as:
+This can make tasks like fetching the newsgroup list as simple as
perl -MNews::NNTPClient
-e 'print News::NNTPClient->new->list("newsgroups")'
=head2 How can I do RPC in Perl?
-A DCE::RPC module is being developed (but is not yet available), and
+A DCE::RPC module is being developed (but is not yet available) and
will be released as part of the DCE-Perl package (available from
CPAN). The rpcgen suite, available from CPAN/authors/id/JAKE/, is
an RPC stub generator and includes an RPC::ONC module.