3 perl - Practical Extraction and Report Language
7 For ease of access, the Perl manual has been split up into a number
10 perl Perl overview (this section)
11 perldata Perl data structures
13 perlop Perl operators and precedence
14 perlre Perl regular expressions
15 perlrun Perl execution and options
16 perlfunc Perl builtin functions
17 perlvar Perl predefined variables
18 perlsub Perl subroutines
20 perlref Perl references and nested data structures
22 perlbot Perl OO tricks and examples
23 perldebug Perl debugging
24 perldiag Perl diagnostic messages
26 perlipc Perl interprocess communication
28 perltrap Perl traps for the unwary
29 perlstyle Perl style guide
30 perlapi Perl application programming interface
31 perlguts Perl internal functions for those doing extensions
32 perlcall Perl calling conventions from C
33 perlovl Perl overloading semantics
34 perlembed Perl how to embed perl in your C or C++ app
35 perlpod Perl plain old documentation
36 perlbook Perl book information
38 (If you're intending to read these straight through for the first time,
39 the suggested order will tend to reduce the number of forward references.)
41 Additional documentation for perl modules is available in
42 the F</usr/local/lib/perl5/man/man3> directory. You can view this
43 with a man(1) program by including the following in the
44 appropriate start-up files. (You may have to adjust the path to
45 match $Config{'man3dir'}.)
47 .profile (for sh, bash or ksh users):
48 MANPATH=$MANPATH:/usr/local/lib/perl5/man
51 .login (for csh or tcsh users):
52 setenv MANPATH $MANPATH:/usr/local/lib/perl5/man
54 If that doesn't work for some reason, you can still use the
55 supplied perldoc script to view module information.
57 If something strange has gone wrong with your program and you're not
58 sure where you should look for help, try the B<-w> switch first. It
59 will often point out exactly where the trouble is.
63 Perl is an interpreted language optimized for scanning arbitrary
64 text files, extracting information from those text files, and printing
65 reports based on that information. It's also a good language for many
66 system management tasks. The language is intended to be practical
67 (easy to use, efficient, complete) rather than beautiful (tiny,
68 elegant, minimal). It combines (in the author's opinion, anyway) some
69 of the best features of C, B<sed>, B<awk>, and B<sh>, so people
70 familiar with those languages should have little difficulty with it.
71 (Language historians will also note some vestiges of B<csh>, Pascal,
72 and even BASIC-PLUS.) Expression syntax corresponds quite closely to C
73 expression syntax. Unlike most Unix utilities, Perl does not
74 arbitrarily limit the size of your data--if you've got the memory,
75 Perl can slurp in your whole file as a single string. Recursion is
76 of unlimited depth. And the hash tables used by associative arrays
77 grow as necessary to prevent degraded performance. Perl uses
78 sophisticated pattern matching techniques to scan large amounts of data
79 very quickly. Although optimized for scanning text, Perl can also
80 deal with binary data, and can make dbm files look like associative
81 arrays (where dbm is available). Setuid Perl scripts are safer than
82 C programs through a dataflow tracing mechanism which prevents many
83 stupid security holes. If you have a problem that would ordinarily use
84 B<sed> or B<awk> or B<sh>, but it exceeds their capabilities or must
85 run a little faster, and you don't want to write the silly thing in C,
86 then Perl may be for you. There are also translators to turn your
87 B<sed> and B<awk> scripts into Perl scripts.
89 But wait, there's more...
91 Perl version 5 is nearly a complete rewrite, and provides
92 the following additional benefits:
96 =item * Many usability enhancements
98 It is now possible to write much more readable Perl code (even within
99 regular expressions). Formerly cryptic variable names can be replaced
100 by mnemonic identifiers. Error messages are more informative, and the
101 optional warnings will catch many of the mistakes a novice might make.
102 This cannot be stressed enough. Whenever you get mysterious behavior,
103 try the B<-w> switch!!! Whenever you don't get mysterious behavior,
104 try using B<-w> anyway.
106 =item * Simplified grammar
108 The new yacc grammar is one half the size of the old one. Many of the
109 arbitrary grammar rules have been regularized. The number of reserved
110 words has been cut by 2/3. Despite this, nearly all old Perl scripts
111 will continue to work unchanged.
113 =item * Lexical scoping
115 Perl variables may now be declared within a lexical scope, like "auto"
116 variables in C. Not only is this more efficient, but it contributes
117 to better privacy for "programming in the large".
119 =item * Arbitrarily nested data structures
121 Any scalar value, including any array element, may now contain a
122 reference to any other variable or subroutine. You can easily create
123 anonymous variables and subroutines. Perl manages your reference
126 =item * Modularity and reusability
128 The Perl library is now defined in terms of modules which can be easily
129 shared among various packages. A package may choose to import all or a
130 portion of a module's published interface. Pragmas (that is, compiler
131 directives) are defined and used by the same mechanism.
133 =item * Object-oriented programming
135 A package can function as a class. Dynamic multiple inheritance and
136 virtual methods are supported in a straightforward manner and with very
137 little new syntax. Filehandles may now be treated as objects.
139 =item * Embeddible and Extensible
141 Perl may now be embedded easily in your C or C++ application, and can
142 either call or be called by your routines through a documented
143 interface. The XS preprocessor is provided to make it easy to glue
144 your C or C++ routines into Perl. Dynamic loading of modules is
147 =item * POSIX compliant
149 A major new module is the POSIX module, which provides access to all
150 available POSIX routines and definitions, via object classes where
153 =item * Package constructors and destructors
155 The new BEGIN and END blocks provide means to capture control as
156 a package is being compiled, and after the program exits. As a
157 degenerate case they work just like awk's BEGIN and END when you
158 use the B<-p> or B<-n> switches.
160 =item * Multiple simultaneous DBM implementations
162 A Perl program may now access DBM, NDBM, SDBM, GDBM, and Berkeley DB
163 files from the same script simultaneously. In fact, the old dbmopen
164 interface has been generalized to allow any variable to be tied
165 to an object class which defines its access methods.
167 =item * Subroutine definitions may now be autoloaded
169 In fact, the AUTOLOAD mechanism also allows you to define any arbitrary
170 semantics for undefined subroutine calls. It's not just for autoloading.
172 =item * Regular expression enhancements
174 You can now specify non-greedy quantifiers. You can now do grouping
175 without creating a backreference. You can now write regular expressions
176 with embedded whitespace and comments for readability. A consistent
177 extensibility mechanism has been added that is upwardly compatible with
178 all old regular expressions.
182 Ok, that's I<definitely> enough hype.
190 Used if chdir has no argument.
194 Used if chdir has no argument and HOME is not set.
198 Used in executing subprocesses, and in finding the script if B<-S> is
203 A colon-separated list of directories in which to look for Perl library
204 files before looking in the standard library and the current
205 directory. If PERL5LIB is not defined, PERLLIB is used.
209 The command used to get the debugger code. If unset, uses
211 BEGIN { require 'perl5db.pl' }
215 A colon-separated list of directories in which to look for Perl library
216 files before looking in the standard library and the current
217 directory. If PERL5LIB is defined, PERLLIB is not used.
222 Apart from these, Perl uses no other environment variables, except
223 to make them available to the script being executed, and to child
224 processes. However, scripts running setuid would do well to execute
225 the following lines before doing anything else, just to keep people
228 $ENV{'PATH'} = '/bin:/usr/bin'; # or whatever you need
229 $ENV{'SHELL'} = '/bin/sh' if defined $ENV{'SHELL'};
230 $ENV{'IFS'} = '' if defined $ENV{'IFS'};
234 Larry Wall <F<lwall@netlabs.com.>, with the help of oodles of other folks.
238 "/tmp/perl-e$$" temporary file for -e commands
239 "@INC" locations of perl 5 libraries
243 a2p awk to perl translator
244 s2p sed to perl translator
248 The B<-w> switch produces some lovely diagnostics.
250 See L<perldiag> for explanations of all Perl's diagnostics.
252 Compilation errors will tell you the line number of the error, with an
253 indication of the next token or token type that was to be examined.
254 (In the case of a script passed to Perl via B<-e> switches, each
255 B<-e> is counted as one line.)
257 Setuid scripts have additional constraints that can produce error
258 messages such as "Insecure dependency". See L<perlsec>.
260 Did we mention that you should definitely consider using the B<-w>
265 The B<-w> switch is not mandatory.
267 Perl is at the mercy of your machine's definitions of various
268 operations such as type casting, atof() and sprintf().
270 If your stdio requires a seek or eof between reads and writes on a
271 particular stream, so does Perl. (This doesn't apply to sysread()
274 While none of the built-in data types have any arbitrary size limits
275 (apart from memory size), there are still a few arbitrary limits: a
276 given identifier may not be longer than 255 characters, and no
277 component of your PATH may be longer than 255 if you use B<-S>. A regular
278 expression may not compile to more than 32767 bytes internally.
280 Perl actually stands for Pathologically Eclectic Rubbish Lister, but
281 don't tell anyone I said that.
285 The Perl motto is "There's more than one way to do it." Divining
286 how many more is left as an exercise to the reader.
288 The three principle virtues of a programmer are Laziness,
289 Impatience, and Hubris. See the Camel Book for why.