=head2 Predefined Names
-The following names have special meaning to Perl. Most of the
+The following names have special meaning to Perl. Most
punctuation names have reasonable mnemonics, or analogues in one of
-the shells. Nevertheless, if you wish to use the long variable names,
+the shells. Nevertheless, if you wish to use long variable names,
you just need to say
use English;
at the top of your program. This will alias all the short names to the
-long names in the current package. Some of them even have medium names,
+long names in the current package. Some even have medium names,
generally borrowed from B<awk>.
+Due to an unfortunate accident of Perl's implementation, "C<use English>"
+imposes a considerable performance penalty on all regular expression
+matches in a program, regardless of whether they occur in the scope of
+"C<use English>". For that reason, saying "C<use English>" in
+libraries is strongly discouraged. See the Devel::SawAmpersand module
+documentation from CPAN
+(http://www.perl.com/CPAN/modules/by-module/Devel/Devel-SawAmpersand-0.10.readme)
+for more information.
+
To go a step further, those variables that depend on the currently
-selected filehandle may instead be set by calling an object method on
-the FileHandle object. (Summary lines below for this contain the word
-HANDLE.) First you must say
+selected filehandle may instead (and preferably) be set by calling an
+object method on the FileHandle object. (Summary lines below for this
+contain the word HANDLE.) First you must say
use FileHandle;
method HANDLE EXPR
-or
+or more safely,
HANDLE->method(EXPR)
you try to assign to this variable, either directly or indirectly through
a reference, you'll raise a run-time exception.
+The following list is ordered by scalar variables first, then the
+arrays, then the hashes (except $^M was added in the wrong place).
+This is somewhat obscured by the fact that %ENV and %SIG are listed as
+$ENV{expr} and $SIG{expr}.
+
+
=over 8
=item $ARG
=over 8
-=item $E<lt>I<digit>E<gt>
+=item $E<lt>I<digits>E<gt>
Contains the subpattern from the corresponding set of parentheses in
the last pattern matched, not counting patterns matched in nested
-blocks that have been exited already. (Mnemonic: like \digit.)
+blocks that have been exited already. (Mnemonic: like \digits.)
These variables are all read-only.
=item $MATCH
any matches hidden within a BLOCK or eval() enclosed by the current
BLOCK). (Mnemonic: like & in some editors.) This variable is read-only.
+The use of this variable anywhere in a program imposes a considerable
+performance penalty on all regular expression matches. See the
+Devel::SawAmpersand module from CPAN for more information.
+
=item $PREMATCH
=item $`
enclosed by the current BLOCK). (Mnemonic: C<`> often precedes a quoted
string.) This variable is read-only.
+The use of this variable anywhere in a program imposes a considerable
+performance penalty on all regular expression matches. See the
+Devel::SawAmpersand module from CPAN for more information.
+
=item $POSTMATCH
=item $'
This variable is read-only.
+The use of this variable anywhere in a program imposes a considerable
+performance penalty on all regular expression matches. See the
+Devel::SawAmpersand module from CPAN for more information.
+
=item $LAST_PAREN_MATCH
=item $+
(Mnemonic: be positive and forward looking.)
This variable is read-only.
+=item @+
+
+$+[0] is the offset of the end of the last successfull match.
+C<$+[>I<n>C<]> is the offset of the end of the substring matched by
+I<n>-th subpattern, or undef if the subpattern did not match.
+
+Thus after a match against $_, $& coincides with C<substr $_, $-[0],
+$+[0] - $-[0]>. Similarly, C<$>I<n> coincides with C<substr $_, $-[>I<n>C<],
+$+[>I<n>C<] - $-[>I<n>C<]> if C<$-[>I<n>C<]> is defined, and $+ coincides with
+C<substr $_, $-[$#-], $+[$#-]>. One can use C<$#+> to find the number
+of subgroups in the last successful match. Note the difference with
+C<$#->, which is the last I<matched> subgroup. Compare with L<"@-">.
+
=item $MULTILINE_MATCHING
=item $*
-Set to 1 to do multiline matching within a string, 0 to tell Perl
+Set to 1 to do multi-line matching within a string, 0 to tell Perl
that it can assume that strings contain a single line, for the purpose
of optimizing pattern matches. Pattern matches on strings containing
multiple newlines can produce confusing results when "C<$*>" is 0. Default
influences the interpretation of only "C<^>" and "C<$>". A literal newline can
be searched for even when C<$* == 0>.
-Use of "C<$*>" is deprecated in modern perls.
+Use of "C<$*>" is deprecated in modern Perls, supplanted by
+the C</s> and C</m> modifiers on pattern matching.
=item input_line_number HANDLE EXPR
=item $.
The current input line number for the last file handle from
-which you read (or performed a C<seek> or C<tell> on). An
+which you read (or performed a C<seek> or C<tell> on). The value
+may be different from the actual physical line number in the file,
+depending on what notion of "line" is in effect--see L<$/> on how
+to affect that. An
explicit close on a filehandle resets the line number. Because
"C<E<lt>E<gt>>" never does an explicit close, line numbers increase
across ARGV files (but see examples under eof()). Localizing C<$.> has
=item $/
-The input record separator, newline by default. Works like B<awk>'s RS
+The input record separator, newline by default. This is used to
+influence Perl's idea of what a "line" is. Works like B<awk>'s RS
variable, including treating empty lines as delimiters if set to the
null string. (Note: An empty line cannot contain any spaces or tabs.)
-You may set it to a multicharacter string to match a multicharacter
+You may set it to a multi-character string to match a multi-character
delimiter, or to C<undef> to read to end of file. Note that setting it
to C<"\n\n"> means something slightly different than setting it to
C<"">, if the file contains consecutive empty lines. Setting it to
character belongs to the next paragraph, even if it's a newline.
(Mnemonic: / is used to delimit line boundaries when quoting poetry.)
- undef $/;
- $_ = <FH>; # whole file now here
+ undef $/; # enable "slurp" mode
+ $_ = <FH>; # whole file now here
s/\n[ \t]+/ /g;
Remember: the value of $/ is a string, not a regexp. AWK has to be
better for something :-)
+Setting $/ to a reference to an integer, scalar containing an integer, or
+scalar that's convertable to an integer will attempt to read records
+instead of lines, with the maximum record size being the referenced
+integer. So this:
+
+ $/ = \32768; # or \"32768", or \$var_containing_32768
+ open(FILE, $myfile);
+ $_ = <FILE>;
+
+will read a record of no more than 32768 bytes from FILE. If you're not
+reading from a record-oriented file (or your OS doesn't have
+record-oriented files), then you'll likely get a full chunk of data with
+every read. If a record is larger than the record size you've set, you'll
+get the record back in pieces.
+
+On VMS, record reads are done with the equivalent of C<sysread>, so it's
+best not to mix record and non-record reads on the same file. (This is
+likely not a problem, as any file you'd want to read in record mode is
+probably usable in line mode) Non-VMS systems perform normal I/O, so
+it's safe to mix record and non-record reads of a file.
+
+Also see L<$.>.
+
=item autoflush HANDLE EXPR
=item $OUTPUT_AUTOFLUSH
The number of lines left on the page of the currently selected output
channel. (Mnemonic: lines_on_page - lines_printed.)
+=item @-
+
+$-[0] is the offset of the start of the last successfull match.
+C<$-[>I<n>C<]> is the offset of the start of the substring matched by
+I<n>-th subpattern, or undef if the subpattern did not match.
+
+Thus after a match against $_, $& coincides with C<substr $_, $-[0],
+$+[0] - $-[0]>. Similarly, C<$>I<n> coincides with C<substr $_, $-[>I<n>C<],
+$+[>I<n>C<] - $-[>I<n>C<]> if C<$-[>I<n>C<]> is defined, and $+ coincides with
+C<substr $_, $-[$#-], $+[$#-]>. One can use C<$#-> to find the last
+matched subgroup in the last successful match. Note the difference with
+C<$#+>, which is the number of subgroups in the regular expression. Compare
+with L<"@+">.
+
=item format_name HANDLE EXPR
=item $FORMAT_NAME
=item $?
The status returned by the last pipe close, backtick (C<``>) command,
-or system() operator. Note that this is the status word returned by
-the wait() system call (or else is made up to look like it). Thus,
-the exit value of the subprocess is actually (C<$? E<gt>E<gt> 8>), and
-C<$? & 255> gives which signal, if any, the process died from, and
-whether there was a core dump. (Mnemonic: similar to B<sh> and
-B<ksh>.)
+or system() operator. Note that this is the status word returned by the
+wait() system call (or else is made up to look like it). Thus, the exit
+value of the subprocess is actually (C<$? E<gt>E<gt> 8>), and C<$? & 127>
+gives which signal, if any, the process died from, and C<$? & 128> reports
+whether there was a core dump. (Mnemonic: similar to B<sh> and B<ksh>.)
+
+Additionally, if the C<h_errno> variable is supported in C, its value
+is returned via $? if any of the C<gethost*()> functions fail.
Note that if you have installed a signal handler for C<SIGCHLD>, the
value of C<$?> will usually be wrong outside that handler.
actual VMS exit status, instead of the default emulation of POSIX
status.
+Also see L<Error Indicators>.
+
=item $OS_ERROR
=item $ERRNO
If used in a numeric context, yields the current value of errno, with
all the usual caveats. (This means that you shouldn't depend on the
-value of "C<$!>" to be anything in particular unless you've gotten a
+value of C<$!> to be anything in particular unless you've gotten a
specific error return indicating a system error.) If used in a string
context, yields the corresponding system error string. You can assign
-to "C<$!>" to set I<errno> if, for instance, you want "C<$!>" to return the
+to C<$!> to set I<errno> if, for instance, you want C<"$!"> to return the
string for error I<n>, or you want to set the exit value for the die()
operator. (Mnemonic: What just went bang?)
+Also see L<Error Indicators>.
+
=item $EXTENDED_OS_ERROR
=item $^E
-More specific information about the last system error than that provided by
-C<$!>, if available. (If not, it's just C<$!> again, except under OS/2.)
-At the moment, this differs from C<$!> under only VMS and OS/2, where it
-provides the VMS status value from the last system error, and OS/2 error
-code of the last call to OS/2 API which was not directed via CRT. The
-caveats mentioned in the description of C<$!> apply here, too.
-(Mnemonic: Extra error explanation.)
+Error information specific to the current operating system. At
+the moment, this differs from C<$!> under only VMS, OS/2, and Win32
+(and for MacPerl). On all other platforms, C<$^E> is always just
+the same as C<$!>.
+
+Under VMS, C<$^E> provides the VMS status value from the last
+system error. This is more specific information about the last
+system error than that provided by C<$!>. This is particularly
+important when C<$!> is set to B<EVMSERR>.
-Note that under OS/2 C<$!> and C<$^E> do not track each other, so if an
-OS/2-specific call is performed, you may need to check both.
+Under OS/2, C<$^E> is set to the error code of the last call to
+OS/2 API either via CRT, or directly from perl.
+
+Under Win32, C<$^E> always returns the last error information
+reported by the Win32 call C<GetLastError()> which describes
+the last error from within the Win32 API. Most Win32-specific
+code will report errors via C<$^E>. ANSI C and UNIX-like calls
+set C<errno> and so most portable Perl code will report errors
+via C<$!>.
+
+Caveats mentioned in the description of C<$!> generally apply to
+C<$^E>, also. (Mnemonic: Extra error explanation.)
+
+Also see L<Error Indicators>.
=item $EVAL_ERROR
however, set up a routine to process warnings by setting C<$SIG{__WARN__}>
as described below.
+Also see L<Error Indicators>.
+
=item $PROCESS_ID
=item $PID
$< = $>; # set real to effective uid
($<,$>) = ($>,$<); # swap real and effective uid
-(Mnemonic: it's the uid you went I<TO>, if you're running setuid.) Note:
-"C<$E<lt>>" and "C<$E<gt>>" can be swapped on only machines supporting setreuid().
+(Mnemonic: it's the uid you went I<TO>, if you're running setuid.)
+Note: "C<$E<lt>>" and "C<$E<gt>>" can be swapped only on machines
+supporting setreuid().
=item $REAL_GROUP_ID
membership in multiple groups simultaneously, gives a space separated
list of groups you are in. The first number is the one returned by
getgid(), and the subsequent ones by getgroups(), one of which may be
-the same as the first number. (Mnemonic: parentheses are used to I<GROUP>
-things. The real gid is the group you I<LEFT>, if you're running setgid.)
+the same as the first number.
+
+However, a value assigned to "C<$(>" must be a single number used to
+set the real gid. So the value given by "C<$(>" should I<not> be assigned
+back to "C<$(>" without being forced numeric, such as by adding zero.
+
+(Mnemonic: parentheses are used to I<GROUP> things. The real gid is the
+group you I<LEFT>, if you're running setgid.)
=item $EFFECTIVE_GROUP_ID
supports membership in multiple groups simultaneously, gives a space
separated list of groups you are in. The first number is the one
returned by getegid(), and the subsequent ones by getgroups(), one of
-which may be the same as the first number. (Mnemonic: parentheses are
-used to I<GROUP> things. The effective gid is the group that's I<RIGHT> for
-you, if you're running setgid.)
+which may be the same as the first number.
+
+Similarly, a value assigned to "C<$)>" must also be a space-separated
+list of numbers. The first number is used to set the effective gid, and
+the rest (if any) are passed to setgroups(). To get the effect of an
+empty list for setgroups(), just repeat the new effective gid; that is,
+to force an effective gid of 5 and an effectively empty setgroups()
+list, say C< $) = "5 5" >.
+
+(Mnemonic: parentheses are used to I<GROUP> things. The effective gid
+is the group that's I<RIGHT> for you, if you're running setgid.)
Note: "C<$E<lt>>", "C<$E<gt>>", "C<$(>" and "C<$)>" can be set only on
machines that support the corresponding I<set[re][ug]id()> routine. "C<$(>"
-and "C<$)>" can be swapped on only machines supporting setregid(). Because
-Perl doesn't currently use initgroups(), you can't set your group vector to
-multiple groups.
+and "C<$)>" can be swapped only on machines supporting setregid().
=item $PROGRAM_NAME
See also the documentation of C<use VERSION> and C<require VERSION>
for a convenient way to fail if the Perl interpreter is too old.
+=item $COMPILING
+
+=item $^C
+
+The current value of the flag associated with the B<-c> switch. Mainly
+of use with B<-MO=...> to allow code to alter its behaviour when being compiled.
+(For example to automatically AUTOLOADing at compile time rather than normal
+deferred loading.) Setting C<$^C = 1> is similar to calling C<B::minus_c>.
+
=item $DEBUGGING
=item $^D
preserved even if the open() fails. (Ordinary file descriptors are
closed before the open() is attempted.) Note that the close-on-exec
status of a file descriptor will be decided according to the value of
-C<$^F> at the time of the open, not the time of the exec.
+C<$^F> when the open() or pipe() was called, not the time of the exec().
=item $^H
-The current set of syntax checks enabled by C<use strict>. See the
-documentation of C<strict> for more details.
+The current set of syntax checks enabled by C<use strict> and other block
+scoped compiler hints. See the documentation of C<strict> for more details.
=item $INPLACE_EDIT
The current value of the inplace-edit extension. Use C<undef> to disable
inplace editing. (Mnemonic: value of B<-i> switch.)
+=item $^M
+
+By default, running out of memory it is not trappable. However, if
+compiled for this, Perl may use the contents of C<$^M> as an emergency
+pool after die()ing with this message. Suppose that your Perl were
+compiled with -DPERL_EMERGENCY_SBRK and used Perl's malloc. Then
+
+ $^M = 'a' x (1<<16);
+
+would allocate a 64K buffer for use when in emergency. See the F<INSTALL>
+file for information on how to enable this option. As a disincentive to
+casual use of this advanced feature, there is no L<English> long name for
+this variable.
+
=item $OSNAME
=item $^O
=item $^P
-The internal flag that the debugger clears so that it doesn't debug
-itself. You could conceivably disable debugging yourself by clearing
-it.
+The internal variable for debugging support. Different bits mean the
+following (subject to change):
+
+=over 6
+
+=item 0x01
+
+Debug subroutine enter/exit.
+
+=item 0x02
+
+Line-by-line debugging.
+
+=item 0x04
+
+Switch off optimizations.
+
+=item 0x08
+
+Preserve more data for future interactive inspections.
+
+=item 0x10
+
+Keep info about source lines on which a subroutine is defined.
+
+=item 0x20
+
+Start with single-step on.
+
+=back
+
+Note that some bits may be relevant at compile-time only, some at
+run-time only. This is a new mechanism and the details may change.
+
+=item $^R
+
+The result of evaluation of the last successful L<perlre/C<(?{ code })>>
+regular expression assertion. (Excluding those used as switches.) May
+be written to.
+
+=item $^S
+
+Current state of the interpreter. Undefined if parsing of the current
+module/eval is not finished (may happen in $SIG{__DIE__} and
+$SIG{__WARN__} handlers). True if inside an eval, otherwise false.
=item $BASETIME
use lib '/mypath/libdir/';
use SomeMod;
+=item @_
+
+Within a subroutine the array @_ contains the parameters passed to that
+subroutine. See L<perlsub>.
+
=item %INC
The hash %INC contains entries for each filename that has
The C<require> command uses this array to determine whether a given file
has already been included.
+=item %ENV
+
=item $ENV{expr}
The hash %ENV contains your current environment. Setting a
value in C<ENV> changes the environment for child processes.
+=item %SIG
+
=item $SIG{expr}
The hash %SIG is used to set signal handlers for various
signals. Example:
sub handler { # 1st argument is signal name
- local($sig) = @_;
+ my($sig) = @_;
print "Caught a SIG$sig--shutting down\n";
close(LOG);
exit(0);
}
- $SIG{'INT'} = 'handler';
- $SIG{'QUIT'} = 'handler';
+ $SIG{'INT'} = \&handler;
+ $SIG{'QUIT'} = \&handler;
...
$SIG{'INT'} = 'DEFAULT'; # restore default action
$SIG{'QUIT'} = 'IGNORE'; # ignore SIGQUIT
+Using a value of C<'IGNORE'> usually has the effect of ignoring the
+signal, except for the C<CHLD> signal. See L<perlipc> for more about
+this special case.
+
The %SIG array contains values for only the signals actually set within
the Perl script. Here are some other examples:
- $SIG{PIPE} = Plumber; # SCARY!!
- $SIG{"PIPE"} = "Plumber"; # just fine, assumes main::Plumber
+ $SIG{"PIPE"} = Plumber; # SCARY!!
+ $SIG{"PIPE"} = "Plumber"; # assumes main::Plumber (not recommended)
$SIG{"PIPE"} = \&Plumber; # just fine; assume current Plumber
$SIG{"PIPE"} = Plumber(); # oops, what did Plumber() return??
processing continues as it would have in the absence of the hook,
unless the hook routine itself exits via a C<goto>, a loop exit, or a die().
The C<__DIE__> handler is explicitly disabled during the call, so that you
-can die from a C<__DIE__> handler. Similarly for C<__WARN__>. See
-L<perlfunc/die>, L<perlfunc/warn> and L<perlfunc/eval>.
+can die from a C<__DIE__> handler. Similarly for C<__WARN__>.
-=item $^M
+Note that the C<$SIG{__DIE__}> hook is called even inside eval()ed
+blocks/strings. See L<perlfunc/die> and L<perlvar/$^S> for how to
+circumvent this.
-By default, running out of memory it is not trappable. However, if
-compiled for this, Perl may use the contents of C<$^M> as an emergency
-pool after die()ing with this message. Suppose that your Perl were
-compiled with -DEMERGENCY_SBRK and used Perl's malloc. Then
+Note that C<__DIE__>/C<__WARN__> handlers are very special in one
+respect: they may be called to report (probable) errors found by the
+parser. In such a case the parser may be in inconsistent state, so
+any attempt to evaluate Perl code from such a handler will probably
+result in a segfault. This means that calls which result/may-result
+in parsing Perl should be used with extreme caution, like this:
- $^M = 'a' x (1<<16);
+ require Carp if defined $^S;
+ Carp::confess("Something wrong") if defined &Carp::confess;
+ die "Something wrong, but could not load Carp to give backtrace...
+ To see backtrace try starting Perl with -MCarp switch";
-would allocate a 64K buffer for use when in emergency. See the F<INSTALL>
-file for information on how to enable this option. As a disincentive to
-casual use of this advanced feature, there is no L<English> long name for
-this variable.
+Here the first line will load Carp I<unless> it is the parser who
+called the handler. The second line will print backtrace and die if
+Carp was available. The third line will be executed only if Carp was
+not available.
+
+See L<perlfunc/die>, L<perlfunc/warn> and L<perlfunc/eval> for
+additional info.
=back
+
+=head2 Error Indicators
+
+The variables L<$@>, L<$!>, L<$^E>, and L<$?> contain information about
+different types of error conditions that may appear during execution of
+Perl script. The variables are shown ordered by the "distance" between
+the subsystem which reported the error and the Perl process, and
+correspond to errors detected by the Perl interpreter, C library,
+operating system, or an external program, respectively.
+
+To illustrate the differences between these variables, consider the
+following Perl expression:
+
+ eval '
+ open PIPE, "/cdrom/install |";
+ @res = <PIPE>;
+ close PIPE or die "bad pipe: $?, $!";
+ ';
+
+After execution of this statement all 4 variables may have been set.
+
+$@ is set if the string to be C<eval>-ed did not compile (this may happen if
+C<open> or C<close> were imported with bad prototypes), or if Perl
+code executed during evaluation die()d (either implicitly, say,
+if C<open> was imported from module L<Fatal>, or the C<die> after
+C<close> was triggered). In these cases the value of $@ is the compile
+error, or C<Fatal> error (which will interpolate C<$!>!), or the argument
+to C<die> (which will interpolate C<$!> and C<$?>!).
+
+When the above expression is executed, open(), C<<PIPEE<gt>>, and C<close>
+are translated to C run-time library calls. $! is set if one of these
+calls fails. The value is a symbolic indicator chosen by the C run-time
+library, say C<No such file or directory>.
+
+On some systems the above C library calls are further translated
+to calls to the kernel. The kernel may have set more verbose error
+indicator that one of the handful of standard C errors. In such cases $^E
+contains this verbose error indicator, which may be, say, C<CDROM tray not
+closed>. On systems where C library calls are identical to system calls
+$^E is a duplicate of $!.
+
+Finally, $? may be set to non-C<0> value if the external program
+C</cdrom/install> fails. Upper bits of the particular value may reflect
+specific error conditions encountered by this program (this is
+program-dependent), lower-bits reflect mode of failure (segfault, completion,
+etc.). Note that in contrast to $@, $!, and $^E, which are set only
+if error condition is detected, the variable $? is set on each C<wait> or
+pipe C<close>, overwriting the old value.
+
+For more details, see the individual descriptions at L<$@>, L<$!>, L<$^E>,
+and L<$?>.