arranged by category. Some functions appear in more
than one place.
-=over
+=over 4
=item Functions for SCALARs or strings
-O File is owned by real uid.
-e File exists.
- -z File has zero size.
- -s File has nonzero size (returns size).
+ -z File has zero size (is empty).
+ -s File has nonzero size (returns size in bytes).
-f File is a plain file.
-d File is a directory.
Example:
while (<>) {
- chop;
+ chomp;
next unless -f $_; # ignore specials
#...
}
# ...
}
+If VARIABLE is a hash, it chomps the hash's values, but not its keys.
+
You can actually chomp anything that's an lvalue, including an assignment:
chomp($cwd = `pwd`);
=item chop
Chops off the last character of a string and returns the character
-chopped. It's used primarily to remove the newline from the end of an
-input record, but is much more efficient than C<s/\n//> because it neither
+chopped. It is much more efficient than C<s/.$//s> because it neither
scans nor copies the string. If VARIABLE is omitted, chops C<$_>.
-Example:
-
- while (<>) {
- chop; # avoid \n on last field
- @array = split(/:/);
- #...
- }
+If VARIABLE is a hash, it chops the hash's values, but not its keys.
-You can actually chop anything that's an lvalue, including an assignment:
-
- chop($cwd = `pwd`);
- chop($answer = <STDIN>);
+You can actually chop anything that's an lvalue, including an assignment.
If you chop a list, each element is chopped. Only the value of the
last C<chop> is returned.
You may also use C<defined(&func)> to check whether subroutine C<&func>
has ever been defined. The return value is unaffected by any forward
-declarations of C<&foo>.
+declarations of C<&foo>. Note that a subroutine which is not defined
+may still be callable: its package may have an C<AUTOLOAD> method that
+makes it spring into existence the first time that it is called -- see
+L<perlsub>.
Use of C<defined> on aggregates (hashes and arrays) is deprecated. It
used to report whether memory for that aggregate has ever been
When called in list context, returns a 2-element list consisting of the
key and value for the next element of a hash, so that you can iterate over
-it. When called in scalar context, returns the key for only the "next"
+it. When called in scalar context, returns only the key for the next
element in the hash.
Entries are returned in an apparently random order. The actual random
C<keys>, and C<values> function calls in the program; it can be reset by
reading all the elements from the hash, or by evaluating C<keys HASH> or
C<values HASH>. If you add or delete elements of a hash while you're
-iterating over it, you may get entries skipped or duplicated, so don't.
+iterating over it, you may get entries skipped or duplicated, so
+don't. Exception: It is always safe to delete the item most recently
+returned by C<each()>, which means that the following code will work:
+
+ while (($key, $value) = each %hash) {
+ print $key, "\n";
+ delete $hash{$key}; # This is safe
+ }
The following prints out your environment like the printenv(1) program,
only in a different order:
Given an expression that specifies the name of a subroutine,
returns true if the specified subroutine has ever been declared, even
if it is undefined. Mentioning a subroutine name for exists or defined
-does not count as declaring it.
+does not count as declaring it. Note that a subroutine which does not
+exist may still be callable: its package may have an C<AUTOLOAD>
+method that makes it spring into existence the first time that it is
+called -- see L<perlsub>.
print "Exists\n" if exists &subroutine;
print "Defined\n" if defined &subroutine;
=item local EXPR
You really probably want to be using C<my> instead, because C<local> isn't
-what most people think of as "local". See L<perlsub/"Private Variables
-via my()"> for details.
+what most people think of as "local". See
+L<perlsub/"Private Variables via my()"> for details.
A local modifies the listed variables to be local to the enclosing
block, file, or eval. If more than one value is listed, the list must
most cases. See also L</grep> for an array composed of those items of
the original list for which the BLOCK or EXPR evaluates to true.
+C<{> starts both hash references and blocks, so C<map { ...> could be either
+the start of map BLOCK LIST or map EXPR, LIST. Because perl doesn't look
+ahead for the closing C<}> it has to take a guess at which its dealing with
+based what it finds just after the C<{>. Usually it gets it right, but if it
+doesn't it won't realize something is wrong until it gets to the C<}> and
+encounters the missing (or unexpected) comma. The syntax error will be
+reported close to the C<}> but you'll need to change something near the C<{>
+such as using a unary C<+> to give perl some help:
+
+ %hash = map { "\L$_", 1 } @array # perl guesses EXPR. wrong
+ %hash = map { +"\L$_", 1 } @array # perl guesses BLOCK. right
+ %hash = map { ("\L$_", 1) } @array # this also works
+ %hash = map { lc($_), 1 } @array # as does this.
+ %hash = map +( lc($_), 1 ), @array # this is EXPR and works!
+
+ %hash = map ( lc($_), 1 ), @array # evaluates to (1, @array)
+
+or to force an anon hash constructor use C<+{>
+
+ @hashes = map +{ lc($_), 1 }, @array # EXPR, so needs , at end
+
+and you get list of anonymous hashes each with only 1 entry.
+
=item mkdir FILENAME,MASK
=item mkdir FILENAME
kept private (mail files, for instance). The perlfunc(1) entry on
C<umask> discusses the choice of MASK in more detail.
+Note that according to the POSIX 1003.1-1996 the FILENAME may have any
+number of trailing slashes. Some operating and filesystems do not get
+this right, so Perl automatically removes all trailing slashes to keep
+everyone happy.
+
=item msgctl ID,CMD,ARG
Calls the System V IPC function msgctl(2). You'll probably have to say
$file =~ s#^(\s)#./$1#;
open(FOO, "< $file\0");
-(this may not work on some bizzare filesystems). One should
+(this may not work on some bizarre filesystems). One should
conscientiously choose between the I<magic> and 3-arguments form
of open():
sysopen(HANDLE, $path, O_RDWR|O_CREAT|O_EXCL)
or die "sysopen $path: $!";
$oldfh = select(HANDLE); $| = 1; select($oldfh);
- print HANDLE "stuff $$\n");
+ print HANDLE "stuff $$\n";
seek(HANDLE, 0, 0);
print "File contains: ", <HANDLE>;
0x12 0x34 0x56 0x78 # big-endian
0x78 0x56 0x34 0x12 # little-endian
-Basically, the Intel, Alpha, and VAX CPUs are little-endian, while
-everybody else, for example Motorola m68k/88k, PPC, Sparc, HP PA,
-Power, and Cray are big-endian. MIPS can be either: Digital used it
-in little-endian mode; SGI uses it in big-endian mode.
+Basically, the Intel and VAX CPUs are little-endian, while everybody
+else, for example Motorola m68k/88k, PPC, Sparc, HP PA, Power, and
+Cray are big-endian. Alpha and MIPS can be either: Digital/Compaq
+used/uses them in little-endian mode; SGI/Cray uses them in big-endian mode.
The names `big-endian' and `little-endian' are comic references to
the classic "Gulliver's Travels" (via the paper "On Holy Wars and a
=item read FILEHANDLE,SCALAR,LENGTH
Attempts to read LENGTH bytes of data into variable SCALAR from the
-specified FILEHANDLE. Returns the number of bytes actually read,
-C<0> at end of file, or undef if there was an error. SCALAR will be grown
-or shrunk to the length actually read. An OFFSET may be specified to
-place the read data at some other place than the beginning of the
-string. This call is actually implemented in terms of stdio's fread(3)
-call. To get a true read(2) system call, see C<sysread>.
+specified FILEHANDLE. Returns the number of bytes actually read, C<0>
+at end of file, or undef if there was an error. SCALAR will be grown
+or shrunk to the length actually read. If SCALAR needs growing, the
+new bytes will be zero bytes. An OFFSET may be specified to place
+the read data into some other place in SCALAR than the beginning.
+The call is actually implemented in terms of stdio's fread(3) call.
+To get a true read(2) system call, see C<sysread>.
=item readdir DIRHANDLE
it, or else see L</select> above. The Time::HiRes module from CPAN
may also help.
-See also the POSIX module's C<sigpause> function.
+See also the POSIX module's C<pause> function.
=item socket SOCKET,DOMAIN,TYPE,PROTOCOL
If you're using strict, you I<must not> declare $a
and $b as lexicals. They are package globals. That means
if you're in the C<main> package and type
-
+
@articles = sort {$b <=> $a} @files;
-
+
then C<$a> and C<$b> are C<$main::a> and C<$main::b> (or C<$::a> and C<$::b>),
but if you're in the C<FooPack> package, it's the same as typing
produces the output 'h:i:t:h:e:r:e'.
+Empty leading (or trailing) fields are produced when there positive width
+matches at the beginning (or end) of the string; a zero-width match at the
+beginning (or end) of the string does not produce an empty field. For
+example:
+
+ print join(':', split(/(?=\w)/, 'hi there!'));
+
+produces the output 'h:i :t:h:e:r:e!'.
+
The LIMIT parameter can be used to split a line partially
($login, $passwd, $remainder) = split(/:/, $_, 3);
open(PASSWD, '/etc/passwd');
while (<PASSWD>) {
- ($login, $passwd, $uid, $gid,
+ chomp;
+ ($login, $passwd, $uid, $gid,
$gcos, $home, $shell) = split(/:/);
#...
}
-(Note that $shell above will still have a newline on it. See L</chop>,
-L</chomp>, and L</join>.)
=item sprintf FORMAT, LIST
%O a synonym for %lo
%F a synonym for %f
-Conversions to scientific notation by C<%e>, C<%E>, C<%g> and C<%G>
-always have a two-digit exponent unless the modulus of the exponent is
-greater than 99.
+Note that the number of exponent digits in the scientific notation by
+C<%e>, C<%E>, C<%g> and C<%G> for numbers with the modulus of the
+exponent less than 100 is system-dependent: it may be three or less
+(zero-padded as necessary). In other words, 1.23 times ten to the
+99th may be either "1.23e99" or "1.23e099".
Perl permits the following universally-known flags between the C<%>
and the conversion letter:
h interpret integer as C type "short" or "unsigned short"
If no flags, interpret integer as C type "int" or "unsigned"
+Perl supports parameter ordering, in other words, fetching the
+parameters in some explicitly specified "random" ordering as opposed
+to the default implicit sequential ordering. The syntax is, instead
+of the C<%> and C<*>, to use C<%>I<digits>C<$> and C<*>I<digits>C<$>,
+where the I<digits> is the wanted index, from one upwards. For example:
+
+ printf "%2\$d %1\$d\n", 12, 34; # will print "34 12\n"
+ printf "%*2\$d\n", 12, 3; # will print " 12\n"
+
+Note that using the reordering syntax does not interfere with the usual
+implicit sequential fetching of the parameters:
+
+ printf "%2\$d %d\n", 12, 34; # will print "34 12\n"
+ printf "%2\$d %d %d\n", 12, 34; # will print "34 12 34\n"
+ printf "%3\$d %d %d\n", 12, 34, 56; # will print "56 12 34\n"
+ printf "%2\$*3\$d %d\n", 12, 34, 3; # will print " 34 12\n"
+ printf "%*3\$2\$d %d\n", 12, 34, 3; # will print " 34 12\n"
+
There are also two Perl-specific flags:
- V interpret integer as Perl's standard integer type
- v interpret string as a vector of integers, output as
- numbers separated either by dots, or by an arbitrary
- string received from the argument list when the flag
- is preceded by C<*>
+ V interpret integer as Perl's standard integer type
+ v interpret string as a vector of integers, output as
+ numbers separated either by dots, or by an arbitrary
+ string received from the argument list when the flag
+ is preceded by C<*>
Where a number would appear in the flags, an asterisk (C<*>) may be
used instead, in which case Perl uses the next item in the parameter
=item tell
-Returns the current position for FILEHANDLE. FILEHANDLE may be an
-expression whose value gives the name of the actual filehandle. If
-FILEHANDLE is omitted, assumes the file last read.
+Returns the current position for FILEHANDLE, or -1 on error. FILEHANDLE
+may be an expression whose value gives the name of the actual filehandle.
+If FILEHANDLE is omitted, assumes the file last read.
+
+The return value of tell() for the standard streams like the STDIN
+depends on the operating system: it may return -1 or something else.
+tell() on pipes, fifos, and sockets usually returns -1.
There is no C<systell> function. Use C<sysseek(FH, 0, 1)> for that.
FIRSTKEY this
NEXTKEY this, lastkey
DESTROY this
+ UNTIE this
A class implementing an ordinary array should have the following methods:
SPLICE this, offset, length, LIST
EXTEND this, count
DESTROY this
+ UNTIE this
A class implementing a file handle should have the following methods:
WRITE this, scalar, length, offset
PRINT this, LIST
PRINTF this, format, LIST
+ BINMODE this
+ EOF this
+ FILENO this
+ SEEK this, position, whence
+ TELL this
+ OPEN this, mode, LIST
CLOSE this
DESTROY this
+ UNTIE this
A class implementing a scalar should have the following methods:
FETCH this,
STORE this, value
DESTROY this
+ UNTIE this
Not all methods indicated above need be implemented. See L<perltie>,
L<Tie::Hash>, L<Tie::Array>, L<Tie::Scalar>, and L<Tie::Handle>.
If no C<unimport> method can be found the call fails with a fatal error.
-See L<perlmod> for a list of standard modules and pragmas. See L<perlrun>
+See L<perlmodlib> for a list of standard modules and pragmas. See L<perlrun>
for the C<-M> and C<-m> command-line options to perl that give C<use>
functionality from the command-line.
If BITS is 16 or more, bytes of the input string are grouped into chunks
of size BITS/8, and each group is converted to a number as with
-pack()/unpack() with big-endian formats C<n>/C<N> (and analoguously
+pack()/unpack() with big-endian formats C<n>/C<N> (and analogously
for BITS==64). See L<"pack"> for details.
If bits is 4 or less, the string is broken into bytes, then the bits
vec($image, $max_x * $x + $y, 8) = 3;
-If the selected element is off the end of the string, the value 0 is
-returned. If an element off the end of the string is written to,
-Perl will first extend the string with sufficiently many zero bytes.
+If the selected element is outside the string, the value 0 is returned.
+If an element off the end of the string is written to, Perl will first
+extend the string with sufficiently many zero bytes. It is an error
+to try to write off the beginning of the string (i.e. negative OFFSET).
+
+The string should not contain any character with the value > 255 (which
+can only happen if you're using UTF8 encoding). If it does, it will be
+treated as something which is not UTF8 encoded. When the C<vec> was
+assigned to, other parts of your program will also no longer consider the
+string to be UTF8 encoded. In other words, if you do have such characters
+in your string, vec() will operate on the actual byte string, and not the
+conceptual character string.
Strings created with C<vec> can also be manipulated with the logical
operators C<|>, C<&>, C<^>, and C<~>. These operators will assume a bit