. This may be
+subject to change.
+
=item pack TEMPLATE,LIST
Takes a LIST of values and converts it into a string using the rules
@@ -3161,34 +3272,14 @@ of values, as follows:
h A hex string (low nybble first).
H A hex string (high nybble first).
- c A signed char value.
+ c A signed char (8-bit) value.
C An unsigned char value. Only does bytes. See U for Unicode.
- s A signed short value.
+ s A signed short (16-bit) value.
S An unsigned short value.
- (This 'short' is _exactly_ 16 bits, which may differ from
- what a local C compiler calls 'short'. If you want
- native-length shorts, use the '!' suffix.)
-
- i A signed integer value.
- I An unsigned integer value.
- (This 'integer' is _at_least_ 32 bits wide. Its exact
- size depends on what a local C compiler calls 'int',
- and may even be larger than the 'long' described in
- the next item.)
- l A signed long value.
+ l A signed long (32-bit) value.
L An unsigned long value.
- (This 'long' is _exactly_ 32 bits, which may differ from
- what a local C compiler calls 'long'. If you want
- native-length longs, use the '!' suffix.)
-
- n An unsigned short in "network" (big-endian) order.
- N An unsigned long in "network" (big-endian) order.
- v An unsigned short in "VAX" (little-endian) order.
- V An unsigned long in "VAX" (little-endian) order.
- (These 'shorts' and 'longs' are _exactly_ 16 bits and
- _exactly_ 32 bits, respectively.)
q A signed quad (64-bit) value.
Q An unsigned quad value.
@@ -3196,14 +3287,23 @@ of values, as follows:
integer values _and_ if Perl has been compiled to support those.
Causes a fatal error otherwise.)
- j A signed integer value (a Perl internal integer, IV).
- J An unsigned integer value (a Perl internal unsigned integer, UV).
+ i A signed integer value.
+ I A unsigned integer value.
+ (This 'integer' is _at_least_ 32 bits wide. Its exact
+ size depends on what a local C compiler calls 'int'.)
+
+ n An unsigned short (16-bit) in "network" (big-endian) order.
+ N An unsigned long (32-bit) in "network" (big-endian) order.
+ v An unsigned short (16-bit) in "VAX" (little-endian) order.
+ V An unsigned long (32-bit) in "VAX" (little-endian) order.
+
+ j A Perl internal signed integer value (IV).
+ J A Perl internal unsigned integer value (UV).
f A single-precision float in the native format.
d A double-precision float in the native format.
- F A floating point value in the native native format
- (a Perl internal floating point value, NV).
+ F A Perl internal floating point value (NV) in the native format
D A long double-precision float in the native format.
(Long doubles are available only if your system supports long
double values _and_ if Perl has been compiled to support those.
@@ -3223,9 +3323,27 @@ of values, as follows:
x A null byte.
X Back up a byte.
- @ Null fill to absolute position.
+ @ Null fill to absolute position, counted from the start of
+ the innermost ()-group.
( Start of a ()-group.
+Some letters in the TEMPLATE may optionally be followed by one or
+more of these modifiers (the second column lists the letters for
+which the modifier is valid):
+
+ ! sSlLiI Forces native (short, long, int) sizes instead
+ of fixed (16-/32-bit) sizes.
+
+ xX Make x and X act as alignment commands.
+
+ nNvV Treat integers as signed instead of unsigned.
+
+ > sSiIlLqQ Force big-endian byte-order on the type.
+ jJfFdDpP (The "big end" touches the construct.)
+
+ < sSiIlLqQ Force little-endian byte-order on the type.
+ jJfFdDpP (The "little end" touches the construct.)
+
The following rules apply:
=over 8
@@ -3330,6 +3448,11 @@ The C type packs a pointer to a structure of the size indicated by the
length. A NULL pointer is created if the corresponding value for C
or
C
is C, similarly for unpack().
+If your system has a strange pointer size (i.e. a pointer is neither as
+big as an int nor as big as a long), it may not be possible to pack or
+unpack pointers in big- or little-endian byte order. Attempting to do
+so will result in a fatal error.
+
=item *
The C> template character allows packing and unpacking of strings where
@@ -3341,9 +3464,11 @@ how the length value is packed. The ones likely to be of most use are
integer-packing ones like C (for Java strings), C (for ASN.1 or
SNMP) and C (for Sun XDR).
-The I must, at present, be C<"A*">, C<"a*"> or C<"Z*">.
-For C the length of the string is obtained from the I,
-but if you put in the '*' it will be ignored.
+For C, the I must, at present, be C<"A*">, C<"a*"> or
+C<"Z*">. For C the length of the string is obtained from the
+I, but if you put in the '*' it will be ignored. For all other
+codes, C applies the length value to the next item, which must not
+have a repeat count.
unpack 'C/a', "\04Gurusamy"; gives 'Guru'
unpack 'a3/A* A*', '007 Bond J '; gives (' Bond','J')
@@ -3359,7 +3484,7 @@ which Perl does not regard as legal in numeric strings.
=item *
The integer types C, C, C, and C may be
-immediately followed by a C suffix to signify native shorts or
+followed by a C modifier to signify native shorts or
longs--as you can see from above for example a bare C does mean
exactly 32 bits, the native C (as seen by the local C compiler)
may be larger. This is an issue mainly in 64-bit platforms. You can
@@ -3381,7 +3506,7 @@ L:
print $Config{longsize}, "\n";
print $Config{longlongsize}, "\n";
-(The C<$Config{longlongsize}> will be undefine if your system does
+(The C<$Config{longlongsize}> will be undefined if your system does
not support long longs.)
=item *
@@ -3425,12 +3550,39 @@ via L:
Byteorders C<'1234'> and C<'12345678'> are little-endian, C<'4321'>
and C<'87654321'> are big-endian.
-If you want portable packed integers use the formats C, C,
-C, and C, their byte endianness and size are known.
+If you want portable packed integers you can either use the formats
+C, C, C, and C, or you can use the C> and C>
+modifiers. These modifiers are only available as of perl 5.8.5.
See also L.
=item *
+All integer and floating point formats as well as C and C
may
+be followed by the C> or C> modifiers to force big- or
+little- endian byte-order, respectively. This is especially useful,
+since C, C, C and C don't cover signed integers, 64-bit
+integers and floating point values. However, there are some things
+to keep in mind.
+
+Exchanging signed integers between different platforms only works
+if all platforms store them in the same format. Most platforms store
+signed integers in two's complement, so usually this is not an issue.
+
+The C> or C> modifiers can only be used on floating point
+formats on big- or little-endian machines. Otherwise, attempting to
+do so will result in a fatal error.
+
+Forcing big- or little-endian byte-order on floating point values for
+data exchange can only work if all platforms are using the same
+binary representation (e.g. IEEE floating point format). Even if all
+platforms are using IEEE, there may be subtle differences. Being able
+to use C> or C> on floating point values can be very useful,
+but also very dangerous if you don't know exactly what you're doing.
+It is definetely not a general way to portably store floating point
+values.
+
+=item *
+
Real numbers (floats and doubles) are in the native machine format only;
due to the multiplicity of floating formats around, and the lack of a
standard "network" representation, no facility for interchange has been
@@ -3439,19 +3591,23 @@ may not be readable on another - even if both use IEEE floating point
arithmetic (as the endian-ness of the memory representation is not part
of the IEEE spec). See also L.
-Note that Perl uses doubles internally for all numeric calculation, and
-converting from double into float and thence back to double again will
-lose precision (i.e., C) will not in general
-equal $foo).
+If you know exactly what you're doing, you can use the C> or C>
+modifiers to force big- or little-endian byte-order on floating point values.
+
+Note that Perl uses doubles (or long doubles, if configured) internally for
+all numeric calculation, and converting from double into float and thence back
+to double again will lose precision (i.e., C)
+will not in general equal $foo).
=item *
-If the pattern begins with a C, the resulting string will be treated
-as Unicode-encoded. You can force UTF8 encoding on in a string with an
-initial C, and the bytes that follow will be interpreted as Unicode
-characters. If you don't want this to happen, you can begin your pattern
-with C (or anything else) to force Perl not to UTF8 encode your
-string, and then follow this with a C somewhere in your pattern.
+If the pattern begins with a C, the resulting string will be
+treated as UTF-8-encoded Unicode. You can force UTF-8 encoding on in a
+string with an initial C, and the bytes that follow will be
+interpreted as Unicode characters. If you don't want this to happen,
+you can begin your pattern with C (or anything else) to force Perl
+not to UTF-8 encode your string, and then follow this with a C
+somewhere in your pattern.
=item *
@@ -3464,8 +3620,14 @@ sequences of bytes.
=item *
A ()-group is a sub-TEMPLATE enclosed in parentheses. A group may
-take a repeat count, both as postfix, and via the C> template
-character.
+take a repeat count, both as postfix, and for unpack() also via the C>
+template character. Within each repetition of a group, positioning with
+C<@> starts again at 0. Therefore, the result of
+
+ pack( '@1A((@2A)@3A)', 'a', 'b', 'c' )
+
+is the string "\0a\0\0bc".
+
=item *
@@ -3481,7 +3643,17 @@ both result in no-ops.
=item *
+C, C, C and C accept the C modifier. In this case they
+will represent signed 16-/32-bit integers in big-/little-endian order.
+This is only portable if all platforms sharing the packed data use the
+same binary representation for signed integers (e.g. all platforms are
+using two's complement representation).
+
+=item *
+
A comment in a TEMPLATE starts with C<#> and goes to the end of line.
+White space may be used to separate pack codes from each other, but
+modifiers and a repeat count must follow immediately.
=item *
@@ -3541,6 +3713,13 @@ Examples:
# short 12, zero fill to position 4, long 34
# $foo eq $bar
+ $foo = pack('nN', 42, 4711);
+ # pack big-endian 16- and 32-bit unsigned integers
+ $foo = pack('S>L>', 42, 4711);
+ # exactly the same
+ $foo = pack('s.
Returns a random fractional number greater than or equal to C<0> and less
than the value of EXPR. (EXPR should be positive.) If EXPR is
-omitted, or a C<0>, the value C<1> is used. Automatically calls C
-unless C has already been called. See also C.
+omitted, the value C<1> is used. Currently EXPR with the value C<0> is
+also special-cased as C<1> - this has not been documented before perl 5.8.0
+and is subject to change in future versions of perl. Automatically calls
+C unless C has already been called. See also C.
Apply C to the value returned by C if you want random
integers instead of random fractional numbers. For example,
@@ -3735,19 +3916,28 @@ with the wrong number of RANDBITS.)
Attempts to read LENGTH I of data into variable SCALAR
from the specified FILEHANDLE. Returns the number of characters
-actually read, C<0> at end of file, or undef if there was an error.
-SCALAR will be grown or shrunk to the length actually read. If SCALAR
-needs growing, the new bytes will be zero bytes. An OFFSET may be
-specified to place the read data into some other place in SCALAR than
-the beginning. The call is actually implemented in terms of either
-Perl's or system's fread() call. To get a true read(2) system call,
-see C.
+actually read, C<0> at end of file, or undef if there was an error (in
+the latter case C<$!> is also set). SCALAR will be grown or shrunk
+so that the last character actually read is the last character of the
+scalar after the read.
+
+An OFFSET may be specified to place the read data at some place in the
+string other than the beginning. A negative OFFSET specifies
+placement at that many characters counting backwards from the end of
+the string. A positive OFFSET greater than the length of SCALAR
+results in the string being padded to the required size with C<"\0">
+bytes before the result of the read is appended.
+
+The call is actually implemented in terms of either Perl's or system's
+fread() call. To get a true read(2) system call, see C.
Note the I: depending on the status of the filehandle,
either (8-bit) bytes or characters are read. By default all
filehandles operate on bytes, but for example if the filehandle has
-been opened with the C<:utf8> discipline (see L