> modifiers can also be used on C<()>-groups,
+in which case they force a certain byte-order on all components of
+that group, including subgroups.
+
The following rules apply:
=over 8
@@ -3365,6 +3483,11 @@ The C type packs a pointer to a structure of the size indicated by the
length. A NULL pointer is created if the corresponding value for C
or
C
is C, similarly for unpack().
+If your system has a strange pointer size (i.e. a pointer is neither as
+big as an int nor as big as a long), it may not be possible to pack or
+unpack pointers in big- or little-endian byte order. Attempting to do
+so will result in a fatal error.
+
=item *
The C> template character allows packing and unpacking of strings where
@@ -3376,9 +3499,11 @@ how the length value is packed. The ones likely to be of most use are
integer-packing ones like C (for Java strings), C (for ASN.1 or
SNMP) and C (for Sun XDR).
-The I must, at present, be C<"A*">, C<"a*"> or C<"Z*">.
-For C the length of the string is obtained from the I,
-but if you put in the '*' it will be ignored.
+For C, the I must, at present, be C<"A*">, C<"a*"> or
+C<"Z*">. For C the length of the string is obtained from the
+I, but if you put in the '*' it will be ignored. For all other
+codes, C applies the length value to the next item, which must not
+have a repeat count.
unpack 'C/a', "\04Gurusamy"; gives 'Guru'
unpack 'a3/A* A*', '007 Bond J '; gives (' Bond','J')
@@ -3394,7 +3519,7 @@ which Perl does not regard as legal in numeric strings.
=item *
The integer types C, C, C, and C may be
-immediately followed by a C suffix to signify native shorts or
+followed by a C modifier to signify native shorts or
longs--as you can see from above for example a bare C does mean
exactly 32 bits, the native C (as seen by the local C compiler)
may be larger. This is an issue mainly in 64-bit platforms. You can
@@ -3416,7 +3541,7 @@ L:
print $Config{longsize}, "\n";
print $Config{longlongsize}, "\n";
-(The C<$Config{longlongsize}> will be undefine if your system does
+(The C<$Config{longlongsize}> will be undefined if your system does
not support long longs.)
=item *
@@ -3460,12 +3585,45 @@ via L:
Byteorders C<'1234'> and C<'12345678'> are little-endian, C<'4321'>
and C<'87654321'> are big-endian.
-If you want portable packed integers use the formats C, C,
-C, and C, their byte endianness and size are known.
+If you want portable packed integers you can either use the formats
+C, C, C, and C, or you can use the C> and C>
+modifiers. These modifiers are only available as of perl 5.8.5.
See also L.
=item *
+All integer and floating point formats as well as C and C
and
+C<()>-groups may be followed by the C> or C> modifiers
+to force big- or little- endian byte-order, respectively.
+This is especially useful, since C, C, C and C don't cover
+signed integers, 64-bit integers and floating point values. However,
+there are some things to keep in mind.
+
+Exchanging signed integers between different platforms only works
+if all platforms store them in the same format. Most platforms store
+signed integers in two's complement, so usually this is not an issue.
+
+The C> or C> modifiers can only be used on floating point
+formats on big- or little-endian machines. Otherwise, attempting to
+do so will result in a fatal error.
+
+Forcing big- or little-endian byte-order on floating point values for
+data exchange can only work if all platforms are using the same
+binary representation (e.g. IEEE floating point format). Even if all
+platforms are using IEEE, there may be subtle differences. Being able
+to use C> or C> on floating point values can be very useful,
+but also very dangerous if you don't know exactly what you're doing.
+It is definetely not a general way to portably store floating point
+values.
+
+When using C> or C> on an C<()>-group, this will affect
+all types inside the group that accept the byte-order modifiers,
+including all subgroups. It will silently be ignored for all other
+types. You are not allowed to override the byte-order within a group
+that already has a byte-order modifier suffix.
+
+=item *
+
Real numbers (floats and doubles) are in the native machine format only;
due to the multiplicity of floating formats around, and the lack of a
standard "network" representation, no facility for interchange has been
@@ -3474,19 +3632,23 @@ may not be readable on another - even if both use IEEE floating point
arithmetic (as the endian-ness of the memory representation is not part
of the IEEE spec). See also L.
-Note that Perl uses doubles internally for all numeric calculation, and
-converting from double into float and thence back to double again will
-lose precision (i.e., C) will not in general
-equal $foo).
+If you know exactly what you're doing, you can use the C> or C>
+modifiers to force big- or little-endian byte-order on floating point values.
+
+Note that Perl uses doubles (or long doubles, if configured) internally for
+all numeric calculation, and converting from double into float and thence back
+to double again will lose precision (i.e., C)
+will not in general equal $foo).
=item *
-If the pattern begins with a C, the resulting string will be treated
-as Unicode-encoded. You can force UTF8 encoding on in a string with an
-initial C, and the bytes that follow will be interpreted as Unicode
-characters. If you don't want this to happen, you can begin your pattern
-with C (or anything else) to force Perl not to UTF8 encode your
-string, and then follow this with a C somewhere in your pattern.
+If the pattern begins with a C, the resulting string will be
+treated as UTF-8-encoded Unicode. You can force UTF-8 encoding on in a
+string with an initial C, and the bytes that follow will be
+interpreted as Unicode characters. If you don't want this to happen,
+you can begin your pattern with C (or anything else) to force Perl
+not to UTF-8 encode your string, and then follow this with a C
+somewhere in your pattern.
=item *
@@ -3499,8 +3661,14 @@ sequences of bytes.
=item *
A ()-group is a sub-TEMPLATE enclosed in parentheses. A group may
-take a repeat count, both as postfix, and via the C> template
-character.
+take a repeat count, both as postfix, and for unpack() also via the C>
+template character. Within each repetition of a group, positioning with
+C<@> starts again at 0. Therefore, the result of
+
+ pack( '@1A((@2A)@3A)', 'a', 'b', 'c' )
+
+is the string "\0a\0\0bc".
+
=item *
@@ -3516,7 +3684,17 @@ both result in no-ops.
=item *
+C, C, C and C accept the C modifier. In this case they
+will represent signed 16-/32-bit integers in big-/little-endian order.
+This is only portable if all platforms sharing the packed data use the
+same binary representation for signed integers (e.g. all platforms are
+using two's complement representation).
+
+=item *
+
A comment in a TEMPLATE starts with C<#> and goes to the end of line.
+White space may be used to separate pack codes from each other, but
+modifiers and a repeat count must follow immediately.
=item *
@@ -3576,6 +3754,15 @@ Examples:
# short 12, zero fill to position 4, long 34
# $foo eq $bar
+ $foo = pack('nN', 42, 4711);
+ # pack big-endian 16- and 32-bit unsigned integers
+ $foo = pack('S>L>', 42, 4711);
+ # exactly the same
+ $foo = pack('s of data into variable SCALAR
from the specified FILEHANDLE. Returns the number of characters
actually read, C<0> at end of file, or undef if there was an error (in
-the latter case C<$!> is also set). SCALAR will be grown or shrunk to
-the length actually read. If SCALAR needs growing, the new bytes will
-be zero bytes. An OFFSET may be specified to place the read data into
-some other place in SCALAR than the beginning. The call is actually
-implemented in terms of either Perl's or system's fread() call. To
-get a true read(2) system call, see C.
+the latter case C<$!> is also set). SCALAR will be grown or shrunk
+so that the last character actually read is the last character of the
+scalar after the read.
+
+An OFFSET may be specified to place the read data at some place in the
+string other than the beginning. A negative OFFSET specifies
+placement at that many characters counting backwards from the end of
+the string. A positive OFFSET greater than the length of SCALAR
+results in the string being padded to the required size with C<"\0">
+bytes before the result of the read is appended.
+
+The call is actually implemented in terms of either Perl's or system's
+fread() call. To get a true read(2) system call, see C.
Note the I: depending on the status of the filehandle,
either (8-bit) bytes or characters are read. By default all
filehandles operate on bytes, but for example if the filehandle has
been opened with the C<:utf8> I/O layer (see L