From: Jarkko Hietaniemi Date: Tue, 27 May 2003 06:30:54 +0000 (+0000) Subject: For now reword the sysread/syswrite description to X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=3874323d73ef08e6639270ea07e834aec3e379f1;p=p5sagit%2Fp5-mst-13.2.git For now reword the sysread/syswrite description to stress the fact that by default everything is still bytes. p4raw-id: //depot/perl@19626 --- diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod index 2a3533b..7af8918 100644 --- a/pod/perlfunc.pod +++ b/pod/perlfunc.pod @@ -498,7 +498,9 @@ pending buffered output data (and perhaps pending input data) on the handle. An exception to this is the C<:encoding> layer that changes the default character encoding of the handle, see L. The C<:encoding> layer sometimes needs to be called in -mid-stream, and it doesn't flush the stream. +mid-stream, and it doesn't flush the stream. The C<:encoding> +also implicitly pushes on top of itself the C<:utf8> layer because +internally Perl will operate on UTF-8 encoded Unicode characters. The operating system, device drivers, C libraries, and Perl run-time system all work together to let the programmer treat a single @@ -5582,21 +5584,15 @@ See L for a kinder, gentler explanation of opening files. =item sysread FILEHANDLE,SCALAR,LENGTH -Attempts to read LENGTH I of data into variable SCALAR -from the specified FILEHANDLE, using the system call read(2). It -bypasses buffered IO, so mixing this with other kinds of reads, -C, C, C, C, or C can cause confusion -because stdio usually buffers data. Returns the number of characters -actually read, C<0> at end of file, or undef if there was an error (in -the latter case C<$!> is also set). SCALAR will be grown or shrunk so -that the last byte actually read is the last byte of the scalar after -the read. - -Note the I: depending on the status of the filehandle, -either (8-bit) bytes or characters are read. By default all -filehandles operate on bytes, but for example if the filehandle has -been opened with the C<:utf8> I/O layer (see L, and the C -pragma, L), the I/O will operate on characters, not bytes. +Attempts to read LENGTH bytes of data into variable SCALAR from the +specified FILEHANDLE, using the system call read(2). It bypasses +buffered IO, so mixing this with other kinds of reads, C, +C, C, C, or C can cause confusion because the +perlio or stdio layers usually buffers data. Returns the number of +bytes actually read, C<0> at end of file, or undef if there was an +error (in the latter case C<$!> is also set). SCALAR will be grown or +shrunk so that the last byte actually read is the last byte of the +scalar after the read. An OFFSET may be specified to place the read data at some place in the string other than the beginning. A negative OFFSET specifies @@ -5609,9 +5605,15 @@ There is no syseof() function, which is ok, since eof() doesn't work very well on device files (like ttys) anyway. Use sysread() and check for a return value for 0 to decide whether you're done. +Note that if the filehandle has been marked as C<:utf8> Unicode +characters are read instead of bytes (the LENGTH, OFFSET, and the +return value of sysread() are in Unicode characters). +The C<:encoding(...)> layer implicitly introduces the C<:utf8> layer. +See L, L, and the C pragma, L. + =item sysseek FILEHANDLE,POSITION,WHENCE -Sets FILEHANDLE's system position I using the system call +Sets FILEHANDLE's system position in bytes using the system call lseek(2). FILEHANDLE may be an expression whose value gives the name of the filehandle. The values for WHENCE are C<0> to set the new position to POSITION, C<1> to set the it to the current position plus @@ -5623,7 +5625,7 @@ on characters (for example by using the C<:utf8> I/O layer), tell() will return byte offsets, not character offsets (because implementing that would render sysseek() very slow). -sysseek() bypasses normal buffered io, so mixing this with reads (other +sysseek() bypasses normal buffered IO, so mixing this with reads (other than C, for example >< or read()) C, C, C, C, or C may cause confusion. @@ -5702,27 +5704,27 @@ See L and L for details. =item syswrite FILEHANDLE,SCALAR -Attempts to write LENGTH characters of data from variable SCALAR to -the specified FILEHANDLE, using the system call write(2). If LENGTH -is not specified, writes whole SCALAR. It bypasses buffered IO, so +Attempts to write LENGTH bytes of data from variable SCALAR to the +specified FILEHANDLE, using the system call write(2). If LENGTH is +not specified, writes whole SCALAR. It bypasses buffered IO, so mixing this with reads (other than C, C, C, -C, C, or C may cause confusion because stdio usually -buffers data. Returns the number of characters actually written, or -C if there was an error (in this case the errno variable C<$!> -is also set). If the LENGTH is greater than the available data in the -SCALAR after the OFFSET, only as much data as is available will be -written. +C, C, or C may cause confusion because the perlio and +stdio layers usually buffers data. Returns the number of bytes +actually written, or C if there was an error (in this case the +errno variable C<$!> is also set). If the LENGTH is greater than the +available data in the SCALAR after the OFFSET, only as much data as is +available will be written. An OFFSET may be specified to write the data from some part of the string other than the beginning. A negative OFFSET specifies writing that many characters counting backwards from the end of the string. In the case the SCALAR is empty you can use OFFSET but only zero offset. -Note the I: depending on the status of the filehandle, -either (8-bit) bytes or characters are written. By default all -filehandles operate on bytes, but for example if the filehandle has -been opened with the C<:utf8> I/O layer (see L, and the open -pragma, L), the I/O will operate on characters, not bytes. +Note that if the filehandle has been marked as C<:utf8>, +Unicode characters are written instead of bytes (the LENGTH, OFFSET, +and the return value of syswrite() are in Unicode characters). +The C<:encoding(...)> layer implicitly introduces the C<:utf8> layer. +See L, L, and the C pragma, L. =item tell FILEHANDLE