X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlfunc.pod;h=c42257516b74e025a233d547b0d8de0a7cd9a347;hb=83272a45226e83bd136d713158e9b44ace2dbc8d;hp=3d6f30192eb9a3b289c188c80a40b46d88cee869;hpb=67408caed70eed949797d2c8d6a752d0b53c070b;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod index 3d6f301..c422575 100644 --- a/pod/perlfunc.pod +++ b/pod/perlfunc.pod @@ -445,39 +445,59 @@ does. Returns true if it succeeded, false otherwise. NAME should be a packed address of the appropriate type for the socket. See the examples in L. -=item binmode FILEHANDLE, DISCIPLINE +=item binmode FILEHANDLE, LAYER =item binmode FILEHANDLE -Arranges for FILEHANDLE to be read or written in "binary" or "text" mode -on systems where the run-time libraries distinguish between binary and -text files. If FILEHANDLE is an expression, the value is taken as the -name of the filehandle. - -DISCIPLINE can be either of C<:raw> for binary mode or C<:crlf> for -"text" mode. If the DISCIPLINE is omitted, it defaults to C<:raw>. -Returns true on success, C on failure. To mark FILEHANDLE as -UTF-8, use C<:utf8>, and to mark it as bytes, use C<:bytes>. - -The C<:raw> are C<:clrf>, and any other directives of the form -C<:...>, are called I/O I. The C pragma can be -used to establish default I/O disciplines. See L. +Arranges for FILEHANDLE to be read or written in "binary" or "text" +mode on systems where the run-time libraries distinguish between +binary and text files. If FILEHANDLE is an expression, the value is +taken as the name of the filehandle. Returns true on success, +C on failure. + +If LAYER is omitted or specified as C<:raw> the filehandle is made +suitable for passing binary data. This includes turning off possible CRLF +translation and marking it as bytes (as opposed to Unicode characters). +Note that as desipite what may be implied in I<"Programming Perl"> +(the Camel) or elsewhere C<:raw> is I the simply inverse of C<:crlf> +-- other layers which would affect binary nature of the stream are +I disabled. See L, L and the discussion about the +PERLIO environment variable. + +I + +On some systems (in general, DOS and Windows-based systems) binmode() +is necessary when you're not working with a text file. For the sake +of portability it is a good idea to always use it when appropriate, +and to never use it when it isn't appropriate. + +In other words: regardless of platform, use binmode() on binary files +(like for example images). + +If LAYER is present it is a single string, but may contain +multiple directives. The directives alter the behaviour of the +file handle. When LAYER is present using binmode on text +file makes sense. + +To mark FILEHANDLE as UTF-8, use C<:utf8>. + +The C<:bytes>, C<:crlf>, and C<:utf8>, and any other directives of the +form C<:...>, are called I/O I. The C pragma can be used to +establish default I/O layers. See L. In general, binmode() should be called after open() but before any I/O -is done on the filehandle. Calling binmode() will flush any possibly -pending buffered input or output data on the handle. The only -exception to this is the C<:encoding> discipline that changes -the default character encoding of the handle, see L. -The C<:encoding> discipline sometimes needs to be called in +is done on the filehandle. Calling binmode() will normally flush any +pending buffered output data (and perhaps pending input data) on the +handle. An exception to this is the C<:encoding> layer that +changes the default character encoding of the handle, see L. +The C<:encoding> layer sometimes needs to be called in mid-stream, and it doesn't flush the stream. -On some systems binmode() is necessary when you're not working with a -text file. For the sake of portability it is a good idea to always use -it when appropriate, and to never use it when it isn't appropriate. - -In other words: Regardless of platform, use binmode() on binary -files, and do not use binmode() on text files. - The operating system, device drivers, C libraries, and Perl run-time system all work together to let the programmer treat a single character (C<\n>) as the line terminator, irrespective of the external @@ -489,14 +509,13 @@ one character. Mac OS, all variants of Unix, and Stream_LF files on VMS use a single character to end each line in the external representation of text (even though that single character is CARRIAGE RETURN on Mac OS and LINE FEED -on Unix and most VMS files). Consequently binmode() has no effect on -these operating systems. In other systems like OS/2, DOS and the various -flavors of MS-Windows your program sees a C<\n> as a simple C<\cJ>, but -what's stored in text files are the two characters C<\cM\cJ>. That means -that, if you don't use binmode() on these systems, C<\cM\cJ> sequences on -disk will be converted to C<\n> on input, and any C<\n> in your program -will be converted back to C<\cM\cJ> on output. This is what you want for -text files, but it can be disastrous for binary files. +on Unix and most VMS files). In other systems like OS/2, DOS and the +various flavors of MS-Windows your program sees a C<\n> as a simple C<\cJ>, +but what's stored in text files are the two characters C<\cM\cJ>. That +means that, if you don't use binmode() on these systems, C<\cM\cJ> +sequences on disk will be converted to C<\n> on input, and any C<\n> in +your program will be converted back to C<\cM\cJ> on output. This is what +you want for text files, but it can be disastrous for binary files. Another consequence of using binmode() (on some systems) is that special end-of-file markers will be seen as part of the data stream. @@ -638,6 +657,13 @@ You can actually chomp anything that's an lvalue, including an assignment: If you chomp a list, each element is chomped, and the total number of characters removed is returned. +Note that parentheses are necessary when you're chomping anything +that is not a simple variable. This is because C +is interpreted as C<(chomp $cwd) = `pwd`;>, rather than as +C which you might expect. Similarly, +C is interpreted as C rather than +as C. + =item chop VARIABLE =item chop( LIST ) @@ -657,6 +683,8 @@ last C is returned. Note that C returns the last character. To return all but the last character, use C. +See also L. + =item chown LIST Changes the owner (and group) of a list of files. The first two @@ -2025,11 +2053,13 @@ Returns the socket option requested, or undef if there is an error. =item glob -Returns the value of EXPR with filename expansions such as the -standard Unix shell F would do. This is the internal function -implementing the C<< <*.c> >> operator, but you can use it directly. -If EXPR is omitted, C<$_> is used. The C<< <*.c> >> operator is -discussed in more detail in L. +In list context, returns a (possibly empty) list of filename expansions on +the value of EXPR such as the standard Unix shell F would do. In +scalar context, glob iterates through such filename expansions, returning +undef when the list is exhausted. This is the internal function +implementing the C<< <*.c> >> operator, but you can use it directly. If +EXPR is omitted, C<$_> is used. The C<< <*.c> >> operator is discussed in +more detail in L. Beginning with v5.6.0, this operator is implemented using the standard C extension. See L for details. @@ -2469,14 +2499,13 @@ and the month of the year, may not necessarily be three characters wide. =item lock THING -This function places an advisory lock on a variable, subroutine, -or referenced object contained in I until the lock goes out -of scope. +This function places an advisory lock on a shared variable, or referenced +object contained in I until the lock goes out of scope. lock() is a "weak keyword" : this means that if you've defined a function by this name (before any calls to it), that function will be called -instead. (However, if you've said C, lock() is always a -keyword.) See L. +instead. (However, if you've said C, lock() is always a +keyword.) See L. =item log EXPR @@ -2625,12 +2654,22 @@ and C documentation. =item my EXPR -=item my EXPR : ATTRIBUTES +=item my TYPE EXPR + +=item my EXPR : ATTRS + +=item my TYPE EXPR : ATTRS A C declares the listed variables to be local (lexically) to the -enclosing block, file, or C. If -more than one value is listed, the list must be placed in parentheses. See -L for details. +enclosing block, file, or C. If more than one value is listed, +the list must be placed in parentheses. + +The exact semantics and interface of TYPE and ATTRS are still +evolving. TYPE is currently bound to the use of C pragma, +and attributes are handled using the C pragma, or starting +from Perl 5.8.0 also via the C module. See +L for details, and L, +L, and L. =item next LABEL @@ -2771,14 +2810,17 @@ meaning. In the 2-arguments (and 1-argument) form opening C<'-'> opens STDIN and opening C<< '>-' >> opens STDOUT. -You may use the three-argument form of open to specify -I that affect how the input and output -are processed: see L and L. For example +You may use the three-argument form of open to specify IO "layers" +(sometimes also referred to as "disciplines") to be applied to the handle +that affect how the input and output are processed (see L and +L for more details). For example open(FH, "<:utf8", "file") will open the UTF-8 encoded file containing Unicode characters, -see L. +see L. (Note that if layers are specified in the +three-arg form then default layers set by the C pragma are +ignored.) Open returns nonzero upon success, the undefined value otherwise. If the C involved a pipe, the return value happens to be the pid of @@ -2788,15 +2830,10 @@ If you're running Perl on a system that distinguishes between text files and binary files, then you should check out L for tips for dealing with this. The key distinction between systems that need C and those that don't is their text file formats. Systems -like Unix, MacOS, and Plan9, which delimit lines with a single +like Unix, Mac OS, and Plan 9, which delimit lines with a single character, and which encode that character in C as C<"\n">, do not need C. The rest need it. -In the three argument form MODE may also contain a list of IO "layers" -(see L and L for more details) to be applied to the -handle. This can be used to achieve the effect of C as well -as more complex behaviours. - When opening a file, it's usually a bad idea to continue normal execution if the request failed, so C is frequently used in connection with C. Even if C won't do what you want (say, in a CGI script, @@ -2954,7 +2991,9 @@ The following triples are more or less equivalent: open(FOO, '-|', "cat", '-n', $file); The last example in each block shows the pipe as "list form", which is -not yet supported on all platforms. +not yet supported on all platforms. A good rule of thumb is that if +your platform has true C (in other words, if your platform is +UNIX) you can use the list form. See L for more examples of this. @@ -3055,7 +3094,11 @@ See L and L for more about Unicode. =item our EXPR -=item our EXPR : ATTRIBUTES +=item our EXPR TYPE + +=item our EXPR : ATTRS + +=item our TYPE EXPR : ATTRS An C declares the listed variables to be valid globals within the enclosing block, file, or C. That is, it has the same @@ -3096,22 +3139,29 @@ package, Perl will emit warnings if you have asked for them. our $bar; # emits warning An C declaration may also have a list of attributes associated -with it. B: This is an experimental feature that may be -changed or removed in future releases of Perl. It should not be -relied upon. - -The only currently recognized attribute is C which indicates -that a single copy of the global is to be used by all interpreters -should the program happen to be running in a multi-interpreter -environment. (The default behaviour would be for each interpreter to -have its own copy of the global.) In such an environment, this -attribute also has the effect of making the global readonly. -Examples: +with it. + +The exact semantics and interface of TYPE and ATTRS are still +evolving. TYPE is currently bound to the use of C pragma, +and attributes are handled using the C pragma, or starting +from Perl 5.8.0 also via the C module. See +L for details, and L, +L, and L. + +The only currently recognized C attribute is C which +indicates that a single copy of the global is to be used by all +interpreters should the program happen to be running in a +multi-interpreter environment. (The default behaviour would be for +each interpreter to have its own copy of the global.) Examples: our @EXPORT : unique = qw(foo); our %EXPORT_TAGS : unique = (bar => [qw(aa bb cc)]); our $VERSION : unique = "1.00"; +Note that this attribute also has the effect of making the global +readonly when the first new interpreter is cloned (for example, +when the first new thread is created). + Multi-interpreter environments can come to being either through the fork() emulation on Windows platforms, or by embedding perl in a multi-threaded application. The C attribute does nothing in @@ -3722,7 +3772,7 @@ see C. Note the I: depending on the status of the filehandle, either (8-bit) bytes or characters are read. By default all filehandles operate on bytes, but for example if the filehandle has -been opened with the C<:utf8> discipline (see L, and the C +been opened with the C<:utf8> I/O layer (see L, and the C pragma, L), the I/O will operate on characters, not bytes. =item readdir DIRHANDLE @@ -3794,7 +3844,7 @@ See L for examples. Note the I: depending on the status of the socket, either (8-bit) bytes or characters are received. By default all sockets operate on bytes, but for example if the socket has been changed using -binmode() to operate with the C<:utf8> discipline (see the C +binmode() to operate with the C<:utf8> I/O layer (see the C pragma, L), the I/O will operate on characters, not bytes. =item redo LABEL @@ -4142,7 +4192,7 @@ otherwise. Note the I: even if the filehandle has been set to operate on characters (for example by using the C<:utf8> open -discipline), tell() will return byte offsets, not character offsets +layer), tell() will return byte offsets, not character offsets (because implementing that would render seek() and tell() rather slow). If you want to position file for C or C, don't use @@ -4314,7 +4364,7 @@ L for examples. Note the I: depending on the status of the socket, either (8-bit) bytes or characters are sent. By default all sockets operate on bytes, but for example if the socket has been changed using -binmode() to operate with the C<:utf8> discipline (see L, or +binmode() to operate with the C<:utf8> I/O layer (see L, or the C pragma, L), the I/O will operate on characters, not bytes. @@ -4478,16 +4528,18 @@ sockets but not socketpair. =item sort LIST -Sorts the LIST and returns the sorted list value. If SUBNAME or BLOCK -is omitted, Cs in standard string comparison order. If SUBNAME is -specified, it gives the name of a subroutine that returns an integer -less than, equal to, or greater than C<0>, depending on how the elements -of the list are to be ordered. (The C<< <=> >> and C -operators are extremely useful in such routines.) SUBNAME may be a -scalar variable name (unsubscripted), in which case the value provides -the name of (or a reference to) the actual subroutine to use. In place -of a SUBNAME, you can provide a BLOCK as an anonymous, in-line sort -subroutine. +In list context, this sorts the LIST and returns the sorted list value. +In scalar context, the behaviour of C is undefined. + +If SUBNAME or BLOCK is omitted, Cs in standard string comparison +order. If SUBNAME is specified, it gives the name of a subroutine +that returns an integer less than, equal to, or greater than C<0>, +depending on how the elements of the list are to be ordered. (The C<< +<=> >> and C operators are extremely useful in such routines.) +SUBNAME may be a scalar variable name (unsubscripted), in which case +the value provides the name of (or a reference to) the actual +subroutine to use. In place of a SUBNAME, you can provide a BLOCK as +an anonymous, in-line sort subroutine. If the subroutine's prototype is C<($$)>, the elements to be compared are passed by reference in C<@_>, as for a normal subroutine. This is @@ -5192,17 +5244,22 @@ out the names of those files that contain a match: print $file, "\n"; } -=item sub BLOCK +=item sub NAME BLOCK -=item sub NAME +=item sub NAME (PROTO) BLOCK -=item sub NAME BLOCK +=item sub NAME : ATTRS BLOCK + +=item sub NAME (PROTO) : ATTRS BLOCK + +This is subroutine definition, not a real function I. +Without a BLOCK it's just a forward declaration. Without a NAME, +it's an anonymous function declaration, and does actually return +a value: the CODE ref of the closure you just created. -This is subroutine definition, not a real function I. With just a -NAME (and possibly prototypes or attributes), it's just a forward declaration. -Without a NAME, it's an anonymous function declaration, and does actually -return a value: the CODE ref of the closure you just created. See L -and L for details. +See L and L for details about subroutines and +references, and L and L for more +information about attributes. =item substr EXPR,OFFSET,LENGTH,REPLACEMENT @@ -5353,7 +5410,7 @@ last byte of the scalar after the read. Note the I: depending on the status of the filehandle, either (8-bit) bytes or characters are read. By default all filehandles operate on bytes, but for example if the filehandle has -been opened with the C<:utf8> discipline (see L, and the C +been opened with the C<:utf8> I/O layer (see L, and the C pragma, L), the I/O will operate on characters, not bytes. An OFFSET may be specified to place the read data at some place in the @@ -5377,7 +5434,7 @@ POSITION, and C<2> to set it to EOF plus POSITION (typically negative). Note the I: even if the filehandle has been set to operate -on characters (for example by using the C<:utf8> discipline), tell() +on characters (for example by using the C<:utf8> I/O layer), tell() will return byte offsets, not character offsets (because implementing that would render sysseek() very slow). @@ -5478,7 +5535,7 @@ In the case the SCALAR is empty you can use OFFSET but only zero offset. Note the I: depending on the status of the filehandle, either (8-bit) bytes or characters are written. By default all filehandles operate on bytes, but for example if the filehandle has -been opened with the C<:utf8> discipline (see L, and the open +been opened with the C<:utf8> I/O layer (see L, and the open pragma, L), the I/O will operate on characters, not bytes. =item tell FILEHANDLE @@ -5492,7 +5549,7 @@ last read. Note the I: even if the filehandle has been set to operate on characters (for example by using the C<:utf8> open -discipline), tell() will return byte offsets, not character offsets +layer), tell() will return byte offsets, not character offsets (because that would render seek() and tell() rather slow). The return value of tell() for the standard streams like the STDIN @@ -5614,7 +5671,7 @@ package. =item time Returns the number of non-leap seconds since whatever time the system -considers to be the epoch (that's 00:00:00, January 1, 1904 for MacOS, +considers to be the epoch (that's 00:00:00, January 1, 1904 for Mac OS, and 00:00:00 UTC, January 1, 1970 for most other systems). Suitable for feeding to C and C. @@ -5810,6 +5867,7 @@ See L for more examples and notes. =item untie VARIABLE Breaks the binding between a variable and a package. (See C.) +Has no effect if the variable is not tied. =item unshift ARRAY,LIST