X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlfunc.pod;h=101d10e9fb23705d343b45ea0a52502d3f4c5259;hb=1651fc447620d3610b694c35696c13530282f981;hp=1004837802ee0c3d2fdbd61811c57dcf10c55ed1;hpb=ee6b43cc19efb39ed8a2fdad01d701e59dbdd946;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod index 1004837..101d10e 100644 --- a/pod/perlfunc.pod +++ b/pod/perlfunc.pod @@ -100,7 +100,7 @@ than one place. X X X C, C, C, C, C, C, C, C, -C, C, C, C, C, C, C, +C, C, C, C, C, C, C, C, C, C, C, C, C, C =item Regular expressions and pattern matching @@ -122,7 +122,7 @@ C, C, C, C, C =item Functions for list data X -C, C, C, C, C, C, C +C, C, C, C, C, C, C =item Functions for real %HASHes X @@ -180,7 +180,7 @@ C, C, C, C, C X X X C, C, C, C, C, C, C, -C, C, C, C, C, C, +C, C, C, C, C, C, C, C, C =item Keywords related to perl modules @@ -233,7 +233,7 @@ X C, C, C, C, C, C, C, C, C, C, C, C, C, C, -C, C, C, C, C, C, C, C, C, +C, C, C, C, C, C, C, C, C, C, C, C, C*, C, C, C, C, C, C, C, C @@ -296,8 +296,7 @@ and tests the associated file to see if something is true about it. If the argument is omitted, tests C<$_>, except for C<-t>, which tests STDIN. Unless otherwise documented, it returns C<1> for true and C<''> for false, or the undefined value if the file doesn't exist. Despite the funny -names, precedence is the same as any other named unary operator, and -the argument may be parenthesized like any other unary operator. The +names, precedence is the same as any other named unary operator. The operator may be any of: -r File is readable by effective uid/gid. @@ -526,8 +525,8 @@ If LAYER is omitted or specified as C<:raw> the filehandle is made suitable for passing binary data. This includes turning off possible CRLF translation and marking it as bytes (as opposed to Unicode characters). Note that, despite what may be implied in I<"Programming Perl"> (the -Camel) or elsewhere, C<:raw> is I the simply inverse of C<:crlf> --- other layers which would affect binary nature of the stream are +Camel) or elsewhere, C<:raw> is I simply the inverse of C<:crlf> +-- other layers which would affect the binary nature of the stream are I disabled. See L, L and the discussion about the PERLIO environment variable. @@ -542,7 +541,10 @@ functionality has moved from "discipline" to "layer". All documentation of this version of Perl therefore refers to "layers" rather than to "disciplines". Now back to the regularly scheduled documentation...> -To mark FILEHANDLE as UTF-8, use C<:utf8>. +To mark FILEHANDLE as UTF-8, use C<:utf8> or C<:encoding(utf8)>. +C<:utf8> just marks the data as UTF-8 without further checking, +while C<:encoding(utf8)> checks the data for actually being valid +UTF-8. More details can be found in L. In general, binmode() should be called after open() but before any I/O is done on the filehandle. Calling binmode() will normally flush any @@ -756,10 +758,6 @@ You can actually chomp anything that's an lvalue, including an assignment: If you chomp a list, each element is chomped, and the total number of characters removed is returned. -If the C pragma is in scope then the lengths returned are -calculated from the length of C<$/> in Unicode characters, which is not -always the same as the length of C<$/> in the native encoding. - Note that parentheses are necessary when you're chomping anything that is not a simple variable. This is because C is interpreted as C<(chomp $cwd) = `pwd`;>, rather than as @@ -836,9 +834,7 @@ X X X X Returns the character represented by that NUMBER in the character set. For example, C is C<"A"> in either ASCII or Unicode, and -chr(0x263a) is a Unicode smiley face. Note that characters from 128 -to 255 (inclusive) are by default not encoded in UTF-8 Unicode for -backward compatibility reasons (but see L). +chr(0x263a) is a Unicode smiley face. Negative values give the Unicode replacement character (chr(0xfffd)), except under the L pragma, where low eight bits of the value @@ -848,10 +844,10 @@ If NUMBER is omitted, uses C<$_>. For the reverse, use L. -Note that under the C pragma the NUMBER is masked to -the low eight bits. +Note that characters from 128 to 255 (inclusive) are by default +internally not encoded as UTF-8 for backward compatibility reasons. -See L and L for more about Unicode. +See L for more about Unicode. =item chroot FILENAME X X @@ -870,10 +866,11 @@ X =item close -Closes the file or pipe associated with the file handle, returning -true only if IO buffers are successfully flushed and closes the system -file descriptor. Closes the currently selected filehandle if the -argument is omitted. +Closes the file or pipe associated with the file handle, flushes the IO +buffers, and closes the system file descriptor. Returns true if those +operations have succeeded and if no error was reported by any PerlIO +layer. Closes the currently selected filehandle if the argument is +omitted. You don't have to close FILEHANDLE if you are immediately going to do another C on it, because C will close it for you. (See @@ -1288,13 +1285,17 @@ trapped within an eval(), $@ contains the reference. This behavior permits a more elaborate exception handling implementation using objects that maintain arbitrary state about the nature of the exception. Such a scheme is sometimes preferable to matching particular string values of $@ using -regular expressions. Here's an example: +regular expressions. Because $@ is a global variable, and eval() may be +used within object implementations, care must be taken that analyzing the +error object doesn't replace the reference in the global variable. The +easiest solution is to make a local copy of the reference before doing +other manipulations. Here's an example: use Scalar::Util 'blessed'; eval { ... ; die Some::Module::Exception->new( FOO => "bar" ) }; - if ($@) { - if (blessed($@) && $@->isa("Some::Module::Exception")) { + if (my $ev_err = $@) { + if (blessed($ev_err) && $ev_err->isa("Some::Module::Exception")) { # handle Some::Module::Exception } else { @@ -1404,20 +1405,11 @@ B: Any files opened at the time of the dump will I be open any more when the program is reincarnated, with possible resulting confusion on the part of Perl. -This function is now largely obsolete, partly because it's very -hard to convert a core file into an executable, and because the -real compiler backends for generating portable bytecode and compilable -C code have superseded it. That's why you should now invoke it as -C, if you don't want to be warned against a possible +This function is now largely obsolete, mostly because it's very hard to +convert a core file into an executable. That's why you should now invoke +it as C, if you don't want to be warned against a possible typo. -If you're looking to use L to speed up your program, consider -generating bytecode or native C code as described in L. If -you're just trying to accelerate a CGI script, consider using the -C extension to B, or the CPAN module, CGI::Fast. -You might also consider autoloading or selfloading, which at least -make your program I to run faster. - =item each HASH X X @@ -2582,8 +2574,8 @@ If SIGNAL is zero, no signal is sent to the process, but the kill(2) system call will check whether it's possible to send a signal to it (that means, to be brief, that the process is owned by the same user, or we are the super-user). This is a useful way to check that a child process is -alive and hasn't changed its UID. See L for notes on the -portability of this construct. +alive (even if only as a zombie) and hasn't changed its UID. See +L for notes on the portability of this construct. Unlike in the shell, if SIGNAL is negative, it kills process groups instead of processes. (On System V, a negative I @@ -2656,7 +2648,11 @@ For that, use C and C respectively. Note the I: if the EXPR is in Unicode, you will get the number of characters, not the number of bytes. To get the length -in bytes, use C, see L. +of the internal string in bytes, use C, see +L. Note that the internal encoding is variable, and the number +of bytes usually meaningless. To get the number of bytes that the +string would have when encoded as UTF-8, use +C. =item link OLDFILE,NEWFILE X @@ -2822,13 +2818,13 @@ more elements in the returned value. translates a list of numbers to the corresponding characters. And - %hash = map { getkey($_) => $_ } @array; + %hash = map { get_a_key_for($_) => $_ } @array; is just a funny way to write %hash = (); - foreach $_ (@array) { - $hash{getkey($_)} = $_; + foreach (@array) { + $hash{get_a_key_for($_)} = $_; } Note that C<$_> is an alias to the list value, so it can be used to @@ -2839,8 +2835,8 @@ most cases. See also L for an array composed of those items of the original list for which the BLOCK or EXPR evaluates to true. If C<$_> is lexical in the scope where the C appears (because it has -been declared with C) then, in addition to being locally aliased to -the list elements, C<$_> keeps being lexical inside the block; i.e. it +been declared with C), then, in addition to being locally aliased to +the list elements, C<$_> keeps being lexical inside the block; that is, it can't be seen from the outside, avoiding any potential side-effects. C<{> starts both hash references and blocks, so C could be either @@ -2860,7 +2856,7 @@ such as using a unary C<+> to give perl some help: %hash = map ( lc($_), 1 ), @array # evaluates to (1, @array) -or to force an anon hash constructor use C<+{> +or to force an anon hash constructor use C<+{>: @hashes = map +{ lc($_), 1 }, @array # EXPR, so needs , at end @@ -2993,6 +2989,8 @@ X =item no Module +=item no VERSION + See the C function, of which C is the opposite. =item oct EXPR @@ -3105,7 +3103,7 @@ You may use the three-argument form of open to specify IO "layers" that affect how the input and output are processed (see L and L for more details). For example - open(FH, "<:utf8", "file") + open(FH, "<:encoding(UTF-8)", "file") will open the UTF-8 encoded file containing Unicode characters, see L. Note that if layers are specified in the @@ -3411,7 +3409,7 @@ or Unicode) value of the first character of EXPR. If EXPR is omitted, uses C<$_>. For the reverse, see L. -See L and L for more about Unicode. +See L for more about Unicode. =item our EXPR X X @@ -3507,8 +3505,7 @@ of values, as follows: H A hex string (high nybble first). c A signed char (8-bit) value. - C An unsigned C char (octet) even under Unicode. Should normally not - be used. See U and W instead. + C An unsigned char (octet) value. W An unsigned char value (can be greater than 255). s A signed short (16-bit) value. @@ -3549,8 +3546,8 @@ of values, as follows: P A pointer to a structure (fixed-length string). u A uuencoded string. - U A Unicode character number. Encodes to UTF-8 internally - (or UTF-EBCDIC in EBCDIC platforms). + U A Unicode character number. Encodes to a character in character mode + and UTF-8 (or UTF-EBCDIC in EBCDIC platforms) in byte mode. w A BER compressed integer (not an ASN.1 BER, see perlpacktut for details). Its bytes represent an unsigned integer in base 128, @@ -4127,9 +4124,10 @@ X Equivalent to C, except that C<$\> (the output record separator) is not appended. The first argument of the list will be interpreted as the C format. See C -for an explanation of the format argument. If C is in effect, -the character used for the decimal point in formatted real numbers is -affected by the LC_NUMERIC locale. See L. +for an explanation of the format argument. If C is in effect, +and POSIX::setlocale() has been called, the character used for the decimal +separator in formatted floating point numbers is affected by the LC_NUMERIC +locale. See L and L. Don't fall into the trap of using a C when a simple C would do. The C is more efficient and less @@ -4150,7 +4148,7 @@ like a Perl function. Otherwise, the string describing the equivalent prototype is returned. =item push ARRAY,LIST -X, X +X X Treats ARRAY as a stack, and pushes the values of LIST onto the end of ARRAY. The length of ARRAY increases by the length of @@ -4259,14 +4257,17 @@ C there, it would have been testing the wrong file. closedir DIR; =item readline EXPR + +=item readline X X X -Reads from the filehandle whose typeglob is contained in EXPR. In scalar -context, each call reads and returns the next line, until end-of-file is -reached, whereupon the subsequent call returns undef. In list context, -reads until end-of-file is reached and returns a list of lines. Note that -the notion of "line" used here is however you may have defined it -with C<$/> or C<$INPUT_RECORD_SEPARATOR>). See L. +Reads from the filehandle whose typeglob is contained in EXPR (or from +*ARGV if EXPR is not provided). In scalar context, each call reads and +returns the next line, until end-of-file is reached, whereupon the +subsequent call returns undef. In list context, reads until end-of-file +is reached and returns a list of lines. Note that the notion of "line" +used here is however you may have defined it with C<$/> or +C<$INPUT_RECORD_SEPARATOR>). See L. When C<$/> is set to C, when readline() is in scalar context (i.e. file slurp mode), and when an empty file is read, it @@ -4305,6 +4306,8 @@ error, returns the undefined value and sets C<$!> (errno). If EXPR is omitted, uses C<$_>. =item readpipe EXPR + +=item readpipe X EXPR is executed as a system command. @@ -4315,6 +4318,7 @@ multi-line) string. In list context, returns a list of lines This is the internal function implementing the C operator, but you can use it directly. The C operator is discussed in more detail in L. +If EXPR is omitted, uses C<$_>. =item recv SOCKET,SCALAR,LENGTH,FLAGS X @@ -4408,6 +4412,14 @@ name is returned instead. You can think of C as a C operator. print "r is not a reference at all.\n"; } +The return value C indicates a reference to an lvalue that is not +a variable. You get this from taking the reference of function calls like +C or C. C is returned if the reference points +to a L. + +The result C indicates that the argument is a regular expression +resulting from C. + See also L. =item rename OLDNAME,NEWNAME @@ -4453,8 +4465,9 @@ version should be used instead. Otherwise, C demands that a library file be included if it hasn't already been included. The file is included via the do-FILE -mechanism, which is essentially just a variety of C. Has -semantics similar to the following subroutine: +mechanism, which is essentially just a variety of C with the +caveat that lexical variables in the invoking script will be invisible +to the included code. Has semantics similar to the following subroutine: sub require { my ($filename) = @_; @@ -4533,22 +4546,16 @@ Subroutine references are the simplest case. When the inclusion system walks through @INC and encounters a subroutine, this subroutine gets called with two parameters, the first being a reference to itself, and the second the name of the file to be included (e.g. "F"). The -subroutine should return nothing, or a list of up to 4 values in the +subroutine should return nothing, or a list of up to three values in the following order: =over =item 1 -A reference to a scalar, containing any initial source code to prepend to -the file or generator output. - - -=item 2 - A filehandle, from which the file will be read. -=item 3 +=item 2 A reference to a subroutine. If there is no filehandle (previous item), then this subroutine is expected to generate one line of source code per @@ -4558,7 +4565,7 @@ called to act a simple source filter, with the line as read in C<$_>. Again, return 1 for each valid line, and 0 after all lines have been returned. -=item 4 +=item 3 Optional state for the subroutine. The state is passed in as C<$_[1]>. A reference to the subroutine itself is passed in as C<$_[0]>. @@ -4717,11 +4724,8 @@ X =item say Just like C, but implicitly appends a newline. -C is simply an abbreviation for C, -and C works just like C. - -That means that a call to say() appends any output record separator -I the added newline. +C is simply an abbreviation for C<{ local $\ = "\n"; print +LIST }>. This keyword is only available when the "say" feature is enabled: see L. @@ -4816,8 +4820,8 @@ X