X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperldata.pod;h=3e2482e7847047e7a244efc69823bf6e89f3a385;hb=bbd5c0f5ad81733b079008f34cd05cd9aef7d917;hp=e3361e4dad790bfab9d9af445cd8045f9ca09141;hpb=c47ff5f1a1ef5d0daccf1724400a446cd8e93573;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perldata.pod b/pod/perldata.pod index e3361e4..3e2482e 100644 --- a/pod/perldata.pod +++ b/pod/perldata.pod @@ -87,10 +87,11 @@ that returns a reference to the appropriate type. For a description of this, see L. Names that start with a digit may contain only more digits. Names -that do not start with a letter, underscore, or digit are limited to -one character, e.g., C<$%> or C<$$>. (Most of these one character names -have a predefined significance to Perl. For instance, C<$$> is the -current process id.) +that do not start with a letter, underscore, digit or a caret (i.e. +a control character) are limited to one character, e.g., C<$%> or +C<$$>. (Most of these one character names have a predefined +significance to Perl. For instance, C<$$> is the current process +id.) =head2 Context @@ -129,7 +130,8 @@ assignment to an array or hash evaluates the righthand side in list context. Assignment to a list (or slice, which is just a list anyway) also evaluates the righthand side in list context. -When you use Perl's B<-w> command-line option, you may see warnings +When you use the C pragma or Perl's B<-w> command-line +option, you may see warnings about useless uses of constants or functions in "void context". Void context just means the value has been discarded, such as a statement containing only C<"fred";> or C. It still @@ -208,9 +210,9 @@ with a regular expression (as documented in L). unless /^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/; The length of an array is a scalar value. You may find the length -of array @days by evaluating C<$#days>, as in B. Technically -speaking, this isn't the length of the array; it's the subscript -of the last element, since there is ordinarily a 0th element. +of array @days by evaluating C<$#days>, as in B. However, this +isn't the length of the array; it's the subscript of the last element, +which is a different value since there is ordinarily a 0th element. Assigning to C<$#days> actually changes the length of the array. Shortening an array this way destroys intervening values. Lengthening an array that was previously shortened does not recover values @@ -258,7 +260,7 @@ of sixteen buckets has been touched, and presumably contains all 10,000 of your items. This isn't supposed to happen. You can preallocate space for a hash by assigning to the keys() function. -This rounds up the allocated bucked to the next power of two: +This rounds up the allocated buckets to the next power of two: keys(%users) = 1000; # allocate 1024 buckets @@ -270,11 +272,17 @@ integer formats: 12345 12345.67 .23E-10 # a very small number - 4_294_967_296 # underline for legibility + 3.14_15_92 # a very important number + 4_294_967_296 # underscore for legibility 0xff # hex + 0xdead_beef # more hex 0377 # octal 0b011011 # binary - v102.111.111 # string (made of characters "f", "o", "o") + +You are allowed to use underscores (underbars) in numeric literals +between digits for legibility. You could, for example, group binary +digits by threes (as for a Unix-style mode argument such as 0b110_100_100) +or by fours (to represent nibbles, as in 0b1010_0110) or in other groups. String literals are usually delimited by either single or double quotes. They work much like quotes in the standard Unix shells: @@ -282,7 +290,7 @@ double-quoted string literals are subject to backslash and variable substitution; single-quoted strings are not (except for C<\'> and C<\\>). The usual C-style backslash rules apply for making characters such as newline, tab, etc., as well as some more exotic -forms. See L for a list. +forms. See L for a list. Hexadecimal, octal, or binary, representations in string literals (e.g. '0xff') are not automatically converted to their integer @@ -303,7 +311,8 @@ price is $Z<>100." print "The price is $Price.\n"; # interpreted As in some shells, you can enclose the variable name in braces to -disambiguate it from following alphanumerics. You must also do +disambiguate it from following alphanumerics (and underscores). +You must also do this when interpolating a variable into a string to separate the variable name from a following double-colon or an apostrophe, since these would be otherwise treated as a package separator: @@ -325,15 +334,23 @@ anything more complicated in the subscript will be interpreted as an expression. A literal of the form C is parsed as a string composed -of characters with the specified ordinals. This provides an alternative, -more readable way to construct strings, rather than use the somewhat less -readable interpolation form C<"\x{1}\x{14}\x{12c}\x{fa0}">. This is useful -for representing Unicode strings, and for comparing version "numbers" -using the string comparison operators, C, C, C etc. -If there are two or more dots in the literal, the leading C may be -omitted. Such literals are accepted by both C and C for +of characters with the specified ordinals. This form, known as +v-strings, provides an alternative, more readable way to construct +strings, rather than use the somewhat less readable interpolation form +C<"\x{1}\x{14}\x{12c}\x{fa0}">. This is useful for representing +Unicode strings, and for comparing version "numbers" using the string +comparison operators, C, C, C etc. If there are two or +more dots in the literal, the leading C may be omitted. + + print v9786; # prints UTF-8 encoded SMILEY, "\x{263a}" + print v102.111.111; # prints "foo" + print 102.111.111; # same + +Such literals are accepted by both C and C for doing a version check. The C<$^V> special variable also contains the running Perl interpreter's version in this form. See L. +Note that using the v-strings for IPv4 addresses is not portable unless +you also use the inet_aton()/inet_ntoa() routines of the Socket package. The special literals __FILE__, __LINE__, and __PACKAGE__ represent the current filename, line number, and package name at that @@ -366,7 +383,8 @@ A word that has no other interpretation in the grammar will be treated as if it were a quoted string. These are known as "barewords". As with filehandles and labels, a bareword that consists entirely of lowercase letters risks conflict with future reserved -words, and if you use the B<-w> switch, Perl will warn you about any +words, and if you use the C pragma or the B<-w> switch, +Perl will warn you about any such words. Some people may wish to outlaw barewords entirely. If you say @@ -397,62 +415,9 @@ and is almost always right. If it does guess wrong, or if you're just plain paranoid, you can force the correct interpretation with curly braces as above. -A line-oriented form of quoting is based on the shell "here-document" -syntax. Following a C<< << >> you specify a string to terminate -the quoted material, and all lines following the current line down to -the terminating string are the value of the item. The terminating -string may be either an identifier (a word), or some quoted text. If -quoted, the type of quotes you use determines the treatment of the -text, just as in regular quoting. An unquoted identifier works like -double quotes. There must be no space between the C<< << >> and -the identifier. (If you put a space it will be treated as a null -identifier, which is valid, and matches the first empty line.) The -terminating string must appear by itself (unquoted and with no -surrounding whitespace) on the terminating line. - - print < in the section on +L. =head2 List value constructors @@ -476,26 +441,26 @@ Note that the value of an actual array in scalar context is the length of the array; the following assigns the value 3 to $foo: @foo = ('cc', '-E', $bar); - $foo = @foo; # $foo gets 3 + $foo = @foo; # $foo gets 3 You may have an optional comma before the closing parenthesis of a list literal, so that you can say: @foo = ( - 1, - 2, - 3, + 1, + 2, + 3, ); To use a here-document to assign an array, one line per element, you might use an approach like this: @sauces = < is a +concatenation of two lists, C<1,> and C<3>, the first of which ends +with that optional comma. C<1,,3> is C<(1,),(3)> is C<1,3> (And +similarly for C<1,,,3> is C<(1,),(,),3> is C<1,3> and so on.) Not that +we'd advise you to use this obfuscation. + A list value may also be subscripted like a normal array. You must put the list in parentheses to avoid ambiguity. For example: @@ -547,14 +521,34 @@ function: List assignment in scalar context returns the number of elements produced by the expression on the right side of the assignment: - $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2 - $x = (($foo,$bar) = f()); # set $x to f()'s return count + $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2 + $x = (($foo,$bar) = f()); # set $x to f()'s return count This is handy when you want to do a list assignment in a Boolean context, because most list functions return a null list when finished, which when assigned produces a 0, which is interpreted as FALSE. -The final element may be an array or a hash: +It's also the source of a useful idiom for executing a function or +performing an operation in list context and then counting the number of +return values, by assigning to an empty list and then using that +assignment in scalar context. For example, this code: + + $count = () = $string =~ /\d+/g; + +will place into $count the number of digit groups found in $string. +This happens because the pattern match is in list context (since it +is being assigned to the empty list), and will therefore return a list +of all matching parts of the string. The list assignment in scalar +context will translate that into the number of elements (here, the +number of times the pattern matched) and assign that to $count. Note +that simply using + + $count = $string =~ /\d+/g; + +would not have worked, since a pattern match in scalar context will +only return true or false, rather than a count of matches. + +The final element of a list assignment may be an array or a hash: ($a, $b, @rest) = split; my($a, $b, %rest) = @_; @@ -583,23 +577,23 @@ interpreted as a string--if it's a bareword that would be a legal identifier. This makes it nice for initializing hashes: %map = ( - red => 0x00f, - blue => 0x0f0, - green => 0xf00, + red => 0x00f, + blue => 0x0f0, + green => 0xf00, ); or for initializing hash references to be used as records: $rec = { - witch => 'Mable the Merciless', - cat => 'Fluffy the Ferocious', - date => '10/31/1776', + witch => 'Mable the Merciless', + cat => 'Fluffy the Ferocious', + date => '10/31/1776', }; or for using call-by-named-parameter to complicated functions: $field = $query->radio_group( - name => 'group_name', + name => 'group_name', values => ['eenie','meenie','minie'], default => 'meenie', linebreak => 'true', @@ -615,33 +609,33 @@ of how to arrange for an output ordering. A common way to access an array or a hash is one scalar element at a time. You can also subscript a list to get a single element from it. - $whoami = $ENV{"USER"}; # one element from the hash - $parent = $ISA[0]; # one element from the array - $dir = (getpwnam("daemon"))[7]; # likewise, but with list + $whoami = $ENV{"USER"}; # one element from the hash + $parent = $ISA[0]; # one element from the array + $dir = (getpwnam("daemon"))[7]; # likewise, but with list A slice accesses several elements of a list, an array, or a hash simultaneously using a list of subscripts. It's more convenient than writing out the individual elements as a list of separate scalar values. - ($him, $her) = @folks[0,-1]; # array slice - @them = @folks[0 .. 3]; # array slice - ($who, $home) = @ENV{"USER", "HOME"}; # hash slice - ($uid, $dir) = (getpwnam("daemon"))[2,7]; # list slice + ($him, $her) = @folks[0,-1]; # array slice + @them = @folks[0 .. 3]; # array slice + ($who, $home) = @ENV{"USER", "HOME"}; # hash slice + ($uid, $dir) = (getpwnam("daemon"))[2,7]; # list slice Since you can assign to a list of variables, you can also assign to an array or hash slice. @days[3..5] = qw/Wed Thu Fri/; @colors{'red','blue','green'} - = (0xff0000, 0x0000ff, 0x00ff00); + = (0xff0000, 0x0000ff, 0x00ff00); @folks[0, -1] = @folks[-1, 0]; The previous assignments are exactly equivalent to ($days[3], $days[4], $days[5]) = qw/Wed Thu Fri/; ($colors{'red'}, $colors{'blue'}, $colors{'green'}) - = (0xff0000, 0x0000ff, 0x00ff00); + = (0xff0000, 0x0000ff, 0x00ff00); ($folks[0], $folks[-1]) = ($folks[0], $folks[-1]); Since changing a slice changes the original array or hash that it's @@ -651,9 +645,9 @@ values of the array or hash. foreach (@array[ 4 .. 10 ]) { s/peter/paul/ } foreach (@hash{keys %hash}) { - s/^\s+//; # trim leading whitespace - s/\s+$//; # trim trailing whitespace - s/(\w+)/\u\L$1/g; # "titlecase" words + s/^\s+//; # trim leading whitespace + s/\s+$//; # trim trailing whitespace + s/(\w+)/\u\L$1/g; # "titlecase" words } A slice of an empty list is still an empty list. Thus: @@ -671,7 +665,7 @@ This makes it easy to write loops that terminate when a null list is returned: while ( ($home, $user) = (getpwent)[7,0]) { - printf "%-8s %s\n", $user, $home; + printf "%-8s %s\n", $user, $home; } As noted earlier in this document, the scalar sense of list assignment @@ -729,10 +723,10 @@ operator. These last until their block is exited, but may be passed back. For example: sub newopen { - my $path = shift; - local *FH; # not my! - open (FH, $path) or return undef; - return *FH; + my $path = shift; + local *FH; # not my! + open (FH, $path) or return undef; + return *FH; } $fh = newopen('/etc/passwd'); @@ -743,6 +737,28 @@ C<*HANDLE{IO}> only works if HANDLE has already been used as a handle. In other words, C<*FH> must be used to create new symbol table entries; C<*foo{THING}> cannot. When in doubt, use C<*FH>. +All functions that are capable of creating filehandles (open(), +opendir(), pipe(), socketpair(), sysopen(), socket(), and accept()) +automatically create an anonymous filehandle if the handle passed to +them is an uninitialized scalar variable. This allows the constructs +such as C and C to be used to +create filehandles that will conveniently be closed automatically when +the scope ends, provided there are no other references to them. This +largely eliminates the need for typeglobs when opening filehandles +that must be passed around, as in the following example: + + sub myopen { + open my $fh, "@_" + or die "Can't open '@_': $!"; + return $fh; + } + + { + my $f = myopen("; + # $f implicitly closed here + } + Another way to create anonymous filehandles is with the Symbol module or with the IO::Handle module and its ilk. These modules have the advantage of not hiding different types of the same name