X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlfunc.pod;h=2f342900a1de184bb6a806d3b58e37c716490747;hb=16fe6d5906f6eff9da00cb861a7054a440d1f6eb;hp=5de9dc794771568ff79a21b69d9d2bac71ce7a35;hpb=5a211162cd360449f2dbfb7ca9231c025909353f;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod index 5de9dc7..2f34290 100644 --- a/pod/perlfunc.pod +++ b/pod/perlfunc.pod @@ -335,7 +335,7 @@ following a minus are interpreted as file tests. The C<-T> and C<-B> switches work as follows. The first block or so of the file is examined for odd characters such as strange control codes or -characters with the high bit set. If too many strange characters (E30%) +characters with the high bit set. If too many strange characters (>30%) are found, it's a C<-B> file, otherwise it's a C<-T> file. Also, any file containing null in the first block is considered a binary file. If C<-T> or C<-B> is used on a filehandle, the current stdio buffer is examined @@ -439,47 +439,63 @@ does. Returns true if it succeeded, false otherwise. NAME should be a packed address of the appropriate type for the socket. See the examples in L. +=item binmode FILEHANDLE, DISCIPLINE + =item binmode FILEHANDLE -Arranges for FILEHANDLE to be read or written in "binary" mode on -systems whose run-time libraries force the programmer to guess -between binary and text files. If FILEHANDLE is an expression, the -value is taken as the name of the filehandle. binmode() should be -called after the C but before any I/O is done on the filehandle. -The only way to reset binary mode on a filehandle is to reopen the -file. +Arranges for FILEHANDLE to be read or written in "binary" or "text" mode +on systems where the run-time libraries distinguish between binary and +text files. If FILEHANDLE is an expression, the value is taken as the +name of the filehandle. DISCIPLINE can be either of C<":raw"> for +binary mode or C<":crlf"> for "text" mode. If the DISCIPLINE is +omitted, it defaults to C<":raw">. + +binmode() should be called after open() but before any I/O is done on +the filehandle. + +On many systems binmode() currently has no effect, but in future, it +will be extended to support user-defined input and output disciplines. +On some systems binmode() is necessary when you're not working with a +text file. For the sake of portability it is a good idea to always use +it when appropriate, and to never use it when it isn't appropriate. + +In other words: Regardless of platform, use binmode() on binary +files, and do not use binmode() on text files. + +The C pragma can be used to establish default disciplines. +See L. The operating system, device drivers, C libraries, and Perl run-time -system all conspire to let the programmer conveniently treat a -simple, one-byte C<\n> as the line terminator, irrespective of its -external representation. On Unix and its brethren, the native file -representation exactly matches the internal representation, making -everyone's lives unbelievably simpler. Consequently, L -has no effect under Unix, Plan9, or Mac OS, all of which use C<\n> -to end each line. (Unix and Plan9 think C<\n> means C<\cJ> and -C<\r> means C<\cM>, whereas the Mac goes the other way--it uses -C<\cM> for c<\n> and C<\cJ> to mean C<\r>. But that's ok, because -it's only one byte, and the internal and external representations -match.) - -In legacy systems like MS-DOS and its embellishments, your program -sees a C<\n> as a simple C<\cJ> (just as in Unix), but oddly enough, -that's not what's physically stored on disk. What's worse, these -systems refuse to help you with this; it's up to you to remember -what to do. And you mustn't go applying binmode() with wild abandon, -either, because if your system does care about binmode(), then using -it when you shouldn't is just as perilous as failing to use it when -you should. - -That means that on any version of Microsoft WinXX that you might -care to name (or not), binmode() causes C<\cM\cJ> sequences on disk -to be converted to C<\n> when read into your program, and causes -any C<\n> in your program to be converted back to C<\cM\cJ> on -output to disk. This sad discrepancy leads to no end of -problems in not just the readline operator, but also when using -seek(), tell(), and read() calls. See L for other painful -details. See the C<$/> and C<$\> variables in L for how -to manually set your input and output line-termination sequences. +system all work together to let the programmer treat a single +character (C<\n>) as the line terminator, irrespective of the external +representation. On many operating systems, the native text file +representation matches the internal representation, but on some +platforms the external representation of C<\n> is made up of more than +one character. + +Mac OS and all variants of Unix use a single character to end each line +in the external representation of text (even though that single +character is not necessarily the same across these platforms). +Consequently binmode() has no effect on these operating systems. In +other systems like VMS, MS-DOS and the various flavors of MS-Windows +your program sees a C<\n> as a simple C<\cJ>, but what's stored in text +files are the two characters C<\cM\cJ>. That means that, if you don't +use binmode() on these systems, C<\cM\cJ> sequences on disk will be +converted to C<\n> on input, and any C<\n> in your program will be +converted back to C<\cM\cJ> on output. This is what you want for text +files, but it can be disastrous for binary files. + +Another consequence of using binmode() (on some systems) is that +special end-of-file markers will be seen as part of the data stream. +For systems from the Microsoft family this means that if your binary +data contains C<\cZ>, the I/O subsystem will ragard it as the end of +the file, unless you use binmode(). + +binmode() is not only important for readline() and print() operations, +but also when using read(), seek(), sysread(), syswrite() and tell() +(see L for more details). See the C<$/> and C<$\> variables +in L for how to manually set your input and output +line-termination sequences. =item bless REF,CLASSNAME @@ -517,7 +533,7 @@ print a stack trace. The value of EXPR indicates how many call frames to go back before the current one. ($package, $filename, $line, $subroutine, $hasargs, - $wantarray, $evaltext, $is_require, $hints) = caller($i); + $wantarray, $evaltext, $is_require, $hints, $bitmask) = caller($i); Here $subroutine may be C<(eval)> if the frame is not a subroutine call, but an C. In such a case additional elements $evaltext and @@ -526,9 +542,9 @@ C or C statement, $evaltext contains the text of the C statement. In particular, for a C statement, $filename is C<(eval)>, but $evaltext is undefined. (Note also that each C statement creates a C frame inside an C) -frame. C<$hints> contains pragmatic hints that the caller was -compiled with. The C<$hints> value is subject to change between versions -of Perl, and is not meant for external use. +frame. C<$hints> and C<$bitmask> contain pragmatic hints that the caller +was compiled with. The C<$hints> and C<$bitmask> values are subject to +change between versions of Perl, and are not meant for external use. Furthermore, when called from within the DB package, caller returns more detailed information: it sets the list variable C<@DB::args> to be the @@ -537,7 +553,7 @@ arguments with which the subroutine was invoked. Be aware that the optimizer might have optimized call frames away before C had a chance to get the information. That means that C might not return information about the call frame you expect it do, for -C 1>. In particular, C<@DB::args> might have information from the +C<< N > 1 >>. In particular, C<@DB::args> might have information from the previous time C was called. =item chdir EXPR @@ -995,8 +1011,8 @@ lookup: Outside an C, prints the value of LIST to C and exits with the current value of C<$!> (errno). If C<$!> is C<0>, -exits with the value of C<($? EE 8)> (backtick `command` -status). If C<($? EE 8)> is C<0>, exits with C<255>. Inside +exits with the value of C<<< ($? >> 8) >>> (backtick `command` +status). If C<<< ($? >> 8) >>> is C<0>, exits with C<255>. Inside an C the error message is stuffed into C<$@> and the C is terminated with the undefined value. This makes C the way to raise an exception. @@ -1210,12 +1226,12 @@ as terminals may lose the end-of-file condition if you do. An C without an argument uses the last file read. Using C with empty parentheses is very different. It refers to the pseudo file formed from the files listed on the command line and accessed via the -CE> operator. Since CE> isn't explicitly opened, -as a normal filehandle is, an C before CE> has been +C<< <> >> operator. Since C<< <> >> isn't explicitly opened, +as a normal filehandle is, an C before C<< <> >> has been used will cause C<@ARGV> to be examined to determine if input is available. -In a CE)> loop, C or C can be used to +In a C<< while (<>) >> loop, C or C can be used to detect the end of each file, C will only detect the end of the last file. Examples: @@ -1458,7 +1474,7 @@ operation is a hash or array key lookup or subroutine name: Although the deepest nested array or hash will not spring into existence just because its existence was tested, any intervening ones will. -Thus C<$ref-E{"A"}> and C<$ref-E{"A"}-E{"B"}> will spring +Thus C<< $ref->{"A"} >> and C<< $ref->{"A"}->{"B"} >> will spring into existence due to the existence test for the $key element above. This happens anywhere the arrow operator is used, including even: @@ -1470,8 +1486,8 @@ This surprising autovivification in what does not at first--or even second--glance appear to be an lvalue context may be fixed in a future release. -See L for specifics on how exists() acts when -used on a pseudo-hash. +See L for specifics +on how exists() acts when used on a pseudo-hash. Use of a subroutine call, rather than a subroutine name, as an argument to exists() is an error. @@ -1926,10 +1942,13 @@ Returns the socket option requested, or undef if there is an error. Returns the value of EXPR with filename expansions such as the standard Unix shell F would do. This is the internal function -implementing the C*.cE> operator, but you can use it directly. -If EXPR is omitted, C<$_> is used. The C*.cE> operator is +implementing the C<< <*.c> >> operator, but you can use it directly. +If EXPR is omitted, C<$_> is used. The C<< <*.c> >> operator is discussed in more detail in L. +Beginning with v5.6.0, this operator is implemented using the standard +C extension. See L for details. + =item gmtime EXPR Converts a time as returned by the time function to a 9-element list @@ -2099,7 +2118,7 @@ Implements the ioctl(2) function. You'll probably first have to say to get the correct function definitions. If F doesn't exist or doesn't have the correct definitions you'll have to roll your -own, based on your C header files such as Fsys/ioctl.hE>. +own, based on your C header files such as F<< >>. (There is a Perl script called B that comes with the Perl kit that may help you in this, but it's nontrivial.) SCALAR will be read and/or written depending on the FUNCTION--a pointer to the string value of SCALAR @@ -2452,10 +2471,10 @@ and C documentation. =item msgsnd ID,MSG,FLAGS Calls the System V IPC function msgsnd to send the message MSG to the -message queue ID. MSG must begin with the long integer message type, -which may be created with C. Returns true if -successful, or false if there is an error. See also C -and C documentation. +message queue ID. MSG must begin with the native long integer message +type, which may be created with C. Returns true if +successful, or false if there is an error. See also C and +C documentation. =item msgrcv ID,VAR,SIZE,TYPE,FLAGS @@ -2528,7 +2547,7 @@ to be converted into a file mode, for example. (Although perl will automatically convert strings into numbers as needed, this automatic conversion assumes base 10.) -=item open FILEHANDLE,MODE,EXPR +=item open FILEHANDLE,MODE,LIST =item open FILEHANDLE,EXPR @@ -2543,24 +2562,24 @@ for this purpose; so if you're using C, specify EXPR in your call to open.) See L for a kinder, gentler explanation of opening files. -If MODE is C<'E'> or nothing, the file is opened for input. -If MODE is C<'E'>, the file is truncated and opened for -output, being created if necessary. If MODE is C<'EE'>, +If MODE is C<< '<' >> or nothing, the file is opened for input. +If MODE is C<< '>' >>, the file is truncated and opened for +output, being created if necessary. If MODE is C<<< '>>' >>>, the file is opened for appending, again being created if necessary. -You can put a C<'+'> in front of the C<'E'> or C<'E'> to indicate that -you want both read and write access to the file; thus C<'+E'> is almost -always preferred for read/write updates--the C<'+E'> mode would clobber the +You can put a C<'+'> in front of the C<< '>' >> or C<< '<' >> to indicate that +you want both read and write access to the file; thus C<< '+<' >> is almost +always preferred for read/write updates--the C<< '+>' >> mode would clobber the file first. You can't usually use either read-write mode for updating textfiles, since they have variable length records. See the B<-i> switch in L for a better approach. The file is created with permissions of C<0666> modified by the process' C value. -These various prefixes correspond to the fopen(3) modes of C<'r'>, C<'r+'>, C<'w'>, -C<'w+'>, C<'a'>, and C<'a+'>. +These various prefixes correspond to the fopen(3) modes of C<'r'>, C<'r+'>, +C<'w'>, C<'w+'>, C<'a'>, and C<'a+'>. In the 2-arguments (and 1-argument) form of the call the mode and filename should be concatenated (in this order), possibly separated by -spaces. It is possible to omit the mode if the mode is C<'E'>. +spaces. It is possible to omit the mode if the mode is C<< '<' >>. If the filename begins with C<'|'>, the filename is interpreted as a command to which output is to be piped, and if the filename ends with a @@ -2580,7 +2599,7 @@ that pipes both in I out, but see L, L, and L for alternatives.) In the 2-arguments (and 1-argument) form opening C<'-'> opens STDIN -and opening C<'E-'> opens STDOUT. +and opening C<< '>-' >> opens STDOUT. Open returns nonzero upon success, the undefined value otherwise. If the C @@ -2652,10 +2671,10 @@ Examples: } You may also, in the Bourne shell tradition, specify an EXPR beginning -with C<'E&'>, in which case the rest of the string is interpreted as the +with C<< '>&' >>, in which case the rest of the string is interpreted as the name of a filehandle (or file descriptor, if numeric) to be -duped and opened. You may use C<&> after C>, CE>, -C>, C<+E>, C<+EE>, and C<+E>. The +duped and opened. You may use C<&> after C<< > >>, C<<< >> >>>, +C<< < >>, C<< +> >>, C<<< +>> >>>, and C<< +< >>. The mode you specify should match the mode of the original filehandle. (Duping a filehandle does not take into account any existing contents of stdio buffers.) Duping file handles is not yet supported for 3-argument @@ -2686,7 +2705,7 @@ STDERR: print STDOUT "stdout 2\n"; print STDERR "stderr 2\n"; -If you specify C<'E&=N'>, where C is a number, then Perl will do an +If you specify C<< '<&=N' >>, where C is a number, then Perl will do an equivalent of C's C of that file descriptor; this is more parsimonious of file descriptors. For example: @@ -3082,10 +3101,10 @@ are inherently non-portable between processors and operating systems because they obey the native byteorder and endianness. For example a 4-byte integer 0x12345678 (305419896 decimal) be ordered natively (arranged in and handled by the CPU registers) into bytes as - + 0x12 0x34 0x56 0x78 # little-endian 0x78 0x56 0x34 0x12 # big-endian - + Basically, the Intel, Alpha, and VAX CPUs are little-endian, while everybody else, for example Motorola m68k/88k, PPC, Sparc, HP PA, Power, and Cray are big-endian. MIPS can be either: Digital used it @@ -3095,12 +3114,12 @@ The names `big-endian' and `little-endian' are comic references to the classic "Gulliver's Travels" (via the paper "On Holy Wars and a Plea for Peace" by Danny Cohen, USC/ISI IEN 137, April 1, 1980) and the egg-eating habits of the Lilliputians. - + Some systems may have even weirder byte orders such as - + 0x56 0x78 0x12 0x34 0x34 0x12 0x78 0x56 - + You can see your system's preference with print join(" ", map { sprintf "%#02x", $_ } @@ -3422,8 +3441,8 @@ When C<$/> is set to C, when readline() is in scalar context (i.e. file slurp mode), and when an empty file is read, it returns C<''> the first time, followed by C subsequently. -This is the internal function implementing the CEXPRE> -operator, but you can use it directly. The CEXPRE> +This is the internal function implementing the C<< >> +operator, but you can use it directly. The C<< >> operator is discussed in more detail in L. $line = ; @@ -3549,9 +3568,18 @@ rename(2) manpage or equivalent system documentation for details. =item require Demands some semantics specified by EXPR, or by C<$_> if EXPR is not -supplied. If a version number or tuple is specified, or if EXPR is -numeric, demands that the current version of Perl -(C<$^V> or C<$]> or $PERL_VERSION) be equal or greater than EXPR. +supplied. + +If a VERSION is specified as a literal of the form v5.6.1, +demands that the current version of Perl (C<$^V> or $PERL_VERSION) be +at least as recent as that version, at run time. (For compatibility +with older versions of Perl, a numeric argument will also be interpreted +as VERSION.) Compare with L, which can do a similar check at +compile time. + + require v5.6.1; # run time version check + require 5.6.1; # ditto + require 5.005_03; # float version allowed for compatibility Otherwise, demands that a library file be included if it hasn't already been included. The file is included via the do-FILE mechanism, which is @@ -3750,7 +3778,7 @@ This is also useful for applications emulating C. Once you hit EOF on your read, and then sleep for a while, you might have to stick in a seek() to reset things. The C doesn't change the current position, but it I clear the end-of-file condition on the handle, so that the -next CFILEE> makes Perl try again to read something. We hope. +next C<< >> makes Perl try again to read something. We hope. If that doesn't work (some stdios are particularly cantankerous), then you may need something more like this: @@ -3844,7 +3872,7 @@ You can effect a sleep of 250 milliseconds this way: select(undef, undef, undef, 0.25); B: One should not attempt to mix buffered I/O (like C -or EFHE) with C, except as permitted by POSIX, and even then only on POSIX systems. You have to use C instead. =item semctl ID,SEMNUM,CMD,ARG @@ -3855,9 +3883,11 @@ Calls the System V IPC function C. You'll probably have to say first to get the correct constant definitions. If CMD is IPC_STAT or GETALL, then ARG must be a variable which will hold the returned -semid_ds structure or semaphore value array. Returns like C: the -undefined value for error, "C<0 but true>" for zero, or the actual return -value otherwise. See also C and C documentation. +semid_ds structure or semaphore value array. Returns like C: +the undefined value for error, "C<0 but true>" for zero, or the actual +return value otherwise. The ARG must consist of a vector of native +short integers, which may may be created with C. +See also C and C documentation. =item semget KEY,NSEMS,FLAGS @@ -4053,7 +4083,7 @@ Sorts the LIST and returns the sorted list value. If SUBNAME or BLOCK is omitted, Cs in standard string comparison order. If SUBNAME is specified, it gives the name of a subroutine that returns an integer less than, equal to, or greater than C<0>, depending on how the elements -of the list are to be ordered. (The C=E> and C +of the list are to be ordered. (The C<< <=> >> and C operators are extremely useful in such routines.) SUBNAME may be a scalar variable name (unsubscripted), in which case the value provides the name of (or a reference to) the actual subroutine to use. In place @@ -4149,7 +4179,7 @@ Examples: || $a->[2] cmp $b->[2] } map { [$_, /=(\d+)/, uc($_)] } @old; - + # using a prototype allows you to use any comparison subroutine # as a sort subroutine (including other package's subroutines) package other; @@ -4337,10 +4367,6 @@ In addition, Perl permits the following widely-supported conversions: %n special: *stores* the number of characters output so far into the next variable in the parameter list -And the following Perl-specific conversion: - - %v a string, output as a tuple of integers ("Perl" is 80.101.114.108) - Finally, for backward (and we do mean "backward") compatibility, Perl permits these unnecessary but widely-supported conversions: @@ -4366,9 +4392,13 @@ and the conversion letter: h interpret integer as C type "short" or "unsigned short" If no flags, interpret integer as C type "int" or "unsigned" -There is also one Perl-specific flag: +There are also two Perl-specific flags: V interpret integer as Perl's standard integer type + v interpret string as a vector of integers, output as + numbers separated either by dots, or by an arbitrary + string received from the argument list when the flag + is preceded by C<*> Where a number would appear in the flags, an asterisk (C<*>) may be used instead, in which case Perl uses the next item in the parameter @@ -4376,6 +4406,13 @@ list as the given number (that is, as the field width or precision). If a field width obtained through C<*> is negative, it has the same effect as the C<-> flag: left-justification. +The C flag is useful for displaying ordinal values of characters +in arbitrary strings: + + printf "version is v%vd\n", $^V; # Perl's version + printf "address is %*vX\n", ":", $addr; # IPv6 address + printf "bits are %*vb\n", " ", $bits; # random bitstring + If C is in effect, the character used for the decimal point in formatted real numbers is affected by the LC_NUMERIC locale. See L. @@ -4397,7 +4434,7 @@ For example You can find out whether your Perl supports quads via L: use Config; - ($Config{use64bits} eq 'define' || $Config{longsize} == 8) && + ($Config{use64bitint} eq 'define' || $Config{longsize} == 8) && print "quads\n"; If Perl understands "long doubles" (this requires that the platform @@ -4439,7 +4476,7 @@ the F device) or based on the current time and process ID, among other things. In versions of Perl prior to 5.004 the default seed was just the current C, which can do a +similar check at run time. - use 5.005_03; # version number - use v5.6.0; # version tuple + use v5.6.1; # compile time version check + use 5.6.1; # ditto + use 5.005_03; # float version allowed for compatibility This is often useful if you need to check the current Perl version before Cing library modules that have changed in incompatible ways from @@ -5271,9 +5315,12 @@ That is exactly equivalent to If the VERSION argument is present between Module and LIST, then the C will call the VERSION method in class Module with the given version as an argument. The default VERSION method, inherited from -the Universal class, croaks if the given version is larger than the -value of the variable C<$Module::VERSION>. (Note that there is not a -comma after VERSION!) +the UNIVERSAL class, croaks if the given version is larger than the +value of the variable C<$Module::VERSION>. + +Again, there is a distinction between omitting LIST (C called +with no arguments) and an explicit empty LIST C<()> (C not +called). Note that there is no comma after VERSION! Because this is a wide-open interface, pragmas (compiler directives) are also implemented this way. Currently implemented pragmas are: