X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlport.pod;h=c838264f3a23139e1444463c28a1b586eb4a6fcb;hb=9296fdfa58e48c98d4f73d3f3c7220275ccf66e3;hp=9a76d240c0244b5c71c6cd28a58a73dcd08b68b9;hpb=3fd80bd61943d0f83ebe1ce9c42b05023a1b7a18;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perlport.pod b/pod/perlport.pod index 9a76d24..c838264 100644 --- a/pod/perlport.pod +++ b/pod/perlport.pod @@ -67,9 +67,9 @@ The important thing is to decide where the code will run and to be deliberate in your decision. The material below is separated into three main sections: main issues of -portability (L<"ISSUES">, platform-specific issues (L<"PLATFORMS">, and +portability (L<"ISSUES">), platform-specific issues (L<"PLATFORMS">), and built-in perl functions that behave differently on various ports -(L<"FUNCTION IMPLEMENTATIONS">. +(L<"FUNCTION IMPLEMENTATIONS">). This information should not be considered complete; it includes possibly transient information about idiosyncrasies of some of the ports, almost @@ -107,8 +107,8 @@ newlines: You can get away with this on Unix and Mac OS (they have a single character end-of-line), but the same program will break under DOSish perls because you're only chop()ing half the end-of-line. Instead, -chomp() should be used to trim newlines. The Dunce::Files module can -help audit your code for misuses of chop(). +chomp() should be used to trim newlines. The L module +can help audit your code for misuses of chop(). When dealing with binary files (or text files in binary mode) be sure to explicitly set $/ to the appropriate value for your file format @@ -188,12 +188,12 @@ The Unix column assumes that you are not accessing a serial line "\n", and "\n" on output becomes CRLF. These are just the most common definitions of C<\n> and C<\r> in Perl. -There may well be others. For example, on an EBCDIC implementation such -as z/OS or OS/400 the above material is similar to "Unix" but the code -numbers change: +There may well be others. For example, on an EBCDIC implementation +such as z/OS (OS/390) or OS/400 (using the ILE, the PASE is ASCII-based) +the above material is similar to "Unix" but the code numbers change: - LF eq \025 eq \x15 eq chr(21) eq CP-1047 21 - LF eq \045 eq \x25 eq \cU eq chr(37) eq CP-0037 37 + LF eq \025 eq \x15 eq \cU eq chr(21) eq CP-1047 21 + LF eq \045 eq \x25 eq chr(37) eq CP-0037 37 CR eq \015 eq \x0D eq \cM eq chr(13) eq CP-1047 13 CR eq \015 eq \x0D eq \cM eq chr(13) eq CP-0037 13 @@ -224,6 +224,10 @@ them in big-endian mode. To avoid this problem in network (socket) connections use the C and C formats C and C, the "network" orders. These are guaranteed to be portable. +As of perl 5.9.2, you can also use the C> and C> modifiers +to force big- or little-endian byte-order. This is useful if you want +to store signed integers or 64-bit integers, for example. + You can explore the endianness of your platform by unpacking a data structure packed in native format such as: @@ -404,10 +408,12 @@ interaction. A program requiring a command line interface might not work everywhere. This is probably for the user of the program to deal with, so don't stay up late worrying about it. -Some platforms can't delete or rename files held open by the system. -Remember to C files when you are done with them. Don't -C or C an open file. Don't C or C a -file already tied or opened; C or C it first. +Some platforms can't delete or rename files held open by the system, +this limitation may also apply to changing filesystem metainformation +like file permissions or owners. Remember to C files when you +are done with them. Don't C or C an open file. Don't +C or C a file already tied or opened; C or C +it first. Don't open the same file more than once at a time for writing, as some operating systems put mandatory locks on such files. @@ -446,7 +452,12 @@ C instead. Don't count on per-program environment variables, or per-program current directories. -Don't count on specific values of C<$!>. +Don't count on specific values of C<$!>, neither numeric nor +especially the strings values-- users may switch their locales causing +error messages to be translated into their languages. If you can +trust a POSIXish environment, you can portably use the symbols defined +by the Errno module, like ENOENT. And don't trust on the values of C<$!> +at all except immediately after a failed system call. =head2 Command names versus file pathnames @@ -469,17 +480,55 @@ file name. To convert $^X to a file pathname, taking account of the requirements of the various operating system possibilities, say: + use Config; $thisperl = $^X; if ($^O ne 'VMS') {$thisperl .= $Config{_exe} unless $thisperl =~ m/$Config{_exe}$/i;} To convert $Config{perlpath} to a file pathname, say: + use Config; $thisperl = $Config{perlpath}; if ($^O ne 'VMS') {$thisperl .= $Config{_exe} unless $thisperl =~ m/$Config{_exe}$/i;} +=head2 Networking + +Don't assume that you can reach the public Internet. + +Don't assume that there is only one way to get through firewalls +to the public Internet. + +Don't assume that you can reach outside world through any other port +than 80, or some web proxy. ftp is blocked by many firewalls. + +Don't assume that you can send email by connecting to the local SMTP port. + +Don't assume that you can reach yourself or any node by the name +'localhost'. The same goes for '127.0.0.1'. You will have to try both. + +Don't assume that the host has only one network card, or that it +can't bind to many virtual IP addresses. + +Don't assume a particular network device name. + +Don't assume a particular set of ioctl()s will work. + +Don't assume that you can ping hosts and get replies. + +Don't assume that any particular port (service) will respond. + +Don't assume that Sys::Hostname (or any other API or command) +returns either a fully qualified hostname or a non-qualified hostname: +it all depends on how the system had been configured. Also remember +things like DHCP and NAT-- the hostname you get back might not be very +useful. + +All the above "don't":s may look daunting, and they are -- but the key +is to degrade gracefully if one cannot reach the particular network +service one wants. Croaking or hanging do not look very professional. + =head2 Interprocess Communication (IPC) In general, don't directly access the system in code meant to be @@ -564,16 +613,24 @@ work with any DBM module. See L for more details. The system's notion of time of day and calendar date is controlled in widely different ways. Don't assume the timezone is stored in C<$ENV{TZ}>, and even if it is, don't assume that you can control the timezone through -that variable. +that variable. Don't assume anything about the three-letter timezone +abbreviations (for example that MST would be the Mountain Standard Time, +it's been known to stand for Moscow Standard Time). If you need to +use timezones, express them in some unambiguous format like the +exact number of minutes offset from UTC, or the POSIX timezone +format. Don't assume that the epoch starts at 00:00:00, January 1, 1970, -because that is OS- and implementation-specific. It is better to store a date -in an unambiguous representation. The ISO-8601 standard defines -"YYYY-MM-DD" as the date format. A text representation (like "1987-12-18") -can be easily converted into an OS-specific value using a module like -Date::Parse. An array of values, such as those returned by -C, can be converted to an OS-specific representation using -Time::Local. +because that is OS- and implementation-specific. It is better to +store a date in an unambiguous representation. The ISO 8601 standard +defines YYYY-MM-DD as the date format, or YYYY-MM-DDTHH-MM-SS +(that's a literal "T" separating the date from the time). +Please do use the ISO 8601 instead of making us to guess what +date 02/03/04 might be. ISO 8601 even sorts nicely as-is. +A text representation (like "1987-12-18") can be easily converted +into an OS-specific value using a module like Date::Parse. +An array of values, such as those returned by C, can be +converted to an OS-specific representation using Time::Local. When calculating specific times, such as for tests in time or date modules, it may be appropriate to calculate an offset for the epoch. @@ -585,6 +642,9 @@ The value for C<$offset> in Unix will be C<0>, but in Mac OS will be some large number. C<$offset> can then be added to a Unix time value to get what should be the proper value on any system. +On Windows (at least), you shouldn't pass a negative value to C or +C. + =head2 Character sets and character encoding Assume very little about character sets. @@ -598,9 +658,9 @@ Do not assume that the alphabetic characters are encoded contiguously Do not assume anything about the ordering of the characters. The lowercase letters may come before or after the uppercase letters; -the lowercase and uppercase may be interlaced so that both `a' and `A' -come before `b'; the accented and other international characters may -be interlaced so that E comes before `b'. +the lowercase and uppercase may be interlaced so that both "a" and "A" +come before "b"; the accented and other international characters may +be interlaced so that E comes before "b". =head2 Internationalisation @@ -611,6 +671,25 @@ or at least more convenient and native-friendly for non-English users. The system affects character sets and encoding, and date and time formatting--amongst other things. +If you really want to be international, you should consider Unicode. +See L and L for more information. + +If you want to use non-ASCII bytes (outside the bytes 0x00..0x7f) in +the "source code" of your code, to be portable you have to be explicit +about what bytes they are. Someone might for example be using your +code under a UTF-8 locale, in which case random native bytes might be +illegal ("Malformed UTF-8 ...") This means that for example embedding +ISO 8859-1 bytes beyond 0x7f into your strings might cause trouble +later. If the bytes are native 8-bit bytes, you can use the C +pragma. If the bytes are in a string (regular expression being a +curious string), you can often also use the C<\xHH> notation instead +of embedding the bytes as-is. If they are in some particular legacy +encoding (ether single-byte or something more complicated), you can +use the C pragma. (If you want to write your code in UTF-8, +you can use either the C pragma, or the C pragma.) +The C and C pragmata are available since Perl 5.6.0, and +the C pragma since Perl 5.8.0. + =head2 System Resources If your code is destined for systems with severely constrained (or @@ -672,12 +751,14 @@ Be careful in the tests you supply with your module or programs. Module code may be fully portable, but its tests might not be. This often happens when tests spawn off other processes or call external programs to aid in the testing, or when (as noted above) the tests -assume certain things about the filesystem and paths. Be careful -not to depend on a specific output style for errors, such as when -checking C<$!> after a system call. Some platforms expect a certain -output format, and perl on those platforms may have been adjusted -accordingly. Most specifically, don't anchor a regex when testing -an error value. +assume certain things about the filesystem and paths. Be careful not +to depend on a specific output style for errors, such as when checking +C<$!> after a failed system call. Using C<$!> for anything else than +displaying it as output is doubtful (though see the Errno module for +testing reasonably portably for error value). Some platforms expect +a certain output format, and Perl on those platforms may have been +adjusted accordingly. Most specifically, don't anchor a regex when +testing an error value. =head1 CPAN Testers @@ -691,11 +772,17 @@ problems in their code that crop up because of lack of testing on other platforms; two, to provide users with information about whether a given module works on a given platform. +Also see: + =over 4 -=item Mailing list: cpan-testers@perl.org +=item * + +Mailing list: cpan-testers@perl.org + +=item * -=item Testing results: http://testers.cpan.org/ +Testing results: http://testers.cpan.org/ =back @@ -818,10 +905,11 @@ DOSish perls are as follows: Windows NT MSWin32 MSWin32-x86 2 4 xx Windows NT MSWin32 MSWin32-ALPHA 2 4 xx Windows NT MSWin32 MSWin32-ppc 2 4 xx - Windows 2000 MSWin32 MSWin32-x86 2 5 xx - Windows XP MSWin32 MSWin32-x86 2 ? + Windows 2000 MSWin32 MSWin32-x86 2 5 00 + Windows XP MSWin32 MSWin32-x86 2 5 01 + Windows 2003 MSWin32 MSWin32-x86 2 5 02 Windows CE MSWin32 ? 3 - Cygwin cygwin ? + Cygwin cygwin cygwin The various MSWin32 Perl's can distinguish the OS they are running on via the value of the fifth element of the list returned from @@ -960,6 +1048,10 @@ The MacPerl Pages, http://www.macperl.com/ . The MacPerl mailing lists, http://lists.perl.org/ . +=item * + +MPW, ftp://ftp.apple.com/developer/Tool_Chest/Core_Mac_OS_Tools/ + =back =head2 VMS @@ -1027,7 +1119,7 @@ native formats. What C<\n> represents depends on the type of file opened. It usually represents C<\012> but it could also be C<\015>, C<\012>, C<\015\012>, -C<\000>, C<\040>, or nothing depending on the file organiztion and +C<\000>, C<\040>, or nothing depending on the file organization and record format. The VMS::Stdio module provides access to the special fopen() requirements of files with unusual attributes on VMS. @@ -1136,7 +1228,9 @@ Character Code Set ID 0037 for OS/400 and either 1047 or POSIX-BC for S/390 systems). On the mainframe perl currently works under the "Unix system services for OS/390" (formerly known as OpenEdition), VM/ESA OpenEdition, or the BS200 POSIX-BC system (BS2000 is supported in perl 5.6 and greater). -See L for details. +See L for details. Note that for OS/400 there is also a port of +Perl 5.8.1/5.9.0 or later to the PASE which is ASCII-based (as opposed to +ILE which is EBCDIC-based), see L. As of R2.5 of USS for OS/390 and Version 2.3 of VM/ESA these Unix sub-systems do not support the C<#!> shebang trick for script invocation. @@ -1207,8 +1301,6 @@ Also see: =item * -* - L, F, F, F, L. @@ -1218,7 +1310,7 @@ The perl-mvs@perl.org list is for discussion of porting issues as well as general usage issues for all EBCDIC Perls. Send a message body of "subscribe perl-mvs" to majordomo@perl.org. -=item * +=item * AS/400 Perl information at http://as400.rochester.ibm.com/ @@ -1406,10 +1498,6 @@ L for a full description of available variables. =over 8 -=item -X FILEHANDLE - -=item -X EXPR - =item -X C<-r>, C<-w>, and C<-x> have a limited meaning only; directories @@ -1448,13 +1536,18 @@ suffixes. C<-S> is meaningless. (Win32) C<-x> (or C<-X>) determine if a file has an executable file type. (S) -=item alarm SECONDS +=item atan2 -=item alarm +Due to issues with various CPUs, math libraries, compilers, and standards, +results for C may vary depending on any combination of the above. +Perl attempts to conform to the Open Group/IEEE standards for the results +returned from C, but cannot force the issue if the system Perl is +run on does not allow it. (Tru64, HP-UX 10.20) -Not implemented. (Win32) +The current version of the standards for C is available at +L. -=item binmode FILEHANDLE +=item binmode Meaningless. (S, S) @@ -1465,7 +1558,7 @@ filehandle may be closed, or pointer may be in a different position. The value returned by C may be affected after the call, and the filehandle may be flushed. (Win32) -=item chmod LIST +=item chmod Only limited meaning. Disabling/enabling write permission is mapped to locking/unlocking the file. (S) @@ -1480,7 +1573,7 @@ Access permissions are mapped onto VOS access-control list changes. (VOS) The actual permissions set depend on the value of the C in the SYSTEM environment settings. (Cygwin) -=item chown LIST +=item chown Not implemented. (S, Win32, S, S) @@ -1488,26 +1581,24 @@ Does nothing, but won't fail. (Win32) A little funky, because VOS's notion of ownership is a little funky (VOS). -=item chroot FILENAME - =item chroot Not implemented. (S, Win32, VMS, S, S, VOS, VM/ESA) -=item crypt PLAINTEXT,SALT +=item crypt May not be available if library or source was not provided when building perl. (Win32) -=item dbmclose HASH +=item dbmclose Not implemented. (VMS, S, VOS) -=item dbmopen HASH,DBNAME,MODE +=item dbmopen Not implemented. (VMS, S, VOS) -=item dump LABEL +=item dump Not useful. (S, S) @@ -1515,7 +1606,7 @@ Not implemented. (Win32) Invokes VMS debugger. (VMS) -=item exec LIST +=item exec Not implemented. (S) @@ -1524,8 +1615,6 @@ Implemented via Spawn. (VM/ESA) Does not automatically flush output handles on some platforms. (SunOS, Solaris, HP-UX) -=item exit EXPR - =item exit Emulates UNIX exit() (which considers C to indicate an error) by @@ -1535,11 +1624,11 @@ function, C is also mapped to an exit status of SS$_NORMAL (C<1>); this mapping cannot be overridden. Any other argument to exit() is used directly as Perl's exit status. (VMS) -=item fcntl FILEHANDLE,FUNCTION,SCALAR +=item fcntl Not implemented. (Win32, VMS) -=item flock FILEHANDLE,OPERATION +=item flock Not implemented (S, VMS, S, VOS). @@ -1558,7 +1647,7 @@ Does not automatically flush output handles on some platforms. Not implemented. (S, S) -=item getpgrp PID +=item getpgrp Not implemented. (S, Win32, VMS, S) @@ -1566,43 +1655,43 @@ Not implemented. (S, Win32, VMS, S) Not implemented. (S, Win32, S) -=item getpriority WHICH,WHO +=item getpriority Not implemented. (S, Win32, VMS, S, VOS, VM/ESA) -=item getpwnam NAME +=item getpwnam Not implemented. (S, Win32) Not useful. (S) -=item getgrnam NAME +=item getgrnam Not implemented. (S, Win32, VMS, S) -=item getnetbyname NAME +=item getnetbyname Not implemented. (S, Win32, S) -=item getpwuid UID +=item getpwuid Not implemented. (S, Win32) Not useful. (S) -=item getgrgid GID +=item getgrgid Not implemented. (S, Win32, VMS, S) -=item getnetbyaddr ADDR,ADDRTYPE +=item getnetbyaddr Not implemented. (S, Win32, S) -=item getprotobynumber NUMBER +=item getprotobynumber Not implemented. (S) -=item getservbyport PORT,PROTO +=item getservbyport Not implemented. (S) @@ -1614,6 +1703,11 @@ Not implemented. (S, Win32, VM/ESA) Not implemented. (S, Win32, VMS, VM/ESA) +=item gethostbyname + +C does not work everywhere: you may have +to use C. (S, S) + =item gethostent Not implemented. (S, Win32) @@ -1630,19 +1724,19 @@ Not implemented. (S, Win32, S) Not implemented. (Win32, S) -=item sethostent STAYOPEN +=item sethostent Not implemented. (S, Win32, S, S) -=item setnetent STAYOPEN +=item setnetent Not implemented. (S, Win32, S, S) -=item setprotoent STAYOPEN +=item setprotoent Not implemented. (S, Win32, S, S) -=item setservent STAYOPEN +=item setservent Not implemented. (S, Win32, S) @@ -1674,13 +1768,15 @@ Not implemented. (S, Win32) Not implemented. (S) -=item glob EXPR - =item glob This operator is implemented via the File::Glob extension on most platforms. See L for portability information. +=item gmtime + +Same portability caveats as L. + =item ioctl FILEHANDLE,FUNCTION,SCALAR Not implemented. (VMS) @@ -1690,7 +1786,7 @@ in the Winsock API does. (Win32) Available only for socket handles. (S) -=item kill SIGNAL, LIST +=item kill C is implemented for the sake of taint checking; use with other signals is unimplemented. (S) @@ -1704,7 +1800,7 @@ and makes it exit immediately with exit status $sig. As in Unix, if $sig is 0 and the specified process exists, it returns true without actually terminating it. (Win32) -=item link OLDFILE,NEWFILE +=item link Not implemented. (S, MPE/iX, VMS, S) @@ -1714,9 +1810,12 @@ Link count not updated because hard links are not quite that hard Hard links are implemented on Win32 (Windows NT and Windows 2000) under NTFS only. -=item lstat FILEHANDLE +=item localtime -=item lstat EXPR +Because Perl currently relies on the native standard C localtime() +function, it is only safe to use times between 0 and (2**31)-1. Times +outside this range may result in unexpected behavior depending on your +operating system's implementation of localtime(). =item lstat @@ -1724,19 +1823,17 @@ Not implemented. (VMS, S) Return values (especially for device and inode) may be bogus. (Win32) -=item msgctl ID,CMD,ARG +=item msgctl -=item msgget KEY,FLAGS +=item msgget -=item msgsnd ID,MSG,FLAGS +=item msgsnd -=item msgrcv ID,VAR,SIZE,TYPE,FLAGS +=item msgrcv Not implemented. (S, Win32, VMS, S, S, VOS) -=item open FILEHANDLE,EXPR - -=item open FILEHANDLE +=item open The C<|> variants are supported only if ToolServer is installed. (S) @@ -1746,17 +1843,19 @@ open to C<|-> and C<-|> are unsupported. (S, Win32, S) Opening a process does not automatically flush output handles on some platforms. (SunOS, Solaris, HP-UX) -=item pipe READHANDLE,WRITEHANDLE +=item pipe Very limited functionality. (MiNT) -=item readlink EXPR - =item readlink Not implemented. (Win32, VMS, S) -=item select RBITS,WBITS,EBITS,TIMEOUT +=item rename + +Can't move directories between directories on different logical volumes. (Win32) + +=item select Only implemented on sockets. (Win32, VMS) @@ -1764,11 +1863,11 @@ Only reliable on sockets. (S) Note that the C