X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlport.pod;h=c1a5483add6795cb520dcce8ed01020d9811360a;hb=2b5ab1e742ea1b1374dcea7f6f90ef5c5cf29914;hp=8568c2515ae5cd6d2003492fdcd020322ffb7b9d;hpb=0a47030adea6675ff2e866534b32d11b2531fe9e;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perlport.pod b/pod/perlport.pod index 8568c25..c1a5483 100644 --- a/pod/perlport.pod +++ b/pod/perlport.pod @@ -84,7 +84,7 @@ should be considered a perpetual work in progress =head2 Newlines -In most operating systems, lines in files are separated with newlines. +In most operating systems, lines in files are terminated by newlines. Just what is used as a newline may vary from OS to OS. Unix traditionally uses C<\012>, one kind of Windows I/O uses C<\015\012>, and S uses C<\015>. @@ -148,8 +148,41 @@ And this example is actually better than the previous one even for Unix platforms, because now any C<\015>'s (C<\cM>'s) are stripped out (and there was much rejoicing). +An important thing to remember is that functions that return data +should translate newlines when appropriate. Often one line of code +will suffice: -=head2 Files + $data =~ s/\015?\012/\n/g; + return $data; + + +=head2 Numbers endianness and Width + +Different CPUs store integers and floating point numbers in different +orders (called I) and widths (32-bit and 64-bit being the +most common). This affects your programs if they attempt to transfer +numbers in binary format from a CPU architecture to another over some +channel: either 'live' via network connections or storing the numbers +to secondary storage such as a disk file. + +Conflicting storage orders make utter mess out of the numbers: if a +little-endian host (Intel, Alpha) stores 0x12345678 (305419896 in +decimal), a big-endian host (Motorola, MIPS, Sparc, PA) reads it as +0x78563412 (2018915346 in decimal). To avoid this problem in network +(socket) connections use the C and C formats C<"n"> +and C<"N">, the "network" orders, they are guaranteed to be portable. + +Different widths can cause truncation even between platforms of equal +endianness: the platform of shorter width loses the upper parts of the +number. There is no good solution for this problem except to avoid +transferring or storing raw binary numbers. + +One can circumnavigate both these problems in two ways: either +transfer and store numbers always in text format, instead of raw +binary, or consider using modules like C (included in +the standard distribution as of Perl 5.005) and C. + +=head2 Files and Filesystems Most platforms these days structure files in a hierarchical fashion. So, it is reasonably safe to assume that any platform supports the @@ -157,13 +190,32 @@ notion of a "path" to uniquely identify a file on the system. Just how that path is actually written, differs. While they are similar, file path specifications differ between Unix, -Windows, S, OS/2, VMS, S and probably others. Unix, for -example, is one of the few OSes that has the idea of a root directory. -S uses C<:> as a path separator instead of C. VMS, Windows, -and OS/2 can work similarly to Unix with C as path separator, or in -their own idiosyncratic ways. C perl can emulate Unix filenames -with C as path separator, or go native and use C<.> for path separator -and C<:> to signal filing systems and disc names. +Windows, S, OS/2, VMS, VOS, S and probably others. +Unix, for example, is one of the few OSes that has the idea of a single +root directory. + +VMS, Windows, and OS/2 can work similarly to Unix with C as path +separator, or in their own idiosyncratic ways (such as having several +root directories and various "unrooted" device files such NIL: and +LPT:). + +S uses C<:> as a path separator instead of C. + +The filesystem may support neither hard links (C) nor +symbolic links (C, C, C). + +The filesystem may not support neither access timestamp nor change +timestamp (meaning that about the only portable timestamp is the +modification timestamp), or one second granularity of any timestamps +(e.g. the FAT filesystem limits the time granularity to two seconds). + +VOS perl can emulate Unix filenames with C as path separator. The +native pathname characters greater-than, less-than, number-sign, and +percent-sign are always accepted. + +C perl can emulate Unix filenames with C as path +separator, or go native and use C<.> for path separator and C<:> to +signal filing systems and disc names. As with the newline problem above, there are modules that can help. The C modules provide methods to do the Right Thing on whatever @@ -191,13 +243,21 @@ Also of use is C, from the standard distribution, which splits a pathname into pieces (base filename, full path to directory, and file suffix). -Remember not to count on the existence of system-specific files, like -F. If code does need to rely on such a file, include a -description of the file and its format in the code's documentation, and -make it easy for the user to override the default location of the file. +Even when on a single platform (if you can call UNIX a single platform), +remember not to count on the existence or the contents of +system-specific files or directories, like F, +F, F, or even F. For +example, F may exist but it may not contain the encrypted +passwords because the system is using some form of enhanced security -- +or it may not contain all the accounts because the system is using NIS. +If code does need to rely on such a file, include a description of the +file and its format in the code's documentation, and make it easy for +the user to override the default location of the file. + +Don't assume a text file will end with a newline. Do not have two files of the same name with different case, like -F and , as many platforms have case-insensitive +F and F, as many platforms have case-insensitive filenames. Also, try not to have non-word characters (except for C<.>) in the names, and keep them to the 8.3 convention, for maximum portability. @@ -207,11 +267,17 @@ Likewise, if using C, try to keep the split functions to make it so the resulting files have a unique (case-insensitively) first 8 characters. -Don't assume C> won't be the first character of a filename. Always -use C> explicitly to open a file for reading: +There certainly can be whitespace in filenames. Many systems (DOS, +VMS) cannot have more than one C<"."> in their filenames. + +Don't assume C> won't be the first character of a filename. +Always use C> explicitly to open a file for reading. open(FILE, "<$existing_file") or die $!; +Actually, though, if filenames might use strange characters, it is +safest to open it with C instead of C, which is magic. + =head2 System Interaction @@ -241,6 +307,8 @@ C instead. Don't count on per-program environment variables, or per-program current directories. +Don't count on specific values of C<$!>. + =head2 Interprocess Communication (IPC) @@ -274,6 +342,9 @@ The rule of thumb for portable code is: Do it all in portable Perl, or use a module (that may internally implement it with platform-specific code, but expose a common interface). +The UNIX System V IPC (C) is not available +even in all UNIX platforms. + =head2 External Subroutines (XS) @@ -315,12 +386,37 @@ widely different ways. Don't assume the timezone is stored in C<$ENV{TZ}>, and even if it is, don't assume that you can control the timezone through that variable. -Don't assume that the epoch starts at January 1, 1970, because that is -OS-specific. Better to store a date in an unambiguous representation. -A text representation (like C<1 Jan 1970>) can be easily converted into an -OS-specific value using a module like C. An array of values, -such as those returned by C, can be converted to an OS-specific -representation using C. +Don't assume that the epoch starts at 00:00:00, January 1, 1970, +because that is OS-specific. Better to store a date in an unambiguous +representation. The ISO 8601 standard defines YYYY-MM-DD as the date +format. A text representation (like C<1 Jan 1970>) can be easily +converted into an OS-specific value using a module like +C. An array of values, such as those returned by +C, can be converted to an OS-specific representation using +C. + + +=head2 Character sets and character encoding + +Assume very little about character sets. Do not assume anything about +the numerical values (C, C) of characters. Do not +assume that the alphabetic characters are encoded contiguously (in +numerical sense). Do not assume anything about the ordering of the +characters. The lowercase letters may come before or after the +uppercase letters, the lowercase and uppercase may be interlaced so +that both 'a' and 'A' come before the 'b', the accented and other +international characters may be interlaced so that E comes +before the 'b'. + + +=head2 Internationalisation + +If you may assume POSIX (a rather large assumption, that in practice +means UNIX), you may read more about the POSIX locale system from +L. The locale system at least attempts to make things a +little bit more portable, or at least more convenient and +native-friendly for non-English users. The system affects character +sets and encoding, and date and time formatting, among other things. =head2 System Resources @@ -406,15 +502,18 @@ Unix flavors: uname $^O $Config{'archname'} ------------------------------------------- - AIX aix - FreeBSD freebsd - Linux linux - HP-UX hpux - OSF1 dec_osf + AIX aix aix + FreeBSD freebsd freebsd-i386 + Linux linux i386-linux + HP-UX hpux PA-RISC1.1 + IRIX irix irix + OSF1 dec_osf alpha-dec_osf SunOS solaris sun4-solaris SunOS solaris i86pc-solaris - SunOS4 sunos + SunOS4 sunos sun4-sunos +Note that because the C<$Config{'archname'}> may depend on the hardware +architecture it may vary quite a lot, much more than the C<$^O>. =head2 DOS and Derivatives @@ -478,7 +577,8 @@ Also see: =item The djgpp environment for DOS, C =item The EMX environment for DOS, OS/2, etc. C, -C +C or +C =item Build instructions for Win32, L. @@ -509,7 +609,7 @@ limited to 31 characters, and may include any character except C<:>, which is reserved as a path separator. Instead of C, see C and C in the -C module. +C module, or C and C. In the MacPerl application, you can't run a program from the command line; programs that expect C<@ARGV> to be populated can be edited with something @@ -544,10 +644,9 @@ the application or MPW tool version is running, check: $is_ppc = $MacPerl::Architecture eq 'MacPPC'; $is_68k = $MacPerl::Architecture eq 'Mac68K'; -S, to be based on NeXT's OpenStep OS, will be able to run -MacPerl natively (in the Blue Box, and even in the Yellow Box, once some -changes to the toolbox calls are made), but Unix perl will also run -natively. +S, to be based on NeXT's OpenStep OS, will (in theory) be able +to run MacPerl natively, but Unix perl will also run natively under the +built-in Unix environment. Also see: @@ -658,18 +757,84 @@ Put words C in message body. =back +=head2 VOS + +Perl on VOS is discussed in F in the perl distribution. +Note that perl on VOS can accept either VOS- or Unix-style file +specifications as in either of the following: + + $ perl -ne "print if /perl_setup/i" >system>notices + $ perl -ne "print if /perl_setup/i" /system/notices + +or even a mixture of both as in: + + $ perl -ne "print if /perl_setup/i" >system/notices + +Note that even though VOS allows the slash character to appear in object +names, because the VOS port of Perl interprets it as a pathname +delimiting character, VOS files, directories, or links whose names +contain a slash character cannot be processed. Such files must be +renamed before they can be processed by Perl. + +The following C functions are unimplemented on VOS, and any attempt by +Perl to use them will result in a fatal error message and an immediate +exit from Perl: dup, do_aspawn, do_spawn, fork, waitpid. Once these +functions become available in the VOS POSIX.1 implementation, you can +either recompile and rebind Perl, or you can download a newer port from +ftp.stratus.com. + +The value of C<$^O> on VOS is "VOS". To determine the architecture that +you are running on without resorting to loading all of C<%Config> you +can examine the content of the C<@INC> array like so: + + if (grep(/VOS/, @INC)) { + print "I'm on a Stratus box!\n"; + } else { + print "I'm not on a Stratus box!\n"; + die; + } + + if (grep(/860/, @INC)) { + print "This box is a Stratus XA/R!\n"; + } elsif (grep(/7100/, @INC)) { + print "This box is a Stratus HP 7100 or 8000!\n"; + } elsif (grep(/8000/, @INC)) { + print "This box is a Stratus HP 8000!\n"; + } else { + print "This box is a Stratus 68K...\n"; + } + +Also see: + +=over 4 + +=item L + +=item VOS mailing list + +There is no specific mailing list for Perl on VOS. You can post +comments to the comp.sys.stratus newsgroup, or subscribe to the general +Stratus mailing list. Send a letter with "Subscribe Info-Stratus" in +the message body to majordomo@list.stratagy.com. + +=item VOS Perl on the web at C + +=back + + =head2 EBCDIC Platforms Recent versions of Perl have been ported to platforms such as OS/400 on -AS/400 minicomputers as well as OS/390 for IBM Mainframes. Such computers -use EBCDIC character sets internally (usually Character Code Set ID 00819 -for OS/400 and IBM-1047 for OS/390). Note that on the mainframe perl -currently works under the "Unix system services for OS/390" (formerly -known as OpenEdition). +AS/400 minicomputers as well as OS/390 & VM/ESA for IBM Mainframes. Such +computers use EBCDIC character sets internally (usually Character Code +Set ID 00819 for OS/400 and IBM-1047 for OS/390 & VM/ESA). Note that on +the mainframe perl currently works under the "Unix system services +for OS/390" (formerly known as OpenEdition) and VM/ESA OpenEdition. -As of R2.5 of USS for OS/390 that Unix sub-system did not support the -C<#!> shebang trick for script invocation. Hence, on OS/390 perl scripts -can executed with a header similar to the following simple script: +As of R2.5 of USS for OS/390 and Version 2.3 of VM/ESA these Unix +sub-systems do not support the C<#!> shebang trick for script invocation. +Hence, on OS/390 and VM/ESA perl scripts can be executed with a header +similar to the following simple script: : # use perl eval 'exec /usr/local/bin/perl -S $0 ${1+"$@"}' @@ -683,16 +848,18 @@ an effect on what happens with some perl functions (such as C, C, C, C, C, C, C, C), as well as bit-fiddling with ASCII constants using operators like C<^>, C<&> and C<|>, not to mention dealing with socket interfaces to ASCII computers -(see L<"NEWLINES">). +(see L). Fortunately, most web servers for the mainframe will correctly translate the C<\n> in the following statement to its ASCII equivalent (note that -C<\r> is the same under both Unix and OS/390): +C<\r> is the same under both Unix and OS/390 & VM/ESA): print "Content-type: text/html\r\n\r\n"; The value of C<$^O> on OS/390 is "os390". +The value of C<$^O> on VM/ESA is "vmesa". + Some simple tricks for determining if you are running on an EBCDIC platform could include any of the following (perhaps all): @@ -765,7 +932,7 @@ C contains a single item list. The filesystem will also expand system variables in filenames if enclosed in angle brackets, so CSystem$DirE.Modules> would look for the file S>. The obvious implication of this is -that BE> and should +that BE>> and should be protected when C is used for input. Because C<.> was in use as a directory separator and filenames could not @@ -944,9 +1111,11 @@ bits are meaningless. (Win32) Only good for changing "owner" and "other" read-write access. (S) +Access permissions are mapped onto VOS access-control list changes. (VOS) + =item chown LIST -Not implemented. (S, Win32, Plan9, S) +Not implemented. (S, Win32, Plan9, S, VOS) Does nothing, but won't fail. (Win32) @@ -954,20 +1123,22 @@ Does nothing, but won't fail. (Win32) =item chroot -Not implemented. (S, Win32, VMS, Plan9, S) +Not implemented. (S, Win32, VMS, Plan9, S, VOS, VM/ESA) =item crypt PLAINTEXT,SALT May not be available if library or source was not provided when building perl. (Win32) +Not implemented. (VOS) + =item dbmclose HASH -Not implemented. (VMS, Plan9) +Not implemented. (VMS, Plan9, VOS) =item dbmopen HASH,DBNAME,MODE -Not implemented. (VMS, Plan9) +Not implemented. (VMS, Plan9, VOS) =item dump LABEL @@ -981,19 +1152,21 @@ Invokes VMS debugger. (VMS) Not implemented. (S) +Implemented via Spawn. (VM/ESA) + =item fcntl FILEHANDLE,FUNCTION,SCALAR Not implemented. (Win32, VMS) =item flock FILEHANDLE,OPERATION -Not implemented (S, VMS, S). +Not implemented (S, VMS, S, VOS). Available only on Windows NT (not on Windows 95). (Win32) =item fork -Not implemented. (S, Win32, AmigaOS, S) +Not implemented. (S, Win32, AmigaOS, S, VOS, VM/ESA) =item getlogin @@ -1001,7 +1174,7 @@ Not implemented. (S, S) =item getpgrp PID -Not implemented. (S, Win32, VMS, S) +Not implemented. (S, Win32, VMS, S, VOS) =item getppid @@ -1009,7 +1182,7 @@ Not implemented. (S, Win32, VMS, S) =item getpriority WHICH,WHO -Not implemented. (S, Win32, VMS, S) +Not implemented. (S, Win32, VMS, S, VOS, VM/ESA) =item getpwnam NAME @@ -1049,11 +1222,11 @@ Not implemented. (S) =item getpwent -Not implemented. (S, Win32) +Not implemented. (S, Win32, VM/ESA) =item getgrent -Not implemented. (S, Win32, VMS) +Not implemented. (S, Win32, VMS, VM/ESA) =item gethostent @@ -1097,11 +1270,11 @@ Not implemented. (Plan9, Win32, S) =item endpwent -Not implemented. (S, Win32) +Not implemented. (S, Win32, VM/ESA) =item endgrent -Not implemented. (S, Win32, VMS, S) +Not implemented. (S, Win32, VMS, S, VM/ESA) =item endhostent @@ -1160,6 +1333,9 @@ method of spawning a process. (Win32) Not implemented. (S, Win32, VMS, S) +Link count not updated because hard links are not quite that hard +(They are sort of half-way between hard and soft links). (AmigaOS) + =item lstat FILEHANDLE =item lstat EXPR @@ -1178,7 +1354,7 @@ Return values may be bogus. (Win32) =item msgrcv ID,VAR,SIZE,TYPE,FLAGS -Not implemented. (S, Win32, VMS, Plan9, S) +Not implemented. (S, Win32, VMS, Plan9, S, VOS) =item open FILEHANDLE,EXPR @@ -1193,6 +1369,8 @@ open to C<|-> and C<-|> are unsupported. (S, Win32, S) Not implemented. (S) +Very limited functionality. (MiNT) + =item readlink EXPR =item readlink @@ -1211,15 +1389,15 @@ Only reliable on sockets. (S) =item semop KEY,OPSTRING -Not implemented. (S, Win32, VMS, S) +Not implemented. (S, Win32, VMS, S, VOS) =item setpgrp PID,PGRP -Not implemented. (S, Win32, VMS, S) +Not implemented. (S, Win32, VMS, S, VOS) =item setpriority WHICH,WHO,PRIORITY -Not implemented. (S, Win32, VMS, S) +Not implemented. (S, Win32, VMS, S, VOS) =item setsockopt SOCKET,LEVEL,OPTNAME,OPTVAL @@ -1233,11 +1411,11 @@ Not implemented. (S, Plan9) =item shmwrite ID,STRING,POS,SIZE -Not implemented. (S, Win32, VMS, S) +Not implemented. (S, Win32, VMS, S, VOS) =item socketpair SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL -Not implemented. (S, Win32, VMS, S) +Not implemented. (S, Win32, VMS, S, VOS, VM/ESA) =item stat FILEHANDLE @@ -1261,13 +1439,14 @@ Not implemented. (Win32, VMS, S) =item syscall LIST -Not implemented. (S, Win32, VMS, S) +Not implemented. (S, Win32, VMS, S, VOS, VM/ESA) =item sysopen FILEHANDLE,FILENAME,MODE,PERMS The traditional "0", "1", and "2" MODEs are implemented with different -numeric values on some systems. The flags exported by C should -work everywhere though. (S, OS/390) +numeric values on some systems. The flags exported by C +(O_RDONLY, O_WRONLY, O_RDWR) should work everywhere though. (S, OS/390, VM/ESA) =item system LIST @@ -1289,6 +1468,11 @@ the child program uses a compatible version of the emulation library. I will call the native command line direct and no such emulation of a child Unix program will exists. Mileage B vary. (S) +Far from being POSIX compliant. Because there may be no underlying +/bin/sh tries to work around the problem by forking and execing the +first token in its argument string. Handles basic redirection +("E" or "E") on its own behalf. (MiNT) + =item times Only the first entry returned is nonzero. (S) @@ -1305,23 +1489,37 @@ Not useful. (S) Not implemented. (VMS) +Truncation to zero-length only. (VOS) + +If a FILEHANDLE is supplied, it must be writable and opened in append +mode (i.e., use C>filename')> +or C. If a filename is supplied, it +should not be held open elsewhere. (Win32) + =item umask EXPR =item umask Returns undef where unavailable, as of version 5.005. +C works but the correct permissions are only set when the file +is finally close()d. (AmigaOS) + =item utime LIST Only the modification time is updated. (S, VMS, S) -May not behave as expected. (Win32) +May not behave as expected. Behavior depends on the C runtime +library's implementation of utime(), and the filesystem being +used. The FAT filesystem typically does not support an "access +time" field, and it may limit timestamps to a granularity of +two seconds. (Win32) =item wait =item waitpid PID,FLAGS -Not implemented. (S) +Not implemented. (S, VOS) Can only be applied to process handles returned for processes spawned using C. (Win32) @@ -1334,19 +1532,43 @@ Not useful. (S) =over 4 -=item 1.33, 06 August 1998 +=item v1.39, 11 February, 1999 + +Changes from Jarkko and EMX URL fixes Michael Schwern. Additional +note about newlines added. + +=item v1.38, 31 December 1998 + +More changes from Jarkko. + +=item v1.37, 19 December 1998 + +More minor changes. Merge two separate version 1.35 documents. + +=item v1.36, 9 September 1998 + +Updated for Stratus VOS. Also known as version 1.35. + +=item v1.35, 13 August 1998 + +Integrate more minor changes, plus addition of new sections under +L<"ISSUES">: L<"Numbers endianness and Width">, +L<"Character sets and character encoding">, +L<"Internationalisation">. + +=item v1.33, 06 August 1998 Integrate more minor changes. -=item 1.32, 05 August 1998 +=item v1.32, 05 August 1998 Integrate more minor changes. -=item 1.30, 03 August 1998 +=item v1.30, 03 August 1998 Major update for RISC OS, other minor changes. -=item 1.23, 10 July 1998 +=item v1.23, 10 July 1998 First public release with perl5.005. @@ -1355,33 +1577,37 @@ First public release with perl5.005. =head1 AUTHORS / CONTRIBUTORS Abigail Eabigail@fnx.comE, -Charles Bailey Ebailey@genetics.upenn.eduE, +Charles Bailey Ebailey@newman.upenn.eduE, Graham Barr Egbarr@pobox.comE, Tom Christiansen Etchrist@perl.comE, Nicholas Clark ENicholas.Clark@liverpool.ac.ukE, Andy Dougherty Edoughera@lafcol.lafayette.eduE, Dominic Dunlop Edomo@vo.luE, +Neale Ferguson Eneale@mailbox.tabnsw.com.auE +Paul Green EPaul_Green@stratus.comE, M.J.T. Guy Emjtg@cus.cam.ac.ukE, +Jarkko Hietaniemi Ejhi@iki.fi, Luther Huffman Elutherh@stratcom.comE, Nick Ing-Simmons Enick@ni-s.u-net.comE, -Andreas J. Koenig Ekoenig@kulturbox.deE, +Andreas J. KEnig Ekoenig@kulturbox.deE, +Markus Laker Emlaker@contax.co.ukE, Andrew M. Langmead Eaml@world.std.comE, Paul Moore EPaul.Moore@uk.origin-it.comE, Chris Nandor Epudge@pobox.comE, -Matthias Neercher Eneeri@iis.ee.ethz.chE, +Matthias Neeracher Eneeri@iis.ee.ethz.chE, Gary Ng E71564.1743@CompuServe.COME, Tom Phoenix Erootbeer@teleport.comE, Peter Prymmer Epvhp@forte.comE, -Hugo van der Sanden Eh.sanden@elsevier.nlE, +Hugo van der Sanden Ehv@crypt0.demon.co.ukE, Gurusamy Sarathy Egsar@umich.eduE, Paul J. Schinder Eschinder@pobox.comE, +Michael G Schwern Eschwern@pobox.comE, Dan Sugalski Esugalskd@ous.eduE, Nathan Torkington Egnat@frii.comE. -This document is maintained by Chris Nandor. +This document is maintained by Chris Nandor +Epudge@pobox.comE. =head1 VERSION -Version 1.33, last modified 06 August 1998. - - +Version 1.39, last modified 11 February 1999