Unix does the same thing on ttys in canonical mode. C<\015\012>
is commonly referred to as CRLF.
-A common cause of unportable programs is the misuse of chop() to trim
-newlines:
-
- # XXX UNPORTABLE!
- while(<FILE>) {
- chop;
- @array = split(/:/);
- #...
- }
-
-You can get away with this on Unix and Mac OS (they have a single
-character end-of-line), but the same program will break under DOSish
-perls because you're only chop()ing half the end-of-line. Instead,
-chomp() should be used to trim newlines. The L<Dunce::Files> module
-can help audit your code for misuses of chop().
+To trim trailing newlines from text lines use chomp(). With default
+settings that function looks for a trailing C<\n> character and thus
+trims in a portable way.
When dealing with binary files (or text files in binary mode) be sure
to explicitly set $/ to the appropriate value for your file format
and even on systems where it might be tolerated, some utilities
might become confused by such whitespace.
-Many systems (DOS, VMS) cannot have more than one C<.> in their filenames.
+Many systems (DOS, VMS ODS-2) cannot have more than one C<.> in their
+filenames.
Don't assume C<< > >> won't be the first character of a filename.
Always use C<< < >> explicitly to open a file for reading, or even
VMS the C<%ENV> table is much more than a per-process key-value string
table.
+On VMS, some entries in the %ENV hash are dynamically created when
+their key is used on a read if they did not previously exist. The
+values for C<$ENV{HOME}>, C<$ENV{TERM}>, C<$ENV{HOME}>, and C<$ENV{USER}>,
+are known to be dynamically generated. The specific names that are
+dynamically generated may vary with the version of the C library on VMS,
+and more may exist than is documented.
+
+On VMS by default, changes to the %ENV hash are persistent after the process
+exits. This can cause unintended issues.
+
Don't count on signals or C<%SIG> for anything.
Don't count on filename globbing. Use C<opendir>, C<readdir>, and
some large number. C<$offset> can then be added to a Unix time value
to get what should be the proper value on any system.
-On Windows (at least), you shouldn't pass a negative value to C<gmtime> or
-C<localtime>.
-
=head2 Character sets and character encoding
Assume very little about character sets.
Do not assume anything about the ordering of the characters.
The lowercase letters may come before or after the uppercase letters;
-the lowercase and uppercase may be interlaced so that both `a' and `A'
-come before `b'; the accented and other international characters may
-be interlaced so that E<auml> comes before `b'.
+the lowercase and uppercase may be interlaced so that both "a" and "A"
+come before "b"; the accented and other international characters may
+be interlaced so that E<auml> comes before "b".
=head2 Internationalisation
later. If the bytes are native 8-bit bytes, you can use the C<bytes>
pragma. If the bytes are in a string (regular expression being a
curious string), you can often also use the C<\xHH> notation instead
-of embedding the bytes as-is. If they are in some particular legacy
-encoding (ether single-byte or something more complicated), you can
-use the C<encoding> pragma. (If you want to write your code in UTF-8,
-you can use either the C<utf8> pragma, or the C<encoding> pragma.)
-The C<bytes> and C<utf8> pragmata are available since Perl 5.6.0, and
-the C<encoding> pragma since Perl 5.8.0.
+of embedding the bytes as-is. (If you want to write your code in UTF-8,
+you can use the C<utf8>.) The C<bytes> and C<utf8> pragmata are
+available since Perl 5.6.0.
=head2 System Resources
=head2 VMS
Perl on VMS is discussed in L<perlvms> in the perl distribution.
+
+The official name of VMS as of this writing is OpenVMS.
+
Perl on VMS can accept either VMS- or Unix-style file
specifications as in either of the following:
Do take care with C<$ ASSIGN/nolog/user SYS$COMMAND: SYS$INPUT> if your
perl-in-DCL script expects to do things like C<< $read = <STDIN>; >>.
-Filenames are in the format "name.extension;version". The maximum
-length for filenames is 39 characters, and the maximum length for
+The VMS operating system has two filesystems, known as ODS-2 and ODS-5.
+
+For ODS-2, filenames are in the format "name.extension;version". The
+maximum length for filenames is 39 characters, and the maximum length for
extensions is also 39 characters. Version is a number from 1 to
32767. Valid characters are C</[A-Z0-9$_-]/>.
-VMS's RMS filesystem is case-insensitive and does not preserve case.
-C<readdir> returns lowercased filenames, but specifying a file for
-opening remains case-insensitive. Files without extensions have a
-trailing period on them, so doing a C<readdir> with a file named F<A.;5>
-will return F<a.> (though that file could be opened with
+The ODS-2 filesystem is case-insensitive and does not preserve case.
+Perl simulates this by converting all filenames to lowercase internally.
+
+For ODS-5, filenames may have almost any character in them and can include
+Unicode characters. Characters that could be misinterpreted by the DCL
+shell or file parsing utilities need to be prefixed with the C<^>
+character, or replaced with hexadecimal characters prefixed with the
+C<^> character. Such prefixing is only needed with the pathnames are
+in VMS format in applications. Programs that can accept the UNIX format
+of pathnames do not need the escape characters. The maximum length for
+filenames is 255 characters. The ODS-5 file system can handle both
+a case preserved and a case sensitive mode.
+
+ODS-5 is only available on the OpenVMS for 64 bit platforms.
+
+Support for the extended file specifications is being done as optional
+settings to preserve backward compatibility with Perl scripts that
+assume the previous VMS limitations.
+
+In general routines on VMS that get a UNIX format file specification
+should return it in a UNIX format, and when they get a VMS format
+specification they should return a VMS format unless they are documented
+to do a conversion.
+
+For routines that generate return a file specification, VMS allows setting
+if the C library which Perl is built on if it will be returned in VMS
+format or in UNIX format.
+
+With the ODS-2 file system, there is not much difference in syntax of
+filenames without paths for VMS or UNIX. With the extended character
+set available with ODS-5 there can be a significant difference.
+
+Because of this, existing Perl scripts written for VMS were sometimes
+treating VMS and UNIX filenames interchangeably. Without the extended
+character set enabled, this behavior will mostly be maintained for
+backwards compatibility.
+
+When extended characters are enabled with ODS-5, the handling of
+UNIX formatted file specifications is to that of a UNIX system.
+
+VMS file specifications without extensions have a trailing dot. An
+equivalent UNIX file specification should not show the trailing dot.
+
+The result of all of this, is that for VMS, for portable scripts, you
+can not depend on Perl to present the filenames in lowercase, to be
+case sensitive, and that the filenames could be returned in either
+UNIX or VMS format.
+
+And if a routine returns a file specification, unless it is intended to
+convert it, it should return it in the same format as it found it.
+
+C<readdir> by default has traditionally returned lowercased filenames.
+When the ODS-5 support is enabled, it will return the exact case of the
+filename on the disk.
+
+Files without extensions have a trailing period on them, so doing a
+C<readdir> in the default mode with a file named F<A.;5> will
+return F<a.> when VMS is (though that file could be opened with
C<open(FH, 'A')>).
+With support for extended file specifications and if C<opendir> was
+given a UNIX format directory, a file named F<A.;5> will return F<a>
+and optionally in the exact case on the disk. When C<opendir> is given
+a VMS format directory, then C<readdir> should return F<a.>, and
+again with the optionally the exact case.
+
RMS had an eight level limit on directory depths from any rooted logical
-(allowing 16 levels overall) prior to VMS 7.2. Hence
-C<PERL_ROOT:[LIB.2.3.4.5.6.7.8]> is a valid directory specification but
-C<PERL_ROOT:[LIB.2.3.4.5.6.7.8.9]> is not. F<Makefile.PL> authors might
-have to take this into account, but at least they can refer to the former
-as C</PERL_ROOT/lib/2/3/4/5/6/7/8/>.
+(allowing 16 levels overall) prior to VMS 7.2, and even with versions of
+VMS on VAX up through 7.3. Hence C<PERL_ROOT:[LIB.2.3.4.5.6.7.8]> is a
+valid directory specification but C<PERL_ROOT:[LIB.2.3.4.5.6.7.8.9]> is
+not. F<Makefile.PL> authors might have to take this into account, but at
+least they can refer to the former as C</PERL_ROOT/lib/2/3/4/5/6/7/8/>.
+
+Pumpkings and module integrators can easily see whether files with too many
+directory levels have snuck into the core by running the following in the
+top-level source directory:
+
+ $ perl -ne "$_=~s/\s+.*//; print if scalar(split /\//) > 8;" < MANIFEST
+
The VMS::Filespec module, which gets installed as part of the build
process on VMS, is a pure Perl module that can easily be installed on
non-VMS platforms and can be helpful for conversions to and from RMS
-native formats.
+native formats. It is also now the only way that you should check to
+see if VMS is in a case sensitive mode.
What C<\n> represents depends on the type of file opened. It usually
represents C<\012> but it could also be C<\015>, C<\012>, C<\015\012>,
TCP/IP stacks are optional on VMS, so socket routines might not be
implemented. UDP sockets may not be supported.
+The TCP/IP library support for all current versions of VMS is dynamically
+loaded if present, so even if the routines are configured, they may
+return a status indicating that they are not implemented.
+
The value of C<$^O> on OpenVMS is "VMS". To determine the architecture
that you are running on without resorting to loading all of C<%Config>
you can examine the content of the C<@INC> array like so:
} elsif (grep(/VMS_VAX/, @INC)) {
print "I'm on VAX!\n";
+ } elsif (grep(/VMS_IA64/, @INC)) {
+ print "I'm on IA64!\n";
+
} else {
print "I'm not so sure about where $^O is...\n";
}
+In general, the significant differences should only be if Perl is running
+on VMS_VAX or one of the 64 bit OpenVMS platforms.
+
On VMS, perl determines the UTC offset from the C<SYS$TIMEZONE_DIFFERENTIAL>
logical name. Although the VMS epoch began at 17-NOV-1858 00:00:00.00,
calls to C<localtime> are adjusted to count offsets from
=item *
-vmsperl list, majordomo@perl.org
-
-(Put the words C<subscribe vmsperl> in message body.)
+vmsperl list, vmsperl-subscribe@perl.org
=item *
delimiting character, VOS files, directories, or links whose names
contain a slash character cannot be processed. Such files must be
renamed before they can be processed by Perl. Note that VOS limits
-file names to 32 or fewer characters.
+file names to 32 or fewer characters, file names cannot start with a
+C<-> character, or contain any character matching C<< tr/ !%&'()*+;<>?// >>
The value of C<$^O> on VOS is "VOS". To determine the architecture that
you are running on without resorting to loading all of C<%Config> you
and applications are executable, and there are no uid/gid
considerations. C<-o> is not supported. (S<Mac OS>)
+C<-w> only inspects the read-only file attribute (FILE_ATTRIBUTE_READONLY),
+which determines whether the directory can be deleted, not whether it can
+be written to. Directories always have read and write access unless denied
+by discretionary access control lists (DACLs). (S<Win32>)
+
C<-r>, C<-w>, C<-x>, and C<-o> tell whether the file is accessible,
which may not reflect UIC-based file protections. (VMS)
Not useful. (S<Mac OS>, S<RISC OS>)
-Not implemented. (Win32)
+Not supported. (Cygwin, Win32)
Invokes VMS debugger. (VMS)
with the pragma C<use vmsish 'exit'>. As with the CRTL's exit()
function, C<exit 0> is also mapped to an exit status of SS$_NORMAL
(C<1>); this mapping cannot be overridden. Any other argument to exit()
-is used directly as Perl's exit status. (VMS)
+is used directly as Perl's exit status. On VMS, unless the future
+POSIX_EXIT mode is enabled, the exit code should always be a valid
+VMS exit code and not a generic number. When the POSIX_EXIT mode is
+enabled, a generic number will be encoded in a method compatible with
+the C library _POSIX_EXIT macro so that it can be decoded by other
+programs, particularly ones written in C, like the GNV package. (VMS)
=item fcntl
-Not implemented. (Win32, VMS)
+Not implemented. (Win32)
+Some functions available based on the version of VMS. (VMS)
=item flock
This operator is implemented via the File::Glob extension on most
platforms. See L<File::Glob> for portability information.
+=item gmtime
+
+gmtime() has a range of about 2 billion years before and after 1970.
+
=item ioctl FILEHANDLE,FUNCTION,SCALAR
Not implemented. (VMS)
$sig is 0 and the specified process exists, it returns true without
actually terminating it. (Win32)
+C<kill(-9, $pid)> will terminate the process specified by $pid and
+recursively all child processes owned by it. This is different from
+the Unix semantics, where the signal will be delivered to all
+processes in the same process group as the process specified by
+$pid. (Win32)
+
+Is not supported for process identification number of 0 or negative
+numbers. (VMS)
+
=item link
-Not implemented. (S<Mac OS>, MPE/iX, VMS, S<RISC OS>)
+Not implemented. (S<Mac OS>, MPE/iX, S<RISC OS>)
Link count not updated because hard links are not quite that hard
(They are sort of half-way between hard and soft links). (AmigaOS)
-Hard links are implemented on Win32 (Windows NT and Windows 2000)
-under NTFS only.
+Hard links are implemented on Win32 under NTFS only. They are
+natively supported on Windows 2000 and later. On Windows NT they
+are implemented using the Windows POSIX subsystem support and the
+Perl process will need Administrator or Backup Operator privileges
+to create hard links.
+
+Available on 64 bit OpenVMS 8.2 and later. (VMS)
+
+=item localtime
+
+localtime() has the same range as L<gmtime>, but because time zone
+rules change its accuracy for historical and future times may degrade
+but usually by no more than an hour.
=item lstat
-Not implemented. (VMS, S<RISC OS>)
+Not implemented. (S<RISC OS>)
Return values (especially for device and inode) may be bogus. (Win32)
=item socketpair
-Not implemented. (Win32, VMS, S<RISC OS>, VOS, VM/ESA)
+Not implemented. (S<RISC OS>, VOS, VM/ESA)
+
+Available on 64 bit OpenVMS 8.2 and later. (VMS)
=item stat
some versions of cygwin when doing a stat("foo") and if not finding it
may then attempt to stat("foo.exe") (Cygwin)
+On Win32 stat() needs to open the file to determine the link count
+and update attributes that may have been changed through hard links.
+Setting ${^WIN32_SLOPPY_STAT} to a true value speeds up stat() by
+not performing this operation. (Win32)
+
=item symlink
-Not implemented. (Win32, VMS, S<RISC OS>)
+Not implemented. (Win32, S<RISC OS>)
+
+Implemented on 64 bit VMS 8.3. VMS requires the symbolic link to be in Unix
+syntax if it is intended to resolve to a valid path.
=item syscall
The return value is POSIX-like (shifted up by 8 bits), which only allows
room for a made-up value derived from the severity bits of the native
32-bit condition code (unless overridden by C<use vmsish 'status'>).
+If the native condition code is one that has a POSIX value encoded, the
+POSIX value will be decoded to extract the expected exit value.
For more details see L<perlvms/$?>. (VMS)
=item times
Michael G Schwern <schwern@pobox.com>,
Dan Sugalski <dan@sidhe.org>,
Nathan Torkington <gnat@frii.com>.
-
+John Malmberg <wb8tyw@qsl.net>