From: Rafael Garcia-Suarez Date: Sun, 7 Jul 2002 20:31:37 +0000 (+0000) Subject: Replace the word "discipline" by "layer" almost everywhere, X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=fae2c0fbfb26247eb616ab310ef74b1f4084ba68;p=p5sagit%2Fp5-mst-13.2.git Replace the word "discipline" by "layer" almost everywhere, by Elizabeth Mattijsen. p4raw-id: //depot/perl@17410 --- diff --git a/MANIFEST b/MANIFEST index 883dcb0..d34bb41 100644 --- a/MANIFEST +++ b/MANIFEST @@ -1350,7 +1350,7 @@ lib/NEXT/t/actual.t NEXT lib/NEXT/t/actuns.t NEXT lib/NEXT/t/next.t NEXT lib/NEXT/t/unseen.t NEXT -lib/open.pm Pragma to specify default I/O disciplines +lib/open.pm Pragma to specify default I/O layers lib/open.t See if the open pragma works lib/open2.pl Open a two-ended pipe (uses IPC::Open2) lib/open3.pl Open a three-ended pipe (uses IPC::Open3) diff --git a/lib/open.pm b/lib/open.pm index f7e594b..2dc1d21 100644 --- a/lib/open.pm +++ b/lib/open.pm @@ -165,12 +165,9 @@ Perl is configured to use PerlIO as its IO system (which is now the default). The C pragma serves as one of the interfaces to declare default -"layers" for all I/O. - -The C pragma is used to declare one or more default layers for -I/O operations. Any open(), readpipe() (aka qx//) and similar -operators found within the lexical scope of this pragma will use the -declared defaults. +"layers" (also known as "disciplines") for all I/O. Any open(), +readpipe() (aka qx//) and similar operators found within the lexical +scope of this pragma will use the declared defaults. With the C subpragma you can declare the default layers of input streams, and with the C subpragma you can declare diff --git a/pod/perldelta.pod b/pod/perldelta.pod index b4c008a..24a8938 100644 --- a/pod/perldelta.pod +++ b/pod/perldelta.pod @@ -238,13 +238,14 @@ source code level, this shouldn't be that drastic a change. =item * Previous versions of perl and some readings of some sections of Camel III -implied that C<:raw> "discipline" was the inverse of C<:crlf>. +implied that the C<:raw> "discipline" was the inverse of C<:crlf>. Turning off "clrfness" is no longer enough to make a stream truly -binary. So the PerlIO C<:raw> discipline is now formally defined as being +binary. So the PerlIO C<:raw> layer (or "discipline", to use the +Camel book's older terminology) is now formally defined as being equivalent to binmode(FH) - which is in turn defined as doing whatever is necessary to pass each byte as-is without any translation. In particular binmode(FH) - and hence C<:raw> - will now turn off both CRLF -and UTF-8 translation and remove other "layers" (e.g. :encoding()) which +and UTF-8 translation and remove other layers (e.g. :encoding()) which would modify byte stream. =item * @@ -368,7 +369,7 @@ for more information about UTF-8. If your environment variables (LC_ALL, LC_CTYPE, LANG, LANGUAGE) look like you want to use UTF-8 (any of the the variables match C), -your STDIN, STDOUT, STDERR handles and the default open discipline +your STDIN, STDOUT, STDERR handles and the default open layer (see L) are marked as UTF-8. (This feature, like other new features that combine Unicode and I/O, work only if you are using PerlIO, but that's is the default.) @@ -1004,7 +1005,7 @@ See L. =item * -C is a new pragma for setting the default I/O disciplines +C is a new pragma for setting the default I/O layers for open(). =item * diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod index d8e7902..c422575 100644 --- a/pod/perlfunc.pod +++ b/pod/perlfunc.pod @@ -445,7 +445,7 @@ does. Returns true if it succeeded, false otherwise. NAME should be a packed address of the appropriate type for the socket. See the examples in L. -=item binmode FILEHANDLE, DISCIPLINE +=item binmode FILEHANDLE, LAYER =item binmode FILEHANDLE @@ -455,15 +455,22 @@ binary and text files. If FILEHANDLE is an expression, the value is taken as the name of the filehandle. Returns true on success, C on failure. -If DISCIPLINE is omitted or specified as C<:raw> the filehandle is made +If LAYER is omitted or specified as C<:raw> the filehandle is made suitable for passing binary data. This includes turning off possible CRLF translation and marking it as bytes (as opposed to Unicode characters). Note that as desipite what may be implied in I<"Programming Perl"> (the Camel) or elsewhere C<:raw> is I the simply inverse of C<:crlf> -- other disciplines which would affect binary nature of the stream are +-- other layers which would affect binary nature of the stream are I disabled. See L, L and the discussion about the PERLIO environment variable. +I + On some systems (in general, DOS and Windows-based systems) binmode() is necessary when you're not working with a text file. For the sake of portability it is a good idea to always use it when appropriate, @@ -472,26 +479,23 @@ and to never use it when it isn't appropriate. In other words: regardless of platform, use binmode() on binary files (like for example images). -If DISCIPLINE is present it is a single string, but may contain +If LAYER is present it is a single string, but may contain multiple directives. The directives alter the behaviour of the -file handle. When DISCIPLINE is present using binmode on text +file handle. When LAYER is present using binmode on text file makes sense. To mark FILEHANDLE as UTF-8, use C<:utf8>. The C<:bytes>, C<:crlf>, and C<:utf8>, and any other directives of the -form C<:...>, are called I/O I. The normal implementation -of disciplines in Perl 5.8 and later is in terms of I. See -L. (There is typically a one-to-one correspondence between -layers and disiplines.) The C pragma can be used to establish -default I/O disciplines. See L. +form C<:...>, are called I/O I. The C pragma can be used to +establish default I/O layers. See L. In general, binmode() should be called after open() but before any I/O is done on the filehandle. Calling binmode() will normally flush any pending buffered output data (and perhaps pending input data) on the -handle. An exception to this is the C<:encoding> discipline that +handle. An exception to this is the C<:encoding> layer that changes the default character encoding of the handle, see L. -The C<:encoding> discipline sometimes needs to be called in +The C<:encoding> layer sometimes needs to be called in mid-stream, and it doesn't flush the stream. The operating system, device drivers, C libraries, and Perl run-time @@ -2806,16 +2810,16 @@ meaning. In the 2-arguments (and 1-argument) form opening C<'-'> opens STDIN and opening C<< '>-' >> opens STDOUT. -You may use the three-argument form of open to specify -I or IO "layers" to be applied to the handle that affect how the input and output -are processed: (see L and L for more details). -For example +You may use the three-argument form of open to specify IO "layers" +(sometimes also referred to as "disciplines") to be applied to the handle +that affect how the input and output are processed (see L and +L for more details). For example open(FH, "<:utf8", "file") will open the UTF-8 encoded file containing Unicode characters, -see L. (Note that if disciplines are specified in the -three-arg form then default disciplines set by the C pragma are +see L. (Note that if layers are specified in the +three-arg form then default layers set by the C pragma are ignored.) Open returns nonzero upon success, the undefined value otherwise. If @@ -3768,7 +3772,7 @@ see C. Note the I: depending on the status of the filehandle, either (8-bit) bytes or characters are read. By default all filehandles operate on bytes, but for example if the filehandle has -been opened with the C<:utf8> discipline (see L, and the C +been opened with the C<:utf8> I/O layer (see L, and the C pragma, L), the I/O will operate on characters, not bytes. =item readdir DIRHANDLE @@ -3840,7 +3844,7 @@ See L for examples. Note the I: depending on the status of the socket, either (8-bit) bytes or characters are received. By default all sockets operate on bytes, but for example if the socket has been changed using -binmode() to operate with the C<:utf8> discipline (see the C +binmode() to operate with the C<:utf8> I/O layer (see the C pragma, L), the I/O will operate on characters, not bytes. =item redo LABEL @@ -4188,7 +4192,7 @@ otherwise. Note the I: even if the filehandle has been set to operate on characters (for example by using the C<:utf8> open -discipline), tell() will return byte offsets, not character offsets +layer), tell() will return byte offsets, not character offsets (because implementing that would render seek() and tell() rather slow). If you want to position file for C or C, don't use @@ -4360,7 +4364,7 @@ L for examples. Note the I: depending on the status of the socket, either (8-bit) bytes or characters are sent. By default all sockets operate on bytes, but for example if the socket has been changed using -binmode() to operate with the C<:utf8> discipline (see L, or +binmode() to operate with the C<:utf8> I/O layer (see L, or the C pragma, L), the I/O will operate on characters, not bytes. @@ -5406,7 +5410,7 @@ last byte of the scalar after the read. Note the I: depending on the status of the filehandle, either (8-bit) bytes or characters are read. By default all filehandles operate on bytes, but for example if the filehandle has -been opened with the C<:utf8> discipline (see L, and the C +been opened with the C<:utf8> I/O layer (see L, and the C pragma, L), the I/O will operate on characters, not bytes. An OFFSET may be specified to place the read data at some place in the @@ -5430,7 +5434,7 @@ POSITION, and C<2> to set it to EOF plus POSITION (typically negative). Note the I: even if the filehandle has been set to operate -on characters (for example by using the C<:utf8> discipline), tell() +on characters (for example by using the C<:utf8> I/O layer), tell() will return byte offsets, not character offsets (because implementing that would render sysseek() very slow). @@ -5531,7 +5535,7 @@ In the case the SCALAR is empty you can use OFFSET but only zero offset. Note the I: depending on the status of the filehandle, either (8-bit) bytes or characters are written. By default all filehandles operate on bytes, but for example if the filehandle has -been opened with the C<:utf8> discipline (see L, and the open +been opened with the C<:utf8> I/O layer (see L, and the open pragma, L), the I/O will operate on characters, not bytes. =item tell FILEHANDLE @@ -5545,7 +5549,7 @@ last read. Note the I: even if the filehandle has been set to operate on characters (for example by using the C<:utf8> open -discipline), tell() will return byte offsets, not character offsets +layer), tell() will return byte offsets, not character offsets (because that would render seek() and tell() rather slow). The return value of tell() for the standard streams like the STDIN diff --git a/pod/perlpodspec.pod b/pod/perlpodspec.pod index ab20799..7387258 100644 --- a/pod/perlpodspec.pod +++ b/pod/perlpodspec.pod @@ -611,11 +611,11 @@ is sufficient to establish this file's encoding. =for comment If/WHEN some brave soul makes these heuristics into a generic - text-file class (or file discipline?), we can presumably delete + text-file class (or PerlIO layer?), we can presumably delete mention of these icky details from this file, and can instead - tell people to just use appropriate class/discipline. + tell people to just use appropriate class/layer. Auto-recognition of newline sequences would be another desirable - feature of such a class/discipline. + feature of such a class/layer. HINT HINT HINT. =for comment diff --git a/pod/perlrun.pod b/pod/perlrun.pod index 3890cfc..4f9afdf 100644 --- a/pod/perlrun.pod +++ b/pod/perlrun.pod @@ -917,13 +917,13 @@ are disabled. Arranges for all accesses go straight to the lowest buffered layer provided by the configration. That is it strips off any layers above that layer. -In Perl 5.6 and some books the C<:raw> layer (also called a discipline) -is documented as the inverse of the C<:crlf> layer. That is no longer -the case - other layers which would alter binary nature of the -stream are also disabled. If you want UNIX line endings on a platform -that normally does CRLF translation, but still want UTF-8 or encoding -defaults the appropriate thing to do is to add C<:perlio> to PERLIO -environment variable. +In Perl 5.6 and some books the C<:raw> layer (previously sometimes also +referred to as a "discipline") is documented as the inverse of the +C<:crlf> layer. That is no longer the case - other layers which would +alter binary nature of the stream are also disabled. If you want UNIX +line endings on a platform that normally does CRLF translation, but still +want UTF-8 or encoding defaults the appropriate thing to do is to add +C<:perlio> to PERLIO environment variable. =item :stdio diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod index 0aec6fe..8489702 100644 --- a/pod/perlunicode.pod +++ b/pod/perlunicode.pod @@ -12,7 +12,7 @@ from cover to cover, Perl does support many Unicode features. =over 4 -=item Input and Output Disciplines +=item Input and Output Layers Perl knows when a filehandle uses Perl's internal Unicode encodings (UTF-8, or UTF-EBCDIC if in EBCDIC) if the filehandle is opened with @@ -87,7 +87,7 @@ Unless explicitly stated, Perl operators use character semantics for Unicode data and byte semantics for non-Unicode data. The decision to use character semantics is made transparently. If input data comes from a Unicode source--for example, if a character -encoding discipline is added to a filehandle or a literal Unicode +encoding layer is added to a filehandle or a literal Unicode string constant appears in a program--character semantics apply. Otherwise, byte semantics are in effect. The C pragma should be used to force byte semantics on Unicode data. diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod index cc11dde..870926e 100644 --- a/pod/perluniintro.pod +++ b/pod/perluniintro.pod @@ -150,7 +150,7 @@ character set. Otherwise, it uses UTF-8. A user of Perl does not normally need to know nor care how Perl happens to encode its internal strings, but it becomes relevant when -outputting Unicode strings to a stream without a discipline--one with +outputting Unicode strings to a stream without a PerlIO layer -- one with the "default" encoding. In such a case, the raw bytes used internally (the native character set or UTF-8, as appropriate for each string) will be used, and a "Wide character" warning will be issued if those @@ -165,7 +165,7 @@ as a warning: Wide character in print at ... -To output UTF-8, use the C<:utf8> output discipline. Prepending +To output UTF-8, use the C<:utf8> output layer. Prepending binmode(STDOUT, ":utf8"); @@ -328,7 +328,7 @@ and on already open streams, use C: binmode(STDOUT, ":encoding(shift_jis)"); The matching of encoding names is loose: case does not matter, and -many encodings have several aliases. Note that C<:utf8> discipline +many encodings have several aliases. Note that the C<:utf8> layer must always be specified exactly like that; it is I subject to the loose matching of encoding names. @@ -340,7 +340,7 @@ module. Reading in a file that you know happens to be encoded in one of the Unicode or legacy encodings does not magically turn the data into Unicode in Perl's eyes. To do that, specify the appropriate -discipline when opening files +layer when opening files open(my $fh,'<:utf8', 'anything'); my $line_of_unicode = <$fh>; @@ -348,10 +348,10 @@ discipline when opening files open(my $fh,'<:encoding(Big5)', 'anything'); my $line_of_unicode = <$fh>; -The I/O disciplines can also be specified more flexibly with +The I/O layers can also be specified more flexibly with the C pragma. See L, or look at the following example. - use open ':utf8'; # input and output default discipline will be UTF-8 + use open ':utf8'; # input and output default layer will be UTF-8 open X, ">file"; print X chr(0x100), "\n"; close X; @@ -359,7 +359,7 @@ the C pragma. See L, or look at the following example. printf "%#x\n", ord(); # this should print 0x100 close Y; -With the C pragma you can use the C<:locale> discipline +With the C pragma you can use the C<:locale> layer $ENV{LC_ALL} = $ENV{LANG} = 'ru_RU.KOI8-R'; # the :locale will probe the locale environment variables like LC_ALL @@ -371,7 +371,7 @@ With the C pragma you can use the C<:locale> discipline printf "%#x\n", ord(), "\n"; # this should print 0xc1 close I; -or you can also use the C<':encoding(...)'> discipline +or you can also use the C<':encoding(...)'> layer open(my $epic,'<:encoding(iso-8859-7)','iliad.greek'); my $line_of_unicode = <$epic>; @@ -381,8 +381,8 @@ converts data from the specified encoding when it is read in from the stream. The result is always Unicode. The L pragma affects all the C calls after the pragma by -setting default disciplines. If you want to affect only certain -streams, use explicit disciplines directly in the C call. +setting default layers. If you want to affect only certain +streams, use explicit layers directly in the C call. You can switch encodings on an already opened stream by using C; see L. @@ -392,7 +392,7 @@ C and C, only with the C pragma. The C<:utf8> and C<:encoding(...)> methods do work with all of C, C, and the C pragma. -Similarly, you may use these I/O disciplines on output streams to +Similarly, you may use these I/O layers on output streams to automatically convert Unicode to the specified encoding when it is written to the stream. For example, the following snippet copies the contents of the file "text.jis" (encoded as ISO-2022-JP, aka JIS) to @@ -415,7 +415,7 @@ C and C operate on byte counts, as do C and C. Notice that because of the default behaviour of not doing any -conversion upon input if there is no default discipline, +conversion upon input if there is no default layer, it is easy to mistakenly write code that keeps on expanding a file by repeatedly encoding the data: @@ -484,7 +484,7 @@ Peeking At Perl's Internal Encoding Normal users of Perl should never care how Perl encodes any particular Unicode string (because the normal ways to get at the contents of a string with Unicode--via input and output--should always be via -explicitly-defined I/O disciplines). But if you must, there are two +explicitly-defined I/O layers). But if you must, there are two ways of looking behind the scenes. One way of peeking inside the internal encoding of Unicode characters diff --git a/pod/perlvar.pod b/pod/perlvar.pod index 100361b..0672c4e 100644 --- a/pod/perlvar.pod +++ b/pod/perlvar.pod @@ -1012,8 +1012,8 @@ between the variants. =item ${^OPEN} An internal variable used by PerlIO. A string in two parts, separated -by a C<\0> byte, the first part is the input disciplines, the second -part is the output disciplines. +by a C<\0> byte, the first part describes the input layers, the second +part describes the output layers. =item $PERLDB