Commit | Line | Data |
1141d9f8 |
1 | package PerlIO; |
2 | |
92a3e63c |
3 | our $VERSION = '1.02'; |
8de1277c |
4 | |
1141d9f8 |
5 | # Map layer name to package that defines it |
c1a61b17 |
6 | our %alias; |
1141d9f8 |
7 | |
8 | sub import |
9 | { |
10 | my $class = shift; |
11 | while (@_) |
12 | { |
13 | my $layer = shift; |
14 | if (exists $alias{$layer}) |
15 | { |
16 | $layer = $alias{$layer} |
17 | } |
18 | else |
19 | { |
20 | $layer = "${class}::$layer"; |
21 | } |
22 | eval "require $layer"; |
23 | warn $@ if $@; |
24 | } |
25 | } |
26 | |
39f7a870 |
27 | sub F_UTF8 () { 0x8000 } |
28 | |
1141d9f8 |
29 | 1; |
30 | __END__ |
b3d30bf7 |
31 | |
32 | =head1 NAME |
33 | |
7d3b96bb |
34 | PerlIO - On demand loader for PerlIO layers and root of PerlIO::* name space |
b3d30bf7 |
35 | |
36 | =head1 SYNOPSIS |
37 | |
01e6739c |
38 | open($fh,"<:crlf", "my.txt"); # portably open a text file for reading |
1cbfc93d |
39 | |
40 | open($fh,"<","his.jpg"); # portably open a binary file for reading |
41 | binmode($fh); |
7d3b96bb |
42 | |
43 | Shell: |
44 | PERLIO=perlio perl .... |
b3d30bf7 |
45 | |
46 | =head1 DESCRIPTION |
47 | |
ec28694c |
48 | When an undefined layer 'foo' is encountered in an C<open> or |
49 | C<binmode> layer specification then C code performs the equivalent of: |
b3d30bf7 |
50 | |
51 | use PerlIO 'foo'; |
52 | |
53 | The perl code in PerlIO.pm then attempts to locate a layer by doing |
54 | |
55 | require PerlIO::foo; |
56 | |
47bfe92f |
57 | Otherwise the C<PerlIO> package is a place holder for additional |
58 | PerlIO related functions. |
b3d30bf7 |
59 | |
7d3b96bb |
60 | The following layers are currently defined: |
b3d30bf7 |
61 | |
7d3b96bb |
62 | =over 4 |
63 | |
64 | =item unix |
65 | |
66 | Low level layer which calls C<read>, C<write> and C<lseek> etc. |
67 | |
68 | =item stdio |
69 | |
47bfe92f |
70 | Layer which calls C<fread>, C<fwrite> and C<fseek>/C<ftell> etc. Note |
71 | that as this is "real" stdio it will ignore any layers beneath it and |
7d3b96bb |
72 | got straight to the operating system via the C library as usual. |
73 | |
74 | =item perlio |
75 | |
47bfe92f |
76 | This is a re-implementation of "stdio-like" buffering written as a |
77 | PerlIO "layer". As such it will call whatever layer is below it for |
78 | its operations. |
7d3b96bb |
79 | |
80 | =item crlf |
81 | |
47bfe92f |
82 | A layer which does CRLF to "\n" translation distinguishing "text" and |
83 | "binary" files in the manner of MS-DOS and similar operating systems. |
0226bbdb |
84 | (It currently does I<not> mimic MS-DOS as far as treating of Control-Z |
85 | as being an end-of-file marker.) |
7d3b96bb |
86 | |
87 | =item utf8 |
88 | |
47bfe92f |
89 | Declares that the stream accepts perl's internal encoding of |
90 | characters. (Which really is UTF-8 on ASCII machines, but is |
91 | UTF-EBCDIC on EBCDIC machines.) This allows any character perl can |
92 | represent to be read from or written to the stream. The UTF-X encoding |
93 | is chosen to render simple text parts (i.e. non-accented letters, |
94 | digits and common punctuation) human readable in the encoded file. |
95 | |
96 | Here is how to write your native data out using UTF-8 (or UTF-EBCDIC) |
97 | and then read it back in. |
98 | |
99 | open(F, ">:utf8", "data.utf"); |
100 | print F $out; |
101 | close(F); |
102 | |
103 | open(F, "<:utf8", "data.utf"); |
104 | $in = <F>; |
105 | close(F); |
7d3b96bb |
106 | |
c1a61b17 |
107 | =item bytes |
108 | |
109 | This is the inverse of C<:utf8> layer. It turns off the flag |
110 | on the layer below so that data read from it is considered to |
111 | be "octets" i.e. characters in range 0..255 only. Likewise |
112 | on output perl will warn if a "wide" character is written |
113 | to a such a stream. |
114 | |
7d3b96bb |
115 | =item raw |
116 | |
0226bbdb |
117 | The C<:raw> layer is I<defined> as being identical to calling |
18aba96f |
118 | C<binmode($fh)> - the stream is made suitable for passing binary data |
119 | i.e. each byte is passed as-is. The stream will still be |
120 | buffered. Unlike in the earlier versions of Perl C<:raw> is I<not> |
121 | just the inverse of C<:crlf> - other layers which would affect the |
122 | binary nature of the stream are also removed or disabled. |
1cbfc93d |
123 | |
0226bbdb |
124 | The implementation of C<:raw> is as a pseudo-layer which when "pushed" |
125 | pops itself and then any layers which do not declare themselves as suitable |
126 | for binary data. (Undoing :utf8 and :crlf are implemented by clearing |
39f7a870 |
127 | flags rather than popping layers but that is an implementation detail.) |
01e6739c |
128 | |
0226bbdb |
129 | As a consequence of the fact that C<:raw> normally pops layers |
39f7a870 |
130 | it usually only makes sense to have it as the only or first element in |
131 | a layer specification. When used as the first element it provides |
0226bbdb |
132 | a known base on which to build e.g. |
7d3b96bb |
133 | |
0226bbdb |
134 | open($fh,":raw:utf8",...) |
7d3b96bb |
135 | |
0226bbdb |
136 | will construct a "binary" stream, but then enable UTF-8 translation. |
b3d30bf7 |
137 | |
4ec2216f |
138 | =item pop |
139 | |
140 | A pseudo layer that removes the top-most layer. Gives perl code |
141 | a way to manipulate the layer stack. Should be considered |
142 | as experimental. Note that C<:pop> only works on real layers |
143 | and will not undo the effects of pseudo layers like C<:utf8>. |
144 | An example of a possible use might be: |
145 | |
146 | open($fh,...) |
147 | ... |
148 | binmode($fh,":encoding(...)"); # next chunk is encoded |
149 | ... |
150 | binmode($fh,":pop"); # back to un-encocded |
151 | |
152 | A more elegant (and safer) interface is needed. |
153 | |
7d3b96bb |
154 | =back |
155 | |
39f7a870 |
156 | =head2 Custom Layers |
157 | |
158 | It is possible to write custom layers in addition to the above builtin |
159 | ones, both in C/XS and Perl. Two such layers (and one example written |
160 | in Perl using the latter) come with the Perl distribution. |
161 | |
162 | =over 4 |
163 | |
164 | =item :encoding |
165 | |
166 | Use C<:encoding(ENCODING)> either in open() or binmode() to install |
167 | a layer that does transparently character set and encoding transformations, |
e76300d6 |
168 | for example from Shift-JIS to Unicode. Note that under C<stdio> |
169 | an C<:encoding> also enables C<:utf8>. See L<PerlIO::encoding> |
170 | for more information. |
39f7a870 |
171 | |
172 | =item :via |
173 | |
174 | Use C<:via(MODULE)> either in open() or binmode() to install a layer |
175 | that does whatever transformation (for example compression / |
176 | decompression, encryption / decryption) to the filehandle. |
177 | See L<PerlIO::via> for more information. |
178 | |
179 | =back |
180 | |
01e6739c |
181 | =head2 Alternatives to raw |
182 | |
0226bbdb |
183 | To get a binary stream an alternate method is to use: |
01e6739c |
184 | |
0226bbdb |
185 | open($fh,"whatever") |
01e6739c |
186 | binmode($fh); |
187 | |
0226bbdb |
188 | this has advantage of being backward compatible with how such things have |
01e6739c |
189 | had to be coded on some platforms for years. |
01e6739c |
190 | |
191 | To get an un-buffered stream specify an unbuffered layer (e.g. C<:unix>) |
0226bbdb |
192 | in the open call: |
01e6739c |
193 | |
194 | open($fh,"<:unix",$path) |
195 | |
7d3b96bb |
196 | =head2 Defaults and how to override them |
197 | |
ec28694c |
198 | If the platform is MS-DOS like and normally does CRLF to "\n" |
199 | translation for text files then the default layers are : |
7d3b96bb |
200 | |
201 | unix crlf |
202 | |
47bfe92f |
203 | (The low level "unix" layer may be replaced by a platform specific low |
204 | level layer.) |
7d3b96bb |
205 | |
47bfe92f |
206 | Otherwise if C<Configure> found out how to do "fast" IO using system's |
046e4a6a |
207 | stdio, then the default layers are: |
7d3b96bb |
208 | |
209 | unix stdio |
210 | |
211 | Otherwise the default layers are |
212 | |
213 | unix perlio |
214 | |
215 | These defaults may change once perlio has been better tested and tuned. |
216 | |
47bfe92f |
217 | The default can be overridden by setting the environment variable |
39f7a870 |
218 | PERLIO to a space separated list of layers (C<unix> or platform low |
219 | level layer is always pushed first). |
47bfe92f |
220 | |
7d3b96bb |
221 | This can be used to see the effect of/bugs in the various layers e.g. |
222 | |
223 | cd .../perl/t |
224 | PERLIO=stdio ./perl harness |
225 | PERLIO=perlio ./perl harness |
226 | |
3b0db4f9 |
227 | For the various value of PERLIO see L<perlrun/PERLIO>. |
228 | |
39f7a870 |
229 | =head2 Querying the layers of filehandle |
230 | |
231 | The following returns the B<names> of the PerlIO layers on a filehandle. |
232 | |
9d569fce |
233 | my @layers = PerlIO::get_layers($fh); # Or FH, *FH, "FH". |
39f7a870 |
234 | |
235 | The layers are returned in the order an open() or binmode() call would |
f0fd62e2 |
236 | use them. Note that the "default stack" depends on the operating |
237 | system and on the perl version. |
79d9a4d7 |
238 | |
79d9a4d7 |
239 | The following table summarizes the default layers on UNIX-like and |
240 | DOS-like platforms and depending on the setting of the C<$ENV{PERLIO}>: |
241 | |
f0fd62e2 |
242 | PERLIO UNIX-like DOS-like |
79d9a4d7 |
243 | |
f0fd62e2 |
244 | unset / "" unix perlio / stdio [1] unix crlf |
245 | stdio unix perlio / stdio [1] stdio |
246 | perlio unix perlio unix perlio |
247 | mmap unix mmap unix mmap |
39f7a870 |
248 | |
f0fd62e2 |
249 | # [1] "stdio" if Configure found out how to do "fast stdio" (depends |
250 | # on the stdio implementation) and in Perl 5.8, otherwise "unix perlio" |
046e4a6a |
251 | |
39f7a870 |
252 | By default the layers from the input side of the filehandle is |
253 | returned, to get the output side use the optional C<output> argument: |
254 | |
2ae85e59 |
255 | my @layers = PerlIO::get_layers($fh, output => 1); |
39f7a870 |
256 | |
257 | (Usually the layers are identical on either side of a filehandle but |
2ae85e59 |
258 | for example with sockets there may be differences, or if you have |
259 | been using the C<open> pragma.) |
39f7a870 |
260 | |
92a3e63c |
261 | There is no set_layers(), nor does get_layers() return a tied array |
262 | mirroring the stack, or anything fancy like that. This is not |
263 | accidental or unintentional. The PerlIO layer stack is a bit more |
264 | complicated than just a stack (see for example the behaviour of C<:raw>). |
265 | You are supposed to use open() and binmode() to manipulate the stack. |
266 | |
39f7a870 |
267 | B<Implementation details follow, please close your eyes.> |
268 | |
269 | The arguments to layers are by default returned in parenthesis after |
270 | the name of the layer, and certain layers (like C<utf8>) are not real |
271 | layers but instead flags on real layers: to get all of these returned |
272 | separately use the optional C<separate> argument: |
273 | |
2ae85e59 |
274 | my @layer_and_args_and_flags = PerlIO::get_layers($fh, details => 1); |
39f7a870 |
275 | |
276 | The result will be up to be three times the number of layers: |
277 | the first element will be a name, the second element the arguments |
278 | (unspecified arguments will be C<undef>), the third element the flags, |
279 | the fourth element a name again, and so forth. |
280 | |
281 | B<You may open your eyes now.> |
282 | |
7d3b96bb |
283 | =head1 AUTHOR |
284 | |
285 | Nick Ing-Simmons E<lt>nick@ing-simmons.netE<gt> |
286 | |
287 | =head1 SEE ALSO |
288 | |
39f7a870 |
289 | L<perlfunc/"binmode">, L<perlfunc/"open">, L<perlunicode>, L<perliol>, |
290 | L<Encode> |
7d3b96bb |
291 | |
292 | =cut |
b3d30bf7 |
293 | |