Commit | Line | Data |
50b80e25 |
1 | =head1 NAME |
2 | |
3 | perliol - C API for Perl's implementation of IO in Layers. |
4 | |
5 | =head1 SYNOPSIS |
6 | |
7 | /* Defining a layer ... */ |
8 | #include <perliol.h> |
9 | |
50b80e25 |
10 | =head1 DESCRIPTION |
11 | |
9d799145 |
12 | This document describes the behavior and implementation of the PerlIO |
13 | abstraction described in L<perlapio> when C<USE_PERLIO> is defined (and |
14 | C<USE_SFIO> is not). |
50b80e25 |
15 | |
16 | =head2 History and Background |
17 | |
9d799145 |
18 | The PerlIO abstraction was introduced in perl5.003_02 but languished as |
19 | just an abstraction until perl5.7.0. However during that time a number |
d1be9408 |
20 | of perl extensions switched to using it, so the API is mostly fixed to |
9d799145 |
21 | maintain (source) compatibility. |
50b80e25 |
22 | |
9d799145 |
23 | The aim of the implementation is to provide the PerlIO API in a flexible |
24 | and platform neutral manner. It is also a trial of an "Object Oriented |
25 | C, with vtables" approach which may be applied to perl6. |
50b80e25 |
26 | |
27 | =head2 Layers vs Disciplines |
28 | |
9d799145 |
29 | Initial discussion of the ability to modify IO streams behaviour used |
30 | the term "discipline" for the entities which were added. This came (I |
31 | believe) from the use of the term in "sfio", which in turn borrowed it |
32 | from "line disciplines" on Unix terminals. However, this document (and |
33 | the C code) uses the term "layer". |
34 | |
1d11c889 |
35 | This is, I hope, a natural term given the implementation, and should |
36 | avoid connotations that are inherent in earlier uses of "discipline" |
37 | for things which are rather different. |
50b80e25 |
38 | |
39 | =head2 Data Structures |
40 | |
41 | The basic data structure is a PerlIOl: |
42 | |
43 | typedef struct _PerlIO PerlIOl; |
44 | typedef struct _PerlIO_funcs PerlIO_funcs; |
45 | typedef PerlIOl *PerlIO; |
46 | |
47 | struct _PerlIO |
48 | { |
49 | PerlIOl * next; /* Lower layer */ |
50 | PerlIO_funcs * tab; /* Functions for this layer */ |
51 | IV flags; /* Various flags for state */ |
52 | }; |
53 | |
1d11c889 |
54 | A C<PerlIOl *> is a pointer to the struct, and the I<application> |
55 | level C<PerlIO *> is a pointer to a C<PerlIOl *> - i.e. a pointer |
56 | to a pointer to the struct. This allows the application level C<PerlIO *> |
57 | to remain constant while the actual C<PerlIOl *> underneath |
58 | changes. (Compare perl's C<SV *> which remains constant while its |
59 | C<sv_any> field changes as the scalar's type changes.) An IO stream is |
60 | then in general represented as a pointer to this linked-list of |
61 | "layers". |
50b80e25 |
62 | |
9d799145 |
63 | It should be noted that because of the double indirection in a C<PerlIO *>, |
d4165bde |
64 | a C<< &(perlio->next) >> "is" a C<PerlIO *>, and so to some degree |
11e1c8f2 |
65 | at least one layer can use the "standard" API on the next layer down. |
50b80e25 |
66 | |
67 | A "layer" is composed of two parts: |
68 | |
69 | =over 4 |
70 | |
210b36aa |
71 | =item 1. |
50b80e25 |
72 | |
210b36aa |
73 | The functions and attributes of the "layer class". |
74 | |
75 | =item 2. |
76 | |
77 | The per-instance data for a particular handle. |
50b80e25 |
78 | |
79 | =back |
80 | |
81 | =head2 Functions and Attributes |
82 | |
9d799145 |
83 | The functions and attributes are accessed via the "tab" (for table) |
84 | member of C<PerlIOl>. The functions (methods of the layer "class") are |
85 | fixed, and are defined by the C<PerlIO_funcs> type. They are broadly the |
86 | same as the public C<PerlIO_xxxxx> functions: |
50b80e25 |
87 | |
b76cc8ba |
88 | struct _PerlIO_funcs |
89 | { |
90 | char * name; |
91 | Size_t size; |
92 | IV kind; |
d4165bde |
93 | IV (*Pushed)(pTHX_ PerlIO *f,const char *mode,SV *arg); |
94 | IV (*Popped)(pTHX_ PerlIO *f); |
b76cc8ba |
95 | PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, |
96 | AV *layers, IV n, |
97 | const char *mode, |
98 | int fd, int imode, int perm, |
99 | PerlIO *old, |
100 | int narg, SV **args); |
d4165bde |
101 | SV * (*Getarg)(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags) |
102 | IV (*Fileno)(pTHX_ PerlIO *f); |
103 | PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, CLONE_PARAMS *param, int flags) |
b76cc8ba |
104 | /* Unix-like functions - cf sfio line disciplines */ |
d4165bde |
105 | SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count); |
106 | SSize_t (*Unread)(pTHX_ PerlIO *f, const void *vbuf, Size_t count); |
107 | SSize_t (*Write)(pTHX_ PerlIO *f, const void *vbuf, Size_t count); |
108 | IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence); |
109 | Off_t (*Tell)(pTHX_ PerlIO *f); |
110 | IV (*Close)(pTHX_ PerlIO *f); |
b76cc8ba |
111 | /* Stdio-like buffered IO functions */ |
d4165bde |
112 | IV (*Flush)(pTHX_ PerlIO *f); |
113 | IV (*Fill)(pTHX_ PerlIO *f); |
114 | IV (*Eof)(pTHX_ PerlIO *f); |
115 | IV (*Error)(pTHX_ PerlIO *f); |
116 | void (*Clearerr)(pTHX_ PerlIO *f); |
117 | void (*Setlinebuf)(pTHX_ PerlIO *f); |
b76cc8ba |
118 | /* Perl's snooping functions */ |
d4165bde |
119 | STDCHAR * (*Get_base)(pTHX_ PerlIO *f); |
120 | Size_t (*Get_bufsiz)(pTHX_ PerlIO *f); |
121 | STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f); |
122 | SSize_t (*Get_cnt)(pTHX_ PerlIO *f); |
123 | void (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt); |
b76cc8ba |
124 | }; |
125 | |
9d799145 |
126 | The first few members of the struct give a "name" for the layer, the |
127 | size to C<malloc> for the per-instance data, and some flags which are |
128 | attributes of the class as whole (such as whether it is a buffering |
129 | layer), then follow the functions which fall into four basic groups: |
50b80e25 |
130 | |
131 | =over 4 |
132 | |
aa500c9e |
133 | =item 1. |
50b80e25 |
134 | |
aa500c9e |
135 | Opening and setup functions |
50b80e25 |
136 | |
aa500c9e |
137 | =item 2. |
50b80e25 |
138 | |
aa500c9e |
139 | Basic IO operations |
140 | |
141 | =item 3. |
142 | |
143 | Stdio class buffering options. |
144 | |
145 | =item 4. |
146 | |
147 | Functions to support Perl's traditional "fast" access to the buffer. |
50b80e25 |
148 | |
149 | =back |
150 | |
1d11c889 |
151 | A layer does not have to implement all the functions, but the whole |
152 | table has to be present. Unimplemented slots can be NULL (which will |
153 | result in an error when called) or can be filled in with stubs to |
154 | "inherit" behaviour from a "base class". This "inheritance" is fixed |
155 | for all instances of the layer, but as the layer chooses which stubs |
156 | to populate the table, limited "multiple inheritance" is possible. |
50b80e25 |
157 | |
158 | =head2 Per-instance Data |
159 | |
1d11c889 |
160 | The per-instance data are held in memory beyond the basic PerlIOl |
161 | struct, by making a PerlIOl the first member of the layer's struct |
162 | thus: |
50b80e25 |
163 | |
164 | typedef struct |
165 | { |
166 | struct _PerlIO base; /* Base "class" info */ |
167 | STDCHAR * buf; /* Start of buffer */ |
168 | STDCHAR * end; /* End of valid part of buffer */ |
169 | STDCHAR * ptr; /* Current position in buffer */ |
170 | Off_t posn; /* Offset of buf into the file */ |
171 | Size_t bufsiz; /* Real size of buffer */ |
172 | IV oneword; /* Emergency buffer */ |
173 | } PerlIOBuf; |
174 | |
1d11c889 |
175 | In this way (as for perl's scalars) a pointer to a PerlIOBuf can be |
176 | treated as a pointer to a PerlIOl. |
50b80e25 |
177 | |
178 | =head2 Layers in action. |
179 | |
180 | table perlio unix |
181 | | | |
182 | +-----------+ +----------+ +--------+ |
183 | PerlIO ->| |--->| next |--->| NULL | |
184 | +-----------+ +----------+ +--------+ |
185 | | | | buffer | | fd | |
186 | +-----------+ | | +--------+ |
187 | | | +----------+ |
188 | |
189 | |
190 | The above attempts to show how the layer scheme works in a simple case. |
9d799145 |
191 | The application's C<PerlIO *> points to an entry in the table(s) |
192 | representing open (allocated) handles. For example the first three slots |
193 | in the table correspond to C<stdin>,C<stdout> and C<stderr>. The table |
194 | in turn points to the current "top" layer for the handle - in this case |
195 | an instance of the generic buffering layer "perlio". That layer in turn |
196 | points to the next layer down - in this case the lowlevel "unix" layer. |
50b80e25 |
197 | |
9d799145 |
198 | The above is roughly equivalent to a "stdio" buffered stream, but with |
199 | much more flexibility: |
50b80e25 |
200 | |
201 | =over 4 |
202 | |
203 | =item * |
204 | |
9d799145 |
205 | If Unix level C<read>/C<write>/C<lseek> is not appropriate for (say) |
206 | sockets then the "unix" layer can be replaced (at open time or even |
207 | dynamically) with a "socket" layer. |
50b80e25 |
208 | |
209 | =item * |
210 | |
1d11c889 |
211 | Different handles can have different buffering schemes. The "top" |
212 | layer could be the "mmap" layer if reading disk files was quicker |
213 | using C<mmap> than C<read>. An "unbuffered" stream can be implemented |
214 | simply by not having a buffer layer. |
50b80e25 |
215 | |
216 | =item * |
217 | |
218 | Extra layers can be inserted to process the data as it flows through. |
9d799145 |
219 | This was the driving need for including the scheme in perl 5.7.0+ - we |
d1be9408 |
220 | needed a mechanism to allow data to be translated between perl's |
9d799145 |
221 | internal encoding (conceptually at least Unicode as UTF-8), and the |
222 | "native" format used by the system. This is provided by the |
223 | ":encoding(xxxx)" layer which typically sits above the buffering layer. |
50b80e25 |
224 | |
225 | =item * |
226 | |
1d11c889 |
227 | A layer can be added that does "\n" to CRLF translation. This layer |
228 | can be used on any platform, not just those that normally do such |
229 | things. |
50b80e25 |
230 | |
231 | =back |
232 | |
233 | =head2 Per-instance flag bits |
234 | |
1d11c889 |
235 | The generic flag bits are a hybrid of C<O_XXXXX> style flags deduced |
236 | from the mode string passed to C<PerlIO_open()>, and state bits for |
237 | typical buffer layers. |
50b80e25 |
238 | |
9d799145 |
239 | =over 4 |
50b80e25 |
240 | |
241 | =item PERLIO_F_EOF |
242 | |
243 | End of file. |
244 | |
245 | =item PERLIO_F_CANWRITE |
246 | |
3039a93d |
247 | Writes are permitted, i.e. opened as "w" or "r+" or "a", etc. |
50b80e25 |
248 | |
249 | =item PERLIO_F_CANREAD |
250 | |
3039a93d |
251 | Reads are permitted i.e. opened "r" or "w+" (or even "a+" - ick). |
50b80e25 |
252 | |
253 | =item PERLIO_F_ERROR |
254 | |
d4165bde |
255 | An error has occurred (for C<PerlIO_error()>). |
50b80e25 |
256 | |
257 | =item PERLIO_F_TRUNCATE |
258 | |
259 | Truncate file suggested by open mode. |
260 | |
261 | =item PERLIO_F_APPEND |
262 | |
263 | All writes should be appends. |
264 | |
265 | =item PERLIO_F_CRLF |
266 | |
11e1c8f2 |
267 | Layer is performing Win32-like "\n" mapped to CR,LF for output and CR,LF |
268 | mapped to "\n" for input. Normally the provided "crlf" layer is the only |
269 | layer that need bother about this. C<PerlIO_binmode()> will mess with this |
9d799145 |
270 | flag rather than add/remove layers if the C<PERLIO_K_CANCRLF> bit is set |
271 | for the layers class. |
50b80e25 |
272 | |
273 | =item PERLIO_F_UTF8 |
274 | |
3039a93d |
275 | Data written to this layer should be UTF-8 encoded; data provided |
50b80e25 |
276 | by this layer should be considered UTF-8 encoded. Can be set on any layer |
277 | by ":utf8" dummy layer. Also set on ":encoding" layer. |
278 | |
279 | =item PERLIO_F_UNBUF |
280 | |
281 | Layer is unbuffered - i.e. write to next layer down should occur for |
282 | each write to this layer. |
283 | |
284 | =item PERLIO_F_WRBUF |
285 | |
286 | The buffer for this layer currently holds data written to it but not sent |
287 | to next layer. |
288 | |
289 | =item PERLIO_F_RDBUF |
290 | |
291 | The buffer for this layer currently holds unconsumed data read from |
292 | layer below. |
293 | |
294 | =item PERLIO_F_LINEBUF |
295 | |
9d799145 |
296 | Layer is line buffered. Write data should be passed to next layer down |
297 | whenever a "\n" is seen. Any data beyond the "\n" should then be |
298 | processed. |
50b80e25 |
299 | |
300 | =item PERLIO_F_TEMP |
301 | |
9d799145 |
302 | File has been C<unlink()>ed, or should be deleted on C<close()>. |
50b80e25 |
303 | |
304 | =item PERLIO_F_OPEN |
305 | |
306 | Handle is open. |
307 | |
308 | =item PERLIO_F_FASTGETS |
309 | |
9d799145 |
310 | This instance of this layer supports the "fast C<gets>" interface. |
311 | Normally set based on C<PERLIO_K_FASTGETS> for the class and by the |
d1be9408 |
312 | existence of the function(s) in the table. However a class that |
50b80e25 |
313 | normally provides that interface may need to avoid it on a |
314 | particular instance. The "pending" layer needs to do this when |
d1be9408 |
315 | it is pushed above a layer which does not support the interface. |
9d799145 |
316 | (Perl's C<sv_gets()> does not expect the streams fast C<gets> behaviour |
50b80e25 |
317 | to change during one "get".) |
318 | |
319 | =back |
320 | |
321 | =head2 Methods in Detail |
322 | |
323 | =over 4 |
324 | |
5cb3728c |
325 | =item name |
326 | |
327 | char * name; |
d4165bde |
328 | |
329 | The name of the layer whose open() method Perl should invoke on |
330 | open(). For example if the layer is called APR, you will call: |
331 | |
332 | open $fh, ">:APR", ... |
333 | |
334 | and Perl knows that it has to invoke the PerlIOAPR_open() method |
335 | implemented by the APR layer. |
336 | |
5cb3728c |
337 | =item size |
338 | |
339 | Size_t size; |
d4165bde |
340 | |
341 | The size of the per-instance data structure, e.g.: |
342 | |
343 | sizeof(PerlIOAPR) |
344 | |
5cb3728c |
345 | =item kind |
346 | |
347 | IV kind; |
d4165bde |
348 | |
349 | XXX: explain all the available flags here |
350 | |
351 | =over 4 |
352 | |
353 | =item * PERLIO_K_BUFFERED |
354 | |
355 | =item * PERLIO_K_CANCRLF |
356 | |
357 | =item * PERLIO_K_FASTGETS |
358 | |
359 | =item * PERLIO_K_MULTIARG |
360 | |
361 | Used when the layer's open() accepts more arguments than usual. The |
362 | extra arguments should come not before the C<MODE> argument. When this |
363 | flag is used it's up to the layer to validate the args. |
364 | |
365 | =item * PERLIO_K_RAW |
366 | |
367 | =back |
368 | |
5cb3728c |
369 | =item Pushed |
370 | |
371 | IV (*Pushed)(pTHX_ PerlIO *f,const char *mode, SV *arg); |
50b80e25 |
372 | |
1d11c889 |
373 | The only absolutely mandatory method. Called when the layer is pushed |
374 | onto the stack. The C<mode> argument may be NULL if this occurs |
375 | post-open. The C<arg> will be non-C<NULL> if an argument string was |
376 | passed. In most cases this should call C<PerlIOBase_pushed()> to |
377 | convert C<mode> into the appropriate C<PERLIO_F_XXXXX> flags in |
378 | addition to any actions the layer itself takes. If a layer is not |
379 | expecting an argument it need neither save the one passed to it, nor |
380 | provide C<Getarg()> (it could perhaps C<Perl_warn> that the argument |
381 | was un-expected). |
50b80e25 |
382 | |
d4165bde |
383 | Returns 0 on success. On failure returns -1 and should set errno. |
384 | |
5cb3728c |
385 | =item Popped |
386 | |
387 | IV (*Popped)(pTHX_ PerlIO *f); |
50b80e25 |
388 | |
1d11c889 |
389 | Called when the layer is popped from the stack. A layer will normally |
390 | be popped after C<Close()> is called. But a layer can be popped |
391 | without being closed if the program is dynamically managing layers on |
392 | the stream. In such cases C<Popped()> should free any resources |
393 | (buffers, translation tables, ...) not held directly in the layer's |
394 | struct. It should also C<Unread()> any unconsumed data that has been |
395 | read and buffered from the layer below back to that layer, so that it |
396 | can be re-provided to what ever is now above. |
b76cc8ba |
397 | |
d4165bde |
398 | Returns 0 on success and failure. |
399 | |
5cb3728c |
400 | =item Open |
401 | |
402 | PerlIO * (*Open)(...); |
b76cc8ba |
403 | |
1d11c889 |
404 | The C<Open()> method has lots of arguments because it combines the |
405 | functions of perl's C<open>, C<PerlIO_open>, perl's C<sysopen>, |
406 | C<PerlIO_fdopen> and C<PerlIO_reopen>. The full prototype is as |
407 | follows: |
b76cc8ba |
408 | |
409 | PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, |
410 | AV *layers, IV n, |
411 | const char *mode, |
412 | int fd, int imode, int perm, |
413 | PerlIO *old, |
414 | int narg, SV **args); |
415 | |
1d11c889 |
416 | Open should (perhaps indirectly) call C<PerlIO_allocate()> to allocate |
417 | a slot in the table and associate it with the layers information for |
418 | the opened file, by calling C<PerlIO_push>. The I<layers> AV is an |
419 | array of all the layers destined for the C<PerlIO *>, and any |
420 | arguments passed to them, I<n> is the index into that array of the |
421 | layer being called. The macro C<PerlIOArg> will return a (possibly |
422 | C<NULL>) SV * for the argument passed to the layer. |
423 | |
424 | The I<mode> string is an "C<fopen()>-like" string which would match |
425 | the regular expression C</^[I#]?[rwa]\+?[bt]?$/>. |
426 | |
427 | The C<'I'> prefix is used during creation of C<stdin>..C<stderr> via |
428 | special C<PerlIO_fdopen> calls; the C<'#'> prefix means that this is |
429 | C<sysopen> and that I<imode> and I<perm> should be passed to |
430 | C<PerlLIO_open3>; C<'r'> means B<r>ead, C<'w'> means B<w>rite and |
431 | C<'a'> means B<a>ppend. The C<'+'> suffix means that both reading and |
432 | writing/appending are permitted. The C<'b'> suffix means file should |
433 | be binary, and C<'t'> means it is text. (Binary/Text should be ignored |
434 | by almost all layers and binary IO done, with PerlIO. The C<:crlf> |
435 | layer should be pushed to handle the distinction.) |
436 | |
437 | If I<old> is not C<NULL> then this is a C<PerlIO_reopen>. Perl itself |
438 | does not use this (yet?) and semantics are a little vague. |
439 | |
440 | If I<fd> not negative then it is the numeric file descriptor I<fd>, |
441 | which will be open in a manner compatible with the supplied mode |
442 | string, the call is thus equivalent to C<PerlIO_fdopen>. In this case |
443 | I<nargs> will be zero. |
444 | |
445 | If I<nargs> is greater than zero then it gives the number of arguments |
446 | passed to C<open>, otherwise it will be 1 if for example |
447 | C<PerlIO_open> was called. In simple cases SvPV_nolen(*args) is the |
448 | pathname to open. |
449 | |
450 | Having said all that translation-only layers do not need to provide |
451 | C<Open()> at all, but rather leave the opening to a lower level layer |
452 | and wait to be "pushed". If a layer does provide C<Open()> it should |
453 | normally call the C<Open()> method of next layer down (if any) and |
454 | then push itself on top if that succeeds. |
b76cc8ba |
455 | |
d4165bde |
456 | Returns C<NULL> on failure. |
457 | |
5cb3728c |
458 | =item Getarg |
459 | |
460 | SV * (*Getarg)(pTHX_ PerlIO *f, |
461 | CLONE_PARAMS *param, int flags); |
b76cc8ba |
462 | |
d4165bde |
463 | Optional. If present should return an SV * representing the string |
464 | argument passed to the layer when it was |
465 | pushed. e.g. ":encoding(ascii)" would return an SvPV with value |
466 | "ascii". (I<param> and I<flags> arguments can be ignored in most |
467 | cases) |
b76cc8ba |
468 | |
5cb3728c |
469 | =item Fileno |
470 | |
471 | IV (*Fileno)(pTHX_ PerlIO *f); |
b76cc8ba |
472 | |
d1be9408 |
473 | Returns the Unix/Posix numeric file descriptor for the handle. Normally |
b76cc8ba |
474 | C<PerlIOBase_fileno()> (which just asks next layer down) will suffice |
475 | for this. |
50b80e25 |
476 | |
d4165bde |
477 | Returns -1 if the layer cannot provide such a file descriptor, or in |
478 | the case of the error. |
479 | |
480 | XXX: two possible results end up in -1, one is an error the other is |
481 | not. |
482 | |
5cb3728c |
483 | =item Dup |
484 | |
485 | PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, |
486 | CLONE_PARAMS *param, int flags); |
d4165bde |
487 | |
488 | XXX: not documented |
489 | |
490 | Similar to C<Open>, returns PerlIO* on success, C<NULL> on failure. |
491 | |
5cb3728c |
492 | =item Read |
493 | |
494 | SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count); |
d4165bde |
495 | |
496 | Basic read operation. |
50b80e25 |
497 | |
d4165bde |
498 | Typically will call C<Fill> and manipulate pointers (possibly via the |
499 | API). C<PerlIOBuf_read()> may be suitable for derived classes which |
500 | provide "fast gets" methods. |
50b80e25 |
501 | |
d4165bde |
502 | Returns actual bytes read, or -1 on an error. |
503 | |
5cb3728c |
504 | =item Unread |
505 | |
506 | SSize_t (*Unread)(pTHX_ PerlIO *f, |
507 | const void *vbuf, Size_t count); |
50b80e25 |
508 | |
9d799145 |
509 | A superset of stdio's C<ungetc()>. Should arrange for future reads to |
510 | see the bytes in C<vbuf>. If there is no obviously better implementation |
511 | then C<PerlIOBase_unread()> provides the function by pushing a "fake" |
512 | "pending" layer above the calling layer. |
50b80e25 |
513 | |
d4165bde |
514 | Returns the number of unread chars. |
515 | |
5cb3728c |
516 | =item Write |
517 | |
518 | SSize_t (*Write)(PerlIO *f, const void *vbuf, Size_t count); |
50b80e25 |
519 | |
d4165bde |
520 | Basic write operation. |
50b80e25 |
521 | |
d4165bde |
522 | Returns bytes written or -1 on an error. |
523 | |
5cb3728c |
524 | =item Seek |
525 | |
526 | IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence); |
50b80e25 |
527 | |
1d11c889 |
528 | Position the file pointer. Should normally call its own C<Flush> |
529 | method and then the C<Seek> method of next layer down. |
50b80e25 |
530 | |
d4165bde |
531 | Returns 0 on success, -1 on failure. |
532 | |
5cb3728c |
533 | =item Tell |
534 | |
535 | Off_t (*Tell)(pTHX_ PerlIO *f); |
50b80e25 |
536 | |
9d799145 |
537 | Return the file pointer. May be based on layers cached concept of |
538 | position to avoid overhead. |
50b80e25 |
539 | |
d4165bde |
540 | Returns -1 on failure to get the file pointer. |
541 | |
5cb3728c |
542 | =item Close |
543 | |
544 | IV (*Close)(pTHX_ PerlIO *f); |
50b80e25 |
545 | |
9d799145 |
546 | Close the stream. Should normally call C<PerlIOBase_close()> to flush |
547 | itself and close layers below, and then deallocate any data structures |
548 | (buffers, translation tables, ...) not held directly in the data |
549 | structure. |
50b80e25 |
550 | |
d4165bde |
551 | Returns 0 on success, -1 on failure. |
552 | |
5cb3728c |
553 | =item Flush |
554 | |
555 | IV (*Flush)(pTHX_ PerlIO *f); |
50b80e25 |
556 | |
9d799145 |
557 | Should make stream's state consistent with layers below. That is, any |
558 | buffered write data should be written, and file position of lower layers |
d1be9408 |
559 | adjusted for data read from below but not actually consumed. |
b76cc8ba |
560 | (Should perhaps C<Unread()> such data to the lower layer.) |
50b80e25 |
561 | |
d4165bde |
562 | Returns 0 on success, -1 on failure. |
563 | |
5cb3728c |
564 | =item Fill |
565 | |
566 | IV (*Fill)(pTHX_ PerlIO *f); |
d4165bde |
567 | |
568 | The buffer for this layer should be filled (for read) from layer |
569 | below. When you "subclass" PerlIOBuf layer, you want to use its |
570 | I<_read> method and to supply your own fill method, which fills the |
571 | PerlIOBuf's buffer. |
50b80e25 |
572 | |
d4165bde |
573 | Returns 0 on success, -1 on failure. |
50b80e25 |
574 | |
5cb3728c |
575 | =item Eof |
576 | |
577 | IV (*Eof)(pTHX_ PerlIO *f); |
50b80e25 |
578 | |
9d799145 |
579 | Return end-of-file indicator. C<PerlIOBase_eof()> is normally sufficient. |
50b80e25 |
580 | |
d4165bde |
581 | Returns 0 on end-of-file, 1 if not end-of-file, -1 on error. |
582 | |
5cb3728c |
583 | =item Error |
584 | |
585 | IV (*Error)(pTHX_ PerlIO *f); |
50b80e25 |
586 | |
9d799145 |
587 | Return error indicator. C<PerlIOBase_error()> is normally sufficient. |
50b80e25 |
588 | |
d4165bde |
589 | Returns 1 if there is an error (usually when C<PERLIO_F_ERROR> is set, |
590 | 0 otherwise. |
591 | |
5cb3728c |
592 | =item Clearerr |
593 | |
594 | void (*Clearerr)(pTHX_ PerlIO *f); |
50b80e25 |
595 | |
9d799145 |
596 | Clear end-of-file and error indicators. Should call C<PerlIOBase_clearerr()> |
597 | to set the C<PERLIO_F_XXXXX> flags, which may suffice. |
50b80e25 |
598 | |
5cb3728c |
599 | =item Setlinebuf |
600 | |
601 | void (*Setlinebuf)(pTHX_ PerlIO *f); |
50b80e25 |
602 | |
b76cc8ba |
603 | Mark the stream as line buffered. C<PerlIOBase_setlinebuf()> sets the |
604 | PERLIO_F_LINEBUF flag and is normally sufficient. |
50b80e25 |
605 | |
5cb3728c |
606 | =item Get_base |
607 | |
608 | STDCHAR * (*Get_base)(pTHX_ PerlIO *f); |
50b80e25 |
609 | |
610 | Allocate (if not already done so) the read buffer for this layer and |
d4165bde |
611 | return pointer to it. Return NULL on failure. |
50b80e25 |
612 | |
5cb3728c |
613 | =item Get_bufsiz |
614 | |
615 | Size_t (*Get_bufsiz)(pTHX_ PerlIO *f); |
50b80e25 |
616 | |
9d799145 |
617 | Return the number of bytes that last C<Fill()> put in the buffer. |
50b80e25 |
618 | |
5cb3728c |
619 | =item Get_ptr |
620 | |
621 | STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f); |
50b80e25 |
622 | |
3039a93d |
623 | Return the current read pointer relative to this layer's buffer. |
50b80e25 |
624 | |
5cb3728c |
625 | =item Get_cnt |
626 | |
627 | SSize_t (*Get_cnt)(pTHX_ PerlIO *f); |
50b80e25 |
628 | |
629 | Return the number of bytes left to be read in the current buffer. |
630 | |
5cb3728c |
631 | =item Set_ptrcnt |
632 | |
633 | void (*Set_ptrcnt)(pTHX_ PerlIO *f, |
634 | STDCHAR *ptr, SSize_t cnt); |
50b80e25 |
635 | |
636 | Adjust the read pointer and count of bytes to match C<ptr> and/or C<cnt>. |
637 | The application (or layer above) must ensure they are consistent. |
638 | (Checking is allowed by the paranoid.) |
639 | |
640 | =back |
641 | |
642 | |
643 | =head2 Core Layers |
644 | |
645 | The file C<perlio.c> provides the following layers: |
646 | |
647 | =over 4 |
648 | |
649 | =item "unix" |
650 | |
9d799145 |
651 | A basic non-buffered layer which calls Unix/POSIX C<read()>, C<write()>, |
652 | C<lseek()>, C<close()>. No buffering. Even on platforms that distinguish |
653 | between O_TEXT and O_BINARY this layer is always O_BINARY. |
50b80e25 |
654 | |
655 | =item "perlio" |
656 | |
9d799145 |
657 | A very complete generic buffering layer which provides the whole of |
658 | PerlIO API. It is also intended to be used as a "base class" for other |
1d11c889 |
659 | layers. (For example its C<Read()> method is implemented in terms of |
660 | the C<Get_cnt()>/C<Get_ptr()>/C<Set_ptrcnt()> methods). |
50b80e25 |
661 | |
9d799145 |
662 | "perlio" over "unix" provides a complete replacement for stdio as seen |
663 | via PerlIO API. This is the default for USE_PERLIO when system's stdio |
1d11c889 |
664 | does not permit perl's "fast gets" access, and which do not |
665 | distinguish between C<O_TEXT> and C<O_BINARY>. |
50b80e25 |
666 | |
667 | =item "stdio" |
668 | |
9d799145 |
669 | A layer which provides the PerlIO API via the layer scheme, but |
670 | implements it by calling system's stdio. This is (currently) the default |
671 | if system's stdio provides sufficient access to allow perl's "fast gets" |
672 | access and which do not distinguish between C<O_TEXT> and C<O_BINARY>. |
50b80e25 |
673 | |
674 | =item "crlf" |
675 | |
9d799145 |
676 | A layer derived using "perlio" as a base class. It provides Win32-like |
677 | "\n" to CR,LF translation. Can either be applied above "perlio" or serve |
678 | as the buffer layer itself. "crlf" over "unix" is the default if system |
679 | distinguishes between C<O_TEXT> and C<O_BINARY> opens. (At some point |
680 | "unix" will be replaced by a "native" Win32 IO layer on that platform, |
681 | as Win32's read/write layer has various drawbacks.) The "crlf" layer is |
682 | a reasonable model for a layer which transforms data in some way. |
50b80e25 |
683 | |
684 | =item "mmap" |
685 | |
9d799145 |
686 | If Configure detects C<mmap()> functions this layer is provided (with |
687 | "perlio" as a "base") which does "read" operations by mmap()ing the |
688 | file. Performance improvement is marginal on modern systems, so it is |
689 | mainly there as a proof of concept. It is likely to be unbundled from |
690 | the core at some point. The "mmap" layer is a reasonable model for a |
691 | minimalist "derived" layer. |
50b80e25 |
692 | |
693 | =item "pending" |
694 | |
9d799145 |
695 | An "internal" derivative of "perlio" which can be used to provide |
1d11c889 |
696 | Unread() function for layers which have no buffer or cannot be |
697 | bothered. (Basically this layer's C<Fill()> pops itself off the stack |
698 | and so resumes reading from layer below.) |
50b80e25 |
699 | |
700 | =item "raw" |
701 | |
9d799145 |
702 | A dummy layer which never exists on the layer stack. Instead when |
703 | "pushed" it actually pops the stack(!), removing itself, and any other |
704 | layers until it reaches a layer with the class C<PERLIO_K_RAW> bit set. |
50b80e25 |
705 | |
706 | =item "utf8" |
707 | |
9d799145 |
708 | Another dummy layer. When pushed it pops itself and sets the |
1d11c889 |
709 | C<PERLIO_F_UTF8> flag on the layer which was (and now is once more) |
710 | the top of the stack. |
50b80e25 |
711 | |
712 | =back |
713 | |
9d799145 |
714 | In addition F<perlio.c> also provides a number of C<PerlIOBase_xxxx()> |
715 | functions which are intended to be used in the table slots of classes |
716 | which do not need to do anything special for a particular method. |
50b80e25 |
717 | |
718 | =head2 Extension Layers |
719 | |
1d11c889 |
720 | Layers can made available by extension modules. When an unknown layer |
721 | is encountered the PerlIO code will perform the equivalent of : |
b76cc8ba |
722 | |
723 | use PerlIO 'layer'; |
724 | |
1d11c889 |
725 | Where I<layer> is the unknown layer. F<PerlIO.pm> will then attempt to: |
b76cc8ba |
726 | |
727 | require PerlIO::layer; |
728 | |
1d11c889 |
729 | If after that process the layer is still not defined then the C<open> |
730 | will fail. |
b76cc8ba |
731 | |
732 | The following extension layers are bundled with perl: |
50b80e25 |
733 | |
734 | =over 4 |
735 | |
b76cc8ba |
736 | =item ":encoding" |
50b80e25 |
737 | |
738 | use Encoding; |
739 | |
1d11c889 |
740 | makes this layer available, although F<PerlIO.pm> "knows" where to |
741 | find it. It is an example of a layer which takes an argument as it is |
742 | called thus: |
50b80e25 |
743 | |
744 | open($fh,"<:encoding(iso-8859-7)",$pathname) |
745 | |
b76cc8ba |
746 | =item ":Scalar" |
747 | |
748 | Provides support for |
749 | |
750 | open($fh,"...",\$scalar) |
50b80e25 |
751 | |
1d11c889 |
752 | When a handle is so opened, then reads get bytes from the string value |
753 | of I<$scalar>, and writes change the value. In both cases the position |
754 | in I<$scalar> starts as zero but can be altered via C<seek>, and |
755 | determined via C<tell>. |
b76cc8ba |
756 | |
757 | =item ":Object" or ":Perl" |
758 | |
1d11c889 |
759 | May be provided to allow layers to be implemented as perl code - |
760 | implementation is being investigated. |
b76cc8ba |
761 | |
762 | =back |
50b80e25 |
763 | |
d4165bde |
764 | =head1 TODO |
765 | |
766 | Things that need to be done to improve this document. |
767 | |
768 | =over |
769 | |
770 | =item * |
771 | |
772 | Explain how to make a valid fh without going through open()(i.e. apply |
773 | a layer). For example if the file is not opened through perl, but we |
774 | want to get back a fh, like it was opened by Perl. |
775 | |
776 | How PerlIO_apply_layera fits in, where its docs, was it made public? |
777 | |
778 | Currently the example could be something like this: |
779 | |
780 | PerlIO *foo_to_PerlIO(pTHX_ char *mode, ...) |
781 | { |
782 | char *mode; /* "w", "r", etc */ |
783 | const char *layers = ":APR"; /* the layer name */ |
784 | PerlIO *f = PerlIO_allocate(aTHX); |
785 | if (!f) { |
786 | return NULL; |
787 | } |
788 | |
789 | PerlIO_apply_layers(aTHX_ f, mode, layers); |
790 | |
791 | if (f) { |
792 | PerlIOAPR *st = PerlIOSelf(f, PerlIOAPR); |
793 | /* fill in the st struct, as in _open() */ |
794 | st->file = file; |
795 | PerlIOBase(f)->flags |= PERLIO_F_OPEN; |
796 | |
797 | return f; |
798 | } |
799 | return NULL; |
800 | } |
801 | |
802 | =item * |
803 | |
804 | fix/add the documentation in places marked as XXX. |
805 | |
806 | =item * |
807 | |
808 | The handling of errors by the layer is not specified. e.g. when $! |
809 | should be set explicitly, when the error handling should be just |
810 | delegated to the top layer. |
811 | |
812 | Probably give some hints on using SETERRNO() or pointers to where they |
813 | can be found. |
814 | |
815 | =item * |
816 | |
817 | I think it would help to give some concrete examples to make it easier |
818 | to understand the API. Of course I agree that the API has to be |
819 | concise, but since there is no second document that is more of a |
820 | guide, I think that it'd make it easier to start with the doc which is |
821 | an API, but has examples in it in places where things are unclear, to |
822 | a person who is not a PerlIO guru (yet). |
823 | |
824 | =back |
825 | |
50b80e25 |
826 | =cut |
827 | |
828 | |
829 | |