Upgrade to Getopt::Long 2.33_03.
[p5sagit/p5-mst-13.2.git] / pod / perliol.pod
CommitLineData
50b80e25 1=head1 NAME
2
3perliol - C API for Perl's implementation of IO in Layers.
4
5=head1 SYNOPSIS
6
7 /* Defining a layer ... */
8 #include <perliol.h>
9
50b80e25 10=head1 DESCRIPTION
11
9d799145 12This document describes the behavior and implementation of the PerlIO
13abstraction described in L<perlapio> when C<USE_PERLIO> is defined (and
14C<USE_SFIO> is not).
50b80e25 15
16=head2 History and Background
17
9d799145 18The PerlIO abstraction was introduced in perl5.003_02 but languished as
19just an abstraction until perl5.7.0. However during that time a number
d1be9408 20of perl extensions switched to using it, so the API is mostly fixed to
9d799145 21maintain (source) compatibility.
50b80e25 22
9d799145 23The aim of the implementation is to provide the PerlIO API in a flexible
24and platform neutral manner. It is also a trial of an "Object Oriented
25C, with vtables" approach which may be applied to perl6.
50b80e25 26
cc83745d 27=head2 Basic Structure
28
29PerlIO is as a stack of layers.
30
31The low levels of the stack work with the low-level operating system
32calls (file descriptors in C) getting bytes in and out, the higher
33layers of the stack buffer, filter, and otherwise manipulate the I/O.
34Terms I<above> and I<below> are used to refer to the relative
35positioning of the stack layers.
36
37A layer contains a "vtable", the table of I/O operations (at C level
38a table of function pointers), and status flags. The functions in the
39vtable implement operations like "open", "read", and "write".
40
41When I/O, for example "read", is requested, the request goes from Perl
42first down the stack using "read" functions of each layer, then at the
43bottom the input is requested from the operating system services, then
44the result is returned up the stack, finally being interpreted as Perl
45data.
46
47When you do an open() and specify extra PerlIO layers to be deployed,
48the layers you specify are "pushed" on top of the already existing
49default stack. What exact layers are in this default stack depends on
50a lot of things: your operating system, Perl version, Perl compile
51time configuration, and Perl runtime configuration. See L<PerlIO>,
52L<perlrun/PERLIO>, and L<open> for more information.
53
54binmode() operates similarly to open(): by default the specified
55layers are pushed on top of the existing stack.
56
57However, note that even as the specified layers are "pushed on top"
58for open() and binmode(), this doesn't mean that the effects are
59limited to the "top": PerlIO layers can be very 'active' and inspect
60and affect layers also deeper in the stack. As an example there
61is a layer called "raw" which repeatedly "pops" layers until
62it reaches the first layer that has declared itself capable of
63handling binary data. The "pushed" layers are processed in left-to-right
64order.
65
66sysopen() operates (unsurprisingly) at a lower level in the stack than
67open(). For example in UNIX or UNIX-like systems sysopen() operates
68directly at the level of file descriptors: in the terms of PerlIO
69layers, it uses only the "unix" layer, which is a rather thin wrapper
70on top of the UNIX file descriptors.
71
50b80e25 72=head2 Layers vs Disciplines
73
9d799145 74Initial discussion of the ability to modify IO streams behaviour used
75the term "discipline" for the entities which were added. This came (I
76believe) from the use of the term in "sfio", which in turn borrowed it
77from "line disciplines" on Unix terminals. However, this document (and
78the C code) uses the term "layer".
79
1d11c889 80This is, I hope, a natural term given the implementation, and should
81avoid connotations that are inherent in earlier uses of "discipline"
82for things which are rather different.
50b80e25 83
84=head2 Data Structures
85
86The basic data structure is a PerlIOl:
87
88 typedef struct _PerlIO PerlIOl;
89 typedef struct _PerlIO_funcs PerlIO_funcs;
90 typedef PerlIOl *PerlIO;
91
92 struct _PerlIO
93 {
94 PerlIOl * next; /* Lower layer */
95 PerlIO_funcs * tab; /* Functions for this layer */
96 IV flags; /* Various flags for state */
97 };
98
1d11c889 99A C<PerlIOl *> is a pointer to the struct, and the I<application>
100level C<PerlIO *> is a pointer to a C<PerlIOl *> - i.e. a pointer
101to a pointer to the struct. This allows the application level C<PerlIO *>
102to remain constant while the actual C<PerlIOl *> underneath
103changes. (Compare perl's C<SV *> which remains constant while its
104C<sv_any> field changes as the scalar's type changes.) An IO stream is
105then in general represented as a pointer to this linked-list of
106"layers".
50b80e25 107
9d799145 108It should be noted that because of the double indirection in a C<PerlIO *>,
d4165bde 109a C<< &(perlio->next) >> "is" a C<PerlIO *>, and so to some degree
11e1c8f2 110at least one layer can use the "standard" API on the next layer down.
50b80e25 111
112A "layer" is composed of two parts:
113
114=over 4
115
210b36aa 116=item 1.
50b80e25 117
210b36aa 118The functions and attributes of the "layer class".
119
120=item 2.
121
122The per-instance data for a particular handle.
50b80e25 123
124=back
125
126=head2 Functions and Attributes
127
9d799145 128The functions and attributes are accessed via the "tab" (for table)
129member of C<PerlIOl>. The functions (methods of the layer "class") are
130fixed, and are defined by the C<PerlIO_funcs> type. They are broadly the
131same as the public C<PerlIO_xxxxx> functions:
50b80e25 132
b76cc8ba 133 struct _PerlIO_funcs
134 {
2dc2558e 135 Size_t fsize;
b76cc8ba 136 char * name;
137 Size_t size;
138 IV kind;
2dc2558e 139 IV (*Pushed)(pTHX_ PerlIO *f,const char *mode,SV *arg, PerlIO_funcs *tab);
d4165bde 140 IV (*Popped)(pTHX_ PerlIO *f);
b76cc8ba 141 PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab,
142 AV *layers, IV n,
143 const char *mode,
144 int fd, int imode, int perm,
145 PerlIO *old,
146 int narg, SV **args);
86e05cf2 147 IV (*Binmode)(pTHX_ PerlIO *f);
d4165bde 148 SV * (*Getarg)(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags)
149 IV (*Fileno)(pTHX_ PerlIO *f);
150 PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, CLONE_PARAMS *param, int flags)
b76cc8ba 151 /* Unix-like functions - cf sfio line disciplines */
d4165bde 152 SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count);
153 SSize_t (*Unread)(pTHX_ PerlIO *f, const void *vbuf, Size_t count);
154 SSize_t (*Write)(pTHX_ PerlIO *f, const void *vbuf, Size_t count);
155 IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence);
156 Off_t (*Tell)(pTHX_ PerlIO *f);
157 IV (*Close)(pTHX_ PerlIO *f);
b76cc8ba 158 /* Stdio-like buffered IO functions */
d4165bde 159 IV (*Flush)(pTHX_ PerlIO *f);
160 IV (*Fill)(pTHX_ PerlIO *f);
161 IV (*Eof)(pTHX_ PerlIO *f);
162 IV (*Error)(pTHX_ PerlIO *f);
163 void (*Clearerr)(pTHX_ PerlIO *f);
164 void (*Setlinebuf)(pTHX_ PerlIO *f);
b76cc8ba 165 /* Perl's snooping functions */
d4165bde 166 STDCHAR * (*Get_base)(pTHX_ PerlIO *f);
167 Size_t (*Get_bufsiz)(pTHX_ PerlIO *f);
168 STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f);
169 SSize_t (*Get_cnt)(pTHX_ PerlIO *f);
170 void (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt);
b76cc8ba 171 };
172
2dc2558e 173The first few members of the struct give a function table size for
174compatibility check "name" for the layer, the size to C<malloc> for the per-instance data,
175and some flags which are attributes of the class as whole (such as whether it is a buffering
9d799145 176layer), then follow the functions which fall into four basic groups:
50b80e25 177
178=over 4
179
aa500c9e 180=item 1.
50b80e25 181
aa500c9e 182Opening and setup functions
50b80e25 183
aa500c9e 184=item 2.
50b80e25 185
aa500c9e 186Basic IO operations
187
188=item 3.
189
190Stdio class buffering options.
191
192=item 4.
193
194Functions to support Perl's traditional "fast" access to the buffer.
50b80e25 195
196=back
197
1d11c889 198A layer does not have to implement all the functions, but the whole
199table has to be present. Unimplemented slots can be NULL (which will
200result in an error when called) or can be filled in with stubs to
201"inherit" behaviour from a "base class". This "inheritance" is fixed
202for all instances of the layer, but as the layer chooses which stubs
203to populate the table, limited "multiple inheritance" is possible.
50b80e25 204
205=head2 Per-instance Data
206
1d11c889 207The per-instance data are held in memory beyond the basic PerlIOl
208struct, by making a PerlIOl the first member of the layer's struct
209thus:
50b80e25 210
211 typedef struct
212 {
213 struct _PerlIO base; /* Base "class" info */
214 STDCHAR * buf; /* Start of buffer */
215 STDCHAR * end; /* End of valid part of buffer */
216 STDCHAR * ptr; /* Current position in buffer */
217 Off_t posn; /* Offset of buf into the file */
218 Size_t bufsiz; /* Real size of buffer */
219 IV oneword; /* Emergency buffer */
220 } PerlIOBuf;
221
1d11c889 222In this way (as for perl's scalars) a pointer to a PerlIOBuf can be
223treated as a pointer to a PerlIOl.
50b80e25 224
225=head2 Layers in action.
226
227 table perlio unix
228 | |
229 +-----------+ +----------+ +--------+
230 PerlIO ->| |--->| next |--->| NULL |
231 +-----------+ +----------+ +--------+
232 | | | buffer | | fd |
233 +-----------+ | | +--------+
234 | | +----------+
235
236
237The above attempts to show how the layer scheme works in a simple case.
9d799145 238The application's C<PerlIO *> points to an entry in the table(s)
239representing open (allocated) handles. For example the first three slots
240in the table correspond to C<stdin>,C<stdout> and C<stderr>. The table
241in turn points to the current "top" layer for the handle - in this case
242an instance of the generic buffering layer "perlio". That layer in turn
243points to the next layer down - in this case the lowlevel "unix" layer.
50b80e25 244
9d799145 245The above is roughly equivalent to a "stdio" buffered stream, but with
246much more flexibility:
50b80e25 247
248=over 4
249
250=item *
251
9d799145 252If Unix level C<read>/C<write>/C<lseek> is not appropriate for (say)
253sockets then the "unix" layer can be replaced (at open time or even
254dynamically) with a "socket" layer.
50b80e25 255
256=item *
257
1d11c889 258Different handles can have different buffering schemes. The "top"
259layer could be the "mmap" layer if reading disk files was quicker
260using C<mmap> than C<read>. An "unbuffered" stream can be implemented
261simply by not having a buffer layer.
50b80e25 262
263=item *
264
265Extra layers can be inserted to process the data as it flows through.
9d799145 266This was the driving need for including the scheme in perl 5.7.0+ - we
d1be9408 267needed a mechanism to allow data to be translated between perl's
9d799145 268internal encoding (conceptually at least Unicode as UTF-8), and the
269"native" format used by the system. This is provided by the
270":encoding(xxxx)" layer which typically sits above the buffering layer.
50b80e25 271
272=item *
273
1d11c889 274A layer can be added that does "\n" to CRLF translation. This layer
275can be used on any platform, not just those that normally do such
276things.
50b80e25 277
278=back
279
280=head2 Per-instance flag bits
281
1d11c889 282The generic flag bits are a hybrid of C<O_XXXXX> style flags deduced
283from the mode string passed to C<PerlIO_open()>, and state bits for
284typical buffer layers.
50b80e25 285
9d799145 286=over 4
50b80e25 287
288=item PERLIO_F_EOF
289
290End of file.
291
292=item PERLIO_F_CANWRITE
293
3039a93d 294Writes are permitted, i.e. opened as "w" or "r+" or "a", etc.
50b80e25 295
296=item PERLIO_F_CANREAD
297
3039a93d 298Reads are permitted i.e. opened "r" or "w+" (or even "a+" - ick).
50b80e25 299
300=item PERLIO_F_ERROR
301
d4165bde 302An error has occurred (for C<PerlIO_error()>).
50b80e25 303
304=item PERLIO_F_TRUNCATE
305
306Truncate file suggested by open mode.
307
308=item PERLIO_F_APPEND
309
310All writes should be appends.
311
312=item PERLIO_F_CRLF
313
11e1c8f2 314Layer is performing Win32-like "\n" mapped to CR,LF for output and CR,LF
315mapped to "\n" for input. Normally the provided "crlf" layer is the only
316layer that need bother about this. C<PerlIO_binmode()> will mess with this
9d799145 317flag rather than add/remove layers if the C<PERLIO_K_CANCRLF> bit is set
318for the layers class.
50b80e25 319
320=item PERLIO_F_UTF8
321
3039a93d 322Data written to this layer should be UTF-8 encoded; data provided
50b80e25 323by this layer should be considered UTF-8 encoded. Can be set on any layer
324by ":utf8" dummy layer. Also set on ":encoding" layer.
325
326=item PERLIO_F_UNBUF
327
328Layer is unbuffered - i.e. write to next layer down should occur for
329each write to this layer.
330
331=item PERLIO_F_WRBUF
332
333The buffer for this layer currently holds data written to it but not sent
334to next layer.
335
336=item PERLIO_F_RDBUF
337
338The buffer for this layer currently holds unconsumed data read from
339layer below.
340
341=item PERLIO_F_LINEBUF
342
9d799145 343Layer is line buffered. Write data should be passed to next layer down
344whenever a "\n" is seen. Any data beyond the "\n" should then be
345processed.
50b80e25 346
347=item PERLIO_F_TEMP
348
9d799145 349File has been C<unlink()>ed, or should be deleted on C<close()>.
50b80e25 350
351=item PERLIO_F_OPEN
352
353Handle is open.
354
355=item PERLIO_F_FASTGETS
356
9d799145 357This instance of this layer supports the "fast C<gets>" interface.
358Normally set based on C<PERLIO_K_FASTGETS> for the class and by the
d1be9408 359existence of the function(s) in the table. However a class that
50b80e25 360normally provides that interface may need to avoid it on a
361particular instance. The "pending" layer needs to do this when
d1be9408 362it is pushed above a layer which does not support the interface.
9d799145 363(Perl's C<sv_gets()> does not expect the streams fast C<gets> behaviour
50b80e25 364to change during one "get".)
365
366=back
367
368=head2 Methods in Detail
369
370=over 4
371
e2d9456f 372=item fsize
2dc2558e 373
374 Size_t fsize;
375
a489db4d 376Size of the function table. This is compared against the value PerlIO
377code "knows" as a compatibility check. Future versions I<may> be able
378to tolerate layers compiled against an old version of the headers.
2dc2558e 379
5cb3728c 380=item name
381
382 char * name;
d4165bde 383
384The name of the layer whose open() method Perl should invoke on
385open(). For example if the layer is called APR, you will call:
386
387 open $fh, ">:APR", ...
388
389and Perl knows that it has to invoke the PerlIOAPR_open() method
390implemented by the APR layer.
391
5cb3728c 392=item size
393
394 Size_t size;
d4165bde 395
396The size of the per-instance data structure, e.g.:
397
398 sizeof(PerlIOAPR)
399
a489db4d 400If this field is zero then C<PerlIO_pushed> does not malloc anything
401and assumes layer's Pushed function will do any required layer stack
402manipulation - used to avoid malloc/free overhead for dummy layers.
2dc2558e 403If the field is non-zero it must be at least the size of C<PerlIOl>,
404C<PerlIO_pushed> will allocate memory for the layer's data structures
405and link new layer onto the stream's stack. (If the layer's Pushed
406method returns an error indication the layer is popped again.)
407
5cb3728c 408=item kind
409
410 IV kind;
d4165bde 411
d4165bde 412=over 4
413
414=item * PERLIO_K_BUFFERED
415
86e05cf2 416The layer is buffered.
417
418=item * PERLIO_K_RAW
419
420The layer is acceptable to have in a binmode(FH) stack - i.e. it does not
421(or will configure itself not to) transform bytes passing through it.
422
d4165bde 423=item * PERLIO_K_CANCRLF
424
86e05cf2 425Layer can translate between "\n" and CRLF line ends.
426
d4165bde 427=item * PERLIO_K_FASTGETS
428
86e05cf2 429Layer allows buffer snooping.
430
d4165bde 431=item * PERLIO_K_MULTIARG
432
433Used when the layer's open() accepts more arguments than usual. The
434extra arguments should come not before the C<MODE> argument. When this
435flag is used it's up to the layer to validate the args.
436
d4165bde 437=back
438
5cb3728c 439=item Pushed
440
441 IV (*Pushed)(pTHX_ PerlIO *f,const char *mode, SV *arg);
50b80e25 442
1d11c889 443The only absolutely mandatory method. Called when the layer is pushed
444onto the stack. The C<mode> argument may be NULL if this occurs
445post-open. The C<arg> will be non-C<NULL> if an argument string was
446passed. In most cases this should call C<PerlIOBase_pushed()> to
447convert C<mode> into the appropriate C<PERLIO_F_XXXXX> flags in
448addition to any actions the layer itself takes. If a layer is not
449expecting an argument it need neither save the one passed to it, nor
450provide C<Getarg()> (it could perhaps C<Perl_warn> that the argument
451was un-expected).
50b80e25 452
d4165bde 453Returns 0 on success. On failure returns -1 and should set errno.
454
5cb3728c 455=item Popped
456
457 IV (*Popped)(pTHX_ PerlIO *f);
50b80e25 458
1d11c889 459Called when the layer is popped from the stack. A layer will normally
460be popped after C<Close()> is called. But a layer can be popped
461without being closed if the program is dynamically managing layers on
462the stream. In such cases C<Popped()> should free any resources
463(buffers, translation tables, ...) not held directly in the layer's
464struct. It should also C<Unread()> any unconsumed data that has been
465read and buffered from the layer below back to that layer, so that it
466can be re-provided to what ever is now above.
b76cc8ba 467
3077d0b1 468Returns 0 on success and failure. If C<Popped()> returns I<true> then
469I<perlio.c> assumes that either the layer has popped itself, or the
470layer is super special and needs to be retained for other reasons.
471In most cases it should return I<false>.
d4165bde 472
5cb3728c 473=item Open
474
475 PerlIO * (*Open)(...);
b76cc8ba 476
1d11c889 477The C<Open()> method has lots of arguments because it combines the
478functions of perl's C<open>, C<PerlIO_open>, perl's C<sysopen>,
479C<PerlIO_fdopen> and C<PerlIO_reopen>. The full prototype is as
480follows:
b76cc8ba 481
482 PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab,
483 AV *layers, IV n,
484 const char *mode,
485 int fd, int imode, int perm,
486 PerlIO *old,
487 int narg, SV **args);
488
1d11c889 489Open should (perhaps indirectly) call C<PerlIO_allocate()> to allocate
490a slot in the table and associate it with the layers information for
491the opened file, by calling C<PerlIO_push>. The I<layers> AV is an
492array of all the layers destined for the C<PerlIO *>, and any
493arguments passed to them, I<n> is the index into that array of the
494layer being called. The macro C<PerlIOArg> will return a (possibly
495C<NULL>) SV * for the argument passed to the layer.
496
497The I<mode> string is an "C<fopen()>-like" string which would match
498the regular expression C</^[I#]?[rwa]\+?[bt]?$/>.
499
500The C<'I'> prefix is used during creation of C<stdin>..C<stderr> via
501special C<PerlIO_fdopen> calls; the C<'#'> prefix means that this is
502C<sysopen> and that I<imode> and I<perm> should be passed to
503C<PerlLIO_open3>; C<'r'> means B<r>ead, C<'w'> means B<w>rite and
504C<'a'> means B<a>ppend. The C<'+'> suffix means that both reading and
a489db4d 505writing/appending are permitted. The C<'b'> suffix means file should
506be binary, and C<'t'> means it is text. (Almost all layers should do
507the IO in binary mode, and ignore the b/t bits. The C<:crlf> layer
508should be pushed to handle the distinction.)
1d11c889 509
510If I<old> is not C<NULL> then this is a C<PerlIO_reopen>. Perl itself
511does not use this (yet?) and semantics are a little vague.
512
513If I<fd> not negative then it is the numeric file descriptor I<fd>,
514which will be open in a manner compatible with the supplied mode
515string, the call is thus equivalent to C<PerlIO_fdopen>. In this case
516I<nargs> will be zero.
517
518If I<nargs> is greater than zero then it gives the number of arguments
519passed to C<open>, otherwise it will be 1 if for example
520C<PerlIO_open> was called. In simple cases SvPV_nolen(*args) is the
521pathname to open.
522
523Having said all that translation-only layers do not need to provide
524C<Open()> at all, but rather leave the opening to a lower level layer
525and wait to be "pushed". If a layer does provide C<Open()> it should
526normally call the C<Open()> method of next layer down (if any) and
527then push itself on top if that succeeds.
b76cc8ba 528
3077d0b1 529If C<PerlIO_push> was performed and open has failed, it must
530C<PerlIO_pop> itself, since if it's not, the layer won't be removed
531and may cause bad problems.
532
d4165bde 533Returns C<NULL> on failure.
534
86e05cf2 535=item Binmode
536
537 IV (*Binmode)(pTHX_ PerlIO *f);
538
539Optional. Used when C<:raw> layer is pushed (explicitly or as a result
540of binmode(FH)). If not present layer will be popped. If present
541should configure layer as binary (or pop itself) and return 0.
542If it returns -1 for error C<binmode> will fail with layer
543still on the stack.
544
5cb3728c 545=item Getarg
546
547 SV * (*Getarg)(pTHX_ PerlIO *f,
548 CLONE_PARAMS *param, int flags);
b76cc8ba 549
d4165bde 550Optional. If present should return an SV * representing the string
551argument passed to the layer when it was
552pushed. e.g. ":encoding(ascii)" would return an SvPV with value
553"ascii". (I<param> and I<flags> arguments can be ignored in most
554cases)
b76cc8ba 555
5cb3728c 556=item Fileno
557
558 IV (*Fileno)(pTHX_ PerlIO *f);
b76cc8ba 559
d1be9408 560Returns the Unix/Posix numeric file descriptor for the handle. Normally
b76cc8ba 561C<PerlIOBase_fileno()> (which just asks next layer down) will suffice
562for this.
50b80e25 563
a489db4d 564Returns -1 on error, which is considered to include the case where the
565layer cannot provide such a file descriptor.
d4165bde 566
5cb3728c 567=item Dup
568
569 PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o,
570 CLONE_PARAMS *param, int flags);
d4165bde 571
2dc2558e 572XXX: Needs more docs.
573
a489db4d 574Used as part of the "clone" process when a thread is spawned (in which
575case param will be non-NULL) and when a stream is being duplicated via
576'&' in the C<open>.
d4165bde 577
578Similar to C<Open>, returns PerlIO* on success, C<NULL> on failure.
579
5cb3728c 580=item Read
581
582 SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count);
d4165bde 583
584Basic read operation.
50b80e25 585
d4165bde 586Typically will call C<Fill> and manipulate pointers (possibly via the
587API). C<PerlIOBuf_read()> may be suitable for derived classes which
588provide "fast gets" methods.
50b80e25 589
d4165bde 590Returns actual bytes read, or -1 on an error.
591
5cb3728c 592=item Unread
593
594 SSize_t (*Unread)(pTHX_ PerlIO *f,
595 const void *vbuf, Size_t count);
50b80e25 596
9d799145 597A superset of stdio's C<ungetc()>. Should arrange for future reads to
598see the bytes in C<vbuf>. If there is no obviously better implementation
599then C<PerlIOBase_unread()> provides the function by pushing a "fake"
600"pending" layer above the calling layer.
50b80e25 601
d4165bde 602Returns the number of unread chars.
603
5cb3728c 604=item Write
605
606 SSize_t (*Write)(PerlIO *f, const void *vbuf, Size_t count);
50b80e25 607
d4165bde 608Basic write operation.
50b80e25 609
d4165bde 610Returns bytes written or -1 on an error.
611
5cb3728c 612=item Seek
613
614 IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence);
50b80e25 615
1d11c889 616Position the file pointer. Should normally call its own C<Flush>
617method and then the C<Seek> method of next layer down.
50b80e25 618
d4165bde 619Returns 0 on success, -1 on failure.
620
5cb3728c 621=item Tell
622
623 Off_t (*Tell)(pTHX_ PerlIO *f);
50b80e25 624
9d799145 625Return the file pointer. May be based on layers cached concept of
626position to avoid overhead.
50b80e25 627
d4165bde 628Returns -1 on failure to get the file pointer.
629
5cb3728c 630=item Close
631
632 IV (*Close)(pTHX_ PerlIO *f);
50b80e25 633
9d799145 634Close the stream. Should normally call C<PerlIOBase_close()> to flush
635itself and close layers below, and then deallocate any data structures
636(buffers, translation tables, ...) not held directly in the data
637structure.
50b80e25 638
d4165bde 639Returns 0 on success, -1 on failure.
640
5cb3728c 641=item Flush
642
643 IV (*Flush)(pTHX_ PerlIO *f);
50b80e25 644
9d799145 645Should make stream's state consistent with layers below. That is, any
646buffered write data should be written, and file position of lower layers
d1be9408 647adjusted for data read from below but not actually consumed.
b76cc8ba 648(Should perhaps C<Unread()> such data to the lower layer.)
50b80e25 649
d4165bde 650Returns 0 on success, -1 on failure.
651
5cb3728c 652=item Fill
653
654 IV (*Fill)(pTHX_ PerlIO *f);
d4165bde 655
656The buffer for this layer should be filled (for read) from layer
657below. When you "subclass" PerlIOBuf layer, you want to use its
658I<_read> method and to supply your own fill method, which fills the
659PerlIOBuf's buffer.
50b80e25 660
d4165bde 661Returns 0 on success, -1 on failure.
50b80e25 662
5cb3728c 663=item Eof
664
665 IV (*Eof)(pTHX_ PerlIO *f);
50b80e25 666
9d799145 667Return end-of-file indicator. C<PerlIOBase_eof()> is normally sufficient.
50b80e25 668
d4165bde 669Returns 0 on end-of-file, 1 if not end-of-file, -1 on error.
670
5cb3728c 671=item Error
672
673 IV (*Error)(pTHX_ PerlIO *f);
50b80e25 674
9d799145 675Return error indicator. C<PerlIOBase_error()> is normally sufficient.
50b80e25 676
d4165bde 677Returns 1 if there is an error (usually when C<PERLIO_F_ERROR> is set,
6780 otherwise.
679
5cb3728c 680=item Clearerr
681
682 void (*Clearerr)(pTHX_ PerlIO *f);
50b80e25 683
9d799145 684Clear end-of-file and error indicators. Should call C<PerlIOBase_clearerr()>
685to set the C<PERLIO_F_XXXXX> flags, which may suffice.
50b80e25 686
5cb3728c 687=item Setlinebuf
688
689 void (*Setlinebuf)(pTHX_ PerlIO *f);
50b80e25 690
b76cc8ba 691Mark the stream as line buffered. C<PerlIOBase_setlinebuf()> sets the
692PERLIO_F_LINEBUF flag and is normally sufficient.
50b80e25 693
5cb3728c 694=item Get_base
695
696 STDCHAR * (*Get_base)(pTHX_ PerlIO *f);
50b80e25 697
698Allocate (if not already done so) the read buffer for this layer and
d4165bde 699return pointer to it. Return NULL on failure.
50b80e25 700
5cb3728c 701=item Get_bufsiz
702
703 Size_t (*Get_bufsiz)(pTHX_ PerlIO *f);
50b80e25 704
9d799145 705Return the number of bytes that last C<Fill()> put in the buffer.
50b80e25 706
5cb3728c 707=item Get_ptr
708
709 STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f);
50b80e25 710
3039a93d 711Return the current read pointer relative to this layer's buffer.
50b80e25 712
5cb3728c 713=item Get_cnt
714
715 SSize_t (*Get_cnt)(pTHX_ PerlIO *f);
50b80e25 716
717Return the number of bytes left to be read in the current buffer.
718
5cb3728c 719=item Set_ptrcnt
720
721 void (*Set_ptrcnt)(pTHX_ PerlIO *f,
722 STDCHAR *ptr, SSize_t cnt);
50b80e25 723
724Adjust the read pointer and count of bytes to match C<ptr> and/or C<cnt>.
725The application (or layer above) must ensure they are consistent.
726(Checking is allowed by the paranoid.)
727
728=back
729
210e727c 730=head2 Implementing PerlIO Layers
731
2535a4f7 732If you find the implementation document unclear or not sufficient,
733look at the existing perlio layer implementations, which include:
734
735=over
736
737=item * C implementations
738
eae154c7 739The F<perlio.c> and F<perliol.h> in the Perl core implement the
740"unix", "perlio", "stdio", "crlf", "utf8", "byte", "raw", "pending"
741layers, and also the "mmap" and "win32" layers if applicable.
742(The "win32" is currently unfinished and unused, to see what is used
743instead in Win32, see L<PerlIO/"Querying the layers of filehandles"> .)
744
2535a4f7 745PerlIO::encoding, PerlIO::scalar, PerlIO::via in the Perl core.
746
747PerlIO::gzip and APR::PerlIO (mod_perl 2.0) on CPAN.
748
749=item * Perl implementations
750
751PerlIO::via::QuotedPrint in the Perl core and PerlIO::via::* on CPAN.
752
753=back
754
210e727c 755If you are creating a PerlIO layer, you may want to be lazy, in other
756words, implement only the methods that interest you. The other methods
757you can either replace with the "blank" methods
758
759 PerlIOBase_noop_ok
760 PerlIOBase_noop_fail
761
762(which do nothing, and return zero and -1, respectively) or for
763certain methods you may assume a default behaviour by using a NULL
61bdadae 764method. The Open method looks for help in the 'parent' layer.
765The following table summarizes the behaviour:
210e727c 766
767 method behaviour with NULL
768
769 Clearerr PerlIOBase_clearerr
770 Close PerlIOBase_close
61bdadae 771 Dup PerlIOBase_dup
210e727c 772 Eof PerlIOBase_eof
773 Error PerlIOBase_error
774 Fileno PerlIOBase_fileno
775 Fill FAILURE
776 Flush SUCCESS
61bdadae 777 Getarg SUCCESS
210e727c 778 Get_base FAILURE
779 Get_bufsiz FAILURE
780 Get_cnt FAILURE
781 Get_ptr FAILURE
61bdadae 782 Open INHERITED
783 Popped SUCCESS
784 Pushed SUCCESS
210e727c 785 Read PerlIOBase_read
786 Seek FAILURE
787 Set_cnt FAILURE
788 Set_ptrcnt FAILURE
789 Setlinebuf PerlIOBase_setlinebuf
790 Tell FAILURE
791 Unread PerlIOBase_unread
792 Write FAILURE
50b80e25 793
61bdadae 794 FAILURE Set errno (to EINVAL in UNIXish, to LIB$_INVARG in VMS) and
795 return -1 (for numeric return values) or NULL (for pointers)
796 INHERITED Inherited from the layer below
797 SUCCESS Return 0 (for numeric return values) or a pointer
798
50b80e25 799=head2 Core Layers
800
801The file C<perlio.c> provides the following layers:
802
803=over 4
804
805=item "unix"
806
9d799145 807A basic non-buffered layer which calls Unix/POSIX C<read()>, C<write()>,
808C<lseek()>, C<close()>. No buffering. Even on platforms that distinguish
809between O_TEXT and O_BINARY this layer is always O_BINARY.
50b80e25 810
811=item "perlio"
812
9d799145 813A very complete generic buffering layer which provides the whole of
814PerlIO API. It is also intended to be used as a "base class" for other
1d11c889 815layers. (For example its C<Read()> method is implemented in terms of
816the C<Get_cnt()>/C<Get_ptr()>/C<Set_ptrcnt()> methods).
50b80e25 817
9d799145 818"perlio" over "unix" provides a complete replacement for stdio as seen
819via PerlIO API. This is the default for USE_PERLIO when system's stdio
1d11c889 820does not permit perl's "fast gets" access, and which do not
821distinguish between C<O_TEXT> and C<O_BINARY>.
50b80e25 822
823=item "stdio"
824
9d799145 825A layer which provides the PerlIO API via the layer scheme, but
826implements it by calling system's stdio. This is (currently) the default
827if system's stdio provides sufficient access to allow perl's "fast gets"
828access and which do not distinguish between C<O_TEXT> and C<O_BINARY>.
50b80e25 829
830=item "crlf"
831
9d799145 832A layer derived using "perlio" as a base class. It provides Win32-like
833"\n" to CR,LF translation. Can either be applied above "perlio" or serve
834as the buffer layer itself. "crlf" over "unix" is the default if system
835distinguishes between C<O_TEXT> and C<O_BINARY> opens. (At some point
836"unix" will be replaced by a "native" Win32 IO layer on that platform,
837as Win32's read/write layer has various drawbacks.) The "crlf" layer is
838a reasonable model for a layer which transforms data in some way.
50b80e25 839
840=item "mmap"
841
9d799145 842If Configure detects C<mmap()> functions this layer is provided (with
843"perlio" as a "base") which does "read" operations by mmap()ing the
844file. Performance improvement is marginal on modern systems, so it is
845mainly there as a proof of concept. It is likely to be unbundled from
846the core at some point. The "mmap" layer is a reasonable model for a
847minimalist "derived" layer.
50b80e25 848
849=item "pending"
850
9d799145 851An "internal" derivative of "perlio" which can be used to provide
1d11c889 852Unread() function for layers which have no buffer or cannot be
853bothered. (Basically this layer's C<Fill()> pops itself off the stack
854and so resumes reading from layer below.)
50b80e25 855
856=item "raw"
857
9d799145 858A dummy layer which never exists on the layer stack. Instead when
86e05cf2 859"pushed" it actually pops the stack removing itself, it then calls
860Binmode function table entry on all the layers in the stack - normally
861this (via PerlIOBase_binmode) removes any layers which do not have
862C<PERLIO_K_RAW> bit set. Layers can modify that behaviour by defining
863their own Binmode entry.
50b80e25 864
865=item "utf8"
866
9d799145 867Another dummy layer. When pushed it pops itself and sets the
1d11c889 868C<PERLIO_F_UTF8> flag on the layer which was (and now is once more)
869the top of the stack.
50b80e25 870
871=back
872
9d799145 873In addition F<perlio.c> also provides a number of C<PerlIOBase_xxxx()>
874functions which are intended to be used in the table slots of classes
875which do not need to do anything special for a particular method.
50b80e25 876
877=head2 Extension Layers
878
1d11c889 879Layers can made available by extension modules. When an unknown layer
880is encountered the PerlIO code will perform the equivalent of :
b76cc8ba 881
882 use PerlIO 'layer';
883
1d11c889 884Where I<layer> is the unknown layer. F<PerlIO.pm> will then attempt to:
b76cc8ba 885
886 require PerlIO::layer;
887
1d11c889 888If after that process the layer is still not defined then the C<open>
889will fail.
b76cc8ba 890
891The following extension layers are bundled with perl:
50b80e25 892
893=over 4
894
b76cc8ba 895=item ":encoding"
50b80e25 896
897 use Encoding;
898
1d11c889 899makes this layer available, although F<PerlIO.pm> "knows" where to
900find it. It is an example of a layer which takes an argument as it is
901called thus:
50b80e25 902
b31b80f9 903 open( $fh, "<:encoding(iso-8859-7)", $pathname );
50b80e25 904
385e1f9f 905=item ":scalar"
b76cc8ba 906
b31b80f9 907Provides support for reading data from and writing data to a scalar.
b76cc8ba 908
385e1f9f 909 open( $fh, "+<:scalar", \$scalar );
50b80e25 910
1d11c889 911When a handle is so opened, then reads get bytes from the string value
912of I<$scalar>, and writes change the value. In both cases the position
913in I<$scalar> starts as zero but can be altered via C<seek>, and
914determined via C<tell>.
b76cc8ba 915
385e1f9f 916Please note that this layer is implied when calling open() thus:
917
918 open( $fh, "+<", \$scalar );
919
920=item ":via"
b76cc8ba 921
4f7853f4 922Provided to allow layers to be implemented as Perl code. For instance:
923
e934609f 924 use PerlIO::via::StripHTML;
385e1f9f 925 open( my $fh, "<:via(StripHTML)", "index.html" );
4f7853f4 926
e934609f 927See L<PerlIO::via> for details.
b76cc8ba 928
929=back
50b80e25 930
d4165bde 931=head1 TODO
932
933Things that need to be done to improve this document.
934
935=over
936
937=item *
938
939Explain how to make a valid fh without going through open()(i.e. apply
940a layer). For example if the file is not opened through perl, but we
941want to get back a fh, like it was opened by Perl.
942
943How PerlIO_apply_layera fits in, where its docs, was it made public?
944
945Currently the example could be something like this:
946
947 PerlIO *foo_to_PerlIO(pTHX_ char *mode, ...)
948 {
949 char *mode; /* "w", "r", etc */
950 const char *layers = ":APR"; /* the layer name */
951 PerlIO *f = PerlIO_allocate(aTHX);
952 if (!f) {
953 return NULL;
954 }
955
956 PerlIO_apply_layers(aTHX_ f, mode, layers);
957
958 if (f) {
959 PerlIOAPR *st = PerlIOSelf(f, PerlIOAPR);
960 /* fill in the st struct, as in _open() */
961 st->file = file;
962 PerlIOBase(f)->flags |= PERLIO_F_OPEN;
963
964 return f;
965 }
966 return NULL;
967 }
968
969=item *
970
971fix/add the documentation in places marked as XXX.
972
973=item *
974
975The handling of errors by the layer is not specified. e.g. when $!
976should be set explicitly, when the error handling should be just
977delegated to the top layer.
978
979Probably give some hints on using SETERRNO() or pointers to where they
980can be found.
981
982=item *
983
984I think it would help to give some concrete examples to make it easier
985to understand the API. Of course I agree that the API has to be
986concise, but since there is no second document that is more of a
987guide, I think that it'd make it easier to start with the doc which is
988an API, but has examples in it in places where things are unclear, to
989a person who is not a PerlIO guru (yet).
990
991=back
992
50b80e25 993=cut