X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperliol.pod;h=26ad305fb037f0a9a9777d19f9f7df61a5b798ab;hb=8269e00da02a2e0f107fbb8b4a78f0c4058f3587;hp=1c346e0965ec1dcb426ee23fc7a32ad5fe309411;hpb=5cb3728cfe288ad05e8d10c8176f72378da2238f;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perliol.pod b/pod/perliol.pod index 1c346e0..26ad305 100644 --- a/pod/perliol.pod +++ b/pod/perliol.pod @@ -24,6 +24,57 @@ The aim of the implementation is to provide the PerlIO API in a flexible and platform neutral manner. It is also a trial of an "Object Oriented C, with vtables" approach which may be applied to perl6. +=head2 Basic Structure + +PerlIO is a stack of layers. + +The low levels of the stack work with the low-level operating system +calls (file descriptors in C) getting bytes in and out, the higher +layers of the stack buffer, filter, and otherwise manipulate the I/O, +and return characters (or bytes) to Perl. Terms I and I +are used to refer to the relative positioning of the stack layers. + +A layer contains a "vtable", the table of I/O operations (at C level +a table of function pointers), and status flags. The functions in the +vtable implement operations like "open", "read", and "write". + +When I/O, for example "read", is requested, the request goes from Perl +first down the stack using "read" functions of each layer, then at the +bottom the input is requested from the operating system services, then +the result is returned up the stack, finally being interpreted as Perl +data. + +The requests do not necessarily go always all the way down to the +operating system: that's where PerlIO buffering comes into play. + +When you do an open() and specify extra PerlIO layers to be deployed, +the layers you specify are "pushed" on top of the already existing +default stack. One way to see it is that "operating system is +on the left" and "Perl is on the right". + +What exact layers are in this default stack depends on a lot of +things: your operating system, Perl version, Perl compile time +configuration, and Perl runtime configuration. See L, +L, and L for more information. + +binmode() operates similarly to open(): by default the specified +layers are pushed on top of the existing stack. + +However, note that even as the specified layers are "pushed on top" +for open() and binmode(), this doesn't mean that the effects are +limited to the "top": PerlIO layers can be very 'active' and inspect +and affect layers also deeper in the stack. As an example there +is a layer called "raw" which repeatedly "pops" layers until +it reaches the first layer that has declared itself capable of +handling binary data. The "pushed" layers are processed in left-to-right +order. + +sysopen() operates (unsurprisingly) at a lower level in the stack than +open(). For example in UNIX or UNIX-like systems sysopen() operates +directly at the level of file descriptors: in the terms of PerlIO +layers, it uses only the "unix" layer, which is a rather thin wrapper +on top of the UNIX file descriptors. + =head2 Layers vs Disciplines Initial discussion of the ability to modify IO streams behaviour used @@ -87,10 +138,11 @@ same as the public C functions: struct _PerlIO_funcs { + Size_t fsize; char * name; Size_t size; IV kind; - IV (*Pushed)(pTHX_ PerlIO *f,const char *mode,SV *arg); + IV (*Pushed)(pTHX_ PerlIO *f,const char *mode,SV *arg, PerlIO_funcs *tab); IV (*Popped)(pTHX_ PerlIO *f); PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, AV *layers, IV n, @@ -98,6 +150,7 @@ same as the public C functions: int fd, int imode, int perm, PerlIO *old, int narg, SV **args); + IV (*Binmode)(pTHX_ PerlIO *f); SV * (*Getarg)(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags) IV (*Fileno)(pTHX_ PerlIO *f); PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, CLONE_PARAMS *param, int flags) @@ -123,9 +176,9 @@ same as the public C functions: void (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt); }; -The first few members of the struct give a "name" for the layer, the -size to C for the per-instance data, and some flags which are -attributes of the class as whole (such as whether it is a buffering +The first few members of the struct give a function table size for +compatibility check "name" for the layer, the size to C for the per-instance data, +and some flags which are attributes of the class as whole (such as whether it is a buffering layer), then follow the functions which fall into four basic groups: =over 4 @@ -322,6 +375,14 @@ to change during one "get".) =over 4 +=item fsize + + Size_t fsize; + +Size of the function table. This is compared against the value PerlIO +code "knows" as a compatibility check. Future versions I be able +to tolerate layers compiled against an old version of the headers. + =item name char * name; @@ -342,28 +403,43 @@ The size of the per-instance data structure, e.g.: sizeof(PerlIOAPR) +If this field is zero then C does not malloc anything +and assumes layer's Pushed function will do any required layer stack +manipulation - used to avoid malloc/free overhead for dummy layers. +If the field is non-zero it must be at least the size of C, +C will allocate memory for the layer's data structures +and link new layer onto the stream's stack. (If the layer's Pushed +method returns an error indication the layer is popped again.) + =item kind IV kind; - XXX: explain all the available flags here - =over 4 =item * PERLIO_K_BUFFERED +The layer is buffered. + +=item * PERLIO_K_RAW + +The layer is acceptable to have in a binmode(FH) stack - i.e. it does not +(or will configure itself not to) transform bytes passing through it. + =item * PERLIO_K_CANCRLF +Layer can translate between "\n" and CRLF line ends. + =item * PERLIO_K_FASTGETS +Layer allows buffer snooping. + =item * PERLIO_K_MULTIARG Used when the layer's open() accepts more arguments than usual. The extra arguments should come not before the C argument. When this flag is used it's up to the layer to validate the args. -=item * PERLIO_K_RAW - =back =item Pushed @@ -395,7 +471,10 @@ struct. It should also C any unconsumed data that has been read and buffered from the layer below back to that layer, so that it can be re-provided to what ever is now above. -Returns 0 on success and failure. +Returns 0 on success and failure. If C returns I then +I assumes that either the layer has popped itself, or the +layer is super special and needs to be retained for other reasons. +In most cases it should return I. =item Open @@ -429,10 +508,10 @@ special C calls; the C<'#'> prefix means that this is C and that I and I should be passed to C; C<'r'> means Bead, C<'w'> means Brite and C<'a'> means Bppend. The C<'+'> suffix means that both reading and -writing/appending are permitted. The C<'b'> suffix means file should -be binary, and C<'t'> means it is text. (Binary/Text should be ignored -by almost all layers and binary IO done, with PerlIO. The C<:crlf> -layer should be pushed to handle the distinction.) +writing/appending are permitted. The C<'b'> suffix means file should +be binary, and C<'t'> means it is text. (Almost all layers should do +the IO in binary mode, and ignore the b/t bits. The C<:crlf> layer +should be pushed to handle the distinction.) If I is not C then this is a C. Perl itself does not use this (yet?) and semantics are a little vague. @@ -453,8 +532,22 @@ and wait to be "pushed". If a layer does provide C it should normally call the C method of next layer down (if any) and then push itself on top if that succeeds. +If C was performed and open has failed, it must +C itself, since if it's not, the layer won't be removed +and may cause bad problems. + Returns C on failure. +=item Binmode + + IV (*Binmode)(pTHX_ PerlIO *f); + +Optional. Used when C<:raw> layer is pushed (explicitly or as a result +of binmode(FH)). If not present layer will be popped. If present +should configure layer as binary (or pop itself) and return 0. +If it returns -1 for error C will fail with layer +still on the stack. + =item Getarg SV * (*Getarg)(pTHX_ PerlIO *f, @@ -466,6 +559,10 @@ pushed. e.g. ":encoding(ascii)" would return an SvPV with value "ascii". (I and I arguments can be ignored in most cases) +C uses C to retrieve the argument originally passed to +C, so you must implement this function if your layer has an +extra argument to C and will ever be Ced. + =item Fileno IV (*Fileno)(pTHX_ PerlIO *f); @@ -474,18 +571,19 @@ Returns the Unix/Posix numeric file descriptor for the handle. Normally C (which just asks next layer down) will suffice for this. -Returns -1 if the layer cannot provide such a file descriptor, or in -the case of the error. - -XXX: two possible results end up in -1, one is an error the other is -not. +Returns -1 on error, which is considered to include the case where the +layer cannot provide such a file descriptor. =item Dup PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, CLONE_PARAMS *param, int flags); -XXX: not documented +XXX: Needs more docs. + +Used as part of the "clone" process when a thread is spawned (in which +case param will be non-NULL) and when a stream is being duplicated via +'&' in the C. Similar to C, returns PerlIO* on success, C on failure. @@ -639,6 +737,110 @@ The application (or layer above) must ensure they are consistent. =back +=head2 Utilities + +To ask for the next layer down use PerlIONext(PerlIO *f). + +To check that a PerlIO* is valid use PerlIOValid(PerlIO *f). (All +this does is really just to check that the pointer is non-NULL and +that the pointer behind that is non-NULL.) + +PerlIOBase(PerlIO *f) returns the "Base" pointer, or in other words, +the C pointer. + +PerlIOSelf(PerlIO* f, type) return the PerlIOBase cast to a type. + +Perl_PerlIO_or_Base(PerlIO* f, callback, base, failure, args) either +calls the I from the functions of the layer I (just by +the name of the IO function, like "Read") with the I, or if +there is no such callback, calls the I version of the callback +with the same args, or if the f is invalid, set errno to EBADF and +return I. + +Perl_PerlIO_or_fail(PerlIO* f, callback, failure, args) either calls +the I of the functions of the layer I with the I, +or if there is no such callback, set errno to EINVAL. Or if the f is +invalid, set errno to EBADF and return I. + +Perl_PerlIO_or_Base_void(PerlIO* f, callback, base, args) either calls +the I of the functions of the layer I with the I, +or if there is no such callback, calls the I version of the +callback with the same args, or if the f is invalid, set errno to +EBADF. + +Perl_PerlIO_or_fail_void(PerlIO* f, callback, args) either calls the +I of the functions of the layer I with the I, or if +there is no such callback, set errno to EINVAL. Or if the f is +invalid, set errno to EBADF. + +=head2 Implementing PerlIO Layers + +If you find the implementation document unclear or not sufficient, +look at the existing PerlIO layer implementations, which include: + +=over + +=item * C implementations + +The F and F in the Perl core implement the +"unix", "perlio", "stdio", "crlf", "utf8", "byte", "raw", "pending" +layers, and also the "mmap" and "win32" layers if applicable. +(The "win32" is currently unfinished and unused, to see what is used +instead in Win32, see L .) + +PerlIO::encoding, PerlIO::scalar, PerlIO::via in the Perl core. + +PerlIO::gzip and APR::PerlIO (mod_perl 2.0) on CPAN. + +=item * Perl implementations + +PerlIO::via::QuotedPrint in the Perl core and PerlIO::via::* on CPAN. + +=back + +If you are creating a PerlIO layer, you may want to be lazy, in other +words, implement only the methods that interest you. The other methods +you can either replace with the "blank" methods + + PerlIOBase_noop_ok + PerlIOBase_noop_fail + +(which do nothing, and return zero and -1, respectively) or for +certain methods you may assume a default behaviour by using a NULL +method. The Open method looks for help in the 'parent' layer. +The following table summarizes the behaviour: + + method behaviour with NULL + + Clearerr PerlIOBase_clearerr + Close PerlIOBase_close + Dup PerlIOBase_dup + Eof PerlIOBase_eof + Error PerlIOBase_error + Fileno PerlIOBase_fileno + Fill FAILURE + Flush SUCCESS + Getarg SUCCESS + Get_base FAILURE + Get_bufsiz FAILURE + Get_cnt FAILURE + Get_ptr FAILURE + Open INHERITED + Popped SUCCESS + Pushed SUCCESS + Read PerlIOBase_read + Seek FAILURE + Set_cnt FAILURE + Set_ptrcnt FAILURE + Setlinebuf PerlIOBase_setlinebuf + Tell FAILURE + Unread PerlIOBase_unread + Write FAILURE + + FAILURE Set errno (to EINVAL in UNIXish, to LIB$_INVARG in VMS) and + return -1 (for numeric return values) or NULL (for pointers) + INHERITED Inherited from the layer below + SUCCESS Return 0 (for numeric return values) or a pointer =head2 Core Layers @@ -700,8 +902,11 @@ and so resumes reading from layer below.) =item "raw" A dummy layer which never exists on the layer stack. Instead when -"pushed" it actually pops the stack(!), removing itself, and any other -layers until it reaches a layer with the class C bit set. +"pushed" it actually pops the stack removing itself, it then calls +Binmode function table entry on all the layers in the stack - normally +this (via PerlIOBase_binmode) removes any layers which do not have +C bit set. Layers can modify that behaviour by defining +their own Binmode entry. =item "utf8" @@ -741,23 +946,31 @@ makes this layer available, although F "knows" where to find it. It is an example of a layer which takes an argument as it is called thus: - open($fh,"<:encoding(iso-8859-7)",$pathname) + open( $fh, "<:encoding(iso-8859-7)", $pathname ); -=item ":Scalar" +=item ":scalar" -Provides support for +Provides support for reading data from and writing data to a scalar. - open($fh,"...",\$scalar) + open( $fh, "+<:scalar", \$scalar ); When a handle is so opened, then reads get bytes from the string value of I<$scalar>, and writes change the value. In both cases the position in I<$scalar> starts as zero but can be altered via C, and determined via C. -=item ":Object" or ":Perl" +Please note that this layer is implied when calling open() thus: + + open( $fh, "+<", \$scalar ); -May be provided to allow layers to be implemented as perl code - -implementation is being investigated. +=item ":via" + +Provided to allow layers to be implemented as Perl code. For instance: + + use PerlIO::via::StripHTML; + open( my $fh, "<:via(StripHTML)", "index.html" ); + +See L for details. =back @@ -824,6 +1037,3 @@ a person who is not a PerlIO guru (yet). =back =cut - - -