From: Malcolm Beattie Date: Tue, 21 Jul 1998 18:13:16 +0000 (+0100) Subject: Compiler docs for 5.005 X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=1a52ab62ecd17035bea67b718b7daed8ad18e673;p=p5sagit%2Fp5-mst-13.2.git Compiler docs for 5.005 Message-Id: <199807211713.SAA20735@sable.ox.ac.uk> p4raw-id: //depot/perl@1617 --- diff --git a/ext/B/B.pm b/ext/B/B.pm index dcf7809..d5137d4 100644 --- a/ext/B/B.pm +++ b/ext/B/B.pm @@ -1,6 +1,6 @@ # B.pm # -# Copyright (c) 1996, 1997 Malcolm Beattie +# Copyright (c) 1996, 1997, 1998 Malcolm Beattie # # You may distribute under the terms of either the GNU General Public # License or the Artistic License, as specified in the README file. @@ -283,7 +283,540 @@ B - The Perl Compiler =head1 DESCRIPTION -See F. +The C module supplies classes which allow a Perl program to delve +into its own innards. It is the module used to implement the +"backends" of the Perl compiler. Usage of the compiler does not +require knowledge of this module: see the F module for the +user-visible part. The C module is of use to those who want to +write new compiler backends. This documentation assumes that the +reader knows a fair amount about perl's internals including such +things as SVs, OPs and the internal symbol table and syntax tree +of a program. + +=head1 OVERVIEW OF CLASSES + +The C structures used by Perl's internals to hold SV and OP +information (PVIV, AV, HV, ..., OP, SVOP, UNOP, ...) are modelled on a +class hierarchy and the C module gives access to them via a true +object hierarchy. Structure fields which point to other objects +(whether types of SV or types of OP) are represented by the C +module as Perl objects of the appropriate class. The bulk of the C +module is the methods for accessing fields of these structures. Note +that all access is read-only: you cannot modify the internals by +using this module. + +=head2 SV-RELATED CLASSES + +B::IV, B::NV, B::RV, B::PV, B::PVIV, B::PVNV, B::PVMG, B::BM, B::PVLV, +B::AV, B::HV, B::CV, B::GV, B::FM, B::IO. These classes correspond in +the obvious way to the underlying C structures of similar names. The +inheritance hierarchy mimics the underlying C "inheritance". Access +methods correspond to the underlying C macros for field access, +usually with the leading "class indication" prefix removed (Sv, Av, +Hv, ...). The leading prefix is only left in cases where its removal +would cause a clash in method name. For example, C stays +as-is since its abbreviation would clash with the "superclass" method +C (corresponding to the C function C). + +=head2 B::SV METHODS + +=over 4 + +=item REFCNT + +=item FLAGS + +=back + +=head2 B::IV METHODS + +=over 4 + +=item IV + +=item IVX + +=item needs64bits + +=item packiv + +=back + +=head2 B::NV METHODS + +=over 4 + +=item NV + +=item NVX + +=back + +=head2 B::RV METHODS + +=over 4 + +=item RV + +=back + +=head2 B::PV METHODS + +=over 4 + +=item PV + +=back + +=head2 B::PVMG METHODS + +=over 4 + +=item MAGIC + +=item SvSTASH + +=back + +=head2 B::MAGIC METHODS + +=over 4 + +=item MOREMAGIC + +=item PRIVATE + +=item TYPE + +=item FLAGS + +=item OBJ + +=item PTR + +=back + +=head2 B::PVLV METHODS + +=over 4 + +=item TARGOFF + +=item TARGLEN + +=item TYPE + +=item TARG + +=back + +=head2 B::BM METHODS + +=over 4 + +=item USEFUL + +=item PREVIOUS + +=item RARE + +=item TABLE + +=back + +=head2 B::GV METHODS + +=over 4 + +=item NAME + +=item STASH + +=item SV + +=item IO + +=item FORM + +=item AV + +=item HV + +=item EGV + +=item CV + +=item CVGEN + +=item LINE + +=item FILEGV + +=item GvREFCNT + +=item FLAGS + +=back + +=head2 B::IO METHODS + +=over 4 + +=item LINES + +=item PAGE + +=item PAGE_LEN + +=item LINES_LEFT + +=item TOP_NAME + +=item TOP_GV + +=item FMT_NAME + +=item FMT_GV + +=item BOTTOM_NAME + +=item BOTTOM_GV + +=item SUBPROCESS + +=item IoTYPE + +=item IoFLAGS + +=back + +=head2 B::AV METHODS + +=over 4 + +=item FILL + +=item MAX + +=item OFF + +=item ARRAY + +=item AvFLAGS + +=back + +=head2 B::CV METHODS + +=over 4 + +=item STASH + +=item START + +=item ROOT + +=item GV + +=item FILEGV + +=item DEPTH + +=item PADLIST + +=item OUTSIDE + +=item XSUB + +=item XSUBANY + +=back + +=head2 B::HV METHODS + +=over 4 + +=item FILL + +=item MAX + +=item KEYS + +=item RITER + +=item NAME + +=item PMROOT + +=item ARRAY + +=back + +=head2 OP-RELATED CLASSES + +B::OP, B::UNOP, B::BINOP, B::LOGOP, B::CONDOP, B::LISTOP, B::PMOP, +B::SVOP, B::GVOP, B::PVOP, B::CVOP, B::LOOP, B::COP. +These classes correspond in +the obvious way to the underlying C structures of similar names. The +inheritance hierarchy mimics the underlying C "inheritance". Access +methods correspond to the underlying C structre field names, with the +leading "class indication" prefix removed (op_). + +=head2 B::OP METHODS + +=over 4 + +=item next + +=item sibling + +=item ppaddr + +This returns the function name as a string (e.g. pp_add, pp_rv2av). + +=item desc + +This returns the op description from the global C op_desc array +(e.g. "addition" "array deref"). + +=item targ + +=item type + +=item seq + +=item flags + +=item private + +=back + +=head2 B::UNOP METHOD + +=over 4 + +=item first + +=back + +=head2 B::BINOP METHOD + +=over 4 + +=item last + +=back + +=head2 B::LOGOP METHOD + +=over 4 + +=item other + +=back + +=head2 B::CONDOP METHODS + +=over 4 + +=item true + +=item false + +=back + +=head2 B::LISTOP METHOD + +=over 4 + +=item children + +=back + +=head2 B::PMOP METHODS + +=over 4 + +=item pmreplroot + +=item pmreplstart + +=item pmnext + +=item pmregexp + +=item pmflags + +=item pmpermflags + +=item precomp + +=back + +=head2 B::SVOP METHOD + +=over 4 + +=item sv + +=back + +=head2 B::GVOP METHOD + +=over 4 + +=item gv + +=back + +=head2 B::PVOP METHOD + +=over 4 + +=item pv + +=back + +=head2 B::LOOP METHODS + +=over 4 + +=item redoop + +=item nextop + +=item lastop + +=back + +=head2 B::COP METHODS + +=over 4 + +=item label + +=item stash + +=item filegv + +=item cop_seq + +=item arybase + +=item line + +=back + +=head1 FUNCTIONS EXPORTED BY C + +The C module exports a variety of functions: some are simple +utility functions, others provide a Perl program with a way to +get an initial "handle" on an internal object. + +=over 4 + +=item main_cv + +Return the (faked) CV corresponding to the main part of the Perl +program. + +=item main_root + +Returns the root op (i.e. an object in the appropriate B::OP-derived +class) of the main part of the Perl program. + +=item main_start + +Returns the starting op of the main part of the Perl program. + +=item comppadlist + +Returns the AV object (i.e. in class B::AV) of the global comppadlist. + +=item sv_undef + +Returns the SV object corresponding to the C variable C. + +=item sv_yes + +Returns the SV object corresponding to the C variable C. + +=item sv_no + +Returns the SV object corresponding to the C variable C. + +=item walkoptree(OP, METHOD) + +Does a tree-walk of the syntax tree based at OP and calls METHOD on +each op it visits. Each node is visited before its children. If +C (q.v.) has been called to turn debugging on then +the method C is called on each op before METHOD is +called. + +=item walkoptree_debug(DEBUG) + +Returns the current debugging flag for C. If the optional +DEBUG argument is non-zero, it sets the debugging flag to that. See +the description of C above for what the debugging flag +does. + +=item walksymtable(SYMREF, METHOD, RECURSE) + +Walk the symbol table starting at SYMREF and call METHOD on each +symbol visited. When the walk reached package symbols "Foo::" it +invokes RECURSE and only recurses into the package if that sub +returns true. + +=item svref_2object(SV) + +Takes any Perl variable and turns it into an object in the +appropriate B::OP-derived or B::SV-derived class. Apart from functions +such as C, this is the primary way to get an initial +"handle" on a internal perl data structure which can then be followed +with the other access methods. + +=item ppname(OPNUM) + +Return the PP function name (e.g. "pp_add") of op number OPNUM. + +=item hash(STR) + +Returns a string in the form "0x..." representing the value of the +internal hash function used by perl on string STR. + +=item cast_I32(I) + +Casts I to the internal I32 type used by that perl. + + +=item minus_c + +Does the equivalent of the C<-c> command-line option. Obviously, this +is only useful in a BEGIN block or else the flag is set too late. + + +=item cstring(STR) + +Returns a double-quote-surrounded escaped version of STR which can +be used as a string in C source code. + +=item class(OBJ) + +Returns the class of an object without the part of the classname +preceding the first "::". This is used to turn "B::UNOP" into +"UNOP" for example. + +=item threadsv_names + +In a perl compiled for threads, this returns a list of the special +per-thread threadsv variables. + +=item byteload_fh(FILEHANDLE) + +Load the contents of FILEHANDLE as bytecode. See documentation for +the B module in F for how to generate bytecode. + +=back =head1 AUTHOR diff --git a/ext/B/B/Bytecode.pm b/ext/B/B/Bytecode.pm index 6c882b2..60b93a5 100644 --- a/ext/B/B/Bytecode.pm +++ b/ext/B/B/Bytecode.pm @@ -785,11 +785,121 @@ B::Bytecode - Perl compiler's bytecode backend =head1 SYNOPSIS - perl -MO=Bytecode[,SUBROUTINE] foo.pl + perl -MO=Bytecode[,OPTIONS] foo.pl =head1 DESCRIPTION -See F. +This compiler backend takes Perl source and generates a +platform-independent bytecode encapsulating code to load the +internal structures perl uses to run your program. When the +generated bytecode is loaded in, your program is ready to run, +reducing the time which perl would have taken to load and parse +your program into its internal semi-compiled form. That means that +compiling with this backend will not help improve the runtime +execution speed of your program but may improve the start-up time. +Depending on the environment in which your program runs this may +or may not be a help. + +The resulting bytecode can be run with a special byteperl executable +or (for non-main programs) be loaded via the C function +in the F module. + +=head1 OPTIONS + +If there are any non-option arguments, they are taken to be names of +objects to be saved (probably doesn't work properly yet). Without +extra arguments, it saves the main program. + +=over 4 + +=item B<-ofilename> + +Output to filename instead of STDOUT. + +=item B<--> + +Force end of options. + +=item B<-f> + +Force optimisations on or off one at a time. Each can be preceded +by B to turn the option off (e.g. B<-fno-compress-nullops>). + +=item B<-fcompress-nullops> + +Only fills in the necessary fields of ops which have +been optimised away by perl's internal compiler. + +=item B<-fomit-sequence-numbers> + +Leaves out code to fill in the op_seq field of all ops +which is only used by perl's internal compiler. + +=item B<-fbypass-nullops> + +If op->op_next ever points to a NULLOP, replaces the op_next field +with the first non-NULLOP in the path of execution. + +=item B<-fstrip-syntax-tree> + +Leaves out code to fill in the pointers which link the internal syntax +tree together. They're not needed at run-time but leaving them out +will make it impossible to recompile or disassemble the resulting +program. It will also stop C statements from working. + +=item B<-On> + +Optimisation level (n = 0, 1, 2, ...). B<-O> means B<-O1>. +B<-O1> sets B<-fcompress-nullops> B<-fomit-sequence numbers>. +B<-O6> adds B<-fstrip-syntax-tree>. + +=item B<-D> + +Debug options (concatenated or separate flags like C). + +=item B<-Do> + +Prints each OP as it's processed. + +=item B<-Db> + +Print debugging information about bytecompiler progress. + +=item B<-Da> + +Tells the (bytecode) assembler to include source assembler lines +in its output as bytecode comments. + +=item B<-DC> + +Prints each CV taken from the final symbol tree walk. + +=item B<-S> + +Output (bytecode) assembler source rather than piping it +through the assembler and outputting bytecode. + +=item B<-m> + +Compile as a module rather than a standalone program. Currently this +just means that the bytecodes for initialising C, +C and C are omitted. + +=back + +=head EXAMPLES + + perl -MO=Bytecode,-O6,-o,foo.plc foo.pl + + perl -MO=Bytecode,-S foo.pl > foo.S + assemble foo.S > foo.plc + byteperl foo.plc + + perl -MO=Bytecode,-m,-oFoo.pmc Foo.pm + +=head1 BUGS + +Plenty. Current status: experimental. =head1 AUTHOR diff --git a/ext/B/B/C.pm b/ext/B/B/C.pm index 0669109..e9e6fa2 100644 --- a/ext/B/B/C.pm +++ b/ext/B/B/C.pm @@ -1,6 +1,6 @@ # C.pm # -# Copyright (c) 1996, 1997 Malcolm Beattie +# Copyright (c) 1996, 1997, 1998 Malcolm Beattie # # You may distribute under the terms of either the GNU General Public # License or the Artistic License, as specified in the README file. @@ -1212,7 +1212,105 @@ B::C - Perl compiler's C backend =head1 DESCRIPTION -See F. +This compiler backend takes Perl source and generates C source code +corresponding to the internal structures that perl uses to run +your program. When the generated C source is compiled and run, it +cuts out the time which perl would have taken to load and parse +your program into its internal semi-compiled form. That means that +compiling with this backend will not help improve the runtime +execution speed of your program but may improve the start-up time. +Depending on the environment in which your program runs this may be +either a help or a hindrance. + +=head1 OPTIONS + +If there are any non-option arguments, they are taken to be +names of objects to be saved (probably doesn't work properly yet). +Without extra arguments, it saves the main program. + +=over 4 + +=item B<-ofilename> + +Output to filename instead of STDOUT + +=item B<-v> + +Verbose compilation (currently gives a few compilation statistics). + +=item B<--> + +Force end of options + +=item B<-uPackname> + +Force apparently unused subs from package Packname to be compiled. +This allows programs to use eval "foo()" even when sub foo is never +seen to be used at compile time. The down side is that any subs which +really are never used also have code generated. This option is +necessary, for example, if you have a signal handler foo which you +initialise with C<$SIG{BAR} = "foo">. A better fix, though, is just +to change it to C<$SIG{BAR} = \&foo>. You can have multiple B<-u> +options. The compiler tries to figure out which packages may possibly +have subs in which need compiling but the current version doesn't do +it very well. In particular, it is confused by nested packages (i.e. +of the form C) where package C does not contain any subs. + +=item B<-D> + +Debug options (concatenated or separate flags like C). + +=item B<-Do> + +OPs, prints each OP as it's processed + +=item B<-Dc> + +COPs, prints COPs as processed (incl. file & line num) + +=item B<-DA> + +prints AV information on saving + +=item B<-DC> + +prints CV information on saving + +=item B<-DM> + +prints MAGIC information on saving + +=item B<-f> + +Force optimisations on or off one at a time. + +=item B<-fcog> + +Copy-on-grow: PVs declared and initialised statically. + +=item B<-fno-cog> + +No copy-on-grow. + +=item B<-On> + +Optimisation level (n = 0, 1, 2, ...). B<-O> means B<-O1>. Currently, +B<-O1> and higher set B<-fcog>. + +=head1 EXAMPLES + + perl -MO=C,-ofoo.c foo.pl + perl cc_harness -o foo foo.c + +Note that C lives in the C subdirectory of your perl +library directory. The utility called C may also be used to +help make use of this compiler. + + perl -MO=C,-v,-DcA bar.pl > /dev/null + +=head1 BUGS + +Plenty. Current status: experimental. =head1 AUTHOR diff --git a/ext/B/B/CC.pm b/ext/B/B/CC.pm index 32c3033..573dbd6 100644 --- a/ext/B/B/CC.pm +++ b/ext/B/B/CC.pm @@ -1,6 +1,6 @@ # CC.pm # -# Copyright (c) 1996, 1997 Malcolm Beattie +# Copyright (c) 1996, 1997, 1998 Malcolm Beattie # # You may distribute under the terms of either the GNU General Public # License or the Artistic License, as specified in the README file. @@ -1539,7 +1539,193 @@ B::CC - Perl compiler's optimized C translation backend =head1 DESCRIPTION -See F. +This compiler backend takes Perl source and generates C source code +corresponding to the flow of your program. In other words, this +backend is somewhat a "real" compiler in the sense that many people +think about compilers. Note however that, currently, it is a very +poor compiler in that although it generates (mostly, or at least +sometimes) correct code, it performs relatively few optimisations. +This will change as the compiler develops. The result is that +running an executable compiled with this backend may start up more +quickly than running the original Perl program (a feature shared +by the B compiler backend--see F) and may also execute +slightly faster. This is by no means a good optimising compiler--yet. + +=head1 OPTIONS + +If there are any non-option arguments, they are taken to be +names of objects to be saved (probably doesn't work properly yet). +Without extra arguments, it saves the main program. + +=over 4 + +=item B<-ofilename> + +Output to filename instead of STDOUT + +=item B<-v> + +Verbose compilation (currently gives a few compilation statistics). + +=item B<--> + +Force end of options + +=item B<-uPackname> + +Force apparently unused subs from package Packname to be compiled. +This allows programs to use eval "foo()" even when sub foo is never +seen to be used at compile time. The down side is that any subs which +really are never used also have code generated. This option is +necessary, for example, if you have a signal handler foo which you +initialise with C<$SIG{BAR} = "foo">. A better fix, though, is just +to change it to C<$SIG{BAR} = \&foo>. You can have multiple B<-u> +options. The compiler tries to figure out which packages may possibly +have subs in which need compiling but the current version doesn't do +it very well. In particular, it is confused by nested packages (i.e. +of the form C) where package C does not contain any subs. + +=item B<-mModulename> + +Instead of generating source for a runnable executable, generate +source for an XSUB module. The boot_Modulename function (which +DynaLoader can look for) does the appropriate initialisation and runs +the main part of the Perl source that is being compiled. + + +=item B<-D> + +Debug options (concatenated or separate flags like C). + +=item B<-Dr> + +Writes debugging output to STDERR just as it's about to write to the +program's runtime (otherwise writes debugging info as comments in +its C output). + +=item B<-DO> + +Outputs each OP as it's compiled + +=item B<-Ds> + +Outputs the contents of the shadow stack at each OP + +=item B<-Dp> + +Outputs the contents of the shadow pad of lexicals as it's loaded for +each sub or the main program. + +=item B<-Dq> + +Outputs the name of each fake PP function in the queue as it's about +to process it. + +=item B<-Dl> + +Output the filename and line number of each original line of Perl +code as it's processed (C). + +=item B<-Dt> + +Outputs timing information of compilation stages. + +=item B<-f> + +Force optimisations on or off one at a time. + +=item B<-ffreetmps-each-bblock> + +Delays FREETMPS from the end of each statement to the end of the each +basic block. + +=item B<-ffreetmps-each-loop> + +Delays FREETMPS from the end of each statement to the end of the group +of basic blocks forming a loop. At most one of the freetmps-each-* +options can be used. + +=item B<-fomit-taint> + +Omits generating code for handling perl's tainting mechanism. + +=item B<-On> + +Optimisation level (n = 0, 1, 2, ...). B<-O> means B<-O1>. +Currently, B<-O1> sets B<-ffreetmps-each-bblock> and B<-O2> +sets B<-ffreetmps-each-loop>. + +=back + +=head1 EXAMPLES + + perl -MO=CC,-O2,-ofoo.c foo.pl + perl cc_harness -o foo foo.c + +Note that C lives in the C subdirectory of your perl +library directory. The utility called C may also be used to +help make use of this compiler. + + perl -MO=CC,-mFoo,-oFoo.c Foo.pm + perl cc_harness -shared -c -o Foo.so Foo.c + +=head1 BUGS + +Plenty. Current status: experimental. + +=head1 DIFFERENCES + +These aren't really bugs but they are constructs which are heavily +tied to perl's compile-and-go implementation and with which this +compiler backend cannot cope. + +=head2 Loops + +Standard perl calculates the target of "next", "last", and "redo" +at run-time. The compiler calculates the targets at compile-time. +For example, the program + + sub skip_on_odd { next NUMBER if $_[0] % 2 } + NUMBER: for ($i = 0; $i < 5; $i++) { + skip_on_odd($i); + print $i; + } + +produces the output + + 024 + +with standard perl but gives a compile-time error with the compiler. + +=head2 Context of ".." + +The context (scalar or array) of the ".." operator determines whether +it behaves as a range or a flip/flop. Standard perl delays until +runtime the decision of which context it is in but the compiler needs +to know the context at compile-time. For example, + + @a = (4,6,1,0,0,1); + sub range { (shift @a)..(shift @a) } + print range(); + while (@a) { print scalar(range()) } + +generates the output + + 456123E0 + +with standard Perl but gives a compile-time error with compiled Perl. + +=head2 Arithmetic + +Compiled Perl programs use native C arithemtic much more frequently +than standard perl. Operations on large numbers or on boundary +cases may produce different behaviour. + +=head2 Deprecated features + +Features of standard perl such as C<$[> which have been deprecated +in standard perl since Perl5 was released have not been implemented +in the compiler. =head1 AUTHOR diff --git a/ext/B/O.pm b/ext/B/O.pm index 3b0f054..ad391a3 100644 --- a/ext/B/O.pm +++ b/ext/B/O.pm @@ -31,7 +31,52 @@ O - Generic interface to Perl Compiler backends =head1 DESCRIPTION -See F. +This is the module that is used as a frontend to the Perl Compiler. + +=head1 CONVENTIONS + +Most compiler backends use the following conventions: OPTIONS +consists of a comma-separated list of words (no white-space). +The C<-v> option usually puts the backend into verbose mode. +The C<-ofile> option generates output to B instead of +stdout. The C<-D> option followed by various letters turns on +various internal debugging flags. See the documentation for the +desired backend (named C for the example above) to +find out about that backend. + +=head1 IMPLEMENTATION + +This section is only necessary for those who want to write a +compiler backend module that can be used via this module. + +The command-line mentioned in the SYNOPSIS section corresponds to +the Perl code + + use O ("Backend", OPTIONS); + +The C function which that calls loads in the appropriate +C module and calls the C function in that +package, passing it OPTIONS. That function is expected to return +a sub reference which we'll call CALLBACK. Next, the "compile-only" +flag is switched on (equivalent to the command-line option C<-c>) +and an END block is registered which calls CALLBACK. Thus the main +Perl program mentioned on the command-line is read in, parsed and +compiled into internal syntax tree form. Since the C<-c> flag is +set, the program does not start running (excepting BEGIN blocks of +course) but the CALLBACK function registered by the compiler +backend is called. + +In summary, a compiler backend module should be called "B::Foo" +for some foo and live in the appropriate directory for that name. +It should define a function called C. When the user types + + perl -MO=Foo,OPTIONS foo.pl + +that function is called and is passed those OPTIONS (split on +commas). It should return a sub ref to the main compilation function. +After the user's program is loaded and parsed, that returned sub ref +is invoked which can then go ahead and do the compilation, usually by +making use of the C module's functionality. =head1 AUTHOR