From: Jon Orwant Date: Sat, 1 Feb 1997 23:34:59 +0000 (-0500) Subject: new (Feb 1) perlembed.pod X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=8a7dc658e6602067382c308b2131d135e4063624;p=p5sagit%2Fp5-mst-13.2.git new (Feb 1) perlembed.pod private-msgid: <9702012334.AA15747@fahrenheit-451.media.mit.edu> --- diff --git a/pod/perlembed.pod b/pod/perlembed.pod index 2a9ce58..a8bedcc 100644 --- a/pod/perlembed.pod +++ b/pod/perlembed.pod @@ -20,8 +20,8 @@ Read about back-quotes and about C and C in L. =item B -Read about C and C in L and L and C -and C in L and L, L. +Read about L and L and L +and L. =item B @@ -55,12 +55,17 @@ L L -This documentation is UNIX specific. +This documentation is Unix specific; if you have information about how +to embed Perl on other platforms, please send e-mail to +orwant@tpj.com. =head2 Compiling your C program -Every C program that uses Perl must link in the I. +If you have trouble compiling the scripts in this documentation, +you're not alone. The cardinal rule: COMPILE THE PROGRAMS IN EXACTLY +THE SAME WAY THAT YOUR PERL WAS COMPILED. (Sorry for yelling.) +Also, every C program that uses Perl must link in the I. What's that, you ask? Perl is itself written in C; the perl library is the collection of compiled C programs that were used to create your perl executable (I or equivalent). (Corollary: you @@ -69,13 +74,14 @@ your machine, or installed properly--that's why you shouldn't blithely copy Perl executables from machine to machine without also copying the I directory.) -Your C program will--usually--allocate, "run", and deallocate a -I object, which is defined in the perl library. +When you use Perl from C, your C program will--usually--allocate, +"run", and deallocate a I object, which is defined by +the perl library. If your copy of Perl is recent enough to contain this documentation (version 5.002 or later), then the perl library (and I and -I, which you'll also need) will -reside in a directory resembling this: +I, which you'll also need) will reside in a directory +that looks like this: /usr/local/lib/perl5/your_architecture_here/CORE @@ -91,42 +97,64 @@ Execute this statement for a hint about where to find CORE: perl -MConfig -e 'print $Config{archlib}' -Here's how you might compile the example in the next section, -L, -on a DEC Alpha running the OSF operating system: +Here's how you'd compile the example in the next section, +L, on my Linux box: - % cc -o interp interp.c -L/usr/local/lib/perl5/alpha-dec_osf/CORE - -I/usr/local/lib/perl5/alpha-dec_osf/CORE -lperl -lm + % gcc -O2 -Dbool=char -DHAS_BOOL -I/usr/local/include + -I/usr/local/lib/perl5/i586-linux/5.003/CORE + -L/usr/local/lib/perl5/i586-linux/5.003/CORE + -o interp interp.c -lperl -lm -You'll have to choose the appropriate compiler (I, I, et al.) and -library directory (I) for your machine. If your -compiler complains that certain functions are undefined, or that it -can't locate I<-lperl>, then you need to change the path following the --L. If it complains that it can't find I or I, you need -to change the path following the -I. +(That's all one line.) On my DEC Alpha running 5.00305, the incantation +is a bit different: + + % cc -O2 -Olimit 2900 -DSTANDARD_C -I/usr/local/include + -I/usr/local/lib/perl5/alpha-dec_osf/5.00305/CORE + -L/usr/local/lib/perl5/alpha-dec_osf/5.00305/CORE -L/usr/local/lib + -D__LANGUAGE_C__ -D_NO_PROTO -o interp interp.c -lperl -lm + +How can you figure out what to add? Assuming your Perl is post-5.001, +execute a C command and pay special attention to the "cc" and +"ccflags" information. + +You'll have to choose the appropriate compiler (I, I, et al.) for +your machine: C will tell you what +to use. + +You'll also have to choose the appropriate library directory +(I) for your machine. If your compiler complains +that certain functions are undefined, or that it can't locate +I<-lperl>, then you need to change the path following the C<-L>. If it +complains that it can't find I and I, you need to +change the path following the C<-I>. You may have to add extra libraries as well. Which ones? Perhaps those printed by perl -MConfig -e 'print $Config{libs}' -We strongly recommend you use the B module to determine -all of this information for you: +Provided your perl binary was properly configured and installed the +B module will determine all of this information for +you: % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts` +If the B module isn't part of your Perl distribution, +you can retrieve it from +http://www.perl.com/perl/CPAN/modules/by-module/ExtUtils::Embed. (If +this documentation came from your Perl distribution, then you're +running 5.004 or better and you already have it.) -If the B module is not part of your perl kit's -distribution you can retrieve it from: -http://www.perl.com/cgi-bin/cpan_mod?module=ExtUtils::Embed. - +The B kit on CPAN also contains all source code for +the examples in this document, tests, additional examples and other +information you may find useful. =head2 Adding a Perl interpreter to your C program In a sense, perl (the C program) is a good example of embedding Perl (the language), so I'll demonstrate embedding with I, -from the source distribution. Here's a bastardized, non-portable version of -I containing the essentials of embedding: +from the source distribution. Here's a bastardized, non-portable +version of I containing the essentials of embedding: #include /* from the Perl distribution */ #include /* from the Perl distribution */ @@ -143,11 +171,9 @@ I containing the essentials of embedding: perl_free(my_perl); } -Note that we do not use the C pointer here or in any of the -following examples. -Normally handed to C as its final argument, -we hand it a B instead, in which case the current environment -is used. +Notice that we don't use the C pointer. Normally handed to +C as its final argument, C here is replaced by +C, which means that the current environment will be used. Now compile this program (I'll call it I) into an executable: @@ -221,33 +247,34 @@ Simple enough. Now compile and run: 818284590 yielding the number of seconds that elapsed between January 1, 1970 -(the beginning of the UNIX epoch), and the moment I began writing this +(the beginning of the Unix epoch), and the moment I began writing this sentence. -Note that in this particular case we are not required to call I, -however, in general it's considered good practice to ensure proper -initialization of library code including execution of all object C -methods and package C blocks. +In this particular case we don't have to call I, but in +general it's considered good practice to ensure proper initialization +of library code, including execution of all object C methods +and package C blocks. -If you want to pass some arguments to the Perl subroutine, you may add -strings to the C terminated C list passed to I. -In order to pass arguments of another data type and/or examine return values -of the subroutine you'll need to manipulate the -Perl stack, demonstrated in the last section of this document: -L +If you want to pass arguments to the Perl subroutine, you can add +strings to the C-terminated C list passed to +I. For other data types, or to examine return values, +you'll need to manipulate the Perl stack. That's demonstrated in the +last section of this document: L. =head2 Evaluating a Perl statement from your C program -One way to evaluate pieces of Perl code is to use L. -We have wrapped this function with our own I function, which -converts a command string to an SV, passing this and the L -flag to L. +One way to evaluate pieces of Perl code is to use +L. We've wrapped this inside our own +I function, which converts a command string to an SV, +passing this and the L flag to +L. Arguably, this is the only routine you'll ever need to execute snippets of Perl code from within your C program. Your string can be -as long as you wish; it can contain multiple statements; it can -include L, L and L to -include external Perl files. +as long as you wish; it can contain multiple statements; it can employ +L, L and L to include +external Perl files. Our I lets us evaluate individual Perl strings, and then extract variables for coercion into C types. The following program, @@ -309,29 +336,30 @@ substitutions: I, I, and I. char match(char *string, char *pattern); -Given a string and a pattern (e.g., "m/clasp/" or "/\b\w*\b/", which in -your program might be represented as C<"/\\b\\w*\\b/">), +Given a string and a pattern (e.g., C or C, which +in your C program might appear as "/\\b\\w*\\b/"), match() returns 1 if the string matches the pattern and 0 otherwise. - int substitute(char *string[], char *pattern); -Given a pointer to a string and an "=~" operation (e.g., "s/bob/robert/g" or -"tr[A-Z][a-z]"), modifies the string according to the operation, -returning the number of substitutions made. +Given a pointer to a string and an C<=~> operation (e.g., +C or C), substitute() modifies the string +according to the operation, returning the number of substitutions +made. int matches(char *string, char *pattern, char **matches[]); Given a string, a pattern, and a pointer to an empty array of strings, -evaluates C<$string =~ $pattern> in an array context, and fills in -I with the array elements (allocating memory as it does so), -returning the number of matches found. +matches() evaluates C<$string =~ $pattern> in an array context, and +fills in I with the array elements (allocating memory as it +does so), returning the number of matches found. Here's a sample program, I, that uses all three (long lines have been wrapped here): #include #include + static PerlInterpreter *my_perl; I32 perl_eval(char *string) { @@ -457,22 +485,22 @@ been wrapped here): which produces the output (again, long lines have been wrapped here) - perl_match: Text contains the word 'quarter'. + match: Text contains the word 'quarter'. - perl_match: Text doesn't contain the word 'eighth'. + match: Text doesn't contain the word 'eighth'. - perl_matches: m/(wi..)/g found 2 matches... + matches: m/(wi..)/g found 2 matches... match: will match: with - perl_substitute: s/[aeiou]//gi...139 substitutions made. + substitute: s/[aeiou]//gi...139 substitutions made. Now text is: Whn h s t cnvnnc str nd th bll cms t sm mnt lk 76 cnts, Mynrd s wr tht thr s smthng h *shld* d, smthng tht wll nbl hm t gt bck qrtr, bt h hs n d *wht*. H fmbls thrgh hs rd sqzy chngprs nd gvs th by thr xtr pnns wth hs dllr, hpng tht h mght lck nt th crrct mnt. Th by gvs hm bck tw f hs wn pnns nd thn th bg shny qrtr tht s hs prz. -RCHH - perl_substitute: s/Perl/C...No substitution made. + substitute: s/Perl/C...No substitution made. =head2 Fiddling with the Perl stack from your C program @@ -561,42 +589,44 @@ Compile and run: =head2 Maintaining a persistent interpreter -When developing interactive, potentially long-running applications, it's -a good idea to maintain a persistent interpreter rather than allocating -and constructing a new interpreter multiple times. The major gain here is -speed, avoiding the penalty of Perl start-up time. However, a persistent -interpreter will require you to be more cautious in your use of namespace -and variable scoping. In previous examples we've been using global variables -in the default package B
. We knew exactly what code would be run, -making it safe to assume we'd avoid any variable collision or outrageous -symbol table growth. - -Let's say your application is a server, which must run perl code from an -arbitrary file during each transaction. Your server has no way of knowing -what code is inside anyone of these files. -If the file was pulled in by B, compiled into a newly -constructed interpreter, then cleaned out with B after the -the transaction, you'd be shielded from most namespace troubles. - -One way to avoid namespace collisions in this scenerio, is to translate the -file name into a valid Perl package name, which is most likely to be unique, -then compile the code into that package using L. -In the example below, each file will only be compiled once, unless it is -updated on disk. -Optionally, the application may choose to clean out the symbol table -associated with the file after we are done with it. We'll call the subroutine -B which lives in the file B, with -L, passing the filename and boolean cleanup/cache +When developing interactive and/or potentially long-running +applications, it's a good idea to maintain a persistent interpreter +rather than allocating and constructing a new interpreter multiple +times. The major reason is speed: since Perl will only be loaded into +memory once. + +However, you have to be more cautious with namespace and variable +scoping when using a persistent interpreter. In previous examples +we've been using global variables in the default package C
. We +knew exactly what code would be run, and assumed we could avoid +variable collisions and outrageous symbol table growth. + +Let's say your application is a server that will occasionally run Perl +code from some arbitrary file. Your server has no way of knowing what +code it's going to run. Very dangerous. + +If the file is pulled in by C, compiled into a newly +constructed interpreter, and subsequently cleaned out with +C afterwards, you're shielded from most namespace +troubles. + +One way to avoid namespace collisions in this scenario is to translate +the filename into a guaranteed-unique package name, and then compile +the code into that package using L. In the example +below, each file will only be compiled once. Or, the application +might choose to clean out the symbol table associated with the file +after it's no longer needed. Using L, We'll +call the subroutine C which lives in the +file C and pass the filename and boolean cleanup/cache flag as arguments. -Note that the process will continue to grow for each file that is compiled, -and each file it pulls in via L, L or -L. In addition, there maybe Bed subroutines and -other conditions that cause Perl's symbol table to grow. You may wish to -add logic which keeps track of process size or restarts itself after n number -of requests to ensure memory consumption is kept to a minimum. You also need -to consider the importance of variable scoping with L to futher -reduce symbol table growth. +Note that the process will continue to grow for each file that it +uses. In addition, there might be Ced subroutines and other +conditions that cause Perl's symbol table to grow. You might want to +add some logic that keeps track of the process size, or restarts +itself after a certain number of requests, to ensure that memory +consumption is minimized. You'll also want to scope your variables +with L whenever possible. package Embed::Persistent; @@ -605,8 +635,6 @@ reduce symbol table growth. use strict; use vars '%Cache'; - #use Devel::Symdump (); - sub valid_package_name { my($string) = @_; $string =~ s/([^A-Za-z0-9\/])/sprintf("_%2x",unpack("C",$1))/eg; @@ -729,7 +757,7 @@ reduce symbol table growth. Now compile: - % cc -o persistent persistent.c `perl -MExtUtils::Embed -e ldopts` + % cc -o persistent persistent.c `perl -MExtUtils::Embed -e ccopts -e ldopts` Here's a example script file: @@ -753,67 +781,50 @@ Now run: =head2 Maintaining multiple interpreter instances -The previous examples have gone through several steps to startup, use and -shutdown an embedded Perl interpreter. Certain applications may require -more than one instance of an interpreter to be created during the lifespan -of a single process. Such an application may take different approaches in -it's use of interpreter objects. For example, a particular transaction may -want to create an interpreter instance, then release any resources associated -with the object once the transaction is completed. When a single process -does this once, resources are released upon exit of the program and the next -time it starts, the interpreter's global state is fresh. - -In the same process, the program must take care to ensure that these -actions take place before constructing a new interpreter. By default, the -global variable C is set to C<0> since extra cleaning -is not needed when a program constructs a single interpreter, such as the -perl executable itself in C or some such. - -You can tell Perl to make everything squeeky clean by setting -C to C<1>. +Some rare applications will need to create more than one interpreter +during a session. Such an application might sporadically decide to +release any resources associated with the interpreter. + +The program must take care to ensure that this takes place I +the next interpreter is constructed. By default, the global variable +C is set to C<0>, since extra cleaning isn't +needed when a program has only one interpreter. + +Setting C to C<1> makes everything squeaky clean: + + perl_destruct_level = 1; - perl_destruct_level = 1; /* perl global variable */ while(1) { ... /* reset global variables here with perl_destruct_level = 1 */ - perl_contruct(my_perl); + perl_construct(my_perl); ... /* clean and reset _everything_ during perl_destruct */ - perl_destruct(my_perl); /* ah, nice and fresh */ + perl_destruct(my_perl); perl_free(my_perl); ... /* let's go do it again! */ } -Now, when I is called, the interpreter's syntax parsetree -and symbol tables are cleaned out, along with reseting global variables. - -So, we've seen how to startup and shutdown an interpreter more than once -in the same process, but there was only one instance in existance at any -one time. Hmm, wonder if we can have more than one interpreter instance -running at the _same_ time? -Indeed this is possible, however when you build Perl, you must compile with -C<-DMULTIPLICITY>. +When I is called, the interpreter's syntax parse tree +and symbol tables are cleaned up, and global variables are reset. -It's a little tricky for the Perl runtime to handle multiple interpreters, -introducing some overhead that most programs with a single interpreter don't -get burdened with. When you compile with C<-DMULTIPLICITY>, by default, -C is set to C<1> for each interpreter. +Now suppose we have more than one interpreter instance running at the +same time. This is feasible, but only if you used the +C<-DMULTIPLICITY> flag when building Perl. By default, that sets +C to C<1>. Let's give it a try: #include - #include - + #include /* we're going to embed two interpreters */ /* we're going to embed two interpreters */ - #define SAY_HELLO "-e", "print qq(Hi, I'm $^X\n)" - int main(int argc, char **argv, char **env) { PerlInterpreter @@ -917,7 +928,7 @@ Once you have this code, slap it into the second argument of I: Then compile: - % cc -o interp interp.c `perl -MExtUtils::Embed -e ldopts` + % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts` % interp use Socket; @@ -927,11 +938,10 @@ Then compile: B can also automate writing the I glue code. - % perl -MExtUtils::Embed -e xsinit -o perlxsi.c + % perl -MExtUtils::Embed -e xsinit -- -o perlxsi.c % cc -c perlxsi.c `perl -MExtUtils::Embed -e ccopts` % cc -c interp.c `perl -MExtUtils::Embed -e ccopts` - % cc -o interp perlxsi.o interp.o \ - `perl -MExtUtils::Embed -e ccdlflags -e ldopts` + % cc -o interp perlxsi.o interp.o `perl -MExtUtils::Embed -e ldopts` Consult L and L for more details. @@ -945,14 +955,28 @@ each from the other, combine them as you wish. =head1 AUTHOR -Jon Orwant Forwant@media.mit.eduE>, -co-authored by Doug MacEachern Fdougm@osf.orgE>, -with contributions from -Tim Bunce, Tom Christiansen, Dov Grobgeld, and Ilya -Zakharevich. +Jon Orwant and Forwant@media.mit.eduE> and Doug MacEachern +Fdougm@osf.orgE>, with small contributions from Tim Bunce, +Tom Christiansen, Hallvard Furuseth, Dov Grobgeld, and Ilya Zakharevich. + +Check out Doug's article on embedding in Volume 1, Issue 4 of The Perl +Journal. Info about TPJ is available from http://tpj.com. -June 17, 1996 +February 1, 1997 -Some of this material is excerpted from my book: I, -Waite Group Press, 1996 (ISBN 1-57169-064-6) and appears +Some of this material is excerpted from Jon Orwant's book: I, Waite Group Press, 1996 (ISBN 1-57169-064-6) and appears courtesy of Waite Group Press. + +=head1 COPYRIGHT + +Copyright (C) 1995, 1996, 1997 Doug MacEachern and Jon Orwant. All +Rights Reserved. + +Although destined for release with the standard Perl distribution, +this document is not public domain, nor is any of Perl and its +documentation. Permission is granted to freely distribute verbatim +copies of this document provided that no modifications outside of +formatting be made, and that this notice remain intact. You are +permitted and encouraged to use its code and derivatives thereof in +your own source code for fun or for profit as you see fit.