pod/perlembed.pod

   1 =head1 NAME
   2
   3 perlembed - how to embed perl in your C program
   4
   5 =head1 DESCRIPTION
   6
   7 =head2 PREAMBLE
   8
   9 Do you want to:
  10
  11 =over 5
  12
  13 =item B<Use C from Perl?>
  14
  15 Read L<perlcall> and L<perlxs>.
  16
  17 =item B<Use a Unix program from Perl?>
  18
  19 Read about back-quotes and about C<system> and C<exec> in L<perlfunc>.
  20
  21 =item B<Use Perl from Perl?>
  22
  23 Read about L<perlfunc/do> and L<perlfunc/eval> and L<perlfunc/require>
  24 and L<perlfunc/use>.
  25
  26 =item B<Use C from C?>
  27
  28 Rethink your design.
  29
  30 =item B<Use Perl from C?>
  31
  32 Read on...
  33
  34 =back
  35
  36 =head2 ROADMAP
  37
  38 L<Compiling your C program>
  39
  40 There's one example in each of the eight sections:
  41
  42 L<Adding a Perl interpreter to your C program>
  43
  44 L<Calling a Perl subroutine from your C program>
  45
  46 L<Evaluating a Perl statement from your C program>
  47
  48 L<Performing Perl pattern matches and substitutions from your C program>
  49
  50 L<Fiddling with the Perl stack from your C program>
  51
  52 L<Maintaining a persistent interpreter>
  53
  54 L<Maintaining multiple interpreter instances>
  55
  56 L<Using Perl modules, which themselves use C libraries, from your C program>
  57
  58 This documentation is Unix specific; if you have information about how
  59 to embed Perl on other platforms, please send e-mail to <F<orwant@tpj.com>>.
  60
  61 =head2 Compiling your C program
  62
  63 If you have trouble compiling the scripts in this documentation,
  64 you're not alone.  The cardinal rule: COMPILE THE PROGRAMS IN EXACTLY
  65 THE SAME WAY THAT YOUR PERL WAS COMPILED.  (Sorry for yelling.)
  66
  67 Also, every C program that uses Perl must link in the I<perl library>.
  68 What's that, you ask?  Perl is itself written in C; the perl library
  69 is the collection of compiled C programs that were used to create your
  70 perl executable (I</usr/bin/perl> or equivalent).  (Corollary: you
  71 can't use Perl from your C program unless Perl has been compiled on
  72 your machine, or installed properly--that's why you shouldn't blithely
  73 copy Perl executables from machine to machine without also copying the
  74 I<lib> directory.)
  75
  76 When you use Perl from C, your C program will--usually--allocate,
  77 "run", and deallocate a I<PerlInterpreter> object, which is defined by
  78 the perl library.
  79
  80 If your copy of Perl is recent enough to contain this documentation
  81 (version 5.002 or later), then the perl library (and I<EXTERN.h> and
  82 I<perl.h>, which you'll also need) will reside in a directory
  83 that looks like this:
  84
  85     /usr/local/lib/perl5/your_architecture_here/CORE
  86
  87 or perhaps just
  88
  89     /usr/local/lib/perl5/CORE
  90
  91 or maybe something like
  92
  93     /usr/opt/perl5/CORE
  94
  95 Execute this statement for a hint about where to find CORE:
  96
  97     perl -MConfig -e 'print $Config{archlib}'
  98
  99 Here's how you'd compile the example in the next section,
 100 L<Adding a Perl interpreter to your C program>, on my Linux box:
 101
 102     % gcc -O2 -Dbool=char -DHAS_BOOL -I/usr/local/include
 103     -I/usr/local/lib/perl5/i586-linux/5.003/CORE
 104     -L/usr/local/lib/perl5/i586-linux/5.003/CORE
 105     -o interp interp.c -lperl -lm
 106
 107 (That's all one line.)  On my DEC Alpha running 5.003_05, the incantation
 108 is a bit different:
 109
 110     % cc -O2 -Olimit 2900 -DSTANDARD_C -I/usr/local/include
 111     -I/usr/local/lib/perl5/alpha-dec_osf/5.00305/CORE
 112     -L/usr/local/lib/perl5/alpha-dec_osf/5.00305/CORE -L/usr/local/lib
 113     -D__LANGUAGE_C__ -D_NO_PROTO -o interp interp.c -lperl -lm
 114
 115 How can you figure out what to add?  Assuming your Perl is post-5.001,
 116 execute a C<perl -V> command and pay special attention to the "cc" and
 117 "ccflags" information.
 118
 119 You'll have to choose the appropriate compiler (I<cc>, I<gcc>, et al.) for
 120 your machine: C<perl -MConfig -e 'print $Config{cc}'> will tell you what
 121 to use.
 122
 123 You'll also have to choose the appropriate library directory
 124 (I</usr/local/lib/...>) for your machine.  If your compiler complains
 125 that certain functions are undefined, or that it can't locate
 126 I<-lperl>, then you need to change the path following the C<-L>.  If it
 127 complains that it can't find I<EXTERN.h> and I<perl.h>, you need to
 128 change the path following the C<-I>.
 129
 130 You may have to add extra libraries as well.  Which ones?
 131 Perhaps those printed by
 132
 133    perl -MConfig -e 'print $Config{libs}'
 134
 135 Provided your perl binary was properly configured and installed the
 136 B<ExtUtils::Embed> module will determine all of this information for
 137 you:
 138
 139    % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts`
 140
 141 If the B<ExtUtils::Embed> module isn't part of your Perl distribution,
 142 you can retrieve it from
 143 http://www.perl.com/perl/CPAN/modules/by-module/ExtUtils::Embed.  (If
 144 this documentation came from your Perl distribution, then you're
 145 running 5.004 or better and you already have it.)
 146
 147 The B<ExtUtils::Embed> kit on CPAN also contains all source code for
 148 the examples in this document, tests, additional examples and other
 149 information you may find useful.
 150
 151 =head2 Adding a Perl interpreter to your C program
 152
 153 In a sense, perl (the C program) is a good example of embedding Perl
 154 (the language), so I'll demonstrate embedding with I<miniperlmain.c>,
 155 from the source distribution.  Here's a bastardized, nonportable
 156 version of I<miniperlmain.c> containing the essentials of embedding:
 157
 158     #include <EXTERN.h>               /* from the Perl distribution     */
 159     #include <perl.h>                 /* from the Perl distribution     */
 160
 161     static PerlInterpreter *my_perl;  /***    The Perl interpreter    ***/
 162
 163     int main(int argc, char **argv, char **env)
 164     {
 165         my_perl = perl_alloc();
 166         perl_construct(my_perl);
 167         perl_parse(my_perl, NULL, argc, argv, (char **)NULL);
 168         perl_run(my_perl);
 169         perl_destruct(my_perl);
 170         perl_free(my_perl);
 171     }
 172
 173 Notice that we don't use the C<env> pointer.  Normally handed to
 174 C<perl_parse> as its final argument, C<env> here is replaced by
 175 C<NULL>, which means that the current environment will be used.
 176
 177 Now compile this program (I'll call it I<interp.c>) into an executable:
 178
 179     % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts`
 180
 181 After a successful compilation, you'll be able to use I<interp> just
 182 like perl itself:
 183
 184     % interp
 185     print "Pretty Good Perl \n";
 186     print "10890 - 9801 is ", 10890 - 9801;
 187     <CTRL-D>
 188     Pretty Good Perl
 189     10890 - 9801 is 1089
 190
 191 or
 192
 193     % interp -e 'printf("%x", 3735928559)'
 194     deadbeef
 195
 196 You can also read and execute Perl statements from a file while in the
 197 midst of your C program, by placing the filename in I<argv[1]> before
 198 calling I<perl_run()>.
 199
 200 =head2 Calling a Perl subroutine from your C program
 201
 202 To call individual Perl subroutines, you can use any of the B<perl_call_*>
 203 functions documented in the L<perlcall> manpage.
 204 In this example we'll use I<perl_call_argv>.
 205
 206 That's shown below, in a program I'll call I<showtime.c>.
 207
 208     #include <EXTERN.h>
 209     #include <perl.h>
 210
 211     static PerlInterpreter *my_perl;
 212
 213     int main(int argc, char **argv, char **env)
 214     {
 215         char *args[] = { NULL };
 216         my_perl = perl_alloc();
 217         perl_construct(my_perl);
 218
 219         perl_parse(my_perl, NULL, argc, argv, NULL);
 220
 221         /*** skipping perl_run() ***/
 222
 223         perl_call_argv("showtime", G_DISCARD | G_NOARGS, args);
 224
 225         perl_destruct(my_perl);
 226         perl_free(my_perl);
 227     }
 228
 229 where I<showtime> is a Perl subroutine that takes no arguments (that's the
 230 I<G_NOARGS>) and for which I'll ignore the return value (that's the
 231 I<G_DISCARD>).  Those flags, and others, are discussed in L<perlcall>.
 232
 233 I'll define the I<showtime> subroutine in a file called I<showtime.pl>:
 234
 235     print "I shan't be printed.";
 236
 237     sub showtime {
 238         print time;
 239     }
 240
 241 Simple enough.  Now compile and run:
 242
 243     % cc -o showtime showtime.c `perl -MExtUtils::Embed -e ccopts -e ldopts`
 244
 245     % showtime showtime.pl
 246     818284590
 247
 248 yielding the number of seconds that elapsed between January 1, 1970
 249 (the beginning of the Unix epoch), and the moment I began writing this
 250 sentence.
 251
 252 In this particular case we don't have to call I<perl_run>, but in
 253 general it's considered good practice to ensure proper initialization
 254 of library code, including execution of all object C<DESTROY> methods
 255 and package C<END {}> blocks.
 256
 257 If you want to pass arguments to the Perl subroutine, you can add
 258 strings to the C<NULL>-terminated C<args> list passed to
 259 I<perl_call_argv>.  For other data types, or to examine return values,
 260 you'll need to manipulate the Perl stack.  That's demonstrated in the
 261 last section of this document: L<Fiddling with the Perl stack from
 262 your C program>.
 263
 264 =head2 Evaluating a Perl statement from your C program
 265
 266 One way to evaluate pieces of Perl code is to use
 267 L<perlguts/perl_eval_sv()>.  We've wrapped this inside our own
 268 I<perl_eval()> function, which converts a command string to an SV,
 269 passing this and the L<perlcall/G_DISCARD> flag to
 270 L<perlguts/perl_eval_sv()>.
 271
 272 Arguably, this is the only routine you'll ever need to execute
 273 snippets of Perl code from within your C program.  Your string can be
 274 as long as you wish; it can contain multiple statements; it can employ
 275 L<perlfunc/use>, L<perlfunc/require> and L<perlfunc/do> to include
 276 external Perl files.
 277
 278 Our I<perl_eval()> lets us evaluate individual Perl strings, and then
 279 extract variables for coercion into C types.  The following program,
 280 I<string.c>, executes three Perl strings, extracting an C<int> from
 281 the first, a C<float> from the second, and a C<char *> from the third.
 282
 283    #include <EXTERN.h>
 284    #include <perl.h>
 285
 286    static PerlInterpreter *my_perl;
 287
 288    I32 perl_eval(char *string)
 289    {
 290      return perl_eval_sv(newSVpv(string,0), G_DISCARD);
 291    }
 292
 293    main (int argc, char **argv, char **env)
 294    {
 295      char *embedding[] = { "", "-e", "0" };
 296      STRLEN length;
 297
 298      my_perl = perl_alloc();
 299      perl_construct( my_perl );
 300
 301      perl_parse(my_perl, NULL, 3, embedding, NULL);
 302      perl_run(my_perl);
 303                                        /** Treat $a as an integer **/
 304      perl_eval("$a = 3; $a **= 2");
 305      printf("a = %d\n", SvIV(perl_get_sv("a", FALSE)));
 306
 307                                        /** Treat $a as a float **/
 308      perl_eval("$a = 3.14; $a **= 2");
 309      printf("a = %f\n", SvNV(perl_get_sv("a", FALSE)));
 310
 311                                        /** Treat $a as a string **/
 312      perl_eval("$a = 'rekcaH lreP rehtonA tsuJ'; $a = reverse($a); ");
 313      printf("a = %s\n", SvPV(perl_get_sv("a", FALSE), length));
 314
 315      perl_destruct(my_perl);
 316      perl_free(my_perl);
 317    }
 318
 319 All of those strange functions with I<sv> in their names help convert Perl scalars to C types.  They're described in L<perlguts>.
 320
 321 If you compile and run I<string.c>, you'll see the results of using
 322 I<SvIV()> to create an C<int>, I<SvNV()> to create a C<float>, and
 323 I<SvPV()> to create a string:
 324
 325    a = 9
 326    a = 9.859600
 327    a = Just Another Perl Hacker
 328
 329 In the example above, we've created a global variable to temporarily
 330 store the computed value of our eval'd expression.  It is also
 331 possible and in most cases a better strategy to fetch the return value
 332 from L<perl_eval_sv> instead.  Example:
 333
 334    SV *perl_eval(char *string, int croak_on_error)
 335    {
 336        dSP;
 337        SV *sv = newSVpv(string,0);
 338
 339        PUSHMARK(sp);
 340        perl_eval_sv(sv, G_SCALAR);
 341        SvREFCNT_dec(sv);
 342
 343        SPAGAIN;
 344        sv = POPs;
 345        PUTBACK;
 346
 347        if (croak_on_error && SvTRUE(GvSV(errgv)))
 348              croak(SvPV(GvSV(errgv),na));
 349
 350        return sv;
 351    }
 352    ...
 353    SV *val = perl_eval("reverse 'rekcaH lreP rehtonA tsuJ'", TRUE);
 354    printf("%s\n", SvPV(val,na));
 355    ...
 356
 357 This way, we avoid namespace pollution by not creating global
 358 variables and we've simplified our code as well.
 359
 360 =head2 Performing Perl pattern matches and substitutions from your C program
 361
 362 Our I<perl_eval()> lets us evaluate strings of Perl code, so we can
 363 define some functions that use it to "specialize" in matches and
 364 substitutions: I<match()>, I<substitute()>, and I<matches()>.
 365
 366    char match(char *string, char *pattern);
 367
 368 Given a string and a pattern (e.g., C<m/clasp/> or C</\b\w*\b/>, which
 369 in your C program might appear as "/\\b\\w*\\b/"), match()
 370 returns 1 if the string matches the pattern and 0 otherwise.
 371
 372    int substitute(char *string[], char *pattern);
 373
 374 Given a pointer to a string and an C<=~> operation (e.g.,
 375 C<s/bob/robert/g> or C<tr[A-Z][a-z]>), substitute() modifies the string
 376 according to the operation, returning the number of substitutions
 377 made.
 378
 379    int matches(char *string, char *pattern, char **matches[]);
 380
 381 Given a string, a pattern, and a pointer to an empty array of strings,
 382 matches() evaluates C<$string =~ $pattern> in an array context, and
 383 fills in I<matches> with the array elements (allocating memory as it
 384 does so), returning the number of matches found.
 385
 386 Here's a sample program, I<match.c>, that uses all three (long lines have
 387 been wrapped here):
 388
 389    #include <EXTERN.h>
 390    #include <perl.h>
 391
 392    static PerlInterpreter *my_perl;
 393    I32 perl_eval(char *string)
 394    {
 395       return perl_eval_sv(newSVpv(string,0), G_DISCARD);
 396    }
 397    /** match(string, pattern)
 398    **
 399    ** Used for matches in a scalar context.
 400    **
 401    ** Returns 1 if the match was successful; 0 otherwise.
 402    **/
 403    char match(char *string, char *pattern)
 404    {
 405      char *command;
 406      command = malloc(sizeof(char) * strlen(string) + strlen(pattern) + 37);
 407      sprintf(command, "$string = '%s'; $return = $string =~ %s",
 408                       string, pattern);
 409      perl_eval(command);
 410      free(command);
 411      return SvIV(perl_get_sv("return", FALSE));
 412    }
 413    /** substitute(string, pattern)
 414    **
 415    ** Used for =~ operations that modify their left-hand side (s/// and tr///)
 416    **
 417    ** Returns the number of successful matches, and
 418    ** modifies the input string if there were any.
 419    **/
 420    int substitute(char *string[], char *pattern)
 421    {
 422      char *command;
 423      STRLEN length;
 424      command = malloc(sizeof(char) * strlen(*string) + strlen(pattern) + 35);
 425      sprintf(command, "$string = '%s'; $ret = ($string =~ %s)",
 426                       *string, pattern);
 427      perl_eval(command);
 428      free(command);
 429      *string = SvPV(perl_get_sv("string", FALSE), length);
 430      return SvIV(perl_get_sv("ret", FALSE));
 431    }
 432    /** matches(string, pattern, matches)
 433    **
 434    ** Used for matches in an array context.
 435    **
 436    ** Returns the number of matches,
 437    ** and fills in **matches with the matching substrings (allocates memory!)
 438    **/
 439    int matches(char *string, char *pattern, char **match_list[])
 440    {
 441      char *command;
 442      SV *current_match;
 443      AV *array;
 444      I32 num_matches;
 445      STRLEN length;
 446      int i;
 447      command = malloc(sizeof(char) * strlen(string) + strlen(pattern) + 38);
 448      sprintf(command, "$string = '%s'; @array = ($string =~ %s)",
 449                       string, pattern);
 450      perl_eval(command);
 451      free(command);
 452      array = perl_get_av("array", FALSE);
 453      num_matches = av_len(array) + 1; /** assume $[ is 0 **/
 454      *match_list = (char **) malloc(sizeof(char *) * num_matches);
 455      for (i = 0; i <= num_matches; i++) {
 456        current_match = av_shift(array);
 457        (*match_list)[i] = SvPV(current_match, length);
 458      }
 459      return num_matches;
 460    }
 461    main (int argc, char **argv, char **env)
 462    {
 463      char *embedding[] = { "", "-e", "0" };
 464      char *text, **match_list;
 465      int num_matches, i;
 466      int j;
 467      my_perl = perl_alloc();
 468      perl_construct( my_perl );
 469      perl_parse(my_perl, NULL, 3, embedding, NULL);
 470      perl_run(my_perl);
 471
 472      text = (char *) malloc(sizeof(char) * 486); /** A long string follows! **/
 473      sprintf(text, "%s", "When he is at a convenience store and the bill \
 474      comes to some amount like 76 cents, Maynard is aware that there is \
 475      something he *should* do, something that will enable him to get back \
 476      a quarter, but he has no idea *what*.  He fumbles through his red \
 477      squeezey changepurse and gives the boy three extra pennies with his \
 478      dollar, hoping that he might luck into the correct amount.  The boy \
 479      gives him back two of his own pennies and then the big shiny quarter \
 480      that is his prize. -RICHH");
 481      if (match(text, "m/quarter/")) /** Does text contain 'quarter'? **/
 482        printf("match: Text contains the word 'quarter'.\n\n");
 483      else
 484        printf("match: Text doesn't contain the word 'quarter'.\n\n");
 485      if (match(text, "m/eighth/")) /** Does text contain 'eighth'? **/
 486        printf("match: Text contains the word 'eighth'.\n\n");
 487      else
 488        printf("match: Text doesn't contain the word 'eighth'.\n\n");
 489      /** Match all occurrences of /wi../ **/
 490      num_matches = matches(text, "m/(wi..)/g", &match_list);
 491      printf("matches: m/(wi..)/g found %d matches...\n", num_matches);
 492      for (i = 0; i < num_matches; i++)
 493        printf("match: %s\n", match_list[i]);
 494      printf("\n");
 495      for (i = 0; i < num_matches; i++) {
 496        free(match_list[i]);
 497      }
 498      free(match_list);
 499      /** Remove all vowels from text **/
 500      num_matches = substitute(&text, "s/[aeiou]//gi");
 501      if (num_matches) {
 502        printf("substitute: s/[aeiou]//gi...%d substitutions made.\n",
 503               num_matches);
 504        printf("Now text is: %s\n\n", text);
 505      }
 506      /** Attempt a substitution **/
 507      if (!substitute(&text, "s/Perl/C/")) {
 508        printf("substitute: s/Perl/C...No substitution made.\n\n");
 509      }
 510      free(text);
 511      perl_destruct(my_perl);
 512      perl_free(my_perl);
 513    }
 514
 515 which produces the output (again, long lines have been wrapped here)
 516
 517    match: Text contains the word 'quarter'.
 518
 519    match: Text doesn't contain the word 'eighth'.
 520
 521    matches: m/(wi..)/g found 2 matches...
 522    match: will
 523    match: with
 524
 525    substitute: s/[aeiou]//gi...139 substitutions made.
 526    Now text is: Whn h s t  cnvnnc str nd th bll cms t sm mnt lk 76 cnts,
 527    Mynrd s wr tht thr s smthng h *shld* d, smthng tht wll nbl hm t gt bck
 528    qrtr, bt h hs n d *wht*.  H fmbls thrgh hs rd sqzy chngprs nd gvs th by
 529    thr xtr pnns wth hs dllr, hpng tht h mght lck nt th crrct mnt.  Th by gvs
 530    hm bck tw f hs wn pnns nd thn th bg shny qrtr tht s hs prz. -RCHH
 531
 532    substitute: s/Perl/C...No substitution made.
 533
 534 =head2 Fiddling with the Perl stack from your C program
 535
 536 When trying to explain stacks, most computer science textbooks mumble
 537 something about spring-loaded columns of cafeteria plates: the last
 538 thing you pushed on the stack is the first thing you pop off.  That'll
 539 do for our purposes: your C program will push some arguments onto "the Perl
 540 stack", shut its eyes while some magic happens, and then pop the
 541 results--the return value of your Perl subroutine--off the stack.
 542
 543 First you'll need to know how to convert between C types and Perl
 544 types, with newSViv() and sv_setnv() and newAV() and all their
 545 friends.  They're described in L<perlguts>.
 546
 547 Then you'll need to know how to manipulate the Perl stack.  That's
 548 described in L<perlcall>.
 549
 550 Once you've understood those, embedding Perl in C is easy.
 551
 552 Because C has no builtin function for integer exponentiation, let's
 553 make Perl's ** operator available to it (this is less useful than it
 554 sounds, because Perl implements ** with C's I<pow()> function).  First
 555 I'll create a stub exponentiation function in I<power.pl>:
 556
 557     sub expo {
 558         my ($a, $b) = @_;
 559         return $a ** $b;
 560     }
 561
 562 Now I'll create a C program, I<power.c>, with a function
 563 I<PerlPower()> that contains all the perlguts necessary to push the
 564 two arguments into I<expo()> and to pop the return value out.  Take a
 565 deep breath...
 566
 567     #include <EXTERN.h>
 568     #include <perl.h>
 569
 570     static PerlInterpreter *my_perl;
 571
 572     static void
 573     PerlPower(int a, int b)
 574     {
 575       dSP;                            /* initialize stack pointer      */
 576       ENTER;                          /* everything created after here */
 577       SAVETMPS;                       /* ...is a temporary variable.   */
 578       PUSHMARK(sp);                   /* remember the stack pointer    */
 579       XPUSHs(sv_2mortal(newSViv(a))); /* push the base onto the stack  */
 580       XPUSHs(sv_2mortal(newSViv(b))); /* push the exponent onto stack  */
 581       PUTBACK;                      /* make local stack pointer global */
 582       perl_call_pv("expo", G_SCALAR); /* call the function             */
 583       SPAGAIN;                        /* refresh stack pointer         */
 584                                     /* pop the return value from stack */
 585       printf ("%d to the %dth power is %d.\n", a, b, POPi);
 586       PUTBACK;
 587       FREETMPS;                       /* free that return value        */
 588       LEAVE;                       /* ...and the XPUSHed "mortal" args.*/
 589     }
 590
 591     int main (int argc, char **argv, char **env)
 592     {
 593       char *my_argv[2];
 594
 595       my_perl = perl_alloc();
 596       perl_construct( my_perl );
 597
 598       my_argv[1] = (char *) malloc(10);
 599       sprintf(my_argv[1], "power.pl");
 600
 601       perl_parse(my_perl, NULL, argc, my_argv, NULL);
 602       perl_run(my_perl);
 603
 604       PerlPower(3, 4);                      /*** Compute 3 ** 4 ***/
 605
 606       perl_destruct(my_perl);
 607       perl_free(my_perl);
 608     }
 609
 610
 611
 612 Compile and run:
 613
 614     % cc -o power power.c `perl -MExtUtils::Embed -e ccopts -e ldopts`
 615
 616     % power
 617     3 to the 4th power is 81.
 618
 619 =head2 Maintaining a persistent interpreter
 620
 621 When developing interactive and/or potentially long-running
 622 applications, it's a good idea to maintain a persistent interpreter
 623 rather than allocating and constructing a new interpreter multiple
 624 times.  The major reason is speed: since Perl will only be loaded into
 625 memory once.
 626
 627 However, you have to be more cautious with namespace and variable
 628 scoping when using a persistent interpreter.  In previous examples
 629 we've been using global variables in the default package C<main>.  We
 630 knew exactly what code would be run, and assumed we could avoid
 631 variable collisions and outrageous symbol table growth.
 632
 633 Let's say your application is a server that will occasionally run Perl
 634 code from some arbitrary file.  Your server has no way of knowing what
 635 code it's going to run.  Very dangerous.
 636
 637 If the file is pulled in by C<perl_parse()>, compiled into a newly
 638 constructed interpreter, and subsequently cleaned out with
 639 C<perl_destruct()> afterwards, you're shielded from most namespace
 640 troubles.
 641
 642 One way to avoid namespace collisions in this scenario is to translate
 643 the filename into a guaranteed-unique package name, and then compile
 644 the code into that package using L<perlfunc/eval>.  In the example
 645 below, each file will only be compiled once.  Or, the application
 646 might choose to clean out the symbol table associated with the file
 647 after it's no longer needed.  Using L<perlcall/perl_call_argv>, We'll
 648 call the subroutine C<Embed::Persistent::eval_file> which lives in the
 649 file C<persistent.pl> and pass the filename and boolean cleanup/cache
 650 flag as arguments.
 651
 652 Note that the process will continue to grow for each file that it
 653 uses.  In addition, there might be C<AUTOLOAD>ed subroutines and other
 654 conditions that cause Perl's symbol table to grow.  You might want to
 655 add some logic that keeps track of the process size, or restarts
 656 itself after a certain number of requests, to ensure that memory
 657 consumption is minimized.  You'll also want to scope your variables
 658 with L<perlfunc/my> whenever possible.
 659
 660
 661  package Embed::Persistent;
 662  #persistent.pl
 663
 664  use strict;
 665  use vars '%Cache';
 666
 667  sub valid_package_name {
 668      my($string) = @_;
 669      $string =~ s/([^A-Za-z0-9\/])/sprintf("_%2x",unpack("C",$1))/eg;
 670      # second pass only for words starting with a digit
 671      $string =~ s|/(\d)|sprintf("/_%2x",unpack("C",$1))|eg;
 672
 673      # Dress it up as a real package name
 674      $string =~ s|/|::|g;
 675      return "Embed" . $string;
 676  }
 677
 678  #borrowed from Safe.pm
 679  sub delete_package {
 680      my $pkg = shift;
 681      my ($stem, $leaf);
 682
 683      no strict 'refs';
 684      $pkg = "main::$pkg\::";    # expand to full symbol table name
 685      ($stem, $leaf) = $pkg =~ m/(.*::)(\w+::)$/;
 686
 687      my $stem_symtab = *{$stem}{HASH};
 688
 689      delete $stem_symtab->{$leaf};
 690  }
 691
 692  sub eval_file {
 693      my($filename, $delete) = @_;
 694      my $package = valid_package_name($filename);
 695      my $mtime = -M $filename;
 696      if(defined $Cache{$package}{mtime}
 697         &&
 698         $Cache{$package}{mtime} <= $mtime)
 699      {
 700         # we have compiled this subroutine already,
 701         # it has not been updated on disk, nothing left to do
 702         print STDERR "already compiled $package->handler\n";
 703      }
 704      else {
 705         local *FH;
 706         open FH, $filename or die "open '$filename' $!";
 707         local($/) = undef;
 708         my $sub = <FH>;
 709         close FH;
 710
 711         #wrap the code into a subroutine inside our unique package
 712         my $eval = qq{package $package; sub handler { $sub; }};
 713         {
 714             # hide our variables within this block
 715             my($filename,$mtime,$package,$sub);
 716             eval $eval;
 717         }
 718         die $@ if $@;
 719
 720         #cache it unless we're cleaning out each time
 721         $Cache{$package}{mtime} = $mtime unless $delete;
 722      }
 723
 724      eval {$package->handler;};
 725      die $@ if $@;
 726
 727      delete_package($package) if $delete;
 728
 729      #take a look if you want
 730      #print Devel::Symdump->rnew($package)->as_string, $/;
 731  }
 732
 733  1;
 734
 735  __END__
 736
 737  /* persistent.c */
 738  #include <EXTERN.h>
 739  #include <perl.h>
 740
 741  /* 1 = clean out filename's symbol table after each request, 0 = don't */
 742  #ifndef DO_CLEAN
 743  #define DO_CLEAN 0
 744  #endif
 745
 746  static PerlInterpreter *perl = NULL;
 747
 748  int
 749  main(int argc, char **argv, char **env)
 750  {
 751      char *embedding[] = { "", "persistent.pl" };
 752      char *args[] = { "", DO_CLEAN, NULL };
 753      char filename [1024];
 754      int exitstatus = 0;
 755
 756      if((perl = perl_alloc()) == NULL) {
 757         fprintf(stderr, "no memory!");
 758         exit(1);
 759      }
 760      perl_construct(perl);
 761
 762      exitstatus = perl_parse(perl, NULL, 2, embedding, NULL);
 763
 764      if(!exitstatus) {
 765         exitstatus = perl_run(perl);
 766
 767         while(printf("Enter file name: ") && gets(filename)) {
 768
 769             /* call the subroutine, passing it the filename as an argument */
 770             args[0] = filename;
 771             perl_call_argv("Embed::Persistent::eval_file",
 772                            G_DISCARD | G_EVAL, args);
 773
 774             /* check $@ */
 775             if(SvTRUE(GvSV(errgv)))
 776                 fprintf(stderr, "eval error: %s\n", SvPV(GvSV(errgv),na));
 777         }
 778      }
 779
 780      perl_destruct_level = 0;
 781      perl_destruct(perl);
 782      perl_free(perl);
 783      exit(exitstatus);
 784  }
 785
 786 Now compile:
 787
 788  % cc -o persistent persistent.c `perl -MExtUtils::Embed -e ccopts -e ldopts`
 789
 790 Here's a example script file:
 791
 792  #test.pl
 793  my $string = "hello";
 794  foo($string);
 795
 796  sub foo {
 797      print "foo says: @_\n";
 798  }
 799
 800 Now run:
 801
 802  % persistent
 803  Enter file name: test.pl
 804  foo says: hello
 805  Enter file name: test.pl
 806  already compiled Embed::test_2epl->handler
 807  foo says: hello
 808  Enter file name: ^C
 809
 810 =head2 Maintaining multiple interpreter instances
 811
 812 Some rare applications will need to create more than one interpreter
 813 during a session.  Such an application might sporadically decide to
 814 release any resources associated with the interpreter.
 815
 816 The program must take care to ensure that this takes place I<before>
 817 the next interpreter is constructed.  By default, the global variable
 818 C<perl_destruct_level> is set to C<0>, since extra cleaning isn't
 819 needed when a program has only one interpreter.
 820
 821 Setting C<perl_destruct_level> to C<1> makes everything squeaky clean:
 822
 823  perl_destruct_level = 1;
 824
 825  while(1) {
 826      ...
 827      /* reset global variables here with perl_destruct_level = 1 */
 828      perl_construct(my_perl);
 829      ...
 830      /* clean and reset _everything_ during perl_destruct */
 831      perl_destruct(my_perl);
 832      perl_free(my_perl);
 833      ...
 834      /* let's go do it again! */
 835  }
 836
 837 When I<perl_destruct()> is called, the interpreter's syntax parse tree
 838 and symbol tables are cleaned up, and global variables are reset.
 839
 840 Now suppose we have more than one interpreter instance running at the
 841 same time.  This is feasible, but only if you used the
 842 C<-DMULTIPLICITY> flag when building Perl.  By default, that sets
 843 C<perl_destruct_level> to C<1>.
 844
 845 Let's give it a try:
 846
 847
 848  #include <EXTERN.h>
 849  #include <perl.h>
 850
 851  /* we're going to embed two interpreters */
 852  /* we're going to embed two interpreters */
 853
 854  #define SAY_HELLO "-e", "print qq(Hi, I'm $^X\n)"
 855
 856  int main(int argc, char **argv, char **env)
 857  {
 858      PerlInterpreter
 859          *one_perl = perl_alloc(),
 860          *two_perl = perl_alloc();
 861      char *one_args[] = { "one_perl", SAY_HELLO };
 862      char *two_args[] = { "two_perl", SAY_HELLO };
 863
 864      perl_construct(one_perl);
 865      perl_construct(two_perl);
 866
 867      perl_parse(one_perl, NULL, 3, one_args, (char **)NULL);
 868      perl_parse(two_perl, NULL, 3, two_args, (char **)NULL);
 869
 870      perl_run(one_perl);
 871      perl_run(two_perl);
 872
 873      perl_destruct(one_perl);
 874      perl_destruct(two_perl);
 875
 876      perl_free(one_perl);
 877      perl_free(two_perl);
 878  }
 879
 880
 881 Compile as usual:
 882
 883  % cc -o multiplicity multiplicity.c `perl -MExtUtils::Embed -e ccopts -e ldopts`
 884
 885 Run it, Run it:
 886
 887  % multiplicity
 888  Hi, I'm one_perl
 889  Hi, I'm two_perl
 890
 891 =head2 Using Perl modules, which themselves use C libraries, from your C program
 892
 893 If you've played with the examples above and tried to embed a script
 894 that I<use()>s a Perl module (such as I<Socket>) which itself uses a C or C++ library,
 895 this probably happened:
 896
 897
 898  Can't load module Socket, dynamic loading not available in this perl.
 899   (You may need to build a new perl executable which either supports
 900   dynamic loading or has the Socket module statically linked into it.)
 901
 902
 903 What's wrong?
 904
 905 Your interpreter doesn't know how to communicate with these extensions
 906 on its own.  A little glue will help.  Up until now you've been
 907 calling I<perl_parse()>, handing it NULL for the second argument:
 908
 909  perl_parse(my_perl, NULL, argc, my_argv, NULL);
 910
 911 That's where the glue code can be inserted to create the initial contact between
 912 Perl and linked C/C++ routines.  Let's take a look some pieces of I<perlmain.c>
 913 to see how Perl does this:
 914
 915
 916  #ifdef __cplusplus
 917  #  define EXTERN_C extern "C"
 918  #else
 919  #  define EXTERN_C extern
 920  #endif
 921
 922  static void xs_init _((void));
 923
 924  EXTERN_C void boot_DynaLoader _((CV* cv));
 925  EXTERN_C void boot_Socket _((CV* cv));
 926
 927
 928  EXTERN_C void
 929  xs_init()
 930  {
 931         char *file = __FILE__;
 932         /* DynaLoader is a special case */
 933         newXS("DynaLoader::boot_DynaLoader", boot_DynaLoader, file);
 934         newXS("Socket::bootstrap", boot_Socket, file);
 935  }
 936
 937 Simply put: for each extension linked with your Perl executable
 938 (determined during its initial configuration on your
 939 computer or when adding a new extension),
 940 a Perl subroutine is created to incorporate the extension's
 941 routines.  Normally, that subroutine is named
 942 I<Module::bootstrap()> and is invoked when you say I<use Module>.  In
 943 turn, this hooks into an XSUB, I<boot_Module>, which creates a Perl
 944 counterpart for each of the extension's XSUBs.  Don't worry about this
 945 part; leave that to the I<xsubpp> and extension authors.  If your
 946 extension is dynamically loaded, DynaLoader creates I<Module::bootstrap()>
 947 for you on the fly.  In fact, if you have a working DynaLoader then there
 948 is rarely any need to link in any other extensions statically.
 949
 950
 951 Once you have this code, slap it into the second argument of I<perl_parse()>:
 952
 953
 954  perl_parse(my_perl, xs_init, argc, my_argv, NULL);
 955
 956
 957 Then compile:
 958
 959  % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts`
 960
 961  % interp
 962    use Socket;
 963    use SomeDynamicallyLoadedModule;
 964
 965    print "Now I can use extensions!\n"'
 966
 967 B<ExtUtils::Embed> can also automate writing the I<xs_init> glue code.
 968
 969  % perl -MExtUtils::Embed -e xsinit -- -o perlxsi.c
 970  % cc -c perlxsi.c `perl -MExtUtils::Embed -e ccopts`
 971  % cc -c interp.c  `perl -MExtUtils::Embed -e ccopts`
 972  % cc -o interp perlxsi.o interp.o `perl -MExtUtils::Embed -e ldopts`
 973
 974 Consult L<perlxs> and L<perlguts> for more details.
 975
 976
 977 =head1 MORAL
 978
 979 You can sometimes I<write faster code> in C, but
 980 you can always I<write code faster> in Perl.  Because you can use
 981 each from the other, combine them as you wish.
 982
 983
 984 =head1 AUTHOR
 985
 986 Jon Orwant and <F<orwant@tpj.com>> and Doug MacEachern <F<dougm@osf.org>>,
 987 with small contributions from Tim Bunce, Tom Christiansen, Hallvard Furuseth,
 988 Dov Grobgeld, and Ilya Zakharevich.
 989
 990 Check out Doug's article on embedding in Volume 1, Issue 4 of The Perl
 991 Journal.  Info about TPJ is available from http://tpj.com.
 992
 993 February 1, 1997
 994
 995 Some of this material is excerpted from Jon Orwant's book: I<Perl 5
 996 Interactive>, Waite Group Press, 1996 (ISBN 1-57169-064-6) and appears
 997 courtesy of Waite Group Press.
 998
 999 =head1 COPYRIGHT
1000
1001 Copyright (C) 1995, 1996, 1997 Doug MacEachern and Jon Orwant.  All
1002 Rights Reserved.
1003
1004 Although destined for release with the standard Perl distribution,
1005 this document is not public domain, nor is any of Perl and its
1006 documentation.  Permission is granted to freely distribute verbatim
1007 copies of this document provided that no modifications outside of
1008 formatting be made, and that this notice remain intact.  You are
1009 permitted and encouraged to use its code and derivatives thereof in
1010 your own source code for fun or for profit as you see fit.