pod/perlguts.pod

   1 =head1 NAME
   2
   3 perlguts - Perl's Internal Functions
   4
   5 =head1 DESCRIPTION
   6
   7 This document attempts to describe some of the internal functions of the
   8 Perl executable.  It is far from complete and probably contains many errors.
   9 Please refer any questions or comments to the author below.
  10
  11 =head1 Datatypes
  12
  13 Perl has three typedefs that handle Perl's three main data types:
  14
  15     SV  Scalar Value
  16     AV  Array Value
  17     HV  Hash Value
  18
  19 Each typedef has specific routines that manipulate the various data type.
  20
  21 =head2 What is an "IV"?
  22
  23 Perl uses a special typedef IV which is large enough to hold either an
  24 integer or a pointer.
  25
  26 Perl also uses a special typedef I32 which will always be a 32-bit integer.
  27
  28 =head2 Working with SV's
  29
  30 An SV can be created and loaded with one command.  There are four types of
  31 values that can be loaded: an integer value (IV), a double (NV), a string,
  32 (PV), and another scalar (SV).
  33
  34 The four routines are:
  35
  36     SV*  newSViv(IV);
  37     SV*  newSVnv(double);
  38     SV*  newSVpv(char*, int);
  39     SV*  newSVsv(SV*);
  40
  41 To change the value of an *already-existing* scalar, there are five routines:
  42
  43     void  sv_setiv(SV*, IV);
  44     void  sv_setnv(SV*, double);
  45     void  sv_setpvn(SV*, char*, int)
  46     void  sv_setpv(SV*, char*);
  47     void  sv_setsv(SV*, SV*);
  48
  49 Notice that you can choose to specify the length of the string to be
  50 assigned by using C<sv_setpvn>, or allow Perl to calculate the length by
  51 using C<sv_setpv>.  Be warned, though, that C<sv_setpv> determines the
  52 string's length by using C<strlen>, which depends on the string terminating
  53 with a NUL character.
  54
  55 To access the actual value that an SV points to, you can use the macros:
  56
  57     SvIV(SV*)
  58     SvNV(SV*)
  59     SvPV(SV*, STRLEN len)
  60
  61 which will automatically coerce the actual scalar type into an IV, double,
  62 or string.
  63
  64 In the C<SvPV> macro, the length of the string returned is placed into the
  65 variable C<len> (this is a macro, so you do I<not> use C<&len>).  If you do not
  66 care what the length of the data is, use the global variable C<na>.  Remember,
  67 however, that Perl allows arbitrary strings of data that may both contain
  68 NUL's and not be terminated by a NUL.
  69
  70 If you simply want to know if the scalar value is TRUE, you can use:
  71
  72     SvTRUE(SV*)
  73
  74 Although Perl will automatically grow strings for you, if you need to force
  75 Perl to allocate more memory for your SV, you can use the macro
  76
  77     SvGROW(SV*, STRLEN newlen)
  78
  79 which will determine if more memory needs to be allocated.  If so, it will
  80 call the function C<sv_grow>.  Note that C<SvGROW> can only increase, not
  81 decrease, the allocated memory of an SV.
  82
  83 If you have an SV and want to know what kind of data Perl thinks is stored
  84 in it, you can use the following macros to check the type of SV you have.
  85
  86     SvIOK(SV*)
  87     SvNOK(SV*)
  88     SvPOK(SV*)
  89
  90 You can get and set the current length of the string stored in an SV with
  91 the following macros:
  92
  93     SvCUR(SV*)
  94     SvCUR_set(SV*, I32 val)
  95
  96 But note that these are valid only if C<SvPOK()> is true.
  97
  98 If you know the name of a scalar variable, you can get a pointer to its SV
  99 by using the following:
 100
 101     SV*  perl_get_sv("varname", FALSE);
 102
 103 This returns NULL if the variable does not exist.
 104
 105 If you want to know if this variable (or any other SV) is actually defined,
 106 you can call:
 107
 108     SvOK(SV*)
 109
 110 The scalar C<undef> value is stored in an SV instance called C<sv_undef>.  Its
 111 address can be used whenever an C<SV*> is needed.
 112
 113 There are also the two values C<sv_yes> and C<sv_no>, which contain Boolean
 114 TRUE and FALSE values, respectively.  Like C<sv_undef>, their addresses can
 115 be used whenever an C<SV*> is needed.
 116
 117 Do not be fooled into thinking that C<(SV *) 0> is the same as C<&sv_undef>.
 118 Take this code:
 119
 120     SV* sv = (SV*) 0;
 121     if (I-am-to-return-a-real-value) {
 122             sv = sv_2mortal(newSViv(42));
 123     }
 124     sv_setsv(ST(0), sv);
 125
 126 This code tries to return a new SV (which contains the value 42) if it should
 127 return a real value, or undef otherwise.  Instead it has returned a null
 128 pointer which, somewhere down the line, will cause a segmentation violation,
 129 or just weird results.  Change the zero to C<&sv_undef> in the first line and
 130 all will be well.
 131
 132 To free an SV that you've created, call C<SvREFCNT_dec(SV*)>.  Normally this
 133 call is not necessary.  See the section on B<MORTALITY>.
 134
 135 =head2 Private and Public Values
 136
 137 Recall that the usual method of determining the type of scalar you have is
 138 to use C<Sv[INP]OK> macros.  Since a scalar can be both a number and a string,
 139 usually these macros will always return TRUE and calling the C<Sv[INP]V>
 140 macros will do the appropriate conversion of string to integer/double or
 141 integer/double to string.
 142
 143 If you I<really> need to know if you have an integer, double, or string
 144 pointer in an SV, you can use the following three macros instead:
 145
 146     SvIOKp(SV*)
 147     SvNOKp(SV*)
 148     SvPOKp(SV*)
 149
 150 These will tell you if you truly have an integer, double, or string pointer
 151 stored in your SV.
 152
 153 In general, though, it's best to just use the C<Sv[INP]V> macros.
 154
 155 =head2 Working with AV's
 156
 157 There are two ways to create and load an AV.  The first method just creates
 158 an empty AV:
 159
 160     AV*  newAV();
 161
 162 The second method both creates the AV and initially populates it with SV's:
 163
 164     AV*  av_make(I32 num, SV **ptr);
 165
 166 The second argument points to an array containing C<num> C<SV*>'s.
 167
 168 Once the AV has been created, the following operations are possible on AV's:
 169
 170     void  av_push(AV*, SV*);
 171     SV*   av_pop(AV*);
 172     SV*   av_shift(AV*);
 173     void  av_unshift(AV*, I32 num);
 174
 175 These should be familiar operations, with the exception of C<av_unshift>.
 176 This routine adds C<num> elements at the front of the array with the C<undef>
 177 value.  You must then use C<av_store> (described below) to assign values
 178 to these new elements.
 179
 180 Here are some other functions:
 181
 182     I32   av_len(AV*); /* Returns length of array */
 183
 184     SV**  av_fetch(AV*, I32 key, I32 lval);
 185             /* Fetches value at key offset, but it seems to
 186                set the value to lval if lval is non-zero */
 187     SV**  av_store(AV*, I32 key, SV* val);
 188             /* Stores val at offset key */
 189
 190     void  av_clear(AV*);
 191             /* Clear out all elements, but leave the array */
 192     void  av_undef(AV*);
 193             /* Undefines the array, removing all elements */
 194
 195 If you know the name of an array variable, you can get a pointer to its AV
 196 by using the following:
 197
 198     AV*  perl_get_av("varname", FALSE);
 199
 200 This returns NULL if the variable does not exist.
 201
 202 =head2 Working with HV's
 203
 204 To create an HV, you use the following routine:
 205
 206     HV*  newHV();
 207
 208 Once the HV has been created, the following operations are possible on HV's:
 209
 210     SV**  hv_store(HV*, char* key, U32 klen, SV* val, U32 hash);
 211     SV**  hv_fetch(HV*, char* key, U32 klen, I32 lval);
 212
 213 The C<klen> parameter is the length of the key being passed in.  The C<val>
 214 argument contains the SV pointer to the scalar being stored, and C<hash> is
 215 the pre-computed hash value (zero if you want C<hv_store> to calculate it
 216 for you).  The C<lval> parameter indicates whether this fetch is actually a
 217 part of a store operation.
 218
 219 Remember that C<hv_store> and C<hv_fetch> return C<SV**>'s and not just
 220 C<SV*>.  In order to access the scalar value, you must first dereference
 221 the return value.  However, you should check to make sure that the return
 222 value is not NULL before dereferencing it.
 223
 224 These two functions check if a hash table entry exists, and deletes it.
 225
 226     bool  hv_exists(HV*, char* key, U32 klen);
 227     SV*   hv_delete(HV*, char* key, U32 klen);
 228
 229 And more miscellaneous functions:
 230
 231     void   hv_clear(HV*);
 232             /* Clears all entries in hash table */
 233     void   hv_undef(HV*);
 234             /* Undefines the hash table */
 235
 236     I32    hv_iterinit(HV*);
 237             /* Prepares starting point to traverse hash table */
 238     HE*    hv_iternext(HV*);
 239             /* Get the next entry, and return a pointer to a
 240                structure that has both the key and value */
 241     char*  hv_iterkey(HE* entry, I32* retlen);
 242             /* Get the key from an HE structure and also return
 243                the length of the key string */
 244     SV*     hv_iterval(HV*, HE* entry);
 245             /* Return a SV pointer to the value of the HE
 246                structure */
 247
 248 If you know the name of a hash variable, you can get a pointer to its HV
 249 by using the following:
 250
 251     HV*  perl_get_hv("varname", FALSE);
 252
 253 This returns NULL if the variable does not exist.
 254
 255 The hash algorithm, for those who are interested, is:
 256
 257     i = klen;
 258     hash = 0;
 259     s = key;
 260     while (i--)
 261         hash = hash * 33 + *s++;
 262
 263 =head2 References
 264
 265 References are a special type of scalar that point to other scalar types
 266 (including references).  To treat an AV or HV as a scalar, it is simply
 267 a matter of casting an AV or HV to an SV.
 268
 269 To create a reference, use the following command:
 270
 271     SV*  newRV((SV*) pointer);
 272
 273 Once you have a reference, you can use the following macro with a cast to
 274 the appropriate typedef (SV, AV, HV):
 275
 276     SvRV(SV*)
 277
 278 then call the appropriate routines, casting the returned C<SV*> to either an
 279 C<AV*> or C<HV*>.
 280
 281 To determine, after dereferencing a reference, if you still have a reference,
 282 you can use the following macro:
 283
 284     SvROK(SV*)
 285
 286 =head1 XSUB'S and the Argument Stack
 287
 288 The XSUB mechanism is a simple way for Perl programs to access C subroutines.
 289 An XSUB routine will have a stack that contains the arguments from the Perl
 290 program, and a way to map from the Perl data structures to a C equivalent.
 291
 292 The stack arguments are accessible through the C<ST(n)> macro, which returns
 293 the C<n>'th stack argument.  Argument 0 is the first argument passed in the
 294 Perl subroutine call.  These arguments are C<SV*>, and can be used anywhere
 295 an C<SV*> is used.
 296
 297 Most of the time, output from the C routine can be handled through use of
 298 the RETVAL and OUTPUT directives.  However, there are some cases where the
 299 argument stack is not already long enough to handle all the return values.
 300 An example is the POSIX tzname() call, which takes no arguments, but returns
 301 two, the local timezone's standard and summer time abbreviations.
 302
 303 To handle this situation, the PPCODE directive is used and the stack is
 304 extended using the macro:
 305
 306     EXTEND(sp, num);
 307
 308 where C<sp> is the stack pointer, and C<num> is the number of elements the
 309 stack should be extended by.
 310
 311 Now that there is room on the stack, values can be pushed on it using the
 312 macros to push IV's, doubles, strings, and SV pointers respectively:
 313
 314     PUSHi(IV)
 315     PUSHn(double)
 316     PUSHp(char*, I32)
 317     PUSHs(SV*)
 318
 319 And now the Perl program calling C<tzname>, the two values will be assigned
 320 as in:
 321
 322     ($standard_abbrev, $summer_abbrev) = POSIX::tzname;
 323
 324 An alternate (and possibly simpler) method to pushing values on the stack is
 325 to use the macros:
 326
 327     XPUSHi(IV)
 328     XPUSHn(double)
 329     XPUSHp(char*, I32)
 330     XPUSHs(SV*)
 331
 332 These macros automatically adjust the stack for you, if needed.
 333
 334 =head1 Mortality
 335
 336 In Perl, values are normally "immortal" -- that is, they are not freed unless
 337 explicitly done so (via the Perl C<undef> call or other routines in Perl
 338 itself).
 339
 340 In the above example with C<tzname>, we needed to create two new SV's to push
 341 onto the argument stack, that being the two strings.  However, we don't want
 342 these new SV's to stick around forever because they will eventually be
 343 copied into the SV's that hold the two scalar variables.
 344
 345 An SV (or AV or HV) that is "mortal" acts in all ways as a normal "immortal"
 346 SV, AV, or HV, but is only valid in the "current context".  When the Perl
 347 interpreter leaves the current context, the mortal SV, AV, or HV is
 348 automatically freed.  Generally the "current context" means a single
 349 Perl statement.
 350
 351 To create a mortal variable, use the functions:
 352
 353     SV*  sv_newmortal()
 354     SV*  sv_2mortal(SV*)
 355     SV*  sv_mortalcopy(SV*)
 356
 357 The first call creates a mortal SV, the second converts an existing SV to
 358 a mortal SV, the third creates a mortal copy of an existing SV.
 359
 360 The mortal routines are not just for SV's -- AV's and HV's can be made mortal
 361 by passing their address (and casting them to C<SV*>) to the C<sv_2mortal> or
 362 C<sv_mortalcopy> routines.
 363
 364 =head1 Creating New Variables
 365
 366 To create a new Perl variable, which can be accessed from your Perl script,
 367 use the following routines, depending on the variable type.
 368
 369     SV*  perl_get_sv("varname", TRUE);
 370     AV*  perl_get_av("varname", TRUE);
 371     HV*  perl_get_hv("varname", TRUE);
 372
 373 Notice the use of TRUE as the second parameter.  The new variable can now
 374 be set, using the routines appropriate to the data type.
 375
 376 =head1 Stashes and Objects
 377
 378 A stash is a hash table (associative array) that contains all of the
 379 different objects that are contained within a package.  Each key of the
 380 hash table is a symbol name (shared by all the different types of
 381 objects that have the same name), and each value in the hash table is
 382 called a GV (for Glob Value).  The GV in turn contains references to
 383 the various objects of that name, including (but not limited to) the
 384 following:
 385
 386     Scalar Value
 387     Array Value
 388     Hash Value
 389     File Handle
 390     Directory Handle
 391     Format
 392     Subroutine
 393
 394 Perl stores various stashes in a GV structure (for global variable) but
 395 represents them with an HV structure.
 396
 397 To get the HV pointer for a particular package, use the function:
 398
 399     HV*  gv_stashpv(char* name, I32 create)
 400     HV*  gv_stashsv(SV*, I32 create)
 401
 402 The first function takes a literal string, the second uses the string stored
 403 in the SV.
 404
 405 The name that C<gv_stash*v> wants is the name of the package whose symbol table
 406 you want.  The default package is called C<main>.  If you have multiply nested
 407 packages, it is legal to pass their names to C<gv_stash*v>, separated by
 408 C<::> as in the Perl language itself.
 409
 410 Alternately, if you have an SV that is a blessed reference, you can find
 411 out the stash pointer by using:
 412
 413     HV*  SvSTASH(SvRV(SV*));
 414
 415 then use the following to get the package name itself:
 416
 417     char*  HvNAME(HV* stash);
 418
 419 If you need to return a blessed value to your Perl script, you can use the
 420 following function:
 421
 422     SV*  sv_bless(SV*, HV* stash)
 423
 424 where the first argument, an C<SV*>, must be a reference, and the second
 425 argument is a stash.  The returned C<SV*> can now be used in the same way
 426 as any other SV.
 427
 428 =head1 Magic
 429
 430 [This section under construction]
 431
 432 =head1 Double-Typed SV's
 433
 434 Scalar variables normally contain only one type of value, an integer,
 435 double, pointer, or reference.  Perl will automatically convert the
 436 actual scalar data from the stored type into the requested type.
 437
 438 Some scalar variables contain more than one type of scalar data.  For
 439 example, the variable C<$!> contains either the numeric value of C<errno>
 440 or its string equivalent from C<sys_errlist[]>.
 441
 442 To force multiple data values into an SV, you must do two things: use the
 443 C<sv_set*v> routines to add the additional scalar type, then set a flag
 444 so that Perl will believe it contains more than one type of data.  The
 445 four macros to set the flags are:
 446
 447         SvIOK_on
 448         SvNOK_on
 449         SvPOK_on
 450         SvROK_on
 451
 452 The particular macro you must use depends on which C<sv_set*v> routine
 453 you called first.  This is because every C<sv_set*v> routine turns on
 454 only the bit for the particular type of data being set, and turns off
 455 all the rest.
 456
 457 For example, to create a new Perl variable called "dberror" that contains
 458 both the numeric and descriptive string error values, you could use the
 459 following code:
 460
 461     extern int  dberror;
 462     extern char *dberror_list;
 463
 464     SV* sv = perl_get_sv("dberror", TRUE);
 465     sv_setiv(sv, (IV) dberror);
 466     sv_setpv(sv, dberror_list[dberror]);
 467     SvIOK_on(sv);
 468
 469 If the order of C<sv_setiv> and C<sv_setpv> had been reversed, then the
 470 macro C<SvPOK_on> would need to be called instead of C<SvIOK_on>.
 471
 472 =head1 Calling Perl Routines from within C Programs
 473
 474 There are four routines that can be used to call a Perl subroutine from
 475 within a C program.  These four are:
 476
 477     I32  perl_call_sv(SV*, I32);
 478     I32  perl_call_pv(char*, I32);
 479     I32  perl_call_method(char*, I32);
 480     I32  perl_call_argv(char*, I32, register char**);
 481
 482 The routine most often used should be C<perl_call_sv>.  The C<SV*> argument
 483 contains either the name of the Perl subroutine to be called, or a reference
 484 to the subroutine.  The second argument tells the appropriate routine what,
 485 if any, variables are being returned by the Perl subroutine.
 486
 487 All four routines return the number of arguments that the subroutine returned
 488 on the Perl stack.
 489
 490 When using these four routines, the programmer must manipulate the Perl stack.
 491 These include the following macros and functions:
 492
 493     dSP
 494     PUSHMARK()
 495     PUTBACK
 496     SPAGAIN
 497     ENTER
 498     SAVETMPS
 499     FREETMPS
 500     LEAVE
 501     XPUSH*()
 502
 503 For more information, consult L<perlcall>.
 504
 505 =head1 Memory Allocation
 506
 507 [This section under construction]
 508
 509 =head1 AUTHOR
 510
 511 Jeff Okamoto <okamoto@corp.hp.com>
 512
 513 With lots of help and suggestions from Dean Roehrich, Malcolm Beattie,
 514 Andreas Koenig, Paul Hudson, Ilya Zakharevich, Paul Marquess, and Neil
 515 Bowers.
 516
 517 =head1 DATE
 518
 519 Version 12: 1994/10/16
 520
 521