=head1 DESCRIPTION
This is not the perldebug(1) manpage, which tells you how to use
-the debugger. This manpage describes low-level details ranging
-between difficult and impossible for anyone who isn't incredibly
-intimate with Perl's guts to understand. Caveat lector.
+the debugger. This manpage describes low-level details concerning
+the debugger's internals, which range from difficult to impossible
+to understand for anyone who isn't incredibly intimate with Perl's guts.
+Caveat lector.
=head1 Debugger Internals
F<INSTALL> podpage in the Perl source tree.
For example, whenever you call Perl's built-in C<caller> function
-from the package DB, the arguments that the corresponding stack
-frame was called with are copied to the the @DB::args array. The
-general mechanisms is enabled by calling Perl with the B<-d> switch, the
-following additional features are enabled (cf. L<perlvar/$^P>):
+from the package C<DB>, the arguments that the corresponding stack
+frame was called with are copied to the C<@DB::args> array. These
+mechanisms are enabled by calling Perl with the B<-d> switch.
+Specifically, the following additional features are enabled
+(cf. L<perlvar/$^P>):
-=over
+=over 4
=item *
=item *
-The array C<@{"_<$filename"}> holds the lines of $filename for all
-files compiled by Perl. The same for C<eval>ed strings that contain
-subroutines, or which are currently being executed. The $filename
-for C<eval>ed strings looks like C<(eval 34)>. Code assertions
-in regexes look like C<(re_eval 19)>.
+Each array C<@{"_<$filename"}> holds the lines of $filename for a
+file compiled by Perl. The same is also true for C<eval>ed strings
+that contain subroutines, or which are currently being executed.
+The $filename for C<eval>ed strings looks like C<(eval 34)>.
+Code assertions in regexes look like C<(re_eval 19)>.
+
+Values in this array are magical in numeric context: they compare
+equal to zero only if the line is not breakable.
=item *
-The hash C<%{"_<$filename"}> contains breakpoints and actions keyed
+Each hash C<%{"_<$filename"}> contains breakpoints and actions keyed
by line number. Individual entries (as opposed to the whole hash)
are settable. Perl only cares about Boolean true here, although
the values used by F<perl5db.pl> have the form
-C<"$break_condition\0$action">. Values in this hash are magical
-in numeric context: they are zeros if the line is not breakable.
+C<"$break_condition\0$action">.
The same holds for evaluated strings that contain subroutines, or
which are currently being executed. The $filename for C<eval>ed strings
=item *
-The scalar C<${"_<$filename"}> contains C<"_<$filename">. This is
+Each scalar C<${"_<$filename"}> contains C<"_<$filename">. This is
also the case for evaluated strings that contain subroutines, or
which are currently being executed. The $filename for C<eval>ed
strings looks like C<(eval 34)> or C<(re_eval 19)>.
=item *
When the execution of your program reaches a point that can hold a
-breakpoint, the C<DB::DB()> subroutine is called any of the variables
-$DB::trace, $DB::single, or $DB::signal is true. These variables
+breakpoint, the C<DB::DB()> subroutine is called if any of the variables
+C<$DB::trace>, C<$DB::single>, or C<$DB::signal> is true. These variables
are not C<local>izable. This feature is disabled when executing
inside C<DB::DB()>, including functions called from it
unless C<< $^D & (1<<30) >> is true.
When execution of the program reaches a subroutine call, a call to
C<&DB::sub>(I<args>) is made instead, with C<$DB::sub> holding the
-name of the called subroutine. This doesn't happen if the subroutine
+name of the called subroutine. (This doesn't happen if the subroutine
was compiled in the C<DB> package.)
=back
Note that if C<&DB::sub> needs external data for it to work, no
-subroutine call is possible until this is done. For the standard
-debugger, the C<$DB::deep> variable (how many levels of recursion
-deep into the debugger you can go before a mandatory break) gives
-an example of such a dependency.
+subroutine call is possible without it. As an example, the standard
+debugger's C<&DB::sub> depends on the C<$DB::deep> variable
+(it defines how many levels of recursion deep into the debugger you can go
+before a mandatory break). If C<$DB::deep> is not defined, subroutine
+calls are not possible, even though C<&DB::sub> exists.
=head2 Writing Your Own Debugger
-The minimal working debugger consists of one line
+=head3 Environment Variables
+
+The C<PERL5DB> environment variable can be used to define a debugger.
+For example, the minimal "working" debugger (it actually doesn't do anything)
+consists of one line:
sub DB::DB {}
-which is quite handy as contents of C<PERL5DB> environment
-variable:
+It can easily be defined like this:
$ PERL5DB="sub DB::DB {}" perl -d your-script
-Another brief debugger, slightly more useful, could be created
+Another brief debugger, slightly more useful, can be created
with only the line:
sub DB::DB {print ++$i; scalar <STDIN>}
-This debugger would print the sequential number of encountered
-statement, and would wait for you to hit a newline before continuing.
+This debugger prints a number which increments for each statement
+encountered and waits for you to hit a newline before continuing
+to the next statement.
-The following debugger is quite functional:
+The following debugger is actually useful:
{
package DB;
sub sub {print ++$i, " $sub\n"; &$sub}
}
-It prints the sequential number of subroutine call and the name of the
-called subroutine. Note that C<&DB::sub> should be compiled into the
-package C<DB>.
+It prints the sequence number of each subroutine call and the name of the
+called subroutine. Note that C<&DB::sub> is being compiled into the
+package C<DB> through the use of the C<package> directive.
-At the start, the debugger reads your rc file (F<./.perldb> or
-F<~/.perldb> under Unix), which can set important options. This file may
-define a subroutine C<&afterinit> to be executed after the debugger is
-initialized.
+When it starts, the debugger reads your rc file (F<./.perldb> or
+F<~/.perldb> under Unix), which can set important options.
+(A subroutine (C<&afterinit>) can be defined here as well; it is executed
+after the debugger completes its own initialization.)
After the rc file is read, the debugger reads the PERLDB_OPTS
-environment variable and parses this as the remainder of a C<O ...>
-line as one might enter at the debugger prompt.
+environment variable and uses it to set debugger options. The
+contents of this variable are treated as if they were the argument
+of an C<o ...> debugger command (q.v. in L<perldebug/Options>).
+
+=head3 Debugger internal variables
+In addition to the file and subroutine-related variables mentioned above,
+the debugger also maintains various magical internal variables.
+
+=over 4
+
+=item *
+
+C<@DB::dbline> is an alias for C<@{"::_<current_file"}>, which
+holds the lines of the currently-selected file (compiled by Perl), either
+explicitly chosen with the debugger's C<f> command, or implicitly by flow
+of execution.
+
+Values in this array are magical in numeric context: they compare
+equal to zero only if the line is not breakable.
-The debugger also maintains magical internal variables, such as
-C<@DB::dbline>, C<%DB::dbline>, which are aliases for
-C<@{"::_<current_file"}> C<%{"::_<current_file"}>. Here C<current_file>
-is the currently selected file, either explicitly chosen with the
+=item *
+
+C<%DB::dbline>, is an alias for C<%{"::_<current_file"}>, which
+contains breakpoints and actions keyed by line number in
+the currently-selected file, either explicitly chosen with the
debugger's C<f> command, or implicitly by flow of execution.
-Some functions are provided to simplify customization. See
-L<perldebug/"Options"> for description of options parsed by
-C<DB::parse_options(string)>. The function C<DB::dump_trace(skip[,
-count])> skips the specified number of frames and returns a list
-containing information about the calling frames (all of them, if
-C<count> is missing). Each entry is reference to a a hash with
-keys C<context> (either C<.>, C<$>, or C<@>), C<sub> (subroutine
+As previously noted, individual entries (as opposed to the whole hash)
+are settable. Perl only cares about Boolean true here, although
+the values used by F<perl5db.pl> have the form
+C<"$break_condition\0$action">.
+
+=back
+
+=head3 Debugger customization functions
+
+Some functions are provided to simplify customization.
+
+=over 4
+
+=item *
+
+See L<perldebug/"Options"> for description of options parsed by
+C<DB::parse_options(string)> parses debugger options; see
+L<pperldebug/Options> for a description of options recognized.
+
+=item *
+
+C<DB::dump_trace(skip[,count])> skips the specified number of frames
+and returns a list containing information about the calling frames (all
+of them, if C<count> is missing). Each entry is reference to a hash
+with keys C<context> (either C<.>, C<$>, or C<@>), C<sub> (subroutine
name, or info about C<eval>), C<args> (C<undef> or a reference to
an array), C<file>, and C<line>.
-The function C<DB::print_trace(FH, skip[, count[, short]])> prints
+=item *
+
+C<DB::print_trace(FH, skip[, count[, short]])> prints
formatted info about caller frames. The last two functions may be
convenient as arguments to C<< < >>, C<< << >> commands.
+=back
+
Note that any variables and functions that are not documented in
this manpages (or in L<perldebug>) are considered for internal
use only, and as such are subject to change without notice.
main::bar((eval 170):2):
42
-with this one, once the C<O>ption C<frame=2> has been set:
+with this one, once the C<o>ption C<frame=2> has been set:
- DB<4> O f=2
+ DB<4> o f=2
frame = '2'
DB<5> t print foo() * bar()
3: foo() * bar()
The debugging output at compile time looks like this:
- compiling RE `[bc]d(ef*g)+h[ij]k$'
- size 43 first at 1
- 1: ANYOF(11)
- 11: EXACT <d>(13)
- 13: CURLYX {1,32767}(27)
- 15: OPEN1(17)
- 17: EXACT <e>(19)
- 19: STAR(22)
- 20: EXACT <f>(0)
- 22: EXACT <g>(24)
- 24: CLOSE1(26)
- 26: WHILEM(0)
- 27: NOTHING(28)
- 28: EXACT <h>(30)
- 30: ANYOF(40)
- 40: EXACT <k>(42)
- 42: EOL(43)
- 43: END(0)
- anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating)
- stclass `ANYOF' minlen 7
+ Compiling REx `[bc]d(ef*g)+h[ij]k$'
+ size 45 Got 364 bytes for offset annotations.
+ first at 1
+ rarest char g at 0
+ rarest char d at 0
+ 1: ANYOF[bc](12)
+ 12: EXACT <d>(14)
+ 14: CURLYX[0] {1,32767}(28)
+ 16: OPEN1(18)
+ 18: EXACT <e>(20)
+ 20: STAR(23)
+ 21: EXACT <f>(0)
+ 23: EXACT <g>(25)
+ 25: CLOSE1(27)
+ 27: WHILEM[1/1](0)
+ 28: NOTHING(29)
+ 29: EXACT <h>(31)
+ 31: ANYOF[ij](42)
+ 42: EXACT <k>(44)
+ 44: EOL(45)
+ 45: END(0)
+ anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating)
+ stclass `ANYOF[bc]' minlen 7
+ Offsets: [45]
+ 1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]
+ 0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]
+ 11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]
+ 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0]
+ Omitting $` $& $' support.
The first line shows the pre-compiled form of the regex. The second
shows the size of the compiled form (in arbitrary units, usually
-4-byte words) and the label I<id> of the first node that does a
-match.
+4-byte words) and the total number of bytes allocated for the
+offset/length table, usually 4+C<size>*8. The next line shows the
+label I<id> of the first node that does a match.
+
+The
-The last line (split into two lines above) contains optimizer
+ anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating)
+ stclass `ANYOF[bc]' minlen 7
+
+line (split into two lines above) contains optimizer
information. In the example shown, the optimizer found that the match
should contain a substring C<de> at offset 1, plus substring C<gh>
at some offset between 3 and infinity. Moreover, when checking for
these substrings (to abandon impossible matches quickly), Perl will check
for the substring C<gh> before checking for the substring C<de>. The
optimizer may also use the knowledge that the match starts (at the
-C<first> I<id>) with a character class, and the match cannot be
-shorter than 7 chars.
+C<first> I<id>) with a character class, and no string
+shorter than 7 characters can possibly match.
-The fields of interest which may appear in the last line are
+The fields of interest which may appear in this line are
-=over
+=over 4
=item C<anchored> I<STRING> C<at> I<POS>
=item C<isall>
-Means that the optimizer info is all that the regular
+Means that the optimizer information is all that the regular
expression contains, and thus one does not need to enter the regex engine at
all.
If a substring is known to match at end-of-line only, it may be
followed by C<$>, as in C<floating `k'$>.
-The optimizer-specific info is used to avoid entering (a slow) regex
-engine on strings that will not definitely match. If C<isall> flag
+The optimizer-specific information is used to avoid entering (a slow) regex
+engine on strings that will not definitely match. If the C<isall> flag
is set, a call to the regex engine may be avoided even when the optimizer
found an appropriate place for the match.
-The rest of the output contains the list of I<nodes> of the compiled
+Above the optimizer section is the list of I<nodes> of the compiled
form of the regex. Each line has format
C< >I<id>: I<TYPE> I<OPTIONAL-INFO> (I<next-id>)
# To simplify debugging output, we mark it as if it were a node
OPTIMIZED off Placeholder for dump.
+=for unprinted-credits
+Next section M-J. Dominus (mjd-perl-patch+@plover.com) 20010421
+
+Following the optimizer information is a dump of the offset/length
+table, here split across several lines:
+
+ Offsets: [45]
+ 1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]
+ 0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]
+ 11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]
+ 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0]
+
+The first line here indicates that the offset/length table contains 45
+entries. Each entry is a pair of integers, denoted by C<offset[length]>.
+Entries are numbered starting with 1, so entry #1 here is C<1[4]> and
+entry #12 is C<5[1]>. C<1[4]> indicates that the node labeled C<1:>
+(the C<1: ANYOF[bc]>) begins at character position 1 in the
+pre-compiled form of the regex, and has a length of 4 characters.
+C<5[1]> in position 12
+indicates that the node labeled C<12:>
+(the C<< 12: EXACT <d> >>) begins at character position 5 in the
+pre-compiled form of the regex, and has a length of 1 character.
+C<12[1]> in position 14
+indicates that the node labeled C<14:>
+(the C<< 14: CURLYX[0] {1,32767} >>) begins at character position 12 in the
+pre-compiled form of the regex, and has a length of 1 character---that
+is, it corresponds to the C<+> symbol in the precompiled regex.
+
+C<0[0]> items indicate that there is no corresponding node.
+
=head2 Run-time output
First of all, when doing a match, one may get no run-time output even
result are quite a bit worse on 64-bit architectures). If a variable
is accessed in two of three different ways (which require an integer,
a float, or a string), the memory footprint may increase yet another
-20 bytes. A sloppy malloc(3) implementation can make inflate these
+20 bytes. A sloppy malloc(3) implementation can inflate these
numbers dramatically.
On the opposite end of the scale, a declaration like
Total sbrk(): 215040/47:145. Odd ends: pad+heads+chain+tail: 0+2192+0+6144.
It is possible to ask for such a statistic at arbitrary points in
-your execution using the mstats() function out of the standard
+your execution using the mstat() function out of the standard
Devel::Peek module.
Here is some explanation of that format:
-=over
+=over 4
=item C<buckets SMALLEST(APPROX)..GREATEST(APPROX)>
4 12 24 48 80
With non-C<DEBUGGING> perl, the buckets starting from C<128> have
-a 4-byte overhead, and thus a 8192-long bucket may take up to
+a 4-byte overhead, and thus an 8192-long bucket may take up to
8188-byte allocations.
=item C<Total sbrk(): SBRKed/SBRKs:CONTINUOUS>
Here are explanations for other I<Id>s above:
-=over
+=over 4
=item C<717>
If warn() string starts with
-=over
+=over 4
=item C<!!!>
L<perlrun>
L<re>,
and
-L<Devel::Dprof>.
+L<Devel::DProf>.