X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperldebguts.pod;h=efc979861f4d0848d286323ca4dd900e3ebcebf0;hb=50de6d7ea6627af63fe7cc0e7a62e105a73a0565;hp=45c33c7ec437133247cefef19c1c0ad8e6c4ede7;hpb=ee8c7f5465f003860e2347a2946abacac39bd9b9;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perldebguts.pod b/pod/perldebguts.pod index 45c33c7..efc9798 100644 --- a/pod/perldebguts.pod +++ b/pod/perldebguts.pod @@ -23,7 +23,7 @@ frame was called with are copied to the @DB::args array. The general mechanisms is enabled by calling Perl with the B<-d> switch, the following additional features are enabled (cf. L): -=over +=over 4 =item * @@ -32,20 +32,22 @@ Perl inserts the contents of C<$ENV{PERL5DB}> (or C holds the lines of $filename for all -files compiled by Perl. The same for Ced strings that contain +Each array C<@{"_<$filename"}> holds the lines of $filename for a +file compiled by Perl. The same for Ced strings that contain subroutines, or which are currently being executed. The $filename for Ced strings looks like C<(eval 34)>. Code assertions -in regexes look like C<(re_eval 19)>. +in regexes look like C<(re_eval 19)>. + +Values in this array are magical in numeric context: they compare +equal to zero only if the line is not breakable. =item * -The hash C<%{"_<$filename"}> contains breakpoints and actions keyed +Each hash C<%{"_<$filename"}> contains breakpoints and actions keyed by line number. Individual entries (as opposed to the whole hash) are settable. Perl only cares about Boolean true here, although the values used by F have the form -C<"$break_condition\0$action">. Values in this hash are magical -in numeric context: they are zeros if the line is not breakable. +C<"$break_condition\0$action">. The same holds for evaluated strings that contain subroutines, or which are currently being executed. The $filename for Ced strings @@ -53,7 +55,7 @@ looks like C<(eval 34)> or C<(re_eval 19)>. =item * -The scalar C<${"_<$filename"}> contains C<"_<$filename">. This is +Each scalar C<${"_<$filename"}> contains C<"_<$filename">. This is also the case for evaluated strings that contain subroutines, or which are currently being executed. The $filename for Ced strings looks like C<(eval 34)> or C<(re_eval 19)>. @@ -362,45 +364,60 @@ compile time and run time. It is not lexically scoped. The debugging output at compile time looks like this: - compiling RE `[bc]d(ef*g)+h[ij]k$' - size 43 first at 1 - 1: ANYOF(11) - 11: EXACT (13) - 13: CURLYX {1,32767}(27) - 15: OPEN1(17) - 17: EXACT (19) - 19: STAR(22) - 20: EXACT (0) - 22: EXACT (24) - 24: CLOSE1(26) - 26: WHILEM(0) - 27: NOTHING(28) - 28: EXACT (30) - 30: ANYOF(40) - 40: EXACT (42) - 42: EOL(43) - 43: END(0) - anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating) - stclass `ANYOF' minlen 7 + Compiling REx `[bc]d(ef*g)+h[ij]k$' + size 45 Got 364 bytes for offset annotations. + first at 1 + rarest char g at 0 + rarest char d at 0 + 1: ANYOF[bc](12) + 12: EXACT (14) + 14: CURLYX[0] {1,32767}(28) + 16: OPEN1(18) + 18: EXACT (20) + 20: STAR(23) + 21: EXACT (0) + 23: EXACT (25) + 25: CLOSE1(27) + 27: WHILEM[1/1](0) + 28: NOTHING(29) + 29: EXACT (31) + 31: ANYOF[ij](42) + 42: EXACT (44) + 44: EOL(45) + 45: END(0) + anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating) + stclass `ANYOF[bc]' minlen 7 + Offsets: [45] + 1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1] + 0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0] + 11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0] + 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0] + Omitting $` $& $' support. The first line shows the pre-compiled form of the regex. The second shows the size of the compiled form (in arbitrary units, usually -4-byte words) and the label I of the first node that does a -match. +4-byte words) and the total number of bytes allocated for the +offset/length table, usually 4+C*8. The next line shows the +label I of the first node that does a match. + +The -The last line (split into two lines above) contains optimizer + anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating) + stclass `ANYOF[bc]' minlen 7 + +line (split into two lines above) contains optimizer information. In the example shown, the optimizer found that the match should contain a substring C at offset 1, plus substring C at some offset between 3 and infinity. Moreover, when checking for these substrings (to abandon impossible matches quickly), Perl will check for the substring C before checking for the substring C. The optimizer may also use the knowledge that the match starts (at the -C I) with a character class, and the match cannot be -shorter than 7 chars. +C I) with a character class, and no string +shorter than 7 characters can possibly match. -The fields of interest which may appear in the last line are +The fields of interest which may appear in this line are -=over +=over 4 =item C I C I @@ -426,7 +443,7 @@ Don't scan for the found substrings. =item C -Means that the optimizer info is all that the regular +Means that the optimizer information is all that the regular expression contains, and thus one does not need to enter the regex engine at all. @@ -457,12 +474,12 @@ being C, C, or C. See the table below. If a substring is known to match at end-of-line only, it may be followed by C<$>, as in C. -The optimizer-specific info is used to avoid entering (a slow) regex -engine on strings that will not definitely match. If C flag +The optimizer-specific information is used to avoid entering (a slow) regex +engine on strings that will not definitely match. If the C flag is set, a call to the regex engine may be avoided even when the optimizer found an appropriate place for the match. -The rest of the output contains the list of I of the compiled +Above the optimizer section is the list of I of the compiled form of the regex. Each line has format C< >I: I I (I) @@ -581,6 +598,36 @@ Here are the possible types, with short descriptions: # To simplify debugging output, we mark it as if it were a node OPTIMIZED off Placeholder for dump. +=for unprinted-credits +Next section M-J. Dominus (mjd-perl-patch+@plover.com) 20010421 + +Following the optimizer information is a dump of the offset/length +table, here split across several lines: + + Offsets: [45] + 1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1] + 0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0] + 11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0] + 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0] + +The first line here indicates that the offset/length table contains 45 +entries. Each entry is a pair of integers, denoted by C. +Entries are numbered starting with 1, so entry #1 here is C<1[4]> and +entry #12 is C<5[1]>. C<1[4]> indicates that the node labeled C<1:> +(the C<1: ANYOF[bc]>) begins at character position 1 in the +pre-compiled form of the regex, and has a length of 4 characters. +C<5[1]> in position 12 +indicates that the node labeled C<12:> +(the C<< 12: EXACT >>) begins at character position 5 in the +pre-compiled form of the regex, and has a length of 1 character. +C<12[1]> in position 14 +indicates that the node labeled C<14:> +(the C<< 14: CURLYX[0] {1,32767} >>) begins at character position 12 in the +pre-compiled form of the regex, and has a length of 1 character---that +is, it corresponds to the C<+> symbol in the precompiled regex. + +C<0[0]> items indicate that there is no corresponding node. + =head2 Run-time output First of all, when doing a match, one may get no run-time output even @@ -639,7 +686,7 @@ than 32 bytes (all these examples assume 32-bit architectures, the result are quite a bit worse on 64-bit architectures). If a variable is accessed in two of three different ways (which require an integer, a float, or a string), the memory footprint may increase yet another -20 bytes. A sloppy malloc(3) implementation can make inflate these +20 bytes. A sloppy malloc(3) implementation can inflate these numbers dramatically. On the opposite end of the scale, a declaration like @@ -686,12 +733,12 @@ the following example: Total sbrk(): 215040/47:145. Odd ends: pad+heads+chain+tail: 0+2192+0+6144. It is possible to ask for such a statistic at arbitrary points in -your execution using the mstats() function out of the standard +your execution using the mstat() function out of the standard Devel::Peek module. Here is some explanation of that format: -=over +=over 4 =item C @@ -838,7 +885,7 @@ per glob - for glob name, and glob stringification magic. Here are explanations for other Is above: -=over +=over 4 =item C<717> @@ -892,7 +939,7 @@ these categories. If warn() string starts with -=over +=over 4 =item C