2 # package-less .pm file allows 'use OptreeCheck';
3 # otherwise, it's like "require './test.pl'"
7 OptreeCheck - check optrees
11 OptreeCheck supports regression testing of perl's parser, optimizer,
12 bytecode generator, via a single function: checkOptree(%args).
14 checkOptree(name => "your title here",
15 bcopts => '-exec', # $opt or \@opts, passed to BC::compile
16 code => sub {my $a}, # must be CODE ref
17 # prog => 'sort @a', # run in subprocess, aka -MO=Concise
18 # skip => 1, # skips test
19 # todo => 'excuse', # anticipated failures
20 # fail => 1 # fails (by redirecting result)
21 # debug => 1, # turns on regex debug for match test !!
22 # retry => 1 # retry with debug on test failure
23 expect => <<'EOT_EOT', expect_nt => <<'EONT_EONT');
24 # 1 <;> nextstate(main 45 optree.t:23) v
25 # 2 <0> padsv[$a:45,46] M/LVINTRO
26 # 3 <1> leavesub[1 ref] K/REFC,1
28 # 1 <;> nextstate(main 45 optree.t:23) v
29 # 2 <0> padsv[$a:45,46] M/LVINTRO
30 # 3 <1> leavesub[1 ref] K/REFC,1
33 =head1 checkOptree(%in) Overview
35 Runs code or prog through B::Concise, and captures its rendering.
37 Calls mkCheckRex() to produce a regex which will match the expected
38 rendering, and fail when it doesn't match.
40 Also calls like($out,/$regex/,$name), and thereby plugs into the test.pl
43 =head1 checkOptree(%Args) API
45 Accepts %Args, with following requirements and actions:
47 expect and expect_nt required, not empty, not whitespace. Its a fatal
48 error, because false positives are BAD.
50 Either code or prog must be present.
52 prog is some source code, and is passed through via runperl, to B::Concise
53 like this: (bcopts are fixed up for cmdline)
55 './perl -w -MO=Concise,$bcopts_massaged -e $src'
57 code is a subref, or $src, like above. If it's not a subref, it's
58 treated like source, and wrapped as a subroutine, and passed to
59 B::Concise::compile():
61 $subref = eval "sub{$src}";
63 I suppose I should also explain these more, but..
65 # prog => 'sort @a', # run in subprocess, aka -MO=Concise
66 # skip => 1, # skips test
67 # todo => 'excuse', # anticipated failures
68 # fail => 1 # fails (by redirecting result)
69 # debug => 1, # turns on regex debug for match test !!
70 # retry => 1 # retry with debug on test failure
72 =head1 Usage Philosophy
74 2 platforms --> 2 reftexts: You want an accurate test, independent of
75 which platform youre on. This is obvious in retrospect, but ..
77 I started this with 1 reftext, and tried to use it to construct regexs
78 for both platforms. This is extra complexity, trying to build a
79 single regex for both cases makes the regex more complicated, and
80 harder to get 'right'.
82 Having 2 references also allows various 'tests', really explorations
83 currently. At the very least, having 2 samples side by side allows
84 inspection and aids understanding of optrees.
86 Cross-testing (expect_nt on threaded, expect on non-threaded) exposes
87 differences in B::Concise output, so mkCheckRex has code to do some
88 cross-test manipulations. This area needs more work.
92 One consequence of a single-function API is difficulty controlling
93 test-mode. Ive chosen for now to use a package hash, %gOpts, to store
94 test-state. These properties alter checkOptree() function, either
95 short-circuiting to selftest, or running a loop that runs the testcase
96 2^N times, varying conditions each time. (current N is 2 only).
98 So Test-mode is controlled with cmdline args, also called options below.
99 Run with 'help' to see the test-state, and how to change it.
103 This argument invokes runSelftest(), which tests a regex against the
104 reference renderings that they're made from. Failure of a regex match
105 its 'mold' is a strong indicator that mkCheckRex is buggy.
107 That said, selftest mode currently runs a cross-test too, they're not
108 completely orthogonal yet. See below.
110 =head2 testmode=cross
112 Cross-testing is purposely creating a T-NT mismatch, looking at the
113 fallout, and tweaking the regex to deal with it. Thus tests lead to
114 'provably' complete understanding of the differences.
116 The tweaking appears contrary to the 2-refs philosophy, but the tweaks
117 will be made in conversion-specific code, which (will) handles T->NT
118 and NT->T separately. The tweaking is incomplete.
120 A reasonable 1st step is to add tags to indicate when TonNT or NTonT
121 is known to fail. This needs an option to force failure, so the
122 test.pl reporting mechanics show results to aid the user.
124 =head2 testmode=native
126 This is normal mode. Other valid values are: native, cross, both.
128 =head2 checkOptree Notes
130 Accepts test code, renders its optree using B::Concise, and matches that
131 rendering against a regex built from one of 2 reference-renderings %in data.
133 The regex is built by mkCheckRex(\%in), which scrubs %in data to
134 remove match-irrelevancies, such as (args) and [args]. For example,
135 it strips leading '# ', making it easy to cut-paste new tests into
136 your test-file, run it, and cut-paste actual results into place. You
137 then retest and reedit until all 'errors' are gone. (now make sure you
138 haven't 'enshrined' a bug).
140 name: The test name. May be augmented by a label, which is built from
141 important params, and which helps keep names in sync with whats being
148 use B::Concise qw(walk_output);
150 $Data::Dumper::Sortkeys = 1;
153 $SIG{__WARN__} = sub {
155 $err =~ m/Subroutine re::(un)?install redefined/ and return;
159 # but wait - more skullduggery !
160 sub OptreeCheck::import { &getCmdLine; } # process @ARGV
162 # %gOpts params comprise a global test-state. Initial values here are
163 # HELP strings, they MUST BE REPLACED by runtime values before use, as
164 # is done by getCmdLine(), via import
166 our %gOpts = # values are replaced at runtime !!
168 # scalar values are help string
169 rextract => 'writes src-code todo same Optree matching',
170 vbasic => 'prints $str and $rex',
171 retry => 'retry failures after turning on re debug',
172 retrydbg => 'retry failures after turning on re debug',
173 selftest => 'self-tests mkCheckRex vs the reference rendering',
174 selfdbg => 'redo failing selftests with re debug',
175 xtest => 'extended thread/non-thread testing',
176 fail => 'force all test to fail, print to stdout',
177 dump => 'dump cmdline arg prcessing',
178 rexpedant => 'try tighter regex, still buggy',
179 help => 0, # 1 ends in die
181 # array values are one-of selections, with 1st value as default
182 # tbc: 1st value is help, 2nd is default
183 testmode => [qw/ native cross both /],
187 our $threaded = 1 if $Config::Config{usethreads};
188 our $platform = ($threaded) ? "threaded" : "plain";
189 our $thrstat = ($threaded) ? "threaded" : "nonthreaded";
191 our ($MatchRetry,$MatchRetryDebug); # let mylike be generic
193 *MatchRetry = \$gOpts{retry}; # but alias it into %gOpts
194 *MatchRetryDebug = \$gOpts{retrydbg}; # but alias it into %gOpts
197 both => [ 'expect', 'expect_nt'],
198 native => [ ($threaded) ? 'expect' : 'expect_nt'],
199 cross => [ !($threaded) ? 'expect' : 'expect_nt'],
200 expect => [ 'expect' ],
201 expect_nt => [ 'expect_nt' ],
204 our %msgs # announce cross-testing.
207 'expect_nt-threaded' => " (Non-threaded-ref on Threaded-build)",
208 'expect-nonthreaded' => " (Threaded-ref on Non-threaded-build)",
209 # native - nothing to say
210 'expect_nt-nonthreaded' => '',
211 'expect-threaded' => '',
215 sub getCmdLine { # import assistant
217 print(qq{\n$0 accepts args to update these state-vars:
218 turn on a flag by typing its name,
219 select a value from list by typing name=val.\n },
221 if grep /help/, @ARGV;
223 # replace values for each key !! MUST MARK UP %gOpts
224 foreach my $opt (keys %gOpts) {
226 # scan ARGV for known params
227 if (ref $gOpts{$opt} eq 'ARRAY') {
229 # $opt is a One-Of construct
230 # replace with valid selection from the list
232 # uhh this WORKS. but it's inscrutable
233 # grep s/$opt=(\w+)/grep {$_ eq $1} @ARGV and $gOpts{$opt}=$1/e, @ARGV;
235 if (grep s/$opt=(\w+)/$tval=$1/e, @ARGV) {
236 # check val before accepting
237 my @allowed = @{$gOpts{$opt}};
238 if (grep { $_ eq $tval } @allowed) {
239 $gOpts{$opt} = $tval;
241 else {die "invalid value: '$tval' for $opt\n"}
244 # take 1st val as default
245 $gOpts{$opt} = ${$gOpts{$opt}}[0]
246 if ref $gOpts{$opt} eq 'ARRAY';
248 else { # handle scalars
250 # if 'opt' is present, true
251 $gOpts{$opt} = (grep /$opt/, @ARGV) ? 1 : 0;
253 # override with 'foo' if 'opt=foo' appears
254 grep s/$opt=(.*)/$gOpts{$opt}=$1/e, @ARGV;
257 print("$0 heres current state:\n", Dumper \%gOpts)
258 if $gOpts{help} or $gOpts{dump};
260 exit if $gOpts{help};
263 ##################################
268 my ($in, $res) = (\%in,0); # set up privates.
270 print "checkOptree args: ",Dumper \%in if $in{dump};
272 skip($in{name}, 1) if $in{skip};
273 return runSelftest(\%in) if $gOpts{selftest};
275 my $rendering = getRendering(\%in); # get the actual output
276 fail("FORCED: $in{name}:\n$rendering") if $gOpts{fail}; # silly ?
278 # Test rendering against ..
279 foreach $want (@{$modes{$gOpts{testmode}}}) {
281 my $rex = mkCheckRex(\%in,$want);
282 my $cross = $msgs{"$want-$thrstat"};
284 # bad is anticipated failure on cross testing ONLY
285 my $bad = (0 or ( $cross && $in{crossfail})
286 or (!$cross && $in{fail})
289 # couldn't bear to pass \%in to likeyn
290 $res = mylike ( # custom test mode stuff
292 $in{retry} || $gOpts{retry},
293 $in{debug} || $gOpts{retrydbg}
295 # remaining is std API
296 $rendering, qr/$rex/ms, "$cross $in{name}")
298 printhelp(\%in, $rendering, $rex);
308 # may help get/keep test output consistent
310 $in->{label} = join(',', map {"$_=>$in->{$_}"}
311 qw( bcopts name prog code ));
315 # generate a set of test-cases from the options
318 foreach $want (@{$modes{$gOpts{testmode}}}) {
327 # tests the test-cases offered (expect, expect_nt)
328 # needs Unification with above.
331 foreach $want (@{$modes{$gOpts{testmode}}}) {}
333 for my $provenance (qw/ expect expect_nt /) {
334 next unless $in->{$provenance};
335 my ($rex,$gospel) = mkCheckRex($in, $provenance);
336 return unless $gospel;
338 my $cross = $msgs{"$provenance-$thrstat"};
339 my $bad = (0 or ( $cross && $in->{crossfail})
340 or (!$cross && $in->{fail})
342 # couldn't bear to pass \%in to likeyn
343 $res = mylike ( [ !$bad,
344 $in->{retry} || $gOpts{retry},
345 $in->{debug} || $gOpts{retrydbg}
347 $rendering, qr/$rex/ms, "$cross $in{name}")
355 # note dependence on unlike()
356 my ($control) = shift;
357 my ($yes,$retry,$debug) = @$control; # or dies
358 my ($got, $expected, $name, @mess) = @_; # pass thru mostly
360 die "unintended usage, expecting Regex". Dumper \@_
361 unless ref $_[1] eq 'Regexp';
363 # same as A ^ B, but B has side effects
364 my $ok = ( (!$yes and unlike($got, $expected, $name, @mess))
365 or ($yes and like($got, $expected, $name, @mess)));
367 if (not $ok and $retry) {
368 # redo, perhaps with use re debug
369 eval "use re 'debug'" if $debug;
370 $ok = (!$yes and unlike($got, $expected, "(RETRY) $name", @mess)
371 or $yes and like($got, $expected, "(RETRY) $name", @mess));
380 die "getRendering: code or prog is required\n"
381 unless $in->{code} or $in->{prog};
383 my @opts = get_bcopts($in);
384 my $rendering = ''; # suppress "Use of uninitialized value in open"
387 $rendering = runperl( switches => ['-w',join(',',"-MO=Concise",@opts)],
388 prog => $in->{prog}, stderr => 1,
391 my $code = $in->{code};
392 unless (ref $code eq 'CODE') {
393 # treat as source, and wrap
394 $code = eval "sub { $code }";
395 die "$@ evaling code 'sub { $in->{code} }'\n"
396 unless ref $code eq 'CODE';
398 # set walk-output b4 compiling, which writes 'announce' line
399 walk_output(\$rendering);
401 fail("forced failure: stdout follows");
402 walk_output(\*STDOUT);
404 my $opwalker = B::Concise::compile(@opts, $code);
405 die "bad BC::compile retval" unless ref $opwalker eq 'CODE';
407 B::Concise::reset_sequence();
414 # collect concise passthru-options if any
418 @opts = (ref $in->{bcopts} eq 'ARRAY')
419 ? @{$in->{bcopts}} : ($in->{bcopts});
424 # needless complexity due to 'too much info' from B::Concise v.60
425 my $announce = 'B::Concise::compile\(CODE\(0x[0-9a-f]+\)\)';;
428 # converts expected text into Regexp which should match against
429 # unaltered version. also adjusts threaded => non-threaded
430 my ($in, $want) = @_;
431 eval "no re 'debug'";
433 my $str = $in->{expect} || $in->{expect_nt}; # standard bias
434 $str = $in->{$want} if $want; # stated pref
436 die "no reftext found for $want: $in->{name}" unless $str;
437 #fail("rex-str is empty, won't allow false positives") unless $str;
439 $str =~ s/^\# //mg; # ease cut-paste testcase authoring
440 my $reftxt = $str; # extra return val !!
442 unless ($gOpts{rexpedant}) {
443 # convert all (args) and [args] to temporary '____'
444 $str =~ s/(\(.*?\))/____/msg;
445 $str =~ s/(\[.*?\])/____/msg;
447 # escape remaining metachars. manual \Q (doesnt escape '+')
448 $str =~ s/([\[\]()*.\$\@\#])/\\$1/msg;
449 #$str =~ s/([*.\$\@\#])/\\$1/msg;
451 # now replace '____' with something that matches both.
452 # bracing style agnosticism is important here, it makes many
453 # threaded / non-threaded diffs irrelevant
454 $str =~ s/____/(\\[.*?\\]|\\(.*?\\))/msg; # capture in case..
456 # no mysterious failures in debugger
457 $str =~ s/(?:next|db)state/(?:next|db)state/msg;
460 # precise/pedantic way - only wildcard nextate, leavesub
462 # escape some literals
463 $str =~ s/([*.\$\@\#])/\\$1/msg;
465 # nextstate. replace args, and work under debugger
466 $str =~ s/(?:next|db)state\(.*?\)/(?:next|db)state\\(.*?\\)/msg;
468 # leavesub refcount changes, dont care
469 $str =~ s/leavesub\[.*?\]/leavesub[.*?]/msg;
471 # wildcard-ify all [contents]
472 $str =~ s/\[.*?\]/[.*?]/msg; # add capture ?
474 # make [] literal now, keeping .* for contents
475 $str =~ s/([\[\]])/\\$1/msg;
477 # threaded <--> non-threaded transforms ??
479 if (not $Config::Config{usethreads}) {
480 # written for T->NT transform
481 # $str =~ s/<\\#>/<\\\$>/msg; # GV on pad, a threads thing ?
482 $str =~ s/PADOP/SVOP/msg; # fix terse output diffs
484 croak "no reftext found for $want: $in->{name}"
485 unless $str =~ /\w+/; # fail unless a real test
487 # $str = '.*' if 1; # sanity test
488 # $str .= 'FAIL' if 1; # sanity test
491 $str =~ s/\t/ +/msg; # not \s+
493 eval "use re 'debug'" if $debug;
497 return ($qr, $reftxt) if wantarray;
502 my ($in, $rendering, $rex) = @_;
503 print "<$rendering>\nVS\n<$reftext>\n" if $gOpts{vbasic};
505 # save this output to afile, edit out 'ok's and 1..N
506 # then perl -d afile, and add re 'debug' to suit.
507 print("\$str = q{$rendering};\n".
508 "\$rex = qr{$reftext};\n".
509 "print \"\$str =~ m{\$rex}ms \";\n".
510 "\$str =~ m{\$rex}ms or print \"doh\\n\";\n\n")
511 if $in{rextract} or $gOpts{rextract};
520 mkCheckRex receives the full testcase object, and constructs a regex.
521 1st, it selects a reftxt from either the expect or expect_nt items.
523 Once selected, reftext massaged & convert into a Regex that accepts
524 'good' concise renderings, with appropriate input variations, but is
525 otherwize as strict as possible. For example, it should *not* match
526 when opcode flags change, or when optimizations convert an op to an
529 =head2 match criteria
531 Opcode arguments (text within braces) are disregarded for matching
532 purposes. This loses some info in 'add[t5]', but greatly simplifys
533 matching 'nextstate(main 22 (eval 10):1)'. Besides, we are testing
534 for regressions, not for complete accuracy.
536 The regex is unanchored, allowing success on simple expectations, such
537 as one with a single 'print' opcode.
539 =head2 complicating factors
541 Note that %in may seem overly complicated, but it's needed to allow
542 mkCheckRex to better support selftest,
544 The emerging complexity is that mkCheckRex must choose which refdata
545 to use as a template for the regex being constructed. This feels like
546 selection mechanics being duplicated.
548 =head1 FEATURES, BUGS, ENHANCEMENTS
550 Hey, they're the same thing now, modulo heisen-phase-shifting, and the
551 probe used to observe them.
555 Test cases were recently doubled, by adding a 2nd ref-data property;
556 expect and expect_nt carry renderings taken from threaded and
557 non-threaded builds. This addition has several benefits:
559 1. native reference data allows closer matching by regex.
560 2. samples can be eyeballed to grok t-nt differences.
561 3. data can help to validate mkCheckRex() operation.
562 4. can develop code to smooth t-nt differences.
563 5. can test with both native and cross+converted rexes
567 Tests should specify both 'expect' and 'expect_nt', making the
568 distinction now will allow a range of behaviors, in escalating
569 thoroughness. This variable is called provenance, indicating where
570 the reftext came from.
572 build_only: tests which dont have the reference-sample of the
573 right provenance will be skipped. NO GOOD.
575 prefer_expect: This is implied standard, as all tests done thus far
576 started here. One way t->nt conversions is done, based upon Config.
578 activetest: do cross-testing when test-case has both, ie also test
579 'expect_nt' references on threaded builds. This is aggressive, and is
580 intended to seek out t<->nt differences. if mkCheckRex knows
581 provenance and Config, it can do 2 way t<->nt conversions.
583 activemapping: This builds upon activetest by controlling whether
584 t<->nt conversions are done, and allows simpler verification that each
585 conversion step is indeed necessary.
587 pedantic: this fails if tests dont have both, whereas above doesn't care.