ext/B/t/OptreeCheck.pm

   1 # OptreeCheck.pm
   2 # package-less .pm file allows 'use OptreeCheck';
   3 # otherwise, it's like "require './test.pl'"
   4
   5 =head1 NAME
   6
   7 OptreeCheck - check optrees
   8
   9 =head1 SYNOPSIS
  10
  11 OptreeCheck supports regression testing of perl's parser, optimizer,
  12 bytecode generator, via a single function: checkOptree(%args).
  13
  14  checkOptree(name   => "your title here",
  15              bcopts => '-exec', # $opt or \@opts, passed to BC::compile
  16              code   => sub {my $a},     # must be CODE ref
  17              # prog   => 'sort @a',     # run in subprocess, aka -MO=Concise
  18              # skip => 1,               # skips test
  19              # todo => 'excuse',        # anticipated failures
  20              # fail => 1                # fails (by redirecting result)
  21              # debug => 1,              # turns on regex debug for match test !!
  22              # retry => 1               # retry with debug on test failure
  23              expect => <<'EOT_EOT', expect_nt => <<'EONT_EONT');
  24  # 1  <;> nextstate(main 45 optree.t:23) v
  25  # 2  <0> padsv[$a:45,46] M/LVINTRO
  26  # 3  <1> leavesub[1 ref] K/REFC,1
  27  EOT_EOT
  28  # 1  <;> nextstate(main 45 optree.t:23) v
  29  # 2  <0> padsv[$a:45,46] M/LVINTRO
  30  # 3  <1> leavesub[1 ref] K/REFC,1
  31  EONT_EONT
  32
  33 =head1 checkOptree(%in) Overview
  34
  35 Runs code or prog through B::Concise, and captures its rendering.
  36
  37 Calls mkCheckRex() to produce a regex which will match the expected
  38 rendering, and fail when it doesn't match.
  39
  40 Also calls like($out,/$regex/,$name), and thereby plugs into the test.pl
  41 framework.
  42
  43 =head1 checkOptree(%Args) API
  44
  45 Accepts %Args, with following requirements and actions:
  46
  47 expect and expect_nt required, not empty, not whitespace.  Its a fatal
  48 error, because false positives are BAD.
  49
  50 Either code or prog must be present.
  51
  52 prog is some source code, and is passed through via runperl, to B::Concise
  53 like this: (bcopts are fixed up for cmdline)
  54
  55     './perl -w -MO=Concise,$bcopts_massaged -e $src'
  56
  57 code is a subref, or $src, like above.  If it's not a subref, it's
  58 treated like source, and wrapped as a subroutine, and passed to
  59 B::Concise::compile():
  60
  61     $subref = eval "sub{$src}";
  62
  63 I suppose I should also explain these more, but..
  64
  65     # prog   => 'sort @a',      # run in subprocess, aka -MO=Concise
  66     # skip => 1,                # skips test
  67     # todo => 'excuse', # anticipated failures
  68     # fail => 1         # fails (by redirecting result)
  69     # debug => 1,               # turns on regex debug for match test !!
  70     # retry => 1                # retry with debug on test failure
  71
  72 =head1 Usage Philosophy
  73
  74 2 platforms --> 2 reftexts: You want an accurate test, independent of
  75 which platform youre on.  This is obvious in retrospect, but ..
  76
  77 I started this with 1 reftext, and tried to use it to construct regexs
  78 for both platforms.  This is extra complexity, trying to build a
  79 single regex for both cases makes the regex more complicated, and
  80 harder to get 'right'.
  81
  82 Having 2 references also allows various 'tests', really explorations
  83 currently.  At the very least, having 2 samples side by side allows
  84 inspection and aids understanding of optrees.
  85
  86 Cross-testing (expect_nt on threaded, expect on non-threaded) exposes
  87 differences in B::Concise output, so mkCheckRex has code to do some
  88 cross-test manipulations.  This area needs more work.
  89
  90 =head1 Test Modes
  91
  92 One consequence of a single-function API is difficulty controlling
  93 test-mode.  Ive chosen for now to use a package hash, %gOpts, to store
  94 test-state.  These properties alter checkOptree() function, either
  95 short-circuiting to selftest, or running a loop that runs the testcase
  96 2^N times, varying conditions each time.  (current N is 2 only).
  97
  98 So Test-mode is controlled with cmdline args, also called options below.
  99 Run with 'help' to see the test-state, and how to change it.
 100
 101 =head2  selftest
 102
 103 This argument invokes runSelftest(), which tests a regex against the
 104 reference renderings that they're made from.  Failure of a regex match
 105 its 'mold' is a strong indicator that mkCheckRex is buggy.
 106
 107 That said, selftest mode currently runs a cross-test too, they're not
 108 completely orthogonal yet.  See below.
 109
 110 =head2 testmode=cross
 111
 112 Cross-testing is purposely creating a T-NT mismatch, looking at the
 113 fallout, and tweaking the regex to deal with it.  Thus tests lead to
 114 'provably' complete understanding of the differences.
 115
 116 The tweaking appears contrary to the 2-refs philosophy, but the tweaks
 117 will be made in conversion-specific code, which (will) handles T->NT
 118 and NT->T separately.  The tweaking is incomplete.
 119
 120 A reasonable 1st step is to add tags to indicate when TonNT or NTonT
 121 is known to fail.  This needs an option to force failure, so the
 122 test.pl reporting mechanics show results to aid the user.
 123
 124 =head2 testmode=native
 125
 126 This is normal mode.  Other valid values are: native, cross, both.
 127
 128 =head2 checkOptree Notes
 129
 130 Accepts test code, renders its optree using B::Concise, and matches that
 131 rendering against a regex built from one of 2 reference-renderings %in data.
 132
 133 The regex is built by mkCheckRex(\%in), which scrubs %in data to
 134 remove match-irrelevancies, such as (args) and [args].  For example,
 135 it strips leading '# ', making it easy to cut-paste new tests into
 136 your test-file, run it, and cut-paste actual results into place.  You
 137 then retest and reedit until all 'errors' are gone.  (now make sure you
 138 haven't 'enshrined' a bug).
 139
 140 name: The test name.  May be augmented by a label, which is built from
 141 important params, and which helps keep names in sync with whats being
 142 tested.
 143
 144 =cut
 145
 146 use Config;
 147 use Carp;
 148 use B::Concise qw(walk_output);
 149 use Data::Dumper;
 150 $Data::Dumper::Sortkeys = 1;
 151
 152 BEGIN {
 153     $SIG{__WARN__} = sub {
 154         my $err = shift;
 155         $err =~ m/Subroutine re::(un)?install redefined/ and return;
 156     };
 157 }
 158
 159 # but wait - more skullduggery !
 160 sub OptreeCheck::import {  &getCmdLine; }       # process @ARGV
 161
 162 # %gOpts params comprise a global test-state.  Initial values here are
 163 # HELP strings, they MUST BE REPLACED by runtime values before use, as
 164 # is done by getCmdLine(), via import
 165
 166 our %gOpts =    # values are replaced at runtime !!
 167     (
 168      # scalar values are help string
 169      rextract   => 'writes src-code todo same Optree matching',
 170      vbasic     => 'prints $str and $rex',
 171      retry      => 'retry failures after turning on re debug',
 172      retrydbg   => 'retry failures after turning on re debug',
 173      selftest   => 'self-tests mkCheckRex vs the reference rendering',
 174      selfdbg    => 'redo failing selftests with re debug',
 175      xtest      => 'extended thread/non-thread testing',
 176      fail       => 'force all test to fail, print to stdout',
 177      dump       => 'dump cmdline arg prcessing',
 178      rexpedant  => 'try tighter regex, still buggy',
 179      help       => 0,   # 1 ends in die
 180
 181      # array values are one-of selections, with 1st value as default
 182      #   tbc: 1st value is help, 2nd is default
 183      testmode => [qw/ native cross both /],
 184     );
 185
 186
 187 our $threaded = 1 if $Config::Config{usethreads};
 188 our $platform = ($threaded) ? "threaded" : "plain";
 189 our $thrstat = ($threaded)  ? "threaded" : "nonthreaded";
 190
 191 our ($MatchRetry,$MatchRetryDebug);     # let mylike be generic
 192 # test.pl-ish hack
 193 *MatchRetry = \$gOpts{retry};           # but alias it into %gOpts
 194 *MatchRetryDebug = \$gOpts{retrydbg};   # but alias it into %gOpts
 195
 196 our %modes = (
 197               both      => [ 'expect', 'expect_nt'],
 198               native    => [ ($threaded) ? 'expect' : 'expect_nt'],
 199               cross     => [ !($threaded) ? 'expect' : 'expect_nt'],
 200               expect    => [ 'expect' ],
 201               expect_nt => [ 'expect_nt' ],
 202         );
 203
 204 our %msgs # announce cross-testing.
 205     = (
 206        # cross-platform
 207        'expect_nt-threaded' => " (Non-threaded-ref on Threaded-build)",
 208        'expect-nonthreaded' => " (Threaded-ref on Non-threaded-build)",
 209        # native - nothing to say
 210        'expect_nt-nonthreaded'  => '',
 211        'expect-threaded'        => '',
 212        );
 213
 214 #######
 215 sub getCmdLine {        # import assistant
 216     # offer help
 217     print(qq{\n$0 accepts args to update these state-vars:
 218              turn on a flag by typing its name,
 219              select a value from list by typing name=val.\n    },
 220           Dumper \%gOpts)
 221         if grep /help/, @ARGV;
 222
 223     # replace values for each key !! MUST MARK UP %gOpts
 224     foreach my $opt (keys %gOpts) {
 225
 226         # scan ARGV for known params
 227         if (ref $gOpts{$opt} eq 'ARRAY') {
 228
 229             # $opt is a One-Of construct
 230             # replace with valid selection from the list
 231
 232             # uhh this WORKS. but it's inscrutable
 233             # grep s/$opt=(\w+)/grep {$_ eq $1} @ARGV and $gOpts{$opt}=$1/e, @ARGV;
 234             my $tval;  # temp
 235             if (grep s/$opt=(\w+)/$tval=$1/e, @ARGV) {
 236                 # check val before accepting
 237                 my @allowed = @{$gOpts{$opt}};
 238                 if (grep { $_ eq $tval } @allowed) {
 239                     $gOpts{$opt} = $tval;
 240                 }
 241                 else {die "invalid value: '$tval' for $opt\n"}
 242             }
 243
 244             # take 1st val as default
 245             $gOpts{$opt} = ${$gOpts{$opt}}[0]
 246                 if ref $gOpts{$opt} eq 'ARRAY';
 247         }
 248         else { # handle scalars
 249
 250             # if 'opt' is present, true
 251             $gOpts{$opt} = (grep /$opt/, @ARGV) ? 1 : 0;
 252
 253             # override with 'foo' if 'opt=foo' appears
 254             grep s/$opt=(.*)/$gOpts{$opt}=$1/e, @ARGV;
 255         }
 256     }
 257     print("$0 heres current state:\n", Dumper \%gOpts)
 258         if $gOpts{help} or $gOpts{dump};
 259
 260     exit if $gOpts{help};
 261 }
 262
 263 ##################################
 264 # API
 265
 266 sub checkOptree {
 267     my %in = @_;
 268     my ($in, $res) = (\%in,0);   # set up privates.
 269
 270     print "checkOptree args: ",Dumper \%in if $in{dump};
 271     SKIP: {
 272         skip($in{name}, 1) if $in{skip};
 273         return runSelftest(\%in) if $gOpts{selftest};
 274
 275         my $rendering = getRendering(\%in);     # get the actual output
 276         fail("FORCED: $in{name}:\n$rendering") if $gOpts{fail}; # silly ?
 277
 278         # Test rendering against ..
 279         foreach $want (@{$modes{$gOpts{testmode}}}) {
 280
 281             my $rex = mkCheckRex(\%in,$want);
 282             my $cross = $msgs{"$want-$thrstat"};
 283
 284             # bad is anticipated failure on cross testing ONLY
 285             my $bad = (0 or ( $cross && $in{crossfail})
 286                          or (!$cross && $in{fail})
 287                          or 0);
 288
 289             # couldn't bear to pass \%in to likeyn
 290             $res = mylike ( # custom test mode stuff
 291                 [ !$bad,
 292                 $in{retry} || $gOpts{retry},
 293                 $in{debug} || $gOpts{retrydbg}
 294                 ],
 295                 # remaining is std API
 296                 $rendering, qr/$rex/ms, "$cross $in{name}")
 297             || 0;
 298             printhelp(\%in, $rendering, $rex);
 299         }
 300     }
 301     $res;
 302 }
 303
 304 #################
 305 # helpers
 306
 307 sub label {
 308     # may help get/keep test output consistent
 309     my ($in) = @_;
 310     $in->{label} = join(',', map {"$_=>$in->{$_}"}
 311                         qw( bcopts name prog code ));
 312 }
 313
 314 sub testCombo {
 315     # generate a set of test-cases from the options
 316     my $in = @_;
 317     my @cases;
 318     foreach $want (@{$modes{$gOpts{testmode}}}) {
 319
 320         push @cases, [ %in,
 321                       ];
 322     }
 323     return @cases;
 324 }
 325
 326 sub runSelftest {
 327     # tests the test-cases offered (expect, expect_nt)
 328     # needs Unification with above.
 329     my ($in) = @_;
 330     my $ok;
 331     foreach $want (@{$modes{$gOpts{testmode}}}) {}
 332
 333     for my $provenance (qw/ expect expect_nt /) {
 334         next unless $in->{$provenance};
 335         my ($rex,$gospel) = mkCheckRex($in, $provenance);
 336         return unless $gospel;
 337
 338         my $cross = $msgs{"$provenance-$thrstat"};
 339         my $bad = (0 or ( $cross && $in->{crossfail})
 340                    or   (!$cross && $in->{fail})
 341                    or 0);
 342             # couldn't bear to pass \%in to likeyn
 343             $res = mylike ( [ !$bad,
 344                               $in->{retry} || $gOpts{retry},
 345                               $in->{debug} || $gOpts{retrydbg}
 346                               ],
 347                             $rendering, qr/$rex/ms, "$cross $in{name}")
 348                 || 0;
 349     }
 350     $ok;
 351 }
 352
 353 # use re;
 354 sub mylike {
 355     # note dependence on unlike()
 356     my ($control) = shift;
 357     my ($yes,$retry,$debug) = @$control; # or dies
 358     my ($got, $expected, $name, @mess) = @_; # pass thru mostly
 359
 360     die "unintended usage, expecting Regex". Dumper \@_
 361         unless ref $_[1] eq 'Regexp';
 362
 363     # same as A ^ B, but B has side effects
 364     my $ok = ( (!$yes   and unlike($got, $expected, $name, @mess))
 365                or ($yes and   like($got, $expected, $name, @mess)));
 366
 367     if (not $ok and $retry) {
 368         # redo, perhaps with use re debug
 369         eval "use re 'debug'" if $debug;
 370         $ok = (!$yes   and unlike($got, $expected, "(RETRY) $name", @mess)
 371                or $yes and   like($got, $expected, "(RETRY) $name", @mess));
 372
 373         no re 'debug';
 374     }
 375     return $ok;
 376 }
 377
 378 sub getRendering {
 379     my ($in) = @_;
 380     die "getRendering: code or prog is required\n"
 381         unless $in->{code} or $in->{prog};
 382
 383     my @opts = get_bcopts($in);
 384     my $rendering = ''; # suppress "Use of uninitialized value in open"
 385
 386     if ($in->{prog}) {
 387         $rendering = runperl( switches => ['-w',join(',',"-MO=Concise",@opts)],
 388                               prog => $in->{prog}, stderr => 1,
 389                               ); #verbose => 1);
 390     } else {
 391         my $code = $in->{code};
 392         unless (ref $code eq 'CODE') {
 393             # treat as source, and wrap
 394             $code = eval "sub { $code }";
 395             die "$@ evaling code 'sub { $in->{code} }'\n"
 396                 unless ref $code eq 'CODE';
 397         }
 398         # set walk-output b4 compiling, which writes 'announce' line
 399         walk_output(\$rendering);
 400         if ($in->{fail}) {
 401             fail("forced failure: stdout follows");
 402             walk_output(\*STDOUT);
 403         }
 404         my $opwalker = B::Concise::compile(@opts, $code);
 405         die "bad BC::compile retval" unless ref $opwalker eq 'CODE';
 406
 407       B::Concise::reset_sequence();
 408         $opwalker->();
 409     }
 410     return $rendering;
 411 }
 412
 413 sub get_bcopts {
 414     # collect concise passthru-options if any
 415     my ($in) = shift;
 416     my @opts = ();
 417     if ($in->{bcopts}) {
 418         @opts = (ref $in->{bcopts} eq 'ARRAY')
 419             ? @{$in->{bcopts}} : ($in->{bcopts});
 420     }
 421     return @opts;
 422 }
 423
 424 # needless complexity due to 'too much info' from B::Concise v.60
 425 my $announce = 'B::Concise::compile\(CODE\(0x[0-9a-f]+\)\)';;
 426
 427 sub mkCheckRex {
 428     # converts expected text into Regexp which should match against
 429     # unaltered version.  also adjusts threaded => non-threaded
 430     my ($in, $want) = @_;
 431     eval "no re 'debug'";
 432
 433     my $str = $in->{expect} || $in->{expect_nt};        # standard bias
 434     $str = $in->{$want} if $want;                       # stated pref
 435
 436     die "no reftext found for $want: $in->{name}" unless $str;
 437     #fail("rex-str is empty, won't allow false positives") unless $str;
 438
 439     $str =~ s/^\# //mg;         # ease cut-paste testcase authoring
 440     my $reftxt = $str;          # extra return val !!
 441
 442     unless ($gOpts{rexpedant}) {
 443         # convert all (args) and [args] to temporary '____'
 444         $str =~ s/(\(.*?\))/____/msg;
 445         $str =~ s/(\[.*?\])/____/msg;
 446
 447         # escape remaining metachars. manual \Q (doesnt escape '+')
 448         $str =~ s/([\[\]()*.\$\@\#])/\\$1/msg;
 449         #$str =~ s/([*.\$\@\#])/\\$1/msg;
 450
 451         # now replace '____' with something that matches both.
 452         #  bracing style agnosticism is important here, it makes many
 453         #  threaded / non-threaded diffs irrelevant
 454         $str =~ s/____/(\\[.*?\\]|\\(.*?\\))/msg; # capture in case..
 455
 456         # no mysterious failures in debugger
 457         $str =~ s/(?:next|db)state/(?:next|db)state/msg;
 458     }
 459     else {
 460         # precise/pedantic way - only wildcard nextate, leavesub
 461
 462         # escape some literals
 463         $str =~ s/([*.\$\@\#])/\\$1/msg;
 464
 465         # nextstate. replace args, and work under debugger
 466         $str =~ s/(?:next|db)state\(.*?\)/(?:next|db)state\\(.*?\\)/msg;
 467
 468         # leavesub refcount changes, dont care
 469         $str =~ s/leavesub\[.*?\]/leavesub[.*?]/msg;
 470
 471         # wildcard-ify all [contents]
 472         $str =~ s/\[.*?\]/[.*?]/msg;    # add capture ?
 473
 474         # make [] literal now, keeping .* for contents
 475         $str =~ s/([\[\]])/\\$1/msg;
 476     }
 477     # threaded <--> non-threaded transforms ??
 478
 479     if (not $Config::Config{usethreads}) {
 480         # written for T->NT transform
 481         # $str =~ s/<\\#>/<\\\$>/msg;   # GV on pad, a threads thing ?
 482         $str =~ s/PADOP/SVOP/msg;       # fix terse output diffs
 483     }
 484     croak "no reftext found for $want: $in->{name}"
 485         unless $str =~ /\w+/; # fail unless a real test
 486
 487     # $str = '.*'       if 1;   # sanity test
 488     # $str .= 'FAIL'    if 1;   # sanity test
 489
 490     # tabs fixup
 491     $str =~ s/\t/ +/msg; # not \s+
 492
 493     eval "use re 'debug'" if $debug;
 494     my $qr = qr/$str/;
 495     no re 'debug';
 496
 497     return ($qr, $reftxt) if wantarray;
 498     return $qr;
 499 }
 500
 501 sub printhelp {
 502     my ($in, $rendering, $rex) = @_;
 503     print "<$rendering>\nVS\n<$reftext>\n" if $gOpts{vbasic};
 504
 505     # save this output to afile, edit out 'ok's and 1..N
 506     # then perl -d afile, and add re 'debug' to suit.
 507     print("\$str = q{$rendering};\n".
 508           "\$rex = qr{$reftext};\n".
 509           "print \"\$str =~ m{\$rex}ms \";\n".
 510           "\$str =~ m{\$rex}ms or print \"doh\\n\";\n\n")
 511         if $in{rextract} or $gOpts{rextract};
 512 }
 513
 514 1;
 515
 516 __END__
 517
 518 =head1 mkCheckRex
 519
 520 mkCheckRex receives the full testcase object, and constructs a regex.
 521 1st, it selects a reftxt from either the expect or expect_nt items.
 522
 523 Once selected, reftext massaged & convert into a Regex that accepts
 524 'good' concise renderings, with appropriate input variations, but is
 525 otherwize as strict as possible.  For example, it should *not* match
 526 when opcode flags change, or when optimizations convert an op to an
 527 ex-op.
 528
 529 =head2 match criteria
 530
 531 Opcode arguments (text within braces) are disregarded for matching
 532 purposes.  This loses some info in 'add[t5]', but greatly simplifys
 533 matching 'nextstate(main 22 (eval 10):1)'.  Besides, we are testing
 534 for regressions, not for complete accuracy.
 535
 536 The regex is unanchored, allowing success on simple expectations, such
 537 as one with a single 'print' opcode.
 538
 539 =head2 complicating factors
 540
 541 Note that %in may seem overly complicated, but it's needed to allow
 542 mkCheckRex to better support selftest,
 543
 544 The emerging complexity is that mkCheckRex must choose which refdata
 545 to use as a template for the regex being constructed.  This feels like
 546 selection mechanics being duplicated.
 547
 548 =head1 FEATURES, BUGS, ENHANCEMENTS
 549
 550 Hey, they're the same thing now, modulo heisen-phase-shifting, and the
 551 probe used to observe them.
 552
 553 =head1 Test Data
 554
 555 Test cases were recently doubled, by adding a 2nd ref-data property;
 556 expect and expect_nt carry renderings taken from threaded and
 557 non-threaded builds.  This addition has several benefits:
 558
 559  1. native reference data allows closer matching by regex.
 560  2. samples can be eyeballed to grok t-nt differences.
 561  3. data can help to validate mkCheckRex() operation.
 562  4. can develop code to smooth t-nt differences.
 563  5. can test with both native and cross+converted rexes
 564
 565 Enhancements:
 566
 567 Tests should specify both 'expect' and 'expect_nt', making the
 568 distinction now will allow a range of behaviors, in escalating
 569 thoroughness.  This variable is called provenance, indicating where
 570 the reftext came from.
 571
 572 build_only: tests which dont have the reference-sample of the
 573 right provenance will be skipped. NO GOOD.
 574
 575 prefer_expect: This is implied standard, as all tests done thus far
 576 started here.  One way t->nt conversions is done, based upon Config.
 577
 578 activetest: do cross-testing when test-case has both, ie also test
 579 'expect_nt' references on threaded builds.  This is aggressive, and is
 580 intended to seek out t<->nt differences.  if mkCheckRex knows
 581 provenance and Config, it can do 2 way t<->nt conversions.
 582
 583 activemapping: This builds upon activetest by controlling whether
 584 t<->nt conversions are done, and allows simpler verification that each
 585 conversion step is indeed necessary.
 586
 587 pedantic: this fails if tests dont have both, whereas above doesn't care.
 588
 589 =cut