[p5sagit/p5-mst-13.2.git] / ext / B / t / OptreeCheck.pm

# OptreeCheck.pm
# package-less .pm file allows 'use OptreeCheck';
# otherwise, it's like "require './test.pl'"

=head1 NAME

OptreeCheck - check optrees

=head1 SYNOPSIS

OptreeCheck supports regression testing of perl's parser, optimizer,
bytecode generator, via a single function: checkOptree(%args).

 checkOptree(name   => "your title here",
	     bcopts => '-exec',	# $opt or \@opts, passed to BC::compile
	     code   => sub {my $a},	# must be CODE ref
	     # prog   => 'sort @a',	# run in subprocess, aka -MO=Concise
	     # skip => 1,		# skips test
	     # todo => 'excuse',	# anticipated failures
	     # fail => 1		# fails (by redirecting result)
	     # debug => 1,		# turns on regex debug for match test !!
	     # retry => 1		# retry with debug on test failure
	     expect => <<'EOT_EOT', expect_nt => <<'EONT_EONT');
 # 1  <;> nextstate(main 45 optree.t:23) v
 # 2  <0> padsv[$a:45,46] M/LVINTRO
 # 3  <1> leavesub[1 ref] K/REFC,1
 EOT_EOT
 # 1  <;> nextstate(main 45 optree.t:23) v
 # 2  <0> padsv[$a:45,46] M/LVINTRO
 # 3  <1> leavesub[1 ref] K/REFC,1
 EONT_EONT

=head1 checkOptree(%in) Overview

Runs code or prog through B::Concise, and captures its rendering.

Calls mkCheckRex() to produce a regex which will match the expected
rendering, and fail when it doesn't match.

Also calls like($out,/$regex/,$name), and thereby plugs into the test.pl
framework.

=head1 checkOptree(%Args) API

Accepts %Args, with following requirements and actions:

expect and expect_nt required, not empty, not whitespace.  Its a fatal
error, because false positives are BAD.

Either code or prog must be present.

prog is some source code, and is passed through via runperl, to B::Concise
like this: (bcopts are fixed up for cmdline)

    './perl -w -MO=Concise,$bcopts_massaged -e $src'

code is a subref, or $src, like above.  If it's not a subref, it's
treated like source, and wrapped as a subroutine, and passed to
B::Concise::compile():

    $subref = eval "sub{$src}";

I suppose I should also explain these more, but..

    # prog   => 'sort @a',	# run in subprocess, aka -MO=Concise
    # skip => 1,		# skips test
    # todo => 'excuse',	# anticipated failures
    # fail => 1		# fails (by redirecting result)
    # debug => 1,		# turns on regex debug for match test !!
    # retry => 1		# retry with debug on test failure

=head1 Usage Philosophy

2 platforms --> 2 reftexts: You want an accurate test, independent of
which platform youre on.  This is obvious in retrospect, but ..

I started this with 1 reftext, and tried to use it to construct regexs
for both platforms.  This is extra complexity, trying to build a
single regex for both cases makes the regex more complicated, and
harder to get 'right'.

Having 2 references also allows various 'tests', really explorations
currently.  At the very least, having 2 samples side by side allows
inspection and aids understanding of optrees.

Cross-testing (expect_nt on threaded, expect on non-threaded) exposes
differences in B::Concise output, so mkCheckRex has code to do some
cross-test manipulations.  This area needs more work.

=head1 Test Modes

One consequence of a single-function API is difficulty controlling
test-mode.  Ive chosen for now to use a package hash, %gOpts, to store
test-state.  These properties alter checkOptree() function, either
short-circuiting to selftest, or running a loop that runs the testcase
2^N times, varying conditions each time.  (current N is 2 only).

So Test-mode is controlled with cmdline args, also called options below.
Run with 'help' to see the test-state, and how to change it.

=head2  selftest

This argument invokes runSelftest(), which tests a regex against the
reference renderings that they're made from.  Failure of a regex match
its 'mold' is a strong indicator that mkCheckRex is buggy.

That said, selftest mode currently runs a cross-test too, they're not
completely orthogonal yet.  See below.

=head2 testmode=cross

Cross-testing is purposely creating a T-NT mismatch, looking at the
fallout, and tweaking the regex to deal with it.  Thus tests lead to
'provably' complete understanding of the differences.

The tweaking appears contrary to the 2-refs philosophy, but the tweaks
will be made in conversion-specific code, which (will) handles T->NT
and NT->T separately.  The tweaking is incomplete.

A reasonable 1st step is to add tags to indicate when TonNT or NTonT
is known to fail.  This needs an option to force failure, so the
test.pl reporting mechanics show results to aid the user.

=head2 testmode=native

This is normal mode.  Other valid values are: native, cross, both.

=head2 checkOptree Notes

Accepts test code, renders its optree using B::Concise, and matches that
rendering against a regex built from one of 2 reference-renderings %in data.

The regex is built by mkCheckRex(\%in), which scrubs %in data to
remove match-irrelevancies, such as (args) and [args].  For example,
it strips leading '# ', making it easy to cut-paste new tests into
your test-file, run it, and cut-paste actual results into place.  You
then retest and reedit until all 'errors' are gone.  (now make sure you
haven't 'enshrined' a bug).

name: The test name.  May be augmented by a label, which is built from
important params, and which helps keep names in sync with whats being
tested.

=cut

use Config;
use Carp;
use B::Concise qw(walk_output);
use Data::Dumper;
$Data::Dumper::Sortkeys = 1;

BEGIN {
    $SIG{__WARN__} = sub {
	my $err = shift;
	$err =~ m/Subroutine re::(un)?install redefined/ and return;
    };
}

# but wait - more skullduggery !
sub OptreeCheck::import {  &getCmdLine; }	# process @ARGV

# %gOpts params comprise a global test-state.  Initial values here are
# HELP strings, they MUST BE REPLACED by runtime values before use, as
# is done by getCmdLine(), via import

our %gOpts = 	# values are replaced at runtime !!
    (
     # scalar values are help string
     rextract	=> 'writes src-code todo same Optree matching',
     vbasic	=> 'prints $str and $rex',
     retry	=> 'retry failures after turning on re debug',
     retrydbg	=> 'retry failures after turning on re debug',
     selftest	=> 'self-tests mkCheckRex vs the reference rendering',
     selfdbg	=> 'redo failing selftests with re debug',
     xtest	=> 'extended thread/non-thread testing',
     fail	=> 'force all test to fail, print to stdout',
     dump	=> 'dump cmdline arg prcessing',
     rexpedant	=> 'try tighter regex, still buggy',
     help	=> 0,	# 1 ends in die

     # array values are one-of selections, with 1st value as default
     #   tbc: 1st value is help, 2nd is default
     testmode => [qw/ native cross both /],
    );


our $threaded = 1 if $Config::Config{usethreads};
our $platform = ($threaded) ? "threaded" : "plain";
our $thrstat = ($threaded)  ? "threaded" : "nonthreaded";

our ($MatchRetry,$MatchRetryDebug);	# let mylike be generic
# test.pl-ish hack
*MatchRetry = \$gOpts{retry};		# but alias it into %gOpts
*MatchRetryDebug = \$gOpts{retrydbg};	# but alias it into %gOpts

our %modes = (
	      both	=> [ 'expect', 'expect_nt'],
	      native	=> [ ($threaded) ? 'expect' : 'expect_nt'],
	      cross	=> [ !($threaded) ? 'expect' : 'expect_nt'],
	      expect	=> [ 'expect' ],
	      expect_nt	=> [ 'expect_nt' ],
	);

our %msgs # announce cross-testing.
    = (
       # cross-platform
       'expect_nt-threaded' => " (Non-threaded-ref on Threaded-build)",
       'expect-nonthreaded' => " (Threaded-ref on Non-threaded-build)",
       # native - nothing to say
       'expect_nt-nonthreaded'	=> '',
       'expect-threaded'	=> '',
       );

#######
sub getCmdLine {	# import assistant
    # offer help
    print(qq{\n$0 accepts args to update these state-vars:
	     turn on a flag by typing its name,
	     select a value from list by typing name=val.\n    },
	  Dumper \%gOpts)
	if grep /help/, @ARGV;

    # replace values for each key !! MUST MARK UP %gOpts
    foreach my $opt (keys %gOpts) {

	# scan ARGV for known params
	if (ref $gOpts{$opt} eq 'ARRAY') {

	    # $opt is a One-Of construct
	    # replace with valid selection from the list

	    # uhh this WORKS. but it's inscrutable
	    # grep s/$opt=(\w+)/grep {$_ eq $1} @ARGV and $gOpts{$opt}=$1/e, @ARGV;
	    my $tval;  # temp
	    if (grep s/$opt=(\w+)/$tval=$1/e, @ARGV) {
		# check val before accepting
		my @allowed = @{$gOpts{$opt}};
		if (grep { $_ eq $tval } @allowed) {
		    $gOpts{$opt} = $tval;
		}
		else {die "invalid value: '$tval' for $opt\n"}
	    }

	    # take 1st val as default
	    $gOpts{$opt} = ${$gOpts{$opt}}[0]
		if ref $gOpts{$opt} eq 'ARRAY';
        }
        else { # handle scalars

	    # if 'opt' is present, true
	    $gOpts{$opt} = (grep /$opt/, @ARGV) ? 1 : 0;

	    # override with 'foo' if 'opt=foo' appears
	    grep s/$opt=(.*)/$gOpts{$opt}=$1/e, @ARGV;
	}
    }
    print("$0 heres current state:\n", Dumper \%gOpts)
	if $gOpts{help} or $gOpts{dump};

    exit if $gOpts{help};
}

##################################
# API

sub checkOptree {
    my %in = @_;
    my ($in, $res) = (\%in,0);	 # set up privates.

    print "checkOptree args: ",Dumper \%in if $in{dump};
    SKIP: {
	skip($in{name}, 1) if $in{skip};
	return runSelftest(\%in) if $gOpts{selftest};

	my $rendering = getRendering(\%in);	# get the actual output
	fail("FORCED: $in{name}:\n$rendering") if $gOpts{fail}; # silly ?

	# Test rendering against ..
	foreach $want (@{$modes{$gOpts{testmode}}}) {

	    my $rex = mkCheckRex(\%in,$want);
	    my $cross = $msgs{"$want-$thrstat"};

	    # bad is anticipated failure on cross testing ONLY
	    my $bad = (0 or ( $cross && $in{crossfail})
			 or (!$cross && $in{fail})
			 or 0);

	    # couldn't bear to pass \%in to likeyn
	    $res = mylike ( # custom test mode stuff
		[ !$bad,
		$in{retry} || $gOpts{retry},
		$in{debug} || $gOpts{retrydbg}
		],
		# remaining is std API
		$rendering, qr/$rex/ms, "$cross $in{name}")
	    || 0;
	    printhelp(\%in, $rendering, $rex);
	}
    }
    $res;
}

#################
# helpers

sub label {
    # may help get/keep test output consistent
    my ($in) = @_;
    $in->{label} = join(',', map {"$_=>$in->{$_}"}
			qw( bcopts name prog code ));
}

sub testCombo {
    # generate a set of test-cases from the options
    my $in = @_;
    my @cases;
    foreach $want (@{$modes{$gOpts{testmode}}}) {

	push @cases, [ %in,
		      ];
    }
    return @cases;
}

sub runSelftest {
    # tests the test-cases offered (expect, expect_nt)
    # needs Unification with above.
    my ($in) = @_;
    my $ok;
    foreach $want (@{$modes{$gOpts{testmode}}}) {}

    for my $provenance (qw/ expect expect_nt /) {
	next unless $in->{$provenance};
	my ($rex,$gospel) = mkCheckRex($in, $provenance);
	return unless $gospel;

	my $cross = $msgs{"$provenance-$thrstat"};
	my $bad = (0 or ( $cross && $in->{crossfail})
		   or   (!$cross && $in->{fail})
		   or 0);
	    # couldn't bear to pass \%in to likeyn
	    $res = mylike ( [ !$bad,
			      $in->{retry} || $gOpts{retry},
			      $in->{debug} || $gOpts{retrydbg}
			      ],
			    $rendering, qr/$rex/ms, "$cross $in{name}")
		|| 0;
    }
    $ok;
}

# use re;
sub mylike {
    # note dependence on unlike()
    my ($control) = shift;
    my ($yes,$retry,$debug) = @$control; # or dies
    my ($got, $expected, $name, @mess) = @_; # pass thru mostly

    die "unintended usage, expecting Regex". Dumper \@_
	unless ref $_[1] eq 'Regexp';

    # same as A ^ B, but B has side effects
    my $ok = ( (!$yes   and unlike($got, $expected, $name, @mess))
	       or ($yes and   like($got, $expected, $name, @mess)));

    if (not $ok and $retry) {
	# redo, perhaps with use re debug
	eval "use re 'debug'" if $debug;
	$ok = (!$yes   and unlike($got, $expected, "(RETRY) $name", @mess)
	       or $yes and   like($got, $expected, "(RETRY) $name", @mess));

	no re 'debug';
    }
    return $ok;
}

sub getRendering {
    my ($in) = @_;
    die "getRendering: code or prog is required\n"
	unless $in->{code} or $in->{prog};

    my @opts = get_bcopts($in);
    my $rendering = ''; # suppress "Use of uninitialized value in open"

    if ($in->{prog}) {
	$rendering = runperl( switches => ['-w',join(',',"-MO=Concise",@opts)],
			      prog => $in->{prog}, stderr => 1,
			      ); #verbose => 1);
    } else {
	my $code = $in->{code};
	unless (ref $code eq 'CODE') {
	    # treat as source, and wrap
	    $code = eval "sub { $code }";
	    die "$@ evaling code 'sub { $in->{code} }'\n"
		unless ref $code eq 'CODE';
	}
	# set walk-output b4 compiling, which writes 'announce' line
	walk_output(\$rendering);
	if ($in->{fail}) {
	    fail("forced failure: stdout follows");
	    walk_output(\*STDOUT);
	}
	my $opwalker = B::Concise::compile(@opts, $code);
	die "bad BC::compile retval" unless ref $opwalker eq 'CODE';

      B::Concise::reset_sequence();
	$opwalker->();
    }
    return $rendering;
}

sub get_bcopts {
    # collect concise passthru-options if any
    my ($in) = shift;
    my @opts = ();
    if ($in->{bcopts}) {
	@opts = (ref $in->{bcopts} eq 'ARRAY')
	    ? @{$in->{bcopts}} : ($in->{bcopts});
    }
    return @opts;
}

# needless complexity due to 'too much info' from B::Concise v.60
my $announce = 'B::Concise::compile\(CODE\(0x[0-9a-f]+\)\)';;

sub mkCheckRex {
    # converts expected text into Regexp which should match against
    # unaltered version.  also adjusts threaded => non-threaded
    my ($in, $want) = @_;
    eval "no re 'debug'";

    my $str = $in->{expect} || $in->{expect_nt};	# standard bias
    $str = $in->{$want} if $want;			# stated pref

    die "no reftext found for $want: $in->{name}" unless $str;
    #fail("rex-str is empty, won't allow false positives") unless $str;

    $str =~ s/^\# //mg;		# ease cut-paste testcase authoring
    my $reftxt = $str;		# extra return val !!

    unless ($gOpts{rexpedant}) {
	# convert all (args) and [args] to temporary '____'
	$str =~ s/(\(.*?\))/____/msg;
	$str =~ s/(\[.*?\])/____/msg;

	# escape remaining metachars. manual \Q (doesnt escape '+')
	$str =~ s/([\[\]()*.\$\@\#])/\\$1/msg;
	#$str =~ s/([*.\$\@\#])/\\$1/msg;

	# now replace '____' with something that matches both.
	#  bracing style agnosticism is important here, it makes many
	#  threaded / non-threaded diffs irrelevant
	$str =~ s/____/(\\[.*?\\]|\\(.*?\\))/msg; # capture in case..

	# no mysterious failures in debugger
	$str =~ s/(?:next|db)state/(?:next|db)state/msg;
    }
    else {
	# precise/pedantic way - only wildcard nextate, leavesub

	# escape some literals
	$str =~ s/([*.\$\@\#])/\\$1/msg;

	# nextstate. replace args, and work under debugger
	$str =~ s/(?:next|db)state\(.*?\)/(?:next|db)state\\(.*?\\)/msg;

	# leavesub refcount changes, dont care
	$str =~ s/leavesub\[.*?\]/leavesub[.*?]/msg;

	# wildcard-ify all [contents]
	$str =~ s/\[.*?\]/[.*?]/msg;	# add capture ?

	# make [] literal now, keeping .* for contents
	$str =~ s/([\[\]])/\\$1/msg;
    }
    # threaded <--> non-threaded transforms ??

    if (not $Config::Config{usethreads}) {
	# written for T->NT transform
	# $str =~ s/<\\#>/<\\\$>/msg;	# GV on pad, a threads thing ?
	$str =~ s/PADOP/SVOP/msg;	# fix terse output diffs
    }
    croak "no reftext found for $want: $in->{name}"
	unless $str =~ /\w+/; # fail unless a real test

    # $str = '.*'	if 1;	# sanity test
    # $str .= 'FAIL'	if 1;	# sanity test

    # tabs fixup
    $str =~ s/\t/ +/msg; # not \s+

    eval "use re 'debug'" if $debug;
    my $qr = qr/$str/;
    no re 'debug';

    return ($qr, $reftxt) if wantarray;
    return $qr;
}

sub printhelp {
    my ($in, $rendering, $rex) = @_;
    print "<$rendering>\nVS\n<$reftext>\n" if $gOpts{vbasic};

    # save this output to afile, edit out 'ok's and 1..N
    # then perl -d afile, and add re 'debug' to suit.
    print("\$str = q{$rendering};\n".
	  "\$rex = qr{$reftext};\n".
	  "print \"\$str =~ m{\$rex}ms \";\n".
	  "\$str =~ m{\$rex}ms or print \"doh\\n\";\n\n")
	if $in{rextract} or $gOpts{rextract};
}

1;

__END__

=head1 mkCheckRex

mkCheckRex receives the full testcase object, and constructs a regex.
1st, it selects a reftxt from either the expect or expect_nt items.

Once selected, reftext massaged & convert into a Regex that accepts
'good' concise renderings, with appropriate input variations, but is
otherwize as strict as possible.  For example, it should *not* match
when opcode flags change, or when optimizations convert an op to an
ex-op.

=head2 match criteria

Opcode arguments (text within braces) are disregarded for matching
purposes.  This loses some info in 'add[t5]', but greatly simplifys
matching 'nextstate(main 22 (eval 10):1)'.  Besides, we are testing
for regressions, not for complete accuracy.

The regex is unanchored, allowing success on simple expectations, such
as one with a single 'print' opcode.

=head2 complicating factors

Note that %in may seem overly complicated, but it's needed to allow
mkCheckRex to better support selftest,

The emerging complexity is that mkCheckRex must choose which refdata
to use as a template for the regex being constructed.  This feels like
selection mechanics being duplicated.

=head1 FEATURES, BUGS, ENHANCEMENTS

Hey, they're the same thing now, modulo heisen-phase-shifting, and the
probe used to observe them.

=head1 Test Data

Test cases were recently doubled, by adding a 2nd ref-data property;
expect and expect_nt carry renderings taken from threaded and
non-threaded builds.  This addition has several benefits:

 1. native reference data allows closer matching by regex.
 2. samples can be eyeballed to grok t-nt differences.
 3. data can help to validate mkCheckRex() operation.
 4. can develop code to smooth t-nt differences.
 5. can test with both native and cross+converted rexes

Enhancements:

Tests should specify both 'expect' and 'expect_nt', making the
distinction now will allow a range of behaviors, in escalating
thoroughness.  This variable is called provenance, indicating where
the reftext came from.

build_only: tests which dont have the reference-sample of the
right provenance will be skipped. NO GOOD.

prefer_expect: This is implied standard, as all tests done thus far
started here.  One way t->nt conversions is done, based upon Config.

activetest: do cross-testing when test-case has both, ie also test
'expect_nt' references on threaded builds.  This is aggressive, and is
intended to seek out t<->nt differences.  if mkCheckRex knows
provenance and Config, it can do 2 way t<->nt conversions.

activemapping: This builds upon activetest by controlling whether
t<->nt conversions are done, and allows simpler verification that each
conversion step is indeed necessary.

pedantic: this fails if tests dont have both, whereas above doesn't care.

=cut
Commit	Line	Data
724aa791	1	# OptreeCheck.pm
	2	# package-less .pm file allows 'use OptreeCheck';
	3	# otherwise, it's like "require './test.pl'"
	4
	5	=head1 NAME
	6
	7	OptreeCheck - check optrees
	8
	9	=head1 SYNOPSIS
	10
	11	OptreeCheck supports regression testing of perl's parser, optimizer,
	12	bytecode generator, via a single function: checkOptree(%args).
	13
	14	checkOptree(name => "your title here",
	15	bcopts => '-exec', # $opt or \@opts, passed to BC::compile
	16	code => sub {my $a}, # must be CODE ref
	17	# prog => 'sort @a', # run in subprocess, aka -MO=Concise
	18	# skip => 1, # skips test
	19	# todo => 'excuse', # anticipated failures
	20	# fail => 1 # fails (by redirecting result)
	21	# debug => 1, # turns on regex debug for match test !!
	22	# retry => 1 # retry with debug on test failure
	23	expect => <<'EOT_EOT', expect_nt => <<'EONT_EONT');
	24	# 1 <;> nextstate(main 45 optree.t:23) v
	25	# 2 <0> padsv[$a:45,46] M/LVINTRO
	26	# 3 <1> leavesub[1 ref] K/REFC,1
	27	EOT_EOT
	28	# 1 <;> nextstate(main 45 optree.t:23) v
	29	# 2 <0> padsv[$a:45,46] M/LVINTRO
	30	# 3 <1> leavesub[1 ref] K/REFC,1
	31	EONT_EONT
	32
	33	=head1 checkOptree(%in) Overview
	34
	35	Runs code or prog through B::Concise, and captures its rendering.
	36
	37	Calls mkCheckRex() to produce a regex which will match the expected
	38	rendering, and fail when it doesn't match.
	39
	40	Also calls like($out,/$regex/,$name), and thereby plugs into the test.pl
	41	framework.
	42
	43	=head1 checkOptree(%Args) API
	44
	45	Accepts %Args, with following requirements and actions:
	46
	47	expect and expect_nt required, not empty, not whitespace. Its a fatal
	48	error, because false positives are BAD.
	49
	50	Either code or prog must be present.
	51
	52	prog is some source code, and is passed through via runperl, to B::Concise
	53	like this: (bcopts are fixed up for cmdline)
	54
	55	'./perl -w -MO=Concise,$bcopts_massaged -e $src'
	56
	57	code is a subref, or $src, like above. If it's not a subref, it's
	58	treated like source, and wrapped as a subroutine, and passed to
	59	B::Concise::compile():
	60
	61	$subref = eval "sub{$src}";
	62
	63	I suppose I should also explain these more, but..
	64
65	# prog => 'sort @a', # run in subprocess, aka -MO=Concise
66	# skip => 1, # skips test
67	# todo => 'excuse', # anticipated failures
68	# fail => 1 # fails (by redirecting result)
69	# debug => 1, # turns on regex debug for match test !!
70	# retry => 1 # retry with debug on test failure
71
72	=head1 Usage Philosophy
73
74	2 platforms --> 2 reftexts: You want an accurate test, independent of
75	which platform youre on. This is obvious in retrospect, but ..
76
77	I started this with 1 reftext, and tried to use it to construct regexs
78	for both platforms. This is extra complexity, trying to build a
79	single regex for both cases makes the regex more complicated, and
80	harder to get 'right'.
81
82	Having 2 references also allows various 'tests', really explorations
83	currently. At the very least, having 2 samples side by side allows
84	inspection and aids understanding of optrees.
85
86	Cross-testing (expect_nt on threaded, expect on non-threaded) exposes
87	differences in B::Concise output, so mkCheckRex has code to do some
88	cross-test manipulations. This area needs more work.
89
90	=head1 Test Modes
91
92	One consequence of a single-function API is difficulty controlling
93	test-mode. Ive chosen for now to use a package hash, %gOpts, to store
94	test-state. These properties alter checkOptree() function, either
95	short-circuiting to selftest, or running a loop that runs the testcase
96	2^N times, varying conditions each time. (current N is 2 only).
97
98	So Test-mode is controlled with cmdline args, also called options below.
99	Run with 'help' to see the test-state, and how to change it.
100
101	=head2 selftest
102
103	This argument invokes runSelftest(), which tests a regex against the
104	reference renderings that they're made from. Failure of a regex match
105	its 'mold' is a strong indicator that mkCheckRex is buggy.
106
107	That said, selftest mode currently runs a cross-test too, they're not
108	completely orthogonal yet. See below.
109
110	=head2 testmode=cross
111
112	Cross-testing is purposely creating a T-NT mismatch, looking at the
113	fallout, and tweaking the regex to deal with it. Thus tests lead to
114	'provably' complete understanding of the differences.
115
116	The tweaking appears contrary to the 2-refs philosophy, but the tweaks
117	will be made in conversion-specific code, which (will) handles T->NT
118	and NT->T separately. The tweaking is incomplete.
119
120	A reasonable 1st step is to add tags to indicate when TonNT or NTonT
121	is known to fail. This needs an option to force failure, so the
122	test.pl reporting mechanics show results to aid the user.
123
124	=head2 testmode=native
125
126	This is normal mode. Other valid values are: native, cross, both.
127
128	=head2 checkOptree Notes
129
130	Accepts test code, renders its optree using B::Concise, and matches that
131	rendering against a regex built from one of 2 reference-renderings %in data.
132
133	The regex is built by mkCheckRex(\%in), which scrubs %in data to
134	remove match-irrelevancies, such as (args) and [args]. For example,
135	it strips leading '# ', making it easy to cut-paste new tests into
136	your test-file, run it, and cut-paste actual results into place. You
137	then retest and reedit until all 'errors' are gone. (now make sure you
138	haven't 'enshrined' a bug).
139
140	name: The test name. May be augmented by a label, which is built from
141	important params, and which helps keep names in sync with whats being
142	tested.
143
144	=cut
145
146	use Config;
147	use Carp;
148	use B::Concise qw(walk_output);
149	use Data::Dumper;
150	$Data::Dumper::Sortkeys = 1;
151
152	BEGIN {
153	$SIG{__WARN__} = sub {
154	my $err = shift;
155	$err =~ m/Subroutine re::(un)?install redefined/ and return;
156	};
157	}
158
159	# but wait - more skullduggery !
160	sub OptreeCheck::import { &getCmdLine; } # process @ARGV
161
162	# %gOpts params comprise a global test-state. Initial values here are
163	# HELP strings, they MUST BE REPLACED by runtime values before use, as
164	# is done by getCmdLine(), via import
165
166	our %gOpts = # values are replaced at runtime !!
167	(
168	# scalar values are help string
169	rextract => 'writes src-code todo same Optree matching',
170	vbasic => 'prints $str and $rex',
171	retry => 'retry failures after turning on re debug',
172	retrydbg => 'retry failures after turning on re debug',
173	selftest => 'self-tests mkCheckRex vs the reference rendering',
174	selfdbg => 'redo failing selftests with re debug',
175	xtest => 'extended thread/non-thread testing',
176	fail => 'force all test to fail, print to stdout',
177	dump => 'dump cmdline arg prcessing',
178	rexpedant => 'try tighter regex, still buggy',
179	help => 0, # 1 ends in die
180
181	# array values are one-of selections, with 1st value as default
182	# tbc: 1st value is help, 2nd is default
183	testmode => [qw/ native cross both /],
184	);
185
186
187	our $threaded = 1 if $Config::Config{usethreads};
188	our $platform = ($threaded) ? "threaded" : "plain";
189	our $thrstat = ($threaded) ? "threaded" : "nonthreaded";
190
191	our ($MatchRetry,$MatchRetryDebug); # let mylike be generic
192	# test.pl-ish hack
193	*MatchRetry = \$gOpts{retry}; # but alias it into %gOpts
194	*MatchRetryDebug = \$gOpts{retrydbg}; # but alias it into %gOpts
195
196	our %modes = (
197	both => [ 'expect', 'expect_nt'],
198	native => [ ($threaded) ? 'expect' : 'expect_nt'],
199	cross => [ !($threaded) ? 'expect' : 'expect_nt'],
200	expect => [ 'expect' ],
201	expect_nt => [ 'expect_nt' ],
202	);
203
204	our %msgs # announce cross-testing.
205	= (
206	# cross-platform
207	'expect_nt-threaded' => " (Non-threaded-ref on Threaded-build)",
208	'expect-nonthreaded' => " (Threaded-ref on Non-threaded-build)",
209	# native - nothing to say
210	'expect_nt-nonthreaded' => '',
211	'expect-threaded' => '',
212	);
213
214	#######
215	sub getCmdLine { # import assistant
216	# offer help
217	print(qq{\n$0 accepts args to update these state-vars:
218	turn on a flag by typing its name,
219	select a value from list by typing name=val.\n },
220	Dumper \%gOpts)
221	if grep /help/, @ARGV;
222
223	# replace values for each key !! MUST MARK UP %gOpts
224	foreach my $opt (keys %gOpts) {
225
226	# scan ARGV for known params
227	if (ref $gOpts{$opt} eq 'ARRAY') {
228
229	# $opt is a One-Of construct
230	# replace with valid selection from the list
231
232	# uhh this WORKS. but it's inscrutable
233	# grep s/$opt=(\w+)/grep {$_ eq $1} @ARGV and $gOpts{$opt}=$1/e, @ARGV;
234	my $tval; # temp
235	if (grep s/$opt=(\w+)/$tval=$1/e, @ARGV) {
236	# check val before accepting
237	my @allowed = @{$gOpts{$opt}};
238	if (grep { $_ eq $tval } @allowed) {
239	$gOpts{$opt} = $tval;
240	}
241	else {die "invalid value: '$tval' for $opt\n"}
242	}
243
244	# take 1st val as default
245	$gOpts{$opt} = ${$gOpts{$opt}}[0]
246	if ref $gOpts{$opt} eq 'ARRAY';
247	}
248	else { # handle scalars
249
250	# if 'opt' is present, true
251	$gOpts{$opt} = (grep /$opt/, @ARGV) ? 1 : 0;
252
253	# override with 'foo' if 'opt=foo' appears
254	grep s/$opt=(.*)/$gOpts{$opt}=$1/e, @ARGV;
255	}
256	}
257	print("$0 heres current state:\n", Dumper \%gOpts)
258	if $gOpts{help} or $gOpts{dump};
259
260	exit if $gOpts{help};
261	}
262
263	##################################
264	# API
265
266	sub checkOptree {
267	my %in = @_;
268	my ($in, $res) = (\%in,0); # set up privates.
269
270	print "checkOptree args: ",Dumper \%in if $in{dump};
271	SKIP: {
272	skip($in{name}, 1) if $in{skip};
273	return runSelftest(\%in) if $gOpts{selftest};
274
275	my $rendering = getRendering(\%in); # get the actual output
276	fail("FORCED: $in{name}:\n$rendering") if $gOpts{fail}; # silly ?
277
278	# Test rendering against ..
279	foreach $want (@{$modes{$gOpts{testmode}}}) {
280
281	my $rex = mkCheckRex(\%in,$want);
282	my $cross = $msgs{"$want-$thrstat"};
283
284	# bad is anticipated failure on cross testing ONLY
285	my $bad = (0 or ( $cross && $in{crossfail})
286	or (!$cross && $in{fail})
287	or 0);
288
289	# couldn't bear to pass \%in to likeyn
290	$res = mylike ( # custom test mode stuff
291	[ !$bad,
292	$in{retry} \|\| $gOpts{retry},
293	$in{debug} \|\| $gOpts{retrydbg}
294	],
295	# remaining is std API
296	$rendering, qr/$rex/ms, "$cross $in{name}")
297	\|\| 0;
298	printhelp(\%in, $rendering, $rex);
299	}
300	}
301	$res;
302	}
303
304	#################
305	# helpers
306
307	sub label {
308	# may help get/keep test output consistent
309	my ($in) = @_;
310	$in->{label} = join(',', map {"$_=>$in->{$_}"}
311	qw( bcopts name prog code ));
312	}
313
314	sub testCombo {
315	# generate a set of test-cases from the options
316	my $in = @_;
317	my @cases;
318	foreach $want (@{$modes{$gOpts{testmode}}}) {
319
320	push @cases, [ %in,
321	];
322	}
323	return @cases;
324	}
325
326	sub runSelftest {
327	# tests the test-cases offered (expect, expect_nt)
328	# needs Unification with above.
329	my ($in) = @_;
330	my $ok;
331	foreach $want (@{$modes{$gOpts{testmode}}}) {}
332
333	for my $provenance (qw/ expect expect_nt /) {
334	next unless $in->{$provenance};
335	my ($rex,$gospel) = mkCheckRex($in, $provenance);
336	return unless $gospel;
337
338	my $cross = $msgs{"$provenance-$thrstat"};
339	my $bad = (0 or ( $cross && $in->{crossfail})
340	or (!$cross && $in->{fail})
341	or 0);
342	# couldn't bear to pass \%in to likeyn
343	$res = mylike ( [ !$bad,
344	$in->{retry} \|\| $gOpts{retry},
345	$in->{debug} \|\| $gOpts{retrydbg}
346	],
347	$rendering, qr/$rex/ms, "$cross $in{name}")
348	\|\| 0;
349	}
350	$ok;
351	}
352
353	# use re;
354	sub mylike {
355	# note dependence on unlike()
356	my ($control) = shift;
357	my ($yes,$retry,$debug) = @$control; # or dies
358	my ($got, $expected, $name, @mess) = @_; # pass thru mostly
359
360	die "unintended usage, expecting Regex". Dumper \@_
361	unless ref $_[1] eq 'Regexp';
362
363	# same as A ^ B, but B has side effects
364	my $ok = ( (!$yes and unlike($got, $expected, $name, @mess))
365	or ($yes and like($got, $expected, $name, @mess)));
366
367	if (not $ok and $retry) {
368	# redo, perhaps with use re debug
369	eval "use re 'debug'" if $debug;
370	$ok = (!$yes and unlike($got, $expected, "(RETRY) $name", @mess)
371	or $yes and like($got, $expected, "(RETRY) $name", @mess));
372
373	no re 'debug';
374	}
375	return $ok;
376	}
377
378	sub getRendering {
379	my ($in) = @_;
380	die "getRendering: code or prog is required\n"
381	unless $in->{code} or $in->{prog};
382
383	my @opts = get_bcopts($in);
384	my $rendering = ''; # suppress "Use of uninitialized value in open"
385
386	if ($in->{prog}) {
387	$rendering = runperl( switches => ['-w',join(',',"-MO=Concise",@opts)],
388	prog => $in->{prog}, stderr => 1,
389	); #verbose => 1);
390	} else {
391	my $code = $in->{code};
392	unless (ref $code eq 'CODE') {
393	# treat as source, and wrap
394	$code = eval "sub { $code }";
395	die "$@ evaling code 'sub { $in->{code} }'\n"
396	unless ref $code eq 'CODE';
397	}
398	# set walk-output b4 compiling, which writes 'announce' line
399	walk_output(\$rendering);
400	if ($in->{fail}) {
401	fail("forced failure: stdout follows");
402	walk_output(\*STDOUT);
403	}
404	my $opwalker = B::Concise::compile(@opts, $code);
405	die "bad BC::compile retval" unless ref $opwalker eq 'CODE';
406
407	B::Concise::reset_sequence();
408	$opwalker->();
409	}
410	return $rendering;
411	}
412
413	sub get_bcopts {
414	# collect concise passthru-options if any
415	my ($in) = shift;
416	my @opts = ();
417	if ($in->{bcopts}) {
418	@opts = (ref $in->{bcopts} eq 'ARRAY')
419	? @{$in->{bcopts}} : ($in->{bcopts});
420	}
421	return @opts;
422	}
423
424	# needless complexity due to 'too much info' from B::Concise v.60
425	my $announce = 'B::Concise::compile\(CODE\(0x[0-9a-f]+\)\)';;
426
427	sub mkCheckRex {
428	# converts expected text into Regexp which should match against
429	# unaltered version. also adjusts threaded => non-threaded
430	my ($in, $want) = @_;
431	eval "no re 'debug'";
432
433	my $str = $in->{expect} \|\| $in->{expect_nt}; # standard bias
434	$str = $in->{$want} if $want; # stated pref
435
436	die "no reftext found for $want: $in->{name}" unless $str;
437	#fail("rex-str is empty, won't allow false positives") unless $str;
438
439	$str =~ s/^\# //mg; # ease cut-paste testcase authoring
440	my $reftxt = $str; # extra return val !!
441
442	unless ($gOpts{rexpedant}) {
443	# convert all (args) and [args] to temporary '____'
444	$str =~ s/(\(.*?\))/____/msg;
445	$str =~ s/(\[.*?\])/____/msg;
446
447	# escape remaining metachars. manual \Q (doesnt escape '+')
448	$str =~ s/([\[\]()*.\$\@\#])/\\$1/msg;
449	#$str =~ s/([*.\$\@\#])/\\$1/msg;
450
451	# now replace '____' with something that matches both.
452	# bracing style agnosticism is important here, it makes many
453	# threaded / non-threaded diffs irrelevant
454	$str =~ s/____/(\\[.?\\]\|\\(.?\\))/msg; # capture in case..
455
456	# no mysterious failures in debugger
457	$str =~ s/(?:next\|db)state/(?:next\|db)state/msg;
458	}
459	else {
460	# precise/pedantic way - only wildcard nextate, leavesub
461
462	# escape some literals
463	$str =~ s/([*.\$\@\#])/\\$1/msg;
464
465	# nextstate. replace args, and work under debugger
466	$str =~ s/(?:next\|db)state\(.?\)/(?:next\|db)state\\(.?\\)/msg;
467
468	# leavesub refcount changes, dont care
469	$str =~ s/leavesub\[.?\]/leavesub[.?]/msg;
470
471	# wildcard-ify all [contents]
472	$str =~ s/\[.?\]/[.?]/msg; # add capture ?
473
474	# make [] literal now, keeping .* for contents
475	$str =~ s/([\[\]])/\\$1/msg;
476	}
477	# threaded <--> non-threaded transforms ??
478
479	if (not $Config::Config{usethreads}) {
480	# written for T->NT transform
481	# $str =~ s/<\\#>/<\\\$>/msg; # GV on pad, a threads thing ?
482	$str =~ s/PADOP/SVOP/msg; # fix terse output diffs
483	}
484	croak "no reftext found for $want: $in->{name}"
485	unless $str =~ /\w+/; # fail unless a real test
486
487	# $str = '.*' if 1; # sanity test
488	# $str .= 'FAIL' if 1; # sanity test
489
490	# tabs fixup
491	$str =~ s/\t/ +/msg; # not \s+
492
493	eval "use re 'debug'" if $debug;
494	my $qr = qr/$str/;
495	no re 'debug';
496
497	return ($qr, $reftxt) if wantarray;
498	return $qr;
499	}
500
501	sub printhelp {
502	my ($in, $rendering, $rex) = @_;
503	print "<$rendering>\nVS\n<$reftext>\n" if $gOpts{vbasic};
504
505	# save this output to afile, edit out 'ok's and 1..N
506	# then perl -d afile, and add re 'debug' to suit.
507	print("\$str = q{$rendering};\n".
508	"\$rex = qr{$reftext};\n".
509	"print \"\$str =~ m{\$rex}ms \";\n".
510	"\$str =~ m{\$rex}ms or print \"doh\\n\";\n\n")
511	if $in{rextract} or $gOpts{rextract};
512	}
513
514	1;
515
516	__END__
517
518	=head1 mkCheckRex
519
520	mkCheckRex receives the full testcase object, and constructs a regex.
521	1st, it selects a reftxt from either the expect or expect_nt items.
522
523	Once selected, reftext massaged & convert into a Regex that accepts
524	'good' concise renderings, with appropriate input variations, but is
525	otherwize as strict as possible. For example, it should not match
526	when opcode flags change, or when optimizations convert an op to an
527	ex-op.
528
529	=head2 match criteria
530
531	Opcode arguments (text within braces) are disregarded for matching
532	purposes. This loses some info in 'add[t5]', but greatly simplifys
533	matching 'nextstate(main 22 (eval 10):1)'. Besides, we are testing
534	for regressions, not for complete accuracy.
535
536	The regex is unanchored, allowing success on simple expectations, such
537	as one with a single 'print' opcode.
538
539	=head2 complicating factors
540
541	Note that %in may seem overly complicated, but it's needed to allow
542	mkCheckRex to better support selftest,
543
544	The emerging complexity is that mkCheckRex must choose which refdata
545	to use as a template for the regex being constructed. This feels like
546	selection mechanics being duplicated.
547
548	=head1 FEATURES, BUGS, ENHANCEMENTS
549
550	Hey, they're the same thing now, modulo heisen-phase-shifting, and the
551	probe used to observe them.
552
553	=head1 Test Data
554
555	Test cases were recently doubled, by adding a 2nd ref-data property;
556	expect and expect_nt carry renderings taken from threaded and
557	non-threaded builds. This addition has several benefits:
558
559	1. native reference data allows closer matching by regex.
560	2. samples can be eyeballed to grok t-nt differences.
561	3. data can help to validate mkCheckRex() operation.
562	4. can develop code to smooth t-nt differences.
563	5. can test with both native and cross+converted rexes
564
565	Enhancements:
566
567	Tests should specify both 'expect' and 'expect_nt', making the
568	distinction now will allow a range of behaviors, in escalating
569	thoroughness. This variable is called provenance, indicating where
570	the reftext came from.
571
572	build_only: tests which dont have the reference-sample of the
573	right provenance will be skipped. NO GOOD.
574
575	prefer_expect: This is implied standard, as all tests done thus far
576	started here. One way t->nt conversions is done, based upon Config.
577
578	activetest: do cross-testing when test-case has both, ie also test
579	'expect_nt' references on threaded builds. This is aggressive, and is
580	intended to seek out t<->nt differences. if mkCheckRex knows
581	provenance and Config, it can do 2 way t<->nt conversions.
582
583	activemapping: This builds upon activetest by controlling whether
584	t<->nt conversions are done, and allows simpler verification that each
585	conversion step is indeed necessary.
586
587	pedantic: this fails if tests dont have both, whereas above doesn't care.
588
589	=cut