427 lines
15 KiB
Plaintext
427 lines
15 KiB
Plaintext
=head1 NAME
|
|
|
|
Test::Harness::Beyond - Beyond make test
|
|
|
|
=head1 Beyond make test
|
|
|
|
Test::Harness is responsible for running test scripts, analysing
|
|
their output and reporting success or failure. When I type
|
|
F<make test> (or F<./Build test>) for a module, Test::Harness is usually
|
|
used to run the tests (not all modules use Test::Harness but the
|
|
majority do).
|
|
|
|
To start exploring some of the features of Test::Harness I need to
|
|
switch from F<make test> to the F<prove> command (which ships with
|
|
Test::Harness). For the following examples I'll also need a recent
|
|
version of Test::Harness installed; 3.14 is current as I write.
|
|
|
|
For the examples I'm going to assume that we're working with a
|
|
'normal' Perl module distribution. Specifically I'll assume that
|
|
typing F<make> or F<./Build> causes the built, ready-to-install module
|
|
code to be available below ./blib/lib and ./blib/arch and that
|
|
there's a directory called 't' that contains our tests. Test::Harness
|
|
isn't hardwired to that configuration but it saves me from explaining
|
|
which files live where for each example.
|
|
|
|
Back to F<prove>; like F<make test> it runs a test suite - but it
|
|
provides far more control over which tests are executed, in what
|
|
order and how their results are reported. Typically F<make test>
|
|
runs all the test scripts below the 't' directory. To do the same
|
|
thing with prove I type:
|
|
|
|
prove -rb t
|
|
|
|
The switches here are -r to recurse into any directories below 't'
|
|
and -b which adds ./blib/lib and ./blib/arch to Perl's include path
|
|
so that the tests can find the code they will be testing. If I'm
|
|
testing a module of which an earlier version is already installed
|
|
I need to be careful about the include path to make sure I'm not
|
|
running my tests against the installed version rather than the new
|
|
one that I'm working on.
|
|
|
|
Unlike F<make test>, typing F<prove> doesn't automatically rebuild
|
|
my module. If I forget to make before prove I will be testing against
|
|
older versions of those files - which inevitably leads to confusion.
|
|
I either get into the habit of typing
|
|
|
|
make && prove -rb t
|
|
|
|
or - if I have no XS code that needs to be built I use the modules
|
|
below F<lib> instead
|
|
|
|
prove -Ilib -r t
|
|
|
|
So far I've shown you nothing that F<make test> doesn't do. Let's
|
|
fix that.
|
|
|
|
=head2 Saved State
|
|
|
|
If I have failing tests in a test suite that consists of more than
|
|
a handful of scripts and takes more than a few seconds to run it
|
|
rapidly becomes tedious to run the whole test suite repeatedly as
|
|
I track down the problems.
|
|
|
|
I can tell prove just to run the tests that are failing like this:
|
|
|
|
prove -b t/this_fails.t t/so_does_this.t
|
|
|
|
That speeds things up but I have to make a note of which tests are
|
|
failing and make sure that I run those tests. Instead I can use
|
|
prove's --state switch and have it keep track of failing tests for
|
|
me. First I do a complete run of the test suite and tell prove to
|
|
save the results:
|
|
|
|
prove -rb --state=save t
|
|
|
|
That stores a machine readable summary of the test run in a file
|
|
called '.prove' in the current directory. If I have failures I can
|
|
then run just the failing scripts like this:
|
|
|
|
prove -b --state=failed
|
|
|
|
I can also tell prove to save the results again so that it updates
|
|
its idea of which tests failed:
|
|
|
|
prove -b --state=failed,save
|
|
|
|
As soon as one of my failing tests passes it will be removed from
|
|
the list of failed tests. Eventually I fix them all and prove can
|
|
find no failing tests to run:
|
|
|
|
Files=0, Tests=0, 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
|
|
Result: NOTESTS
|
|
|
|
As I work on a particular part of my module it's most likely that
|
|
the tests that cover that code will fail. I'd like to run the whole
|
|
test suite but have it prioritize these 'hot' tests. I can tell
|
|
prove to do this:
|
|
|
|
prove -rb --state=hot,save t
|
|
|
|
All the tests will run but those that failed most recently will be
|
|
run first. If no tests have failed since I started saving state all
|
|
tests will run in their normal order. This combines full test
|
|
coverage with early notification of failures.
|
|
|
|
The --state switch supports a number of options; for example to run
|
|
failed tests first followed by all remaining tests ordered by the
|
|
timestamps of the test scripts - and save the results - I can use
|
|
|
|
prove -rb --state=failed,new,save t
|
|
|
|
See the prove documentation (type prove --man) for the full list
|
|
of state options.
|
|
|
|
When I tell prove to save state it writes a file called '.prove'
|
|
('_prove' on Windows) in the current directory. It's a YAML document
|
|
so it's quite easy to write tools of your own that work on the saved
|
|
test state - but the format isn't officially documented so it might
|
|
change without (much) warning in the future.
|
|
|
|
=head2 Parallel Testing
|
|
|
|
If my tests take too long to run I may be able to speed them up by
|
|
running multiple test scripts in parallel. This is particularly
|
|
effective if the tests are I/O bound or if I have multiple CPU
|
|
cores. I tell prove to run my tests in parallel like this:
|
|
|
|
prove -rb -j 9 t
|
|
|
|
The -j switch enables parallel testing; the number that follows it
|
|
is the maximum number of tests to run in parallel. Sometimes tests
|
|
that pass when run sequentially will fail when run in parallel. For
|
|
example if two different test scripts use the same temporary file
|
|
or attempt to listen on the same socket I'll have problems running
|
|
them in parallel. If I see unexpected failures I need to check my
|
|
tests to work out which of them are trampling on the same resource
|
|
and rename temporary files or add locks as appropriate.
|
|
|
|
To get the most performance benefit I want to have the test scripts
|
|
that take the longest to run start first - otherwise I'll be waiting
|
|
for the one test that takes nearly a minute to complete after all
|
|
the others are done. I can use the --state switch to run the tests
|
|
in slowest to fastest order:
|
|
|
|
prove -rb -j 9 --state=slow,save t
|
|
|
|
=head2 Non-Perl Tests
|
|
|
|
The Test Anything Protocol (http://testanything.org/) isn't just
|
|
for Perl. Just about any language can be used to write tests that
|
|
output TAP. There are TAP based testing libraries for C, C++, PHP,
|
|
Python and many others. If I can't find a TAP library for my language
|
|
of choice it's easy to generate valid TAP. It looks like this:
|
|
|
|
1..3
|
|
ok 1 - init OK
|
|
ok 2 - opened file
|
|
not ok 3 - appended to file
|
|
|
|
The first line is the plan - it specifies the number of tests I'm
|
|
going to run so that it's easy to check that the test script didn't
|
|
exit before running all the expected tests. The following lines are
|
|
the test results - 'ok' for pass, 'not ok' for fail. Each test has
|
|
a number and, optionally, a description. And that's it. Any language
|
|
that can produce output like that on STDOUT can be used to write
|
|
tests.
|
|
|
|
Recently I've been rekindling a two-decades-old interest in Forth.
|
|
Evidently I have a masochistic streak that even Perl can't satisfy.
|
|
I want to write tests in Forth and run them using prove (you can
|
|
find my gforth TAP experiments at
|
|
https://svn.hexten.net/andy/Forth/Testing/). I can use the --exec
|
|
switch to tell prove to run the tests using gforth like this:
|
|
|
|
prove -r --exec gforth t
|
|
|
|
Alternately, if the language used to write my tests allows a shebang
|
|
line I can use that to specify the interpreter. Here's a test written
|
|
in PHP:
|
|
|
|
#!/usr/bin/php
|
|
<?php
|
|
print "1..2\n";
|
|
print "ok 1\n";
|
|
print "not ok 2\n";
|
|
?>
|
|
|
|
If I save that as t/phptest.t the shebang line will ensure that it
|
|
runs correctly along with all my other tests.
|
|
|
|
=head2 Mixing it up
|
|
|
|
Subtle interdependencies between test programs can mask problems -
|
|
for example an earlier test may neglect to remove a temporary file
|
|
that affects the behaviour of a later test. To find this kind of
|
|
problem I use the --shuffle and --reverse options to run my tests
|
|
in random or reversed order.
|
|
|
|
=head2 Rolling My Own
|
|
|
|
If I need a feature that prove doesn't provide I can easily write my own.
|
|
|
|
Typically you'll want to change how TAP gets I<input> into and I<output>
|
|
from the parser. L<App::Prove> supports arbitrary plugins, and L<TAP::Harness>
|
|
supports custom I<formatters> and I<source handlers> that you can load using
|
|
either L<prove> or L<Module::Build>; there are many examples to base mine on.
|
|
For more details see L<App::Prove>, L<TAP::Parser::SourceHandler>, and
|
|
L<TAP::Formatter::Base>.
|
|
|
|
If writing a plugin is not enough, you can write your own test harness; one of
|
|
the motives for the 3.00 rewrite of Test::Harness was to make it easier to
|
|
subclass and extend.
|
|
|
|
The Test::Harness module is a compatibility wrapper around TAP::Harness.
|
|
For new applications I should use TAP::Harness directly. As we'll
|
|
see, prove uses TAP::Harness.
|
|
|
|
When I run prove it processes its arguments, figures out which test
|
|
scripts to run and then passes control to TAP::Harness to run the
|
|
tests, parse, analyse and present the results. By subclassing
|
|
TAP::Harness I can customise many aspects of the test run.
|
|
|
|
I want to log my test results in a database so I can track them
|
|
over time. To do this I override the summary method in TAP::Harness.
|
|
I start with a simple prototype that dumps the results as a YAML
|
|
document:
|
|
|
|
package My::TAP::Harness;
|
|
|
|
use base 'TAP::Harness';
|
|
use YAML;
|
|
|
|
sub summary {
|
|
my ( $self, $aggregate ) = @_;
|
|
print Dump( $aggregate );
|
|
$self->SUPER::summary( $aggregate );
|
|
}
|
|
|
|
1;
|
|
|
|
I need to tell prove to use my My::TAP::Harness. If My::TAP::Harness
|
|
is on Perl's @INC include path I can
|
|
|
|
prove --harness=My::TAP::Harness -rb t
|
|
|
|
If I don't have My::TAP::Harness installed on @INC I need to provide
|
|
the correct path to perl when I run prove:
|
|
|
|
perl -Ilib `which prove` --harness=My::TAP::Harness -rb t
|
|
|
|
I can incorporate these options into my own version of prove. It's
|
|
pretty simple. Most of the work of prove is handled by App::Prove.
|
|
The important code in prove is just:
|
|
|
|
use App::Prove;
|
|
|
|
my $app = App::Prove->new;
|
|
$app->process_args(@ARGV);
|
|
exit( $app->run ? 0 : 1 );
|
|
|
|
If I write a subclass of App::Prove I can customise any aspect of
|
|
the test runner while inheriting all of prove's behaviour. Here's
|
|
myprove:
|
|
|
|
#!/usr/bin/env perl use lib qw( lib ); # Add ./lib to @INC
|
|
use App::Prove;
|
|
|
|
my $app = App::Prove->new;
|
|
|
|
# Use custom TAP::Harness subclass
|
|
$app->harness( 'My::TAP::Harness' );
|
|
|
|
$app->process_args( @ARGV ); exit( $app->run ? 0 : 1 );
|
|
|
|
Now I can run my tests like this
|
|
|
|
./myprove -rb t
|
|
|
|
=head2 Deeper Customisation
|
|
|
|
Now that I know how to subclass and replace TAP::Harness I can
|
|
replace any other part of the harness. To do that I need to know
|
|
which classes are responsible for which functionality. Here's a
|
|
brief guided tour; the default class for each component is shown
|
|
in parentheses. Normally any replacements I write will be subclasses
|
|
of these default classes.
|
|
|
|
When I run my tests TAP::Harness creates a scheduler
|
|
(TAP::Parser::Scheduler) to work out the running order for the
|
|
tests, an aggregator (TAP::Parser::Aggregator) to collect and analyse
|
|
the test results and a formatter (TAP::Formatter::Console) to display
|
|
those results.
|
|
|
|
If I'm running my tests in parallel there may also be a multiplexer
|
|
(TAP::Parser::Multiplexer) - the component that allows multiple
|
|
tests to run simultaneously.
|
|
|
|
Once it has created those helpers TAP::Harness starts running the
|
|
tests. For each test it creates a new parser (TAP::Parser) which
|
|
is responsible for running the test script and parsing its output.
|
|
|
|
To replace any of these components I call one of these harness
|
|
methods with the name of the replacement class:
|
|
|
|
aggregator_class
|
|
formatter_class
|
|
multiplexer_class
|
|
parser_class
|
|
scheduler_class
|
|
|
|
For example, to replace the aggregator I would
|
|
|
|
$harness->aggregator_class( 'My::Aggregator' );
|
|
|
|
Alternately I can supply the names of my substitute classes to the
|
|
TAP::Harness constructor:
|
|
|
|
my $harness = TAP::Harness->new(
|
|
{ aggregator_class => 'My::Aggregator' }
|
|
);
|
|
|
|
If I need to reach even deeper into the internals of the harness I
|
|
can replace the classes that TAP::Parser uses to execute test scripts
|
|
and tokenise their output. Before running a test script TAP::Parser
|
|
creates a grammar (TAP::Parser::Grammar) to decode the raw TAP into
|
|
tokens, a result factory (TAP::Parser::ResultFactory) to turn the
|
|
decoded TAP results into objects and, depending on whether it's
|
|
running a test script or reading TAP from a file, scalar or array
|
|
a source or an iterator (TAP::Parser::IteratorFactory).
|
|
|
|
Each of these objects may be replaced by calling one of these parser
|
|
methods:
|
|
|
|
source_class
|
|
perl_source_class
|
|
grammar_class
|
|
iterator_factory_class
|
|
result_factory_class
|
|
|
|
=head2 Callbacks
|
|
|
|
As an alternative to subclassing the components I need to change I
|
|
can attach callbacks to the default classes. TAP::Harness exposes
|
|
these callbacks:
|
|
|
|
parser_args Tweak the parameters used to create the parser
|
|
made_parser Just made a new parser
|
|
before_runtests About to run tests
|
|
after_runtests Have run all tests
|
|
after_test Have run an individual test script
|
|
|
|
TAP::Parser also supports callbacks; bailout, comment, plan, test,
|
|
unknown, version and yaml are called for the corresponding TAP
|
|
result types, ALL is called for all results, ELSE is called for all
|
|
results for which a named callback is not installed and EOF is
|
|
called once at the end of each TAP stream.
|
|
|
|
To install a callback I pass the name of the callback and a subroutine
|
|
reference to TAP::Harness or TAP::Parser's callback method:
|
|
|
|
$harness->callback( after_test => sub {
|
|
my ( $script, $desc, $parser ) = @_;
|
|
} );
|
|
|
|
I can also pass callbacks to the constructor:
|
|
|
|
my $harness = TAP::Harness->new({
|
|
callbacks => {
|
|
after_test => sub {
|
|
my ( $script, $desc, $parser ) = @_;
|
|
# Do something interesting here
|
|
}
|
|
}
|
|
});
|
|
|
|
When it comes to altering the behaviour of the test harness there's
|
|
more than one way to do it. Which way is best depends on my
|
|
requirements. In general if I only want to observe test execution
|
|
without changing the harness' behaviour (for example to log test
|
|
results to a database) I choose callbacks. If I want to make the
|
|
harness behave differently subclassing gives me more control.
|
|
|
|
=head2 Parsing TAP
|
|
|
|
Perhaps I don't need a complete test harness. If I already have a
|
|
TAP test log that I need to parse all I need is TAP::Parser and the
|
|
various classes it depends upon. Here's the code I need to run a
|
|
test and parse its TAP output
|
|
|
|
use TAP::Parser;
|
|
|
|
my $parser = TAP::Parser->new( { source => 't/simple.t' } );
|
|
while ( my $result = $parser->next ) {
|
|
print $result->as_string, "\n";
|
|
}
|
|
|
|
Alternately I can pass an open filehandle as source and have the
|
|
parser read from that rather than attempting to run a test script:
|
|
|
|
open my $tap, '<', 'tests.tap'
|
|
or die "Can't read TAP transcript ($!)\n";
|
|
my $parser = TAP::Parser->new( { source => $tap } );
|
|
while ( my $result = $parser->next ) {
|
|
print $result->as_string, "\n";
|
|
}
|
|
|
|
This approach is useful if I need to convert my TAP based test
|
|
results into some other representation. See TAP::Convert::TET
|
|
(http://search.cpan.org/dist/TAP-Convert-TET/) for an example of
|
|
this approach.
|
|
|
|
=head2 Getting Support
|
|
|
|
The Test::Harness developers hang out on the tapx-dev mailing
|
|
list[1]. For discussion of general, language independent TAP issues
|
|
there's the tap-l[2] list. Finally there's a wiki dedicated to the
|
|
Test Anything Protocol[3]. Contributions to the wiki, patches and
|
|
suggestions are all welcome.
|
|
|
|
=for comment
|
|
The URLs in [1] and [2] point to 404 pages. What are currently the
|
|
correct URLs?
|
|
|
|
[1] L<http://www.hexten.net/mailman/listinfo/tapx-dev>
|
|
[2] L<http://testanything.org/mailman/listinfo/tap-l>
|
|
[3] L<http://testanything.org/>
|