Testing is crucial. While many different kinds and levels of testing exist, there's good library support only for unit tests (the Python unittest package and its moral equivalents in other languages). However, unit testing does not cover all kinds of testing we may want to do - for example, all kinds of whole program tests and integration tests. This is where we usually end up with a custom "test runner" script.

Having written my share of such custom test runners, I've recently gravitated towards a very convenient approach which I want to share here. In short, I'm actually using Python's unittest, combined with the dynamic nature of the language, to run all kinds of tests.

Let's assume my tests are some sort of data files which have to be fed to a program. The output of the program is compared to some "expected results" file, or maybe is encoded in the data file itself in some way. The details of this are immaterial, but seasoned programmers usually encounter such testing rigs very frequently. It commonly comes up when the program under test is a data-transformation mechanism of some sort (compiler, encryptor, encoder, compressor, translator etc.)

So you write a "test runner". A script that looks at some directory tree, finds all the "test files" there, runs each through the transformation, compares, reports, etc. I'm sure all these test runners share a lot of common infrastructure - I know that mine do.

Why not employ Python's existing "test runner" capabilities to do the same?

Here's a very short code snippet that can serve as a template to achieve this:

import unittest

class TestsContainer(unittest.TestCase):
    longMessage = True

def make_test_function(description, a, b):
    def test(self):
        self.assertEqual(a, b, description)
    return test

if __name__ == '__main__':
    testsmap = {
        'foo': [1, 1],
        'bar': [1, 2],
        'baz': [5, 5]}

    for name, params in testsmap.iteritems():
        test_func = make_test_function(name, params[0], params[1])
        setattr(TestsContainer, 'test_{0}'.format(name), test_func)

    unittest.main()

What happens here:

  1. The test class TestsContainer will contain dynamically generated test methods.
  2. make_test_function creates a test function (a method, to be precise) that compares its inputs. This is just a trivial template - it could do anything, or there can be multiple such "makers" fur multiple purposes.
  3. The loop creates test functions from the data description in testmap and attaches them to the test class.

Keep in mind that this is a very basic example. I hope it's obvious that testmap could really be test files found on disk, or whatever else. The main idea here is the dynamic test method creation.

So what do we gain from this, you may ask? Quite a lot. unittest is powerful - armed to its teeth with useful tools for testing. You can now invoke tests from the command line, control verbosity, control "fast fail" behavior, easily filter which tests to run and which not to run, use all kinds of assertion methods for readability and reporting (why write your own smart list comparison assertions?). Moreover, you can build on top of any number of third-party tools for working with unittest results - HTML/XML reporting, logging, automatic CI integration, and so on. The possibilities are endless.

One interesting variation on this theme is aiming the dynamic generation at a different testing "layer". unittest defines any number of "test cases" (classes), each with any number of "tests" (methods). In the code above, we generate a bunch of tests into a single test case. Here's a sample invocation to see this in action:

$ python dynamic_test_methods.py -v
test_bar (__main__.TestsContainer) ... FAIL
test_baz (__main__.TestsContainer) ... ok
test_foo (__main__.TestsContainer) ... ok

======================================================================
FAIL: test_bar (__main__.TestsContainer)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "dynamic_test_methods.py", line 8, in test
    self.assertEqual(a, b, description)
AssertionError: 1 != 2 : bar

----------------------------------------------------------------------
Ran 3 tests in 0.001s

FAILED (failures=1)

As you can see, all data pairs in testmap are translated into distinctly named test methods within the single test case TestsContainer.

Very easily, we can cut this a different way, by generating a whole test case for each data item:

import unittest

class DynamicClassBase(unittest.TestCase):
    longMessage = True

def make_test_function(description, a, b):
    def test(self):
        self.assertEqual(a, b, description)
    return test

if __name__ == '__main__':
    testsmap = {
        'foo': [1, 1],
        'bar': [1, 2],
        'baz': [5, 5]}

    for name, params in testsmap.iteritems():
        test_func = make_test_function(name, params[0], params[1])
        klassname = 'Test_{0}'.format(name)
        globals()[klassname] = type(klassname,
                                   (DynamicClassBase,),
                                   {'test_gen_{0}'.format(name): test_func})

    unittest.main()

Most of the code here remains the same. The difference is in the lines within the loop: now instead of dynamically creating test methods and attaching them to the test case, we create whole test cases - one per data item, with a single test method. All test cases derive from DynamicClassBase and hence from unittest.TestCase, so they will be auto-discovered by the unittest machinery. Now an execution will look like this:

$ python dynamic_test_classes.py -v
test_gen_bar (__main__.Test_bar) ... FAIL
test_gen_baz (__main__.Test_baz) ... ok
test_gen_foo (__main__.Test_foo) ... ok

======================================================================
FAIL: test_gen_bar (__main__.Test_bar)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "dynamic_test_classes.py", line 8, in test
    self.assertEqual(a, b, description)
AssertionError: 1 != 2 : bar

----------------------------------------------------------------------
Ran 3 tests in 0.000s

FAILED (failures=1)

Why would you want to generate whole test cases dynamically rather than just single tests? It all depends on your specific needs, really. In general, test cases are better isolated and share less, than tests within one test case. Moreover, you may have a huge amount of tests and want to use tools that shard your tests for parallel execution - in this case you almost certainly need separate test cases.

I've used this technique in a number of projects over the past couple of years and found it very useful; more than once, I replaced a whole complex test runner program with about 20-30 lines of code using this technique, and gained access to many more capabilities for free.

Python's built-in test discovery, reporting and running facilities are very powerful. Coupled with third-party tools they can be even more powerful. Leveraging all this power for any kind of testing, and not just unit testing, is possible with very little code, due to Python's dynamism. I hope you find it useful too.