Mutation Testing

or who is going to test your tests ?

1st Developers@CERN Forum

Created by Sebastian Witowski

Do you write code ?

Do you test your code ?

Do you test your tests ?

Do you test the tests for your tests ?

Testing tests ?

Richard Lipton, Fault Diagnosis of Computer Programs, 1971

How does this mutation testing works ?

Step 1. Change you code in a small way:

a + b ---> a * b
a + 1 ---> a + 2
a + b ---> a + c
a + 1 ---> a + 1
a + 1
(if a == 1 and b > 1) ---> (if a == 1 or b > 1)

Changes similar to small, programming errors.

Step 2. Run your tests

After that, you run your tests and see what happens. There are two possibilities here.
You start getting errors, which means that tests have reached the modified code and:
- either they died with some error message, because they didn't expect this modification in the code. For example your function was expecting 2 parameters but suddenly it got only one and that kills your test. This situation is called "weak mutation testing" and it means that your tests are good and in case something changes, you will notice it.
- or your tests detects the change properly and informs you about it with the assertion failure - which means that your tests are even better. This situation is called strong mutation testing and it's more powerful - it means that tests are actually catching the possible problems, but it's also more difficult to achieve this.

Step 2. Run your tests

Step 3. Get the mutation score

Mutation score =

number or mutants killed
number of mutants created

Step 4. Profit

Example time

def multiply(a, b):
    return a * b

def multiply(a, b):
    return a * b

class CalculatorTest(TestCase):
    def test_multiply(self):
        self.assertEqual(multiply(2, 2), 4)

self.assertEqual(multiply(2, 2), 4)

↓

self.assertEqual(multiply(3, 3), 9)

Equivalent Mutant Problem

index = 0
while True:
    do_stuff()
    index = index + 1
    if index == 10:
        break

if index == 10:
vs.
if index >= 10:

Wrap up

The good parts

Detect problems with your tests
They discover dead code
They are automatic
How else would you test your tests ?
(Semi-)Automatic tool for testing ? I'm in !

What could be the advantages of automatic mutation testing tools ?
They can help you detect problem with your tests
But they can also help you detect problems with your code, like a dead code
They are automatic. You won't write the mutation tests by hand because it would take too much time.
You will either use an existing library or at least you will write your own library.
And let's face it - how else would you measure the quality of your tests ?
We have test coverage statistics for the code, but we don't have anything like that for tests.
So, if there was a good tool, that I could plug into my continuous integration cycle that would report problems with my test, then I'm totally sold.
Studies shows that software developers spend up to 50% of their time just on testing. I would love to have a tool that gives me automatic feedback on my tests.

The not so good parts

Mutation testing is slow:
(TIME = ALL MUTANTS x ALL TESTS)
Handful of libraries
Equivalent Mutant Problem
Writing complex mutant tests is difficult

And that bring us to the "not so good parts".
First of all - mutation testing is slow. If you think that your whole test suite is slow because it runs for 2 minutes, then imagine that proper mutation testing requires running all your tests for the all the mutants. Well, there are some studies on how to speed up this process, like using selective mutations or mutant sampling, but they all boil down to running less tests and that might overlook some bugs.
There are not so many good libraries for mutation testing.
This is often caused by the aforementioned Equivalent Mutant Problem, which is very difficult to solve.
Also, writing complex tests, beyond the simple operator changing or variables replacement, it's basically impossible.

Mutant testing libraries

Mutant - Ruby (last updated September 2015)
VisualMutator - C# (last updated September 2015)
Pitest - Java (last updated August 2015)
Humbug - PHP (last updated May 2015)
MuCheck - Haskell (last updated January 2015)
MutPy - Python3 (last updated January 2014)
Mutator - commercial solution for Java, Ruby, JavaScript and PHP

Example

MutPy (requires Python3)

calculator.py

def multiply(a, b):
    return a * b

test_calculator.py

from unittest import TestCase
from calculator import multiply

class CalculatorTest(TestCase):
    def test_multiply(self):
        self.assertEqual(multiply(2, 2), 4)

self.assertEqual(multiply(2, 2), 4)

↓

self.assertEqual(multiply(3, 3), 9)

The future ?

Thank you !

Any questions ?

Happy ~~coding~~ testing !

This presentation is available on github, so you can see the slides on github pages