Is it a unit test or an integration test?

We didn’t want to spend time agonizing over whether a test should be considered a unit test or an integration test.  We wanted an easy way to decide, such that the resulting division of tests would meet our needs. Here are some guidelines I wrote up to that end a couple of years ago, as a companion article to the Why not just integration tests article.


So you’re writing a code-based test, and first off you need to decide whether it should be considered a unit test or an integration test so that you know how to name the test class. How do you decide?

1. We should consider it a unit test if…

There are two things we want to be true about a thing we call a unit test1:

  • It runs fast
  • It is geared toward error localization

1.1. It Runs Fast

To consider a test a unit test, it needs to not take very long to run.

Why do we say that?

Speed rationale

The reason we don’t want unit tests to take a long time is because we want to be able to run them often, during development. We want to be able to code a little, run our tests, code a little more, run our tests…. If our tests are too slow, it will break the rhythm and in practice we will just end up not running our tests very often.

Speed guidelines

Ok, so what kind of speed are we looking for?

Unit Test Speed Guidelines
Ideally a unit test class will run in under… 0.1 second
We should aim for a unit test to run in… < 1 second
Many of our unit tests may run in… 1-3 seconds
Some of our unit tests may take as long as… ~5 seconds

If a test tends to take closer to 10 seconds to run, it needs to go in the integration test bucket.

What about a slow first run and quick later runs?

Some tests have to do some setup that may take, say, 15 seconds the first time the test is run in a namespace, but then subsequent runs take less than one second. That is ok, the slow first run is not likely to be a barrier to people running the test often since subsequent runs are fast.

My PC is slow, or I was running a sandbox update, and…

There may be “artificial” conditions that make your test run’s duration approach 10 seconds, when in a more normal situation the test run duration would be much shorter. Use your judgment in these situations, keeping in mind that the main thing is to keep our body of unit tests fast enough to run often.

1.2. It Is Geared Toward Error Localization

To consider a test a unit test, it must also be “geared toward error localization”.

Error Localization
When a failing test points you to the errant code that is responsible for the failure.
Why do we want error localization?

One industry guru puts it this way:

As tests get further from what they test, it is harder to determine what a test failure means. Often it takes considerable work to pinpoint the source of a test failure. You have to look at the test inputs, look at the failure, and determine where along the path from inputs to outputs the failure occurred. Yes, we have to do that for unit tests also, but often the work is trivial.2

So even if a test is fast, if the main assertion(s) of the test methods seem to be lost among all the setup and tear-down code, it probably belongs in the integration test bucket.

2. Otherwise It’s an Integration Test…

If your test is too slow to fall within the unit test speed guidelines, or if it is geared more toward broad coverage, then it belongs in the integration test bucket.

3. …Unless It’s One of These Other Kinds of Tests

Manual test helpers

If your test can’t do everything automatically, but instead performs setup to make it quicker to perform manual testing, it’s a manual test helper, and it belongs in the test.manual hierarchy.

System integrity tests

If your test tests delivered data rather than code (such as making sure that each delivered field has a table specified or that each delivered R-model relationship is owned by serial number 8500), it is a system integrity test, and it belongs in the test.sysinteg hierarchy.

4. Other Issues When Writing Tests

TODO: Answer other questions about tests, such as:

  • Is it ok to change system settings in a unit test? in an integration test? (yes, but the test should leave them to the way it found them)
  • How strict should we be about a test cleaning up records, globals, log messages, etc. it creates? (Clean up whatever you can – accounts with payments on them are a known issue…)
  • Sub-issue of the above: should we recommend that tests use transactions to undo complex things like account creation? What level of endorsement would we give: Is it preferred, acceptable where needed, or discouraged?
  • Is a test class responsible for making sure two instances of itself don’t run concurrently, if it’s not thread-safe? Or should we assume that only one process at a time will be running tests in a namespace?

Notes

  1. We’re explaining what we will consider to be a unit test for our purposes, versus what we will consider to be an integration test. We’re not trying to address the question of what really is a unit test versus what really is an integration test. For instance, Michael Feathers says that if it writes to the database it’s not a unit test, but for our purposes if it writes to the database and it’s still fast enough, it can go in the unit test bucket.
  2. Michael Feathers, in Working Effectively with Legacy Code, p. 12.

— DanielMeyer – 06 Dec 2007

Why not just integration tests?

The following is a mildly modified version of a post I made to our internal TWiki web a couple of years ago, under the title Why Not Just Integration Tests?


Do we really need actual UNIT tests? Why aren’t integration tests enough?

1. What’s an integration test?

An integration test:

  • Tends to test more than one source file at a time
  • May depend on other subsystems (e.g., may write to the database, need the workflow monitor to be running, perform an actual import, do an actual release load, need ECP set up between your computer and another, need a real dialer configured, etc.)
  • Creates the data it needs for its test scenario (tables, imports, workflows, clients, accounts…)1

2. Advantages of Integration Tests Over Unit Tests

There are certain things that integration tests do better than unit tests:

  • You don’t have to break as many dependencies to get the code under test. Instead, you take the code you’re changing together with all its dependencies
  • They provide basic test coverage for large areas of the system, which is useful when great expanses of the codebase are not under test.
  • They test how code components work together. This is something that is not in unit tests’ job description, and it’s an important item.

Integration tests are kind of like the Marines. They go in first and give some cover to the rest of the troops. But you can’t win a full-scale war with just the Marines – after they’ve established a beachhead, it’s time to send in the rest of the troops.

3. Disadvantages of Integration Tests

Integration tests do have disadvantages as well:

  • They are harder to read and maintain. Because integration tests generally need to perform setup for the tested code’s dependencies, the code of integration tests tends to be thicker and harder to read. It’s easy to get lost in what is setup and what is the main test. And it can take more careful analysis to to be sure the test itself doesn’t have a logic error in it.
  • Their code coverage is low. Even if your integration test covers several scenarios, getting anywhere near complete code coverage is usually somewhere between tediously difficult and impossible. Running a whole scenario is just too coarse of a tool to get that kind of coverage. (As a sidenote, this is also one reason manual testing is not enough.)
  • They tend to be slow-running (30 seconds to half an hour)2.
  • They take longer to write. In the short-term, when testing legacy code integration tests are still quicker to write than unit tests, since changing the production code to be unit testable takes time and effort. But once you break the dependencies of a class or routine for unit testing, future unit tests will no longer pay that cost, and integration tests will be the slower ones to write (probably by a large margin).

4. Conclusion

Integration tests are important and won’t be going away, but to get to the next level we need to be able to unit test individual classes3.

Glossary

Dependency breaking
Changing the production code to make it so that you can test one method at a time without having to have the workflow monitor running, doing an actual release load, etc.
Test case
a test class that tests a production class or routine. Note that the term test case refers to the whole test class, not just one method on the test class.
Code coverage
How many lines of the source file(s) being tested are exercised by unit tests or integration tests, expressed as a percentage. For example, 100% code coverage means that every executable line was executed by the tests. The lower this percent is, the more code we’re shipping that is never exercised by tests.

For Further Reading

The Trouble With Too Many Functional Tests – http://www.javaranch.com/unit-testing/too-functional.jsp

Footnotes

  1. Instead of having each integration test create and set up its own data, another approach is to have certain things already set up that the test environment can rely on. That has quite a bit of appeal – it would make the setup required by each integration test smaller (and they would run faster because of that); they would also be more readable and maintainable. The reason we haven’t gone with that alternative up to this point is that it’s really hard to know what side effects the production code your integration test runs may have on the system. It seemed safest to have each test be responsible for setting up what it needs. Perhaps we’ll revisit this decision in the future.
  2. 30 seconds may seem pretty quick for an integration test; but as we move toward the discipline of running existing tests often, and as that body of existing tests continues to grow, we’ll want the unit tests to be quick – like under half a second per test case. This will enable a continuous build server (for example) to run all unit tests after each commit to get near-instant feedback on whether we broke anything. The build server would still run integration tests, but because they tend to be long-running they may only be able to run once or twice a day.
  3. And routines… though that’s a bit harder, since our routines often aren’t really units of functionality – they’re more like a bag-o-labels… there’s certainly work to be done!

— DanielMeyer – 20 Oct 2007

The phantom mock

A couple of days ago, a co-worker came by with an interesting problem.  His unit test class was failing, with a mock saying that it got more calls to a certain method than it had been set up to expect.

The odd thing was, the test class we were looking at didn’t declare any mocks!

A Clue

The failing test class passed when run by itself, failing only when run with the rest of the module’s unit tests…

The Problem

The problem turned out to be that a test class that ran earlier in the test run was injecting a mock into a singleton and not cleaning up after itself.  The singleton was later used by the failing test, which unwittingly made use of the mocked guts of the singleton, resulting in the unmet expectations.

A Solution

The offending test just needed to reset the guts of the singleton back to its normal, non-mocked value in its @After or @AfterClass method.

Musings

It seems to me it would be even better to implement the singleton using a factory, to avoid such side-effects popping up in the future (see StartupSvcFactory: why go to the bother? for some discussion about factories and singletons).  True, there’s the extra conceptual overhead of one more (factory) class and two more interfaces — but I think the resulting cleanness of the test code makes it worth it, at least if you ever need to mock what the singleton provides (as we did in this example).

Writing a parameterized JUnit test

JUnit 4 supports parameterized tests.  There are a few things that confuse me about how you set up your test class for it, though.  Let’s see what it would look like if what we wanted was to run the same test class, using a different Map<String,String> of settings each time.

The Normal Things

There are some elements of setup for using the parameterized test that didn’t confuse me.  Let’s briefly list them:

  • The test class needs to be decorated with the @RunWith(Parameterized.class) annotation
  • You need a public static data-providing method, decorated with the @Parameters annotation
  • Each test method is decorated with @Test (as usual)

Things I Found Confusing

The Data-Providing Method’s Return Type

The data-providing @Parameters method has to return a Collection<> of arrays.  Now, ever since at least Principles of Programming I & II in college, I’ve had trouble remembering which subscript is which when you have a multidimensional array.  So when I saw this helpful example, my brain got stuck on line 1 of what the @Parameters method was returning:

  return Arrays.asList(new Object[][] {
   {"22101", true },
   {"221x1", false },
   {"22101-5150", true },
   {"221015150", false }});

I couldn’t think how this would translate to my maps I wanted to run with.

The Special Constructor

When you’re using the JUnit 4 parameterized test, your test class needs to have a constructor that takes one set of the parameters and stores them to fields in the test class for use during that run.  But I couldn’t figure out — should my constructor expect a Map<String, String>[]?  A Map<String, String>?

Figuring it Out

There were a few different facets that it helped me understand:

Each Element of the Collection is a set of Parameters

The reason the @Parameters method must return a collection of Arrays is because each Array holds the parameters that are needed for one test scenario.  So if my test class needed a Map, an int, and a boolean, the @Parameters method would return a collection of three-element arrays — each array containing the parameters for one configuration of the test class.  This leads to the next facet…

The Test Class Constructor Should Accept One Parameter for Each Element of the Array

The JUnit parameterized test mechanism expects to instantiate the test class by calling a constructor that has the same number of arguments as there are elements in the current parameters array*.  If my test class FooTest needed a Map, an int, and a boolean each time, the constructor might look something like this:

    public FooTest(Map<String, String> map, int value, boolean flag) {
        //...
    }

…and my @Parameters method would need to return a Collection of Arrays of Object (“of Object” since for a given array, the three elements would be of different types).

*I haven’t read or tested to see if the JUnit mechanism supports a collection of jagged arrays of configuration parameters such that (for instance) sometimes the test class might be instantiated using the two-arg constructor, other times using its three-arg constructor…

Store the Parameters in Private Fields in your Test Class

What you’d normally do is store the parameters you get constructed with to private fields, for use by the @Test methods:

@RunWith(Parameterized.class)
class FooTest {
    private Map<String, String> map;
    private int value;
    private boolean flag;

    public FooTest(Map<String, String> map, int value, boolean flag) {
        this.map = map;
        this.value = value;
        this.flag = flag;
    }

    //...
}

The Application to My Case

I think I was more confused because I only needed one parameter — a Map<String, String> — so it wasn’t apparent to me why the Collection of Object Arrays was needed.

So the “hard parts” of my test class end up looking something like this:


@RunWith(Parameterized.class)

class FooTest {

    private Map<String, String> map;

    @Parameters
    public static Collection<Object&#91;&#93;> configs() {
        Map<String, String> map1 = new HashMap<String, String>();
        map1.put("Name", "Bill");
        map1.put("Favourite Color", "Blue");

        Map<String, String> map2 = new HashMap<String, String>();
        map2.put("Name", "Sam");
        map2.put("Favourite Color", "Plaid");

        return Arrays.asList(new Object[][] {
                { map1 },
                { map2 }
        });

    public FooTest(Map<String, String> map) {
        this.map = map;
    }

    //...
}

(Maybe nobody else needed that explanation, but it helped me!  :)

Why we wanted a JNDI server for integration testing

Why is having a JNDI server for our integration testing environment so helpful?  Well, just for our minimal XA integration test example, there were (at one point) fourteen Spring beans:

If we had to use a non-JNDI strategy for integration testing outside the app server, seven of these beans would have to be swapped out for test-only doubles — but we want our configuration for integration testing to match as closely as possible our production configuration so that we’re testing the same thing we’re deploying!

Abstract test

Suppose you have an interface, and implementer classes A and B.  You’ve written tests for class A, and when you start writing tests for class B you notice that you need to test a lot of the same things you tested for class A.  You don’t want to have all that test duplication in ATest and BTest.  Furthermore, if we made a new implementer class C, we don’t want to have to duplicate all those tests yet again.

The Abstract Test pattern, documented by Eric George,  is an elegant way to solve this problem by putting such tests for functional compliance to an interface in an “abstract test” that can be run against arbitrary implementers.  It tests that an implementer behaves as the interface expects.  The extra cool thing is that the abstract test can be run in the future against currently-unknown  implementers.

The Abstract Test pattern also applies to classes extending an abstract base class.

Since the original article is already gone and only available in web archives, I’ll also summarize how it works:

Abstract Test Class

  1. You create an abstract class that extends TestCase*
  2. In your setUp method, you call an abstract factory method defined on your class that is defined to return a reference to an object of the interface or abstract base class type.
  3. Create test methods (marked final)

Concrete Test Class(es)

For each concrete class implementing the interface or extending the base class:

  1. Create a test class that extends the abstract test class
  2. Override the abstract factory method to return a new instance of the concrete class being tested
  3. Write test methods testing functionality specific to this concrete implementation

Pretty neat!

*I think you could use a different approach using JUnit 4, but for purposes of description I’m staying with the article’s JUnit 3 method