A Philosophy of Testing 3: The Problem of Unit Tests
In the previous article, I argued that Scala is a different language, and its testing requirements are different. Let’s examine a key difference: why old-fashioned unit-testing is mostly not a good idea for Scala services.
(Please remember that this series is focused on testing services specifically. Libraries have different considerations, and are mostly out of scope for this series.)
A Little History
In the Beginning, (that is, pre-2000), automated testing mostly wasn’t a thing. A few companies did it, and everyone agreed that it would be a lovely idea if only it was feasible, but most folks didn’t take it seriously. I’m not sure that I encountered any serious automated testing in my career up to 2001.
Then came Extreme Programming, the forerunner of what is now known as Agile Development. The book Extreme Programming Explained was a sort of manifesto of how to make software engineering work better (I highly recommend reading it, especially the landmark first edition), and one of its principles was continuous integration. And since CI basically makes no sense without deep test automation, people finally woke up and starting doing a lot of that.
But the field at the time was very Java-centric (if you look at software engineering books of the time, most of them talk almost exclusively about Java), and as I described in the last article, testing for Java has some specific considerations. Nulls are common, nearly every object is mutable, and programming is largely about coordinating zillions of little state machines.
In that world, unit testing is critical, and folks tend to spend much of their effort there, testing each individual class to death.
But again, Scala Is Different.
Unit Testing in Scala Services is Mostly a Waste of Time
Most individual Scala classes do not require deep unit testing in and of themselves. They shouldn’t usually be mutable, so you don’t have as much of the state-machine problem. They should be strongly-typed, and generally
null-free, so most illegal-argument tests are unnecessary. (Indeed, many of them wouldn't even be compileable.)
It’s worth noting the “most” there — there are certainly exceptions. If you are building a complex data structure, even an immutable one, then that will need serious tests (usually ScalaCheck-centric tests). If you have a complex algorithm that isn’t trivially obvious, that certainly deserves its own tests.
But those are exceptions, and this gets back to the key point of this series: test mindfully. Always be asking yourself, is this test useful? In particular, is it plausible that this unit of code could break during maintenance, in a way that wouldn’t immediately be caught by a test at a higher level? If the answer is “no”, then probably don’t bother writing a test for just that unit.
And here’s a key point: in services, the answer tends to be “no”, because most code in services is plumbing. It’s not deep complex data structures, and it’s not complex algorithms — it’s plugging together APIs and data stores, and doing mild transformations on them. It’s mostly about pipelines of many pieces connected together. Testing individual units doesn’t provide you with much value in most cases — the whole pipeline is what you need to test.
The Problem of Mocks
Let’s drive this home more: in Scala services, an excessive focus on unit tests is actively bad for your project. Why? Well, let’s think about what a unit test looks like in a plumbing-centric environment.
Unit-testing typically means testing just this little class in isolation. Problem is, plumbing-oriented code isn’t built to operate in isolation — it usually needs to talk to various other components. And that leads you to Mocks.
Let’s be precise about terminology here. There are a lot of different kinds of “test doubles” — for a good taxonomy, I recommend this article from Martin Fowler. He defines mocks as “objects pre-programmed with expectations which form a specification of the calls they are expected to receive”.
So say you are using ScalaMock to test plumbing class A.
(Tangent: please, please, just don’t use Mockito for Scala — it is too weakly-typed and null-friendly, and will cause you misery in the long run. I have lost literally months of effort trying to maintain brittle Mockito-based test harnesses. But I digress…)
Class A depends on additional plumbing: let’s call those classes B, C and D. Typically, in each test, you will wind up defining mocks for each of those, saying precisely what to return for each invocation.
First, let’s state the obvious: that is a lot of boilerplate. Unit tests for plumbing tend to involve more mock definition than actual test code. It’s a lot of effort — is it providing value?
Worse, it tends to be brittle. Say I tweak the semantics of class C. I fix the tests for C, and that’s great — but now the tests for A, and often a bunch of other classes, are suddenly breaking.
That’s because mocking tends to smear the contracts between your classes all over your test base. When you change C, you often need to change the tests for the classes that call C. Worse, you sometimes wind up with indirect failures, because E calls F, which calls G, which calls C, and your change to the semantics of C now requires you to fix the unit test for E. This leads to a lot of head-scratching and confusion.
The result is that these unit tests for plumbing components require excessive maintenance effort. And let’s please remember, making it easier to maintain your code is the entire point here. You want enough tests to be able to code with courage — to enhance and maintain the code and know whether or not you’ve broken anything. That’s worth a good deal of maintenance effort, but adding lots of unnecessary, hard-to-maintain unit tests is counter-productive.
So to put it bluntly:
Unit tests for plumbing classes are generally a net negative.
It’s a bad habit, adapted from a very different language, and it’s worth breaking.
The Problem of Coverage
Just to add a kicker: excessive unit tests get in the way of proper code coverage.
I’ll be arguing, later in this series, that you should be writing comprehensive scenario tests, and that those should be providing 100% coverage of your code. Your CI harness should be checking that, making you fix any coverage failures, and you should be immediately deleting dead code.
The problem is, if you have a unit test, that counts as “coverage”. The automated tools will think this code is properly tested, but that’s a lie: plumbing code isn’t really tested unless it has been tested in context. It means you can add code that passes a unit test but breaks in the real world, and isn’t being tested realistically, but the coverage tool doesn’t know that. That’s bad.
Unit tests for plumbing classes provide a false sense of security.
The above is pretty strong stuff, and I’m sure that some folks will disagree. That’s fine, but please think about it seriously. I’ve found that an excessive reliance on mock-centric unit tests generally does more harm than good when working on Scala services. And it’s unnecessary: I pretty much never use mocks for my Scala tests, and I’ve never missed them. There are better alternatives.