A Philosophy of Testing 5: The Case for 100%

Mark "Justin" Waks
8 min readNov 16, 2021

Back to Part 4.

So far, I’ve argued that Scala services should favor scenario tests instead of old-fashioned unit tests in most cases. Let’s move on to the next topic: how much do you need to test?

If at all possible, your Scala code should have automated test coverage, using scoverage or something like that. You should get that to 100% coverage, and keep it there.

There’s a lot to unpack here, so let’s break it down.

What is “Coverage”?

Not everyone is familiar with coverage tools, so terminology first: for purposes of this series, “coverage” refers to automated tools, generally integrated into sbt and invoked at build time, that check how comprehensively your tests “cover” your code. (scoverage and its variations are popular ways to do this in Scala.)

These tools typically allow you to say what percentage of the lines of code must be covered by tests in order to pass. That percentage number, expressed in sbt as something like this:

lazy val rootProject = (project in file("."))
.settings(coverageMinimumStmtTotal := 100.0)
.settings(coverageFailOnMinimum := true)

is what we’re focusing on today.

Why 100%?

This topic is a frequent source of arguments, and some folks get their backs up about the concept of requiring 100% coverage — it’s very common to say that, say, 80% is fine and much more realistic.

I’d like you to consider, though, that any number other than 100% is arbitrary. Basically, it’s meaningless — you’re saying, “I want you to make sure that some code is tested, but I don’t really care which code.” That violates our “test mindfully” maxim: you shouldn’t be that arbitrary about it.

“100%” Doesn’t Really Mean 100%

That’s where we need to start getting into nuance, though. I would strongly recommend that you tell sbt to check 100% of your code. But that doesn’t actually mean you need to be checking every line of code.

The thing is, scoverage and tools like it generally provide ways to make specific exceptions. In the case of scoverage, this is done with pragma comments, like this:

if (thingAreGoingAsExpected) {
doTheHappyPath()
} else {
// I believe this code path is impossible to hit,
// because...
// $COVERAGE-OFF$
reportAnError()
// $COVERAGE-ON$
}

That $COVERAGE-OFF$ / $COVERAGE-ON$ pair tells scoverage to exempt the code in-between from the coverage requirement, and to not count it in the report. This is totally okay, provided you are disciplined about it. It's very different from setting sbt to 80%, and allowing some random 20% of the code to be untested; instead, you are saying that there are specific reasons why these specific lines of code are exempt.

When and How to Exempt Code

There are several situations in which it is legit and normal to turn coverage off.

Untestable Scala

First and simplest, any final val lines like this are fundamentally untestable:

final val myConstant = "Hello world"

final vals are inlined, so that line of code simply won't ever be hit at runtime. (It's arguably an scoverage bug that it doesn't exempt them automatically.) So blocks of these should always have coverage turned off.

Another variation is the case we had above:

if (thingAreGoingAsExpected) {
doTheHappyPath()
} else {
// I believe this code path is impossible to hit,
// because...
// $COVERAGE-OFF$
reportAnError()
// $COVERAGE-ON$
}

Note the comment. While you should try to avoid it, sometimes you may wind up in a situation where your types have steered you into writing a clause that you sincerely believe can’t ever be reached. (This generally means your types should be improved, but that isn’t always feasible and convenient.)

If so, acknowledge it and move on. The important bit here is that “because” comment — you should state precisely why you think it’s impossible to get here, so that future code maintainers understand what’s going on.

In general, aside from that universal final val problem mentioned above, you should always have an explanatory comment attached when you turn off coverage, both to provide insight for future maintainers and to keep you honest — it's harder to just be lazy when you have to explain yourself in the code.

Not Tested Yet

The most common situation is where you have written some code that is mostly tested, but which has an edge case that is both uncommon and annoyingly hard to test. In theory, you should of course test everything before merging that code; in practice, the pressures of real-world deadlines often don’t permit that.

That’s fine, provided that you are honest about it. What we’re talking about here is test debt, which is a common form of tech debt. It happens — we wish it didn’t, but it totally does. The important thing is that it be tracked properly.

So say that you are tracking your project using a JIRA board named FOO — you should open a ticket describing the missing code to test, and mention that:

if (thingAreGoingAsExpected) {
doTheHappyPath()
} else {
// TODO (FOO-123): come back and test this
// $COVERAGE-OFF$
reportAnError()
// $COVERAGE-ON$
}

Again, this isn’t arbitrary, and it isn’t permanent: you are acknowledging the test debt, and have a plan to come back and address it.

(Yes, folks sometimes just sweep test debt under the rug and never get to it. Please try not to do that: it’s a trap that will eventually come back and bite you. I’m speaking mainly to managers here: testing is really, really important, and it’s on you to make sure your engineers have time to deal with it thoroughly.)

Putting all that together:

It is fine and normal to exempt specific bits of code from test coverage, provided you are disciplined about it.

100% Isn’t Enough

Don’t fool yourself into believing that, just because you have 100% coverage, your code is necessarily bug-free: it absolutely does not prove that. You still need to make sure that you have a test for each likely use case. In practice, any time you make an enhancement, that should be represented somewhere in the tests.

Also, every time you encounter a bug that wasn’t caught by the previous tests, you should add one for that. Not necessarily a separate test, mind — sometimes this can be as little as one line in an existing scenario test. But regression-testing your bugs somehow is a crucial discipline, and will help you flesh out your test suite and make it more realistic.

100% isn’t enough in and of itself. But that 100% at least confirms that you are putting some effort into testing all aspects of the code. It will not catch every bug, but it will catch a lot of them.

So I strongly recommend thinking of 100% code coverage (with specific exceptions as appropriate) as the bare minimum of testing you should be doing. Beyond that, think about plausible use cases, and make sure you are handling all of them. As always, test mindfully.

100% test coverage should be your baseline, not your final goal.

Death to Dead Code!

Finally — your 100% test coverage will often reveal (especially when you are initially getting to that 100%) that you have a bunch of code that isn’t tested because it really, truly isn’t being used. The top-level entry points of your service simply don’t lead there.

It’s very tempting to shrug, wrap that code in $COVERAGE-OFF$, and call it a day. I would usually recommend otherwise, though. Unless you have a specific plan to bring that code into use (with a ticket assigned, and a comment on that code referring to the ticket), you should just delete it. That might be a single if-else clause, a function, or even an entire package — regardless of scale, get rid of it.

Code that isn’t being used, and isn’t being tested, often turns into a trap — it can accumulate debt and bugs especially fast, falling out of sync with the assumptions of the rest of the system. It’s hard to kill your darlings (as writers often put it), but this is editing: if it’s not a net positive, it’s probably a net negative.

And remember: you’re using a change-control system. (Right?) The code isn’t gone forever, and you can always fetch it out of the history if you need to.

A bit of perspective: junior programmers tend to celebrate the number of lines of code they have added. Senior programmers tend to celebrate the number they’ve managed to delete while keeping the application working. Add to that the fact that it makes coverage easier, and you shouldn’t be afraid of deleting unnecessary code.

Getting There From Here

All of the above is talking mainly about the ideal end-state. But say that you have a big code base already, and haven’t done a lot of testing yet. You install scoverage — and it tells you that your coverage is 3%. It’s very easy to get depressed about that, and just write the whole thing off.

Don’t despair, though, and don’t make the best the enemy of the good. 100% (or more) is the goal, but more scenario tests are generally better than fewer, and you have to start somewhere. Just be prepared for full testing to be a long-term background task that will take a month (or a year), and start making gradual progress.

I recommend installing scoverage into sbt, and starting to use it in your build system, as soon as possible. Give it a run, find out what your current coverage is, and clamp the minimum to that as an initial baseline. That’s your starting point.

Over time, as you enhance and maintain the system, take the opportunity to keep adding more scenario tests. It’s often the case that a small number of good tests can cover a surprisingly large amount of the code, especially if they are strategically chosen and hit a variety of major use cases. Keep checking your actual coverage, and keep raising that minimum in your build.sbt file.

Finally, when you are getting within striking range — say, 80% or so — start really tackling the problem in a more disciplined way. Do a research spike to look at the coverage output, and see what’s missing. Tackle the low-hanging fruit quickly. For the rest, start wrapping it in $COVERAGE-OFF$ pragmas and opening test-debt tickets for it. Push to that satisfying 100% goal, and celebrate when you get there.

After that, you can get into a more comfortable maintenance mode: keep the minimum pinned at 100%, and gradually pay down your test debt as a background task. The payoff is quiet, in the form of bugs that you barely even notice because you catch them before they ever make it into a PR. But your system will be stabler and more solid, your engineers can code with more confidence that they aren’t breaking things, the need for manual QA goes steadily down, and you’ll be in a better place for future improvements.

In other words:

Don’t panic if your test coverage isn’t great — focus on making it a little better every week, and you’ll get there.

Summary

I hope you’ll seriously consider the above, and think about applying it to your own applications. 100% coverage is the only philosophically honest level to set; when you couple that with disciplined exceptions, it’s entirely practical. It helps you set a minimum level for your tests, which will help you build reliable systems.

Next time: The Quest for Determinism

--

--

Mark "Justin" Waks

Lifelong programmer and software architect, specializing in online social tools and (nowadays) Scala. Architect of Querki (“leading the small data revolution”).