How to write a good bug report

It’s something that can’t be overstated in software development: a well-written bug report can make an engineer’s life so much easier. A poorly-written one can make your day a living hell.

Something that I find myself bringing to almost every client is a sense of what a well-written bug report looks like. This is largely carried over from my Launchpad days: we were a distributed team spread around the world, and we didn’t always have the luxury of high-bandwidth comms with bug reporters. Instead, we relied on well-crafted bug reports to give us unambiguous problem statements, and — where possible — unambiguous guidance for fixing those problems.

Bad bug reports are expensive

Okay, this isn’t true for all bad bug reports. “Bad,” like many things, is a spectrum. For example, these are both “bad” bug reports:

I can’t type some letters on my keyboard

and

Keyboard sucks

But they’re different degrees of “bad”.

Regardless of the level of badness, though, a bug report that’s not well-written, or doesn’t contain information that was available to the reporter when they were writing the bug, will cost you in developer time when someone starts to investigate the thing.

This can range from time having to comb through logs, to having to go back and forth with the reporter about what they mean by a specific phrase (Arguing over the definition of “doesn’t work” is a classic — one user’s “doesn’t work” is developer’s “works as intended,” but that’s not always obvious at the get-go).

What does a good bug report look like?

There are some simple rules that you can use to guide yourself if you want to write better bug reports. I’m not going to provide a template — because templates can themselves be an anti-pattern for this kind of thing — just some principles that I’ve learned over the years that you may find useful to your project or your team.

Note too that I’m using a completely made up example bug here; the details of that bug aren’t relevant, so please don’t come telling me that I’ve gotten something horribly wrong. It’s a pseudobug.

1. State the actual problem in the title

“The login button should take me to the login form” is not a good title for the bug. It doesn’t tell the reader what the problem actually is. “The login button should take me to the login form” could actually mean:

  • The login button doesn’t do anything
  • The login button takes me to a login page on a different website
  • The login button automatically logged me in when it shouldn’t
  • The login button ate my cat

State the actual problem, as pithily as possible, in the bug title. This may require some eliding of the details, e.g.

That says what the problem is without diving into all the details of the failure modes which cause the bug to surface.

2. Have an unambiguous problem statement in the body of the bug

This is your opportunity to explain the problem to the reader in a manner that doesn’t leave them wondering about any of the details. Sometimes, this is just going to be “See title,” but more often than not it’s going to be more expansive.

More than just guiding the reader through how to reproduce the bug, you’re also giving any QA engineers a place to start when designing regression tests for the issue — and as we well know, making QA’s life easier makes the dev team’s life easier overall.

Following our example from the previous section, we could have:

Clicking the login button elicits no response from the application when the Authentication Service is:

- Shut down
- Running but in an error state
- Restarting

If possible, you want to include a guide to reproducing the issue at this point. This should be stated as a numbered list, so that the reader can follow it step-by-step:

To reproduce this issue:
- Run the application and authentication services in Docker
- Check that the login button takes you to the login form, then go back to the landing page
- Shut down the authentication service
- Refresh the landing page and click the login button. Note that nothing happens — no alerts, no errors in the console, nothing. It doesn’t even behave like a button anymore.

3. If possible, state where you think the problem is coming from, in as much detail as you can

Sometimes you’re not going to be able to answer why there’s a particular problem. It’s totally fair to leave this blank. But if you do know what might be the source of the problem, you might be able to help the reader out somewhat.

Continuing our example bug, here’s how you could phrase this:

Back when we split the authentication service from the application service, we added a fallback that would alert the user to authentication being unavailable. I seem to remember that was part of the JS on the landing page. Could that be misfiring and triggering this issue?

Or perhaps you know exactly the cause of the problem, in which case you might say:

There’s a check in the landing page JS to ensure that the authentication service is up (see landing_page.js:100). It's meant to grey out the login button and provide an alert when you try to log in, but it looks like it's not doing for some reason. I suggest starting any debugging there.

4. If there’s a suggested fix, state it as a series of unambiguous tasks

It’s really important that you don’t feel like you have to state a suggested fix if you don’t know what one looks like. After all, “fix the thing” can be a really deep and involved process, and a guide to fixing the problem can actually be really misleading and, in the end, unhelpful.

But let’s assume that you know the problem well enough — maybe you’ve done some investigation here whilst writing the bug, or maybe you wrote the original code and it’s not that much work to explain the fix. Give the person fixing the bug a series of simple steps, with supplementary detail where necessary:

To fix this issue, you’ll need to:
- Add a front-end test to prove that an alert is show and the login button is greyed out when the authentication service response with an error code. This test should fail at the moment — if it doesn’t, ping me on slack and I’ll help you dive into it.
- Update the code in
landing_page.js:100-200 so that when an error response (as defined above) is returned by the authentication service, the disableLoginWithWarning() function is called, rather than disableLogin(), as is called right now.
- Re-run the test you created in (1). It should now pass.
- Remove
disableLogin(). It's old code that should have been removed a while back; its being in the codebase is just causing more problems.
- Run the front-end tests. If any of them fail due to the lack of
disableLogin(), you can either (if the changes are minor) fix them as part of this work or (if the changes are significant) file tickets for each of them, tag them with a // XXX comment in the code (always referencing the bug number), and then restore disableLogin() for the time being.

5. Always include an unambiguous win condition for the bug

Just like a set of acceptance criteria on a ticket to develop a new feature, a “we will know we’re done when” section on the bug (or as @sil would call it, an “Unambiguous Win Condition” is essential for making sure that a bug gets fixed properly.

UWCs need to be stated plainly, using imperative language, and need to avoid the word “should” wherever possible, because its meaning is unclear (if something “should” do something, is it okay if it doesn’t or not?).

So:

We will know this bug is fixed when:
-
The authentication service being in an error state or unavailable causes the login button on the landing page to be greyed out, with a warning displayed when the user hovers their cursor over the login button stating that authentication is not available at this time.
- The login button on the landing page takes the user to the login screen if the authentication service is running without error, and no warning is shown when the user hovers their cursor over the login button.

In summary

Bad bug reports cost time and therefore money. Good bug reports are grease on the wheels of the development process, and can make developers’ lives so much easier (imagine being able to fix a bug without ever having to speak to anyone until it’s done!).

The rule here aren’t a panacea; there’s definitely still going to be plenty of times when you’ll need to have conversation about some corner case or side-effect of the fix. But hopefully they’ll see you on the right track, and maybe they’ll make your development process — and your software — that bit better too.Originally published at https://gmb.dev on April 13, 2021.

Photographer and writer. Portfolio: http://gmb.photo || Street photography: http://evening.camera.