Lessons from Software Engineering at Google: Part 7

This is the seventh article in a series where we cover the book Software Engineering at Google by Titus Winters, Tom Manshreck, and Hyrum Wright. 📕 We will go over various aspects of software engineering as a process, including the importance of communication, iteration and continuous learning, well-thought-out documentation, robust testing, and many more.

Today we cover automated testing, in all its shapes and sizes. We won't be delving into individual types of tests - such as unit tests, integration tests, and end-to-end tests - but we will outline benefits and best practices that apply across the board and help us write more effective tests, regardless of their scope. Let's dive in!

Benefits of testing

As usual, before we jump into the how, it's worth outlining the why. The book describes the following benefits of writing automated tests. 👇

Less debugging. Tested code has fewer costly defects and prevents annoying debugging sessions throughout the lifetime of the project. With well-written tests, it's on your test infrastructure to detect the issues before they go to production.
Increased confidence in changes. All software changes. Teams with good tests can review and accept changes to their project with confidence because all important behaviors of their project are continuously verified. Such projects encourage refactoring.
Improved documentation. Clear, focused tests that exercise one behavior at a time function as executable documentation. If you want to know what the code does in a particular case, look at the test for that case.
Simpler reviews. A code reviewer spends less effort verifying that code works as expected if the code review includes thorough tests that demonstrate code correctness, edge cases, and error conditions.
Thoughtful design. The act of writing tests also improves the design of your systems. As the first clients of your code, a test can tell you much about your design choices. If new code is difficult to test it is often because the code being tested has too many responsibilities or difficult-to-manage dependencies. Well-designed code should be modular, avoiding tight coupling and focusing on specific responsibilities.
Fast, high-quality releases. With a healthy automated test suite, teams can release new versions of their applications with confidence. At scale, this would not be possible without automated testing.

As with any other type of code, there are better and worse ways of writing it. Let's now discuss how to achieve all of these benefits and ensure our tests are high quality and give us confidence.

Good tests are explicit

Tests should be complete and concise: a test's body should contain all of the information needed to understand it without containing any irrelevant or distracting information. There are two rules you can follow that should help with achieving that.

Include everything that's required for the test to run. All tests should strive to be hermetic. A test should contain all of the information necessary to set up, execute, and tear down its environment.
Don't make any assumptions. Tests should assume as little as possible about the outside environment such as the order in which the tests are run.

What also makes tests explicit is a name that summarises the behavior the test covers. A good name describes both the actions that are being taken on a system and the expected outcome. 💡

The book goes even as far as to say that unclear test code is worse than unclear production code. The reason for that is quite simple. With production code, you can often determine its purpose by looking at what calls it and what breaks when it's removed. With a test, you might never understand its purpose since removing the test will have no effect. In test code, stick to straight-line code over clever logic and consider tolerating some duplication when it makes the test more descriptive and meaningful.

Avoid implementation details

By far the most important way to avoid brittle tests is to invoke the system being tested in the same way its users would. The closer the test resembles how the users use your system, the more confidence it gives you. 📈

You achieve that by minimizing the reliance on implementation details in tests. Make calls against its public API rather than its implementation details. Moreover, testing against the state of the system provides more confidence and makes for less brittle tests than testing against interaction. A good indicator that you've done it right is when changes that refactor code while preserving existing behavior require no changes to existing tests.

Reliability is the prerequisite for confidence

Creating and maintaining a healthy test suite takes real effort. As a codebase grows, so too will the test suite. It will begin to face challenges like instability and slowness. A failure to address these problems will cripple a test suite. 🐢

Tests derive their value from the trust engineers place in them. If testing becomes a productivity sink, engineers will lose trust and begin to find workarounds. A bad test suite can be worse than no test suite at all. Teams that prioritize fixing a broken test within minutes of a failure are able to keep confidence high and failure isolation fast, and therefore derive more value from their tests.

In some cases, you can limit the impact of flaky tests by automatically rerunning them when they fail. This is trading CPU cycles for engineering time. At low levels of flakiness, this trade-off makes sense. Just keep in mind that it means delaying the need to address the issue - proverbial kicking the can down the road. 🥫

A word on code coverage

The fact that tests pass isn't a sufficient indicator that your system is well-tested. You have to know whether you've covered the system's functionality. A common way to do it is with code coverage but isn't a complete metric either. Instead of relying on code coverage, think about the behaviors that are tested. Code coverage can provide some insight into untested code, but it is not a substitute for thinking critically about how well your system is tested. 💭

Writing tests is only the first step

After you have written tests you need to run them. Frequently. As obvious as it sounds, it's key to have the right infrastructure to efficiently run your tests and automate the process so that change requests and deploys are preceded with relevant tests. You get the confidence your system works only by running tests.

As we mentioned before, your systems will evolve and so should your test suite. A good test suite contains a blend of different test sizes and scopes that are appropriate to the local architectural and organizational realities. 🏗️

Conclusion

That's it for today. The more and faster you want to change your systems, the more you need a fast way to test them - automated testing is the only way you can achieve that at scale. Here's a short summary of things we went through:

automated testing decreases the time spent on debugging and preparing releases
testing gives you confidence your system works, it improves software design, documentation and makes the code review process simpler
good tests are complete and concise - they should include everything that's required for the test to run, without unnecessary details or assumptions
tests shouldn't rely on implementation details
you get confidence from your tests by running them frequently and ensuring they give replicable results

Next, we will cover software maintenance. Technical debt as any other debt is a double-edged sword - it can be an effective tool as long as it is treated with carefulness. See you! 👋

If you liked the article or you have a question, feel free to reach out to me on Twitter‚ or add a comment below!

Lessons from Software Engineering at Google: Part 7 - Automated Testing