Lessons from Software Engineering at Google: Part 10 - Continuous Integration

Lessons from Software Engineering at Google: Part 10 - Continuous Integration

This is the tenth and last article in a series where we cover the book Software Engineering at Google by Titus Winters, Tom Manshreck, and Hyrum Wright. 📕 We will go over various aspects of software engineering as a process, including the importance of communication, iteration and continuous learning, well-thought-out documentation, robust testing, and many more.

Today we cover continuous integration and delivery. These are systems and processes that define how members of engineering teams bring their work together, how the software is built, tested, and finally, delivered to your users. Let's dive in!

Shift left

The fundamental goal of continuous integration systems is to catch problematic changes as early as possible. As with most of the things we've discussed in this series, this becomes virtually impossible to do manually as projects grow. CI systems become progressively more necessary as your codebase ages and grows in scale. 📈

Furthermore, finding problems earlier in the developer workflow usually reduces costs. Bugs caught by static analysis and code review before they are committed are much cheaper than bugs that make it to production. Here's where the general rule related to CI systems comes into play - Shift Left. ⏪

Shift left: enable faster, more data-driven decision-making earlier on all changes through CI and continuous deployment.

The purpose of testing is to gather information. Information about problems in your systems. Having this information earlier in the workflow allows having shorter iteration cycles, which means fewer bugs introduced and better quality features.

Minimise human decisions

There are certain things humans excel at and there are things humans are inherently bad at. Consistently enforcing rules at scale arguably falls into the latter category. 🙃

One decision you need to make with every new change is which tests should be run against the changes being introduced. These decisions should be made consistently and repeatedly. Because of that, the book makes the case that this should never be up to individual engineers. A CI system decides which tests to use, and when. That way we always follow explicit rules, and we can reach the desired balance between deployment confidence and speed of development.

The book also suggests that CI should optimize for quicker, more reliable tests on presubmit and slower, less deterministic tests on post-submit. That way we can keep a reasonable pace of development while making sure we don't break things when releasing. 🧘‍♂️

Ship often, ship fast

The book makes an interesting observation about how the speed of delivery impacts the safety and confidence in changes being released.

Faster is safer: ship early and often, and in small batches to reduce the risk of each release and to minimize time to market.

There are a few important steps you might want to take to ensure fast and effective releases. 👇

  • Optimise for team velocity. Velocity is a team sport. The optimal workflow for a large team that develops code collaboratively requires modularity of architecture and near-continuous integration.

  • Evaluate changes in isolation. The only way to be sure what broke is to isolate changes. A typical way to achieve that is to flag guard any features to be able to isolate problems early.

  • Make reality your benchmark. Use a staged rollout to address device diversity and the breadth of the user base. Release qualification in a synthetic environment that isn't similar to the production environment can lead to late surprises.

  • Ship only what gets used. Monitor the cost and value of any feature in the wild to know whether it's still relevant and delivering sufficient user value.

Conclusion

That's it for today. CI systems become necessary for growing teams and codebases, making it possible to efficiently and safely integrate work and deliver your applications. Here's a short summary of things we went through:

  • CI systems become more necessary as your codebase grows in scale

  • Shift left: enable faster and data-driven decision-making earlier on all changes

  • A CI system decides what tests to use, and when

  • Faster is safer: ship early, often, and in small batches

  • Optimise for team velocity, evaluate changes in isolation, make reality your benchmark, ship only what gets used

Congratulations! 🥳 We've just reached the end of the series where we covered lessons learned from the book Software Engineering at Google. We've touched on a lot of aspects of software engineering as a process, but the book still covers a much wider array of topics. I hope you found this series useful and that you learned something that you will use in your work in the future. 🚀

If you liked the article or you have a question, feel free to reach out to me on Twitter‚ or add a comment below!

Further reading and references