Writing

Amol Kapoor

Back to table of contents.

On Code Review

Written January, 2023. Published May, 2023.
Code Review

In Theory of Programming I describe how the core of the programmer's job is building mental models, not writing code. I conclude with the following:

…making code more legible is the most important thing we can do to be
more efficient. Consider how long it takes for a programmer to really
fully grok someone else's code. A long time! In fact, I'd bet that
understanding code is the most time consuming thing a programmer
spends time on.

As in any system optimization, we aim to target the area with the
biggest bottleneck to achieve the most improvement with the least
work. Same here. We can maximize our efficiency as a system by
making the model-update process as low-cost as possible. 
      

This reframing has led me to some pretty radical (?) opinions on the purpose of code review.

I think there are three reasons why an organization might institute a code review process:

  • to check the code for bugs, architectural/design decisions, and stylistic consistency;
  • as a teaching/learning opportunity, whereby developers can share knowledge of libraries and design patterns;
  • as a means of ensuring that more than one person on the team understands the mental model of the code being written.

I want to focus on the last of these, because I think it is simultaneously the most important and the least recognized aspect of code review. Having a shared understanding of a code base is obviously useful – as mentioned earlier, shared understanding directly leads to more parallelism, fewer bottlenecks, and less single-points-of-failure. By forcing other people to read through a PR, there is some minimal guarantee that at least one other person can speak to the code being written.

But remember, programming is about mental models, not code per se. Reviewing code is only useful insofar as the code reveals the underlying model. If, during a review, the reviewer is unable to ascertain the decision making process behind the code, the review is essentially meaningless and we cannot unlock any of the key benefits of the review. Hopefully obvious.

More controversially, if a reviewer is only able to understand the code by reading through all of it, the code is not useful – the programmer has done an insufficient job conveying the model, and is as a result totally, inextricably tied to the code itself.

In a world where the model is everything, review cannot just be about the correctness of the code. It must also be about how well the code – and all additional documentation, comments, etc. – communicates intent. In many ways, this directly parallels the role of review in academia. Researchers are evaluated on their ideas as well as their communication; and bad writing, incoherent structure, or shitty figure design can lead to an otherwise great paper being (rightfully!) rejected.

When reviewing code, then, it is less important to try and read every single detail. Frankly, it is rare that a code review will catch nitty details like potentially corrupted state, and we shouldn't optimize for that. Instead, reviewers should evaluate based on the following questions:

  • Did you have to read every line to understand what is going on, or were comments and descriptions sufficient?
  • Is it obvious what this code is doing from APIs, function signatures, and comments alone?
  • Can a talented new hire who has never seen the codebase before easily blackbox how this code works?
  • How much analysis has to be done before you can feel safe changing the code without violating Chesterton's Fence?

and so on.

When code is well documented and easily understandable – in other words, when the programmer's internal model is externalized to the entire team – anyone can use it, reason about it, change it. We get access to way more parallelization, have way fewer blockers. The entire team benefits, and the exponential corkscrews kick off in earnest.

All this to say, reviewers should be unafraid (even eager!) to ask for inline plain-English explanations. Making the programmer externalize their mental models is perhaps the most important thing we reviewers can do.