Automating code review

In Mews, we are big fans of code review. Not only because it prevents bugs, but also it serves as a natural place for information sharing. For newcomers, it's the easiest way how to quickly get a picture of what's currently being worked on and how other developers solve their issues. We encourage people to ask if they don't understand something. Because even such questions are feedback for the author: "The code is not clear, I don't understand what it does."

At the moment, we use two level code review process. First review is done by a peer within team and if they approve the pull request, it goes to second level to someone more senior. By the way we use Github boards for management of all pull requests and it works pretty well. Both authors and reviewers are assignees of the pull requests, so a single developer has a clear overview of their workload if they filter the board by themselves as assignee.

However the more time someone spends doing code review, the more time they find themselves in a situation, that they're commenting a similar issue over and over again. And we're not talking about code style, for that we already use Stylecop and other rule sets. We're talking about subtle issues, that might be OK in a generic project, but that cause problems in the system you're currently working on or in the environment where your system is deployed.

We want the reviewers to focus on the important stuff while not losing their concentration by pointing out something, that they have already pointed out 50 times in the past. So, as programmers, we decided to solve it by writing a program that would do the code review for us. 👨‍💻 On backend where we use .NET, we take advantage of custom code analyzers and when we encounter an anti-pattern that appears regularly in pull requests, we write an analyzer for it. If the analyzer encounters the pattern, it triggers compile time error.

That may sound a bit complex to do, but in essence, analyzer is just a function that performs pattern matching on a syntax tree. And the rest is handled by the surrounding framework. It takes an hour or so to create a new analyzer, which is a big win considering that it can detect bugs in your existing codebase and prevent them from happening in the future. You don't have to start with anything sophisticated. In our case, we started implementing analyzers that ban functions from .NET framework that have unintended consequences. Implementing such analyzers is trivial, consider the following two examples.

Banning DateTime.Now

Our system is running in cloud in multiple geographical regions which means that return value of DateTime.Now depends on the timezone of the region where it physically runs. Using Now brings non-determinism which we want to avoid, so whenever it is used, our analyzer triggers compile time error. And the developer has to use DateTime.UtcNow and convert it explicitly to particular timezone.

Banning OrderBy(string)

Mews is used by hotels all over the world and even such basic thing as ordering of letters in alphabet depends on culture. E.g. in Czech culture, CH is a letter that should come after H, where in English, CH represents two letters and should be sorted with C. Therefore using e.g. OrderBy(c => c.Name) is an anti-pattern because it does not explicitly take culture into account. It implicitly uses culture of the current thread, but that's a shared global variable which we neither like 😄. The analyzer forces the developer to always specify culture-dependent comparer when ordering by strings.

Conclusion

We are also fans of functional programming which means that we use lot of strong typing and try to turn runtime errors into compile time errors. So in general, we ban lot of type-unsafe stuff that is available in the .NET libraries or even in C# language. On the other hand, we use analyzers for much more basic checks, e.g. to ensure some consistency in comments or in code-style (on top of Stylecop).

Last but not least, not all people are used to getting a lot of feedback and critique of their code. Newcomers are sometimes surprised how thorough we are and we can sense that they're not really comfortable with getting many comments from their peers. That of course improves over time with us, but we understand that it can be a bit discouraging in the beginning. However if the feedback is given by compiler at time of writing the code, it suddenly feels much more objective and acceptable to developers. It doesn't feel like someone is just picking on them because they are new. And also it keeps the reviewers happy because they do not have to be machines for catching the same problems every day.

Learn more about how we do things in Mews on our Github and Twitter.