A little over a year ago, I joined Cloud Foundry to work on Loggregator, Cloud Foundry's application logging component. Its core concern is best-effort log delivery without pushing back on upstream writers. Loggregator is written entirely in Go.
After spending more than a thousand hours working with Go in a non-trivial code base, I still admire the language and enjoy using it. Nonetheless, our team struggled with a number of problems, many of which seem unique to Go. What follows is a list of the most salient problems.
Cloud Foundry was an early adopter of Go at a time when few people knew what idiomatic Go looked like or knew how to structure a large project. As a result, a year ago Loggregator suffered from a haphazard organization which made understanding the code difficult, let alone identifying dead code paths or places for possible refactoring. There seemed to be a tendency to extract tiny packages first instead of waiting for a shared concern to emerge from the code and only then extracting a package. There were many examples of stuttering between package names and types. Worst of all, there was little reusable code in the project.
Given the code's state of organization, Peter Bourgon's advice
on how to organize Go code has been invaluable, as is the rest of his
material on best practices. Likewise, the Go blog's post
on package names provides
many helpful guiding principles. For especially large projects, the distinction
pkg has become a best-practice. See, for example,
Delve. More recently, there is the
excellent Style guideline for Go packages.
When just starting a project, the
pkg package seems unnecessary. I prefer to
begin with a
cmd directory and with whatever go files at the top level, as if
the project were a library. As the project grows, I like to identify packages,
which start out as peers of the
cmd package. When the time seems right, it is
easy to move those various peers of
cmd into a
Small main functions
Just as a poorly organized project results in a ball of mud, a careless approach to a main function can result in needless complexity. Compare two versions of the same main function: before and after. One version is over 400 lines. The other is about 40 lines. That's an order of magnitude. One will be easy to change. The other will not be. Delve is exemplary in its clean and focused main function.
A main function should be a particular invocation of library code. That means collecting any input necessary for the process and then passing that input to library code. This style of main functions is more likely to result in testable and composable code.
Dependency management has been a perennial topic in the Go community. Loggregator has used git submodules to vendor dependencies. The approach works, but it's also cumbersome. Spending some time with Rust has reminded me how sorely Go needs an officially supported dependency management tool as part of the Go toolchain. The work on dep is encouraging.
Keeping Go Meta Linter Happy
Without running Go Meta Linter
regularly, all sorts of mistakes will creep into a code base. In particular,
I have discovered the value of Package Driven Development, i.e., writing code
that looks good when running
godoc some-package, a practice which shares a
history with conventions in the
The documentation for a package should be easy to understand, it should
be intention revealing, and it should be meaningful.
Over the course of the year on numerous occasions I lamented the lack of documentation for Loggregator internals, which slowed down the process of understanding even further. Fortunately, our team has come to share the view that documentation is important and has been gradually working to ensure all files within the project pass the Go Meta Linter.
Writing Performant Code starts with measuring
Go is capable of fast performance. It is tempting to prematurely optimize code with the idea that a particular design is "faster." In fact, until you have measured current performance and determined that current performance is inadequate, "faster" is a totally meaningless word.
Such a statement is hardly controversial, and yet I have worked with numerous well-intentioned individuals who immediately reach for sophisticated designs on the dubious grounds of their being "faster." Fortunately, there is a strong interest in the discipline of writing high performance code. See, for example, Dave Cheney's High Performance Go Workshop or Damian Gryski's in progress book on Go Performance.
Having a shared nomenclature for testing
There seems to be a consensus that writing well-tested code is important. What is lacking, though, is a clear understanding of the differences between test-doubles, mocks, spies, stubs, and fakes. Uncle Bob has explained what each of these terms mean in his The Little Mocker. On Loggregator, we had a mix of all these terms and they were rarely used correctly. It may seem pedantic to insist on using these terms correctly, but then again, software engineering leaves little room for ambiguity, so why would we not use the same standards for our choice of words? In my view, a shared nomenclature -- what has elsewhere been called a "ubiquitous language" -- is the first and most important thing for a team of engineers.
Finally, both for people new to Go and people experienced with Go, I continue to find immense value in the following posts.
Thanks to Jason Keene for reading through this post and pointing out GoDoc's relationship to Python.