Why you should use a Go module proxy
After the introduction of Go modules, I thought everything I need to know was finalized. I quickly realized this wasn't the case. Recently, people started to advocate using a Go module proxy. After researching the pros and cons, I've concluded that this is one of the most important changes in recent years. But why is that the case? What makes Go module proxy so special?
In Go modules, if you add a new dependency or build your Go module on a machine with clean cache, it'll (go get
) download all the dependencies based on go.mod
and will cache it for further operations. You can bypass the cache (and with that downloading dependencies) by using a vendor/
folder and constructing go use this folder with the -mod=vendor
flag.
But both of these approaches are not perfect and we can do better.
Problems of (not) using the vendor/ folder
Here are some disadvantages if you use the vendor/
folder:
vendor/
folder is not used by default anymore for thego
command (in module-aware mode). If you don't append the-mod=vendor
flag, it'll not respect it. This leads often to frustrations and leads to other hacky solutions to support older Go versions (see: Using Go modules with vendor support on Travis CI)vendor/
folder, especially for big monorepos, takes a lot of space. This increases the amount of time spent cloning a repo. Even if you think that cloning is done only once, this is mostly not true. CI/CD systems often clone the repository for every trigger (e.g Pull Requests). So in the long term, this leads to longer build times and affects everyone on the team.- Vendoring a new dependency often leads to changes that are difficult to review. And most of the times you have to bundle your dependencies with your actual business logic, which makes it hard to go over the changes.
What if you skip the vendor/
folder? This also doesn't help because now you're dealing with these issues:
go
will attempt to download the dependencies from the source repositories. But there is always a risk that any dependency might disappear in the future (remember the left-pad saga).- The VCS might be down (e.g github.com). In this case, you won't be able to build your project anymore.
- Some companies don't want to have any outgoing connections outside their internal network. Removing the
vendor/
folder is therefore no-go for them. - Suppose a dependency is published as
v1.3.0
andgo get
fetches it and caches it locally. In the meantime, the owner of the dependency can compromise the repo by pushing malicious content with the same tag. If your Go module is rebuilt on a machine with a clean cache, it'll now use the compromised package. To protect against this, you need to store thego.sum
file alongside thego.mod
file. - Some of the dependencies use a different VCS than
git
and therefore depend on other tools such ashg
(Mercurial),bzr
(Bazaar) orsvn
(Subversion). Not all these tools are installed on your host (or in your Dockerfile), which often leads to frustrations. go get
needs to fetch the source code of each dependency listed ingo.mod
to resolve transitive dependencies (it needs theirgo.mod
file). This slows down the whole build process significantly as it means it has to download (e.ggit clone
) each repository just to fetch a single file.
How can we improve this situation?
Advantages of using a Go module proxy
By default the go
command downloads modules from VCS's directly. The GOPROXY
environment variable allows further control over the download source. The environment variable configures the go
command to use a Go module proxy.
By setting the GOPROXY
environment variable to a Go module proxy, you can overcome all of the disadvantages listed above:
- The Go module proxy is by default caching and storing all the dependencies forever (in immutable storage). This means you don't have to use any
vendor/
folder anymore. - Getting rid of the
vendor/
folder means your projects won't take space in your repository. - Because the dependencies are stored in immutable storage, even if a dependency disappears from the internet, you're protected against it.
- It's not possible to override or delete a Go module once it's stored in the Go proxy. This protects you against actors who might inject malicious code with the same version.
- You don't require any VSC tools anymore to download the dependencies because the dependencies are served over HTTP (Go proxy uses HTTP under the hood).
- It's significantly faster to download and build your Go module because Go proxy serves the source code (
.zip
archive) andgo.mod
independently over HTTP. This causes the downloads to take less time and faster (due to less overhead) compared to fetching from a VCS. Resolving dependencies is also faster because thego.mod
can be fetched independently (whereas before it had to fetch the whole repository). The Go team tested it andthey saw a 3x speedup on fast networks and 6x on slow networks! - You can easily run your own Go proxy, which gives you more control over the stability of your build pipeline and protects against the rare cases when the VCS is down.
As you see, using a Go module proxy is a win for everyone. But how do we use it? What if you don't want to maintain your own Go module proxy? Let us look into many alternative options.
How to use a Go module proxy
To start using a Go module proxy, we need to set the GOPROXY
environment variable to a compatible Go module proxy. There are multiple ways:
1.) If GOPROXY
is unset, empty or set to direct
then go get
will use a direct connection to the VCS (e.g github.com):
GOPROXY=""
GOPROXY=direct
It can be also set to off
, which means no network use is allowed.
GOPROXY=off
2.) You can start using a public Go proxy. One of your options is to use the Go proxy from the Go team (which is run by Google). More information can be found here: https://proxy.golang.org/
To start using it, all you have is to set the environment variable:
GOPROXY=https://proxy.golang.org
Other public proxies are:
GOPROXY=https://goproxy.io
GOPROXY=https://goproxy.cn # proxy.golang.org is blocked in China, this proxy is not
3.) You can run several open source implementations and host it yourself. Some of these are:
Athens
: https://github.com/gomods/athensgoproxy
: https://github.com/goproxy/goproxyTHUMBAI
: https://thumbai.app/
You need to maintain these yourself. It's up to you if you want to serve it over the public internet or on your internal network.
4.) You can buy a commercial offering:
Artifactory
: https://jfrog.com/artifactory/
5.) You can pass a file:///
URL. Because a Go module proxy is a web server that responds to GET requests (with no query parameters), a folder in any filesystem can be also used to serve as a Go module proxy.
Upcoming Go v1.13 changes
There will be some changes regarding Go proxy in the Go v1.13 version which I think should be highlighted:
- the
GOPROXY
environment variable may now set to comma-separated list. It'll try the first proxy before falling back to the next path. - The default value of
GOPROXY
will behttps://proxy.golang.org,direct
. Anything after thedirect
token is ignored. This also means thatgo get
will now by default useGOPROXY
. If you don't want to use Go proxy at all, you need to set it tooff
. - A new
GOPRIVATE
environment variable is introduced, that contains a comma-separated list of glob patterns. This can be used to bypass theGOPROXY
proxy for certain paths, especially private modules in your company (e.g:GOPRIVATE=*.internal.company.com
).
All of these changes indicate how the Go module proxy is a central and important part of Go modules.
Verdict
Using GOPROXY
both via public and private networks has a lot of advantages. It's a great feature that works seamlessly with the go
command. Giving the fact that it has so many advantages (secure, fast, storage efficient) it would be wise to embrace it quickly for your projects or in your organization. Also, with Go v1.13, it'll be enabled by default, which is another welcoming step improving the state of dependency management in Go.
No spam, no sharing to third party. Only you and me.