Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: go run and go tool are slower than directly executing cached binary #71733

Closed
dottedmag opened this issue Feb 13, 2025 · 14 comments
Closed
Labels
GoCommand cmd/go NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@dottedmag
Copy link
Contributor

dottedmag commented Feb 13, 2025

Go version

go version go1.24.0 darwin/arm64

Output of go env in your module/workspace:

AR='ar'
CC='clang'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='clang++'
GCCGO='gccgo'
GO111MODULE=''
GOARCH='arm64'
GOARM64='v8.0'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/Users/dottedmag/Library/Caches/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/Users/dottedmag/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/95/vzwcv5yd32x369z0c9t4bfr00000gn/T/go-build3662448907=/tmp/go-build -gno-record-gcc-switches -fno-common'
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMOD='/Users/dottedmag/tmp/gr-go-run/go.mod'
GOMODCACHE='/Users/dottedmag/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/dottedmag/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/Users/dottedmag/go/pkg/mod/golang.org/[email protected]'
GOSUMDB='sum.golang.org'
GOTELEMETRY='off'
GOTELEMETRYDIR='/Users/dottedmag/Library/Application Support/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/Users/dottedmag/go/pkg/mod/golang.org/[email protected]/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.24.0'
GOWORK=''
PKG_CONFIG='pkg-config'

What did you do?

I have tried running go run and go tool with trivial programs, expecting them to exit nearly instantly.

Instead I'm seeing run times of ~50ms on Mac M1 Max with warm caches.

Here's a repository with a reproducer: https://github.com/dottedmag/gr-go-run

Run go test -bench=..

The basis for a comparison is a tool I wrote some time ago, before link-caching has been merged to Go. It uses a cache key computation algorithm that is fairly close to the original one (borrowing some code directly from Go), and still outperfroms go run 5ms to 50ms.

What did you see happen?

% go test -bench=.
goos: darwin
goarch: arm64
pkg: gr-go-run
cpu: Apple M1 Max
BenchmarkGr-10       	    237	  4570610 ns/op
BenchmarkGoRun-10    	     22	 51377985 ns/op
PASS
ok  	gr-go-run	3.038s

What did you expect to see?

go run or go tool are expected to be at least on par with an external tool that does not hook into the compilation process, now that the linker outputs are cached.

@seankhliao
Copy link
Member

is this a corporate mac with endpoint security software running on it?

@dottedmag
Copy link
Contributor Author

dottedmag commented Feb 13, 2025

Nope, no entrerprise-y shenanigans.

Apple's Gatekeeper is enabled, of course, but it should affect all the solutions equally.

@dmitshur dmitshur added Performance NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. GoCommand cmd/go labels Feb 14, 2025
@dmitshur dmitshur added this to the Backlog milestone Feb 14, 2025
@dmitshur
Copy link
Contributor

CC @matloob, @samthanawalla.

@seankhliao seankhliao changed the title cmd/go: go run and go tool are still quite slow (50ms on Mac M1 Max with warm caches) cmd/go: go run and go tool are slower than directly executing cached binary Feb 22, 2025
@iwahbe
Copy link

iwahbe commented Mar 12, 2025

I'm seeing this as well.

@matloob
Copy link
Contributor

matloob commented Mar 13, 2025

The caching support for go run and go tool caches the output of the link step in the build. We still need to run all the actions in the build graph (looking up the cached output using the action id). We also still need to do module and package loading.

I'm sure there's a lot we can do to make the go command faster, and I would love to see it become faster, but this level of performance is pretty much what we expect.

I would definitely like to hear about use cases that are adversely impacted by this.

We would also definitely welcome changes that improve the performance of the go command without impacting its complexity.

@dottedmag
Copy link
Contributor Author

I would definitely like to hear about use cases that are adversely impacted by this.

With a fast caching Go could be used to write a wrapper around CLI tools. 5ms is acceptable, but 50ms is already perceptible for interactive invocation, and if it is a wrapper around a often-called tool (e.g. a wrapper for a compiler) then these milliseconds begin to add up quickly.

@iwahbe
Copy link

iwahbe commented Mar 14, 2025

My use case is around build tooling: wrapping a utility for integrating Go into Makefiles. I need to run the wrapped utility ~30 times per make invocation, so speed is critical (and developers wait in real time). Each invocation runs for ~20ms, and a >100ms overhead is unacceptable.

@dottedmag I've tried to work around this problem by fetching the tool path with go tool -n ${YOUR_TOOL}, and then invoking the binary directly. If you have somewhere to save state, that worked pretty well for me. It hit #72824 in CI though...

@dottedmag
Copy link
Contributor Author

@iwahbe

I've tried to work around this problem by fetching the tool path with go tool -n ${YOUR_TOOL}, and then invoking the binary directly.

The trickiest part is cache revalidation, as usual. I'm striving for a replacement for "go run" — a tool that can be used without thinking about stale caches.

With gr (see above) it's already much faster than running go list. I haven't spent much time optimizing it though, so I guess there's still a lot of performance left on the table. Now that I think about it, I have an idea how to get it down to a small number of syscalls with an follow-up execve() with little logic in between. This still isn't free, but I guess I could try to cut it down to under 1ms.

I understand that in your use-case you may be reasonably sure that the source code of your wrapper does not change under you as you're building things. In mine it's a source of frustration when I or somebody else on the team change branches and then 30 minutes later figure out the tool was stale.

@matloob
Copy link
Contributor

matloob commented Mar 17, 2025

Thanks for your responses. I think for these use cases, we'd be happy to accept performance increase CLs that don't increase the complexity of the go command.

@iwahbe I'm curious why doing a go build -o <temporary location> <pkgpath> to build the binary to a temporary location, and then executing that binary wouldn't work for your use case. That would be a better way to get the latest cached copy of the tool.

@iwahbe
Copy link

iwahbe commented Mar 17, 2025

I use GOBIN=$(shell pwd)/bin/${HELPMAKEGO_VERSION} go install github.com/iwahbe/helpmakego@${HELPMAKEGO_VERSION} right now, but that requires that I manually manage versions. It works, but go tool would have been simpler if it was fast enough.

I don't want a global install, and running go build -o bin/helpmakego github.com/iwahbe/helpmakego requires updating go.mod, which go mod tidy then reverts:

go build -o bin/v0.1.0/helpmakego github.com/iwahbe/helpmakego
no required module provides package github.com/iwahbe/helpmakego; to add it:
        go get github.com/iwahbe/helpmakego

@matloob
Copy link
Contributor

matloob commented Mar 18, 2025

@iwahbe For the go build -o case, have you already added your tool to the module using go get -tool github.com/iwahbe/helpmakego before running go build -o? If you have a tool directive in your go.mod, go mod tidy won't remove the requirement.

@iwahbe
Copy link

iwahbe commented Mar 19, 2025

Running go get -tool github.com/iwahbe/helpmakego and then go build -o ... works.

Maybe the error message for go build -o bin/v0.1.0/helpmakego github.com/iwahbe/helpmakego should mention -tool when the go path used isn't a child of the current module.

That would be a better way to get the latest cached copy of the tool. (src)

If that is true, this I'm not sure why go tool allows adding external tools. I assumed that go tool allows external tools to provide a correct and easy to use way to run cached tools.

@matloob
Copy link
Contributor

matloob commented Mar 19, 2025

That would be a better way to get the latest cached copy of the tool. (src)

If that is true, this I'm not sure why go tool allows adding external tools. I assumed that go tool allows external tools to provide a correct and easy to use way to run cached tools.

Sorry, I should have clarified: I was comparing using go build -o with go tool -n to get a binary path that you can execute directly. That's for the case where you need to optimize the performance of the tool being called many times. go tool itself is the best option for to build and run the tool.

@matloob
Copy link
Contributor

matloob commented Mar 24, 2025

I'm going to close this issue for now. If you have suggestions for specific work that can be done on the go command that can improve performance without adding complexity, please file an issue and we can discuss the specific proposals.

@matloob matloob closed this as not planned Won't fix, can't repro, duplicate, stale Mar 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GoCommand cmd/go NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

5 participants