Skip to content

execgen: large dependency tree causes lots of recompilation under bazel #77234

@irfansharif

Description

@irfansharif

Describe the problem

Bazel builds can some times be slower than corresponding make builds for the same files changed. I tried the following experiment:

  1. make buildshort and dev build short first, on a clean checkout (to priming both caches).
  2. Applied this random (buggy) diff to one of our base go packages:
$ git diff
diff --git i/pkg/roachpb/api.go w/pkg/roachpb/api.go
index f4d0b4c8aa..5b42b71539 100644
--- i/pkg/roachpb/api.go
+++ w/pkg/roachpb/api.go
@@ -143,7 +143,7 @@ func IsTransactional(args Request) bool {
 // IsLocking returns true if the request acquires locks when used within
 // a transaction.
 func IsLocking(args Request) bool {
-       return (args.flags() & isLocking) != 0
+       return (isLocking) != 0
 }
  1. Re-run make buildshort and dev build short, timing each run:
$ time make buildshort
[...]
________________________________________________________
Executed in   70.72 secs    fish           external
   usr time  174.42 secs   40.00 micros  174.42 secs
   sys time   24.85 secs  537.00 micros   24.85 secs

$ dev build short -- --profile=experiment.tar.gz
[...]
INFO: Elapsed time: 97.947s, Critical Path: 91.80s
INFO: 312 processes: 23 remote cache hit, 1 internal, 288 darwin-sandbox.
INFO: Build completed successfully, 312 total actions

When looking at the bazel profile (chrome://tracing), I observed that we re-compiled execgen, which forced recompilation of all package targets that depended on execgen output. How come make doesn’t go and rebuild execgen + re-run execgen + recompile all dependants? Looking at our Makefile, we only re-compile bin/execgen when proto files change. And we don't (re-)generate .eg.go files unless explicitly running make generate. In short: we've not fully specified execgen dependencies in Make and things are faster because of it. Bazel builds by contrast are fully specified and will re-compile + re-gen execgen output when any of execgen's go dependencies are updated. It's more "correct" but slower because of it.

To Reproduce

See above. To see the dependency chain between execgen and roachpb:

$  bazel query "allpaths(//pkg/sql/colexec/execgen/cmd/execgen, //pkg/roachpb)"

To see the full set of dependencies execgen has (any of which, if changed, would cause slower builds than Make):

$ bazel query "deps(//pkg/sql/colexec/execgen/cmd/execgen)" --output package | grep -v '^@' | grep -v '^build/bazelutil'

Expected behavior

I'm only filing this issue to later point people to for one possible reason for the observed slowdown. We should revisit transitive dependencies execgen has given its position in our build system -- any trim down would drastically help.

Epic CRDB-8036

Jira issue: CRDB-13494

Metadata

Metadata

Assignees

Labels

A-build-systemC-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.T-dev-infT-sql-queriesSQL Queries Team

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions