Start moving dependencies into their own Bazel externals.#1682
Start moving dependencies into their own Bazel externals.#1682htuch merged 1 commit intoenvoyproxy:masterfrom
Conversation
fb8a6cb to
aea0338
Compare
|
@jmillikin-stripe I'm not a huge fan of this change for all the reasons we have already discussed:
From my perspective, the situation we have today, though maybe "ugly", actually works quite well. My preference here would be to somehow allow both the old and new behavior with the old behavior being the default, but will defer to @htuch. |
The new BUILD files are equivalent to, and replace, the existing BUILD rules in For upstreams that are willing to use Bazel directly (example: gperftools/gperftools#800), I'm planning to migrate them first and then have Envoy consume their configs.
Would you be willing to tell a bit more about how the pre-built dependencies work? I believe this work can be done in a way that involves zero changes to your existing tools, but I don't fully understand how you're replacing dependencies in the existing |
We essentially do exactly what CI currently does which is to pre-build all the dependencies using native build systems, and then map them in. For build speed reasons CI will need to keep doing this, and I see no reason to change our internal build system either, as it works fine. |
bazel/external/lightstep.BUILD
Outdated
There was a problem hiding this comment.
Is there no way to get this into a separate file, i.e. a build a recipe? It's easier to read than being inlined here, where it's a mix of Bazel macros and shell.
There was a problem hiding this comment.
Not with new_http_archive, but using a smarter repository definer should let us put in additional files during the "fetch" step. I'll try to get that figured out.
There was a problem hiding this comment.
I would caution against adding repository rules that perform build, or you might hit the issue that led to the single massive recursive make repository rule in the current implementation. Repository rules execute sequentially, so you don't get any cross dependency build parallelism. This slowed down the initial fetch significantly, e.g. from a couple of minutes to tens of minutes.
There was a problem hiding this comment.
I was thinking of using ctx.symlink() to make a shell script available within the repository, similar to how the Skylark new_http_archive works. I definitely want to avoid doing any build actions within the repository rule implementation.
There was a problem hiding this comment.
Well, that's what we do today https://github.com/envoyproxy/envoy/blob/master/bazel/repositories.bzl#L12. I'm starting to wonder how the new approach (if we're moving to repository rules for everything) is moving away from the existing approach, other than decomposing the monolith recursive make, which provide parallelism via job control, to a bunch of repository rules that will be executed sequentially with no job control.
There was a problem hiding this comment.
The future repository rules should be very fast (just unpacking a tarball, maybe writing a shell blob). I want all the real build work to happen within standard BUILD rules, so they can be parallelized.
There was a problem hiding this comment.
That could work. I'm still a little unsure on how this will work though with output file determination, since output files may only be known following the build process (even headers might be generated), so figuring out the output artifacts is going to require a post-build step.
bazel/external/lightstep.BUILD
Outdated
There was a problem hiding this comment.
This is one of the reasons we opted for using a repository_rule instead of genrule. You have to manually enumerate and maintain the complete list of outputs in genrules. For simple cases it's OK, but when you have a large set of headers, which is the case in some libs, it becomes really big.
There was a problem hiding this comment.
ACK. I think it should be possible to avoid huge manual enumerations by using glob(), since it's uncommon for large numbers of output files to use arbitrary names. For example, a library that installed ~dozens of proto headers could use glob("**/*.proto") to derive the output names.
There was a problem hiding this comment.
I don't think genrules support globs on outputs FYI, at least last time I checked. Maybe this has changed?
There was a problem hiding this comment.
See https://github.com/jmillikin/bazel-packages/blob/master/libevent/libevent.bzl for an example. The glob is on input files, and used to calculate the output file names using normal Python expressions.
bazel/external/lightstep.BUILD
Outdated
There was a problem hiding this comment.
Please use caps for shell vars like PROTOC, CONFIGURE here
There was a problem hiding this comment.
CAPITAL_CASE variables are reserved for the OS or the shell. I'm using the Google Shell Style conventions here, which encourage lower_case names for variables defined by the user: https://google.github.io/styleguide/shell.xml#Naming_Conventions
There was a problem hiding this comment.
Within Envoy, I think we have a slight variation where all variables are upper case. Where there is disagreement between Google style guide and Envoy, we tend to prefer consistency (unless you want to update this everywhere). Thanks.
There was a problem hiding this comment.
I'd be fine updating this everywhere (might as well), if you're OK with the extra PRs and general commit churn.
bazel/repositories.bzl
Outdated
There was a problem hiding this comment.
Why is existing_rule_keys needed? Please add a comment if it's necessary
There was a problem hiding this comment.
It lets users override the dependencies simply by defining their own, instead of having to adjust the params of envoy_dependencies. For example, I could have a WORKSPACE that uses a custom protobuf_bzl, and then this code would use it automatically.
I'll add a comment.
There was a problem hiding this comment.
It seems crazy to have two mechanisms for this (this and the skip_targets). I don't feel that strongly on which is better, but we should just have one, even if we need to deprecate skip_targets.
There was a problem hiding this comment.
I would prefer skip_targets, since there some optional dependencies that can be skipped without overriding, e.g. lightstep.
There was a problem hiding this comment.
There was a problem hiding this comment.
@lizan can we get rid of skip_targets based on this? Would really prefer just one approach here.
There was a problem hiding this comment.
ah you're right, I'm fine with that.
|
@jmillikin-stripe I think most of this is a reasonable step, but I'd like to check on one thing. If you do One of the advantages of the recursive make implementation previously was we had a single make server providing job control of the configure/make steps, which allowed effective parallel execution of non-Bazel bits (but not parallel Bazel/non-Bazel). I'm hoping we don't have a regression here (it's possible your change makes it faster as well). On the prebuilts and CI, one of the advantages of the build recipes, which I hope we can preserve, is that they could be executed ahead-of-time in the CI image build, independent of Bazel, as well as under the Bazel I think we will still have the recursive make FWIW, I'm down with these changes as long as we don't go to a world in which Envoy relies on |
bazel/repositories.bzl
Outdated
There was a problem hiding this comment.
Does this work when the repo is consumed by other git_repository or local_repository? Make sure this works in envoy-filter-example, or istio/proxy.
We had to use build_file_content to embed them into .bzl files, just want to double check if things changed.
There was a problem hiding this comment.
build_file takes a label, resolved relative to the current repository (bazelbuild/bazel#682). This will work when building Envoy as part of a larger workspace.
I believe older versions of Bazel (< 4.something?) used plain paths for build_file. If we need to support those versions, I can change to build_file_content. But I'd prefer not to, because it's ugly.
There was a problem hiding this comment.
No I don't need older version support, Good to know it works now.
Testing with command: On my machine,
Agreed, we don't want to re-create the build systems of everything we depend on. |
|
I've made an attempt at the CI prebuilt dependency stuff. It runs |
|
@jmillikin-stripe I discussed with @htuch and I'm fine with this approach if you want to see it through. (I think there will be multiple PRs needed to get CI passing but not sure). |
htuch
left a comment
There was a problem hiding this comment.
Overall, I actually like the approach. It has similarities with an earlier approach in our Bazel history which was largely genrule based, but does a better job at encapsulating some of the build/fetch setup behind a repository_rule.
I feel we should retain the existing repository_rule for some time, until we are comfortable that new dependencies can be added cheaply without Envoy developers becoming Bazel experts. The advantage of the previous approach is that you only had to write a single shell script and you were done.
You should also update docs at https://github.com/envoyproxy/envoy/blob/master/bazel/EXTERNAL_DEPS.md. For now, I would provide developers the option of using either new or old approach.
bazel/external/BUILD
Outdated
There was a problem hiding this comment.
Do you even need empty.cc? I was under the impression that cc_library targets could be used to aggregate libraries without providing their own sources.
There was a problem hiding this comment.
See the comment. An empty .cc file is needed to make Bazel trigger a build. A cc_library that depends only on .a files will report "nothing to do" if you try to build it on the command line.
bazel/external/lightstep.BUILD
Outdated
There was a problem hiding this comment.
This is quite neat, but also non-obvious to the reader. Can you add a comment on each of these sites where you reference GENRULE_REPOSITORY_COMMAND and explain where it is coming from?
There was a problem hiding this comment.
Changed to use a function that takes a label. This should be a bit more self-describing.
bazel/external/lightstep.BUILD
Outdated
There was a problem hiding this comment.
Please add a comment on why this is needed, i.e. that you need the protobuf library headers/libraries to feed into GENRULE_REPOSITORY_COMMAND below.
bazel/external/lightstep.sh
Outdated
There was a problem hiding this comment.
Please either switch back to upper case or put in a PR first to adjust the entire repository to a lower case convention for internal variables. You can always re-lower case it later if you want to delay that repository conversion PR.
There was a problem hiding this comment.
Changed to upper-case. I don't want to barrier this PR on some giant shell blob cleanup.
bazel/external/lightstep.sh
Outdated
There was a problem hiding this comment.
Maybe have the extensions of these .genrule.cmd or something, so that it can't be confused with real .sh.
bazel/repositories.bzl
Outdated
There was a problem hiding this comment.
@lizan can we get rid of skip_targets based on this? Would really prefer just one approach here.
bazel/genrule_repository.bzl
Outdated
There was a problem hiding this comment.
Prefer 4 spaces here. We've been a bit inconsistent, but the idea is that this Skylark, not Python and we're trying to be consistent across BUILD files and .bzl.
There was a problem hiding this comment.
Done. Also ran buildifier.
bazel/genrule_repository.bzl
Outdated
There was a problem hiding this comment.
This approach is great, I like the idea of getting the build under genrule (for build parallelism) while still doing the setup heavy lifting under repository_rule. It doesn't solve the problem of having to explicitly enumerate output files in the genrule; so far, in this PR it's not an issue, but it's something I think we should consider as we bring in more dependencies with more complex output file lists.
There was a problem hiding this comment.
I'm good with this approach, providing we maintain the bind support so that folks who really want prebuilts (e.g. that don't come from our build process), are capable of arbitrary rebinding in their WORKSPACE.
There was a problem hiding this comment.
Rather than using bind(), we should encourage users to override deps using Bazel repositories. That's why the existing_rule_keys stuff is used -- a user can define a different @protobuf_bzl repository in their own WORKSPACE.
See comments on https://docs.bazel.build/versions/master/be/workspace.html#bind and discussion on bazelbuild/bazel#1952 ("Consider removing bind()") for more background on discouraging use of bind().
There was a problem hiding this comment.
As long as we can support remapping of the external repos via envoy_exernal_dep_path, we're OK with this at Google. Internally, as you know, we don't have any @ repos, so we need to be able to take each of the individual repository/dependency labels and remap them to a global path in our monorepo.
There was a problem hiding this comment.
What is envoy_exernal_dep_path? Is that the same as the path param to envoy_dependencies?
How do you currently handle the bind() names? I assume there's something set up to avoid polluting the namespace with things like //external:zlib -- do those get transparently mapped to //third_party/ somehow?
There was a problem hiding this comment.
https://github.com/envoyproxy/envoy/blob/master/bazel/envoy_build_system.bzl#L71. Internally, we rewrite the entire envoy_build_system.bzl to match Google's monorepo and build requirements. That function gets turned into a dictionary lookup, that maps from identifiers such as ares to something in //third_party.
There was a problem hiding this comment.
Ah, that should be fine then. This PR does make a small change to envoy_external_dep_path to support non-ambiguous dep names, but that's merely a change in the identifiers. And since none of them are changing in this PR, you don't have to do anything.
There was a problem hiding this comment.
What's the Bazel story for aliasing without bind()? Suppose I have a repository under @foo, I import @bar, and it expects @foo to exists as @baz. How do I tell Bazel to find @baz as @foo?
There was a problem hiding this comment.
Have you looked into the size of the Docker image? We're now doing 3x the build of external deps. We briefly went from 1 -> 2 and that created almost 50% growth. Please provide stats in the review and I think @mattklein123 might want to review this for sanity.
There was a problem hiding this comment.
On master the image is 2.34GB, on this branch the image is 2.17GB. Bazel is fairly good at avoiding copies of files that aren't different between compilation modes.
I'm also preparing another PR to trim the image by reducing Debian deps with apt-get --no-install-recommends, and have filed a Bazel bug to allow use of the headless JDK because we don't need X11 (bazelbuild/bazel#3761).
There was a problem hiding this comment.
Thanks, reducing image size is a nice improvement.
6aac661 to
6f39f17
Compare
|
Generally sounds good. Will be offline until next Mon, so will pick the
review up then.
…On Wed, Sep 20, 2017 at 5:32 PM jmillikin-stripe ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In bazel/external/lightstep.sh
<#1682 (comment)>:
> @@ -0,0 +1,20 @@
+# This is a Bazel genrule script. It acts like Bash, but has some extra
+# substitutions for locating dependencies.
+#
+# https://docs.bazel.build/versions/master/be/general.html#genrule
+
+configure="$${PWD}/$(location :configure)"
Changed to upper-case. I don't want to barrier this PR on some giant shell
blob cleanup.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1682 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AKaLv31fcynmfofI4h-1MZ_1Ox0lF8GRks5skYRigaJpZM4PZq5I>
.
|
Docs updated. I added two sections, one for external Bazel-based (like protobuf) and one for external non-Bazel (like lightstep). |
htuch
left a comment
There was a problem hiding this comment.
Looks great. A few doc requests and other comments then I think we can ship.
bazel/EXTERNAL_DEPS.md
Outdated
There was a problem hiding this comment.
This is pretty complicated for the average Envoy developer who isn't a Bazel expert. A build recipe was easy to understand, it's just a shell script. Can you provide some simple explanations above imagining the developer has never heard of a genrule? Even a link to Bazel genrule stuff would help, but we don't want folks to have to become Bazel ninjas to add deps.
There was a problem hiding this comment.
Made a few tweaks, does this help? I added a link to the genrule docs, and phrased it as a "bash command with Bazel extensions" which should hopefully be less intimidating to devs unfamiliar with Bazel.
bazel/EXTERNAL_DEPS.md
Outdated
There was a problem hiding this comment.
Can you document how developers deal with external dependencies that depend on other external deps? It was a lot more direct before (we just has essentially a global prefix path).
There was a problem hiding this comment.
I'm not sure how to document this, because it'll be different for each dep -- some use Autotools and thus need flags to ./configure, others use CMake and need ... whatever CMake does. I added a link to the Lightstep build command to use as an example.
bazel/repositories.bzl
Outdated
bazel/repositories.bzl
Outdated
There was a problem hiding this comment.
What about repository_locations.bzl? This was explicitly added to allow site-local redirects of deps.
There was a problem hiding this comment.
I would prefer to have one way to override dependencies, if possible. This implementation allows deps to be overridden by adding things to WORKSPACE, which lets us avoid making local changes to the Envoy source files.
|
Sorry for the delay, end of Q3 was busy. There's currently some merge conflicts with other changes. Would it be OK for me to rebase this to current master, or would you prefer a merge-to-branch? |
|
Yeah, do a full rebase, that's fine! |
3285858 to
4aa93d6
Compare
4aa93d6 to
9a5513e
Compare
|
OK, rebased to current head. The image builds but I haven't done a full build in it yet. Will do more poking on Monday. |
|
@jmillikin-stripe can you get the builds to pass? I think this PR is in good shape to merge, we should try to make this happen. |
|
/subscribe |
d16c142 to
8f6434a
Compare
56421e2 to
2d5e194
Compare
|
Good catch. I'm not sure how |
|
Peanut gallery question: Why are we hitting UBSAN issues when changing the build system? Why aren't we hitting this in the current build? |
The previous build didn't enable UBSAN in any of the separately-compiled dependencies. The new build has UBSAN in protobuf (because built with Bazel) and Lightstep (because it has to link its test binaries against Bazel-built protobuf). |
|
@jmillikin-stripe got it, thanks. |
2d5e194 to
fab290d
Compare
I've added a |
a384e60 to
8a9b19b
Compare
The eventual goal here is to have each dependency in its own @thing, so Bazel can download and build them in parallel. This should also simplify the build, because some dependencies (protobuf, boringssl) have their own Bazel configs. Reason these particular four deps were chosen as a starting point: * `fmtlib` and `spdlog` are header-only libraries used by every `envoy_cc_library` target. Moving them out of the deps blob is trivial, and decouples most of the small libraries from the other huge deps. * `protobuf` has its own BUILD file, which we should use. * `lightstep` depends on `protobuf`, and therefore must also be moved out. This was relatively straightforward, by copying the existing "build recipe" shell script into a genrule. I'm hopeful that we can eventually move these dependencies into https://github.com/bazelbuild/bazel-packages when that project begins accepting contributions. Signed-off-by: John Millikin <jmillikin@stripe.com>
8a9b19b to
d76f5c3
Compare
|
All tests pass with patched protobuf library. Should be safe to merge once this last CI run finishes up. |
|
Thank you all for your help in getting this marathon PR merged! |
| cd /bazel-prebuilt | ||
| for BAZEL_MODE in opt dbg fastbuild; do | ||
| bazel ${BAZEL_OPTIONS} build -c "${BAZEL_MODE}" //bazel/external:all_external | ||
| done |
There was a problem hiding this comment.
@jmillikin-stripe we should make sure we switch the main Envoy build (and CI) to the new image that results from this change. Can you do a PR to bump https://github.com/envoyproxy/envoy/blob/master/ci/envoy_build_sha.sh? Thanks.
Part of a fix for non-root invocation of do_ci.sh for regression introduced in envoyproxy#1682. Signed-off-by: Harvey Tuch <htuch@google.com>
Part of a fix for non-root invocation of do_ci.sh for regression introduced in #1682. Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org> Signed-off-by: JP Simard <jp@jpsim.com>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org> Signed-off-by: JP Simard <jp@jpsim.com>
The eventual goal here is to have each dependency in its own @thing, so
Bazel can download and build them in parallel. This should also simplify
the build, because some dependencies (protobuf, boringssl) have their own
Bazel configs.
Reason these particular four deps were chosen as a starting point:
fmtlibandspdlogare header-only libraries used by everyenvoy_cc_librarytarget. Moving them out of the deps blob is trivial,and decouples most of the small libraries from the other huge deps.
protobufhas its own BUILD file, which we should use.lightstepdepends onprotobuf, and therefore must also be movedout. This was relatively straightforward, by copying the existing
"build recipe" shell script into a genrule.
I'm hopeful that we can eventually move these dependencies into
https://github.com/bazelbuild/bazel-packages when that project begins
accepting contributions.
Signed-off-by: John Millikin jmillikin@stripe.com