-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Bazel 6 Remote JDK Toolchains Incorrect Configuration #17085
Description
Description of the bug:
Upgrading to Bazel 6.0.0, I get some toolchain resolution and build failures around java_library/java_binary rules. My environment is cross-compile heavy and has remote executors on a number of platforms (linux, darwin both x86 and arm64, windows) and bazel itself is invoked on any of those platforms building for any other.
I get issues when, due to peculiarities in execution platform selection order, an exec platform is chosen for java that is distinct from the target platform. For instance, target platform is macos_arm64, but the exec platform is @local_config_platform//:host on linux, or a remote linux platform. For instance, here is toolchain_resolution_debug from a cross compile:
INFO: ToolchainResolution: Type @bazel_tools//tools/jdk:runtime_toolchain_type: target platform //:macos_arm: execution //:macos_arm: Selected toolchain @remotejdk11_macos_aarch64//:jdk
INFO: ToolchainResolution: Type @bazel_tools//tools/jdk:runtime_toolchain_type: target platform //:macos_arm: execution //:macos: Selected toolchain @remotejdk11_macos_aarch64//:jdk
INFO: ToolchainResolution: Type @bazel_tools//tools/jdk:runtime_toolchain_type: target platform //:macos_arm: execution //:linux: Selected toolchain @remotejdk11_macos_aarch64//:jdk
INFO: ToolchainResolution: Type @bazel_tools//tools/jdk:runtime_toolchain_type: target platform //:macos_arm: execution //test:downloader: Selected toolchain @remotejdk11_macos_aarch64//:jdk
INFO: ToolchainResolution: Type @bazel_tools//tools/jdk:runtime_toolchain_type: target platform //:macos_arm: execution @local_config_platform//:host: Selected toolchain @remotejdk11_macos_aarch64//:jdk
A number of these selections are wrong: remotejdk11_macos_aarch64 will not run on the host platform (it is linux), nor will it run on the //:linux remote execution platform. Nor will it run on the //test:downloader platform which specifies no OS constraints to match at all. If bazel happens to choose any but the first configuration with identical target and exec platform, it will try to execute a darwin arm64 binary on linux with predictable fireworks. In the ultimate case where the target and set of exec platforms are distinct, this is guaranteed to fail.
This also fails when targeting a platform that has no JDK such as Android. In Bazel 5.1 it worked just fine to have a java_library depended on by an android_library, now toolchain resolution just fails if you set --platforms to one with @platforms//os:android because there is no openjdk for android (and if there were, my remote exec environment has no android runners)
It seems that d5559c1 changed the definition of the java toolchains to specify only target_compatible_with, not exec_compatible_with, even though the jdk itself is platform-specific. I'm not sure I understand the benefits of that change, but at a minimum it seems like exec_compatible_with should also be specified for those toolchains.
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
See https://github.com/jlaxson/bazel_java_toolchain_bug for the simple reduction involving android platforms.
Which operating system are you running Bazel on?
Macos, linux
What is the output of bazel info release?
release 6.0.0
If bazel info release returns development version or (@non-git), tell us how you built Bazel.
No response
What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?
No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response