Conversation
4e16a08 to
46dd952
Compare
bolinfest
added a commit
that referenced
this pull request
May 1, 2026
## Status This is the Bazel PR-CI cross-compilation follow-up to #20485. It is intentionally split from the Cargo/cargo-xwin release-build PoC so #20485 can stay as the historical release-build exploration. The unrelated async-utils test cleanup has been moved to #20686, so this PR is focused on the Windows Bazel CI path. The intended tradeoff is now explicit in `.github/workflows/bazel.yml`: pull requests get the fast Windows cross-compiled Bazel test leg, while post-merge pushes to `main` run both that fast cross leg and a fully native Windows Bazel test leg. The native main-only job keeps full V8/code-mode coverage and gets a 40-minute timeout because it is less latency-sensitive than PR CI. All other Bazel jobs remain at 30 minutes. ## Why Windows Bazel PR CI currently does the expensive part of the build on Windows. A native Windows Bazel test job on `main` completed in about 28m12s, leaving very little headroom under the 30-minute job timeout and making Windows the slowest PR signal. #20485 showed that Windows cross-compilation can be materially faster for Cargo release builds, but PR CI needs Bazel because Bazel owns our test sharding, flaky-test retries, and integration-test layout. This PR applies the same high-level shape we already use for macOS Bazel CI: compile with remote Linux execution, then run platform-specific tests on the platform runner. The compromise is deliberately signal-aware: code-mode/V8 changes are rare enough that PR CI can accept losing the direct V8/code-mode smoke-test signal temporarily, while `main` still runs the native Windows job post-merge to catch that class of regression. A follow-up PR should investigate making the cross-built Windows gnullvm V8 archive pass the direct V8/code-mode tests so this tradeoff can eventually go away. ## What Changed - Adds a `ci-windows-cross` Bazel config that targets `x86_64-pc-windows-gnullvm`, uses Linux RBE for build actions, and keeps `TestRunner` actions local on the Windows runner. - Adds explicit Windows platform definitions for `windows_x86_64_gnullvm`, `windows_x86_64_msvc`, and a bridge toolchain that lets gnullvm test targets execute under the Windows MSVC host platform. - Updates the Windows Bazel PR test leg to opt into the cross-compile path via `--windows-cross-compile` and `--remote-download-toplevel`. - Adds a `test-windows-native-main` job that runs only for `push` events on `refs/heads/main`, uses the native Windows Bazel path, includes V8/code-mode smoke tests, and has `timeout-minutes: 40`. - Keeps fork/community PRs without `BUILDBUDDY_API_KEY` on the previous local Windows MSVC-host fallback, including `--host_platform=//:local_windows_msvc` and `--jobs=8`. - Preserves the existing integration-test shape on non-gnullvm platforms, while generating Windows-cross wrapper targets only for `windows_gnullvm`. - Resolves `CARGO_BIN_EXE_*` values from runfiles at test runtime, avoiding hard-coded Cargo paths and duplicate test runfiles. - Extends the V8 Bazel patches enough for the `x86_64-pc-windows-gnullvm` target and Linux remote execution path. - Makes the Windows sandbox test cwd derive from `INSTA_WORKSPACE_ROOT` at runtime when Bazel provides it, because cross-compiled binaries may contain Linux compile-time paths. - Keeps the direct V8/code-mode unit smoke tests out of the Windows cross PR path for now while native Windows CI continues to cover them post-merge. ## Command Shape The fast Windows PR test leg invokes the normal Bazel CI wrapper like this: ```shell ./.github/scripts/run-bazel-ci.sh \ --print-failed-action-summary \ --print-failed-test-logs \ --windows-cross-compile \ --remote-download-toplevel \ -- \ test \ --test_tag_filters=-argument-comment-lint \ --test_verbose_timeout_warnings \ --build_metadata=COMMIT_SHA=${GITHUB_SHA} \ -- \ //... \ -//third_party/v8:all \ -//codex-rs/code-mode:code-mode-unit-tests \ -//codex-rs/v8-poc:v8-poc-unit-tests ``` With the BuildBuddy secret available on Windows, the wrapper selects `--config=ci-windows-cross` and appends the important Windows-cross overrides after rc expansion: ```shell --host_platform=//:rbe --shell_executable=/bin/bash --action_env=PATH=/usr/bin:/bin --host_action_env=PATH=/usr/bin:/bin --test_env=PATH=${CODEX_BAZEL_WINDOWS_PATH} ``` The native post-merge Windows job intentionally omits `--windows-cross-compile` and does not exclude the V8/code-mode unit targets: ```shell ./.github/scripts/run-bazel-ci.sh \ --print-failed-action-summary \ --print-failed-test-logs \ -- \ test \ --test_tag_filters=-argument-comment-lint \ --test_verbose_timeout_warnings \ --build_metadata=COMMIT_SHA=${GITHUB_SHA} \ --build_metadata=TAG_windows_native_main=true \ -- \ //... \ -//third_party/v8:all ``` ## Research Notes The existing macOS Bazel CI config already uses the model we want here: build actions run remotely with `--strategy=remote`, but `TestRunner` actions execute on the macOS runner. This PR mirrors that pattern for Windows with `--strategy=TestRunner=local`. The important Bazel detail is that `rules_rs` is already targeting `x86_64-pc-windows-gnullvm` for Windows Bazel PR tests. This PR changes where the build actions execute; it does not switch the Bazel PR test target to Cargo, `cargo-nextest`, or the MSVC release target. Cargo release builds differ from this Bazel path for V8: the normal Windows Cargo release target is MSVC, and `rusty_v8` publishes prebuilt Windows MSVC `.lib.gz` archives. The Bazel PR path targets `windows-gnullvm`; `rusty_v8` does not publish a prebuilt Windows GNU/gnullvm archive, so this PR builds that archive in-tree. That Linux-RBE-built gnullvm archive currently crashes in direct V8/code-mode smoke tests, which is why the workflow keeps native Windows coverage on `main`. The less obvious Bazel detail is test wrapper selection. Bazel chooses the Windows test wrapper (`tw.exe`) from the test action execution platform, not merely from the Rust target triple. The outer `workspace_root_test` therefore declares the default test toolchain and uses the bridge toolchain above so the test action executes on Windows while its inner Rust binary is built for gnullvm. The V8 investigation exposed a Windows-client gotcha: even when an action execution platform is Linux RBE, Bazel can still derive the genrule shell path from the Windows client. That produced remote commands trying to run `C:\Program Files\Git\usr\bin\bash.exe` on Linux workers. The wrapper now passes `--shell_executable=/bin/bash` with `--host_platform=//:rbe` for the Windows cross path. The same Windows-client/Linux-RBE boundary also affected `third_party/v8:binding_cc`: a multiline genrule command can carry CRLF line endings into Linux remote bash, which failed as `$'\r'`. That genrule now keeps the `sed` command on one physical shell line while using an explicit Starlark join so the shell arguments stay readable. ## Verification Local checks included: ```shell bash -n .github/scripts/run-bazel-ci.sh bash -n workspace_root_test_launcher.sh.tpl ruby -e "require %q{yaml}; YAML.load_file(%q{.github/workflows/bazel.yml}); puts %q{ok}" RUNNER_OS=Linux ./scripts/list-bazel-clippy-targets.sh RUNNER_OS=Windows ./scripts/list-bazel-clippy-targets.sh RUNNER_OS=Linux ./tools/argument-comment-lint/list-bazel-targets.sh RUNNER_OS=Windows ./tools/argument-comment-lint/list-bazel-targets.sh ``` The Linux clippy and argument-comment target lists contain zero `*-windows-cross-bin` labels, while the Windows lists still include 47 Windows-cross internal test binaries. CI evidence: - Baseline native Windows Bazel test on `main`: success in about 28m12s, https://github.com/openai/codex/actions/runs/25206257208/job/73907325959 - Green Windows-cross Bazel run on the split PR before adding the main-only native leg: Windows test 9m16s, Windows release verify 5m10s, Windows clippy 4m43s, https://github.com/openai/codex/actions/runs/25231890068 - The latest SHA adds the explicit PR-vs-main tradeoff in `bazel.yml`; CI is rerunning on that focused diff. ## Follow-Up A subsequent PR should investigate making a cross-built Windows binary work with V8/code-mode enabled. Likely options are either making the Linux-RBE-built `windows-gnullvm` V8 archive correct at runtime, or evaluating whether a Bazel MSVC target/toolchain can reuse the same prebuilt MSVC `rusty_v8` archive shape that Cargo release builds already use.
Oreoxp
pushed a commit
to Oreoxp/codex-cli
that referenced
this pull request
May 7, 2026
## Status This is the Bazel PR-CI cross-compilation follow-up to openai#20485. It is intentionally split from the Cargo/cargo-xwin release-build PoC so openai#20485 can stay as the historical release-build exploration. The unrelated async-utils test cleanup has been moved to openai#20686, so this PR is focused on the Windows Bazel CI path. The intended tradeoff is now explicit in `.github/workflows/bazel.yml`: pull requests get the fast Windows cross-compiled Bazel test leg, while post-merge pushes to `main` run both that fast cross leg and a fully native Windows Bazel test leg. The native main-only job keeps full V8/code-mode coverage and gets a 40-minute timeout because it is less latency-sensitive than PR CI. All other Bazel jobs remain at 30 minutes. ## Why Windows Bazel PR CI currently does the expensive part of the build on Windows. A native Windows Bazel test job on `main` completed in about 28m12s, leaving very little headroom under the 30-minute job timeout and making Windows the slowest PR signal. openai#20485 showed that Windows cross-compilation can be materially faster for Cargo release builds, but PR CI needs Bazel because Bazel owns our test sharding, flaky-test retries, and integration-test layout. This PR applies the same high-level shape we already use for macOS Bazel CI: compile with remote Linux execution, then run platform-specific tests on the platform runner. The compromise is deliberately signal-aware: code-mode/V8 changes are rare enough that PR CI can accept losing the direct V8/code-mode smoke-test signal temporarily, while `main` still runs the native Windows job post-merge to catch that class of regression. A follow-up PR should investigate making the cross-built Windows gnullvm V8 archive pass the direct V8/code-mode tests so this tradeoff can eventually go away. ## What Changed - Adds a `ci-windows-cross` Bazel config that targets `x86_64-pc-windows-gnullvm`, uses Linux RBE for build actions, and keeps `TestRunner` actions local on the Windows runner. - Adds explicit Windows platform definitions for `windows_x86_64_gnullvm`, `windows_x86_64_msvc`, and a bridge toolchain that lets gnullvm test targets execute under the Windows MSVC host platform. - Updates the Windows Bazel PR test leg to opt into the cross-compile path via `--windows-cross-compile` and `--remote-download-toplevel`. - Adds a `test-windows-native-main` job that runs only for `push` events on `refs/heads/main`, uses the native Windows Bazel path, includes V8/code-mode smoke tests, and has `timeout-minutes: 40`. - Keeps fork/community PRs without `BUILDBUDDY_API_KEY` on the previous local Windows MSVC-host fallback, including `--host_platform=//:local_windows_msvc` and `--jobs=8`. - Preserves the existing integration-test shape on non-gnullvm platforms, while generating Windows-cross wrapper targets only for `windows_gnullvm`. - Resolves `CARGO_BIN_EXE_*` values from runfiles at test runtime, avoiding hard-coded Cargo paths and duplicate test runfiles. - Extends the V8 Bazel patches enough for the `x86_64-pc-windows-gnullvm` target and Linux remote execution path. - Makes the Windows sandbox test cwd derive from `INSTA_WORKSPACE_ROOT` at runtime when Bazel provides it, because cross-compiled binaries may contain Linux compile-time paths. - Keeps the direct V8/code-mode unit smoke tests out of the Windows cross PR path for now while native Windows CI continues to cover them post-merge. ## Command Shape The fast Windows PR test leg invokes the normal Bazel CI wrapper like this: ```shell ./.github/scripts/run-bazel-ci.sh \ --print-failed-action-summary \ --print-failed-test-logs \ --windows-cross-compile \ --remote-download-toplevel \ -- \ test \ --test_tag_filters=-argument-comment-lint \ --test_verbose_timeout_warnings \ --build_metadata=COMMIT_SHA=${GITHUB_SHA} \ -- \ //... \ -//third_party/v8:all \ -//codex-rs/code-mode:code-mode-unit-tests \ -//codex-rs/v8-poc:v8-poc-unit-tests ``` With the BuildBuddy secret available on Windows, the wrapper selects `--config=ci-windows-cross` and appends the important Windows-cross overrides after rc expansion: ```shell --host_platform=//:rbe --shell_executable=/bin/bash --action_env=PATH=/usr/bin:/bin --host_action_env=PATH=/usr/bin:/bin --test_env=PATH=${CODEX_BAZEL_WINDOWS_PATH} ``` The native post-merge Windows job intentionally omits `--windows-cross-compile` and does not exclude the V8/code-mode unit targets: ```shell ./.github/scripts/run-bazel-ci.sh \ --print-failed-action-summary \ --print-failed-test-logs \ -- \ test \ --test_tag_filters=-argument-comment-lint \ --test_verbose_timeout_warnings \ --build_metadata=COMMIT_SHA=${GITHUB_SHA} \ --build_metadata=TAG_windows_native_main=true \ -- \ //... \ -//third_party/v8:all ``` ## Research Notes The existing macOS Bazel CI config already uses the model we want here: build actions run remotely with `--strategy=remote`, but `TestRunner` actions execute on the macOS runner. This PR mirrors that pattern for Windows with `--strategy=TestRunner=local`. The important Bazel detail is that `rules_rs` is already targeting `x86_64-pc-windows-gnullvm` for Windows Bazel PR tests. This PR changes where the build actions execute; it does not switch the Bazel PR test target to Cargo, `cargo-nextest`, or the MSVC release target. Cargo release builds differ from this Bazel path for V8: the normal Windows Cargo release target is MSVC, and `rusty_v8` publishes prebuilt Windows MSVC `.lib.gz` archives. The Bazel PR path targets `windows-gnullvm`; `rusty_v8` does not publish a prebuilt Windows GNU/gnullvm archive, so this PR builds that archive in-tree. That Linux-RBE-built gnullvm archive currently crashes in direct V8/code-mode smoke tests, which is why the workflow keeps native Windows coverage on `main`. The less obvious Bazel detail is test wrapper selection. Bazel chooses the Windows test wrapper (`tw.exe`) from the test action execution platform, not merely from the Rust target triple. The outer `workspace_root_test` therefore declares the default test toolchain and uses the bridge toolchain above so the test action executes on Windows while its inner Rust binary is built for gnullvm. The V8 investigation exposed a Windows-client gotcha: even when an action execution platform is Linux RBE, Bazel can still derive the genrule shell path from the Windows client. That produced remote commands trying to run `C:\Program Files\Git\usr\bin\bash.exe` on Linux workers. The wrapper now passes `--shell_executable=/bin/bash` with `--host_platform=//:rbe` for the Windows cross path. The same Windows-client/Linux-RBE boundary also affected `third_party/v8:binding_cc`: a multiline genrule command can carry CRLF line endings into Linux remote bash, which failed as `$'\r'`. That genrule now keeps the `sed` command on one physical shell line while using an explicit Starlark join so the shell arguments stay readable. ## Verification Local checks included: ```shell bash -n .github/scripts/run-bazel-ci.sh bash -n workspace_root_test_launcher.sh.tpl ruby -e "require %q{yaml}; YAML.load_file(%q{.github/workflows/bazel.yml}); puts %q{ok}" RUNNER_OS=Linux ./scripts/list-bazel-clippy-targets.sh RUNNER_OS=Windows ./scripts/list-bazel-clippy-targets.sh RUNNER_OS=Linux ./tools/argument-comment-lint/list-bazel-targets.sh RUNNER_OS=Windows ./tools/argument-comment-lint/list-bazel-targets.sh ``` The Linux clippy and argument-comment target lists contain zero `*-windows-cross-bin` labels, while the Windows lists still include 47 Windows-cross internal test binaries. CI evidence: - Baseline native Windows Bazel test on `main`: success in about 28m12s, https://github.com/openai/codex/actions/runs/25206257208/job/73907325959 - Green Windows-cross Bazel run on the split PR before adding the main-only native leg: Windows test 9m16s, Windows release verify 5m10s, Windows clippy 4m43s, https://github.com/openai/codex/actions/runs/25231890068 - The latest SHA adds the explicit PR-vs-main tradeoff in `bazel.yml`; CI is rerunning on that focused diff. ## Follow-Up A subsequent PR should investigate making a cross-built Windows binary work with V8/code-mode enabled. Likely options are either making the Linux-RBE-built `windows-gnullvm` V8 archive correct at runtime, or evaluating whether a Bazel MSVC target/toolchain can reuse the same prebuilt MSVC `rusty_v8` archive shape that Cargo release builds already use.
Contributor
|
Closing this pull request because it has had no updates for more than 14 days. If you plan to continue working on it, feel free to reopen or open a new PR. |
AIALRA-0
pushed a commit
to AIALRA-0/codex-turn-engine
that referenced
this pull request
Jun 10, 2026
## Status This is the Bazel PR-CI cross-compilation follow-up to openai#20485. It is intentionally split from the Cargo/cargo-xwin release-build PoC so openai#20485 can stay as the historical release-build exploration. The unrelated async-utils test cleanup has been moved to openai#20686, so this PR is focused on the Windows Bazel CI path. The intended tradeoff is now explicit in `.github/workflows/bazel.yml`: pull requests get the fast Windows cross-compiled Bazel test leg, while post-merge pushes to `main` run both that fast cross leg and a fully native Windows Bazel test leg. The native main-only job keeps full V8/code-mode coverage and gets a 40-minute timeout because it is less latency-sensitive than PR CI. All other Bazel jobs remain at 30 minutes. ## Why Windows Bazel PR CI currently does the expensive part of the build on Windows. A native Windows Bazel test job on `main` completed in about 28m12s, leaving very little headroom under the 30-minute job timeout and making Windows the slowest PR signal. openai#20485 showed that Windows cross-compilation can be materially faster for Cargo release builds, but PR CI needs Bazel because Bazel owns our test sharding, flaky-test retries, and integration-test layout. This PR applies the same high-level shape we already use for macOS Bazel CI: compile with remote Linux execution, then run platform-specific tests on the platform runner. The compromise is deliberately signal-aware: code-mode/V8 changes are rare enough that PR CI can accept losing the direct V8/code-mode smoke-test signal temporarily, while `main` still runs the native Windows job post-merge to catch that class of regression. A follow-up PR should investigate making the cross-built Windows gnullvm V8 archive pass the direct V8/code-mode tests so this tradeoff can eventually go away. ## What Changed - Adds a `ci-windows-cross` Bazel config that targets `x86_64-pc-windows-gnullvm`, uses Linux RBE for build actions, and keeps `TestRunner` actions local on the Windows runner. - Adds explicit Windows platform definitions for `windows_x86_64_gnullvm`, `windows_x86_64_msvc`, and a bridge toolchain that lets gnullvm test targets execute under the Windows MSVC host platform. - Updates the Windows Bazel PR test leg to opt into the cross-compile path via `--windows-cross-compile` and `--remote-download-toplevel`. - Adds a `test-windows-native-main` job that runs only for `push` events on `refs/heads/main`, uses the native Windows Bazel path, includes V8/code-mode smoke tests, and has `timeout-minutes: 40`. - Keeps fork/community PRs without `BUILDBUDDY_API_KEY` on the previous local Windows MSVC-host fallback, including `--host_platform=//:local_windows_msvc` and `--jobs=8`. - Preserves the existing integration-test shape on non-gnullvm platforms, while generating Windows-cross wrapper targets only for `windows_gnullvm`. - Resolves `CARGO_BIN_EXE_*` values from runfiles at test runtime, avoiding hard-coded Cargo paths and duplicate test runfiles. - Extends the V8 Bazel patches enough for the `x86_64-pc-windows-gnullvm` target and Linux remote execution path. - Makes the Windows sandbox test cwd derive from `INSTA_WORKSPACE_ROOT` at runtime when Bazel provides it, because cross-compiled binaries may contain Linux compile-time paths. - Keeps the direct V8/code-mode unit smoke tests out of the Windows cross PR path for now while native Windows CI continues to cover them post-merge. ## Command Shape The fast Windows PR test leg invokes the normal Bazel CI wrapper like this: ```shell ./.github/scripts/run-bazel-ci.sh \ --print-failed-action-summary \ --print-failed-test-logs \ --windows-cross-compile \ --remote-download-toplevel \ -- \ test \ --test_tag_filters=-argument-comment-lint \ --test_verbose_timeout_warnings \ --build_metadata=COMMIT_SHA=${GITHUB_SHA} \ -- \ //... \ -//third_party/v8:all \ -//codex-rs/code-mode:code-mode-unit-tests \ -//codex-rs/v8-poc:v8-poc-unit-tests ``` With the BuildBuddy secret available on Windows, the wrapper selects `--config=ci-windows-cross` and appends the important Windows-cross overrides after rc expansion: ```shell --host_platform=//:rbe --shell_executable=/bin/bash --action_env=PATH=/usr/bin:/bin --host_action_env=PATH=/usr/bin:/bin --test_env=PATH=${CODEX_BAZEL_WINDOWS_PATH} ``` The native post-merge Windows job intentionally omits `--windows-cross-compile` and does not exclude the V8/code-mode unit targets: ```shell ./.github/scripts/run-bazel-ci.sh \ --print-failed-action-summary \ --print-failed-test-logs \ -- \ test \ --test_tag_filters=-argument-comment-lint \ --test_verbose_timeout_warnings \ --build_metadata=COMMIT_SHA=${GITHUB_SHA} \ --build_metadata=TAG_windows_native_main=true \ -- \ //... \ -//third_party/v8:all ``` ## Research Notes The existing macOS Bazel CI config already uses the model we want here: build actions run remotely with `--strategy=remote`, but `TestRunner` actions execute on the macOS runner. This PR mirrors that pattern for Windows with `--strategy=TestRunner=local`. The important Bazel detail is that `rules_rs` is already targeting `x86_64-pc-windows-gnullvm` for Windows Bazel PR tests. This PR changes where the build actions execute; it does not switch the Bazel PR test target to Cargo, `cargo-nextest`, or the MSVC release target. Cargo release builds differ from this Bazel path for V8: the normal Windows Cargo release target is MSVC, and `rusty_v8` publishes prebuilt Windows MSVC `.lib.gz` archives. The Bazel PR path targets `windows-gnullvm`; `rusty_v8` does not publish a prebuilt Windows GNU/gnullvm archive, so this PR builds that archive in-tree. That Linux-RBE-built gnullvm archive currently crashes in direct V8/code-mode smoke tests, which is why the workflow keeps native Windows coverage on `main`. The less obvious Bazel detail is test wrapper selection. Bazel chooses the Windows test wrapper (`tw.exe`) from the test action execution platform, not merely from the Rust target triple. The outer `workspace_root_test` therefore declares the default test toolchain and uses the bridge toolchain above so the test action executes on Windows while its inner Rust binary is built for gnullvm. The V8 investigation exposed a Windows-client gotcha: even when an action execution platform is Linux RBE, Bazel can still derive the genrule shell path from the Windows client. That produced remote commands trying to run `C:\Program Files\Git\usr\bin\bash.exe` on Linux workers. The wrapper now passes `--shell_executable=/bin/bash` with `--host_platform=//:rbe` for the Windows cross path. The same Windows-client/Linux-RBE boundary also affected `third_party/v8:binding_cc`: a multiline genrule command can carry CRLF line endings into Linux remote bash, which failed as `$'\r'`. That genrule now keeps the `sed` command on one physical shell line while using an explicit Starlark join so the shell arguments stay readable. ## Verification Local checks included: ```shell bash -n .github/scripts/run-bazel-ci.sh bash -n workspace_root_test_launcher.sh.tpl ruby -e "require %q{yaml}; YAML.load_file(%q{.github/workflows/bazel.yml}); puts %q{ok}" RUNNER_OS=Linux ./scripts/list-bazel-clippy-targets.sh RUNNER_OS=Windows ./scripts/list-bazel-clippy-targets.sh RUNNER_OS=Linux ./tools/argument-comment-lint/list-bazel-targets.sh RUNNER_OS=Windows ./tools/argument-comment-lint/list-bazel-targets.sh ``` The Linux clippy and argument-comment target lists contain zero `*-windows-cross-bin` labels, while the Windows lists still include 47 Windows-cross internal test binaries. CI evidence: - Baseline native Windows Bazel test on `main`: success in about 28m12s, https://github.com/openai/codex/actions/runs/25206257208/job/73907325959 - Green Windows-cross Bazel run on the split PR before adding the main-only native leg: Windows test 9m16s, Windows release verify 5m10s, Windows clippy 4m43s, https://github.com/openai/codex/actions/runs/25231890068 - The latest SHA adds the explicit PR-vs-main tradeoff in `bazel.yml`; CI is rerunning on that focused diff. ## Follow-Up A subsequent PR should investigate making a cross-built Windows binary work with V8/code-mode enabled. Likely options are either making the Linux-RBE-built `windows-gnullvm` V8 archive correct at runtime, or evaluating whether a Bazel MSVC target/toolchain can reuse the same prebuilt MSVC `rusty_v8` archive shape that Cargo release builds already use.
AIALRA-0
pushed a commit
to AIALRA-0/codex-turn-engine
that referenced
this pull request
Jun 10, 2026
## Status This is the Bazel PR-CI cross-compilation follow-up to openai#20485. It is intentionally split from the Cargo/cargo-xwin release-build PoC so openai#20485 can stay as the historical release-build exploration. The unrelated async-utils test cleanup has been moved to openai#20686, so this PR is focused on the Windows Bazel CI path. The intended tradeoff is now explicit in `.github/workflows/bazel.yml`: pull requests get the fast Windows cross-compiled Bazel test leg, while post-merge pushes to `main` run both that fast cross leg and a fully native Windows Bazel test leg. The native main-only job keeps full V8/code-mode coverage and gets a 40-minute timeout because it is less latency-sensitive than PR CI. All other Bazel jobs remain at 30 minutes. ## Why Windows Bazel PR CI currently does the expensive part of the build on Windows. A native Windows Bazel test job on `main` completed in about 28m12s, leaving very little headroom under the 30-minute job timeout and making Windows the slowest PR signal. openai#20485 showed that Windows cross-compilation can be materially faster for Cargo release builds, but PR CI needs Bazel because Bazel owns our test sharding, flaky-test retries, and integration-test layout. This PR applies the same high-level shape we already use for macOS Bazel CI: compile with remote Linux execution, then run platform-specific tests on the platform runner. The compromise is deliberately signal-aware: code-mode/V8 changes are rare enough that PR CI can accept losing the direct V8/code-mode smoke-test signal temporarily, while `main` still runs the native Windows job post-merge to catch that class of regression. A follow-up PR should investigate making the cross-built Windows gnullvm V8 archive pass the direct V8/code-mode tests so this tradeoff can eventually go away. ## What Changed - Adds a `ci-windows-cross` Bazel config that targets `x86_64-pc-windows-gnullvm`, uses Linux RBE for build actions, and keeps `TestRunner` actions local on the Windows runner. - Adds explicit Windows platform definitions for `windows_x86_64_gnullvm`, `windows_x86_64_msvc`, and a bridge toolchain that lets gnullvm test targets execute under the Windows MSVC host platform. - Updates the Windows Bazel PR test leg to opt into the cross-compile path via `--windows-cross-compile` and `--remote-download-toplevel`. - Adds a `test-windows-native-main` job that runs only for `push` events on `refs/heads/main`, uses the native Windows Bazel path, includes V8/code-mode smoke tests, and has `timeout-minutes: 40`. - Keeps fork/community PRs without `BUILDBUDDY_API_KEY` on the previous local Windows MSVC-host fallback, including `--host_platform=//:local_windows_msvc` and `--jobs=8`. - Preserves the existing integration-test shape on non-gnullvm platforms, while generating Windows-cross wrapper targets only for `windows_gnullvm`. - Resolves `CARGO_BIN_EXE_*` values from runfiles at test runtime, avoiding hard-coded Cargo paths and duplicate test runfiles. - Extends the V8 Bazel patches enough for the `x86_64-pc-windows-gnullvm` target and Linux remote execution path. - Makes the Windows sandbox test cwd derive from `INSTA_WORKSPACE_ROOT` at runtime when Bazel provides it, because cross-compiled binaries may contain Linux compile-time paths. - Keeps the direct V8/code-mode unit smoke tests out of the Windows cross PR path for now while native Windows CI continues to cover them post-merge. ## Command Shape The fast Windows PR test leg invokes the normal Bazel CI wrapper like this: ```shell ./.github/scripts/run-bazel-ci.sh \ --print-failed-action-summary \ --print-failed-test-logs \ --windows-cross-compile \ --remote-download-toplevel \ -- \ test \ --test_tag_filters=-argument-comment-lint \ --test_verbose_timeout_warnings \ --build_metadata=COMMIT_SHA=${GITHUB_SHA} \ -- \ //... \ -//third_party/v8:all \ -//codex-rs/code-mode:code-mode-unit-tests \ -//codex-rs/v8-poc:v8-poc-unit-tests ``` With the BuildBuddy secret available on Windows, the wrapper selects `--config=ci-windows-cross` and appends the important Windows-cross overrides after rc expansion: ```shell --host_platform=//:rbe --shell_executable=/bin/bash --action_env=PATH=/usr/bin:/bin --host_action_env=PATH=/usr/bin:/bin --test_env=PATH=${CODEX_BAZEL_WINDOWS_PATH} ``` The native post-merge Windows job intentionally omits `--windows-cross-compile` and does not exclude the V8/code-mode unit targets: ```shell ./.github/scripts/run-bazel-ci.sh \ --print-failed-action-summary \ --print-failed-test-logs \ -- \ test \ --test_tag_filters=-argument-comment-lint \ --test_verbose_timeout_warnings \ --build_metadata=COMMIT_SHA=${GITHUB_SHA} \ --build_metadata=TAG_windows_native_main=true \ -- \ //... \ -//third_party/v8:all ``` ## Research Notes The existing macOS Bazel CI config already uses the model we want here: build actions run remotely with `--strategy=remote`, but `TestRunner` actions execute on the macOS runner. This PR mirrors that pattern for Windows with `--strategy=TestRunner=local`. The important Bazel detail is that `rules_rs` is already targeting `x86_64-pc-windows-gnullvm` for Windows Bazel PR tests. This PR changes where the build actions execute; it does not switch the Bazel PR test target to Cargo, `cargo-nextest`, or the MSVC release target. Cargo release builds differ from this Bazel path for V8: the normal Windows Cargo release target is MSVC, and `rusty_v8` publishes prebuilt Windows MSVC `.lib.gz` archives. The Bazel PR path targets `windows-gnullvm`; `rusty_v8` does not publish a prebuilt Windows GNU/gnullvm archive, so this PR builds that archive in-tree. That Linux-RBE-built gnullvm archive currently crashes in direct V8/code-mode smoke tests, which is why the workflow keeps native Windows coverage on `main`. The less obvious Bazel detail is test wrapper selection. Bazel chooses the Windows test wrapper (`tw.exe`) from the test action execution platform, not merely from the Rust target triple. The outer `workspace_root_test` therefore declares the default test toolchain and uses the bridge toolchain above so the test action executes on Windows while its inner Rust binary is built for gnullvm. The V8 investigation exposed a Windows-client gotcha: even when an action execution platform is Linux RBE, Bazel can still derive the genrule shell path from the Windows client. That produced remote commands trying to run `C:\Program Files\Git\usr\bin\bash.exe` on Linux workers. The wrapper now passes `--shell_executable=/bin/bash` with `--host_platform=//:rbe` for the Windows cross path. The same Windows-client/Linux-RBE boundary also affected `third_party/v8:binding_cc`: a multiline genrule command can carry CRLF line endings into Linux remote bash, which failed as `$'\r'`. That genrule now keeps the `sed` command on one physical shell line while using an explicit Starlark join so the shell arguments stay readable. ## Verification Local checks included: ```shell bash -n .github/scripts/run-bazel-ci.sh bash -n workspace_root_test_launcher.sh.tpl ruby -e "require %q{yaml}; YAML.load_file(%q{.github/workflows/bazel.yml}); puts %q{ok}" RUNNER_OS=Linux ./scripts/list-bazel-clippy-targets.sh RUNNER_OS=Windows ./scripts/list-bazel-clippy-targets.sh RUNNER_OS=Linux ./tools/argument-comment-lint/list-bazel-targets.sh RUNNER_OS=Windows ./tools/argument-comment-lint/list-bazel-targets.sh ``` The Linux clippy and argument-comment target lists contain zero `*-windows-cross-bin` labels, while the Windows lists still include 47 Windows-cross internal test binaries. CI evidence: - Baseline native Windows Bazel test on `main`: success in about 28m12s, https://github.com/openai/codex/actions/runs/25206257208/job/73907325959 - Green Windows-cross Bazel run on the split PR before adding the main-only native leg: Windows test 9m16s, Windows release verify 5m10s, Windows clippy 4m43s, https://github.com/openai/codex/actions/runs/25231890068 - The latest SHA adds the explicit PR-vs-main tradeoff in `bazel.yml`; CI is rerunning on that focused diff. ## Follow-Up A subsequent PR should investigate making a cross-built Windows binary work with V8/code-mode enabled. Likely options are either making the Linux-RBE-built `windows-gnullvm` V8 archive correct at runtime, or evaluating whether a Bazel MSVC target/toolchain can reuse the same prebuilt MSVC `rusty_v8` archive shape that Cargo release builds already use.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Windows release builds are currently expensive enough that the Windows jobs can dominate release latency. This PR is a proof of concept for building Windows release binaries from Linux with Cargo, using
cargo-xwinto targetx86_64-pc-windows-msvc.The goal is to validate whether Cargo-based cross compilation can build the Windows release binaries we care about before changing the real signed release workflow. This PoC intentionally does not include signing logic.
What Changed
Adds
.github/workflows/rust-release-windows-cross-compile-poc.yml, a minimal workflow that:ubuntu-24.04x86_64-pc-windows-msvcRust targetcargo-xwincodex-responses-api-proxycodex-app-servercodexPE32+executablesToolchain Confidence
This PoC is still a Cargo-based release build: the workflow runs
cargo xwin build --release --target x86_64-pc-windows-msvc. It is not a Bazel build, and it is not changing the Rust target ABI away from the normal Windows MSVC target.The important distinction from a native Windows
cargo build --release --target x86_64-pc-windows-msvcis the host toolchain around Cargo:link.exe, an installed Visual Studio toolchain, and the installed Windows SDKcargo-xwinusesxwinto provide the Microsoft CRT and Windows SDK headers/libs on non-Windows hostslld-link;cargo-xwindefaults C/C++ cross-compilation toclang-clThis is why the PoC verifies the produced files are normal Windows
PE32+executables forx86_64-pc-windows-msvc.For linker maturity, this is not an obscure linker path. LLVM's Windows
llddocumentation sayslld-linkaccepts almost alllink.execommand-line options and is used to link production Windows builds of large real-world binaries such as Firefox and Chromium:https://lld.llvm.org/windows_support.html
That does not mean Firefox or Chromium use
cargo-xwin; the relevant point is that the linker family used by this cross-compilation path is already used for major production Windows binaries.Relevant tool docs:
cargo-xwin: https://github.com/rust-cross/cargo-xwinxwin: https://github.com/Jake-Shadle/xwinlld-link: https://lld.llvm.org/windows_support.htmlBefore replacing the signed release path, we should still smoke-test the cross-built binaries on Windows and compare any relevant startup/runtime benchmarks. The main risks are compatibility or behavioral differences from
lld-link,clang-cl, or SDK/CRT version differences, not a known systematic Rust runtime performance penalty from cross-compilation itself.Workflow Commands
I triggered the successful thin-LTO workflow by pushing updates to the draft PR branch:
That used the workflow's
pull_requesttrigger.I triggered the fat-LTO experiment with:
The build commands executed by the workflow are:
For both successful runs,
WINDOWS_TARGETwasx86_64-pc-windows-msvc.Verification
Successful thin-LTO workflow run:
https://github.com/openai/codex/actions/runs/25185982350
Successful thin-LTO job:
https://github.com/openai/codex/actions/runs/25185982350/job/73843273846
Per-binary build step timing from the successful thin-LTO run:
codex-responses-api-proxy: 1m59scodex-app-server: 23m01scodex: 16m03sSuccessful fat-LTO workflow run:
https://github.com/openai/codex/actions/runs/25190714066
Successful fat-LTO job:
https://github.com/openai/codex/actions/runs/25190714066/job/73859747064
Per-binary build step timing from the successful fat-LTO run:
codex-responses-api-proxy: 1m59scodex-app-server: 25m01scodex: 23m58sThese are the durations of each sequential
cargo xwin buildstep. Later build steps benefit from dependencies and artifacts produced by earlier steps in the same job.The workflow verified these output files:
target/x86_64-pc-windows-msvc/release/codex-responses-api-proxy.exetarget/x86_64-pc-windows-msvc/release/codex-app-server.exetarget/x86_64-pc-windows-msvc/release/codex.exeDownloading The Binaries
From the GitHub web UI:
windows-cross-compile-poc-x86_64-pc-windows-msvc-primary-app-server.GitHub downloads the artifact as a
.zip. You must be logged into GitHub and have access to the repo.From the CLI:
Each artifact contains:
codex.execodex-app-server.execodex-responses-api-proxy.exeBoth artifacts expire on 2026-07-29.