Skip to content

[core] Optimize GCS build time#50365

Merged
jjyao merged 5 commits intoray-project:masterfrom
dentiny:hjiang/split-gcs-targets
Feb 11, 2025
Merged

[core] Optimize GCS build time#50365
jjyao merged 5 commits intoray-project:masterfrom
dentiny:hjiang/split-gcs-targets

Conversation

@dentiny
Copy link
Copy Markdown
Contributor

@dentiny dentiny commented Feb 9, 2025

This PR improves build and link time for gcs_server_lib from 7m49.451s to 6m55.810s on my devbox (>10% improvement).

In this PR, I tried to do a few things:

  • Split the giant bazel target into smaller pieces, which better leverages bazel's parallelism
  • Leverage forward declaration if possible, so next time header change won't trigger rebuild for unnecessary dependents
    • This is also the benefit for smaller bazel targets
  • Place implementation into impl file rather than header file
    • Similar to above, also useful to reduce re-compilation
  • Remove unnecessary header and dependencies
  • Split dependencies into smaller granularity, apply to a lot of targets under ray/util and ray/common folder

The build time is still intolerably long, eg, a state_util target takes >120 seconds...
I've worked on large C++ binaries, which exceed linker 4GiB linker limit, which have to leverage some hacky ways to work around; but the compilation/link time is still under 15 minutes.

Toolchain setup wise, build file auto-generation and IWYU might be helpful;
but I personally feel it's more easy and important to reduce the interdependency between targets.

Benchmark command for reference:

  • bazel clean
    • cleanup local bazel cache before benchmark
  • sudo sync; sudo sh -c "echo 3 > /proc/sys/vm/drop_caches"
    • deprecate filesystem page cache, which I found affects performance by >10%
  • bazel build //:gcs_server_lib

Signed-off-by: dentiny <dentinyhao@gmail.com>
BUILD.bazel Outdated
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're still quite a few source files under the target, which is hard to separate due to inter-dependency

@dentiny dentiny added the go add ONLY when ready to merge, run all tests label Feb 9, 2025
Signed-off-by: dentiny <dentinyhao@gmail.com>
@dentiny dentiny force-pushed the hjiang/split-gcs-targets branch from 0d7d6b3 to 59cfe29 Compare February 9, 2025 12:58
@edoakes edoakes requested a review from israbbani February 9, 2025 16:50
@edoakes
Copy link
Copy Markdown
Collaborator

edoakes commented Feb 9, 2025

@israbbani and @dayshah to review please

@dayshah
Copy link
Copy Markdown
Contributor

dayshah commented Feb 9, 2025

Some CPP tests are failing to build @dentiny. Agree we should do IWYU and even have it in CI. Also can you kick off the windows/macos builds in premerge as well after, to make sure those don't break.

@jjyao jjyao merged commit 5632a4b into ray-project:master Feb 11, 2025
2 checks passed
@israbbani
Copy link
Copy Markdown
Contributor

Thanks for doing this!

xsuler pushed a commit to antgroup/ant-ray that referenced this pull request Mar 4, 2025
Signed-off-by: dentiny <dentinyhao@gmail.com>
xsuler pushed a commit to antgroup/ant-ray that referenced this pull request Mar 4, 2025
Signed-off-by: dentiny <dentinyhao@gmail.com>
park12sj pushed a commit to park12sj/ray that referenced this pull request Mar 18, 2025
Signed-off-by: dentiny <dentinyhao@gmail.com>
edoakes pushed a commit that referenced this pull request Mar 31, 2025
Followup PR for #50365

The next step is to
(1) split interdependency between node manager and resource manager
(2) split targets depending on node manager and resource manager

Signed-off-by: dentiny <dentinyhao@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-backlog go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants