Skip to content

GH-39707: [Java] Enable local build cache for Maven/Java build#39708

Merged
lidavidm merged 7 commits intoapache:mainfrom
clayburn:develocity
Mar 22, 2024
Merged

GH-39707: [Java] Enable local build cache for Maven/Java build#39708
lidavidm merged 7 commits intoapache:mainfrom
clayburn:develocity

Conversation

@clayburn
Copy link
Copy Markdown
Contributor

@clayburn clayburn commented Jan 19, 2024

Rationale for this change

This change has two main benefits:

Enabling local build caching

Enabling local build cache can speed up builds that occur on the same machine by skipping the execution of certain deterministic goals, such as Java compilation, when no change has occurred.

In the future, https://ge.apache.org will support remote build caching as well so that results can be shared between builds (e.g. ephemeral CI build agents).

Enabling build scans

This change enables the publishing of build scans of the Apache Arrow project to the Develocity instance at ge.apache.org, hosted by the Apache Software Foundation and run in partnership between the ASF and Gradle. This Develocity instance has all features and extensions enabled and is freely available for use by the Apache Arrow project and all other Apache projects. Currently, Maven-built projects such as Pulsar, IoTDB, Ozone, and others are using this instance.

On this Develocity instance, Apache Arrow will have access not only to all of the published build scans but other aggregate data features such as:

  • Dashboards to view all historical build scans, along with performance trends over time
  • Build failure analytics for enhanced investigation and diagnosis of build failures
  • Test failure analytics to better understand trends and causes around slow, failing, and flaky tests

What changes are included in this PR?

Are these changes tested?

No (these changes are for the build and do not affect the code)

Are there any user-facing changes?

No

@github-actions
Copy link
Copy Markdown

⚠️ GitHub issue #39707 has been automatically assigned in GitHub to PR creator.

@github-actions github-actions bot added the awaiting review Awaiting review label Jan 19, 2024
@danepitkin
Copy link
Copy Markdown
Member

Thank you @clayburn! This is an exciting improvement. For local build execution, would we have to change anything in our developer docs[1] on how to build Arrow java in order to take advantage of this?

[1]https://arrow.apache.org/docs/developers/java/building.html#building-java-modules

@assignUser
Copy link
Copy Markdown
Member

@clayburn I have approved the CI runs, I assume that only builds on main will be able to upload data due to the use of secrets? For similar things I have used a two step process in the past where the unprivileged workflow creates the data/artifacts and a second privileged workflow that doesn't run any user code would upload the data. Is this something gradle could support? Or maybe data for each PR build would be a bit much ^^

@clayburn
Copy link
Copy Markdown
Contributor Author

Thank you @clayburn! This is an exciting improvement. For local build execution, would we have to change anything in our developer docs[1] on how to build Arrow java in order to take advantage of this?

[1]https://arrow.apache.org/docs/developers/java/building.html#building-java-modules

@danepitkin - For caching, there is no action needed. The one stipulation that should be noted is that the local cache is machine local, so with some of your containerized builds, it would be local to the container itself without any special handling. The local cache by default is located at $HOME/.m2/.gradle-enterprise/build-cache.

For build scans, you won't see any changes unless you explicitly authenticate to https://ge.apache.org. Instructions to do so are here. All ASF committers can authenticate to ge.apache.org with LDAP credentials. Contributors cannot authenticate, so they won't see anything regarding build scans. The access key is either stored in ~/.m2 or pulled from an env var, so again, the containerized builds may not authenticate easily without intention from the user.

@clayburn I have approved the CI runs, I assume that only builds on main will be able to upload data due to the use of secrets? For similar things I have used a two step process in the past where the unprivileged workflow creates the data/artifacts and a second privileged workflow that doesn't run any user code would upload the data. Is this something gradle could support? Or maybe data for each PR build would be a bit much ^^

Build scans will only be produced for builds that run from the apache/arrow repo. In other words, not from forks. Forked builds will still build just fine, but will not produce a scan. We are working on methods to handle this in the way you described, performing the upload of the scan in a "trusted" environment, but this is not yet ready.

The caching functionality will work without the secret, although the benefits may not be apparent in your CI builds since they are presumably running on ephemeral agents and the cache is machine local. This is where the remote cache will help more, which we hope to enable in the future.

@pitrou
Copy link
Copy Markdown
Member

pitrou commented Jan 22, 2024

The local cache by default is located at $HOME/.m2/.gradle-enterprise/build-cache.

Can it be directed to another directory using an environment variable? In C++ we direct ccache data to a custom directory which is then persisted as Docker volumes. This is then used together with GHA caching to shorten CI times.

@clayburn
Copy link
Copy Markdown
Contributor Author

The local cache by default is located at $HOME/.m2/.gradle-enterprise/build-cache.

Can it be directed to another directory using an environment variable? In C++ we direct ccache data to a custom directory which is then persisted as Docker volumes. This is then used together with GHA caching to shorten CI times.

I think this is plausible. I'll give it a shot and update back here.

@clayburn clayburn requested a review from lidavidm as a code owner January 24, 2024 18:01
@clayburn
Copy link
Copy Markdown
Contributor Author

@pitrou - Looking through the workflow, you are already caching the right directory. It's all of $HOME/.m2, but it will include the default build cache directory. The cache key there being constructed by all files under java/** may be overly specific for the build cache, but the restore-keys configuration should handle it. We can see as you publish build scans and generate that data.

I've pushed two changes to further optimize this:

  • One that enables the local cache. It also adds the clean lifecycle to the CI invocations of Maven, as the build cache is in read-only mode when clean isn't used.
  • One that defines some normalizations so that, for example, a change to a properties file that exists in a runtime classpath won't cause a cache miss.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add newlines to EOF

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Jan 25, 2024
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose build caching means that the 'clean' here isn't actually a clean build?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add clean simply runs the clean lifecycle before running other requested lifecycles. We need this to safely write cache entries so that we can know that which goal invocation produced particular files.

So this will clean the workspace, but it may restore cache entries to the workspace during execution rather than running the goals. Just depends on your definition of "clean" here. -DrerunGoals can be used to explicitly rerun goals in cases where that is desired.

@lidavidm
Copy link
Copy Markdown
Member

@github-actions crossbow submit java

@github-actions github-actions bot added awaiting review Awaiting review awaiting merge Awaiting merge and removed awaiting committer review Awaiting committer review labels Jan 25, 2024
@github-actions
Copy link
Copy Markdown

Revision: aa0ffadb0e87ad24b464b6d43cbb6ac1028ecb2f

Submitted crossbow builds: ursacomputing/crossbow @ actions-f432146f72

Task Status
java-jars GitHub Actions
verify-rc-source-java-linux-almalinux-8-amd64 GitHub Actions
verify-rc-source-java-linux-conda-latest-amd64 GitHub Actions
verify-rc-source-java-linux-ubuntu-20.04-amd64 GitHub Actions
verify-rc-source-java-linux-ubuntu-22.04-amd64 GitHub Actions
verify-rc-source-java-macos-amd64 GitHub Actions

@github-actions github-actions bot added awaiting review Awaiting review awaiting committer review Awaiting committer review and removed awaiting review Awaiting review awaiting merge Awaiting merge labels Jan 25, 2024
@davisusanibar
Copy link
Copy Markdown
Contributor

For caching, there is no action needed. The one stipulation that should be noted is that the local cache is machine local

@clayburn If you could help me with doubts regarding local development, I would appreciate it:

  1. I am running $ mvn clean install and seeing that /.m2/.gradle-enterprise is populating with information. If I run $ mvn clean install again and haven't made any changes to my Java module projects, should it use the last cache?

  2. Is there a list of tasks that will be fetched from the cache before building it again (e.g. compile,...)?

I would like to identify/map what the new advantages will be for developers on their loca machines.

Thank you in advance for your support.

Copy link
Copy Markdown
Member

@danepitkin danepitkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

@clayburn
Copy link
Copy Markdown
Contributor Author

@lidavidm Done ✅

@clayburn
Copy link
Copy Markdown
Contributor Author

For caching, there is no action needed. The one stipulation that should be noted is that the local cache is machine local

@clayburn If you could help me with doubts regarding local development, I would appreciate it:

  1. I am running $ mvn clean install and seeing that /.m2/.gradle-enterprise is populating with information. If I run $ mvn clean install again and haven't made any changes to my Java module projects, should it use the last cache?
  2. Is there a list of tasks that will be fetched from the cache before building it again (e.g. compile,...)?

I would like to identify/map what the new advantages will be for developers on their loca machines.

Thank you in advance for your support.

@davisusanibar - So sorry for losing track of this:

  1. Yes, such a build will use the cache. It also isn't necessarily even the "last" cache (i.e. the last build). It could be the case that you use cache entries from older builds as well. This can help, for example, when switching back and forth between feature branches.

  2. We document the default cacheable goals here. It is possible to add goals if needed. We can explore this in the future to see if any in your project make sense to add.

@danepitkin
Copy link
Copy Markdown
Member

@github-actions crossbow submit -g java

@github-actions
Copy link
Copy Markdown

Revision: 988562f

Submitted crossbow builds: ursacomputing/crossbow @ actions-be6617b977

Task Status
java-jars GitHub Actions
verify-rc-source-java-linux-almalinux-8-amd64 GitHub Actions
verify-rc-source-java-linux-conda-latest-amd64 GitHub Actions
verify-rc-source-java-linux-ubuntu-20.04-amd64 GitHub Actions
verify-rc-source-java-linux-ubuntu-22.04-amd64 GitHub Actions
verify-rc-source-java-macos-amd64 GitHub Actions

ARCHERY_DOCKER_PASSWORD: ${{ secrets.DOCKERHUB_TOKEN }}
run: archery docker run ${{ matrix.image }}
GRADLE_ENTERPRISE_ACCESS_KEY: ${{ secrets.GE_ACCESS_TOKEN }}
run: archery docker run ${{ matrix.image }} -e "GRADLE_ENTERPRISE_ACCESS_KEY=$GRADLE_ENTERPRISE_ACCESS_KEY" -e CI=true
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use archery docker run -e ... ${{ matrix.image }} style like we did in other places?

Suggested change
run: archery docker run ${{ matrix.image }} -e "GRADLE_ENTERPRISE_ACCESS_KEY=$GRADLE_ENTERPRISE_ACCESS_KEY" -e CI=true
run: |
archery docker run \
-e CI=true \
-e "GRADLE_ENTERPRISE_ACCESS_KEY=$GRADLE_ENTERPRISE_ACCESS_KEY" \
${{ matrix.image }}

ARCHERY_DOCKER_PASSWORD: ${{ secrets.DOCKERHUB_TOKEN }}
run: archery docker run conda-python-java-integration
GRADLE_ENTERPRISE_ACCESS_KEY: ${{ secrets.GE_ACCESS_TOKEN }}
run: archery docker run conda-python-java-integration -e "GRADLE_ENTERPRISE_ACCESS_KEY=$GRADLE_ENTERPRISE_ACCESS_KEY" -e CI=true
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Mar 21, 2024
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Mar 21, 2024
@lidavidm
Copy link
Copy Markdown
Member

@danepitkin @vibhatha A potential CI improvement might be to have a post-run step that captures and uploads dump files like these so we can get to the bottom of this instability.

Error:  Caused by: org.apache.maven.surefire.booter.SurefireBooterForkException: The forked VM terminated without properly saying goodbye. VM crash or System.exit called?
Error:  Command was cmd.exe /X /C "C:\hostedtoolcache\windows\Java_Temurin-Hotspot_jdk\11.0.22-7\x64\bin\java --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED -jar C:\Users\runneradmin\AppData\Local\Temp\surefire832983470661055534\surefirebooter-20240322025403947_28.jar C:\Users\runneradmin\AppData\Local\Temp\surefire832983470661055534 2024-03-22T02-53-37_914-jvmRun2 surefire-20240322025403947_24tmp surefire_6-20240322025403947_26tmp"

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting change review Awaiting change review labels Mar 22, 2024
@lidavidm
Copy link
Copy Markdown
Member

Verification failure is unrelated. However, another improvement might be to run mvn -B in our verification script; I don't think we need to see the entire download progress in logs and it makes scanning the log much harder...

@lidavidm lidavidm merged commit 6e54b7b into apache:main Mar 22, 2024
@lidavidm lidavidm removed the awaiting merge Awaiting merge label Mar 22, 2024
@conbench-apache-arrow
Copy link
Copy Markdown

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 6e54b7b.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 3 possible false positives for unstable benchmarks that are known to sometimes produce them.

@danepitkin
Copy link
Copy Markdown
Member

Hey @clayburn , one of our systems used for benchmarking is hitting OOM after merging this PR: #40775

By any chance are you familiar with the stack trace containing gradle and apache.commons.exec?

@danepitkin
Copy link
Copy Markdown
Member

Hmm, I mightve found the culprit.. DEBUG messages are enabled in the build now causing a huge amount of logging.

@lidavidm
Copy link
Copy Markdown
Member

Do we want to revert, or do you think disabling the log is good enough?

@danepitkin
Copy link
Copy Markdown
Member

I think disabling DEBUG logs should be good enough. I'll try it out.

pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
…pache#39708)

### Rationale for this change

This change has two main benefits:

#### Enabling local build caching

Enabling local build cache can speed up builds that occur on the same machine by skipping the execution of certain deterministic goals, such as Java compilation, when no change has occurred. 

In the future, [https://ge.apache.org](ge.apache.org) will support remote build caching as well so that results can be shared between builds (e.g. ephemeral CI build agents).

#### Enabling build scans

This change enables the publishing of build scans of the Apache Arrow project to the Develocity instance at [ge.apache.org](https://ge.apache.org/), hosted by the Apache Software Foundation and run in partnership between the ASF and Gradle. This Develocity instance has all features and extensions enabled and is freely available for use by the Apache Arrow project and all other Apache projects. Currently, Maven-built projects such as [Pulsar](https://ge.apache.org/scans?search.buildToolType=maven&search.rootProjectNames=*Pulsar*&search.timeZoneId=America%2FChicago), [IoTDB](https://ge.apache.org/scans?search.buildToolType=maven&search.rootProjectNames=*iotdb*&search.timeZoneId=America%2FChicago), [Ozone](https://ge.apache.org/scans?search.buildToolType=maven&search.rootProjectNames=*ozone*&search.timeZoneId=America%2FChicago), and others are using this instance.

On this Develocity instance, Apache Arrow will have access not only to all of the published build scans but other aggregate data features such as:

- Dashboards to view all historical build scans, along with performance trends over time
- Build failure analytics for enhanced investigation and diagnosis of build failures
- Test failure analytics to better understand trends and causes around slow, failing, and flaky tests

### What changes are included in this PR?

- Adds the [Develocity Maven Extension](https://docs.gradle.com/enterprise/maven-extension/) to enable local caching
- Publishes build scans to [ge.apache.org](https://ge.apache.org) from CI builds and authenticated local builds
- Adds the [Common Custom User Data Maven Extension](https://github.com/gradle/common-custom-user-data-maven-extension) to enhance build scans with more metadata

### Are these changes tested?

No (these changes are for the build and do not affect the code)

### Are there any user-facing changes?

No
* Closes: apache#39707

Authored-by: Clay Johnson <cjohnson@gradle.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Java] Enable Java Build Caching

8 participants