Skip to content

[VL] Adding cache for m2 repo#11469

Closed
zhouyuan wants to merge 1 commit intoapache:mainfrom
zhouyuan:wip_m2_cache
Closed

[VL] Adding cache for m2 repo#11469
zhouyuan wants to merge 1 commit intoapache:mainfrom
zhouyuan:wip_m2_cache

Conversation

@zhouyuan
Copy link
Copy Markdown
Member

What changes are proposed in this pull request?

This patch adds the cache for m2 repo after TPC test

How was this patch tested?

pass GHA

Signed-off-by: Yuan <yuanzhou@apache.org>
@zhouyuan
Copy link
Copy Markdown
Member Author

@baibaichen The Java caching can work in this way, however it's too big for now and may trigger evict on the CPP cache
image

@FelixYBW
Copy link
Copy Markdown
Contributor

we may create an image including cache daily, then pull the image and copy .w2 from it. Is it feasible?

@baibaichen
Copy link
Copy Markdown
Contributor

How is the cache generated? Is there a shared disk that we mount for each test?

@FelixYBW
Copy link
Copy Markdown
Contributor

How is the cache generated? Is there a shared disk that we mount for each test?

https://github.com/apache/incubator-gluten/actions/caches. You can't mount it. You can copy it. The whole cache size is limited

@baibaichen
Copy link
Copy Markdown
Contributor

How is the cache generated? Is there a shared disk that we mount for each test?

https://github.com/apache/incubator-gluten/actions/caches. You can't mount it. You can copy it. The whole cache size is limited

I see. I agree with proposal of @FelixYBW. We can set up a daily or weekly job to generate the .m2 cache.

Note: Since Java Jar files are independent of the OS and Java version, we only need to maintain a single, unified cache for .m2.

To populate this cache comprehensively, we can execute the following commands. Theoretically, this sequence should download all necessary dependencies for our various build profiles into the cache:

build/mvn -P java-17,spark-4.0,scala-2.13,backends-velox,hadoop-3.3,spark-ut -Piceberg,iceberg-test,delta,paimon clean test-compile
build/mvn -P java-17,spark-3.5,backends-velox,hadoop-3.3,spark-ut -Piceberg,iceberg-test,delta,paimon clean test-compile
build/mvn -P java-17,spark-3.4,backends-velox,hadoop-3.3,spark-ut -Piceberg,iceberg-test,delta,paimon clean test-compile
build/mvn -P java-17,spark-3.3,backends-velox,hadoop-3.3,spark-ut -Piceberg,delta,paimon clean test-compile
build/mvn -P java-17,spark-4.1,scala-2.13,backends-velox,hadoop-3.3,spark-ut clean test-compile

Regarding .ccache, do we currently update it only when upgrading Velox?

@zhouyuan
Copy link
Copy Markdown
Member Author

zhouyuan commented Jan 27, 2026

fired one issue to track

@zhouyuan zhouyuan closed this Jan 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants