Skip to content

Conversation

@m3dwards
Copy link
Contributor

@m3dwards m3dwards commented Jun 19, 2024

Fixes: #30193 (comment)

Cache wasn't being saved when run on GHA because the default behaviour of the CI script was to store cache items in a docker volume. This works on Cirrus CI as the volumes are shared but it does not work on Github Actions in which each run is ephemeral.

Kept the default behaviour the same so hopefully this continues to work for the Cirrus CI jobs.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jun 19, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK maflcko, hebasto

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

@m3dwards m3dwards force-pushed the 240519-ci-gha-no-volume branch 3 times, most recently from 48e1a1f to 5bf7a08 Compare June 20, 2024 12:46
@m3dwards m3dwards force-pushed the 240519-ci-gha-no-volume branch from 5bf7a08 to cae08eb Compare June 20, 2024 12:52
@m3dwards
Copy link
Contributor Author

Setting to draft while I fix the job, will test on my own fork and re-open when ready for review.

@m3dwards m3dwards marked this pull request as draft June 20, 2024 12:57
@m3dwards m3dwards force-pushed the 240519-ci-gha-no-volume branch from cae08eb to 5611197 Compare June 20, 2024 13:26
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was made as a step and not in the env: section of the job as ${{ runner.temp }} is only available in the run steps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RUNNER_TEMP was used over ${{ github.workspace }} (as is used in the MacOS job) because the CI container mounts that as readonly which means it can't be used to write cache items to.

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

Maybe wait for the CI to pass, then address the nits to test the created cache?

@kevkevinpal
Copy link
Contributor

Is there a reason why we don't want to just have cirrus behave in the same way we'd do it on GHA, seems like GHA cannot store to shared cache but cirrus can act the same was as GHA. Just thinking maybe theres a way to reduce complexity in ci/test/02_run_container.sh and have more shared logic between the two ci's

if I were to set DANGER_CI_ON_HOST_CACHE_FOLDERS = 1 on the cirrus pipeline would the ci fail?

@maflcko
Copy link
Member

maflcko commented Jun 20, 2024

Is there a reason why we don't want to just have cirrus behave in the same way we'd do it on GHA, seems like GHA cannot store to shared cache but cirrus can act the same was as GHA. Just thinking maybe theres a way to reduce complexity in ci/test/02_run_container.sh and have more shared logic between the two ci's

Yes, it is possible to do the same on cirrus. However, for local runs it can not be done, because DANGER_CI_ON_HOST_CACHE_FOLDERS is dangerous and bricks your system. So I'd say to leave it as-is.

@maflcko
Copy link
Member

maflcko commented Jun 20, 2024

Maybe wait for the CI to pass, then address the nits to test the created cache?

Oh, I guess only master creates the cache?

        if: github.event_name != 'pull_request' && steps.ccache-cache.outputs.cache-hit != 'true'

So I guess this can be merged after addressing the nits.

DANGER_CI_ON_HOST_CACHE_FOLDERS if set will mount caches in directories on the host rather than in docker volumes. Supports saving and restoring caches on Github Actions.
@m3dwards m3dwards force-pushed the 240519-ci-gha-no-volume branch from 5611197 to 4ecbbd9 Compare June 20, 2024 16:42
@m3dwards
Copy link
Contributor Author

Oh, I guess only master creates the cache?

Branches on forks work, see cache restore here: https://github.com/m3dwards/bitcoin/actions/runs/9601151359/job/26479048839

@m3dwards m3dwards marked this pull request as ready for review June 20, 2024 16:45
@m3dwards
Copy link
Contributor Author

m3dwards commented Jun 20, 2024

Nits should be addressed @maflcko

@m3dwards
Copy link
Contributor Author

This job has a ccache size limit of 100mb, the MacOS job has a ccache limit of 400m. @hebasto is there something specific to MacOS for why the ccache limit is higher there?

@hebasto
Copy link
Member

hebasto commented Jun 20, 2024

This job has a ccache size limit of 100mb, the MacOS job has a ccache limit of 400m. @hebasto is there something specific to MacOS for why the ccache limit is higher there?

When I adjusted ccache size limits, I picked a size that allowed to populate the cleared cache with no cleanups in the ccache --show-stats report. I'm not aware of any peculiarities that make macOS caches bigger than others.

@hebasto
Copy link
Member

hebasto commented Jun 20, 2024

Maybe wait for the CI to pass, then address the nits to test the created cache?

Oh, I guess only master creates the cache?

        if: github.event_name != 'pull_request' && steps.ccache-cache.outputs.cache-hit != 'true'

So I guess this can be merged after addressing the nits.

Maybe comment that line out in a temporary "NO MERGE" commit?

@maflcko
Copy link
Member

maflcko commented Jun 21, 2024

utACK 4ecbbd9

@hebasto
Copy link
Member

hebasto commented Jun 21, 2024

Maybe wait for the CI to pass, then address the nits to test the created cache?

Oh, I guess only master creates the cache?

        if: github.event_name != 'pull_request' && steps.ccache-cache.outputs.cache-hit != 'true'

So I guess this can be merged after addressing the nits.

Maybe comment that line out in a temporary "NO MERGE" commit?

I've pushed a modified branch to my personal account. Here is a CI job: https://github.com/hebasto/bitcoin/actions/runs/9611511964/job/26510228294.

@m3dwards
Copy link
Contributor Author

I've pushed a modified branch to my personal account.

Me also a few comments up :)

@hebasto
Copy link
Member

hebasto commented Jun 21, 2024

I've pushed a modified branch to my personal account.

Me also a few comments up :)

Oh sorry, I missed it.

@hebasto
Copy link
Member

hebasto commented Jun 21, 2024

I've pushed a modified branch to my personal account.

Me also a few comments up :)

Could you please rerun that job one more time to ensure that ccache --show-stats will print nearly 100% hit rate?

@m3dwards
Copy link
Contributor Author

Although I can see the cache get restored the job is no faster so I'm not sure it's working.

Could you please rerun that job one more time to ensure that ccache --show-stats will print nearly 100% hit rate?

Will do this

@m3dwards
Copy link
Contributor Author

Re-running but the cache hit rate at the moment is low:

ccache version 4.9.1
Cacheable calls:   770 / 782 (98.47%)
  Hits:             59 / 770 ( 7.66%)
    Direct:          7 /  59 (11.86%)
    Preprocessed:   52 /  59 (88.14%)
  Misses:          711 / 770 (92.34%)
Uncacheable calls:  12 / 782 ( 1.53%)
Local storage:
  Cache size (GB): 0.1 / 0.1 (102.4%)
  Cleanups:        604
  Hits:             59 / 770 ( 7.66%)
  Misses:          711 / 770 (92.34%)

@hebasto
Copy link
Member

hebasto commented Jun 21, 2024

Re-running but the cache hit rate at the moment is low:

The recent successive "ASan + LSan + UBSan + integer, no depends, USDT" job for the master branch on the Cirrus CI was https://cirrus-ci.com/task/6204664989876224. The merged PR did not touch C++ code at all. However, the hit rate is far from 100%:

+ bash -c 'ccache --version | head -n 1 && ccache --show-stats'
ccache version 4.9.1
Cacheable calls:   769 / 781 (98.46%)
  Hits:            260 / 769 (33.81%)
    Direct:         75 / 260 (28.85%)
    Preprocessed:  185 / 260 (71.15%)
  Misses:          509 / 769 (66.19%)
Uncacheable calls:  12 / 781 ( 1.54%)
Local storage:
  Cache size (GB): 0.2 / 0.2 (101.7%)
  Cleanups:        454
  Hits:            260 / 769 (33.81%)
  Misses:          509 / 769 (66.19%)

I wonder, is it ccache compatibility issue with some sanitizer instrumentations?

@maflcko
Copy link
Member

maflcko commented Jun 21, 2024

The Cirrus cache is shared with pull requests, so it can not be used to compare it here. You'd have to run it locally on a fresh VM, twice.

@m3dwards
Copy link
Contributor Author

Re-run results on GHA:

ccache version 4.9.1
Cacheable calls:   770 / 782 (98.47%)
  Hits:             62 / 770 ( 8.05%)
    Direct:          9 /  62 (14.52%)
    Preprocessed:   53 /  62 (85.48%)
  Misses:          708 / 770 (91.95%)
Uncacheable calls:  12 / 782 ( 1.53%)
Local storage:
  Cache size (GB): 0.1 / 0.1 (102.4%)
  Cleanups:        616
  Hits:             62 / 770 ( 8.05%)
  Misses:          708 / 770 (91.95%)

I will increase the size and see if it helps

@hebasto
Copy link
Member

hebasto commented Jun 21, 2024

I will increase the size and see if it helps

Yes. Especially, considering too many cleanups.

@hebasto
Copy link
Member

hebasto commented Jun 21, 2024

I will increase the size and see if it helps

From my tests it follows that the ccache cache size is about 227 MB. So setting CCACHE_MAXSIZE=250MB or CCACHE_MAXSIZE=300MB should work.

@maflcko
Copy link
Member

maflcko commented Jun 21, 2024

utACK da205dd

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK da205dd.

@m3dwards
Copy link
Contributor Author

m3dwards commented Jun 21, 2024

Looking much better with bigger cache size:

ccache version 4.9.1
Cacheable calls:   770 / 782 (98.47%)
  Hits:            769 / 770 (99.87%)
    Direct:        769 / 769 (100.0%)
    Preprocessed:    0 / 769 ( 0.00%)
  Misses:            1 / 770 ( 0.13%)
Uncacheable calls:  12 / 782 ( 1.53%)
Local storage:
  Cache size (GB): 0.2 / 0.2 (96.47%)
  Hits:            769 / 770 (99.87%)
  Misses:            1 / 770 ( 0.13%)

ASan job has gone from 79 minutes down to 46 minutes with cache.

@fanquake fanquake merged commit cf44adf into bitcoin:master Jun 24, 2024
@m3dwards m3dwards deleted the 240519-ci-gha-no-volume branch June 24, 2024 20:52
@bitcoin bitcoin locked and limited conversation to collaborators Jun 24, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants