Skip to content

Cache multiple paths and add glob pattern support#212

Merged
joshmgross merged 18 commits intoactions:masterfrom
ethanis:ethanis/cache-multiple-paths
Mar 20, 2020
Merged

Cache multiple paths and add glob pattern support#212
joshmgross merged 18 commits intoactions:masterfrom
ethanis:ethanis/cache-multiple-paths

Conversation

@ethanis
Copy link
Copy Markdown
Contributor

@ethanis ethanis commented Mar 12, 2020

This adds the ability to specify multiple items to cache in a single run (#16, #180). Additionally, the items to cache can be located with glob patterns. Paths/patterns are specified as new-line delimited input to the path parameter.

Because this adds multiple directories to the tarball that is cached, each path is made relative to the GITHUB_WORKSPACE env variable. If paths are outside of this directory, the tarball can contain a backtraced path (e.g. ../foo/bar). Because of this the -P flag is used with tar when unpacking the cache.

I have a couple of open questions:

  1. If no paths are found (e.g. a glob pattern found no files/directories), should the action fail intentionally?
  2. Because the tarball entries are now relative to GITHUB_WORKSPACE older cache entries will not be compatible with these changes. Is there a preferred way to add a salt value to the cache key?

Some examples:

    - name: Restore multiple relative paths
      uses: actions/cache@v3
      with:
        path: |
          node_modules
          dist
        key: ${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
    - name: Restore multiple absolute paths
      uses: actions/cache@v3
      with:
        path: |
          ~/foo/bar
          /users/mona/cache-me
        key: ${{ runner.os }}-${{ hashFiles('key.txt') }}
    - name: Restore from glob patterns
      uses: actions/cache@v3
      with:
        path: |
          **/*.ts
          **/node_modules
        key: ${{ runner.os }}-${{ hashFiles('key.txt') }}

ethanis and others added 7 commits March 4, 2020 16:02
remove known failing test

Quote tar paths

Add salt to test cache

Try reading input files from manifest

bump salt

Run test on macos

more testing

Run caching tests on 3 platforms

Run tests on self-hosted

Apparently cant reference hosted runners by name

Bump salt

wait for some time after save

more timing out

smarter waiting

Cache in tmp dir that won't be deleted

Use child_process instead of actions/exec

Revert tempDir hack

bump salt

more logging

More console logging

Use filepath to with cacheHttpClient

Test cache restoration

Revert temp dir hack

debug logging

clean up cache.yml testing

Bump salt

change debug output

build actions
@dhadka
Copy link
Copy Markdown
Contributor

dhadka commented Mar 12, 2020

Thanks for working on this!

We should look at using the version field in the server API. The idea is you compute a hash of any important settings used by the cache action, which in this case is the path, so when the path changes the version changes. Otherwise, if someone were to say add a new line to the path but forgot to change the key, you would get the old cache entry that does not include the new path.

@ethanis
Copy link
Copy Markdown
Contributor Author

ethanis commented Mar 12, 2020

The version field sounds like a perfect fit - is it exposed as a url param here?

Also, what should the initial value be? 1.0?

const httpClient = createHttpClient();
const resource = `cache?keys=${encodeURIComponent(keys.join(","))}`;
const version = getCacheVersion();
const resource = `cache?version=${version}`;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dhadka I took a stab at using a version to represent a cache. I'm not entirely sure how the endpoints expose this parameter, so I'm happy to revise as needed 😄

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will still need to provide the keys here. It should be cache?keys=key1,key2,...,keyn&version=....

I also don't think we need to include Key and RestoreKeys in the version. With the version, we only match caches with the exact (key, version) tuple. So having the keys included in the version is redundant.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it - updated to not include the keys in the version and include the keys as url params.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dhadka Why do we not need keys here?

I would think you could have an exact match on version but a restore key partial match since version only takes into account the action version and the input paths

@dhadka dhadka requested a review from joshmgross March 16, 2020 20:52
@smorimoto
Copy link
Copy Markdown
Contributor

This is great! Thank you @ethanis!

@smorimoto
Copy link
Copy Markdown
Contributor

Can you change to pass the path as Collections? It will be more yaml-like.

@dhadka
Copy link
Copy Markdown
Contributor

dhadka commented Mar 17, 2020

@imbsky Lists are currently not supported, see actions/toolkit#184. Ideally, once that is fixed, we would update this to support collections (as well as remain backwards compatible with multiline strings)

@smorimoto
Copy link
Copy Markdown
Contributor

I see. Well, I don't think there is a big problem so this looks good.

Copy link
Copy Markdown
Contributor

@joshmgross joshmgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!

As far as I can tell there are no breaking changes, but we may need to push this as a v2 update to the action to be safe.

@ethanis
Copy link
Copy Markdown
Contributor Author

ethanis commented Mar 19, 2020

@joshmgross is there anything else that needs to be changed in this PR?

@boredland
Copy link
Copy Markdown

this finally makes this actions usable without thousands of unncecessary lines when in a yarn monorepo.

Copy link
Copy Markdown
Contributor

@dhadka dhadka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from the server side.

@joshmgross
Copy link
Copy Markdown
Contributor

@joshmgross is there anything else that needs to be changed in this PR?

I'll take a look later today, but should be all good

@chrispat
Copy link
Copy Markdown
Contributor

I would like to have one example added as part of this change that uses multiple paths. @boredland perhaps you have one in mind?

@boredland
Copy link
Copy Markdown

boredland commented Mar 20, 2020

@chrispat I think the lerna example would be helpful:

- name: restore lerna
     uses: actions/cache@v3
      with:
        path: |
          node_modules
           */*/node_modules
        key: ${{ runner.os }}-${{ hashFiles('yarn.lock') }}

@ethanis
Copy link
Copy Markdown
Contributor Author

ethanis commented Mar 20, 2020

@boredland do lerna projects have multiple yarn.lock files? If so, should the key should be based on all of them?

- name: restore lerna
  uses: actions/cache@v3
   with:
     path: |
        **/node_modules
     key: ${{ runner.os }}-${{ hashFiles('**/yarn.lock') }}

@smorimoto
Copy link
Copy Markdown
Contributor

smorimoto commented Mar 20, 2020

If I remember correctly, Lerna and yarn workspace have only one lock file at root.

@ethanis
Copy link
Copy Markdown
Contributor Author

ethanis commented Mar 20, 2020

@boredland, @chrispat I added an example for lerna here

Copy link
Copy Markdown
Contributor

@joshmgross joshmgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this contribution!

The current plan is for this to go in v2

Co-Authored-By: Josh Gross <joshmgross@github.com>
@gcourtemanche
Copy link
Copy Markdown

Thanks for the great work! I'm using Yarn and Lerna, looking forward for v2.

@imbsky Lerna uses yarn.lock at root but also in every package.

@ethanis Is it possible to add in the doc that the v2 is not released yet. I spent some time trying to understand why the example wasn't working.

@joshmgross joshmgross mentioned this pull request Apr 2, 2020
@joshmgross
Copy link
Copy Markdown
Contributor

Thanks @gcourtemanche! Fixing that here: #242

@Bnaya
Copy link
Copy Markdown

Bnaya commented Apr 25, 2020

Dose is support ignore path somehow ?

      - name: restore yarn workspaces cache
        uses: actions/cache@master
        with:
          path: |
            node_modules
            */*/node_modules
           !packages/notyou/
          key: ${{ runner.os }}-${{ matrix.node }}-${{ hashFiles('**/yarn.lock') }}

@joshmgross
Copy link
Copy Markdown
Contributor

Dose is support ignore path somehow ?

      - name: restore yarn workspaces cache
        uses: actions/cache@master
        with:
          path: |
            node_modules
            */*/node_modules
           !packages/notyou/
          key: ${{ runner.os }}-${{ matrix.node }}-${{ hashFiles('**/yarn.lock') }}

Ignoring paths should be supported. The paths are evaluated by the @actions/glob package, see https://github.com/actions/toolkit/tree/master/packages/glob#exclude-patterns

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants