Skip to content

[security] Fix exposed push tokens through gh workflow_run#379

Merged
mishig25 merged 10 commits into
mainfrom
fix-token-issue
Jun 7, 2023
Merged

[security] Fix exposed push tokens through gh workflow_run#379
mishig25 merged 10 commits into
mainfrom
fix-token-issue

Conversation

@mishig25

@mishig25 mishig25 commented Jun 2, 2023

Copy link
Copy Markdown
Contributor

Pretty much implements https://securitylab.github.com/research/github-actions-preventing-pwn-requests/

tldr:

  1. one workflow (without access to repo secrets) builds docs
  2. another workflow (with access to repo secrets through wokflow_run) uploads/pushes the docs to the hub

todos (in order):

@mishig25 mishig25 force-pushed the fix-token-issue branch 4 times, most recently from a215366 to dba002e Compare June 5, 2023 08:28
@mishig25 mishig25 changed the title wip [security] Fix exposed push tokens through gh workflow_run Jun 5, 2023
@mishig25 mishig25 requested review from LysandreJik, coyotte508 and sgugger and removed request for coyotte508 June 5, 2023 10:00
@sgugger

sgugger commented Jun 5, 2023

Copy link
Copy Markdown
Contributor

I am very uneducated in secuitry issues but what prevents the malicious user to rewrite those new workflows and get access to the secret token?

@mishig25

mishig25 commented Jun 5, 2023

Copy link
Copy Markdown
Contributor Author

from: https://securitylab.github.com/research/github-actions-preventing-pwn-requests/

a new trigger workflow_run was introduced to enable scenarios that require building the untrusted code and also need write permissions to update the PR with e.g. code coverage results or other test results. To do this in a secure manner, the untrusted code must be handled via the pull_request trigger so that it is isolated in an unprivileged environment. The workflow processing the PR should then store any results like code coverage or failed/passed tests in artifacts and exit. The following workflow then starts on workflow_run where it is granted write permission to the target repository and access to repository secrets, so that it can download the artifacts and make any necessary modifications to the repository or interact with third party services that require repository secrets (e.g. API tokens).

if i understand it correctly, the excerpt above is saying that a malicious user might modify new workflows trying to get access to the secret token, BUT github will only run version of those workflows (specifically, workflows that are triggerred by workflow_run event) that are on main branch of a repository. Therefore, as long as we don't merge a PR that modifies those workflows, malicious user can't access those secrets

@coyotte508 coyotte508 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good to me.

I think that using GH webhooks to a HF space may be better in the long run, that way no need to add the secret to each repo, which is really annoying (especially since we're going to rotate them all).

The only concern is that currently someone malicious could overwrite the docs of another PR from the same repo.

Also, we can remove the delete_doc_comment and delete_pr_documentation workflows probably: PR docs are automatically deleted after 30 days anyway.

Comment thread .github/workflows/delete_pr_documentation.yml Outdated
Comment thread .github/workflows/delete_pr_documentation.yml Outdated
@coyotte508

Copy link
Copy Markdown
Member

I am very uneducated in secuitry issues but what prevents the malicious user to rewrite those new workflows and get access to the secret token?

Secrets are never passed to workflows run from forks.

The workflow that uploads the docs here: https://github.com/huggingface/accelerate-wip/blob/main/.github/workflows/upload_pr_documentation.yml

It's only run from the main branch, and https://github.com/huggingface/doc-builder/pull/379/files#diff-f05826b801b9407ec985196b30fc45b111e09b5d98ee8670333493868e2b8dad it only downloads & reuploads an artifact

@mishig25

mishig25 commented Jun 5, 2023

Copy link
Copy Markdown
Contributor Author

that way no need to add the secret to each repo, which is really annoying (especially since we're going to rotate them all).

I think you can create organization-level secret. Which should solve this issue, no ?

PR docs are automatically deleted after 30 days anyway.

I didn't know it. Could you point me to a resource for conformation? If so, indeed delete_pr_documentation workflow is unnecessary

@coyotte508

Copy link
Copy Markdown
Member

I think you can create organization-level secret. Which should solve this issue, no ?

It would help a lot yes :) A few repos (transformers.js, ...) would still need but it's a lot better

PR docs are automatically deleted after 30 days anyway.

I didn't know it. Could you point me to a resource for conformation? If so, indeed delete_pr_documentation workflow is unnecessary

#353

@sgugger sgugger left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation!

Definitely worth a try (as long as it's tested with PRs opened from forks, which is the main issue we are trying to solve :-) ).

Comment thread .github/workflows/upload_pr_documentation.yml Outdated

@LysandreJik LysandreJik left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you all think it's safe, then this sounds good to me!

@mishig25 mishig25 merged commit 0c16b6c into main Jun 7, 2023
@mishig25 mishig25 deleted the fix-token-issue branch June 7, 2023 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants