docs: add automated contributions policy to CONTRIBUTING.md by ianna · Pull Request #3831 · scikit-hep/awkward

ianna · 2026-01-27T20:31:30Z

Added a section on automated contributions policy to clarify the use of AI tools in contributions. Updated maintainers list.

ianna

As @pfackeldey suggested - the text is taken from here.

I would also suggest to use the following labels to mark the PRs:

@TaiSakuma - please, suggest other two categories as you've mentioned in your document. Thanks

github-actions · 2026-01-27T20:58:59Z

The documentation preview is ready to be viewed at http://preview.awkward-array.org.s3-website.us-east-1.amazonaws.com/PR3831

jpivarski

This is a reasonable statement on AI because of the prohibition against "fully-automated tools" and the need to be able to "explain changes upon request."

At the UChicago DSI, we're using AI for increasingly large parts of code generation and review because it does the routine parts well and just needs to be directed at a high level. The biggest challenge is communicating what "directed at a high level" means to those who don't have a strong programming background, such as students. To avoid banalities like "make sure you're in control," which someone is only likely to understand if they're doing it anyway, I've proposed some concrete measures, like making sure that at least half of all AI interactions are information-seeking (e.g. "what does this function do?"), rather than purely generating (e.g. "write a function that does..."), and requiring them to do clean-up phases (e.g. "simplify this module as much as possible and improve the names") after everything works, because the worst code is generated by asking AI to keep messing with the same code block until it runs. And of course, all linters are in full effect as guardrails: Claude or Codex can be told to keep running them until all the issues are fixed, and that almost always leads to improvement.

Also for projects without students, we use it quite extensively. What that means for our own pull requests is that we must tell reviewers what has already been checked by AI and what remains for human judgement (e.g. "is this the right approach?"). I never leave functions without type hints or docstrings anymore because there's essentially zero cost to letting AI fill in that information (with mypy as a correctness linter). So it's important for PR reviewers to not spend their time checking these things and I, as a PR author, need to tell them that. Reviews have become a lot more intentional, requiring a statement of scope from the PR author, because intention is the main thing AI code generators/checkers lack.

henryiii · 2026-01-27T22:12:37Z

I wonder if adding "unsolicited" would help. In my mind, this is more like a code of conduct, which is really only used when there's an action that needs to be taken and the CoC explains why that action is justified. For example, if I wanted to assign "at" copilot to an issue, and it generated a PR to solve it, that would be in violation of this policy. Since I'm a maintainer, and assuming someone noticed that I triggered it, I assume that this policy would not be enforced, but adding "unsolicited" would make it clear that this is not a violation at all. Also, pre-commit and depenpendabot open computer generated PRs, I wouldn't want those to be in volition of a policy.

I think the idea here is to just protect against an AI generated PR that takes time to review that we didn't ask for.

TaiSakuma · 2026-01-28T01:42:25Z

@TaiSakuma - please, suggest other two categories as you've mentioned in your document. Thanks

@ianna - which two categories are you referring to?

TaiSakuma · 2026-01-28T01:49:47Z

I have a question about the labels autonomous, ai-assisted, and ai-generated. Who is going to put these labels? Are submitters supposed to put the labels themselves, or are reviewers to determine them based on the PR contents?

ianna · 2026-01-28T06:28:19Z

I have a question about the labels autonomous, ai-assisted, and ai-generated. Who is going to put these labels? Are submitters supposed to put the labels themselves, or are reviewers to determine them based on the PR contents?

I would propose the pr contributor does it.

ianna · 2026-01-28T06:30:08Z

@TaiSakuma - please, suggest other two categories as you've mentioned in your document. Thanks

@ianna - which two categories are you referring to?

I was wondering if those labels (or a different set) would work for the pr triage, e.g. the five outcomes of it.

ianna · 2026-01-28T06:33:28Z

I have a question about the labels autonomous, ai-assisted, and ai-generated. Who is going to put these labels? Are submitters supposed to put the labels themselves, or are reviewers to determine them based on the PR contents?

I would propose the pr contributor does it.

I guess in this case we would need a label for no ai as well…

ikrommyd

I support these guidelines.

I would also probably add something along the lines of what jax says here: https://docs.jax.dev/en/latest/contributing.html#can-i-contribute-ai-generated-code
Basically that all contributions (AI included) should follow the scikit-hep code of conduct that we have and that we can be more picky about AI PRs in review if we know there is very little human interaction involved. I also really like the loose rule of thumb "If the team needs to spend more time reviewing a contribution than the contributor spends generating it, then the contribution is probably not helpful to the project"

Finally regarding the labels, I think on one hand having a label to distinguish between those PR types is good. On the other hand, I don't think the contributor can add labels can they? If they can, that's good but it also gives the ability to lie about it. If only we can change the labels, that can create some friction if they disagree with how we label a PR.
I generally would like a way to easily classify AI PRs though just like the numpy team wanted in case they need to be reverted at some point. I'm just a bit skeptical about how labeling will work. I'm fine with whatever you decide though.

ianna · 2026-01-28T08:22:40Z

I support these guidelines.

I would also probably add something along the lines of what jax says here: https://docs.jax.dev/en/latest/contributing.html#can-i-contribute-ai-generated-code Basically that all contributions (AI included) should follow the scikit-hep code of conduct that we have and that we can be more picky about AI PRs in review if we know there is very little human interaction involved. I also really like the loose rule of thumb "If the team needs to spend more time reviewing a contribution than the contributor spends generating it, then the contribution is probably not helpful to the project"

Finally regarding the labels, I think on one hand having a label to distinguish between those PR types is good. On the other hand, I don't think the contributor can add labels can they? If they can, that's good but it also gives the ability to lie about it. If only we can change the labels, that can create some friction if they disagree with how we label a PR. I generally would like a way to easily classify AI PRs though just like the numpy team wanted in case they need to be reverted at some point. I'm just a bit skeptical about how labeling will work. I'm fine with whatever you decide though.

Thanks. All good points! I suggest we go with this as-is and update this as soon as the Scikit-HEP agrees on a final policy.

pfackeldey · 2026-01-28T09:02:55Z

Hi @ianna, thanks for putting this up, I'm generally happy with this guideline, and also thanks for updating the maintainer list.
The coding agent landscape is so rapidly changing that we likely have to adopt this guideline from time to time, but that's a price we have to pay in this rapidly evolving field.

I think @henryiii has a good point and I think we should add the "unsolicited" word. I can imagine that we do want to ask AI in some cases explicitly for e.g. review help/summary in addition to our 'human' review.

We could also add a PR template where people are confronted with e.g. disclosing LLM usage if used, not sure what you think about this?

The labeling is interesting from a metric point of view: we can in 1-2 years from now see how PR contributions have shifted to potentially more AI driven contributions. Labels allow us to filter PRs and use Github's REST API to create some interesting plots in the future, there may be a chance to learn something about how software engineering with AI shifts for our field.

ikrommyd · 2026-01-28T09:05:07Z

We could also add a PR template where people are confronted with e.g. disclosing LLM usage if used, not sure what you think about this?

I like that too

ianna · 2026-01-28T10:24:37Z

Hi @ianna, thanks for putting this up, I'm generally happy with this guideline, and also thanks for updating the maintainer list. The coding agent landscape is so rapidly changing that we likely have to adopt this guideline from time to time, but that's a price we have to pay in this rapidly evolving field.

I think @henryiii has a good point and I think we should add the "unsolicited" word. I can imagine that we do want to ask AI in some cases explicitly for e.g. review help/summary in addition to our 'human' review.

We could also add a PR template where people are confronted with e.g. disclosing LLM usage if used, not sure what you think about this?

The labeling is interesting from a metric point of view: we can in 1-2 years from now see how PR contributions have shifted to potentially more AI driven contributions. Labels allow us to filter PRs and use Github's REST API to create some interesting plots in the future, there may be a chance to learn something about how software engineering with AI shifts for our field.

Good points! BTW, since we are in EU we need to add something like "This documentation was refined with AI assistance" to prepare for the 2026 regulations. I haven't found the US regulations yet - @henryiii perhaps you do?

TaiSakuma · 2026-01-28T13:57:00Z

@TaiSakuma - please, suggest other two categories as you've mentioned in your document. Thanks

@ianna - which two categories are you referring to?

I was wondering if those labels (or a different set) would work for the pr triage, e.g. the five outcomes of it.

I intended these five outcomes to be for the AI triage to categorize PRs into, so as to save human reviewers time. If people spend enough time on a PR to put it into a category, they've just reviewed the PR. Their time isn't saved.

I can experiment with the AI triage on a different branch or a fork. If that works, we can roll out to the main branch of the main repo.

ianna · 2026-01-28T15:10:30Z

@TaiSakuma - please, suggest other two categories as you've mentioned in your document. Thanks

@ianna - which two categories are you referring to?

I was wondering if those labels (or a different set) would work for the pr triage, e.g. the five outcomes of it.

I intended these five outcomes to be for the AI triage to categorize PRs into, so as to save human reviewers time. If people spend enough time on a PR to put it into a category, they've just reviewed the PR. Their time isn't saved.

I can experiment with the AI triage on a different branch or a fork. If that works, we can roll out to the main branch of the main repo.

Excellent idea! Thanks @TaiSakuma

henryiii · 2026-01-28T15:30:04Z

For an example of a maintainer requested AI PR, scikit-hep/boost-histogram#1076 (requested in scikit-hep/boost-histogram#1074)

henryiii · 2026-01-28T15:35:16Z

By the way, I'm also not fond of this:

Contributing to awkward requires human judgment, contextual understanding, and familiarity with awkward’s structure and goals. It is not suitable for automatic processing by AI tools.

That includes a baked-in assumption that AI tools are not smart enough to work on awkward, which is almost certainly wrong in the future; I suspect it's wrong today. I think we should avoid statements that either become false soon, or might even be false today. I expect this is a statement from 1-2 years ago in scikit-learn, when it might have been true.

TaiSakuma

This text was taken from another project, as stated in #3831 (review), which appears to have been written in 2024 or earlier. I think this is outdated in January 2026. If we are to place this for now, I think we should revise it soon, within a few months.

The text states about closing PRs in this section "Automated contributions policy". That isn't specific to this section. I think you can write about closing PRs more generally in a different section. I think that, in principle, you can close or ignore any PRs without providing a reason. Maybe you can state that you would normally review every PR, as long as resources permit, but it may not always be possible, and you may have to stop reviewing and close a PR when sufficient resources aren't available.

Instead of or in addition to about closing PRs, you can state when you merge. For example, you can state that you will only merge PRs that you have tested, understand, believe would be valuable to the project, and can maintain in the future.

CONTRIBUTING.md

Co-authored-by: Henry Schreiner <henry.fredrick.schreiner@cern.ch>

CONTRIBUTING.md

ikrommyd

Apart from my one OCD suggestion, this looks good!

Co-authored-by: Iason Krommydas <iason.krom@gmail.com>

pfackeldey

I like it a lot now 👍

jpivarski

It's good! I was a little confused by the parenthetical phrase. If I'm wrong in my interpretation, don't take my suggestion as it is.

CONTRIBUTING.md

Co-authored-by: Jim Pivarski <jpivarski@users.noreply.github.com>

ianna · 2026-02-02T09:53:25Z

@lgray and @agoose77 - please take a look and sign if you’re on board. It would be great to get all admins and maintainers on the same page before we move forward. Thanks!

ianna · 2026-02-03T16:54:19Z

@lgray and @agoose77 - please take a look and sign if you’re on board. It would be great to get all admins and maintainers on the same page before we move forward. Thanks!

@lgray and @agoose77 - ping

CONTRIBUTING.md

… AI contributions.

lgray

After revision the new text is reasonable and acceptable.

agoose77 · 2026-02-03T18:59:50Z

LGTM!

…#4051) * Add an AI-assisted contributions policy taken mostly from Awkward Array's (https://github.com/scikit-hep/awkward/), which was based on Scikit-learn's Automated Contributions Policy. * Add AI-assistance disclosure checkboxes to pull request template. * c.f. - scikit-hep/awkward#3831 - scikit-learn/scikit-learn#32566 Note that the Awkward Array language is more pro-AI-usage while the Scikit-learn language is more neutral. ### Context This was discussed in the AI section of the [2026 Snakemake Hackathon](https://indico.cern.ch/event/1574891/) at TUM ([GitHub project board](https://github.com/orgs/snakemake/projects/8)).  ### QC  * [N/A] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [x] The documentation (`docs/`) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).  ## Summary by CodeRabbit * **Documentation** * Added an "AI-assisted contributions" subsection to the contribution guide covering attribution, limits on automation, reviewer expectations, and when to disclose significant AI assistance. * Updated the pull request template to include an AI-assistance disclosure section and checklist to ensure contributors declare use of AI tools during submission and review.  Co-authored-by: Johannes Köster <johannes.koester@uni-due.de>

docs: add automated contributions policy to CONTRIBUTING.md

f2cfa06

Added a section on automated contributions policy to clarify the use of AI tools in contributions. Updated maintainers list.

ianna requested review from TaiSakuma, ariostas, henryiii, ikrommyd, jpivarski, lgray and pfackeldey January 27, 2026 20:39

ianna commented Jan 27, 2026

View reviewed changes

jpivarski reviewed Jan 27, 2026

View reviewed changes

ikrommyd reviewed Jan 28, 2026

View reviewed changes

Merge branch 'main' into ianna/automated-contributions-policy

c490e93

TaiSakuma reviewed Jan 29, 2026

View reviewed changes

CONTRIBUTING.md Outdated Show resolved Hide resolved

CONTRIBUTING.md Outdated Show resolved Hide resolved

CONTRIBUTING.md Outdated Show resolved Hide resolved

CONTRIBUTING.md Outdated Show resolved Hide resolved

CONTRIBUTING.md Outdated Show resolved Hide resolved

ianna commented Jan 29, 2026

View reviewed changes

CONTRIBUTING.md Outdated Show resolved Hide resolved

Apply suggestion from @ianna

88dd1c4

Apply suggestion from @henryiii

da2d9e2

Co-authored-by: Henry Schreiner <henry.fredrick.schreiner@cern.ch>

henryiii approved these changes Jan 29, 2026

View reviewed changes

ariostas approved these changes Jan 29, 2026

View reviewed changes

ikrommyd reviewed Jan 29, 2026

View reviewed changes

CONTRIBUTING.md Outdated Show resolved Hide resolved

ikrommyd approved these changes Jan 29, 2026

View reviewed changes

Update CONTRIBUTING.md

dd068e4

Co-authored-by: Iason Krommydas <iason.krom@gmail.com>

ianna requested review from TaiSakuma, agoose77, jpivarski and maxymnaumchyk January 30, 2026 08:53

pfackeldey approved these changes Jan 30, 2026

View reviewed changes

ianna mentioned this pull request Jan 30, 2026

docs: add AI-assisted contributions guidelines scikit-hep/uproot5#1562

Merged

TaiSakuma approved these changes Jan 30, 2026

View reviewed changes

maxymnaumchyk approved these changes Jan 30, 2026

View reviewed changes

jpivarski approved these changes Jan 30, 2026

View reviewed changes

CONTRIBUTING.md Outdated Show resolved Hide resolved

Update CONTRIBUTING.md

3459db6

Co-authored-by: Jim Pivarski <jpivarski@users.noreply.github.com>

Merge branch 'main' into ianna/automated-contributions-policy

076be4a

lgray reviewed Feb 3, 2026

View reviewed changes

CONTRIBUTING.md Outdated Show resolved Hide resolved

lgray added 2 commits February 3, 2026 12:54

remove "fully" from sentence on degree of responsibility statement in…

00a070f

… AI contributions.

Merge branch 'main' into ianna/automated-contributions-policy

df86525

lgray approved these changes Feb 3, 2026

View reviewed changes

agoose77 approved these changes Feb 3, 2026

View reviewed changes

Merge branch 'main' into ianna/automated-contributions-policy

2268c0d

ianna merged commit 4da20e6 into main Feb 5, 2026
16 checks passed

ianna deleted the ianna/automated-contributions-policy branch February 5, 2026 15:23

matthewfeickert mentioned this pull request Mar 11, 2026

docs: Add AI-assisted contributions policy to contributing guidelines snakemake/snakemake#4051

Merged

1 task

Conversation

ianna commented Jan 27, 2026

Uh oh!

ianna left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 27, 2026

Uh oh!

jpivarski left a comment

Choose a reason for hiding this comment

Uh oh!

henryiii commented Jan 27, 2026

Uh oh!

TaiSakuma commented Jan 28, 2026

Uh oh!

TaiSakuma commented Jan 28, 2026

Uh oh!

ianna commented Jan 28, 2026

Uh oh!

ianna commented Jan 28, 2026

Uh oh!

ianna commented Jan 28, 2026

Uh oh!

ikrommyd left a comment

Choose a reason for hiding this comment

Uh oh!

ianna commented Jan 28, 2026

Uh oh!

pfackeldey commented Jan 28, 2026

Uh oh!

ikrommyd commented Jan 28, 2026

Uh oh!

ianna commented Jan 28, 2026

Uh oh!

TaiSakuma commented Jan 28, 2026

Uh oh!

ianna commented Jan 28, 2026

Uh oh!

henryiii commented Jan 28, 2026

Uh oh!

henryiii commented Jan 28, 2026

Uh oh!

TaiSakuma left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ikrommyd left a comment

Choose a reason for hiding this comment

Uh oh!

pfackeldey left a comment

Choose a reason for hiding this comment

Uh oh!

jpivarski left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ianna commented Feb 2, 2026

Uh oh!

ianna commented Feb 3, 2026

Uh oh!

Uh oh!

lgray left a comment

Choose a reason for hiding this comment

Uh oh!

agoose77 commented Feb 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!