DOC add paragraph on "AI usage disclosure" to Automated Contributions Policy and PR Template#32566
Conversation
…icy and PR Template
There was a problem hiding this comment.
Thanks @AnneBeyer , looks good! Just not sure how best to put the 'select one of the following' part.
Co-authored-by: Lucy Liu <jliu176@gmail.com>
doc/developers/contributing.rst
Outdated
| We agree that AI can be a useful development assistant, but the use of any kind of | ||
| AI assistance has to be disclosed in the PR description. Not doing so is not only | ||
| rude to the human maintainers, it also makes it diffictult to determine how much | ||
| scrutiny needs to be applied to the contribution. |
There was a problem hiding this comment.
I'd make this shorter and more of a statement without opinion (e.g. is or isn't AI useful).
For example "If you use AI tools please state so in your Pull Request description."
There was a problem hiding this comment.
Yes, I can make it shorter, but should this also become more of a "please do so", or keep the stricter tone? Or should we even be more explicit on the consequences, like in the paragraph above on fully-automatic submissions ("Maintainers reserve the right, at their sole discretion, to close such submissions")?
There was a problem hiding this comment.
Not sure. I like being polite.
I also like to try and keep it short because no one reads anything, and the longer the text the less people read it :-/
|
A AI contributions policy that is quite good https://github.com/zulip/zulip/blob/main/CONTRIBUTING.md#ai-use-policy-and-guidelines - maybe there is something we can borrow/copy from them |
|
I added an updated version that
@betatim: I like the zulip AI policy section and integrated some parts, but the whole thing kind of contradicts our keeping-it-short discussion here. |
lucyleeow
left a comment
There was a problem hiding this comment.
Some nits only but looks good to me, thank you for the changes!
@StefanieSenger originally added both these sections I think I would like to wait for her opinion as she has more background/context here.
|
I like the Zulip guide, e.g., this is nice:
It is long though and I am not sure if we want to spend the time agreeing on the nitty gritty details of our AI use policy (or maintain it as AI use and tool performance develops). It would be nice if we could all just point to one source for this. Edit: FYI re-started the failing CI 🤞 |
| <!-- | ||
| If AI tools were involved in creating this PR, please disclose their usage here and make | ||
| sure that you adhere to our Automated Contributions Policy: | ||
| https://scikit-learn.org/dev/developers/contributing.html#automated-contributions-policy | ||
| --> |
There was a problem hiding this comment.
Referring to an earlier version where you had a list of AI assistance that was used. stdlib uses a similar list (https://github.com/stdlib-js/stdlib/blob/develop/.github/PULL_REQUEST_TEMPLATE.md#ai-assistance) with checkboxes. Maybe a checkbox to state whether they used AI makes it more explicit/difficult to ignore. e.g., this is what a stdlib PR looks like:
There was a problem hiding this comment.
I like the checkbox approach.
For reference, here is what I had initially (minus the correct formatting):
Please select one of the following:
- No AI assistance was used in the creation of this PR.
- I used AI assistance in the creation of this PR (specifically <ADD TOOLS/DETAILS HERE>),
but I confirm that I checked and understood all changes and can explain them on request. - This PR was created by an AI Agent.
The stdlib template goes on with a disclaimer section similar to what I added in the second option:
Disclosure
If you answered "yes" to using AI assistance, please provide a short disclosure indicating how you used AI assistance. This helps reviewers determine how much scrutiny to apply when reviewing your contribution. Example disclosures: "This PR was written primarily by Claude Code." or "I consulted ChatGPT to understand the codebase, but the proposed changes were fully authored manually by myself.".
{{TODO: add disclosure if applicable}}
There was a problem hiding this comment.
I'm not sure what the best way here is to not add too much burden on contributors and maintainers side, but still to have some easy way of detecting non-compliant PRs.
Maybe we can start by copying just the upper part of the stdlib checklists and see how good Agents are at adapting to that?
There was a problem hiding this comment.
Jumping in from the site with a comment 😅 : I think adding this to the PR Template is the wrong approach. I can really connect to the argument that we should not add additional burdens / bureaucracy to people who want to contribute nor on reviewers.
My hope on adding "disclose AI use" to our policy would be to easier tell people off that use AI in an irresponsible, harmful way, but I really don't care if someone has used a bit of AI or not at all and I don't want maintainers to be in the position to investigate what people claim to be doing compared to what they are doing.
In my opinion, informing people in contributing.rst that they have to disclose AI use on the PR, is enough. Hardly anybody will do that and that's fine. We don't want to discuss with ai-spammers whether they have pushed AI code they have not reviewed at all or whether they have only used AI as a helper.
If they fail to disclose their AI usage, we can then easily tell people who irresponsively open gen-ai PRs that they did two things wrong: 1. hardly or not supervised gen-ai PRs and 2. not telling us about it.
That puts us in a position where we don't need to hint people to the Automated Contributions Policy, because we expect them to be informed. This in turn makes it easier to deal with those cases. (I am speaking of the nastiest 3% of contributions and the rest of the other contributors would be untouched by that, no matter if they use AI as a helper or not.)
There was a problem hiding this comment.
Yeah good thoughts.
Hardly anybody will do that and that's fine.
I think this is the situation that's not ideal, I have not seen anyone offer this info in a PR. What do you think of a simple checkbox yes/no for use of AI? It sort of forces a response as it makes it 'standard'. It would be nice to know before reviewing, and hopefully make people less reluctant to share - it is very commonly used for research/understanding (and even from a data collection perspective it may be interesting).
There was a problem hiding this comment.
I see the value in having a clear signal from PR authors for reviewers. Sorry that I didn't reflect on that and only expressed concerns in my last message. I didn't mean it to sound dismissive, @lucyleeow and @AnneBeyer.
What do you think of a simple checkbox yes/no for use of AI?
I like it. Though posed like this, I am sure most people would have to check "yes". (I would certainly always do, since I constantly chat back and forth with llms to explain me the coding world.) What we mean is if they used AI as a coding assistance, I think.
What do you think of adding a simple checkbox "[ ] AI assistance used for coding?" without a "yes"/"no" option, only check if you have used AI for coding.
We could try it to collect some data on how people use it and if it is useful for reviewers and adjust later if it doesn't prove helpful.
There was a problem hiding this comment.
I think, so far we have not asked people to disclose their AI usage (adding this to the guidelines is also part of this PR), so I'm not too surprised people don't do that yet.
I think we can go with a trial and error approach here. Adding the section heading to the template is a first step towards making it more obvious that this disclosure is expected. Adding a checklist could actually also make it less effort for both sides, because we might not get as detailed information as with a free text field (like which kind of tools people used), but people might be more likely to actually set a check mark than to fill in text. So we could go with something like this (and observe what AI Agents make of it for a while):
I used AI assistance for (please check all that apply)
- Code generation (e.g., when writing an implementation or fixing a bug)
- Test/benchmark generation
- Documentation (including examples)
- Research and understanding
What do you think? @lucyleeow @StefanieSenger
There was a problem hiding this comment.
I'm fine with this, especially since it's important to others on the team.
I do still have concerns about adding an extra task for contributors while collecting information that may not be reliable but we can try it and adjust the PR template later if it turns out not to be helpful, or once we feel we’ve learned enough from it.
There was a problem hiding this comment.
I'm +1 for this. I think a checkbox is easy enough to do, so I don't think it will be a burden in that sense.
But, I can understand that it would be a 'burden' because some people may feel like we would 'value' their contribution less if they say they used AI. As you said, we can always iterate.
Co-authored-by: Tim Head <betatim@gmail.com>
betatim
left a comment
There was a problem hiding this comment.
I like it. Let's use it and see what happens.
I think the open discussion from Lucy has converged, but I'll let the participants declare that themselves. So won't merge yet.
|
Would you like to have another look, @lucyleeow, @StefanieSenger? |
lucyleeow
left a comment
There was a problem hiding this comment.
Sorry 2 nits and then I will merge!
|
Enabling auto merge, hopefully no CI timeouts 🤞 |
… Policy and PR Template (scikit-learn#32566) Co-authored-by: Lucy Liu <jliu176@gmail.com> Co-authored-by: Tim Head <betatim@gmail.com>
… Policy and PR Template (scikit-learn#32566) Co-authored-by: Lucy Liu <jliu176@gmail.com> Co-authored-by: Tim Head <betatim@gmail.com>
… Policy and PR Template (#32566) Co-authored-by: Lucy Liu <jliu176@gmail.com> Co-authored-by: Tim Head <betatim@gmail.com>
…#4051) * Add an AI-assisted contributions policy taken mostly from Awkward Array's (https://github.com/scikit-hep/awkward/), which was based on Scikit-learn's Automated Contributions Policy. * Add AI-assistance disclosure checkboxes to pull request template. * c.f. - scikit-hep/awkward#3831 - scikit-learn/scikit-learn#32566 Note that the Awkward Array language is more pro-AI-usage while the Scikit-learn language is more neutral. ### Context This was discussed in the AI section of the [2026 Snakemake Hackathon](https://indico.cern.ch/event/1574891/) at TUM ([GitHub project board](https://github.com/orgs/snakemake/projects/8)). <!--Add a description of your PR here--> ### QC <!-- Make sure that you can tick the boxes below. --> * [N/A] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [x] The documentation (`docs/`) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Added an "AI-assisted contributions" subsection to the contribution guide covering attribution, limits on automation, reviewer expectations, and when to disclose significant AI assistance. * Updated the pull request template to include an AI-assistance disclosure section and checklist to ensure contributors declare use of AI tools during submission and review. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Johannes Köster <johannes.koester@uni-due.de>
Reference Issues/PRs
First draft towards extending the Automated Contributions Policy for PRs to require a disclosure of AI usage, as discussed towards the end of #31679
What does this implement/fix? Explain your changes.
Adds a paragraph on required disclosure of AI usage to the Automated Contributions Policy and extends the PR template with a selection accordingly.
AI usage disclosure
(Though it could be useful to play around with different formulations/AI suggestions in this case...)
Any other comments?
Any comments/suggestions on the wording are welcome!