Skip to content

Adds CodeClippy dataset [WIP]#2666

Closed
arampacha wants to merge 4 commits intohuggingface:mainfrom
arampacha:code-clippy
Closed

Adds CodeClippy dataset [WIP]#2666
arampacha wants to merge 4 commits intohuggingface:mainfrom
arampacha:code-clippy

Conversation

@arampacha
Copy link
Copy Markdown

CodeClippy is an opensource code dataset scrapped from github during flax-jax-community-week
https://the-eye.eu/public/AI/training_data/code_clippy_data/

@albertvillanova albertvillanova added the dataset contribution Contribution to a dataset script label Sep 23, 2022
@albertvillanova
Copy link
Copy Markdown
Member

Thanks for your contribution, @arampacha. Are you still interested in adding this dataset?

We are removing the dataset scripts from this GitHub repo and moving them to the Hugging Face Hub: https://huggingface.co/datasets

We would suggest you create this dataset there. Please, feel free to tell us if you need some help.

@oaguy1
Copy link
Copy Markdown

oaguy1 commented Jul 26, 2023

Sorry to resurrect a dead issue, but any chance the dataset will make it to HuggingFace? I would love to use it to finetune Llama 2 and HF makes this a breeze. Also happy to submit a PR prepping it for HF if that is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dataset contribution Contribution to a dataset script

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants