Skip to content

Add tutorial for no-code dataset upload#2925

Merged
stevhliu merged 11 commits intohuggingface:masterfrom
stevhliu:master
Sep 27, 2021
Merged

Add tutorial for no-code dataset upload#2925
stevhliu merged 11 commits intohuggingface:masterfrom
stevhliu:master

Conversation

@stevhliu
Copy link
Copy Markdown
Member

This PR is for a tutorial for uploading a dataset to the Hub. It relies on the Hub UI elements to upload a dataset, introduces the online tagging tool for creating tags, and the Dataset card template to get a head start on filling it out. The addition of this tutorial should make it easier for beginners to upload a dataset without accessing the terminal or knowing Git.

@stevhliu stevhliu added the documentation Improvements or additions to documentation label Sep 15, 2021
@lhoestq
Copy link
Copy Markdown
Member

lhoestq commented Sep 17, 2021

Cool, love it ! :)

Feel free to add a paragraph saying how to load the dataset:

from datasets import load_dataset

dataset = load_dataset("stevhliu/demo")

# or to separate each csv file into several splits
data_files = {"train": "train.csv", "test": "test.csv"}
dataset = load_dataset("stevhliu/demo", data_files=data_files)
print(dataset["train"][0])

@lhoestq
Copy link
Copy Markdown
Member

lhoestq commented Sep 21, 2021

Perfect, feel free to mark this PR ready for review :)

cc @albertvillanova do you have any comment ? You can check the tutorial here:
https://47389-250213286-gh.circle-artifacts.com/0/docs/_build/html/no_code_upload.html

Maybe we can just add a list of supported file types:

  • csv
  • json
  • json lines
  • text
  • parquet

@stevhliu stevhliu marked this pull request as ready for review September 22, 2021 17:26
@lhoestq
Copy link
Copy Markdown
Member

lhoestq commented Sep 23, 2021

I just added a mention of the login for private datasets. Don't hesitate to edit or comment.

Otherwise I think it's all good, feel free to merge it @stevhliu if you don't have other changes to make :)

Copy link
Copy Markdown
Member

@albertvillanova albertvillanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! This tutorial is very useful and simple. Thanks! 🤗

Only a minor comment below.

@stevhliu stevhliu merged commit 332004e into huggingface:master Sep 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants