Skip to content

Update new datasets layout.#16

Merged
Narsil merged 1 commit into
mainfrom
update_ds
Aug 17, 2023
Merged

Update new datasets layout.#16
Narsil merged 1 commit into
mainfrom
update_ds

Conversation

@Narsil

@Narsil Narsil commented Aug 17, 2023

Copy link
Copy Markdown
Contributor

No description provided.

@Narsil Narsil requested review from McPatate and lhoestq August 17, 2023 07:33
@Narsil Narsil merged commit 6b125ef into main Aug 17, 2023
@McPatate McPatate deleted the update_ds branch August 17, 2023 10:08
Comment thread src/api/sync.rs
);
let downloaded_path = api
.repo(repo)
.download("wikitext-103-v1/wikitext-test.parquet")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you don't just want to fix the version to a sha to avoid having to change this regularly?

@lhoestq lhoestq Aug 17, 2023

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI we don't plan to do any other breaking changes™  in the path names ;)

You can keep it like this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to know about those changes.

Still early but I think mnist shouldn't receive that many changes aside from big revamps like this :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I genuinely don't understand why you're not just using a sha for this, what's the point of having something that could potentially change in your test suite?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is to make the actual request I'm interested about, this is the exact code used in the candle book.

I'm interested in making sure the answer from that specific call doesn't vary. I don't care (and don't want to ) pin a revision in a demo/tutorial.

Those integration sort of tests are by nature a bit flaky as they do depend on the exterior world (if the hub is down, or the dataset teams modifies the repo..). Having them few and far between is essential.
But this particular one I don't expect a lot of changes, and I'll attribute the recent changes to "bad timing". And the actual changes seem for the better as the structure is now more readable for instance.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants