MAINT: improve organization of dataset fetch functions (refactoring)#785
Conversation
Separates each dataset fetching function into its own file for better organization and maintainability.
|
This is almost ready to be reviewed. I just have a question: Should I update the blog post about Datasets and Seed Prompts since, after the changes I've made in this PR, it will no longer be up-to-date? It's about the following paragraph specifically: 2025_02_11.md#loading-datasets-with-seed-prompts I'll absolutely update the User guide for Datasets. Just wondering whether I should also modify the blog post 😄 |
Awesome! We usually don't update blog posts substantially, but this is easy enough of a fix that I'm inclined to make the change. CC @eugeniavkim I would replace
with
|
romanlutz
left a comment
There was a problem hiding this comment.
Thank you! This is perfect!
|
I see the checks are failing. I've run I'll try to fix the problem tomorrow 😃 |
|
There might be a naming collision since fetch_examples is both the file and function name. But that's just a guess. |
That's right, renaming did the trick! Thank you so much! |
|
Fantastic @paulinek13 !!! Thanks once again for a great contribution. |
Description
Related issue: #775
This PR is about refactoring the dataset fetching functions to improve their organization and maintainability as the codebase grows and new datasets are introduced.
🛠️ The main changes:
fetch_example_datasets.pyinto separate files (similar to how converters are handled)tests/unit/datasets)__init__.py) and docs (api.rst)✏️ Other modifications:
fetch_babelscape_alert_datasetandfetch_librAI_do_not_answer_datasetfetch_example_datasets.pytofetch_examples.py.pre-commit-config.yaml/docfiles:doc/code/datasets/0_dataset.md,doc/code/datasets/2_fetch_dataset.ipynb,doc/code/datasets/2_fetch_dataset.pyClose #775