Skip to content

Autoeval config#4234

Merged
lhoestq merged 13 commits intohuggingface:masterfrom
nazneenrajani:autoeval
May 5, 2022
Merged

Autoeval config#4234
lhoestq merged 13 commits intohuggingface:masterfrom
nazneenrajani:autoeval

Conversation

@nazneenrajani
Copy link

Added autoeval config to imdb as pilot

@nazneenrajani nazneenrajani requested review from lewtun and lhoestq April 27, 2022 05:32
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Apr 27, 2022

The documentation is not available anymore as the PR was closed or merged.

@lewtun
Copy link
Member

lewtun commented Apr 27, 2022

Copy link
Member

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for kicking off the metadata additions for evaluation @NRajani 🚀 ! I've left a few small nits, but otherwise this LGTM!

metrics:
- type: accuracy
- type: f1
name: f1_macro
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's better to use a "pretty name" here (and the other F1 metrics)?

Suggested change
name: f1_macro
name: F1 macro

@lewtun
Copy link
Member

lewtun commented Apr 27, 2022

The tests are failing due to the changed metadata:

got an unexpected keyword argument 'train-eval-index'

I think you can fix this by updating the DatasetMetadata class and implementing an appropriate validate_train_eval_index() function

@lhoestq we are working with an arbitrary set of tags for autoeval config. See https://github.com/huggingface/autonlp-backend/issues/414
I need to add a validator function though for the tests to pass. Our set is not well-defined as in the rest https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources. What's a workaround for this?

@lewtun
Copy link
Member

lewtun commented May 3, 2022

On the question of validating the train-eval-index metadata, I think the simplest approach would be to validate that the required fields exist and not worry about their values (which are open-ended).

For me, the required fields include:

  • config
  • task
  • task_id
  • splits (train / validation / eval)
  • col_mapping
  • metrics (checking that each one has type, name)

Here I'm using the spec defined in https://github.com/huggingface/autonlp-backend/issues/414 as a guide.

WDYT @lhoestq ?

@lhoestq
Copy link
Member

lhoestq commented May 4, 2022

Makes sense ! Currently the metadata type validator doesn't support subfields - let me open a PR to add it

lhoestq and others added 2 commits May 4, 2022 15:51
- support YAML keys with dashes
- add train-eval-index validation
@lhoestq
Copy link
Member

lhoestq commented May 4, 2022

I ended up improving the metadata validation in this PR x)

In particular:

  • I added support YAML keys with dashes instead of underscores for train-eval-index
  • I added train-eval-index validation with validate_train_eval_index. It does nothing fancy, it just checks that it is a list if it exists in the YAML, but feel free to improve it if you want

Let me know if it sounds good to you ! I think we can improve validate_train_eval_index in another PR

Copy link
Member

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for add a special validator for train-eval-index @lhoestq ! I think this approach looks great 🤗

@lhoestq
Copy link
Member

lhoestq commented May 4, 2022

Come on windows... I didn't do anything advanced...

Anyway, will try to fix this when I get back home x)

@lewtun
Copy link
Member

lewtun commented May 4, 2022

Come on windows... I didn't do anything advanced...

Anyway, will try to fix this when I get back home x)

Hehe, thanks!

@nazneenrajani
Copy link
Author

Thanks, @lhoestq this is great!

@lhoestq
Copy link
Member

lhoestq commented May 5, 2022

Did I just fix it for windows and now it fails on linux ? xD

@lewtun
Copy link
Member

lewtun commented May 5, 2022

Did I just fix it for windows and now it fails on linux ? xD

Looks like the Heisenberg uncertainty principle is at play here - you cannot simultaneously have unit tests passing in both Linux and Windows 😅

@lhoestq
Copy link
Member

lhoestq commented May 5, 2022

The worst is that the tests pass locally both on my windows and my linux x)

@lhoestq
Copy link
Member

lhoestq commented May 5, 2022

Ok fixed it, the issue came from python 3.6 that doesn't return the right __origin__ for Dict and List types

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright thanks for adding the first Autoeval config ! :D

@lhoestq lhoestq merged commit 6af556b into huggingface:master May 5, 2022
@lewtun
Copy link
Member

lewtun commented May 5, 2022

Alright thanks for adding the first Autoeval config ! :D

Woohoo! Thank you so much 🤗

@fxmarty
Copy link
Contributor

fxmarty commented May 6, 2022

This is cool!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants