Remove GroupMetricResult by riedgar-ms · Pull Request #287 · fairlearn/fairlearn

riedgar-ms · 2020-02-05T14:27:02Z

Sample PR to remove the GroupMetricResult type and replace it with a Bunch. This does not yet fix the notebooks

Signed-off-by: Richard Edgar <riedgar@microsoft.com>

adrinjalali

Glad to see this tested :)

fairlearn/metrics/_group_metric_set.py

adrinjalali · 2020-02-05T15:33:01Z

fairlearn/metrics/_metrics_engine.py

        result.overall = metric_function(y_a, y_p, **kwargs)

    groups = np.unique(group_membership)
+    result.by_group = dict()


Bunch is a dict, and it seems the only thing we're putting in it here is another dict inside by_group. Doesn't it then make sense to put those results directly into result instead of under by_group?

That would be a problem if someone decides to have an 'overall' entry in their group_membership vector. Not sure why someone would, but that's either something else to check, or a charming pitfall for some user.

But we can check for that, can't we?

Yes - as I said it's something else to check. I just think that it would be a bit odd to restrict the keys a user could use in their data. Especially as they most likely won't notice that bit of the documentation, but will instead end up with a sudden exception.

I had a same thought as @adrinjalali . I think we should try to make the result shallow. Let's say that the two groups are "male" and "female". Then we could use keys like:
"group_male", "group_female", "overall", "min", "max", "range", "range_ratio"

I've updated this again. Please take a look at test_metrics_engine.py and let me know if you like the new syntax. I would like to have this agreed before going through and changing all of the other tests.

@MiroDudik @adrinjalali

I was discussing this with @rihorn2 at lunch, and I like this even less. Going to this level of change will require changes to the dashboard too. I do not see the benefits.

it'd be nice if people who are not there in your office could see the result of your discussions in a better way :)

The result of the discussion: what is the benefit of doing this ('this' being any of this change)?

the hope was to better support the use cases outlined in: https://github.com/fairlearn/fairlearn/issues/257#issuecomment-580811906
and also remove frictions that come about when you introduce new data types:
https://github.com/fairlearn/fairlearn/issues/257#issuecomment-581414697

Signed-off-by: Richard Edgar <riedgar@microsoft.com>

Not all required test changes in place yet Signed-off-by: Richard Edgar <riedgar@microsoft.com>

Signed-off-by: Richard Edgar <riedgar@microsoft.com>

adrinjalali

thanks @riedgar-ms

adrinjalali · 2020-02-10T16:44:53Z

fairlearn/metrics/_metrics_engine.py

+_OVERALL = 'overall'
+_BY_GROUP_FORMAT = 'group_{0}'
+_MIN = 'min'
+_MAX = 'max'
+_RANGE = 'range'
+_RANGE_RATIO = 'range_ratio'
+_ARGMIN = 'argmin'
+_ARGMAX = 'argmax'


I personally find this less readable than having those literals directly in the code. Especially when they're used pretty much only once.

adrinjalali · 2020-02-10T16:51:05Z

fairlearn/metrics/_metrics_engine.py

+        result[_RANGE] = result_range
+        result[_RANGE_RATIO] = range_ratio


how often are these two used? If not often, the user could compute them from the other values.

I would actually be happy to see all of these extra fields eliminated. Since there's no constraint on changing the other members, they could easily be inconsistent. That's why I changed GroupMetricResult to computing them dynamically.

adrinjalali · 2020-02-10T16:52:08Z

fairlearn/metrics/_metrics_engine.py

+    # Compute all the statistics, taking care not to stomp on our dictionary
+    try:
+        minimum = min(result.values())
+        argmin = [k for k, v in result.items() if v == minimum]


use numpy.argmin and numpy.argmax instead?

I don't believe those return the keys of a dict, but indices into an array?

right, sorry, here min(result, key=result.get) would work I believe.

That appears to return only one of the minima.

adrinjalali · 2020-02-10T16:54:18Z

fairlearn/metrics/_metrics_engine.py

+        result[_RANGE] = result_range
+        result[_RANGE_RATIO] = range_ratio
+    except ValueError:
+        # Nothing to do if the result type is not amenable to 'min' etc.


when does this happen?

When the metric is something like a confusion matrix, for which min() is not defined

I guess in that case, it should also be tested?

The test_matrix_metric test perhaps?

adrinjalali · 2020-02-10T16:55:16Z

test/unit/metrics/test_group_metric_set.py

-        assert exception_context.value.args[0] == expected
-

 class TestConsistencyCheck:


curious, why are the tests in classes?

Can't remember offhand why these ones are.

they may be remnants of nose or some other testing tool. pytest doesn't need them.

We've always been pytest, but haven't been consistent in whether we use test classes or not. It might be down to the fact that the _metrics_engine module had two routines, so I had a class for each of them. That would be a remnant from before we reorganised the namespaces.

adrinjalali · 2020-02-10T16:57:34Z

test/unit/metrics/test_metrics_engine.py

+        assert result['overall'] == -16
+        assert result['group_0'] == -10
+        assert result['group_1'] == -6
+        assert result['min'] == -10
+        assert result['max'] == -6
+        assert result['range'] == 4
+        assert np.isnan(result['range_ratio'])


if the result is a Bunch, you could still access them as result.min etc.

I decided against using that.

also, for some of these you could keep the old key name, to keep it more backward compatible.

I decided against using that.

care to elaborate why?

Thinking it over, I wasn't keen on a dependency on undocumented behaviour.

But we talked about it and the conclusion was to document it. That's why I've created scikit-learn/scikit-learn#16404

Also, if it's supposed to not be used, you could use a dict instead of a Bunch.

I am using a dict.

riedgar-ms · 2020-02-11T21:26:02Z

Per discussions on #257 this is not needed right now

riedgar-ms added 4 commits February 5, 2020 08:20

Remove the class and its direct tests

f04b668

Signed-off-by: Richard Edgar <riedgar@microsoft.com>

We require bunch

463c91c

Signed-off-by: Richard Edgar <riedgar@microsoft.com>

Update some of the tests

0208489

Signed-off-by: Richard Edgar <riedgar@microsoft.com>

Convert the GMS tests too

3c5db18

Signed-off-by: Richard Edgar <riedgar@microsoft.com>

riedgar-ms requested review from MiroDudik and romanlutz February 5, 2020 14:27

adrinjalali reviewed Feb 5, 2020

View reviewed changes

Switch to sklearn's Bunch

c37b90a

Signed-off-by: Richard Edgar <riedgar@microsoft.com>

adrinjalali mentioned this pull request Feb 5, 2020

utils.Bunch's documentation. scikit-learn/scikit-learn#16390

Closed

romanlutz approved these changes Feb 7, 2020

View reviewed changes

riedgar-ms added 3 commits February 10, 2020 11:09

Flattening the dictionary as requested

d998723

Not all required test changes in place yet Signed-off-by: Richard Edgar <riedgar@microsoft.com>

Finish converting one batch of tests

372e617

Signed-off-by: Richard Edgar <riedgar@microsoft.com>

Finish test conversions (for this file)

d38ed99

Signed-off-by: Richard Edgar <riedgar@microsoft.com>

adrinjalali reviewed Feb 10, 2020

View reviewed changes

riedgar-ms closed this Feb 11, 2020

riedgar-ms deleted the riedgar-ms/remove-group-metric-result branch March 19, 2020 16:54

		result[_RANGE] = result_range
		result[_RANGE_RATIO] = range_ratio

		assert exception_context.value.args[0] == expected


		class TestConsistencyCheck:

Conversation

riedgar-ms commented Feb 5, 2020

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

riedgar-ms commented Feb 11, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects