[MRG + 1] FIX raise an error message when n_groups > number of groups (#7681)#7683
[MRG + 1] FIX raise an error message when n_groups > number of groups (#7681)#7683amueller merged 3 commits intoscikit-learn:masterfrom
Conversation
sklearn/model_selection/_split.py
Outdated
| unique_groups = np.unique(groups) | ||
| if self.n_groups > len(unique_groups): | ||
| raise ValueError("Cannot have n_groups=%d greater than the number " | ||
| "of unique groups: %d." |
There was a problem hiding this comment.
i find it odd that you choose to use = to indicate the variable value in for n_groups and : for the number of unique groups; maybe keep it consistent?
There was a problem hiding this comment.
I copied the format from this other error in the same file, should I change it?
There was a problem hiding this comment.
If all the file use =, you can keep =.
There was a problem hiding this comment.
I think it's okay as n_groups is a parameter and unique groups is the information we are trying to include in the error message...
|
Thanks for the PR @polmauri |
sklearn/model_selection/_split.py
Outdated
| unique_groups = np.unique(groups) | ||
| if self.n_groups > len(unique_groups): | ||
| raise ValueError("Cannot have n_groups=%d greater than the number " | ||
| "of unique groups: %d." |
There was a problem hiding this comment.
I think it's okay as n_groups is a parameter and unique groups is the information we are trying to include in the error message...
| list(LeavePGroupsOut(n_groups=5).split(X, y, groups)) | ||
|
|
||
| assert_raises(ValueError, leave_one_group_out_zero_groups) | ||
| assert_raises(ValueError, leave_p_groups_out_greater_than_unique_groups) |
There was a problem hiding this comment.
Instead of defining functions you could pass the parameters to assert_raise_message / assert_raises and test these like
assert_raise_message(ValueError, "<snippet of the error message">,
LeaveOneGroupOut().split, X, y)
| assert_equal(3, LeaveOneGroupOut().get_n_splits(X, y, groups)) | ||
|
|
||
|
|
||
| def test_leave_group_out_split_errors(): |
There was a problem hiding this comment.
--> test_leave_one_p_group_out_error_on_fewer_number_of_groups
|
Thanks @polmauri. |
| X = y = groups = np.ones(10) | ||
| list(LeavePGroupsOut(n_groups=5).split(X, y, groups)) | ||
|
|
||
| assert_raises(ValueError, leave_one_group_out_zero_groups) |
There was a problem hiding this comment.
Could you add the ValueError message raised by leave_one_group_out_zero_groups to be sure it is the right message.
| list(LeavePGroupsOut(n_groups=5).split(X, y, groups)) | ||
|
|
||
| assert_raises(ValueError, leave_one_group_out_zero_groups) | ||
| assert_raises(ValueError, leave_p_groups_out_greater_than_unique_groups) |
There was a problem hiding this comment.
Could you add the ValueError message raised by leave_one_group_out_zero_groups to be sure it is the right message.
|
Changed:
|
…cikit-learn#7681) This change addresses issue scikit-learn#7681: - Raise ValueError when n_groups > actual number of unique groups in LeaveOneGroupOut and LeavePGroupsOut. - Add unit test.
- Check error message with `assert_raise_message` - Pass parameters to `assert_raise_message` instead of defining functions
sklearn/model_selection/_split.py
Outdated
| # We make a copy of groups to avoid side-effects during iteration | ||
| groups = np.array(groups, copy=True) | ||
| unique_groups = np.unique(groups) | ||
| if len(unique_groups) == 0: |
sklearn/model_selection/_split.py
Outdated
| groups = np.array(groups, copy=True) | ||
| unique_groups = np.unique(groups) | ||
| if len(unique_groups) == 0: | ||
| raise ValueError("Cannot have zero groups") |
There was a problem hiding this comment.
"The groups parameter contains fewer than 2 unique groups (%s). LeaveOneGroupOut expects at least 2."
% (unique_groups)
sklearn/model_selection/_split.py
Outdated
| raise ValueError("The groups parameter should not be None") | ||
| groups = np.array(groups, copy=True) | ||
| unique_groups = np.unique(groups) | ||
| if self.n_groups > len(unique_groups): |
sklearn/model_selection/_split.py
Outdated
| groups = np.array(groups, copy=True) | ||
| unique_groups = np.unique(groups) | ||
| if self.n_groups > len(unique_groups): | ||
| raise ValueError("Cannot have n_groups=%d greater than the number " |
There was a problem hiding this comment.
"The groups parameter contains fewer than (or equal to) n_groups (%d) numbers of unique groups"
" (%s). LeavePGroupsOut expects that atleast n_groups + 1 (%d) unique groups be present"
% (n_groups, unique_groups, n_groups + 1)|
@polmauri are you still working on this? |
|
Changes requested done. |
|
Thanks for addressing all the comments. Looks good to me. |
|
thanks :) |
…scikit-learn#7681) (scikit-learn#7683) * FIX raise an error message when n_groups > actual number of groups (scikit-learn#7681) This change addresses issue scikit-learn#7681: - Raise ValueError when n_groups > actual number of unique groups in LeaveOneGroupOut and LeavePGroupsOut. - Add unit test. * Make requested changes - Check error message with `assert_raise_message` - Pass parameters to `assert_raise_message` instead of defining functions * Update condition and exception message
…scikit-learn#7681) (scikit-learn#7683) * FIX raise an error message when n_groups > actual number of groups (scikit-learn#7681) This change addresses issue scikit-learn#7681: - Raise ValueError when n_groups > actual number of unique groups in LeaveOneGroupOut and LeavePGroupsOut. - Add unit test. * Make requested changes - Check error message with `assert_raise_message` - Pass parameters to `assert_raise_message` instead of defining functions * Update condition and exception message
…scikit-learn#7681) (scikit-learn#7683) * FIX raise an error message when n_groups > actual number of groups (scikit-learn#7681) This change addresses issue scikit-learn#7681: - Raise ValueError when n_groups > actual number of unique groups in LeaveOneGroupOut and LeavePGroupsOut. - Add unit test. * Make requested changes - Check error message with `assert_raise_message` - Pass parameters to `assert_raise_message` instead of defining functions * Update condition and exception message
…scikit-learn#7681) (scikit-learn#7683) * FIX raise an error message when n_groups > actual number of groups (scikit-learn#7681) This change addresses issue scikit-learn#7681: - Raise ValueError when n_groups > actual number of unique groups in LeaveOneGroupOut and LeavePGroupsOut. - Add unit test. * Make requested changes - Check error message with `assert_raise_message` - Pass parameters to `assert_raise_message` instead of defining functions * Update condition and exception message
…scikit-learn#7681) (scikit-learn#7683) * FIX raise an error message when n_groups > actual number of groups (scikit-learn#7681) This change addresses issue scikit-learn#7681: - Raise ValueError when n_groups > actual number of unique groups in LeaveOneGroupOut and LeavePGroupsOut. - Add unit test. * Make requested changes - Check error message with `assert_raise_message` - Pass parameters to `assert_raise_message` instead of defining functions * Update condition and exception message
…scikit-learn#7681) (scikit-learn#7683) * FIX raise an error message when n_groups > actual number of groups (scikit-learn#7681) This change addresses issue scikit-learn#7681: - Raise ValueError when n_groups > actual number of unique groups in LeaveOneGroupOut and LeavePGroupsOut. - Add unit test. * Make requested changes - Check error message with `assert_raise_message` - Pass parameters to `assert_raise_message` instead of defining functions * Update condition and exception message
This change fixes issue #7681:
ValueErrorwhenn_groups> actual number of unique groups inLeaveOneGroupOutandLeavePGroupsOut.