Skip to content

Fix predict method for multiclass multioutput ensemble models#12834

Merged
jnothman merged 4 commits intoscikit-learn:masterfrom
elsander:multioutput-predict-string
Jan 2, 2019
Merged

Fix predict method for multiclass multioutput ensemble models#12834
jnothman merged 4 commits intoscikit-learn:masterfrom
elsander:multioutput-predict-string

Conversation

@elsander
Copy link
Copy Markdown
Contributor

Reference Issues/PRs

Fixes #12831.

What does this implement/fix? Explain your changes.

This PR fixes a bug where the predict method would fail for multiclass multioutput ensemble models, if any of the dependent variables were strings. The underlying issue was preallocating the predict output using np.zeros, which would then error when string predictions were inserted. I replaced the function call with a more dtype-agnostic call to np.empty.

Any other comments?

n_samples = proba[0].shape[0]
predictions = np.zeros((n_samples, self.n_outputs_))
predictions = np.empty((n_samples, self.n_outputs_),
dtype='object')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it be better to have dtype=self.classes_.dtype or something?


with np.errstate(divide="ignore"):
proba = est.predict_proba(X_test)
assert_equal(len(proba), 2)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the adoption of pytest, we are phasing out use of test helpers assert_equal, assert_true, etc. Please use bare assert statements, e.g. assert x == y, assert not x, etc.

@elsander
Copy link
Copy Markdown
Contributor Author

Sorry for the delay! I committed a couple of changes to address the code review comments.

Copy link
Copy Markdown
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @elsander , LGTM!

Copy link
Copy Markdown
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Please add an entry to the change log at doc/whats_new/v0.21.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

@jnothman jnothman merged commit 6581b0d into scikit-learn:master Jan 2, 2019
rth pushed a commit to rth/scikit-learn that referenced this pull request Jan 3, 2019
adrinjalali pushed a commit to adrinjalali/scikit-learn that referenced this pull request Jan 7, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

predict fails for multioutput ensemble models with non-numeric DVs

3 participants