[MRG+1] Fix/pipeline param error msg by okz12 · Pull Request #13536 · scikit-learn/scikit-learn

okz12 · 2019-03-28T08:05:48Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

The error message for passing a sample_weight is unclear. A check has been introduced for when the step name is not found in the fit_params_steps that raises a TypeError.

Any other comments?

Is the wording of the error clear enough?
Will require adding a test case once approach is finalised

jnothman

Please add a test to test_pipeline.py. I'm pretty sure this does not fix the issue

okz12 · 2019-03-29T07:34:52Z

I added this test case that catches the error:

def test_pipeline_param_error():
    clf = Pipeline(memory=None, steps=[('wrong_step_name', LogisticRegression())])
    assert_raise_message(TypeError,
                         "Use naming convention step__parameter.",
                         clf.fit, [[0], [0]], [0, 1],
                         logisticregression__sample_weight=[1, 1]
                         )

jnothman · 2019-03-29T07:45:23Z

The problem I'm concerned with is pipe.fit(X, y, sample_weight=weight)

This reverts commit e04c4c4.

This reverts commit 3616ad4.

This reverts commit 0f60b34.

This reverts commit 908f830.

This reverts commit a5d5b0f.

…ound" This reverts commit 95b6d9f.

NicolasHug · 2019-03-29T11:24:14Z

@okz12 as described in the original issue #13534 (comment) the problem we are trying to address here is that if a user passes sample_weight to a pipeline as-is, e.g.

pipe.fit([[0], [0]], [0, 1], sample_weight=[1, 1])

they end up with an error message that is not clear at all (Not enough values to unpack...). We want this error message to become something along the lines of

Passing sample_weight directly to a pipeline isn't supported. You can however pass 
sample_weight to specific steps of your pipeline, e.g. 
`pipe.fit(X, y, logisticregression__sample_weight=sample_weight)`.

okz12 · 2019-03-30T18:05:09Z

I've added the error message and an accompanying test. I am checking on whether two underscores '__' are within the variable name, and this would catch other parameters besides sample_weight. I made the error generic referring to parameters rather than sample_weight in particular but kept the sample_weight parameter passing example. Let me know if this should be changed.

NicolasHug

Minor comments but LGTM otherwise

sklearn/pipeline.py

sklearn/tests/test_pipeline.py

NicolasHug · 2019-03-30T18:43:03Z

Also you might want to change the title to [MRG]

sklearn/pipeline.py

…ion description

sklearn/tests/test_pipeline.py

jnothman · 2019-03-31T22:46:19Z

Not sure if we need to worry about changing a ValueError into a TypeError.

jnothman · 2019-03-31T22:46:55Z

Or actually, whether we must be consistent and also raise a TypeError in the case of any other parameter names (including __)

okz12 · 2019-04-04T20:21:50Z

I found examples in gaussian_process/kernels.py and base.py where it splits on '__' and raises a ValueError if the parameter name is invalid. Not sure if I should change pipeline.py to ValueError or the other two to TypeError.

NicolasHug · 2019-04-04T20:47:34Z

Hm, I guess the most consistent way would be to raise a ValueError here instead of a TypeError?

We raise ValueError when invalid parameters are passed to the estimators, and that's what this check is about

jnothman · 2019-04-04T23:15:21Z

Okay let's be backwards compatible and keep with ValueError.

okz12 · 2019-04-06T01:22:16Z

Changing to ValueError breaks the test for fix #13472. The error message picked up is from my fix rather than the fix in #13472, which causes the error message to mismatch and the test-case to fail. I'm not entirely sure of the implications of this. Is it good that this fix captures both errors, in which case I can update the test case for gradient boosting.

Test Case for #13472:

def test_gradient_boosting_with_init_pipeline():
    # Check that the init estimator can be a pipeline (see issue #13466)

    X, y = make_regression(random_state=0)
    init = make_pipeline(LinearRegression())
    gb = GradientBoostingRegressor(init=init)
    gb.fit(X, y)  # pipeline without sample_weight works fine

    with pytest.raises(
            ValueError,
            match='The initial estimator Pipeline does not support sample '
                  'weights'):
        gb.fit(X, y, sample_weight=np.ones(X.shape[0]))

Output Error:

E           AssertionError: Pattern 'The initial estimator Pipeline does not support sample weights' not found in 'Pipeline.fit does not accept the sample_weight parameter. You can pass parameters to specific steps of your pipeline using the stepname__parameter format, e.g. `Pipeline.fit(X, y, logisticregression__sample_weight=sample_weight)`.'

sklearn/ensemble/tests/test_gradient_boosting.py:1396: AssertionError

jnothman · 2019-04-06T11:48:27Z

Yes, IMO, change the gradient boosting check to conform to this.

jnothman · 2019-04-06T11:50:14Z

You need to not modify the gradient boosting test, but the gradient boosting code, to check for the right pipeline failure error message here:

scikit-learn/sklearn/ensemble/gradient_boosting.py

Line 1492 in 24df999

if 'not enough values to unpack' in str(e): # pipeline

jnothman

Lgtm, thanks @okz12

NicolasHug · 2019-04-09T10:54:37Z

Merged, thanks @okz12 !

This reverts commit c79e34b.

okz12 added 2 commits March 28, 2019 07:37

Added error check and message for pipeline parameter if not found

95b6d9f

Using input argument rather than separated param name

a5d5b0f

jnothman reviewed Mar 28, 2019

View reviewed changes

okz12 added 4 commits March 28, 2019 22:45

Separated line for flake8 compliance

908f830

Added testcase for pipeline parameter error message

0f60b34

Added line break for flake8 compliance

3616ad4

Added line break for flake8 compliance

e04c4c4

okz12 added 6 commits March 29, 2019 07:59

Revert "Added line break for flake8 compliance"

0b031a5

This reverts commit e04c4c4.

Revert "Added line break for flake8 compliance"

6809f54

This reverts commit 3616ad4.

Revert "Added testcase for pipeline parameter error message"

4ac93f6

This reverts commit 0f60b34.

Revert "Separated line for flake8 compliance"

4bc32c5

This reverts commit 908f830.

Revert "Using input argument rather than separated param name"

7a8a2cf

This reverts commit a5d5b0f.

Revert "Added error check and message for pipeline parameter if not f…

c150f29

…ound" This reverts commit 95b6d9f.

okz12 added 4 commits March 30, 2019 17:35

Added test for sample_weight error message

97e1888

Added if clause to raise exception if parameter is passed to pipeline

fa976e7

Passes flake8/pyflakes compliance

6fb6e36

Fixes flake8 compliance

c2b944b

NicolasHug approved these changes Mar 30, 2019

View reviewed changes

sklearn/pipeline.py Outdated Show resolved Hide resolved

sklearn/tests/test_pipeline.py Outdated Show resolved Hide resolved

sklearn/tests/test_pipeline.py Outdated Show resolved Hide resolved

NicolasHug mentioned this pull request Mar 30, 2019

Improved error message for passing of sample_weight to pipeline #13548

Closed

okz12 added 2 commits March 31, 2019 09:51

Modified error message: added more details, removed comma, space

5e57570

Added test for sample_weight parameter error message

9b80875

jnothman reviewed Mar 31, 2019

View reviewed changes

sklearn/pipeline.py Outdated Show resolved Hide resolved

okz12 added 2 commits March 31, 2019 10:49

Changed message to include parameter passed in as argument

4b6e965

Changed pipeline parameter test to check for parameter name in except…

397b9e9

…ion description

okz12 changed the title ~~[WIP] Fix/pipeline param error msg~~ [MRG] Fix/pipeline param error msg Mar 31, 2019

NicolasHug reviewed Mar 31, 2019

View reviewed changes

sklearn/tests/test_pipeline.py Outdated Show resolved Hide resolved

Updated error message to be matched in testcase

6ae71fb

jnothman approved these changes Mar 31, 2019

View reviewed changes

okz12 changed the title ~~[MRG] Fix/pipeline param error msg~~ [MRG+1] Fix/pipeline param error msg Apr 1, 2019

okz12 added 2 commits April 5, 2019 23:38

Changed TypeError to ValueError for backward compatability

447a524

flake8 compliance

edbe0c1

fit function checks for correct error message

92bde96

jnothman approved these changes Apr 9, 2019

View reviewed changes

NicolasHug merged commit 0e54f44 into scikit-learn:master Apr 9, 2019

jeremiedbb pushed a commit to jeremiedbb/scikit-learn that referenced this pull request Apr 25, 2019

Improve pipeline parameter error msg (scikit-learn#13536)

d0b7441

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Improve pipeline parameter error msg (scikit-learn#13536)

c79e34b

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert "Improve pipeline parameter error msg (scikit-learn#13536)"

976412a

This reverts commit c79e34b.

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert "Improve pipeline parameter error msg (scikit-learn#13536)"

30325d8

This reverts commit c79e34b.

koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019

Improve pipeline parameter error msg (scikit-learn#13536)

05a6e5f

Uh oh!

Conversation

okz12 commented Mar 28, 2019

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

okz12 commented Mar 29, 2019

Uh oh!

jnothman commented Mar 29, 2019 via email

Uh oh!

NicolasHug commented Mar 29, 2019

Uh oh!

okz12 commented Mar 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

NicolasHug commented Mar 30, 2019

Uh oh!

Uh oh!

Uh oh!

jnothman commented Mar 31, 2019

Uh oh!

jnothman commented Mar 31, 2019

Uh oh!

okz12 commented Apr 4, 2019

Uh oh!

NicolasHug commented Apr 4, 2019

Uh oh!

jnothman commented Apr 4, 2019 via email

Uh oh!

okz12 commented Apr 6, 2019

Uh oh!

jnothman commented Apr 6, 2019

Uh oh!

jnothman commented Apr 6, 2019

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

NicolasHug commented Apr 9, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

okz12 commented Mar 30, 2019 •

edited

Loading