Skip to content

FIX Ridge.coef_ return a single dimension array when target type is not continuous-multiple#19746

Merged
adrinjalali merged 11 commits intoscikit-learn:mainfrom
MaxwellLZH:fix/ridge-coef-shape
Oct 29, 2024
Merged

FIX Ridge.coef_ return a single dimension array when target type is not continuous-multiple#19746
adrinjalali merged 11 commits intoscikit-learn:mainfrom
MaxwellLZH:fix/ridge-coef-shape

Conversation

@MaxwellLZH
Copy link
Copy Markdown
Contributor

@MaxwellLZH MaxwellLZH commented Mar 22, 2021

Reference Issues/PRs

Fixes #19693 following the discussion from @thomasjpfan
Closes #20604

What does this implement/fix? Explain your changes.

Both Ridge.coef_ and RidgeCV.coef_ now returns a single dimension array when the result of sklearn.utils.multiclass.type_of_target(y) is continuous.

@MaxwellLZH MaxwellLZH changed the title FIX Ridge.coef_ return a single dimension array when target type is n… FIX Ridge.coef_ return a single dimension array when target type is not continuous-multiple Mar 22, 2021
@thomasjpfan
Copy link
Copy Markdown
Member

I am unsure which solution would be the good one, because both solutions would be backward incompatible.

@MaxwellLZH
Copy link
Copy Markdown
Contributor Author

I am unsure which solution would be the good one, because both solutions would be backward incompatible.

Maybe we could leave the code as it is, just add some documentation about this behavior in the documentation of Ridge and RidgeCV?

@glemaitre
Copy link
Copy Markdown
Member

I added the following tests: #20604

Ridge, RidgeCV and LinearRegression are the three estimators for which the tests are failing.
I would be inclined to not be backward compatible and consider it as a bug fix indeed.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 15, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 34d42e1. Link to the linter CI: here

@adrinjalali
Copy link
Copy Markdown
Member

@glemaitre @thomasjpfan I've updated the PR, added Guillaume's tests, and added a changelog. This should be good to go.

splits = cv.split(X_tiled, y_tiled, groups=indices)
predictions = cross_val_predict(ridge_reg, X_tiled, y_tiled, cv=splits)
if predictions.shape != y_tiled.shape:
predictions = predictions.reshape(y_tiled.shape)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit strange the cross_val_predict does not have the same shape as y[indices].

I'm guessing it's because Ridge.predict will now always return an ndarray of (n_samples,). If Ridge was trained on a y of shape (n_samples, 1), should Ridge.predict return a prediction of shape (n_samples, 1)?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have in other places where things fail if the shape is (n_samples, 1). I'm more in favor of having a consistent output shape for single outputs. We probably want cross_val_predict to become more consistent with the rest of the library.

@lorentzenchr lorentzenchr added this to the 1.6 milestone Sep 10, 2024
Copy link
Copy Markdown
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
@adrinjalali adrinjalali enabled auto-merge (squash) October 29, 2024 10:59
Copy link
Copy Markdown
Member

@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaxwellLZH Could you correct the changelog in a tiny new PR?

@@ -0,0 +1,3 @@
- In :class:`linear_model.Ridge` and :class:`linear_model.RidgeCV`, after `fit`,
the `coef_` attribute is now of shape `(n_samples,)` like other linear models.
Copy link
Copy Markdown
Member

@lorentzenchr lorentzenchr Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
the `coef_` attribute is now of shape `(n_samples,)` like other linear models.
the `coef_` attribute is now of shape `(n_features,)` like other linear models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ridge and Lasso return different shaped .coef_ attributes

5 participants