[WIP] RidgeGCV with sample weights is broken#4490
[WIP] RidgeGCV with sample weights is broken#4490eickenberg wants to merge 3 commits intoscikit-learn:masterfrom
Conversation
|
Should we add a common test that sample weights correspond to duplicated data points? |
|
Good point! How global can this be made? (I am not sure what exceptions there are. Right now it looks to me that estimators finding an exact optimum generally qualify for this. |
|
|
There was a problem hiding this comment.
Couldn't this use sklearn.datasets.make_regression?
There was a problem hiding this comment.
indeed, sorry. i copied this in from an ages old pr where i didn't know
about that ;)
will update.
On Thursday, April 2, 2015, Vlad Niculae notifications@github.com wrote:
In sklearn/linear_model/tests/test_ridge.py
#4490 (comment)
:
- test = slice(n_train, None)
- X = rng.randn(n_samples, n_features)
- W = rng.randn(n_features, n_targets)
- Y_clean = X.dot(W)
- if noise_levels is None:
noise_levels = rng.randn(n_targets) *\* 2- noise_levels = np.atleast_1d(noise_levels) * np.ones(n_targets)
- noise = rng.randn(*Y_clean.shape) * noise_levels * Y_clean.std(0)
- Y = Y_clean + noise
- return X, Y, W, train, test
+def test_ridge_gcv_with_sample_weights():
+
- n_samples, n_features, n_targets = 20, 5, 7
- X, Y, W, _, _ = make_noisy_forward_data(n_samples, n_features, n_targets)
Couldn't this use sklearn.datasets.make_regression?
—
Reply to this email directly or view it on GitHub
https://github.com/scikit-learn/scikit-learn/pull/4490/files#r27685564.
|
|
We have such invariance testing of sample weights in metrics (also scale invariance of weights, and that |
|
I'm no expert here but this seems like something that would be good to have. Maybe not for the RC but for the release? |
|
was thinking of sprinting on adding some more of these. is that too late? On Wednesday, October 14, 2015, Andreas Mueller notifications@github.com
|
|
some more tests? I guess only the RC will be before the sprint, so we can still cherry pick bugfixes from the sprint into the release. |
5e6c2ea to
d3a44e7
Compare
|
have rebased this. Will attempt a common test for a certain number of classifiers / regressors concerning the effect of |
There was a problem hiding this comment.
the comment should be at the top and start with #
|
@agramfort do you want to review this? |
|
The appveyor failure looks real: |
There was a problem hiding this comment.
please use a more classical formarting for the args. It's confusing this way.
For instance:
def make_noisy_forward_data(n_samples=100, n_features=200, n_targets=10,
train_frac=.8, noise_levels=None, random_state=42):
...There was a problem hiding this comment.
rebase must have gone wrong. I had removed this in favor of a make_regression
|
@eickenberg are you intending to complete this? |
|
Fixed by #13350. |
What the title says.
The way it is right now, the sample weights weight the eigenspaces of the Gram matrix, which doesn't seem sensible.
This change is