Fix "GaussianProcessRegressor doesn't work with multidemensional output when normalize_y=True"#18300
Closed
MathieuBertin wants to merge 2 commits intoscikit-learn:mainfrom
Closed
Fix "GaussianProcessRegressor doesn't work with multidemensional output when normalize_y=True"#18300MathieuBertin wants to merge 2 commits intoscikit-learn:mainfrom
MathieuBertin wants to merge 2 commits intoscikit-learn:mainfrom
Conversation
…ut when normalize_y=True" Added a dimension to y_std and y_cov if model fitted with multidimensional target
Corrected: comment line too long
Contributor
|
Interesting, does this mean that we don't have a test for a multi-dimensional output with normalisze_y=True? |
Author
Yes this is correct. I added the test in my code (file test_gpr.py, function test_y_normalized_multioutput) |
Member
|
This has been addressed in another PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reference Issues/PRs
Fixes #18065
What does this implement/fix? Explain your changes.
The current GaussianProcessRegressor doesn't work with multidimensional output when normalize_y=True, and either y_std or y_cov is queried.
y_std depends on the std of the target if the target is normalized.
Hence it was necessary to add a dimension to y_std if the target is multidimensional, to allow a different set of values for each of the dimensions of the target.
The same problem arose for y_cov, and was fixed similarly by adding a dimension when the target is multidimensional.
Any other comments?
My first contribution here. Sorry in advance if I misunderstood any of the guidelines.
I am not sure of what should happen when the target is multidimensional but not normalized (normalize_y=False)
In the hereby submitted code, a dimension is added to y_std and y_cov if the target is multidimensional, regardless of the value of normalize_y.
The drawback is that if normalize_y=False, values of y_std and y_cov will be identical for all values of this new dimension, which seems redundant.