Allow the creation of a pipeline for non-parametric functions like TSNE#22162
Allow the creation of a pipeline for non-parametric functions like TSNE#22162cathelineparisot wants to merge 2 commits intoscikit-learn:mainfrom
Conversation
35d4cff to
4046912
Compare
4046912 to
0a6edc9
Compare
There was a problem hiding this comment.
Thank you for the PR @cathelineparisot ! I think we still need an answer to #16714 (review) before we can continue. I currently agree with the comment:
Pipeline is very much designed to solve needs around train/test setups, so I don't want to weaken its usability for that use-case.
A user can spend a long time fitting a pipeline with a TNSE in the middle which works because TSNE has fit_transform. pipe.predict would fail because a step in the pipeline does not define transform. This is a poor user experience because they can spend a long time time in fitting but only finds out at the end that they can not use it for prediction.
Thank you for your feedback. I understand what you mean. In this case, setting a warning might already be a solution (as it was done in #16714). It is also necessary to see if creating a Pipeline with a TNSE is useful in other cases or only to make a prediction. I don't have the knowledge for that. |
|
Please update the title to better describe what the PR does. (Same comment for your other #22165 PR) |
I changed the title of my two pull requests. I hope it suits you. I apologize for the bad titles. |
|
I am closing this PR in favor of #16714 since that PR contains tests and has more discussion around the topic. Thank you for looking into this issue! |
Reference Issues/PRs
Fixes #16710. See also #16714.
What does this implement/fix? Explain your changes.
Calling a pipeline for a non-parametric function like TSNE() returns an error because the transform() function is missing. However, for non-parametric functions, there is no transform() method because there is no projection or mapping, yet we can use dimensionality reduction.
So, we must be able to create a pipeline even if the transform() method does not exist. This PR allows it
Any other comments?
However, I have the impression that the constructor of the current library no longer calls "validate_step()" as I added it in the "init". What is the problem? It was this function that was creating the error and I have modified this function. I am waiting for your feedback.