Conversation
Implemented the normalized stress value from Borchmann's stalled PR: scikit-learn#10168 With these changes, passing normalize=True returns a meaningful stress value between 0-1. The current returned stress value is basically useless. normalize is set to False by default.
Hi @rotheconrad! Thanks for your contribution. I see you've mentioned #10168, but I wanted to add that there are two other open PRs (#12285 and #13042) that would also address this issue. I would recommend checking with the author of #13042 to verify the status of that PR, as there was some in-progress discussion regarding implementation details that could be relevant here. |
|
I am not really familiar with this technique. I read quickly the original reference: https://link.springer.com/content/pdf/10.1007/BF02289565.pdf. In the previous PR, @jnothman raised the question if it was necessary to perform the normalization at each step of the SMACOF algorithm (I think that's why @amueller tag "benchmark needed"). From what I can read in p. 8-9, the raw stress does not have scale invariance property. Therefore, I think that one would need to play with the |
glemaitre
left a comment
There was a problem hiding this comment.
We will need a test.
I think that the test in https://github.com/scikit-learn/scikit-learn/pull/10168/files#diff-4170fd64d149a193329356d7f55e78d1R64 was a good start
| Pass an int for reproducible results across multiple function calls. | ||
| See :term: `Glossary <random_state>`. | ||
|
|
||
| normalize : boolean, optional, default: False |
There was a problem hiding this comment.
| normalize : boolean, optional, default: False | |
| normalize : bool, default=False |
| stress = np.sqrt(stress / | ||
| ((disparities.ravel() ** 2).sum() / 2)) |
There was a problem hiding this comment.
| stress = np.sqrt(stress / | |
| ((disparities.ravel() ** 2).sum() / 2)) | |
| stress = np.sqrt( | |
| stress / ((disparities.ravel() ** 2).sum() / 2) | |
| ) |
| return_n_iter : bool, default=False | ||
| Whether or not to return the number of iterations. | ||
|
|
||
| normalize : boolean, optional, default: False |
There was a problem hiding this comment.
| normalize : boolean, optional, default: False | |
| normalize : bool, default=False |
| Pre-computed dissimilarities are passed directly to ``fit`` and | ||
| ``fit_transform``. | ||
|
|
||
| normalize : boolean, optional, default: False |
There was a problem hiding this comment.
| normalize : boolean, optional, default: False | |
| normalize : bool, default=False |
|
Closing as superseded by #22562. |
Implemented the normalized stress value from Borchmann's stalled PR: #10168
With these changes, passing normalize=True returns a meaningful stress value between 0-1. The current returned stress value is basically useless. normalize is set to False by default.
Reference Issues/PRs
What does this implement/fix? Explain your changes.
Any other comments?