[MRG] Feature: calculate normed stress (Stress-1) in sklearn.manifold.MDS#10168
[MRG] Feature: calculate normed stress (Stress-1) in sklearn.manifold.MDS#10168Borchmann wants to merge 3 commits intoscikit-learn:mainfrom
Conversation
jnothman
left a comment
There was a problem hiding this comment.
Thanks for the PR. Without much erudition around MDS, I'll review engineering...
You can't change a public interface without backwards compatibility. So you'll have to add a return_normed_stress attribute to smacof (or use a parameter to switch between normed and raw stress).
You also should test that the attribute on MDS is correctly set in the test file.
|
@jnothman: Thanks for review. That's right, I adjusted PR to your suggestions. Now, Stress-1 is used and returned instead of raw Stress when normalize parameter is set to True (False by default). Test for added feature was also proposed (it relies on property of normed stress that it is expected to be the same for input matrix multiplied by some factor "k"). |
glemaitre
left a comment
There was a problem hiding this comment.
I would add an entry in what's new as well
| disparities and the distances for all constrained points). | ||
| The final value of the stress. By default, sum of squared distance | ||
| of the disparities and the distances for all constrained points. | ||
| If normalize is set to True, returns Stress-1 (according to |
There was a problem hiding this comment.
You could add a Reference section and reference to this part.
| disparities and the distances for all constrained points). | ||
| The final value of the stress. By default, sum of squared distance | ||
| of the disparities and the distances for all constrained points. | ||
| If normalize is set to True, returns Stress-1 (according to |
|
|
||
| # Normed stress should be the same for | ||
| # values multiplied by some factor "k" | ||
| assert_array_almost_equal(stress1, stress2, decimal=2) |
There was a problem hiding this comment.
Could you use allclose instead of almost_equal
| normalize : boolean, optional, default: False | ||
| Whether use and return normed stress value (Stress-1) instead of raw | ||
| stress calculated by default. | ||
|
|
|
@glemaitre, I can fulfill the last request if @Borchmann is not active. |
|
Is there still something blocking this PR? |
|
@mattmilten I'm a bit busy with other projects now, go ahead if you would like to fix this. |
|
Any news on this feature? The current stress value is basically useless. It would be really nice for users to have access to the normalized stress value from the main the scikit-learn package install without modifying anything. |
Implemented the normalized stress value from Borchmann's stalled PR: scikit-learn#10168 With these changes, passing normalize=True returns a meaningful stress value between 0-1. The current returned stress value is basically useless. normalize is set to False by default.
Same here... I'm trying to use MDS and check the Kruskal Stress for my current work... and I was thinking it was already implemented by seeing the 3 years old stackoverflow topic : stress-attribute-sklearn-manifold-mds-python |
|
Closing as superseded by #22562. |
Change introduces additional parameter to sklearn.manifold.MDS, namely normalize (default False), that can be used to return and use Stress-1 instead of raw Stress value. Already implemented stress_ attribute contains raw stress defined as:
The raw Stress value itself is not very informative, and its high value does not necessarily indicate bad fit. A better way of communicating reliability is to calculate a normed stress, eg. with Stress-1 implemented in current PR:
According to Kruskal (1964, p. 3): value 0 indicates "perfect" fit, 0.025 excellent, 0.05 good, 0.1 fair, and 0.2 poor.
For more information cf. Kruskal (1964, p. 8-9) and Borg (2005, p. 41-43).