ENH Change the default n_init and eps for MDS#31117
ENH Change the default n_init and eps for MDS#31117ogrisel merged 37 commits intoscikit-learn:mainfrom
n_init and eps for MDS#31117Conversation
n_init for MDSn_init and eps for MDS
|
@antoinebaker IMHO it would be great to have this in 1.7 so that the deprecation cycle rolls out by 1.9... Would be amazing if you could take a look. Thanks! |
|
Will try ! But not sure we can make it to the 1.7 release, which is coming very soon :) |
antoinebaker
left a comment
There was a problem hiding this comment.
Thanks a lot @dkobak for the follow-up PR !
The stopping criterion makes much more sense now. The plot_mds.py example is improved with the tutorial like presentation.
Even if our goal is not to reproduce the R code exactly, could you provide a comparison of the R / sklearn results on some examples to see if the found stress are in the same ballpark ? I guess comparing the embeddings is more difficult because of rotational invariance.
Below a few suggestions to ease maintenance.
|
@antoinebaker Thanks for the review, I added all your suggestions.
This makes sense, however I am not a R user and so cannot do this very easily... If you think this is important, I can try to set it up. |
|
Okay, I am trying to set up the comparison. I'm taking the data from here: https://cran.r-project.org/web/packages/smacof/vignettes/mdsnutshell.html S = np.array([[0. , 0.86, 0.42, 0.42, 0.18, 0.06, 0.07, 0.04, 0.02, 0.07, 0.09,
0.12, 0.13, 0.16],
[0.86, 0. , 0.5 , 0.44, 0.22, 0.09, 0.07, 0.07, 0.02, 0.04, 0.07,
0.11, 0.13, 0.14],
[0.42, 0.5 , 0. , 0.81, 0.47, 0.17, 0.1 , 0.08, 0.02, 0.01, 0.02,
0.01, 0.05, 0.03],
[0.42, 0.44, 0.81, 0. , 0.54, 0.25, 0.1 , 0.09, 0.02, 0.01, 0. ,
0.01, 0.02, 0.04],
[0.18, 0.22, 0.47, 0.54, 0. , 0.61, 0.31, 0.26, 0.07, 0.02, 0.02,
0.01, 0.02, 0. ],
[0.06, 0.09, 0.17, 0.25, 0.61, 0. , 0.62, 0.45, 0.14, 0.08, 0.02,
0.02, 0.02, 0.01],
[0.07, 0.07, 0.1 , 0.1 , 0.31, 0.62, 0. , 0.73, 0.22, 0.14, 0.05,
0.02, 0.02, 0. ],
[0.04, 0.07, 0.08, 0.09, 0.26, 0.45, 0.73, 0. , 0.33, 0.19, 0.04,
0.03, 0.02, 0.02],
[0.02, 0.02, 0.02, 0.02, 0.07, 0.14, 0.22, 0.33, 0. , 0.58, 0.37,
0.27, 0.2 , 0.23],
[0.07, 0.04, 0.01, 0.01, 0.02, 0.08, 0.14, 0.19, 0.58, 0. , 0.74,
0.5 , 0.41, 0.28],
[0.09, 0.07, 0.02, 0. , 0.02, 0.02, 0.05, 0.04, 0.37, 0.74, 0. ,
0.76, 0.62, 0.55],
[0.12, 0.11, 0.01, 0.01, 0.01, 0.02, 0.02, 0.03, 0.27, 0.5 , 0.76,
0. , 0.85, 0.68],
[0.13, 0.13, 0.05, 0.02, 0.02, 0.02, 0.02, 0.02, 0.2 , 0.41, 0.62,
0.85, 0. , 0.76],
[0.16, 0.14, 0.03, 0.04, 0. , 0.01, 0. , 0.02, 0.23, 0.28, 0.55,
0.68, 0.76, 0. ]])
D = 1 - S
np.fill_diagonal(D, 0)I'm running R here: https://webr.r-wasm.org/latest/ library(smacof)
D = 1 - ekman
mds(D)
mds(D, type="ordinal")Results: In Python (this branch): np.random.seed(42)
MDS(dissimilarity="precomputed", metric=True, normalized_stress=True, n_init=1).fit(D).stress_ # 0.137
MDS(dissimilarity="precomputed", metric=True, normalized_stress=True, n_init=1, eps=1e-6).fit(D).stress_ # 0.132
MDS(dissimilarity="precomputed", metric=False, normalized_stress=True, n_init=1).fit(D).stress_ # 0.372
MDS(dissimilarity="precomputed", metric=False, normalized_stress=True, n_init=1, eps=1e-6).fit(D).stress_ # 0.031I think it fits well enough! Clearly the new Note: I cannot get down to 0.023 stress-1 with the Python implementation, and I am not sure why, but this is out of scope of this PR. For reference, here is the R code: https://github.com/cran/smacof/blob/master/R/smacofSym.R. |
|
@antoinebaker I fixed the issue in my comparison code (had to add |
Thanks for comparison examples! To share and reproduce, could you provide them as gists for the python and R code ? The python one can be a notebook, the R code just a plain script with the results commented as you have done. Another possibility is to have a small public repository with the two code snippets. |
|
For the This PR is already a much needed improvement for MDS, so +1 to merge it as is. |
|
Here is the gist, I put the R code into the comments on top: https://gist.github.com/dkobak/3daf73495b3da3b23dfbfd6b0b441030 |
|
Any chance to get it into 1.7? I am not sure what the timeline for that is. |
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
|
Thanks @ogrisel. I have edited the PR to make the change of |
ogrisel
left a comment
There was a problem hiding this comment.
Thanks very much for the quick follow-up. Here is a final comment but otherwise LGTM.
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
antoinebaker
left a comment
There was a problem hiding this comment.
A few nitpicks (mainly removing superfluous eps=1e-6 as it is the new default), otherwise LGTM !
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
|
I enabled auto-merge so that the PR will be merged if CI is green after this commit. @dkobak instead of committing individual review suggestions, you can press the "add suggestion to batch" from the diff view of the PR and then do a single commit for a group of suggestions to avoid putting unnecessary CI usage. |
|
Thanks. I noticed that the what's-new entries had wrong markup formatting (mea culpa) and also found many other existing what's-new entries for 1.7 that have wrong markup and render incorrectly. I fixed all of them in #31272. |
This is a follow-up to #30514 and has been discussed in there to some extent. It fixes two issues:
Current default in MDS is
n_init=4, which runs MDS four times. Other sklearn classes that offer this functionality usen_init=1by default, e.g.sklearn.mixture.GaussianMixture. This appears much more sensible to me. So I change the default ton_init=1.The convergence criterion was really strange and unmotivated, and the default
epsled to really bad underconvergence on some simple datasets. I am changing it to the convergence criterion that (i) roughly follows the R implementation, that (ii) makes sense for both metric and non-metric MDS, and that (iii) is not affected by any rescaling of the input matrixX. The new convergence criterion is((old_stress - stress) / ((distances.ravel() ** 2).sum() / 2)) < epsand the defaulteps=1e-6as in the R implementation.Apart from that, I fixed the formula for the "normalized stress" aka "stress-1" (as discussed in the previous PR), and added several tests.
I implemented FutureWarnings until v1.9 and corresponding tests.
Here is the result of running this code with the new default parameters on a small subset of Digits dataset.
Note that both embeddings converge within ~200 iterations, and that non-metric MDS has lower normalized stress than metric MDS, as expected.
Running this with current default (removing
n_init=1, eps=1e-6from the MDS calls) produces awful results, as the convergence criterion hits way too early:Almost the same thing happens on
main. So in my opinion the currentepsvalue is dysfunctional, especially for non-metric MDS, and the currentn_initvalue is a waste of computations.