Skip to content

[MRG+1] MAINT: Refactor and speed up silhoutte_score (samples)#6151

Merged
GaelVaroquaux merged 1 commit intoscikit-learn:masterfrom
MechCoder:silhouette_score_refactor
Jan 17, 2016
Merged

[MRG+1] MAINT: Refactor and speed up silhoutte_score (samples)#6151
GaelVaroquaux merged 1 commit intoscikit-learn:masterfrom
MechCoder:silhouette_score_refactor

Conversation

@MechCoder
Copy link
Copy Markdown
Member

A refactor of metrics.silhouette_score. Able to get at least 2 times speedup in most cases.

 n_samples=10000, n_labels=6
 In this branch: 13.4s, in master:23.2s
 n_samples=10000, n_labels=4
 In this branch: 14.4s, in master:23.9s

 n_samples=1000, n_labels=6
 In this branch: 115ms, in master:287ms
 n_samples=1000, n_labels=4
 In this branch: 110ms, in master:249ms

 n_samples=100, n_labels=6
 In this branch: 1.87ms, in master:7.13ms
 n_samples=100, n_labels=4
 In this branch: 1.54ms, in master:5.84ms

Also fixed a bug related to non-encoded labels.

@MechCoder MechCoder force-pushed the silhouette_score_refactor branch from ff2f719 to 76b1b3d Compare January 10, 2016 07:33
@GaelVaroquaux
Copy link
Copy Markdown
Member

I think that this code would greatly benefit from a few comments, and maybe calling 'A' and 'B' with more explicite names. Right now, it is very hard to follow. Harder than the code it replaces.

@MechCoder
Copy link
Copy Markdown
Member Author

@GaelVaroquaux I have added comments and replaced a few variable names. Can you tell me if it still harder to follow?

@MechCoder MechCoder force-pushed the silhouette_score_refactor branch from ec3ce2f to 7bb62bd Compare January 10, 2016 15:51
@MechCoder
Copy link
Copy Markdown
Member Author

In general, I think the tests can be made more stronger but I'll leave that for another PR

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is n_samples, not n_labels

@MechCoder
Copy link
Copy Markdown
Member Author

@TomDLT looks ok now?

@MechCoder
Copy link
Copy Markdown
Member Author

@agramfort if you have the time :-)

@TomDLT
Copy link
Copy Markdown
Member

TomDLT commented Jan 13, 2016

LGTM, you can squash

@TomDLT TomDLT changed the title [MRG] MAINT: Refactor and speed up silhoutte_score (samples) [MRG+1] MAINT: Refactor and speed up silhoutte_score (samples) Jan 13, 2016
@MechCoder MechCoder force-pushed the silhouette_score_refactor branch from f91abff to 11f9b08 Compare January 13, 2016 14:07
@amueller
Copy link
Copy Markdown
Member

needs rebase ;)

@amueller
Copy link
Copy Markdown
Member

how does it scale to very large n_labels?

@MechCoder
Copy link
Copy Markdown
Member Author

It scales pretty well ! :-) (best of 3)

For n_samples=10000, n_labels=100
In this branch: 12.3s, In master: 28s
For n_samples=1000, n_labels=100
In this branch: 277ms, In master: 992ms

Do I haz your +1 as well?

@MechCoder MechCoder force-pushed the silhouette_score_refactor branch from 11f9b08 to c7ce0ab Compare January 15, 2016 21:18
@MechCoder
Copy link
Copy Markdown
Member Author

rebased

@GaelVaroquaux
Copy link
Copy Markdown
Member

👍 on my side. This is a good rewrite. It is clear and fast. I am merging this!

GaelVaroquaux added a commit that referenced this pull request Jan 17, 2016
[MRG+1] MAINT: Refactor and speed up silhoutte_score (samples)
@GaelVaroquaux GaelVaroquaux merged commit 769127c into scikit-learn:master Jan 17, 2016
@MechCoder MechCoder deleted the silhouette_score_refactor branch January 17, 2016 15:03
@MechCoder
Copy link
Copy Markdown
Member Author

thanks for the reviews!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants