Skip to content

[MRG] Fix regression in silhouette_score for clusters of size 1#6089

Closed
Sentient07 wants to merge 1 commit intoscikit-learn:masterfrom
Sentient07:issue-5988
Closed

[MRG] Fix regression in silhouette_score for clusters of size 1#6089
Sentient07 wants to merge 1 commit intoscikit-learn:masterfrom
Sentient07:issue-5988

Conversation

@Sentient07
Copy link
Copy Markdown
Contributor

fix #5988

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could also revert that deleted comment Addressed

@raghavrv
Copy link
Copy Markdown
Member

raghavrv commented Jan 4, 2016

This looks good to me (apart from the minor nitpick)! Wait for reviews from @MechCoder or @amueller

(PS: Could you also squash the two commits? ;) )

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8 ([2,4] --> [2, 4])

@MechCoder MechCoder changed the title Reverted the change, added regression test [MRG] Fix regression in silhouette_score for clusters of size 1 Jan 7, 2016
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is not a regression test. It will pass in master.

Can you remove this and add a new test testing something like this

distances = np.random.RandomState(0).rand(5)
A = _intra_cluster_distance(distance, np.array([0, 1, 1, 1, 1]), 0)
assert_true(np.isnan(A))

@MechCoder
Copy link
Copy Markdown
Member

Would be great if you could hold on till we clarify that the issue is actually an issue.

@amueller
Copy link
Copy Markdown
Member

it is according to the definition on wikipedia, right? (at least when I last thought about that ;)

@Sentient07
Copy link
Copy Markdown
Contributor Author

There's a merge conflict, which I realized just now. Should i resolve it or wait and do it when it's ready to be merged ? @MechCoder @amueller @rvraghav93

@raghavrv
Copy link
Copy Markdown
Member

Resolving merge conflicts as soon as you encounter one saves a lot of headache later on, (say if you happen to accumulate more merge conflicts, it gets harder to fix them).

@Sentient07
Copy link
Copy Markdown
Contributor Author

@rvraghav93 there are some changes that has been made in the master, which i don't understand why it's been made. Like
from sklearn.utils.testing import assert_false, assert_almost_equal
has been changed to

from sklearn.utils.testing import assert_false
from sklearn.utils.testing import assert_almost_equal

I'd go ahead with those changes ?
Edit1: have pinged you in the commit that made this change

@GaelVaroquaux
Copy link
Copy Markdown
Member

You should follow the changes in master.

@Sentient07
Copy link
Copy Markdown
Contributor Author

@rvraghav93 sorry for the delay,fixed merge conflicts. Shall i amend my commits ?
@GaelVaroquaux Yeah, now it's in agreement with master

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new line at the end

reverted the comment

Resolved merge conflicts
@MechCoder
Copy link
Copy Markdown
Member

Thinking about it again, I would suggest we just raise an error for this case

@jnothman
Copy link
Copy Markdown
Member

I have justified at f0f174b#commitcomment-18600086 why I think setting a sample score to 0 for a sample clustered alone is reasonable. I do not think an error is appropriate.

@amueller
Copy link
Copy Markdown
Member

fixed in #7438

@amueller amueller closed this Oct 25, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

regression in silhouette score

6 participants