Skip to content

Add Clauset-Newman-Moore modularity-max community detection#2871

Merged
dschult merged 7 commits intonetworkx:masterfrom
elplatt:feature-modularity_max
Feb 28, 2018
Merged

Add Clauset-Newman-Moore modularity-max community detection#2871
dschult merged 7 commits intonetworkx:masterfrom
elplatt:feature-modularity_max

Conversation

@elplatt
Copy link
Copy Markdown
Contributor

@elplatt elplatt commented Feb 7, 2018

* Adds modularity_max module to networkx.algorithms.community package
* Adds greedy_modularity_communities() function with CNM implementation
* Adds test using Zachary karate club network
* Add networkx.utils.mapped_queue used by the above implementation
* Add tests for networkx.utils.mapped_queue

* Adds modularity_max module to networkx.algorithms.community package

* Adds greedy_modularity_communities() function

* Adds test using Zachary karate club network.
* Replace greedy_modularity_communities() with CNM implementation.
* Add networkx.utils.mapped_queue used by the above implementation.
* Add tests for networkx.utils.mapped_queue.
* Removes call to modularity() left over from debugging. Results in a significant speed-up.
@dschult
Copy link
Copy Markdown
Member

dschult commented Feb 18, 2018

This looks pretty good! Thanks!
Can you get it to PEP8 standards? (you can install a pep8 style checker using pip install pep8. probably it only will need you to follow the 80 character limit on each line.

Can you say something about the need for the locally written and maintained queue utility?
Could it be replaced by Python's heapq? If so, that could remove maintenance issues for us later.

@elplatt
Copy link
Copy Markdown
Contributor Author

elplatt commented Feb 18, 2018

I'll work on pep8.

The heapq module is insufficient for this implementation because it doesn't support removing or updating elements (aside from removing the root). There is a third-party HeapDict package (BSD 3-clause license) that looks like it could work. That package stores elements and priorities as separate values, which isn't needed for this algorithm and would add both time and space overhead. Would it be worth doing some comparisons using that implementation instead? I'm not sure how desirable it is to add a dependency.

* Update modularity_max module and tests for pep8.

* Update mapped_queue module and tests for pep8.
@dschult
Copy link
Copy Markdown
Member

dschult commented Feb 19, 2018

Thanks for that description. I would prefer not to add a dependency. How often do you have to remove a non-root element? I'll look at the code some more...

@elplatt
Copy link
Copy Markdown
Contributor Author

elplatt commented Feb 19, 2018

The algorithm maintains a matrix of how much the modularity would increase by merging each pair of communities. There's a priority queue for each row, and one more for the max of each row. After each merge, a row/column is deleted and the modularity deltas are updated for all neighbors of the merged communities, which is where heapq falls short.

@dschult
Copy link
Copy Markdown
Member

dschult commented Feb 23, 2018

OK.. That makes sense to me.

Could you add to the doc file doc/reference/algorithms/community.rst to make these functions appear in the documentation reference? I think it is then ready to go. If that's too much, let me know and I'll do it.

@dschult dschult merged commit b7bafdb into networkx:master Feb 28, 2018
@dschult dschult added this to the networkx-2.2 milestone Feb 28, 2018
Loveuse pushed a commit to Loveuse/networkx that referenced this pull request Mar 9, 2018
…#2871)

* Add greedy modularity maximization community detection.

* Adds modularity_max module to networkx.algorithms.community package

* Adds greedy_modularity_communities() function

* Adds test using Zachary karate club network.

* Add Clauset-Newman-Moore community detection.

* Replace greedy_modularity_communities() with CNM implementation.
* Add networkx.utils.mapped_queue used by the above implementation.
* Add tests for networkx.utils.mapped_queue.

* Add tests for naive modularity maximization.

* Add TestNaive class.

* Remove redundant modularity calculation.

* Removes call to modularity() left over from debugging. Results in a significant speed-up.

* Comply with pep8.

* Update modularity_max module and tests for pep8.

* Update mapped_queue module and tests for pep8.

* Fix import of MappedQueue.

* Add documentation for modularity_max module.
# Edward L. Platt <ed@elplatt.com>
#
# TODO:
# - Alter equations for weighted case
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elplatt Was this TODO done, or is there still something to do about this?

@elplatt
Copy link
Copy Markdown
Contributor Author

elplatt commented Mar 30, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants