Add Clauset-Newman-Moore modularity-max community detection#2871
Add Clauset-Newman-Moore modularity-max community detection#2871dschult merged 7 commits intonetworkx:masterfrom
Conversation
elplatt
commented
Feb 7, 2018
* Adds modularity_max module to networkx.algorithms.community package * Adds greedy_modularity_communities() function * Adds test using Zachary karate club network.
* Replace greedy_modularity_communities() with CNM implementation. * Add networkx.utils.mapped_queue used by the above implementation. * Add tests for networkx.utils.mapped_queue.
* Add TestNaive class.
* Removes call to modularity() left over from debugging. Results in a significant speed-up.
|
This looks pretty good! Thanks! Can you say something about the need for the locally written and maintained queue utility? |
|
I'll work on pep8. The heapq module is insufficient for this implementation because it doesn't support removing or updating elements (aside from removing the root). There is a third-party HeapDict package (BSD 3-clause license) that looks like it could work. That package stores elements and priorities as separate values, which isn't needed for this algorithm and would add both time and space overhead. Would it be worth doing some comparisons using that implementation instead? I'm not sure how desirable it is to add a dependency. |
* Update modularity_max module and tests for pep8. * Update mapped_queue module and tests for pep8.
|
Thanks for that description. I would prefer not to add a dependency. How often do you have to remove a non-root element? I'll look at the code some more... |
|
The algorithm maintains a matrix of how much the modularity would increase by merging each pair of communities. There's a priority queue for each row, and one more for the max of each row. After each merge, a row/column is deleted and the modularity deltas are updated for all neighbors of the merged communities, which is where heapq falls short. |
|
OK.. That makes sense to me. Could you add to the doc file |
…#2871) * Add greedy modularity maximization community detection. * Adds modularity_max module to networkx.algorithms.community package * Adds greedy_modularity_communities() function * Adds test using Zachary karate club network. * Add Clauset-Newman-Moore community detection. * Replace greedy_modularity_communities() with CNM implementation. * Add networkx.utils.mapped_queue used by the above implementation. * Add tests for networkx.utils.mapped_queue. * Add tests for naive modularity maximization. * Add TestNaive class. * Remove redundant modularity calculation. * Removes call to modularity() left over from debugging. Results in a significant speed-up. * Comply with pep8. * Update modularity_max module and tests for pep8. * Update mapped_queue module and tests for pep8. * Fix import of MappedQueue. * Add documentation for modularity_max module.
| # Edward L. Platt <ed@elplatt.com> | ||
| # | ||
| # TODO: | ||
| # - Alter equations for weighted case |
There was a problem hiding this comment.
@elplatt Was this TODO done, or is there still something to do about this?
|
I've figured out the appropriate equations for all combinations of
(un)weighted and (un)directed, including self loops, but they're not
implemented yet. We also need some known results to write test cases
against.
Equations are here: https://github.com/elplatt/Paper-CNM-Modularity
…On Sat, Mar 30, 2019 at 11:03 AM Mridul Seth ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In networkx/algorithms/community/modularity_max.py
<#2871 (comment)>:
> @@ -0,0 +1,282 @@
+# modularity_max.py - functions for finding communities based on modularity
+#
+# Copyright 2018 Edward L. Platt
+#
+# This file is part of NetworkX
+#
+# NetworkX is distributed under a BSD license; see LICENSE.txt for more
+# information.
+#
+# Authors:
+# Edward L. Platt ***@***.***>
+#
+# TODO:
+# - Alter equations for weighted case
@elplatt <https://github.com/elplatt> Was this TODO done, or is there
still something to do about this?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2871 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAS0WPGGNJ4mRFt7XSqTWvD6WS2Gm8tQks5vb3ysgaJpZM4R8JBJ>
.
--
Edward L. Platt
PhD Candidate, University of Michigan School of Information
he/him | https://elplatt.com | @elplatt | @elplatt@social.coop
Tips for stopping email overload:
https://hbr.org/2012/02/stop-email-overload-1
|