Skip to content

Add spectral bipartition community finding and greedy bipartition using node swaps#8347

Merged
rossbar merged 13 commits intonetworkx:mainfrom
dschult:spectral_and_greedy_bipartition
Dec 4, 2025
Merged

Add spectral bipartition community finding and greedy bipartition using node swaps#8347
rossbar merged 13 commits intonetworkx:mainfrom
dschult:spectral_and_greedy_bipartition

Conversation

@dschult
Copy link
Copy Markdown
Member

@dschult dschult commented Nov 14, 2025

Add spectral community finding and greedy bipartition using node swaps. These functions are part of PR #1733 which is a reworking on PR #764 which is a collection of work from Ben Edward's GSoC project in 2011. Many other functions from that project have made their way into NetworkX. And now these bipartition functions join them. A bipartition is a splitting of a network into two communities. The idea here is to choose communities that maximize "modularity" (a measure of the strength of communities in the network based on counting edges within a community and edges between communities).

This PR adds spectral_modularity_bipartition and greedy_node_swap_bipartition for modularity based bipartition.
Adds tests and connections to docs.

It also moves some files:

  • Move kernighan_lin.py and tests to bipartitions.py and tests/test_bipartitions.py.
  • Merge spectral_modularity_bipartition() and greedy_node_swap_bipartition() with kernighan_lin() in bipartitions.py.
  • Adds tests and docs.

From the original text:
"Adds modularity measure for communities."
Sat Sep 8 19:24:56 2012 +0200

Co-authored-by: Moritz Emanuel Beber <m.beber@jacobs-university.de>
Co-authored-by: Jeffrey Finkelstein <jeffrey.finkelstein@gmail.com>

move kernighan_lin.py and tests to bipartitions.py and tests/test_bipartitions.py
merge spectral with kernighan_lin and call it bipartitions
merge new functions into bipartitions, tests and docs

Adds modularity measure for communities.
Sat Sep 8 19:24:56 2012 +0200

Co-authored-by: Moritz Emanuel Beber <m.beber@jacobs-university.de>
Co-authored-by: Jeffrey Finkelstein <jeffrey.finkelstein@gmail.com>
]

for i in range(max_iter):
costs = list(_kernighan_lin_sweep(edges, side))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any value on having the lin_sweep being an iterator if we just convert it to list? we could make the method return the list. We could even move the the min_cost check inside and signal empty list of swaps as the condition to break.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there advantages to returning a list over yielding and putting the list constructor in the main function? Similarly, would moving min inside the helper function be beneficial in some way?

Copy link
Copy Markdown
Contributor

@amcandio amcandio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the feedback. This looks good to me!

Copy link
Copy Markdown
Member Author

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm doing a review of this because it's not my code even though I refactored for this PR. And it's been long enough since my refactor for me to have "regained my innocence" and look with ~fresh eyes. I'll make these changes soon.

@rossbar
Copy link
Copy Markdown
Contributor

rossbar commented Nov 20, 2025

I will review too, but I'm holding off until after release since this modifies a submodule name... I don't think that's a big deal (or at least it's not a blocker for me), but I wouldn't want to introduce a technically backward-incompatible change during an RC!

…-fold

For K_L_sweep:
  improve names
  replace heap equality check with comparing sides.
  reduce inputs to 1 on helper function.
For K-L_bisection:
  func remove indexing,
  use "seed" only when partition is not provided,
  split nodes in A, B early and set side based on that.
For greedy_node_swap:
  rename C to C_best_so_far and related
  remove one extra deepcopy from inside the loop
@dschult dschult added this to the 3.7 milestone Nov 21, 2025
Copy link
Copy Markdown
Contributor

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the liberty of pushing up some minor docstring/test nits but otherwise this LGTM!

I approve as-is but did leave a comment re: whether we want to do keyword-only args for the new spectral bipartition. I have no strong feeling about it and it's not a blocker for me - I leave it to @dschult !



@nx.utils.not_implemented_for("multigraph")
def greedy_node_swap_bipartition(G, C_init=None, max_iter=10):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we enforce keyword-only args here? My sense is yes, though then it might make sense to have a slightly more descriptive kwarg name for C_init, like init_communities or init_partition. OTOH - the C_ var name matches better with the local variable names inside the function. WDYT? Alternatively it'd be fine to leave as-is!

Suggested change
def greedy_node_swap_bipartition(G, C_init=None, max_iter=10):
def greedy_node_swap_bipartition(G, *, C_init=None, max_iter=10):

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes -- I think keyword-only args here would be good.

And you are right that C_init should be renamed.
Maybe something like initial_guess or initial_bipartition but they're kind of long.
We could go with guess, or start_from or swap_from. :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I like precise, even if it's a bit long, so initial_bipartition is the one that stands out to me :). No blockers though, I'd be happy with whatever argument name (including the current one)!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added kw-only args. And renamed C_init as init_split in this latest push. The rhyming and nice word "split" for "bipartition" were attractive. I also changed some doc wording. And I changed the varnames that used to use C to denote communities so they now use split, e.g. Cmax is now max_split and Cnext is next_split. See what you think.

Copy link
Copy Markdown
Contributor

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added kw-only args. And renamed C_init as init_split in this latest push. The rhyming and nice word "split" for "bipartition" were attractive. I also changed some doc wording. And I changed the varnames that used to use C to denote communities so they now use split, e.g. Cmax is now max_split and Cnext is next_split. See what you think.

These names all make sense to me, and everything else looks good! Thanks @dschult and @amcandio !

@rossbar rossbar merged commit 3dacd1b into networkx:main Dec 4, 2025
52 checks passed
@dschult dschult modified the milestones: 3.7, 3.6 Dec 4, 2025
@dschult dschult deleted the spectral_and_greedy_bipartition branch December 4, 2025 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants