feat: Add Maximum Bipartite Matching Algorithm #740

BryanCruz · 2025-02-17T15:36:00Z

Add Maximum Bipartite Matching Algorithm

Solves the Maximum Matching Problem for a Bipartite graph by reducing it to a Maximum Flow problem and using existent algorithm for Maximum Flow.

References:

Creating the Max Flow Instance

In the algorithm I manually copy all nodes and edges from original graph to create a network flow instance.
I believe there are faster ways to do this and optimize the algorithm, but I don't know how to do so, so I'm accepting some help on this.

Further Optimization

This algorithm can be further optimized when Dinic's Algorithm PR is accepted, because Dinic's is much faster than Ford Fulkerson method for finding maximum flows in unit networks.

I executed the bench tests with Ford Fulkerson and with the Dinic's algorithm, and this were the results:

using ford_fulkerson to find max flow:

    test maximum_bipartite_matching_specific_100  ... bench:      32,223.41 ns/iter (+/- 5,470.65)
    test maximum_bipartite_matching_specific_1000 ... bench:  27,260,136.10 ns/iter (+/- 5,475,181.00)

using dinics to find max flow:

test maximum_bipartite_matching_specific_100  ... bench:      21,775.31 ns/iter (+/- 1,547.98)
test maximum_bipartite_matching_specific_1000 ... bench:   3,098,250.00 ns/iter (+/- 650,500.50)

Current implementation benchmarks for the same bipartite instances:

test maximum_bipartite_matching_generic_100   ... bench:      40,027.83 ns/iter (+/- 1,367.46)
test maximum_bipartite_matching_generic_1000  ... bench:  37,609,410.00 ns/iter (+/- 1,382,172.00)

RaoulLuque

Hey, first off, thanks for the PR :) I left some comments but other than that, it looks good to me 👍 Maybe check if a rebase is necessary? (In particular instead of the merge, I am not sure which is preferred by petgraph)

I'm not a maintainer, so I don't have the final say for a merge, but will try to make someone else have a look at it too !

src/algo/matching.rs

tests/matching.rs

benches/matching.rs

starovoid · 2025-05-15T13:45:52Z

It's interesting to see benchmarks against the existing maximum_matching algorithm.

src/algo/matching.rs

RaoulLuque · 2025-05-17T18:56:55Z

It's interesting to see benchmarks against the existing maximum_matching algorithm.

Here are the benchmarks from my machine. The generic implementation is the existing one and the specific one is the one from this PR. The numbers indicate a variable which indicates 3/2 times the node count (1000 -> 666 nodes in the graph):

maximum_bipartite_matching_1000_generic_implementation  ... bench:  63,293,792.10 ns/iter (+/- 1,772,475.82)
maximum_bipartite_matching_1000_specific_implementation ... bench:  34,926,283.70 ns/iter (+/- 1,797,226.46)
maximum_bipartite_matching_100_generic_implementation   ... bench:      43,887.51 ns/iter (+/- 2,730.11)
maximum_bipartite_matching_100_specific_implementation  ... bench:      45,081.94 ns/iter (+/- 3,020.67)
maximum_bipartite_matching_30_generic_implementation    ... bench:       2,998.50 ns/iter (+/- 125.20)
maximum_bipartite_matching_30_specific_implementation   ... bench:       2,983.48 ns/iter (+/- 171.43)

As can be seen, the difference seems to be noticeable on bigger graphs but negligible on smaller graphs. Note that I also added a new 30 benchmark to test if the generic implementation outperforms the specific one on small graphs.

However, the difference increases when instead of Ford-Fulkerson, Dinics from this PR is used (I was unable to find out, why the other results change so drastically as well, since the only thing I change is the Max-Flow algorithm used in the maximum bipartite matching algorithm - I reran both benchmarks multiple times with the same results):

maximum_bipartite_matching_1000_generic_implementation  ... bench:  72,127,685.20 ns/iter (+/- 2,652,352.21)
maximum_bipartite_matching_1000_specific_implementation ... bench:   2,561,473.48 ns/iter (+/- 85,881.61)
maximum_bipartite_matching_100_generic_implementation   ... bench:      18,356.30 ns/iter (+/- 750.69)
maximum_bipartite_matching_100_specific_implementation  ... bench:      18,056.75 ns/iter (+/- 905.22)
maximum_bipartite_matching_30_generic_implementation    ... bench:       2,971.91 ns/iter (+/- 130.73)
maximum_bipartite_matching_30_specific_implementation   ... bench:       2,908.20 ns/iter (+/- 135.21)

starovoid · 2025-05-17T19:20:45Z

However, the difference increases when instead of Ford-Fulkerson, Dinics from this PR is used (I was unable to find out, why the other results change so drastically as well, since the only thing I change is the Max-Flow algorithm used in the maximum bipartite matching algorithm

The results are similar to the truth, but also impressive! Dinic's algorithm is theoretically indeed faster that current max-flow implementation without shortest path searching, and this benchmark confirms this. But to be sure, I would like to take a look at the benchmark code.

Thank you for the good work done, I would like to include the benchmark code in the repository after the PR merge.

RaoulLuque · 2025-05-17T19:22:37Z

Regarding:

In the algorithm I manually copy all nodes and edges from original graph to create a network flow instance.
I believe there are faster ways to do this and optimize the algorithm, but I don't know how to do so, so I'm accepting some help on this.

I tried out some things, but was also unable to find a good approach. I thought about copying the entire graph and deleting all edges, but that added new problems, where the newly created graph was of type G which one would have to impose more trait bounds on. So maybe, @starovoid, you have an idea, and otherwise we just leave it at that.

RaoulLuque · 2025-05-17T19:28:26Z

The results are similar to the truth, but also impressive! Dinic's algorithm is theoretically indeed faster that current max-flow implementation without shortest path searching, and this benchmark confirms this. But to be sure, I would like to take a look at the benchmark code.

Thank you for the good work done, I would like to include the benchmark code in the repository after the PR merge.

I see. What do you mean exactly by take a look at the benchmark code? The code of the benchmark is included in this PR and the linked Dinics PR in benches/matching.rs.

I just copied the benchmark and used the generic maximum matching algorithm instead of the bipartite specific one:

#[bench]
fn maximum_bipartite_matching_1000_generic_implementation(bench: &mut Bencher) {
    let (g, partition_a, partition_b) = generate_bipartite(1_000);
    let partition_a_ids: Vec<_> = partition_a
        .iter()
        .map(|&id| NodeIndexable::from_index(&g, id as usize))
        .collect();
    let partition_b_ids: Vec<_> = partition_b
        .iter()
        .map(|&id| NodeIndexable::from_index(&g, id as usize))
        .collect();
    bench.iter(|| maximum_matching(&g));
}

Note that I also did not remove the partition a and b collection code since I was not sure if it counts towards the benchmark time. Probably not because it is not part of the bench closure, but I wanted to make sure

src/algo/matching.rs

starovoid · 2025-05-17T19:45:12Z

So maybe, @starovoid, you have an idea, and otherwise we just leave it at that.

The test from #739 on my PC showed the opposite result in the max-flow task, so the first step is to dive into dinic, write more benchmarks and understand this phenomenon.

First attemt

test dinics_bench ... bench:                    16,392.67 ns/iter (+/- 230.41)
test ford_fulkerson_bench ... bench:       9,859.98 ns/iter (+/- 152.55)

Second attempt

test dinics_bench ... bench:      15,856.29 ns/iter (+/- 157.41)
test ford_fulkerson_bench ... bench:       9,486.44 ns/iter (+/- 211.69)

If the dinic implementation turns out to be really faster, the first task will be to adopt it. I will pre-schedule both of these algorithms for version 0.8.4.

If it turns out to solve all the issues quickly, then I will include it in version 0.8.3, but at the moment there are many other PRs that are hungry for attention. For example, we need to choose between #590 and bridges from #738.

RaoulLuque · 2025-05-17T20:02:21Z

The test from #739 on my PC showed the opposite result in the max-flow task, so the first step is to dive into dinic, write more benchmarks and understand this phenomenon.

Okay, interesting. I did not actually run those benchmarks yet, good catch. I will then briefly look into those as well.

BryanCruz · 2025-07-01T03:06:33Z

So maybe, @starovoid, you have an idea, and otherwise we just leave it at that.

The test from #739 on my PC showed the opposite result in the max-flow task, so the first step is to dive into dinic, write more benchmarks and understand this phenomenon.

First attemt
test dinics_bench ... bench:                    16,392.67 ns/iter (+/- 230.41)
test ford_fulkerson_bench ... bench:       9,859.98 ns/iter (+/- 152.55)
Second attempt
test dinics_bench ... bench:      15,856.29 ns/iter (+/- 157.41)
test ford_fulkerson_bench ... bench:       9,486.44 ns/iter (+/- 211.69)
If the dinic implementation turns out to be really faster, the first task will be to adopt it. I will pre-schedule both of these algorithms for version 0.8.4.

If it turns out to solve all the issues quickly, then I will include it in version 0.8.3, but at the moment there are many other PRs that are hungry for attention. For example, we need to choose between #590 and bridges from #738.

@starovoid about the Dinic's implementation and why the benchmarks in it were slower than Ford-Fulkersons, I added some commits and comments in #739 . Also, updated its description.

BryanCruz · 2025-07-01T03:28:22Z

However, the difference increases when instead of Ford-Fulkerson, Dinics from #739 is used (I was unable to find out, why the other results change so drastically as well, since the only thing I change is the Max-Flow algorithm used in the maximum bipartite matching algorithm - I reran both benchmarks multiple times with the same results):

@RaoulLuque see Analysis > Special Cases from Dinics' Wikipedia article:

BryanCruz · 2025-07-04T02:43:31Z

added benchmarks for current maximum_matching with the same bipartite instances used in new algorithm

RaoulLuque · 2025-07-06T12:14:26Z

Thanks for the great work 🌟 I'll have a look latest in the course of next week (does not mean someone else shouldn't / can't have a look in the meantime ^^) :)

In the meantime you could check if the comments on the other PR about node_bound instead of node_count apply here somewhere as well 👍

RaoulLuque · 2025-07-18T08:16:02Z

Hey, just wanted to let you know that I did not forget about reviewing, but currently I have exam phase at Uni and it was a bit more busy than I expected lately so I was not able to review your PR until now :o The phase will end on 24th however, and after that I should have enough time to review the PR so we can get the improvements and your work merged :)

RaoulLuque

Alright, got to the review a bit earlier than expected I guess :) Again, thanks a lot for your great work 🌞 Had only a few nitpicks 👍
(Probably you know this, but if you agree with suggested changes, you can just approve them here in GitHub :) )

RaoulLuque · 2025-07-20T19:47:58Z

benches/matching.rs

+/// This function produces an undirected bipartite graph with two partitions:
+/// * `partition_1`: nodes with indices divisible by 6 (i.e., `0, 6, 12, ...`);
+/// * `partition_2`: nodes with odd indices (i.e., `1, 3, 5, ...`).


I think the docstring for the helper function is a great idea ! 🫶 However, technically I think the resulting graph has node_count_upper_bound many nodes. More precisely, the even indices not divisible by 6 will just be nodes without any edges adjacent to them. Not 100% sure, but this should be what happens when one calls from_edges. So maybe specify that as well. That is, that the resulting graph will have some node without any edges 👍
I would however suggest an alternative approach to generate the graph (with only the vertices with indices divisible by 6 or odd), by only actually adding the vertices that are needed (with indices divisible by 6 or odd) and returning a HashMap with your graph which maps the indices to the NodeIndexes in the graph.
(I am suggesting the latter since we are using these graphs to benchmark and thus they probably shouldn't contain unnecessary nodes since these might bias towards one algorithm or another if one algorithm does not handle "stale" nodes well).

RaoulLuque · 2025-07-20T19:52:02Z

benches/matching.rs

+/// An edge is created between a node `a` in `partition_1` and a node `b` in `partition_b`
+/// if `a` index is lower than `j` index.


Suggested change

/// An edge is created between a node `a` in `partition_1` and a node `b` in `partition_b`

/// if `a` index is lower than `j` index.

/// An edge is created between a node with index `i` in `partition_1` and a node with index `j` in `partition_2`

/// if `i` is smaller than or equal to `j`.

RaoulLuque · 2025-07-20T20:09:20Z

src/algo/matching.rs

+
+fn source_and_target_from_partitions<G>(


Suggested change

fn source_and_target_from_partitions<G>(

/// Returns (endpoint_a, endpoint_b) such that endpoint_a is in `partition_a` and endpoint_b in `partition_b`.

///

/// Panics if the source node is in neither partition.

fn source_and_target_from_partitions<G>(

tests/matching.rs

See also #740 and #847 # Add Dinic's Maximum Flow Algorithm Computes the maximum flow of a weighted directed graph. Implementation inspired by existing [Ford Fulkerson](https://github.com/petgraph/petgraph/blob/9f778ec11b218ef6d1dfaaa51467b24c8c939fca/src/algo/maximum_flow/ford_fulkerson.rs) in Petgraph. Also, creates a `maximum_flow` submodule inside `algo` module to group flow network algorithms. References: * https://en.wikipedia.org/wiki/Dinic%27s_algorithm * https://cp-algorithms.com/graph/dinic.html I also added some more benches that build graphs where Dinics strongly outperforms Ford Fulkerson, as the graph built in the existent bench in the library favored Ford Fulkerson method. * Dinics ``` > cargo bench --bench dinics running 5 tests test dinics_bench ... bench: 52,870.19 ns/iter (+/- 5,405.82) test dinics_bench_dense_middle ... bench: 680,920.00 ns/iter (+/- 120,617.64) test dinics_bench_dense_middle_varying_weights ... bench: 1,527,980.00 ns/iter (+/- 281,240.50) test dinics_bench_many_edges ... bench: 335,548.12 ns/iter (+/- 59,469.62) test dinics_bench_wide ... bench: 163,606.25 ns/iter (+/- 4,729.17) test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured; 0 filtered out; finished in 18.47s ``` * Ford Fulkerson ``` > cargo bench --bench ford_fulkerson running 5 tests test ford_fulkerson_bench ... bench: 15,360.68 ns/iter (+/- 3,189.80) test ford_fulkerson_bench_dense_middle ... bench: 1,129,970.00 ns/iter (+/- 194,473.25) test ford_fulkerson_bench_dense_middle_varying_weights ... bench: 14,432,700.00 ns/iter (+/- 1,127,474.00) test ford_fulkerson_bench_many_edges ... bench: 153,336.25 ns/iter (+/- 20,860.00) test ford_fulkerson_bench_wide ... bench: 3,733,340.00 ns/iter (+/- 957,471.00) test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured; 0 filtered out; finished in 15.93s``` ``` --------- Co-authored-by: Raoul Luqué <125205120+RaoulLuque@users.noreply.github.com> Co-authored-by: Paul Buehne <64769435+cactusdualcore@users.noreply.github.com> Co-authored-by: cactusdualcore <cactusdualcore@noreply.codeberg.org>

See also petgraph#740 and petgraph#847 # Add Dinic's Maximum Flow Algorithm Computes the maximum flow of a weighted directed graph. Implementation inspired by existing [Ford Fulkerson](https://github.com/petgraph/petgraph/blob/9f778ec11b218ef6d1dfaaa51467b24c8c939fca/src/algo/maximum_flow/ford_fulkerson.rs) in Petgraph. Also, creates a `maximum_flow` submodule inside `algo` module to group flow network algorithms. References: * https://en.wikipedia.org/wiki/Dinic%27s_algorithm * https://cp-algorithms.com/graph/dinic.html I also added some more benches that build graphs where Dinics strongly outperforms Ford Fulkerson, as the graph built in the existent bench in the library favored Ford Fulkerson method. * Dinics ``` > cargo bench --bench dinics running 5 tests test dinics_bench ... bench: 52,870.19 ns/iter (+/- 5,405.82) test dinics_bench_dense_middle ... bench: 680,920.00 ns/iter (+/- 120,617.64) test dinics_bench_dense_middle_varying_weights ... bench: 1,527,980.00 ns/iter (+/- 281,240.50) test dinics_bench_many_edges ... bench: 335,548.12 ns/iter (+/- 59,469.62) test dinics_bench_wide ... bench: 163,606.25 ns/iter (+/- 4,729.17) test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured; 0 filtered out; finished in 18.47s ``` * Ford Fulkerson ``` > cargo bench --bench ford_fulkerson running 5 tests test ford_fulkerson_bench ... bench: 15,360.68 ns/iter (+/- 3,189.80) test ford_fulkerson_bench_dense_middle ... bench: 1,129,970.00 ns/iter (+/- 194,473.25) test ford_fulkerson_bench_dense_middle_varying_weights ... bench: 14,432,700.00 ns/iter (+/- 1,127,474.00) test ford_fulkerson_bench_many_edges ... bench: 153,336.25 ns/iter (+/- 20,860.00) test ford_fulkerson_bench_wide ... bench: 3,733,340.00 ns/iter (+/- 957,471.00) test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured; 0 filtered out; finished in 15.93s``` ``` --------- Co-authored-by: Raoul Luqué <125205120+RaoulLuque@users.noreply.github.com> Co-authored-by: Paul Buehne <64769435+cactusdualcore@users.noreply.github.com> Co-authored-by: cactusdualcore <cactusdualcore@noreply.codeberg.org>

BryanCruz added 5 commits February 17, 2025 12:21

Bipartite matching (#17)

14c9170

fix

d748c8d

clippy

7947889

clippy tests

a6eb349

fix bench tests

dd5f293

BryanCruz marked this pull request as ready for review February 17, 2025 15:57

BryanCruz added 2 commits February 17, 2025 13:19

simplify traits

9c9b32f

Merge branch 'master-original' into add-bipartite-matching

374679a

BryanCruz changed the title ~~Add Maximum Bipartite Matching Algorithm~~ feat: Add Maximum Bipartite Matching Algorithm Apr 13, 2025

RaoulLuque reviewed May 15, 2025

View reviewed changes

src/algo/matching.rs Outdated Show resolved Hide resolved

src/algo/matching.rs Outdated Show resolved Hide resolved

tests/matching.rs Show resolved Hide resolved

benches/matching.rs Outdated Show resolved Hide resolved

starovoid self-requested a review May 15, 2025 13:19

RaoulLuque reviewed May 15, 2025

View reviewed changes

src/algo/matching.rs Outdated Show resolved Hide resolved

RaoulLuque reviewed May 17, 2025

View reviewed changes

src/algo/matching.rs Outdated Show resolved Hide resolved

RaoulLuque mentioned this pull request May 17, 2025

feat: Add Dinic's Maximum Flow Algorithm #739

Merged

Merge branch 'master-original' into add-bipartite-matching

7f28a7d

BryanCruz force-pushed the add-bipartite-matching branch from 2062b6b to 374679a Compare July 1, 2025 03:14

reword doc

b030efe

BryanCruz added 4 commits July 3, 2025 22:14

make new indexes stable

d4ab8e0

add docs in petgraph format

7672241

add test for stable graph

7da6337

improve benches

a756deb

BryanCruz added 3 commits July 3, 2025 23:29

fix

ad4fe01

more tests

6dc5092

typo

f1c1f14

RaoulLuque reviewed Jul 20, 2025

View reviewed changes

RaoulLuque added C-new-algorithm Category: A feature request for a new graph algorithm S-waiting-on-author Status: Awaiting some action by the PR/Issue author A-crate Area: Petgraph crate functionality labels Jul 20, 2025

add bipartite odd_path test

a3a58ba

		/// An edge is created between a node `a` in `partition_1` and a node `b` in `partition_b`
		/// if `a` index is lower than `j` index.

feat: Add Maximum Bipartite Matching Algorithm #740

Are you sure you want to change the base?

feat: Add Maximum Bipartite Matching Algorithm #740

Uh oh!

Conversation

BryanCruz commented Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add Maximum Bipartite Matching Algorithm

Creating the Max Flow Instance

Further Optimization

using ford_fulkerson to find max flow:

using dinics to find max flow:

Current implementation benchmarks for the same bipartite instances:

Uh oh!

RaoulLuque left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

starovoid commented May 15, 2025

Uh oh!

Uh oh!

RaoulLuque commented May 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

starovoid commented May 17, 2025

Uh oh!

RaoulLuque commented May 17, 2025

Uh oh!

RaoulLuque commented May 17, 2025

Uh oh!

Uh oh!

starovoid commented May 17, 2025

Uh oh!

RaoulLuque commented May 17, 2025

Uh oh!

BryanCruz commented Jul 1, 2025

Uh oh!

BryanCruz commented Jul 1, 2025

Uh oh!

BryanCruz commented Jul 4, 2025

Uh oh!

RaoulLuque commented Jul 6, 2025

Uh oh!

RaoulLuque commented Jul 18, 2025

Uh oh!

RaoulLuque left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RaoulLuque Jul 20, 2025

Choose a reason for hiding this comment

Uh oh!

RaoulLuque Jul 20, 2025

Choose a reason for hiding this comment

Uh oh!

RaoulLuque Jul 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

BryanCruz commented Feb 17, 2025 •

edited

Loading

RaoulLuque left a comment •

edited

Loading

RaoulLuque commented May 17, 2025 •

edited

Loading

RaoulLuque left a comment •

edited

Loading