Skip to content

Replace linear_assignment with scipy linear_sum_assignment #4490

Merged
shyuep merged 3 commits intomaterialsproject:masterfrom
DanielYang59:replace-linear-assignment-scipy
Sep 3, 2025
Merged

Replace linear_assignment with scipy linear_sum_assignment #4490
shyuep merged 3 commits intomaterialsproject:masterfrom
DanielYang59:replace-linear-assignment-scipy

Conversation

@DanielYang59
Copy link
Contributor

@DanielYang59 DanielYang59 commented Sep 1, 2025

Summary

Benchmark

WSL2 Ubuntu 24.04 (AMD64)

Master branch:

     n | avg_time (s) |  std_dev (s)
------------------------------------
   100 |     0.000104 |     0.000018
   500 |     0.004302 |     0.002333
  1000 |     0.018374 |     0.006702
  5000 |     1.049615 |     0.327349

Current branch:

     n | avg_time (s) |  std_dev (s)
------------------------------------
   100 |     0.000121 |     0.000030
   500 |     0.005082 |     0.002957
  1000 |     0.018487 |     0.006484
  5000 |     1.062805 |     0.332552

script:

import time
import numpy as np

from pymatgen.optimization.linear_assignment import LinearAssignment


def benchmark(
    sizes: list[int],
    n_repeats: int = 3,
    rng_seed: int = 42,
) -> None:
    rng = np.random.default_rng(rng_seed)

    results = []

    for n in sizes:
        times = []
        for _ in range(n_repeats):
            costs = rng.random((n, n))
            start = time.perf_counter()
            lap = LinearAssignment(costs)
            end = time.perf_counter()
            times.append(end - start)

            # sanity check
            if lap.solution is None or len(lap.solution) != n:
                raise RuntimeError(f"Invalid solution for n={n}")

        avg_time = np.mean(times)
        std_time = np.std(times)
        results.append((n, avg_time, std_time))

    print(f"{'n':>6} | {'avg_time (s)':>12} | {'std_dev (s)':>12}")
    print("-" * 36)
    for n, avg, std in results:
        print(f"{n:6d} | {avg:12.6f} | {std:12.6f}")


if __name__ == "__main__":
    benchmark(sizes=[100, 500, 1000, 5000], n_repeats=5)

@DanielYang59 DanielYang59 force-pushed the replace-linear-assignment-scipy branch from 8c6f336 to 0997fb9 Compare September 1, 2025 22:03
@DanielYang59 DanielYang59 marked this pull request as ready for review September 2, 2025 20:17
@shyuep
Copy link
Member

shyuep commented Sep 2, 2025

Can you confirm that the scipy version actually is faster? The numbers don't seem to bear that out?

@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Sep 2, 2025

Can you confirm that the scipy version actually is faster? The numbers don't seem to bear that out?

As far as I could tell our version performs almost identically as the scipy version (tested both on WSL2 ubuntu 24.04 AMD64 and my MacOS ARM64 machines), this replacement was not really intended for speed but freeing pymatgen from maintaining linear_assignment now that it's available from scipy?

@shyuep shyuep merged commit b622498 into materialsproject:master Sep 3, 2025
44 checks passed
@DanielYang59 DanielYang59 deleted the replace-linear-assignment-scipy branch September 3, 2025 07:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants