Skip to content

TST Extend tests for scipy.sparse.*array in sklearn/utils/tests/test_shortest_path.py #27502

Closed
KartikeyBartwal wants to merge 2 commits intoscikit-learn:mainfrom
KartikeyBartwal:test_shortest_path_
Closed

TST Extend tests for scipy.sparse.*array in sklearn/utils/tests/test_shortest_path.py #27502
KartikeyBartwal wants to merge 2 commits intoscikit-learn:mainfrom
KartikeyBartwal:test_shortest_path_

Conversation

@KartikeyBartwal
Copy link
Copy Markdown

Reference Issues/PRs

Towards #27090

What does this implement/fix? Explain your changes.

This file's function don't use scipy.sparse.csr_matrix at all, so there doesn't appear to be any need to add CSR_CONTAINERS parameterization.

However, a little optimization in floyd_warshell_slow function has been done. The improvements:

  1. Improved efficiency: Use of NumPy's optimized functions like np.outer and np.minimum to perform matrix operations more efficiently. Due to taking advantage of Numpy's vectorized operations its faster in terms of computation.

  2. More simplified implementation

  3. Consistent handling of Non-Zero entries

Any other comments?

Please let me know if these changes help

@github-actions
Copy link
Copy Markdown

github-actions bot commented Sep 30, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 228a1d3. Link to the linter CI: here

Copy link
Copy Markdown
Member

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @KartikeyBartwal,

Thank you for this contribution.

As you mentioned, nothing needs to be done for this file regarding sparse containers.

This LGTM.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed nicer. 💯

Could you perform a small benchmark?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A benchmark ?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would mean a lot if you could could merge my changes. Its HacktoberFest after all 😊

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of a comparison between the current implementation and yours to verify that your changes make the code slightly faster, but that might be an overkill.

We require two approvals for PRs to be merged. :)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay! Lets do a comparative analysis 🙌🏻

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll share the results

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
# Compute the sum of the intermediate distances efficiently
graph += np.outer(graph[:, k], graph[k, :])
# Convert distances to 1 for non-zero entries
np.minimum(x1=graph, x2=1, out=graph)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove the comments.

@@ -9,21 +9,14 @@
def floyd_warshall_slow(graph, directed=False):
Copy link
Copy Markdown
Member

@glemaitre glemaitre Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code is really the pseudo-code of the Floyd Warshall algorithm (and non-optimal for testing purpose only): https://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm

let dist be a |V| × |V| array of minimum distances initialized to ∞ (infinity)
for each edge (u, v) do
    dist[u][v] ← w(u, v)  // The weight of the edge (u, v)
for each vertex v do
    dist[v][v] ← 0
for k from 1 to |V|
    for i from 1 to |V|
        for j from 1 to |V|
            if dist[i][j] > dist[i][k] + dist[k][j] 
                dist[i][j] ← dist[i][k] + dist[k][j]
            end if

If the outer product, this is much more difficult to make the link with the pseudo-code.

I would be -1 on this change.

@glemaitre
Copy link
Copy Markdown
Member

Closing this PR since there is not other changes to be done regarding the sparse arrays/matrices

@glemaitre glemaitre closed this Oct 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants