TST Extend tests for scipy.sparse.*array in sklearn/utils/tests/test_shortest_path.py #27502
TST Extend tests for scipy.sparse.*array in sklearn/utils/tests/test_shortest_path.py #27502KartikeyBartwal wants to merge 2 commits intoscikit-learn:mainfrom
scipy.sparse.*array in sklearn/utils/tests/test_shortest_path.py #27502Conversation
jjerphan
left a comment
There was a problem hiding this comment.
Hi @KartikeyBartwal,
Thank you for this contribution.
As you mentioned, nothing needs to be done for this file regarding sparse containers.
This LGTM.
There was a problem hiding this comment.
This is indeed nicer. 💯
Could you perform a small benchmark?
There was a problem hiding this comment.
It would mean a lot if you could could merge my changes. Its HacktoberFest after all 😊
There was a problem hiding this comment.
I was thinking of a comparison between the current implementation and yours to verify that your changes make the code slightly faster, but that might be an overkill.
We require two approvals for PRs to be merged. :)
There was a problem hiding this comment.
Okay! Lets do a comparative analysis 🙌🏻
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
| # Compute the sum of the intermediate distances efficiently | ||
| graph += np.outer(graph[:, k], graph[k, :]) | ||
| # Convert distances to 1 for non-zero entries | ||
| np.minimum(x1=graph, x2=1, out=graph) |
There was a problem hiding this comment.
Could you remove the comments.
| @@ -9,21 +9,14 @@ | |||
| def floyd_warshall_slow(graph, directed=False): | |||
There was a problem hiding this comment.
The original code is really the pseudo-code of the Floyd Warshall algorithm (and non-optimal for testing purpose only): https://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm
let dist be a |V| × |V| array of minimum distances initialized to ∞ (infinity)
for each edge (u, v) do
dist[u][v] ← w(u, v) // The weight of the edge (u, v)
for each vertex v do
dist[v][v] ← 0
for k from 1 to |V|
for i from 1 to |V|
for j from 1 to |V|
if dist[i][j] > dist[i][k] + dist[k][j]
dist[i][j] ← dist[i][k] + dist[k][j]
end if
If the outer product, this is much more difficult to make the link with the pseudo-code.
I would be -1 on this change.
|
Closing this PR since there is not other changes to be done regarding the sparse arrays/matrices |
Reference Issues/PRs
Towards #27090
What does this implement/fix? Explain your changes.
This file's function don't use scipy.sparse.csr_matrix at all, so there doesn't appear to be any need to add CSR_CONTAINERS parameterization.
However, a little optimization in floyd_warshell_slow function has been done. The improvements:
Improved efficiency: Use of NumPy's optimized functions like np.outer and np.minimum to perform matrix operations more efficiently. Due to taking advantage of Numpy's vectorized operations its faster in terms of computation.
More simplified implementation
Consistent handling of Non-Zero entries
Any other comments?
Please let me know if these changes help