-
Notifications
You must be signed in to change notification settings - Fork 242
introduces parallelGraphBuilder #1178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
introduces parallelGraphBuilder #1178
Conversation
32e9c31 to
9b0793c
Compare
angriman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, I left some comments. Since we already have a GraphBuilder class which also supports edge weights, why don't we refactor that class to run in parallel instead of creating a new one?
| void addEdgeParallel(index a, index b); | ||
|
|
||
| void addHalfEdgeParallel(Unsafe, index source, index destination); | ||
|
|
||
| void addHalfInEdgeParallel(Unsafe, index source, index destination); | ||
|
|
||
| void addHalfOutEdgeParallel(Unsafe, index source, index destination); | ||
|
|
||
| Graph toGraphParallel(); | ||
|
|
||
| void addEdgesToGraphParallel(Unsafe, Graph &G); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we can drop the Parallel suffix from all these functions, the class is already called ParallelGraphBuilder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments, i've refactored the parallel graph building into the existing graph builder. Therefore, the suffix might be useful to distinguish between sequential and parallel functions, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After taking a more in-depth look at the code, I would also recommend refactoring the existing code. There is already a parallel builder, which, according to the contributor of the code, is a lot slower than the new concept.
As already indicated, the old implementation supports weights, which adds a layer of complexity due to additional conditionals (even though branch prediction will likely cause some remedy here). Also, the old implementation, when calling completeGraph (this corresponds to toGraphParallel in the new implementation) does shrinkToFit() and reset() as a final step. Would something like this also be necessary with the new refactored implementation? Does this have performance implications? What happens if the new parallel build is completed twice?
I would suggest doing the following:
- Implement support for weighted edges for the new parallel builder
- Benchmark the old vs. the new implementation
- If the new implementation is still faster, refactor old
toGraphParallel()to use the newtoGraphParallel.
For the refactoring step, I also left some other comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fabratu I agree with your points. About 3., assuming that the parallel implementation is faster than the sequential, why don't we just implement a single toGraph function which runs in parallel, and drop toGraphSequential and toGraphParallel? I don't see the point of maintaining two different functions that achieve the same goal.
|
|
||
| void ParallelGraphBuilder::addHalfEdgeParallel(Unsafe, index a, index b) { | ||
| assert(a != b); | ||
| outEdgesPerThread[omp_get_thread_num()][a % omp_get_max_threads()].emplace_back(HalfEdge{a, b}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use emplace this should be enough, same for the other functions below:
| outEdgesPerThread[omp_get_thread_num()][a % omp_get_max_threads()].emplace_back(HalfEdge{a, b}); | |
| outEdgesPerThread[omp_get_thread_num()][a % omp_get_max_threads()].emplace_back(a, b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the current implementation of ´HalfEdge` does not have such a constuctor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea to add it then. Improves readability.
3ae8e61 to
6d75c6b
Compare
| * @param parallel If set to @c true, the graph will be build in parallel. | ||
| */ | ||
| GraphBuilder(count n = 0, bool weighted = false, bool directed = false); | ||
| GraphBuilder(count n = 0, bool weighted = false, bool directed = false, bool parallel = false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the performance is increasing (as well if executed on one thread) and If the result is the same (and it should be maybe except for the order of nodes in the adjacency list) for both variants, then parallel can be set to true.
|
Please also add tests for the new functions + update Python bindings/tests (if |
networkit/cpp/graph/GraphBuilder.cpp
Outdated
| G.setEdgeCount(unsafe, numberOfHalfEdges / 2); | ||
| } | ||
|
|
||
| Graph GraphBuilder::toGraphParallel() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This corresponds to toCompleteGraph in the former implementation. Only use one calling function for graph creation during refactoring.
networkit/cpp/graph/GraphBuilder.cpp
Outdated
| outEdgesPerThread[omp_get_thread_num()][a % omp_get_max_threads()].emplace_back(HalfEdge{a, b}); | ||
| } | ||
|
|
||
| void GraphBuilder::addEdgesToGraphParallel(Unsafe, Graph &G) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function only gives valid results, if addHalf/Out/In/EdgeParallel was used for adding edges. Either make sure to check this accordingly or better change the public API during refactoring.
networkit/cpp/graph/GraphBuilder.cpp
Outdated
|
|
||
| // Parallel functions | ||
|
|
||
| void GraphBuilder::addEdgeParallel(node a, node b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment at function addEdgesInParallel: If the new concept is faster (also for one thread), then drop the previous addHalfEdge function (and others) in favor of the new one(s). If for the sequential case, nothing is gained, a wrapper/template is needed to distinguish between sequential and parallel insertion.
df989ab to
e85c9d3
Compare
923996a to
3419a30
Compare
3419a30 to
4086639
Compare
part of re-upload from #1165