-
Notifications
You must be signed in to change notification settings - Fork 242
Fixes to TopologicalSort #1241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes to TopologicalSort #1241
Conversation
e75abb5 to
83ee59e
Compare
|
Hi, thank you for your first contribution to NetworKit! In general the PR looks good. However due to the scope of the project (supporting large scale analysis) avoid using |
|
If there's a better solution that avoids out of bounds accesses, I'm all for it. But I think the tradeoff of a small performance impact due to bounds checks vs. accidental memory corruption should be cleary in the favour of bounds checks. With memory corruption, the best scenario is a segmentation fault, while the worst is subtle breakage and wrong calculations in completely unrelated code. I did look at |
The problem in this case is, that the overhead is not small in general for
Yes, your impression is correct. It gives the highest node id. There also exists the function
The rationale concerning performance also applies here. In principal, adding a bool variable to the Graph class is cheap. However keeping track of it is not necessarily cheap. You would for example have to keep track of the number of id holes in order to avoid |
Noted, I was under the impression that the impact wouldn't be this big. I should have benchmarked both versions of the algorithm to see the difference.
Nice, Considering that, I'm thinking of the following. Since the caller might have calculated a mapping already, it would still make sense to allow the caller to pass their own mapping, hence my original decision to make it a parameter. However, by having this parameter, I believe that we have the responsibility of checking the caller's mapping for correctness, or at least that mapped node ids fall in [0,n). |
|
Yeah, I understand the part about not liking it. We occasionally implement an additional parameter, which enables safety checks (like In general your impression is again correct, most functions per default don't check for sanity (if it is too costly). As a result, the maintenance and correct preparation of data is relayed to the user. |
55da39d to
aceb956
Compare
fabratu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to remove the static class member and use const for the node map. Added some code suggestion to make this work. The only significant change is in mapNode, since const does not support operator[] for unordered_map.
…ck for node id upper bound
7c2557b to
b918efe
Compare
|
I approved the PR. Thanks again for your contribution! |
Fixed issue #1224 by adding an optional argument to for a continuous node id mapping, i.e. mapping of current node ids to [0,n).
Replaced raw accesses into vectors by the index operator with calls to
.at()which performs bounds checks. This fixes an unreported bug which causes undefined behaviour and specifically segmentation faults when node ids are not in the [0,n) interval. The new behaviour is to throw an exception with a descriptive message instead.If a node id mapping is passed to the algorithm, it maps all node ids through the mapping (and performs bounds checks as in above paragraph). If node ids are not mapped to [0,n), the bounds checks will cause the algorithm to throw an exception with an appropriate message.
Also refactored tests for TopologicalSort and added new tests for bounds checks and node id mappings.