In-class initialize DenseMapBase members.#177168
Conversation
I have seen a number of false positive defects in Coverity regarding uninitialized instances of some of these members. I've decided to in-class initialize these to hopefully silence Coverity.
|
@llvm/pr-subscribers-llvm-adt Author: Matt Davis (enferex) ChangesI've seen a number of false positive defects in Coverity regarding uninitialized instances of some of these members. I've decided to in-class initialize 3 scalar members of Since the initialization is performed through I do see some of the insn size percentages jump around but that might be noise. I had run an earlier compile time tracker on the same data, and I think the timing results were a bit different, the most recent run information is below. Compile time tracker is showing the following output:
clang build: Full diff: https://github.com/llvm/llvm-project/pull/177168.diff 1 Files Affected:
diff --git a/llvm/include/llvm/ADT/DenseMap.h b/llvm/include/llvm/ADT/DenseMap.h
index fe8868619730e..d967a231804f6 100644
--- a/llvm/include/llvm/ADT/DenseMap.h
+++ b/llvm/include/llvm/ADT/DenseMap.h
@@ -751,10 +751,10 @@ class DenseMap : public DenseMapBase<DenseMap<KeyT, ValueT, KeyInfoT, BucketT>,
// simplicity of referring to them.
using BaseT = DenseMapBase<DenseMap, KeyT, ValueT, KeyInfoT, BucketT>;
- BucketT *Buckets;
- unsigned NumEntries;
- unsigned NumTombstones;
- unsigned NumBuckets;
+ BucketT *Buckets = nullptr;
+ unsigned NumEntries = 0;
+ unsigned NumTombstones = 0;
+ unsigned NumBuckets = 0;
explicit DenseMap(unsigned NumBuckets, typename BaseT::ExactBucketCount) {
this->initWithExactBucketCount(NumBuckets);
|
kuhar
left a comment
There was a problem hiding this comment.
I don't necessarily think we should overfit towards external tools, but the change itself is reasonable.
|
@kazutakahirata would you mind reviewing this, since you've been pretty active in this file. Thanks! |
|
@tgymnich Would you be able to take a scan over this, I'd like another pair of eye on it please. |
|
This also does feel to me like kind of an overfit. Is there any output from the tool, maybe an example, that would explain why we need to make this change. |
@tgymnich, thanks for taking a look. I understand the point regarding overfitting, I agree we're jumping though some hoops to pacify static analysis. I think it's helpful to make the initialization for these members more explicit, given the initialization isn't obvious. As an example of one seemingly false positive case, is in LICM.cpp at the return point of llvm-project/llvm/lib/Transforms/Scalar/LICM.cpp Line 1677 in 7deea9d I'm seeing a Coverity "defect" around this location: |
|
ping @tgymnich for any additional feedback. Let me know if you'd be okay with us landing this or think we should hold off. |
|
Sorry for the delay. I am not strictly against this change, but from my side I will only approve true positives.
Should we even care about this being initialized in the destructor? |
Thanks, I understand your point @tgymnich. I don't think the members are initialized in the destructor; rather, values from these members are being used in the dtor. I think making the initialization in |
|
Some idle thoughts: I don't like making changes just to satisfy some insufficiently smart checking tool, and it's nice to leave some things uninitialized if the initialization is bogus anyway (like if the initialized value were ever used it'd be a bug) - so msan can catch them. But equally if it becomes too hard to reason about, it'd be better for humans. Probably comes down to that "are these values actually the real values we'd want to use, or is any use of them just hiding a bug/likely to cause a different bug" Looks like they are the real values that get initialized with anyway if the bucket count requested is zero -so they aren't just bogus values that would be problematic if they were actually used. So that's good/not bad. But equally - if the tools can't figure out what's going on here, I'm not sure we want to keep propping them up - should disable whatever checking produces the false positives. Do you have a standalone reduced example of what the tool is failing to understand? |
I will say I don't find this argument persuasive. I think with all the hubbub around memory safety, there is consensus in the industry that "uninitialized-by-default" was a mistake, and most post C++ AOT compiled languages prove, using local static analysis, not global, that everything is initialized appropriately. The exception to this rule is objects with embedded storage, like SmallVector storage, variants, or optional. It seems like a bad tradeoff to make static analyzers work harder just so that expensive dynamic analyses can be more precise. Besides, it's DenseMap, we know this thing has test coverage. If accommodating limited static analysis tools isn't motivating, I'd argue that this PR is more readable. If we had a static analysis tool to ensure that every constructor has a member initializer for every field and a wand to magically do that cleanup, I'd wave it, assuming llvm-compile-time-tracker shows no regression. |
This was the exact reasoning that made me approve this PR -- it makes it much more obvious to everyone (humans and tools) that all class members are initialized, even if not strictly necessary. |
|
Thanks for the feedback dwblakie, rnk, kuhar. To answer @dwblaikie's earlier ask:
I don't have a smaller repro handy, but listed one (of many) false positive's I've seen in my earlier comment above. Given DenseMap is used all over the place, I'm finding many similar cases that all seem to boil down to DenseMapBase. |
shrug Fair - I don't have especially strong feelings here. I'd still rather like to see a reduced/isolated example of the bug in the static analysis to have a better understanding of what class of problems are we going to work around if we sign up for having the codebase be clean despite this bug. |
Hi @dwblaikie , I think it's the CRTP form that is confusing the static analysis. Here's a claude + some of my hand reduction on the interesting bits of DenseMap.h. I've added two comments indicating the false positives. https://gist.github.com/enferex/1bb4083189e4ab3588fe391a51f6ad3d Edit: I'm looking into reporting this FP to the static analysis vendor. (done) |
|
Any way to link to the vendor's ticket so we can track removing these if the issue's addressed? Anyway, seems like other folks are comfortable enough with this change on readability/safety grounds that I won't stand in the way of it at this point. |
|
Thanks @dwblaikie. I don't think there's a way we can track the vendor ticket publicly. It looks to be guarded by a login. Blackduck Support Ticket 03684877. |
I've seen a number of false positive defects in Coverity regarding uninitialized instances of some of these members. I've decided to in-class initialize 3 scalar members of `DenseMapBase` to hopefully silence Coverity. Since the initialization is performed through `initWithExactBucketCount` I think it makes sense to just zero init them. I assume the original intent was to avoid some initialization overhead; however, that seems negligible from my last compile-time tracker results (If I'm reading the results correctly). I do see some of the insn size percentages jump around but that might be noise. I had run an earlier compile time tracker on the same data, and I think the timing results were a bit different, the most recent run information is below. Compile time tracker is showing the following output: Metric | Old | New -- | -- | -- instructions:u | 35718151M | 35718325M (+0.00%) wall-time | 614.01s | 613.82s (-0.03%) size-file | 132278KiB | 132278KiB (+0.00%) size-file (stage1) | 151065KiB | 151065KiB (-0.00%) clang build: Metric Old New [instructions:u](https://llvm-compile-time-tracker.com/compare_clang.php?from=f97f53e1d925fe702c23ad3a120ad3ab40b4eb51&to=d925643ace32566ea76a60d0642180d077f476f5&stat=instructions%3Au) 35718151M 35718325M (+0.00%) [wall-time](https://llvm-compile-time-tracker.com/compare_clang.php?from=f97f53e1d925fe702c23ad3a120ad3ab40b4eb51&to=d925643ace32566ea76a60d0642180d077f476f5&stat=wall-time) 614.01s 613.82s (-0.03%) [size-file](https://llvm-compile-time-tracker.com/compare_clang.php?from=f97f53e1d925fe702c23ad3a120ad3ab40b4eb51&to=d925643ace32566ea76a60d0642180d077f476f5&stat=size-file) 132278KiB 132278KiB (+0.00%) size-file (stage1) 151065KiB 151065KiB (-0.00%)` (cherry picked from commit 2d53aab)
I've seen a number of false positive defects in Coverity regarding uninitialized instances of some of these members. I've decided to in-class initialize 3 scalar members of `DenseMapBase` to hopefully silence Coverity. Since the initialization is performed through `initWithExactBucketCount` I think it makes sense to just zero init them. I assume the original intent was to avoid some initialization overhead; however, that seems negligible from my last compile-time tracker results (If I'm reading the results correctly). I do see some of the insn size percentages jump around but that might be noise. I had run an earlier compile time tracker on the same data, and I think the timing results were a bit different, the most recent run information is below. Compile time tracker is showing the following output: Metric | Old | New -- | -- | -- instructions:u | 35718151M | 35718325M (+0.00%) wall-time | 614.01s | 613.82s (-0.03%) size-file | 132278KiB | 132278KiB (+0.00%) size-file (stage1) | 151065KiB | 151065KiB (-0.00%) clang build: Metric Old New [instructions:u](https://llvm-compile-time-tracker.com/compare_clang.php?from=f97f53e1d925fe702c23ad3a120ad3ab40b4eb51&to=d925643ace32566ea76a60d0642180d077f476f5&stat=instructions%3Au) 35718151M 35718325M (+0.00%) [wall-time](https://llvm-compile-time-tracker.com/compare_clang.php?from=f97f53e1d925fe702c23ad3a120ad3ab40b4eb51&to=d925643ace32566ea76a60d0642180d077f476f5&stat=wall-time) 614.01s 613.82s (-0.03%) [size-file](https://llvm-compile-time-tracker.com/compare_clang.php?from=f97f53e1d925fe702c23ad3a120ad3ab40b4eb51&to=d925643ace32566ea76a60d0642180d077f476f5&stat=size-file) 132278KiB 132278KiB (+0.00%) size-file (stage1) 151065KiB 151065KiB (-0.00%)` (cherry picked from commit 2d53aab) (cherry picked from commit 7cd4442)
I've seen a number of false positive defects in Coverity regarding uninitialized instances of some of these members. I've decided to in-class initialize 3 scalar members of
DenseMapBaseto hopefully silence Coverity.Since the initialization is performed through
initWithExactBucketCountI think it makes sense to just zero init them. I assume the original intent was to avoid some initialization overhead; however, that seems negligible from my last compile-time tracker results (If I'm reading the results correctly).I do see some of the insn size percentages jump around but that might be noise. I had run an earlier compile time tracker on the same data, and I think the timing results were a bit different, the most recent run information is below.
Compile time tracker is showing the following output:
clang build:
Metric Old New
instructions:u 35718151M 35718325M (+0.00%)
wall-time 614.01s 613.82s (-0.03%)
size-file 132278KiB 132278KiB (+0.00%)
size-file (stage1) 151065KiB 151065KiB (-0.00%)`