Conversation
|
@lmaisons can you please follow the instructions for signing your commits and amend your commit with signature ? also, you will need to use |
f5873bc to
002365e
Compare
|
@Bronek Sorry, not used to the flow here yet. Formatted and signed as required. |
Bronek
left a comment
There was a problem hiding this comment.
Nice, just pls. add minor changes.
The current implementation of rngfill is prone to false warnings from GCC about array bounds violations. Looking at the code, the implementation naively manipulates both the bytes count and the buffer pointer directly to ensure the trailing memcpy doesn't overrun the buffer. As expressed, there is a data dependency on both fields between loop iterations. Now, ideally, an optimizing compiler would realize that these dependencies were unnecessary and end up restructuring its intermediate representation into a functionally equivalent form with them abscent. However, the point at which this occurs may be disjoint from when warning analyises are performed, potentially rendering them more difficult to determine precisely. In addition, it may also consume a portion of the budget the optimizer has allocated to attempting to improve a translation unit's performance. Given this is a function template which requires context-sensitive instantiation, this code, in the author's experience, would be more prone than most to being inlined, with a decrease in optimization budget corresponding to the effort the optimizer has already expended, having already optimized one or more calling functions. Thus, the scope for impacting the the ultimate quality of the code generated is elevated. For this change, we rearrange things so that the location and contents of each memcpy can be computed independently, relying on a simple loop iteration counter as the only changing input between iterations.
002365e to
c72e18f
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #5563 +/- ##
=========================================
- Coverage 78.4% 78.4% -0.0%
=========================================
Files 816 816
Lines 71711 71713 +2
Branches 8580 8597 +17
=========================================
- Hits 56242 56224 -18
- Misses 15469 15489 +20
🚀 New features to boost your workflow:
|
| template <class Generator> | ||
| void | ||
| rngfill(void* buffer, std::size_t bytes, Generator& g) | ||
| rngfill(void* const buffer, std::size_t const bytes, Generator& g) |
There was a problem hiding this comment.
nit: const in a context of 'std::size_t const bytes' is not really necessary.
There was a problem hiding this comment.
I think it's helpful for the optimizer.
There was a problem hiding this comment.
I don't think it matters in terms of optimization. Compiler is perfectly capable of seeing if the argument changes in the function.
It's more of debatable thing where some people believe it improves readability, others think it's meaningless (see abseil code style for example: https://abseil.io/tips/109)
There was a problem hiding this comment.
It's true that the optimizer cannot rely on const annotations for correctness. I included it here to communicate to a reader the intent of not updating the field's value in-place in order to provide as much clarity as possible about any mutable data dependencies that arise across basic blocks / loop iterations.
|
@lmaisons do you think this is ready to merge ? |
Yes. |
The current implementation of rngfill is prone to false warnings from GCC about array bounds violations. Looking at the code, the implementation naively manipulates both the bytes count and the buffer pointer directly to ensure the trailing memcpy doesn't overrun the buffer. As expressed, there is a data dependency on both fields between loop iterations.
Now, ideally, an optimizing compiler would realize that these dependencies were unnecessary and end up restructuring its intermediate representation into a functionally equivalent form with them absent. However, this is not guaranteed, and even if it does occur, the point at which this happens may well be disjoint from when warning analyses are performed, potentially rendering them more difficult to determine precisely.
In addition, it may also consume a portion of the budget the optimizer has allocated to attempting to improve a translation unit's performance. Given this is a function template which requires context-sensitive instantiation, this code, in the author's experience, would be more prone than most to being inlined, with a decrease in optimization budget corresponding to the effort the optimizer has already expended, having already optimized one or more calling functions. Thus, the scope for impacting the the ultimate quality of the code generated is elevated.
For this change, we rearrange things so that the location and contents of each memcpy can be computed independently, relying on a simple loop iteration counter as the only changing input between iterations.