Conversation
|
Part 3, but the branch-name is is (jk :D , that mismatch was already in the previous PR) |
|
f501dc1 fixes divergence of the Since the bug is not obvious from the commit and there might be a better way to fix this, let me put this up for discussion. In
In Have there been issues regarding dependencies between tape parts previously? Are there mechanisms established that address such issues? If someone wants to give it a try, I added a build option for the tape type in 183c3ca. You can switch to the index tape by calling meson with |
| if get_option('codi-tape') == 'JacobianIndex' | ||
| codi_rev_args += '-DCODI_INDEX_TAPE' |
|
Parallel preaccumulations must not share any variables. In particular, they must not have any common inputs and outputs. I suspect (judging from the code) that the preaccumulations treated in 781092a violate this pattern, and that this is the reason why excluding them makes several hybrid parallel AD tests pass. Other preaccumulations might be affected as well. The correctness of preaccumulations inside numerics objects, for example, depends on the values of pointers like |
| END_SU2_OMP_FOR | ||
|
|
||
| AD::EndNoSharedReading(); | ||
|
|
There was a problem hiding this comment.
Some of these shared reading optimizations depend on TimeStep being passive. Does this always hold true?
There was a problem hiding this comment.
It's a fair-enough assumption, I added some more over the Primitive loops and removed some over smaller loops where the performance benefit might not justify the increased maintenance.
pcarruscag
left a comment
There was a problem hiding this comment.
So the only thing left is to decide whether hybrid AD compiles by default if -Dwith-omp=true or if a special flag is added?
You mean a flag like |
|
The omp simd directive is used in the reverse mode for passive linear algebra things, not a huge deal. |
|
I see, you can't drop |
|
We don't really lose vectorization... the compiler will still generate it eventually, it will just be a bit less consistent across compilers and what they include in -O2 or -O3. |
|
Have a look at a9466bb for a suggestion on how to disable OpDiLib within the scope of the existing build options (casually tested). To clarify the AD implications: If you use OpDiLib, you will have atomic adjoints where no shared reading optimizations are applied, even with If we add this, reverse AD + OpenMP - OpDiLib should also print an error for Since I am not 100% sure about it yet, could you clarify again what it is that you want to achieve in the end, and why? |
|
Right, let's get this moving so we can have 7.2.0 |
Proposed Changes
RealReverseIndex.FixIdentify issues regarding index reuse and partial tape evaluations.Related Work
continues #1284
PR Checklist