[WIP] Multizone adjoints for turbomachinery#2446
[WIP] Multizone adjoints for turbomachinery#2446joshkellyjak wants to merge 80 commits intodevelopfrom
Conversation
| AD::EndPreacc(); | ||
|
|
||
| /*--- Max is not differentiable, so we not register them for preacc. ---*/ |
There was a problem hiding this comment.
What is the difference between ending the preaccumulation before or after?
| } | ||
| } | ||
| } else if (Multizone_Problem && DiscreteAdjoint) { | ||
| SU2_MPI::Error(string("OUTPUT_WRT_FREQ cannot be specified for this solver " |
There was a problem hiding this comment.
| SU2_MPI::Error(string("OUTPUT_WRT_FREQ cannot be specified for this solver " | |
| SU2_MPI::Error("OUTPUT_WRT_FREQ cannot be specified for this solver " |
| "(writing of restart and sensitivity files not possible for multizone discrete adjoint during runtime yet).\n" | ||
| "Please remove this option from the config file, output files will be written when solver finalizes.\n"), CURRENT_FUNCTION); |
There was a problem hiding this comment.
Hmmm I'm pretty sure it is possible
There was a problem hiding this comment.
Yes and no, writing the files is fine, but the continuation of the multi-zone discrete adjoint solver is erratic. My guess is that some indices aren't cleared properly before re-recording the tape. (Writing adjoint solution variables to file only, without re-recording at all and without sensitivities, might give us some hints.)
The debug mode could eventually pin down where exactly things go wrong.
| BPressure = config->GetPressureOut_BC(); | ||
| Temperature = config->GetTotalTemperatureIn_BC(); | ||
|
|
||
| if (!reset){ | ||
| AD::RegisterInput(BPressure); | ||
| AD::RegisterInput(Temperature); | ||
| } | ||
|
|
||
| BPressure = config->GetPressureOut_BC(); | ||
| Temperature = config->GetTotalTemperatureIn_BC(); | ||
|
|
There was a problem hiding this comment.
Why are these wrong but not the others that follow the same pattern?
There was a problem hiding this comment.
This threw a tagging error, and if I remember correctly it's because in the other sections the variables it accesses are directly used in the solver, whereas in the Giles implementation they are not, creating a mismatch in tags.
There was a problem hiding this comment.
Can we fix it in a way that keeps a clear pattern for doing this type of thing?
There was a problem hiding this comment.
I can give it a go
There was a problem hiding this comment.
I think this actually removes a potential tag tool error message (= dependency error message in this case), but it breaks the functionality, since e.g. BPressure = config->GetPressureOut_BC(); deletes the index of BPressure, as the config value appears as a constant.
I'd suggest to fix this, for now, by tracking the config class value instead of the local copy (?)
…ding of objective function calculation
… to run in quasi-MZ approach. (NEEDS CLEANING)
… MZ turbo testcase
…n recording wrt mesh coordinates. The tape debug mode code is currently also reworked in #2721. The changes in this commit could already be useful for the current turbo discrete adjoint development.
…m/su2code/SU2 into feature_mz_turbomachinery_adjoint
| const auto nTotalDonors = (nSpanDonor + 1) * size; // Number of donor spans across all ranks | ||
| const auto nSpanDonorVars = (nSpanDonor + 1) * nMixingVars; // Number of variables to be transferred on each rank | ||
| vector<short> buffDonorMarker(nTotalDonors); | ||
| vector<su2double> buffDonorVar(static_cast<unsigned long>(nTotalDonors) * nMixingVars); // Total number of variables to be transferred on all ranks |
| buffDonorVar.data(), nSpanDonorVars, MPI_DOUBLE, | ||
| SU2_MPI::GetComm()); | ||
|
|
||
| for (auto iSize = 0; iSize < size; iSize++){ |
There was a problem hiding this comment.
| for (auto iSize = 0; iSize < size; iSize++){ | |
| for (auto iSize = 0u; iSize < size; iSize++){ |
There was a problem hiding this comment.
size is signed so iSize has to be also
| target_solution->SetExtAveragePressure(Marker_Target, iSpan, Target_Variable[1]); | ||
| target_solution->SetExtAverageTurboVelocity(Marker_Target, iSpan, 0, Target_Variable[2]); | ||
| target_solution->SetExtAverageTurboVelocity(Marker_Target, iSpan, 1, Target_Variable[3]); | ||
| unsigned short iVar, iDonorSpan; |
There was a problem hiding this comment.
| unsigned short iVar, iDonorSpan; | |
| if(nDim == 3){ | ||
| target_solution->SetExtAverageTurboVelocity(Marker_Target, iSpan, 2, Target_Variable[4]); | ||
| } | ||
| iDonorSpan = target_solution->GetnMixingStates(Marker_Target, Span_Target); |
There was a problem hiding this comment.
| iDonorSpan = target_solution->GetnMixingStates(Marker_Target, Span_Target); | |
| unsigned short iDonorSpan = target_solution->GetnMixingStates(Marker_Target, Span_Target); |
| target_solution->SetExtAverageNu(Marker_Target, iSpan, Target_Variable[5]); | ||
| target_solution->SetExtAverageKine(Marker_Target, iSpan, Target_Variable[6]); | ||
| target_solution->SetExtAverageOmega(Marker_Target, iSpan, Target_Variable[7]); | ||
| for (iVar = 0; iVar < nMixingVars; iVar++) { |
There was a problem hiding this comment.
| for (iVar = 0; iVar < nMixingVars; iVar++) { | |
| for (unsigned short iVar = 0; iVar < nMixingVars; iVar++) { |
| } | ||
|
|
||
| void CFlowCompOutput::SetTurboPerformance_Output(std::shared_ptr<CTurboOutput> TurboPerf, | ||
| void CFlowCompOutput::SetTurboPerformance_Output(std::vector<std::shared_ptr<CTurboOutput>> TurboBladePerfs, |
…m/su2code/SU2 into feature_mz_turbomachinery_adjoint
oleburghardt
left a comment
There was a problem hiding this comment.
@pcarruscag @joshkellyjak I'd suggest to solve problems in RegisterVariables in a different branch.
(The problem with the fix that I suggest is that, I think, some config class values are copied into the solver during initialization steps of the solver which might not be part of the recording. So it's also not a good solution, I'm afraid.)
| BPressure = config->GetPressureOut_BC(); | ||
| Temperature = config->GetTotalTemperatureIn_BC(); | ||
|
|
||
| if (!reset){ | ||
| AD::RegisterInput(BPressure); | ||
| AD::RegisterInput(Temperature); | ||
| } | ||
|
|
||
| BPressure = config->GetPressureOut_BC(); | ||
| Temperature = config->GetTotalTemperatureIn_BC(); | ||
|
|
There was a problem hiding this comment.
I think this actually removes a potential tag tool error message (= dependency error message in this case), but it breaks the functionality, since e.g. BPressure = config->GetPressureOut_BC(); deletes the index of BPressure, as the config value appears as a constant.
I'd suggest to fix this, for now, by tracking the config class value instead of the local copy (?)
Sure thing, this sounds reasonable from my perspective as it extends beyond the turbo world. For now it is maybe best to disable the tape tests in the regression workflow in that case? |
|
I reran the APU turbocharger test case as the residuals looked a little strange on the regression workflow, given I have validated this compared to experimental data I would argue that this updating the values here and adding new solution data to the testcase repo is justified. |
| } | ||
| #ifdef HAVE_MPI | ||
| /*--- Gather data. ---*/ | ||
| const size_t nTotalDonors = (nSpanDonor + 1) * size; // Number of donor spans across all ranks |
Check failure
Code scanning / CodeQL
Multiplication result converted to larger type High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 9 hours ago
In general, to fix this kind of problem you must ensure that at least one operand of the multiplication is explicitly cast to the larger integer type before the multiplication, so the operation is carried out in that larger type and cannot overflow a smaller intermediate type.
Here, the problematic expression is on line 94:
const size_t nTotalDonors = (nSpanDonor + 1) * size; // Number of donor spans across all ranksAssuming nSpanDonor and size are of (possibly signed) integer type smaller than or equal to int, the best fix is to cast one operand to size_t before the multiplication. This promotes the entire expression to size_t arithmetic and avoids intermediate int overflow without changing semantics for in‑range values.
The minimal, non‑intrusive change is to rewrite line 94 as:
const size_t nTotalDonors = static_cast<size_t>(nSpanDonor + 1) * static_cast<size_t>(size);or, more compactly (and still safe):
const size_t nTotalDonors = static_cast<size_t>(nSpanDonor + 1) * size;Since size is already used in other multiplications with size_t (e.g., indexing with static_cast<size_t>(iSize) * nSpanDonorVars), promoting the (nSpanDonor + 1) side is sufficient. No new headers or helper functions are needed.
Similarly, for consistency and to avoid the same class of issue, we should also ensure that other multiplications that feed into size_t or unsigned long are performed in a wide type. In the snippet, line 95:
const size_t nSpanDonorVars = (nSpanDonor + 1) * nMixingVars;also multiplies two small integer types (likely unsigned short/int) and stores the result in size_t. It should be updated to ensure the multiplication is done in size_t:
const size_t nSpanDonorVars = static_cast<size_t>(nSpanDonor + 1) * static_cast<size_t>(nMixingVars);No additional methods, imports, or definitions are required; we only add static_cast<size_t> in the existing lines.
| @@ -91,8 +91,8 @@ | ||
| } | ||
| #ifdef HAVE_MPI | ||
| /*--- Gather data. ---*/ | ||
| const size_t nTotalDonors = (nSpanDonor + 1) * size; // Number of donor spans across all ranks | ||
| const size_t nSpanDonorVars = (nSpanDonor + 1) * nMixingVars; // Number of variables to be transferred on each rank | ||
| const size_t nTotalDonors = static_cast<size_t>(nSpanDonor + 1) * static_cast<size_t>(size); // Number of donor spans across all ranks | ||
| const size_t nSpanDonorVars = static_cast<size_t>(nSpanDonor + 1) * static_cast<size_t>(nMixingVars); // Number of variables to be transferred on each rank | ||
| su2vector<short> buffDonorMarker(nTotalDonors); | ||
| su2activevector buffDonorVar(static_cast<unsigned long>(nTotalDonors) * nMixingVars); // Total number of variables to be transferred on all ranks | ||
|
|
| #ifdef HAVE_MPI | ||
| /*--- Gather data. ---*/ | ||
| const size_t nTotalDonors = (nSpanDonor + 1) * size; // Number of donor spans across all ranks | ||
| const size_t nSpanDonorVars = (nSpanDonor + 1) * nMixingVars; // Number of variables to be transferred on each rank |
Check failure
Code scanning / CodeQL
Multiplication result converted to larger type High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 9 hours ago
In general, to fix this class of problems you must ensure that at least one operand of the multiplication is explicitly converted to the larger integer type before the multiplication occurs. That forces the entire arithmetic operation to be performed in the wider type, preventing overflow that would otherwise happen in the narrower type.
In this specific case, nSpanDonorVars is declared as const size_t, but the RHS (nSpanDonor + 1) * nMixingVars is computed using the usual integer promotions (likely int). To fix it without changing functionality, we cast one operand to size_t, e.g. static_cast<size_t>(nSpanDonor + 1) or static_cast<size_t>(nMixingVars), so that the multiplication is performed in size_t. This keeps the logical meaning identical for all values that already fit in int, but makes the code safe and well‑defined for larger values.
Concretely, in SU2_CFD/src/interfaces/cfd/CMixingPlaneInterface.cpp, at the line where nSpanDonorVars is defined (line 95 in your snippet), change:
const size_t nSpanDonorVars = (nSpanDonor + 1) * nMixingVars;to:
const size_t nSpanDonorVars = static_cast<size_t>(nSpanDonor + 1) * nMixingVars;This does not require any new includes or other structural changes. The rest of the code that uses nSpanDonorVars (e.g., in the Allgather call and indexing into buffDonorVar) already expects a size_t, so no further modifications are necessary.
| @@ -92,7 +92,7 @@ | ||
| #ifdef HAVE_MPI | ||
| /*--- Gather data. ---*/ | ||
| const size_t nTotalDonors = (nSpanDonor + 1) * size; // Number of donor spans across all ranks | ||
| const size_t nSpanDonorVars = (nSpanDonor + 1) * nMixingVars; // Number of variables to be transferred on each rank | ||
| const size_t nSpanDonorVars = static_cast<size_t>(nSpanDonor + 1) * nMixingVars; // Number of variables to be transferred on each rank | ||
| su2vector<short> buffDonorMarker(nTotalDonors); | ||
| su2activevector buffDonorVar(static_cast<unsigned long>(nTotalDonors) * nMixingVars); // Total number of variables to be transferred on all ranks | ||
|
|
Proposed Changes
This is a cleaned up PR of the fixes needed for multizone adjoints for turbomachinery from the previous PR of @oleburghardt and I's work.
This hopefully is useable, so if you would like to test and report please feel free to contact me on Slack.
TODO:
Related Work
Now closed PR #2317
PR Checklist
Put an X by all that apply. You can fill this out after submitting the PR. If you have any questions, don't hesitate to ask! We want to help. These are a guide for you to know what the reviewers will be looking for in your contribution.
pre-commit run --allto format old commits.