Skip to content

Conversation

@Myoldmopar
Copy link
Member

Pull request overview

So the RelWithDebugInfo build is awesome. Great runtime but still provides test coverage and catches some things. Unfortunately it does not by default catch array bounds or infinity checks. I added the flag that Julien had put in there to enable these checks. It revealed a couple dozen failures. Almost all of them were fixed with a tariff index off by one change. The only failing tests I see locally are:

566 - integration.ShopWithPVandLiIonBattery (Failed)
752 - integration.WCE_Diffuse_Shade (Failed)
1169 - EnergyPlusFixture.EconomicTariff_LEEDtariff_with_Custom_Meter (Subprocess aborted)

integration.ShopWithPVandLiIonBattery

The PV/Battery file is almost certainly an infinity. The SSC uses infinity values as sentinels, so we need to handle this specially. I've looked into it in the past, but I don't recall hitting anything definitive. @jmarrec your thoughts would be highly appreciated.

integration.WCE_Diffuse_Shade

The WCE is hitting an array index. The slatAngIdxHi variable is -1, and the worker function that assigns these variables doesn't appear to be called for this setup. I started to dive in, but am also open to suggestion on why this might be happening. Here is the block where the array bound occurs. It is on the first Interp call, but happens before Interp is called. It happens while trying to set the onstruction.effShadeBlindEmi[surfShade.blind.slatAngIdxHi] variable, and fails the lookup

            if (ANY_INTERIOR_SHADE_BLIND(state.dataSurface->SurfWinShadingFlag(SurfNum))) {
                auto &surfShade = state.dataSurface->surfShades(SurfNum);
                Real64 EffShBlEmiss = surfShade.effShadeEmi;
                Real64 EffGlEmiss = surfShade.effGlassEmi;
                surfShade.effShadeEmi = Interp(construction.effShadeBlindEmi[surfShade.blind.slatAngIdxLo],
                                               construction.effShadeBlindEmi[surfShade.blind.slatAngIdxHi],
                                               surfShade.blind.slatAngInterpFac);
                surfShade.effGlassEmi = Interp(construction.effGlassEmi[surfShade.blind.slatAngIdxLo],
                                               construction.effGlassEmi[surfShade.blind.slatAngIdxHi],
                                               surfShade.blind.slatAngInterpFac);
                state.dataSurface->SurfWinEffInsSurfTemp(SurfNum) =
                    (EffShBlEmiss * SurfInsideTemp + EffGlEmiss * (state.dataWindowManager->thetas[2 * totSolidLayers - 3] - Constant::Kelvin)) /
                    (EffShBlEmiss + EffGlEmiss);
            }

EnergyPlusFixture.EconomicTariff_LEEDtariff_with_Custom_Meter

This unit test is failing at one of the format protections that @jmarrec added. Here's a snippet of the backtrace from gdb, see line :

#7  std::array<std::basic_string_view<char, std::char_traits<char> >, 5ul>::operator[] (__n=<optimized out>, this=<optimized out>) at /usr/include/c++/13/array:208
#8  EnergyPlus::EconomicTariff::LEEDtariffReporting (state=...) at /eplus/repos/6eplus/src/EnergyPlus/EconomicTariff.cc:3961
#9  0x000055555654f66d in EnergyPlusFixture_EconomicTariff_LEEDtariff_with_Custom_Meter_Test::TestBody (this=0x55555af20900) at /eplus/repos/6eplus/tst/EnergyPlus/unit/EconomicTariff.unit.cc:1751
#10 0x0000555558c03646 in testing::Test::Run (this=0x55555af20900) at /eplus/repos/6eplus/third_party/gtest/googletest/src/gtest.cc:2682
// SNIP //

(gdb) frame 8
#8  EnergyPlus::EconomicTariff::LEEDtariffReporting (state=...) at /eplus/repos/6eplus/src/EnergyPlus/EconomicTariff.cc:3961
warning: Source file is more recent than executable.
3961	                   (gasDemWindowUnits == DemandWindow::Invalid) ? "" : demandWindowStrings[(int)gasDemWindowUnits]));

(gdb) list
3956	            state,
3957	            s_orp->pdchLeedEtsDemUnt,
3958	            "Natural Gas",
3959	            format("{}{}",
3960	                   convDemandStrings[(int)gasUnits],
3961	                   (gasDemWindowUnits == DemandWindow::Invalid) ? "" : demandWindowStrings[(int)gasDemWindowUnits]));
3962	        OutputReportPredefined::PreDefTableEntry(
3963	            state,
3964	            s_orp->pdchLeedEtsDemUnt,
3965	            "Other",
(gdb) 

I'll chip away at these three and get this merged in. This will have a notable impact on test runtime, but it's worth it.

FYI @mjwitte @amirroth @jmarrec

@amirroth
Copy link
Collaborator

The WCE one is from the Blind PR. I can probably figure that one out in a clean way. I can also try to figure out the tariff one. I’ll look in the next day or two.

@amirroth
Copy link
Collaborator

I can see the WCE error but it will take me a few days to figure out how to fix it.

@Myoldmopar Myoldmopar added the Defect Includes code to repair a defect in EnergyPlus label May 21, 2025
@Myoldmopar
Copy link
Member Author

I'm gonna go out on a limb and suggest this was a fluke failure. Since the other two builds on that same machine were happy.... ignoring.

@Myoldmopar
Copy link
Member Author

Tariff one is gone. The WCE one is interesting. If I roll the code back to 24.1, I see that that block isn't ever hit. I was investigating thinking "why isn't the variable initialized", but the real question appears to be "why is that block being hit"?

@amirroth
Copy link
Collaborator

Because in the new code everything is an interpolation.

@github-actions
Copy link

⚠️ Regressions detected on macos-14 for commit 917807e

Regression Summary
  • Table Big Diffs: 3

@github-actions
Copy link

⚠️ Regressions detected on macos-14 for commit b1d96b6

Regression Summary
  • Table Big Diffs: 3

@Myoldmopar
Copy link
Member Author

So this is throwing some table diffs. Looking at them locally, there are multiple places where the monthly diff is off on the last month. I think this is correct. In the current code it was accessing invalid stuff at that last index. Here's the commit that caused it: 153d412

Specifically, before:

        for (int jPeriod = 1; jPeriod <= countPeriod; ++jPeriod) {
            for (int kMonth = 1; kMonth <= MaxNumMonths; ++kMonth) {
                monthVal(kMonth) += tariff.gatherEnergy(kMonth, jPeriod);
            }
        }

and after:

      for (int kMonth = 1; kMonth <= NumMonths; ++kMonth) {
            for (int jPeriod = 1; jPeriod <= (int)Period::Num; ++jPeriod) {
                monthVal(kMonth) += tariff.gatherEnergy(kMonth)[jPeriod];
            }
        }

The jPeriod lookup has changed to square bracket c++ vector lookup, but the counter variable is still 1-based. In this branch, it simply does jPeriod-1, which gets the seg fault fixed, but with diffs. I think it's good to go.

@amirroth or @mjwitte if you see anything I'm missing, I am happy to continue to push on this, but I think this PR is now done. CI is all clean (after a reboot), this PR turns on array bounds checking on Decent, and also fixes the failures that were encountered in doing that.

I'll merge it in a while if I don't hear anything.

@amirroth
Copy link
Collaborator

Hey @Myoldmopar, I don't know if you saw my comment inline on the code but the correct fix for the jPeriod period problem is not [jPeriod-1], it's to make the loop go from 0 to < Period::Num. That's the idiom for 0-based arrays. I know it's a small thing, but it's important I think.

@jmarrec
Copy link
Contributor

jmarrec commented May 26, 2025

integration.ShopWithPVandLiIonBattery

The PV/Battery file is almost certainly an infinity. The SSC uses infinity values as sentinels, so we need to handle this specially. I've looked into it in the past, but I don't recall hitting anything definitive. @jmarrec your thoughts would be highly appreciated.

$ lldb $ep_build2/Products/energyplus -- -D -r -w /home/julien/Software/Others/EnergyPlus2/weather/USA_OK_Oklahoma.City-Will.Rogers.World.AP.723530_TMY3.epw /home/julien/Software/Others/EnergyPlus2/testfiles/ShopWithPVandLiIonBattery.idf 

[...]

Starting Simulation at 01/21 for OKLAHOMA_CITY_OK_USA HEATING 99.6% CONDITIONS
Initializing New Environment Parameters
Warming up {1}
Process 1445281 stopped
* thread #1, name = 'energyplus', stop reason = signal SIGFPE: floating point invalid operation
    frame #0: 0x00007ffff052446b libenergyplusapi.so.25.2.0`lifetime_cycle_t::rainflow_compareRanges(this=0x000055555610dfc0) at lib_battery_lifetime_calendar_cycle.cpp:162:9
   159 	        // the capacity percent cannot increase
   160 	        double dq =
   161 	                bilinear(state->average_range, state->n_cycles) - bilinear(state->average_range, state->n_cycles + 1);
-> 162 	        if (dq > 0)
   163 	            state->cycle->q_relative_cycle -= dq;
   164 	
   165 	        if (state->cycle->q_relative_cycle < 0)
(lldb) p dq
(double) NaN
* thread #1, name = 'energyplus', stop reason = breakpoint 1.1
    frame #0: 0x00007ffff05243d2 libenergyplusapi.so.25.2.0`lifetime_cycle_t::rainflow_compareRanges(this=0x000055555610dfc0) at lib_battery_lifetime_calendar_cycle.cpp:161:48
   158 	
   159 	        // the capacity percent cannot increase
   160 	        double dq =
-> 161 	                bilinear(state->average_range, state->n_cycles) - bilinear(state->average_range, state->n_cycles + 1);
   162 	        if (dq > 0)
   163 	            state->cycle->q_relative_cycle -= dq;
   164 	
(lldb) s

Process 1447389 stopped
* thread #1, name = 'energyplus', stop reason = breakpoint 2.1
    frame #0: 0x00007ffff05248f8 libenergyplusapi.so.25.2.0`lifetime_cycle_t::bilinear(this=0x000055555610dfc0, DOD=38.868803822401645, cycle_number=1) at lib_battery_lifetime_calendar_cycle.cpp:215:25
   212 	    Then interpolate C_, C+ to get C at the DOD of interest

[...]
   325 	        // just have one row, single level interpolation
   326 	    else {
-> 327 	        C = util::linterp_col(params->cal_cyc->cycling_matrix, 1, cycle_number, 2);
   328 	    }
   329 	
   330 	    return C;
(lldb) s
Process 1447389 stopped
* thread #1, name = 'energyplus', stop reason = step in
    frame #0: 0x00007ffff0562296 libenergyplusapi.so.25.2.0`util::linterp_col(mat=0x000055555610e1a0, ixcol=1, xval=1, iycol=2) at lib_util.cpp:1131:22
   1128	{
   1129		// NOTE:  must assume values in ixcol are in increasing sorted order!!
   1130	
-> 1131		size_t n = mat.nrows();
   1132	
   1133		if (n == 1 && ixcol == 0 && iycol == 0)
   1134		    return mat.at(0);
(lldb) n
Process 1447389 stopped
* thread #1, name = 'energyplus', stop reason = step over
    frame #0: 0x00007ffff05622a6 libenergyplusapi.so.25.2.0`util::linterp_col(mat=0x000055555610e1a0, ixcol=1, xval=1, iycol=2) at lib_util.cpp:1133:2
   1130	
   1131		size_t n = mat.nrows();
   1132	
-> 1133		if (n == 1 && ixcol == 0 && iycol == 0)
   1134		    return mat.at(0);
   1135	
   1136		// basic checks
(lldb) n
Process 1447389 stopped
* thread #1, name = 'energyplus', stop reason = step over
    frame #0: 0x00007ffff05622d4 libenergyplusapi.so.25.2.0`util::linterp_col(mat=0x000055555610e1a0, ixcol=1, xval=1, iycol=2) at lib_util.cpp:1137:25
   1134		    return mat.at(0);
   1135	
   1136		// basic checks
-> 1137		if ( ixcol >= mat.ncols()
   1138			|| iycol >= mat.ncols()
   1139			|| n < 2 )
   1140			return std::numeric_limits<double>::quiet_NaN();
(lldb) n
Process 1447389 stopped
* thread #1, name = 'energyplus', stop reason = step over
    frame #0: 0x00007ffff05622e0 libenergyplusapi.so.25.2.0`util::linterp_col(mat=0x000055555610e1a0, ixcol=1, xval=1, iycol=2) at lib_util.cpp:1139:3
   1136		// basic checks
   1137		if ( ixcol >= mat.ncols()
   1138			|| iycol >= mat.ncols()
-> 1139			|| n < 2 )
   1140			return std::numeric_limits<double>::quiet_NaN();
   1141	
   1142	
(lldb) p n
(size_t) 1

@jmarrec
Copy link
Contributor

jmarrec commented May 26, 2025

@Myoldmopar Why am not seeing the ShopWithPVandLiIonBattery failure on IntegrationCoverage-RelWithDebInfo here?

Edit: nevermind, I see you did a new FORCE_CONTAINER_CHECKS_GCC_OR_CLANG

@jmarrec
Copy link
Contributor

jmarrec commented May 26, 2025

@Myoldmopar and @amirroth : quite fascinating. I've tried launching a release on my fork. All builds are good, EXCEPT the mac x86_64, which built fine, but it failed to run the PythonPluginCustomSchedule.idf

        ./$TGZ_DIR/energyplus -w ./$TGZ_DIR/WeatherData/USA_IL_Chicago-OHare.Intl.AP.725300_TMY3.epw -d out ./$TGZ_DIR/ExampleFiles/PythonPluginCustomSchedule.idf

https://github.com/jmarrec/EnergyPlus/actions/runs/15186401562/job/42711866243#step:14:786

I am grabbing that tar.gz on running a throway github actions runners macos-13 and attaching via tmate(ssh), and I can see it failing by attaching lldb (I logically don't have debug symbols, but I can get the routine where it fails)

Process 13055 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x1007c9000)
    frame #0: 0x0000000102f78768 libenergyplusapi.25.2.0.dylib`EnergyPlus::EconomicTariff::ComputeTariff(EnergyPlus::EnergyPlusData&) + 632
libenergyplusapi.25.2.0.dylib`EnergyPlus::EconomicTariff::ComputeTariff:
->  0x102f78768 <+632>: addsd  (%rcx), %xmm0
    0x102f7876c <+636>: movsd  %xmm0, (%rax)
    0x102f78770 <+640>: addq   $0x8, %rax
    0x102f78774 <+644>: addq   $0x28, %rcx

Locally I can actually reproduce the issue with a debug build on Ubuntu

Process 1505848 resuming
/usr/include/c++/13/array:202: constexpr std::array<_Tp, _Nm>::value_type& std::array<_Tp, _Nm>::operator[](size_type) [with _Tp = double; long unsigned int _Nm = 5; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
Process 1505848 stopped
* thread #1, name = 'energyplus', stop reason = signal SIGABRT
    frame #0: 0x00007ffff109eb2c libc.so.6`__GI___pthread_kill [inlined] __pthread_kill_implementation(no_tid=0, signo=6, threadid=<unavailable>) at pthread_kill.c:44:76
(lldb) bt
error: libenergyplusapi.so.25.2.0 0x0041cbd6: DW_TAG_member '_M_local_buf' refers to type 0x000000000047d2c7 which extends beyond the bounds of 0x0041cbcc
* thread #1, name = 'energyplus', stop reason = signal SIGABRT
  * frame #0: 0x00007ffff109eb2c libc.so.6`__GI___pthread_kill [inlined] __pthread_kill_implementation(no_tid=0, signo=6, threadid=<unavailable>) at pthread_kill.c:44:76
    frame #1: 0x00007ffff109eae8 libc.so.6`__GI___pthread_kill [inlined] __pthread_kill_internal(signo=6, threadid=<unavailable>) at pthread_kill.c:78:10
    frame #2: 0x00007ffff109eae8 libc.so.6`__GI___pthread_kill(threadid=<unavailable>, signo=6) at pthread_kill.c:89:10
    frame #3: 0x00007ffff104527e libc.so.6`__GI_raise(sig=6) at raise.c:26:13
    frame #4: 0x00007ffff10288ff libc.so.6`__GI_abort at abort.c:79:7
    frame #5: 0x00007ffff14df90d libstdc++.so.6`std::__glibcxx_assert_fail(file=<unavailable>, line=<unavailable>, function=<unavailable>, condition=<unavailable>) at assert_fail.cc:41:10
    frame #6: 0x00007ffff29371dd libenergyplusapi.so.25.2.0`std::array<double, 5ul>::operator[](this=0x00005555587c4340, __n=5) at array:202:2
    frame #7: 0x00007ffff3728f11 libenergyplusapi.so.25.2.0`EnergyPlus::EconomicTariff::setNativeVariables(state=0x00007fffffffa800) at EconomicTariff.cc:3667:72
    frame #8: 0x00007ffff37200ef libenergyplusapi.so.25.2.0`EnergyPlus::EconomicTariff::ComputeTariff(state=0x00007fffffffa800) at EconomicTariff.cc:2687:23
    frame #9: 0x00007ffff2d545dd libenergyplusapi.so.25.2.0`EnergyPlus::SimulationManager::ManageSimulation(state=0x00007fffffffa800) at SimulationManager.cc:581:38
    frame #10: 0x00007ffff2450c54 libenergyplusapi.so.25.2.0`RunEnergyPlus(state=0x00007fffffffa800, filepath="") at EnergyPlusPgm.cc:423:56
    frame #11: 0x00007ffff244fc7a libenergyplusapi.so.25.2.0`EnergyPlusPgm(args=size=6, filepath="") at EnergyPlusPgm.cc:242:25
    frame #12: 0x0000555555580bf6 energyplus`main(argc=6, argv=0x00007fffffffbf28) at main.cc:60:25
    frame #13: 0x00007ffff102a1ca libc.so.6`__libc_start_call_main(main=(energyplus`main at main.cc:54:1), argc=6, argv=0x00007fffffffbf28) at libc_start_call_main.h:58:16
    frame #14: 0x00007ffff102a28b libc.so.6`__libc_start_main_impl(main=(energyplus`main at main.cc:54:1), argc=6, argv=0x00007fffffffbf28, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007fffffffbf18) at libc-start.c:360:3
    frame #15: 0x0000555555580a45 energyplus`_start + 37
(lldb) f 7
frame #7: 0x00007ffff3728f11 libenergyplusapi.so.25.2.0`EnergyPlus::EconomicTariff::setNativeVariables(state=0x00007fffffffa800) at EconomicTariff.cc:3667:72
   3664	        monthVal = 0.0;
   3665	        for (int kMonth = 1; kMonth <= NumMonths; ++kMonth) {
   3666	            for (int jPeriod = 1; jPeriod <= (int)Period::Num; ++jPeriod) {
-> 3667	                monthVal(kMonth) += tariff.gatherEnergy(kMonth)[jPeriod];
   3668	            }
   3669	        }
   3670	        s_econ->econVar(tariff.natives[(int)Native::TotalEnergy]).values = monthVal;
(lldb) p jPeriod
(int) 5

I'm honestly quite puzzled as to why mac x86-64 is the only one that catches that.

@jmarrec
Copy link
Contributor

jmarrec commented May 26, 2025

I find this change hard to grok. A 1-based fortran array of 0-based std::array, what can go wrong :)

image

Comment on lines 3666 to 3667
for (int jPeriod = 0; jPeriod < (int)Period::Num; ++jPeriod) {
monthVal(kMonth) += tariff.gatherEnergy(kMonth)[jPeriod];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amirroth 's recommendation.

Mine would be to stop using an index based loop, so we'd never run into this problem to begin with.

STL accumulate:

for (int kMonth = 1; kMonth <= NumMonths; ++kMonth) {
    const auto& energy = tariff.gatherEnergy(kMonth);
    monthVal(kMonth) = std::accumulate(energy.begin(), energy.end(), 0.0);
}

Range-based for loop.

for (int kMonth = 1; kMonth <= NumMonths; ++kMonth) {
    for (const auto& v: tariff.gatherEnergy(kMonth)) {
        monthVal(kMonth) += v;
    }
}

Comment on lines 3673 to 3676
for (int kMonth = 1; kMonth <= NumMonths; ++kMonth) {
for (int jPeriod = 1; jPeriod <= (int)Period::Num; ++jPeriod) {
if (tariff.gatherDemand(kMonth)[jPeriod - 1] > monthVal(kMonth)) {
monthVal(kMonth) = tariff.gatherDemand(kMonth)[jPeriod - 1];
for (int jPeriod = 0; jPeriod < (int)Period::Num; ++jPeriod) {
if (tariff.gatherDemand(kMonth)[jPeriod] > monthVal(kMonth)) {
monthVal(kMonth) = tariff.gatherDemand(kMonth)[jPeriod];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same stuff here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're traversing twice on monthVal to set the demand, which is unnecessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff --git a/src/EnergyPlus/EconomicTariff.cc b/src/EnergyPlus/EconomicTariff.cc
index 09c45d7966..c13173e469 100644
--- a/src/EnergyPlus/EconomicTariff.cc
+++ b/src/EnergyPlus/EconomicTariff.cc
@@ -3663,27 +3663,19 @@ void setNativeVariables(EnergyPlusData &state)
         // nativeTotalEnergy
         monthVal = 0.0;
         for (int kMonth = 1; kMonth <= NumMonths; ++kMonth) {
-            for (int jPeriod = 0; jPeriod < (int)Period::Num; ++jPeriod) {
-                monthVal(kMonth) += tariff.gatherEnergy(kMonth)[jPeriod];
-            }
+            const auto &energy = tariff.gatherEnergy(kMonth);
+            monthVal(kMonth) = std::accumulate(energy.begin(), energy.end(), 0.0);
         }
         s_econ->econVar(tariff.natives[(int)Native::TotalEnergy]).values = monthVal;
+
         // nativeTotalDemand
         monthVal = -bigNumber;
         for (int kMonth = 1; kMonth <= NumMonths; ++kMonth) {
-            for (int jPeriod = 0; jPeriod < (int)Period::Num; ++jPeriod) {
-                if (tariff.gatherDemand(kMonth)[jPeriod] > monthVal(kMonth)) {
-                    monthVal(kMonth) = tariff.gatherDemand(kMonth)[jPeriod];
-                }
-            }
-        }
-        // if no maximum was set just set to zero
-        for (int kMonth = 1; kMonth <= NumMonths; ++kMonth) {
-            if (monthVal(kMonth) == -bigNumber) {
-                monthVal(kMonth) = 0.0;
-            }
+            const auto &demand = tariff.gatherDemand(kMonth);
+            monthVal(kMonth) = *std::max_element(demand.begin(), demand.end());
         }
         s_econ->econVar(tariff.natives[(int)Native::TotalDemand]).values = monthVal;
+
         for (int kMonth = 1; kMonth <= NumMonths; ++kMonth) {
             // nativePeakEnergy
             s_econ->econVar(tariff.natives[(int)Native::PeakEnergy]).values(kMonth) = tariff.gatherEnergy(kMonth)[(int)Period::Peak];

@github-actions
Copy link

⚠️ Regressions detected on macos-14 for commit 4e01661

Regression Summary
  • Table Big Diffs: 3

@Myoldmopar
Copy link
Member Author

@jmarrec I agree that using either the accumulate or range based loop would eliminate this possible problem from popping back up. But also, this looks complete and ready to merge. I think let's just get this in to clean up develop for debug tests and continue on other PRs.

Any objections? If not, this is going to the merge queue.

@Myoldmopar
Copy link
Member Author

OK, going in with this one. It will be great to get these working, but it will be notable to see the CI runtime impact. Expect slightly longer delays waiting on results. In the meantime I'll work on getting a supplementary machine to add to the corral.

@Myoldmopar Myoldmopar merged commit 6483d7a into develop May 27, 2025
9 checks passed
@Myoldmopar Myoldmopar deleted the FixDebugTests branch May 27, 2025 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Defect Includes code to repair a defect in EnergyPlus

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants