Skip to content

Fix Stats Collector to not divide by zero#10492

Merged
Pearl1594 merged 2 commits intoapache:4.20from
scclouds:fix-stats-collector-arithmatic-exception
Mar 10, 2025
Merged

Fix Stats Collector to not divide by zero#10492
Pearl1594 merged 2 commits intoapache:4.20from
scclouds:fix-stats-collector-arithmatic-exception

Conversation

@lucas-a-martins
Copy link
Copy Markdown
Contributor

Description

Currently, the Stats Collector can throw an arithmetic exception when trying to divide by zero if interval = 0.

This PR fixes this issue by setting the loadHistory.add value to zero when the interval variable is equal to zero.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Screenshots (if appropriate):

How Has This Been Tested?

In debug mode, I tested by setting the interval value to zero. Before the changes, an arithmetic exception was thrown. After the changes, I got no exception and could check that the value added to loadHistory was equal to 0.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 3, 2025

Codecov Report

Attention: Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.

Project coverage is 16.00%. Comparing base (b92fd17) to head (eb52f72).
Report is 22 commits behind head on 4.20.

Files with missing lines Patch % Lines
...src/main/java/com/cloud/server/StatsCollector.java 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.20   #10492      +/-   ##
============================================
+ Coverage     15.98%   16.00%   +0.01%     
- Complexity    13086    13105      +19     
============================================
  Files          5650     5651       +1     
  Lines        495756   495841      +85     
  Branches      60018    60045      +27     
============================================
+ Hits          79261    79366     +105     
+ Misses       407641   407614      -27     
- Partials       8854     8861       +7     
Flag Coverage Δ
uitests 4.00% <ø> (-0.01%) ⬇️
unittests 16.84% <0.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Pearl1594
Copy link
Copy Markdown
Contributor

@lucas-a-martins Could you please rebase this off 4.20 - I believe this could be an issue in 4.20 as well.

Copy link
Copy Markdown
Contributor

@Pearl1594 Pearl1594 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code lgtm.

@Pearl1594
Copy link
Copy Markdown
Contributor

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@Pearl1594 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12642

@yadvr yadvr added this to the 4.20.1 milestone Mar 4, 2025
@Pearl1594
Copy link
Copy Markdown
Contributor

After having a conversation with @DaanHoogland on this offline, I understand that reporting a 0 could imply that there were no queries made at all at a specific time, would may not be entirely correct. Having a -1 would indicate an erroneous stats collection for that iteration.

@lucas-a-martins
Copy link
Copy Markdown
Contributor Author

@DaanHoogland, @Pearl1594

I applied your suggestions by changing the value to -1 and fixed the maxsize bug. Can you guys take a look?

cc @rohityadavcloud

@DaanHoogland
Copy link
Copy Markdown
Contributor

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12682

@DaanHoogland
Copy link
Copy Markdown
Contributor

@blueorangutan test keepEnv

@blueorangutan
Copy link
Copy Markdown

@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

[SF] Trillian test result (tid-12608)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 61178 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10492-t12608-kvm-ol8.zip
Smoke tests completed. 140 look OK, 1 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
ContextSuite context=TestSharedNetworkWithConfigDrive>:setup Error 1521.39 test_network.py

Copy link
Copy Markdown
Contributor

@Pearl1594 Pearl1594 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code lgtm

@DaanHoogland
Copy link
Copy Markdown
Contributor

the refresh fix works. I have no idea how to force the NPE situation (apart from changing the code/core in a debugger. @lucas-a-martins have you seen this happen without debugger?

@lucas-a-martins
Copy link
Copy Markdown
Contributor Author

the refresh fix works. I have no idea how to force the NPE situation (apart from changing the code/core in a debugger. @lucas-a-martins have you seen this happen without debugger?

@DaanHoogland only one time and it wasn't in my environment. I don't know how to force it too

@DaanHoogland
Copy link
Copy Markdown
Contributor

@Pearl1594 , I am fine with merging this, in spite of the corner caaase not being verified. Normal functionality is still working....

@Pearl1594 Pearl1594 merged commit 54c1f92 into apache:4.20 Mar 10, 2025
25 of 26 checks passed
DaanHoogland added a commit that referenced this pull request Mar 12, 2025
* 4.20:
  Fix Stats Collector to not divide by zero (#10492)
  linstor: try to delete -rst resource before snapshot backup (#10443)
@Pearl1594 Pearl1594 moved this to Done in ACS 4.20.1 Mar 17, 2025
dhslove pushed a commit to ablecloud-team/ablestack-cloud that referenced this pull request Jun 19, 2025
* Set loadHistory value to zero when interval is zero to not throw arithmatic exception

* Change loadHistory value to -1 and fix maxsize bug

---------

Co-authored-by: Lucas Martins <lucas.martins@scclouds.com.br>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants