Skip to content

1938 - Add current-hour aggregation to fix db lag#2586

Merged
crivetimihai merged 1 commit intomainfrom
1938-admin-metrics-rollup
Jan 31, 2026
Merged

1938 - Add current-hour aggregation to fix db lag#2586
crivetimihai merged 1 commit intomainfrom
1938-admin-metrics-rollup

Conversation

@gcgoncalves
Copy link
Copy Markdown
Collaborator

@gcgoncalves gcgoncalves commented Jan 30, 2026

🐛 Bug-fix PR

📌 Summary

Fixes: #1938

This commit addresses an issue where admin metrics were empty during benchmark tests shorter than one hour because they relied on hourly rollup jobs. The metrics query service is updated to use a three-source aggregation:

  1. Historical rollups (for data older than the retention period).
    strategy:
  2. Raw metrics from the current, incomplete hour.
  3. Raw metrics for completed hours within the retention period.

This ensures that metrics are always up-to-date, even before the expensive raw table scans during short-lived tests. hourly rollup job runs, providing immediate visibility and preventing.

🔁 Reproduction Steps

  1. Start the application
  2. Start the load tests (make load-tests-ui)
  3. Wait a few minutes
  4. Query /admin/metrics and /api/metrics

🐞 Root Cause

When no rollups were found, the query was sent to the DB, further stressing it.

💡 Fix Description

Add rollups for the current hour, caching data and allowing for shorter duration load tests to display results data.

🧪 Verification

Check Command Status
Lint suite make lint
Unit tests make test
Coverage ≥ 90 % make coverage
Manual regression no longer fails steps / screenshots

✅ Checklist

  • Code formatted (make black isort pre-commit)
  • No secrets/credentials committed

@gcgoncalves gcgoncalves force-pushed the 1938-admin-metrics-rollup branch 2 times, most recently from a5cf46f to 8b810dc Compare January 30, 2026 15:42
@gcgoncalves gcgoncalves marked this pull request as ready for review January 30, 2026 16:00
@crivetimihai crivetimihai self-assigned this Jan 31, 2026
Fixes: #1938

This commit addresses an issue where admin metrics were empty during
benchmark tests shorter than one hour because they relied on hourly
rollup jobs. The metrics query service is updated to use a three-source
aggregation:

1. Historical rollups (for data older than the retention period)
2. Raw metrics for completed hours within the retention period
3. Raw metrics from the current, incomplete hour

This ensures that metrics are always up-to-date, even before the hourly
rollup job runs, providing immediate visibility and preventing expensive
raw table scans during short-lived tests.

Test improvements:
- Fix flaky test at hour boundary (race condition)
- Remove unused patch import
- Add tests for three-source merge behavior

Signed-off-by: Gabriel Costa <gabrielcg@proton.me>
@crivetimihai crivetimihai force-pushed the 1938-admin-metrics-rollup branch from 8b810dc to d37edc5 Compare January 31, 2026 21:35
@crivetimihai crivetimihai added this to the Release 1.0.0-RC1 milestone Jan 31, 2026
@crivetimihai crivetimihai merged commit 4ff7d7b into main Jan 31, 2026
51 checks passed
@crivetimihai crivetimihai deleted the 1938-admin-metrics-rollup branch January 31, 2026 21:57
hughhennelly pushed a commit to hughhennelly/mcp-context-forge that referenced this pull request Feb 8, 2026
Fixes: IBM#1938

This commit addresses an issue where admin metrics were empty during
benchmark tests shorter than one hour because they relied on hourly
rollup jobs. The metrics query service is updated to use a three-source
aggregation:

1. Historical rollups (for data older than the retention period)
2. Raw metrics for completed hours within the retention period
3. Raw metrics from the current, incomplete hour

This ensures that metrics are always up-to-date, even before the hourly
rollup job runs, providing immediate visibility and preventing expensive
raw table scans during short-lived tests.

Test improvements:
- Fix flaky test at hour boundary (race condition)
- Remove unused patch import
- Add tests for three-source merge behavior

Signed-off-by: Gabriel Costa <gabrielcg@proton.me>
Signed-off-by: hughhennnelly <hughhennelly06@gmail.com>
kcostell06 pushed a commit to kcostell06/mcp-context-forge that referenced this pull request Feb 24, 2026
Fixes: IBM#1938

This commit addresses an issue where admin metrics were empty during
benchmark tests shorter than one hour because they relied on hourly
rollup jobs. The metrics query service is updated to use a three-source
aggregation:

1. Historical rollups (for data older than the retention period)
2. Raw metrics for completed hours within the retention period
3. Raw metrics from the current, incomplete hour

This ensures that metrics are always up-to-date, even before the hourly
rollup job runs, providing immediate visibility and preventing expensive
raw table scans during short-lived tests.

Test improvements:
- Fix flaky test at hour boundary (race condition)
- Remove unused patch import
- Add tests for three-source merge behavior

Signed-off-by: Gabriel Costa <gabrielcg@proton.me>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[PERFORMANCE]: Admin metrics rollups empty during benchmark window (raw scans only)

2 participants