Skip to content

[Alerting v2] [Rule authoring] Enforce max values and validation to rule form duration fields#258526

Merged
yiannisnikolopoulos merged 13 commits intoelastic:alerting_v2from
yiannisnikolopoulos:enforce-max-values-to-rule-form-duration-fields
Mar 26, 2026
Merged

[Alerting v2] [Rule authoring] Enforce max values and validation to rule form duration fields#258526
yiannisnikolopoulos merged 13 commits intoelastic:alerting_v2from
yiannisnikolopoulos:enforce-max-values-to-rule-form-duration-fields

Conversation

@yiannisnikolopoulos
Copy link
Copy Markdown
Contributor

@yiannisnikolopoulos yiannisnikolopoulos commented Mar 19, 2026

Summary

Adds maximum duration constraints to all duration fields in the alerting v2 rule form, closing a validation gap where a direct API call could submit arbitrarily large values that would pass server-side validation undetected. Also enforces a minimum schedule interval and warns when the lookback window is shorter than the execution interval.

Problem: Duration fields (schedule.every, schedule.lookback, state_transition.pending_timeframe, state_transition.recovering_timeframe) were only validated for format correctness (e.g. "5m", "1h"). There was no upper or lower bound, so a malformed or malicious API payload could submit values like "9999d" or "1ms" and they would be accepted by both the client and server.

Solution:

Server-side (@kbn/alerting-v2-schemas):

  • Introduce MAX_DURATION = '365d' and MIN_SCHEDULE_INTERVAL = '1m' constants in constants.ts
  • Add validateMaxDuration(value, max), validateMinDuration(value, min), and parseDurationToMs(value) to validation.ts, converting duration strings to milliseconds for comparison
  • Update durationSchema in common.ts to enforce MAX_DURATION on every duration field automatically
  • Add scheduleEverySchema in rule_data_schema.ts to additionally enforce MIN_SCHEDULE_INTERVAL on schedule.every

Client-side (alerting-v2-rule-form):

  • ScheduleField validates both min (1m) and max (365d) on schedule.every
  • LookbackWindowField validates max (365d) on schedule.lookback and shows a plain-text help warning when the lookback is shorter than the schedule interval
  • StateTransitionTimeframeField validates max (365d) on pending_timeframe and recovering_timeframe
  • All client validation reuses the same functions from @kbn/alerting-v2-schemas, keeping client and server in sync

The value 365d was chosen to be intentionally generous: no legitimate rule would need a schedule or timeframe exceeding one year. It is easier to relax a constraint later than to introduce one after the fact (which would be a breaking change for rules already storing larger values).

The decision for these changes has been captured in this doc.

Test plan

  • Unit tests pass for all modified files (128 tests across schema and validation suites)
  • Dedicated validation.test.ts suite covering parseDurationToMs, validateMaxDuration, and validateMinDuration including cross-unit cases (55w exceeds 365d, 500m is within limit)
  • schedule.every rejects values below 1m (e.g. 30s, 59s) and above 365d in both createRuleDataSchema and updateRuleDataSchema
  • schedule.lookback, pending_timeframe, and recovering_timeframe reject values exceeding 365d
  • Manually verify: enter 30s in Schedule field and confirm min error appears
  • Manually verify: enter 9999d in Schedule / Lookback / Pending timeframe / Recovering timeframe fields and confirm max error appears
  • Manually verify: set Lookback window shorter than Schedule interval and confirm warning text appears below the field

@yiannisnikolopoulos yiannisnikolopoulos self-assigned this Mar 19, 2026
@github-actions github-actions bot added the author:actionable-obs PRs authored by the actionable obs team label Mar 19, 2026
@yiannisnikolopoulos yiannisnikolopoulos marked this pull request as ready for review March 19, 2026 14:12
@yiannisnikolopoulos yiannisnikolopoulos requested a review from a team as a code owner March 19, 2026 14:12
Copy link
Copy Markdown
Member

@jasonrhodes jasonrhodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few testing questions, non-blocking, max value handling LGTM in general.

Comment on lines +21 to +25
function parseDurationToMs(value: string): number {
const match = DURATION_RE.exec(value);
if (!match) return NaN;
return parseInt(match[1], 10) * DURATION_UNIT_TO_MS[match[2]];
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want a few unit tests for this function or only test it via the others?

Copy link
Copy Markdown
Contributor

@adcoelho adcoelho Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for unit tests for all validation functions.

expect(result.success).toBe(true);
});

it('rejects schedule.every exceeding 365d', () => {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these are unit tests and relatively cheap, could we make sure we also test something like "55w" fails, "500m" succeeds ... to make sure the date logic is being fully parsed and not just comparing 366 > 365 etc?

@yiannisnikolopoulos yiannisnikolopoulos requested a review from a team as a code owner March 20, 2026 15:44
@yiannisnikolopoulos yiannisnikolopoulos changed the title [Alerting v2] [Rule authoring] Enforce max values to rule form duration fields [Alerting v2] [Rule authoring] Enforce max values and validation to rule form duration fields Mar 20, 2026
Copy link
Copy Markdown
Member

@jasonrhodes jasonrhodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a question about the parseDuration methods

let showIntervalWarning = false;
if (field.value && scheduleEvery) {
try {
showIntervalWarning = parseDuration(field.value) < parseDuration(scheduleEvery);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parseDuration isn't the one that's created in this PR and used in the validation, right? Looks like the new one allows ms and w units but the existing one doesn't. Is this going to cause sneaky bugs?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@macroscopeapp
Copy link
Copy Markdown
Contributor

macroscopeapp bot commented Mar 26, 2026

Approvability

Verdict: Needs human review

All changed files are owned by @elastic/rna-project-team but the author is not a member. Additionally, an open review comment raises a substantive concern about potential inconsistency between two different parseDuration implementations that could cause bugs.

You can customize Macroscope's approvability policy. Learn more.

@elasticmachine
Copy link
Copy Markdown
Contributor

elasticmachine commented Mar 26, 2026

💔 Build Failed

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #42 / Cloud Security Posture GET /internal/cloud_security_posture/stats KSPM Compliance Dashboard Stats API should return KSPM benchmarks V2
  • [job] [logs] affected Scout: [ platform / alerting_v2 ] plugin / local-stateful-classic - Episode lifecycle for alert rules - should track multiple groups independently
  • [job] [logs] affected Scout: [ platform / alerting_v2 ] plugin / local-stateful-classic - Episode lifecycle for alert rules - should track multiple groups independently
  • [job] [logs] affected Scout: [ platform / alerting_v2 ] plugin / local-stateful-classic - Episode lifecycle for alert rules - should transition active -> recovering -> inactive when source data stops breaching
  • [job] [logs] affected Scout: [ platform / alerting_v2 ] plugin / local-stateful-classic - Episode lifecycle for alert rules - should transition active -> recovering -> inactive when source data stops breaching
  • [job] [logs] affected Scout: [ platform / alerting_v2 ] plugin / local-stateful-classic - Episode lifecycle for alert rules - should transition through pending -> active when source data keeps breaching
  • [job] [logs] affected Scout: [ platform / alerting_v2 ] plugin / local-stateful-classic - Episode lifecycle for alert rules - should transition through pending -> active when source data keeps breaching
  • [job] [logs] affected Scout: [ platform / alerting_v2 ] plugin / local-stateful-classic - Episode lifecycle for alert rules - should use pending_count threshold before transitioning to active
  • [job] [logs] affected Scout: [ platform / alerting_v2 ] plugin / local-stateful-classic - Episode lifecycle for alert rules - should use pending_count threshold before transitioning to active

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/alerting-v2-schemas 63 69 +6

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
alertingVTwo 333.0KB 334.6KB +1.6KB
Unknown metric groups

API count

id before after diff
@kbn/alerting-v2-schemas 74 84 +10

History

cc @yiannisnikolopoulos

@yiannisnikolopoulos yiannisnikolopoulos merged commit 7c92fc4 into elastic:alerting_v2 Mar 26, 2026
14 of 15 checks passed
jasonrhodes added a commit to jasonrhodes/kibana that referenced this pull request Mar 26, 2026
elastic#258526 introduced a 1m minimum for schedule.every which broke the
Scout episode_lifecycle tests that use a 5s interval. Lower the floor
to 5s so existing tests (and short-interval rules) remain valid.

Removes one redundant "cross-unit" test case that no longer applies at
the 5s boundary.

Made-with: Cursor
darnautov pushed a commit that referenced this pull request Mar 26, 2026
## Summary

Fixes a regression introduced by #258526, which added a
`MIN_SCHEDULE_INTERVAL = '1m'` server-side validation on
`schedule.every`. This broke the Scout `episode_lifecycle` tests that
create rules with a `5s` schedule interval — the API now rejects those
requests, causing the "should track multiple groups independently" test
(and others in the suite) to fail.

**Fix:** Lower `MIN_SCHEDULE_INTERVAL` from `'1m'` to `'5s'` in
`constants.ts`. Update corresponding unit tests to reflect the new
boundary.

**Test removal:** Removes one `createRuleDataSchema` test case ("rejects
schedule.every below 1m via cross-unit (59s)") that was meant to
validate cross-unit comparison but wasn't doing that for 1m so was an
unnecessary test before and after these PR changes. (if the min value
had been 2m, you could test failures at 1m and at 95s, and that's
cross-unit testing ... but when the value is 1m and the lowest unit we
allow is seconds, there's no cross-unit test possible).

A follow-up issue will be created to discuss whether `5s` is the right
long-term minimum or if this should be configurable per environment.

## Test plan

- [x] `rule_data_schema.test.ts` — 95 tests pass
- [x] `validation.test.ts` — 48 tests pass

Made with [Cursor](https://cursor.com)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

author:actionable-obs PRs authored by the actionable obs team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants