Skip to content

[SigEvents][Evals] Update GCS bucket for SigEvents and add README with instructions for new evals#256314

Merged
viduni94 merged 12 commits intoelastic:mainfrom
viduni94:update-gcs-bucket-for-sigevents
Mar 6, 2026
Merged

[SigEvents][Evals] Update GCS bucket for SigEvents and add README with instructions for new evals#256314
viduni94 merged 12 commits intoelastic:mainfrom
viduni94:update-gcs-bucket-for-sigevents

Conversation

@viduni94
Copy link
Copy Markdown
Contributor

@viduni94 viduni94 commented Mar 5, 2026

Closes https://github.com/elastic/streams-program/issues/928

Summary

Restructures the GCS storage layout for Significant Events snapshot datasets and improves the capture script ergonomics.

New GCS bucket and folder structure

Migrates from the shared obs-ai-datasets bucket to a dedicated significant-events-datasets bucket with a new path hierarchy that groups datasets under a run ID instead of the other way around:

# Before (obs-ai-datasets)
dataset-a/run-id-1/snapshot
dataset-b/run-id-1/snapshot

# After (significant-events-datasets)
run-id-1/dataset-a/snapshot
run-id-1/dataset-b/snapshot

This makes each run ID an "atomic unit" - a single run captures all datasets at a point in time, making it simpler to browse, compare, and clean up old runs.

OTel Demo snapshot capture script improvements

  • Auto-starts minikube if not already running (with --cpus=4 --memory=8g), removing a common manual prerequisite step

Documentation

  • Adds a comprehensive README for the significant events evaluation suite covering prerequisites, running evaluations (all/specific datasets/specs), CLI options, environment variables, collected metrics (deterministic, LLM-as-a-judge, trace-based), and guidance for adding new datasets with reproducible capture scripts
  • Updates the snapshot capture script README to reflect the new bucket name and path structure

Checklist

  • Unit or functional tests were updated or added to match the most common scenarios
  • The PR description includes the appropriate Release Notes section, and the correct release_note:* label is applied per the guidelines
  • Review the backport guidelines and apply applicable backport:* labels.

Summary by CodeRabbit

Release Notes

  • Documentation

    • Added comprehensive documentation for Significant Events evaluations (prereqs, running evaluations, metrics, and how to add datasets/specs).
    • Updated snapshot capture docs with new GCS base path and naming conventions and revised setup steps.
  • New Features

    • Added automatic Minikube verification/startup to streamline snapshot capture.
  • Tests

    • Updated tests to reflect new snapshot path composition.

@viduni94 viduni94 self-assigned this Mar 5, 2026
@viduni94 viduni94 added release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting Feature:SigEvents Significant events feature, related to streams and rules/alerts (RnA) models:eis/anthropic-claude-4.6-opus Run LLM evals against model: eis/anthropic-claude-4.6-opus models:eis/google-gemini-3.0-flash Run LLM evals against model: eis/google-gemini-3.0-flash models:eis/openai-gpt-oss-120b Run LLM evals against model: eis/openai-gpt-oss-120b deprecated:models:judge:eis/google-gemini-3.0-pro DEPRECATED - model no longer available Team:SigEvents Project team working on Significant Events (deprecated) evals:streams-sigevents This label is deprecated. Use `evals:significant-events` to run the Significant Events eval suite. labels Mar 5, 2026
@viduni94 viduni94 marked this pull request as ready for review March 5, 2026 19:45
@viduni94 viduni94 requested review from a team as code owners March 5, 2026 19:45
@viduni94 viduni94 marked this pull request as draft March 5, 2026 19:47
@viduni94 viduni94 marked this pull request as ready for review March 5, 2026 19:54
Copy link
Copy Markdown
Member

@jasonrhodes jasonrhodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM just had a question about one small try/catch thing

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 6, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 429c274f-db47-41e4-afc9-46e739a5e9e1

📥 Commits

Reviewing files that changed from the base of the PR and between 66039ac and 35404e7.

📒 Files selected for processing (9)
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/README.md
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/README.md
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/capture_otel_demo_snapshots.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/constants.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/gcs.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/otel_demo.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/load_features_from_snapshot.test.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.test.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.ts
🚧 Files skipped from review as they are similar to previous changes (5)
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.test.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/constants.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/otel_demo.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/load_features_from_snapshot.test.ts

📝 Walkthrough

Walkthrough

Reorganizes GCS snapshot paths to a run-id-first structure, updates related docs and constants, adds a Minikube verification/start step to snapshot capture, and adjusts tests and path-resolution logic to match the new storage layout.

Changes

Cohort / File(s) Summary
Documentation
x-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/README.md, x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/README.md
Added a new Significant Events eval README and updated snapshot README to use significant-events-datasets bucket and new base path conventions.
Constants
x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/constants.ts
Changed GCS_BUCKET from obs-ai-datasets to significant-events-datasets, removed GCS_BUCKET_FOLDER, and updated OTEL_DEMO_GCS_BASE_PATH_PREFIX to the new format.
GCS path logic
x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/gcs.ts
Reworked base-path composition to ${runId}/${dataset} (defaults dataset to OTEL_DEMO_NAMESPACE), removed use of GCS_BUCKET_FOLDER; signatures unchanged but semantics differ—review upload/registration callers.
Minikube lifecycle
x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/otel_demo.ts
Added ensureMinikube(log: ToolingLog): Promise<void> and constants for minikube resources; function checks status, throws on missing install, or starts Minikube with configured resources.
Snapshot capture orchestration
x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/capture_otel_demo_snapshots.ts
Inserted ensureMinikube call (and logging) into the capture flow before repository registration and downstream processing.
Path resolution and tests
x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.ts, x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.test.ts, x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/load_features_from_snapshot.test.ts
Swapped path composition to return ${SIGEVENTS_SNAPSHOT_RUN}/${gcs.basePathPrefix}; updated tests to expect run-id-first ordering. Review tests and any code consuming resolved paths.

Sequence Diagram(s)

sequenceDiagram
  participant Runner as Runner (capture script)
  participant Ensure as ensureMinikube
  participant Minikube as Minikube
  participant Repo as GCS Repository / Registry
  participant GCS as GCS Bucket

  Runner->>Ensure: call ensureMinikube(log)
  Ensure->>Minikube: check status
  alt Minikube running
    Minikube-->>Ensure: status Running
    Ensure-->>Runner: return OK
  else Minikube not installed (ENOENT)
    Minikube-->>Ensure: ENOENT error
    Ensure-->>Runner: throw error (install missing)
  else Minikube not running
    Minikube-->>Ensure: not running
    Ensure->>Minikube: start with --cpus/--memory
    Minikube-->>Ensure: started
    Ensure-->>Runner: return OK
  end

  Runner->>Repo: register repository (basePath: <run-id>/<dataset>)
  Repo-->>Runner: repo registered
  Runner->>GCS: create/upload snapshot to `significant-events-datasets` at `<run-id>/<dataset>/...`
  GCS-->>Runner: upload complete
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main changes: updating the GCS bucket for SigEvents and adding README documentation with eval instructions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 6, 2026

📝 Walkthrough

Walkthrough

This PR updates the GCS (Google Cloud Storage) bucket path structure for Significant Events snapshots, changing from "obs-ai-datasets" to "significant-events-datasets" and restructuring the path format from nested directories to a run-ID-first approach. It introduces Minikube initialization checks before snapshot capture and adds comprehensive documentation for the evaluation suite.

Changes

Cohort / File(s) Summary
GCS Bucket Configuration
x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/constants.ts, scripts/significant_events_snapshots/README.md
Updated GCS bucket from 'obs-ai-datasets' to 'significant-events-datasets'; removed GCS_BUCKET_FOLDER constant; simplified OTEL_DEMO_GCS_BASE_PATH_PREFIX to use namespace only. README reflects new bucket and path structure.
Path Generation Logic
scripts/significant_events_snapshots/lib/gcs.ts, src/data_generators/snapshot_run_config.ts
Refactored generateGcsBasePath to return runId/dataset format instead of nested folder structure. Updated resolveBasePath to prepend SIGEVENTS_SNAPSHOT_RUN before basePathPrefix, inverting the previous directory hierarchy.
Path Resolution Tests
src/data_generators/snapshot_run_config.test.ts, src/data_generators/load_features_from_snapshot.test.ts
Updated test assertions to match new path ordering: runId/dataset and snapshot run at top level of bucket.
Minikube Initialization
scripts/significant_events_snapshots/lib/otel_demo.ts, scripts/significant_events_snapshots/capture_otel_demo_snapshots.ts
Added new ensureMinikube function that checks and starts Minikube with configured CPU and memory before snapshot capture. Integrated ensureMinikube into snapshot capture workflow.
Documentation
evals/significant_events/README.md
Added comprehensive README documenting evaluation suites, prerequisites, execution workflows, collected metrics, and procedures for adding new datasets and evaluation specifications.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the two main changes: updating the GCS bucket for SigEvents and adding a README with instructions for new evals, which directly correspond to the significant changes in the pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/README.md (1)

23-30: Consider documenting the minikube auto-start behavior.

The PR introduces automatic minikube initialization (with --cpus=4 --memory=8g) if minikube is not already running. Consider adding this information to the Prerequisites section so users are aware of:

  • The automatic startup behavior
  • The resource requirements (4 CPUs, 8GB memory)

This would help users understand what the script will do and ensure their system meets the resource requirements.

📝 Suggested addition to Prerequisites
 ### Prerequisites
 
-- **minikube** + **kubectl**
+- **minikube** + **kubectl** (the script will auto-start minikube with `--cpus=4 --memory=8g` if not already running)
 - Local **Elasticsearch** running with access to GCS credentials: `yarn es snapshot --license trial --secure-files gcs.client.default.credentials_file=/path/to/creds.json`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/README.md`
around lines 23 - 30, Update the Prerequisites section to document that the
scripts automatically initialize minikube if it is not running: state that the
script will run minikube start with the flags --cpus=4 and --memory=8g (i.e.,
auto-start behavior and resource requirements), so users know the script may
allocate 4 CPUs and 8GB RAM when invoking minikube; add this note near the
existing minikube bullet and include a short warning to ensure their system can
provide those resources.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/README.md`:
- Around line 23-30: Update the Prerequisites section to document that the
scripts automatically initialize minikube if it is not running: state that the
script will run minikube start with the flags --cpus=4 and --memory=8g (i.e.,
auto-start behavior and resource requirements), so users know the script may
allocate 4 CPUs and 8GB RAM when invoking minikube; add this note near the
existing minikube bullet and include a short warning to ensure their system can
provide those resources.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: cae0d644-b11f-4bdb-8638-647655ddb2bd

📥 Commits

Reviewing files that changed from the base of the PR and between b090600 and b65d12f.

📒 Files selected for processing (9)
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/README.md
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/README.md
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/capture_otel_demo_snapshots.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/constants.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/gcs.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/otel_demo.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/load_features_from_snapshot.test.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.test.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.ts

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.ts (1)

22-23: Consider adding an explicit return type for the exported function.

As per coding guidelines, exported functions should have explicit return types.

Suggested change
-export const resolveBasePath = (gcs: GcsConfig) =>
+export const resolveBasePath = (gcs: GcsConfig): string =>
   `${SIGEVENTS_SNAPSHOT_RUN}/${gcs.basePathPrefix}`;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.ts`
around lines 22 - 23, The exported function resolveBasePath currently has an
implicit return type; update its signature to include an explicit return type
(string) so it follows the export typing guideline. Locate the resolveBasePath
declaration and change it to have a typed return (e.g., resolveBasePath = (gcs:
GcsConfig): string => ...), keeping the same implementation that uses
SIGEVENTS_SNAPSHOT_RUN and gcs.basePathPrefix.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.ts`:
- Around line 22-23: The exported function resolveBasePath currently has an
implicit return type; update its signature to include an explicit return type
(string) so it follows the export typing guideline. Locate the resolveBasePath
declaration and change it to have a typed return (e.g., resolveBasePath = (gcs:
GcsConfig): string => ...), keeping the same implementation that uses
SIGEVENTS_SNAPSHOT_RUN and gcs.basePathPrefix.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: d5da344f-cc1d-40dd-803c-4aa610a4210f

📥 Commits

Reviewing files that changed from the base of the PR and between b090600 and 66039ac.

📒 Files selected for processing (9)
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/README.md
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/README.md
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/capture_otel_demo_snapshots.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/constants.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/gcs.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/otel_demo.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/load_features_from_snapshot.test.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.test.ts
  • x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.ts

@viduni94 viduni94 removed models:eis/anthropic-claude-4.6-opus Run LLM evals against model: eis/anthropic-claude-4.6-opus models:eis/google-gemini-3.0-flash Run LLM evals against model: eis/google-gemini-3.0-flash models:eis/openai-gpt-oss-120b Run LLM evals against model: eis/openai-gpt-oss-120b deprecated:models:judge:eis/google-gemini-3.0-pro DEPRECATED - model no longer available (deprecated) evals:streams-sigevents This label is deprecated. Use `evals:significant-events` to run the Significant Events eval suite. labels Mar 6, 2026
@viduni94 viduni94 enabled auto-merge (squash) March 6, 2026 18:40
@viduni94 viduni94 merged commit cd2ff36 into elastic:main Mar 6, 2026
18 checks passed
@elasticmachine
Copy link
Copy Markdown
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] Jest Tests #1 / Mappings editor: date range datatype should append custom format to default formats

Metrics [docs]

✅ unchanged

History

cc @viduni94

kapral18 pushed a commit to kapral18/kibana that referenced this pull request Mar 9, 2026
…h instructions for new evals (elastic#256314)

Closes elastic/streams-program#928

## Summary

Restructures the GCS storage layout for Significant Events snapshot
datasets and improves the capture script ergonomics.

### New GCS bucket and folder structure

Migrates from the shared `obs-ai-datasets` bucket to a dedicated
`significant-events-datasets` bucket with a new path hierarchy that
groups datasets under a run ID instead of the other way around:
```
# Before (obs-ai-datasets)
dataset-a/run-id-1/snapshot
dataset-b/run-id-1/snapshot

# After (significant-events-datasets)
run-id-1/dataset-a/snapshot
run-id-1/dataset-b/snapshot
```

This makes each run ID an "atomic unit" - a single run captures all
datasets at a point in time, making it simpler to browse, compare, and
clean up old runs.

### OTel Demo snapshot capture script improvements

- Auto-starts `minikube` if not already running (with `--cpus=4
--memory=8g`), removing a common manual prerequisite step

### Documentation

- Adds a comprehensive README for the significant events evaluation
suite covering prerequisites, running evaluations (all/specific
datasets/specs), CLI options, environment variables, collected metrics
(deterministic, LLM-as-a-judge, trace-based), and guidance for adding
new datasets with reproducible capture scripts
- Updates the snapshot capture script README to reflect the new bucket
name and path structure

### Checklist

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [x] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
- [x] Review the [backport
guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)
and apply applicable `backport:*` labels.



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

* **Documentation**
* Added comprehensive documentation for Significant Events evaluations
(prereqs, running evaluations, metrics, and how to add datasets/specs).
* Updated snapshot capture docs with new GCS base path and naming
conventions and revised setup steps.

* **New Features**
* Added automatic Minikube verification/startup to streamline snapshot
capture.

* **Tests**
  * Updated tests to reflect new snapshot path composition.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
qn895 pushed a commit to qn895/kibana that referenced this pull request Mar 11, 2026
…h instructions for new evals (elastic#256314)

Closes elastic/streams-program#928

## Summary

Restructures the GCS storage layout for Significant Events snapshot
datasets and improves the capture script ergonomics.

### New GCS bucket and folder structure

Migrates from the shared `obs-ai-datasets` bucket to a dedicated
`significant-events-datasets` bucket with a new path hierarchy that
groups datasets under a run ID instead of the other way around:
```
# Before (obs-ai-datasets)
dataset-a/run-id-1/snapshot
dataset-b/run-id-1/snapshot

# After (significant-events-datasets)
run-id-1/dataset-a/snapshot
run-id-1/dataset-b/snapshot
```

This makes each run ID an "atomic unit" - a single run captures all
datasets at a point in time, making it simpler to browse, compare, and
clean up old runs.

### OTel Demo snapshot capture script improvements

- Auto-starts `minikube` if not already running (with `--cpus=4
--memory=8g`), removing a common manual prerequisite step

### Documentation

- Adds a comprehensive README for the significant events evaluation
suite covering prerequisites, running evaluations (all/specific
datasets/specs), CLI options, environment variables, collected metrics
(deterministic, LLM-as-a-judge, trace-based), and guidance for adding
new datasets with reproducible capture scripts
- Updates the snapshot capture script README to reflect the new bucket
name and path structure

### Checklist

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [x] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
- [x] Review the [backport
guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)
and apply applicable `backport:*` labels.



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

* **Documentation**
* Added comprehensive documentation for Significant Events evaluations
(prereqs, running evaluations, metrics, and how to add datasets/specs).
* Updated snapshot capture docs with new GCS base path and naming
conventions and revised setup steps.

* **New Features**
* Added automatic Minikube verification/startup to streamline snapshot
capture.

* **Tests**
  * Updated tests to reflect new snapshot path composition.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting Feature:SigEvents Significant events feature, related to streams and rules/alerts (RnA) release_note:skip Skip the PR/issue when compiling release notes Team:SigEvents Project team working on Significant Events v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants