[SigEvents][Evals] Update GCS bucket for SigEvents and add README with instructions for new evals#256314
Conversation
...ackages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/otel_demo.ts
Show resolved
Hide resolved
jasonrhodes
left a comment
There was a problem hiding this comment.
LGTM just had a question about one small try/catch thing
x-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/README.md
Outdated
Show resolved
Hide resolved
x-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/README.md
Show resolved
Hide resolved
...tform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/README.md
Show resolved
Hide resolved
x-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/README.md
Outdated
Show resolved
Hide resolved
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
🚧 Files skipped from review as they are similar to previous changes (5)
📝 WalkthroughWalkthroughReorganizes GCS snapshot paths to a run-id-first structure, updates related docs and constants, adds a Minikube verification/start step to snapshot capture, and adjusts tests and path-resolution logic to match the new storage layout. Changes
Sequence Diagram(s)sequenceDiagram
participant Runner as Runner (capture script)
participant Ensure as ensureMinikube
participant Minikube as Minikube
participant Repo as GCS Repository / Registry
participant GCS as GCS Bucket
Runner->>Ensure: call ensureMinikube(log)
Ensure->>Minikube: check status
alt Minikube running
Minikube-->>Ensure: status Running
Ensure-->>Runner: return OK
else Minikube not installed (ENOENT)
Minikube-->>Ensure: ENOENT error
Ensure-->>Runner: throw error (install missing)
else Minikube not running
Minikube-->>Ensure: not running
Ensure->>Minikube: start with --cpus/--memory
Minikube-->>Ensure: started
Ensure-->>Runner: return OK
end
Runner->>Repo: register repository (basePath: <run-id>/<dataset>)
Repo-->>Runner: repo registered
Runner->>GCS: create/upload snapshot to `significant-events-datasets` at `<run-id>/<dataset>/...`
GCS-->>Runner: upload complete
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
📝 WalkthroughWalkthroughThis PR updates the GCS (Google Cloud Storage) bucket path structure for Significant Events snapshots, changing from "obs-ai-datasets" to "significant-events-datasets" and restructuring the path format from nested directories to a run-ID-first approach. It introduces Minikube initialization checks before snapshot capture and adds comprehensive documentation for the evaluation suite. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/README.md (1)
23-30: Consider documenting the minikube auto-start behavior.The PR introduces automatic minikube initialization (with
--cpus=4 --memory=8g) if minikube is not already running. Consider adding this information to the Prerequisites section so users are aware of:
- The automatic startup behavior
- The resource requirements (4 CPUs, 8GB memory)
This would help users understand what the script will do and ensure their system meets the resource requirements.
📝 Suggested addition to Prerequisites
### Prerequisites -- **minikube** + **kubectl** +- **minikube** + **kubectl** (the script will auto-start minikube with `--cpus=4 --memory=8g` if not already running) - Local **Elasticsearch** running with access to GCS credentials: `yarn es snapshot --license trial --secure-files gcs.client.default.credentials_file=/path/to/creds.json`🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/README.md` around lines 23 - 30, Update the Prerequisites section to document that the scripts automatically initialize minikube if it is not running: state that the script will run minikube start with the flags --cpus=4 and --memory=8g (i.e., auto-start behavior and resource requirements), so users know the script may allocate 4 CPUs and 8GB RAM when invoking minikube; add this note near the existing minikube bullet and include a short warning to ensure their system can provide those resources.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@x-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/README.md`:
- Around line 23-30: Update the Prerequisites section to document that the
scripts automatically initialize minikube if it is not running: state that the
script will run minikube start with the flags --cpus=4 and --memory=8g (i.e.,
auto-start behavior and resource requirements), so users know the script may
allocate 4 CPUs and 8GB RAM when invoking minikube; add this note near the
existing minikube bullet and include a short warning to ensure their system can
provide those resources.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: cae0d644-b11f-4bdb-8638-647655ddb2bd
📒 Files selected for processing (9)
x-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/README.mdx-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/README.mdx-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/capture_otel_demo_snapshots.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/constants.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/gcs.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/otel_demo.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/load_features_from_snapshot.test.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.test.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.ts
There was a problem hiding this comment.
🧹 Nitpick comments (1)
x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.ts (1)
22-23: Consider adding an explicit return type for the exported function.As per coding guidelines, exported functions should have explicit return types.
Suggested change
-export const resolveBasePath = (gcs: GcsConfig) => +export const resolveBasePath = (gcs: GcsConfig): string => `${SIGEVENTS_SNAPSHOT_RUN}/${gcs.basePathPrefix}`;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.ts` around lines 22 - 23, The exported function resolveBasePath currently has an implicit return type; update its signature to include an explicit return type (string) so it follows the export typing guideline. Locate the resolveBasePath declaration and change it to have a typed return (e.g., resolveBasePath = (gcs: GcsConfig): string => ...), keeping the same implementation that uses SIGEVENTS_SNAPSHOT_RUN and gcs.basePathPrefix.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.ts`:
- Around line 22-23: The exported function resolveBasePath currently has an
implicit return type; update its signature to include an explicit return type
(string) so it follows the export typing guideline. Locate the resolveBasePath
declaration and change it to have a typed return (e.g., resolveBasePath = (gcs:
GcsConfig): string => ...), keeping the same implementation that uses
SIGEVENTS_SNAPSHOT_RUN and gcs.basePathPrefix.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: d5da344f-cc1d-40dd-803c-4aa610a4210f
📒 Files selected for processing (9)
x-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/README.mdx-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/README.mdx-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/capture_otel_demo_snapshots.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/constants.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/gcs.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/otel_demo.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/load_features_from_snapshot.test.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.test.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/snapshot_run_config.ts
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]
History
cc @viduni94 |
…h instructions for new evals (elastic#256314) Closes elastic/streams-program#928 ## Summary Restructures the GCS storage layout for Significant Events snapshot datasets and improves the capture script ergonomics. ### New GCS bucket and folder structure Migrates from the shared `obs-ai-datasets` bucket to a dedicated `significant-events-datasets` bucket with a new path hierarchy that groups datasets under a run ID instead of the other way around: ``` # Before (obs-ai-datasets) dataset-a/run-id-1/snapshot dataset-b/run-id-1/snapshot # After (significant-events-datasets) run-id-1/dataset-a/snapshot run-id-1/dataset-b/snapshot ``` This makes each run ID an "atomic unit" - a single run captures all datasets at a point in time, making it simpler to browse, compare, and clean up old runs. ### OTel Demo snapshot capture script improvements - Auto-starts `minikube` if not already running (with `--cpus=4 --memory=8g`), removing a common manual prerequisite step ### Documentation - Adds a comprehensive README for the significant events evaluation suite covering prerequisites, running evaluations (all/specific datasets/specs), CLI options, environment variables, collected metrics (deterministic, LLM-as-a-judge, trace-based), and guidance for adding new datasets with reproducible capture scripts - Updates the snapshot capture script README to reflect the new bucket name and path structure ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) - [x] Review the [backport guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing) and apply applicable `backport:*` labels. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **Documentation** * Added comprehensive documentation for Significant Events evaluations (prereqs, running evaluations, metrics, and how to add datasets/specs). * Updated snapshot capture docs with new GCS base path and naming conventions and revised setup steps. * **New Features** * Added automatic Minikube verification/startup to streamline snapshot capture. * **Tests** * Updated tests to reflect new snapshot path composition. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…h instructions for new evals (elastic#256314) Closes elastic/streams-program#928 ## Summary Restructures the GCS storage layout for Significant Events snapshot datasets and improves the capture script ergonomics. ### New GCS bucket and folder structure Migrates from the shared `obs-ai-datasets` bucket to a dedicated `significant-events-datasets` bucket with a new path hierarchy that groups datasets under a run ID instead of the other way around: ``` # Before (obs-ai-datasets) dataset-a/run-id-1/snapshot dataset-b/run-id-1/snapshot # After (significant-events-datasets) run-id-1/dataset-a/snapshot run-id-1/dataset-b/snapshot ``` This makes each run ID an "atomic unit" - a single run captures all datasets at a point in time, making it simpler to browse, compare, and clean up old runs. ### OTel Demo snapshot capture script improvements - Auto-starts `minikube` if not already running (with `--cpus=4 --memory=8g`), removing a common manual prerequisite step ### Documentation - Adds a comprehensive README for the significant events evaluation suite covering prerequisites, running evaluations (all/specific datasets/specs), CLI options, environment variables, collected metrics (deterministic, LLM-as-a-judge, trace-based), and guidance for adding new datasets with reproducible capture scripts - Updates the snapshot capture script README to reflect the new bucket name and path structure ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) - [x] Review the [backport guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing) and apply applicable `backport:*` labels. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **Documentation** * Added comprehensive documentation for Significant Events evaluations (prereqs, running evaluations, metrics, and how to add datasets/specs). * Updated snapshot capture docs with new GCS base path and naming conventions and revised setup steps. * **New Features** * Added automatic Minikube verification/startup to streamline snapshot capture. * **Tests** * Updated tests to reflect new snapshot path composition. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Closes https://github.com/elastic/streams-program/issues/928
Summary
Restructures the GCS storage layout for Significant Events snapshot datasets and improves the capture script ergonomics.
New GCS bucket and folder structure
Migrates from the shared
obs-ai-datasetsbucket to a dedicatedsignificant-events-datasetsbucket with a new path hierarchy that groups datasets under a run ID instead of the other way around:This makes each run ID an "atomic unit" - a single run captures all datasets at a point in time, making it simpler to browse, compare, and clean up old runs.
OTel Demo snapshot capture script improvements
minikubeif not already running (with--cpus=4 --memory=8g), removing a common manual prerequisite stepDocumentation
Checklist
release_note:*label is applied per the guidelinesbackport:*labels.Summary by CodeRabbit
Release Notes
Documentation
New Features
Tests