refactor(bench): _framework decoupling Phase 1 + 2 — capability flags#2800
Conversation
Greptile code reviewThis repo uses Greptile for automated review. Before merge, aim for Confidence Score: 5/5 with zero unresolved review threads — see CONTRIBUTING.md. Run a review — add a PR comment with: Give it ~5-10 minutes (sometimes longer) for results, then fix feedback and re-trigger until you reach Confidence Score: 5/5. Optional: automate with the greploop skill. |
|
@greptile review |
Greptile SummaryThis PR replaces two hardcoded
Confidence Score: 5/5Safe to merge — the refactor is strictly additive, backward compatibility is preserved bit-for-bit, and all previous review concerns have been addressed. The capability-flag layer is well-scoped: frozen Pydantic model prevents runtime mutation, No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["BenchmarkConfig.lint()"] --> B["_capabilities_for_lint(benchmark)"]
B --> C["capabilities_for(name) — registry.py"]
C --> D{Factory is a class?}
D -- Yes --> E["factory.capabilities (ClassVar read, no __init__)"]
D -- No --> F["factory().capabilities (closure fallback)"]
E --> G["AdapterCapabilities"]
F --> G
C -- KeyError / ImportError --> H["AdapterCapabilities() — all-False default"]
B --> G
G --> I{agent_variant != 'default'?}
I -- supports_agent_variant=False --> J["Error: adapter must declare supports_agent_variant=True"]
I -- supports_agent_variant=True --> K["✓ Accepted"]
G --> L{predictor_variant == 'structured'?}
L -- supports_predictor_variant=False --> M["Error: adapter must declare supports_predictor_variant=True"]
L -- supports_predictor_variant=True --> N["Check OpenAI LLM constraint"]
Reviews (3): Last reviewed commit: "fixed issues" | Re-trigger Greptile |
|
@greptile review |
|
🧑💻 @YauhenBichel has entered the contributor hall of fame. Merged. Done. Shipped. Go touch grass (then come back with another PR). 🌱 👋 Join us on Discord - OpenSRE : hang out, contribute, or hunt for features and issues. Everyone's welcome. |

Fixes #2074
Describe the changes you have made in this PR -
First two of five planned phases to decouple
_framework/fromCloudOpsBench-specific assumptions. Today the framework hardcodes
if config.benchmark != "cloudopsbench"in multiple places — any newadapter (OpenRCA, ToolCallBench) needs framework changes to opt in to
features it actually supports. This PR replaces name-based dispatch
with a capability flag layer.
Phase 1 —
AdapterCapabilitiesmodel + ABC integrationAdapterCapabilitiespydantic model in_framework/adapter_base.py.Frozen,
extra="forbid". Two flags to start:supports_agent_variant,supports_predictor_variant. Default all-False so a new adapter islocked down to the minimum surface until it opts in deliberately.
BenchmarkAdapterABC getscapabilities: ClassVar[AdapterCapabilities]with all-False default.
CloudOpsBenchAdapterdeclarescapabilities = AdapterCapabilities(supports_agent_variant=True, supports_predictor_variant=True)— exactly what the previoushardcoded check let through.
Phase 2 — Config validation goes capability-aware
capabilities_for(name)helper in_framework/registry.py—returns the registered adapter's flags without forcing the caller to
instantiate one.
_framework/config.pyreplaces bothif benchmark != "cloudopsbench"guards with
if not adapter_caps.supports_<feature>. Unknownadapter → all-False default → guard refuses the gated knob (a typo
in
config.benchmarksurfaces with a clear error for each gatedknob, plus the underlying unknown-benchmark error at runner build
time).
Backward compat
*.ymlin the repo) continueto validate cleanly — the capability declaration on the adapter
matches the previous hardcoded behavior bit-for-bit.
_framework/adapters.pyre-exportsAdapterCapabilitiescapabilities_forso external imports use the same path as before.extra="forbid"catches capability-name typos at adapterdeclaration time, not at run time.
Code Understanding and AI Usage
Did you use AI assistance (ChatGPT, Claude, Copilot, etc.) to write any part of this code?
If you used AI assistance:
Checklist before requesting a review
Note: Please check Allow edits from maintainers if you would like us to assist in the PR.