Feature Request: Typed Config-Runtime Contract
Supersedes: This FR has been expanded from the original "Typed Plugin Hook Protocol" scope. Hook calls are one sub-pattern; this FR now covers the full contract gap across config→runtime, state→path, and interface→caller boundaries.
Value Score: 44/60 (73%) — HIGH VALUE
| Dimension |
Score |
Rationale |
| Impact Breadth |
9/10 |
17/100 open bugs (17%) share this root cause — #1 recurring pattern |
| Impact Severity |
7/10 |
Silent failures, hard to debug, user sees wrong behavior with no error |
| Fix Leverage |
9/10 |
One structural fix prevents entire bug class; individual ~30-line fixes don't |
| Implementation Feasibility |
5/10 |
Touches core modules, but can be rolled out incrementally |
| Upstream Receptiveness |
6/10 |
Architectural FRs are harder to accept, but 17-bug evidence is compelling |
| Proof of Concept |
8/10 |
5 fixes already demonstrate the pattern; get_custom_provider_model_field() is a working micro-example |
No existing FR covers this scope. Related:
Problem
Hermes has no contract layer between config/state producers and consumers. When a new field is added to config.yaml, a hook signature, or a state object, there is no mechanism to ensure every downstream consumer picks it up. The result: silent failures that only surface when users hit the un-updated code path.
This is not a theory — it is the #1 recurring bug pattern in the open issue tracker.
Evidence: 17 bugs, one root cause, four sub-patterns
We analyzed 100 open bug issues and found 17 that follow the exact same structural defect. They break into four sub-patterns:
Sub-pattern 1: Config field declared, no consumer (4 bugs)
New config.yaml key added and documented, but the bridge code that maps it to a runtime variable/env-var was never updated.
| Bug |
Symptom |
| #28046 ✅ |
custom_providers[].models.*.max_tokens ignored, always defaults to 4096 |
| #28863 |
terminal.docker_extra_args silently dropped — missing from _terminal_env_map |
| #28651 |
web_tools hardcodes provider list, ignores configured providers |
| #28034 |
/model --global doesn't persist when using the visual model picker |
Sub-pattern 2: Path A works, path B broken (5 bugs)
Feature works correctly on one execution path (e.g., startup) but is broken on another (e.g., /model switch, gateway restart, fallback activation). The logic was copy-pasted or reimplemented instead of shared.
| Bug |
Symptom |
| #28753 |
TUI doesn't propagate fallback_model/fallback_providers — gateway path works |
| #28825 |
OpenAI-compatible API doesn't honour tools param — Anthropic path works |
| #28746 |
session:end event not emitted from idle-expiry/auto-reset path — normal close works |
| #28637 |
Per-model token usage lost during /model switch — init path records it |
| #28023 |
Credential pool strategy not honoured on fallback — startup path reads it |
Sub-pattern 3: Interface expanded, caller missed (3 bugs)
A hook signature, callback, or protocol method was extended with new parameters, but not all call sites were updated.
| Bug |
Symptom |
| #28961 ✅ |
pre_tool_call hook missing session_id/tool_call_id in 2 of 3 call sites |
| #28296 ✅ |
OpenViking missing on_session_switch() — interface method added, implementation missed |
| #28662 |
hermes cron list crashes on MCP-created jobs — schedule field type assumption differs |
Sub-pattern 4: State added, not propagated (3 bugs)
A new field was added to a state dict/object, but not all paths that serialize/deserialize/transform that state preserve the new field.
| Bug |
Symptom |
| #28841 ✅ |
Message timestamp lost during fork/compress/branch — always DB write-time |
| #28632 |
Gateway restart leaves launchd service unloaded — stop path cleans up, restart doesn't |
| #28489 |
Gateway persists invalid /model status override and keeps reusing it |
(✅ = already fixed, listed here as evidence the pattern is real and fixable)
Also related (same structural pattern, different domain)
| Bug |
Domain |
Same root cause |
| #28598 ✅ |
Display |
build_tool_preview() hardcoded if-elif chain — new tool → forgotten entry → #28621 |
| #28663 ✅ |
Gateway |
Exec quick commands blocked during drain — one path checked, other didn't |
Why individual fixes don't scale
Each of the 5 bugs we fixed required ~30 lines of targeted code. But fixing #28046 (max_tokens) did nothing to prevent #28863 (docker_extra_args) — they're the same structural defect manifesting in different config keys. Every new config field or code path is a ticking time bomb until someone reports it.
The pattern will keep recurring as long as producer→consumer bindings remain implicit.
Proposed Solution
A lightweight contract layer that makes bindings declarative and verifiable — not a big-bang rewrite, but an incremental rollout:
Phase 1: Config Field Registry + Startup Validator
Define a registry mapping config keys to their expected consumers:
CONFIG_CONTRACTS = {
"terminal.docker_extra_args": {
"env_var": "TERMINAL_DOCKER_EXTRA_ARGS",
"consumer": "gateway.run._terminal_env_map",
"type": list[str],
},
"custom_providers.*.models.*.max_tokens": {
"consumer": "agent.agent_init",
"fallback": 4096,
"type": int,
},
}
On startup, validate every declared field has at least one active consumer. Emit warnings (not errors) for:
- Orphan fields (declared in config, no consumer registered)
- Stale bindings (consumer references a field that no longer exists)
Zero behavior change — purely additive observability.
Phase 2: Typed Hook Payloads
Replace ad-hoc kwargs in plugin hook invocations with typed payloads:
@dataclass
class PreToolCallPayload:
session_id: str
tool_call_id: str
tool_name: str
tool_input: dict
# Future fields added here automatically propagate
# Single entry point constructs the payload — all call sites get every field
This is the original scope from this FR (#28961, #25204, #7344). A Protocol-based approach ensures any new field added to the payload is structurally visible to all consumers.
Phase 3: Path Parity Tests
A test helper that verifies: if path A (startup) reads/configures field X, then path B (/model switch, gateway restart, fallback) must also read X.
def assert_config_path_parity(field: str, paths: list[str]):
"""Fail CI if any declared path doesn't consume the field."""
This turns "path divergence" bugs into CI failures before they reach users.
What this prevents
| Future scenario |
Without contract |
With contract |
New config field terminal.gpu_layers added |
Developer forgets to add to env_map → silent drop until user reports |
Registry shows orphan field at startup → caught immediately |
New hook param retry_count added to pre_tool_call |
Works in sequential path, forgotten in concurrent path |
Typed payload means all paths get all fields |
| Model switch adds new state field |
Init preserves it, switch loses it |
Path parity test fails in CI |
| New tool added to plugin |
build_tool_preview() shows generic output (#28598) |
Declarative preview registered at tool definition time (#28621) |
Existing proof of concept
The generic get_custom_provider_model_field() function introduced in PR #28988 (fixing #28046) is a working micro-example of this approach: instead of separate get_custom_provider_context_length() and get_custom_provider_max_tokens() functions with duplicated logic, we extracted a single generic lookup that any new per-model config field can use without writing new bridge code.
Fixes (as evidence): #28046, #28961, #28841, #28663, #28296
Would prevent: #28863, #28662, #28034, #28753, #28651, #28489, #28023, #28746, #28637, #28825, #28632, #28055
Related: #27342 (complementary), #28621 (same pattern, different domain)
Feature Request: Typed Config-Runtime Contract
Value Score: 44/60 (73%) — HIGH VALUE
get_custom_provider_model_field()is a working micro-exampleNo existing FR covers this scope. Related:
Problem
Hermes has no contract layer between config/state producers and consumers. When a new field is added to
config.yaml, a hook signature, or a state object, there is no mechanism to ensure every downstream consumer picks it up. The result: silent failures that only surface when users hit the un-updated code path.This is not a theory — it is the #1 recurring bug pattern in the open issue tracker.
Evidence: 17 bugs, one root cause, four sub-patterns
We analyzed 100 open bug issues and found 17 that follow the exact same structural defect. They break into four sub-patterns:
Sub-pattern 1: Config field declared, no consumer (4 bugs)
New
config.yamlkey added and documented, but the bridge code that maps it to a runtime variable/env-var was never updated.custom_providers[].models.*.max_tokensignored, always defaults to 4096terminal.docker_extra_argssilently dropped — missing from_terminal_env_mapweb_toolshardcodes provider list, ignores configured providers/model --globaldoesn't persist when using the visual model pickerSub-pattern 2: Path A works, path B broken (5 bugs)
Feature works correctly on one execution path (e.g., startup) but is broken on another (e.g.,
/modelswitch, gateway restart, fallback activation). The logic was copy-pasted or reimplemented instead of shared.fallback_model/fallback_providers— gateway path workstoolsparam — Anthropic path workssession:endevent not emitted from idle-expiry/auto-reset path — normal close works/modelswitch — init path records itSub-pattern 3: Interface expanded, caller missed (3 bugs)
A hook signature, callback, or protocol method was extended with new parameters, but not all call sites were updated.
pre_tool_callhook missingsession_id/tool_call_idin 2 of 3 call siteson_session_switch()— interface method added, implementation missedhermes cron listcrashes on MCP-created jobs —schedulefield type assumption differsSub-pattern 4: State added, not propagated (3 bugs)
A new field was added to a state dict/object, but not all paths that serialize/deserialize/transform that state preserve the new field.
/modelstatus override and keeps reusing it(✅ = already fixed, listed here as evidence the pattern is real and fixable)
Also related (same structural pattern, different domain)
build_tool_preview()hardcoded if-elif chain — new tool → forgotten entry → #28621Why individual fixes don't scale
Each of the 5 bugs we fixed required ~30 lines of targeted code. But fixing #28046 (
max_tokens) did nothing to prevent #28863 (docker_extra_args) — they're the same structural defect manifesting in different config keys. Every new config field or code path is a ticking time bomb until someone reports it.The pattern will keep recurring as long as producer→consumer bindings remain implicit.
Proposed Solution
A lightweight contract layer that makes bindings declarative and verifiable — not a big-bang rewrite, but an incremental rollout:
Phase 1: Config Field Registry + Startup Validator
Define a registry mapping config keys to their expected consumers:
On startup, validate every declared field has at least one active consumer. Emit warnings (not errors) for:
Zero behavior change — purely additive observability.
Phase 2: Typed Hook Payloads
Replace ad-hoc kwargs in plugin hook invocations with typed payloads:
This is the original scope from this FR (#28961, #25204, #7344). A
Protocol-based approach ensures any new field added to the payload is structurally visible to all consumers.Phase 3: Path Parity Tests
A test helper that verifies: if path A (startup) reads/configures field X, then path B (
/modelswitch, gateway restart, fallback) must also read X.This turns "path divergence" bugs into CI failures before they reach users.
What this prevents
terminal.gpu_layersaddedretry_countadded topre_tool_callbuild_tool_preview()shows generic output (#28598)Existing proof of concept
The generic
get_custom_provider_model_field()function introduced in PR #28988 (fixing #28046) is a working micro-example of this approach: instead of separateget_custom_provider_context_length()andget_custom_provider_max_tokens()functions with duplicated logic, we extracted a single generic lookup that any new per-model config field can use without writing new bridge code.Fixes (as evidence): #28046, #28961, #28841, #28663, #28296
Would prevent: #28863, #28662, #28034, #28753, #28651, #28489, #28023, #28746, #28637, #28825, #28632, #28055
Related: #27342 (complementary), #28621 (same pattern, different domain)