refactor(T2): default-flip empty-content recovery via HERMES_LEGACY_EMPTY_RECOVERY (refs #143)#147
Merged
Merged
Conversation
…MPTY_RECOVERY (refs #143) T2 of the #143 thin-client refactor scope. Devagentic-side cascade (NousResearch#324) + runaway detector (NousResearch#345-348) + exec-terminus (NousResearch#349-354) now cover the empty-content recovery layer with full intent / role / dispatch-trace context. Hermes-side recovery layered on top caused 3×4 dispatch stacking (#118) + mode confusion + invisible swallow points (#133 debug funnel). This patch default-flips: the four hermes-side recovery paths are short-circuited unless ``HERMES_LEGACY_EMPTY_RECOVERY`` is set to a truthy value. Legacy users opt-in to keep the pre-T2 behavior. ## Gated paths (all skipped when env unset) | Path | Source | Loc | |---|---|---| | ``_finish_wants_tools`` synthetic recovery | PR #108 / #99 | line ~3618 | | ``_post_tool_empty_retried`` nudge | pre-existing NousResearch#9400-class | line ~3749 | | ``_structural_empty`` synthetic recovery | PR #69 / #67 | line ~3880 | | 3-retry empty-content loop | pre-#67 | line ~3927 | When env unset (default), empty responses fall through to either the fallback-chain provider switch (if configured) or the clean ``(empty)`` terminal with ``_empty_terminal_sentinel=True``. ## Preserved through T2 (verified by test_t2_legacy_empty_recovery_gate.py) - **PR #119 cascade_exhausted short-circuit** — hermes correctly deferring to devagentic's sentinel; NOT recovery. Source-level test asserts the ``if _cascade_err:`` block is NOT prefixed by ``_legacy_recovery_on``. - **PR #122/#125 raw tool_calls fallback** — pre-recovery wire parsing in transports/chat_completions.py. Source-level test asserts the helper name doesn't appear in conversation_loop (lives elsewhere; untouched). - **PR #131/#136/#138/#141 diagnostics** — env-gated via ``HERMES_DIAG_RAW_CAPTURE`` (#140), independent of this env. ## Operator deploy Default behavior changes: empty responses surface cleanly (no synthetic re-prompt). To preserve pre-T2 behavior: ```bash export HERMES_LEGACY_EMPTY_RECOVERY=1 ``` The legacy escape hatch is intended as a temporary safety net while operators validate the thin-client architecture. Once the devagentic-side cascade is universally deployed + observed to cover all empty-content cases, the legacy gate can be removed in a follow-up (the env var stays as a no-op for backward compat). ## Tests - 23 new tests in ``tests/agent/test_t2_legacy_empty_recovery_gate.py``: resolver default-false / empty-false / 6 truthy / 6 falsy-or- unknown; source-level gate assertions for each of the four recovery branches; preserved-path assertions for cascade_exhausted + raw tool_calls fallback location; default-off + opt-in resolver round-trip. - 163 total green across affected suites (T2 + existing empty- terminal mirror tests + finish_reason_tools_recovery + cascade_exhausted + internal_marker_stripping + tool_call_type_default + tool_use_enforcement + T3 + T1 persona + doctor persona probe). ## Composition Per #143 sequencing: - T3 / PR #145 (clarify default-out) — merged ✓ - T1 / PR #146 (persona default-flip) — merged ✓ - **T2 / this PR (empty-content recovery removal, env-gated)** - Later: T4-T6 (tool list / iteration cap / summary fallback) The legacy escape hatch design choice (vs. full removal) lets operators roll back per-deployment if a previously-recovered edge case surfaces in the field. Diagnostic from #140 (HERMES_DIAG_RAW_CAPTURE) remains the observability tool to spot any uncovered case.
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
T2 of #143 thin-client refactor. Default-flips the four hermes-side empty-content recovery paths to OFF in one ship; legacy users re-enable via
HERMES_LEGACY_EMPTY_RECOVERY=1.Devagentic-side cascade (NousResearch#324) + runaway detector (NousResearch#345-348) + exec-terminus (NousResearch#349-354) now own this layer. Hermes-side recovery on top caused the #118 dispatch stacking + #133 invisible swallow points. Default-flip removes the conflict; opt-in env preserves legacy behavior for safety.
Gated paths (4):
_finish_wants_toolssynthetic (PR #108),_post_tool_empty_retriednudge,_structural_emptysynthetic (PR #69), 3-retry loop.Preserved (verified by source-level tests): PR #119 cascade_exhausted short-circuit (hermes deferring to devagentic; NOT recovery), PR #122/#125 raw tool_calls fallback in normalize_response, PR #131/#136/#138/#141 diagnostics (#140-gated).
When env unset: empty responses fall through to fallback-chain or clean
(empty)terminal. 23 new T2 tests + 163 total green across recovery / diagnostics / T1+T3 suites.