Fix silent success for non-deliverable Bedrock Telegram turns#82905
Conversation
|
Codex review: needs maintainer review before merge. Summary Reproducibility: yes. source-reproducible: current main can mark a no-delivery terminal attempt as trajectory success, and the retry allowlist omits Real behavior proof Next step before merge Security Review detailsBest possible solution: Land a maintainer-reviewed version that keeps the retry and trajectory-status logic centralized, preserves non-text delivery evidence, and retains the focused tests plus recorded live Telegram/Bedrock proof. Do we have a high-confidence way to reproduce the issue? Yes, source-reproducible: current main can mark a no-delivery terminal attempt as trajectory success, and the retry allowlist omits Is this the best way to solve the issue? Yes, the proposed direction is a narrow maintainable fix: classify terminal delivery status before trajectory recording and extend the existing non-visible retry guard to Bedrock Converse. The remaining step is maintainer review/landing, not an automated repair. What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against ee10fe17f063. |
edc853b to
12d2bc6
Compare
…aw#82905) * fix: handle non-deliverable terminal turns * chore: add changelog for non-deliverable turns * fix: align telegram message cache types
…aw#82905) * fix: handle non-deliverable terminal turns * chore: add changelog for non-deliverable turns * fix: align telegram message cache types
…aw#82905) * fix: handle non-deliverable terminal turns * chore: add changelog for non-deliverable turns * fix: align telegram message cache types
…aw#82905) * fix: handle non-deliverable terminal turns * chore: add changelog for non-deliverable turns * fix: align telegram message cache types
…aw#82905) * fix: handle non-deliverable terminal turns * chore: add changelog for non-deliverable turns * fix: align telegram message cache types
…aw#82905) * fix: handle non-deliverable terminal turns * chore: add changelog for non-deliverable turns * fix: align telegram message cache types
…aw#82905) * fix: handle non-deliverable terminal turns * chore: add changelog for non-deliverable turns * fix: align telegram message cache types
…aw#82905) * fix: handle non-deliverable terminal turns * chore: add changelog for non-deliverable turns * fix: align telegram message cache types
…aw#82905) * fix: handle non-deliverable terminal turns * chore: add changelog for non-deliverable turns * fix: align telegram message cache types
…aw#82905) * fix: handle non-deliverable terminal turns * chore: add changelog for non-deliverable turns * fix: align telegram message cache types
…aw#82905) * fix: handle non-deliverable terminal turns * chore: add changelog for non-deliverable turns * fix: align telegram message cache types
…aw#82905) * fix: handle non-deliverable terminal turns * chore: add changelog for non-deliverable turns * fix: align telegram message cache types
Summary
Verification
.agents/skills/codex-review/scripts/codex-review --mode local(clean; focused Vitest coverage passed: 7 files, 309 tests)node scripts/run-vitest.mjs src/agents/pi-embedded-runner/run/attempt-trajectory-status.test.ts src/agents/pi-embedded-runner/run.incomplete-turn.test.ts src/trajectory/export.test.ts src/trajectory/metadata.test.ts src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.subscribeembeddedpisession.test.ts(passed: 8 files, 312 tests)node scripts/run-tsgo.mjs -p tsconfig.extensions.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/extensions-pr82905.tsbuildinfonode scripts/run-bundled-extension-oxlint.mjsnode scripts/check-extension-package-tsc-boundary.mjs --mode=compilerun_943fee6e6d01/cbx_9bbc8d8db38c: original Telegram + Bedrock repro observed a Telegram reply after the patchrun_ca257471b04a/cbx_1081c2d5346a:pnpm check:changedpassed after Node 22/Corepack setupba1b76ccd933a6f379e5d64d98fbef5b8f02c28b: all checks green/skipped exceptReal behavior proofReal behavior proof
Behavior addressed: A Bedrock reasoning/empty terminal turn for Telegram could complete with no Telegram reply while trajectory artifacts still reported success.
Real environment tested: AWS Crabbox direct provider
cbx_9bbc8d8db38c, Telegram live bot group via Convextelegramcredential, Bedrock modelamazon-bedrock/openai.gpt-oss-120b-1:0.Exact steps or command run after this patch: Installed dependencies, built the repo, started the QA live Telegram gateway, sent the original prompt from #82394, and waited for the SUT Telegram bot reply.
Evidence after fix: Crabbox run
run_943fee6e6d01; sent Telegram message8958; observed SUT reply message8959.Observed result after fix:
replyObserved: true,observedSutMessageCount: 1, command exit0.What was not tested: Full repository test suite; the changed gate, focused tests, and PR CI were run instead.
Fixes #82394