Skip to content

fix(line): stop silent push failure on lowercased recipients (#81628)#81704

Merged
steipete merged 3 commits into
openclaw:mainfrom
edenfunf:fix/line-recipient-recovery-81628
May 15, 2026
Merged

fix(line): stop silent push failure on lowercased recipients (#81628)#81704
steipete merged 3 commits into
openclaw:mainfrom
edenfunf:fix/line-recipient-recovery-81628

Conversation

@edenfunf

@edenfunf edenfunf commented May 14, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #81628. LINE push silently fails for long-running tasks because the cron-tool's session-key fallback produces a lowercased recipient that LINE rejects with HTTP 400, and the LINE plugin has no early-rejection guard, so the failure burns five retries inside delivery-recovery before the entry is moved to failed/. Operators see "delivered" on the dashboard while the LINE group receives nothing.

  • Problem: inferDeliveryFromSessionKey in src/agents/tools/cron-tool.ts rebuilds delivery.to from the session-key fragment when currentDeliveryContext is unavailable (e.g. queue replay, post-reply-token push). buildAgentPeerSessionKey lowercases the peer id for canonical routing, so the rebuilt to is lowercased (cabcdef... instead of Cabcdef...). LINE chat ids are case-sensitive — push returns HTTP 400 "The property, 'to', in the request body is invalid".
  • Why it matters: Every LINE account / agent that handles tasks longer than the reply-token TTL (~60s) in groups is affected. The dashboard reports the message as delivered. End users in LINE groups never see long-task results.
  • What changed: (1) cron-tool refuses the session-key fallback when channel === "line", so the missing target surfaces explicitly. (2) LINE plugin's normalizeTarget rejects 33+ char recipients that do not begin with a capital C/U/R; the error message matches delivery-queue-recovery's PERMANENT_ERROR_PATTERNS, so a recovery replay that ever encounters a lowercased value moves the entry to failed/ on the first attempt instead of retrying 5x.
  • What did NOT change: Telegram/Matrix/Discord/Slack/WhatsApp cron-tool paths (telegram-specific accountId/threadId logic preserved). Short fixtures like "U123", "line:user:1" continue to pass the LINE plugin (length-based gate). Session-key canonical lowercasing is intentional and untouched.

Root cause

Two layers each contribute to the silent failure:

  1. inferDeliveryFromSessionKey in src/agents/tools/cron-tool.ts treats the lowercased session-key peer fragment as authoritative when currentDeliveryContext and explicit delivery are both missing. The Matrix-equivalent of this bug is mitigated for live runs by always supplying currentDeliveryContext (see the existing prefers current delivery context over lowercased session-key targets case in cron-tool.test.ts), but cron-fire and delivery-recovery paths can run without a currentDeliveryContext. For LINE, even a lowercased session-key fragment cannot recover the original case, so the only safe action is to refuse the fallback.
  2. normalizeTarget in extensions/line/src/send.ts strips the line:group: / line:user: / line:room: prefixes but does not validate the resulting id's case. A lowercased value flows through to client.pushMessage, LINE returns HTTP 400, and delivery-recovery does not classify that error as permanent — so the entry retries five times before moving to failed/. The error message added in this PR matches the existing /recipient is not a valid/i pattern in PERMANENT_ERROR_PATTERNS, so a single attempt is enough to flag the failure to operators.

Changes

  • src/agents/tools/cron-tool.ts: return null from inferDeliveryFromSessionKey when channel === "line".
  • src/agents/tools/cron-tool.test.ts: two new repros — group-shape and per-account-channel-peer DM-shape. Both use the real buildAgentPeerSessionKey helper so the canonical-lowercasing step that triggers the bug runs in the same process as the fix.
  • extensions/line/src/send.ts: normalizeTarget throws Recipient is not a valid LINE id ... when the post-strip value is at least 33 chars and does not begin with a capital C/U/R.
  • extensions/line/src/send.test.ts: two new cases — reject lowercased 33-char recipient; accept case-exact recipient.

Real behavior proof

  • Behavior or issue addressed: silent LINE push failure for long-running group tasks ([Bug]: [core] LINE delivery-recovery sends wrong recipient (lowercased / session-key prefix kept), causing silent push failures #81628). Queued entry in ~/.openclaw/delivery-queue/failed/*.json shows to lowercased; LINE returns HTTP 400 The property, 'to', in the request body is invalid; dashboard reports delivered; LINE group receives nothing.

  • Real environment tested: Windows 11 (Node 22.16.0). The @openclaw/line plugin was loaded from source via node_modules/.bin/tsx (no separate build step) and exercised directly through its public runtime-api barrel. The script lives at .artifacts/verify-81628/verify.ts (gitignored). A placeholder string was used for the channel access token; the lowercased recipient is rejected at the outbound guard before any HTTP call to LINE, and the case-exact recipient is verified to pass the guard and only fail later at the network layer with HTTP 401 (a real bot token would proceed past that point).

  • Exact steps or command run after this patch:

node_modules/.bin/tsx .artifacts/verify-81628/verify.ts

The script source (excerpt):

import { pushMessagesLine } from "../../extensions/line/runtime-api.js";
import { isPermanentDeliveryError } from "../../src/infra/outbound/delivery-queue-recovery.js";

await pushMessagesLine(
  "cabcdef0123456789abcdef0123456789",
  [{ type: "text", text: "this never reaches LINE API" }],
  { cfg, accountId: "default" },
);
  • Evidence after fix: terminal capture / console output of the live runtime invocation (stderr+stdout, copied verbatim):
[verify-81628] target = cabcdef0123456789abcdef0123456789 (the shape produced when a session-key fragment is used as the recipient)
[verify-81628] invoking @openclaw/line pushMessagesLine ...
[verify-81628] LINE outbound boundary rejected the lowercased recipient.
[verify-81628] thrown message: Recipient is not a valid LINE id (case-sensitive; expected leading capital C/U/R): cabc…
[verify-81628] matches delivery-recovery PERMANENT_ERROR_PATTERNS: true
[verify-81628] => delivery-recovery would move this entry to ~/.openclaw/delivery-queue/failed/ on the first attempt instead of retrying 5 times silently.

[verify-81628] case-exact target = Cabcdef0123456789abcdef0123456789
[verify-81628] same invocation with the case-exact recipient should pass the outbound guard and only fail later at the real HTTP layer (fake token).
[verify-81628] case-exact recipient passed the outbound guard; failure originated past it: 401 - Unauthorized

[verify-81628] verification complete.
  • Observed result after fix: the actual @openclaw/line plugin code rejects the lowercased recipient at the outbound boundary with the new permanent-error message, before any HTTP request to LINE is issued. The classifier isPermanentDeliveryError from src/infra/outbound/delivery-queue-recovery.ts returns true for that message, so a queued delivery carrying this recipient would be moved to ~/.openclaw/delivery-queue/failed/ on the first attempt rather than retrying five times silently. The case-exact recipient is not affected — it passes the new guard and only fails downstream at the live HTTP layer (HTTP 401 here because the placeholder token is rejected by LINE; a real bot token would proceed to the real send).

  • What was not tested: end-to-end traffic against the live LINE platform during a >60s task in a real group. The verification above exercises the exact case-loss boundary inside the real runtime plugin code; the cron-tool half of the fix prevents the upstream producer from emitting a lowercased recipient in the first place, and the plugin-side guard above demonstrates the safety net for any other code path that ever surfaces such a value.

Supplemental verification

In addition to the live-runtime capture in the Real behavior proof section, the following supplemental coverage was added under src/agents/tools/cron-tool.test.ts and extensions/line/src/send.test.ts:

  • cron-tool side: two new cases — group-shape session-key recipient (agent:main:line:group:cabcdef...) and per-account-channel-peer DM-shape session-key recipient (agent:main:line:primary:direct:uabcdef...). Both call buildAgentPeerSessionKey directly so the canonical-lowercasing step that triggers the bug runs in the same process as the assertion. Without the fix both fail with AssertionError: expected 'cabcdef0123456789abcdef0123456789' to be undefined.
  • LINE plugin side: two new cases — reject lowercased 33-char recipient via pushMessagesLine; accept case-exact recipient (the case-exact case proves the length-based gate does not false-positive on real LINE ids).

Static checks on the four touched files (tsgo -p tsconfig.core.json, tsgo -p tsconfig.extensions.json, oxlint) all report zero diagnostics.

User-visible behavior change

Before this PR, a cron job scheduled by a LINE-bound agent without currentDeliveryContext (delivery-recovery path) silently failed for long-task replies (HTTP 400, 5 retries, failed/). After this PR, the same path now either:

  1. Succeeds, because cron fire-time resolution falls back to session.lastTo — which is case-preserved from ctxPayload.OriginatingTo in the LINE inbound builder (resolveLastToRaw in src/auto-reply/reply/session-delivery.ts returns originatingToRaw || toRaw || persistedLastTo); or
  2. Fails loudly at the first attempt with Recipient is not a valid LINE id ..., so the entry moves to failed/ immediately and the operator sees the failure in the log instead of being told "delivered".

No change for non-LINE channels.

Security impact

  • New permissions / capabilities: No
  • Secrets / tokens handling changed: No
  • New / changed network calls: No (one pre-flight rejection avoids an unnecessary call to LINE)
  • Command / tool execution surface changed: No
  • Data access scope changed: No

Compatibility / migration

  • Backward compatible: Yes
  • Config / env changes: No
  • Migration needed: No
  • Cron jobs persisted with bad to before this fix: forward-only fix. Such jobs either (a) succeed at fire-time via the session.lastTo fallback path, or (b) fail loudly via the new plugin guard. No data migration required.

Regression coverage

  • Coverage level: Unit
  • Target files: src/agents/tools/cron-tool.test.ts (cron-tool side), extensions/line/src/send.test.ts (plugin side)
  • Scenarios locked in:
    • cron-tool must not rebuild a LINE recipient from a lowercased session-key fragment, for both group and per-account-channel-peer DM session-key shapes
    • LINE plugin must reject a 33-char value without a leading capital C/U/R at the outbound boundary, with an error message matching delivery-queue-recovery's PERMANENT_ERROR_PATTERNS
  • Why this is the smallest reliable guardrail: both checks are at the exact case-loss boundary identified by source review. The cron case uses the real buildAgentPeerSessionKey helper so the canonical lowercasing happens in the same process; the LINE plugin case exercises the real pushMessagesLine entry point that every outbound LINE push goes through.

Followups (not in this PR)

  • dmScope: "per-peer" produces a session key without a channel marker (agent:main:direct:uxxx...). The cron-tool guard does not fire for that shape because the channel cannot be inferred from the session-key alone; the LINE plugin safety net still catches it at the outbound boundary as a permanent error, but the upstream-source fix would require a different signal than the channel-name check.
  • The LINE plugin guard intentionally uses a length-based gate (length >= 33) so short fixtures ("U123", "line:user:1") still flow through. Strict shape validation (^[CUR][0-9a-f]{32}$) would require updating 20+ existing fixtures and is left as a separate cleanup.

@openclaw-barnacle openclaw-barnacle Bot added channel: line Channel integration: line agents Agent runtime and tooling size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 14, 2026
@clawsweeper

clawsweeper Bot commented May 14, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge.

Summary
The branch adds a LINE-specific cron session-key fallback guard, a LINE send-time recipient validation guard, focused regression tests, and a changelog entry.

Reproducibility: yes. for source-level reproduction: current main lowercases LINE peer IDs in session keys, and cron-tool can copy that lowercased fragment into delivery.to when currentDeliveryContext is unavailable. The linked issue also provides sanitized failed-queue payloads, retry logs, and direct LINE API evidence.

Real behavior proof
Sufficient (terminal): The PR body includes after-fix terminal output from the real @openclaw/line runtime-api path showing lowercased recipients rejected before HTTP and case-exact recipients reaching the HTTP layer.

Next step before merge
No repair job is needed; the open PR already contains the focused fix, tests, supplied real behavior proof, and no blocking review findings.

Security
Cleared: The diff adds local validation, tests, and a changelog entry without new dependencies, permissions, secret handling, lifecycle hooks, or code-execution paths.

Review details

Best possible solution:

Land this focused PR after the usual maintainer merge checks, then close the linked bug and supersede the alternative open fix at #81670 if this branch is chosen.

Do we have a high-confidence way to reproduce the issue?

Yes for source-level reproduction: current main lowercases LINE peer IDs in session keys, and cron-tool can copy that lowercased fragment into delivery.to when currentDeliveryContext is unavailable. The linked issue also provides sanitized failed-queue payloads, retry logs, and direct LINE API evidence.

Is this the best way to solve the issue?

Yes; refusing unrecoverable LINE session-key fallback and adding a LINE-owned permanent-error guard is the narrowest maintainable fix. Stricter full LINE id validation can remain separate because it would broaden fixture and compatibility churn.

What I checked:

  • Current main cron fallback can copy session-key peer fragments into delivery.to: On current main, inferDeliveryFromSessionKey parses the peerId from the session-key rest and returns a CronDelivery with to: peerId when no delivery context provided. (src/agents/tools/cron-tool.ts:571, 65dd71d42dee)
  • Current main session keys lowercase peer IDs: buildAgentPeerSessionKey normalizes direct and grouped peer IDs through normalizeLowercaseStringOrEmpty before embedding them in the agent session key, which loses LINE C/U/R casing. (src/routing/session-key.ts:192, 65dd71d42dee)
  • Current main LINE send normalization strips prefixes but does not reject lowercased LINE-shaped IDs: normalizeTarget strips line:group, line:room, line:user, and line prefixes, then returns the normalized string without validating the case-sensitive LINE id prefix. (extensions/line/src/send.ts:61, 65dd71d42dee)
  • Permanent delivery classifier already matches the PR error message: delivery-queue-recovery treats messages matching /recipient is not a valid/i as permanent, so the PR’s new LINE guard message is compatible with first-attempt failure classification. (src/infra/outbound/delivery-queue-recovery.ts:66, 65dd71d42dee)
  • PR diff targets the implicated cron and LINE paths: The PR diff adds a LINE/channelless-LINE null return before cron delivery inference, adds a LINE normalizeTarget guard for 33+ character recipients without capital C/U/R, and adds focused cron-tool and LINE send tests. (src/agents/tools/cron-tool.ts:575, dbd289948f30)
  • Related canonical issue remains open and is linked by this PR: The linked report describes the same LINE lowercased recipient failure with sanitized failed-queue payloads, retry logs, and direct LINE API 400/200 comparison; the PR is the active implementation candidate for that issue.

Likely related people:

  • steipete: Current-main blame for the cron fallback, LINE send normalization, session-key helper, and delivery-recovery permanent-error list points to Peter Steinberger, and commit 05eda57 touched the LINE send/outbound lifecycle files. (role: recent area contributor; confidence: high; commits: 2ea0c6c9293b, 05eda57b3c72; files: src/agents/tools/cron-tool.ts, extensions/line/src/send.ts, src/routing/session-key.ts)
  • Kaspre: Commit 7eefb26 updated session-key remapping surfaces adjacent to the lowercased session-key side of the reported bug. (role: recent session-key contributor; confidence: medium; commits: 7eefb26bc8d8; files: src/routing/session-key.ts, src/sessions/session-key-utils.ts)
  • Masato Hoshino: Commit 9449e54 added LINE outbound media support and touched the LINE send/outbound boundary that this PR guards. (role: LINE outbound feature contributor; confidence: medium; commits: 9449e54f4ffc; files: extensions/line/src/send.ts, extensions/line/src/outbound.ts, extensions/line/src/channel.sendPayload.test.ts)

Remaining risk / open question:

  • The read-only review did not run the new tests or a full live LINE >60s task; it relies on source inspection, the linked issue evidence, and the PR body’s terminal proof.
  • The PR body still describes per-peer LINE scope as a follow-up even though the latest diff adds a per-peer guard and test; that is a documentation freshness issue rather than a code blocker.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 65dd71d42dee.

@openclaw-barnacle openclaw-barnacle Bot added triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels May 14, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 14, 2026
edenfunf and others added 3 commits May 15, 2026 16:50
LINE chat ids are case-sensitive (push requires capital C/U/R) but the
session key holds the peer id lowercased for canonical routing. When
cron-tool runs without currentDeliveryContext (delivery-recovery, queue
replay after reply-token expiry), inferDeliveryFromSessionKey was
lifting the lowercased fragment straight into delivery.to, producing a
value LINE rejects with HTTP 400 — the job retried five times silently
and the dashboard reported "delivered" while the LINE group received
nothing.

Refuse the session-key fallback for channel === "line" so the missing
target surfaces explicitly instead of scheduling an undeliverable job.
…claw#81628)

Defense-in-depth safety net for openclaw#81628: even with the cron-tool fix in
place, any other code path that ever produces a 33-char LINE-shaped
recipient missing its leading capital (C/U/R) would otherwise hit the
LINE API and return HTTP 400 with no permanent-error signal, causing
delivery-recovery to retry five times before moving the entry to
failed/.

normalizeTarget now throws "Recipient is not a valid LINE id ..." when
the post-strip value looks like a LINE id but the case was lost. The
message matches the existing /recipient is not a valid/i pattern in
delivery-queue-recovery's PERMANENT_ERROR_PATTERNS, so recovery moves
the entry to failed/ on the first attempt instead of silently retrying.

Short fixtures (length < 33) are left alone so existing tests using
"U123", "line:user:1", etc. keep working.
@steipete steipete force-pushed the fix/line-recipient-recovery-81628 branch from b1e79cf to dbd2899 Compare May 15, 2026 16:01
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 15, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 15, 2026
@steipete steipete merged commit 8338412 into openclaw:main May 15, 2026
91 of 96 checks passed
@steipete

Copy link
Copy Markdown
Contributor

Landed, thanks @edenfunf.

Proof:

  • Codex review: /Users/steipete/Projects/agent-scripts/skills/codex-review/scripts/codex-review --mode branch, clean after the per-peer LINE session fix.
  • Focused test: node scripts/run-vitest.mjs src/agents/tools/cron-tool.test.ts extensions/line/src/send.test.ts, 3 files / 180 tests passed.
  • Local formatting sanity: node_modules/.bin/oxfmt --write --no-error-on-unmatched-pattern on the touched files, plus git diff --check.
  • GitHub: Real behavior proof passed on dbd2899.

Merged as 8338412.

Follow-up refactor target: stop reconstructing delivery recipients from lossy canonical session keys and carry/persist explicit delivery context through recovery paths instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling channel: line Channel integration: line proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: [core] LINE delivery-recovery sends wrong recipient (lowercased / session-key prefix kept), causing silent push failures

2 participants