Skip to content

fix(matrix): preserve case in group/channel peer IDs for session keys#57337

Closed
dlardo wants to merge 1 commit into
openclaw:mainfrom
dlardo:fix/matrix-room-id-case-preservation
Closed

fix(matrix): preserve case in group/channel peer IDs for session keys#57337
dlardo wants to merge 1 commit into
openclaw:mainfrom
dlardo:fix/matrix-room-id-case-preservation

Conversation

@dlardo

@dlardo dlardo commented Mar 29, 2026

Copy link
Copy Markdown
Contributor

Summary

Matrix room IDs (e.g. !IEjZDNPucuFvKLrAQC:server) are case-sensitive opaque identifiers per the Matrix spec. The buildAgentPeerSessionKey() function was calling .toLowerCase() on group/channel peer IDs, which destroyed the original case.

This caused a critical failure path:

  1. Session key is built with lowercased room ID
  2. Delivery queue entry stores the lowercased room ID as "to" target
  3. On gateway restart, delivery-recovery retries the send using the lowercased room ID
  4. Homeserver returns 403 (M_FORBIDDEN: User not in room)
  5. The 403 crashes the Matrix sync loop permanently
  6. The poisoned delivery persists across restarts, causing a crash loop

Changes

  • src/routing/session-key.ts: Remove .toLowerCase() from the group/channel peer ID path in buildAgentPeerSessionKey(). Direct/DM peer IDs still lowercase (Matrix user IDs are case-insensitive).
  • src/routing/session-key.continuity.test.ts: Add tests verifying mixed-case room IDs are preserved in session keys.

Testing

  • All 59 existing + new session-key and delivery-recovery tests pass
  • No existing tests relied on lowercased group/channel peer IDs

Breaking change note

Existing Matrix group/channel session keys were lowercased. After this change, new sessions will use the original-case room ID, creating new session keys. Existing lowercased sessions will become orphaned. This is acceptable since group chat sessions are ephemeral and regularly compacted/reset.

Fixes #57321
Related: #19278, PR #31023

@greptile-apps

greptile-apps Bot commented Mar 29, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a critical crash-loop bug in the Matrix gateway by removing the .toLowerCase() call applied to group/channel peer IDs in buildAgentPeerSessionKey. Because Matrix room IDs are case-sensitive opaque identifiers, the previous lowercasing caused delivery-recovery retries to use an invalid room ID, receiving a 403 from the homeserver and permanently crashing the sync loop.

  • src/routing/session-key.ts: Correctly removes .toLowerCase() from the non-direct peer ID path; the direct/DM path retains lowercasing (Matrix user IDs are case-insensitive).
  • src/routing/session-key.continuity.test.ts: Two new regression tests confirm channel and group session keys preserve original case.
  • Inconsistency introduced: The sibling buildGroupHistoryKey function (line 230) still calls .toLowerCase() on peerId. Now that session keys preserve case but history keys do not, any code that correlates the two for the same Matrix room will produce mismatched keys — likely causing history-lookup failures for group channels.

Confidence Score: 4/5

Safe to merge after addressing the residual .toLowerCase() call in buildGroupHistoryKey, which creates a case-mismatch between session keys and history keys for Matrix rooms.

The core fix is correct and well-tested. However, the sibling buildGroupHistoryKey function still lowercases peerId, which will produce a key inconsistency with the now-case-preserving session keys for the same Matrix rooms. This is a present defect on the changed path that should be resolved before merging.

src/routing/session-key.ts — line 230 in buildGroupHistoryKey still calls .toLowerCase() on peerId.

Important Files Changed

Filename Overview
src/routing/session-key.ts Removes .toLowerCase() from the non-direct peer ID path in buildAgentPeerSessionKey, correctly preserving Matrix room ID case for session keys. However, the sibling buildGroupHistoryKey function still lowercases its peerId, creating a case inconsistency between session keys and history keys for the same Matrix room.
src/routing/session-key.continuity.test.ts Adds two tests verifying that mixed-case Matrix room IDs are preserved verbatim in both channel and group session keys. Tests are correct and cover the exact regression scenario described in the PR.

Comments Outside Diff (1)

  1. src/routing/session-key.ts, line 230 (link)

    P1 buildGroupHistoryKey still lowercases peerId

    buildGroupHistoryKey at line 230 still calls .toLowerCase() on peerId, applying the same case-mangling to Matrix room IDs that this PR fixes in buildAgentPeerSessionKey.

    This creates an immediate inconsistency: after this fix, the session key for a Matrix group room like !IEjZDNPucuFvKLrAQC:server will use the original-case ID (correct), but any history key built for the same room via buildGroupHistoryKey will still be lowercased. Any code that joins or correlates session keys with history keys for Matrix rooms will therefore fail to match across the two.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/routing/session-key.ts
    Line: 230
    
    Comment:
    **`buildGroupHistoryKey` still lowercases `peerId`**
    
    `buildGroupHistoryKey` at line 230 still calls `.toLowerCase()` on `peerId`, applying the same case-mangling to Matrix room IDs that this PR fixes in `buildAgentPeerSessionKey`.
    
    This creates an immediate inconsistency: after this fix, the session key for a Matrix group room like `!IEjZDNPucuFvKLrAQC:server` will use the original-case ID (correct), but any history key built for the same room via `buildGroupHistoryKey` will still be lowercased. Any code that joins or correlates session keys with history keys for Matrix rooms will therefore fail to match across the two.
    
    
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/routing/session-key.ts
Line: 230

Comment:
**`buildGroupHistoryKey` still lowercases `peerId`**

`buildGroupHistoryKey` at line 230 still calls `.toLowerCase()` on `peerId`, applying the same case-mangling to Matrix room IDs that this PR fixes in `buildAgentPeerSessionKey`.

This creates an immediate inconsistency: after this fix, the session key for a Matrix group room like `!IEjZDNPucuFvKLrAQC:server` will use the original-case ID (correct), but any history key built for the same room via `buildGroupHistoryKey` will still be lowercased. Any code that joins or correlates session keys with history keys for Matrix rooms will therefore fail to match across the two.

```suggestion
  const peerId = params.peerId.trim() || "unknown";
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(matrix): preserve case in group/chan..." | Re-trigger Greptile

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fa46acf21d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/routing/session-key.ts Outdated
@@ -169,7 +169,7 @@ export function buildAgentPeerSessionKey(params: {
});
}
const channel = (params.channel ?? "").trim().toLowerCase() || "unknown";
const peerId = ((params.peerId ?? "").trim() || "unknown").toLowerCase();
const peerId = (params.peerId ?? "").trim() || "unknown";

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Scope case-preserving IDs to Matrix peers only

Removing .toLowerCase() here changes canonicalization for all non-direct peers, not just Matrix room IDs. That means channels with case-insensitive identifiers (for example Slack C123/G123) can now generate different session keys for the same conversation based on input casing, while other flows still canonicalize via lowercasing (see toAgentStoreSessionKey in src/routing/session-key.ts). The result is split/orphaned session state and inconsistent lookups/recovery when a key is rebuilt with different case. Please gate case-preservation to Matrix room/group peers and keep prior lowercase normalization for other providers.

Useful? React with 👍 / 👎.

@dlardo dlardo force-pushed the fix/matrix-room-id-case-preservation branch 3 times, most recently from 7253a17 to 45b8d96 Compare March 29, 2026 23:44
Matrix room IDs are case-sensitive opaque identifiers per the spec.
The previous toLowerCase() on group/channel peer IDs caused delivery-
recovery to use a lowercased room ID that Synapse rejects with 403,
permanently crashing the Matrix sync loop on gateway restart.

Fixes openclaw#57321
Related: openclaw#19278
@dlardo dlardo force-pushed the fix/matrix-room-id-case-preservation branch from 45b8d96 to 9eea474 Compare March 29, 2026 23:49

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9eea474cb8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

// Other providers (Slack, Discord, etc.) use case-insensitive IDs and
// must continue to be lowercased for consistent session key lookups.
const rawPeerId = (params.peerId ?? "").trim() || "unknown";
const peerId = channel === "matrix" ? rawPeerId : rawPeerId.toLowerCase();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep Matrix session-key casing consistent across store reads/writes

Preserving case in buildAgentPeerSessionKey for Matrix group/channel peers introduces a key-format split with existing session-store paths that still lowercase keys (src/channels/session.ts normalizes sessionKey to lowercase before recordSessionMetaFromInbound, and src/config/sessions/store.ts resolves/deletes case variants into the lowercase key), while session init still does an exact lookup via sessionStore[sessionKey] (src/auto-reply/reply/session.ts). For Matrix rooms whose IDs contain uppercase characters, this causes the mixed-case key produced here to diverge from the persisted lowercase key, so continuity lookups miss and the conversation can repeatedly start fresh/new-session instead of reusing prior context.

Useful? React with 👍 / 👎.

@vincentkoc

Copy link
Copy Markdown
Member

Thank you for the fix attempt here.

I am not merging this as-is.

Why:

  • preserving Matrix room-id case in buildAgentPeerSessionKey and buildGroupHistoryKey is directionally right
  • but the session-store stack still canonicalizes keys to lowercase on read/write (src/channels/session.ts, src/config/sessions/store.ts, and related callers)
  • that means this PR would preserve mixed-case IDs in one builder while persistence and continuity paths still collapse them, which can split session identity instead of fixing it

So this needs a broader Matrix-safe session-key strategy, not just a local builder change.

Thank you for pushing on a real problem and for adding the continuity tests. The linked issue remains valid: #57321

@vincentkoc vincentkoc self-assigned this Mar 30, 2026
@dlardo

dlardo commented Mar 30, 2026

Copy link
Copy Markdown
Contributor Author

Closing in favor of a more targeted approach. As @vincentkoc correctly noted, fixing case normalization in the session key builders without addressing the store/persistence layer would create split session identity.

New PR incoming with a narrower fix: treating the Matrix "User not in room" 403 as a permanent delivery error so delivery-recovery moves the poisoned entry to failed/ immediately instead of crash-looping the sync. This prevents the catastrophic failure while the broader case-normalization strategy is worked out separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Matrix] Lowercased room ID in delivery-recovery causes sync loop crash and permanent message loss

2 participants