Skip to content

feat(matrix): isolate matrix thread history and reply placement#71738

Closed
ga-it wants to merge 15 commits into
openclaw:mainfrom
ga-it:feat/matrix-thread-context-isolation
Closed

feat(matrix): isolate matrix thread history and reply placement#71738
ga-it wants to merge 15 commits into
openclaw:mainfrom
ga-it:feat/matrix-thread-context-isolation

Conversation

@ga-it

@ga-it ga-it commented Apr 25, 2026

Copy link
Copy Markdown

PR: feat(matrix): isolate thread history and fix threaded reply placement

Target: openclaw/openclaw main
Branch name suggestion: feat/matrix-thread-context-isolation


Title

feat(matrix): isolate thread history and fix threaded reply placement

Summary

  • Problem: Matrix threads were known to routing and session logic, but room-history tracking and message reads were room-flat. All messages in a room — regardless of which thread they belonged to — competed for the same context window. Separately, threaded bot replies included unconditional is_falling_back + m.in_reply_to fallback metadata in their event relations, causing Matrix clients (e.g. Element) to surface them in the main room timeline even when the intended target was a thread.

  • Why it matters: Without per-thread context isolation, a busy Matrix room or DM pollutes every thread's LLM context with unrelated messages. With the send-path leak, thread replies appear as duplicated messages in the room timeline, confusing users and growing room scroll.

  • What changed:

    • room-history.ts — room queues now carry per-thread sub-queues keyed by thread root event ID; watermarks are scoped by (agentId, roomId, threadRootId).
    • handler.ts_threadRootId is threaded through all three room-history call sites (recordPending, prepareTrigger, consumeHistory).
    • messages.ts — Matrix read actions accept threadId; thread reads use the untyped v1 /relations/{threadId} endpoint with recurse: true, hydrate returned events, and filter by native thread root client-side; main-room reads filter out direct and indirect thread events and over-fetch to compensate.
    • tool-actions.tsthreadId is parsed and forwarded in sendMessage and readMessages action handlers.
    • send/formatting.tsbuildThreadRelation no longer unconditionally includes is_falling_back and m.in_reply_to; fallback metadata is only added when an explicit replyToId is supplied.
    • send.ts (editMessageMatrix) — when editing a threaded message, thread context now goes in m.new_content["m.relates_to"] (per spec) rather than the outer REPLACE relation level (which caused main-room surface via m.in_reply_to).
  • What did NOT change: Session routing, DM allowlists, reaction handling, non-threaded group rooms, Telegram, or any other channel. The channels.matrix.dm.threadReplies config key already existed in the schema; no config schema changes are needed.

  • Verification in this port:

    • focused Matrix tests passed: 4 files, 74 tests
    • extension TypeScript compile passed via tsconfig.extensions.json

Why This Is The Right OpenClaw Direction

This PR does not introduce a new threading model for OpenClaw. It brings the Matrix
extension into line with the isolation model that the Telegram extension already uses
for forum topics and DM topics.

Current Telegram behavior already treats thread identity as conversation identity:

  • outbound sends preserve message_thread_id rather than stripping it, even in DMs,
    because silently dropping thread scope would misroute replies
  • group session keys are built from chatId + messageThreadId, so topic traffic does
    not collapse into one flat group session
  • session/thread binding logic persists topic-qualified conversation IDs such as
    -100200300:topic:77

In other words, Telegram in OpenClaw already behaves as "topic/thread = isolated
conversation scope". This Matrix change applies the same principle:

  • thread root ID becomes part of history and watermark scope
  • thread reads use the thread-specific endpoint
  • main-room reads stop absorbing thread traffic by default
  • thread sends stop advertising themselves as main-timeline fallbacks unless that
    fallback is explicitly requested

That consistency matters because it reduces channel-specific surprises:

  • Telegram topics already preserve context isolation
  • Matrix threads should not be the one major threaded channel where all room traffic
    still bleeds into every thread

Matrix vs Telegram Contrast

Telegram today

  • Thread/topic identity is carried explicitly in outbound send params as
    message_thread_id.
  • Session identity for group traffic includes thread/topic identity.
  • Conversation binding persists topic-qualified IDs like chat:topic:thread.
  • The implementation is deliberately cautious about not stripping thread scope, because
    that would misroute replies.

Matrix before this PR

  • Thread identity was detected, but not consistently propagated into room-history
    storage and reads.
  • History remained effectively flat per room.
  • Main-room reads included thread traffic.
  • Thread sends always advertised a fallback main-timeline reply shape.

Matrix after this PR

  • Thread identity becomes part of inbound history scope, watermark scope, and read
    scope.
  • Main-room reads exclude thread traffic by default.
  • Thread sends stay thread-only unless a caller explicitly asks for fallback reply
    metadata.

The net effect is that Matrix becomes behaviorally much closer to Telegram’s proven
topic-isolation model.


Change Type

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations ← Matrix extension
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue / PR

  • Closes #(thread leak / flat history issue)
  • Related: existing channels.matrix.threadReplies config, threads.ts, thread-context.ts

Root Cause

Flat room history (inbound context leak):
room-history.ts maintained a single queue per room. handler.ts resolved _threadRootId correctly but never forwarded it into recordPending, prepareTrigger, or consumeHistory, so all room traffic — regardless of thread — was merged into one context window.

Reply placement leak (outbound):
buildThreadRelation in send/formatting.ts unconditionally returned:

{ "rel_type": "m.thread", "event_id": "$root",
  "is_falling_back": true,
  "m.in_reply_to": { "event_id": "$root" } }

The is_falling_back / m.in_reply_to combination is the spec signal for clients that do not support threads to surface the message in the main timeline. Supporting clients (Element Web) also respect it, resulting in visible duplicate messages.

Edit leak (editMessageMatrix):
When draft-stream previews were finalized via editMessageMatrix, the REPLACE event carried m.in_reply_to: { event_id: threadRoot } at the outer relation level. Because threadRoot is often the user's original DM (a main-room event), Synapse and some clients surfaced edited thread messages as replies to main-room events.

  • Root cause: _threadRootId was resolved but not propagated; fallback metadata was unconditional; REPLACE events carried reply context at the wrong nesting level.
  • Missing detection: no tests asserted that thread sends omit is_falling_back; no tests asserted thread-scoped history isolation.
  • Contributing context: Matrix's optional is_falling_back semantics are easy to misread as "required for thread sends."

Prior Art In OpenClaw

Telegram already contains the core architectural pattern this PR is moving Matrix
toward:

  • extensions/telegram/src/send.ts
    • preserves message_thread_id and explicitly warns against stripping DM topic
      thread IDs because doing so misroutes replies
  • extensions/telegram/src/bot-core.ts
    • builds group session keys from buildTelegramGroupPeerId(chatId, messageThreadId)
  • extensions/telegram/src/thread-bindings.test.ts
    • persists topic-qualified conversation IDs such as -100200300:topic:77

This is useful upstream context because it shows OpenClaw already endorses
thread/topic-qualified conversation identity on another major channel.


Files Changed

extensions/matrix/src/matrix/monitor/room-history.ts       core change
extensions/matrix/src/matrix/monitor/handler.ts            core change
extensions/matrix/src/matrix/actions/messages.ts           core change
extensions/matrix/src/tool-actions.ts                      core change
extensions/matrix/src/matrix/send/formatting.ts            bug fix
extensions/matrix/src/matrix/send.ts                       bug fix

extensions/matrix/src/matrix/monitor/room-history.test.ts  new tests
extensions/matrix/src/matrix/send.test.ts                   updated + new tests
extensions/matrix/src/matrix/actions/messages.test.ts       new tests (thread read path)

Verification

Commands used during the upstream port:

OPENCLAW_VITEST_INCLUDE_FILE=<tmp-json> vitest run --config test/vitest/vitest.extension-matrix.config.ts
node scripts/run-tsgo.mjs -p tsconfig.extensions.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/matrix-thread-pr.tsbuildinfo

Focused test result:

Test Files  3 passed (3)
Tests       58 passed (58)

Coverage added by this PR includes:

  • thread-only room-history queues stay isolated from main-room queues
  • thread watermarks advance independently
  • main-room reads filter out thread traffic
  • thread reads use the all-event-type relations endpoint, hydrate encrypted events, summarize polls, and include the root event once
  • thread sends omit fallback metadata unless explicitly requested
  • threaded edits keep thread context on m.new_content

Key Diffs

send/formatting.tsbuildThreadRelation

Before:

export function buildThreadRelation(threadId: string, replyToId?: string): MatrixThreadRelation {
  const trimmed = threadId.trim();
  return {
    rel_type: RelationType.Thread,
    event_id: trimmed,
    is_falling_back: true,
    "m.in_reply_to": { event_id: replyToId?.trim() || trimmed },
  };
}

After:

export function buildThreadRelation(threadId: string, replyToId?: string): MatrixThreadRelation {
  const trimmed = threadId.trim();
  const relation: MatrixThreadRelation = {
    rel_type: RelationType.Thread,
    event_id: trimmed,
  };
  const fallbackReplyToId = replyToId?.trim();
  if (fallbackReplyToId) {
    relation.is_falling_back = true;
    relation["m.in_reply_to"] = { event_id: fallbackReplyToId };
  }
  return relation;
}

send.tseditMessageMatrix thread context placement

Before:

const threadId = normalizeThreadId(opts.threadId);
if (threadId) {
  // Thread-aware replace: Synapse needs the thread context...
  replaceRelation["m.in_reply_to"] = { event_id: threadId };
}

After:

const threadId = normalizeThreadId(opts.threadId);
if (threadId) {
  // Per Matrix spec, m.new_content must carry the same thread relation
  // as the original event. Placing it here (not in the outer replace relation)
  // keeps Synapse thread indexing correct without surfacing the edit event
  // as a reply to the thread root in the main room timeline.
  (newContent as Record<string, unknown>)["m.relates_to"] = buildThreadRelation(threadId);
}

room-history.ts — thread sub-queues (abbreviated)

type RoomQueue = {
  entries: HistoryEntry[];
  baseIndex: number;
  generation: number;
  preparedTriggers: Map<string, PreparedTriggerResult>;
  threadQueues: Map<string, ThreadSubQueue>;   // ← new
};

// recordPending routes by threadRootId:
recordPending(roomId: string, entry: HistoryEntry, threadRootId?: string): void

// Watermark scope includes thread:
wmKey()  JSON.stringify({ agentId, roomId, scope: threadRootId ?? "main" })

handler.ts_threadRootId threading

Three call sites updated:

roomHistoryTracker.recordPending(roomId, pendingEntry, _threadRootId);
roomHistoryTracker.prepareTrigger(_route.agentId, roomId, historyLimit, {...}, _threadRootId);
roomHistoryTracker.consumeHistory(_route.agentId, roomId, triggerSnapshot, _messageId, _threadRootId);

messages.ts — thread-aware reads

// Thread read: uses relations endpoint
GET /_matrix/client/v1/rooms/{roomId}/relations/{threadId}/m.thread

// Main-room read: filters out thread events
function isThreadEvent(event: MatrixRawEvent): boolean {
  return event.content?.["m.relates_to"]?.rel_type === "m.thread";
}
// + over-fetch multiplier to compensate for filtered events

Test Plan

Tests that must change

send.test.ts — existing test asserts pre-fix behavior:

// Before (incorrect — asserts the leak):
expect(content["m.relates_to"]).toMatchObject({
  rel_type: "m.thread",
  event_id: "$thread",
  "m.in_reply_to": { event_id: "$thread" },   // ← this must go
});

// After (correct):
expect(content["m.relates_to"]).toMatchObject({
  rel_type: "m.thread",
  event_id: "$thread",
});
expect(content["m.relates_to"]).not.toHaveProperty("is_falling_back");
expect(content["m.relates_to"]).not.toHaveProperty("m.in_reply_to");

New tests to add

send.test.ts:

  • Thread send without replyToId → no is_falling_back, no m.in_reply_to
  • Thread send with explicit replyToId → includes is_falling_back + m.in_reply_to
  • editMessageMatrix with threadIdm.new_content contains thread relation; outer REPLACE relation has no m.in_reply_to

room-history.test.ts:

  • recordPending with threadRootId routes to thread sub-queue
  • prepareTrigger scoped to thread returns only that thread's entries
  • consumeHistory advances watermark per thread independently
  • Main-room and thread histories do not cross-contaminate
  • Thread sub-queue FIFO eviction at MAX_THREAD_QUEUES_PER_ROOM

messages.test.ts:

  • readMatrixMessages with threadId → uses relations endpoint, includes root event
  • readMatrixMessages without threadId → main-room endpoint, filters out thread events
  • Over-fetch compensates for filtered events on subsequent pages

Diagram

Before (flat history):
  DM room messages:  [A][B][C][T1a][T1b][T2a]
  Thread T1 context: [A][B][C][T1a][T1b][T2a]  ← entire room mixed in
  Thread T2 context: [A][B][C][T1a][T1b][T2a]  ← same flat soup

After (thread-scoped):
  DM room messages:  [A][B][C]
  Thread T1 context: [T1a][T1b]                 ← only T1's turns
  Thread T2 context: [T2a]                       ← only T2's turns

Before (send leak):
  Thread reply:  { rel_type: "m.thread", is_falling_back: true, m.in_reply_to: $root }
  Visible in:    thread timeline + main room timeline

After (thread-only send):
  Thread reply:  { rel_type: "m.thread" }        ← no fallback metadata
  Visible in:    thread timeline only

User-visible / Behavior Changes

  • Thread replies no longer appear as duplicate messages in the main room timeline.
  • Each Matrix thread now has its own isolated LLM context — replies in thread T1 do not see messages from thread T2 or from the main room.
  • editMessageMatrix for threaded messages correctly places thread context in m.new_content per the Matrix spec; Synapse keeps edited messages in the thread timeline.
  • Callers of buildThreadRelation that relied on unconditional is_falling_back must now pass an explicit replyToId to preserve that behavior. No production caller currently does this.
  • Config: channels.matrix.dm.threadReplies already existed; setting it to "always" now reliably keeps all DM replies thread-scoped. No new config keys.
  • Behavioral alignment: Matrix now behaves more like Telegram topics in OpenClaw, where
    thread/topic traffic is treated as its own conversation scope instead of flattening
    into the parent room.

Security Impact

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? Yes — thread reads use a different Matrix endpoint (/relations/{threadId} with recurse: true). Same auth, same homeserver, narrower scope.
  • Command/tool execution surface changed? No
  • Data access scope changed? Yes — agents receive narrower context (thread-only history instead of full room). This is a reduction in scope, not an expansion.
  • Risk/mitigation: None beyond standard Matrix API call behavior.

Repro + Verification

Environment

  • Runtime: Docker container, OpenClaw gateway
  • Channel: Matrix DM with channels.matrix.dm.threadReplies = "always"
  • Model: any (context isolation is pre-model)

Steps

  1. Configure channels.matrix.dm.threadReplies = "always".
  2. Send DM message A → bot replies in thread T1 (rooted at A).
  3. Send DM message B (new message, not in T1) → bot replies in thread T2 (rooted at B).
  4. Reply to thread T1 with message C.
  5. Inspect the context.compiled entry in the bot's trajectory JSONL for the T1 reply to C.

Expected

  • Trajectory shows messages_to_llm: 2 (A + T1 bot reply) — not 4 or more.
  • Bot reply is in thread T1, not visible in main room timeline.
  • Thread T2 exists independently; its context.compiled shows messages_to_llm: 0 then messages_to_llm: 2 (only B + T2 reply), with no T1 content.

Actual (pre-fix)

  • All DM messages + all thread replies merged into a single context.
  • Thread replies appeared in main room timeline via is_falling_back.
  • Draft-stream edits re-introduced m.in_reply_to to the main room even after the initial send was corrected.

Evidence

  • Trajectory JSONL inspection confirmed messages_to_llm: 2 for a thread reply with 1 prior exchange (session file named *-topic-$<threadRootId>.trajectory.jsonl).
  • Main room session showed separate trajectory file with its own independent history.
  • grep -c 'replaceRelation\["m.in_reply_to"\]' dist/send-*.js → 0 after fix (confirmed in deployed container).
  • Compiled send bundle contains newContent["m.relates_to"] = buildThreadRelation(threadId) (per-spec placement).

Human Verification

  • Verified: thread replies stay in thread, not visible in main DM room timeline (live Matrix client, post-fix restart).
  • Verified: context.compiled.messages in trajectory JSONL contains only thread-scoped messages.
  • Verified: send bundle reflects both fixes (bundle grep, source file inspection).
  • Verified: existing focused Matrix test suite passes (room-history.test.ts, threads.test.ts, send.test.ts, messages.test.ts).
  • Not yet verified: behavior across non-DM group rooms with active threads (only DM threading tested live); behavior with encrypted rooms (logic is pre-encryption layer).


Real behavior proof

  • Behavior or issue addressed: Matrix native thread reads now exercise the current untyped recursive v1 relations path, keep thread reads isolated by native thread root, preserve indirect child relations under threaded events, and avoid default fallback reply metadata in thread reply payloads. Current PR head is 60b0c9244f01d1e99c74e47c285e84a81a138970 after rebasing onto current origin/main; the Matrix implementation diff exercised by the live proof is unchanged by the rebase.
  • Real environment tested: Karabo OpenClaw Matrix runtime credentials against the live Matrix/Synapse-compatible homeserver. The proof used a temporary private Matrix room; raw tokens, room IDs, and event IDs are redacted and represented only as hashes.
  • Exact steps or command run after this patch: On 2026-05-14T08:05:36Z, using the current Matrix implementation, then rebased cleanly to head 60b0c9244f01d1e99c74e47c285e84a81a138970; created a temporary private Matrix room, sent one main-room message, two native thread roots, one direct m.thread reply under each root, and one indirect m.replace edit relation under Thread A's reply. Queried GET /_matrix/client/v1/rooms/{roomId}/relations/{threadRootId}?dir=f&limit=20&recurse=true with no /m.thread or /m.room.message path segment, then applied the same bounded parent-chain thread filter used by readMatrixMessages({ threadId }).
  • Evidence after fix: Redacted copied live output from the current-head real Matrix setup. Summary: captured at 2026-05-14T08:05:36Z; runtime account hash e51963bdfce8; homeserver hash 74c7bd2b1a2b; room hash b4f71058d518; request path GET /_matrix/client/v1/rooms/{roomId}/relations/{threadRootId}?dir=f&limit=20&recurse=true; stale typed /m.thread and /m.thread/m.room.message path segments were not used; Thread A relation hashes ['2eac1aa671a0', '498a5a89fcf4']; Thread B relation hashes ['3b22215ca715']; Thread A filtered hashes ['667870547242', '2eac1aa671a0', '498a5a89fcf4']; Thread B filtered hashes ['938b4b1e0d0f', '3b22215ca715']; cleanup left temporary proof room. Full redacted output:
{
  "captured_at_utc": "2026-05-14T08:05:36Z",
  "checks": {
    "client_filter_excludes_unrelated_main": true,
    "client_filter_includes_thread_a_root_reply_edit": true,
    "edit_a_is_indirect_child_relation": true,
    "recursive_query_used": true,
    "reply_a_fallback_absent": true,
    "reply_b_fallback_absent": true,
    "thread_a_direct_reply_returned": true,
    "thread_a_excludes_thread_b_reply": true,
    "thread_a_indirect_edit_returned": true,
    "thread_b_direct_reply_returned": true,
    "thread_b_excludes_thread_a_reply": true,
    "typed_m_thread_segment_absent": true,
    "untyped_endpoint_path_used": true
  },
  "cleanup": "left temporary proof room",
  "current_head_client_side_filter_result": {
    "thread_a_filtered_event_hashes": [
      "667870547242",
      "2eac1aa671a0",
      "498a5a89fcf4"
    ],
    "thread_b_filtered_event_hashes": [
      "938b4b1e0d0f",
      "3b22215ca715"
    ]
  },
  "current_request_path_template": "GET /_matrix/client/v1/rooms/{roomId}/relations/{threadRootId}?dir=f&limit=20&recurse=true",
  "event_hashes": {
    "main": "f6ffd45208ab",
    "thread_a_indirect_edit": "498a5a89fcf4",
    "thread_a_reply": "2eac1aa671a0",
    "thread_a_root": "667870547242",
    "thread_b_reply": "3b22215ca715",
    "thread_b_root": "938b4b1e0d0f"
  },
  "homeserver_hash": "74c7bd2b1a2b",
  "overall_real_behavior_proof_passed": true,
  "pr_head": "60b0c9244f01d1e99c74e47c285e84a81a138970",
  "reply_relation_shapes": {
    "thread_a_has_is_falling_back": false,
    "thread_a_has_m_in_reply_to": false,
    "thread_a_indirect_edit_relation_keys": [
      "event_id",
      "rel_type"
    ],
    "thread_a_relation_keys": [
      "event_id",
      "rel_type"
    ],
    "thread_b_has_is_falling_back": false,
    "thread_b_has_m_in_reply_to": false,
    "thread_b_relation_keys": [
      "event_id",
      "rel_type"
    ]
  },
  "room_hash": "b4f71058d518",
  "runtime_account_hash": "e51963bdfce8",
  "untyped_recursive_relations": {
    "thread_a_next_batch_present": false,
    "thread_a_prev_batch_present": false,
    "thread_a_relation_event_hashes": [
      "2eac1aa671a0",
      "498a5a89fcf4"
    ],
    "thread_b_next_batch_present": false,
    "thread_b_prev_batch_present": false,
    "thread_b_relation_event_hashes": [
      "3b22215ca715"
    ]
  }
}
  • Observed result after fix: The untyped recursive relations response for Thread A returned Thread A's direct reply and the indirect edit relation under that reply; the Thread B response returned only Thread B's reply. The current-head client-side filter kept Thread A root/reply/edit together, kept Thread B root/reply together, excluded the unrelated main-room message, and did not cross-contaminate Thread A and Thread B. Direct thread reply relation payloads contained only event_id and rel_type; is_falling_back and m.in_reply_to were absent. The temporary proof room was left after the run. Overall proof passed: True.
  • What was not tested: Live encrypted Matrix rooms, live non-DM group rooms, and live multi-page relation pagination were not tested in this proof. Those paths remain covered by focused tests and source inspection; this proof specifically covers the current untyped recursive relations path and indirect child-relation filtering requested by ClawSweeper.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? Nodm.threadReplies already in schema; defaults unchanged
  • Migration needed? No

The only breaking change is for callers of buildThreadRelation that depended on the unconditional is_falling_back. No such callers exist outside the extension.


Risks and Mitigations

  • Risk: Synapse loses thread timeline membership for edited messages if it relied on the outer m.in_reply_to in REPLACE events.

    • Mitigation: Per Matrix spec, REPLACE events determine thread membership from m.new_content["m.relates_to"], not the outer REPLACE relation. Test confirmed edited messages remain in thread timeline post-fix.
  • Risk: Thread sub-queue memory growth in rooms with many long-lived threads.

    • Mitigation: MAX_THREAD_QUEUES_PER_ROOM = 50 cap with FIFO eviction; per-queue entry cap inherited from room queue.
  • Risk: Over-fetch multiplier for main-room reads (3×) could increase Matrix API call volume.

    • Mitigation: Multiplier only applies when thread events are present; first page respects requested limit exactly; subsequent pages only over-fetch if first page was under-limit after filtering.
  • Risk: messages.test.ts thread read path tests require mocking the /relations endpoint.

    • Mitigation: Existing test harness already mocks Matrix client responses; thread endpoint mock follows same pattern.

@openclaw-barnacle openclaw-barnacle Bot added channel: matrix Channel integration: matrix size: L labels Apr 25, 2026
@greptile-apps

greptile-apps Bot commented Apr 25, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR introduces per-thread context isolation for the Matrix extension — routing room history, watermarks, and API reads to thread-scoped sub-structures — and fixes the outbound send leak where is_falling_back + m.in_reply_to caused thread replies to surface in the main room timeline. The changes are well-structured and include solid test coverage for the new paths.

  • P1 — stale watermark re-insertion after thread sub-queue eviction (room-history.ts): If consumeHistory completes after a thread's sub-queue has been FIFO-evicted (the 50-queue cap), it calls rememberWatermark with a stale index and re-inserts the cleared watermark. When the sub-queue is subsequently re-created, computePendingHistory uses this stale watermark and silently skips the new messages.
  • P2 — clearRoomWatermarks skips legacy colon-format keys: parseWatermarkKey only parses JSON-encoded keys; legacy "agentId:roomId" entries survive room eviction and count against the 5 000-entry global watermark cap.
  • P2 — root event counts against the limit in thread reads: Prepending the root event before hydratedChunk.slice(0, limit) means callers receive limit - 1 reply events on the first page.

Confidence Score: 3/5

Merging is safe for most scenarios but the thread sub-queue re-eviction race can silently drop messages for high-traffic threads.

One P1 finding (stale watermark re-insertion after FIFO eviction of a thread sub-queue) can cause silent message loss in high-thread-volume rooms, pulling the score below 4. The P2s are low-risk cleanup items.

extensions/matrix/src/matrix/monitor/room-history.ts — specifically the consumeHistory method and its interaction with thread sub-queue FIFO eviction.

Comments Outside Diff (1)

  1. extensions/matrix/src/matrix/monitor/room-history.ts, line 401-416 (link)

    P1 Late consumeHistory call can re-introduce an evicted thread watermark

    When a thread sub-queue is FIFO-evicted (the 50-queue cap in getOrCreateThreadQueue), clearThreadWatermarks correctly removes the watermark. However, if consumeHistory is already in-flight for that thread (i.e., the agent was processing a trigger when the sub-queue was evicted), it will call rememberWatermark afterwards and re-insert the evicted watermark. If the thread sub-queue is later re-created (new messages arrive), computePendingHistory will pick up this stale watermark and skip those new messages, silently hiding them from the agent.

    The fix would mirror the pattern used for room eviction: in consumeHistory, skip rememberWatermark when the thread sub-queue no longer exists in queue.threadQueues and threadRootId was provided.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: extensions/matrix/src/matrix/monitor/room-history.ts
    Line: 401-416
    
    Comment:
    **Late `consumeHistory` call can re-introduce an evicted thread watermark**
    
    When a thread sub-queue is FIFO-evicted (the 50-queue cap in `getOrCreateThreadQueue`), `clearThreadWatermarks` correctly removes the watermark. However, if `consumeHistory` is already in-flight for that thread (i.e., the agent was processing a trigger when the sub-queue was evicted), it will call `rememberWatermark` afterwards and re-insert the evicted watermark. If the thread sub-queue is later re-created (new messages arrive), `computePendingHistory` will pick up this stale watermark and skip those new messages, silently hiding them from the agent.
    
    The fix would mirror the pattern used for room eviction: in `consumeHistory`, skip `rememberWatermark` when the thread sub-queue no longer exists in `queue.threadQueues` and `threadRootId` was provided.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: extensions/matrix/src/matrix/monitor/room-history.ts
Line: 133-139

Comment:
**`clearRoomWatermarks` silently skips legacy colon-format keys**

`clearRoomWatermarks` delegates room-ID matching to `parseWatermarkKey`, which only parses JSON-encoded keys and returns `null` for the legacy `"agentId:roomId"` format. Any legacy watermark entries still alive in the map (written by old code before this PR) are therefore invisible to `clearRoomWatermarks`, so they will not be evicted when their room is FIFO-evicted from `roomQueues`. They accumulate until the global `MAX_WATERMARK_ENTRIES` cap kicks in.

While this won't cause incorrect history (the legacy key is only ever *read* in `computePendingHistory`, never re-written, and a freshly-evicted room has a new `baseIndex`), it does mean the legacy watermark for an evicted room persists and counts against the 5 000-entry cap. In a high-throughput deployment where rooms rotate frequently, this could exhaust watermark capacity faster than expected.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/matrix/src/matrix/monitor/room-history.ts
Line: 401-416

Comment:
**Late `consumeHistory` call can re-introduce an evicted thread watermark**

When a thread sub-queue is FIFO-evicted (the 50-queue cap in `getOrCreateThreadQueue`), `clearThreadWatermarks` correctly removes the watermark. However, if `consumeHistory` is already in-flight for that thread (i.e., the agent was processing a trigger when the sub-queue was evicted), it will call `rememberWatermark` afterwards and re-insert the evicted watermark. If the thread sub-queue is later re-created (new messages arrive), `computePendingHistory` will pick up this stale watermark and skip those new messages, silently hiding them from the agent.

The fix would mirror the pattern used for room eviction: in `consumeHistory`, skip `rememberWatermark` when the thread sub-queue no longer exists in `queue.threadQueues` and `threadRootId` was provided.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/matrix/src/matrix/actions/messages.ts
Line: 130-141

Comment:
**Root event consumes one slot from the caller-requested `limit`**

When `threadId` is supplied and there is no pagination cursor (`!opts.before && !opts.after`), the root event is prepended to `rawThreadEvents` before the `hydratedChunk.slice(0, limit)` cap is applied. As a result, callers receive `limit - 1` thread-reply events rather than `limit`. For example, requesting `limit: 5` on a thread with 10 replies returns `[root, reply1, reply2, reply3, reply4]` — the root occupies the first slot.

If the intent is for `limit` to bound the number of *reply* events (not counting the root), the `/relations` fetch should use `limit` and the root prepend should happen after slicing, or the request should fetch `limit - 1` events when the root will be prepended.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "feat(matrix): isolate thread history and..." | Re-trigger Greptile

Comment on lines 133 to +139
function clearRoomWatermarks(roomId: string): void {
const roomSuffix = `:${roomId}`;
for (const key of agentWatermarks.keys()) {
if (key.endsWith(roomSuffix)) {
if (parseWatermarkKey(key)?.roomId === roomId) {
agentWatermarks.delete(key);
}
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 clearRoomWatermarks silently skips legacy colon-format keys

clearRoomWatermarks delegates room-ID matching to parseWatermarkKey, which only parses JSON-encoded keys and returns null for the legacy "agentId:roomId" format. Any legacy watermark entries still alive in the map (written by old code before this PR) are therefore invisible to clearRoomWatermarks, so they will not be evicted when their room is FIFO-evicted from roomQueues. They accumulate until the global MAX_WATERMARK_ENTRIES cap kicks in.

While this won't cause incorrect history (the legacy key is only ever read in computePendingHistory, never re-written, and a freshly-evicted room has a new baseIndex), it does mean the legacy watermark for an evicted room persists and counts against the 5 000-entry cap. In a high-throughput deployment where rooms rotate frequently, this could exhaust watermark capacity faster than expected.

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/matrix/src/matrix/monitor/room-history.ts
Line: 133-139

Comment:
**`clearRoomWatermarks` silently skips legacy colon-format keys**

`clearRoomWatermarks` delegates room-ID matching to `parseWatermarkKey`, which only parses JSON-encoded keys and returns `null` for the legacy `"agentId:roomId"` format. Any legacy watermark entries still alive in the map (written by old code before this PR) are therefore invisible to `clearRoomWatermarks`, so they will not be evicted when their room is FIFO-evicted from `roomQueues`. They accumulate until the global `MAX_WATERMARK_ENTRIES` cap kicks in.

While this won't cause incorrect history (the legacy key is only ever *read* in `computePendingHistory`, never re-written, and a freshly-evicted room has a new `baseIndex`), it does mean the legacy watermark for an evicted room persists and counts against the 5 000-entry cap. In a high-throughput deployment where rooms rotate frequently, this could exhaust watermark capacity faster than expected.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +130 to +141
const rawThreadEvents: MatrixRawEvent[] = [];
if (!opts.before && !opts.after) {
appendUniqueEvent(
rawThreadEvents,
(await client.getEvent(resolvedRoom, opts.threadId)) as MatrixRawEvent | null,
);
}
for (const event of res.chunk) {
appendUniqueEvent(rawThreadEvents, event);
}
hydratedChunk = await client.hydrateEvents(resolvedRoom, rawThreadEvents);
} else {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Root event consumes one slot from the caller-requested limit

When threadId is supplied and there is no pagination cursor (!opts.before && !opts.after), the root event is prepended to rawThreadEvents before the hydratedChunk.slice(0, limit) cap is applied. As a result, callers receive limit - 1 thread-reply events rather than limit. For example, requesting limit: 5 on a thread with 10 replies returns [root, reply1, reply2, reply3, reply4] — the root occupies the first slot.

If the intent is for limit to bound the number of reply events (not counting the root), the /relations fetch should use limit and the root prepend should happen after slicing, or the request should fetch limit - 1 events when the root will be prepended.

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/matrix/src/matrix/actions/messages.ts
Line: 130-141

Comment:
**Root event consumes one slot from the caller-requested `limit`**

When `threadId` is supplied and there is no pagination cursor (`!opts.before && !opts.after`), the root event is prepended to `rawThreadEvents` before the `hydratedChunk.slice(0, limit)` cap is applied. As a result, callers receive `limit - 1` thread-reply events rather than `limit`. For example, requesting `limit: 5` on a thread with 10 replies returns `[root, reply1, reply2, reply3, reply4]` — the root occupies the first slot.

If the intent is for `limit` to bound the number of *reply* events (not counting the root), the `/relations` fetch should use `limit` and the root prepend should happen after slicing, or the request should fetch `limit - 1` events when the root will be prepended.

How can I resolve this? If you propose a fix, please make it concise.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 76e66ca7f3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

hydratedChunk = mainEvents;
}

const processedChunk = hydratedChunk.slice(0, limit);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep main-timeline pagination aligned with returned events

When thread filtering is active, this path can over-fetch extra pages, but it then truncates to limit and still returns nextBatch from the last fetched page. In thread-heavy rooms, callers that paginate with nextBatch will skip non-thread events that were fetched but dropped by this slice (for example, when page 2 contributes more events than needed to hit limit), so history reads become lossy.

Useful? React with 👍 / 👎.

Comment on lines 408 to 409
if (queue.generation !== snapshot.queueGeneration) {
// The room was evicted and recreated before this trigger completed. Reject the stale
// snapshot so it cannot advance or erase state for the new queue generation.
return;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Invalidate stale thread snapshots after thread-queue eviction

Thread sub-queues can be evicted independently, but consumeHistory only validates the room generation before writing a watermark. If an in-flight trigger for an evicted thread finishes after that thread queue is recreated (same threadRootId), the stale snapshotIdx is accepted and can advance the watermark past newly queued messages, hiding fresh thread history until enough events accumulate.

Useful? React with 👍 / 👎.

@clawsweeper

clawsweeper Bot commented Apr 29, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs changes before merge. Reviewed June 5, 2026, 3:44 AM ET / 07:44 UTC.

Summary
The PR makes Matrix room history and read actions thread-aware, changes threaded send/edit relation metadata, and updates Matrix tests, docs, and CHANGELOG.md.

PR surface: Source +445, Tests +751, Docs +1. Total +1197 across 13 files.

Reproducibility: yes. Source inspection of current main shows Matrix room history is room-only and buildThreadRelation always emits fallback reply metadata, while the PR discussion supplies live Matrix/Synapse proof for the changed behavior.

Review metrics: 1 noteworthy metric.

  • Release-owned changelog surface: 1 entry added. Normal PRs should not edit CHANGELOG.md because release generation owns that file; release-note context belongs in the PR body or commit message.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P2] Remove the CHANGELOG.md entry before merge.
  • Have Matrix maintainers choose this branch's fallback/session behavior or the related maintainer-owned Matrix PR as the landing path.

Mantis proof suggestion
A short real-client Matrix visual would materially help maintainers confirm thread replies stay in the intended thread and do not duplicate in the main room timeline. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

visual task: verify in a real Matrix client that threaded replies stay in the thread and do not duplicate in the main room timeline.

Risk before merge

  • [P2] Existing Matrix behavior always adds thread fallback metadata; this PR removes that metadata unless an explicit reply target is supplied, so older or fallback-oriented Matrix clients may lose main-timeline visibility for bot replies.
  • [P1] Matrix room history and read actions move from room-flat to thread-scoped behavior, which changes the context window existing Matrix rooms see after upgrade.
  • [P1] The related maintainer-owned Matrix PR at feat(matrix): handle voice preflight and threads #90415 may become the landing path, but it is open and not mergeable now, so it cannot safely supersede this PR yet.
  • [P1] The PR still edits release-owned CHANGELOG.md; that line should be removed before merge and release-note context kept in the PR body or squash message.

Maintainer options:

  1. Accept Matrix thread isolation intentionally
    Maintainers can land this direction after explicitly accepting the upgrade behavior for Matrix thread-scoped history and thread-only reply placement, with the changelog line removed first.
  2. Preserve fallback compatibility first
    If older Matrix clients must keep main-timeline fallback visibility, adjust the send relation behavior and tests to preserve that compatibility path before merge.
  3. Wait for the maintainer-owned Matrix branch
    If feat(matrix): handle voice preflight and threads #90415 becomes clean and merged, this PR can then close as superseded; until then it should stay open as a viable implementation candidate.

Next step before merge

  • Automation can remove the release-owned CHANGELOG.md line; Matrix compatibility and landing-path acceptance remain maintainer decisions.

Security
Cleared: The diff is limited to Matrix plugin runtime/tests/docs and a changelog line, with no dependency, workflow, secret, package, or code-download surface changes.

Review findings

  • [P3] Remove the release-owned changelog entry — CHANGELOG.md:1945
Review details

Best possible solution:

Keep this PR open for Matrix maintainer review, remove the release-owned changelog line, and explicitly choose whether this branch's thread-only fallback/session isolation behavior or the related maintainer-owned Matrix branch should be the landing path.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection of current main shows Matrix room history is room-only and buildThreadRelation always emits fallback reply metadata, while the PR discussion supplies live Matrix/Synapse proof for the changed behavior.

Is this the best way to solve the issue?

Unclear as a final product choice. The PR is a strong Matrix-owned implementation with tests and live proof, but the fallback visibility and thread-scoped history changes are compatibility decisions and a related maintainer-owned Matrix PR is also open.

Full review comments:

  • [P3] Remove the release-owned changelog entry — CHANGELOG.md:1945
    CHANGELOG.md is release-generated in this repository. Please drop this added line and keep the release-note context in the PR body or squash/merge message instead.
    Confidence: 0.93

Overall correctness: patch is correct
Overall confidence: 0.84

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 1a3ce7c2a8da.

Label changes

Label justifications:

  • P2: This is a normal-priority Matrix channel behavior improvement with real user-facing thread/session impact but limited blast radius to the Matrix plugin.
  • merge-risk: 🚨 compatibility: The PR changes shipped Matrix fallback relation metadata, which can alter behavior for users or clients relying on main-timeline fallback visibility.
  • merge-risk: 🚨 session-state: The PR changes Matrix room history and read context from room-flat to thread-scoped queues and watermarks, affecting existing session context windows.
  • merge-risk: 🚨 message-delivery: The PR changes where Matrix threaded replies and threaded edits are represented in client timelines, so merge can affect visible message placement.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • feature: ✨ showcase: ClawSweeper spotlight: unusually compelling feature idea for maintainer attention. Thread-scoped Matrix context prevents busy-room bleed-through and makes Matrix behavior align with other threaded channel workflows.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR discussion includes redacted live Matrix/Synapse output showing the recursive relations read path, thread isolation, and fallback metadata behavior after the fix; later commits are merge/lint-only relative to that proof path.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR discussion includes redacted live Matrix/Synapse output showing the recursive relations read path, thread isolation, and fallback metadata behavior after the fix; later commits are merge/lint-only relative to that proof path.
Evidence reviewed

PR surface:

Source +445, Tests +751, Docs +1. Total +1197 across 13 files.

View PR surface stats
Area Files Added Removed Net
Source 7 512 67 +445
Tests 4 753 2 +751
Docs 2 2 1 +1
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 13 1267 70 +1197

Acceptance criteria:

  • [P1] git diff --check.

What I checked:

  • Root policy read and applied: Root review policy was read fully; the changelog, compatibility, extension-boundary, proof, and whole-surface review guidance affected this verdict. (AGENTS.md:1, 1a3ce7c2a8da)
  • Scoped extension policy read: The extensions guide says bundled plugin production code stays within plugin boundaries and treats extension code as third-party-plugin-visible surface; this PR stays within the Matrix plugin boundary. (extensions/AGENTS.md:1, 1a3ce7c2a8da)
  • Scoped docs policy read: The docs guide was read because the PR touches docs/channels/matrix.md; the Matrix docs edit uses generic public docs wording and no local/private data. Public docs: docs/AGENTS.md. (docs/AGENTS.md:1, 1a3ce7c2a8da)
  • Current main still uses fallback metadata: Current main's buildThreadRelation always emits is_falling_back plus m.in_reply_to for any threadId, so the PR is not already implemented on main. (extensions/matrix/src/matrix/send/formatting.ts:145, 1a3ce7c2a8da)
  • Current main still documents room-flat history: Current main docs say Matrix room history is room-only, which confirms the thread-scoped history behavior is still unique to the PR branch. Public docs: docs/channels/matrix.md. (docs/channels/matrix.md:591, 1a3ce7c2a8da)
  • PR branch scopes history by thread: The PR branch adds per-thread sub-queues, JSON watermark keys, legacy key cleanup, thread-generation validation, and thread-aware record/prepare/consume calls. (extensions/matrix/src/matrix/monitor/room-history.ts:331, 6e3761e41a22)

Likely related people:

  • steipete: Current-main shallow blame for the Matrix files points to Peter Steinberger, and the related open Matrix PR 90415 is authored by steipete and claims to supersede this thread work. (role: recent area contributor and adjacent Matrix PR owner; confidence: high; commits: 82710b4f1f10, eeb1ce96090c; files: extensions/matrix/src/matrix/send/formatting.ts, extensions/matrix/src/matrix/actions/messages.ts, extensions/matrix/src/matrix/monitor/room-history.ts)
  • teconomix: Merged PR 57995, authored by teconomix, introduced thread-isolated sessions and per-chat-type threadReplies, which are the current behavior this PR extends. (role: introduced related Matrix thread behavior; confidence: high; commits: 697dddbeb61d; files: extensions/matrix/src/matrix/monitor/threads.ts, extensions/matrix/src/types.ts, docs/channels/matrix.md)
  • gumadeiras: GitHub API shows gumadeiras merged PR 57995, and the Matrix extension changelog also credits gumadeiras for room-thread-aware Matrix approval handling in PR 58635. (role: merger and adjacent Matrix channel contributor; confidence: medium; commits: 697dddbeb61d; files: extensions/matrix/src/matrix/monitor/threads.ts, extensions/matrix/CHANGELOG.md)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@ga-it ga-it force-pushed the feat/matrix-thread-context-isolation branch from 8c468ec to 98416dc Compare May 5, 2026 14:35
@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 5, 2026
@ga-it ga-it force-pushed the feat/matrix-thread-context-isolation branch 7 times, most recently from 0c11f1a to 1fa6dc4 Compare May 8, 2026 14:54

RedThunder27112 commented May 8, 2026

Copy link
Copy Markdown

Superseded: this duplicate proof/re-review request was accidentally posted from the wrong connected GitHub account. Please ignore this copy.

The same proof has been reposted from the PR/fork account ga-it here: #71738 (comment)

@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026
@ga-it

ga-it commented May 8, 2026

Copy link
Copy Markdown
Author

@clawsweeper re-review please.

I refreshed the current-head real behavior proof in the PR body for 60b0c9244f01d1e99c74e47c285e84a81a138970 using parser-safe proof fields.

Current implementation path exercised in live Matrix/Synapse setup:

GET /_matrix/client/v1/rooms/{roomId}/relations/{threadRootId}?dir=f&limit=20&recurse=true

Summary: untyped recursive relation reads returned Thread A direct reply plus an indirect edit child relation, kept Thread B isolated, the current-head parent-chain filter kept root/reply/edit scoped to Thread A while excluding unrelated main-room content, fallback reply metadata was absent, and the temporary proof room was left.

{
  "captured_at_utc": "2026-05-14T08:05:36Z",
  "checks": {
    "client_filter_excludes_unrelated_main": true,
    "client_filter_includes_thread_a_root_reply_edit": true,
    "edit_a_is_indirect_child_relation": true,
    "recursive_query_used": true,
    "reply_a_fallback_absent": true,
    "reply_b_fallback_absent": true,
    "thread_a_direct_reply_returned": true,
    "thread_a_excludes_thread_b_reply": true,
    "thread_a_indirect_edit_returned": true,
    "thread_b_direct_reply_returned": true,
    "thread_b_excludes_thread_a_reply": true,
    "typed_m_thread_segment_absent": true,
    "untyped_endpoint_path_used": true
  },
  "cleanup": "left temporary proof room",
  "current_head_client_side_filter_result": {
    "thread_a_filtered_event_hashes": [
      "667870547242",
      "2eac1aa671a0",
      "498a5a89fcf4"
    ],
    "thread_b_filtered_event_hashes": [
      "938b4b1e0d0f",
      "3b22215ca715"
    ]
  },
  "current_request_path_template": "GET /_matrix/client/v1/rooms/{roomId}/relations/{threadRootId}?dir=f&limit=20&recurse=true",
  "event_hashes": {
    "main": "f6ffd45208ab",
    "thread_a_indirect_edit": "498a5a89fcf4",
    "thread_a_reply": "2eac1aa671a0",
    "thread_a_root": "667870547242",
    "thread_b_reply": "3b22215ca715",
    "thread_b_root": "938b4b1e0d0f"
  },
  "homeserver_hash": "74c7bd2b1a2b",
  "overall_real_behavior_proof_passed": true,
  "pr_head": "60b0c9244f01d1e99c74e47c285e84a81a138970",
  "reply_relation_shapes": {
    "thread_a_has_is_falling_back": false,
    "thread_a_has_m_in_reply_to": false,
    "thread_a_indirect_edit_relation_keys": [
      "event_id",
      "rel_type"
    ],
    "thread_a_relation_keys": [
      "event_id",
      "rel_type"
    ],
    "thread_b_has_is_falling_back": false,
    "thread_b_has_m_in_reply_to": false,
    "thread_b_relation_keys": [
      "event_id",
      "rel_type"
    ]
  },
  "room_hash": "b4f71058d518",
  "runtime_account_hash": "e51963bdfce8",
  "untyped_recursive_relations": {
    "thread_a_next_batch_present": false,
    "thread_a_prev_batch_present": false,
    "thread_a_relation_event_hashes": [
      "2eac1aa671a0",
      "498a5a89fcf4"
    ],
    "thread_b_next_batch_present": false,
    "thread_b_prev_batch_present": false,
    "thread_b_relation_event_hashes": [
      "3b22215ca715"
    ]
  }
}

Re-review progress:

@ga-it ga-it force-pushed the feat/matrix-thread-context-isolation branch from a830104 to 4714035 Compare May 11, 2026 14:06
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 11, 2026
@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. and removed proof: sufficient ClawSweeper judged the real behavior proof convincing. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. proof: supplied External PR includes structured after-fix real behavior proof. labels May 11, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 13, 2026
@clawsweeper clawsweeper Bot added rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. and removed status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. labels Jun 2, 2026
@clawsweeper

clawsweeper Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper

clawsweeper Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper

clawsweeper Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper

clawsweeper Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@steipete

steipete commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Closed as superseded by the maintainer-owned Matrix fix that landed in #90415 at 25149801189f.

#90415 includes the Matrix thread history isolation and reply placement behavior from this PR, with focused tests and real Matrix QA proof: AWS Crabbox Matrix QA run_8f0cdf1afdb8 passed the thread scenarios, suite 9/9.

Thanks @ga-it for pushing this behavior forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: matrix Channel integration: matrix docs Improvements or additions to documentation feature: ✨ showcase ClawSweeper spotlight: unusually compelling feature idea for maintainer attention. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P2 Normal backlog priority with limited blast radius. proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: XL status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants