Skip to content

fix(msteams): prefer freshest personal conversation for proactive DM sends#54702

Merged
steipete merged 3 commits intoopenclaw:mainfrom
gumclaw:fix-msteams-stale-dm-resolution
Mar 28, 2026
Merged

fix(msteams): prefer freshest personal conversation for proactive DM sends#54702
steipete merged 3 commits intoopenclaw:mainfrom
gumclaw:fix-msteams-stale-dm-resolution

Conversation

@gumclaw
Copy link
Copy Markdown
Contributor

@gumclaw gumclaw commented Mar 25, 2026

Summary

  • Problem: MS Teams proactive sends to user:<aadObjectId> could resolve to a stale personal DM conversation reference when the same user had multiple stored references.
  • Why it matters: replies could be generated successfully but then fail outbound with HTTP 403 because the send path targeted an old Bot Framework conversation ID.
  • What changed: findByUserId now considers all matching references, prefers personal conversations, and then selects the freshest lastSeenAt; added a regression test covering multiple references for one user.
  • What did NOT change (scope boundary): no changes to Teams webhook ingestion, proactive send transport, or conversation reference storage format beyond typing lastSeenAt on the stored reference.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #
  • This PR fixes a bug or regression

Root Cause / Regression History (if applicable)

  • Root cause: the MS Teams FS conversation store returned the first stored user match instead of selecting the newest valid DM reference.
  • Missing detection / guardrail: there was no regression test for multiple conversation references sharing the same AAD object ID.
  • Prior context (git blame, prior PR, issue, or refactor if known): Unknown.
  • Why this regressed now: Teams rotated/recreated a personal DM conversation reference for the same user, leaving both old and new references in the store.
  • If unknown, what was ruled out: ruled out inbound processing failure and ruled out a Teams-wide outage; the failing path was specifically proactive outbound resolution for the DM user target.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: extensions/msteams/src/conversation-store-fs.test.ts
  • Scenario the test should lock in: when a single AAD user has multiple stored references, the store should pick the freshest personal conversation instead of the first inserted one.
  • Why this is the smallest reliable guardrail: the bug lives entirely in store-side lookup/selection logic.
  • Existing test that already covers this (if any): None.
  • If no new test is added, why not: N/A.

User-visible / Behavior Changes

  • MS Teams proactive replies to DM users now prefer the latest personal conversation reference when multiple references exist for the same user.

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS 15 / Darwin 25.3.0 (arm64)
  • Runtime/container: local OpenClaw dev repo
  • Model/provider: N/A
  • Integration/channel (if any): MS Teams
  • Relevant config (redacted): stored conversation references containing two personal DMs and one group/chat reference for the same AAD user

Steps

  1. Store an older personal conversation reference for a user.
  2. Store another reference for the same AAD user, including a newer personal DM.
  3. Resolve findByUserId for that user.

Expected

  • The newest personal conversation reference is returned.

Actual

  • Before this change, the first inserted matching reference was returned, which could be stale.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: reproduced a real MS Teams DM delivery failure caused by stale conversation resolution; hot-patched the installed runtime; confirmed replies started arriving; added repo-level regression test; confirmed targeted test passes and pnpm build passes.
  • Edge cases checked: multiple references for the same AAD user across personal and group chat conversation types; newest personal reference wins.
  • What you did not verify: full repo-wide pnpm check and pnpm test green, because origin/main currently has unrelated TUI type errors in src/tui/components/* for getEditorKeybindings.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps: N/A

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: revert this PR or restore the previous findByUserId selection behavior.
  • Files/config to restore: extensions/msteams/src/conversation-store-fs.ts, extensions/msteams/src/conversation-store.ts
  • Known bad symptoms reviewers should watch for: proactive DM sends choosing an unexpected conversation reference for users with multiple stored references.

Risks and Mitigations

  • Risk: preferring personal and newest lastSeenAt could change resolution behavior for edge cases where a non-personal reference was intentionally preferred.
    • Mitigation: this only affects user:<id> lookup, which is used for person-targeted proactive sends; the new regression test locks the expected DM-preference behavior.

@openclaw-barnacle openclaw-barnacle Bot added channel: msteams Channel integration: msteams size: S labels Mar 25, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 25, 2026

Greptile Summary

This PR fixes a real production bug where MS Teams proactive DM sends could target a stale Bot Framework conversation ID when a user had multiple stored conversation references. The FS store's findByUserId now gathers all matches, prefers personal conversation types, and picks the freshest lastSeenAt — which is the correct selection policy for person-targeted proactive messaging. A focused regression test locks in the new behaviour.

Key observations:

  • Memory store not updated (conversation-store-memory.ts)findByUserId still uses the original first-match logic. The two store implementations are now behaviorally inconsistent. If the memory store is ever used on a real proactive-send path (or in integration tests that drive that path), it will reproduce the same regression. The fix should be mirrored here.
  • Redundant type intersectionConversationStoreData.conversations is typed Record<string, StoredConversationReference & { lastSeenAt?: string }>, but lastSeenAt is now part of StoredConversationReference itself after this PR, making the intersection dead weight.
  • Test timing — The regression test uses a hard-coded 10 ms setTimeout to produce distinct timestamps. This can be flaky on loaded CI runners; a deterministic clock injection or explicit lastSeenAt seeding would make the test more reliable.
  • The FS store sort logic itself is sound: personal entries bubble to the top, then ties are broken by timestamp descending, with missing timestamps treated as epoch 0 (preserving legacy entries at the bottom).

Confidence Score: 4/5

  • Safe to merge with one concrete follow-up: mirror the same fix into the memory store to keep both implementations consistent.
  • The FS store fix is correct and the regression test covers the primary bug scenario. The only substantive issue is that conversation-store-memory.ts was not updated with the same selection logic, leaving a behavioral inconsistency between the two store implementations. This does not affect the immediate production fix (the FS store is the real store) but is a latent bug worth resolving before it bites again.
  • extensions/msteams/src/conversation-store-memory.ts — findByUserId still has the pre-fix first-match logic.

Comments Outside Diff (2)

  1. extensions/msteams/src/conversation-store-memory.ts, line 31-44 (link)

    P2 Memory store has the same stale-reference bug

    findByUserId in the memory store still uses the old first-match logic, returning the first entry that matches by aadObjectId or user.id without any preference for personal conversations or lastSeenAt recency. The FS store received the fix, but the memory store was left behind.

    If createMSTeamsConversationStoreMemory is ever wired into a real proactive-send path (e.g. in tests that drive the full send stack, or a future in-process runtime), it will reproduce exactly the regression this PR is fixing. The two implementations should behave identically for the same inputs.

    // suggested replacement for the findByUserId method:
    findByUserId: async (id) => {
      const target = id.trim();
      if (!target) {
        return null;
      }
      const matches: MSTeamsConversationStoreEntry[] = [];
      for (const [conversationId, reference] of map.entries()) {
        if (reference.user?.aadObjectId === target || reference.user?.id === target) {
          matches.push({ conversationId, reference });
        }
      }
      if (matches.length === 0) {
        return null;
      }
      matches.sort((a, b) => {
        const aPersonal = a.reference.conversation?.conversationType?.toLowerCase() === "personal" ? 1 : 0;
        const bPersonal = b.reference.conversation?.conversationType?.toLowerCase() === "personal" ? 1 : 0;
        if (aPersonal !== bPersonal) return bPersonal - aPersonal;
        const aTs = parseTimestamp(a.reference.lastSeenAt) ?? 0;
        const bTs = parseTimestamp(b.reference.lastSeenAt) ?? 0;
        return bTs - aTs;
      });
      return matches[0] ?? null;
    },
    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: extensions/msteams/src/conversation-store-memory.ts
    Line: 31-44
    
    Comment:
    **Memory store has the same stale-reference bug**
    
    `findByUserId` in the memory store still uses the old first-match logic, returning the first entry that matches by `aadObjectId` or `user.id` without any preference for `personal` conversations or `lastSeenAt` recency. The FS store received the fix, but the memory store was left behind.
    
    If `createMSTeamsConversationStoreMemory` is ever wired into a real proactive-send path (e.g. in tests that drive the full send stack, or a future in-process runtime), it will reproduce exactly the regression this PR is fixing. The two implementations should behave identically for the same inputs.
    
    ```ts
    // suggested replacement for the findByUserId method:
    findByUserId: async (id) => {
      const target = id.trim();
      if (!target) {
        return null;
      }
      const matches: MSTeamsConversationStoreEntry[] = [];
      for (const [conversationId, reference] of map.entries()) {
        if (reference.user?.aadObjectId === target || reference.user?.id === target) {
          matches.push({ conversationId, reference });
        }
      }
      if (matches.length === 0) {
        return null;
      }
      matches.sort((a, b) => {
        const aPersonal = a.reference.conversation?.conversationType?.toLowerCase() === "personal" ? 1 : 0;
        const bPersonal = b.reference.conversation?.conversationType?.toLowerCase() === "personal" ? 1 : 0;
        if (aPersonal !== bPersonal) return bPersonal - aPersonal;
        const aTs = parseTimestamp(a.reference.lastSeenAt) ?? 0;
        const bTs = parseTimestamp(b.reference.lastSeenAt) ?? 0;
        return bTs - aTs;
      });
      return matches[0] ?? null;
    },
    ```
    
    How can I resolve this? If you propose a fix, please make it concise.
  2. extensions/msteams/src/conversation-store-fs.ts, line 11 (link)

    P2 Redundant lastSeenAt in intersection type

    lastSeenAt was added to StoredConversationReference in this PR, so the & { lastSeenAt?: string } intersection on line 11 is now redundant — ConversationStoreData.conversations is already typed as Record<string, StoredConversationReference> which includes the field. Not a runtime bug, but the duplication can mislead future readers into thinking lastSeenAt lives outside the base type.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: extensions/msteams/src/conversation-store-fs.ts
    Line: 11
    
    Comment:
    **Redundant `lastSeenAt` in intersection type**
    
    `lastSeenAt` was added to `StoredConversationReference` in this PR, so the `& { lastSeenAt?: string }` intersection on line 11 is now redundant — `ConversationStoreData.conversations` is already typed as `Record<string, StoredConversationReference>` which includes the field. Not a runtime bug, but the duplication can mislead future readers into thinking `lastSeenAt` lives outside the base type.
    
    
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: extensions/msteams/src/conversation-store-memory.ts
Line: 31-44

Comment:
**Memory store has the same stale-reference bug**

`findByUserId` in the memory store still uses the old first-match logic, returning the first entry that matches by `aadObjectId` or `user.id` without any preference for `personal` conversations or `lastSeenAt` recency. The FS store received the fix, but the memory store was left behind.

If `createMSTeamsConversationStoreMemory` is ever wired into a real proactive-send path (e.g. in tests that drive the full send stack, or a future in-process runtime), it will reproduce exactly the regression this PR is fixing. The two implementations should behave identically for the same inputs.

```ts
// suggested replacement for the findByUserId method:
findByUserId: async (id) => {
  const target = id.trim();
  if (!target) {
    return null;
  }
  const matches: MSTeamsConversationStoreEntry[] = [];
  for (const [conversationId, reference] of map.entries()) {
    if (reference.user?.aadObjectId === target || reference.user?.id === target) {
      matches.push({ conversationId, reference });
    }
  }
  if (matches.length === 0) {
    return null;
  }
  matches.sort((a, b) => {
    const aPersonal = a.reference.conversation?.conversationType?.toLowerCase() === "personal" ? 1 : 0;
    const bPersonal = b.reference.conversation?.conversationType?.toLowerCase() === "personal" ? 1 : 0;
    if (aPersonal !== bPersonal) return bPersonal - aPersonal;
    const aTs = parseTimestamp(a.reference.lastSeenAt) ?? 0;
    const bTs = parseTimestamp(b.reference.lastSeenAt) ?? 0;
    return bTs - aTs;
  });
  return matches[0] ?? null;
},
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/msteams/src/conversation-store-fs.ts
Line: 11

Comment:
**Redundant `lastSeenAt` in intersection type**

`lastSeenAt` was added to `StoredConversationReference` in this PR, so the `& { lastSeenAt?: string }` intersection on line 11 is now redundant — `ConversationStoreData.conversations` is already typed as `Record<string, StoredConversationReference>` which includes the field. Not a runtime bug, but the duplication can mislead future readers into thinking `lastSeenAt` lives outside the base type.

```suggestion
  conversations: Record<string, StoredConversationReference>;
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/msteams/src/conversation-store-fs.test.ts
Line: 148

Comment:
**`setTimeout` timing creates a flaky ordering dependency**

The test relies on a 10 ms `setTimeout` to guarantee that `a:new-personal` gets a strictly later `lastSeenAt` than `a:old-personal`. Under CI load or on systems with coarse timer resolution this gap can collapse to 0 ms, causing both entries to receive the same ISO string and making the sort non-deterministic.

A more reliable approach is to bypass the real clock altogether. For example, inject a synthetic clock or directly call `upsert` with an explicit `lastSeenAt` on the reference (if the store honours an incoming `lastSeenAt`). Alternatively, the `upsert` signature could accept an optional `now` override for testability.

At minimum, consider bumping the delay to something more generous (e.g. 50–100 ms) or adding an assertion that the two stored timestamps are actually different before checking the result.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "msteams: prefer freshest personal conver..." | Re-trigger Greptile

user: { id: "group-user", aadObjectId: "shared-aad" },
});

await new Promise((resolve) => setTimeout(resolve, 10));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 setTimeout timing creates a flaky ordering dependency

The test relies on a 10 ms setTimeout to guarantee that a:new-personal gets a strictly later lastSeenAt than a:old-personal. Under CI load or on systems with coarse timer resolution this gap can collapse to 0 ms, causing both entries to receive the same ISO string and making the sort non-deterministic.

A more reliable approach is to bypass the real clock altogether. For example, inject a synthetic clock or directly call upsert with an explicit lastSeenAt on the reference (if the store honours an incoming lastSeenAt). Alternatively, the upsert signature could accept an optional now override for testability.

At minimum, consider bumping the delay to something more generous (e.g. 50–100 ms) or adding an assertion that the two stored timestamps are actually different before checking the result.

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/msteams/src/conversation-store-fs.test.ts
Line: 148

Comment:
**`setTimeout` timing creates a flaky ordering dependency**

The test relies on a 10 ms `setTimeout` to guarantee that `a:new-personal` gets a strictly later `lastSeenAt` than `a:old-personal`. Under CI load or on systems with coarse timer resolution this gap can collapse to 0 ms, causing both entries to receive the same ISO string and making the sort non-deterministic.

A more reliable approach is to bypass the real clock altogether. For example, inject a synthetic clock or directly call `upsert` with an explicit `lastSeenAt` on the reference (if the store honours an incoming `lastSeenAt`). Alternatively, the `upsert` signature could accept an optional `now` override for testability.

At minimum, consider bumping the delay to something more generous (e.g. 50–100 ms) or adding an assertion that the two stored timestamps are actually different before checking the result.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I pushed a follow-up that does three things:

  1. mirrors the same freshest-personal-selection logic into conversation-store-memory.ts so both store implementations behave the same
  2. removes the redundant & { lastSeenAt?: string } typing in the FS store now that lastSeenAt lives on StoredConversationReference
  3. removes the timer-based ordering from the test by making the regression coverage deterministic instead of relying on setTimeout

Focused test still passes after the follow-up:

  • pnpm test -- extensions/msteams/src/conversation-store-fs.test.ts

Thanks.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5e6f25e6d8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +143 to +144
(parseTimestamp(b.reference.lastSeenAt) ?? 0) -
(parseTimestamp(a.reference.lastSeenAt) ?? 0)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid ranking by mutable lastSeenAt writes

findByUserId now orders matches by lastSeenAt, but that timestamp is updated on every upsert, including proactive send-side cache updates (send-context.ts calls store.upsert(...) when caching graphChatId). As a result, an older/stale personal conversation can be made “freshest” by a non-inbound write and keep winning this sort, so user:<aad> sends may continue targeting a stale Bot Framework conversation ID and still fail with 403 despite a newer valid DM reference existing.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. You are right that send-side cache writes can currently interfere with the freshness ranking if they touch the same record. I started patching this by preserving/carrying forward the stored lastSeenAt on cache-only updates instead of letting those writes redefine freshness, but I want to tighten the regression coverage before I mark this thread resolved.

So for now: acknowledged, and I am treating this as a real follow-up to address on the PR rather than waving it away. Thanks.

@steipete
Copy link
Copy Markdown
Contributor

Was Opus here? This is a very localized fix, I get sth better in.

Screenshot 2026-03-28 at 01 36 09

@steipete steipete force-pushed the fix-msteams-stale-dm-resolution branch from c00d20f to 4c6f312 Compare March 28, 2026 02:04
@steipete steipete merged commit 2926c25 into openclaw:main Mar 28, 2026
9 checks passed
@steipete
Copy link
Copy Markdown
Contributor

Landed via temp rebase onto main.

  • Gate: pnpm lint ✅, pnpm build ✅, pnpm test ❌ (unrelated pre-existing failures in src/cli/skills-cli.formatting.test.ts, src/plugins/cli.browser-plugin.integration.test.ts, and src/plugin-sdk/runtime-api-guardrails.test.ts)
  • Land commit: 4c6f312
  • Merge commit: 2926c25

Thanks @gumclaw!

johnkhagler pushed a commit to johnkhagler/openclaw that referenced this pull request Mar 29, 2026
alexcode-cc pushed a commit to alexcode-cc/clawdbot that referenced this pull request Mar 30, 2026
alexjiang1 pushed a commit to alexjiang1/openclaw that referenced this pull request Mar 31, 2026
lovewanwan pushed a commit to lovewanwan/openclaw that referenced this pull request Apr 28, 2026
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: msteams Channel integration: msteams size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants