fix(msteams): prefer freshest personal conversation for proactive DM sends by gumclaw · Pull Request #54702 · openclaw/openclaw

gumclaw · 2026-03-25T21:08:26Z

Summary

Problem: MS Teams proactive sends to user:<aadObjectId> could resolve to a stale personal DM conversation reference when the same user had multiple stored references.
Why it matters: replies could be generated successfully but then fail outbound with HTTP 403 because the send path targeted an old Bot Framework conversation ID.
What changed: findByUserId now considers all matching references, prefers personal conversations, and then selects the freshest lastSeenAt; added a regression test covering multiple references for one user.
What did NOT change (scope boundary): no changes to Teams webhook ingestion, proactive send transport, or conversation reference storage format beyond typing lastSeenAt on the stored reference.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes #
Related #
This PR fixes a bug or regression

Root Cause / Regression History (if applicable)

Root cause: the MS Teams FS conversation store returned the first stored user match instead of selecting the newest valid DM reference.
Missing detection / guardrail: there was no regression test for multiple conversation references sharing the same AAD object ID.
Prior context (git blame, prior PR, issue, or refactor if known): Unknown.
Why this regressed now: Teams rotated/recreated a personal DM conversation reference for the same user, leaving both old and new references in the store.
If unknown, what was ruled out: ruled out inbound processing failure and ruled out a Teams-wide outage; the failing path was specifically proactive outbound resolution for the DM user target.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: extensions/msteams/src/conversation-store-fs.test.ts
Scenario the test should lock in: when a single AAD user has multiple stored references, the store should pick the freshest personal conversation instead of the first inserted one.
Why this is the smallest reliable guardrail: the bug lives entirely in store-side lookup/selection logic.
Existing test that already covers this (if any): None.
If no new test is added, why not: N/A.

User-visible / Behavior Changes

MS Teams proactive replies to DM users now prefer the latest personal conversation reference when multiple references exist for the same user.

Security Impact (required)

New permissions/capabilities? (Yes/No) No
Secrets/tokens handling changed? (Yes/No) No
New/changed network calls? (Yes/No) No
Command/tool execution surface changed? (Yes/No) No
Data access scope changed? (Yes/No) No
If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

OS: macOS 15 / Darwin 25.3.0 (arm64)
Runtime/container: local OpenClaw dev repo
Model/provider: N/A
Integration/channel (if any): MS Teams
Relevant config (redacted): stored conversation references containing two personal DMs and one group/chat reference for the same AAD user

Steps

Store an older personal conversation reference for a user.
Store another reference for the same AAD user, including a newer personal DM.
Resolve findByUserId for that user.

Expected

The newest personal conversation reference is returned.

Actual

Before this change, the first inserted matching reference was returned, which could be stale.

Evidence

Attach at least one:

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

Verified scenarios: reproduced a real MS Teams DM delivery failure caused by stale conversation resolution; hot-patched the installed runtime; confirmed replies started arriving; added repo-level regression test; confirmed targeted test passes and pnpm build passes.
Edge cases checked: multiple references for the same AAD user across personal and group chat conversation types; newest personal reference wins.
What you did not verify: full repo-wide pnpm check and pnpm test green, because origin/main currently has unrelated TUI type errors in src/tui/components/* for getEditorKeybindings.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

Backward compatible? (Yes/No) Yes
Config/env changes? (Yes/No) No
Migration needed? (Yes/No) No
If yes, exact upgrade steps: N/A

Failure Recovery (if this breaks)

How to disable/revert this change quickly: revert this PR or restore the previous findByUserId selection behavior.
Files/config to restore: extensions/msteams/src/conversation-store-fs.ts, extensions/msteams/src/conversation-store.ts
Known bad symptoms reviewers should watch for: proactive DM sends choosing an unexpected conversation reference for users with multiple stored references.

Risks and Mitigations

Risk: preferring personal and newest lastSeenAt could change resolution behavior for edge cases where a non-personal reference was intentionally preferred.
- Mitigation: this only affects user:<id> lookup, which is used for person-targeted proactive sends; the new regression test locks the expected DM-preference behavior.

greptile-apps · 2026-03-25T21:10:39Z

Greptile Summary

This PR fixes a real production bug where MS Teams proactive DM sends could target a stale Bot Framework conversation ID when a user had multiple stored conversation references. The FS store's findByUserId now gathers all matches, prefers personal conversation types, and picks the freshest lastSeenAt — which is the correct selection policy for person-targeted proactive messaging. A focused regression test locks in the new behaviour.

Key observations:

Memory store not updated (conversation-store-memory.ts) — findByUserId still uses the original first-match logic. The two store implementations are now behaviorally inconsistent. If the memory store is ever used on a real proactive-send path (or in integration tests that drive that path), it will reproduce the same regression. The fix should be mirrored here.
Redundant type intersection — ConversationStoreData.conversations is typed Record<string, StoredConversationReference & { lastSeenAt?: string }>, but lastSeenAt is now part of StoredConversationReference itself after this PR, making the intersection dead weight.
Test timing — The regression test uses a hard-coded 10 ms setTimeout to produce distinct timestamps. This can be flaky on loaded CI runners; a deterministic clock injection or explicit lastSeenAt seeding would make the test more reliable.
The FS store sort logic itself is sound: personal entries bubble to the top, then ties are broken by timestamp descending, with missing timestamps treated as epoch 0 (preserving legacy entries at the bottom).

Confidence Score: 4/5

Safe to merge with one concrete follow-up: mirror the same fix into the memory store to keep both implementations consistent.
The FS store fix is correct and the regression test covers the primary bug scenario. The only substantive issue is that conversation-store-memory.ts was not updated with the same selection logic, leaving a behavioral inconsistency between the two store implementations. This does not affect the immediate production fix (the FS store is the real store) but is a latent bug worth resolving before it bites again.
extensions/msteams/src/conversation-store-memory.ts — findByUserId still has the pre-fix first-match logic.

Comments Outside Diff (2)

extensions/msteams/src/conversation-store-memory.ts, line 31-44 (link)

Memory store has the same stale-reference bug

findByUserId in the memory store still uses the old first-match logic, returning the first entry that matches by aadObjectId or user.id without any preference for personal conversations or lastSeenAt recency. The FS store received the fix, but the memory store was left behind.

If createMSTeamsConversationStoreMemory is ever wired into a real proactive-send path (e.g. in tests that drive the full send stack, or a future in-process runtime), it will reproduce exactly the regression this PR is fixing. The two implementations should behave identically for the same inputs.

// suggested replacement for the findByUserId method:
findByUserId: async (id) => {
  const target = id.trim();
  if (!target) {
    return null;
  }
  const matches: MSTeamsConversationStoreEntry[] = [];
  for (const [conversationId, reference] of map.entries()) {
    if (reference.user?.aadObjectId === target || reference.user?.id === target) {
      matches.push({ conversationId, reference });
    }
  }
  if (matches.length === 0) {
    return null;
  }
  matches.sort((a, b) => {
    const aPersonal = a.reference.conversation?.conversationType?.toLowerCase() === "personal" ? 1 : 0;
    const bPersonal = b.reference.conversation?.conversationType?.toLowerCase() === "personal" ? 1 : 0;
    if (aPersonal !== bPersonal) return bPersonal - aPersonal;
    const aTs = parseTimestamp(a.reference.lastSeenAt) ?? 0;
    const bTs = parseTimestamp(b.reference.lastSeenAt) ?? 0;
    return bTs - aTs;
  });
  return matches[0] ?? null;
},

Prompt To Fix With AI

This is a comment left during a code review.
Path: extensions/msteams/src/conversation-store-memory.ts
Line: 31-44

Comment:
**Memory store has the same stale-reference bug**

`findByUserId` in the memory store still uses the old first-match logic, returning the first entry that matches by `aadObjectId` or `user.id` without any preference for `personal` conversations or `lastSeenAt` recency. The FS store received the fix, but the memory store was left behind.

If `createMSTeamsConversationStoreMemory` is ever wired into a real proactive-send path (e.g. in tests that drive the full send stack, or a future in-process runtime), it will reproduce exactly the regression this PR is fixing. The two implementations should behave identically for the same inputs.

```ts
// suggested replacement for the findByUserId method:
findByUserId: async (id) => {
  const target = id.trim();
  if (!target) {
    return null;
  }
  const matches: MSTeamsConversationStoreEntry[] = [];
  for (const [conversationId, reference] of map.entries()) {
    if (reference.user?.aadObjectId === target || reference.user?.id === target) {
      matches.push({ conversationId, reference });
    }
  }
  if (matches.length === 0) {
    return null;
  }
  matches.sort((a, b) => {
    const aPersonal = a.reference.conversation?.conversationType?.toLowerCase() === "personal" ? 1 : 0;
    const bPersonal = b.reference.conversation?.conversationType?.toLowerCase() === "personal" ? 1 : 0;
    if (aPersonal !== bPersonal) return bPersonal - aPersonal;
    const aTs = parseTimestamp(a.reference.lastSeenAt) ?? 0;
    const bTs = parseTimestamp(b.reference.lastSeenAt) ?? 0;
    return bTs - aTs;
  });
  return matches[0] ?? null;
},
```

How can I resolve this? If you propose a fix, please make it concise.

extensions/msteams/src/conversation-store-fs.ts, line 11 (link)

Redundant lastSeenAt in intersection type

lastSeenAt was added to StoredConversationReference in this PR, so the & { lastSeenAt?: string } intersection on line 11 is now redundant — ConversationStoreData.conversations is already typed as Record<string, StoredConversationReference> which includes the field. Not a runtime bug, but the duplication can mislead future readers into thinking lastSeenAt lives outside the base type.

Prompt To Fix With AI

This is a comment left during a code review.
Path: extensions/msteams/src/conversation-store-fs.ts
Line: 11

Comment:
**Redundant `lastSeenAt` in intersection type**

`lastSeenAt` was added to `StoredConversationReference` in this PR, so the `& { lastSeenAt?: string }` intersection on line 11 is now redundant — `ConversationStoreData.conversations` is already typed as `Record<string, StoredConversationReference>` which includes the field. Not a runtime bug, but the duplication can mislead future readers into thinking `lastSeenAt` lives outside the base type.



How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: extensions/msteams/src/conversation-store-memory.ts
Line: 31-44

Comment:
**Memory store has the same stale-reference bug**

`findByUserId` in the memory store still uses the old first-match logic, returning the first entry that matches by `aadObjectId` or `user.id` without any preference for `personal` conversations or `lastSeenAt` recency. The FS store received the fix, but the memory store was left behind.

If `createMSTeamsConversationStoreMemory` is ever wired into a real proactive-send path (e.g. in tests that drive the full send stack, or a future in-process runtime), it will reproduce exactly the regression this PR is fixing. The two implementations should behave identically for the same inputs.

```ts
// suggested replacement for the findByUserId method:
findByUserId: async (id) => {
  const target = id.trim();
  if (!target) {
    return null;
  }
  const matches: MSTeamsConversationStoreEntry[] = [];
  for (const [conversationId, reference] of map.entries()) {
    if (reference.user?.aadObjectId === target || reference.user?.id === target) {
      matches.push({ conversationId, reference });
    }
  }
  if (matches.length === 0) {
    return null;
  }
  matches.sort((a, b) => {
    const aPersonal = a.reference.conversation?.conversationType?.toLowerCase() === "personal" ? 1 : 0;
    const bPersonal = b.reference.conversation?.conversationType?.toLowerCase() === "personal" ? 1 : 0;
    if (aPersonal !== bPersonal) return bPersonal - aPersonal;
    const aTs = parseTimestamp(a.reference.lastSeenAt) ?? 0;
    const bTs = parseTimestamp(b.reference.lastSeenAt) ?? 0;
    return bTs - aTs;
  });
  return matches[0] ?? null;
},
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/msteams/src/conversation-store-fs.ts
Line: 11

Comment:
**Redundant `lastSeenAt` in intersection type**

`lastSeenAt` was added to `StoredConversationReference` in this PR, so the `& { lastSeenAt?: string }` intersection on line 11 is now redundant — `ConversationStoreData.conversations` is already typed as `Record<string, StoredConversationReference>` which includes the field. Not a runtime bug, but the duplication can mislead future readers into thinking `lastSeenAt` lives outside the base type.

```suggestion
  conversations: Record<string, StoredConversationReference>;
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/msteams/src/conversation-store-fs.test.ts
Line: 148

Comment:
**`setTimeout` timing creates a flaky ordering dependency**

The test relies on a 10 ms `setTimeout` to guarantee that `a:new-personal` gets a strictly later `lastSeenAt` than `a:old-personal`. Under CI load or on systems with coarse timer resolution this gap can collapse to 0 ms, causing both entries to receive the same ISO string and making the sort non-deterministic.

A more reliable approach is to bypass the real clock altogether. For example, inject a synthetic clock or directly call `upsert` with an explicit `lastSeenAt` on the reference (if the store honours an incoming `lastSeenAt`). Alternatively, the `upsert` signature could accept an optional `now` override for testability.

At minimum, consider bumping the delay to something more generous (e.g. 50–100 ms) or adding an assertion that the two stored timestamps are actually different before checking the result.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "msteams: prefer freshest personal conver..." | Re-trigger Greptile}

greptile-apps · 2026-03-25T21:10:49Z

+      user: { id: "group-user", aadObjectId: "shared-aad" },
+    });
+
+    await new Promise((resolve) => setTimeout(resolve, 10));


setTimeout timing creates a flaky ordering dependency

The test relies on a 10 ms setTimeout to guarantee that a:new-personal gets a strictly later lastSeenAt than a:old-personal. Under CI load or on systems with coarse timer resolution this gap can collapse to 0 ms, causing both entries to receive the same ISO string and making the sort non-deterministic.

A more reliable approach is to bypass the real clock altogether. For example, inject a synthetic clock or directly call upsert with an explicit lastSeenAt on the reference (if the store honours an incoming lastSeenAt). Alternatively, the upsert signature could accept an optional now override for testability.

At minimum, consider bumping the delay to something more generous (e.g. 50–100 ms) or adding an assertion that the two stored timestamps are actually different before checking the result.

Prompt To Fix With AI

This is a comment left during a code review. Path: extensions/msteams/src/conversation-store-fs.test.ts Line: 148 Comment: **`setTimeout` timing creates a flaky ordering dependency** The test relies on a 10 ms `setTimeout` to guarantee that `a:new-personal` gets a strictly later `lastSeenAt` than `a:old-personal`. Under CI load or on systems with coarse timer resolution this gap can collapse to 0 ms, causing both entries to receive the same ISO string and making the sort non-deterministic. A more reliable approach is to bypass the real clock altogether. For example, inject a synthetic clock or directly call `upsert` with an explicit `lastSeenAt` on the reference (if the store honours an incoming `lastSeenAt`). Alternatively, the `upsert` signature could accept an optional `now` override for testability. At minimum, consider bumping the delay to something more generous (e.g. 50–100 ms) or adding an assertion that the two stored timestamps are actually different before checking the result. How can I resolve this? If you propose a fix, please make it concise.

Good catch. I pushed a follow-up that does three things:

mirrors the same freshest-personal-selection logic into conversation-store-memory.ts so both store implementations behave the same

removes the redundant & { lastSeenAt?: string } typing in the FS store now that lastSeenAt lives on StoredConversationReference

removes the timer-based ordering from the test by making the regression coverage deterministic instead of relying on setTimeout

Focused test still passes after the follow-up:

pnpm test -- extensions/msteams/src/conversation-store-fs.test.ts

Thanks.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5e6f25e6d8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-25T21:12:41Z

+        (parseTimestamp(b.reference.lastSeenAt) ?? 0) -
+        (parseTimestamp(a.reference.lastSeenAt) ?? 0)


Avoid ranking by mutable lastSeenAt writes

findByUserId now orders matches by lastSeenAt, but that timestamp is updated on every upsert, including proactive send-side cache updates (send-context.ts calls store.upsert(...) when caching graphChatId). As a result, an older/stale personal conversation can be made “freshest” by a non-inbound write and keep winning this sort, so user:<aad> sends may continue targeting a stale Bot Framework conversation ID and still fail with 403 despite a newer valid DM reference existing.

Useful? React with 👍 / 👎.

Good catch. You are right that send-side cache writes can currently interfere with the freshness ranking if they touch the same record. I started patching this by preserving/carrying forward the stored lastSeenAt on cache-only updates instead of letting those writes redefine freshness, but I want to tighten the regression coverage before I mark this thread resolved.

So for now: acknowledged, and I am treating this as a real follow-up to address on the PR rather than waving it away. Thanks.

steipete · 2026-03-28T02:03:18Z

Was Opus here? This is a very localized fix, I get sth better in.

…claw

…claw)

steipete · 2026-03-28T02:05:19Z

Landed via temp rebase onto main.

Gate: pnpm lint ✅, pnpm build ✅, pnpm test ❌ (unrelated pre-existing failures in src/cli/skills-cli.formatting.test.ts, src/plugins/cli.browser-plugin.integration.test.ts, and src/plugin-sdk/runtime-api-guardrails.test.ts)
Land commit: 4c6f312
Merge commit: 2926c25

Thanks @gumclaw!

…claw

…claw)

…claw

…claw)

…claw

…claw)

…claw

…claw)

…claw

…claw)

…claw

…claw)

openclaw-barnacle Bot added channel: msteams Channel integration: msteams size: S labels Mar 25, 2026

greptile-apps Bot reviewed Mar 25, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Mar 25, 2026

View reviewed changes

BradGroux mentioned this pull request Mar 26, 2026

WORKING: All Microsoft Issues and PRs #49126

Closed

gumclaw added 2 commits March 28, 2026 01:56

msteams: prefer freshest personal conversation reference

eb21ed0

msteams: align memory store user resolution

bd54490

fix: prefer freshest Teams DM reference (openclaw#54702) (thanks @gum…

4c6f312

…claw)

steipete force-pushed the fix-msteams-stale-dm-resolution branch from c00d20f to 4c6f312 Compare March 28, 2026 02:04

steipete merged commit 2926c25 into openclaw:main Mar 28, 2026
9 checks passed

openclaw-barnacle Bot added size: M and removed size: S labels Mar 28, 2026

johnkhagler pushed a commit to johnkhagler/openclaw that referenced this pull request Mar 29, 2026

fix: prefer freshest Teams DM reference (openclaw#54702) (thanks @gum…

d66c24a

…claw)

alexcode-cc pushed a commit to alexcode-cc/clawdbot that referenced this pull request Mar 30, 2026

fix: prefer freshest Teams DM reference (openclaw#54702) (thanks @gum…

6d1ae82

…claw)

alexjiang1 pushed a commit to alexjiang1/openclaw that referenced this pull request Mar 31, 2026

fix: prefer freshest Teams DM reference (openclaw#54702) (thanks @gum…

821a236

…claw)

lovewanwan pushed a commit to lovewanwan/openclaw that referenced this pull request Apr 28, 2026

fix: prefer freshest Teams DM reference (openclaw#54702) (thanks @gum…

dfa05c2

…claw)

ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026

fix: prefer freshest Teams DM reference (openclaw#54702) (thanks @gum…

eda5da7

…claw)

github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026

fix: prefer freshest Teams DM reference (openclaw#54702) (thanks @gum…

8eb2c01

…claw)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(msteams): prefer freshest personal conversation for proactive DM sends#54702

fix(msteams): prefer freshest personal conversation for proactive DM sends#54702
steipete merged 3 commits intoopenclaw:mainfrom
gumclaw:fix-msteams-stale-dm-resolution

gumclaw commented Mar 25, 2026

Uh oh!

greptile-apps Bot commented Mar 25, 2026 •

edited

Loading

Comments Outside Diff (2)

Uh oh!

greptile-apps Bot Mar 25, 2026

Uh oh!

gumclaw Mar 25, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 25, 2026

Uh oh!

gumclaw Mar 25, 2026

Uh oh!

steipete commented Mar 28, 2026

Uh oh!

Uh oh!

steipete commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		(parseTimestamp(b.reference.lastSeenAt) ?? 0) -
		(parseTimestamp(a.reference.lastSeenAt) ?? 0)

Uh oh!

Conversation

gumclaw commented Mar 25, 2026

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause / Regression History (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Failure Recovery (if this breaks)

Risks and Mitigations

Uh oh!

greptile-apps Bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Comments Outside Diff (2)

Uh oh!

greptile-apps Bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

gumclaw Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

gumclaw Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

steipete commented Mar 28, 2026

Uh oh!

Uh oh!

steipete commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented Mar 25, 2026 •

edited

Loading