Skip to content

fix(reply): normalize Windows media paths for dedupe#44228

Open
aniruddhaadak80 wants to merge 2 commits intoopenclaw:mainfrom
aniruddhaadak80:fix/windows-reply-media-dedupe
Open

fix(reply): normalize Windows media paths for dedupe#44228
aniruddhaadak80 wants to merge 2 commits intoopenclaw:mainfrom
aniruddhaadak80:fix/windows-reply-media-dedupe

Conversation

@aniruddhaadak80
Copy link
Copy Markdown

Summary

  • Problem: reply-media dedupe compared Windows drive-letter local paths and file:///C:/... URLs as different strings.
  • Why it matters: replies can resend media the messaging tool already delivered on Windows hosts.
  • What changed: normalized Windows drive-letter file URLs and local paths into the same dedupe form, and added a regression test that covers both slash styles.
  • What did NOT change (scope boundary): non-Windows media normalization and the rest of reply suppression logic are unchanged.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

  • Windows reply dedupe now treats file:///C:/..., C:\..., and C:/... media references as the same local file when deciding whether to strip duplicates.

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)

Repro + Verification

Environment

  • OS: Windows
  • Runtime/container: local Node/pnpm workspace
  • Model/provider: N/A
  • Integration/channel (if any): auto-reply / messaging tool dedupe path
  • Relevant config (redacted): N/A

Steps

  1. Send media via a messaging tool on Windows so the sent record contains a local drive-letter path.
  2. Produce a reply payload that references the same file through a file:///C:/... URL.
  3. Run reply-media dedupe.

Expected

  • The duplicate media reference should be stripped.

Actual

  • The new regression test passes and the payload media is removed as expected.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: ran pnpm exec vitest run src/auto-reply/reply/reply-payloads.test.ts.
  • Additional gate: ran pnpm build:plugin-sdk:dts.
  • What I did not verify: additional UNC-path semantics remain out of scope for this small fix.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)

Failure Recovery (if this breaks)

  • Revert the reply-payload normalization change and test.
  • Known bad symptoms reviewers should watch for: unexpected dedupe behavior for unusual Windows path formats outside drive-letter paths.

Copilot AI review requested due to automatic review settings March 12, 2026 16:24
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes Windows-specific reply media dedupe mismatches by normalizing drive-letter paths so that local paths (C:\... / C:/...) and file:///C:/... URLs compare equivalently during dedupe filtering.

Changes:

  • Normalize Windows absolute drive-letter local paths by canonicalizing separators before dedupe comparison.
  • Normalize file:///C:/... URLs into the same canonical Windows local-path form during dedupe comparison.
  • Add a regression test covering file:///C:/... vs both C:\... and C:/....

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/auto-reply/reply/reply-payloads.ts Adds Windows drive-letter normalization in filterMessagingToolMediaDuplicates so equivalent local path / file URL forms dedupe correctly.
src/auto-reply/reply/reply-payloads.test.ts Adds regression coverage for Windows drive-letter file:///C:/... URLs deduping against local path variants.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 12, 2026

Greptile Summary

This PR fixes a Windows-specific media deduplication bug where file:///C:/… URLs and local drive-letter paths (C:\…, C:/…) were compared as different strings, causing replies to resend already-delivered media. The fix adds a WINDOWS_ABSOLUTE_PATH_RE helper and extends normalizeMediaForDedupe to canonicalize all three forms to a backslash-separated path before Set lookup.

Key observations:

  • The core normalization logic is correct: file:///C:/tmp/photo%20one.jpgC:\tmp\photo one.jpg, C:/tmp/photo one.jpgC:\tmp\photo one.jpg, C:\tmp\photo one.jpg → unchanged. All three forms land in the same Set bucket.
  • Drive letter case is not normalized. file:///c:/tmp/file.jpg (lowercase c) normalizes to c:\tmp\file.jpg while a stored record of C:\tmp\file.jpg (uppercase C) normalizes to C:\tmp\file.jpg — these do not match. Since Windows paths are case-insensitive, this is a real deduplication gap that the current test suite does not cover (all test variants use uppercase C:).
  • Scope is appropriately narrow; UNC paths and non-Windows normalization are unchanged.

Confidence Score: 3/5

  • The fix is correct for same-case drive letters but will silently miss deduplication when drive letter casing differs between the stored record and the reply payload.
  • The normalization logic works correctly for the exact cases tested (uppercase C: on both sides, slash-style variations, URL percent-encoding). However, Windows file paths are case-insensitive and drive letters can legitimately appear as lowercase in one record and uppercase in another. The omission of drive-letter case folding means the fix is incomplete: a file:///c:/… payload (lowercase) will not dedupe against a stored C:\… record (uppercase), leaving the original resend bug alive in that scenario. Confidence is further reduced because no test exercises the case-mismatch path.
  • src/auto-reply/reply/reply-payloads.ts lines 117–124 (normalization logic) and the corresponding test in src/auto-reply/reply/reply-payloads.test.ts (missing lowercase drive-letter coverage)
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/reply-payloads.ts
Line: 117-124

Comment:
**Drive letter case not normalized**

The normalization converts all variants to backslash form (e.g. `C:\path`) but does not lowercase the drive letter. This means `file:///c:/tmp/photo.jpg` (lowercase `c`) normalizes to `c:\tmp\photo.jpg`, which will **not** match a `sentMediaUrls` entry of `C:\tmp\photo.jpg` (uppercase `C`), even though on Windows these refer to the same file.

Windows paths are case-insensitive, so a sent record stored with one casing and a reply payload referencing the other casing will silently fail to dedupe.

Consider folding the drive letter to a consistent case before returning:

```typescript
// non-file:// branch
return WINDOWS_ABSOLUTE_PATH_RE.test(trimmed)
  ? trimmed[0].toLowerCase() + trimmed.slice(1).replaceAll("/", "\\")
  : trimmed;
```

```typescript
// file: branch
if (/^\/[a-zA-Z]:\//.test(decodedPath)) {
  const win = decodedPath.slice(1).replaceAll("/", "\\");
  return win[0].toLowerCase() + win.slice(1);
}
```

(The corresponding test in `reply-payloads.test.ts` covers only uppercase `C:` on both sides, so the case-mismatch scenario is currently untested.)

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 4e0ca4c

Comment on lines +117 to +124
return WINDOWS_ABSOLUTE_PATH_RE.test(trimmed) ? trimmed.replaceAll("/", "\\") : trimmed;
}
try {
const parsed = new URL(trimmed);
if (parsed.protocol === "file:") {
return decodeURIComponent(parsed.pathname || "");
const decodedPath = decodeURIComponent(parsed.pathname || "");
if (/^\/[a-zA-Z]:\//.test(decodedPath)) {
return decodedPath.slice(1).replaceAll("/", "\\");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive letter case not normalized

The normalization converts all variants to backslash form (e.g. C:\path) but does not lowercase the drive letter. This means file:///c:/tmp/photo.jpg (lowercase c) normalizes to c:\tmp\photo.jpg, which will not match a sentMediaUrls entry of C:\tmp\photo.jpg (uppercase C), even though on Windows these refer to the same file.

Windows paths are case-insensitive, so a sent record stored with one casing and a reply payload referencing the other casing will silently fail to dedupe.

Consider folding the drive letter to a consistent case before returning:

// non-file:// branch
return WINDOWS_ABSOLUTE_PATH_RE.test(trimmed)
  ? trimmed[0].toLowerCase() + trimmed.slice(1).replaceAll("/", "\\")
  : trimmed;
// file: branch
if (/^\/[a-zA-Z]:\//.test(decodedPath)) {
  const win = decodedPath.slice(1).replaceAll("/", "\\");
  return win[0].toLowerCase() + win.slice(1);
}

(The corresponding test in reply-payloads.test.ts covers only uppercase C: on both sides, so the case-mismatch scenario is currently untested.)

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/reply-payloads.ts
Line: 117-124

Comment:
**Drive letter case not normalized**

The normalization converts all variants to backslash form (e.g. `C:\path`) but does not lowercase the drive letter. This means `file:///c:/tmp/photo.jpg` (lowercase `c`) normalizes to `c:\tmp\photo.jpg`, which will **not** match a `sentMediaUrls` entry of `C:\tmp\photo.jpg` (uppercase `C`), even though on Windows these refer to the same file.

Windows paths are case-insensitive, so a sent record stored with one casing and a reply payload referencing the other casing will silently fail to dedupe.

Consider folding the drive letter to a consistent case before returning:

```typescript
// non-file:// branch
return WINDOWS_ABSOLUTE_PATH_RE.test(trimmed)
  ? trimmed[0].toLowerCase() + trimmed.slice(1).replaceAll("/", "\\")
  : trimmed;
```

```typescript
// file: branch
if (/^\/[a-zA-Z]:\//.test(decodedPath)) {
  const win = decodedPath.slice(1).replaceAll("/", "\\");
  return win[0].toLowerCase() + win.slice(1);
}
```

(The corresponding test in `reply-payloads.test.ts` covers only uppercase `C:` on both sides, so the case-mismatch scenario is currently untested.)

How can I resolve this? If you propose a fix, please make it concise.

@katoue

This comment was marked as spam.

@johnsonshi
Copy link
Copy Markdown
Contributor

The Windows media-dedupe fix itself looks good, and the regression test is solid.

  • In src/auto-reply/reply/reply-payloads.ts, the drive-letter normalization looks correct for the reported Windows case (file:///C:/... vs C:\\... / C:/...).
  • The new test in src/auto-reply/reply/reply-payloads.test.ts covers that behavior well.

That said, this PR also includes unrelated changes in:

  • apps/macos/Sources/OpenClawProtocol/GatewayModels.swift
  • apps/shared/OpenClawKit/Sources/OpenClawProtocol/GatewayModels.swift
  • src/config/schema.labels.ts

Those changes don’t appear tied to the reply-media dedupe fix and aren’t reflected in the PR title/summary. The branch history also shows they came in as a separate commit (fix(config): add push labels and sync protocol models), which reinforces that they should probably be split out.

I’d suggest trimming this PR to just the reply-media dedupe fix plus the regression test, or moving the config/protocol-model changes into a separate PR so this one stays tightly scoped.

@openclaw-barnacle
Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle Bot added the stale Marked as stale due to inactivity label Apr 28, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 28, 2026

Codex review: needs changes before merge.

Summary
The PR normalizes Windows drive-letter file URLs and local paths for reply-media dedupe, adds a regression test, and also includes push config label plus Swift protocol model changes.

Reproducibility: yes. A high-confidence source-level reproduction is clear: a sent media URL such as C:\tmp\photo.jpg does not match reply media references such as file:///C:/tmp/photo.jpg or file:///c:/tmp/photo.jpg on current main because the helper returns different strings.

Next step before merge
This is a narrow valid bug with clear files, mechanical review findings, and a focused replacement or branch-repair path.

Security
Cleared: The diff does not add dependencies, CI execution, secrets handling, network calls, lifecycle scripts, downloaded artifacts, or broader permissions.

Review findings

  • [P2] Normalize Windows drive-letter casing — src/auto-reply/reply/reply-payloads.ts:117-124
  • [P3] Add the required changelog entry — src/auto-reply/reply/reply-payloads.ts:117
  • [P3] Drop unrelated push/config hunks — src/config/schema.labels.ts:80-84
Review details

Best possible solution:

Land a focused current-main fix in src/auto-reply/reply/reply-payloads-dedupe.ts that canonicalizes Windows drive-letter file URLs and local paths, including drive-letter case, with targeted tests and a single changelog entry.

Do we have a high-confidence way to reproduce the issue?

Yes. A high-confidence source-level reproduction is clear: a sent media URL such as C:\tmp\photo.jpg does not match reply media references such as file:///C:/tmp/photo.jpg or file:///c:/tmp/photo.jpg on current main because the helper returns different strings.

Is this the best way to solve the issue?

No. The PR targets the right behavior, but it misses Windows drive-letter case folding, is against the older helper shape, and includes unrelated stale push/protocol changes.

Full review comments:

  • [P2] Normalize Windows drive-letter casing — src/auto-reply/reply/reply-payloads.ts:117-124
    The new normalization only changes separators, so file:///c:/tmp/photo.jpg still canonicalizes differently from a sent C:\tmp\photo.jpg. Windows drive letters are case-insensitive, so this leaves a valid duplicate-media resend path unfixed; fold the drive letter to a stable case and add a mismatch regression test.
    Confidence: 0.9
  • [P3] Add the required changelog entry — src/auto-reply/reply/reply-payloads.ts:117
    This is a user-facing bug fix, but the PR does not update CHANGELOG.md. OpenClaw policy requires a single-line changelog entry for user-facing fixes before merge.
    Confidence: 0.84
  • [P3] Drop unrelated push/config hunks — src/config/schema.labels.ts:80-84
    This reply-media dedupe PR also changes push config labels and Swift PushTestResult transport models. Those changes are unrelated to the reported bug and are already present on current main, so keeping them here makes the branch stale and harder to merge safely.
    Confidence: 0.78

Overall correctness: patch is incorrect
Overall confidence: 0.88

Acceptance criteria:

  • pnpm test src/auto-reply/reply/reply-payloads.test.ts
  • pnpm exec oxfmt --check --threads=1 src/auto-reply/reply/reply-payloads-dedupe.ts src/auto-reply/reply/reply-payloads.test.ts CHANGELOG.md
  • git diff --check
  • pnpm check:changed

What I checked:

Likely related people:

  • Peter Steinberger: Local blame/log for the current split dedupe helper and tests attributes the available checkout boundary to this maintainer; useful for routing, not blame. (role: current-main maintainer; confidence: medium; commits: b37fba7c07a7; files: src/auto-reply/reply/reply-payloads-dedupe.ts, src/auto-reply/reply/reply-payloads.test.ts)
  • johnsonshi: Already reviewed this PR and identified the same unrelated push/protocol hunks that need to be split from the reply-media fix. (role: reviewer / scope reviewer; confidence: medium; files: src/auto-reply/reply/reply-payloads.ts, src/config/schema.labels.ts, apps/macos/Sources/OpenClawProtocol/GatewayModels.swift)
  • shuofengzhang: Opened related reply: normalize Windows file paths for media dedupe #37425/Reply media dedupe should normalize Windows local paths and file:// URLs #37426 with the same Windows path normalization goal, including drive-letter case differences, which helps preserve the original bug context. (role: related prior implementation author; confidence: low; files: src/auto-reply/reply/reply-payloads.ts, src/auto-reply/reply/reply-payloads.test.ts)

Remaining risk / open question:

  • No executable test run was performed in this read-only review; the reproduction is source-level.

Codex review notes: model gpt-5.5, reasoning high; reviewed against a90be474f441.

@openclaw-barnacle openclaw-barnacle Bot removed the stale Marked as stale due to inactivity label Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants