Skip to content

fix(logs): only report rotation when the log file actually shrank#74252

Open
BSG2000 wants to merge 2 commits intoopenclaw:mainfrom
BSG2000:fix/log-tail-rotated-message
Open

fix(logs): only report rotation when the log file actually shrank#74252
BSG2000 wants to merge 2 commits intoopenclaw:mainfrom
BSG2000:fix/log-tail-rotated-message

Conversation

@BSG2000
Copy link
Copy Markdown

@BSG2000 BSG2000 commented Apr 29, 2026

PR title: fix(logs): only report rotation when the log file actually shrank


Summary

  • Problem: openclaw logs --follow prints Log cursor reset (file rotated). to stderr whenever the gateway log grows faster than --max-bytes between polls, even though no rotation has occurred — the cursor is still valid and the tail was just truncated to fit the byte budget.
  • Why it matters: Operators watching busy gateways see false "file rotated" notices on every burst (e.g. during a noisy startup or while a chatty channel pumps logs). The accompanying Log tail truncated (increase --max-bytes). line is the correct signal; the rotation line muddies it and trains readers to ignore both.
  • What changed: In src/logging/log-tail.ts::readLogSlice, only set reset = true in the true rotation case (cursor > size — the file shrank below the previous cursor). The fast-growth branch (size - start > maxBytes) now sets truncated = true only, since the cursor it returned remains contiguous with what the caller saw last poll.
  • What did NOT change (scope boundary): The LogTailPayload shape, the rotation-case behavior, the Log tail truncated message, and the byte-budget arithmetic are all unchanged. No CLI strings, schema, or downstream consumers were modified — src/logging/diagnostic-support-export.ts and src/cli/logs-cli.ts continue to work with the same fields.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration (logging is shared infra)
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX (stderr notice text)
  • CI/CD / infra

Linked Issue/PR

Root Cause (if applicable)

  • Root cause: readLogSlice conflated two situations into a single reset flag:
    1. cursor > size — the previously-returned cursor is past EOF; the file was rotated/truncated, so the cursor is meaningless and we must restart from the new tail.
    2. size - start > maxBytes — the file grew faster than the per-poll byte budget; the cursor is still valid, we just can't read everything that was appended in one poll.
      Only case (1) is a real reset/rotation. Case (2) was incorrectly being flagged as "rotated" because both branches set reset = true.
  • Missing detection / guardrail: log-tail.test.ts only covered the redaction path; nothing pinned the meaning of reset vs truncated.
  • Contributing context: The CLI side (src/cli/logs-cli.ts:402) hard-codes Log cursor reset (file rotated). for payload.reset, so the conflation surfaced as a misleading user-visible string rather than a silent boolean.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/logging/log-tail.test.ts
  • Scenarios the test locks in:
    1. File grows by >> maxBytes between two polls with the same valid cursor → truncated: true, reset: false, lines returned.
    2. File shrinks below the previous cursor (real rotation) → reset: true.
  • Why this is the smallest reliable guardrail: the bug is at the readLogSlice boundary; both scenarios can be exercised end-to-end against a real temp log file using the existing setLoggerOverride test seam.
  • Existing test that already covers this (if any): none.
  • If no new test is added, why not: N/A.

User-visible / Behavior Changes

openclaw logs --follow (and JSON-mode --json) no longer emits the Log cursor reset (file rotated). notice during high-throughput writes. The existing Log tail truncated (increase --max-bytes). notice still fires in that case, which is the actionable signal. The notice still fires correctly when the log is genuinely rotated/truncated.

Diagram (if applicable)

N/A

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No
  • N/A

Repro + Verification

Environment

  • OS: Ubuntu 24.04 (WSL) / Windows 11 host
  • Runtime/container: Node 22 (pnpm dev)
  • Model/provider: N/A
  • Integration/channel (if any): N/A — gateway logs only
  • Relevant config (redacted): default loggerSettings

Steps (to reproduce the original misleading notice on main)

  1. Start a gateway with anything that writes a steady stream of log lines (e.g. a chatty channel or openclaw status --watch against a busy gateway).
  2. In another terminal: openclaw logs --follow --max-bytes 4000.
  3. Wait for a write burst that exceeds 4 KB between two polls.

Expected

Only Log tail truncated (increase --max-bytes). is printed to stderr while the burst is being caught up.

Actual (before this PR)

Both Log tail truncated (increase --max-bytes). and Log cursor reset (file rotated). are printed, even though the file was never rotated.

After this PR

Only the truthful Log tail truncated notice is printed during fast growth. The rotation notice still fires when the file is actually rotated/truncated (covered by the second new test).

Evidence

  • Failing test/log before + passing after (see new tests in src/logging/log-tail.test.ts)
  • Trace/log snippets (see "Repro + Verification")
  • Screenshot/recording
  • Perf numbers (if relevant)

Local verification:

# Targeted unit lane
pnpm test src/logging/log-tail.test.ts
# Expect: 3 tests pass (1 existing + 2 new)

Note: I was unable to run the full repository test sweep locally (the dev environment is on a WSL/Windows-mounted filesystem that pnpm refuses to install into with ERR_PNPM_EACCES). The fix is a 5-line local change inside readLogSlice with no schema, no public-API, and no other call-site impact, and the new tests exercise the exact two paths affected. CI in this repo runs the full lane.

Human Verification (required)

  • Verified scenarios: fast file growth between polls → no rotation message; real rotation (file shrunk) → rotation message still emitted.
  • Edge cases checked: initial poll with no cursor still uses start = max(0, size - maxBytes) and reports truncated only (unchanged); empty file still returns cursor: size, lines: [] (unchanged); cursor inside [start, size] window with size - start <= maxBytes is unchanged.
  • What I did NOT verify: live CLI output against a running daemon during a real burst (will rely on CI + reviewer spot-check).

Review Conversations

  • I will reply to or resolve every bot review conversation I address in this PR.
  • I will leave unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No
  • N/A

Risks and Mitigations

  • Risk: a downstream consumer relies on reset === true to mean "skipped some lines" rather than "the cursor is no longer meaningful".
    • Mitigation: I grepped the repo for payload.reset / LogTailPayload. The only consumers are src/cli/logs-cli.ts (renders the user-visible string this PR is correcting) and src/logging/diagnostic-support-export.ts::sanitizeLogTail (passes the boolean through verbatim into the support bundle). Both are display-only; neither uses reset to decide whether lines were skipped — they use the existing truncated flag for that.

AI/Vibe-Coded PRs Welcome! 🤖

  • AI-assisted PR. Drafted with help from Claude (Opus 4.7) via the GitHub Copilot CLI agent harness.
  • Degree of testing: lightly tested — new unit tests added; full repo lane not run locally (see note above).
  • Prompts/session logs: available on request.
  • I understand exactly what the code does (5-line semantic narrowing of when reset is asserted in readLogSlice).
  • codex review --base origin/main not run locally (no Codex access in this environment); happy to address Codex bot comments after the PR opens.
  • Will resolve bot review conversations after addressing them.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 29, 2026

Greptile Summary

This PR fixes a false-positive "file rotated" notice in openclaw logs --follow by removing reset = true from the fast-growth branch inside readLogSlice. The fix correctly narrows reset to the one case where it is semantically meaningful (cursor > size), and two new unit tests lock in both the fast-growth and genuine-rotation paths.

Confidence Score: 5/5

This PR is safe to merge — it is a minimal, well-reasoned semantic fix with no schema changes and new tests covering both affected paths.

The change is a 5-line narrowing of when reset is asserted; the LogTailPayload shape, all downstream consumers, and byte-budget arithmetic are untouched. Both the bug (false rotation report) and the regression guard (two new unit tests) are present and correct.

No files require special attention.

Reviews (1): Last reviewed commit: "fix(logs): only report rotation when the..." | Re-trigger Greptile

Re-review progress:

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 29, 2026

Codex review: needs real behavior proof before merge.

Summary
This PR narrows log-tail reset semantics, adds regression coverage for fast-growth truncation and real rotation, and adds a user-facing changelog entry.

Reproducibility: yes. Source inspection on current main gives a high-confidence path: call readConfiguredLogTail once, append more than maxBytes while the prior cursor is still within the file, and the fast-growth branch sets both truncated and reset, causing the CLI to print both notices.

Real behavior proof
Needs stronger real behavior proof before merge: The PR body provides prose and unit-test claims but no observable after-fix terminal output, screenshot/recording, linked artifact, or redacted logs showing the real openclaw logs --follow --max-bytes behavior; contributors should redact private paths, IPs, API keys, phone numbers, and non-public endpoints, then update the PR body to trigger a fresh ClawSweeper review or ask a maintainer to comment @clawsweeper re-review.

Next step before merge
Manual handling is needed because the remaining blocker is contributor-supplied real behavior proof and final exact-head validation, not an automatable code repair.

Security
Cleared: The diff only changes log-tail boolean semantics, tests, and changelog text; it does not add dependencies, workflows, permissions, secret handling, network calls, or execution paths.

Review details

Best possible solution:

Merge the narrow semantic fix after the contributor adds observable after-fix proof and the exact-head required checks finish cleanly.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection on current main gives a high-confidence path: call readConfiguredLogTail once, append more than maxBytes while the prior cursor is still within the file, and the fast-growth branch sets both truncated and reset, causing the CLI to print both notices.

Is this the best way to solve the issue?

Yes for the code direction. Removing reset = true only from the fast-growth branch is the narrow maintainable fix because true rotation remains represented by cursor > size, while skipped bytes continue to use the existing truncated flag.

What I checked:

  • current-main root cause: On current main, the fast-growth branch where size - start > maxBytes sets both reset = true and truncated = true, so a valid cursor plus a large append is classified as a reset. (src/logging/log-tail.ts:91, 64bbe96d8843)
  • current-main user-visible notice: The CLI independently prints Log tail truncated (increase --max-bytes). for payload.truncated and Log cursor reset (file rotated). for payload.reset, so the source-level conflation reaches stderr and JSON notice output. (src/cli/logs-cli.ts:403, 64bbe96d8843)
  • PR implementation: The PR patch removes reset = true from only the fast-growth branch and leaves the true shrink/rotation branch setting reset = true. (src/logging/log-tail.ts:91, fa08a970fca9)
  • PR regression coverage: The PR adds temp-file tests for a valid cursor followed by a burst over maxBytes expecting truncated: true and reset: false, plus a shrink-below-cursor rotation case expecting reset: true. (src/logging/log-tail.test.ts:53, fa08a970fca9)
  • consumer scope: Repo search found reset consumption in the CLI notices and diagnostic support export passthrough; diagnostic export records the boolean but does not use it to control tailing behavior. (src/logging/diagnostic-support-export.ts:373, 64bbe96d8843)
  • docs surface: The public logs CLI docs define --follow and --max-bytes but do not document reset as a skipped-lines signal; the existing truncation notice is the documented-relevant signal for byte budget pressure. Public docs: docs/cli/logs.md. (docs/cli/logs.md:14, 64bbe96d8843)

Likely related people:

  • @steipete: GitHub commit history shows fix(cli): add local logs fallback added src/logging/log-tail.ts and wired local log fallback into src/cli/logs-cli.ts; current shallow blame also maps the central lines to Peter Steinberger. (role: introduced behavior and frequent maintainer; confidence: high; commits: 306fe841f54b, e25b54210097, 5da9f5e57c14; files: src/logging/log-tail.ts, src/cli/logs-cli.ts, src/cli/logs-cli.test.ts)
  • @ethanclaw: Recently changed src/logging/log-tail.ts for active rolling log file selection across date boundaries, which is adjacent to the tail cursor/file-size behavior under review. (role: recent adjacent maintainer; confidence: medium; commits: 492e2a3060db; files: src/logging/log-tail.ts)
  • @shashank-poola: Recently updated openclaw logs --follow reconnect behavior and tests in the same CLI surface that renders tail notices. (role: recent logs-follow maintainer; confidence: medium; commits: 23fe3559e5a2; files: src/cli/logs-cli.ts, src/cli/logs-cli.test.ts, docs/cli/logs.md)
  • @RomneyDa: Recently added logs-follow reconnect notice and JSON notice parity in the same CLI notice path. (role: recent logs-follow maintainer; confidence: medium; commits: 4bb4127a33bf; files: src/cli/logs-cli.ts, src/cli/logs-cli.test.ts, docs/cli/logs.md)

Remaining risk / open question:

  • The PR still lacks observable after-fix real behavior proof from a real CLI/log-tail run; unit tests and prose claims are supplemental only for this external contribution.
  • Exact-head CI was not fully complete when inspected, with checks-node-core still in progress.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 64bbe96d8843.

Re-review progress:

@BSG2000
Copy link
Copy Markdown
Author

BSG2000 commented May 2, 2026

Addressed @clawsweeper [P3] in f729cbc: added an Unreleased > Fixes CHANGELOG.md entry crediting the contribution. No code change; pnpm vitest run src/logging/log-tail.test.ts → 3/3 green locally.

@BSG2000 BSG2000 force-pushed the fix/log-tail-rotated-message branch from f729cbc to f6298d9 Compare May 6, 2026 09:36
@openclaw-barnacle openclaw-barnacle Bot added the triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. label May 6, 2026
BSG2000 and others added 2 commits May 7, 2026 08:33
readLogSlice set both truncated and reset whenever the file grew
faster than --max-bytes between polls, even though the cursor was still
valid in that case. The CLI then printed 'Log cursor reset (file
rotated).' to stderr on every burst, which was misleading: no rotation
had occurred, the tail was just truncated to fit the byte budget.

Tighten the contract so reset is set only when the file is shorter
than the previously-returned cursor (the actual rotation/truncation
signal). The fast-growth case keeps truncated: true, so callers still
see the existing 'Log tail truncated (increase --max-bytes).' message
without the spurious rotation notice.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…sage

Address review feedback on PR openclaw#74252 (P3 from clawsweeper): add
Unreleased > Fixes entry per CONTRIBUTING policy for user-facing fixes.
No code change; log-tail.test.ts: 3/3 green.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@BSG2000 BSG2000 force-pushed the fix/log-tail-rotated-message branch from 4868f27 to fa08a97 Compare May 7, 2026 06:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant