Skip to content

refactor: consolidate log files and clean up obsolete artifacts#1597

Merged
TheLastCicada merged 3 commits into
v2-rc2from
refactor/prune-log-files
Apr 23, 2026
Merged

refactor: consolidate log files and clean up obsolete artifacts#1597
TheLastCicada merged 3 commits into
v2-rc2from
refactor/prune-log-files

Conversation

@TheLastCicada

@TheLastCicada TheLastCicada commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

Summary

Follow-up to #1592. Reduces the on-disk logging footprint from 8 transports per logger version (16 total, counting v1 and v2) down to 3 rotated files, eliminates unbounded files, adds retention limits, and auto-cleans legacy files from hosts upgrading across CADT versions.

Context for why there were so many files:

Inspecting a real deployment's ~/.chia/mainnet/cadt/v1/logs/:

File Config Problem
error.log level: error, no rotation Unbounded — 19 MB and growing; all content also in other files
combined.log logger level, no rotation Unbounded — 27 MB and growing; pure duplicate of application-*.log
application-%DATE%.log logger level, daily rotate, 20 MB Subset of debug-%DATE%.log
debug-%DATE%.log debug level, daily rotate, 20 MB Superset of everything above; no maxFiles, so rotations accumulate forever
exceptions.log no rotation Duplicate of exceptions-%DATE%.log
exceptions-%DATE%.log daily rotate, 20 MB no maxFiles
rejections.log no rotation Duplicate of rejections-%DATE%.log
rejections-%DATE%.log daily rotate, 20 MB no maxFiles

Winston 3's per-transport level is independent of the logger's level, so debug-%DATE%.log with level: 'debug' captured verbose/debug regardless of APP.LOG_LEVEL.

Changes

New transport set (per logger version)

transports: [
  new DailyRotateFile({
    filename: `${logDir}/application-%DATE%.log`,
    datePattern: 'YYYY-MM-DD',
    level: 'debug',                         // captures error..debug stream
    zippedArchive: true,
    maxSize: '20m',
    maxFiles: '30d',                        // NEW: bounded retention
    utc: true,
    format: format.combine(format.json()),
  }),
],
exceptionHandlers: [
  new DailyRotateFile({ /* exceptions-%DATE%.log, 20m, 30d, zipped */ }),
],
rejectionHandlers: [
  new DailyRotateFile({ /* rejections-%DATE%.log, 20m, 30d, zipped */ }),
],

Disk usage is now bounded at ~600 MB per logger version (30 × 20 MB compressed) instead of unbounded.

Dropped transports

  • error.log — subset of application-%DATE%.log; filter via jq '.level=="error"' or journalctl -p err
  • combined.log — pure duplicate
  • debug-%DATE%.log — merged into application-%DATE%.log at debug level
  • exceptions.log and rejections.log — unbounded duplicates of the rotated siblings

Startup cleanup of legacy files

New cleanupObsoleteLogFiles(logDir) runs at logger creation and removes files superseded by rotated equivalents:

  • error.log, combined.log, exceptions.log, rejections.log (the four unbounded files)
  • debug-%DATE%.log, debug-%DATE%.log.N, debug-%DATE%.log.gz (matched via /^debug-.*\.log(\.\d+)?(\.gz)?\$/)

Behavior guarantees:

  • Best-effort: missing files, missing directories, and per-file unlink errors never throw; they're collected and logged as a warning via the new logger once it's constructed.
  • Idempotent: second run on a clean directory removes 0 files.
  • Narrow: files whose names loosely resemble debug files but don't match the regex (debug-backup.tar, debug.txt, other.log) are preserved.

Other adjacent concerns that this PR deliberately does NOT address (to keep diff narrow):

  • Making maxFiles configurable via APP.LOG_RETENTION_DAYS — fine with 30 as a default; can add a knob later if operators ask
  • Removing `console.trace(error)` calls from other controllers — separate cleanup
  • Updating docker/k8s volume sizing recommendations — followup docs PR if desired

Verification

Ran two dedicated smoke tests against winston@3.19.0 + winston-daily-rotate-file@5.0.0.

Cleanup logic (seeded with the user's actual production filenames plus adversarial non-log filenames):

== removed (9) ==
combined.log
debug-2026-04-13.log
debug-2026-04-16.log
debug-2026-04-16.log.1
debug-2026-04-16.log.gz
debug-2026-04-20.log
error.log
exceptions.log
rejections.log

== kept ==
application-2026-04-13.log, application-2026-04-20.log,
exceptions-2026-04-20.log, rejections-2026-04-20.log,
debug-backup.tar, debug.txt, other.log

second-run removed: 0 errors: 0
missing-dir removed: 0 errors: 0

New application transport with logger level info, transport level debug, all 6 levels emitted:

=== application-2026-04-21.log ===
{"level":"error",  "message":"TEST_ERROR",   ...}
{"level":"warn",   "message":"TEST_WARN",    ...}
{"level":"info",   "message":"TEST_INFO",    ...}
{"level":"verbose","message":"TEST_VERBOSE", ...}
{"level":"debug",  "message":"TEST_DEBUG",   ...}

silly is correctly excluded (transport level is debug; if operators want silly, they can set APP.LOG_LEVEL=silly and it will reach the journalctl Console transport from #1592 without cluttering disk).

Impact on existing deployments

  • Upgrading hosts lose 5 files on first start (the four unbounded files + any existing debug-*.log* files). All content in them is either duplicated elsewhere on disk or available in journalctl.
  • Disk usage drops. In the investigated deployment, the unbounded files had already reached 46 MB combined and would continue growing; the rotated debug files had accumulated 8 days × up to 3 files/day × up to 20 MB.
  • Operational commands change slightly: grep targets that hit error.log or combined.log need to move to application-*.log or to journalctl.
  • Nothing in this PR changes what Winston emits to journalctl — fix(API): capture staging commit errors in logs and stream all logs to journalctl #1592 owns that stream.

Test plan

  • npm run test passes locally
  • Start CADT against an existing ~/.chia/mainnet/cadt/v1/logs/ that contains the old files; verify:
    • The five legacy file classes are removed on startup
    • New log lines land in application-%DATE%.log at all levels error..debug
    • A structured info entry reporting the cleanup count and filenames is present
  • Start CADT against a fresh / empty log directory; verify no startup warnings about missing files
  • Simulate a read-only log directory (or make one file immutable with chattr +i); verify the logger still starts and emits a structured warning listing the files it couldn't remove
  • Confirm maxFiles: '30d' behavior by advancing the system clock or manually creating older dated files and verifying they're eventually purged

Note

Medium Risk
Moderate operational risk: changes logging destinations/retention and deletes legacy log files on startup, which could affect troubleshooting workflows if misconfigured or if cleanup targets unexpected files.

Overview
Consolidates per-version file logging into a single rotated application-%DATE%.log (capturing debug and above) plus rotated exceptions-%DATE%.log and rejections-%DATE%.log, and adds bounded retention via maxFiles (30d) and shared size limits.

Removes the unrotated error.log/combined.log/exceptions.log/rejections.log transports and adds a best-effort startup cleanup (cleanupObsoleteLogFiles) that deletes those legacy files and old debug-*.log* artifacts, logging what was removed or failed to remove.

Reviewed by Cursor Bugbot for commit 462f805. Bugbot is set up for automated code reviews on this repo. Configure here.

Trim the logger from 8 file transports per version to 3:

  - application-%DATE%.log: single daily-rotated JSON file at debug
    level, capturing the full error/warn/info/verbose/debug stream
    (the previous combined.log, application-%DATE%.log, and
    debug-%DATE%.log all wrote overlapping subsets of this)
  - exceptions-%DATE%.log: unhandled exceptions, daily-rotated
  - rejections-%DATE%.log: unhandled promise rejections, daily-rotated

Removed transports:

  - error.log (unbounded; its data was already in combined.log /
    application-%DATE%.log / debug-%DATE%.log)
  - combined.log (unbounded; pure duplicate of application-%DATE%.log)
  - debug-%DATE%.log (superseded by application-%DATE%.log at debug
    level)
  - exceptions.log and rejections.log (unbounded duplicates of the
    rotated siblings)

All remaining rotated transports now carry maxFiles: '30d', capping
disk usage at roughly 600 MB per logger version instead of growing
without bound.

To avoid leaving old files sitting around forever on hosts upgrading
across CADT versions, startup now deletes the superseded files from
the log directory:

  error.log, combined.log, exceptions.log, rejections.log,
  and debug-%DATE%.log{,.N,.gz}

Cleanup is best-effort; missing files are ignored and per-file errors
are logged rather than thrown, so a cleanup problem can never take
down startup. Journalctl / pm2 / docker continue to be the primary
real-time log consumers.
@TheLastCicada TheLastCicada changed the base branch from develop to v2-rc2 April 23, 2026 17:04
@TheLastCicada TheLastCicada merged commit 8faec6d into v2-rc2 Apr 23, 2026
36 of 37 checks passed
@TheLastCicada TheLastCicada deleted the refactor/prune-log-files branch April 23, 2026 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant