Skip to content

Conversation

@jkowalski
Copy link
Contributor

@jkowalski jkowalski commented Sep 15, 2025

This is a breaking change to users who might be using Kopia as a library.

Log Format

{"t":"<timestamp-rfc-3389-microseconds>", "span:T1":"V1", "span:T2":"V2", "n":"<source>", "m":"<message>", /*parameters*/}

Where each record is associated with one or more spans that describe its scope:

  • "span:client": "<hash-of-username@hostname>"
  • "span:repo": "<random>" - random identifier of a repository connection (from repo.Open)
  • "span:maintenance": "<random>" - random identifier of a maintenance session
  • "span:upload": "<hash-of-username@host:/path>" - uniquely identifies upload session of a given directory
  • "span:checkpoint": "<random>" - encapsulates each checkpoint operation during Upload
  • "span:server-session": "<random>" -single client connection to the server
  • "span:flush": "<random>" - encapsulates each Flush session
  • "span:maintenance": "<random>" - encapsulates each maintenance operation
  • "span:loadIndex" : "<random>" - encapsulates index loading operation
  • "span:emr" : "<random>" - encapsulates epoch manager refresh
  • "span:writePack": "<pack-blob-ID>" - encapsulates pack blob preparation and writing

(plus additional minor spans for various phases of the maintenance).

Notable points:

  • Used internal zero allocation JSON writer for reduced memory usage.
  • renamed --disable-internal-log to --disable-repository-log (controls saving blobs to repository)
  • added --disable-content-log (controls writing of content-log files)
  • all storage operations are also logged in a structural way and associated with the corresponding spans.
  • all content IDs are logged in a truncated format (since first N bytes that are usually enough to be unique) to improve compressibility of logs (blob IDs are frequently repeated but content IDs usually appear just once).

This format should make it possible to recreate the journey of any single content throughout pack blobs, indexes and compaction events.

@codecov
Copy link

codecov bot commented Sep 15, 2025

Codecov Report

❌ Patch coverage is 88.60870% with 131 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.52%. Comparing base (cb455c6) to head (3aed1b3).
⚠️ Report is 669 commits behind head on master.

Files with missing lines Patch % Lines
repo/content/indexblob/index_blob_encryption.go 64.86% 12 Missing and 1 partial ⚠️
repo/content/content_prefetch.go 0.00% 12 Missing ⚠️
internal/server/server.go 45.00% 11 Missing ⚠️
repo/content/indexblob/index_blob_manager_v0.go 83.87% 10 Missing ⚠️
repo/content/committed_read_manager.go 66.66% 9 Missing ⚠️
repo/maintenance/content_rewrite.go 74.28% 9 Missing ⚠️
internal/server/grpc_session.go 77.41% 7 Missing ⚠️
repo/maintenance/blob_retain.go 58.82% 7 Missing ⚠️
repo/content/committed_content_index.go 53.84% 6 Missing ⚠️
repo/maintenance/blob_gc.go 78.57% 6 Missing ⚠️
... and 15 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4822      +/-   ##
==========================================
+ Coverage   75.86%   76.52%   +0.65%     
==========================================
  Files         470      536      +66     
  Lines       37301    41167    +3866     
==========================================
+ Hits        28299    31502    +3203     
- Misses       7071     7617     +546     
- Partials     1931     2048     +117     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jkowalski jkowalski marked this pull request as ready for review September 16, 2025 14:37
Copilot AI review requested due to automatic review settings September 16, 2025 14:37
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request implements a complete rewrite of content logs to use JSON format with zero-allocation JSON building for improved memory efficiency. The change replaces the previous structured logging system with a custom JSON writer that avoids memory allocations during log operations.

Key changes include:

  • Implementation of a zero-allocation JSON writer for content logging
  • Replacement of zap-based logging with custom structured JSON logging
  • Addition of strongly-typed parameter system for log entries
  • Integration of content log writer throughout the storage and content management layers

Reviewed Changes

Copilot reviewed 45 out of 45 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
internal/contentlog/ New package implementing zero-allocation JSON writer and structured logging
internal/blobparam/ New package providing blob-specific logging parameters
repo/content/ Updated to use new JSON-based content logging system
repo/blob/logging/ Modified to support both traditional and JSON logging outputs
internal/repodiag/ Simplified log manager to work with new JSON-based system
repo/open.go Updated repository opening to pass content log writer
cli/ Updated to handle content log writer configuration

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Collaborator

@julio-lopez julio-lopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question and a bunch of nits

@julio-lopez
Copy link
Collaborator

@jkowalski please provide high-level description of the design of the new approach.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@jkowalski jkowalski changed the title feat(general): rewrote content logs to always be JSON-based feat(general): rewrote content logs to always be JSON-based and reorganized log structure Sep 27, 2025
This is a breaking change to users who consume Kopia as a library.

Notable points:

- Used internal exactly-no-allocations JSON writer for reduced memory
  usage.

- renamed --disable-internal-log to --disable-repository-log

- Maintenance also logs to repository now

- Cleaned up and reorganized a number of noisy logs.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@jkowalski jkowalski requested a review from Copilot September 27, 2025 23:41
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 72 out of 72 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (3)

snapshot/snapshotgc/gc.go:1

  • The function toCountAndBytesString is removed but was being used in multiple places. The removal suggests the callers were updated to call Approximate() directly, which reduces code duplication and improves maintainability.
// Package snapshotgc implements garbage collection of contents that are no longer referenced through snapshots.

internal/contentlog/contentlog_json_writer.go:1

  • The variable hex is used without being declared or assigned. The code appears to be missing the line that assigns the result of strconv.AppendInt to hex.
// Package contentlog provides a JSON writer that can write JSON to a buffer

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@jkowalski jkowalski merged commit 0f7253e into kopia:master Sep 28, 2025
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants