-
Notifications
You must be signed in to change notification settings - Fork 594
feat(general): rewrote content logs to always be JSON-based and reorganized log structure #4822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0637401 to
5411427
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #4822 +/- ##
==========================================
+ Coverage 75.86% 76.52% +0.65%
==========================================
Files 470 536 +66
Lines 37301 41167 +3866
==========================================
+ Hits 28299 31502 +3203
- Misses 7071 7617 +546
- Partials 1931 2048 +117 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request implements a complete rewrite of content logs to use JSON format with zero-allocation JSON building for improved memory efficiency. The change replaces the previous structured logging system with a custom JSON writer that avoids memory allocations during log operations.
Key changes include:
- Implementation of a zero-allocation JSON writer for content logging
- Replacement of zap-based logging with custom structured JSON logging
- Addition of strongly-typed parameter system for log entries
- Integration of content log writer throughout the storage and content management layers
Reviewed Changes
Copilot reviewed 45 out of 45 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/contentlog/ | New package implementing zero-allocation JSON writer and structured logging |
| internal/blobparam/ | New package providing blob-specific logging parameters |
| repo/content/ | Updated to use new JSON-based content logging system |
| repo/blob/logging/ | Modified to support both traditional and JSON logging outputs |
| internal/repodiag/ | Simplified log manager to work with new JSON-based system |
| repo/open.go | Updated repository opening to pass content log writer |
| cli/ | Updated to handle content log writer configuration |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
julio-lopez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A question and a bunch of nits
e1d3116 to
f336bab
Compare
|
@jkowalski please provide high-level description of the design of the new approach. |
f336bab to
6e116c6
Compare
6e116c6 to
48dd126
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
This is a breaking change to users who consume Kopia as a library. Notable points: - Used internal exactly-no-allocations JSON writer for reduced memory usage. - renamed --disable-internal-log to --disable-repository-log - Maintenance also logs to repository now - Cleaned up and reorganized a number of noisy logs.
48dd126 to
93beda0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 72 out of 72 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (3)
snapshot/snapshotgc/gc.go:1
- The function
toCountAndBytesStringis removed but was being used in multiple places. The removal suggests the callers were updated to callApproximate()directly, which reduces code duplication and improves maintainability.
// Package snapshotgc implements garbage collection of contents that are no longer referenced through snapshots.
internal/contentlog/contentlog_json_writer.go:1
- The variable
hexis used without being declared or assigned. The code appears to be missing the line that assigns the result ofstrconv.AppendInttohex.
// Package contentlog provides a JSON writer that can write JSON to a buffer
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
This is a breaking change to users who might be using Kopia as a library.
Log Format
{"t":"<timestamp-rfc-3389-microseconds>", "span:T1":"V1", "span:T2":"V2", "n":"<source>", "m":"<message>", /*parameters*/}Where each record is associated with one or more spans that describe its scope:
"span:client": "<hash-of-username@hostname>""span:repo": "<random>"- random identifier of a repository connection (fromrepo.Open)"span:maintenance": "<random>"- random identifier of a maintenance session"span:upload": "<hash-of-username@host:/path>"- uniquely identifies upload session of a given directory"span:checkpoint": "<random>"- encapsulates each checkpoint operation during Upload"span:server-session": "<random>"-single client connection to the server"span:flush": "<random>"- encapsulates each Flush session"span:maintenance": "<random>"- encapsulates each maintenance operation"span:loadIndex" : "<random>"- encapsulates index loading operation"span:emr" : "<random>"- encapsulates epoch manager refresh"span:writePack": "<pack-blob-ID>"- encapsulates pack blob preparation and writing(plus additional minor spans for various phases of the maintenance).
Notable points:
--disable-internal-logto--disable-repository-log(controls saving blobs to repository)--disable-content-log(controls writing ofcontent-logfiles)This format should make it possible to recreate the journey of any single content throughout pack blobs, indexes and compaction events.