Skip to content

Conversation

@Lyndon-Li
Copy link
Contributor

@Lyndon-Li Lyndon-Li commented Oct 23, 2025

Maintenance is critical for healthy of the repository.
On the other hand, Maintenance is complex, because it runs multiple sub tasks each may generate different results according to the maintenance policy. The results may include deleting/combining/adding data/metadata to the repository.

It is worthy to add more observability for these tasks for below reasons:

  1. It is helpful for troubleshooting. Any data change to the repository is critical, the observability info helps to understand what happened during the maintenance and why that happened
  2. It is helpful for users to understand/predict the repo's behavior. The repo data may be stored in a public cloud for which costs are sensitive to scale/duration of data stored. On the other hand, repository has its own policy to manage the data, so the data is not deleted until it is safe enough according to the policy. The observability info helps users to understand how much data is in-use, how much data is out of use and when it is deleted

There will be a serial of PRs to add observability info for each sub task.

The current PR add the stats info for CleanupMarkers sub task.

@Lyndon-Li Lyndon-Li force-pushed the clean-up-markers-stats branch 2 times, most recently from 7dc6b32 to 3be87e1 Compare October 23, 2025 10:51
@Lyndon-Li Lyndon-Li force-pushed the clean-up-markers-stats branch from 3be87e1 to d6efee9 Compare October 23, 2025 11:14
@Lyndon-Li Lyndon-Li marked this pull request as ready for review October 23, 2025 11:30
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds observability statistics for the CleanupMarkers maintenance task to track the number of epoch markers and deletion watermarks removed during cleanup operations.

Key Changes:

  • Introduced CleanupMarkersStats struct to capture deletion counts for epoch markers and watermarks
  • Modified CleanupMarkers function to return statistics instead of just error
  • Added helper functions BuildExtra and BuildFromExtra for serializing/deserializing stats

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
repo/maintenancestats/types.go Implements CleanupMarkersStats type with serialization, summary generation, and builder functions
repo/maintenancestats/types_test.go Adds comprehensive test coverage for stats serialization/deserialization
internal/epoch/epoch_manager.go Updates CleanupMarkers and helper functions to track and return deletion counts
internal/epoch/epoch_manager_test.go Updates tests to verify returned statistics
repo/maintenance/maintenance_run.go Captures and propagates stats from CleanupMarkers

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 71 to 72
DeletedEpochMarkerBlobs int `json:"deletedEpochMarkerBlobs"`
DeletedDeletionWaterMarkBlobs int `json:"deletedDeletionWaterMarkBlobs"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DeletedEpochMarkerBlobs int `json:"deletedEpochMarkerBlobs"`
DeletedDeletionWaterMarkBlobs int `json:"deletedDeletionWaterMarkBlobs"`
DeletedEpochMarkerBlobCount int `json:"deletedEpochMarkerBlobCount"`
DeletedWatermarkBlobCount int `json:"deletedWatermarkBlobCount"`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeletedWatermarkBlobCount was not renamed correctly

@julio-lopez
Copy link
Collaborator

@codecov
Copy link

codecov bot commented Oct 24, 2025

Codecov Report

❌ Patch coverage is 80.43478% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.04%. Comparing base (cb455c6) to head (a1ad8ab).
⚠️ Report is 703 commits behind head on master.

Files with missing lines Patch % Lines
repo/maintenancestats/stats_cleanup_markers.go 44.44% 5 Missing ⚠️
internal/epoch/epoch_manager.go 82.60% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4900      +/-   ##
==========================================
+ Coverage   75.86%   78.04%   +2.17%     
==========================================
  Files         470      538      +68     
  Lines       37301    31235    -6066     
==========================================
- Hits        28299    24376    -3923     
+ Misses       7071     4818    -2253     
- Partials     1931     2041     +110     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Lyndon-Li Lyndon-Li force-pushed the clean-up-markers-stats branch from 2b4a140 to 449932a Compare October 24, 2025 05:34
@Lyndon-Li Lyndon-Li requested a review from julio-lopez October 24, 2025 05:37
@Lyndon-Li Lyndon-Li force-pushed the clean-up-markers-stats branch from 449932a to c5d74ac Compare October 24, 2025 08:46
@Lyndon-Li Lyndon-Li requested a review from julio-lopez October 24, 2025 09:14
// CleanupMarkersStats are the stats for cleaning up markers.
type CleanupMarkersStats struct {
DeletedEpochMarkerBlobCount int `json:"deletedEpochMarkerBlobCount"`
DeletedDeletionWaterMarkBlobCount int `json:"deletedDeletionWaterMarkBlobCount"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a prior comment, this can be shortened:

Suggested change
DeletedDeletionWaterMarkBlobCount int `json:"deletedDeletionWaterMarkBlobCount"`
DeletedWaterMarkBlobCount int `json:"deletedWaterMarkBlobCount"`

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge this PR as is. A followup PR will address the naming of these fields.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@julio-lopez julio-lopez merged commit e3e4e09 into kopia:master Oct 24, 2025
23 checks passed
julio-lopez added a commit that referenced this pull request Oct 24, 2025
Fixes field names in CleanupMarkersStats persistent struct.
Also renames variables for consistency.

- Followup fix for: #4900 add stats to maintenance run - CleanupMarkers 
- Ref: #4848
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants