Skip to content

Conversation

@Lyndon-Li
Copy link
Contributor

Maintenance is critical for healthy of the repository.
On the other hand, Maintenance is complex, because it runs multiple sub tasks each may generate different results according to the maintenance policy. The results may include deleting/combining/adding data/metadata to the repository.

It is worthy to add more observability for these tasks for below reasons:

  1. It is helpful for troubleshooting. Any data change to the repository is critical, the observability info helps to understand what happened during the maintenance and why that happened
  2. It is helpful for users to understand/predict the repo's behavior. The repo data may be stored in a public cloud for which costs are sensitive to scale/duration of data stored. On the other hand, repository has its own policy to manage the data, so the data is not deleted until it is safe enough according to the policy. The observability info helps users to understand how much data is in-use, how much data is out of use and when it is deleted

There will be a serial of PRs to add observability info for each sub task.

The current PR adds the interfaces, structures and workflows to save, retrieve and display the statistics info for maintenance tasks.

@Lyndon-Li Lyndon-Li force-pushed the increase-maintenance-observability-stats branch from 7dd92ba to f6f6841 Compare October 20, 2025 11:20
Copy link
Collaborator

@julio-lopez julio-lopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename fields to Extra and Data.
Add marshaling unit tests

@julio-lopez
Copy link
Collaborator

As soon as feedback is addressed, CI can run and this PR can be merged afterwards

@Lyndon-Li Lyndon-Li force-pushed the increase-maintenance-observability-stats branch 2 times, most recently from 7f0ba55 to 6cff24e Compare October 22, 2025 05:42
@Lyndon-Li Lyndon-Li force-pushed the increase-maintenance-observability-stats branch from 6cff24e to b83d95d Compare October 22, 2025 05:44
@Lyndon-Li
Copy link
Contributor Author

@julio-lopez
I have addressed all the comments.

However, for "Add marshaling unit tests", I cannot add UT for the current PR.
The testing code are more like end to end function test instead of UT. So for example, for TestMaintenanceInfoSimple:

  • I cannot control the the retrieved maintenance info, so cannot make sure the test covers every path of the changed code
  • It is in a separate package cli_test, I cannot make a UT to test a private function getMessageFromRun in cli package

Which way do you prefer? And can you give some advice?

Comment on lines +17 to +22
type Summarizer interface {
Summary() string
}

// Kind defines the methods for detecting kind of a maintenance statistics.
type Kind interface {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No change requested, only a comment: it seems that these did not need to be separated, what's the rationale for separating them?
Anyway, we can merge as is and roll the ball forward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems that these did not need to be separated, what's the rationale for separating them?

I just feel that they don't need to be from the same implementation --- the task functions who generate the statistics structs don't need to know what Summary mean, this Summarizer should not appear in these functions' signature (return value).

@codecov
Copy link

codecov bot commented Oct 23, 2025

Codecov Report

❌ Patch coverage is 54.76190% with 38 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.73%. Comparing base (cb455c6) to head (24378f3).
⚠️ Report is 690 commits behind head on master.

Files with missing lines Patch % Lines
repo/maintenancestats/types.go 0.00% 11 Missing ⚠️
cli/command_maintenance_info.go 50.00% 10 Missing ⚠️
repo/maintenance/content_index_to_pack_check.go 10.00% 9 Missing ⚠️
repo/maintenance/maintenance_schedule.go 53.84% 6 Missing ⚠️
repo/maintenance/maintenance_run.go 92.30% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4890      +/-   ##
==========================================
+ Coverage   75.86%   76.73%   +0.86%     
==========================================
  Files         470      536      +66     
  Lines       37301    40981    +3680     
==========================================
+ Hits        28299    31446    +3147     
- Misses       7071     7489     +418     
- Partials     1931     2046     +115     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@julio-lopez
Copy link
Collaborator

julio-lopez commented Oct 23, 2025

However, for "Add marshaling unit tests", I cannot add UT for the current PR. The testing code are more like end to end function test instead of UT. So for example, for TestMaintenanceInfoSimple:

  • I cannot control the the retrieved maintenance info, so cannot make sure the test covers every path of the changed code
  • It is in a separate package cli_test, I cannot make a UT to test a private function getMessageFromRun in cli package

Which way do you prefer? And can you give some advice?

The suggestion is to have simple tests that ensure the struct marshaling/unmarshaling "round-trips", that is:

  • instantiate the struct;
  • marshal it to JSON, it should not error out;
  • then unmarshal the resulting JSON back into a separate instance of the struct;
  • ensure that the values in the unmarshaled struct match the expected values.

These tests are simple, they DO NOT require a complex setup such to run maintenance, forcing it to create a schedule, etc.

return succeed
}

extraStr := ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
extraStr := ""
extraStr := ""

}

// BuildFromExtra builds maintenance statistics from an Extra and returns a Summarizer.
func BuildFromExtra(extra Extra) (Summarizer, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func BuildFromExtra(extra Extra) (Summarizer, error) {
func BuildFromExtra(_ Extra) (Summarizer, error) {

func BuildExtra(stats Kind) (Extra, error) {
bytes, err := json.Marshal(stats)
if err != nil {
return Extra{}, errors.Wrapf(err, "error marshalling stats %v", stats)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return Extra{}, errors.Wrapf(err, "error marshalling stats %v", stats)
return Extra{}, errors.Wrapf(err, "error marshaling stats %v", stats)

@julio-lopez julio-lopez merged commit 657db97 into kopia:master Oct 23, 2025
20 of 23 checks passed
@julio-lopez
Copy link
Collaborator

julio-lopez commented Oct 23, 2025

Merging as is, I'll take care of the linter issues in a separate PR.

@julio-lopez
Copy link
Collaborator

For the PRs to follow, please run the linter and other CI jobs in your computer, or in your repository so you catch any test and linter errors early. This reduces the number of back-and-forth and overall speeds up the PR review process and thus, merging the changes.

@Lyndon-Li
Copy link
Contributor Author

However, for "Add marshaling unit tests", I cannot add UT for the current PR. The testing code are more like end to end function test instead of UT. So for example, for TestMaintenanceInfoSimple:

  • I cannot control the the retrieved maintenance info, so cannot make sure the test covers every path of the changed code
  • It is in a separate package cli_test, I cannot make a UT to test a private function getMessageFromRun in cli package

Which way do you prefer? And can you give some advice?

The suggestion is to have simple tests that ensure the struct marshaling/unmarshaling "round-trips", that is:

  • instantiate the struct;
  • marshal it to JSON, it should not error out;
  • then unmarshal the resulting JSON back into a separate instance of the struct;
  • ensure that the values in the unmarshaled struct match the expected values.

These tests are simple, they DO NOT require a complex setup such to run maintenance, forcing it to create a schedule, etc.

Sure, so it is more like a UT. Will add it in the next PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants