feat: backup usage stats by henryz78 · Pull Request #1632 · looplj/axonhub

henryz78 · 2026-05-10T05:19:25Z

Summary

include usage statistics in manual backup, restore, and auto-backup flows
add includeUsageStats to GraphQL inputs/settings and the backup settings UI
export usage requests and usage logs with project/channel/API-key reference metadata for restore
strip Ent edges from usage-stat backup JSON to avoid API key leakage and duplicated nested request data
only include API key values when IncludeAPIKeys is explicitly enabled; avoid restoring API key links from raw numeric IDs
avoid logging raw API key values when restore references cannot be resolved
restore usage stats with cached FK resolution, safe optional-reference handling, and actionable missing-reference warnings
deduplicate restored usage requests by stable fingerprints, including re-restore cases where API key values were omitted from the backup
reduce re-restore lookup amplification by deduplicating timestamp probes before querying existing requests
use each usage log’s own APIKeyID when adding optional API-key restore metadata
default auto backup to include usage stats, with backward-compatible settings parsing

Testing

git diff --check
Not run: Go tests / gofmt, because go and gofmt are not available in this local environment

gemini-code-assist

Code Review

This pull request adds support for backing up and restoring usage statistics, including usage requests and logs. The implementation includes UI toggles for manual and automatic backups, GraphQL schema updates, and backend logic for batched data processing and entity remapping. Feedback suggests optimizing memory usage by batching API key lookups during backup and refining the deduplication query during restore to handle potential performance issues with high-concurrency timestamps.

gemini-code-assist · 2026-05-10T05:25:31Z

+		apiKeys, err := svc.db.APIKey.Query().
+			Select(apikey.FieldID, apikey.FieldKey).
+			All(ctx)
+		if err != nil {
+			return nil, err
+		}
+
+		for _, ak := range apiKeys {
+			apiKeyKeys[ak.ID] = ak.Key
+		}
+	}


Querying all API keys at once to build a lookup map might lead to high memory consumption if the system has a very large number of API keys. Consider using a more memory-efficient approach or batching this query if the number of keys is expected to be significant.

gemini-code-assist · 2026-05-10T05:25:31Z

+	for start := 0; start < len(createdAt); start += usageBackupBatchSize {
+		end := min(start+usageBackupBatchSize, len(createdAt))
+		requests, err := db.Request.Query().
+			Where(request.CreatedAtIn(createdAt[start:end]...)).
+			WithProject().
+			WithChannel().
+			WithAPIKey().
+			All(ctx)
+		if err != nil {
+			return nil, err
+		}
+
+		for _, req := range requests {
+			addExistingUsageRequest(lookup, req)
+		}
+	}


Querying existing requests by CreatedAt in batches of 500 timestamps might still return a large number of rows if many requests share the exact same timestamp (e.g., in high-concurrency scenarios). While CreatedAt usually has high precision, consider if additional filters (like ProjectID) could be added to the query to further narrow down the results and improve performance during restore.

greptile-apps · 2026-05-10T05:26:31Z

Greptile Summary

This PR adds usage statistics (requests and logs) to the manual backup, restore, and auto-backup flows, including a new includeUsageStats option wired through GraphQL, the business layer, and the UI. The implementation handles FK resolution by name with ID fallback, deduplicates restored records via both ID and stable fingerprint, redacts credentials from backup output unless explicitly opted in, and uses a backward-compatible *bool pointer trick for the auto-backup settings JSON field.

Backup: batched cursor pagination (500 at a time) exports usage requests with project/channel/credential metadata; custom MarshalJSON strips Ent edge structs to prevent nested data leakage.
Restore: usageRestoreResolver caches project, channel, and credential lookups; existingUsageRequests pre-fetches by ID and timestamp to detect duplicates before inserting; usage logs are bulk-inserted in batches after deduplication against existing and in-session records.
Settings: autoBackupSettingsJSON intermediate struct with *bool for IncludeUsageStats preserves backward compatibility; defaults to false for auto-backup and true for manual backup/restore.

Confidence Score: 5/5

Safe to merge. The backup and restore logic is well-structured with batched queries, proper dedup, and credential redaction.

The change introduces significant new code for backup and restore of usage data, but the core paths are covered by round-trip tests, deduplication logic is thorough (both ID and fingerprint-based), and credential handling is correctly guarded. The only finding is a silently-ignored omitzero JSON tag that has no runtime impact since the relevant timestamp fields are always populated for real records.

internal/server/backup/types.go has the misleading omitzero tag; internal/server/backup/restore.go carries the bulk of the new logic and would benefit from an eye on the transaction scope for large datasets.

Important Files Changed

Filename	Overview
internal/server/backup/restore.go	Adds full restore pipeline for usage requests and logs with FK resolution, deduplication by ID and fingerprint, batched DB queries, and safe optional-reference handling.
internal/server/backup/types.go	Introduces BackupUsageRequest/BackupUsageLog with custom MarshalJSON to strip Ent edges; uses non-standard omitzero tag (silently ignored by encoding/json) on time.Time fields.
internal/server/backup/backup_ops.go	Adds batched backup of usage requests and logs; redaction of sensitive key values works correctly via empty map fallback; switches to compact json.Marshal when usage stats are included.
internal/server/biz/system.go	Uses intermediate autoBackupSettingsJSON with *bool for IncludeUsageStats to safely handle backward-compatible JSON parsing of existing stored settings.
internal/server/backup/restore_test.go	Adds a round-trip restore test for usage stats covering token counts, cost, project/channel/API-key linkage.
internal/server/backup/backup_test.go	Adds backup test covering usage stats inclusion, credential redaction by default, and optional credential inclusion; verifies no Ent edge data leaks into JSON output.
frontend/src/features/system/components/backup-settings.tsx	Adds includeUsageStats toggle to manual backup, restore, and auto-backup forms; defaults to true for manual ops and false for auto-backup.
internal/server/gql/backup.graphql	Adds includeUsageStats to BackupOptionsInput, RestoreOptionsInput (default true), and AutoBackupSettings/UpdateAutoBackupSettingsInput.

Sequence Diagram

sequenceDiagram
    participant UI as BackupSettings UI
    participant GQL as GraphQL Resolver
    participant BackupSvc as backup.BackupService
    participant DB as Database

    Note over UI,DB: Backup Flow
    UI->>GQL: "backup(opts: {includeUsageStats: true})"
    GQL->>BackupSvc: BackupWithoutAuth(ctx, opts)
    BackupSvc->>DB: Request.Query (batched, cursor ID)
    DB-->>BackupSvc: []ent.Request (with Project, Channel edges)
    BackupSvc->>DB: UsageLog.Query (batched, cursor ID)
    DB-->>BackupSvc: []ent.UsageLog (with Project, Channel edges)
    BackupSvc-->>GQL: JSON (MarshalJSON strips edges, redacts keys)
    GQL-->>UI: backup file

    Note over UI,DB: Restore Flow
    UI->>GQL: "restore(data, opts: {includeUsageStats: true})"
    GQL->>BackupSvc: Restore(ctx, data, opts)
    BackupSvc->>DB: Load all projects, channels, keys (resolver cache)
    BackupSvc->>DB: existingUsageRequests (by ID + createdAt batches)
    DB-->>BackupSvc: existing requests for dedup
    loop For each backup request
        BackupSvc->>BackupSvc: resolve project/channel/key IDs
        BackupSvc->>BackupSvc: check byID + byFingerprint dedup
        BackupSvc->>DB: Request.Create (if new)
    end
    loop For each backup log (batched 500)
        BackupSvc->>BackupSvc: resolve requestIDMap + dedup checks
        BackupSvc->>DB: UsageLog.CreateBulk
    end
    BackupSvc-->>GQL: nil (success)

_{Reviews (4): Last reviewed commit: "Fix usage backup lint formatting" | Re-trigger Greptile}

greptile-apps · 2026-05-10T05:26:35Z

+		if includeAPIKeyValues {
+			query.WithAPIKey()
+		}


Discarded WithAPIKey() return value

query.WithAPIKey() is called without capturing its return value. Ent's With* methods use a pointer receiver (_q *RequestQuery) and mutate the query in-place, so the eager load is registered correctly — but the pattern is non-idiomatic and easy to misread as a no-op. It's also fragile: if a future Ent refactor changes With* to return a new query object rather than self, this silently stops loading API key edges and req.Edges.APIKey would always be nil for callers expecting key values.

greptile-apps · 2026-05-10T05:26:36Z

+func (svc *BackupService) backupUsageRequests(ctx context.Context, includeAPIKeyValues bool) ([]*BackupUsageRequest, error) {
+	var usageRequestDataList []*BackupUsageRequest
+	lastID := 0
+
+	for {
+		query := svc.db.Request.Query().
+			Where(request.IDGT(lastID)).
+			Order(ent.Asc(request.FieldID)).
+			Limit(usageBackupBatchSize).
+			WithProject().
+			WithChannel()
+		if includeAPIKeyValues {
+			query.WithAPIKey()
+		}
+
+		usageRequests, err := query.All(ctx)
+		if err != nil {
+			return nil, err
+		}
+
+		if len(usageRequests) == 0 {
+			break
+		}
+
+		for _, req := range usageRequests {
+			usageRequestDataList = append(usageRequestDataList, backupUsageRequest(req, includeAPIKeyValues))
+			lastID = req.ID
+		}
+
+		if len(usageRequests) < usageBackupBatchSize {
+			break
+		}
+	}
+
+	return usageRequestDataList, nil


Full dataset accumulated in memory before serialization

backupUsageRequests and backupUsageLogs each batch-query from the DB (500 at a time) but accumulate every record into an in-memory slice before returning. For large deployments with millions of requests, both slices — plus the final json.Marshal(backupData) call that holds another representation — will coexist in heap. A streaming JSON approach (e.g. writing records as they are fetched and encoding directly to an io.Writer) would bound peak memory to roughly one batch rather than the full dataset.

greptile-apps · 2026-05-10T05:26:37Z

+			log.Warn(ctx, "API key not found for restoring usage request, restoring with null API key",
+				log.Int("request_id", oldID),
+			)
+		}
+
+		if existing, ok := existingRequests.byID[oldID]; ok {
+			if sameUsageRequest(existing, reqData, projectID, channelID, apiKeyID) {
+				idMap[oldID] = existing.ID
+				continue
+			}
+		}
+		if existing, ok := existingRequests.byFingerprint[usageRequestBackupFingerprint(reqData)]; ok {
+			idMap[oldID] = existing.ID
+			continue
+		}
+


sameUsageRequest compares ChannelID directly without accounting for failed resolution

When resolveChannelID cannot resolve the channel (returns ok=false), channelID is 0. The comparison existing.ChannelID == channelID then only matches an existing request whose DB channel is already null. If a request from the first restore was written with a valid channel ID (because the channel existed then but was later deleted), this check returns false, the fingerprint check also fails (channel name differs), and the re-restore inserts a duplicate record. This is an edge case only triggered when a referenced channel disappears between the original restore and a re-restore.

looplj · 2026-05-10T10:13:40Z

usage 表可能太大了，每天备份也不合适吧。

henryz78 · 2026-05-10T17:38:41Z

有道理，usage stats 可能会非常大，所以我已经把自动备份里的 usage stats 改成默认不包含，避免每天备份时生成过大的备份文件。

不过这个选项还是保留给用户手动开启，因为有些用户确实需要备份使用统计，比如迁移实例、灾难恢复，或者保留历史请求量、token 用量、成本统计和使用日志，避免恢复后统计页面的数据全部丢失。

looplj · 2026-05-11T13:18:34Z

ci 挂了

henryz78 · 2026-05-12T01:16:41Z

不好意思漏了个import。已在最新提交中修复，等待工作流批准以重新运行检查。

henryz78 · 2026-05-12T03:11:03Z

CI 之前只剩 lint 的格式问题，我已经修复并推送了，现在等你批准 workflow 重新跑即可。

HenryZ-0302 added 12 commits May 9, 2026 18:59

Include usage stats in backups

9de93bf

Address usage stats backup review feedback

67143fd

Avoid leaking API keys in usage backups

5f7bc50

Tighten usage stats restore safeguards

f573e89

Clarify usage backup API key restore behavior

49c4479

Remove unused usage restore metadata

cab9100

Reduce API key restore warning noise

e6d85cc

Deduplicate restored usage requests by fingerprint

9b8f7b6

Align usage backup restore metadata

a7d5dcf

Fix usage restore dedup without API keys

9ca6c7b

Clarify usage restore API key warnings

952aef2

Avoid logging usage restore credentials

048f65b

gemini-code-assist Bot reviewed May 10, 2026

View reviewed changes

greptile-apps Bot reviewed May 10, 2026

View reviewed changes

Default auto backup usage stats off

e8af3aa

looplj changed the title ~~Fix backup usage stats~~ feat: backup usage stats May 11, 2026

Fix usage restore request import

79a0356

Fix usage backup lint formatting

dd5188d

looplj merged commit 19d1101 into looplj:unstable May 12, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: backup usage stats#1632

feat: backup usage stats#1632
looplj merged 15 commits into
looplj:unstablefrom
henryz78:fix-backup-usage-stats

henryz78 commented May 10, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 10, 2026

Uh oh!

gemini-code-assist Bot May 10, 2026

Uh oh!

greptile-apps Bot commented May 10, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot May 10, 2026

Uh oh!

greptile-apps Bot May 10, 2026

Uh oh!

greptile-apps Bot May 10, 2026

Uh oh!

looplj commented May 10, 2026

Uh oh!

henryz78 commented May 10, 2026

Uh oh!

looplj commented May 11, 2026

Uh oh!

henryz78 commented May 12, 2026

Uh oh!

henryz78 commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

henryz78 commented May 10, 2026

Summary

Testing

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

looplj commented May 10, 2026

Uh oh!

henryz78 commented May 10, 2026

Uh oh!

looplj commented May 11, 2026

Uh oh!

henryz78 commented May 12, 2026

Uh oh!

henryz78 commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented May 10, 2026 •

edited

Loading