Skip to content

Conversation

@tolgaozen
Copy link
Member

@tolgaozen tolgaozen commented Sep 8, 2025

…e adjustments

Summary by CodeRabbit

  • New Features
    • Added a CLI repair command to safely advance PostgreSQL transaction IDs. Supports dry-run, batching, retries, verbosity, and requires a database URI.
  • Documentation
    • Introduced “Operations > Repair” guide with usage, examples, safety notes, and troubleshooting. Navigation updated to include the new page.
  • Bug Fixes
    • Improved PostgreSQL stability around transaction ID wraparound via safer active/expired record handling and a migration to update default sentinel values.
  • Chores
    • Added a package manifest for the documentation site.

@coderabbitai
Copy link

coderabbitai bot commented Sep 8, 2025

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  - Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.
  - Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch update-pguint64-xid8-feature

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
pkg/database/postgres/pguint64.go (3)

93-107: Fix DecodeBinary: 64-bit length check is wrong (expects 8 bytes, not 4).

This will mis-parse or panic when src is 8 bytes. Update the guard to 8 and tighten the error.

Apply:

- if len(src) != 4 {
-   return fmt.Errorf("invalid length: %v", len(src))
- }
+ if len(src) != 8 {
+   return fmt.Errorf("invalid length for pguint64 binary: %d (want 8)", len(src))
+ }

161-171: Avoid uint64→int64 overflow in Value; return string when out-of-range.

Returning int64 overflows for values > MaxInt64 and silently corrupts data.

Apply:

 func (p pguint64) Value() (driver.Value, error) {
   switch p.Status {
   case pgtype.Present:
-    return int64(p.Uint), nil
+    if p.Uint > math.MaxInt64 {
+      return strconv.FormatUint(p.Uint, 10), nil
+    }
+    return int64(p.Uint), nil
   case pgtype.Null:
     return nil, nil
   default:
     return nil, errors.New("encode text status undefined status")
   }
 }

Add import:

import "math"

133-159: Reject negative int64 in Scan to prevent wrap-around.

Negative int64 currently converts to huge uint64.

Apply:

 case int64:
-  *p = pguint64{Uint: uint64(src), Status: pgtype.Present}
+  if src < 0 {
+    return fmt.Errorf("%d is less than minimum value for pguint64", src)
+  }
+  *p = pguint64{Uint: uint64(src), Status: pgtype.Present}
internal/storage/postgres/dataWriter.go (3)

215-224: Exec error gets overwritten by Close; can return nil on failure

On Exec failure, you overwrite err with Close()’s result, potentially returning nil and hiding the real error.

Apply:

 for i := 0; i < batch.Len(); i++ {
-		_, err = batchResult.Exec()
-		if err != nil {
-			err = batchResult.Close()
-			if err != nil {
-				return nil, err
-			}
-			return nil, err
-		}
+		_, execErr := batchResult.Exec()
+		if execErr != nil {
+			if closeErr := batchResult.Close(); closeErr != nil {
+				return nil, closeErr
+			}
+			return nil, execErr
+		}
 }

356-364: Same Exec/Close error handling bug in runBundle

Duplicate of the bug above in runBundle’s batch execution loop.

Apply:

 for i := 0; i < batch.Len(); i++ {
-		_, err = batchResult.Exec()
-		if err != nil {
-			err = batchResult.Close()
-			if err != nil {
-				return nil, err
-			}
-			return nil, err
-		}
+		_, execErr := batchResult.Exec()
+		if execErr != nil {
+			if closeErr := batchResult.Close(); closeErr != nil {
+				return nil, closeErr
+			}
+			return nil, execErr
+		}
 }

54-55: Fix slog args: don’t mix Attrs with stray strings

Current calls mix Attrs with bare strings, producing malformed key/value pairs. Use either Attrs only or pure key-value pairs.

Apply examples:

-	slog.DebugContext(ctx, "writing data for tenant_id", slog.String("tenant_id", tenantID), "max retries", slog.Any("max_retries", w.database.GetMaxRetries()))
+	slog.DebugContext(ctx, "writing data for tenant_id", slog.String("tenant_id", tenantID), slog.Int("max_retries", w.database.GetMaxRetries()))
-	slog.DebugContext(ctx, "deleting data for tenant_id", slog.String("tenant_id", tenantID), "max retries", slog.Any("max_retries", w.database.GetMaxRetries()))
+	slog.DebugContext(ctx, "deleting data for tenant_id", slog.String("tenant_id", tenantID), slog.Int("max_retries", w.database.GetMaxRetries()))
-	slog.DebugContext(ctx, "running bundle for tenant_id", slog.String("tenant_id", tenantID), "max retries", slog.Any("max_retries", w.database.GetMaxRetries()))
+	slog.DebugContext(ctx, "running bundle for tenant_id", slog.String("tenant_id", tenantID), slog.Int("max_retries", w.database.GetMaxRetries()))
-	slog.DebugContext(ctx, "retrieved transaction", slog.Any("xid", xid), "for tenant", slog.Any("tenant_id", tenantID))
+	slog.DebugContext(ctx, "retrieved transaction", slog.Any("xid", xid), slog.String("tenant_id", tenantID))

(Apply the last pattern wherever “retrieved transaction” appears.)

Also applies to: 99-100, 139-140, 190-191, 264-265, 339-340

🧹 Nitpick comments (17)
pkg/database/postgres/pguint64.go (1)

121-129: Nit: wrong error text in EncodeBinary.

Message says “encode text” in a binary encoder.

Apply:

 case pgtype.Undefined:
-  return nil, errors.New("encode text status undefined status")
+  return nil, errors.New("encode binary status undefined status")
internal/storage/postgres/watch_test.go (1)

151-156: Brittle assertion on exact error text.

Tests matching “encode text status undefined status” couple to an internal message from Value(); any cleanup will break them. Prefer checking error type/category or substring from your error mapping layer.

Also applies to: 174-180

internal/storage/postgres/migrations/20250913120000_fix_xid_wraparound.sql (1)

1-19: Migration does a full-table UPDATE; consider batching to reduce locks and bloat.

On large tables this can be heavy. Optional: update in chunks (ctid or id ranges) and VACUUM after.

pkg/database/postgres/pguint64_test.go (3)

63-84: Add overflow/large-input DecodeText case

Include a DecodeText case that attempts to parse a value > math.MaxUint64 to ensure it errors predictably.

Example to add:

It("Case 4: Invalid input - overflow", func() {
	var p pguint64
	err := p.DecodeText(nil, []byte("18446744073709551616")) // MaxUint64 + 1
	Expect(err).Should(HaveOccurred())
})

109-147: Scan negative int64 should error

Mirror the Set(-1) behavior by asserting that Scan(int64(-1)) fails with an informative error.

Example to add:

It("Case 6: Error with negative int64", func() {
	var p pguint64
	err := p.Scan(int64(-1))
	Expect(err).Should(HaveOccurred())
})

149-170: Boundary test for Value/Present near MaxInt64

Given Value() returns int64, add a Present case at math.MaxInt64 to verify no truncation and to document the chosen sentinel domain.

Example to add:

It("Case 4: Present at MaxInt64 boundary", func() {
	p := pguint64{Uint: uint64(math.MaxInt64), Status: pgtype.Present}
	v, err := p.Value()
	Expect(err).ShouldNot(HaveOccurred())
	Expect(v).Should(Equal(int64(math.MaxInt64)))
})
internal/storage/postgres/dataWriter.go (1)

437-439: Prefer table name constants for inserts

Hardcoding "relation_tuples"/"attributes" risks drift with consts.go. Consider using the constants for inserts too.

Example:

// e.g., fmt.Sprintf("INSERT INTO %s (...", RelationTuplesTable)

If switching, add fmt to imports.

Also applies to: 501-503

pkg/database/postgres/xid8_test.go (2)

74-88: Add EncodeText Undefined-status error case

Parity with pguint64 tests helps ensure consistent behavior.

Example to add:

It("Case 3: Undefined status", func() {
	x := XID8{Status: pgtype.Undefined}
	_, err := x.EncodeText(nil, nil)
	Expect(err).Should(HaveOccurred())
})

123-137: Add Value Undefined-status error case

Covers final missing branch.

Example to add:

It("Case 3: Undefined status", func() {
	x := XID8{Status: pgtype.Undefined}
	_, err := x.Value()
	Expect(err).Should(HaveOccurred())
})
pkg/database/postgres/postgres_test.go (3)

104-110: Avoid asserting panics for Close()

Panic on Close with nil pools is brittle; prefer a no-op or clear error. Consider making Close nil-safe and updating the test accordingly.


23-27: Brittle struct-shape assertions on XID8

Asserting fields (Uint, Status) ties tests to internal representation. Prefer compile-time interface/behavior checks (e.g., that XID8 satisfies required interfaces) or method-level behavior tests.


123-137: Hard-coding internal map keys is fragile

Validating exact keys in queryExecModes/planCacheModes may fail on harmless internal changes. Consider testing presence of only the modes your code actually uses, or validating via exported accessors.

Also applies to: 141-155

internal/storage/postgres/utils/common.go (3)

168-175: Backoff: prefer NewTimer over time.After to avoid timer leaks

When ctx cancels first, time.After’s timer isn’t stopped. Use NewTimer.

- select {
- case <-time.After(nextBackoff):
- case <-ctx.Done():
- }
+ timer := time.NewTimer(nextBackoff)
+ defer timer.Stop()
+ select {
+ case <-timer.C:
+ case <-ctx.Done():
+ }

Also applies to: 176-181


186-193: Comment/code mismatch in secureRandomFloat64

Comment says “bit shifting,” code uses division. Either adjust comment or implement shift-based scaling.

Comment-only fix:

- // Use bit shifting instead of division for better performance
- return float64(binary.BigEndian.Uint64(b[:])) / (1 << 63) / 2.0
+ // Scale to [0,1) using a 64-bit denominator
+ return float64(binary.BigEndian.Uint64(b[:])) / (1 << 63) / 2.0

Or implement shift-based normalization if you prefer.


28-31: De-duplicate sentinel source of truth

ActiveRecordTxnID and MaxXID8Value encode the same magnitude twice. Derive the SQL literal from the uint64 to prevent drift.

- ActiveRecordTxnID = uint64(9223372036854775807)
- MaxXID8Value      = "'9223372036854775807'::xid8"
+ ActiveRecordTxnID = uint64(9223372036854775807)
+ // Note: var (not const) to derive from ActiveRecordTxnID
+)
+
+var MaxXID8Value = fmt.Sprintf("'%d'::xid8", ActiveRecordTxnID)
pkg/database/postgres/xid8.go (2)

12-15: pgSnapshot/interface and codec usage unclear

  • Embedding pgtype.Value doesn’t make pgSnapshot implement pgtype.Value (see failing test).
  • The standalone codec types only expose FormatCode; ensure they’re actually used by pgtype/pgx, or remove them.

Options:

  • If a concrete pgx/pgtype snapshot type exists, alias it:
- type pgSnapshot struct {
-   pgtype.Value
- }
+ // If available in your pgtype version:
+ type pgSnapshot = pgtype.Snapshot
  • Or drop pgSnapshot entirely and the test asserting its conformance.

Also applies to: 17-22, 68-76


23-66: Forwarders on XID8 look fine; consider compile-time interface checks

Add static assertions to catch regressions if pguint64 changes.

// Optional (in this file):
var _ driver.Valuer = XID8{}
var _ interface{ Scan(interface{}) error } = (*XID8)(nil)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 836f275 and 571b1f3.

📒 Files selected for processing (16)
  • internal/storage/postgres/dataReader.go (3 hunks)
  • internal/storage/postgres/dataWriter.go (11 hunks)
  • internal/storage/postgres/migrations/20250913120000_fix_xid_wraparound.sql (1 hunks)
  • internal/storage/postgres/snapshot/token.go (3 hunks)
  • internal/storage/postgres/snapshot/token_test.go (8 hunks)
  • internal/storage/postgres/types/pguint64_test.go (0 hunks)
  • internal/storage/postgres/types/xid8.go (0 hunks)
  • internal/storage/postgres/utils/common.go (5 hunks)
  • internal/storage/postgres/utils/common_test.go (3 hunks)
  • internal/storage/postgres/watch.go (6 hunks)
  • internal/storage/postgres/watch_test.go (8 hunks)
  • pkg/database/postgres/pguint64.go (10 hunks)
  • pkg/database/postgres/pguint64_test.go (1 hunks)
  • pkg/database/postgres/postgres_test.go (1 hunks)
  • pkg/database/postgres/xid8.go (1 hunks)
  • pkg/database/postgres/xid8_test.go (1 hunks)
💤 Files with no reviewable changes (2)
  • internal/storage/postgres/types/xid8.go
  • internal/storage/postgres/types/pguint64_test.go
🧰 Additional context used
🧬 Code graph analysis (8)
pkg/database/postgres/xid8_test.go (1)
pkg/database/postgres/xid8.go (3)
  • XID8 (10-10)
  • SnapshotCodec (18-18)
  • Uint64Codec (21-21)
pkg/database/postgres/postgres_test.go (2)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
pkg/database/postgres/postgres.go (2)
  • NewWithSeparateURIs (43-45)
  • Postgres (22-35)
internal/storage/postgres/dataReader.go (2)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
internal/storage/postgres/snapshot/token.go (1)
  • Token (15-17)
internal/storage/postgres/watch.go (1)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
internal/storage/postgres/dataWriter.go (3)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
internal/storage/postgres/consts.go (2)
  • RelationTuplesTable (4-4)
  • AttributesTable (5-5)
internal/storage/postgres/utils/common.go (1)
  • ActiveRecordTxnID (30-30)
internal/storage/postgres/snapshot/token.go (1)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
internal/storage/postgres/snapshot/token_test.go (2)
internal/storage/postgres/snapshot/token.go (3)
  • NewToken (25-29)
  • EncodedToken (19-21)
  • Token (15-17)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
internal/storage/postgres/watch_test.go (1)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Analyze (go)
  • GitHub Check: Test with Coverage
  • GitHub Check: Test with Coverage
🔇 Additional comments (14)
internal/storage/postgres/utils/common_test.go (3)

31-33: LGTM: SnapshotQuery expectation updated to new xid8 sentinel.


42-44: LGTM: GC query (global) expectation uses new sentinel.


51-54: LGTM: GC query (tenant-aware) expectation uses new sentinel.

internal/storage/postgres/migrations/20250913120000_fix_xid_wraparound.sql (1)

20-37: Verify migration ordering and idempotency on mixed states.

Ensure this timestamp positions after any prior defaults/indexes that reference expired_tx_id=0 and that reruns are harmless.

If helpful, I can draft a follow-up goose script to batch updates safely.

internal/storage/postgres/snapshot/token.go (2)

9-11: LGTM: switched to public postgres.XID8.


24-29: All NewToken call sites updated to use postgres.XID8.

internal/storage/postgres/snapshot/token_test.go (3)

28-34: LGTM: encode cases updated to postgres.XID8.


59-60: LGTM: Eg/Gt/Lt tests reflect new type.

Also applies to: 73-75, 89-91


106-112: LGTM: decode tests aligned with new type.

Also applies to: 126-127

pkg/database/postgres/pguint64_test.go (1)

10-41: Solid coverage for Set/Get with clear expectations

Cases exercise both happy paths and error branches; assertions look correct.

internal/storage/postgres/dataWriter.go (1)

269-270: Verify deployment order of fix_xid_wraparound migration
Migration internal/storage/postgres/migrations/20250913120000_fix_xid_wraparound.sql updates all existing expired_tx_id=0 rows to 9223372036854775807 and sets the new default. Ensure this migration runs before deploying the dataWriter changes, otherwise updates filtered on ActiveRecordTxnID will miss rows.

pkg/database/postgres/xid8_test.go (2)

10-41: LGTM: XID8 Set/Get cases mirror pguint64 and look correct

Happy-path and error-path expectations are reasonable.


140-154: Codecs tests are concise and on-point

FormatCode assertions are sufficient.

internal/storage/postgres/utils/common.go (1)

43-57: Snapshot visibility: double-check semantics around “current txn”

Including created_tx_id = value and excluding expired_tx_id = value changes same-tx read-your-writes behavior. Verify this matches your intended snapshot isolation semantics across writers/readers.

If needed, I can generate focused tests around same-tx create/expire visibility with pg_visible_in_snapshot.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/storage/postgres/watch.go (1)

161-175: Bug: 0/unknown anchor prevents any results; also unparameterized/typed comparison.

If the anchor XID has no row (e.g., 0), the subquery returns NULL and the “> (NULL)” filter eliminates all rows permanently. Also, prefer parameterization and explicit cast for the xmin comparison.

Apply:

-func (w *Watch) getRecentXIDs(ctx context.Context, value uint64, tenantID string) ([]db.XID8, error) {
-	// Convert the value to a string formatted as a Postgresql XID8 type.
-	valStr := fmt.Sprintf("'%v'::xid8", value)
-
-	subquery := fmt.Sprintf("(select pg_xact_commit_timestamp(id::xid) from transactions where id = %s)", valStr)
-
-	// Build the main query to get transactions committed after the one with a given XID,
-	// still visible in the current snapshot, ordered by their commit timestamps.
-	builder := w.database.Builder.Select("id").
-		From(TransactionsTable).
-		Where(fmt.Sprintf("pg_xact_commit_timestamp(id::xid) > (%s)", subquery)).
-		Where("id < pg_snapshot_xmin(pg_current_snapshot())").
-		Where(squirrel.Eq{"tenant_id": tenantID}).
-		OrderBy("pg_xact_commit_timestamp(id::xid)")
+func (w *Watch) getRecentXIDs(ctx context.Context, value uint64, tenantID string) ([]db.XID8, error) {
+	// Compare against the commit-ts of `value` if it exists; otherwise use -infinity to allow bootstrapping from 0.
+	anchor := squirrel.Expr(
+		"pg_xact_commit_timestamp(id::xid) > COALESCE((select pg_xact_commit_timestamp(id::xid) from transactions where id = ?::xid8), '-infinity'::timestamptz)",
+		value,
+	)
+	builder := w.database.Builder.Select("id").
+		From(TransactionsTable).
+		Where(anchor).
+		Where(squirrel.Expr("id::xid < pg_snapshot_xmin(pg_current_snapshot())")).
+		Where(squirrel.Eq{"tenant_id": tenantID}).
+		OrderBy("pg_xact_commit_timestamp(id::xid)")
🧹 Nitpick comments (4)
internal/storage/postgres/dataReader.go (1)

32-39: Comment mismatches actual isolation level.

Doc says “Repeatable Read,” code sets ReadCommitted + ReadOnly. Update the comment.

-// It initializes a new DataReader with a given database, a logger, and sets transaction options to be read-only with Repeatable Read isolation level.
+// It initializes a new DataReader with a given database and sets transaction options to ReadCommitted + ReadOnly.
internal/storage/postgres/watch.go (2)

285-309: Check iterator errors after loop (relation tuples).

Add rows.Err() handling to catch driver-level iteration errors.

 for trows.Next() {
   ...
 }
+if err := trows.Err(); err != nil {
+	slog.ErrorContext(ctx, "iteration error for relation tuples", slog.Any("error", err))
+	return nil, err
+}

312-347: Check iterator errors after loop (attributes).

Same for attributes iteration.

 for arows.Next() {
   ...
 }
+if err := arows.Err(); err != nil {
+	slog.ErrorContext(ctx, "iteration error for attributes", slog.Any("error", err))
+	return nil, err
+}
internal/storage/postgres/dataWriter.go (1)

444-449: Expire-by-sentinel updates are correct; consider aiding them with an index.

Using utils.ActiveRecordTxnID in the WHERE ensures only active records get expired. Consider adding/supporting a partial index to keep these updates fast:

  • relation_tuples: (tenant_id, entity_type, entity_id, relation, subject_type, subject_id, subject_relation) WHERE expired_tx_id = ACTIVE
  • attributes: (tenant_id, entity_type, entity_id, attribute) WHERE expired_tx_id = ACTIVE

If migrations already add these, ignore.

Also applies to: 508-513

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 571b1f3 and 49f6b8c.

📒 Files selected for processing (5)
  • internal/storage/postgres/dataReader.go (2 hunks)
  • internal/storage/postgres/dataWriter.go (10 hunks)
  • internal/storage/postgres/watch.go (5 hunks)
  • internal/storage/postgres/watch_test.go (7 hunks)
  • pkg/database/postgres/postgres_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/database/postgres/postgres_test.go
  • internal/storage/postgres/watch_test.go
🧰 Additional context used
🧬 Code graph analysis (3)
internal/storage/postgres/watch.go (1)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
internal/storage/postgres/dataReader.go (2)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
internal/storage/postgres/snapshot/token.go (1)
  • Token (15-17)
internal/storage/postgres/dataWriter.go (3)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
internal/storage/postgres/consts.go (2)
  • RelationTuplesTable (4-4)
  • AttributesTable (5-5)
internal/storage/postgres/utils/common.go (1)
  • ActiveRecordTxnID (30-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Test with Coverage
  • GitHub Check: Analyze (go)
  • GitHub Check: Test with Coverage
🔇 Additional comments (5)
internal/storage/postgres/dataReader.go (1)

563-563: Alias migration looks correct (db.XID8).

Good switch to the public package alias.

internal/storage/postgres/watch.go (2)

194-197: Alias migration to db.XID8 looks consistent.

Scanning works with the new codec-backed type.


224-224: Signature update to use db.XID8 is correct.

Aligns with package restructuring.

internal/storage/postgres/dataWriter.go (2)

183-185: Good: db.XID8 adoption across write/delete/bundle paths.

Consistent with the new exported type; tx.Scan into db.XID8 should work with the registered codec.

Also applies to: 257-260, 332-335, 381-385


427-441: Batch helper signatures updated to db.XID8.

Signatures and call sites look correct.

Also applies to: 487-505

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
pkg/database/postgres/pguint64.go (4)

93-108: Critical: DecodeBinary length check is wrong and can panic

Expect 8 bytes for a uint64; current 4-byte check can trigger out-of-bounds on Uint64 read.

Apply:

 func (p *pguint64) DecodeBinary(ci *pgtype.ConnInfo, src []byte) error {
   if src == nil {
     *p = pguint64{Status: pgtype.Null}
     return nil
   }
 
-  if len(src) != 4 {
+  if len(src) != 8 {
     return fmt.Errorf("invalid length: %v", len(src))
   }
 
   n := binary.BigEndian.Uint64(src)
   *p = pguint64{Uint: n, Status: pgtype.Present}
   return nil
 }

141-149: Scan(int64): reject negative values to avoid wrap-around

Casting negative int64 to uint64 silently wraps.

 case int64:
-  *p = pguint64{Uint: uint64(src), Status: pgtype.Present}
-  return nil
+  if src < 0 {
+    return fmt.Errorf("%d is less than minimum value for pguint64", src)
+  }
+  *p = pguint64{Uint: uint64(src), Status: pgtype.Present}
+  return nil

161-171: Value(): avoid overflow/negative round‑trip; return string

driver.Value int64 cannot represent full uint64 range.

 case pgtype.Present:
-  return int64(p.Uint), nil
+  return strconv.FormatUint(p.Uint, 10), nil

56-75: AssignTo: return error for unsupported destinations

Currently falls through with no error, hiding misuse.

 func (p *pguint64) AssignTo(dst interface{}) error {
   switch v := dst.(type) {
   case *uint64:
     if p.Status == pgtype.Present {
       *v = p.Uint
     } else {
       return fmt.Errorf("cannot assign %v into %T", p, dst)
     }
   case **uint64:
     if p.Status == pgtype.Present {
       n := p.Uint
       *v = &n
     } else {
       *v = nil
     }
+  default:
+    return fmt.Errorf("cannot assign %v into %T", p, dst)
   }
 
   return nil
 }
♻️ Duplicate comments (5)
internal/storage/postgres/utils/common.go (1)

154-160: min() is undefined; clamp retries explicitly

This won’t compile in Go. Clamp with simple int logic.

-		shift := min(retries, 5) // Cap at 2^5 = 32, so max backoff is 640ms
-		baseBackoff = baseBackoff << shift
+		shift := retries
+		if shift > 5 {
+			shift = 5 // Cap at 2^5 = 32, so max backoff is 640ms
+		}
+		baseBackoff = baseBackoff << shift
internal/storage/postgres/dataWriter.go (1)

20-20: LGTM: Clean import with proper aliasing

The import correctly uses the db alias for the postgres package, resolving the duplicate import issue flagged in previous reviews.

pkg/database/postgres/postgres_test.go (2)

29-33: Compilation issue resolved

This test was previously flagged for compilation issues, but the current implementation correctly assigns a pgSnapshot to a pgtype.Value interface, which should work if pgSnapshot properly implements the interface.


65-121: LGTM: Struct method tests with panic expectations

The tests correctly validate:

  • Getter methods return expected configuration values
  • Close and IsReady methods properly panic when pools are nil

The panic expectations align with the current implementation behavior.

internal/storage/postgres/watch.go (1)

20-20: LGTM: Import cleanup completed

The duplicate import issue has been resolved with only the aliased import remaining.

🧹 Nitpick comments (27)
internal/storage/postgres/migrations/20250913120000_fix_xid_wraparound.sql (2)

14-19: Heads-up: ALTER COLUMN DEFAULT takes AccessExclusive lock

ALTER TABLE … SET DEFAULT will block writers briefly. Plan to run off-peak and monitor lock wait. If you want, I can provide a zero-downtime-ish rollout plan.


1-11: Optional: batch the backfill on large tables

If relation_tuples/attributes are big, consider chunked updates to reduce bloat/locks.

I can draft a CTE-based batched UPDATE if desired.

Also applies to: 14-19

docs/operations/repair.mdx (3)

8-20: Add Requirements note (PostgreSQL version and privileges)

Clarify that xid8 and pg_current_xact_id() require PostgreSQL ≥ 13 and suitable privileges.

Apply this diff:

  # Database Repair
@@
- The `permify repair datastore` command helps prevent PostgreSQL XID wraparound issues by safely advancing the transaction ID counter. This is essential after database migrations or when dealing with XID-related problems.
+ The `permify repair datastore` command helps prevent PostgreSQL XID wraparound issues by safely advancing the transaction ID counter. This is essential after database migrations or when dealing with XID-related problems.
+
+ > Requirements
+ > - PostgreSQL 13+ (xid8 support and pg_current_xact_id()).
+ > - A user with rights to start/commit transactions and read the `transactions` table.

78-92: Call out primary-only and replication/WAL impact

Recommend running on primary, expect WAL growth; avoid replicas.

Apply this diff:

 ## When to Use
@@
 - When encountering XID-related errors
+
+## Operational Notes
+
+- Run on the primary (not a hot standby).
+- Expect increased WAL; ensure disk headroom and monitor replication lag.
+- Prefer off-peak windows in busy systems.

73-77: Minor wording tweak

“transactions table” → “Permify transactions table” to avoid ambiguity.

-2. **Reference Analysis**: Finds the maximum transaction ID referenced in the transactions table
+2. **Reference Analysis**: Finds the maximum transaction ID referenced in the Permify `transactions` table
docs/package.json (1)

1-24: Make docs package private and fix placeholder metadata

Prevents accidental publish and aligns license/description with the repo.

Apply this diff:

 {
   "name": "docs",
   "version": "1.0.0",
-  "description": "Click on `Use this template` to copy the Mintlify starter kit. The starter kit contains examples including",
+  "description": "Permify documentation site (Mintlify).",
+  "private": true,
   "main": "gurubase.js",
@@
-  "author": "",
-  "license": "ISC"
+  "author": "",
+  "license": "Apache-2.0"
 }
internal/storage/postgres/utils/common.go (4)

28-31: Single source of truth for sentinel

Keep ActiveRecordTxnID (uint64) as the source of truth and avoid embedding the quoted string where possible.

I’m proposing placeholder-based SQL below to remove dependence on MaxXID8Value in this file.


172-175: Reduce log level to avoid noisy retries

Warn on every retry can spam logs; suggest Debug.

-	slog.WarnContext(ctx, "waiting before retry",
+	slog.DebugContext(ctx, "waiting before retry",

191-193: Fix comment to match implementation

You’re dividing by 2^64; there’s no bit shifting here.

-	// Use bit shifting instead of division for better performance
-	return float64(binary.BigEndian.Uint64(b[:])) / (1 << 63) / 2.0
+	// Normalize to [0,1) by dividing by 2^64 (avoids math package)
+	return float64(binary.BigEndian.Uint64(b[:])) / (1 << 63) / 2.0

34-60: Optional: unit tests for SnapshotQuery/GC builders

Add tests asserting generated SQL and args (Dollar placeholders) to prevent regressions when changing sentinels.

Also applies to: 62-100

internal/storage/postgres/utils/common_test.go (3)

31-33: Use shared sentinel constant to avoid drift.

Hardcoding '9223372036854775807'::xid8 in expectedSQL is brittle. Consider formatting the string with the package constant (e.g., utils.MaxXID8Value) to keep tests aligned with code.

-expectedSQL := "SELECT column FROM table WHERE (pg_visible_in_snapshot(created_tx_id, (select snapshot from transactions where id = '42'::xid8)) = true OR created_tx_id = '42'::xid8) AND ((pg_visible_in_snapshot(expired_tx_id, (select snapshot from transactions where id = '42'::xid8)) = false OR expired_tx_id = '9223372036854775807'::xid8) AND expired_tx_id <> '42'::xid8)"
+expectedSQL := fmt.Sprintf(
+  "SELECT column FROM table WHERE (pg_visible_in_snapshot(created_tx_id, (select snapshot from transactions where id = '42'::xid8)) = true OR created_tx_id = '42'::xid8) AND ((pg_visible_in_snapshot(expired_tx_id, (select snapshot from transactions where id = '42'::xid8)) = false OR expired_tx_id = %s) AND expired_tx_id <> '42'::xid8)",
+  utils.MaxXID8Value,
+)

42-44: Reduce brittleness in SQL assertions.

Same suggestion here—compose with the constant to avoid hardcoded sentinel repetition.

-expectedSQL := "DELETE FROM relation_tuples WHERE expired_tx_id <> '9223372036854775807'::xid8 AND expired_tx_id < '100'::xid8"
+expectedSQL := fmt.Sprintf(
+  "DELETE FROM relation_tuples WHERE expired_tx_id <> %s AND expired_tx_id < '100'::xid8",
+  utils.MaxXID8Value,
+)

51-53: Apply the same constant-driven formatting for tenant-aware GC.

-expectedSQL := "DELETE FROM relation_tuples WHERE tenant_id = ? AND expired_tx_id <> '9223372036854775807'::xid8 AND expired_tx_id < '100'::xid8"
+expectedSQL := fmt.Sprintf(
+  "DELETE FROM relation_tuples WHERE tenant_id = ? AND expired_tx_id <> %s AND expired_tx_id < '100'::xid8",
+  utils.MaxXID8Value,
+)
pkg/database/postgres/repair.go (3)

9-14: Avoid duplicating XID sentinel constants across packages.

Defining ActiveRecordTxnID/MaxXID8Value here risks divergence from internal/storage/postgres/utils. Consider moving them to a common exported package (e.g., pkg/xid) and importing from both places.


56-63: Populate RepairResult.Duration.

Capture elapsed time for observability.

+ start := time.Now()
 ...
 slog.InfoContext(ctx, "PostgreSQL transaction ID counter repair completed",
   slog.Int("transactions_advanced", result.CreatedTxIdFixed),
   slog.Int("errors", len(result.Errors)))
 
+ result.Duration = time.Since(start).String()
 return result, nil

Additionally, add:

 import (
   "context"
   "fmt"
   "log/slog"
+  "time"
 )

Also applies to: 105-110


86-93: Guard against int overflow in counterDelta.

Compute in uint64 and clamp to MaxInt to avoid overflow on very large deltas.

- counterDelta := int(maxReferencedXID - currentXID + 1000) // Add safety buffer
+ delta := maxReferencedXID - currentXID + 1000
+ if delta > uint64(^int(0)) {
+   return result, fmt.Errorf("delta too large: %d", delta)
+ }
+ counterDelta := int(delta)
pkg/database/postgres/repair_test.go (2)

95-117: Config fields MaxRetries/RetryDelay are unused.

Either implement simple retry/backoff in advanceXIDCounterByDelta or remove these fields to avoid confusion.


29-53: Result.Duration is never set by Repair.

If you keep Duration, add an assertion in a Repair integration test (dry-run path) once duration is populated.

pkg/database/postgres/pguint64.go (1)

121-128: Nit: fix error text in EncodeBinary

Message says “encode text” in binary path.

 case pgtype.Undefined:
-  return nil, errors.New("encode text status undefined status")
+  return nil, errors.New("encode binary status undefined status")
pkg/database/postgres/xid8.go (1)

3-7: Add compile-time interface assertions for safety

Assert XID8 implements sql.Scanner and driver.Valuer. Catches regressions early.

 import (
+	"database/sql"
 	"database/sql/driver"

 	"github.com/jackc/pgtype"
 )
@@
 func (x XID8) Value() (driver.Value, error) {
 	return (pguint64)(x).Value()
 }
+
+// Interface assertions
+var (
+	_ driver.Valuer = XID8{}
+	_ sql.Scanner   = (*XID8)(nil)
+)

Also applies to: 66-67

pkg/database/postgres/xid8_test.go (7)

11-41: Add boundary coverage for Set (zero, MaxUint64) and avoid brittle error assertions.

  • Add success cases for 0 and math.MaxUint64.
  • Current error checks rely on substrings; prefer sentinel errors or MatchError(ContainSubstring(...)).

Apply:

@@
 	Context("Set", func() {
@@
 		It("Case 2: Success with uint64", func() {
@@
 		})
+
+		It("Case 5: Success with zero value", func() {
+			var x XID8
+			err := x.Set(uint64(0))
+			Expect(err).ShouldNot(HaveOccurred())
+			Expect(x.Uint).Should(Equal(uint64(0)))
+			Expect(x.Status).Should(Equal(pgtype.Present))
+		})
+
+		It("Case 6: Success with MaxUint64", func() {
+			var x XID8
+			err := x.Set(^uint64(0))
+			Expect(err).ShouldNot(HaveOccurred())
+			Expect(x.Uint).Should(Equal(^uint64(0)))
+			Expect(x.Status).Should(Equal(pgtype.Present))
+		})

43-55: Assert the type of Get() result to catch accidental int64 returns.

Add a type assertion for the present case.

 		It("Case 1: Present status", func() {
 			x := XID8{Uint: 12345, Status: pgtype.Present}
 			result := x.Get()
+			Expect(result).Should(BeAssignableToTypeOf(uint64(0)))
 			Expect(result).Should(Equal(uint64(12345)))
 		})

57-72: Add negative DecodeText tests (non-numeric, overflow).

Covers common parsing failures.

 	Context("DecodeText", func() {
@@
 		})
 
+		It("Case 3: Error with non-numeric", func() {
+			var x XID8
+			err := x.DecodeText(nil, []byte("abc"))
+			Expect(err).Should(HaveOccurred())
+		})
+
+		It("Case 4: Error with overflow", func() {
+			var x XID8
+			// 18446744073709551616 = MaxUint64 + 1
+			err := x.DecodeText(nil, []byte("18446744073709551616"))
+			Expect(err).Should(HaveOccurred())
+		})
 	})

74-88: Optional: verify EncodeText appends to provided buffer.

Minor perf guard; pgtype encoders typically append to buf.

 		It("Case 1: Present status", func() {
 			x := XID8{Uint: 12345, Status: pgtype.Present}
-			result, err := x.EncodeText(nil, nil)
+			buf := make([]byte, 0, 8)
+			result, err := x.EncodeText(nil, buf)
 			Expect(err).ShouldNot(HaveOccurred())
 			Expect(string(result)).Should(Equal("12345"))
+			Expect(cap(result)).Should(BeNumerically(">=", 8))
 		})

90-121: Strengthen Scan() coverage with invalid inputs and edge cases.

Add cases for unsupported types, invalid strings, negative ints.

 	Context("Scan", func() {
@@
 		})
 
+		It("Case 5: Error with unsupported type", func() {
+			var x XID8
+			err := x.Scan(12.34)
+			Expect(err).Should(HaveOccurred())
+		})
+
+		It("Case 6: Error with invalid numeric string", func() {
+			var x XID8
+			err := x.Scan("not-a-number")
+			Expect(err).Should(HaveOccurred())
+		})
+
+		It("Case 7: Error with negative int64", func() {
+			var x XID8
+			err := x.Scan(int64(-1))
+			Expect(err).Should(HaveOccurred())
+			Expect(x.Status).ShouldNot(Equal(pgtype.Present))
+		})
 	})

123-137: Clarify Value() behavior for values > MaxInt64 (driver.Value cannot carry uint64).

Either return a string for out-of-range values or error deterministically; add a test to lock behavior.

 	Context("Value", func() {
@@
 		})
 
+		It("Case 3: Out-of-range value strategy", func() {
+			x := XID8{Uint: ^uint64(0), Status: pgtype.Present} // MaxUint64
+			_, err := x.Value()
+			// Decide and enforce one of:
+			// Expect(err).To(HaveOccurred()) // if Value() errors on overflow
+			// OR:
+			// s, ok := result.(string); Expect(ok).To(BeTrue()); Expect(s).To(Equal("18446744073709551615"))
+			_ = err
+		})
 	})

If you want, I can align the implementation and tests to your chosen strategy.


45-48: Prefer black-box setup to reduce coupling to struct fields.

Instead of struct literals with Uint/Status, set state via exported methods like Set to avoid brittle tests if internals change.

-			x := XID8{Uint: 12345, Status: pgtype.Present}
+			var x XID8
+			Expect(x.Set(uint64(12345))).To(Succeed())

(Apply similarly in EncodeText and Value contexts.)

Also applies to: 76-80, 125-129

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 836f275 and c7d00fd.

⛔ Files ignored due to path filters (2)
  • docs/package-lock.json is excluded by !**/package-lock.json
  • playground/public/play.wasm is excluded by !**/*.wasm
📒 Files selected for processing (23)
  • cmd/permify/permify.go (1 hunks)
  • docs/mint.json (1 hunks)
  • docs/operations/repair.mdx (1 hunks)
  • docs/package.json (1 hunks)
  • internal/storage/postgres/dataReader.go (2 hunks)
  • internal/storage/postgres/dataWriter.go (10 hunks)
  • internal/storage/postgres/migrations/20250913120000_fix_xid_wraparound.sql (1 hunks)
  • internal/storage/postgres/snapshot/token.go (3 hunks)
  • internal/storage/postgres/snapshot/token_test.go (8 hunks)
  • internal/storage/postgres/types/pguint64_test.go (0 hunks)
  • internal/storage/postgres/types/xid8.go (0 hunks)
  • internal/storage/postgres/utils/common.go (5 hunks)
  • internal/storage/postgres/utils/common_test.go (3 hunks)
  • internal/storage/postgres/watch.go (5 hunks)
  • internal/storage/postgres/watch_test.go (7 hunks)
  • pkg/cmd/repair.go (1 hunks)
  • pkg/database/postgres/pguint64.go (10 hunks)
  • pkg/database/postgres/pguint64_test.go (1 hunks)
  • pkg/database/postgres/postgres_test.go (1 hunks)
  • pkg/database/postgres/repair.go (1 hunks)
  • pkg/database/postgres/repair_test.go (1 hunks)
  • pkg/database/postgres/xid8.go (1 hunks)
  • pkg/database/postgres/xid8_test.go (1 hunks)
💤 Files with no reviewable changes (2)
  • internal/storage/postgres/types/pguint64_test.go
  • internal/storage/postgres/types/xid8.go
🧰 Additional context used
🧬 Code graph analysis (12)
cmd/permify/permify.go (1)
pkg/cmd/repair.go (1)
  • NewRepairCommand (24-35)
pkg/database/postgres/xid8_test.go (1)
pkg/database/postgres/xid8.go (3)
  • XID8 (10-10)
  • SnapshotCodec (18-18)
  • Uint64Codec (21-21)
pkg/database/postgres/postgres_test.go (2)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
pkg/database/postgres/postgres.go (2)
  • NewWithSeparateURIs (43-45)
  • Postgres (22-35)
pkg/cmd/repair.go (1)
pkg/database/postgres/repair.go (1)
  • RepairConfig (17-23)
internal/storage/postgres/dataReader.go (2)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
internal/storage/postgres/snapshot/token.go (1)
  • Token (15-17)
pkg/database/postgres/repair_test.go (1)
pkg/database/postgres/repair.go (3)
  • DefaultRepairConfig (26-34)
  • RepairResult (37-41)
  • RepairConfig (17-23)
pkg/database/postgres/repair.go (1)
pkg/database/postgres/postgres.go (1)
  • Postgres (22-35)
internal/storage/postgres/snapshot/token.go (1)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
internal/storage/postgres/watch_test.go (1)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
internal/storage/postgres/dataWriter.go (3)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
internal/storage/postgres/consts.go (2)
  • RelationTuplesTable (4-4)
  • AttributesTable (5-5)
internal/storage/postgres/utils/common.go (1)
  • ActiveRecordTxnID (30-30)
internal/storage/postgres/watch.go (1)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
internal/storage/postgres/snapshot/token_test.go (2)
internal/storage/postgres/snapshot/token.go (3)
  • NewToken (25-29)
  • EncodedToken (19-21)
  • Token (15-17)
pkg/database/postgres/xid8.go (1)
  • XID8 (10-10)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test with Coverage
  • GitHub Check: Test with Coverage
🔇 Additional comments (36)
internal/storage/postgres/migrations/20250913120000_fix_xid_wraparound.sql (2)

5-11: Good sentinel migration for active records

Updating 0 → 9223372036854775807::xid8 for expired_tx_id in both tables is consistent with the new “never-expire” sentinel. LGTM.


31-37: Down migration symmetry

Reverting 9223372036854775807 → 0 is symmetric. Confirm no other code path uses MAX as a valid “expired” value before relying on this Down.

docs/mint.json (1)

318-326: Nav addition looks good

Adding operations/repair under Operations is consistent with the new doc.

internal/storage/postgres/utils/common.go (1)

1-194: No remaining old sentinel usages of 0::xid8
A repo-wide search for ::xid only returned the updated sentinel ('9223372036854775807'::xid8) and other valid casts—no occurrences of '0'::xid8 remain.

cmd/permify/permify.go (1)

42-44: CLI: Repair subcommand wiring — LGTM

Command is properly constructed and registered on root. No concerns.

internal/storage/postgres/watch_test.go (1)

242-245: Alias migration to PQDatabase.XID8 — LGTM

Consistent with the public type relocation; tests read cleanly.

Also applies to: 257-260, 272-275, 287-290, 302-305, 317-320, 332-335

internal/storage/postgres/dataReader.go (1)

563-581: HeadSnapshot: switch to db.XID8 and zero fallback — LGTM

Type alignment looks correct; zero-token fallback remains intact.

pkg/database/postgres/pguint64.go (2)

20-42: Set(): input validation and paths — LGTM

Covers signed/unsigned ints with proper negativity checks.


109-120: EncodeText(): status handling — LGTM

Null/undefined handling and formatting are correct.

internal/storage/postgres/snapshot/token.go (2)

9-9: LGTM: Import path migration aligned with package restructuring

The import path correctly reflects the migration from internal types to the public postgres package.


16-16: LGTM: Type migration properly implemented

The Token struct and all related functions correctly use the new postgres.XID8 type. The type change is consistently applied across:

  • Token.Value field declaration (line 16)
  • NewToken function signature (line 25)
  • EncodedToken.Decode instantiation (line 65)

This maintains API compatibility while using the public type.

Also applies to: 25-25, 65-65

internal/storage/postgres/dataWriter.go (3)

183-184: LGTM: Consistent XID8 type usage

All transaction ID variables correctly use the db.XID8 type throughout the write, delete, and runBundle operations. This ensures consistency with the new public API.

Also applies to: 257-258, 332-333


268-268: LGTM: Proper use of ActiveRecordTxnID constant

The update queries correctly use utils.ActiveRecordTxnID instead of hardcoded values for expired record identification. This aligns with the new sentinel-based approach for handling XID wraparound scenarios.

Also applies to: 288-288, 448-448, 512-512


381-381: LGTM: Function signatures updated consistently

All batch operation functions correctly use db.XID8 parameter types, maintaining consistency across the data writing pipeline.

Also applies to: 427-427, 444-444, 487-487, 508-508

pkg/cmd/repair.go (7)

1-22: LGTM: Well-structured command package

Clean imports and properly defined constants for command flags. The package structure follows Cobra CLI conventions.


24-35: LGTM: Clear command hierarchy

The repair command structure is well-organized with appropriate subcommand integration.


38-67: LGTM: Comprehensive CLI configuration

The datastore subcommand provides:

  • Reasonable default values (batch size 1000, retries 3)
  • Clear help documentation explaining the repair process
  • Proper flag validation (required database URI)
  • Safety features (dry-run mode, verbose logging)

76-79: LGTM: Input validation

Proper validation ensures only PostgreSQL engine is supported, providing clear error messages for unsupported engines.


104-105: LGTM: Appropriate timeout configuration

The 30-minute timeout is reasonable for database repair operations, preventing indefinite hangs while allowing sufficient time for completion.


107-111: LGTM: Security-conscious logging

The logging correctly masks the database URI using the maskURL helper function, preventing sensitive connection details from appearing in logs.


146-152: LGTM: Simple and effective URL masking

The maskURL function provides basic protection for database URIs in logs while maintaining readability for debugging purposes.

internal/storage/postgres/snapshot/token_test.go (2)

11-11: LGTM: Import path updated correctly

The import correctly reflects the migration from internal types to the public postgres package.


28-33: LGTM: Test data migrated consistently

All test cases correctly use the new postgres.XID8 type for:

  • Encode test cases with various uint64 values
  • Comparison operations (Eg, Gt, Lt)
  • Decode test cases
  • Error case testing

The token values and expected results remain unchanged, ensuring behavioral compatibility.

Also applies to: 46-46, 59-59, 73-75, 88-91, 106-112, 126-126

pkg/database/postgres/pguint64_test.go (2)

1-8: LGTM: Proper test package structure

Clean package declaration and imports following Ginkgo/Gomega conventions.


10-171: LGTM: Comprehensive pguint64 test suite

Excellent test coverage for the pguint64 type with well-structured test contexts covering:

  • Set operations: Success with int64/uint64, proper error handling for negative values and unsupported types
  • Get operations: Correct handling of Present, Null, and Undefined statuses
  • Text encoding/decoding: Valid inputs, null handling, and error cases
  • Scan operations: Multiple input type support (uint64, int64, string, nil, unsupported types)
  • Value operations: Status-based return values and error handling

The error message assertions using ContainSubstring provide robust validation without being overly brittle.

pkg/database/postgres/postgres_test.go (3)

22-27: LGTM: Type validation test

The test correctly validates that XID8 is based on pguint64 by checking the underlying fields. This ensures the type alias relationship is properly maintained.


36-63: LGTM: Constructor error handling tests

The tests properly validate that constructor functions return errors for invalid inputs, covering both single URI and separate URI scenarios.


123-156: LGTM: Registry map validation

The tests ensure that important configuration maps (queryExecModes and planCacheModes) contain all expected keys. This provides safety against accidental removal of supported modes.

internal/storage/postgres/watch.go (2)

161-161: LGTM: Function signatures and variables updated

All XID8-related code correctly uses the db.XID8 type:

  • getRecentXIDs return type and local variables (lines 161, 194, 196)
  • getChanges parameter type (line 224)
  • expiredXID variables in both tuple and attribute processing loops (lines 286, 313)

The migration maintains functional compatibility while using the public API.

Also applies to: 194-196, 224-224


286-286: LGTM: Consistent type usage in data processing

Both relation tuple and attribute processing loops correctly use db.XID8 for expired transaction ID scanning, ensuring consistency across the watch functionality.

Also applies to: 313-313

pkg/database/postgres/xid8.go (3)

23-66: LGTM: thin wrapper delegates correctly

Set/Get/AssignTo/Decode*/Encode*/Scan/Value all correctly forward to the underlying implementation with appropriate pointer/value receivers.


12-16: Do not remove pgSnapshot – it’s used in tests
pgSnapshot is referenced in pkg/database/postgres/postgres_test.go (lines 29–30), so it’s not dead code.

Likely an incorrect or invalid review comment.


9-11: Resolve: pguint64 is defined and provides required methods
pguint64 is declared in pkg/database/postgres/pguint64.go and implements Set/Get/AssignTo/DecodeText/DecodeBinary/EncodeText/EncodeBinary/Scan/Value, so XID8 compiles and correctly implements all required interfaces.

pkg/database/postgres/xid8_test.go (3)

140-154: LGTM on codec FormatCode assertions.

Binary format is appropriate for performance and parity with pgtype codecs.


3-8: No change needed for pgtype import path
The project’s go.mod explicitly requires github.com/jackc/pgtype v1.14.4 and there is no pgx/v5 dependency, so import "github.com/jackc/pgtype" is correct.


1-1: Remove suggestion: Ginkgo suite bootstrap already exists in this package
postgres_test.go (lines 16–17) and repair_test.go (lines 11–12) already register the FailHandler and call RunSpecs, so no additional _suite_test.go is needed.

Likely an incorrect or invalid review comment.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
pkg/database/postgres/repair.go (3)

113-120: LGTM! Proper handling of xid8 to uint64 conversion.

The implementation correctly casts xid8 to bigint and scans into int64 before converting to uint64, avoiding the truncation issue from previous versions.


133-187: LGTM! Proper implementation of XID advancement using separate transactions.

The implementation correctly advances XIDs using individual transactions with proper context cancellation checks and error handling. This addresses the issue from previous versions where a DO block was incorrectly used.


123-130: Handle NULL values from empty tables.

The MAX(id) query returns NULL on empty tables, which will cause a scan error. Use COALESCE to handle this case gracefully.

 func (p *Postgres) getMaxReferencedXID(ctx context.Context) (uint64, error) {
-	query := "SELECT (MAX(id)::text)::bigint FROM transactions"
+	query := "SELECT (COALESCE(MAX(id), '0'::xid8)::text)::bigint FROM transactions"
 	var x int64
 	if err := p.ReadPool.QueryRow(ctx, query).Scan(&x); err != nil {
 		return 0, err
 	}
 	return uint64(x), nil
 }
🧹 Nitpick comments (2)
pkg/database/postgres/repair.go (2)

16-34: Consider adding upper bounds validation for configuration parameters.

While the code validates that BatchSize is positive, consider adding upper bounds to prevent potential resource exhaustion or performance issues.

Add validation for reasonable upper bounds:

 func DefaultRepairConfig() *RepairConfig {
 	return &RepairConfig{
 		BatchSize:  1000, // default batch size for XID advancement
 		MaxRetries: 3,
 		RetryDelay: 100,
 		DryRun:     false,
 		Verbose:    true,
 	}
 }
+
+// ValidateRepairConfig validates and normalizes the repair configuration
+func (c *RepairConfig) Validate() {
+	if c.BatchSize <= 0 || c.BatchSize > 10000 {
+		c.BatchSize = 1000
+	}
+	if c.MaxRetries < 0 || c.MaxRetries > 10 {
+		c.MaxRetries = 3
+	}
+	if c.RetryDelay < 0 || c.RetryDelay > 5000 {
+		c.RetryDelay = 100
+	}
+}

86-100: Consider making the safety buffer configurable.

The hardcoded safety buffer of 1000 XIDs might not be suitable for all scenarios. Consider making it configurable through RepairConfig.

 type RepairConfig struct {
 	BatchSize  int  // batch size for XID advancement
 	MaxRetries int  // maximum number of retries
 	RetryDelay int  // milliseconds
 	DryRun     bool // perform a dry run without making changes
 	Verbose    bool // enable verbose logging
+	SafetyBuffer int // XIDs to advance beyond the maximum referenced XID
 }

 func DefaultRepairConfig() *RepairConfig {
 	return &RepairConfig{
 		BatchSize:  1000, // default batch size for XID advancement
 		MaxRetries: 3,
 		RetryDelay: 100,
 		DryRun:     false,
 		Verbose:    true,
+		SafetyBuffer: 1000,
 	}
 }

And in the repair logic:

 if maxReferencedXID > currentXID {
-	counterDelta := int(maxReferencedXID - currentXID + 1000) // Add safety buffer
+	safetyBuffer := config.SafetyBuffer
+	if safetyBuffer <= 0 {
+		safetyBuffer = 1000
+	}
+	counterDelta := int(maxReferencedXID - currentXID + uint64(safetyBuffer))
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9693516 and 4757525.

📒 Files selected for processing (3)
  • pkg/cmd/repair.go (1 hunks)
  • pkg/database/postgres/repair.go (1 hunks)
  • pkg/database/postgres/repair_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/cmd/repair.go
  • pkg/database/postgres/repair_test.go
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/database/postgres/repair.go (1)
pkg/database/postgres/postgres.go (1)
  • Postgres (22-35)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Test with Coverage
  • GitHub Check: Analyze (go)
  • GitHub Check: dependency-review
  • GitHub Check: Test with Coverage
🔇 Additional comments (1)
pkg/database/postgres/repair.go (1)

9-14: LGTM! Well-defined constants for XID management.

The constants are appropriately defined for preventing XID wraparound by using the maximum signed 64-bit value as a sentinel. This aligns with the changes in internal/storage/postgres/utils/common.go.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
internal/storage/postgres/utils/common.go (2)

144-151: min() is undefined in Go; build will fail

Go has no built-in min for ints. Replace with simple capping logic.

Apply:

-		shift := min(retries, 5) // Cap at 2^5 = 32, so max backoff is 640ms
-		baseBackoff = baseBackoff << shift
+		shift := retries
+		if shift > 5 {
+			shift = 5 // Cap at 2^5 = 32, so max backoff is 640ms
+		}
+		baseBackoff = baseBackoff << shift

37-53: Avoid passing a Sqlizer as a bound value; inline the subquery into the expression

Using squirrel.Expr(snapshotQuery, value) as an argument to another Expr’s “?” is fragile. Inline the subquery and bind only scalars to placeholders.

Apply:

-	snapshotQuery := "(select snapshot from transactions where id = ?::xid8)"
+	snapshotQuery := "(select snapshot from transactions where id = ?::xid8)"

-	createdWhere := squirrel.Or{
-		squirrel.Expr("pg_visible_in_snapshot(created_tx_id, ?) = true", squirrel.Expr(snapshotQuery, value)),
-		squirrel.Expr("created_tx_id = ?::xid8", value), // Include current transaction
-	}
+	createdWhere := squirrel.Or{
+		squirrel.Expr("pg_visible_in_snapshot(created_tx_id, "+snapshotQuery+") = true", value),
+		squirrel.Expr("created_tx_id = ?::xid8", value),
+	}

-	expiredWhere := squirrel.And{
-		squirrel.Or{
-			squirrel.Expr("pg_visible_in_snapshot(expired_tx_id, ?) = false", squirrel.Expr(snapshotQuery, value)),
-			squirrel.Expr("expired_tx_id = ?::xid8", ActiveRecordTxnID), // Never expired
-		},
-		squirrel.Expr("expired_tx_id <> ?::xid8", value), // Not expired by current transaction
-	}
+	expiredWhere := squirrel.And{
+		squirrel.Or{
+			squirrel.Expr("pg_visible_in_snapshot(expired_tx_id, "+snapshotQuery+") = false", value),
+			squirrel.Expr("expired_tx_id = ?::xid8", ActiveRecordTxnID),
+		},
+		squirrel.Expr("expired_tx_id <> ?::xid8", value),
+	}
🧹 Nitpick comments (1)
internal/storage/postgres/utils/common.go (1)

174-183: Use 53-bit technique for uniform [0,1) without FP precision pitfalls

Derive a float via the upper 53 bits; this matches math/rand’s approach and avoids chained divisions.

Apply:

-	// Use bit shifting instead of division for better performance
-	return float64(binary.BigEndian.Uint64(b[:])) / (1 << 63) / 2.0
+	// Use upper 53 bits for uniform [0,1)
+	u := binary.BigEndian.Uint64(b[:]) >> 11
+	return float64(u) * (1.0 / (1 << 53))
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4757525 and 1054d5b.

📒 Files selected for processing (2)
  • internal/storage/postgres/utils/common.go (4 hunks)
  • internal/storage/postgres/utils/common_test.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
internal/storage/postgres/utils/common_test.go (1)
internal/storage/postgres/utils/common.go (3)
  • ActiveRecordTxnID (29-29)
  • GenerateGCQuery (61-73)
  • GenerateGCQueryForTenant (78-90)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test with Coverage
  • GitHub Check: Test with Coverage
🔇 Additional comments (6)
internal/storage/postgres/utils/common.go (3)

65-73: LGTM: parameterized GC where-clauses

Good switch to placeholders and typed casts; avoids string interpolation and improves plan caching.


82-89: LGTM: tenant-aware GC is safely parameterized

Table name is still dynamic; ensure callers only pass vetted constants.


26-31: MaxXID8Value is still referenced in pkg/database/postgres/repair.go:13; cannot remove it before updating that usage.

Likely an incorrect or invalid review comment.

internal/storage/postgres/utils/common_test.go (3)

28-34: LGTM: SnapshotQuery asserts SQL and arg order

Expected SQL matches parameterized pattern and args validate binding (revision, revision, revision, ActiveRecordTxnID, revision).


41-46: LGTM: GC query (global) uses placeholders and correct args

Asserts both SQL and args; good.


53-56: LGTM: Tenant-aware GC query covers tenant_id + predicates with proper args

Good coverage and ordering.

@tolgaozen tolgaozen merged commit ac4509f into master Sep 8, 2025
11 of 14 checks passed
@tolgaozen tolgaozen deleted the update-pguint64-xid8-feature branch September 8, 2025 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants