Skip to content

sql: GC during Index Backfills can generate incorrect index #64014

@dt

Description

@dt

When backfilling an index, we read some rows from the table, generate index
entries for those rows, then write those entries. We write those entries with
AddSSTable, which can add keys at any timestamp.

Consider the following timeline:

  • T_before: a row X is written to the table.
  • T0: CREATE INDEX runs, index steps to DELETE_AND_WRITE_ONLY, back-filler waits for one version.
  • T1: Back-filler can now assume all SQL writes will update entries. Will read-and-write entries at T1. Starts back-filling.
  • T2: SQL deletes X, writes tombstone for X to index.
  • T3: Back-filler scans (at T1) and sees X, adds entry for it to its buffer.
  • T5: Back-filler flushes its buffer, writes SST with record for X@T1 under the tombstone.

In this case, the final index is correct, since the SQL write at T2 dropped a
tombstone so that even though the back-filler then slid the key under it.

However, now let's consider the same example, but say the table has a TTL of 2.
Picking up the exact same timeline above from T2, we have an extra event at T4:

  • T2: SQL deletes X, writes tombstone for X to index.
  • T3: Back-filler scans (at T1 which is still at/above TTL=2) and sees X, adds entry for it to its buffer.
  • T4: GC can GC up to T2, sees the tombstone and record under it can go.
  • T5: Back-filler flushes its buffer, writes SST with record for X@T1, which is the only entry for X, as GC has already removed the tombstone.

The final index now contains X@T1 even though X was removed at T2, because GC removed the tombstone at T2 before we added X@T1.

In general, the count() based validation should catch this. However, I think in (exceptionally rare) cases, if the right number of rows are shadowed due to unlucky uniqueness violations to exactly cancel out the number of rows incorrectly not shadowed by tombstones collected too early, it is possible we could end up making an incorrect public and subsequently returning incorrect query results.

Metadata

Metadata

Assignees

Labels

C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.S-0-visible-logical-errorDatabase stores inconsistent data in some cases, or queries return invalid results silently.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions