Skip to content

sql: SHOW RANGES broken by coalesced ranges #93617

@knz

Description

@knz

Describe the problem

The output of SHOW RANGES assumes that a range maps to at most one db, table, index.

Since recently, ranges coalesce across tables and even databases. This means that the db/table/index name fields are now incorrect; they merely describe the data at the beginning of the range but do not include data/tables that's in the middle or at the end.

This has hilarious effects:

> create table db1; create table db1.t as select 1 as x; 
> create table db2; create table db2.t as select 1 as x; 
> show ranges from database d1;
  table_name | start_key | end_key | range_id |       range_size_mb        | lease_holder | lease_holder_locality | replicas |    replica_localities
-------------+-----------+---------+----------+----------------------------+--------------+-----------------------+----------+---------------------------
  t          | /108      | NULL    |       56 | 0.000068000000000000000000 |            1 | region=us-east1,az=b  | {1}      | {"region=us-east1,az=b"}

> show ranges from database db2;

SHOW RANGES 0

oops!
(the problem is because the database_name field in the range is computed on the start key, and there's no separate range with db2 as its database_name)

Additionally, the start_key/end_key values are often broken, because they don't fit within the keyspace of the selected index (in show ranges from table / from index).

Expected behavior

Fundamentally the assumption that "1 range <-> at most one object" is broken so we can't repair this with the current schema of the show ranges output, nor the underlying crdb_internal.ranges. Both need to change.

We have two alternatives:

  • keep the guarantee "1 row per range" (range_id remains unique), but then include a list of designated objects elsewhere on the row, e.g. via a new column with type ARRAY.
  • keep the guarantee "at least 1 row per object" (db/table/index names remain unique), but then introduce synthetic range rows, 1 per SQL object included in the range, with synthetic start/end keys that map to just the object selected, and the range_id repeated across rows.

Given how we're using this facility in tests, it looks like both are desirable!

So what can happen is likely that we'll split the vtable in two: one with the former behavior and one with the latter.

Jira issue: CRDB-22439

Epic CRDB-22701

Metadata

Metadata

Assignees

Labels

A-kv-observabilityA-sql-vtablesVirtual tables - pg_catalog, information_schema etcC-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.T-kvKV Team

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions