sql: implement immutable tables that automatically use follower reads

## Background

It's very common in real-world database schemas to have _reference tables_ - infrequently-updated tables, often relatively small, that are used as foreign key tables in applications. Some examples:
  
- a table of zipcodes. zipcodes are very infrequently deleted or added
- the `items` table from `tpcc`.  items in tpcc are never updated
- a table of historical closing prices in a financial database. this table is updated once a day, but not during the day

These kinds of tables present some issues for CockroachDB. In CockroachDB, doing a read from a table requires going to the table's leaseholder, to make sure that the reads are consistent. In an application that plays to CockroachDB's strengths that has many, globally distributed datacenters, reference tables like this will cause poor performance in several situations:

- inserts into tables that have foreign keys to a reference table will be at least as slow as a read to the leaseholder of the reference table, which might live across the globe
- joins from tables to a reference table will require going to the leaseholder, which might live across the globe

This may be unacceptable for customers who desire low-latency reads against reference tables and are willing to sacrifice write performance against those tables.

We can support this in CockroachDB by providing users the ability to mark a table as immutable, and using that information to provide fast, transactional reads to any replica.

## Proposed solution for CockroachDB: Immutable tables

Immutable tables are tables that cannot be modified. Tables can be set as immutable, and reset back to mutable, by performing a schema change with an ordinary `ALTER TABLE` command.

When a table is marked as immutable, it's possible to drastically improve read performance from it by altering the read semantics. There are (at least) two ways of doing this:

1. Using inconsistent reads against the table. ~That's safe because we know that the table is immutable, implying that all of its replicas are up-to-date.~
2. Using follower reads (#16593) against the table, with a read timestamp somewhere in the past, divorced from the active transaction timestamp. That's safe because follower reads guarantee that follower replicas are up-to-date as of some interval of time in the past.

The first solution is harder to implement, because of the semantics of leaving the immutable state - we'd have to invent a way to wait until data has fully replicated to all replicas. It also might be unsafe altogether - see @nvanbenschoten's comment on this issue.

The second solution is easier, and should be possible as soon as follower reads are available, using the schema change infrastructure and an intermediate state similar to the intermediate WRITE_ONLY state used for index backfills. The rest of this issue proposes to use the second solution.

### Details on immutable/mutable states and state change process

Table descriptors will be extended with a new field, `mutability`, that has three possible values:
- `mutable`
  - Reads from a `mutable` table behave as normal.
  - Writes to a `mutable` table are permitted.
- `immutable-must-read-transactional`
  - Reads from an `immutable-must-read-transactional` table behave as normal.
  - Writes to an `immutable-must-read-transactional` table are not permitted. 
- `immutable`
  - Reads from an `immutable` table are performed at a historical timestamp, `t` seconds in the past (where `t` is defined by the follower read safe past interval), and get served by a follower read
  - Writes to an `immutable` table are not permitted

By default, `mutability` is set to `mutable`. To transition a table from `mutable` to `immutable`, a user issues an `ALTER TABLE t SET IMMUTABLE` command (or similar - syntax to be decided later), which performs the following actions:

1. Set table descriptor to `immutable-must-read-transactional` and wait until the cluster converges on the new descriptor.
2. Wait `t` seconds to ensure that, once transitioned to `immutable`, all reads from `t` seconds ago occur before the last possible write to the reference table.
3. Set table descriptor to `immutable`.

To transition a table from `immutable` to `mutable`, a user issues an `ALTER TABLE t SET MUTABLE` command, which performs the following actions:

1. Set table descriptor to `immutable-must-read-transactional` and wait until the cluster converges on the new descriptor.
2. Set table descriptor to `mutable`. (note we don't have to wait `t` seconds here, since we're moving in a "safe" direction, causing all future reads to be ordinary transactional reads.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: implement immutable tables that automatically use follower reads #26301

Background

Proposed solution for CockroachDB: Immutable tables

Details on immutable/mutable states and state change process

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

sql: implement immutable tables that automatically use follower reads #26301

Description

Background

Proposed solution for CockroachDB: Immutable tables

Details on immutable/mutable states and state change process

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions