kv: txn giving up on refresh span collection causes closed ts to kick it out

Found by user:

1. txn starts
2. txn has a lot of operations whereby they exceed max_refresh_span_bytes and refresh span collection stops
3. txn lasts for more than 30s
4. closed ts "Catches up", doesn't find refresh spans and "kicks the txn out" (pushes it and client receives an error)
5. the error is not the usual retry error because it is not caused by contention, but the error _message_ does not clarify what is happening

There are three separate issues here:

- we want a larger default for max_refresh_span_bytes so that the scenario becomes less likely. This is predicated on better memory tracking in KV, a separate work item (planned for 20.1, see the work @tbg has started on #44341 ). I think this is orthogonal and should be kept out of scope here.

- when the scenario happens we want the error message to be clearer about what needs to happen: either decrease the duration of the txn, or decrease the its number of refresh spans (fewer reads/writes), or increase max_refresh_span_bytes, or increase the closed ts delay

- or we could avoid the situation entirely? Make the closed ts lag behind the long-running txn if it has disabled refresh spans collection.

cc @ajwerner @tbg for triage.

Jira issue: CRDB-5215

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kv: txn giving up on refresh span collection causes closed ts to kick it out #44645

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

kv: txn giving up on refresh span collection causes closed ts to kick it out #44645

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions