You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
txn has a lot of operations whereby they exceed max_refresh_span_bytes and refresh span collection stops
txn lasts for more than 30s
closed ts "Catches up", doesn't find refresh spans and "kicks the txn out" (pushes it and client receives an error)
the error is not the usual retry error because it is not caused by contention, but the error message does not clarify what is happening
There are three separate issues here:
we want a larger default for max_refresh_span_bytes so that the scenario becomes less likely. This is predicated on better memory tracking in KV, a separate work item (planned for 20.1, see the work @tbg has started on [dnm] kv: expose (and use) byte batch response size limit #44341 ). I think this is orthogonal and should be kept out of scope here.
when the scenario happens we want the error message to be clearer about what needs to happen: either decrease the duration of the txn, or decrease the its number of refresh spans (fewer reads/writes), or increase max_refresh_span_bytes, or increase the closed ts delay
or we could avoid the situation entirely? Make the closed ts lag behind the long-running txn if it has disabled refresh spans collection.
Found by user:
There are three separate issues here:
we want a larger default for max_refresh_span_bytes so that the scenario becomes less likely. This is predicated on better memory tracking in KV, a separate work item (planned for 20.1, see the work @tbg has started on [dnm] kv: expose (and use) byte batch response size limit #44341 ). I think this is orthogonal and should be kept out of scope here.
when the scenario happens we want the error message to be clearer about what needs to happen: either decrease the duration of the txn, or decrease the its number of refresh spans (fewer reads/writes), or increase max_refresh_span_bytes, or increase the closed ts delay
or we could avoid the situation entirely? Make the closed ts lag behind the long-running txn if it has disabled refresh spans collection.
cc @ajwerner @tbg for triage.
Jira issue: CRDB-5215