server, sql: surface session txnCount, recent txn fingerprints, active time#82352
server, sql: surface session txnCount, recent txn fingerprints, active time#82352craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
fadce86 to
60160c8
Compare
rafiss
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @xinhaoz)
pkg/sql/logictest/testdata/logic_test/crdb_internal line 371 at r1 (raw file):
SELECT * FROM crdb_internal.node_sessions WHERE node_id < 0 ---- node_id session_id user_name client_address application_name active_queries last_active_query num_txns_executed session_start oldest_query_start kv_txn alloc_bytes max_alloc_bytes status session_end
i feel like this might be less disruptive if the new column were added to the end, so the column ordering doesn't change as much. (would affect any users who are doing select * queries)
matthewtodd
left a comment
There was a problem hiding this comment.
mod a few questions and @rafiss's suggestion.
Reviewed 6 of 19 files at r1.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @matthewtodd, @rafiss, and @xinhaoz)
pkg/sql/conn_executor.go line 2904 at r2 (raw file):
ex.extraTxnState.txnCounter++ // Session is considred active when executing a transaction.
nit: typo in "considred"
pkg/sql/txn_fingerprint_id_cache.go line 77 at r2 (raw file):
} // Add adds a TxnFingerprintID to the cache, truncating the cache to the
nit: looks like this sentence got truncated?
pkg/sql/txn_fingerprint_id_cache.go line 92 at r2 (raw file):
// GetAllTxnFingerprintIDs returns a slice of all TxnFingerprintIDs in the cache. // The cache may be truncated if the capacity was updated to a smaller size.
nit: I wonder if it would be sufficient to just let adding to the cache handle eviction, WDYT?
pkg/util/timeutil/stopwatch.go line 90 at r2 (raw file):
w.mu.Lock() defer w.mu.Unlock() return w.mu.started, w.mu.startedAt
nit: Would it feel more idiomatic to reverse the order of these return parameters, like the v, ok when getting from a map?
xinhaoz
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @matthewtodd and @rafiss)
pkg/sql/conn_executor.go line 2904 at r2 (raw file):
Previously, matthewtodd (Matthew Todd) wrote…
nit: typo in "considred"
Fixed.
pkg/sql/txn_fingerprint_id_cache.go line 77 at r2 (raw file):
Previously, matthewtodd (Matthew Todd) wrote…
nit: looks like this sentence got truncated?
Fixed.
pkg/sql/txn_fingerprint_id_cache.go line 92 at r2 (raw file):
Previously, matthewtodd (Matthew Todd) wrote…
nit: I wonder if it would be sufficient to just let adding to the cache handle eviction, WDYT?
Yeah, the rate at which txn fingerprints are added in a real workload might be sufficient for ensuring the cache eventually gets resized appropriately. But in the off chance where none are inserted between a change in capacity and a retrieval of fingerprint ids, are we okay with the number of returned ids being greater than the cache's newly set capacity? I'm also not sure how often sessions become idle for extended periods of time in practice.
pkg/util/timeutil/stopwatch.go line 90 at r2 (raw file):
Previously, matthewtodd (Matthew Todd) wrote…
nit: Would it feel more idiomatic to reverse the order of these return parameters, like the
v, okwhen getting from a map?
Ah good suggestions! Done. I also changed the name of this function to more accurately reflect what it returns.
pkg/sql/logictest/testdata/logic_test/crdb_internal line 371 at r1 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
i feel like this might be less disruptive if the new column were added to the end, so the column ordering doesn't change as much. (would affect any users who are doing
select *queries)
Done.
abarganier
left a comment
There was a problem hiding this comment.
Not much to add on top of what others said aside from a lingering merge conflict artifact. otherwise!
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @abarganier, @matthewtodd, @rafiss, and @xinhaoz)
docs/generated/settings/settings-for-tenants.txt line 276 at r3 (raw file):
sql.ttl.default_select_batch_size integer 500 default amount of rows to select in a single query during a TTL job sql.ttl.job.enabled boolean true whether the TTL job is enabled <<<<<<< HEAD
nit: Lingering artifacts from merge conflict?
Code quote:
<<<<<<< HEAD
=======
sql.ttl.range_batch_size integer 100 amount of ranges to fetch at a time for a table during the TTL job
sql.txn_fingerprint_id_cache.capacity integer 100 the maximum number of txn fingerprint IDs stored
>>>>>>> 4dd1f55533 (server, sql: surface session txnCount, txn fingerprints, active time)Partially addresses cockroachdb#74257. Previously, the status server did not provide session details such as total number of transactions executed, transaction fingerprint IDs, and total active time. This change adds the aforementioned session details to the `serverpb.Session` struct. To track recently executed transaction fingerprint IDs, a FIFO cache `TxnFingerprintIDCache` is introduced with its corresponding cluster setting `TxnFingerprintIDBufferCapacity` to control the capacity. The default capacity is set at 100 fingerprints. The total number of transactions executed is filled using the existing `txnCounter` from the `extraTxnState` in `connExecutor`. The total active time is calculated by introducing a `timeutil.StopWatch` to the connection executor, which is started and stopped when a transaction is started and finished respectively. Release note (api change): the `serverpb.Session` struct now has three new fields: number of transactions executed, transaction fingerprint IDs, and total active time.
|
Thanks everyone for the reviews! |
|
Build failed (retrying...): |
|
Build failed (retrying...): |
|
Build failed (retrying...): |
|
Build succeeded: |
Finishing up Gerardo's PR, original review here: #80717
Partially addresses #74257.
Previously, the status server did not provide session details such as
total number of transactions executed, transaction fingerprint
IDs, and total active time. This change adds the aforementioned session
details to the
serverpb.Sessionstruct.To track recently executed transaction fingerprint IDs, a FIFO cache
TxnFingerprintIDCacheis introduced with its corresponding clustersetting
TxnFingerprintIDBufferCapacityto control the capacity. Thedefault capacity is set at 100 fingerprints.
The total number of transactions executed is filled using the existing
txnCounterfrom theextraTxnStateinconnExecutor. The total activetime is calculated by introducing a
timeutil.StopWatchto the connectionexecutor, which is started and stopped when a transaction is started and
finished respectively.
Release note (api change): the
serverpb.Sessionstruct now has threenew fields: number of transactions executed, transaction fingerprint
IDs, and total active time.