sql: generate `crdb_internal.cluster_locks` using worker and memory accounting by AlexTalks · Pull Request #80216 · cockroachdb/cockroach

AlexTalks · 2022-04-20T01:56:48Z

sql: convert crdb_internal.cluster_locks virtual table to use worker

This change updates the generator for the crdb_internal.cluster_locks
virtual table to be able to wrap it in a worker Goroutine by calling
setupGenerator. While previously the virtual table generator did not
run in a separate Goroutine, wrapping it as a worker running in a
separate Goroutine allows for cancellation during table generation,
eliminating potentially unnecessary RPCs. This follows the pattern set
by other virtual tables within crb_internal.

sql: account for memory used in generating crdb_internal.cluster_locks

This change adds memory monitoring to the memory used in generating the
virtual table crdb_internal.cluster_locks. Utilizing the
BoundAccount as part of the planner's EvalContext, it accounts for
the memory used by the QueryLocksResponse returned from the RPC fanout
needed to generate the table. As such, if the memory budget allocated
has been reached, the call to acct.Grow() will be denied and return an
error that will stop row generation, thus ensuring that the memory used
in the generation of this table will not cause memory exhaustion.

Release justification: Minor improvements.

Depends on #77876, #80422, #79623

cockroach-teamcity · 2022-04-20T01:56:59Z

This change is

nvb

Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks and @nvanbenschoten)

pkg/sql/crdb_internal.go, line 6069 at r5 (raw file):

		resumeSpan = resp.ResumeSpan

		if err = acct.Grow(ctx, int64(resp.Size())); err != nil {

We're adding to our memory account when we get a response and freeing that before the next request. I have two questions about this:

Should we be requesting bytes from the budget before sending the request to avoid ever exceeding our memory budget? Maybe requesting b.Header.TargetBytes ahead of time?
Does this leave the memory account with some accounted bytes after the last request is sent? Is that ok?

AlexTalks

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @nvanbenschoten)

pkg/sql/crdb_internal.go line 6069 at r5 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We're adding to our memory account when we get a response and freeing that before the next request. I have two questions about this:

Should we be requesting bytes from the budget before sending the request to avoid ever exceeding our memory budget? Maybe requesting b.Header.TargetBytes ahead of time?

Does this leave the memory account with some accounted bytes after the last request is sent? Is that ok?

Perhaps @Azhng can answer this a bit better than me, but as I understand it:

It seems like requesting budget with acct.Grow(..) is generally done after a request is made and we have the response object, at least here in crdb_internal.go. If we should do this differently - e.g. allocate TargetBytes aka 10MB prior to each request, and then shrink it down to the actual response size, definitely let me know.
I don't believe so because of the defer acct.Close().

Azhng

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks and @nvanbenschoten)

pkg/sql/crdb_internal.go line 6069 at r5 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

Perhaps @Azhng can answer this a bit better than me, but as I understand it:

It seems like requesting budget with acct.Grow(..) is generally done after a request is made and we have the response object, at least here in crdb_internal.go. If we should do this differently - e.g. allocate TargetBytes aka 10MB prior to each request, and then shrink it down to the actual response size, definitely let me know.

I don't believe so because of the defer acct.Close().

Currently, SQL Stats's memory accounting within the crdb_internal.go is quite problematic for a few reasons:

SQL Stats's node-level API is not paginated, and it's difficult to implement pagination on it since it's essentially an hash aggregation. (Hence SQL Stats is quite aggressively pushing data into the system table at a 10 mins interval, when reading from table, DistSQL has its own internal memory accounting).
Without pagination, the RPC client doesn't have a good way to communicate to the Server about the maximum number of bytes it would accept from the RPC response,

With that being said, the memory accounting done within crdb_internal.go for SQL Stats is so that it's "better than nothing".

I think since KV RPC request has the ability to paginate, and also the ability specify maximum target bytes in the RPC request header, we should perform memory accounting before sending the RPC request. IIRC, I think similar memory accounting is implemented between SQL<->KV boundary.

While previously virtual schema tables had support for defining tables with a `populate` function, which eagerly loads all rows, or a `generator` function, which lazy loads each row when called (possibly running in a worker Goroutine), and also had support for virtual indexes which would have their own `populate` functions, there was a subtle lack of support for using virtual indexes with virtual tables that used a `generator`, since the virtual index constraint logic would fall back to a (possibly undefined) `populate` function in several cases. This change fixes the nil pointer exception that could occur if using virtual indexes with a table using a `generator`, and validates that the virtual index is supported prior to use. Release note: None

AlexTalks

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @nvanbenschoten)

pkg/sql/crdb_internal.go line 6069 at r5 (raw file):

Previously, Azhng (Archer Zhang) wrote…

Currently, SQL Stats's memory accounting within the crdb_internal.go is quite problematic for a few reasons:

SQL Stats's node-level API is not paginated, and it's difficult to implement pagination on it since it's essentially an hash aggregation. (Hence SQL Stats is quite aggressively pushing data into the system table at a 10 mins interval, when reading from table, DistSQL has its own internal memory accounting).

Without pagination, the RPC client doesn't have a good way to communicate to the Server about the maximum number of bytes it would accept from the RPC response,

With that being said, the memory accounting done within crdb_internal.go for SQL Stats is so that it's "better than nothing".

I think since KV RPC request has the ability to paginate, and also the ability specify maximum target bytes in the RPC request header, we should perform memory accounting before sending the RPC request. IIRC, I think similar memory accounting is implemented between SQL<->KV boundary.

Discussed this with @Azhng on Slack a bit more - we agreed that the goal here is to account for memory used in generation (i.e. RPC responses and other temp buffers), rather than for the actual row data, which is likely better left to other parts of SQL to track.

One thing we discussed is requesting a budget of TargetBytes (I.e. 10MB) at the start of generation, and simply leaving it as is (since we only hold onto one page of data at a time). However on second thought, this might be a bit of overkill, especially given that the "paged" response is per-table descriptor. It seems like other usages of memory accounting do in fact perform a sort of "trailing" accounting when you don't have expectations on the response size, so hopefully that is sufficient for tracking. If we need to request the budget up front, however, we can do that.

This change adds virtual indexes on the `table_id`, `database_name`, and `table_name` columns of the `crdb_internal.cluster_locks` virtual table, so that when queried, the `kv.Batch`es with `QueryLocksRequest`s can be constrained to query only specific tables or databases. This allows the requests to be much more limited, rather than needing to request all of the ranges that comprise the key spans of all tables accessible by the user. Release note (sql change): Improved query performance for `crdb_internal.cluster_locks` when issued with constraints in the WHERE clause on `table_id`, `database_name`, or `table_name` columns.

This change updates the generator for the `crdb_internal.cluster_locks` virtual table to be able to wrap it in a worker Goroutine by calling `setupGenerator`. While previously the virtual table generator did not run in a separate Goroutine, wrapping it as a worker running in a separate Goroutine allows for cancellation during table generation, eliminating potentially unnecessary RPCs. This follows the pattern set by other virtual tables within `crb_internal`. Release note: None Release justification: Minor improvement.

This change adds memory monitoring to the memory used in generating the virtual table `crdb_internal.cluster_locks`. Utilizing the `BoundAccount` as part of the planner's `EvalContext`, it accounts for the memory used by the `QueryLocksResponse` returned from the RPC fanout needed to generate the table. As such, if the memory budget allocated has been reached, the call to `acct.Grow()` will be denied and return an error that will stop row generation, thus ensuring that the memory used in the generation of this table will not cause memory exhaustion. Release note: None

AlexTalks added the do-not-merge bors won't merge a PR with this label. label Apr 20, 2022

AlexTalks requested review from a team as code owners April 20, 2022 01:56

AlexTalks requested review from a team April 20, 2022 01:56

otan removed the request for review from a team April 20, 2022 03:54

irfansharif marked this pull request as draft April 20, 2022 11:14

AlexTalks force-pushed the cluster_locks_worker branch 2 times, most recently from eaef3d2 to 9cf01f1 Compare April 21, 2022 19:53

AlexTalks changed the title ~~sql: convert crdb_internal.cluster_locks virtual table to use worker~~ sql: generate crdb_internal.cluster_locks using worker and memory accounting Apr 21, 2022

AlexTalks marked this pull request as ready for review April 21, 2022 19:59

AlexTalks removed the do-not-merge bors won't merge a PR with this label. label Apr 21, 2022

AlexTalks assigned nvb Apr 21, 2022

AlexTalks added the backport-22.1.x label Apr 21, 2022

nvb reviewed Apr 22, 2022

View reviewed changes

AlexTalks force-pushed the cluster_locks_worker branch from 9cf01f1 to 3f40ff6 Compare April 25, 2022 21:18

AlexTalks commented Apr 25, 2022

View reviewed changes

Azhng reviewed Apr 25, 2022

View reviewed changes

AlexTalks commented Apr 26, 2022

View reviewed changes

AlexTalks added 3 commits April 25, 2022 21:50

AlexTalks force-pushed the cluster_locks_worker branch from 3f40ff6 to c2860c3 Compare April 26, 2022 01:59

AlexTalks closed this Oct 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: generate `crdb_internal.cluster_locks` using worker and memory accounting#80216

sql: generate `crdb_internal.cluster_locks` using worker and memory accounting#80216
AlexTalks wants to merge 4 commits intocockroachdb:masterfrom
AlexTalks:cluster_locks_worker

AlexTalks commented Apr 20, 2022 •

edited

Loading

Uh oh!

cockroach-teamcity commented Apr 20, 2022

Uh oh!

nvb left a comment

Uh oh!

AlexTalks left a comment

Uh oh!

Azhng left a comment

Uh oh!

AlexTalks left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

AlexTalks commented Apr 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cockroach-teamcity commented Apr 20, 2022

Uh oh!

nvb left a comment

Choose a reason for hiding this comment

Uh oh!

AlexTalks left a comment

Choose a reason for hiding this comment

Uh oh!

Azhng left a comment

Choose a reason for hiding this comment

Uh oh!

AlexTalks left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

AlexTalks commented Apr 20, 2022 •

edited

Loading