Skip to content

sql: generate crdb_internal.cluster_locks using worker and memory accounting#80216

Closed
AlexTalks wants to merge 4 commits intocockroachdb:masterfrom
AlexTalks:cluster_locks_worker
Closed

sql: generate crdb_internal.cluster_locks using worker and memory accounting#80216
AlexTalks wants to merge 4 commits intocockroachdb:masterfrom
AlexTalks:cluster_locks_worker

Conversation

@AlexTalks
Copy link
Copy Markdown
Contributor

@AlexTalks AlexTalks commented Apr 20, 2022

sql: convert crdb_internal.cluster_locks virtual table to use worker

This change updates the generator for the crdb_internal.cluster_locks
virtual table to be able to wrap it in a worker Goroutine by calling
setupGenerator. While previously the virtual table generator did not
run in a separate Goroutine, wrapping it as a worker running in a
separate Goroutine allows for cancellation during table generation,
eliminating potentially unnecessary RPCs. This follows the pattern set
by other virtual tables within crb_internal.

sql: account for memory used in generating crdb_internal.cluster_locks

This change adds memory monitoring to the memory used in generating the
virtual table crdb_internal.cluster_locks. Utilizing the
BoundAccount as part of the planner's EvalContext, it accounts for
the memory used by the QueryLocksResponse returned from the RPC fanout
needed to generate the table. As such, if the memory budget allocated
has been reached, the call to acct.Grow() will be denied and return an
error that will stop row generation, thus ensuring that the memory used
in the generation of this table will not cause memory exhaustion.

Release justification: Minor improvements.

Depends on #77876, #80422, #79623

@AlexTalks AlexTalks added the do-not-merge bors won't merge a PR with this label. label Apr 20, 2022
@AlexTalks AlexTalks requested review from a team as code owners April 20, 2022 01:56
@AlexTalks AlexTalks requested review from a team April 20, 2022 01:56
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@otan otan removed the request for review from a team April 20, 2022 03:54
@irfansharif irfansharif marked this pull request as draft April 20, 2022 11:14
@AlexTalks AlexTalks force-pushed the cluster_locks_worker branch 2 times, most recently from eaef3d2 to 9cf01f1 Compare April 21, 2022 19:53
@AlexTalks AlexTalks changed the title sql: convert crdb_internal.cluster_locks virtual table to use worker sql: generate crdb_internal.cluster_locks using worker and memory accounting Apr 21, 2022
@AlexTalks AlexTalks marked this pull request as ready for review April 21, 2022 19:59
@AlexTalks AlexTalks removed the do-not-merge bors won't merge a PR with this label. label Apr 21, 2022
Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks and @nvanbenschoten)


pkg/sql/crdb_internal.go, line 6069 at r5 (raw file):

		resumeSpan = resp.ResumeSpan

		if err = acct.Grow(ctx, int64(resp.Size())); err != nil {

We're adding to our memory account when we get a response and freeing that before the next request. I have two questions about this:

  1. Should we be requesting bytes from the budget before sending the request to avoid ever exceeding our memory budget? Maybe requesting b.Header.TargetBytes ahead of time?
  2. Does this leave the memory account with some accounted bytes after the last request is sent? Is that ok?

@AlexTalks AlexTalks force-pushed the cluster_locks_worker branch from 9cf01f1 to 3f40ff6 Compare April 25, 2022 21:18
Copy link
Copy Markdown
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @nvanbenschoten)


pkg/sql/crdb_internal.go line 6069 at r5 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We're adding to our memory account when we get a response and freeing that before the next request. I have two questions about this:

  1. Should we be requesting bytes from the budget before sending the request to avoid ever exceeding our memory budget? Maybe requesting b.Header.TargetBytes ahead of time?
  2. Does this leave the memory account with some accounted bytes after the last request is sent? Is that ok?

Perhaps @Azhng can answer this a bit better than me, but as I understand it:

  1. It seems like requesting budget with acct.Grow(..) is generally done after a request is made and we have the response object, at least here in crdb_internal.go. If we should do this differently - e.g. allocate TargetBytes aka 10MB prior to each request, and then shrink it down to the actual response size, definitely let me know.
  2. I don't believe so because of the defer acct.Close().

Copy link
Copy Markdown
Contributor

@Azhng Azhng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks and @nvanbenschoten)


pkg/sql/crdb_internal.go line 6069 at r5 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

Perhaps @Azhng can answer this a bit better than me, but as I understand it:

  1. It seems like requesting budget with acct.Grow(..) is generally done after a request is made and we have the response object, at least here in crdb_internal.go. If we should do this differently - e.g. allocate TargetBytes aka 10MB prior to each request, and then shrink it down to the actual response size, definitely let me know.
  2. I don't believe so because of the defer acct.Close().

Currently, SQL Stats's memory accounting within the crdb_internal.go is quite problematic for a few reasons:

  1. SQL Stats's node-level API is not paginated, and it's difficult to implement pagination on it since it's essentially an hash aggregation. (Hence SQL Stats is quite aggressively pushing data into the system table at a 10 mins interval, when reading from table, DistSQL has its own internal memory accounting).
  2. Without pagination, the RPC client doesn't have a good way to communicate to the Server about the maximum number of bytes it would accept from the RPC response,

With that being said, the memory accounting done within crdb_internal.go for SQL Stats is so that it's "better than nothing".

I think since KV RPC request has the ability to paginate, and also the ability specify maximum target bytes in the RPC request header, we should perform memory accounting before sending the RPC request. IIRC, I think similar memory accounting is implemented between SQL<->KV boundary.

While previously virtual schema tables had support for defining tables
with a `populate` function, which eagerly loads all rows, or a
`generator` function, which lazy loads each row when called (possibly
running in a worker Goroutine), and also had support for virtual indexes
which would have their own `populate` functions, there was a subtle lack
of support for using virtual indexes with virtual tables that used a
`generator`, since the virtual index constraint logic would fall back to
a (possibly undefined) `populate` function in several cases. This change
fixes the nil pointer exception that could occur if using virtual
indexes with a table using a `generator`, and validates that the virtual
index is supported prior to use.

Release note: None
Copy link
Copy Markdown
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @nvanbenschoten)


pkg/sql/crdb_internal.go line 6069 at r5 (raw file):

Previously, Azhng (Archer Zhang) wrote…

Currently, SQL Stats's memory accounting within the crdb_internal.go is quite problematic for a few reasons:

  1. SQL Stats's node-level API is not paginated, and it's difficult to implement pagination on it since it's essentially an hash aggregation. (Hence SQL Stats is quite aggressively pushing data into the system table at a 10 mins interval, when reading from table, DistSQL has its own internal memory accounting).
  2. Without pagination, the RPC client doesn't have a good way to communicate to the Server about the maximum number of bytes it would accept from the RPC response,

With that being said, the memory accounting done within crdb_internal.go for SQL Stats is so that it's "better than nothing".

I think since KV RPC request has the ability to paginate, and also the ability specify maximum target bytes in the RPC request header, we should perform memory accounting before sending the RPC request. IIRC, I think similar memory accounting is implemented between SQL<->KV boundary.

Discussed this with @Azhng on Slack a bit more - we agreed that the goal here is to account for memory used in generation (i.e. RPC responses and other temp buffers), rather than for the actual row data, which is likely better left to other parts of SQL to track.

One thing we discussed is requesting a budget of TargetBytes (I.e. 10MB) at the start of generation, and simply leaving it as is (since we only hold onto one page of data at a time). However on second thought, this might be a bit of overkill, especially given that the "paged" response is per-table descriptor. It seems like other usages of memory accounting do in fact perform a sort of "trailing" accounting when you don't have expectations on the response size, so hopefully that is sufficient for tracking. If we need to request the budget up front, however, we can do that.

This change adds virtual indexes on the `table_id`, `database_name`, and
`table_name` columns of the `crdb_internal.cluster_locks` virtual table,
so that when queried, the `kv.Batch`es with `QueryLocksRequest`s can be
constrained to query only specific tables or databases. This allows the
requests to be much more limited, rather than needing to request all of
the ranges that comprise the key spans of all tables accessible by the
user.

Release note (sql change): Improved query performance for
`crdb_internal.cluster_locks` when issued with constraints in the WHERE
clause on `table_id`, `database_name`, or `table_name` columns.
This change updates the generator for the `crdb_internal.cluster_locks`
virtual table to be able to wrap it in a worker Goroutine by calling
`setupGenerator`. While previously the virtual table generator did not
run in a separate Goroutine, wrapping it as a worker running in a
separate Goroutine allows for cancellation during table generation,
eliminating potentially unnecessary RPCs. This follows the pattern set
by other virtual tables within `crb_internal`.

Release note: None

Release justification: Minor improvement.
This change adds memory monitoring to the memory used in generating the
virtual table `crdb_internal.cluster_locks`. Utilizing the
`BoundAccount` as part of the planner's `EvalContext`, it accounts for
the memory used by the `QueryLocksResponse` returned from the RPC fanout
needed to generate the table. As such, if the memory budget allocated
has been reached, the call to `acct.Grow()` will be denied and return an
error that will stop row generation, thus ensuring that the memory used
in the generation of this table will not cause memory exhaustion.

Release note: None
@AlexTalks AlexTalks force-pushed the cluster_locks_worker branch from 3f40ff6 to c2860c3 Compare April 26, 2022 01:59
@AlexTalks AlexTalks closed this Oct 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants