builtins: add pg_total_relation_size builtin#59604
builtins: add pg_total_relation_size builtin#59604jordanlewis wants to merge 2 commits intocockroachdb:masterfrom
Conversation
| # crdb_internal.ranges in the logic test context. | ||
| # Make sure also that a non-existent table id returns null. | ||
| query II | ||
| SELECT pg_table_size('size_tester'::regclass), pg_table_size(387438) |
There was a problem hiding this comment.
is there no way to test this works outside of logic tests?
There was a problem hiding this comment.
I added a test, thanks for the push!
c7e9038 to
73d1d9a
Compare
|
I rewrote the function to avoid depending on the virtual table, which has to fetch all range and table and node descriptors, and to avoid getting the range stats for every range in a separate RPC. Now, it fetches just the range descriptors it needs, and does one big RPC to get all of their stats at once. Also I addeda test. |
|
Actually, this might be subtly wrong or confusing. So I guess this function as written would be closer to |
Am I missig something -- why is it not pg_total_relation_size?
Hmm, yeah sounds about right |
c3ebc79 to
91a87c9
Compare
|
PTAL @tbg i know you weighed in on #20712, with a comment
but given we haven't had something here in the last 3 years and no plans to improve it, any objections remaining? I now see that we have yet another way of calculating this via |
otan
left a comment
There was a problem hiding this comment.
Reviewed 1 of 2 files at r1, 1 of 3 files at r3, 3 of 3 files at r4.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @otan)
pkg/sql/crdb_internal.go, line 2444 at r4 (raw file):
end_key, end_pretty, table_id,
nit: indentation misaligned
|
@jordanlewis I can only warn, not make the call. Anything that fans out to the whole cluster is dangerous when called frequently. Is this going to be called frequently? I don't know. Probably not, but not my call to make. An individual call shouldn't be too bad, since this is only reading from the leaseholders. Orthogonally, is this the right implementation? It looks like people want this to reflect the disk usage, not the number of logical bytes in the cluster, which is what you're returning now. By implementing it as such we're just going to switch users from asking for this to be implemented to asking why it returns seemingly bogus numbers, especially seeing how the UI perhaps does not return the same thing (I haven't checked what "table size" means there). If there has been a discussion and it was resolved that it should do what this PR proposes, I haven't seen it neither here nor on the linked issues. |
|
I don't have much context on this PR, but just inquiring: if the pg_table_size stuff is still under debate, should we cherry-pick off the first commit so #59601 is fixed? |
|
Yes, I think so. I'll take care of that today. |
This function tends to get called by metadata scraping admin consoles and such - prometheus exporters - and other things that presumably will call this function once per table in the whole database. So maybe this is not really the best idea to implement it this way.
I think, based on the text you wrote below this question, the answer is no. I didn't realize that the MVCC stats didn't reflect the physical disk usage of the range. I'll figure out another way to do this, after I merge the first commit to fix #59601. |
6e27449 to
602905e
Compare
1708a26 to
e7a7ac6
Compare
|
I took a look at redoing this today. I ended up having the builtin function call into the admin server's TableStatsRequest API, the same as what the DB Console does. This will prevent any confusion where the numbers don't align - something that I agree would be deeply confusing for everyone. For now, though, this builtin will still fan out. It's not ideal. But, I think that ultimately it's going to be better to provide information in both SQL and DB console, even if inefficient. I think it would good to collect these statistics during, say, table statistics collection, and store them in a new table or a new way in the existing table statistics system table, but that's going to be a pretty significant chunk of work. |
knz
left a comment
There was a problem hiding this comment.
Jordan, I'm sorry but this is not going to fly. Our multi-tenant architecture is going to be the way moving forward, even for dedicated / on-prem users, and the SQL pods don't serve the admin UI / status server.
What you really need here is to access the (s *adminServer) statsForSpan() method from SQL. But this method has no reason to be on adminServer. It doesn't even use adminServer specifics. It could as well be implemented on (*Server) or even (*sql.Server). If you were to move it to the latter, you could access it safely from a SQL builtin, and the changes to server/admin.go to get there would be minimal: from adminServer to sql.Server is just s.server.sqlServer.
Reviewed 2 of 11 files at r6.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @dt, @jordanlewis, and @otan)
|
(Alternatively, you can move it to a method of |
|
@tbg can you check whether my proposal to integrate |
29eed06 to
ff432d3
Compare
|
I embarked on an exciting plumbing adventure and moved the stats for span implementation into the sql server. However, we still need to call into the status server, once per KV node that has a relevant leaseholder, to fetch the span stats underneath... is that acceptable? In the end, I think we actually could have called into a remote |
... and crdb_internal.ranges_no_leases This commit adds schema_name to crdb_internal.ranges and crdb_internal.ranges_no_leases to ensure that it's possible to disambiguate between ranges that are contained by a table with the same name in two different user-defined schemas. In addition, it also adds the table_id column which allows unambiguous lookups of ranges for a given table id. This will also enable making a virtual index on the table_id column later, which should be a nice win for some introspection commands. Release note (sql change): add the schema_name and table_id columns to the crdb_internal.ranges and crdb_internal.ranges_no_leases virtual tables.
ff432d3 to
5b867ab
Compare
That is not the plan, no.
(Which also means that our current implementation of SQL session handling, which is currently routed through these things, is incorrect. See related issue #60584.)
IT seems to me that we need an API like this:
Again, |
|
This topic was discussed in this week's KV team meeting.
So in summary, we have a strong uncertainty / lack of leadership/vision about how to expose KV-level (or storage-level) information through SQL in the multi-tenant deployment. I am currently working with PM to figure out who owns the architectural vision here. Will report when we know more. |
|
Just a note of intent - I want to come back to this one now that we've gotten further in our MT architecture. |
|
Any chance this can be revisited @jordanlewis ? We're finding a need for this and wanted to see if there are any updates to this. |
Closes #20712.
Depends on #59865.
This builtin function returns the number of bytes in a table. For
CockroachDB, this means summing up the number of bytes in the ranges of
a table.
Notably, we do not multiply the number of bytes in the table's ranges by
the number of replicas - this is left up to the user to do if desired.
Release note (sql change): add the pg_total_relation_size builtin for estimating
the amount of data in a table.