sql: disallow the use of pg_catalog tables with no cur db defined#23045
sql: disallow the use of pg_catalog tables with no cur db defined#23045knz merged 1 commit intocockroachdb:masterfrom
pg_catalog tables with no cur db defined#23045Conversation
782510f to
b80a8cb
Compare
b80a8cb to
35e8c5e
Compare
35e8c5e to
f826503
Compare
|
Rebased on top of #23051 to ensure the logic underlying the test that this PR touches is correct. PTAL |
The contents of the various `pg_catalog` tables are very ill-defined when there is no current database set. (For context, in PostgreSQL it is not possible for a client to have no "current database" set. There is always a current database in a pg session.) This ill definition surfaces in the shape of mildly confusing situations, like duplicate schema/table names in `pg_tables` when there there are two tables with the same name in different databases (because `pg_catalog` tables do not contain a catalog column, unlike `information_schema` tables), to blatantly problematic output (`pg_proc` wants a `pronamespace` OID column, which suggests separate entries for each database/schema, but all the built-in functions should only be listed once) and outright bugs (`::regproc` and `::regclass` conversions don't resolve properly with no current database set). In order to restore sanity, this patch outright disallows using any of the `pg_catalog` vtables with no current databaset set. This should not create incompatibility with pg-specific tools anyways, since these should have been designed to always specify a current db in the connection string, and would have been bound to not-work with no current database set anyway. The other vtables (`crdb_internal` and `information_schema`) do not have these problems are are thus still allowed when there is no current database set. Release note (sql change): the `pg_catalog` virtual tables, as well as the special casts `::regproc` and `::regclass`, can only be queried by clients that properly set a current database (`SET database = ...` or `USE`).
f826503 to
5ca5c5a
Compare
|
Instead of this approach, did you give any more thought to removing the "no cur database" state entirely? I'm worried that this change is going to make our new database/schema distinction even more unclear. If we do decide to keep this behavior, should we do the same thing for |
Yes. This would be a feature change and not suitable for the 2.0 release, yet there is a clear problem in 2.0 to be solved.
How so? Please be concrete here. I see only upsides and no downsides myself.
Yes, both |
nvb
left a comment
There was a problem hiding this comment.
I see only upsides and no downsides myself.
The major downside I see is that there will now be a discrepancy between the output of SHOW SCHEMAS with and without a database set. This may be unclear to anyone exploring the object hierarchy of Cockroach and I suspect that we'll end up with issues like "why is pg_catalog missing!!".
EDIT: I now see that SHOW SCHEMAS does not work when a database is not specified. This helps relieve my previous concern. However, it does introduce a new issue where some schemas can be queried outside of the context of a database but it's not clear which ones they are. These schemas are also not properly reflected in information_schema.schemata. This is a problem with the "no current db" state itself, so I won't push it here. I also agree that removing this state should not be in 2.0, but do we at least have an issue to remove it in 2.1?
| } | ||
| } else { | ||
| if !e.validWithNoDatabaseContext { | ||
| return nil, errNoDatabase |
There was a problem hiding this comment.
We should add PG codes to all four errors defined at the top of descriptor.go.
Sounds like a bug to me. Will have a look :) |
|
Ok indeed The fact that querying from |
|
Filed the discussion about removing the "no current db" situation as #23145. |
|
Ok merging this now -- I'm willing to revisit this area as we try more pg tools and discover there's more to be done. |
Fixes #23028.
The contents of the various
pg_catalogtables are very ill-definedwhen there is no current database set.
(For context, in PostgreSQL it is not possible for a client to have no
"current database" set. There is always a current database in a pg
session.)
This ill definition surfaces in the shape of mildly confusing
situations, like duplicate schema/table names in
pg_tableswhenthere there are two tables with the same name in different databases
(because
pg_catalogtables do not contain a catalog column, unlikeinformation_schematables), to blatantly problematic output(
pg_procwants apronamespaceOID column, which suggests separateentries for each database/schema, but all the built-in functions
should only be listed once) and outright bugs (
::regprocand::regclassconversions don't resolve properly with no currentdatabase set).
In order to restore sanity, this patch outright disallows using any of
the
pg_catalogvtables with no current databaset set. This shouldnot create incompatibility with pg-specific tools anyways, since these
should have been designed to always specify a current db in the
connection string, and would have been bound to not-work with no
current database set anyway.
The other vtables (
crdb_internalandinformation_schema) do nothave these problems are are thus still allowed when there is no
current database set.
Release note (sql change): the
pg_catalogvirtual tables, as well asthe special casts
::regprocand::regclass, can only be queried byclients that properly set a current database (
SET database = ...orUSE).cc @rytaft