backport-2.1: fix the handling of SQL types#29006
Merged
craig[bot] merged 29 commits intocockroachdb:release-2.1from Aug 24, 2018
Merged
backport-2.1: fix the handling of SQL types#29006craig[bot] merged 29 commits intocockroachdb:release-2.1from
craig[bot] merged 29 commits intocockroachdb:release-2.1from
Conversation
Member
BramGruneir
approved these changes
Aug 23, 2018
Member
BramGruneir
left a comment
There was a problem hiding this comment.
This is a ton to backport. I've reviewed the individual commits, but I have no idea how to review this.
I think the commits are good, just a question of if anyone else has any objections.
Reviewable status:
complete! 0 of 0 LGTMs obtained
Contributor
Author
|
For reference there was zero merge conflict. So if you're good with the base PRs I'd say there's nothing special to see here. |
BramGruneir
reviewed
Aug 23, 2018
Member
BramGruneir
left a comment
There was a problem hiding this comment.
SGTM
Reviewable status:
complete! 0 of 0 LGTMs obtained
257112b to
7e9633b
Compare
Contributor
Author
|
bors r+ |
craig bot
pushed a commit
that referenced
this pull request
Aug 23, 2018
29006: backport-2.1: fix the handling of SQL types r=knz a=knz Backport 1/1 commits from #28937. Backport 1/1 commits from #28942. Backport 17/17 commits from #28943. Backport 1/1 commits from #28944. Backport 1/1 commits from #28945. Backport 1/1 commits from #28949. Backport 2/2 commits from #28950. Backport 1/1 commits from #28951 (#28959). Backport 1/1 commits from #28952 (#28959). Backport 1/1 commits from #28955 (#28959). Backport 1/1 commits from #28959. Backport 1/1 commits from #28959. Backport 1/1 commits from #28690. cc @cockroachdb/release Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
Contributor
Build failed |
The functions to retrieve the sizes of value encodings have not been used in a long time. Remove them. Release note: None
The `ColumnType` struct is not trivially small. Also calling a by-pointer method on a copy may cause it to escape on the heap. Avoid making copies until necessary. Release note: None
These will be used to break up table.go in smaller pieces. Release note: None
Release note: None
Release note: None
Release note: None
…estutils Release note: None
Release note: None
Release note: None
Release note: None
Release note: None
Release note: None
Since it's super small and used only in one place. Release note: None
Release note: None
to EncodeDatumKeyAscending and EncodeDatumsKeyAscending, respectively. Also, add comments and move them to column_type_encoding.go. Release note: None
This was not well defined; the UNION code that required it really has particular semantics and it is thus best defined close to the implementation of UNION. Release note: None
Release note: None
Release note: None
It is unneeded; we already have a perfectly useful `TypeName` map in package `oid` which does the same. Release note: None
Prior to this patch, CockroachDB incorrectly placed the "input syntax" of each SQL type in the column `data_type` of `information_schema.columns`. The input syntax is the one reported in SHOW COLUMNS, SHOW CREATE TABLE and other places, and is suitable to reproduce the exact type of at able. In contrast, `information_schema.columns.data_type` is constrained by compatibility with third party tools and PostgreSQL clients. It must report the name of the type like PostgreSQL does, which in turn is constrained by the SQL standard. A text column must be reported as "text" not "string"; a decimal column as "numeric" not "decimal", a float8 column as "double precision" not "float8", and so on. By reporting the wrong string in that column CockroachDB is confusing ORMs, which subsequently decide that the current on-disk type is not the one expected by the app and then initiate a schema change (ALTER COLUMN SET TYPE). This patch corrects this incompatibility by introducing logic that produces the proper information schema names for column types. This is expected to reduce ORM complaints about insufficient support for ALTER COLUMN SET TYPE (but will be tested/evaluated separately). Release note (bug fix): CockroachDB now populates the `data_type` column of `information_schema.columns` like PostgreSQL for compatibility with 3rd party tools and ORMs.
There are major problems in `cockroach dump` described separately in cockroachdb#28948. Part of this is `cockroach dump` trying to reverse-engineering a datum type from the announced column type in `information_schema.columns`, by parsing the type name. Prior to this patch, this parsing was incomplete (it went out of sync with the SQL parser over time, naturally as it was bound to do). This entire approach is flawed, but this is no time to redesign `cockroach dump`. Instead, this patch re-syncs the dump-specific type parser with the general SQL parser. Release note (bug fix): `cockroach dump` is now again better able to operate across multiple CockroachDB versions.
The distinction between "TEXT" and "STRING" inside CockroachDB is inconsequential. Remove it. The data type is already properly reported in pg_catalog and information_schema as "text". Release note: None
Release note: None
This patch corrects two long-standing bugs in CockroachDB which were confusing both 3rd party tools/ORMs and users: - the SQL standard type called "CHAR" is supposed to have a default maximum width of 1 character. Prior to this patch, CockroachDB did not support this and instead set no default maximum width on CHAR. This would cause CockroachDB to fail validating the width of values inserted in tables, fail to constrain the width of values during casts to CHAR, and report incorrect information in introspection tables. - PostgresSQL's dialect distinguishes the table column types TEXT, CHAR, VARCHAR and a special thing called `"char"` (this is a legacy PostgreSQL type which is, unfortunately, still used by some pg_catalog tables). Prior to this patch, CockroachDB would map all of these types into the same column type "STRING". In turn this caused them to show up as "text" in introspection. While this was not incorrect from the perspective of value handling (all these types are, indeed, handled with about the same value semantics in both PostgreSQL and CockroachDB), ORMs do pay attention to this distinction and become confused if they see a different type in introspection (e.g. "text") than what the application model requires (e.g. "char"). The result of this confusion was to trigger a schema change to adapt the type, which in turn fails due to missing features in ALTER COLUMN TYPE. This patch corrects both problems by giving the appropriate default width to CHAR and preserving the distinction between the various string types in column descriptors and thus introspection. Additionally, a width constraint check on collated string columns was missing. This is now fixed too. Tables and columns created prior to this patch are unaffected. Release note (bug fix): CockroachDB now distinguishes "CHAR" and "VARCHAR" as mandated by the SQL standard and PostgreSQL compatibility. When a width is not specified (e.g. `CHAR(3)`), the maximum width of `VARCHAR` remains unconstrained whereas the maximum width for `CHAR` is now 1 character. Release note (bug fix): CockroachDB now properly checks the width of strings inserted in a collated string column with a specified width. Release note (sql change): CockroachDB now preserves the distinction between different column types for string values like in PostgreSQL, for compatibility with 3rd party tools and ORMs.
The distinction between the type names "JSON" and "JSONB" is inconsequential. The types are already reported properly in introspection using the right alias. This patch removes this distinction in code. Release note: None
Prior to this patch the canonical name for the timestamptz type in CockroachDB was "TIMESTAMP WITH TIME ZONE" with spaces and all glorious 24 characters. This is unfortunate because this is a pretty common type, and is moreover used to disambiguate the return type of `now()` and thus will be emitted pretty often in distsql physical plans. Having such a long type name is unnecessary and indeed adds 50%+ storage, communication and parsing overhead for every occurrence of this type in distsql plans. This patch shortens the canonical CockroachDB name to "TIMESTAMPTZ". The representation of the type in introspection tables (pg_catalog, information_schema) is unaffected. Release note: None
It was not needed. This simplifies. Release note: None
Prior to this patch, CockroachDB maintained an unnecessary distinction between "INT" and "INTEGER", between "BIGINT" and "INT8", etc. This distinction is unnecessary but also costly, as we were paying the price of a "name" attribute in coltypes.TInt, with a string comparison and hash table lookup on every use of the type. What really matters is that the type shows up properly in introspection; this has already been ensured by various OID-to-pgcatalog mappings and the recently introduced `InformationSchemaTypeName()`. Any distinction beyond that is unnecessary and can be dropped from the implementation. Release note: None
7e9633b to
4f27ce0
Compare
Contributor
Author
|
Merge skew with an opt backport, rebased. bors r+ |
craig bot
pushed a commit
that referenced
this pull request
Aug 24, 2018
29006: backport-2.1: fix the handling of SQL types r=knz a=knz Backport 1/1 commits from #28937. Backport 1/1 commits from #28942. Backport 17/17 commits from #28943. Backport 1/1 commits from #28944. Backport 1/1 commits from #28945. Backport 1/1 commits from #28949. Backport 2/2 commits from #28950. Backport 1/1 commits from #28951 (#28959). Backport 1/1 commits from #28952 (#28959). Backport 1/1 commits from #28955 (#28959). Backport 1/1 commits from #28959. Backport 1/1 commits from #28959. Backport 1/1 commits from #28690. cc @cockroachdb/release Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
Contributor
Build succeeded |
This was referenced Sep 4, 2018
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backport 1/1 commits from #28937.
Backport 1/1 commits from #28942.
Backport 17/17 commits from #28943.
Backport 1/1 commits from #28944.
Backport 1/1 commits from #28945.
Backport 1/1 commits from #28949.
Backport 2/2 commits from #28950.
Backport 1/1 commits from #28951 (#28959).
Backport 1/1 commits from #28952 (#28959).
Backport 1/1 commits from #28955 (#28959).
Backport 1/1 commits from #28959.
Backport 1/1 commits from #28959.
Backport 1/1 commits from #28690.
cc @cockroachdb/release