Skip to content

sql/types: fix type serialization in mixed version state#152633

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
rafiss:fix-type-serialization-bug
Aug 29, 2025
Merged

sql/types: fix type serialization in mixed version state#152633
craig[bot] merged 1 commit intocockroachdb:masterfrom
rafiss:fix-type-serialization-bug

Conversation

@rafiss
Copy link
Copy Markdown
Collaborator

@rafiss rafiss commented Aug 27, 2025

In 2fad508 we removed the downgradeType function and changed upgradeType to only overwrite the OID if it was not already set. However, this causes issues in mixed-version clusters, since the 25.3 nodes will always unconditionally overwrite the OID based on the value of the VisibleType field.

To account for this, this patch updates the hardcoded types to have VisibleType set, so that nodes on the newer version will serialize the type with this field set (previously downgradeType would have done that work). With the field set, upgradeType does the correct thing in the mixed version state.

informs #152629
fixes #152419
Release note: None

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@rafiss rafiss force-pushed the fix-type-serialization-bug branch 3 times, most recently from 57cebf4 to fb76c63 Compare August 28, 2025 16:13
@rafiss rafiss marked this pull request as ready for review August 28, 2025 16:43
@rafiss rafiss requested review from a team and fqazi August 28, 2025 16:43
Copy link
Copy Markdown
Collaborator

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@fqazi reviewed 8 of 8 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @rafiss)

@rafiss rafiss force-pushed the fix-type-serialization-bug branch from fb76c63 to 12d89cd Compare August 28, 2025 18:45
@rafiss rafiss requested a review from a team as a code owner August 28, 2025 18:45
@rafiss rafiss requested review from mw5h and removed request for a team August 28, 2025 18:45
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Aug 28, 2025

It looks like your PR touches SQL parser code but doesn't add or edit parser tests. Please make sure you add or edit parser tests if you edit the parser.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@rafiss rafiss force-pushed the fix-type-serialization-bug branch from 12d89cd to 2d5938c Compare August 28, 2025 19:03
Copy link
Copy Markdown
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuzefovich reviewed 8 of 8 files at r1, 6 of 6 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mw5h)


pkg/sql/types/types_jsonpb.go line 66 at r2 (raw file):

		t.InternalType.VisibleType = visibleQCHAR
	case oid.T_varbit:
		t.InternalType.VisibleType = visibleVARBIT

Do we need to handle visibleBIT?


pkg/sql/types/types.go line 335 at r2 (raw file):

		// TODO(rafi): Once compatibility with 25.3 is no longer needed, remove
		// VisibleType.
		VisibleType: visibleDOUBLE,

Should this be visibleREAL?


pkg/sql/logictest/testdata/logic_test/mixed_version_char line 4 at r2 (raw file):


# This test verifies that the "char" type is handled correctly in a mixed
# version cluster. The reason its being tested is because dueing 25.4

nit: s/its/it's/ and s/dueing/during/.

Copy link
Copy Markdown
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mw5h and @yuzefovich)


pkg/sql/types/types.go line 335 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Should this be visibleREAL?

yes, nice catch. luckily i'm pretty sure that VisibleType was only relevant for StringFamily due to how upgradeType works, but I included it here for good measure


pkg/sql/types/types_jsonpb.go line 66 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Do we need to handle visibleBIT?

no, it doesn't seem to ever get referenced in the downgradeType function which was removed in 2fad508.

@rafiss rafiss force-pushed the fix-type-serialization-bug branch 2 times, most recently from 856833c to 2db31f6 Compare August 29, 2025 02:58
In 2fad508 we removed the downgradeType
function and changed upgradeType to only overwrite the OID if it was not
already set. However, this causes issues in mixed-version clusters,
since the 25.3 nodes will always unconditionally overwrite the OID based
on the value of the VisibleType field.

To account for this, we update the hardcoded types to have VisibleType
set, so that nodes on the newer version will serialize the type with
this field set (previously downgradeType would have done that work). With
the field set, upgradeType does the correct thing in the mixed version
state.

Release note: None
@rafiss rafiss force-pushed the fix-type-serialization-bug branch from 2db31f6 to 17a8a60 Compare August 29, 2025 04:31
@rafiss
Copy link
Copy Markdown
Collaborator Author

rafiss commented Aug 29, 2025

tftr!

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 29, 2025

@craig craig bot merged commit 921d818 into cockroachdb:master Aug 29, 2025
32 of 33 checks passed
@rafiss rafiss deleted the fix-type-serialization-bug branch August 30, 2025 17:06
spilchen added a commit to spilchen/cockroach that referenced this pull request Sep 29, 2025
… v25.3

This patch reintroduces the downgradeType function to ensure serialization
compatibility for the NAME type in mixed-version clusters with v25.3/v25.2
nodes.

In older versions, NAME types were encoded using the legacy Family=name (value
11). In newer releases, NAME is represented as a string family type with
Oid=T_name. However, without downgrade logic, these modern descriptors may be
misinterpreted by v25.3 nodes as Oid=T_text.

This issue became more apparent after PR cockroachdb#152633, which reintroduced
VisibleType for mixed-version compatibility. However, since NAME has no
distinct VisibleType, it wasn't covered by that logic.

This patch resolves the compatibility gap by downgrading NAME types to use the
legacy family code (Family=name) when marshaling descriptors. It also
normalizes empty string locales to nil, preventing NAME from being
misinterpreted as a CollatedStringFamily during upgrade.

This logic is intended to be temporary. Once v25.4 becomes the minimum
supported version, a descriptor rewrite should be introduced to upgrade all
remaining Family=name descriptors to StringFamily with Oid=T_name. After that
migration is complete, downgradeType can be safely removed.

Fixes cockroachdb#153322

Release note: None
spilchen added a commit to spilchen/cockroach that referenced this pull request Sep 29, 2025
…clusters

This patch restores the `downgradeType` hook to maintain serialization
compatibility for the NAME type in mixed-version clusters (v25.3 and 25.4).

In previous releases, a NAME type could be misinterpreted by older nodes.
Specifically, a v25.3 node deserializing a NAME encoded by a v25.4 node would
mistakenly treat it as text (`oid.T_text`). This was due to the old
`upgradeType` logic overwriting the OID when no `VisibleType` was set, causing
`T_name` to degrade to `T_text`. PR cockroachdb#148694 addressed this issue for v25.4 by
changing `upgradeType` (to avoid clobbering a provided OID) and removing the
`downgradeType` function.

However, the follow-up fix for mixed-version serialization (PR cockroachdb#152633) relied
on the `InternalType.VisibleType` field to tag string types for older nodes.
Because the NAME type doesn’t use a `VisibleType` (it’s represented as a
STRING family with a special OID), that change didn’t cover NAME. As a result,
NAME remained broken in a 25.3/25.4 mixed cluster.

This commit explicitly handles NAME during serialization. When sending a NAME
type to the wire, we downgrade it to the legacy encoding: we set its type
family to `Family=name` (legacy enum value 11) so that v25.3 nodes recognize
it as a NAME. Additionally, if the locale is an empty string, we set it to
`nil` to ensure older versions don’t misclassify the NAME as a collated string.
Together, these steps preserve the correct type interpretation on older nodes.

Once all nodes are on v25.4, this compatibility shim can be removed. At that
point, the improved upgrade logic in v25.4 (which no longer overrides a valid
OID) means we won’t need to downgrade types for backward compatibility. We may
also perform a one time descriptor migration to update any NAME descriptors to
the modern format after the cluster upgrade. In short, `downgradeType` for NAME
is a temporary measure to bridge the v25.3 gap.

Fixes cockroachdb#153322

Release note: None
spilchen added a commit to spilchen/cockroach that referenced this pull request Sep 29, 2025
…clusters

This patch restores the `downgradeType` hook to maintain serialization
compatibility for the NAME type in mixed-version clusters (v25.3 and 25.4).

In previous releases, a NAME type could be misinterpreted by older nodes.
Specifically, a v25.3 node deserializing a NAME encoded by a v25.4 node would
mistakenly treat it as text (`oid.T_text`). This was due to the old
`upgradeType` logic overwriting the OID when no `VisibleType` was set, causing
`T_name` to degrade to `T_text`. PR cockroachdb#148694 addressed this issue for v25.4 by
changing `upgradeType` (to avoid clobbering a provided OID) and removing the
`downgradeType` function.

However, the follow-up fix for mixed-version serialization (PR cockroachdb#152633) relied
on the `InternalType.VisibleType` field to tag string types for older nodes.
Because the NAME type doesn’t use a `VisibleType` (it’s represented as a
STRING family with a special OID), that change didn’t cover NAME. As a result,
NAME remained broken in a 25.3/25.4 mixed cluster.

This commit explicitly handles NAME during serialization. When sending a NAME
type to the wire, we downgrade it to the legacy encoding: we set its type
family to `Family=name` (legacy enum value 11) so that v25.3 nodes recognize
it as a NAME. Additionally, if the locale is an empty string, we set it to
`nil` to ensure older versions don’t misclassify the NAME as a collated string.
Together, these steps preserve the correct type interpretation on older nodes.

Once all nodes are on v25.4, this compatibility shim can be removed. At that
point, the improved upgrade logic in v25.4 (which no longer overrides a valid
OID) means we won’t need to downgrade types for backward compatibility. We may
also perform a one time descriptor migration to update any NAME descriptors to
the modern format after the cluster upgrade. In short, `downgradeType` for NAME
is a temporary measure to bridge the v25.3 gap.

Fixes cockroachdb#153322

Release note: None
spilchen added a commit to spilchen/cockroach that referenced this pull request Sep 29, 2025
…clusters

This patch restores the `downgradeType` hook to maintain serialization
compatibility for the NAME type in mixed-version clusters (v25.3 and 25.4).

In previous releases, a NAME type could be misinterpreted by older nodes.
Specifically, a v25.3 node deserializing a NAME encoded by a v25.4 node would
mistakenly treat it as text (`oid.T_text`). This was due to the old
`upgradeType` logic overwriting the OID when no `VisibleType` was set, causing
`T_name` to degrade to `T_text`. PR cockroachdb#148694 addressed this issue for v25.4 by
changing `upgradeType` (to avoid clobbering a provided OID) and removing the
`downgradeType` function.

However, the follow-up fix for mixed-version serialization (PR cockroachdb#152633) relied
on the `InternalType.VisibleType` field to tag string types for older nodes.
Because the NAME type doesn’t use a `VisibleType` (it’s represented as a
STRING family with a special OID), that change didn’t cover NAME. As a
result, NAME remained broken in a 25.3/25.4 mixed cluster.

This commit explicitly handles NAME during serialization. When sending a NAME
type to the wire, we downgrade it to the legacy encoding: we set its type
family to `Family=name` (legacy enum value 11) so that v25.3 nodes recognize
it as a NAME. Additionally, if the locale is an empty string, we set it to
`nil` to ensure older versions don’t misclassify the NAME as a collated
string. Together, these steps preserve the correct type interpretation
on older nodes.

Once all nodes are on v25.4, this compatibility shim can be removed. At that
point, the improved upgrade logic in v25.4 (which no longer overrides a valid
OID) means we won’t need to downgrade types for backward compatibility. We may
also perform a one time descriptor migration to update any NAME descriptors to
the modern format after the cluster upgrade. In short, `downgradeType` for NAME
is a temporary measure to bridge the v25.3 gap.

Fixes cockroachdb#153322

Release note: None
craig bot pushed a commit that referenced this pull request Sep 29, 2025
154164: roachtest: add github issue posting dry run mode r=williamchoe3,herkolategan a=DarrylWong

We already have the functionality in datadriven tests to render github issue post requests as clickable links. This change extends that functionality to roachtests in the form of a --dry-run-issue-posting flag. This allows for viewing what the output of a failed roachtest would look like even if github posting is disabled.

Fixes: none
Release note: none
Epic: none

154266: sqlproxyccl: pass in ctx when transferring connections r=jeffswenson a=davidwding

Previously, the span used when transferring connections was reused from
the forwarder. However, transferring connections has async components,
so it's possible that the forwarder calls `Finish` on the span before
the connection transfer is done using the span. This leads to a panic:
`panic: use of Span after Finish`. To address this, this patch passes a
context in when transferring a connection, instead of using the
forwarder's context.

Fixes: #153569
Release note: None

154363: sql/types: reintroduce downgradeType for NAME compatibility in mixed clusters r=spilchen a=spilchen

This patch restores the `downgradeType` hook to maintain serialization compatibility for the NAME type in mixed-version clusters (v25.3 and 25.4).

In previous releases, a NAME type could be misinterpreted by older nodes. Specifically, a v25.3 node deserializing a NAME encoded by a v25.4 node would mistakenly treat it as text (`oid.T_text`). This was due to the old `upgradeType` logic overwriting the OID when no `VisibleType` was set, causing `T_name` to degrade to `T_text`. PR #148694 addressed this issue for v25.4 by changing `upgradeType` (to avoid clobbering a provided OID) and removing the `downgradeType` function.

However, the follow-up fix for mixed-version serialization (PR #152633) relied on the `InternalType.VisibleType` field to tag string types for older nodes. Because the NAME type doesn’t use a `VisibleType` (it’s represented as a STRING family with a special OID), that change didn’t cover NAME. As a result, NAME remained broken in a 25.3/25.4 mixed cluster.

This commit explicitly handles NAME during serialization. When sending a NAME type to the wire, we downgrade it to the legacy encoding: we set its type family to `Family=name` (legacy enum value 11) so that v25.3 nodes recognize it as a NAME. Additionally, if the locale is an empty string, we set it to `nil` to ensure older versions don’t misclassify the NAME as a collated string. Together, these steps preserve the correct type interpretation on older nodes.

Once all nodes are on v25.4, this compatibility shim can be removed. At that point, the improved upgrade logic in v25.4 (which no longer overrides a valid OID) means we won’t need to downgrade types for backward compatibility. We may also perform a one time descriptor migration to update any NAME descriptors to the modern format after the cluster upgrade. In short, `downgradeType` for NAME is a temporary measure to bridge the v25.3 gap.

Fixes #153322

Release note: None

154369: roachest: ensure reader tenant accepts connections before restart r=jeffswenson a=msbutler

Previously, the roachtest would restart the tenant before the tenant could accept connections, which means the tenant could still be bootstrapping. Restarting the tenant service during bootstrap could lead to sadness, as outlined in #154356.

Informs #154356
Informs #154311
Informs #153131

Release note: none

Co-authored-by: DarrylWong <darryl@cockroachlabs.com>
Co-authored-by: David Ding <ding@cockroachlabs.com>
Co-authored-by: Matt Spilchen <matt.spilchen@cockroachlabs.com>
Co-authored-by: Michael Butler <butler@cockroachlabs.com>
spilchen added a commit to spilchen/cockroach that referenced this pull request Sep 29, 2025
…clusters

This patch restores the `downgradeType` hook to maintain serialization
compatibility for the NAME type in mixed-version clusters (v25.3 and 25.4).

In previous releases, a NAME type could be misinterpreted by older nodes.
Specifically, a v25.3 node deserializing a NAME encoded by a v25.4 node would
mistakenly treat it as text (`oid.T_text`). This was due to the old
`upgradeType` logic overwriting the OID when no `VisibleType` was set, causing
`T_name` to degrade to `T_text`. PR cockroachdb#148694 addressed this issue for v25.4 by
changing `upgradeType` (to avoid clobbering a provided OID) and removing the
`downgradeType` function.

However, the follow-up fix for mixed-version serialization (PR cockroachdb#152633) relied
on the `InternalType.VisibleType` field to tag string types for older nodes.
Because the NAME type doesn’t use a `VisibleType` (it’s represented as a
STRING family with a special OID), that change didn’t cover NAME. As a
result, NAME remained broken in a 25.3/25.4 mixed cluster.

This commit explicitly handles NAME during serialization. When sending a NAME
type to the wire, we downgrade it to the legacy encoding: we set its type
family to `Family=name` (legacy enum value 11) so that v25.3 nodes recognize
it as a NAME. Additionally, if the locale is an empty string, we set it to
`nil` to ensure older versions don’t misclassify the NAME as a collated
string. Together, these steps preserve the correct type interpretation
on older nodes.

Once all nodes are on v25.4, this compatibility shim can be removed. At that
point, the improved upgrade logic in v25.4 (which no longer overrides a valid
OID) means we won’t need to downgrade types for backward compatibility. We may
also perform a one time descriptor migration to update any NAME descriptors to
the modern format after the cluster upgrade. In short, `downgradeType` for NAME
is a temporary measure to bridge the v25.3 gap.

Fixes cockroachdb#153322

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

roachtest: schemachange/mixed-versions failed [value too long for type STRING(1)]

4 participants