Skip to content

types: remove downgradeType function to fix data race#148694

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
rafiss:remove-downgradetype
Jun 27, 2025
Merged

types: remove downgradeType function to fix data race#148694
craig[bot] merged 1 commit intocockroachdb:masterfrom
rafiss:remove-downgradetype

Conversation

@rafiss
Copy link
Copy Markdown
Collaborator

@rafiss rafiss commented Jun 23, 2025

This function was only needed for mixed-version clusters that had nodes
running on 19.1, which did not know about the new format for encoding
types.

There's no need to persist the old format anywhere now. We will need to
keep upgradeType until we no longer need compatibility with the older
type format. As part of this change, upgradeType was also modified so it
does not attempt to set the OID if it has already been set. This is
because the OID only will have been set if the type is in the new
format.

fixes #148278
fixes #148277
fixes #148276

Release note: None

@rafiss rafiss requested review from a team, fqazi and mgartner June 23, 2025 17:17
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@rafiss rafiss force-pushed the remove-downgradetype branch 2 times, most recently from 75e9faa to 76ad6ba Compare June 25, 2025 20:24
@rafiss rafiss added the backport-25.3.x Flags PRs that need to be backported to 25.3 label Jun 25, 2025
@rafiss rafiss changed the title types: remove downgradeType function types: remove downgradeType function to fix data race Jun 25, 2025
@rafiss rafiss force-pushed the remove-downgradetype branch from 76ad6ba to a17c94f Compare June 25, 2025 22:07
Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Can we remove VisibleType now, or do we need to wait?

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @fqazi)


pkg/sql/types/types.go line 2454 at r1 (raw file):

//
// This can be removed once we can ensure that all descriptors have been
// rewritten with the newer format. Note that the migration in first_upgrade.go

So we need another mechanism than just waiting until a future version to ensure this? Any ideas what kind of mechanism that could be?

@rafiss rafiss force-pushed the remove-downgradetype branch from a17c94f to 4d9be50 Compare June 26, 2025 16:10
Copy link
Copy Markdown
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to wait on that. since all persisted types.T are in the "old" format, that means VisibleType is being stored on disk in ~all clusters. we still need the logic in upgradeType that looks at VisibleType and uses it to set the Oid field if it's currently unset.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @fqazi and @mgartner)


pkg/sql/types/types.go line 2454 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

So we need another mechanism than just waiting until a future version to ensure this? Any ideas what kind of mechanism that could be?

one way is to explicitly write an upgrade migration that unconditionally reads and rewrites every descriptor without performing any modifications to them.

Copy link
Copy Markdown
Collaborator

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Do we need a follow on issues for an upgrade that will rewrite all descriptors?

Reviewed 2 of 3 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)

Copy link
Copy Markdown
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)


pkg/sql/types/types.go line 2454 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

one way is to explicitly write an upgrade migration that unconditionally reads and rewrites every descriptor without performing any modifications to them.

another way to achieve that is to comment/delete this condition so that all descriptors are added to descsToUpdate:

changes := desc.GetPostDeserializationChanges()
if !changes.HasChanges() || (changes.Len() == 1 && changes.Contains(catalog.SetModTimeToMVCCTimestamp)) {
return nil
}
descsToUpdate.Add(desc.GetID())

we'd probably do that just for one release, and then add it back in for the next one.

@rafiss rafiss force-pushed the remove-downgradetype branch from 4d9be50 to bdd5275 Compare June 26, 2025 19:43
This function was only needed for mixed-version clusters that had nodes
running on 19.1, which did not know about the new format for encoding
types.

There's no need to persist the old format anywhere now. We will need to
keep upgradeType until we no longer need compatibility with the older
type format. As part of this change, upgradeType was also modified so it
does not attempt to set the OID if it has already been set. This is
because the OID only will have been set if the type is in the new
format.

Release note: None
@rafiss rafiss force-pushed the remove-downgradetype branch from bdd5275 to 2fad508 Compare June 26, 2025 20:36
@rafiss
Copy link
Copy Markdown
Collaborator Author

rafiss commented Jun 27, 2025

bors r+

tftr!

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jun 27, 2025

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Jun 27, 2025

Based on the specified backports for this PR, I applied new labels to the following linked issue(s). Please adjust the labels as needed to match the branches actually affected by the issue(s), including adding any known older branches.


Issue #148278: branch-release-25.3.


Issue #148277: branch-release-25.3.


Issue #148276: branch-release-25.3.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Jun 27, 2025

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 2fad508 to blathers/backport-release-25.3-148694: POST https://api.github.com/repos/rafiss/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 25.3.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@rafiss
Copy link
Copy Markdown
Collaborator Author

rafiss commented Jun 27, 2025

I'll skip the backport and instead leave the test skipped under race on the 25.3 branch.

@rafiss rafiss removed the backport-25.3.x Flags PRs that need to be backported to 25.3 label Jun 27, 2025
@rafiss rafiss deleted the remove-downgradetype branch June 27, 2025 19:58
spilchen added a commit to spilchen/cockroach that referenced this pull request Sep 29, 2025
…clusters

This patch restores the `downgradeType` hook to maintain serialization
compatibility for the NAME type in mixed-version clusters (v25.3 and 25.4).

In previous releases, a NAME type could be misinterpreted by older nodes.
Specifically, a v25.3 node deserializing a NAME encoded by a v25.4 node would
mistakenly treat it as text (`oid.T_text`). This was due to the old
`upgradeType` logic overwriting the OID when no `VisibleType` was set, causing
`T_name` to degrade to `T_text`. PR cockroachdb#148694 addressed this issue for v25.4 by
changing `upgradeType` (to avoid clobbering a provided OID) and removing the
`downgradeType` function.

However, the follow-up fix for mixed-version serialization (PR cockroachdb#152633) relied
on the `InternalType.VisibleType` field to tag string types for older nodes.
Because the NAME type doesn’t use a `VisibleType` (it’s represented as a
STRING family with a special OID), that change didn’t cover NAME. As a result,
NAME remained broken in a 25.3/25.4 mixed cluster.

This commit explicitly handles NAME during serialization. When sending a NAME
type to the wire, we downgrade it to the legacy encoding: we set its type
family to `Family=name` (legacy enum value 11) so that v25.3 nodes recognize
it as a NAME. Additionally, if the locale is an empty string, we set it to
`nil` to ensure older versions don’t misclassify the NAME as a collated string.
Together, these steps preserve the correct type interpretation on older nodes.

Once all nodes are on v25.4, this compatibility shim can be removed. At that
point, the improved upgrade logic in v25.4 (which no longer overrides a valid
OID) means we won’t need to downgrade types for backward compatibility. We may
also perform a one time descriptor migration to update any NAME descriptors to
the modern format after the cluster upgrade. In short, `downgradeType` for NAME
is a temporary measure to bridge the v25.3 gap.

Fixes cockroachdb#153322

Release note: None
spilchen added a commit to spilchen/cockroach that referenced this pull request Sep 29, 2025
…clusters

This patch restores the `downgradeType` hook to maintain serialization
compatibility for the NAME type in mixed-version clusters (v25.3 and 25.4).

In previous releases, a NAME type could be misinterpreted by older nodes.
Specifically, a v25.3 node deserializing a NAME encoded by a v25.4 node would
mistakenly treat it as text (`oid.T_text`). This was due to the old
`upgradeType` logic overwriting the OID when no `VisibleType` was set, causing
`T_name` to degrade to `T_text`. PR cockroachdb#148694 addressed this issue for v25.4 by
changing `upgradeType` (to avoid clobbering a provided OID) and removing the
`downgradeType` function.

However, the follow-up fix for mixed-version serialization (PR cockroachdb#152633) relied
on the `InternalType.VisibleType` field to tag string types for older nodes.
Because the NAME type doesn’t use a `VisibleType` (it’s represented as a
STRING family with a special OID), that change didn’t cover NAME. As a result,
NAME remained broken in a 25.3/25.4 mixed cluster.

This commit explicitly handles NAME during serialization. When sending a NAME
type to the wire, we downgrade it to the legacy encoding: we set its type
family to `Family=name` (legacy enum value 11) so that v25.3 nodes recognize
it as a NAME. Additionally, if the locale is an empty string, we set it to
`nil` to ensure older versions don’t misclassify the NAME as a collated string.
Together, these steps preserve the correct type interpretation on older nodes.

Once all nodes are on v25.4, this compatibility shim can be removed. At that
point, the improved upgrade logic in v25.4 (which no longer overrides a valid
OID) means we won’t need to downgrade types for backward compatibility. We may
also perform a one time descriptor migration to update any NAME descriptors to
the modern format after the cluster upgrade. In short, `downgradeType` for NAME
is a temporary measure to bridge the v25.3 gap.

Fixes cockroachdb#153322

Release note: None
spilchen added a commit to spilchen/cockroach that referenced this pull request Sep 29, 2025
…clusters

This patch restores the `downgradeType` hook to maintain serialization
compatibility for the NAME type in mixed-version clusters (v25.3 and 25.4).

In previous releases, a NAME type could be misinterpreted by older nodes.
Specifically, a v25.3 node deserializing a NAME encoded by a v25.4 node would
mistakenly treat it as text (`oid.T_text`). This was due to the old
`upgradeType` logic overwriting the OID when no `VisibleType` was set, causing
`T_name` to degrade to `T_text`. PR cockroachdb#148694 addressed this issue for v25.4 by
changing `upgradeType` (to avoid clobbering a provided OID) and removing the
`downgradeType` function.

However, the follow-up fix for mixed-version serialization (PR cockroachdb#152633) relied
on the `InternalType.VisibleType` field to tag string types for older nodes.
Because the NAME type doesn’t use a `VisibleType` (it’s represented as a
STRING family with a special OID), that change didn’t cover NAME. As a
result, NAME remained broken in a 25.3/25.4 mixed cluster.

This commit explicitly handles NAME during serialization. When sending a NAME
type to the wire, we downgrade it to the legacy encoding: we set its type
family to `Family=name` (legacy enum value 11) so that v25.3 nodes recognize
it as a NAME. Additionally, if the locale is an empty string, we set it to
`nil` to ensure older versions don’t misclassify the NAME as a collated
string. Together, these steps preserve the correct type interpretation
on older nodes.

Once all nodes are on v25.4, this compatibility shim can be removed. At that
point, the improved upgrade logic in v25.4 (which no longer overrides a valid
OID) means we won’t need to downgrade types for backward compatibility. We may
also perform a one time descriptor migration to update any NAME descriptors to
the modern format after the cluster upgrade. In short, `downgradeType` for NAME
is a temporary measure to bridge the v25.3 gap.

Fixes cockroachdb#153322

Release note: None
craig bot pushed a commit that referenced this pull request Sep 29, 2025
154164: roachtest: add github issue posting dry run mode r=williamchoe3,herkolategan a=DarrylWong

We already have the functionality in datadriven tests to render github issue post requests as clickable links. This change extends that functionality to roachtests in the form of a --dry-run-issue-posting flag. This allows for viewing what the output of a failed roachtest would look like even if github posting is disabled.

Fixes: none
Release note: none
Epic: none

154266: sqlproxyccl: pass in ctx when transferring connections r=jeffswenson a=davidwding

Previously, the span used when transferring connections was reused from
the forwarder. However, transferring connections has async components,
so it's possible that the forwarder calls `Finish` on the span before
the connection transfer is done using the span. This leads to a panic:
`panic: use of Span after Finish`. To address this, this patch passes a
context in when transferring a connection, instead of using the
forwarder's context.

Fixes: #153569
Release note: None

154363: sql/types: reintroduce downgradeType for NAME compatibility in mixed clusters r=spilchen a=spilchen

This patch restores the `downgradeType` hook to maintain serialization compatibility for the NAME type in mixed-version clusters (v25.3 and 25.4).

In previous releases, a NAME type could be misinterpreted by older nodes. Specifically, a v25.3 node deserializing a NAME encoded by a v25.4 node would mistakenly treat it as text (`oid.T_text`). This was due to the old `upgradeType` logic overwriting the OID when no `VisibleType` was set, causing `T_name` to degrade to `T_text`. PR #148694 addressed this issue for v25.4 by changing `upgradeType` (to avoid clobbering a provided OID) and removing the `downgradeType` function.

However, the follow-up fix for mixed-version serialization (PR #152633) relied on the `InternalType.VisibleType` field to tag string types for older nodes. Because the NAME type doesn’t use a `VisibleType` (it’s represented as a STRING family with a special OID), that change didn’t cover NAME. As a result, NAME remained broken in a 25.3/25.4 mixed cluster.

This commit explicitly handles NAME during serialization. When sending a NAME type to the wire, we downgrade it to the legacy encoding: we set its type family to `Family=name` (legacy enum value 11) so that v25.3 nodes recognize it as a NAME. Additionally, if the locale is an empty string, we set it to `nil` to ensure older versions don’t misclassify the NAME as a collated string. Together, these steps preserve the correct type interpretation on older nodes.

Once all nodes are on v25.4, this compatibility shim can be removed. At that point, the improved upgrade logic in v25.4 (which no longer overrides a valid OID) means we won’t need to downgrade types for backward compatibility. We may also perform a one time descriptor migration to update any NAME descriptors to the modern format after the cluster upgrade. In short, `downgradeType` for NAME is a temporary measure to bridge the v25.3 gap.

Fixes #153322

Release note: None

154369: roachest: ensure reader tenant accepts connections before restart r=jeffswenson a=msbutler

Previously, the roachtest would restart the tenant before the tenant could accept connections, which means the tenant could still be bootstrapping. Restarting the tenant service during bootstrap could lead to sadness, as outlined in #154356.

Informs #154356
Informs #154311
Informs #153131

Release note: none

Co-authored-by: DarrylWong <darryl@cockroachlabs.com>
Co-authored-by: David Ding <ding@cockroachlabs.com>
Co-authored-by: Matt Spilchen <matt.spilchen@cockroachlabs.com>
Co-authored-by: Michael Butler <butler@cockroachlabs.com>
spilchen added a commit to spilchen/cockroach that referenced this pull request Sep 29, 2025
…clusters

This patch restores the `downgradeType` hook to maintain serialization
compatibility for the NAME type in mixed-version clusters (v25.3 and 25.4).

In previous releases, a NAME type could be misinterpreted by older nodes.
Specifically, a v25.3 node deserializing a NAME encoded by a v25.4 node would
mistakenly treat it as text (`oid.T_text`). This was due to the old
`upgradeType` logic overwriting the OID when no `VisibleType` was set, causing
`T_name` to degrade to `T_text`. PR cockroachdb#148694 addressed this issue for v25.4 by
changing `upgradeType` (to avoid clobbering a provided OID) and removing the
`downgradeType` function.

However, the follow-up fix for mixed-version serialization (PR cockroachdb#152633) relied
on the `InternalType.VisibleType` field to tag string types for older nodes.
Because the NAME type doesn’t use a `VisibleType` (it’s represented as a
STRING family with a special OID), that change didn’t cover NAME. As a
result, NAME remained broken in a 25.3/25.4 mixed cluster.

This commit explicitly handles NAME during serialization. When sending a NAME
type to the wire, we downgrade it to the legacy encoding: we set its type
family to `Family=name` (legacy enum value 11) so that v25.3 nodes recognize
it as a NAME. Additionally, if the locale is an empty string, we set it to
`nil` to ensure older versions don’t misclassify the NAME as a collated
string. Together, these steps preserve the correct type interpretation
on older nodes.

Once all nodes are on v25.4, this compatibility shim can be removed. At that
point, the improved upgrade logic in v25.4 (which no longer overrides a valid
OID) means we won’t need to downgrade types for backward compatibility. We may
also perform a one time descriptor migration to update any NAME descriptors to
the modern format after the cluster upgrade. In short, `downgradeType` for NAME
is a temporary measure to bridge the v25.3 gap.

Fixes cockroachdb#153322

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

4 participants