[Replicated] release-25.1: sql/schemachanger: support generated columns in ADD COLUMN#146
Merged
mohini-crl merged 273 commits intomasterfrom Mar 11, 2025
Merged
Conversation
Fixes cockroachdb#133146 Release note (bug fix): A bug has been fixed that caused incorrect NOT NULL constraint violation errors on `UPSERT` and `INSERT .. ON CONFLICT .. DO UPDATE` statements when those statements updated an existing row and a subset of columns which did not include a `NOT NULL` column of the table. This bug has been present since at least version 20.1.0.
Fixes cockroachdb#133146 Release note (bug fix): A bug has been fixed that caused incorrect NOT NULL constraint violation errors on `UPSERT` and `INSERT .. ON CONFLICT .. DO UPDATE` statements when those statements updated an existing row and a subset of columns which did not include a `NOT NULL` column of the table. This bug has been present since at least version 20.1.0.
Fixes cockroachdb#133146 Release note (bug fix): A bug has been fixed that caused incorrect NOT NULL constraint violation errors on `UPSERT` and `INSERT .. ON CONFLICT .. DO UPDATE` statements when those statements updated an existing row and a subset of columns which did not include a `NOT NULL` column of the table. This bug has been present since at least version 20.1.0.
Previously, if the correct overloads were not found for sequence builtins it was possible for the server to panic. This could happen when rewriting a CREATE TABLE expression with an invalid sequence builtin call. To address this, this patch updates the sequence logic to return the error instead of panicking on it. Fixes: cockroachdb#133399 Release note (bug fix): Address a panic inside CREATE TABLE AS if sequence builtin expressions had invalid function overloads.
Previously, if the correct overloads were not found for sequence builtins it was possible for the server to panic. This could happen when rewriting a CREATE TABLE expression with an invalid sequence builtin call. To address this, this patch updates the sequence logic to return the error instead of panicking on it. Fixes: cockroachdb#133399 Release note (bug fix): Address a panic inside CREATE TABLE AS if sequence builtin expressions had invalid function overloads.
Previously, if the correct overloads were not found for sequence builtins it was possible for the server to panic. This could happen when rewriting a CREATE TABLE expression with an invalid sequence builtin call. To address this, this patch updates the sequence logic to return the error instead of panicking on it. Fixes: cockroachdb#133399 Release note (bug fix): Address a panic inside CREATE TABLE AS if sequence builtin expressions had invalid function overloads.
Previously, if the correct overloads were not found for sequence builtins it was possible for the server to panic. This could happen when rewriting a CREATE TABLE expression with an invalid sequence builtin call. To address this, this patch updates the sequence logic to return the error instead of panicking on it. Fixes: cockroachdb#133399 Release note (bug fix): Address a panic inside CREATE TABLE AS if sequence builtin expressions had invalid function overloads.
Previously, if the correct overloads were not found for sequence builtins it was possible for the server to panic. This could happen when rewriting a CREATE TABLE expression with an invalid sequence builtin call. To address this, this patch updates the sequence logic to return the error instead of panicking on it. Fixes: cockroachdb#133399 Release note (bug fix): Address a panic inside CREATE TABLE AS if sequence builtin expressions had invalid function overloads.
Previously, if the correct overloads were not found for sequence builtins it was possible for the server to panic. This could happen when rewriting a CREATE TABLE expression with an invalid sequence builtin call. To address this, this patch updates the sequence logic to return the error instead of panicking on it. Fixes: cockroachdb#133399 Release note (bug fix): Address a panic inside CREATE TABLE AS if sequence builtin expressions had invalid function overloads.
This commit fixes type schema corruption in the vectorized engine in an edge case. In particular, consider the following circumstances: - during the physical planning, when creating a new stage of processors, we often reuse the same type slice (stored in `InputSyncSpec.ColumnTypes`) that we get from the previous stage. In other words, we might have memory aliasing, but only on the gateway node because the remote nodes get their specs deserialized and each has its own memory allocation. - throughout the vectorized operator planning, as of 85fd4fb, for each newly projected vector we append the corresponding type to the type slice we have in scope. We also capture intermediate state of the type slice by some operators (e.g. `BatchSchemaSubsetEnforcer`). - as expected, when appending a type to the slice, if there is enough capacity, we reuse it, meaning that we often append to the slice that came to us via `InputSyncSpec.ColumnTypes`. - now, if we have two stages of processors that happened to share the same underlying type slice with some free capacity AND we needed to append vectors for each stage, then we might corrupt the type schema captured by an operator for the earlier stage when performing vectorized planning for the later stage. The bug is effectively the same as the comment deleted by 85fd4fb outlined: ``` // As an example, consider the following scenario in the context of // planFilterExpr method: // 1. r.ColumnTypes={types.Bool} with len=1 and cap=4 // 2. planSelectionOperators adds another types.Int column, so // filterColumnTypes={types.Bool, types.Int} with len=2 and cap=4 // Crucially, it uses exact same underlying array as r.ColumnTypes // uses. // 3. we project out second column, so r.ColumnTypes={types.Bool} // 4. later, we add another types.Float column, so // r.ColumnTypes={types.Bool, types.Float}, but there is enough // capacity in the array, so we simply overwrite the second slot // with the new type which corrupts filterColumnTypes to become // {types.Bool, types.Float}, and we can get into a runtime type // mismatch situation. ``` The only differences are: - aliasing of the type slice occurs via the `InputSyncSpec.ColumnTypes` that is often used as the starting points for populating `NewColOperatorResult.ColumnTypes` which is used throughout the vectorized operator planning - columns are "projected out" by sharing the type schema between two stages of DistSQL processors. This commit addresses this issue by capping the slice to its length right before we get into the vectorized planning. This will make it so that if we need to append a type, then we'll make a fresh allocation, and any possible memory aliasing with a different stage of processors will be gone. I haven't quite figured out the exact conditions that are needed for this bug to occur, but my intuition says that it should be quite rare in practice (otherwise we'd have seen this much sooner given that the offending commit was merged more than a year ago and was backported to older branches). Release note (bug fix): Previously, CockroachDB could encounter an internal error of the form `interface conversion: coldata.Column is` in an edge case and this is now fixed. The bug is present in versions 22.2.13+, 23.1.9+, 23.2+.
This commit fixes type schema corruption in the vectorized engine in an edge case. In particular, consider the following circumstances: - during the physical planning, when creating a new stage of processors, we often reuse the same type slice (stored in `InputSyncSpec.ColumnTypes`) that we get from the previous stage. In other words, we might have memory aliasing, but only on the gateway node because the remote nodes get their specs deserialized and each has its own memory allocation. - throughout the vectorized operator planning, as of 85fd4fb, for each newly projected vector we append the corresponding type to the type slice we have in scope. We also capture intermediate state of the type slice by some operators (e.g. `BatchSchemaSubsetEnforcer`). - as expected, when appending a type to the slice, if there is enough capacity, we reuse it, meaning that we often append to the slice that came to us via `InputSyncSpec.ColumnTypes`. - now, if we have two stages of processors that happened to share the same underlying type slice with some free capacity AND we needed to append vectors for each stage, then we might corrupt the type schema captured by an operator for the earlier stage when performing vectorized planning for the later stage. The bug is effectively the same as the comment deleted by 85fd4fb outlined: ``` // As an example, consider the following scenario in the context of // planFilterExpr method: // 1. r.ColumnTypes={types.Bool} with len=1 and cap=4 // 2. planSelectionOperators adds another types.Int column, so // filterColumnTypes={types.Bool, types.Int} with len=2 and cap=4 // Crucially, it uses exact same underlying array as r.ColumnTypes // uses. // 3. we project out second column, so r.ColumnTypes={types.Bool} // 4. later, we add another types.Float column, so // r.ColumnTypes={types.Bool, types.Float}, but there is enough // capacity in the array, so we simply overwrite the second slot // with the new type which corrupts filterColumnTypes to become // {types.Bool, types.Float}, and we can get into a runtime type // mismatch situation. ``` The only differences are: - aliasing of the type slice occurs via the `InputSyncSpec.ColumnTypes` that is often used as the starting points for populating `NewColOperatorResult.ColumnTypes` which is used throughout the vectorized operator planning - columns are "projected out" by sharing the type schema between two stages of DistSQL processors. This commit addresses this issue by capping the slice to its length right before we get into the vectorized planning. This will make it so that if we need to append a type, then we'll make a fresh allocation, and any possible memory aliasing with a different stage of processors will be gone. I haven't quite figured out the exact conditions that are needed for this bug to occur, but my intuition says that it should be quite rare in practice (otherwise we'd have seen this much sooner given that the offending commit was merged more than a year ago and was backported to older branches). Release note (bug fix): Previously, CockroachDB could encounter an internal error of the form `interface conversion: coldata.Column is` in an edge case and this is now fixed. The bug is present in versions 22.2.13+, 23.1.9+, 23.2+.
This commit fixes type schema corruption in the vectorized engine in an edge case. In particular, consider the following circumstances: - during the physical planning, when creating a new stage of processors, we often reuse the same type slice (stored in `InputSyncSpec.ColumnTypes`) that we get from the previous stage. In other words, we might have memory aliasing, but only on the gateway node because the remote nodes get their specs deserialized and each has its own memory allocation. - throughout the vectorized operator planning, as of 85fd4fb, for each newly projected vector we append the corresponding type to the type slice we have in scope. We also capture intermediate state of the type slice by some operators (e.g. `BatchSchemaSubsetEnforcer`). - as expected, when appending a type to the slice, if there is enough capacity, we reuse it, meaning that we often append to the slice that came to us via `InputSyncSpec.ColumnTypes`. - now, if we have two stages of processors that happened to share the same underlying type slice with some free capacity AND we needed to append vectors for each stage, then we might corrupt the type schema captured by an operator for the earlier stage when performing vectorized planning for the later stage. The bug is effectively the same as the comment deleted by 85fd4fb outlined: ``` // As an example, consider the following scenario in the context of // planFilterExpr method: // 1. r.ColumnTypes={types.Bool} with len=1 and cap=4 // 2. planSelectionOperators adds another types.Int column, so // filterColumnTypes={types.Bool, types.Int} with len=2 and cap=4 // Crucially, it uses exact same underlying array as r.ColumnTypes // uses. // 3. we project out second column, so r.ColumnTypes={types.Bool} // 4. later, we add another types.Float column, so // r.ColumnTypes={types.Bool, types.Float}, but there is enough // capacity in the array, so we simply overwrite the second slot // with the new type which corrupts filterColumnTypes to become // {types.Bool, types.Float}, and we can get into a runtime type // mismatch situation. ``` The only differences are: - aliasing of the type slice occurs via the `InputSyncSpec.ColumnTypes` that is often used as the starting points for populating `NewColOperatorResult.ColumnTypes` which is used throughout the vectorized operator planning - columns are "projected out" by sharing the type schema between two stages of DistSQL processors. This commit addresses this issue by capping the slice to its length right before we get into the vectorized planning. This will make it so that if we need to append a type, then we'll make a fresh allocation, and any possible memory aliasing with a different stage of processors will be gone. I haven't quite figured out the exact conditions that are needed for this bug to occur, but my intuition says that it should be quite rare in practice (otherwise we'd have seen this much sooner given that the offending commit was merged more than a year ago and was backported to older branches). Release note (bug fix): Previously, CockroachDB could encounter an internal error of the form `interface conversion: coldata.Column is` in an edge case and this is now fixed. The bug is present in versions 22.2.13+, 23.1.9+, 23.2+.
This commit fixes type schema corruption in the vectorized engine in an edge case. In particular, consider the following circumstances: - during the physical planning, when creating a new stage of processors, we often reuse the same type slice (stored in `InputSyncSpec.ColumnTypes`) that we get from the previous stage. In other words, we might have memory aliasing, but only on the gateway node because the remote nodes get their specs deserialized and each has its own memory allocation. - throughout the vectorized operator planning, as of 85fd4fb, for each newly projected vector we append the corresponding type to the type slice we have in scope. We also capture intermediate state of the type slice by some operators (e.g. `BatchSchemaSubsetEnforcer`). - as expected, when appending a type to the slice, if there is enough capacity, we reuse it, meaning that we often append to the slice that came to us via `InputSyncSpec.ColumnTypes`. - now, if we have two stages of processors that happened to share the same underlying type slice with some free capacity AND we needed to append vectors for each stage, then we might corrupt the type schema captured by an operator for the earlier stage when performing vectorized planning for the later stage. The bug is effectively the same as the comment deleted by 85fd4fb outlined: ``` // As an example, consider the following scenario in the context of // planFilterExpr method: // 1. r.ColumnTypes={types.Bool} with len=1 and cap=4 // 2. planSelectionOperators adds another types.Int column, so // filterColumnTypes={types.Bool, types.Int} with len=2 and cap=4 // Crucially, it uses exact same underlying array as r.ColumnTypes // uses. // 3. we project out second column, so r.ColumnTypes={types.Bool} // 4. later, we add another types.Float column, so // r.ColumnTypes={types.Bool, types.Float}, but there is enough // capacity in the array, so we simply overwrite the second slot // with the new type which corrupts filterColumnTypes to become // {types.Bool, types.Float}, and we can get into a runtime type // mismatch situation. ``` The only differences are: - aliasing of the type slice occurs via the `InputSyncSpec.ColumnTypes` that is often used as the starting points for populating `NewColOperatorResult.ColumnTypes` which is used throughout the vectorized operator planning - columns are "projected out" by sharing the type schema between two stages of DistSQL processors. This commit addresses this issue by capping the slice to its length right before we get into the vectorized planning. This will make it so that if we need to append a type, then we'll make a fresh allocation, and any possible memory aliasing with a different stage of processors will be gone. I haven't quite figured out the exact conditions that are needed for this bug to occur, but my intuition says that it should be quite rare in practice (otherwise we'd have seen this much sooner given that the offending commit was merged more than a year ago and was backported to older branches). Release note (bug fix): Previously, CockroachDB could encounter an internal error of the form `interface conversion: coldata.Column is` in an edge case and this is now fixed. The bug is present in versions 22.2.13+, 23.1.9+, 23.2+.
This commit fixes type schema corruption in the vectorized engine in an edge case. In particular, consider the following circumstances: - during the physical planning, when creating a new stage of processors, we often reuse the same type slice (stored in `InputSyncSpec.ColumnTypes`) that we get from the previous stage. In other words, we might have memory aliasing, but only on the gateway node because the remote nodes get their specs deserialized and each has its own memory allocation. - throughout the vectorized operator planning, as of 85fd4fb, for each newly projected vector we append the corresponding type to the type slice we have in scope. We also capture intermediate state of the type slice by some operators (e.g. `BatchSchemaSubsetEnforcer`). - as expected, when appending a type to the slice, if there is enough capacity, we reuse it, meaning that we often append to the slice that came to us via `InputSyncSpec.ColumnTypes`. - now, if we have two stages of processors that happened to share the same underlying type slice with some free capacity AND we needed to append vectors for each stage, then we might corrupt the type schema captured by an operator for the earlier stage when performing vectorized planning for the later stage. The bug is effectively the same as the comment deleted by 85fd4fb outlined: ``` // As an example, consider the following scenario in the context of // planFilterExpr method: // 1. r.ColumnTypes={types.Bool} with len=1 and cap=4 // 2. planSelectionOperators adds another types.Int column, so // filterColumnTypes={types.Bool, types.Int} with len=2 and cap=4 // Crucially, it uses exact same underlying array as r.ColumnTypes // uses. // 3. we project out second column, so r.ColumnTypes={types.Bool} // 4. later, we add another types.Float column, so // r.ColumnTypes={types.Bool, types.Float}, but there is enough // capacity in the array, so we simply overwrite the second slot // with the new type which corrupts filterColumnTypes to become // {types.Bool, types.Float}, and we can get into a runtime type // mismatch situation. ``` The only differences are: - aliasing of the type slice occurs via the `InputSyncSpec.ColumnTypes` that is often used as the starting points for populating `NewColOperatorResult.ColumnTypes` which is used throughout the vectorized operator planning - columns are "projected out" by sharing the type schema between two stages of DistSQL processors. This commit addresses this issue by capping the slice to its length right before we get into the vectorized planning. This will make it so that if we need to append a type, then we'll make a fresh allocation, and any possible memory aliasing with a different stage of processors will be gone. I haven't quite figured out the exact conditions that are needed for this bug to occur, but my intuition says that it should be quite rare in practice (otherwise we'd have seen this much sooner given that the offending commit was merged more than a year ago and was backported to older branches). Release note (bug fix): Previously, CockroachDB could encounter an internal error of the form `interface conversion: coldata.Column is` in an edge case and this is now fixed. The bug is present in versions 22.2.13+, 23.1.9+, 23.2+.
This commit fixes type schema corruption in the vectorized engine in an edge case. In particular, consider the following circumstances: - during the physical planning, when creating a new stage of processors, we often reuse the same type slice (stored in `InputSyncSpec.ColumnTypes`) that we get from the previous stage. In other words, we might have memory aliasing, but only on the gateway node because the remote nodes get their specs deserialized and each has its own memory allocation. - throughout the vectorized operator planning, as of 85fd4fb, for each newly projected vector we append the corresponding type to the type slice we have in scope. We also capture intermediate state of the type slice by some operators (e.g. `BatchSchemaSubsetEnforcer`). - as expected, when appending a type to the slice, if there is enough capacity, we reuse it, meaning that we often append to the slice that came to us via `InputSyncSpec.ColumnTypes`. - now, if we have two stages of processors that happened to share the same underlying type slice with some free capacity AND we needed to append vectors for each stage, then we might corrupt the type schema captured by an operator for the earlier stage when performing vectorized planning for the later stage. The bug is effectively the same as the comment deleted by 85fd4fb outlined: ``` // As an example, consider the following scenario in the context of // planFilterExpr method: // 1. r.ColumnTypes={types.Bool} with len=1 and cap=4 // 2. planSelectionOperators adds another types.Int column, so // filterColumnTypes={types.Bool, types.Int} with len=2 and cap=4 // Crucially, it uses exact same underlying array as r.ColumnTypes // uses. // 3. we project out second column, so r.ColumnTypes={types.Bool} // 4. later, we add another types.Float column, so // r.ColumnTypes={types.Bool, types.Float}, but there is enough // capacity in the array, so we simply overwrite the second slot // with the new type which corrupts filterColumnTypes to become // {types.Bool, types.Float}, and we can get into a runtime type // mismatch situation. ``` The only differences are: - aliasing of the type slice occurs via the `InputSyncSpec.ColumnTypes` that is often used as the starting points for populating `NewColOperatorResult.ColumnTypes` which is used throughout the vectorized operator planning - columns are "projected out" by sharing the type schema between two stages of DistSQL processors. This commit addresses this issue by capping the slice to its length right before we get into the vectorized planning. This will make it so that if we need to append a type, then we'll make a fresh allocation, and any possible memory aliasing with a different stage of processors will be gone. I haven't quite figured out the exact conditions that are needed for this bug to occur, but my intuition says that it should be quite rare in practice (otherwise we'd have seen this much sooner given that the offending commit was merged more than a year ago and was backported to older branches). Release note (bug fix): Previously, CockroachDB could encounter an internal error of the form `interface conversion: coldata.Column is` in an edge case and this is now fixed. The bug is present in versions 22.2.13+, 23.1.9+, 23.2+.
This commit fixes type schema corruption in the vectorized engine in an edge case. In particular, consider the following circumstances: - during the physical planning, when creating a new stage of processors, we often reuse the same type slice (stored in `InputSyncSpec.ColumnTypes`) that we get from the previous stage. In other words, we might have memory aliasing, but only on the gateway node because the remote nodes get their specs deserialized and each has its own memory allocation. - throughout the vectorized operator planning, as of 85fd4fb, for each newly projected vector we append the corresponding type to the type slice we have in scope. We also capture intermediate state of the type slice by some operators (e.g. `BatchSchemaSubsetEnforcer`). - as expected, when appending a type to the slice, if there is enough capacity, we reuse it, meaning that we often append to the slice that came to us via `InputSyncSpec.ColumnTypes`. - now, if we have two stages of processors that happened to share the same underlying type slice with some free capacity AND we needed to append vectors for each stage, then we might corrupt the type schema captured by an operator for the earlier stage when performing vectorized planning for the later stage. The bug is effectively the same as the comment deleted by 85fd4fb outlined: ``` // As an example, consider the following scenario in the context of // planFilterExpr method: // 1. r.ColumnTypes={types.Bool} with len=1 and cap=4 // 2. planSelectionOperators adds another types.Int column, so // filterColumnTypes={types.Bool, types.Int} with len=2 and cap=4 // Crucially, it uses exact same underlying array as r.ColumnTypes // uses. // 3. we project out second column, so r.ColumnTypes={types.Bool} // 4. later, we add another types.Float column, so // r.ColumnTypes={types.Bool, types.Float}, but there is enough // capacity in the array, so we simply overwrite the second slot // with the new type which corrupts filterColumnTypes to become // {types.Bool, types.Float}, and we can get into a runtime type // mismatch situation. ``` The only differences are: - aliasing of the type slice occurs via the `InputSyncSpec.ColumnTypes` that is often used as the starting points for populating `NewColOperatorResult.ColumnTypes` which is used throughout the vectorized operator planning - columns are "projected out" by sharing the type schema between two stages of DistSQL processors. This commit addresses this issue by capping the slice to its length right before we get into the vectorized planning. This will make it so that if we need to append a type, then we'll make a fresh allocation, and any possible memory aliasing with a different stage of processors will be gone. I haven't quite figured out the exact conditions that are needed for this bug to occur, but my intuition says that it should be quite rare in practice (otherwise we'd have seen this much sooner given that the offending commit was merged more than a year ago and was backported to older branches). Release note (bug fix): Previously, CockroachDB could encounter an internal error of the form `interface conversion: coldata.Column is` in an edge case and this is now fixed. The bug is present in versions 22.2.13+, 23.1.9+, 23.2+.
This commit fixes type schema corruption in the vectorized engine in an edge case. In particular, consider the following circumstances: - during the physical planning, when creating a new stage of processors, we often reuse the same type slice (stored in `InputSyncSpec.ColumnTypes`) that we get from the previous stage. In other words, we might have memory aliasing, but only on the gateway node because the remote nodes get their specs deserialized and each has its own memory allocation. - throughout the vectorized operator planning, as of 85fd4fb, for each newly projected vector we append the corresponding type to the type slice we have in scope. We also capture intermediate state of the type slice by some operators (e.g. `BatchSchemaSubsetEnforcer`). - as expected, when appending a type to the slice, if there is enough capacity, we reuse it, meaning that we often append to the slice that came to us via `InputSyncSpec.ColumnTypes`. - now, if we have two stages of processors that happened to share the same underlying type slice with some free capacity AND we needed to append vectors for each stage, then we might corrupt the type schema captured by an operator for the earlier stage when performing vectorized planning for the later stage. The bug is effectively the same as the comment deleted by 85fd4fb outlined: ``` // As an example, consider the following scenario in the context of // planFilterExpr method: // 1. r.ColumnTypes={types.Bool} with len=1 and cap=4 // 2. planSelectionOperators adds another types.Int column, so // filterColumnTypes={types.Bool, types.Int} with len=2 and cap=4 // Crucially, it uses exact same underlying array as r.ColumnTypes // uses. // 3. we project out second column, so r.ColumnTypes={types.Bool} // 4. later, we add another types.Float column, so // r.ColumnTypes={types.Bool, types.Float}, but there is enough // capacity in the array, so we simply overwrite the second slot // with the new type which corrupts filterColumnTypes to become // {types.Bool, types.Float}, and we can get into a runtime type // mismatch situation. ``` The only differences are: - aliasing of the type slice occurs via the `InputSyncSpec.ColumnTypes` that is often used as the starting points for populating `NewColOperatorResult.ColumnTypes` which is used throughout the vectorized operator planning - columns are "projected out" by sharing the type schema between two stages of DistSQL processors. This commit addresses this issue by capping the slice to its length right before we get into the vectorized planning. This will make it so that if we need to append a type, then we'll make a fresh allocation, and any possible memory aliasing with a different stage of processors will be gone. I haven't quite figured out the exact conditions that are needed for this bug to occur, but my intuition says that it should be quite rare in practice (otherwise we'd have seen this much sooner given that the offending commit was merged more than a year ago and was backported to older branches). Release note (bug fix): Previously, CockroachDB could encounter an internal error of the form `interface conversion: coldata.Column is` in an edge case and this is now fixed. The bug is present in versions 22.2.13+, 23.1.9+, 23.2+.
This commit fixes type schema corruption in the vectorized engine in an edge case. In particular, consider the following circumstances: - during the physical planning, when creating a new stage of processors, we often reuse the same type slice (stored in `InputSyncSpec.ColumnTypes`) that we get from the previous stage. In other words, we might have memory aliasing, but only on the gateway node because the remote nodes get their specs deserialized and each has its own memory allocation. - throughout the vectorized operator planning, as of 85fd4fb, for each newly projected vector we append the corresponding type to the type slice we have in scope. We also capture intermediate state of the type slice by some operators (e.g. `BatchSchemaSubsetEnforcer`). - as expected, when appending a type to the slice, if there is enough capacity, we reuse it, meaning that we often append to the slice that came to us via `InputSyncSpec.ColumnTypes`. - now, if we have two stages of processors that happened to share the same underlying type slice with some free capacity AND we needed to append vectors for each stage, then we might corrupt the type schema captured by an operator for the earlier stage when performing vectorized planning for the later stage. The bug is effectively the same as the comment deleted by 85fd4fb outlined: ``` // As an example, consider the following scenario in the context of // planFilterExpr method: // 1. r.ColumnTypes={types.Bool} with len=1 and cap=4 // 2. planSelectionOperators adds another types.Int column, so // filterColumnTypes={types.Bool, types.Int} with len=2 and cap=4 // Crucially, it uses exact same underlying array as r.ColumnTypes // uses. // 3. we project out second column, so r.ColumnTypes={types.Bool} // 4. later, we add another types.Float column, so // r.ColumnTypes={types.Bool, types.Float}, but there is enough // capacity in the array, so we simply overwrite the second slot // with the new type which corrupts filterColumnTypes to become // {types.Bool, types.Float}, and we can get into a runtime type // mismatch situation. ``` The only differences are: - aliasing of the type slice occurs via the `InputSyncSpec.ColumnTypes` that is often used as the starting points for populating `NewColOperatorResult.ColumnTypes` which is used throughout the vectorized operator planning - columns are "projected out" by sharing the type schema between two stages of DistSQL processors. This commit addresses this issue by capping the slice to its length right before we get into the vectorized planning. This will make it so that if we need to append a type, then we'll make a fresh allocation, and any possible memory aliasing with a different stage of processors will be gone. I haven't quite figured out the exact conditions that are needed for this bug to occur, but my intuition says that it should be quite rare in practice (otherwise we'd have seen this much sooner given that the offending commit was merged more than a year ago and was backported to older branches). Release note (bug fix): Previously, CockroachDB could encounter an internal error of the form `interface conversion: coldata.Column is` in an edge case and this is now fixed. The bug is present in versions 22.2.13+, 23.1.9+, 23.2+.
This commit fixes type schema corruption in the vectorized engine in an edge case. In particular, consider the following circumstances: - during the physical planning, when creating a new stage of processors, we often reuse the same type slice (stored in `InputSyncSpec.ColumnTypes`) that we get from the previous stage. In other words, we might have memory aliasing, but only on the gateway node because the remote nodes get their specs deserialized and each has its own memory allocation. - throughout the vectorized operator planning, as of 85fd4fb, for each newly projected vector we append the corresponding type to the type slice we have in scope. We also capture intermediate state of the type slice by some operators (e.g. `BatchSchemaSubsetEnforcer`). - as expected, when appending a type to the slice, if there is enough capacity, we reuse it, meaning that we often append to the slice that came to us via `InputSyncSpec.ColumnTypes`. - now, if we have two stages of processors that happened to share the same underlying type slice with some free capacity AND we needed to append vectors for each stage, then we might corrupt the type schema captured by an operator for the earlier stage when performing vectorized planning for the later stage. The bug is effectively the same as the comment deleted by 85fd4fb outlined: ``` // As an example, consider the following scenario in the context of // planFilterExpr method: // 1. r.ColumnTypes={types.Bool} with len=1 and cap=4 // 2. planSelectionOperators adds another types.Int column, so // filterColumnTypes={types.Bool, types.Int} with len=2 and cap=4 // Crucially, it uses exact same underlying array as r.ColumnTypes // uses. // 3. we project out second column, so r.ColumnTypes={types.Bool} // 4. later, we add another types.Float column, so // r.ColumnTypes={types.Bool, types.Float}, but there is enough // capacity in the array, so we simply overwrite the second slot // with the new type which corrupts filterColumnTypes to become // {types.Bool, types.Float}, and we can get into a runtime type // mismatch situation. ``` The only differences are: - aliasing of the type slice occurs via the `InputSyncSpec.ColumnTypes` that is often used as the starting points for populating `NewColOperatorResult.ColumnTypes` which is used throughout the vectorized operator planning - columns are "projected out" by sharing the type schema between two stages of DistSQL processors. This commit addresses this issue by capping the slice to its length right before we get into the vectorized planning. This will make it so that if we need to append a type, then we'll make a fresh allocation, and any possible memory aliasing with a different stage of processors will be gone. I haven't quite figured out the exact conditions that are needed for this bug to occur, but my intuition says that it should be quite rare in practice (otherwise we'd have seen this much sooner given that the offending commit was merged more than a year ago and was backported to older branches). Release note (bug fix): Previously, CockroachDB could encounter an internal error of the form `interface conversion: coldata.Column is` in an edge case and this is now fixed. The bug is present in versions 22.2.13+, 23.1.9+, 23.2+.
Fixes cockroachdb#133015 Release note: None
Currently, CRDB log is configured, by roachtest, to log to a file to catch any logs written to it during a roachtest run. This is usually from a shared test util that uses the CRDB log. The file sink on the CRDB logger will log program arguments by default, but this can leak sensitive information. This PR introduces a log redirect that uses the CRDB log interceptor functionality instead of using a file sink. This way we can avoid logging the program arguments. Epic: None Release note: None
This replaces the initiation of the file sink based CRDB log with the new interceptor log redirect. It will log to a file in the artifacts directory. Epic: None Release note: None
…ort-release-23.1.29-rc-133241 release-23.1.29-rc: opt: relax max stack size in test for stack overflow
Fixes cockroachdb#133088. Release note: None Epic: None
When testing the license throttling behavior, it's helpful to easily override the telemetry ping timestamp. This timestamp will only be accepted if it's smaller than the current recorded timestamp, which reduces the chance that this ability can be used to extend the grace period. Epic: CRDB-40209 Release note: None
Instead of asking a user to "renew" we tell them to "add" a new license, which matches other language we've used. Epic: CRDB-40853 Release note: None
Prior to 24.3 release, we added a notification in the DB Console to alert customers to the licensing changes and give them time to prepare. Now that they'll be rolling out, the notice is removed since it's no longer in the future. Epic: CRDB-40853 Release note: None
Previously, the redaction logic for `Sensitive` settings in the diagnotics payload was conditional on the value of the `"server.redact_sensitive_settings.enabled"` cluster setting. This commit modifies the behavior of `RedactedValue` used to render modified cluster settings by the `diagnostics` package to always fully redact the values of string settings and any sensitive or non- reportable settings. Because the `MaskedSetting` struct is now in use by code in the `SHOW CLUSTER SETTING` code path, we no longer rely on it for redaction behavior of string settings. Note: This is a backport of a PR from `master` and this branch does not contain the concept of `sensitive` settings so only `non- reportable` ones are included. Resolves: CRDB-43457 Epic: None Release note (security update): all cluster settings that accept strings are now fully redacted when transmitted as part of our diagnostics telemetry. This payload includes a record of modified cluster settings and their values when they are not strings. Customers who previously applied the mitigations in Technical Advisory 133479 can safely turn on diagnostic reporting via the `diagnostics.reporting.enabled` cluster setting without leaking sensitive cluster settings values.
Previously, the redaction logic for `Sensitive` settings in the diagnotics payload was conditional on the value of the `"server.redact_sensitive_settings.enabled"` cluster setting. This commit modifies the behavior of `RedactedValue` used to render modified cluster settings by the `diagnostics` package to always fully redact the values of string settings and any sensitive or non- reportable settings. Because the `MaskedSetting` struct is now in use by code in the `SHOW CLUSTER SETTING` code path, we no longer rely on it for redaction behavior of string settings. Note: This is a backport of a PR from `master` and this branch does not contain the concept of `sensitive` settings so only `non- reportable` ones are included. Resolves: CRDB-43457 Epic: None Release note (security update): all cluster settings that accept strings are now fully redacted when transmitted as part of our diagnostics telemetry. This payload includes a record of modified cluster settings and their values when they are not strings. Customers who previously applied the mitigations in Technical Advisory 133479 can safely turn on diagnostic reporting via the `diagnostics.reporting.enabled` cluster setting without leaking sensitive cluster settings values.
Previously, the redaction logic for `Sensitive` settings in the diagnotics payload was conditional on the value of the `"server.redact_sensitive_settings.enabled"` cluster setting. This commit modifies the behavior of `RedactedValue` used to render modified cluster settings by the `diagnostics` package to always fully redact the values of string settings and any sensitive or non- reportable settings. Because the `MaskedSetting` struct is now in use by code in the `SHOW CLUSTER SETTING` code path, we no longer rely on it for redaction behavior of string settings. Note: This is a backport of a PR from `master` and this branch does not contain the concept of `sensitive` settings so only `non- reportable` ones are included. Resolves: CRDB-43457 Epic: None Release note (security update): all cluster settings that accept strings are now fully redacted when transmitted as part of our diagnostics telemetry. This payload includes a record of modified cluster settings and their values when they are not strings. Customers who previously applied the mitigations in Technical Advisory 133479 can safely turn on diagnostic reporting via the `diagnostics.reporting.enabled` cluster setting without leaking sensitive cluster settings values.
…02326 [Replicated] release-23.1: Update pkg/testutils/release/cockroach_releases.yaml
…oach_releases.yaml"
…0-20250105102326 Revert "[Replicated] release-23.1: Update pkg/testutils/release/cockroach_releases.yaml"
…se/cockroach_releases.yaml""
…te-pr-135360-20250105102326 Revert "Revert "[Replicated] release-23.1: Update pkg/testutils/release/cockroach_releases.yaml""
…ls/release/cockroach_releases.yaml"""
…127-replicate-pr-135360-20250105102326 Revert "Revert "Revert "[Replicated] release-23.1: Update pkg/testutils/release/cockroach_releases.yaml"""
…/testutils/release/cockroach_releases.yaml""""
…130-revert-127-replicate-pr-135360-20250105102326 Revert "Revert "Revert "Revert "[Replicated] release-23.1: Update pkg/testutils/release/cockroach_releases.yaml""""
…date pkg/testutils/release/cockroach_releases.yaml"""""
…131-revert-130-revert-127-replicate-pr-135360-20250105102326 Revert "Revert "Revert "Revert "Revert "[Replicated] release-23.1: Update pkg/testutils/release/cockroach_releases.yaml"""""
…02335 [Replicated] release-23.1: pgwire,authccl: use pgx for TestAuthenticationAndHBARules
…02346 [Replicated] release-23.1: colexecerror: improve the catcher due to a recent regression
…33412 [Replicated] release-23.1: kvstreamer: fix pathological behavior in InOrder mode
…34015 [Replicated] release-23.1: kvstreamer: fix pathological behavior in InOrder mode
…35713 [Replicated] release-23.1: kvstreamer: fix pathological behavior in InOrder mode
…42940 [Replicated] release-23.1: kvstreamer: fix pathological behavior in InOrder mode
…44136 [Replicated] release-23.1: kvstreamer: fix pathological behavior in InOrder mode
…95913 [Replicated] release-23.1: kvstreamer: fix pathological behavior in InOrder mode
…33443 [Replicated] release-23.1: colexecerror: improve the catcher due to a recent regression
…33432 [Replicated] release-23.1: pgwire,authccl: use pgx for TestAuthenticationAndHBARules
Previously, the add column statement in the declarative schema changer only supported serial columns. This patch adds support for adding generated columns inside the declarative schema changer. Fixes: cockroachdb#128087 Release note: None
|
Reminder: it has been 3 weeks please merge or close your backport! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replicated from original PR cockroachdb#139179
Original author: blathers-crl[bot]
Original creation date: 2025-01-15T20:57:57Z
Original reviewers: rafiss
Original description:
Backport 1/1 commits from cockroachdb#139135 on behalf of @fqazi.
/cc @cockroachdb/release
Previously, the add column statement in the declarative schema changer only supported serial columns. This patch adds support for adding generated columns inside the declarative schema changer.
Fixes: cockroachdb#128087
Release note: None
Release justification: merge functionality that was planned for 25.1