Skip to content

rfc: Alter column type#24703

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
bobvawter:alter-rfc
May 4, 2018
Merged

rfc: Alter column type#24703
craig[bot] merged 1 commit intocockroachdb:masterfrom
bobvawter:alter-rfc

Conversation

@bobvawter
Copy link
Copy Markdown
Contributor

@bobvawter bobvawter commented Apr 11, 2018

This RFC describes how we will support altering column types in an online
fashion.

Release note: None

@bobvawter bobvawter requested a review from a team as a code owner April 11, 2018 19:34
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@knz
Copy link
Copy Markdown
Contributor

knz commented Apr 11, 2018

Some early feedback posted here with agreement from the author.

In random order of importance:

  • we usually hard wrap the text of rfcs.
  • evolutation -> evolution
  • you're missing a conversion class: "Assisted", which is even more general than what you now call "General". I suggest renaming "General" to "Cast-based" and add "Assisted" next to that. The new section "Assisted" is when an explicit conversion expression is specified with USING. I think this will be a pretty common case actually.
  • appyling -> applying (may I suggest enabling the spell check in your editor? no judgment here)
  • you will get questions about your deferred/completed approach with jobs. It seems orthogonal to this RFC: the mechanism could be equally well applied to any of our other asynchronous schema changes. If you want to argue it is necessary for "alter table set type", then I'd recommend starting this work separately, and complete it before you resume work on this feature.
  • Cloumns -> columns
  • I think that any value written to the old type after the change has been initiated, must also be done synchronously to the shadow column. We wouldn't know how to pick them up during the backfill otherwise. (Same as adding a new index.)
  • "A backfill process is run on each node" -> "A backfill process is distributed across all nodes that have data for the table, like we do for index backfills already"
  • " Once the backfill is complete, the column and indexes are upgraded to READ status. Any code that resolves a column or index name to an id should be prepared to look for a READ-mode shadow that has been activated. " - I have no idea what you're talking about here (neither what this is about nor why this is important). You should clarify, with examples.
  • there's a missing entry in failure modes: Writes that occur during the backfill (after the change is initiated, but before the new type becomes active) are issued by the client app under the assumption that the old type is active. This means that from those clients' perspective it is OK to write any value valid for the old type but some of these values may be invalid for the new type. You have to outline what happens in that case.
  • the final paragraph "Solving for the general case does have the potential to become a science project" doesn't make much sense to me. What are you trying to say? The project is complex, that's true, but so are other projects on CockroachDB. What are the things that require science which will not be completed when the work defined by the RFC is done?

@couchand couchand added the do-not-merge bors won't merge a PR with this label. label Apr 11, 2018
@couchand
Copy link
Copy Markdown
Contributor

Pro-tips: Bors understands the do-not-merge label, and also you're gonna want to update the PR description before you merge.

@bobvawter
Copy link
Copy Markdown
Contributor Author

@knz, thank you for your near-instant feedback. I have incorporated the changes that you have suggested.

@couchand Fixed.

@knz
Copy link
Copy Markdown
Contributor

knz commented Apr 11, 2018

The RFC also needs to talk about interleaved tables and column families. How does the proposal work with those cases?


Reviewed 1 of 1 files at r1.
Review status: all files reviewed at latest revision, all discussions resolved.


Comments from Reviewable

Copy link
Copy Markdown
Contributor

@madelynnblue madelynnblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a great RFC. I really liked the discussion of the different classes and implementation phases.


## Indexed columns

Columns that are part of an index are supported, however columns that
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this (not changing a PK column) apply to all classes? Even trivial?

Given that a type-change process may take an arbitrary and unpredictable
amount of time to complete, we support an optional two-phase approach to
making the schema change visible to SQL clients. This allows the user
to choose exactly when a (breaking) change to the table schema will
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you define breaking change? It seems like it's anything not in the trivial class.


The `ALTER TABLE` command is first run with the `DEFERRED` option, which
begins the type-conversion in the background and returns an associated
job id. At any point in time, but generally after the job has
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use of the job id? We don't return them anywhere else, and I might want to add them all at once instead of here so we can decide exactly when, why, and how we want to return job ids.

and indexes.
* A backfill process is distributed across all nodes in the cluster
which have data for the table. This backfill will populate the shadow
column and shadow indexes, inserting data with the source MVCC
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can insert data with the source MVCC timestamp. If the timestamp is below the GC threshold then we certainly can't commit a txn at that timestamp. Also, that would require one txn per row, which is really slow. I think it's ok to use whatever the backfill chunk read time is, just like for column backfill.

when a writable shadow exists for their target column or index.
* Code that resolves a column or index name for reading should be
prepared to look for an activated shadow.
* It must be possible to insert bulk, backfilled data into the LSM such
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is a thing we want to do because it would actually change history if someone was doing a read AS OF SYSTEM TIME.

@bobvawter
Copy link
Copy Markdown
Contributor Author

@knz, as I understand them, both the interleaved tables features and column families are just used to compute the KV key that the data will actually be inserted at to improve data locality. Beyond ensuring that any shadow columns created by a type change are placed in the same column family as the source column, is there some specific problem you're thinking of? Or is this just a topic that should appear in any RFC related to data manipulation?


Review status: all files reviewed at latest revision, 5 unresolved discussions.


docs/RFCS/20180411_alter_column_type.md, line 97 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

Does this (not changing a PK column) apply to all classes? Even trivial?

My thinking is that we get non-PK columns working first to get all of the type-change machinery in place. Once it's possible to change the PK columns for a table, we can merge the two features together, since it seems like making any non-trivial change to a PK column type is effectively the same as changing the PK to a new (shadow) column.

We could certainly support trivial type changes for PK columns as part of this RFC, except that I think we'd need to handle foreign-key references. Let's say the user is doing STRING -> BYTES in table A and table B has an FK constraint. Does that STRING column in table B also need to have a cascaded type change to BYTES?


docs/RFCS/20180411_alter_column_type.md, line 105 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

Can you define breaking change? It seems like it's anything not in the trivial class.

Clarified the sentence.


docs/RFCS/20180411_alter_column_type.md, line 110 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

What's the use of the job id? We don't return them anywhere else, and I might want to add them all at once instead of here so we can decide exactly when, why, and how we want to return job ids.

That was left over from an earlier thought about how the two-phase commit would work. It's not actually needed and has been removed.


docs/RFCS/20180411_alter_column_type.md, line 203 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

I'm not sure this is a thing we want to do because it would actually change history if someone was doing a read AS OF SYSTEM TIME.

The scenario that I want to avoid looks like:

  • Start reading a range to perform the conversion.
  • The user issues a write to a row within that range.
  • Stomp on the user's write with the backfilled data.

It would be fine, I think to retry that range if we detect a conflict; is there plumbing that already makes this possible?


Comments from Reviewable

@bobvawter
Copy link
Copy Markdown
Contributor Author

Another concern I just thought of is range thrash: since the shadow column is going to duplicate the source column and we want to be able to have them both live in the KV layer at the same time, our encoded column families are going to get (temporarily) larger. This could cause extra range-splitting to happen, right? Is this going to be a problem?


Review status: 0 of 1 files reviewed at latest revision, 5 unresolved discussions.


Comments from Reviewable

@knz
Copy link
Copy Markdown
Contributor

knz commented Apr 13, 2018

Regarding interleaved tables: the RFC should discuss whether changing the column type on the parent should be blocked (error) or accepted (and propagate automatically to the children tables).

@madelynnblue
Copy link
Copy Markdown
Contributor

Review status: 0 of 1 files reviewed at latest revision, 3 unresolved discussions, some commit checks failed.


docs/RFCS/20180411_alter_column_type.md, line 97 at r1 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

My thinking is that we get non-PK columns working first to get all of the type-change machinery in place. Once it's possible to change the PK columns for a table, we can merge the two features together, since it seems like making any non-trivial change to a PK column type is effectively the same as changing the PK to a new (shadow) column.

We could certainly support trivial type changes for PK columns as part of this RFC, except that I think we'd need to handle foreign-key references. Let's say the user is doing STRING -> BYTES in table A and table B has an FK constraint. Does that STRING column in table B also need to have a cascaded type change to BYTES?

Good point. Sounds like we should stick with no PK and no FK for now.


docs/RFCS/20180411_alter_column_type.md, line 350 at r2 (raw file):

|                     | `TIMESTAMP`             |                     |
| `STRING COLLATE ab` |                         |                     |
|                     | `STRING COLLATE cd`     | Trivial             |

I thought the collation key affected the on-disk bytes. Does it not?


Comments from Reviewable

Copy link
Copy Markdown
Contributor

@vivekmenezes vivekmenezes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together. It is very well thought through especially for someone new to our system. Thank you!

updated schema, such as converting `STRING -> BYTES` or `STRING(10) ->
STRING(25)`. These enable a fast-path conversion by simply updating the
table descriptor.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can go ahead and implement these type conversions ^

Validated conversions involve no conversion of on-disk data, but do
require that all existing data is validated against the proposed schema
change before completion. This would include conversions such as
`STRING(100) -> STRING(10)` or `BYTES -> STRING`. These have a cost
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you planning on running some sort of conversion for invalid data? What does postgres do?

run `COMPLETE` before the conversion job has finished; it is simply a
signal to indicate that the SQL consumers are prepared to handle the new
column type.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good thinking about this. I'm not sure if we need to do it in a first implementation though.

retired. The shadow column will be anonymous and generally invisible to
the user until it is swapped in. Shadow columns will be created in the
same column family as the source column.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice summary!

which have data for the table. This backfill will populate the shadow
column and shadow indexes, inserting data with the source MVCC
timestamp. The backfilled sstables will be inserted into the LSM such
that any shadowed write will have priority.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can say here that backfilling this column is similar to how we backfill a computed column today

the target type, the `INSERT` should fail with an error message such as
`column user_id is being converted from STRING to UUID. The value
'foobar' cannot be converted to UUID.`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The biggest complexity in this change is introducing the notion of shadow column everywhere in the run time code. Its quite similar to a computed column with the same name, I'd like us try to hide the name from the shadow column until it has been promoted, so that we don't have to make the assumption everywhere in the code that a duplicate column can exist. perhaps we should leave its name unspecified until the promotion happens.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justinj might have some suggestions here.

@bobvawter
Copy link
Copy Markdown
Contributor Author

Review status: 0 of 1 files reviewed at latest revision, 10 unresolved discussions, some commit checks failed.


docs/RFCS/20180411_alter_column_type.md, line 97 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

Good point. Sounds like we should stick with no PK and no FK for now.

Added a paragraph in the rationale section summarizing this.


docs/RFCS/20180411_alter_column_type.md, line 183 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

I don't think we can insert data with the source MVCC timestamp. If the timestamp is below the GC threshold then we certainly can't commit a txn at that timestamp. Also, that would require one txn per row, which is really slow. I think it's ok to use whatever the backfill chunk read time is, just like for column backfill.

Clarified this section to describe the ideal condition, pending discussion with DT and core-team about how we can slide sstable updates "underneath" and pre-existing writes to a particular KV key.


docs/RFCS/20180411_alter_column_type.md, line 203 at r1 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

The scenario that I want to avoid looks like:

  • Start reading a range to perform the conversion.
  • The user issues a write to a row within that range.
  • Stomp on the user's write with the backfilled data.

It would be fine, I think to retry that range if we detect a conflict; is there plumbing that already makes this possible?

Moved this into the unresolved-questions section.


docs/RFCS/20180411_alter_column_type.md, line 107 at r2 (raw file):

Previously, vivekmenezes wrote…

Are you planning on running some sort of conversion for invalid data? What does postgres do?

Added details.


docs/RFCS/20180411_alter_column_type.md, line 176 at r2 (raw file):

Previously, vivekmenezes wrote…

good thinking about this. I'm not sure if we need to do it in a first implementation though.

It's not going to be necessary for a trivial conversion, since those are "instantaneous", but we'll want a two-phase setup for validated changes.


docs/RFCS/20180411_alter_column_type.md, line 204 at r2 (raw file):

Previously, vivekmenezes wrote…

nice summary!

Thanks.


docs/RFCS/20180411_alter_column_type.md, line 268 at r2 (raw file):

Previously, vivekmenezes wrote…

@justinj might have some suggestions here.

Clarified the sentence in which shadow columns are introduced that they will be anonymous.


Comments from Reviewable

@bobvawter
Copy link
Copy Markdown
Contributor Author

@mjibson, @knz I'd like to get to an lgtm to land this as an in-progress rfc and start laying down code. Thanks.


Review status: 0 of 1 files reviewed at latest revision, 10 unresolved discussions.


Comments from Reviewable

@bobvawter bobvawter changed the title wip: rfc: Alter column type rfc: Alter column type Apr 20, 2018
@bobvawter bobvawter removed the do-not-merge bors won't merge a PR with this label. label Apr 20, 2018
@a-robinson
Copy link
Copy Markdown
Contributor

Reviewed 1 of 1 files at r3.
Review status: all files reviewed at latest revision, 10 unresolved discussions.


docs/RFCS/20180411_alter_column_type.md, line 163 at r3 (raw file):

-- Wait for the job to finish, stop the application, swap the column type.
-- This command would block, returning any output from the conversion job.
-- Any reads that happen-after this command will see the new type, while any

What happens to reads that are submitted to cockroach while the COMPLETE is blocking? Do they block?


docs/RFCS/20180411_alter_column_type.md, line 189 at r3 (raw file):

Quoted 5 lines of code…
Columns that are the target of a primary or foreign key constraint will
not be supported under this RFC. It could be possible to support foreign
keys in the future by cascading the type change, but this would greatly
increase the amount of coordination required to effect the total schema
change.

This could also break views in certain circumstances. We either need to disallow changing the types of columns used in views, or do some hard thinking about how type changes affect them. @knz is the expert here.


docs/RFCS/20180411_alter_column_type.md, line 383 at r3 (raw file):

since a copy-and-rename operation can be performed or users can write
their own migration code. This is, however, a sub-optimal user
experience and will diminish compatibility with ORM or other tooling

Nit that you don't have to bother fixing, but saying that it "will diminish compatibility" is much weaker than saying that it does diminish compatibility and calling out at least one specific important ORM/tool that is hurt by it.


Comments from Reviewable

@madelynnblue
Copy link
Copy Markdown
Contributor

LGTM modulo Alex's comments.


Review status: all files reviewed at latest revision, 13 unresolved discussions.


Comments from Reviewable

@knz
Copy link
Copy Markdown
Contributor

knz commented Apr 23, 2018

Thanks for pinging me. Overall I'm happy about most of the details addressed but there's still one major point open. See below.


Review status: all files reviewed at latest revision, 13 unresolved discussions.


docs/RFCS/20180411_alter_column_type.md, line 144 at r3 (raw file):
On second read I'm afraid this DEFERRED/COMPLETE business won't do.
Remember, we're investing in this work also partly for compatibility with PostgreSQL clients. It is thus rather important that we invest sufficient time and attention in reusing PostgreSQL's SQL syntax/semantics to achieve this.

  1. pg's syntax for ALTER TABLE SET TYPE does not include DEFERRED/COMPLETE
  2. we want to introduce compatibility with existing clients/apps which already use ALTER TABLE SET TYPE. These do not know about DEFERRED/COMPLETE.

So where does this leave us.

  • I think there should be an alternative to the mechanisms discussed here which does not require these keywords.
  • I have made a first comment about this in here: rfc: Alter column type #24703 (comment)

about your deferred/completed approach with jobs. It seems orthogonal to this RFC: the mechanism could be equally well applied to any of our other asynchronous schema changes. If you want to argue it is necessary for "alter table set type", then I'd recommend starting this work separately, and complete it before you resume work on this feature.

I would like to see your reaction to this. It seems to me that you could issue the ALTER TABLE SET TYPE like any of our other async schema changes, ie. queue the mutation and process it upon txn commit (like CREATE INDEX, ALTER TABLE ADD COLUMN, etc). If you were to do that, then we get the compatibility with pg clients "for free".

The mechanism you are proposing here (DEFERRED/COMPLETE) could be considered as well as a CockroachDB-extension, and made available to any ALTER not just this one.


docs/RFCS/20180411_alter_column_type.md, line 189 at r3 (raw file):

Previously, a-robinson (Alex Robinson) wrote…
Columns that are the target of a primary or foreign key constraint will
not be supported under this RFC. It could be possible to support foreign
keys in the future by cascading the type change, but this would greatly
increase the amount of coordination required to effect the total schema
change.

This could also break views in certain circumstances. We either need to disallow changing the types of columns used in views, or do some hard thinking about how type changes affect them. @knz is the expert here.

Yes it's rather simple really. The RFC must make it clear that changing the type of a column depended on by a view should be simply disallowed, there is nothing else we can do.
The DependedOnBy field on the descriptor indicates which column IDs are "locked" in that way.


docs/RFCS/20180411_alter_column_type.md, line 291 at r3 (raw file):

cast, it will be marked as `(extension)`.

| From                | To                      | Notes               |

This table must be extended to include TIMESTAMPTZ and TIMETZ

I'd also like to make it clear in the Notes which conversion applies. It's not clear to me what it means when the notes column is empty.


docs/RFCS/20180411_alter_column_type.md, line 337 at r3 (raw file):

|                     | `INTERVAL`              |                     |
|                     | `DECIMAL`               |                     |
|                     | `FLOAT`                 |                     |

float is listed twice


Comments from Reviewable

@knz
Copy link
Copy Markdown
Contributor

knz commented Apr 23, 2018

Reviewed 1 of 1 files at r3.
Review status: all files reviewed at latest revision, 16 unresolved discussions.


Comments from Reviewable

ALTER TABLE rpc_logs
ALTER COLUMN caller_ip
SET DATA TYPE INET
DEFERRED
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this command should block until the the entire type conversion has happened (with backfill), preparing the table for the quick execution of COMPLETE below.

@vivekmenezes
Copy link
Copy Markdown
Contributor

Review status: all files reviewed at latest revision, 13 unresolved discussions.


docs/RFCS/20180411_alter_column_type.md, line 144 at r3 (raw file):

Previously, knz (kena) wrote…

On second read I'm afraid this DEFERRED/COMPLETE business won't do.
Remember, we're investing in this work also partly for compatibility with PostgreSQL clients. It is thus rather important that we invest sufficient time and attention in reusing PostgreSQL's SQL syntax/semantics to achieve this.

  1. pg's syntax for ALTER TABLE SET TYPE does not include DEFERRED/COMPLETE
  2. we want to introduce compatibility with existing clients/apps which already use ALTER TABLE SET TYPE. These do not know about DEFERRED/COMPLETE.

So where does this leave us.

  • I think there should be an alternative to the mechanisms discussed here which does not require these keywords.
  • I have made a first comment about this in here: rfc: Alter column type #24703 (comment)

about your deferred/completed approach with jobs. It seems orthogonal to this RFC: the mechanism could be equally well applied to any of our other asynchronous schema changes. If you want to argue it is necessary for "alter table set type", then I'd recommend starting this work separately, and complete it before you resume work on this feature.

I would like to see your reaction to this. It seems to me that you could issue the ALTER TABLE SET TYPE like any of our other async schema changes, ie. queue the mutation and process it upon txn commit (like CREATE INDEX, ALTER TABLE ADD COLUMN, etc). If you were to do that, then we get the compatibility with pg clients "for free".

The mechanism you are proposing here (DEFERRED/COMPLETE) could be considered as well as a CockroachDB-extension, and made available to any ALTER not just this one.

One idea is to introduce the notion of BACKFILL. So without using this keyword cockroach can behave like pg and run all the individual steps for the schema change. With the BACKFILL keyword, it completes all the steps until the backfill is complete. The BACKFILL command can be followed by the plain command that is speeded up because the BACKFILL is complete


Comments from Reviewable

@vivekmenezes
Copy link
Copy Markdown
Contributor

:lgtm:


Review status: all files reviewed at latest revision, 13 unresolved discussions.


Comments from Reviewable

@bobvawter
Copy link
Copy Markdown
Contributor Author

Review status: 0 of 1 files reviewed at latest revision, 11 unresolved discussions.


docs/RFCS/20180411_alter_column_type.md, line 144 at r3 (raw file):

Previously, vivekmenezes wrote…

This seems like something specific to this command so I don't think it's a good idea to add something to something as general as BEGIN TRANSACTION. Just looking at the postgres docs I can't finding a keyword appropriate for what we're doing here. DEFERRED means something else altogether

s/DEFERRED/SHADOW/g


Comments from Reviewable

@bobvawter
Copy link
Copy Markdown
Contributor Author

Based on a conversation with @knz, @mjibson, and @dt we'll take a different approach when the user needs control over exactly when a type change will become visible to clients that cannot support both the old and new column types concurrently. The rough approach looks like the following:

  • Add a new computed column from the active, original column that performs the desired type conversion.
  • Wait for the computed column to backfill.
  • Atomically swap the source and computed column, remove the computed-ness, and optionally drop the original column.

A regular ALTER COLUMN TYPE command is more or less a wrapper around the above using a hidden column and performing the swap when the backfill is ready. This will give us the desired user experience of directed schema evolution and provide an independently-useful feature of being able to drop the computed nature of a column. We'll also make sure that the above workflow is highly visible in the documentation.


Review status: 0 of 1 files reviewed at latest revision, 11 unresolved discussions, some commit checks failed.


Comments from Reviewable

@bobvawter
Copy link
Copy Markdown
Contributor Author

PTAL: I've pushed a new version of the RFC that eliminates the explicit syntax for controlling the two-phase type change for one that builds on top of being able to add new computed columns.

* The hidden flag is removed from the shadow column.
* The source column is dropped.
* The shadow column is renamed.
* Any indexes on the original column are dropped.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break any queries that specify the index by name in a query. I think that's ok though, and we can document it. I'm just thinking through this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Also indexes have a DependedOnBy populated by views, this needs to be taken into account too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to rename the shadow indexes while we're mucking around in those data structures, right?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed. renaming the shadow index(es) to be the original index name(s) should do the trick.

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the improvements already.


## Views

Columns that are depended upon by a `CREATE VIEW` are out of scope for
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarify that the changes will ensure that a client receives an error if they attempt to chang ethe type of a column depended on by a view.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also indexes as a whole can be depended on by something else (not just FKs).

Try this: "In general any time there is a structural dependency on a column or an index, the alter proposed here will bail out with an error. Supporting these cases is left to later work."

* The hidden flag is removed from the shadow column.
* The source column is dropped.
* The shadow column is renamed.
* Any indexes on the original column are dropped.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed. renaming the shadow index(es) to be the original index name(s) should do the trick.

the source column's family, with the expression provided by the user
via `USING expression` or inferred for cast-type conversions.
* If there are indexes that refer to the column, additional
shadow-indexes are created.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for the implementation: be mindful that cockroachdb currently supports (for better or for worse) multiple indexes over the same set of columns, but with different names. Perhaps we shouldn't do that, but we can. There must be as many shadow indexes created as there were indexes originally, and the atomic rename described below must apply separately to all of them.

In the case where a column-type conversion (e.g. `STRING -> UUID`) is in
process and a user attempts to insert data that is not compatible with
the target type, the `INSERT` should fail with an error message such as
`column user_id is being converted from STRING to UUID. The value
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you reuse the computed column mechanism, it may not be practical / elegant to change the error message. Please check what would currently happen with a computed column, and see if the error message is acceptable from a UX perspective. If it's not, then this RFC should outline how you plan to extend the code for error handling, and/or if it would be possible to use the same error messages for both regular computed columns and type substitutions as described here. I would prefer to avoid weird boolean flags throughout multiple layers of abstraction just to get the right error message.


## Job control

The type change will be modeled as a bulk-IO job, allowing it to be
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make a reference here of how this will align with the handling of other DDL statements, and highlight that you plan to reuse the same code paths.

(Note that not all DDL statements are currently handled as jobs — although arguably they should — so pick one that is already as example)

The element type of an array can be converted, based on the conversion
rules above. That is, `STRING[]` could be converted to `DATE[]`.
Additionally, it is possible to convert any type to an array of a
convertable type, e.g. `STRING -> DATE[]`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you expand on this. What is the expected semantics of this last conversion? What does pg does in that case? What is the use case?

In general I would be rather strongly against the idea of enabling an automatic (i.e. non-assisted) column type conversion that is not otherwise possible via a cast (using ::). This last example would be an example of that. I'd need a use case to be convinced otherwise.

Copy link
Copy Markdown
Contributor

@vivekmenezes vivekmenezes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good. I'd appreciate you provide an example of how a user will control a manual two phase column type conversion. Thanks!

Once the type-change work has been completed, the new type becomes
effective and the old data can be retired. We allow only a single
column-type change to be in progress for any given table at a time;
multiple type changes will be queued and executed sequentially.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can add, "although this design doesn't preclude simultaneous execution of multiple type changes"

@bobvawter
Copy link
Copy Markdown
Contributor Author

Review status: 0 of 1 files reviewed at latest revision, 18 unresolved discussions.


docs/RFCS/20180411_alter_column_type.md, line 208 at r5 (raw file):

Previously, knz (kena) wrote…

Yes indeed. renaming the shadow index(es) to be the original index name(s) should do the trick.

Done.


docs/RFCS/20180411_alter_column_type.md, line 83 at r6 (raw file):

Previously, vivekmenezes wrote…

you can add, "although this design doesn't preclude simultaneous execution of multiple type changes"

Done.


docs/RFCS/20180411_alter_column_type.md, line 167 at r6 (raw file):

Previously, knz (kena) wrote…

Also indexes as a whole can be depended on by something else (not just FKs).

Try this: "In general any time there is a structural dependency on a column or an index, the alter proposed here will bail out with an error. Supporting these cases is left to later work."

Reworked this section to talk more generally about not supporting structural dependencies other than indexes.


docs/RFCS/20180411_alter_column_type.md, line 192 at r6 (raw file):

Previously, knz (kena) wrote…

Note for the implementation: be mindful that cockroachdb currently supports (for better or for worse) multiple indexes over the same set of columns, but with different names. Perhaps we shouldn't do that, but we can. There must be as many shadow indexes created as there were indexes originally, and the atomic rename described below must apply separately to all of them.

Reworded to be more general.


docs/RFCS/20180411_alter_column_type.md, line 234 at r6 (raw file):

Previously, knz (kena) wrote…

if you reuse the computed column mechanism, it may not be practical / elegant to change the error message. Please check what would currently happen with a computed column, and see if the error message is acceptable from a UX perspective. If it's not, then this RFC should outline how you plan to extend the code for error handling, and/or if it would be possible to use the same error messages for both regular computed columns and type substitutions as described here. I would prefer to avoid weird boolean flags throughout multiple layers of abstraction just to get the right error message.

Understood. Updated this paragraph to discuss the alternative error message.


docs/RFCS/20180411_alter_column_type.md, line 239 at r6 (raw file):

Previously, knz (kena) wrote…

Make a reference here of how this will align with the handling of other DDL statements, and highlight that you plan to reuse the same code paths.

(Note that not all DDL statements are currently handled as jobs — although arguably they should — so pick one that is already as example)

Done.


docs/RFCS/20180411_alter_column_type.md, line 338 at r6 (raw file):

Previously, knz (kena) wrote…

can you expand on this. What is the expected semantics of this last conversion? What does pg does in that case? What is the use case?

In general I would be rather strongly against the idea of enabling an automatic (i.e. non-assisted) column type conversion that is not otherwise possible via a cast (using ::). This last example would be an example of that. I'd need a use case to be convinced otherwise.

Agreed. I have clarified that only T -> T[] can be performed implicitly. Any other array-involved conversion T -> S[] or T[] -> T requires an explicit USING expression.


Comments from Reviewable

@bobvawter
Copy link
Copy Markdown
Contributor Author

PTAL: Added EXPLAIN(STEPS) as a way to give DBA's a way to perform preparatory work so the actual ALTER ... COLUMN TYPE can be almost instantaneous.


Review status: 0 of 1 files reviewed at latest revision, 18 unresolved discussions.


Comments from Reviewable

@bobvawter
Copy link
Copy Markdown
Contributor Author

bors r+

@knz
Copy link
Copy Markdown
Contributor

knz commented May 2, 2018

Reviewed 1 of 1 files at r4.
Review status: 0 of 1 files reviewed at latest revision, 18 unresolved discussions, some commit checks pending.


docs/RFCS/20180411_alter_column_type.md, line 338 at r6 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Agreed. I have clarified that only T -> T[] can be performed implicitly. Any other array-involved conversion T -> S[] or T[] -> T requires an explicit USING expression.

I must insist, the conversion T -> T[] is

  1. not clearly motivated here
  2. not possible via casts
  3. hard to justify implementation-wise: there's nothing that guarantees it will be cheap other than currnet implementation happenstance

I would encourage you to not implement this initially at all and instead wait until you have completed the rest of the work to decide whether it is indeed useful (and possibly leave that implementation to a follow-up project).

I could be convinced otherwise if I saw the business case, but the RFC is not presenting that yet.


Comments from Reviewable

@craig
Copy link
Copy Markdown
Contributor

craig bot commented May 2, 2018

Build failed

craig bot pushed a commit that referenced this pull request May 2, 2018
24958: sql: Add support for parsing ALTER COLUMN TYPE r=bobvawter a=bobvawter

This change adds SQL parsing support for the proposed ALTER COLUMN TYPE syntax
in RFC #24703.  It still returns an unimplemented error message to the caller.

Release note: none

25242: sql: support binary format for DInterval r=justinj a=justinj

Fixes #24525.

Release note (sql change): The binary postgres wire format is now supported for
intervals.

Co-authored-by: Bob Vawter <bob@cockroachlabs.com>
Co-authored-by: Justin Jaffray <justin@cockroachlabs.com>
@bobvawter
Copy link
Copy Markdown
Contributor Author

Review status: 0 of 1 files reviewed at latest revision, 18 unresolved discussions, some commit checks failed.


docs/RFCS/20180411_alter_column_type.md, line 338 at r6 (raw file):

Previously, knz (kena) wrote…

I must insist, the conversion T -> T[] is

  1. not clearly motivated here
  2. not possible via casts
  3. hard to justify implementation-wise: there's nothing that guarantees it will be cheap other than currnet implementation happenstance

I would encourage you to not implement this initially at all and instead wait until you have completed the rest of the work to decide whether it is indeed useful (and possibly leave that implementation to a follow-up project).

I could be convinced otherwise if I saw the business case, but the RFC is not presenting that yet.

T -> T[] and T[] -> T must be possible if we're implementing a general type-conversion system. I'm assuming that cast-type conversions will be implemented by substituting a default USING clause, which in this case would be USING ARRAY[column_name]. The implicit conversion is a user convenience and doesn't affect the rest of the type system. If it turns into an implementation problem, I'll reconsider allowing array conversions, but, for now, I remain unconvinced that it should be excluded.


Comments from Reviewable

@knz
Copy link
Copy Markdown
Contributor

knz commented May 2, 2018

Reviewed 1 of 1 files at r7.
Review status: all files reviewed at latest revision, 18 unresolved discussions, some commit checks failed.


docs/RFCS/20180411_alter_column_type.md, line 338 at r6 (raw file):

T -> T[] and T[] -> T must be possible if we're implementing a general type-conversion system.

  1. Why?

We do not support conversion of UUID to/from FLOAT. We do not support conversion of INET to/from INTERVAL. The reason why we do not support them is that they are non-sensical.

There are plenty of non-sensical conversions.

I posit that converting T <-> T[] is also non-sensical. I am willing to change my mind, but you'd need to explain.

  1. What is a "general type-conversion system"?

Comments from Reviewable

@bobvawter
Copy link
Copy Markdown
Contributor Author

Review status: all files reviewed at latest revision, 18 unresolved discussions, some commit checks failed.


docs/RFCS/20180411_alter_column_type.md, line 338 at r6 (raw file):

Previously, knz (kena) wrote…

T -> T[] and T[] -> T must be possible if we're implementing a general type-conversion system.

  1. Why?

We do not support conversion of UUID to/from FLOAT. We do not support conversion of INET to/from INTERVAL. The reason why we do not support them is that they are non-sensical.

There are plenty of non-sensical conversions.

I posit that converting T <-> T[] is also non-sensical. I am willing to change my mind, but you'd need to explain.

  1. What is a "general type-conversion system"?

You know the saying about how there's only three numbers that matter: 0, 1, ∞, and everything else is a special case? The way I see it, you should be able to have 0 of something (a never-created or dropped column), 1 of something (a column of type T) or an infinite number of something (a column of type T[]).

Right now, you can achieve the following conversions:

  • 0 -> 1 is ALTER TABLE ADD COLUMN t T
  • 0 -> ∞ is ALTER TABLE ADD COLUMN t T[]
  • 1, ∞ -> 0 is ALTER TABLE DROP COLUMN t

I want to solve for the missing conversion of 1 <--> ∞ by allowing the type of a column to be changed from T -> T[] or T[] ->T.


Comments from Reviewable

@knz
Copy link
Copy Markdown
Contributor

knz commented May 2, 2018

Review status: all files reviewed at latest revision, 18 unresolved discussions, some commit checks failed.


docs/RFCS/20180411_alter_column_type.md, line 338 at r6 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

You know the saying about how there's only three numbers that matter: 0, 1, ∞, and everything else is a special case? The way I see it, you should be able to have 0 of something (a never-created or dropped column), 1 of something (a column of type T) or an infinite number of something (a column of type T[]).

Right now, you can achieve the following conversions:

  • 0 -> 1 is ALTER TABLE ADD COLUMN t T
  • 0 -> ∞ is ALTER TABLE ADD COLUMN t T[]
  • 1, ∞ -> 0 is ALTER TABLE DROP COLUMN t

I want to solve for the missing conversion of 1 <--> ∞ by allowing the type of a column to be changed from T -> T[] or T[] ->T.

  • T -> T[]: ALTER COLUMN x SET TYPE int[] USING ARRAY[x]
  • T[] -> T: ALTER COLUMN x SET TYPE int USING x[1]

What's missing?

In your opinion, why does PostgreSQL not allow writing 123::int::int[]?

In your opinion, why does PostgreSQL now allow writing array[1,2,3]::int?

Also what does T[] -> T mean when there's more than one element?


Comments from Reviewable

@bobvawter
Copy link
Copy Markdown
Contributor Author

Review status: all files reviewed at latest revision, 18 unresolved discussions, some commit checks failed.


docs/RFCS/20180411_alter_column_type.md, line 338 at r6 (raw file):

T -> T[]: ALTER COLUMN x SET TYPE int[] USING ARRAY[x]
T[] -> T: ALTER COLUMN x SET TYPE int USING x[1]

Exactly. I'm not sure what we're discussing.


Comments from Reviewable

@knz
Copy link
Copy Markdown
Contributor

knz commented May 2, 2018

Review status: all files reviewed at latest revision, 18 unresolved discussions, some commit checks failed.


docs/RFCS/20180411_alter_column_type.md, line 338 at r6 (raw file):
Look at the first sentence in the paragraph:

Any given type T may be implicitly converted to T[].

(emphasis mine)

If you were to rephrase this as "Any given type T may be explicitly converted to T[] with USING."

Then we'd be in agreement.


Comments from Reviewable

This RFC describes how we will support altering column types in an online
fashion.

Release note: None
@bobvawter
Copy link
Copy Markdown
Contributor Author

@knz Updated doc so that all array-involved conversions require an explicit USING expression.

@bobvawter
Copy link
Copy Markdown
Contributor Author

bors r+

craig bot pushed a commit that referenced this pull request May 4, 2018
24703: rfc: Alter column type r=bobvawter a=bobvawter

This RFC describes how we will support altering column types in an online
fashion.

Release note: None

Co-authored-by: Bob Vawter <bob@cockroachlabs.com>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented May 4, 2018

Build succeeded

@craig craig bot merged commit 97e26ec into cockroachdb:master May 4, 2018
@bobvawter bobvawter deleted the alter-rfc branch December 10, 2018 15:03
craig bot pushed a commit that referenced this pull request May 7, 2020
46893: RFCS: ALTER COLUMN TYPE for cast conversions r=jordanlewis,rohany a=RichardJCai

Created a new RFC for ALTER COLUMN TYPE that requires a cast for conversion. 

Thought I would create a new one instead of adding to #24703 since this one mainly focuses on ALTER COLUMN TYPE where a cast is required.

Release note: none
Release justification: Added an RFC - no code change.

48452: AUTHORS: Add juan@ to AUTHORS file r=JuanLeon1 a=JuanLeon1



48539: roachtest/schemachange: add MinVersion to mixed-version test r=spaskob a=spaskob

This tests the work done for 20.1 that turned schema changes into jobsand in
addition prevented making any new schema changes on a mixed cluster in
order to prevent bugs during upgrades.

Release note: none.

Co-authored-by: richardjcai <caioftherichard@gmail.com>
Co-authored-by: Juan Leon <github@artedo.com>
Co-authored-by: Spas Bojanov <pachob@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants