Skip to content

fix: Error if after the migration there are not null columns that are not part of the new schema#6282

Merged
kodiakhq[bot] merged 5 commits intocloudquery:mainfrom
erezrokah:fix/cleanup_not_null
Jan 4, 2023
Merged

fix: Error if after the migration there are not null columns that are not part of the new schema#6282
kodiakhq[bot] merged 5 commits intocloudquery:mainfrom
erezrokah:fix/cleanup_not_null

Conversation

@erezrokah
Copy link
Copy Markdown
Member

@erezrokah erezrokah commented Jan 3, 2023

Summary

Fixes #6248 (implements first second solution from the issue)

@erezrokah erezrokah force-pushed the fix/cleanup_not_null branch 3 times, most recently from 8fd313d to c6196fe Compare January 3, 2023 13:16
@vercel
Copy link
Copy Markdown

vercel bot commented Jan 3, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated
cloudquery-web 🔄 Building (Inspect) Jan 4, 2023 at 9:24AM (UTC)

@erezrokah erezrokah force-pushed the fix/cleanup_not_null branch 2 times, most recently from 127f755 to 1948bbc Compare January 3, 2023 13:31
@erezrokah erezrokah force-pushed the fix/cleanup_not_null branch 2 times, most recently from 2b46e23 to d112538 Compare January 3, 2023 13:45
@erezrokah erezrokah marked this pull request as ready for review January 3, 2023 13:46
@erezrokah erezrokah requested review from a team and bbernays and removed request for a team January 3, 2023 13:46
@erezrokah erezrokah force-pushed the fix/cleanup_not_null branch from d112538 to 91bd761 Compare January 3, 2023 13:53
@yevgenypats
Copy link
Copy Markdown
Contributor

I would be cautions in dropping things as a data-integration tool. users can (and do) create their own constraints, views and so on.

@erezrokah
Copy link
Copy Markdown
Member Author

I would be cautions in dropping things as a data-integration tool. users can (and do) create their own constraints, views and so on.

That's my concern too. We're dropping the constraint on columns that were primary keys before the migration, and are not after the migration.
Instead of dropping the constraint we can exit with an error guiding the user what to do.

The main issue is that the migration succeeds, but puts the DB in a state that any following sync fails. Exiting with an error should also work.

@yevgenypats
Copy link
Copy Markdown
Contributor

I would be cautions in dropping things as a data-integration tool. users can (and do) create their own constraints, views and so on.

That's my concern too. We're dropping the constraint on columns that were primary keys before the migration, and are not after the migration. Instead of dropping the constraint we can exit with an error guiding the user what to do.

The main issue is that the migration succeeds, but puts the DB in a state that any following sync fails. Exiting with an error should also work.

Yes error would be a better/safer solution.

@erezrokah erezrokah changed the title fix: Drop not null when a column is no longer a PK in the new schema feat: Error if after the migration there are not null columns that are not part of the new schema Jan 4, 2023
@erezrokah erezrokah changed the title feat: Error if after the migration there are not null columns that are not part of the new schema fix: Error if after the migration there are not null columns that are not part of the new schema Jan 4, 2023
@erezrokah erezrokah force-pushed the fix/cleanup_not_null branch from 91bd761 to b80ab93 Compare January 4, 2023 09:24
@erezrokah erezrokah force-pushed the fix/cleanup_not_null branch from b80ab93 to 14790b6 Compare January 4, 2023 09:25
@erezrokah
Copy link
Copy Markdown
Member Author

Yes error would be a better/safer solution.

Done in 14790b6

@disq
Copy link
Copy Markdown
Member

disq commented Jan 4, 2023

Could also generate drop not null statements for the user to run.

@erezrokah
Copy link
Copy Markdown
Member Author

Could also generate drop not null statements for the user to run.

Ah, I tried that at first. But the problem they only work after we drop the primary keys. And if we drop the primary keys we won't be able to detect the error the next time the migration runs 🙃

@disq
Copy link
Copy Markdown
Member

disq commented Jan 4, 2023

Ah, I tried that at first. But the problem they only work after we drop the primary keys. And if we drop the primary keys we won't be able to detect the error the next time the migration runs

Need an --allow-schema-change option maybe? Which will auto-fix things by dropping PKs and remove constraints (and drop unused columns...)

For the user message you could also suggest the drop pk commands as well as drop not nulls.

@erezrokah
Copy link
Copy Markdown
Member Author

For the user message you could also suggest the drop pk commands as well as drop not nulls.

I felt that would be too verbose, but let me try that

@erezrokah
Copy link
Copy Markdown
Member Author

For the user message you could also suggest the drop pk commands as well as drop not nulls.

But then again, it brings the question - do we tell the user to drop the column? Drop only the constraint? Rename it?
The use case here is a primary key rename. We can't detect the rename just by looking at the DB

@erezrokah erezrokah requested a review from yevgenypats January 4, 2023 11:43
@disq
Copy link
Copy Markdown
Member

disq commented Jan 4, 2023

We can't detect the rename just by looking at the DB

In the olden times we did: I remember writing some code to compare a schema.Table with an actual table and to report what's changed and to generate sql migrations for the user. It detected column renames by checking if the types were equal and the names were similar.

Here we'll just suggest the user to drop any unused column, including renames. We'll take care of the rest in the next migration (by adding what's missing). We could also tell the user what the next steps are (from our migrations POV, ie. what we're going to do on the next run if the columns are dropped), so they can decide for themselves if it's a rename and take necessary action? Too much? Maybe.

@erezrokah
Copy link
Copy Markdown
Member Author

In the olden times we did: I remember writing some code to compare a schema.Table with an actual table and to report what's changed and to generate sql migrations for the user. It detected column renames by checking if the types were equal and the names were similar.

Here we'll just suggest the user to drop any unused column, including renames. We'll take care of the rest in the next migration (by adding what's missing). We could also tell the user what the next steps are (from our migrations POV, ie. what we're going to do on the next run if the columns are dropped), so they can decide for themselves if it's a rename and take necessary action? Too much? Maybe.

I could go either way. I'd rather give a generic error that's 100% correct, than have something wrong in our heuristics and provide the wrong recommendation.

Maybe dropping the primary key and the not null constraints is the safest option to recommend. I'll give it a try

@erezrokah
Copy link
Copy Markdown
Member Author

image

OK, so I think 371be8e should work (see image)

@erezrokah erezrokah added the automerge Automatically merge once required checks pass label Jan 4, 2023
@kodiakhq kodiakhq bot merged commit c5a4bf5 into cloudquery:main Jan 4, 2023
yevgenypats added a commit that referenced this pull request Jan 6, 2023
…s that are not part of the new schema (#6282)"

This reverts commit c5a4bf5.
kodiakhq bot pushed a commit that referenced this pull request Jan 6, 2023
I think this should be given additional take, potentially not doing any drops as we should avoid those like fire especially when running those in distributed mode and so on.
kodiakhq bot pushed a commit that referenced this pull request Jan 6, 2023
🤖 I have created a release *beep* *boop*
---


## [2.0.2](plugins-destination-postgresql-v2.0.1...plugins-destination-postgresql-v2.0.2) (2023-01-06)


### Bug Fixes

* **deps:** Update module github.com/cloudquery/plugin-sdk to v1.18.0 ([#6339](#6339)) ([158365a](158365a))
* **deps:** Update module github.com/cloudquery/plugin-sdk to v1.20.0 ([#6376](#6376)) ([d6187ec](d6187ec))
* **deps:** Update module github.com/cloudquery/plugin-sdk to v1.21.0 ([#6382](#6382)) ([5baea40](5baea40))
* **deps:** Update plugin-sdk to v1.21.0 for destinations ([#6419](#6419)) ([f3b989f](f3b989f))
* Error if after the migration there are `not null` columns that are not part of the new schema ([#6282](#6282)) ([c5a4bf5](c5a4bf5))
* **pg:** Return more detailed pg errors ([#6421](#6421)) ([acb3e21](acb3e21))
* **postgresql:** Revert [#6282](#6282) ([#6434](#6434)) ([2cecacd](2cecacd))
* **tests:** Postgresql: Read only columns defined in the schema ([#6371](#6371)) ([491941a](491941a))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
erezrokah added a commit to erezrokah/cloudquery that referenced this pull request Jan 9, 2023
kodiakhq bot pushed a commit that referenced this pull request Jan 9, 2023
…re not part of the new schema (#6519)



#### Summary

Bring back #6282 as the revert was not needed (see the fix #6466).

Also I fixed the issue with the tests, they were flaky not failing. The reason is that the separator length is based on the `alter table` statement. Since I use a random table name, I need to pad the table index to make it consistent.

**Please note that without this fix, if we rename a primary key column the migration will succeed but the first sync will fail without a clear message on the reason of the failure**.
Original issue #6248


<!--
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automerge Automatically merge once required checks pass

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug(postgres): When recreating primary key(s) constraint, not null is not removed from old PK(s)

5 participants