-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Support DROP SCHEMA statements #6096
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| (false, None, false) => Err(DataFusionError::Execution(format!( | ||
| "Schema '{name}' doesn't exist." | ||
| ))), | ||
| // no schema found but dont return error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This scenario would occur for a statement like DROP SCHEMA does_not_exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The is like DROP SCHEMA foo IF EXISTS
This PR I think is consistent with what postgres does in this case
postgres=# drop schema if exists foo;
NOTICE: schema "foo" does not exist, skipping
DROP SCHEMA
| ); | ||
| ctx.sql("DROP SCHEMA test_schema CASCADE").await?; | ||
|
|
||
| assert!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jaylmiller and @viirya and @comphead for the reviews. I left some structure comments that I think are worth considering but I don't think they are required to merge this PR as:
- This PR is well tested
- Some of the comments are to clean up the code structure rather than anything fundamental
| } | ||
| } | ||
|
|
||
| #[derive(Debug, Clone, PartialEq, Eq, Hash, PartialOrd, Ord)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat strange / non standard to have code below a mod test -- maybe it would be cleaner in its own module (e.g. datafusion/common/src/schema_reference.rs) perhaps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure that sounds good to me
| /// | ||
| /// By default returns a "Not Implemented" error | ||
| fn deregister_schema(&self, name: &str) -> Result<Option<Arc<dyn SchemaProvider>>> { | ||
| // use variables to avoid unused variable warnings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative pattern is to name the variable with a prefix. Like_
fn deregister_schema(&self, _name: &str) -> Result<Option<Arc<dyn SchemaProvider>>> {
| (false, None, false) => Err(DataFusionError::Execution(format!( | ||
| "Schema '{name}' doesn't exist." | ||
| ))), | ||
| // no schema found but dont return error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The is like DROP SCHEMA foo IF EXISTS
This PR I think is consistent with what postgres does in this case
postgres=# drop schema if exists foo;
NOTICE: schema "foo" does not exist, skipping
DROP SCHEMA
| datafusion test xyz CREATE VIEW test.xyz AS SELECT * FROM abc | ||
|
|
||
| statement error DataFusion error: This feature is not implemented: Only `DROP TABLE/VIEW | ||
| statement error DataFusion error: Execution error: Cannot drop schema test because other tables depend on it: xyz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| /// Attempts to find a schema and deregister it. Returns a tuple of the schema and a | ||
| /// flag indicating whether dereg was performed (e.g if schema is found but has tables | ||
| /// then `cascade` must be set) | ||
| fn find_and_deregister_schema<'a>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if different catalogs might want to implement the "is the schema empty" check differently 🤔
If so, it might make sense to put logic for "is the schema empty so can we drop it" in the CatalogProvider
Perhaps if CatalogProvider had a signature like:
fn deregister_schema(&self, name: &str, cascade: bool) -> Result<Option<Arc<dyn SchemaProvider>>> {
I think this code would get significantly simpler as well as the debug_asserts to reassert what had just been validated would be unnecessary.
That being said, we could also do such a change as a follow on PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm that is a good point. I was trying to keep the signature as similar as possible to the other Catalog's deregister methods. But you are right that it could make sense to do it like this. The code within the context provider is a bit complex at the moment
| } | ||
| } | ||
|
|
||
| /// helper for the following drop schema tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could use sqllogic tests https://github.com/apache/arrow-datafusion/blob/main/datafusion/core/tests/sqllogictests/ to write this more concisely
For example, perhaps we could extend:
| Ok(()) | ||
| } | ||
|
|
||
| #[tokio::test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend putting these tests in https://github.com/apache/arrow-datafusion/blob/main/datafusion/core/tests/sqllogictests/test_files/ddl.slt rather than a rs test
|
@alamb Thanks for the thorough review! I will modify this PR accordingly in the next day or so. |
|
FYI I think #6144 will likely also cause conflicts with this PR |
@alamb No worries! I've actually been following along with your work there--thanks for CCing me. I was planning on updating this PR according to your review in the next few days anyways--so conflicts not a big deal. Thanks for the heads up. |
fa552de to
be19d7b
Compare
|
@alamb Thanks for the suggestions: managed to clean this PR up ALOT following them. I think this should be ready to review/merge, but let me know what you think! |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks really nice -- thank you @jaylmiller
| } | ||
|
|
||
| /// Removes a schema from this catalog. Implementations of this method should return | ||
| /// errors if the schema exists but cannot be dropped. For example, in DataFusion's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ --nice documentation
| pub struct DropCatalogSchema { | ||
| /// The schema name | ||
| pub name: OwnedSchemaReference, | ||
| /// If the schema exists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// If the schema exists | |
| /// If true, do not error if the schema does not exist |
Which issue does this PR close?
Closes #6027
Rationale for this change
Being able to drop schemas is generally just a good feature to have: exists in most databases.
What changes are included in this PR?
deregister_schematoCatalogProvidertrait (and an impl for it inMemoryCatalogProvider)DropCatalogSchemaLogicalPlan::DropCatalogSchemaAre these changes tested?
Unit tests and sql integration tests
Are there any user-facing changes?
this is a user-facing feature but everything is backwards compatible.