feat: drop/delete database by praveen-influx · Pull Request #25549 · influxdata/influxdb

praveen-influx · 2024-11-13T17:59:11Z

This commit allows to soft delete database using influxdb3 database delete <db_name>. The write buffer and last value cache are cleared as well.

closes: #25523

This commit allows soft deletion of database using `influxdb3 database delete <db_name>` command. The write buffer and last value cache are cleared as well. closes: #25523

- In previous commit, the deletion of database immediately triggered clearing last cache and query buffer. But on restarts same logic had to be repeated to allow deleting database when starting up. This commit removes immediate deletion by explicitly calling necessary methods and moves the logic to `apply_catalog_batch` which already applies `CatalogOp` and also clearing cache and buffer in `buffer_ops` method which has hooks to call other places. closes: #25523

hiltontj

This mostly looks good. The only comment I think you need to address is to add the deleted field into the DatabaseSnapshot struct to ensure the deleted field round-trips to and from JSON when the catalog is serialized. Once that is in and you fix compiler warnings / insta snapshots, then I will approve.

The other comments are up to you if you want to address them.

hiltontj · 2024-11-18T19:30:29Z

influxdb3/tests/server/configure.rs

+        .send()
+        .await
+        .expect("delete database call succeed");
+    assert_eq!(StatusCode::INTERNAL_SERVER_ERROR, resp.status());


Can we send back 404 NOT_FOUND in this case?

hiltontj · 2024-11-18T19:43:35Z

influxdb3_catalog/src/catalog.rs

        if let Some(db) = self.databases.get(&catalog_batch.database_id) {
-            let existing_table_count = db.tables.len();
-
-            if let Some(new_db) = db.new_if_updated_from_batch(catalog_batch)? {
-                let new_table_count = new_db.tables.len() - existing_table_count;
-                if table_count + new_table_count > Catalog::NUM_TABLES_LIMIT {
-                    return Err(Error::TooManyTables);
-                }
-                let new_db = Arc::new(new_db);
-                self.databases.insert(new_db.id, Arc::clone(&new_db));
-                self.sequence = self.sequence.next();
-                self.updated = true;
-                self.db_map.insert(new_db.id, Arc::clone(&new_db.name));
+            if let Some(new_db) = DatabaseSchema::new_if_updated_from_batch(db, catalog_batch)? {
+                check_overall_table_count(db, &new_db, table_count)?;
+                self.upsert_db(new_db);
            }
        } else {
-            if self.databases.len() >= Catalog::NUM_DBS_LIMIT {
-                return Err(Error::TooManyDbs);
-            }
-
-            let new_db = DatabaseSchema::new_from_batch(catalog_batch)?;
-            if table_count + new_db.tables.len() > Catalog::NUM_TABLES_LIMIT {
-                return Err(Error::TooManyTables);
-            }
-
-            let new_db = Arc::new(new_db);
-            self.databases.insert(new_db.id, Arc::clone(&new_db));
-            self.sequence = self.sequence.next();
-            self.updated = true;
-            self.db_map.insert(new_db.id, Arc::clone(&new_db.name));
+            let new_db = self.check_db_count(catalog_batch, table_count)?;
+            self.upsert_db(new_db);


IMHO the upsert_db method is nice here to DRY things up, but the logic captured in the check_overall_table_count and Self::check_db_count is simple enough to keep in line. Furthermore, I don't think their names as is capture what they are doing very well.

Sure, think check_db_count does a bit more creates db and checks table count again. I'll rejig it and see if you still think it should be in-lined. Having said that am I missing something in check_overall_table_count? I don't mind in-lining by the way, just wanted to see if I've missed anything.

No I don't think you missed anything the logic is still good. I'm being pedantic, but felt that moving the logic out to a helpers that aren't re-used otherwise was overkill.

Yea definitely, what I wanted to tease out was the table count check as that gets used in both branches. I've kept the db check and creation inline now.

hiltontj · 2024-11-18T19:46:44Z

influxdb3_catalog/src/serialize.rs

+            // todo: check if it's right to default to false here,
+            //       not sure where this is called
+            deleted: false,


Should be fine to add a deleted field on the DatabaseSnapshot struct, then base it off of that.

This bit wasn't too obvious when I saw the From impls, I see all the types have ..Snapshot structs for serializing and deserializing, is that avoiding to build deserializer by hand? I guess in Deserialize impls for partial structs (..Snapshot) you can use deserialize directly and then just build the relevant fields for the target type manually without involving visitor impl - is that the motivation?

Yeah, that about sums it up. The main motivation is the bi-directional maps that map from ID to name and vice versa. Those only need to be held in memory because the information in that bi-directional map is contained in the main map that maps the ID to the whole object, and which is serialized with the SerdeVecMap. No need to also serialize the bi-directional map and waste bytes.

Earlier on in the project, the motivation for the _Snapshot types was to capture the info that is contained in the arrow Schema, since we could not rely on its Serialize/Deserialize impls being stable.

influxdb3_client/src/lib.rs

Co-authored-by: Trevor Hilton <thilton@influxdata.com>

- `DatabaseSchema` serialization/deserialization is delegated to `DatabaseSnapshot`, so the `deleted` flag should be included in `DatabaseSnapshot` as well. - insta test snapshots fixed closes: #25523

praveen-influx force-pushed the praveen/drop-database branch 4 times, most recently from 264726d to a910cf2 Compare November 14, 2024 15:54

praveen-influx changed the title ~~feat: drop/delete database feature~~ feat: drop/delete database Nov 14, 2024

praveen-influx force-pushed the praveen/drop-database branch 2 times, most recently from 633fc44 to 780208f Compare November 15, 2024 16:12

feat: drop/delete database

49894b7

This commit allows soft deletion of database using `influxdb3 database delete <db_name>` command. The write buffer and last value cache are cleared as well. closes: #25523

praveen-influx force-pushed the praveen/drop-database branch 4 times, most recently from 8a265cc to 8f0c25a Compare November 18, 2024 17:41

praveen-influx force-pushed the praveen/drop-database branch from 8f0c25a to 2b729e4 Compare November 18, 2024 17:42

praveen-influx marked this pull request as ready for review November 18, 2024 17:42

praveen-influx requested review from hiltontj, mgattozzi and pauldix and removed request for hiltontj and pauldix November 18, 2024 17:43

hiltontj requested changes Nov 18, 2024

View reviewed changes

praveen-influx and others added 2 commits November 19, 2024 10:09

feat: use reqwest query api for query param

8d2f4fa

Co-authored-by: Trevor Hilton <thilton@influxdata.com>

feat: include deleted flag in DatabaseSnapshot

316d09a

- `DatabaseSchema` serialization/deserialization is delegated to `DatabaseSnapshot`, so the `deleted` flag should be included in `DatabaseSnapshot` as well. - insta test snapshots fixed closes: #25523

praveen-influx force-pushed the praveen/drop-database branch from d1a914d to f447bde Compare November 19, 2024 10:51

praveen-influx requested a review from hiltontj November 19, 2024 10:52

praveen-influx force-pushed the praveen/drop-database branch from f447bde to 5fbf7b4 Compare November 19, 2024 11:36

feat: address PR comments + tidy ups

c7d62aa

praveen-influx force-pushed the praveen/drop-database branch from 5fbf7b4 to c7d62aa Compare November 19, 2024 14:40

hiltontj approved these changes Nov 19, 2024

View reviewed changes

praveen-influx merged commit 33c2d47 into main Nov 19, 2024

hiltontj deleted the praveen/drop-database branch November 19, 2024 17:41

hiltontj added the v3 label Nov 22, 2024

hiltontj mentioned this pull request Nov 22, 2024

Delete metadata caches on database/table delete #25583

Closed

hiltontj mentioned this pull request Dec 10, 2024

Support database creation via REST API and CLI #25640

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: drop/delete database#25549

feat: drop/delete database#25549
praveen-influx merged 5 commits intomainfrom
praveen/drop-database

praveen-influx commented Nov 13, 2024 •

edited

Loading

Uh oh!

hiltontj left a comment

Uh oh!

hiltontj Nov 18, 2024

Uh oh!

hiltontj Nov 18, 2024

Uh oh!

praveen-influx Nov 19, 2024

Uh oh!

hiltontj Nov 19, 2024

Uh oh!

praveen-influx Nov 19, 2024

Uh oh!

hiltontj Nov 18, 2024

Uh oh!

praveen-influx Nov 19, 2024

Uh oh!

hiltontj Nov 19, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

praveen-influx commented Nov 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hiltontj left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

praveen-influx commented Nov 13, 2024 •

edited

Loading