Skip to content

fix: Automatic intermediate directory cleanup for file object store#26480

Merged
stuartcarnie merged 1 commit intomainfrom
sgc/object_store_file_cleanup
Jun 2, 2025
Merged

fix: Automatic intermediate directory cleanup for file object store#26480
stuartcarnie merged 1 commit intomainfrom
sgc/object_store_file_cleanup

Conversation

@stuartcarnie
Copy link
Copy Markdown
Contributor

Removes empty intermediate directories when a key is removed from local file system object storage, which matches cloud-based providers.

Prior to this comment, when using the file object store, if the two keys (12.parquet, 9.json) are deleted:

/Volumes/RAMDisk/influxdb3_pro/ingest/c/4c/537/cb4/12.parquet
/Volumes/RAMDisk/influxdb3_pro/ingest/c/4c/537/cb4/9.json
/Volumes/RAMDisk/influxdb3_pro/ingest/c/8b/417/4dd/14.parquet
/Volumes/RAMDisk/influxdb3_pro/ingest/c/8b/417/4dd/10.parquet

The on-disk structure will leave an orphaned intermediate directory structure /Volumes/RAMDisk/influxdb3_pro/ingest/c/4c/537/cb4:

/Volumes/RAMDisk/influxdb3_pro/ingest/c/4c/537/cb4
/Volumes/RAMDisk/influxdb3_pro/ingest/c/8b/417/4dd/14.parquet
/Volumes/RAMDisk/influxdb3_pro/ingest/c/8b/417/4dd/10.parquet

However, cloud-based object stores don't store the intermediate directories. With the additional flag in this PR, the empty intermediate directories are also removed, resulting in the following:

/Volumes/RAMDisk/influxdb3_pro/ingest/c/8b/417/4dd/14.parquet
/Volumes/RAMDisk/influxdb3_pro/ingest/c/8b/417/4dd/10.parquet

Removes empty intermediate directories when a key is removed from
local file system object storage, which matches cloud-based providers.
@stuartcarnie stuartcarnie requested a review from a team June 1, 2025 23:32
@stuartcarnie stuartcarnie merged commit 494847b into main Jun 2, 2025
12 checks passed
@stuartcarnie stuartcarnie deleted the sgc/object_store_file_cleanup branch June 2, 2025 23:31
mgattozzi added a commit that referenced this pull request Sep 8, 2025
* fix: Automatic intermediate directory cleanup for file object store (#26480)

Removes empty intermediate directories when a key is removed from
local file system object storage, which matches cloud-based providers.

* feat: add retention period to catalog (#26479)

* feat: add retention period to catalog

* fix: handle humantime parsing error properly

* refactor: use new iox_http_util types

---------

Co-authored-by: Michael Gattozzi <mgattozzi@influxdata.com>

* chore: address a couple post-merge PR comments (#26489)

* feat: add concurrency limit for WAL replay (#26483)

WAL replay currently loads _all_ WAL files concurrently running into
OOM. This commit adds a CLI parameter `--wal-replay-concurrency-limit`
that would allow the user to set a lower limit and run WAL replay again.

closes: #26481

---------

Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>
Co-authored-by: Michael Gattozzi <mgattozzi@influxdata.com>
Co-authored-by: praveen-influx <pkumar@influxdata.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants