Skip to content

Add limit to DefaultFileStatisticsCache #19052

@alamb

Description

@alamb

Also listing file statistics cache seems to not have any memory limit unlike metadata cache for example.

Is that by design , do you think we need to add similar limit for this cache too ?

Originally posted by @bharath-techie in #18971 (comment)

Basically the cache used in ListingTable comes from here:
https://github.com/apache/datafusion/blob/81512da2b0aaa474f6c4ba205b05eea7b3095176/datafusion/core/src/datasource/listing_table_factory.rs#L188-L187

Which somewhat unobviously sets a DefaultFileStatisticsCache here
https://github.com/apache/datafusion/blob/9f725d9c7064813cda0de0f87d115354b68d76e6/datafusion/catalog-listing/src/table.rs#L260-L259

The DefaultFileStatisticsCache has no limit:
https://github.com/apache/datafusion/blob/7d8b8602ad1be2f61f6a8ebb253ace9d85304ea7/datafusion/execution/src/cache/cache_unit.rs#L41-L40

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions