Skip to content

Conversation

@mildbyte
Copy link
Contributor

Make the metrics feature non-optional (there's a lot of code coming that exposes metrics and it'll get painful/pointless to maintain the ability to opt out of compiling metrics support at the cargo feature level

Add a special MemoryPool that wraps the default DataFusion MemoryPool (that setting defaults to GreedyMemoryPool, so construct that explicitly) and logs the total allocated/released bytes by each DataFusion query tree node.

Also replace all uses of RwLock<HashMap> to DashMap as @gruuya proposed.

Sample output:

# HELP seafowl_datafusion_memory_pool_reserved_bytes_current Current memory reserved in DataFusion's managed memory pool
# TYPE seafowl_datafusion_memory_pool_reserved_bytes_current gauge
seafowl_datafusion_memory_pool_reserved_bytes_current 0
# HELP seafowl_datafusion_memory_pool_freed_bytes_total Memory freed in DataFusion's managed memory pool
# TYPE seafowl_datafusion_memory_pool_freed_bytes_total counter
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="RepartitionExec"} 2033677121
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="NestedLoopJoinLoad"} 15072
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="TopK"} 5951438
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="GroupedHashAggregateStream"} 89032440
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="CrossJoinExec"} 21184
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="HashJoinStream"} 18346
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="AggregateStream"} 3230
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="SortPreservingMergeExec"} 2932891
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="ExternalSorter"} 3268145
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="HashJoinInput"} 112437164
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="ExternalSorterMerge"} 120259084288

# HELP seafowl_datafusion_memory_pool_allocated_bytes_total Memory allocated in DataFusion's managed memory pool
# TYPE seafowl_datafusion_memory_pool_allocated_bytes_total counter
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="TopK",result="success"} 5951438
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="SortPreservingMergeExec",result="success"} 2932891
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="ExternalSorter",result="success"} 3268145
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="HashJoinStream",result="success"} 18346
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="ExternalSorterMerge",result="success"} 120259084288
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="HashJoinInput",result="success"} 112437164
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="NestedLoopJoinLoad",result="success"} 15072
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="AggregateStream",result="success"} 3230
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="GroupedHashAggregateStream",result="success"} 89032440
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="RepartitionExec",result="success"} 2033677121
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="CrossJoinExec",result="success"} 21184

Cargo.toml Outdated
tokio-graceful-shutdown = { version = "0.14" }
tonic = { version = "0.10.0", optional = true }
tower = { version = "0.4", optional = true }
tower = { version = "0.4" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be just tower = "0.4"

@mildbyte mildbyte force-pushed the feature/memory-manager-observability branch from 213dbae to 5445458 Compare April 24, 2024 05:04
@mildbyte mildbyte merged commit 7e0a54e into main Apr 24, 2024
@mildbyte mildbyte deleted the feature/memory-manager-observability branch April 24, 2024 05:26

impl Metrics {
pub fn new() -> Self {
describe_gauge!(
Copy link
Collaborator

@SergeiPatiakin SergeiPatiakin Apr 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm late to the party, but should this be describe_counter rather than describe_gauge? Ditto in L24

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's actually a good catch - these are used as counters below and not gauges, though for some reason they still show up in the metrics output as type counter.

$ curl localhost:9191/metrics | grep memo | grep TYPE
# TYPE seafowl_datafusion_memory_pool_allocated_bytes_total counter
# TYPE seafowl_datafusion_memory_pool_freed_bytes_total counter
# TYPE seafowl_datafusion_memory_pool_reserved_bytes_current gauge

Cowboy-pushed 2755bb8 to fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants