Skip to content

[GLUTEN-6960][VL] Limit Velox untracked global memory manager's usage #6988

Merged
zhztheplayer merged 7 commits intoapache:mainfrom
zhztheplayer:wip-6960
Aug 26, 2024
Merged

[GLUTEN-6960][VL] Limit Velox untracked global memory manager's usage #6988
zhztheplayer merged 7 commits intoapache:mainfrom
zhztheplayer:wip-6960

Conversation

@zhztheplayer
Copy link
Copy Markdown
Member

@zhztheplayer zhztheplayer commented Aug 23, 2024

Limit Velox's global memory manager's usage to 0.75 * Spark overhead memory by default. The overhead memory is calculated by same manner with vanilla Spark.

Velox's global memory manager is used for some global allocations that don't belong to a specific query or task. These allocations are not tracked by Spark task memory manager.

After the patch, spark.memory.offHeap.size and spark.executor.memoryOverhead (with spark.executor.memoryOverheadFactor / spark.executor.minMemoryOverhead for higher version Spark, probably) will co-work for Velox backend's memory management.

  1. spark.memory.offHeap.size limits major of the memory allocations happened during Velox query execution, e.g., hash tables, sort buffers, shuffle buffers, etc.
  2. spark.executor.memoryOverhead limits the memory allocations that are hardly tracked by Spark task memory manager, or that don't belong to a specific query. E.g., allocations happen during spilling, or possibly the global cache size (unimplemented).

Edit: As the change fails a couple of CI tests, we are setting the internal Velox global poll size limit to Long.MaxValue by default in the PR, to bypass the check for sometime, until the relevant bugs from Velox get fixed.

@github-actions github-actions bot added CORE works for Gluten Core VELOX labels Aug 23, 2024
@github-actions
Copy link
Copy Markdown

#6960

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

@zhztheplayer
Copy link
Copy Markdown
Member Author

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

@zhztheplayer
Copy link
Copy Markdown
Member Author

zhztheplayer commented Aug 26, 2024

CH failure doesn't seem to be related cc @zzcclp

https://opencicd.kyligence.com/job/gluten/job/gluten-ci/11749

@zhztheplayer
Copy link
Copy Markdown
Member Author

Run Gluten Clickhouse CI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants