Skip to content

Columnar shuffle uses wrong memory allocator in unified memory mode #1438

@andygrove

Description

@andygrove

Describe the bug

I am using unified memory management:

    --conf spark.memory.offHeap.enabled=true \
    --conf spark.memory.offHeap.size=2g \

I deliberately allocated a small amount of memory for testing purposes. However, the stack trace below shows that Comet is using CometTestShuffleMemoryAllocator which should not be the case.

Caused by: org.apache.spark.memory.SparkOutOfMemoryError: Unable to acquire 16777216 bytes of memory, got 2588672 bytes. Available: 2588672
	at org.apache.spark.shuffle.comet.CometTestShuffleMemoryAllocator.allocateMemoryBlock(CometTestShuffleMemoryAllocator.java:129)
	at org.apache.spark.shuffle.comet.CometTestShuffleMemoryAllocator.allocate(CometTestShuffleMemoryAllocator.java:116)
	at org.apache.spark.sql.comet.execution.shuffle.SpillWriter.initialCurrentPage(SpillWriter.java:158)
	at org.apache.spark.shuffle.sort.CometShuffleExternalSorter.insertRecord(CometShuffleExternalSorter.java:374)
	at org.apache.spark.sql.comet.execution.shuffle.CometUnsafeShuffleWriter.insertRecordIntoSorter(CometUnsafeShuffleWriter.java:278)
	at org.apache.spark.sql.comet.execution.shuffle.CometUnsafeShuffleWriter.write(CometUnsafeShuffleWriter.java:206)

This is created in the following code in CometShuffleMemoryAllocatorTrait:

    boolean useUnifiedMemAllocator =
        (boolean)
            CometConf$.MODULE$.COMET_COLUMNAR_SHUFFLE_UNIFIED_MEMORY_ALLOCATOR_IN_TEST().get();

    if (!useUnifiedMemAllocator) {
      synchronized (CometShuffleMemoryAllocator.class) {
        if (INSTANCE == null) {
          // CometTestShuffleMemoryAllocator handles pages by itself so it can be a singleton.
          INSTANCE = new CometTestShuffleMemoryAllocator(conf, taskMemoryManager, pageSize);
        }
      }
      return INSTANCE;

It seems that the wrong config is being checked here.

Steps to reproduce

No response

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions