Skip to content

Unbounded memory usage in tensordot #6916

@tomwhite

Description

@tomwhite

What happened:

Following #6846 the memory usage of tensordot (and dot, which delegates to tensordot in Dask) is much higher than before, and grows as a function of array size.

What you expected to happen:

The memory usage should be related to chunk size (as it was previously), not the size of the array. This was achieved previously by avoiding concatenate=True in the call to blockwise from tensordot. As a general rule, concatenate=True should be avoided since it causes these memory issues for very large inputs.

Minimal Complete Verifiable Example:

The "Multiplication Only" part of this notebook - with X.T @ Y replaced with da.dot(X.T, Y) - demonstrates the problem. The memory usage should be flat, not growing with the size of the array.

Anything else we need to know?:

#6874 is a related issue that aims to remove the same memory issue from matmul.

Environment:

  • Dask version: latest head (unreleased)
  • Python version: 3.7.6
  • Operating System: MacOS
  • Install method (conda, pip, source): source

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions