Skip to content

Flush lineage cache after a node failure #4958

@stephanie-wang

Description

@stephanie-wang

Describe the problem

As of #4942, the raylet where a task is submitted will flush a task to the GCS instead of the raylet where a task is executed. This means that if a node dies before it can flush a task, it's possible that no one will ever flush the task. This can lead to a memory leak in the lineage cache.

The proposed solution is to have every node flush their local lineage cache if any node fails.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions