Skip to content

Examine the affects of MADV_RANDOM when MGLRU is enabled in Linux kernel #14408

@ChrisHegarty

Description

@ChrisHegarty

With the relatively recent capability to call madvise in Lucene, we've started to use MADV_RANDOM in several places where it makes conceptual sense, e.g. for accessing vector data when navigating the graph. The memory access is truly random, but we've seen several reports of performance regressions that appear as a result of this.

Of particular concern is the interaction of MADV_RANDOM with Multi-Gen LRU [1]. From my reading of the code, and someone please correct me, the semantics of MADV_RANDOM has changed in the kernel with MGLRU, and results in pages being proactively reclaimed more eagerly, even when there is no memory pressure. Specifically after torvalds/linux@8788f67.

This Elasticsearch issue has the more of the lower-level details, elastic/elasticsearch#124499.

This issue may also have some connection, #14281.

I opened this issue to help facilitate a discussion and hopefully converge on a potential direction to mitigate the possibility of performance regressions. For example, one possible mitigation would be to expose the ReadAdivce that will be used as part of the API, so that callers can have more fine-grained control over whether or not to use MADV_RANDOM.

[1] https://docs.kernel.org/admin-guide/mm/multigen_lru.html

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions