Examine the affects of MADV_RANDOM when MGLRU is enabled in Linux kernel

With the relatively recent capability to call `madvise` in Lucene, we've started to use `MADV_RANDOM` in several places where it makes conceptual sense, e.g. for accessing vector data when navigating the graph. The memory access is truly random, but we've seen several reports of performance regressions that appear as a result of this.

Of particular concern is the interaction of `MADV_RANDOM` with Multi-Gen LRU [1]. From my reading of the code, and someone please correct me,  the semantics of `MADV_RANDOM` has changed in the kernel with MGLRU, and results in pages being proactively reclaimed more eagerly, even when there is no memory pressure. Specifically after https://github.com/torvalds/linux/commit/8788f6781486769d9598dcaedc3fe0eb12fc3e59.

This Elasticsearch issue has the more of the lower-level details, https://github.com/elastic/elasticsearch/issues/124499.

This issue may also have some connection, https://github.com/apache/lucene/issues/14281.

I opened this issue to help facilitate a discussion and hopefully converge on a potential direction to mitigate the possibility of performance regressions. For example, one possible mitigation would be to expose the `ReadAdivce` that will be used as part of the API, so that callers can have more fine-grained control over whether or not to use `MADV_RANDOM`.

[1] https://docs.kernel.org/admin-guide/mm/multigen_lru.html





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Examine the affects of MADV_RANDOM when MGLRU is enabled in Linux kernel #14408

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Examine the affects of MADV_RANDOM when MGLRU is enabled in Linux kernel #14408

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions