[ML] Refactor AutodetectMemoryLimitIT to use memory limit constants and improve model size assertions#135526
Conversation
…prove model size assertions - Removed muted tests for `testManyDistinctOverFields` and `testTooManyByAndOverFields`. - Introduced constants for memory limits in `AutodetectMemoryLimitIT`. - Updated assertions to check effective model size against calculated limits.
|
Pinging @elastic/ml-core (Team:ML) |
...ts/src/javaRestTest/java/org/elasticsearch/xpack/ml/integration/AutodetectMemoryLimitIT.java
Outdated
Show resolved
Hide resolved
...ts/src/javaRestTest/java/org/elasticsearch/xpack/ml/integration/AutodetectMemoryLimitIT.java
Outdated
Show resolved
Hide resolved
jan-elastic
left a comment
There was a problem hiding this comment.
Generally look good. Some small comments.
- Changed variable names from `memoryLimit` to `memoryLimitMb` for clarity. - Updated memory limit assertions to reflect the new variable naming. - Ensured consistency in memory limit usage across multiple test cases.
| */ | ||
| public class AutodetectMemoryLimitIT extends MlNativeAutodetectIntegTestCase { | ||
|
|
||
| private static final long PROCESS_OVERHEAD_BYTES = ByteSizeValue.ofMb(20).getBytes(); |
There was a problem hiding this comment.
Where is this value coming from?
There was a problem hiding this comment.
This value is estimated empirically. I used a small, simple anomaly detection job with a trivial model to establish the lower bound on the process memory usage.
There was a problem hiding this comment.
Would it be helpful to add a comment explaining that? If there's a risk of that value changing at some point in the future, it would be good to know what needs to be do to recalculate and update it.
There was a problem hiding this comment.
On the second though, checking if the autodetect process has a specific memory overhead make these tests unnecessary brittle. I removed it and only kept checking for hard_limit. In ml-cpp we have unit tests to ensure there are no memory leaks.
…tests/AutodetectMemoryLimitIT
| assertThat(modelSizeStats.getModelBytes(), lessThan(120500000L)); | ||
| assertThat(modelSizeStats.getModelBytes(), greaterThan(70000000L)); | ||
| assertThat(getEffectiveModelSize(modelSizeStats.getModelBytes()), lessThan(ByteSizeValue.ofMb(memoryLimitMb).getBytes() * 1.05)); | ||
| assertThat(modelSizeStats.getMemoryStatus(), equalTo(ModelSizeStats.MemoryStatus.HARD_LIMIT)); |
There was a problem hiding this comment.
The reporting of memory usage is different on different plattforms: on Linux and Windows we report actual process memory usage, while on MacOS this information is not available and hence we report estimated memory usage. This makes the test for a specific memory limit brittle. To improve robustness, I only check that the job is in the hard_limit state.
| assertThat(modelSizeStats.getModelBytes(), lessThan(72000000L)); | ||
| assertThat(modelSizeStats.getModelBytes(), greaterThan(24000000L)); | ||
| assertThat(getEffectiveModelSize(modelSizeStats.getModelBytes()), lessThan(ByteSizeValue.ofMb(memoryLimitMb).getBytes() * 1.05)); | ||
| assertThat(modelSizeStats.getMemoryStatus(), equalTo(ModelSizeStats.MemoryStatus.HARD_LIMIT)); |
There was a problem hiding this comment.
Only check if the job is in hard_limit state to increase test robustness on different plattforms
| assertThat(modelSizeStats.getModelBytes(), greaterThan(24000000L)); | ||
|
|
||
| assertThat(getEffectiveModelSize(modelSizeStats.getModelBytes()), lessThan(ByteSizeValue.ofMb(memoryLimitMb).getBytes() * 1.05)); | ||
| assertThat( |
There was a problem hiding this comment.
Only check for hard_limit to increase test robustness.
| assertThat(modelSizeStats.getModelBytes(), lessThan(45000000L)); | ||
| assertThat(modelSizeStats.getModelBytes(), greaterThan(25000000L)); | ||
| assertThat(getEffectiveModelSize(modelSizeStats.getModelBytes()), lessThan(ByteSizeValue.ofMb(memoryLimitMb).getBytes() * 1.05)); | ||
| assertThat(modelSizeStats.getMemoryStatus(), equalTo(ModelSizeStats.MemoryStatus.HARD_LIMIT)); |
There was a problem hiding this comment.
Only check hard_limit for robustness.
…MemoryLimitIT tests
…leriy42/elasticsearch into tests/AutodetectMemoryLimitIT
Previously, the upper bound for model memory checks was set in absolute terms, which is not easy to understand and is brittle. I adjusted the assertions to ensure that the memory usage does not exceed 5% of the memory limit. Additionally, on Linux, we now report the process size (see #131981), which includes approximately 20 MB of native code overhead. I made handling this overhead more explicit.
More details:
testManyDistinctOverFieldsandtestTooManyByAndOverFields.AutodetectMemoryLimitIT.Closes #132308
Closes #132310
Closes #132611