[ML] Use perAllocation and perDeployment memory usage in the model assignment planner#98874
Merged
valeriy42 merged 42 commits intoelastic:mainfrom Nov 6, 2023
Merged
Conversation
Collaborator
|
Hi @valeriy42, I've created a changelog YAML for you. |
Collaborator
|
Pinging @elastic/ml-core (Team:ML) |
|
@valeriy42 this is failing CI because of this: You can fix that by using |
Contributor
Author
|
@elasticmachine update branch |
valeriy42
added a commit
to elastic/eland
that referenced
this pull request
Nov 6, 2023
This PR adds an ability to estimate per deployment and per allocation memory usage of NLP transformer models. It uses torch.profiler and performs logs the peak memory usage during the inference. This information is then used in Elasticsearch to provision models with sufficient memory (elastic/elasticsearch#98874).
valeriy42
added a commit
to valeriy42/elasticsearch
that referenced
this pull request
Nov 7, 2023
…signment planner (elastic#98874) Building upon elastic#98139, this PR extends the model assignment planning algorithms and the linear solver to use the extended memory fields. It also adds unit tests to verify the new behavior. I needed to adjust the old unit tests since we use the estimateMemoryUsage routine, which would compute 2*memoryBytes + 240 MB as the memory requirement. Previously, in the unit tests, we were simply using memoryBytes field value.
valeriy42
added a commit
that referenced
this pull request
Nov 7, 2023
…in the model assignment planner" (#101853) The original PR #98874 missed the memory overhead adjustment from #86416. As it caused some BWC test failures on the CI, I reverted it in #101834. This PR reintegrates the functionality and extends the BWC integration test with the memory constant depending on the version of the old cluster.
droberts195
added a commit
to droberts195/elasticsearch
that referenced
this pull request
Dec 11, 2023
…model assignment planner" This reverts commit 31ca2f7. The functionality of elastic#98874 is being removed from 8.12 because it means that models which were working successfully on 2GB nodes in 8.11 will no longer fit on 2GB nodes. This will be frustrating for trial users. Before 8.13 we need to do a more thorough assessment of which models will and won't fit on 2GB nodes as a result of better memory estimation. It may be possible to tweak the memory usage estimation so that we require more memory than 8.11 but not so much more that our recommended trial models no longer fit onto 2GB nodes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Building upon #98139, this PR extends the model assignment planning algorithms and the linear solver to use the extended memory fields. It also adds unit tests to verify the new behavior.
I needed to adjust the old unit tests since we use the
estimateMemoryUsageroutine, which would compute2*memoryBytes + 240 MBas the memory requirement. Previously, in the unit tests, we were simply usingmemoryBytesfield value.