[NPU] Add support for compiled model release memory#35084
Merged
pereanub merged 3 commits intoopenvinotoolkit:masterfrom Apr 3, 2026
Merged
[NPU] Add support for compiled model release memory#35084pereanub merged 3 commits intoopenvinotoolkit:masterfrom
pereanub merged 3 commits intoopenvinotoolkit:masterfrom
Conversation
e4143f3 to
afefcdc
Compare
Signed-off-by: Bogdan Pereanu <bogdan.pereanu@intel.com>
afefcdc to
91ab8e8
Compare
91ab8e8 to
8893292
Compare
sbutnari
approved these changes
Apr 1, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
Adds NPU plugin support for CompiledModel::release_memory() by evicting Level Zero graph memory, plus a basic functional test that validates inference can run again after eviction.
Changes:
- Implement
CompiledModel::release_memory()in the NPU plugin and route it to graph-level eviction. - Add
evict_memory()plumbing throughIGraph→Graph→ZeGraphExtWrappersand call Level ZeropfnEvict. - Add a functional behavior test that runs inference, calls
release_memory(), then runs inference again.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/plugins/intel_npu/tests/functional/behavior/ov_infer_request/compile_and_infer.hpp | Adds a regression-style test for release_memory() and a disabled memory-usage check test. |
| src/plugins/intel_npu/src/plugin/src/compiled_model.cpp | Implements CompiledModel::release_memory() by evicting graph memory. |
| src/plugins/intel_npu/src/plugin/include/compiled_model.hpp | Declares release_memory() override on the NPU compiled model. |
| src/plugins/intel_npu/src/compiler_adapter/src/ze_graph_ext_wrappers.cpp | Adds Level Zero graph eviction wrapper calling pfnEvict. |
| src/plugins/intel_npu/src/compiler_adapter/src/graph.cpp | Adds Graph::evict_memory() forwarding to the ZE wrapper. |
| src/plugins/intel_npu/src/compiler_adapter/include/ze_graph_ext_wrappers.hpp | Declares ZeGraphExtWrappers::evict_memory(...). |
| src/plugins/intel_npu/src/compiler_adapter/include/graph.hpp | Declares Graph::evict_memory() override. |
| src/plugins/intel_npu/src/common/src/igraph.cpp | Adds default IGraph::evict_memory() implementation (currently throwing). |
| src/plugins/intel_npu/src/common/include/intel_npu/common/igraph.hpp | Declares IGraph::evict_memory() in the common interface. |
src/plugins/intel_npu/src/compiler_adapter/src/ze_graph_ext_wrappers.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_npu/tests/functional/behavior/ov_infer_request/compile_and_infer.hpp
Show resolved
Hide resolved
src/plugins/intel_npu/src/compiler_adapter/src/ze_graph_ext_wrappers.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_npu/src/compiler_adapter/src/ze_graph_ext_wrappers.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_npu/tests/functional/behavior/ov_infer_request/compile_and_infer.hpp
Outdated
Show resolved
Hide resolved
Signed-off-by: Bogdan Pereanu <bogdan.pereanu@intel.com>
105d931 to
7161335
Compare
PatrikStepan
approved these changes
Apr 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Details:
Tickets:
AI Assistance: