Skip to content

Conversation

@gundaarx
Copy link

Description: Changes for Azure pipelines

@gundaarx gundaarx requested a review from suryasidd April 16, 2020 19:26
@suryasidd suryasidd requested a review from smkarlap April 17, 2020 18:30
Copy link

@suryasidd suryasidd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@smkarlap smkarlap merged commit d19f34c into openvino-ep-v2 Apr 17, 2020
@suryasidd suryasidd deleted the aravind/docker_file_changes branch May 26, 2020 19:18
MaajidKhan pushed a commit that referenced this pull request Mar 9, 2022
ARM a55 micro-architecture (with dot product instructions), similar to a53, is widely used as little cores in big.Little configurations. A55 has a narrower memory load/store hardware, where a 128b load instruction would block the pipeline for 2 whole cycles, during which no other instructions can be executed. On the other hand, a 64b load instruction can be duo issued with many other instructions.

This change adds a Symmetric QGEMM kernel for a55 micro-architecture, where we replace

ldr q4,[x1],#16

with

ldr d4,[x1],#8
ldr x11,[x1],#8
ins v4.d[1],x11

so that we can try to hide the memory load cycles behind computing cycles in the kernel.

Co-authored-by: Chen Fu <fuchen@microsoft.com>
lavanyax pushed a commit that referenced this pull request Mar 29, 2022
ARM a55 micro-architecture (with dot product instructions), similar to a53, is widely used as little cores in big.Little configurations. A55 has a narrower memory load/store hardware, where a 128b load instruction would block the pipeline for 2 whole cycles, during which no other instructions can be executed. On the other hand, a 64b load instruction can be duo issued with many other instructions.

This change adds a Symmetric QGEMM kernel for a55 micro-architecture, where we replace

ldr q4,[x1],#16

with

ldr d4,[x1],#8
ldr x11,[x1],#8
ins v4.d[1],x11

so that we can try to hide the memory load cycles behind computing cycles in the kernel.

Co-authored-by: Chen Fu <fuchen@microsoft.com>
RyanMetcalfeInt8 pushed a commit to RyanMetcalfeInt8/onnxruntime that referenced this pull request Jul 25, 2025
* Introduce new capability

* inprogress implemenation of get capability

* Enable epctx native binary execution

* Add epctx plugin test

* Update to new abi APIs

* Implement new version API

* Enable onnx model compilation

* Fix linux build

* use compile_model directly

* missing cmake changes to fix windows

* fixup! Fix linux build

* Only match by pci device for GPU.

* update tests check for ep device metadata

* remove abi entry points from legacy dll
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants