Conversation
|
IIRC, |
|
|
Abduragim will add fastGemm with the next iteration. |
|
@dkurt @fengyuentau I want to merge the PR. fastGem will be integrated with the next one to simplify performance comparison. Do you have any concerns? |
fengyuentau
left a comment
There was a problem hiding this comment.
Also what is the time cost on CI for these tests? is it tolerable (< 1000ms for example)?
|
|
On my old PC without AVX2: 17 tests from 1 test case ran. (15615 ms total) |
|
@Abdurrahheem Could you collect the perf results from detail pages and fill your table in the first comment? |
|
@fengyuentau I propose to rerun the benchmark locally and update the PR. CI runs perf tests with single iteration and concurrently with other builds. The numbers are not reliable. |
|
ARM64: ~3.5s
X64: ~1.8s
Win-X64: ~7.6s
I propose to make it a smaller scale. |
|
@Abdurrahheem friendly reminder. |
I am only able to test on Ubuntu locally currently due to lack of different platforms |
|
Updated the table with performance results. |
Einsum Layer Performance Test opencv#24445 ## This PR adds performance tests for Einsum Layer. See below results of performance test on different inputs **Notation:** - WX: windows10_x64 - MX: macos_x64 - MA: macos_arm64 - UX: ubuntu_x64 - UA: ubuntu_arm64 All data in ms (milliseconds). Gemm is backend for matrix multiplication --- Benchmarks: | Equation | Inputs Mat Dims | UX (ms) | UA (ms) | MX (ms) | MA (ms) | WX (ms) | |-------------------------|-----------------------------------|----------------|---------|---------|---------|---------| | "ij, jk -> ik" | [2, 3], [3,2] | 0.04 ± 0.00 | - | - | - | - | | "ij, jk -> ik" | [20, 30], [30,20] | 0.08 ± 0.00 | - | - | - | - | | "ij, jk -> ik" | [113, 127], [127,113] | 2.41 ± 0.05 | - | - | - | - | | "imkj, injs -> imnks" | [1, 4, 7, 9], [1, 5, 9, 8] | 0.11 ± 0.00 | - | - | - | - | | "imkj, injs -> imnks" | [1, 4, 70, 90], [1, 5, 90, 80] | 15.49 ± 0.46 | - | - | - | - | | "imkj, injs -> imnks" | [1, 4, 73, 91], [1, 5, 91, 57] | 11.53 ± 0.06 | - | - | - | - | | "ij -> i" | [30, 40] | 0.03 ± 0.00 | - | - | - | - | | "ij -> i" | [113, 374] | 0.13 ± 0.00 | - | - | - | - | | "...ij -> ...i" | [30, 40] | 0.03 ± 0.00 | - | - | - | - | | "...ij -> ...i" | [113, 374] | 0.13 ± 0.00 | - | - | - | - | | "...ij, ...jk -> ...ik" | [40, 50], [50,80] | 0.37 ± 0.01 | - | - | - | - | | "...ij, ...jk -> ...ik" | [47, 51], [51, 83] | 0.43 ± 0.01 | - | - | - | - | ----- ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
Einsum Layer Performance Test opencv#24445 ## This PR adds performance tests for Einsum Layer. See below results of performance test on different inputs **Notation:** - WX: windows10_x64 - MX: macos_x64 - MA: macos_arm64 - UX: ubuntu_x64 - UA: ubuntu_arm64 All data in ms (milliseconds). Gemm is backend for matrix multiplication --- Benchmarks: | Equation | Inputs Mat Dims | UX (ms) | UA (ms) | MX (ms) | MA (ms) | WX (ms) | |-------------------------|-----------------------------------|----------------|---------|---------|---------|---------| | "ij, jk -> ik" | [2, 3], [3,2] | 0.04 ± 0.00 | - | - | - | - | | "ij, jk -> ik" | [20, 30], [30,20] | 0.08 ± 0.00 | - | - | - | - | | "ij, jk -> ik" | [113, 127], [127,113] | 2.41 ± 0.05 | - | - | - | - | | "imkj, injs -> imnks" | [1, 4, 7, 9], [1, 5, 9, 8] | 0.11 ± 0.00 | - | - | - | - | | "imkj, injs -> imnks" | [1, 4, 70, 90], [1, 5, 90, 80] | 15.49 ± 0.46 | - | - | - | - | | "imkj, injs -> imnks" | [1, 4, 73, 91], [1, 5, 91, 57] | 11.53 ± 0.06 | - | - | - | - | | "ij -> i" | [30, 40] | 0.03 ± 0.00 | - | - | - | - | | "ij -> i" | [113, 374] | 0.13 ± 0.00 | - | - | - | - | | "...ij -> ...i" | [30, 40] | 0.03 ± 0.00 | - | - | - | - | | "...ij -> ...i" | [113, 374] | 0.13 ± 0.00 | - | - | - | - | | "...ij, ...jk -> ...ik" | [40, 50], [50,80] | 0.37 ± 0.01 | - | - | - | - | | "...ij, ...jk -> ...ik" | [47, 51], [51, 83] | 0.43 ± 0.01 | - | - | - | - | ----- ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
Einsum Layer Performance Test opencv#24445 ## This PR adds performance tests for Einsum Layer. See below results of performance test on different inputs **Notation:** - WX: windows10_x64 - MX: macos_x64 - MA: macos_arm64 - UX: ubuntu_x64 - UA: ubuntu_arm64 All data in ms (milliseconds). Gemm is backend for matrix multiplication --- Benchmarks: | Equation | Inputs Mat Dims | UX (ms) | UA (ms) | MX (ms) | MA (ms) | WX (ms) | |-------------------------|-----------------------------------|----------------|---------|---------|---------|---------| | "ij, jk -> ik" | [2, 3], [3,2] | 0.04 ± 0.00 | - | - | - | - | | "ij, jk -> ik" | [20, 30], [30,20] | 0.08 ± 0.00 | - | - | - | - | | "ij, jk -> ik" | [113, 127], [127,113] | 2.41 ± 0.05 | - | - | - | - | | "imkj, injs -> imnks" | [1, 4, 7, 9], [1, 5, 9, 8] | 0.11 ± 0.00 | - | - | - | - | | "imkj, injs -> imnks" | [1, 4, 70, 90], [1, 5, 90, 80] | 15.49 ± 0.46 | - | - | - | - | | "imkj, injs -> imnks" | [1, 4, 73, 91], [1, 5, 91, 57] | 11.53 ± 0.06 | - | - | - | - | | "ij -> i" | [30, 40] | 0.03 ± 0.00 | - | - | - | - | | "ij -> i" | [113, 374] | 0.13 ± 0.00 | - | - | - | - | | "...ij -> ...i" | [30, 40] | 0.03 ± 0.00 | - | - | - | - | | "...ij -> ...i" | [113, 374] | 0.13 ± 0.00 | - | - | - | - | | "...ij, ...jk -> ...ik" | [40, 50], [50,80] | 0.37 ± 0.01 | - | - | - | - | | "...ij, ...jk -> ...ik" | [47, 51], [51, 83] | 0.43 ± 0.01 | - | - | - | - | ----- ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
This PR adds performance tests for Einsum Layer. See below results of performance test on different inputs
Notation:
All data in ms (milliseconds).
Gemm is backend for matrix multiplication
Benchmarks:
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.