-
-
Notifications
You must be signed in to change notification settings - Fork 1k
Add SIMD-enabled OpenBLAS package and benchmarks for Pyodide #5948
Description
🚀 Feature
Add SIMD-enabled OpenBLAS build recipe and benchmarking recipe comparing SIMD vs non-SIMD builds.
Motivation
The existing Pyodide build of OpenBLAS does not leverage WebAssembly SIMD extensions, which can significantly improve performance for matrix and linear algebra operations.
In #5855, we discussed adding SIMD verification tests. As a next step, I’ve prepared recipes that build OpenBLAS with SIMD enabled and benchmark its performance against the original scalar build.
This addition would help validate performance improvements and guide future SIMD optimization decisions across numerical packages (NumPy, SciPy, etc.).
Pitch
I propose to include two new recipes:
- libopenblas-simd - builds OpenBLAS using
-msimd128flags to enable WebAssembly SIMD. - test-openblas-simd - runs
cblas_sdot(Vector Dot Product) andcblas_sgemm(Matrix-Matrix Multiplication) benchmarks to compare the performance of SIMD and non-SIMD builds.
These recipes will help contributors quickly verify SIMD performance improvements without modifying the core build system.
Alternatives
The optimal long-term solution would be for OpenBLAS to provide native WebAssembly SIMD kernels, but as discussed in OpenBLAS #4023(Not sure if mentioning directly is okay, so leaving it in a code block.), the current practical approach is to rely on compiler autovectorization with -msimd128, which this recipe implements.
Additional context
The sizes were chosen arbitrarily since the best dimensions for benchmarking were not known. If you have any better ideas, please give me feedback!
I additionally compared the -O2 and -O3 builds to evaluate potential performance differences and included the results in the benchmark.
As the next step, I plan to verify whether the current SIMD-built OpenBLAS is actually used in NumPy operations.