[CANN] weight format to nz for Ascend310P3#14407
Conversation
|
Please delete Trailing whitespace and rebase. |
|
Please paste MULMAT test case result and provide performance comparison data. |
|
Please remove Trailing whitespace. |
|
OP Test new_pool_for_device: device 0 use vmm pool |
* origin/master: (49 commits) ci : correct label refactor->refactoring (ggml-org#14832) CUDA: fix quantized KV cache + multiple sequences (ggml-org#14822) tests : add non-cont K,V FA tests memory : handle saving/loading null layers in recurrent memory (ggml-org#14675) ggml: fix loongarch quantize_row_q8_1 error (ggml-org#14827) CANN: weight format to NZ for Ascend310P3 (ggml-org#14407) CUDA: add fused rms norm (ggml-org#14800) ggml : model card yaml tab->2xspace (ggml-org#14819) vulkan: fix rms_norm_mul to handle broadcasting dim0 (ggml-org#14817) llama : add model type detection for rwkv7 7B&14B (ggml-org#14816) imatrix: add option to display importance score statistics for a given imatrix file (ggml-org#12718) Mtmd: add a way to select device for vision encoder (ggml-org#14236) cuda : implement bf16 cpy ops and enable bf16 cont (ggml-org#14763) opencl: remove unreachable `return` (ggml-org#14806) server : allow setting `--reverse-prompt` arg (ggml-org#14799) cuda: remove linking to cublasLt (ggml-org#14790) opencl: fix `im2col` when `KW!=KH` (ggml-org#14803) opencl: add conv2d kernel (ggml-org#14403) sycl: Fix im2col (ggml-org#14797) kleidiai: add support for get_rows (ggml-org#14676) ...
* weight format to nz for 310p * remove quant weight format to nz * clean code * fix * make the conditions for converting weights to NZ format consistent * clean code
* weight format to nz for 310p * remove quant weight format to nz * clean code * fix * make the conditions for converting weights to NZ format consistent * clean code
Why is this PR needed?
When loading mat mul weights on Ascend 310P, convert them into an Ascend-friendly format to improve performance
Performance Comparison
Model:Qwen2-7B-Instruct-FP16
with nz feature:
without nz feature: