Q4_2 quantization with rmse-optimized scale and quants by ikawrakow · Pull Request #1062 · ggml-org/llama.cpp

ikawrakow · 2023-04-19T15:15:55Z

For quantize-stats we get
q4_2: rmse 0.00159301, maxerr 0.17480469, 95pct<0.0030, median<0.0012

For 7B perplexity with BLAS enabled we get 6.2038 after 655 chunks.

Quantization is slow (~90 seconds on my Mac for 7B) as not multi-threaded as in PR #896.

For quantize-stats we get q4_2: rmse 0.00159301, maxerr 0.17480469, 95pct<0.0030, median<0.0012 For 7B perplexity with BLAS enabled we get 6.2038 after 655 chunks. Quantization is slow (~90 seconds on my Mac for 7B) as not multi-threaded as in PR #896.

Not sure why this makes them fail

ggml.c

Green-Sky · 2023-04-19T18:11:34Z

ggml.c

    }
 }

+static inline int nearest_int(float fval) {


this inline does not do anything here. the static is all you need.

hm actually, after looking at cppref, i am not sure that C and C++ are the same here.

* Add alternative log functions * chat: fix int overflow, prevent size calculation in float/double (ggml-org#17357) * chat: fix int overflow, prevent size calculation in float/double * Update common/chat.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * common : move all common_chat_parse_* to chat-parser.cpp. (ggml-org#17481) # Conflicts: # common/chat.cpp * server: split server.cpp code into server/common/task/queue/context * Fix compiler warning * Clean up code * common: use native MultiByteToWideChar * move server prompt to server task * Clean code * delete utils.hpp --------- Co-authored-by: firecoperana <firecoperana> Co-authored-by: Xuan-Son Nguyen <son@huggingface.co> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: DAN™ <dranger003@gmail.com>

ikawrakow requested a review from ggerganov April 19, 2023 15:15

ggerganov approved these changes Apr 19, 2023

View reviewed changes

ggml : satisfy the sanitizer builds

6d36a51

Not sure why this makes them fail

sw reviewed Apr 19, 2023

View reviewed changes

ggml.c Outdated Show resolved Hide resolved

sw reviewed Apr 19, 2023

View reviewed changes

ggml.c Outdated Show resolved Hide resolved

Iwan Kawrakow added 2 commits April 19, 2023 18:52

Better follow ggml conventions for function names

49beb2c

Fixed type as per reviewer comment

96d8443

Green-Sky reviewed Apr 19, 2023

View reviewed changes

ikawrakow merged commit f7d0509 into master Apr 19, 2023

ikawrakow deleted the quantize-q4-2-rmse branch April 19, 2023 18:20

ggerganov mentioned this pull request Apr 22, 2023

Use full range for q4_0 quantization #729

Merged

MarcioPais mentioned this pull request Apr 22, 2023

Investigate alternative approach for Q4 quantization #397

Closed

Bearsaerker mentioned this pull request Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Q4_2 quantization with rmse-optimized scale and quants#1062

Q4_2 quantization with rmse-optimized scale and quants#1062
ikawrakow merged 4 commits intomasterfrom
quantize-q4-2-rmse

ikawrakow commented Apr 19, 2023

Uh oh!

Uh oh!

Uh oh!

Green-Sky Apr 19, 2023

Uh oh!

Green-Sky Apr 19, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ikawrakow commented Apr 19, 2023

Uh oh!

Uh oh!

Uh oh!

Green-Sky Apr 19, 2023

Choose a reason for hiding this comment

Uh oh!

Green-Sky Apr 19, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants