speculative : fix "ngram-map-k4v" name in logging#24253
Merged
Conversation
This is a non-functional change. When using `--spec-type ngram-map-k4v`, the log messages at startup and runtime say `ngram-map-k`. Added logic in the in the constructor of `common_speculative_impl_ngram_map_k` to pass the correct `COMMON_SPECULATIVE_TYPE_NGRAM_MAP_K4V` when `config.key_only` is `false`. After this change, the log messages use the correct name.
Contributor
Author
|
Before: ./build/bin/llama-server [...] --spec-type ngram-map-k4v --spec-ngram-map-k4v-size-n 4 --spec-ngram-map-k4v-size-m 4 --spec-ngram-map-k4v-min-hits 1 --spec-draft-n-min 1 --spec-draft-n-max 4
I common_speculative_impl_ngram_map_k: adding speculative implementation 'ngram-map-k'
I common_speculative_impl_ngram_map_k: - size_key=4, size_value=4, key_only=0, min_hits=1
statistics ngram-map-k: #calls(b,g,a) = [...]After: ./build/bin/llama-server [...] --spec-type ngram-map-k4v --spec-ngram-map-k4v-size-n 4 --spec-ngram-map-k4v-size-m 4 --spec-ngram-map-k4v-min-hits 1 --spec-draft-n-min 1 --spec-draft-n-max 4
I common_speculative_impl_ngram_map_k: adding speculative implementation 'ngram-map-k4v'
I common_speculative_impl_ngram_map_k: - size_key=4, size_value=4, key_only=0, min_hits=1
statistics ngram-map-k4v: #calls(b,g,a) = [...]However I'm not sure if this change has any other implications? |
ngxson
approved these changes
Jun 9, 2026
pwilkin
approved these changes
Jun 10, 2026
Member
|
Uhh, sorry, though it was a simple change then I read the further comments. If this is not OK I'll revert. |
Member
|
I think it's ok |
Jcfunk
added a commit
to Jcfunk/llama.cpp
that referenced
this pull request
Jun 11, 2026
* upstream/HEAD: (329 commits) vendor : update LibreSSL to 4.3.2 (ggml-org#24397) Remove padding and multiple D2D copies for MTP (ggml-org#24086) chat: fix LFM2/LFM2.5 ignoring json_schema (ggml-org#24377) CUDA: Fix ssm_scan_f32 data-races (ggml-org#24360) ci : bump komac version (ggml-org#24396) speculative : fix "ngram-map-k4v" name in logging (ggml-org#24253) webui: implement pinned conversations support (ggml-org#21387) graph: Fix granite speech model inference by applying embedding scale when deepstack is not used (ggml-org#24357) ci : fix windows release (ggml-org#24369) ui: add opt-in run_javascript frontend tool (ggml-org#24244) mtmd: build_vit batching (ggml-org#24352) vulkan: reduce iq1 shared memory usage for mul_mm (ggml-org#24287) vulkan: add `v_dot2_f32_f16` support in matrix-matrix multiplication and Flash Attention (ggml-org#24123) ui: Fix excessive style recalculation on hover (ggml-org#24243) mtmd: refactor video subproc handling (ggml-org#24316) server: log prompts to directory (ggml-org#22031) ui: fix mobile chat form overflow and bust stale bundle cache (ggml-org#24158) ggml : add GGML_OP_COL2IM_1D (ggml-org#24206) server : do not clear slots without unified KV cache (ggml-org#24190) models : fix plamo2 attention_key/value_length regression (ggml-org#24317) ...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This is a non-functional change.
When using
--spec-type ngram-map-k4v, the log messages at startup and runtime sayngram-map-k. Added logic in the in the constructor ofcommon_speculative_impl_ngram_map_kto pass the correctCOMMON_SPECULATIVE_TYPE_NGRAM_MAP_K4Vwhenconfig.key_onlyisfalse.After this change, the log messages use the correct name.
Requirements