server: log prompts to directory by jacekpoplawski · Pull Request #22031 · ggml-org/llama.cpp

jacekpoplawski · 2026-04-17T07:06:02Z

Overview

Log each prompt into a separate text file.

Additional information

There is a recurring class of issues related to prompt-processing cache behavior, for example:
#19394

Server logs are useful for observing cache-related metrics, but they are not enough when the goal is to compare the exact prompt contents between requests.

This small helper writes each prompt to a separate plain-text file, making it easy to inspect and compare prompts with tools like diff.

It helped me quickly identify that opencode reorders system-reminder, which changes middle of the prompt and confuses llama.cpp prompt caching.

The change is intentionally minimal and does not even create a new directory. I initially considered logging more information, but this turned out to be enough to understand what client is doing.

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: YES (initial research only)

0x62ash · 2026-04-17T14:20:05Z

This is very useful feature. I used mitmproxy for debugging and it was pain in a...

jacekpoplawski · 2026-04-18T15:48:23Z

I was able to run it also on Windows:

.\bin\Release\llama-server.exe -c 50000 -m J:\llm\models\Qwen3.6-35B-A3B-UD-Q4_K_M.gguf --log-prompts-dir logs

then connect Opencode and try to reproduce the unnecessary prompt reprocessing:

    Directory: C:\Users\jacek\git\llama.cpp\build_2026.04.18\logs


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a----        18.04.2026     17:37           3869 0001.txt
-a----        18.04.2026     17:37          55640 0002.txt
-a----        18.04.2026     17:38          57557 0003.txt
-a----        18.04.2026     17:38          56958 0004.txt
-a----        18.04.2026     17:38          62223 0005.txt
-a----        18.04.2026     17:39          64588 0006.txt
-a----        18.04.2026     17:39          65036 0007.txt
-a----        18.04.2026     17:39          65908 0008.txt
-a----        18.04.2026     17:39          66766 0009.txt
-a----        18.04.2026     17:40          66711 0010.txt
-a----        18.04.2026     17:40          69417 0011.txt
-a----        18.04.2026     17:40          70280 0012.txt
-a----        18.04.2026     17:40          75751 0013.txt
-a----        18.04.2026     17:40          76858 0014.txt
-a----        18.04.2026     17:40          77048 0015.txt
-a----        18.04.2026     17:40          77943 0016.txt
-a----        18.04.2026     17:41          77874 0017.txt

then we can use fc to compare two prompts:

PS C:\Users\jacek\git\llama.cpp\build_2026.04.18\logs> fc.exe /N .\0016.txt .\0017.txt
Comparing files .\0016.txt and .\0017.TXT
***** .\0016.txt
 1184:  <|im_start|>user
 1185:  add one comment to it
 1186:  <system-reminder>
 1187:  Your operational mode has changed from plan to build.
 1188:  You are no longer in read-only mode.
 1189:  You are permitted to make file changes, run shell commands, and utilize your arsenal of tools as needed.
 1190:  </system-reminder><|im_end|>
 1191:  <|im_start|>assistant
 1192:  <think>
 1193:  The user wants me to add a comment to the `build_norm_gated` function. I should add a descriptive comment before the function.
 1194:  </think>
 1195:
 1196:  <tool_call>
***** .\0017.TXT
 1184:  <|im_start|>user
 1185:  add one comment to it<|im_end|>
 1186:  <|im_start|>assistant
 1187:  <tool_call>
*****

***** .\0016.txt
 1222:  <|im_start|>assistant
 1223:  <think>
***** .\0017.TXT
 1213:  <|im_start|>assistant
 1214:  Added.<|im_end|>
 1215:  <|im_start|>user
 1216:  thank you
 1217:  <system-reminder>
 1218:  Your operational mode has changed from plan to build.
 1219:  You are no longer in read-only mode.
 1220:  You are permitted to make file changes, run shell commands, and utilize your arsenal of tools as needed.
 1221:  </system-reminder><|im_end|>
 1222:  <|im_start|>assistant
 1223:  <think>
*****

In prompt 16, system-reminder appears after "add one comment to it", while in prompt 17 it was moved to after "thank you". As a result, a big number of tokens had to be processed again.

These logs are useful for detecting cases like this.

0x62ash · 2026-04-20T20:51:11Z

Hello,

The log file name should follow a more robust format. Proper naming helps prevent overwriting after restarts and avoids conflicts during concurrent writes.

The simplest solution is to use a Unix timestamp combined with a UUID or request ID.

pwilkin · 2026-04-21T06:13:11Z

Agreed with @0x62ash - within the directory specified, each session should create its timestamped subdirectory and within it should be the dumps for the single prompts.

jacekpoplawski · 2026-05-03T00:07:52Z

I switched from OpenCode to Pi Coding Agent, but I’m seeing a very similar issue.

When comparing two prompt logs produced by this PR, I can see that Pi removes the thoughts from the next prompt. As a result, the prompt content changes between requests and llama.cpp has to reprocess the changed suffix of the context, which can take minutes.

--- /home/jacek/logs/1777767070/0005.txt        2026-05-03 02:11:59.883259335 +0200
+++ /home/jacek/logs/1777767070/0006.txt        2026-05-03 02:12:43.399730238 +0200
@@ -127,16 +127,7 @@
 <|turn>user
 read the docs<turn|>
 <|turn>model
-<|channel>thought
-The user wants me to "read the docs". Looking at the project context, there are several documentation locations:
(...)
+<|turn>user
+run the tests<turn|>
+<|turn>model

aldehir · 2026-05-03T00:51:58Z

When comparing two prompt logs produced by this PR, I can see that Pi removes the thoughts from the next prompt. As a result, the prompt content changes between requests and llama.cpp has to reprocess the changed suffix of the context, which can take minutes.

This is expected behavior for Gemma 4. The reasoning traces are only kept between tool calls and tool responses. A new user message will remove the traces from prior messages.

ggerganov

This is useful to have as available functionality. cc @ngxson

ngxson · 2026-06-05T14:47:31Z

+            static std::atomic<int> prompt_counter(0);
+            const int file_name = ++prompt_counter;


maybe better to use a timestamp for file name, ggml_time_ms() ?

assuming that this feature is mostly used for debugging, the chance of collision should be negligible

With ggml_time_ms:

Directory: C:\Users\jacek\git\llama.cpp\build\prompt-logs Mode LastWriteTime Length Name ---- ------------- ------ ---- -a---- 05.06.2026 18:47 58 000000006048.txt -a---- 05.06.2026 18:48 173 000000009115.txt -a---- 05.06.2026 18:48 356 000000010614.txt

pwilkin

Let's prioritize this, this would be of great help with the ton of various non-specific parser issue reports we're getting.

jacekpoplawski · 2026-06-05T16:34:50Z

this would be of great help with the ton of various non-specific parser issue reports we're getting.

that was the idea, I will rebase to master and change file name as requested

ngxson · 2026-06-05T16:44:13Z

+    add_opt(common_arg(
+        {"--log-prompts-dir"}, "PATH",
+        "Log prompts to directory",
+        [](common_params &params, const std::string & value) {
+            params.path_prompts_log_dir = value;
+        }
+    ));


Suggested change

add_opt(common_arg(

{"--log-prompts-dir"}, "PATH",

"Log prompts to directory",

[](common_params &params, const std::string & value) {

params.path_prompts_log_dir = value;

}

));

add_opt(common_arg(

{"--log-prompts-dir"}, "PATH",

"Log prompts to directory (only used for debugging, default: disabled)",

[](common_params &params, const std::string & value) {

params.path_prompts_log_dir = value;

}

).set_examples({LLAMA_EXAMPLE_SERVER, LLAMA_EXAMPLE_CLI}));

Add `--log-prompts-dir` to write each prompt to a separate text file in the specified directory.

pwilkin

@ngxson if it's fine by you let's merge this.

* upstream/HEAD: (329 commits) vendor : update LibreSSL to 4.3.2 (ggml-org#24397) Remove padding and multiple D2D copies for MTP (ggml-org#24086) chat: fix LFM2/LFM2.5 ignoring json_schema (ggml-org#24377) CUDA: Fix ssm_scan_f32 data-races (ggml-org#24360) ci : bump komac version (ggml-org#24396) speculative : fix "ngram-map-k4v" name in logging (ggml-org#24253) webui: implement pinned conversations support (ggml-org#21387) graph: Fix granite speech model inference by applying embedding scale when deepstack is not used (ggml-org#24357) ci : fix windows release (ggml-org#24369) ui: add opt-in run_javascript frontend tool (ggml-org#24244) mtmd: build_vit batching (ggml-org#24352) vulkan: reduce iq1 shared memory usage for mul_mm (ggml-org#24287) vulkan: add `v_dot2_f32_f16` support in matrix-matrix multiplication and Flash Attention (ggml-org#24123) ui: Fix excessive style recalculation on hover (ggml-org#24243) mtmd: refactor video subproc handling (ggml-org#24316) server: log prompts to directory (ggml-org#22031) ui: fix mobile chat form overflow and bust stale bundle cache (ggml-org#24158) ggml : add GGML_OP_COL2IM_1D (ggml-org#24206) server : do not clear slots without unified KV cache (ggml-org#24190) models : fix plamo2 attention_key/value_length regression (ggml-org#24317) ...

@ngxson

* server: log prompts to directory Add `--log-prompts-dir` to write each prompt to a separate text file in the specified directory. * Apply suggestion from @ngxson --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> (cherry picked from commit 1e91256)

@ngxson

* server: log prompts to directory Add `--log-prompts-dir` to write each prompt to a separate text file in the specified directory. * Apply suggestion from @ngxson --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

@ngxson

* server: log prompts to directory Add `--log-prompts-dir` to write each prompt to a separate text file in the specified directory. * Apply suggestion from @ngxson --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> (cherry picked from commit 1e91256)

@ngxson

* server: log prompts to directory Add `--log-prompts-dir` to write each prompt to a separate text file in the specified directory. * Apply suggestion from @ngxson --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

@ngxson

* server: log prompts to directory Add `--log-prompts-dir` to write each prompt to a separate text file in the specified directory. * Apply suggestion from @ngxson --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> (cherry picked from commit 1e91256)

@ngxson

* server: log prompts to directory Add `--log-prompts-dir` to write each prompt to a separate text file in the specified directory. * Apply suggestion from @ngxson --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

jacekpoplawski requested review from a team as code owners April 17, 2026 07:06

github-actions Bot added examples server labels Apr 17, 2026

jacekpoplawski marked this pull request as draft April 17, 2026 07:06

jacekpoplawski force-pushed the prompt-logging branch from 999701f to eff56d7 Compare April 18, 2026 15:13

jacekpoplawski force-pushed the prompt-logging branch from eff56d7 to 10c4950 Compare April 18, 2026 15:52

jacekpoplawski marked this pull request as ready for review April 18, 2026 15:53

jacekpoplawski mentioned this pull request Apr 20, 2026

<system-reminder> keeps moving, causing unnecessary prompt processing in llama.cpp anomalyco/opencode#23595

Open

0x62ash mentioned this pull request Apr 20, 2026

Eval bug: Qwen3.5-122B-A10B-GGUF loses part of the context cache with erased invalidated context checkpoint #19977

Closed

jacekpoplawski force-pushed the prompt-logging branch from 10c4950 to cf0414e Compare May 3, 2026 00:24

jacekpoplawski mentioned this pull request May 23, 2026

server: fix checkpoints creation #22929

Merged

jacekpoplawski mentioned this pull request Jun 5, 2026

Eval bug: Qwen 3.6 27B forcing full prompt re-processing due to lack of cache data #22746

Open

ggerganov approved these changes Jun 5, 2026

View reviewed changes

ngxson reviewed Jun 5, 2026

View reviewed changes

pwilkin approved these changes Jun 5, 2026

View reviewed changes

ngxson reviewed Jun 5, 2026

View reviewed changes

server: log prompts to directory

d69585e

Add `--log-prompts-dir` to write each prompt to a separate text file in the specified directory.

jacekpoplawski force-pushed the prompt-logging branch from cf0414e to d69585e Compare June 5, 2026 16:53

pwilkin approved these changes Jun 9, 2026

View reviewed changes

ngxson reviewed Jun 9, 2026

View reviewed changes

Comment thread common/arg.cpp Outdated

Apply suggestion from @ngxson

8be6eee

ngxson approved these changes Jun 9, 2026

View reviewed changes

ngxson added the merge ready A maintainer can use this label to indicate that they consider the changes final and ready to merge. label Jun 9, 2026

pwilkin approved these changes Jun 9, 2026

View reviewed changes

pwilkin merged commit 1e91256 into ggml-org:master Jun 9, 2026
1 check passed

		static std::atomic<int> prompt_counter(0);
		const int file_name = ++prompt_counter;

Conversation

jacekpoplawski commented Apr 17, 2026

Overview

Additional information

Requirements

Uh oh!

0x62ash commented Apr 17, 2026

Uh oh!

jacekpoplawski commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

0x62ash commented Apr 20, 2026

Uh oh!

pwilkin commented Apr 21, 2026

Uh oh!

jacekpoplawski commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aldehir commented May 3, 2026

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

ngxson Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

jacekpoplawski Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

pwilkin left a comment

Choose a reason for hiding this comment

Uh oh!

jacekpoplawski commented Jun 5, 2026

Uh oh!

ngxson Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

pwilkin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jacekpoplawski commented Apr 18, 2026 •

edited

Loading

jacekpoplawski commented May 3, 2026 •

edited

Loading