-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
Description
Check Existing Issues
- I have searched the existing issues and discussions.
- I am using the latest version of Open WebUI.
Installation Method
Docker
Open WebUI Version
v0.6.16
Ollama Version (if applicable)
No response
Operating System
Windows Sequoia
Browser (if applicable)
No response
Confirmation
- I have read and followed all instructions in
README.md. - I am using the latest version of both Open WebUI and Ollama.
- I have included the browser console logs.
- I have included the Docker container logs.
- I have provided every relevant configuration, setting, and environment variable used in my setup.
- I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
- I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
- Start with the initial platform/version/OS and dependencies used,
- Specify exact install/launch/configure commands,
- List URLs visited, user input (incl. example values/emails/passwords if needed),
- Describe all options and toggles enabled or changed,
- Include any files or environmental changes,
- Identify the expected and actual result at each stage,
- Ensure any reasonably skilled user can follow and hit the same issue.
Expected Behavior
Streaming output provides all streamed content and does not miss any parts
Actual Behavior
Streaming output occasionally misses a stream chunk (a few characters). It is often unnoticeable and you may assume it's a model issue or an inference provider issue, however, I have validated the issue occurs with multiple models from multiple inference providers.
Steps to Reproduce
- Overcome issue issue: debug logging of streamed responses fails with TypeError: not all arguments converted during string formatting #15848 by monkeypatching backend/open_webui/utils/middleware.py replacing with
open-webui/backend/open_webui/utils/middleware.py
Line 2042 in 2470da8
log.debug("Error: ", e) log.debug(f"Error: {e}")(this enables debug logging to print streaming errors properly) - Run latest with
docker run -d --name openwebui -p 3000:8080 -e GLOBAL_LOG_LEVEL=debug -v /path/to/monkeypatched_middleware.py:/app/backend/open_webui/utils/middleware.py -v openwebui-data:/app/backend/data --restart unless-stopped ghcr.io/open-webui/open-webui:latest - Configure with any model like Cerebras qwen-3-235b-a22b or OpenAI gpt-4o-mini
- Run a prompt like 'print a bunch of stuff'
- Check the logs for errors like those indicated below
- If you don't see the error then rerun the prompt a few times or another prompt that outputs a bunch of tokens and it'll show up
**NOTE: I believe this happens more frequently on faster streaming models like OpenAI gpt-4o-mini or Cerebras qwen-3-235b-a22b . **
Logs & Screenshots
2025-07-18 20:35:40.116 | DEBUG | open_webui.utils.middleware:stream_body_handler:2058 - Error: Unterminated string starting at: line 1 column 139 (char 138) - {}
2025-07-18 20:35:40.137 | DEBUG | open_webui.utils.middleware:stream_body_handler:2058 - Error: Unterminated string starting at: line 1 column 172 (char 171) - {}
2025-07-18 20:35:40.268 | DEBUG | open_webui.utils.middleware:stream_body_handler:2058 - Error: Expecting ':' delimiter: line 1 column 171 (char 170) - {}
2025-07-18 20:35:40.306 | DEBUG | open_webui.utils.middleware:stream_body_handler:2058 - Error: Unterminated string starting at: line 1 column 7 (char 6) - {}
2025-07-18 20:35:40.325 | DEBUG | open_webui.utils.middleware:stream_body_handler:2058 - Error: Unterminated string starting at: line 1 column 173 (char 172) - {}
Additional Information
I have investigated this at length. If you add do some debugging after line 1786 you'll find that 1. every so often a valid JSON event will be chunked into two line iterations, the first will contain some of the data including the beginning of the JSON string, and the second will contain the remaining part of the JSON string, and 2. each line does not contain line endings.
I dug around and found
| r.content, |
I don't know whether the correct solution is to do buffering in openwebui's middleware.py where it's processing the lines (which won't work well because the line endings are not showing up and you can only key on something like }), or if the solution is to do something lower level to prevent SSE JSON lines from ever being fragmented in the first place...