webui: introduce OpenAI-compatible model selector in JSON payload by ServeurpersoCom · Pull Request #16562 · ggml-org/llama.cpp

ServeurpersoCom · 2025-10-13T13:31:34Z

Introduce OpenAI-compatible model selector in JSON payload

This PR adds a minimal model selector to the WebUI sidebar, allowing users to pick an available model exposed through the /v1/models OpenAI-compatible endpoint

The selector automatically fetches and lists models from the server, persists the selected model in local storage, and sends it in the JSON body of subsequent /v1/chat/completions requests. The selection logic mirrors OpenAI’s client behavior while remaining fully offline-compatible with local llama.cpp instances

This enables direct interoperability with OpenAI-compatible clients and simplifies multi-model setups in the WebUI

Restore OpenAI-Compatible model source of truth and unify metadata capture :

This change re-establishes a single, reliable source of truth for the active model:
fully aligned with the OpenAI-Compat API behavior

It introduces a unified metadata flow that captures the model field from both
streaming and non-streaming responses, wiring a new onModel callback through ChatService
The model name is now resolved directly from the API payload rather than relying on
server /props or UI assumptions

ChatStore records and persists the resolved model for each assistant message during
streaming, ensuring consistency across the UI and database
Type definitions for API and settings were also extended to include model metadata
and the onModel callback, completing the alignment with OpenAI-Compat semantics

Remaining '/props' usage audit in the WebUI :

A repository-wide search inside 'tools/server/webui' shows the remaining '/props' references are intentional because the WebUI still needs to bootstrap and validate server capabilities outside of chat responses:

'src/routes/+layout.svelte' and 'src/lib/stores/server.svelte.ts' fetch '/props' on application startup to populate the global server store with template, model alias, and capability metadata that never appears in chat completions.
'src/lib/components/app/server/ServerErrorSplash.svelte' and 'src/lib/components/app/chat/ChatScreen/ChatScreenWarning.svelte' surface fallback UI when '/props' is unreachable, ensuring the user understands cached data might be stale.
'src/lib/utils/api-key-validation.ts' validates API keys against '/props' so that the UI can warn about incompatible keys before issuing chat requests.
'src/lib/services/chat.ts' performs a last-resort fetch to '/props' when the streaming handshake fails, preserving compatibility with legacy servers that only expose model names via that endpoint.

ServeurpersoCom · 2025-10-13T13:34:53Z

TL;DR:
Adds a lightweight model selector for the WebUI using the /v1/models OpenAI-compatible endpoint.
Selected models are persisted locally and included in chat request payloads (model field).
Also unifies model metadata capture during streaming and non-streaming responses : the WebUI now uses a single source of truth for the active model across the stack.

ServeurpersoCom · 2025-10-13T13:37:40Z

@ngxson :) What do you think about this approach ?

aiming to stay compatible with the current standalone llama-server,
llama-swap
and future multi-model evolutions of llama-server?

It introduces a unified, KISS, OpenAI-compatible model selection path while keeping everything backward-compatible with existing setups

A standalone llama-server on a Raspberry Pi 5 :

I'll have to filter the model path here too (?)

ServeurpersoCom · 2025-10-13T15:04:47Z

@allozaur mind taking a look at those default Svelte arrows and the scrolling manager? I figured your Svelte wizardry might know the cleanest way to get rid of them 😄 I like things to be pixel-perfect, but it looks like this is built into the framework : and I’d rather not bypass Svelte just for that.

allozaur · 2025-10-13T15:06:15Z

@allozaur mind taking a look at those default Svelte arrows and the scrolling manager? I figured your Svelte wizardry might know the cleanest way to get rid of them 😄 I like things to be pixel-perfect, but it looks like this is built into the framework : and I’d rather not bypass Svelte just for that.

Yep, will take a look at that and come back to u with an answer 😉

tools/server/webui/src/lib/components/app/chat/ChatSidebar/ChatSidebar.svelte

tools/server/webui/src/routes/+layout.svelte

tools/server/webui/src/lib/stores/models.svelte.ts

tools/server/webui/src/lib/stores/chat.svelte.ts

ServeurpersoCom · 2025-10-20T06:16:19Z

Extracted determineInitialSelection helper
Centralized localStorage key constant
Minor Toaster cleanup per review
All checks passing locally

ServeurpersoCom · 2025-10-20T07:01:40Z

I think that placing the model selector in the Sidebar makes its UI a bit too heavy and bloated... Much better place for changing the model woud be in Chat Form, like here:

Computer :

Smartphone :

That’s actually a great idea : moving the selector into the Chat Form feels way more natural now that I’ve tested it 😄
The layout is cleaner, lighter, and it fits perfectly in the conversation flow. Definitely the right spot!

1324a4d

allozaur · 2025-10-20T07:22:09Z

I think that placing the model selector in the Sidebar makes its UI a bit too heavy and bloated... Much better place for changing the model woud be in Chat Form, like here:

Computer :

Smartphone :

That’s actually a great idea : moving the selector into the Chat Form feels way more natural now that I’ve tested it 😄

The layout is cleaner, lighter, and it fits perfectly in the conversation flow. Definitely the right spot!

1324a4d

I will post a PR to improve the UI of this selector as now it's just taking too much space and looks a bit off 😜

ServeurpersoCom · 2025-10-20T07:31:55Z

I will post a PR to improve the UI of this selector as now it's just taking too much space and looks a bit off 😜

Got it : you want to fine-tune the layout so the selector sits closer to the mic button, probably with a max width to keep long model names from stretching the form. We could even take it a step further : replace the full selector with a small model icon or dropdown button that pops the model list on click.

But we should still make sure the currently selected model (the one actually sent in the request) is clearly visible somewhere: since the /props display above only shows the model currently loaded on the llama-server, not the one chosen by the user through the selector. And on mobile there’s already very little screen space to work with, so keeping it minimal while still informative would be ideal. Actually, the small checkmark in the dropdown list might already be enough to show the active model, though in that case, we’d probably need to fix the scrolling glitch in the framework so the menu behaves properly.

ServeurpersoCom · 2025-10-20T08:36:34Z

Like this ?

(root|~/llama.cpp.pascal) git diff 0b9aaf8fe8fb36817c74f25e905aedaccfa9a825
diff --git a/tools/server/public/index.html.gz b/tools/server/public/index.html.gz
index c76f5778b..1897d50c8 100644
Binary files a/tools/server/public/index.html.gz and b/tools/server/public/index.html.gz differ
diff --git a/tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions.svelte b/tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions.svelte
index cbf0385a4..1bdb7f947 100644
--- a/tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions.svelte
+++ b/tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions.svelte
@@ -32,27 +32,29 @@
 <div class="flex w-full items-center gap-2 {className}">
        <ChatFormActionFileAttachments {disabled} {onFileUpload} />

-       <ChatFormModelSelector class="min-w-[140px] flex-1" />
+       <div class="ml-auto flex items-center gap-2">
+               <ChatFormModelSelector class="flex-shrink-0" />

-       {#if isLoading}
-               <Button
-                       type="button"
-                       onclick={onStop}
-                       class="h-8 w-8 bg-transparent p-0 hover:bg-destructive/20"
-               >
-                       <span class="sr-only">Stop</span>
-                       <Square class="h-8 w-8 fill-destructive stroke-destructive" />
-               </Button>
-       {:else}
-               <ChatFormActionRecord {disabled} {isLoading} {isRecording} {onMicClick} />
+               {#if isLoading}
+                       <Button
+                               type="button"
+                               onclick={onStop}
+                               class="h-8 w-8 bg-transparent p-0 hover:bg-destructive/20"
+                       >
+                               <span class="sr-only">Stop</span>
+                               <Square class="h-8 w-8 fill-destructive stroke-destructive" />
+                       </Button>
+               {:else}
+                       <ChatFormActionRecord {disabled} {isLoading} {isRecording} {onMicClick} />

-               <Button
-                       type="submit"
-                       disabled={!canSend || disabled || isLoading}
-                       class="h-8 w-8 rounded-full p-0"
-               >
-                       <span class="sr-only">Send</span>
-                       <ArrowUp class="h-12 w-12" />
-               </Button>
-       {/if}
+                       <Button
+                               type="submit"
+                               disabled={!canSend || disabled || isLoading}
+                               class="h-8 w-8 rounded-full p-0"
+                       >
+                               <span class="sr-only">Send</span>
+                               <ArrowUp class="h-12 w-12" />
+                       </Button>
+               {/if}
+       </div>
 </div>
diff --git a/tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte b/tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte
index ca48285da..d54147a5e 100644
--- a/tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte
+++ b/tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte
@@ -1,6 +1,6 @@
 <script lang="ts">
        import { onMount } from 'svelte';
-       import { Loader2 } from '@lucide/svelte';
+       import { Bot, Loader2 } from '@lucide/svelte';
        import * as Select from '$lib/components/ui/select';
        import { cn } from '$lib/components/ui/utils';
        import {
@@ -63,7 +63,7 @@
        }
 </script>

-<div class={cn('flex min-w-0 flex-col gap-1', className)}>
+<div class={cn('flex min-w-0 flex-col items-center gap-1', className)}>
        {#if loading && options.length === 0 && !isMounted}
                <div class="flex items-center gap-2 text-xs text-muted-foreground">
                        <Loader2 class="h-4 w-4 animate-spin" />
@@ -81,13 +81,16 @@
                        disabled={loading || updating}
                >
                        <Select.Trigger
-                               class="h-9 w-full min-w-[140px] justify-between rounded-full border border-input bg-background/80 px-3 text-left text-sm sm:min-w-[200px]"
+                               aria-label="Select model"
+                               title={selectedOption?.name || 'Select model'}
+                               class="flex !h-8 !w-8 items-center justify-center !rounded-full border border-input !bg-background/80 !p-0 text-muted-foreground transition-colors hover:!bg-background data-[state=open]:!bg-background [&>svg:last-child]:hidden"
                        >
-                               <span class="truncate font-medium">{selectedOption?.name || 'Select model'}</span>
-
                                {#if updating}
                                        <Loader2 class="h-4 w-4 animate-spin text-muted-foreground" />
+                               {:else}
+                                       <Bot class="h-4 w-4" />
                                {/if}
+                               <span class="sr-only">{selectedOption?.name || 'Select model'}</span>
                        </Select.Trigger>

                        <Select.Content class="z-[100000]">
@@ -105,6 +108,6 @@
        {/if}

        {#if error}
-               <p class="text-xs text-destructive">{error}</p>
+               <p class="text-center text-xs text-destructive">{error}</p>
        {/if}
 </div>
(root|~/llama.cpp.pascal)

allozaur · 2025-10-20T08:38:32Z

Like this?

@ServeurpersoCom i thought more of sth similiar to what Claude has:

ServeurpersoCom · 2025-10-20T15:19:10Z

rebase master

ServeurpersoCom · 2025-10-20T16:34:31Z

ServeurpersoCom · 2025-10-20T18:14:29Z

No more Flowbite scroll bug on mobile :

I'll add a "developer" option with the model selector hidden by default.
But we can force it by default for all new users in lib/constants/settings-config.ts.
That's important so users can share decentralized AI services with friends, like I'm doing now,
otherwise it would be broken for anyone visiting the page for the first time!

ServeurpersoCom · 2025-10-20T18:46:20Z

Done. Now it’s compliant for all use cases @allozaur

ServeurpersoCom · 2025-10-20T19:13:40Z

PC :
https://github.com/user-attachments/assets/c7b55f8e-503d-4445-ab17-59ab4c75d35e

Mobile :
https://github.com/user-attachments/assets/f73c6e31-3246-4db0-b66a-75dba4d23b67

allozaur · 2025-10-20T23:09:53Z

@ServeurpersoCom

I've tested this on my end simply by running:

llama-server -hf ggml-org/Qwen2.5-Omni-7B-GGUF -c 0 --jinja --parallel 5

and

npm run dev

and I've spotted a few issues:

When i have the Enable model selector option unchecked, the default model should not be gpt-3.5-turbo but the one that i am actually running with llama-server

On the main screen i see the currently loaded model:

But under the message I am seeing this in the model info:

Also, after enabling the model selector in Settings, i am seeing full path to the model file instead of just the model file in the message model info.

Besides that the selector value string is good, but when the name is long, the UI breaks:

ServeurpersoCom · 2025-10-21T04:29:06Z

1. When i have the `Enable model selector` option unchecked, the default model should not be `gpt-3.5-turbo` but the one that i am actually running with `llama-server`

Oops, that's exactly what the backend does : we need to display the captured chunks ! and refactor the backend accordingly!

When the selector is disabled, it falls back to the active server model name from /props When the model selector is enabled, the displayed model comes from the message metadata (the one explicitly selected and sent in the request)

…rmActions.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

…rmModelSelector.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

…atMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

- Replace inline portal and event listeners with proper Svelte bindings - Introduce 'persisted' store helper for localStorage sync without runes - Extract 'normalizeModelName' utils + Vitest coverage - Simplify ChatFormModelSelector structure and cleanup logic Replaced the persisted store helper's use of '$state/$effect' runes with a plain TS implementation to prevent orphaned effect runtime errors outside component context Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

…rmModelSelector.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

…ant model field

allozaur

Also please make this small change ;)

tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte

…atMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

allozaur

Okay, i think that we are good to go :)

@allozaur

…ml-org#16562) * webui: introduce OpenAI-compatible model selector in JSON payload * webui: restore OpenAI-Compatible model source of truth and unify metadata capture This change re-establishes a single, reliable source of truth for the active model: fully aligned with the OpenAI-Compat API behavior It introduces a unified metadata flow that captures the model field from both streaming and non-streaming responses, wiring a new onModel callback through ChatService The model name is now resolved directly from the API payload rather than relying on server /props or UI assumptions ChatStore records and persists the resolved model for each assistant message during streaming, ensuring consistency across the UI and database Type definitions for API and settings were also extended to include model metadata and the onModel callback, completing the alignment with OpenAI-Compat semantics * webui: address review feedback from allozaur * webui: move model selector into ChatForm (idea by @allozaur) * webui: make model selector more subtle and integrated into ChatForm * webui: replaced the Flowbite selector with a native Svelte dropdown * webui: add developer setting to toggle the chat model selector * webui: address review feedback from allozaur Normalized streamed model names during chat updates by trimming input and removing directory components before saving or persisting them, so the conversation UI shows only the filename Forced model names within the chat form selector dropdown to render as a single-line, truncated entry with a tooltip revealing the full name * webui: toggle displayed model source for legacy vs OpenAI-Compat modes When the selector is disabled, it falls back to the active server model name from /props When the model selector is enabled, the displayed model comes from the message metadata (the one explicitly selected and sent in the request) * Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/constants/localstorage-keys.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/services/chat.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/services/chat.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: refactor model selector and persistence helpers - Replace inline portal and event listeners with proper Svelte bindings - Introduce 'persisted' store helper for localStorage sync without runes - Extract 'normalizeModelName' utils + Vitest coverage - Simplify ChatFormModelSelector structure and cleanup logic Replaced the persisted store helper's use of '$state/$effect' runes with a plain TS implementation to prevent orphaned effect runtime errors outside component context Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: document normalizeModelName usage with inline examples * Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/stores/models.svelte.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/stores/models.svelte.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: extract ModelOption type into dedicated models.d.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: refine ChatMessageAssistant displayedModel source logic * webui: stabilize dropdown, simplify model extraction, and init assistant model field * chore: update webui static build * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * chore: npm format, update webui static build * webui: align sidebar trigger position, remove z-index glitch * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

@allozaur

* Add new webui from llama.cpp * Add new webui * feat: Improve mobile UI for Settings Dialog (ggml-org#16084) * feat: Improve mobile UI for Settings Dialog * chore: update webui build output * fix: Linting errors * chore: update webui build output # Conflicts: # examples/server/webui_llamacpp/src/lib/components/app/chat/ChatSettings/ChatSettingsFields.svelte # examples/server/webui_llamacpp/src/lib/components/app/chat/ChatSettings/ChatSettingsSection.svelte # tools/server/public/index.html.gz * webui : fix handling incomplete chunks (ggml-org#16107) * Always show message actions for mobile UI + improvements for user message sizing (ggml-org#16076) # Conflicts: # .gitignore # examples/server/webui_llamacpp/package.json # examples/server/webui_llamacpp/scripts/dev.sh # tools/server/webui/scripts/post-build.sh * webui: switch to hash-based routing (alternative of ggml-org#16079) (ggml-org#16157) * Switched web UI to hash-based routing * Added hash to missed goto function call * Removed outdated SPA handling code * Fixed broken sidebar home link # Conflicts: # examples/server/webui_llamacpp/src/routes/+layout.ts # tools/server/server.cpp * Allow viewing conversations even when llama server is down (ggml-org#16255) * webui: allow viewing conversations and sending messages even if llama-server is down - Cached llama.cpp server properties in browser localStorage on startup, persisting successful fetches and reloading them when refresh attempts fail so the chat UI continues to render while the backend is unavailable. - Cleared the stored server properties when resetting the store to prevent stale capability data after cache-backed operation. - Kept the original error-splash behavior when no cached props exist so fresh installs still surface a clear failure state instead of rendering stale data. * feat: Add UI for `props` endpoint unavailable + cleanup logic * webui: extend cached props fallback to offline errors Treat connection failures (refused, DNS, timeout, fetch) the same way as server 5xx so the warning banner shows up when cache is available, instead of falling back to a full error screen. * webui: Left the chat form enabled when a server warning is present so operators can keep sending messages e.g., to restart the backend over llama-swap, even while cached /props data is in use * chore: update webui build output --------- Co-authored-by: Pascal <admin@serveurperso.com> # Conflicts: # examples/server/webui_llamacpp/src/lib/components/app/chat/ChatScreen/ChatScreenWarning.svelte # examples/server/webui_llamacpp/src/lib/constants/localstorage-keys.ts * Enhance text file detection logic for file attachments (ggml-org#16199) * feat: Enhances text file detection logic * chore: Build static `webui` output * chore: update webui build output # Conflicts: # examples/server/webui_llamacpp/src/lib/constants/binary-detection.ts * Show message actions by default (ggml-org#16289) * fix: preserved zero values in chat settings inputs and textareas by switching to nullish coalescing for field values and default placeholders (ggml-org#16312) * Improve Mobile UI for dialogs and action dropdowns (ggml-org#16222) * fix: Always show conversation item actions * feat: Improve Alert Dialog and Dialog mobile UI * feat: Add settings reset to default confirmation * fix: Close Edit dialog on save * chore: update webui build output * webui: implement proper z-index system and scroll management - Add CSS variable for centralized z-index control - Fix dropdown positioning with Settings dialog conflicts - Prevent external scroll interference with proper event handling - Clean up hardcoded z-index values for maintainable architecture * webui: ensured the settings dialog enforces dynamic viewport height on mobile while retaining existing desktop sizing overrides * feat: Use `dvh` instead of computed px height for dialogs max height on mobile * chore: update webui build output * feat: Improve Settings fields UI * chore: update webui build output * chore: update webui build output --------- Co-authored-by: Pascal <admin@serveurperso.com> * Fix thinking blocks with quotes + add handling `[THINK]...[/THINK]` blocks (ggml-org#16326) * fix: prevent reasoning blocks with quotes from being truncated * chore: update webui build output * feat: Improve thinking content parsing * test: Adds ChatMessage component stories for different thinking blocks * chore: update webui build output * fix: ChatMessage story fix --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Chatapi ignore empty sampling (ggml-org#16330) * fix: skip empty sampling fields instead of coercing to 0 in chat API options * chore: update webui build output * webui: Remove running `llama-server` within WebUI `dev.sh` script (ggml-org#16363) * Add optional setting for showing "Model used:" information (ggml-org#16337) * feat: Add a setting to include model name used to generate the message * feat: UI improvements * feat: Save model info along with the database message entry creation * chore: Build webui static output * Improve code block color theming (ggml-org#16325) * feat: Improve code block theming * chore: update webui build output * chore: Update webui static build * Conversation action dialogs as singletons from Chat Sidebar + apply conditional rendering for Actions Dropdown for Chat Conversation Items (ggml-org#16369) * fix: Render Conversation action dialogs as singletons from Chat Sidebar level * chore: update webui build output * fix: Render Actions Dropdown conditionally only when user hovers conversation item + remove unused markup * chore: Update webui static build * fix: Always truncate conversation names * chore: Update webui static build * fix: track viewportHeight via window.innerHeight to avoid unwanted scrolling (ggml-org#16356) Use <svelte:window bind:innerHeight> instead of manual resize listener Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui : Fix messages payload sent to chat completions (ggml-org#16402) * fix: Include just the currently active message branches instead of all in chat completions request * chore: Build webui static output * chore: Formatting * chore: update webui build output * Capture model name only after first token (streaming) or completed request (ggml-org#16405) * feat: Capture model name only after first token (streaming) or completed request (non-streaming) * chore: update webui build output * chore: update webui build output * Fix missing messages on sibling navigation (ggml-org#16408) * fix: resolve message disappearing issue when navigating between regenerated siblings by using current leaf nodes instead of cached sibling IDs * chore: update webui build output * chore: update webui build output * webui : added download action (ggml-org#13552) (ggml-org#16282) * webui : added download action (ggml-org#13552) * webui : import and export (for all conversations) * webui : fixed download-format, import of one conversation * webui : add ExportedConversations type for chat import/export * feat: Update naming & order * chore: Linting * webui : Updated static build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * refactor: centralize CoT parsing in backend for streaming mode (ggml-org#16394) * refactor: unify reasoning handling via backend reasoning_content, drop frontend tag parsing - Updated the chat message component to surface backend-supplied reasoning via message.thinking while showing the raw assistant content without inline tag scrubbing - Simplified chat streaming to append content chunks directly, stream reasoning into the message model, and persist any partial reasoning when generation stops - Refactored the chat service SSE handler to rely on server-provided reasoning_content, removing legacy <think> parsing logic - Refreshed Storybook data and streaming flows to populate the thinking field explicitly for static and streaming assistant messages * refactor: implement streaming-aware universal reasoning parser Remove the streaming mode limitation from --reasoning-format by refactoring try_parse_reasoning() to handle incremental parsing of <think> tags across all formats. - Rework try_parse_reasoning() to track whitespace, partial tags, and multiple reasoning segments, allowing proper separation of reasoning_content and content in streaming mode - Parse reasoning tags before tool call handling in content-only and Llama 3.x formats to ensure inline <think> blocks are captured correctly - Change default reasoning_format from 'auto' to 'deepseek' for consistent behavior - Add 'deepseek-legacy' option to preserve old inline behavior when needed - Update CLI help and documentation to reflect streaming support - Add parser tests for inline <think>...</think> segments The parser now continues processing content after </think> closes instead of stopping, enabling proper message.reasoning_content and message.content separation in both streaming and non-streaming modes. Fixes the issue where streaming responses would dump everything (including post-thinking content) into reasoning_content while leaving content empty. * refactor: address review feedback from allozaur - Passed the assistant message content directly to ChatMessageAssistant to drop the redundant derived state in the chat message component - Simplified chat streaming updates by removing unused partial-thinking handling and persisting partial responses straight from currentResponse - Refreshed the ChatMessage stories to cover standard and reasoning scenarios without the old THINK-tag parsing examples Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * refactor: restore forced reasoning prefix to pass test-chat ([chat] All tests passed) - store the exact sequence seen on input when 'thinking_forced_open' enforces a reasoning block - inject this prefix before the first accumulated segment in 'reasoning_content', then clear it to avoid duplication - repeat the capture on every new 'start_think' detection to properly handle partial/streaming flows * refactor: address review feedback from ngxson * debug: say goodbye to curl -N, hello one-click raw stream - adds a new checkbox in the WebUI to display raw LLM output without backend parsing or frontend Markdown rendering * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessage.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: add Storybook example for raw LLM output and scope reasoning format toggle per story - Added a Storybook example that showcases the chat message component in raw LLM output mode with the provided trace sample - Updated every ChatMessage story to toggle the disableReasoningFormat setting so the raw-output rendering remains scoped to its own example * npm run format * chat-parser: address review feedback from ngxson Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> # Conflicts: # common/arg.cpp # examples/server/webui_llamacpp/src/lib/utils/thinking.ts # tools/server/README.md * No markdown in cot (ggml-org#16483) * fix: let the model think in plaintext * chore: npm run format + npm run build * webui: updated the chat service to only include max_tokens in the req… (ggml-org#16489) * webui: updated the chat service to only include max_tokens in the request payload when the setting is explicitly provided, while still mapping explicit zero or null values to the infinite-token sentinel * chore: update webui build output * feat: render user content as markdown option (ggml-org#16358) * feat: render user content as markdown option - Add a persisted 'renderUserContentAsMarkdown' preference to the settings defaults and info metadata so the choice survives reloads like other options - Surface the new 'Render user content as Markdown' checkbox in the General section of the chat settings dialog, beneath the PDF toggle - Render user chat messages with 'MarkdownContent' when the new setting is enabled, matching assistant formatting while preserving the existing card styling otherwise - chore: update webui build output * chore: update webui build output * webui: remove client-side context pre-check and rely on backend for limits (ggml-org#16506) * fix: make SSE client robust to premature [DONE] in agentic proxy chains * webui: remove client-side context pre-check and rely on backend for limits Removed the client-side context window pre-check and now simply sends messages while keeping the dialog imports limited to core components, eliminating the maximum context alert path Simplified streaming and non-streaming chat error handling to surface a generic 'No response received from server' error whenever the backend returns no content Removed the obsolete maxContextError plumbing from the chat store so state management now focuses on the core message flow without special context-limit cases * webui: cosmetic rename of error messages * Update tools/server/webui/src/lib/stores/chat.svelte.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/stores/chat.svelte.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/components/app/chat/ChatScreen/ChatScreen.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/components/app/chat/ChatScreen/ChatScreen.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> # Conflicts: # examples/server/webui_llamacpp/src/lib/components/app/dialogs/ChatErrorDialog.svelte # examples/server/webui_llamacpp/src/lib/components/app/dialogs/MaximumContextAlertDialog.svelte # examples/server/webui_llamacpp/src/lib/services/context.ts * fix: add remark plugin to render raw HTML as literal text (ggml-org#16505) * fix: add remark plugin to render raw HTML as literal text Implemented a missing MDAST stage to neutralize raw HTML like major LLM WebUIs do ensuring consistent and safe Markdown rendering Introduced 'remarkLiteralHtml', a plugin that converts raw HTML nodes in the Markdown AST into plain-text equivalents while preserving indentation and line breaks. This ensures consistent rendering and prevents unintended HTML execution, without altering valid Markdown structure Kept 'remarkRehype' in the pipeline since it performs the required conversion from MDAST to HAST for KaTeX, syntax highlighting, and HTML serialization Refined the link-enhancement logic to skip unnecessary DOM rewrites, fixing a subtle bug where extra paragraphs were injected after the first line due to full innerHTML reconstruction, and ensuring links open in new tabs only when required Final pipeline: remarkGfm -> remarkMath -> remarkBreaks -> remarkLiteralHtml -> remarkRehype -> rehypeKatex -> rehypeHighlight -> rehypeStringify * fix: address review feedback from allozaur * chore: update webui build output # Conflicts: # examples/server/webui_llamacpp/src/lib/constants/literal-html.ts * Add server-driven parameter defaults and syncing (ggml-org#16515) # Conflicts: # examples/server/webui_llamacpp/src/lib/components/app/chat/ChatSettings/ParameterSourceIndicator.svelte # examples/server/webui_llamacpp/src/lib/constants/precision.ts # examples/server/webui_llamacpp/src/lib/services/parameter-sync.spec.ts # examples/server/webui_llamacpp/src/lib/services/parameter-sync.ts # examples/server/webui_llamacpp/src/lib/utils/config-helpers.ts # examples/server/webui_llamacpp/src/lib/utils/precision.ts * fix: added a normalization step for MathJax-style \[\] and  delimiters (ggml-org#16599) * fix: added a normalization step for MathJax-style \[\] and  delimiters So inline and block equations are converted before KaTeX rendering, enabling proper display of model-generated LaTeX in the WebUI * chore: update webui build output * webui: reorganize settings layout (ggml-org#16607) * webui: reorganize settings layout * chore: update webui build output * fix: remove unused variable * chore: update webui build output * Enable per-conversation loading states to allow having parallel conversations (ggml-org#16327) * feat: Per-conversation loading states and tracking streaming stats * chore: update webui build output * refactor: Chat state management Consolidates loading state management by using a global `isLoading` store synchronized with individual conversation states. This change ensures proper reactivity and avoids potential race conditions when updating the UI based on the loading status of different conversations. It also improves the accuracy of statistics displayed. Additionally, slots service methods are updated to use conversation IDs for per-conversation state management, avoiding global state pollution. * feat: Adds loading indicator to conversation items * chore: update webui build output * fix: Fix aborting chat streaming Improves the chat stream abortion process by ensuring that partial responses are saved before the abort signal is sent. This avoids a race condition where the onError callback could clear the streaming state before the partial response is saved. Additionally, the stream reading loop and callbacks are now checked for abort signals to prevent further processing after abortion. * refactor: Remove redundant comments * chore: build webui static output * refactor: Cleanup * chore: update webui build output * chore: update webui build output * fix: Conversation loading indicator for regenerating messages * chore: update webui static build * feat: Improve configuration * feat: Install `http-server` as dev dependency to not need to rely on `npx` in CI * Import/Export UX improvements (ggml-org#16619) * webui : added download action (ggml-org#13552) * webui : import and export (for all conversations) * webui : fixed download-format, import of one conversation * webui : add ExportedConversations type for chat import/export * feat: Update naming & order * chore: Linting * feat: Import/Export UX improvements * chore: update webui build output * feat: Update UI placement of Import/Export tab in Chat Settings Dialog * refactor: Cleanup chore: update webui build output * feat: Enable shift-click multiple conversation items selection * chore: update webui static build * chore: update webui static build --------- Co-authored-by: Sascha Rogmann <github@rogmann.org> # Conflicts: # examples/server/webui_llamacpp/src/lib/components/app/chat/ChatSettings/ConversationSelectionDialog.svelte # examples/server/webui_llamacpp/src/lib/components/app/chat/ChatSettings/ImportExportTab.svelte # examples/server/webui_llamacpp/src/lib/utils/conversation-utils.ts * Prevent premature submission on IME input (ggml-org#16673) * fix: Prevent premature submission on IME input * chore: update webui static build * refactor: Put IME completion checker in a helper function and add checking for `KeyboardEvent.eventKey === 229` * chore: update webui static build * chore: update webui static build * chore: update webui static build # Conflicts: # examples/server/webui_llamacpp/src/lib/utils/is-ime-composing.ts * Handle legacy 'context' attachments (ggml-org#16687) * webui: introduce OpenAI-compatible model selector in JSON payload (ggml-org#16562) * webui: introduce OpenAI-compatible model selector in JSON payload * webui: restore OpenAI-Compatible model source of truth and unify metadata capture This change re-establishes a single, reliable source of truth for the active model: fully aligned with the OpenAI-Compat API behavior It introduces a unified metadata flow that captures the model field from both streaming and non-streaming responses, wiring a new onModel callback through ChatService The model name is now resolved directly from the API payload rather than relying on server /props or UI assumptions ChatStore records and persists the resolved model for each assistant message during streaming, ensuring consistency across the UI and database Type definitions for API and settings were also extended to include model metadata and the onModel callback, completing the alignment with OpenAI-Compat semantics * webui: address review feedback from allozaur * webui: move model selector into ChatForm (idea by @allozaur) * webui: make model selector more subtle and integrated into ChatForm * webui: replaced the Flowbite selector with a native Svelte dropdown * webui: add developer setting to toggle the chat model selector * webui: address review feedback from allozaur Normalized streamed model names during chat updates by trimming input and removing directory components before saving or persisting them, so the conversation UI shows only the filename Forced model names within the chat form selector dropdown to render as a single-line, truncated entry with a tooltip revealing the full name * webui: toggle displayed model source for legacy vs OpenAI-Compat modes When the selector is disabled, it falls back to the active server model name from /props When the model selector is enabled, the displayed model comes from the message metadata (the one explicitly selected and sent in the request) * Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/constants/localstorage-keys.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/services/chat.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/services/chat.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: refactor model selector and persistence helpers - Replace inline portal and event listeners with proper Svelte bindings - Introduce 'persisted' store helper for localStorage sync without runes - Extract 'normalizeModelName' utils + Vitest coverage - Simplify ChatFormModelSelector structure and cleanup logic Replaced the persisted store helper's use of '$state/$effect' runes with a plain TS implementation to prevent orphaned effect runtime errors outside component context Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: document normalizeModelName usage with inline examples * Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/stores/models.svelte.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/stores/models.svelte.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: extract ModelOption type into dedicated models.d.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: refine ChatMessageAssistant displayedModel source logic * webui: stabilize dropdown, simplify model extraction, and init assistant model field * chore: update webui static build * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * chore: npm format, update webui static build * webui: align sidebar trigger position, remove z-index glitch * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> # Conflicts: # examples/server/webui_llamacpp/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte # examples/server/webui_llamacpp/src/lib/services/models.ts # examples/server/webui_llamacpp/src/lib/stores/models.svelte.ts # examples/server/webui_llamacpp/src/lib/stores/persisted.svelte.ts # examples/server/webui_llamacpp/src/lib/types/models.d.ts # examples/server/webui_llamacpp/src/lib/utils/model-names.test.ts # examples/server/webui_llamacpp/src/lib/utils/model-names.ts # examples/server/webui_llamacpp/src/lib/utils/portal-to-body.ts * webui: support q URL parameter (ggml-org#16728) * webui: support q URL parameter Fixes ggml-org#16722 I’ve checked that it works with Firefox’s AI tools * webui: apply suggestions from code review Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * chore: update webui static build --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * build fix --------- Co-authored-by: firecoperana <firecoperana> Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> Co-authored-by: Quentin Bramas <quentin.bramas@gmail.com> Co-authored-by: Isaac McFadyen <isaac@imcf.me> Co-authored-by: Pascal <admin@serveurperso.com> Co-authored-by: Sascha Rogmann <59577610+srogmann@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Sascha Rogmann <github@rogmann.org> Co-authored-by: Florian Badie <florianbadie@odrling.xyz>

@allozaur

…ml-org#16562) * webui: introduce OpenAI-compatible model selector in JSON payload * webui: restore OpenAI-Compatible model source of truth and unify metadata capture This change re-establishes a single, reliable source of truth for the active model: fully aligned with the OpenAI-Compat API behavior It introduces a unified metadata flow that captures the model field from both streaming and non-streaming responses, wiring a new onModel callback through ChatService The model name is now resolved directly from the API payload rather than relying on server /props or UI assumptions ChatStore records and persists the resolved model for each assistant message during streaming, ensuring consistency across the UI and database Type definitions for API and settings were also extended to include model metadata and the onModel callback, completing the alignment with OpenAI-Compat semantics * webui: address review feedback from allozaur * webui: move model selector into ChatForm (idea by @allozaur) * webui: make model selector more subtle and integrated into ChatForm * webui: replaced the Flowbite selector with a native Svelte dropdown * webui: add developer setting to toggle the chat model selector * webui: address review feedback from allozaur Normalized streamed model names during chat updates by trimming input and removing directory components before saving or persisting them, so the conversation UI shows only the filename Forced model names within the chat form selector dropdown to render as a single-line, truncated entry with a tooltip revealing the full name * webui: toggle displayed model source for legacy vs OpenAI-Compat modes When the selector is disabled, it falls back to the active server model name from /props When the model selector is enabled, the displayed model comes from the message metadata (the one explicitly selected and sent in the request) * Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/constants/localstorage-keys.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/services/chat.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/services/chat.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: refactor model selector and persistence helpers - Replace inline portal and event listeners with proper Svelte bindings - Introduce 'persisted' store helper for localStorage sync without runes - Extract 'normalizeModelName' utils + Vitest coverage - Simplify ChatFormModelSelector structure and cleanup logic Replaced the persisted store helper's use of '$state/$effect' runes with a plain TS implementation to prevent orphaned effect runtime errors outside component context Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: document normalizeModelName usage with inline examples * Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/stores/models.svelte.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/stores/models.svelte.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: extract ModelOption type into dedicated models.d.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: refine ChatMessageAssistant displayedModel source logic * webui: stabilize dropdown, simplify model extraction, and init assistant model field * chore: update webui static build * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * chore: npm format, update webui static build * webui: align sidebar trigger position, remove z-index glitch * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

@allozaur

…6562) * webui: introduce OpenAI-compatible model selector in JSON payload * webui: restore OpenAI-Compatible model source of truth and unify metadata capture This change re-establishes a single, reliable source of truth for the active model: fully aligned with the OpenAI-Compat API behavior It introduces a unified metadata flow that captures the model field from both streaming and non-streaming responses, wiring a new onModel callback through ChatService The model name is now resolved directly from the API payload rather than relying on server /props or UI assumptions ChatStore records and persists the resolved model for each assistant message during streaming, ensuring consistency across the UI and database Type definitions for API and settings were also extended to include model metadata and the onModel callback, completing the alignment with OpenAI-Compat semantics * webui: address review feedback from allozaur * webui: move model selector into ChatForm (idea by @allozaur) * webui: make model selector more subtle and integrated into ChatForm * webui: replaced the Flowbite selector with a native Svelte dropdown * webui: add developer setting to toggle the chat model selector * webui: address review feedback from allozaur Normalized streamed model names during chat updates by trimming input and removing directory components before saving or persisting them, so the conversation UI shows only the filename Forced model names within the chat form selector dropdown to render as a single-line, truncated entry with a tooltip revealing the full name * webui: toggle displayed model source for legacy vs OpenAI-Compat modes When the selector is disabled, it falls back to the active server model name from /props When the model selector is enabled, the displayed model comes from the message metadata (the one explicitly selected and sent in the request) * Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/constants/localstorage-keys.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/services/chat.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/services/chat.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: refactor model selector and persistence helpers - Replace inline portal and event listeners with proper Svelte bindings - Introduce 'persisted' store helper for localStorage sync without runes - Extract 'normalizeModelName' utils + Vitest coverage - Simplify ChatFormModelSelector structure and cleanup logic Replaced the persisted store helper's use of '$state/$effect' runes with a plain TS implementation to prevent orphaned effect runtime errors outside component context Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: document normalizeModelName usage with inline examples * Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/stores/models.svelte.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/stores/models.svelte.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: extract ModelOption type into dedicated models.d.ts Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: refine ChatMessageAssistant displayedModel source logic * webui: stabilize dropdown, simplify model extraction, and init assistant model field * chore: update webui static build * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * chore: npm format, update webui static build * webui: align sidebar trigger position, remove z-index glitch * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

ServeurpersoCom requested a review from allozaur as a code owner October 13, 2025 13:31

github-actions bot added examples server labels Oct 13, 2025

allozaur requested a review from ngxson October 13, 2025 13:46

This was referenced Oct 13, 2025

Svelte webui model selector #16335

Closed

Feature Request: tool to list and delete cached models #16393

Open

Feature request: allow load/unload models on server #16487

Closed

ServeurpersoCom force-pushed the openai-model-selector branch from 6606ff7 to 03d383c Compare October 18, 2025 10:20

ServeurpersoCom mentioned this pull request Oct 18, 2025

webui: add OAI-Compat Harmony tool-call streaming visualization and persistence in chat UI #16618

Merged

allozaur reviewed Oct 19, 2025

View reviewed changes

ServeurpersoCom force-pushed the openai-model-selector branch 2 times, most recently from 45298f8 to 286ca88 Compare October 20, 2025 06:07

ServeurpersoCom force-pushed the openai-model-selector branch from 48ad6c5 to b56058a Compare October 20, 2025 15:03

ServeurpersoCom and others added 16 commits October 22, 2025 13:01

Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFo…

2a7c4c0

…rmActions.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

Update tools/server/webui/src/lib/constants/localstorage-keys.ts

f59cac2

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFo…

938e68c

…rmModelSelector.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

Update tools/server/webui/src/lib/components/app/chat/ChatMessages/Ch…

da3c653

…atMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

Update tools/server/webui/src/lib/services/chat.ts

0ad7601

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

Update tools/server/webui/src/lib/services/chat.ts

3a7ab7f

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

webui: document normalizeModelName usage with inline examples

2d5bcad

Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFo…

fd7866a

…rmModelSelector.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

Update tools/server/webui/src/lib/stores/models.svelte.ts

8e54dd8

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

Update tools/server/webui/src/lib/stores/models.svelte.ts

ab922d9

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

webui: extract ModelOption type into dedicated models.d.ts

2173554

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

webui: refine ChatMessageAssistant displayedModel source logic

e0dc324

webui: stabilize dropdown, simplify model extraction, and init assist…

cef7776

…ant model field

chore: update webui static build

d13a292

ServeurpersoCom force-pushed the openai-model-selector branch from f3a6387 to d13a292 Compare October 22, 2025 11:03

allozaur requested changes Oct 22, 2025

View reviewed changes

tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Outdated Show resolved Hide resolved

ServeurpersoCom and others added 4 commits October 22, 2025 13:19

Update tools/server/webui/src/lib/components/app/chat/ChatMessages/Ch…

d90cb36

…atMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

chore: npm format, update webui static build

0cf3135

webui: align sidebar trigger position, remove z-index glitch

0a907ef

chore: update webui build output

114cd56

allozaur approved these changes Oct 22, 2025

View reviewed changes

allozaur merged commit 9b9201f into ggml-org:master Oct 22, 2025
14 checks passed

firecoperana mentioned this pull request Oct 26, 2025

Add --webui arg to launch llama.cpp new webui ikawrakow/ik_llama.cpp#786

Merged

Conversation

ServeurpersoCom commented Oct 13, 2025

Introduce OpenAI-compatible model selector in JSON payload

Restore OpenAI-Compatible model source of truth and unify metadata capture :

Remaining '/props' usage audit in the WebUI :

Uh oh!

ServeurpersoCom commented Oct 13, 2025

Uh oh!

ServeurpersoCom commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ServeurpersoCom commented Oct 13, 2025

Uh oh!

allozaur commented Oct 13, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ServeurpersoCom commented Oct 20, 2025

Uh oh!

ServeurpersoCom commented Oct 20, 2025

Uh oh!

allozaur commented Oct 20, 2025

Uh oh!

ServeurpersoCom commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ServeurpersoCom commented Oct 20, 2025

Uh oh!

allozaur commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ServeurpersoCom commented Oct 20, 2025

Uh oh!

ServeurpersoCom commented Oct 20, 2025

Uh oh!

ServeurpersoCom commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ServeurpersoCom commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ServeurpersoCom commented Oct 20, 2025

Uh oh!

allozaur commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ServeurpersoCom commented Oct 21, 2025

Uh oh!

allozaur left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

allozaur left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ServeurpersoCom commented Oct 13, 2025 •

edited

Loading

ServeurpersoCom commented Oct 20, 2025 •

edited

Loading

allozaur commented Oct 20, 2025 •

edited

Loading

ServeurpersoCom commented Oct 20, 2025 •

edited

Loading

ServeurpersoCom commented Oct 20, 2025 •

edited

Loading

allozaur commented Oct 20, 2025 •

edited

Loading