Skip to content

[Frontend] Add lora-adapter toggler / selection list for selected model #13

@marksverdhei

Description

@marksverdhei

Summary

Add a LoRA adapter toggler/selection list in the webui that allows users to view, enable/disable, and adjust the scale of loaded LoRA adapters for the currently selected model.

This feature should only appear when LoRA adapters are available (either loaded via CLI, present in the auto-discovery directory, or configured in the models preset INI). It must not disrupt or degrade any other frontend feature when no adapters are present.

Backend Context

Existing LoRA API (fully implemented)

The server already exposes two endpoints for runtime LoRA management:

  • GET /lora-adapters → returns list of loaded adapters with id, path, scale, task_name, prompt_prefix, and optional alora_invocation_string
    • Handler: tools/server/server-context.cpp:3929-3955
  • POST /lora-adapters → sets global adapter scales at runtime (array of {id, scale})
    • Handler: tools/server/server-context.cpp:3957-3989
  • Per-request lora field in /completion and /chat/completions — overrides global scales for individual requests
    • Parsed at: tools/server/server-task.cpp:328-335

How LoRA adapters are loaded

  • CLI: --lora FNAME or --lora-scaled FNAME:SCALE,... (common/arg.cpp:2473-2496)
  • --lora-init-without-apply loads adapters into memory without applying them (scale=0), enabling later activation via the POST endpoint (common/arg.cpp:3110-3115)
  • Data structure: common_adapter_lora_info in common/common.h:42-50 — holds path, scale, task_name, prompt_prefix, and native llama_adapter_lora *ptr

Auto-discovery / Router mode

  • Router mode activates when no model is specified on CLI (tools/server/server.cpp:128)
  • Models are auto-discovered from --models-dir directory by scanning for .gguf files (common/preset.cpp:382-445)
  • Models can also be configured via --models-preset INI file (common/preset.cpp:310-360)
  • LoRA adapters are NOT currently auto-discovered — they must be specified via CLI args
  • LoRA GGUF files are distinguishable from model files by their general.type = "adapter" metadata (src/llama-adapter.cpp:204)
  • The INI preset system (common/preset.cpp) does not currently support lora or lora-scaled keys

Key backend task types

  • SERVER_TASK_TYPE_GET_LORA and SERVER_TASK_TYPE_SET_LORA in tools/server/server-task.h:27-28
  • Result structures: server_task_result_get_lora (line 543) and server_task_result_apply_lora (line 554)

Frontend Context

Current state — no LoRA UI exists

  • The API type definition already includes a lora field in ApiLlamaCppServerProps.default_generation_settings.params (tools/server/webui/src/lib/types/api.d.ts:175,343), but it is not used anywhere in the UI
  • SettingsChatServiceOptions (tools/server/webui/src/lib/types/settings.d.ts:19-69) has no LoRA field
  • No component exists for LoRA management
  • The custom?: string field in settings could theoretically pass LoRA params, but there is no UI for it

Model selector architecture (reference for LoRA UI patterns)

  • Model store: tools/server/webui/src/lib/stores/models.svelte.ts — manages model list, selection, load/unload
  • Model service: tools/server/webui/src/lib/services/models.service.ts — API calls (/v1/models)
  • Model selector components: tools/server/webui/src/lib/components/app/models/
    • ModelsSelector.svelte — dropdown with grouped display (loaded, favourites, by org)
    • ModelsSelectorSheet.svelte — mobile/sheet variant
    • ModelsSelectorOption.svelte — individual model item
  • Chat parameter passing: tools/server/webui/src/lib/stores/chat.svelte.ts:734-745 — builds completion options (no LoRA field currently)

Implementation Considerations

Frontend requirements

  1. Conditional rendering: LoRA UI should only appear when GET /lora-adapters returns a non-empty list
  2. Adapter list with toggles: Show each adapter with name (derived from path), current scale, and enable/disable toggle
  3. Scale slider: Allow adjusting scale per adapter (0.0 to 1.0+ range)
  4. Global vs per-request: Decide whether to use POST /lora-adapters (global) or pass lora field per-request. Per-request is more flexible but prevents batching
  5. Placement: Could be a collapsible section near the model selector, or a settings panel entry

Backend gaps to consider

  • No LoRA auto-discovery: load_from_models_dir() in common/preset.cpp does not scan for or categorize LoRA files. A --lora-dir flag or LoRA support in the INI preset would be needed for auto-discovery
  • GGUF metadata: LoRA files have general.type = "adapter" — this can be used to distinguish them during directory scanning
  • Router mode: In router mode, each model spawns a child process. LoRA adapters would need to be passed to the child process args (via preset/INI)

Testing

Existing test infrastructure can be reused:

  • Test LoRA adapter: https://huggingface.co/ggml-org/stories15M_MOE/resolve/main/moe_shakespeare15M.gguf (small Shakespeare-style LoRA for stories15M_MOE model)
  • Base model: stories15M_MOE preset already in test utils
  • Existing tests: tools/server/tests/unit/test_lora.py — tests scale toggling, per-request config, and parallel slots
  • LoRA conversion tools: convert_lora_to_gguf.py and tools/export-lora/ for creating test adapters

File Reference

Area File Key Lines
LoRA API routes tools/server/server.cpp 200-201
LoRA GET/POST handlers tools/server/server-context.cpp 3929-3989
LoRA slot application tools/server/server-context.cpp 1064-1143
LoRA utilities tools/server/server-common.cpp 91-155
LoRA data structures common/common.h 42-50
LoRA CLI args common/arg.cpp 2473-2496, 3110-3115
Model auto-discovery common/preset.cpp 382-445
INI preset parsing common/preset.cpp 310-360
Router model management tools/server/server-models.cpp 242-375
Frontend API types (lora field) tools/server/webui/src/lib/types/api.d.ts 175, 343
Frontend settings types tools/server/webui/src/lib/types/settings.d.ts 19-69
Frontend models store tools/server/webui/src/lib/stores/models.svelte.ts
Frontend chat params tools/server/webui/src/lib/stores/chat.svelte.ts 734-745
Test: LoRA server tests tools/server/tests/unit/test_lora.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions