Feature Request: multiple llama-server WebUI FRs

### Prerequisites

- [x] I am running the latest code. Mention the version if possible as well.
- [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md).
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggml-org/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

- [x] "Continue response" button (something with icon that looks like this: [>>]) when inference is stopped
- [ ] WebUI offline caching (similiar to how hexed.it doing it, for example)
- [ ] (somewhat related to 2nd item) When model is loading, instead of showing just "Model is loading" text, load the homepage like usual, but there's phase where it "waits for server" (forgot what its called), we can try send the "error model is still loading" response thing there, and error handler that catches this will show in homepage "loading the model" and of course it gets rechecked every 5 seconds
- [ ] Multiple languages (assuming its not that hard or costly in size to implement)(idk if this was implemented already or not, bcz i dont see language toggle)
- [x] For mobile users, pressing enter should not send query, for that user can press the submit button instead
- [ ] After done inferencing, make another inference to generate short summary for the chat title

### Motivation

- In case when user stopped inference or output limit (not context) reaches maximum, instead of user having to tell the AI to "continue" response, which likely break, provide "continue response" button (maybe next to "regenerate response" button) which when pressed continues the inference. I've seen this somewhere but i forgot where it was, maybe DeepSeek has it?
- Faster loading times
- Cleaner UI for showing when the model is loading
- Not all people can understand English, and some AI frontends have multiple language support
- For mobile users, the "Shift" key on almost all virtual keyboards can only be used to switch letters/symbols only it cant send literal shift key when pressing enter for example
- makes the chat title readable, but could be slow to generate, especially if long texts are involved, so this should be optional toggle. So far i only see this on ChatGPT

### Possible Implementation

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: multiple llama-server WebUI FRs #16839

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature Request: multiple llama-server WebUI FRs #16839

Description

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions