[Feature]: Manual context length entry on initial setup for custom endpoints

### Problem or Use Case

Just walked through the setup against a local Qwen3.5-9B @ Q4_K_M GGUF running via llama.cpp, with ~150k tokens context size (so I have a little wiggle room on my 4070 Ti Super).

The /v1/models endpoint doesn't expose a context size, so I (believe that I) got fuzzy matched to 32k and had to edit the context_length_cache.yaml file to expose the configured value.

### Proposed Solution

Would be nice if this were something I could have entered directly in initial setup for a custom endpoint (maybe with local vs remote endpoints as separate options to expose a more nuanced flow). Even the probe for unknown models would've forced me into a bucket that doesn't actually match with the local configuration I'm using.

### Alternatives Considered

The alternative approach is to do nothing about it, or patch llama.cpp to expose the value we want via llama-server, but it did make setup less ergonomic for me.

An environment variable would also work, but then the question is how to handle it vs the existing cached context size. The environment variable would likely be an overwrite, so that's a more general solution, but then how does that get persisted to the gateway? Less ergonomic overall than a simple setup option.

### Feature Type

Configuration option

### Scope

Small (single file, < 50 lines)

### Contribution

- [x] I'd like to implement this myself and submit a PR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Manual context length entry on initial setup for custom endpoints #2007

Problem or Use Case

Proposed Solution

Alternatives Considered

Feature Type

Scope

Contribution

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature]: Manual context length entry on initial setup for custom endpoints #2007

Description

Problem or Use Case

Proposed Solution

Alternatives Considered

Feature Type

Scope

Contribution

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions