Add persistent Ollama model serving option with configurable keep-alive by Copilot · Pull Request #75 · micr0-dev/Altbot

Copilot · 2025-10-02T08:23:27Z

Problem

Ollama automatically unloads models from RAM after each request, causing significant response time delays as the model needs to be reloaded for every alt-text generation. This is particularly problematic for active Altbot instances handling frequent requests, where users experience noticeable latency.

Solution

This PR adds a configurable keep-alive parameter that controls how long Ollama models stay loaded in RAM. The feature leverages Ollama's built-in --keepalive flag to maintain models in memory between requests.

Changes

Configuration

Added ollama_keep_alive config option with three modes:
- -1: Persistent serving (model stays loaded indefinitely)
- 0: Immediate unload after each request
- Duration strings: 5m, 30m, 1h, etc.
Default: 5m (5 minutes) if not specified

Implementation

Updated OllamaProvider to include keep-alive state
Modified all Ollama command invocations to include --keepalive parameter
When set to -1, model is pre-loaded during startup with user feedback
Applied consistently across image processing and translation operations

User Experience

Enhanced setup wizard with interactive keep-alive configuration
Clear explanation of options and trade-offs
Recommends persistent mode for active instances

Usage Example

[llm]
provider = "ollama"
ollama_model = "llava-phi3"
ollama_keep_alive = "-1"  # Keep model loaded persistently

Performance Impact

Keep-Alive Value	RAM Usage	Response Time	Best For
`-1` (persistent)	Continuous	Instant	High-traffic instances
`5m` (default)	5 min after use	Medium	Moderate usage
`0` (immediate)	Minimal	Slow	Resource-constrained

Benefits

Dramatically improved response times for active instances (no model loading delays)
Configurable to balance performance vs. resource usage
Backward compatible with sensible defaults
User-friendly setup wizard guides optimal configuration

This feature is ideal for the main Altbot instance and other high-traffic deployments where response time is critical.

Resolves #40

Original prompt

This section details on the original issue you should resolve

<issue_title>Add option for persistent Ollama model serving</issue_title>
<issue_description>so that Ollama doesn't unload the model from ram while Altbot is running, this will greatly improve response times with Ollama based models, but will increase continuous server load. Good for very active Altbot instances like the main Altbot instance.</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes #74

💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

Co-authored-by: micr0-dev <26364458+micr0-dev@users.noreply.github.com>

micr0-dev

Holy shit, it actually did a decent job what the heck? And this is pretty clean too, honestly I thought we were gonna have to do this via ollama serve, but this is much cleaner. I like.

Initial plan

3a34f27

Copilot AI assigned Copilot and micr0-dev Oct 2, 2025

Copilot started work on behalf of micr0-dev October 2, 2025 08:23 View session

Copilot AI and others added 2 commits October 2, 2025 08:29

Add persistent Ollama model serving option with keep-alive parameter

db99b57

Co-authored-by: micr0-dev <26364458+micr0-dev@users.noreply.github.com>

Add Altbot binary to .gitignore and remove from repository

d0a8750

Co-authored-by: micr0-dev <26364458+micr0-dev@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add option for persistent Ollama model serving~~ Add persistent Ollama model serving option with configurable keep-alive Oct 2, 2025

Copilot AI requested a review from micr0-dev October 2, 2025 08:35

Copilot finished work on behalf of micr0-dev October 2, 2025 08:35

micr0-dev approved these changes Oct 2, 2025

View reviewed changes

micr0-dev marked this pull request as ready for review October 2, 2025 08:43

micr0-dev merged commit cdef8a9 into main Oct 2, 2025

micr0-dev deleted the copilot/fix-40ca4948-bea5-4ddf-a1e3-ab41c569592c branch October 2, 2025 20:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add persistent Ollama model serving option with configurable keep-alive#75

Add persistent Ollama model serving option with configurable keep-alive#75
micr0-dev merged 3 commits intomainfrom
copilot/fix-40ca4948-bea5-4ddf-a1e3-ab41c569592c

Copilot AI commented Oct 2, 2025 •

edited

Loading

Uh oh!

micr0-dev left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Changes

Configuration

Implementation

User Experience

Usage Example

Performance Impact

Benefits

Comments on the Issue (you are @copilot in this section)

Uh oh!

micr0-dev left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Oct 2, 2025 •

edited

Loading