issue: ExternalReranker blocks event loop causing application freeze during RAG queries

### Check Existing Issues

- [x] I have searched for any existing and/or related issues.
- [x] I have searched for any existing and/or related discussions.
- [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
- [x] I am using the latest version of Open WebUI.

### Installation Method

Git Clone

### Open WebUI Version

v0.6.41

### Ollama Version (if applicable)

_No response_

### Operating System

macOS Tahoe

### Browser (if applicable)

Chrome

### Confirmation

- [x] I have read and followed all instructions in `README.md`.
- [x] I am using the latest version of **both** Open WebUI and Ollama.
- [x] I have included the browser console logs.
- [x] I have included the Docker container logs.
- [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.**
- [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
- [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps:
- Start with the initial platform/version/OS and dependencies used,
- Specify exact install/launch/configure commands,
- List URLs visited, user input (incl. example values/emails/passwords if needed),
- Describe all options and toggles enabled or changed,
- Include any files or environmental changes,
- Identify the expected and actual result at each stage,
- Ensure any reasonably skilled user can follow and hit the same issue.


### Expected Behavior

When using an external reranker (via `RAG_RERANKING_MODEL_TRUST_REMOTE_CODE=true` with an external reranking endpoint), the application should remain responsive while waiting for reranking API responses. Other users should be able to continue using the application normally.

### Actual Behavior

The entire Open WebUI application freezes and becomes unresponsive while waiting for the external reranker HTTP response. All users experience a complete outage until the reranking request completes. In testing, this caused outages of 9-36+ seconds per reranking call.

### Steps to Reproduce


1. Deploy Open WebUI using Docker
2. Configure an external reranker:
   - Set `RAG_RERANKING_MODEL` to use an external model
   - Set `RAG_RERANKING_MODEL_TRUST_REMOTE_CODE=true`
   - Configure `RAG_EXTERNAL_RERANKER_URL` to point to a reranking endpoint (e.g., a Bedrock Cohere rerank endpoint via API gateway)
3. Upload a document to a knowledge base
4. In a new chat, attach the knowledge base and send a query that triggers RAG retrieval
5. While the reranker is processing, attempt to access any other page in Open WebUI from a different browser tab or as a different user
6. Observe that the entire application is frozen/unresponsive until the reranking completes

### Logs & Screenshots


### BEFORE FIX - Application freezes during reranking

Note the timestamps showing ~9 second gap between reranker call (`01:13:23.547`) and next log (`01:13:32.582`), and ~23 second gap between subsequent reranker calls. During these periods, the application was completely unresponsive:

```
2025-12-12 01:13:16.029 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.64.66.1:63824 - "GET /api/v1/chats/?page=1 HTTP/1.1" 200
2025-12-12 01:13:18.508 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.64.66.1:63824 - "GET /_app/version.json HTTP/1.1" 200
2025-12-12 01:13:22.968 | INFO     | open_webui.retrieval.utils:query_collection_with_hybrid_search:484 - Starting hybrid search for 3 queries in 1 collections...
2025-12-12 01:13:23.547 | INFO     | open_webui.retrieval.models.external:predict:41 - ExternalReranker:predict:model bedrock-cohere-rerank-english-v3.0
2025-12-12 01:13:23.547 | INFO     | open_webui.retrieval.models.external:predict:42 - ExternalReranker:predict:query best line from Shakespeare

# ~9 SECOND GAP - APPLICATION FROZEN

2025-12-12 01:13:32.582 | INFO     | open_webui.retrieval.utils:query_doc_with_hybrid_search:307 - query_doc_with_hybrid_search:result [...]
2025-12-12 01:13:32.642 | INFO     | open_webui.retrieval.models.external:predict:41 - ExternalReranker:predict:model bedrock-cohere-rerank-english-v3.0
2025-12-12 01:13:32.642 | INFO     | open_webui.retrieval.models.external:predict:42 - ExternalReranker:predict:query top Shakespeare quotes and lines

# ~23 SECOND GAP - APPLICATION FROZEN

2025-12-12 01:13:56.120 | INFO     | open_webui.retrieval.models.external:predict:41 - ExternalReranker:predict:model bedrock-cohere-rerank-english-v3.0
2025-12-12 01:13:56.120 | INFO     | open_webui.retrieval.models.external:predict:42 - ExternalReranker:predict:query most famous Shakespeare quotes
2025-12-12 01:13:59.054 | INFO     | open_webui.retrieval.utils:query_doc_with_hybrid_search:307 - query_doc_with_hybrid_search:result [...]
2025-12-12 01:13:59.136 | INFO     | open_webui.routers.openai:get_all_models:476 - get_all_models()
2025-12-12 01:13:59.147 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.64.66.1:60092 - "GET / HTTP/1.1" 304
```

### AFTER FIX - Application remains responsive during reranking

Note how HTTP requests continue to be processed while reranking occurs in the background:

```
2025-12-12 01:25:02.994 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.64.66.1:32856 - "GET /api/v1/chats/?page=1 HTTP/1.1" 200
2025-12-12 01:25:03.030 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.64.66.1:51896 - "GET /api/v1/functions/ HTTP/1.1" 200
2025-12-12 01:25:03.030 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.64.66.1:53740 - "GET /api/v1/tools/ HTTP/1.1" 200
2025-12-12 01:25:03.041 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.64.66.1:33688 - "GET /api/v1/functions/ HTTP/1.1" 200
2025-12-12 01:25:03.258 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.64.66.1:35218 - "GET /api/version/updates HTTP/1.1" 200

# Reranking completes while other requests continue to be served
2025-12-12 01:25:04.064 | INFO     | open_webui.retrieval.utils:query_doc_with_hybrid_search:307 - query_doc_with_hybrid_search:result [...]
2025-12-12 01:25:04.389 | INFO     | open_webui.retrieval.utils:query_doc_with_hybrid_search:307 - query_doc_with_hybrid_search:result [...]

# Application continues responding normally
2025-12-12 01:25:04.847 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.64.66.1:35218 - "GET /api/config HTTP/1.1" 200
2025-12-12 01:25:04.860 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.64.66.1:35218 - "GET /api/v1/auths/ HTTP/1.1" 200
2025-12-12 01:25:04.874 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.64.66.1:33688 - "GET /api/config HTTP/1.1" 200
```

### Additional Information

## Root Cause Analysis
The `ExternalReranker.predict()` method in `open_webui/retrieval/models/external.py` uses synchronous `requests.post()` which is called directly from an async FastAPI endpoint without being offloaded to a thread pool. This blocks the Python event loop.

### Proposed Fix

Two changes are required:

**1. Wrap the reranking call with `asyncio.to_thread()` in `retrieval/utils.py`:**

```python
# Before (line ~1284)
if reranking:
    scores = self.reranking_function(query, documents)

# After
if reranking:
    scores = await asyncio.to_thread(self.reranking_function, query, documents)
```

**2. Add timeout to `requests.post()` in `retrieval/models/external.py`:**

```python
# Before (line ~52)
r = requests.post(
    f"{self.url}",
    headers=headers,
    json=payload,
)

# After
r = requests.post(
    f"{self.url}",
    headers=headers,
    json=payload,
    timeout=60,
)
```

This approach is consistent with existing patterns in the codebase (e.g., `asyncio.to_thread()` is used for embedding operations on line 790 of `retrieval/utils.py`).

## Example .env 
# Ollama URL for the backend to connect
# The path '/ollama' will be redirected to the specified backend URL
OLLAMA_BASE_URL='http://localhost:11434'
# OpenAI Configuration
OPENAI_API_BASE_URL='http://host.docker.internal:4000/v1'
OPENAI_API_KEY='xxx'
# AUTOMATIC1111_BASE_URL='http://localhost:7860'
# Application Configuration
ENV='dev'
ENABLE_PERSISTENT_CONFIG=false
DEFAULT_MODELS='gpt-5'
TASK_MODEL_EXTERNAL='gpt-5-nano'
ENABLE_COMMUNITY_SHARING=false
ENABLE_API_KEY=true
ENABLE_OLLAMA_API=false
ENABLE_DIRECT_CONNECTIONS=false
WEBUI_URL='http://localhost:3000'
GLOBAL_LOG_LEVEL='INFO'
RESET_CONFIG_ON_START=true
# RAG Configuration
PDF_EXTRACT_IMAGES=true
ENABLE_RAG_HYBRID_SEARCH=true
RAG_TOP_K=5
RAG_EMBEDDING_ENGINE='openai'
RAG_EMBEDDING_MODEL='text-embedding-3-small'
RAG_OPENAI_API_BASE_URL='http://host.docker.internal:4000/v1'
RAG_OPENAI_API_KEY='xxx'
RAG_FILE_MAX_SIZE=9
RAG_EMBEDDING_BATCH_SIZE=2048
VECTOR_DB='pgvector'
ENABLE_AUTOCOMPLETE_GENERATION=true
RAG_RERANKING_ENGINE='external'
RAG_RERANKING_MODEL='bedrock-cohere-rerank-english-v3.0'
RAG_EXTERNAL_RERANKER_URL='http://custom-text-litellm:4000/v1/rerank'
RAG_EXTERNAL_RERANKER_API_KEY='xxx'
# Web Search Configuration
ENABLE_WEB_SEARCH=true
ENABLE_SEARCH_QUERY_GENERATION=true
WEB_SEARCH_ENGINE='searxng'
WEB_SEARCH_CONCURRENT_REQUESTS=3
WEB_SEARCH_RESULT_COUNT=1
SEARXNG_QUERY_URL='http://searxng:8080/search?q=<query>'
# Audio Configuration
AUDIO_STT_ENGINE='azure'
AUDIO_STT_AZURE_REGION='xxx'
AUDIO_STT_AZURE_LOCALES='en-US,en-GB'
AUDIO_TTS_ENGINE='azure'
AUDIO_TTS_OPENAI_API_BASE_URL='xxx'
AUDIO_TTS_AZURE_SPEECH_REGION='xxx'
AUDIO_TTS_AZURE_SPEECH_OUTPUT_FORMAT='ogg-48khz-16bit-mono-opus'
AUDIO_TTS_VOICE='en-AU-NatashaNeural'
WHISPER_MODEL_AUTO_UPDATE=true
# Database Configuration
DATABASE_URL='xxx'
PGVECTOR_DB_URL='xxx'
# Tool Server Connections (JSON configuration)
TOOL_SERVER_CONNECTIONS='[]'
# Authentication & OAuth Configuration
ENABLE_LOGIN_FORM=false
ENABLE_OAUTH_SIGNUP=true
DEFAULT_USER_ROLE='user'
JWT_EXPIRES_IN='12h'
OAUTH_MERGE_ACCOUNTS_BY_EMAIL=true
OPENID_PROVIDER_URL='xxx'
MICROSOFT_REDIRECT_URI='http://localhost:3000/oauth/microsoft/callback'
OAUTH_GROUP_CLAIM='groups'
ENABLE_OAUTH_GROUP_CREATION=true
ENABLE_OAUTH_GROUP_MANAGEMENT=true
# Session & Cookie Configuration
WEBUI_SESSION_COOKIE_SECURE=true
WEBUI_SESSION_COOKIE_SAME_SITE='lax'
WEBUI_AUTH_COOKIE_SAME_SITE='lax'
# OneDrive Integration
ENABLE_ONEDRIVE_INTEGRATION=true
ENABLE_ONEDRIVE_BUSINESS=true
ENABLE_ONEDRIVE_PERSONAL=false
ONEDRIVE_SHAREPOINT_URL='xxx'
# Content Extraction
CONTENT_EXTRACTION_ENGINE='tika'
TIKA_SERVER_URL='http://tika:9998'
# Image Generation
ENABLE_IMAGE_GENERATION=true
IMAGE_GENERATION_ENGINE='openai'
IMAGES_OPENAI_API_BASE_URL='http://litellm-image-models:4001/v1'
IMAGES_OPENAI_API_KEY='xxx'
IMAGE_GENERATION_MODEL='gpt-image-1'
IMAGE_SIZE='1536x1024'
IMAGE_STEPS=24
# Storage Configuration
STORAGE_PROVIDER='s3'
S3_REGION_NAME='xxx'
S3_BUCKET_NAME='xxx'
# For production, you should only need one host as
# fastapi serves the svelte-kit built frontend and backend from the same host and port.
# To test with CORS locally, you can set something like
# CORS_ALLOW_ORIGIN='http://localhost:5173;http://localhost:8080'
CORS_ALLOW_ORIGIN='*'
# For production you should set this to match the proxy configuration (127.0.0.1)
FORWARDED_ALLOW_IPS='*'
# DO NOT TRACK
SCARF_NO_ANALYTICS=true
DO_NOT_TRACK=true
ANONYMIZED_TELEMETRY=false
# DEV OWUI Secret Key
WEBUI_SECRET_KEY='xxx'
# AWS Secrets
AWS_ACCESS_KEY_ID='xxx'
AWS_SECRET_ACCESS_KEY='xxx'
AWS_SESSION_TOKEN='xxx'
# Azure Secrets
AZURE_API_KEY='xxx'
AZURE_API_IMAGE_KEY='xxx'
AUDIO_TTS_API_KEY='xxx'
AUDIO_STT_AZURE_API_KEY='xxx'
# Entra Secrets
MICROSOFT_CLIENT_ID='xxx'
MICROSOFT_CLIENT_SECRET='xxx'
ONEDRIVE_CLIENT_ID_BUSINESS='xxx'
MICROSOFT_CLIENT_TENANT_ID='xxx'
ONEDRIVE_SHAREPOINT_TENANT_ID='xxx'

## Example docker compose
services:
    postgres:
        image: postgres:17
        container_name: postgres
        volumes:
            - postgres-data:/var/lib/postgresql/data
        ports:
            - '5432:5432'
        environment:
            - POSTGRES_DB=${POSTGRES_DB}
            - POSTGRES_USER=${POSTGRES_USER}
            - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
        restart: unless-stopped

    pgvector:
        image: pgvector/pgvector:pg17
        container_name: pgvector
        volumes:
            - vectorpg-data:/var/lib/postgresql/data
        ports:
            - '5431:5432'
        environment:
            - POSTGRES_DB=${POSTGRES_DB}
            - POSTGRES_USER=${POSTGRES_USER}
            - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
        restart: unless-stopped

    open-webui:
        build:
            context: .
            dockerfile: Dockerfile
            args:
                REBUILD: true
        container_name: open-webui
        volumes:
            - open-webui:/app/backend/data
        ports:
            - '${OPEN_WEBUI_PORT-3000}:8080'
        environment:
            - WEBUI_NAME=${WEBUI_NAME}
            - ENV=${ENV}
            - ENABLE_PERSISTENT_CONFIG=${ENABLE_PERSISTENT_CONFIG}
            - OPENAI_API_BASE_URL=${OPENAI_API_BASE_URL}
            - OPENAI_API_KEY=${OPENAI_API_KEY}
            - PDF_EXTRACT_IMAGES=${PDF_EXTRACT_IMAGES}
            - ENABLE_RAG_HYBRID_SEARCH=${ENABLE_RAG_HYBRID_SEARCH}
            - RAG_TOP_K=${RAG_TOP_K}
            - RAG_EMBEDDING_ENGINE=${RAG_EMBEDDING_ENGINE}
            - RAG_EMBEDDING_MODEL=${RAG_EMBEDDING_MODEL}
            - RAG_OPENAI_API_BASE_URL=${RAG_OPENAI_API_BASE_URL}
            - RAG_OPENAI_API_KEY=${RAG_OPENAI_API_KEY}
            - ENABLE_WEB_SEARCH=${ENABLE_WEB_SEARCH}
            - ENABLE_SEARCH_QUERY_GENERATION=${ENABLE_SEARCH_QUERY_GENERATION}
            - WEB_SEARCH_ENGINE=${WEB_SEARCH_ENGINE}
            - WEB_SEARCH_CONCURRENT_REQUESTS=${WEB_SEARCH_CONCURRENT_REQUESTS}
            - RAG_FILE_MAX_SIZE=${RAG_FILE_MAX_SIZE}
            - AUDIO_STT_ENGINE=${AUDIO_STT_ENGINE}
            - AUDIO_STT_AZURE_REGION=${AUDIO_STT_AZURE_REGION}
            - AUDIO_STT_AZURE_LOCALES=${AUDIO_STT_AZURE_LOCALES}
            - AUDIO_TTS_ENGINE=${AUDIO_TTS_ENGINE}
            - AUDIO_TTS_OPENAI_API_BASE_URL=${AUDIO_TTS_OPENAI_API_BASE_URL}
            - AUDIO_TTS_AZURE_SPEECH_REGION=${AUDIO_TTS_AZURE_SPEECH_REGION}
            - AUDIO_TTS_AZURE_SPEECH_OUTPUT_FORMAT=${AUDIO_TTS_AZURE_SPEECH_OUTPUT_FORMAT}
            - AUDIO_TTS_VOICE=${AUDIO_TTS_VOICE}
            - WHISPER_MODEL_AUTO_UPDATE=${WHISPER_MODEL_AUTO_UPDATE}
            - GLOBAL_LOG_LEVEL=${GLOBAL_LOG_LEVEL}
            - RESET_CONFIG_ON_START=${RESET_CONFIG_ON_START}
            - VECTOR_DB=${VECTOR_DB}
            - ENABLE_AUTOCOMPLETE_GENERATION=${ENABLE_AUTOCOMPLETE_GENERATION}
            - RAG_EMBEDDING_BATCH_SIZE=${RAG_EMBEDDING_BATCH_SIZE}
            - DATABASE_URL=${DATABASE_URL}
            - PGVECTOR_DB_URL=${PGVECTOR_DB_URL}
            - AZURE_API_ENDPOINT_EU=${AZURE_API_ENDPOINT_EU}
            - TOOL_SERVER_CONNECTIONS=${TOOL_SERVER_CONNECTIONS}
            - DEFAULT_MODELS=${DEFAULT_MODELS}
            - TASK_MODEL_EXTERNAL=${TASK_MODEL_EXTERNAL}
            - ENABLE_COMMUNITY_SHARING=${ENABLE_COMMUNITY_SHARING}
            - ENABLE_API_KEY=${ENABLE_API_KEY}
            - ENABLE_OLLAMA_API=${ENABLE_OLLAMA_API}
            - ENABLE_DIRECT_CONNECTIONS=${ENABLE_DIRECT_CONNECTIONS}
            - WEBUI_URL=${WEBUI_URL}
            - ENABLE_LOGIN_FORM=${ENABLE_LOGIN_FORM}
            - ENABLE_OAUTH_SIGNUP=${ENABLE_OAUTH_SIGNUP}
            - DEFAULT_USER_ROLE=${DEFAULT_USER_ROLE}
            - JWT_EXPIRES_IN=${JWT_EXPIRES_IN}
            - OAUTH_MERGE_ACCOUNTS_BY_EMAIL=${OAUTH_MERGE_ACCOUNTS_BY_EMAIL}
            - OPENID_PROVIDER_URL=${OPENID_PROVIDER_URL}
            - MICROSOFT_REDIRECT_URI=${MICROSOFT_REDIRECT_URI}
            - OAUTH_GROUP_CLAIM=${OAUTH_GROUP_CLAIM}
            - ENABLE_OAUTH_GROUP_CREATION=${ENABLE_OAUTH_GROUP_CREATION}
            - ENABLE_OAUTH_GROUP_MANAGEMENT=${ENABLE_OAUTH_GROUP_MANAGEMENT}
            - CORS_ALLOW_ORIGIN=${CORS_ALLOW_ORIGIN}
            - WEBUI_SESSION_COOKIE_SECURE=${WEBUI_SESSION_COOKIE_SECURE}
            - WEBUI_SESSION_COOKIE_SAME_SITE=${WEBUI_SESSION_COOKIE_SAME_SITE}
            - WEBUI_AUTH_COOKIE_SAME_SITE=${WEBUI_AUTH_COOKIE_SAME_SITE}
            - ENABLE_ONEDRIVE_INTEGRATION=${ENABLE_ONEDRIVE_INTEGRATION}
            - ONEDRIVE_SHAREPOINT_URL=${ONEDRIVE_SHAREPOINT_URL}
            - CONTENT_EXTRACTION_ENGINE=${CONTENT_EXTRACTION_ENGINE}
            - TIKA_SERVER_URL=${TIKA_SERVER_URL}
            - SEARXNG_QUERY_URL=${SEARXNG_QUERY_URL}
            - WEB_SEARCH_RESULT_COUNT=${WEB_SEARCH_RESULT_COUNT}
            - ENABLE_IMAGE_GENERATION=${ENABLE_IMAGE_GENERATION}
            - IMAGE_GENERATION_ENGINE=${IMAGE_GENERATION_ENGINE}
            - IMAGES_OPENAI_API_BASE_URL=${IMAGES_OPENAI_API_BASE_URL}
            - IMAGES_OPENAI_API_KEY=${IMAGES_OPENAI_API_KEY}
            - IMAGE_GENERATION_MODEL=${IMAGE_GENERATION_MODEL}
            - IMAGE_SIZE=${IMAGE_SIZE}
            - IMAGE_STEPS=${IMAGE_STEPS}
            - STORAGE_PROVIDER=${STORAGE_PROVIDER}
            - S3_REGION_NAME=${S3_REGION_NAME}
            - S3_BUCKET_NAME=${S3_BUCKET_NAME}
            - WEBUI_SECRET_KEY=${WEBUI_SECRET_KEY}
            - MICROSOFT_CLIENT_ID=${MICROSOFT_CLIENT_ID}
            - MICROSOFT_CLIENT_SECRET=${MICROSOFT_CLIENT_SECRET}
            - MICROSOFT_CLIENT_TENANT_ID=${MICROSOFT_CLIENT_TENANT_ID}
            - ONEDRIVE_CLIENT_ID_BUSINESS=${ONEDRIVE_CLIENT_ID_BUSINESS}
            - ENABLE_ONEDRIVE_BUSINESS=${ENABLE_ONEDRIVE_BUSINESS}
            - ENABLE_ONEDRIVE_PERSONAL=${ENABLE_ONEDRIVE_PERSONAL}
            - ONEDRIVE_SHAREPOINT_TENANT_ID=${ONEDRIVE_SHAREPOINT_TENANT_ID}
            - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
            - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
            - AWS_SESSION_TOKEN=${AWS_SESSION_TOKEN}
            - AUDIO_TTS_API_KEY=${AUDIO_TTS_API_KEY}
            - AUDIO_STT_AZURE_API_KEY=${AUDIO_STT_AZURE_API_KEY}
            - DEFAULT_PROMPT_SUGGESTIONS=${DEFAULT_PROMPT_SUGGESTIONS}
            - RAG_RERANKING_ENGINE=${RAG_RERANKING_ENGINE}
            - RAG_RERANKING_MODEL=${RAG_RERANKING_MODEL}
            - RAG_EXTERNAL_RERANKER_URL=${RAG_EXTERNAL_RERANKER_URL}
            - RAG_EXTERNAL_RERANKER_API_KEY=${RAG_EXTERNAL_RERANKER_API_KEY}
        extra_hosts:
            - host.docker.internal:host-gateway
        restart: unless-stopped
        depends_on: [postgres, tika, pgvector]

    text-litellm:
        build:
            context: ../litellm-text-models
            dockerfile: Dockerfile
            args:
                REBUILD: true
        image: custom-text-litellm:latest
        container_name: custom-text-litellm
        ports:
            - '4000:4000'
        environment:
            - AZURE_API_ENDPOINT=${AZURE_API_ENDPOINT}
            - AZURE_API_KEY=${AZURE_API_KEY}
            - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
            - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
            - AWS_SESSION_TOKEN=${AWS_SESSION_TOKEN}
        restart: unless-stopped

    tika:
        image: apache/tika:latest-full
        container_name: tika
        ports:
            - '9998:9998'
        environment:
            - FORCE_REDEPLOY=true
        restart: unless-stopped

volumes:
    open-webui: {}
    postgres-data: {}
    vectorpg-data: {}


Uh oh!

issue: ExternalReranker blocks event loop causing application freeze during RAG queries #19900

Description

Check Existing Issues

Installation Method

Open WebUI Version

Ollama Version (if applicable)

Operating System

Browser (if applicable)

Confirmation

Expected Behavior

Actual Behavior

Steps to Reproduce

Logs & Screenshots

BEFORE FIX - Application freezes during reranking

AFTER FIX - Application remains responsive during reranking

Additional Information

Root Cause Analysis

Proposed Fix

Example .env

Ollama URL for the backend to connect

The path '/ollama' will be redirected to the specified backend URL

OpenAI Configuration

AUTOMATIC1111_BASE_URL='http://localhost:7860'

Application Configuration

RAG Configuration

Web Search Configuration

Audio Configuration

Database Configuration

Tool Server Connections (JSON configuration)

Authentication & OAuth Configuration

Session & Cookie Configuration

OneDrive Integration

Content Extraction

Image Generation

Storage Configuration

For production, you should only need one host as

fastapi serves the svelte-kit built frontend and backend from the same host and port.

To test with CORS locally, you can set something like

CORS_ALLOW_ORIGIN='http://localhost:5173;http://localhost:8080'

For production you should set this to match the proxy configuration (127.0.0.1)

DO NOT TRACK

DEV OWUI Secret Key

AWS Secrets

Azure Secrets

Entra Secrets

Example docker compose

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions