Skip to content

OpenAI-compatible server reliability improvements via HTTP/1.1 keep-alive config and error handling#14

Open
BingqingLyu wants to merge 4 commits into
mainfrom
fork-pr-404-fix-openai-local-server-reliability
Open

OpenAI-compatible server reliability improvements via HTTP/1.1 keep-alive config and error handling#14
BingqingLyu wants to merge 4 commits into
mainfrom
fork-pr-404-fix-openai-local-server-reliability

Conversation

@BingqingLyu

@BingqingLyu BingqingLyu commented Apr 27, 2026

Copy link
Copy Markdown
Owner

Summary

This PR improves the reliability of Qwen Code when using local OpenAI-compatible servers by implementing proper HTTP/1.1 keep-alive configuration and robust error handling.

Problem

Local OpenAI-compatible servers (running on localhost) were experiencing:

  • Intermittent connection failures during streaming
  • Premature connection closures
  • Missing retry logic for transient failures
  • Inadequate timeout handling for slower local inference

Solution

1. HTTP Agent Configuration

  • Implemented HTTP/1.1 keep-alive with 4-second timeout to prevent premature disconnections
  • Set connection pool to max 2 connections to avoid overwhelming local servers
  • Disabled HTTP pipelining (set to 0) for compatibility with simpler server implementations

2. Retry Logic

  • Added exponential backoff starting at 1 second, doubling up to 5 retries
  • Falls back to standard fetch when undici module is not available
  • Retries on HTTP status codes 429, 500, 502, 503 and ECONNRESET errors

3. Configuration Improvements

  • Consolidated OpenAI configuration with proper hierarchy (CLI args > env vars > settings.json)
  • Added getBackendName() function to display actual backend name ("OpenAI API", "Qwen API", "Vertex AI", etc.) instead of undefined in error messages
  • Consolidated multiple debug console.log calls into single multi-line output

Changes

  • Core improvements: HTTP agent configuration with 4s keep-alive timeout, exponential backoff retry, 10-minute timeout for local inference
  • Configuration: Centralized OpenAI config in config.ts, enforced CLI > env > settings.json precedence
  • Error handling: Error messages now show actual backend name (e.g., "OpenAI API", "Qwen API") instead of undefined in reportError calls
  • Documentation: Added local server configuration section to openai-auth.md with environment variable examples

Testing

To test with a local OpenAI-compatible server:

export OPENAI_API_KEY="any-value"
export OPENAI_BASE_URL="http://localhost:1234/v1"
export OPENAI_MODEL="your-model-name"
qwen

Compatibility

  • Fully backward compatible with cloud OpenAI API
  • No breaking changes to existing configurations
  • Graceful degradation when optional dependencies are missing

Fixes issues with local model inference reliability when using self-hosted models.

ahundt added 4 commits August 20, 2025 19:40
Previous behavior: Local OpenAI-compatible servers experienced intermittent connection failures and streaming issues due to missing HTTP/1.1 keep-alive configuration and inadequate timeout handling.

What changed:
- packages/core/src/core/openaiContentGenerator.ts: Add HTTP agent configuration with proper keep-alive settings, implement retry logic with exponential backoff, add constants for timeout and connection pooling
- packages/cli/src/config/config.ts: Consolidate OpenAI configuration functions, add setupOpenAIFromCliArgs and getEffectiveAuthType for proper config hierarchy (CLI > env > settings)
- packages/cli/src/validateNonInterActiveAuth.ts: Consolidate debug logging into multi-line statements, use getEffectiveAuthType for consistent auth resolution
- packages/cli/src/gemini.tsx: Fix import order, add error logging in debug mode, update imports after config consolidation
- packages/core/src/core/contentGenerator.ts: Add getBackendName function for human-readable API names in error messages
- packages/core/src/core/geminiChat.ts: Add getConfig() method to expose configuration for error reporting
- packages/core/src/core/turn.ts: Use getBackendName for better error messages
- packages/cli/src/nonInteractiveCli.ts: Apply OpenAI CLI arguments before auth initialization
- docs/cli/openai-auth.md: Document local server configuration options

Why: Local OpenAI-compatible servers often have different connection characteristics than cloud services, requiring specific HTTP agent configuration to maintain stable connections. The changes ensure reliable streaming and prevent premature connection closures.

Testable: Run with local OpenAI-compatible server using OPENAI_BASE_URL=http://localhost:1234/v1 and verify stable streaming without connection drops.
This was referenced Apr 28, 2026
@BingqingLyu BingqingLyu added conflicting-group-1 conflicting-group-1 Conflicting PR group 1 — review as a batch conflicting-pr Shares at least one cross-PR dependency with other PRs and removed conflicting-group-1 labels May 7, 2026
@BingqingLyu

BingqingLyu commented May 7, 2026

Copy link
Copy Markdown
Owner Author

Conflict Group 1

This PR shares modified functions with 20 other PR(s): #10, #112, #113, #114, #117, #17, #18, #20, #21, #22, #31, #36, #46, #6, #7, #71, #75, #86, #88, #94.

These PRs should be reviewed as a batch — merging one may affect the others.

Function File Also modified by
main gemini.tsx #17, #20, #94
parseArguments config.ts #10, #112, #113, #114, #117, #17, #18, #21, #22, #31, #36, #46, #7, #86, #88
resolveContentGeneratorConfigWithSources contentGenerator.ts #17
runNonInteractive nonInteractiveCli.ts #114, #17, #6, #71, #75
startInteractiveUI gemini.tsx #114, #117, #17, #31, #6, #88, #94
validateNonInteractiveAuth validateNonInterActiveAuth.ts #10, #17
graph LR
    PR14["PR #14"]
    Fmain_4875["main<br>gemini.tsx"]
    PR14 -->|modifies| Fmain_4875
    PR17["PR #17"]
    PR17 -->|modifies| Fmain_4875
    PR20["PR #20"]
    PR20 -->|modifies| Fmain_4875
    PR94["PR #94"]
    PR94 -->|modifies| Fmain_4875
    FparseArguments_6977["parseArguments<br>config.ts"]
    PR14 -->|modifies| FparseArguments_6977
    PR10["PR #10"]
    PR10 -->|modifies| FparseArguments_6977
    PR112["PR #112"]
    PR112 -->|modifies| FparseArguments_6977
    PR113["PR #113"]
    PR113 -->|modifies| FparseArguments_6977
    PR114["PR #114"]
    PR114 -->|modifies| FparseArguments_6977
    PR117["PR #117"]
    PR117 -->|modifies| FparseArguments_6977
    PR17 -->|modifies| FparseArguments_6977
    PR18["PR #18"]
    PR18 -->|modifies| FparseArguments_6977
    PR21["PR #21"]
    PR21 -->|modifies| FparseArguments_6977
    PR22["PR #22"]
    PR22 -->|modifies| FparseArguments_6977
    PR31["PR #31"]
    PR31 -->|modifies| FparseArguments_6977
    PR36["PR #36"]
    PR36 -->|modifies| FparseArguments_6977
    PR46["PR #46"]
    PR46 -->|modifies| FparseArguments_6977
    PR7["PR #7"]
    PR7 -->|modifies| FparseArguments_6977
    PR86["PR #86"]
    PR86 -->|modifies| FparseArguments_6977
    PR88["PR #88"]
    PR88 -->|modifies| FparseArguments_6977
    FresolveContentGeneratorConfigWithSources_5386["resolveContentGeneratorConfigWithSources<br>contentGenerator.ts"]
    PR14 -->|modifies| FresolveContentGeneratorConfigWithSources_5386
    PR17 -->|modifies| FresolveContentGeneratorConfigWithSources_5386
    FrunNonInteractive_8467["runNonInteractive<br>nonInteractiveCli.ts"]
    PR14 -->|modifies| FrunNonInteractive_8467
    PR114 -->|modifies| FrunNonInteractive_8467
    PR17 -->|modifies| FrunNonInteractive_8467
    PR6["PR #6"]
    PR6 -->|modifies| FrunNonInteractive_8467
    PR71["PR #71"]
    PR71 -->|modifies| FrunNonInteractive_8467
    PR75["PR #75"]
    PR75 -->|modifies| FrunNonInteractive_8467
    FstartInteractiveUI_4875["startInteractiveUI<br>gemini.tsx"]
    PR14 -->|modifies| FstartInteractiveUI_4875
    PR114 -->|modifies| FstartInteractiveUI_4875
    PR117 -->|modifies| FstartInteractiveUI_4875
    PR17 -->|modifies| FstartInteractiveUI_4875
    PR31 -->|modifies| FstartInteractiveUI_4875
    PR6 -->|modifies| FstartInteractiveUI_4875
    PR88 -->|modifies| FstartInteractiveUI_4875
    PR94 -->|modifies| FstartInteractiveUI_4875
    FvalidateNonInteractiveAuth_1241["validateNonInteractiveAuth<br>validateNonInterActiveAuth.ts"]
    PR14 -->|modifies| FvalidateNonInteractiveAuth_1241
    PR10 -->|modifies| FvalidateNonInteractiveAuth_1241
    PR17 -->|modifies| FvalidateNonInteractiveAuth_1241
Loading

Posted by codegraph-ai conflict detection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conflicting-group-1 Conflicting PR group 1 — review as a batch conflicting-pr Shares at least one cross-PR dependency with other PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants