OpenAI-compatible server reliability improvements via HTTP/1.1 keep-alive config and error handling by ahundt · Pull Request #404 · QwenLM/qwen-code

ahundt · 2025-08-20T23:49:49Z

Summary

This PR improves the reliability of Qwen Code when using local OpenAI-compatible servers by implementing proper HTTP/1.1 keep-alive configuration and robust error handling.

Problem

Local OpenAI-compatible servers (running on localhost) were experiencing:

Intermittent connection failures during streaming
Premature connection closures
Missing retry logic for transient failures
Inadequate timeout handling for slower local inference

Solution

1. HTTP Agent Configuration

Implemented HTTP/1.1 keep-alive with 4-second timeout to prevent premature disconnections
Set connection pool to max 2 connections to avoid overwhelming local servers
Disabled HTTP pipelining (set to 0) for compatibility with simpler server implementations

2. Retry Logic

Added exponential backoff starting at 1 second, doubling up to 5 retries
Falls back to standard fetch when undici module is not available
Retries on HTTP status codes 429, 500, 502, 503 and ECONNRESET errors

3. Configuration Improvements

Consolidated OpenAI configuration with proper hierarchy (CLI args > env vars > settings.json)
Added getBackendName() function to display actual backend name ("OpenAI API", "Qwen API", "Vertex AI", etc.) instead of undefined in error messages
Consolidated multiple debug console.log calls into single multi-line output

Changes

Core improvements: HTTP agent configuration with 4s keep-alive timeout, exponential backoff retry, 10-minute timeout for local inference
Configuration: Centralized OpenAI config in config.ts, enforced CLI > env > settings.json precedence
Error handling: Error messages now show actual backend name (e.g., "OpenAI API", "Qwen API") instead of undefined in reportError calls
Documentation: Added local server configuration section to openai-auth.md with environment variable examples

Testing

To test with a local OpenAI-compatible server:

export OPENAI_API_KEY="any-value"
export OPENAI_BASE_URL="http://localhost:1234/v1"
export OPENAI_MODEL="your-model-name"
qwen

Compatibility

Fully backward compatible with cloud OpenAI API
No breaking changes to existing configurations
Graceful degradation when optional dependencies are missing

Fixes issues with local model inference reliability when using self-hosted models.

Previous behavior: Local OpenAI-compatible servers experienced intermittent connection failures and streaming issues due to missing HTTP/1.1 keep-alive configuration and inadequate timeout handling. What changed: - packages/core/src/core/openaiContentGenerator.ts: Add HTTP agent configuration with proper keep-alive settings, implement retry logic with exponential backoff, add constants for timeout and connection pooling - packages/cli/src/config/config.ts: Consolidate OpenAI configuration functions, add setupOpenAIFromCliArgs and getEffectiveAuthType for proper config hierarchy (CLI > env > settings) - packages/cli/src/validateNonInterActiveAuth.ts: Consolidate debug logging into multi-line statements, use getEffectiveAuthType for consistent auth resolution - packages/cli/src/gemini.tsx: Fix import order, add error logging in debug mode, update imports after config consolidation - packages/core/src/core/contentGenerator.ts: Add getBackendName function for human-readable API names in error messages - packages/core/src/core/geminiChat.ts: Add getConfig() method to expose configuration for error reporting - packages/core/src/core/turn.ts: Use getBackendName for better error messages - packages/cli/src/nonInteractiveCli.ts: Apply OpenAI CLI arguments before auth initialization - docs/cli/openai-auth.md: Document local server configuration options Why: Local OpenAI-compatible servers often have different connection characteristics than cloud services, requiring specific HTTP agent configuration to maintain stable connections. The changes ensure reliable streaming and prevent premature connection closures. Testable: Run with local OpenAI-compatible server using OPENAI_BASE_URL=http://localhost:1234/v1 and verify stable streaming without connection drops.

wenshao · 2026-04-19T10:09:40Z

+
+    // Use config values if provided, otherwise use sensible defaults
+    const timeout = contentGeneratorConfig?.timeout || 600000;  // 10 minutes default
+    const maxRetries = contentGeneratorConfig?.maxRetries ?? 2; // 2 retries default


[Critical] fetchOptions is computed in the constructor before Agent is loaded. On the first OpenAIContentGenerator instance Agent is still undefined, so this client is created without the custom dispatcher and never picks up the new keep-alive / timeout settings that this PR is trying to add. Please load undici before new OpenAI(...), or recreate the client after Agent becomes available.

— gpt-5.4 via Qwen Code /review

wenshao · 2026-04-19T10:09:40Z

+
+  if (!effectiveAuthType) {
+    return;
+  }


[Critical] initializeAuth() now swallows every auth initialization failure here. That changes the sandbox startup path too, because the surrounding try/catch no longer sees the error and cannot fail fast before launching. Broken or expired credentials will now slip past startup and fail later in a much harder-to-diagnose place. Please rethrow after logging, or return an explicit failure result that callers must handle.

— gpt-5.4 via Qwen Code /review

wenshao · 2026-04-19T10:09:40Z

+  if (process.env.GEMINI_API_KEY) {
+    return AuthType.USE_GEMINI;
+  }
+  if (process.env.GOOGLE_GENAI_USE_VERTEXAI === 'true') {


[Critical] This new shared auth resolver no longer considers QWEN_OAUTH_TOKEN, even though other code paths still treat that env var as a valid signal for Qwen auth. As soon as validateNonInteractiveAuth() switches to this helper, existing token-based Qwen setups stop resolving an auth type in non-interactive/shared flows. Please preserve the previous env-based auth matrix here before making this the source of truth.

— gpt-5.4 via Qwen Code /review

@naishyadav

# v0.7.5 — Network Proxy, Webhooks & Community Fixes --- ## Features - **Network proxy support** — configure HTTP/HTTPS proxies with bypass rules directly from App Settings. The proxy engine routes traffic through `undici` ProxyAgent instances, respects `NO_PROXY` rules, and configures both Node and Electron browser sessions. (46049d28, 0ed01c48, 81d1b1c3, 0da3f265) - **Webhook actions for automations** — automations can now fire HTTP webhooks with configurable auth, form payloads, response capture, replay, and persistent retry with exponential backoff. (993c75ce, 588aab6c) - **Gemini 3.1 Flash Lite** — added to Google AI Studio preferred defaults. Thanks to [@naishyadav](https://github.com/naishyadav) for the suggestion in [#357](craft-ai-agents/craft-agents-oss#357). (e36e9a22) - **Dismiss working directory history items** — hover-visible X button on each recent directory entry to remove it from history. Thanks to [@jjjrmy](https://github.com/jjjrmy) ([#346](craft-ai-agents/craft-agents-oss#346)) and [@jonzhan](https://github.com/jonzhan) ([#391](craft-ai-agents/craft-agents-oss#391)) for requesting this. (3bd55d6d) ## Improvements - **History truncation** — consolidated all history field truncation to a single `HISTORY_FIELD_MAX_LENGTH` constant instead of scattered hardcoded limits (8689a1d4) - **Craft source docs** — added Collections section to guide with emphasis on the nested title + properties item format (0837afa0) ## Bug Fixes - **MiniMax CN authentication** — removed incorrect `minimax-cn → minimax` alias, added lightweight direct HTTP test for Pi providers, and stripped MiniMax-prefix for CN API compatibility. Thanks to [@Kathie-yu](https://github.com/Kathie-yu) ([#396](craft-ai-agents/craft-agents-oss#396)) and [@RimuruW](https://github.com/RimuruW) ([#386](craft-ai-agents/craft-agents-oss#386)) for reporting. Fixes [#396](craft-ai-agents/craft-agents-oss#396). (612c0e7e) - **Inline code in messages** — text between inline badges (sources, skills, files) now renders through the Markdown component, restoring inline code, bold, italic, and links. Thanks to [@linusrogge](https://github.com/linusrogge) for reporting [#378](craft-ai-agents/craft-agents-oss#378). Fixes [#378](craft-ai-agents/craft-agents-oss#378). (e7f88a38) - **Zod/JSON Schema passthrough** — default Zod object schemas to `.passthrough()` to match JSON Schema semantics where `additionalProperties` defaults to `true`. Also preserved `additionalProperties` in MCP proxy tool schema round-trip. Fixes tools with loosely-typed schemas silently losing fields. (cf4b6ac1, 42c173cf) - **Self-signed TLS certificates** — accept self-signed certificates for the configured remote server origin, fixing `ERR_CERT_AUTHORITY_INVALID` on `wss://` connections (7148ebec) - **File attachment in thin client mode** — paperclip button now uses browser-native `FileReader` API instead of server-side `fs.readFile()`, fixing silent failures when client and server filesystems differ (680cd197) - **URL linkification** — strip trailing markdown characters (`**`, etc.) from linkified URLs that were producing broken links (24385e78, 65b8f350) - **Automation action badges** — sidebar now shows correct "Prompt" / "Webhook" badges based on actual action types instead of hardcoding "Prompt" (dc422573) - **Packaged server path fallback** — added `dist/resources` fallback for builds where `extraResources` output layout differs (edc5f61f) - **`$CRAFT_EVENT_DATA` missing labels** — all automation event payloads now include the session's current `labels` array, so webhooks and scripts can access label data. Fixes [#406](craft-ai-agents/craft-agents-oss#406). (0f98f090) - **Multi-select non-adjacent sessions** — Cmd-click now always toggles selection (standard OS behavior); opening in a new panel moves to Cmd-Shift-click. Fixes [#404](craft-ai-agents/craft-agents-oss#404). (d4d7aff1) - **@ mention autocomplete with spaces** — spaces are now allowed in file mention queries (e.g. `@app availability.md`). The menu auto-closes Slack-style when a space produces no matches. Thanks to [@alexzadeh](https://github.com/alexzadeh) for reporting [#398](craft-ai-agents/craft-agents-oss#398). Fixes [#398](craft-ai-agents/craft-agents-oss#398). (1d063177) ---

ahundt added 4 commits August 20, 2025 19:40

Merge upstream/main and resolve conflicts

087d7b3

fix: add missing telemetry imports after merge

0886753

fix: move undici dynamic import to async context with debug logging

89cd276

ahundt mentioned this pull request Aug 23, 2025

fix(core): add local AI server socket configuration to eliminate connection termination #429

Closed

wenshao requested changes Apr 19, 2026

View reviewed changes

This was referenced Apr 27, 2026

OpenAI-compatible server reliability improvements via HTTP/1.1 keep-alive config and error handling BingqingLyu/qwen-code#14

Open

fix(core): add local AI server socket configuration to eliminate connection termination BingqingLyu/qwen-code#17

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAI-compatible server reliability improvements via HTTP/1.1 keep-alive config and error handling#404

OpenAI-compatible server reliability improvements via HTTP/1.1 keep-alive config and error handling#404
ahundt wants to merge 4 commits into
QwenLM:mainfrom
ahundt:fix-openai-local-server-reliability

ahundt commented Aug 20, 2025

Uh oh!

wenshao Apr 19, 2026

Uh oh!

wenshao Apr 19, 2026

Uh oh!

wenshao Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ahundt commented Aug 20, 2025

Summary

Problem

Solution

1. HTTP Agent Configuration

2. Retry Logic

3. Configuration Improvements

Changes

Testing

Compatibility

Uh oh!

wenshao Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

wenshao Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

wenshao Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants