Skip to content

OpenAI-compatible server reliability improvements via HTTP/1.1 keep-alive config and error handling#404

Open
ahundt wants to merge 4 commits into
QwenLM:mainfrom
ahundt:fix-openai-local-server-reliability
Open

OpenAI-compatible server reliability improvements via HTTP/1.1 keep-alive config and error handling#404
ahundt wants to merge 4 commits into
QwenLM:mainfrom
ahundt:fix-openai-local-server-reliability

Conversation

@ahundt

@ahundt ahundt commented Aug 20, 2025

Copy link
Copy Markdown

Summary

This PR improves the reliability of Qwen Code when using local OpenAI-compatible servers by implementing proper HTTP/1.1 keep-alive configuration and robust error handling.

Problem

Local OpenAI-compatible servers (running on localhost) were experiencing:

  • Intermittent connection failures during streaming
  • Premature connection closures
  • Missing retry logic for transient failures
  • Inadequate timeout handling for slower local inference

Solution

1. HTTP Agent Configuration

  • Implemented HTTP/1.1 keep-alive with 4-second timeout to prevent premature disconnections
  • Set connection pool to max 2 connections to avoid overwhelming local servers
  • Disabled HTTP pipelining (set to 0) for compatibility with simpler server implementations

2. Retry Logic

  • Added exponential backoff starting at 1 second, doubling up to 5 retries
  • Falls back to standard fetch when undici module is not available
  • Retries on HTTP status codes 429, 500, 502, 503 and ECONNRESET errors

3. Configuration Improvements

  • Consolidated OpenAI configuration with proper hierarchy (CLI args > env vars > settings.json)
  • Added getBackendName() function to display actual backend name ("OpenAI API", "Qwen API", "Vertex AI", etc.) instead of undefined in error messages
  • Consolidated multiple debug console.log calls into single multi-line output

Changes

  • Core improvements: HTTP agent configuration with 4s keep-alive timeout, exponential backoff retry, 10-minute timeout for local inference
  • Configuration: Centralized OpenAI config in config.ts, enforced CLI > env > settings.json precedence
  • Error handling: Error messages now show actual backend name (e.g., "OpenAI API", "Qwen API") instead of undefined in reportError calls
  • Documentation: Added local server configuration section to openai-auth.md with environment variable examples

Testing

To test with a local OpenAI-compatible server:

export OPENAI_API_KEY="any-value"
export OPENAI_BASE_URL="http://localhost:1234/v1"
export OPENAI_MODEL="your-model-name"
qwen

Compatibility

  • Fully backward compatible with cloud OpenAI API
  • No breaking changes to existing configurations
  • Graceful degradation when optional dependencies are missing

Fixes issues with local model inference reliability when using self-hosted models.

ahundt added 4 commits August 20, 2025 19:40
Previous behavior: Local OpenAI-compatible servers experienced intermittent connection failures and streaming issues due to missing HTTP/1.1 keep-alive configuration and inadequate timeout handling.

What changed:
- packages/core/src/core/openaiContentGenerator.ts: Add HTTP agent configuration with proper keep-alive settings, implement retry logic with exponential backoff, add constants for timeout and connection pooling
- packages/cli/src/config/config.ts: Consolidate OpenAI configuration functions, add setupOpenAIFromCliArgs and getEffectiveAuthType for proper config hierarchy (CLI > env > settings)
- packages/cli/src/validateNonInterActiveAuth.ts: Consolidate debug logging into multi-line statements, use getEffectiveAuthType for consistent auth resolution
- packages/cli/src/gemini.tsx: Fix import order, add error logging in debug mode, update imports after config consolidation
- packages/core/src/core/contentGenerator.ts: Add getBackendName function for human-readable API names in error messages
- packages/core/src/core/geminiChat.ts: Add getConfig() method to expose configuration for error reporting
- packages/core/src/core/turn.ts: Use getBackendName for better error messages
- packages/cli/src/nonInteractiveCli.ts: Apply OpenAI CLI arguments before auth initialization
- docs/cli/openai-auth.md: Document local server configuration options

Why: Local OpenAI-compatible servers often have different connection characteristics than cloud services, requiring specific HTTP agent configuration to maintain stable connections. The changes ensure reliable streaming and prevent premature connection closures.

Testable: Run with local OpenAI-compatible server using OPENAI_BASE_URL=http://localhost:1234/v1 and verify stable streaming without connection drops.

// Use config values if provided, otherwise use sensible defaults
const timeout = contentGeneratorConfig?.timeout || 600000; // 10 minutes default
const maxRetries = contentGeneratorConfig?.maxRetries ?? 2; // 2 retries default

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] fetchOptions is computed in the constructor before Agent is loaded. On the first OpenAIContentGenerator instance Agent is still undefined, so this client is created without the custom dispatcher and never picks up the new keep-alive / timeout settings that this PR is trying to add. Please load undici before new OpenAI(...), or recreate the client after Agent becomes available.

— gpt-5.4 via Qwen Code /review


if (!effectiveAuthType) {
return;
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] initializeAuth() now swallows every auth initialization failure here. That changes the sandbox startup path too, because the surrounding try/catch no longer sees the error and cannot fail fast before launching. Broken or expired credentials will now slip past startup and fail later in a much harder-to-diagnose place. Please rethrow after logging, or return an explicit failure result that callers must handle.

— gpt-5.4 via Qwen Code /review

if (process.env.GEMINI_API_KEY) {
return AuthType.USE_GEMINI;
}
if (process.env.GOOGLE_GENAI_USE_VERTEXAI === 'true') {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] This new shared auth resolver no longer considers QWEN_OAUTH_TOKEN, even though other code paths still treat that env var as a valid signal for Qwen auth. As soon as validateNonInteractiveAuth() switches to this helper, existing token-based Qwen setups stop resolving an auth type in non-interactive/shared flows. Please preserve the previous env-based auth matrix here before making this the source of truth.

— gpt-5.4 via Qwen Code /review

DragonnZhang pushed a commit that referenced this pull request Apr 30, 2026
# v0.7.5 — Network Proxy, Webhooks & Community Fixes

---

## Features

- **Network proxy support** — configure HTTP/HTTPS proxies with bypass rules directly from App Settings. The proxy engine routes traffic through `undici` ProxyAgent instances, respects `NO_PROXY` rules, and configures both Node and Electron browser sessions. (46049d28, 0ed01c48, 81d1b1c3, 0da3f265)
- **Webhook actions for automations** — automations can now fire HTTP webhooks with configurable auth, form payloads, response capture, replay, and persistent retry with exponential backoff. (993c75ce, 588aab6c)
- **Gemini 3.1 Flash Lite** — added to Google AI Studio preferred defaults. Thanks to [@naishyadav](https://github.com/naishyadav) for the suggestion in [#357](craft-ai-agents/craft-agents-oss#357). (e36e9a22)
- **Dismiss working directory history items** — hover-visible X button on each recent directory entry to remove it from history. Thanks to [@jjjrmy](https://github.com/jjjrmy) ([#346](craft-ai-agents/craft-agents-oss#346)) and [@jonzhan](https://github.com/jonzhan) ([#391](craft-ai-agents/craft-agents-oss#391)) for requesting this. (3bd55d6d)

## Improvements

- **History truncation** — consolidated all history field truncation to a single `HISTORY_FIELD_MAX_LENGTH` constant instead of scattered hardcoded limits (8689a1d4)
- **Craft source docs** — added Collections section to guide with emphasis on the nested title + properties item format (0837afa0)

## Bug Fixes

- **MiniMax CN authentication** — removed incorrect `minimax-cn → minimax` alias, added lightweight direct HTTP test for Pi providers, and stripped MiniMax-prefix for CN API compatibility. Thanks to [@Kathie-yu](https://github.com/Kathie-yu) ([#396](craft-ai-agents/craft-agents-oss#396)) and [@RimuruW](https://github.com/RimuruW) ([#386](craft-ai-agents/craft-agents-oss#386)) for reporting. Fixes [#396](craft-ai-agents/craft-agents-oss#396). (612c0e7e)
- **Inline code in messages** — text between inline badges (sources, skills, files) now renders through the Markdown component, restoring inline code, bold, italic, and links. Thanks to [@linusrogge](https://github.com/linusrogge) for reporting [#378](craft-ai-agents/craft-agents-oss#378). Fixes [#378](craft-ai-agents/craft-agents-oss#378). (e7f88a38)
- **Zod/JSON Schema passthrough** — default Zod object schemas to `.passthrough()` to match JSON Schema semantics where `additionalProperties` defaults to `true`. Also preserved `additionalProperties` in MCP proxy tool schema round-trip. Fixes tools with loosely-typed schemas silently losing fields. (cf4b6ac1, 42c173cf)
- **Self-signed TLS certificates** — accept self-signed certificates for the configured remote server origin, fixing `ERR_CERT_AUTHORITY_INVALID` on `wss://` connections (7148ebec)
- **File attachment in thin client mode** — paperclip button now uses browser-native `FileReader` API instead of server-side `fs.readFile()`, fixing silent failures when client and server filesystems differ (680cd197)
- **URL linkification** — strip trailing markdown characters (`**`, etc.) from linkified URLs that were producing broken links (24385e78, 65b8f350)
- **Automation action badges** — sidebar now shows correct "Prompt" / "Webhook" badges based on actual action types instead of hardcoding "Prompt" (dc422573)
- **Packaged server path fallback** — added `dist/resources` fallback for builds where `extraResources` output layout differs (edc5f61f)
- **`$CRAFT_EVENT_DATA` missing labels** — all automation event payloads now include the session's current `labels` array, so webhooks and scripts can access label data. Fixes [#406](craft-ai-agents/craft-agents-oss#406). (0f98f090)
- **Multi-select non-adjacent sessions** — Cmd-click now always toggles selection (standard OS behavior); opening in a new panel moves to Cmd-Shift-click. Fixes [#404](craft-ai-agents/craft-agents-oss#404). (d4d7aff1)
- **@ mention autocomplete with spaces** — spaces are now allowed in file mention queries (e.g. `@app availability.md`). The menu auto-closes Slack-style when a space produces no matches. Thanks to [@alexzadeh](https://github.com/alexzadeh) for reporting [#398](craft-ai-agents/craft-agents-oss#398). Fixes [#398](craft-ai-agents/craft-agents-oss#398). (1d063177)

---
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants