fix(core): add local AI server socket configuration to eliminate connection termination#17
Open
BingqingLyu wants to merge 5 commits into
Open
fix(core): add local AI server socket configuration to eliminate connection termination#17BingqingLyu wants to merge 5 commits into
BingqingLyu wants to merge 5 commits into
Conversation
Previous behavior: Local OpenAI-compatible servers experienced intermittent connection failures and streaming issues due to missing HTTP/1.1 keep-alive configuration and inadequate timeout handling. What changed: - packages/core/src/core/openaiContentGenerator.ts: Add HTTP agent configuration with proper keep-alive settings, implement retry logic with exponential backoff, add constants for timeout and connection pooling - packages/cli/src/config/config.ts: Consolidate OpenAI configuration functions, add setupOpenAIFromCliArgs and getEffectiveAuthType for proper config hierarchy (CLI > env > settings) - packages/cli/src/validateNonInterActiveAuth.ts: Consolidate debug logging into multi-line statements, use getEffectiveAuthType for consistent auth resolution - packages/cli/src/gemini.tsx: Fix import order, add error logging in debug mode, update imports after config consolidation - packages/core/src/core/contentGenerator.ts: Add getBackendName function for human-readable API names in error messages - packages/core/src/core/geminiChat.ts: Add getConfig() method to expose configuration for error reporting - packages/core/src/core/turn.ts: Use getBackendName for better error messages - packages/cli/src/nonInteractiveCli.ts: Apply OpenAI CLI arguments before auth initialization - docs/cli/openai-auth.md: Document local server configuration options Why: Local OpenAI-compatible servers often have different connection characteristics than cloud services, requiring specific HTTP agent configuration to maintain stable connections. The changes ensure reliable streaming and prevent premature connection closures. Testable: Run with local OpenAI-compatible server using OPENAI_BASE_URL=http://localhost:1234/v1 and verify stable streaming without connection drops.
…ections Previous behavior: Local AI servers (LM Studio, Ollama) would frequently terminate connections with socket hang up errors, causing unreliable communication and failed requests. What changed: - packages/core/src/utils/localAI.ts: Created comprehensive local AI utilities with essential socket configuration (setNoDelay, setKeepAlive, custom timeouts) - packages/core/src/core/openaiContentGenerator.ts: Integrated local server detection and conditional application of socket optimizations - packages/core/src/index.ts: Added exports for localAI utility functions - scripts/test-qwen-lmstudio-integration.sh: Added integration test script for verifying local AI server connectivity Why: Local AI servers require specific socket settings to maintain stable connections. Without proper configuration, Node.js default networking causes frequent "terminated" and "socket hang up" errors that make local AI servers unreliable. The implementation applies enhanced networking ONLY when local servers are detected (localhost, 127.0.0.1, private networks, common AI ports), preserving existing behavior for cloud APIs. Files affected: - packages/core/src/utils/localAI.ts: New utility with socket configuration, server detection, custom fetch implementation - packages/core/src/core/openaiContentGenerator.ts: Added isLocalServerUrl detection and configureLocalAIClientOptions call - packages/core/src/index.ts: Export localAI utilities for package consumers - scripts/test-qwen-lmstudio-integration.sh: Test automation for verifying local server integration Testable: Run "OPENAI_API_KEY=lmstudio OPENAI_BASE_URL=http://localhost:1234/v1 node packages/cli/dist/index.js -y -p 'test'" to verify local server configuration is applied and connections are stable.
This was referenced Apr 28, 2026
5 tasks
Owner
Author
Conflict Group 1This PR shares modified functions with 20 other PR(s): #10, #112, #113, #114, #117, #14, #18, #20, #21, #22, #31, #36, #46, #6, #7, #71, #75, #86, #88, #94. These PRs should be reviewed as a batch — merging one may affect the others.
graph LR
PR17["PR #17"]
Fmain_4875["main<br>gemini.tsx"]
PR17 -->|modifies| Fmain_4875
PR14["PR #14"]
PR14 -->|modifies| Fmain_4875
PR20["PR #20"]
PR20 -->|modifies| Fmain_4875
PR94["PR #94"]
PR94 -->|modifies| Fmain_4875
FparseArguments_6977["parseArguments<br>config.ts"]
PR17 -->|modifies| FparseArguments_6977
PR10["PR #10"]
PR10 -->|modifies| FparseArguments_6977
PR112["PR #112"]
PR112 -->|modifies| FparseArguments_6977
PR113["PR #113"]
PR113 -->|modifies| FparseArguments_6977
PR114["PR #114"]
PR114 -->|modifies| FparseArguments_6977
PR117["PR #117"]
PR117 -->|modifies| FparseArguments_6977
PR14 -->|modifies| FparseArguments_6977
PR18["PR #18"]
PR18 -->|modifies| FparseArguments_6977
PR21["PR #21"]
PR21 -->|modifies| FparseArguments_6977
PR22["PR #22"]
PR22 -->|modifies| FparseArguments_6977
PR31["PR #31"]
PR31 -->|modifies| FparseArguments_6977
PR36["PR #36"]
PR36 -->|modifies| FparseArguments_6977
PR46["PR #46"]
PR46 -->|modifies| FparseArguments_6977
PR7["PR #7"]
PR7 -->|modifies| FparseArguments_6977
PR86["PR #86"]
PR86 -->|modifies| FparseArguments_6977
PR88["PR #88"]
PR88 -->|modifies| FparseArguments_6977
FresolveContentGeneratorConfigWithSources_5386["resolveContentGeneratorConfigWithSources<br>contentGenerator.ts"]
PR17 -->|modifies| FresolveContentGeneratorConfigWithSources_5386
PR14 -->|modifies| FresolveContentGeneratorConfigWithSources_5386
FrunNonInteractive_8467["runNonInteractive<br>nonInteractiveCli.ts"]
PR17 -->|modifies| FrunNonInteractive_8467
PR114 -->|modifies| FrunNonInteractive_8467
PR14 -->|modifies| FrunNonInteractive_8467
PR6["PR #6"]
PR6 -->|modifies| FrunNonInteractive_8467
PR71["PR #71"]
PR71 -->|modifies| FrunNonInteractive_8467
PR75["PR #75"]
PR75 -->|modifies| FrunNonInteractive_8467
FstartInteractiveUI_4875["startInteractiveUI<br>gemini.tsx"]
PR17 -->|modifies| FstartInteractiveUI_4875
PR114 -->|modifies| FstartInteractiveUI_4875
PR117 -->|modifies| FstartInteractiveUI_4875
PR14 -->|modifies| FstartInteractiveUI_4875
PR31 -->|modifies| FstartInteractiveUI_4875
PR6 -->|modifies| FstartInteractiveUI_4875
PR88 -->|modifies| FstartInteractiveUI_4875
PR94 -->|modifies| FstartInteractiveUI_4875
FvalidateNonInteractiveAuth_1241["validateNonInteractiveAuth<br>validateNonInterActiveAuth.ts"]
PR17 -->|modifies| FvalidateNonInteractiveAuth_1241
PR10 -->|modifies| FvalidateNonInteractiveAuth_1241
PR14 -->|modifies| FvalidateNonInteractiveAuth_1241
Posted by codegraph-ai conflict detection. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Previous behavior
Local AI servers (LM Studio, Ollama) frequently terminate connections with "socket hang up" and "terminated" errors during streaming responses, causing unreliable communication and failed requests.
What changed
Technical implementation
Essential socket configuration:
Local server detection: Detects localhost, 127.0.0.1, private networks (RFC 1918), .local/.localhost domains, Docker patterns (host.docker.internal), and common AI ports (1234, 5000, 7860, 8000, 11434)
Custom fetch implementation: Uses Node.js http/https modules directly with socket configuration applied via req.on('socket') event, bypassing undici limitations for local servers
Conditional application: Only applies socket configuration when isLocalServerUrl() returns true, preserving existing cloud API behavior
Why
Local AI servers require specific socket settings to maintain stable connections. Node.js default networking causes frequent termination errors with local servers that have different timeout and keep-alive requirements than cloud APIs.
Files affected
Testable
OPENAI_API_KEY=lmstudio OPENAI_BASE_URL=http://localhost:1234/v1 node packages/cli/dist/index.js -y -p "test"[OpenAIContentGenerator.constructor] Configuring local AI client options for http://localhost:1234/v1node packages/cli/dist/index.js --help(no environment variables)./scripts/test-qwen-lmstudio-integration.shCompatibility
100% backward compatible - changes are additive and conditional. Cloud APIs and existing functionality remain unchanged. Only applies enhancements when local servers are detected.
Improves upon QwenLM#404