Skip to content

fix(core): add local AI server socket configuration to eliminate connection termination#17

Open
BingqingLyu wants to merge 5 commits into
mainfrom
fork-pr-429-fix-local-ai-socket-configuration
Open

fix(core): add local AI server socket configuration to eliminate connection termination#17
BingqingLyu wants to merge 5 commits into
mainfrom
fork-pr-429-fix-local-ai-socket-configuration

Conversation

@BingqingLyu

@BingqingLyu BingqingLyu commented Apr 27, 2026

Copy link
Copy Markdown
Owner

Previous behavior

Local AI servers (LM Studio, Ollama) frequently terminate connections with "socket hang up" and "terminated" errors during streaming responses, causing unreliable communication and failed requests.

What changed

  • packages/core/src/utils/localAI.ts: New utility (276 lines) with socket configuration functions
  • packages/core/src/core/openaiContentGenerator.ts: Apply configureLocalAIClientOptions() in constructor with conditional local server detection
  • packages/core/src/index.ts: Export localAI utility functions (isLocalServerUrl, configureLocalAIClientOptions, etc.)
  • scripts/test-qwen-lmstudio-integration.sh: Integration test script for verifying local AI server connectivity

Technical implementation

Essential socket configuration:

socket.setNoDelay(true);        // Disable Nagle algorithm
socket.setKeepAlive(true, 1000); // Enable keepalive with 1s interval  
socket.setTimeout(60000);        // 60 second timeout

Local server detection: Detects localhost, 127.0.0.1, private networks (RFC 1918), .local/.localhost domains, Docker patterns (host.docker.internal), and common AI ports (1234, 5000, 7860, 8000, 11434)

Custom fetch implementation: Uses Node.js http/https modules directly with socket configuration applied via req.on('socket') event, bypassing undici limitations for local servers

Conditional application: Only applies socket configuration when isLocalServerUrl() returns true, preserving existing cloud API behavior

Why

Local AI servers require specific socket settings to maintain stable connections. Node.js default networking causes frequent termination errors with local servers that have different timeout and keep-alive requirements than cloud APIs.

Files affected

  • packages/core/src/utils/localAI.ts: Socket configuration, server detection, custom fetch implementation
  • packages/core/src/core/openaiContentGenerator.ts: Added isLocalServerUrl check and configureLocalAIClientOptions call in constructor
  • packages/core/src/index.ts: Export isLocalServerUrl, configureLocalAIClientOptions, getApiKeyForUrl, LOCAL_AI_DEFAULT_KEY
  • scripts/test-qwen-lmstudio-integration.sh: Automated test script using byobu/tmux for LM Studio integration testing

Testable

  1. Run with local server: OPENAI_API_KEY=lmstudio OPENAI_BASE_URL=http://localhost:1234/v1 node packages/cli/dist/index.js -y -p "test"
  2. Verify debug log shows: [OpenAIContentGenerator.constructor] Configuring local AI client options for http://localhost:1234/v1
  3. Confirm stable connections without "terminated" errors
  4. Test cloud APIs work unchanged: node packages/cli/dist/index.js --help (no environment variables)
  5. Run integration test: ./scripts/test-qwen-lmstudio-integration.sh

Compatibility

100% backward compatible - changes are additive and conditional. Cloud APIs and existing functionality remain unchanged. Only applies enhancements when local servers are detected.

Improves upon QwenLM#404

ahundt added 5 commits August 20, 2025 19:40
Previous behavior: Local OpenAI-compatible servers experienced intermittent connection failures and streaming issues due to missing HTTP/1.1 keep-alive configuration and inadequate timeout handling.

What changed:
- packages/core/src/core/openaiContentGenerator.ts: Add HTTP agent configuration with proper keep-alive settings, implement retry logic with exponential backoff, add constants for timeout and connection pooling
- packages/cli/src/config/config.ts: Consolidate OpenAI configuration functions, add setupOpenAIFromCliArgs and getEffectiveAuthType for proper config hierarchy (CLI > env > settings)
- packages/cli/src/validateNonInterActiveAuth.ts: Consolidate debug logging into multi-line statements, use getEffectiveAuthType for consistent auth resolution
- packages/cli/src/gemini.tsx: Fix import order, add error logging in debug mode, update imports after config consolidation
- packages/core/src/core/contentGenerator.ts: Add getBackendName function for human-readable API names in error messages
- packages/core/src/core/geminiChat.ts: Add getConfig() method to expose configuration for error reporting
- packages/core/src/core/turn.ts: Use getBackendName for better error messages
- packages/cli/src/nonInteractiveCli.ts: Apply OpenAI CLI arguments before auth initialization
- docs/cli/openai-auth.md: Document local server configuration options

Why: Local OpenAI-compatible servers often have different connection characteristics than cloud services, requiring specific HTTP agent configuration to maintain stable connections. The changes ensure reliable streaming and prevent premature connection closures.

Testable: Run with local OpenAI-compatible server using OPENAI_BASE_URL=http://localhost:1234/v1 and verify stable streaming without connection drops.
…ections

Previous behavior: Local AI servers (LM Studio, Ollama) would frequently terminate connections with socket hang up errors, causing unreliable communication and failed requests.

What changed:
- packages/core/src/utils/localAI.ts: Created comprehensive local AI utilities with essential socket configuration (setNoDelay, setKeepAlive, custom timeouts)
- packages/core/src/core/openaiContentGenerator.ts: Integrated local server detection and conditional application of socket optimizations
- packages/core/src/index.ts: Added exports for localAI utility functions
- scripts/test-qwen-lmstudio-integration.sh: Added integration test script for verifying local AI server connectivity

Why: Local AI servers require specific socket settings to maintain stable connections. Without proper configuration, Node.js default networking causes frequent "terminated" and "socket hang up" errors that make local AI servers unreliable.

The implementation applies enhanced networking ONLY when local servers are detected (localhost, 127.0.0.1, private networks, common AI ports), preserving existing behavior for cloud APIs.

Files affected:
- packages/core/src/utils/localAI.ts: New utility with socket configuration, server detection, custom fetch implementation
- packages/core/src/core/openaiContentGenerator.ts: Added isLocalServerUrl detection and configureLocalAIClientOptions call
- packages/core/src/index.ts: Export localAI utilities for package consumers
- scripts/test-qwen-lmstudio-integration.sh: Test automation for verifying local server integration

Testable: Run "OPENAI_API_KEY=lmstudio OPENAI_BASE_URL=http://localhost:1234/v1 node packages/cli/dist/index.js -y -p 'test'" to verify local server configuration is applied and connections are stable.
@BingqingLyu BingqingLyu added conflicting-group-1 conflicting-group-1 Conflicting PR group 1 — review as a batch conflicting-pr Shares at least one cross-PR dependency with other PRs and removed conflicting-group-1 labels May 7, 2026
@BingqingLyu

BingqingLyu commented May 7, 2026

Copy link
Copy Markdown
Owner Author

Conflict Group 1

This PR shares modified functions with 20 other PR(s): #10, #112, #113, #114, #117, #14, #18, #20, #21, #22, #31, #36, #46, #6, #7, #71, #75, #86, #88, #94.

These PRs should be reviewed as a batch — merging one may affect the others.

Function File Also modified by
main gemini.tsx #14, #20, #94
parseArguments config.ts #10, #112, #113, #114, #117, #14, #18, #21, #22, #31, #36, #46, #7, #86, #88
resolveContentGeneratorConfigWithSources contentGenerator.ts #14
runNonInteractive nonInteractiveCli.ts #114, #14, #6, #71, #75
startInteractiveUI gemini.tsx #114, #117, #14, #31, #6, #88, #94
validateNonInteractiveAuth validateNonInterActiveAuth.ts #10, #14
graph LR
    PR17["PR #17"]
    Fmain_4875["main<br>gemini.tsx"]
    PR17 -->|modifies| Fmain_4875
    PR14["PR #14"]
    PR14 -->|modifies| Fmain_4875
    PR20["PR #20"]
    PR20 -->|modifies| Fmain_4875
    PR94["PR #94"]
    PR94 -->|modifies| Fmain_4875
    FparseArguments_6977["parseArguments<br>config.ts"]
    PR17 -->|modifies| FparseArguments_6977
    PR10["PR #10"]
    PR10 -->|modifies| FparseArguments_6977
    PR112["PR #112"]
    PR112 -->|modifies| FparseArguments_6977
    PR113["PR #113"]
    PR113 -->|modifies| FparseArguments_6977
    PR114["PR #114"]
    PR114 -->|modifies| FparseArguments_6977
    PR117["PR #117"]
    PR117 -->|modifies| FparseArguments_6977
    PR14 -->|modifies| FparseArguments_6977
    PR18["PR #18"]
    PR18 -->|modifies| FparseArguments_6977
    PR21["PR #21"]
    PR21 -->|modifies| FparseArguments_6977
    PR22["PR #22"]
    PR22 -->|modifies| FparseArguments_6977
    PR31["PR #31"]
    PR31 -->|modifies| FparseArguments_6977
    PR36["PR #36"]
    PR36 -->|modifies| FparseArguments_6977
    PR46["PR #46"]
    PR46 -->|modifies| FparseArguments_6977
    PR7["PR #7"]
    PR7 -->|modifies| FparseArguments_6977
    PR86["PR #86"]
    PR86 -->|modifies| FparseArguments_6977
    PR88["PR #88"]
    PR88 -->|modifies| FparseArguments_6977
    FresolveContentGeneratorConfigWithSources_5386["resolveContentGeneratorConfigWithSources<br>contentGenerator.ts"]
    PR17 -->|modifies| FresolveContentGeneratorConfigWithSources_5386
    PR14 -->|modifies| FresolveContentGeneratorConfigWithSources_5386
    FrunNonInteractive_8467["runNonInteractive<br>nonInteractiveCli.ts"]
    PR17 -->|modifies| FrunNonInteractive_8467
    PR114 -->|modifies| FrunNonInteractive_8467
    PR14 -->|modifies| FrunNonInteractive_8467
    PR6["PR #6"]
    PR6 -->|modifies| FrunNonInteractive_8467
    PR71["PR #71"]
    PR71 -->|modifies| FrunNonInteractive_8467
    PR75["PR #75"]
    PR75 -->|modifies| FrunNonInteractive_8467
    FstartInteractiveUI_4875["startInteractiveUI<br>gemini.tsx"]
    PR17 -->|modifies| FstartInteractiveUI_4875
    PR114 -->|modifies| FstartInteractiveUI_4875
    PR117 -->|modifies| FstartInteractiveUI_4875
    PR14 -->|modifies| FstartInteractiveUI_4875
    PR31 -->|modifies| FstartInteractiveUI_4875
    PR6 -->|modifies| FstartInteractiveUI_4875
    PR88 -->|modifies| FstartInteractiveUI_4875
    PR94 -->|modifies| FstartInteractiveUI_4875
    FvalidateNonInteractiveAuth_1241["validateNonInteractiveAuth<br>validateNonInterActiveAuth.ts"]
    PR17 -->|modifies| FvalidateNonInteractiveAuth_1241
    PR10 -->|modifies| FvalidateNonInteractiveAuth_1241
    PR14 -->|modifies| FvalidateNonInteractiveAuth_1241
Loading

Posted by codegraph-ai conflict detection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conflicting-group-1 Conflicting PR group 1 — review as a batch conflicting-pr Shares at least one cross-PR dependency with other PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants