fix: auto-reconnect connection pool before batch retry#1615
Closed
justemu wants to merge 4 commits into
Closed
Conversation
…user-provided models
Two fixes in gateway.ts:
1. Embed preflight (): When a user configures
,
the preflight incorrectly returns because
it checks without verifying whether the
model string actually carries a model id. Add guard
so that a non-empty model id after the colon in
satisfies the preflight.
2. Chat response normalization: AI SDK v6 returns reasoning-model output
(e.g. DeepSeek ) as typed blocks in
or as . The current code skips
these, producing empty text output when a reasoning model responds
without a separate text part. Handle both paths:
- In the array: treat as text
- In the flat fallback: check when is empty
DeepSeek v4-flash defaults to thinking mode, producing ~50% unnecessary output tokens for gbrain's non-reasoning use cases. The same risk exists for any OpenAI-compatible provider (Ollama, OpenRouter, Groq, Together, llama-server, etc.) that may add thinking support in the future. Rather than targeting only DeepSeek, apply the thinking disable to ALL openai-compatible providers (recipe.implementation === 'openai-compatible'). The @ai-sdk/openai-compatible provider spreads providerOptions[recipe.id] into the request body; providers that don't recognize 'thinking' silently ignore the parameter — 100% safe. Does not affect native providers (OpenAI o-series, Anthropic Claude, Google Gemini) which have different parameter names for reasoning control.
The batchRetry helper retries transient connection errors but never re-establishes the database connection between attempts. When the postgres.js pool drops (network blip, Supavisor circuit-breaker) or the module-level db.sql is nulled mid-cycle, every retry fails with the same 'No database connection' error and all rows in the batch are silently lost. Fix: 1. PostgresEngine.batchRetry(): call this.reconnect() before each retry attempt. reconnect() already handles both instance-level and module-level connection styles, has its own _reconnecting mutex to prevent races, and no-ops if the pool is healthy. 2. withRetry (retry.ts): await the onRetry callback so async reconnect operations complete before the inter-attempt delay starts. Non-async callbacks resolve immediately (backward- compatible interface change: void | Promise<void>). Fixes garrytan#1603
Owner
|
Thanks for this! The Postgres module-singleton / "connect() has not been called" ownership bug was fixed canonically in #1805 (atomic owner-only-disconnect), with follow-ups in #1608 and #1572 — all merged to master. Closing as already-resolved. If you still reproduce after upgrading, please reopen, and thank you for the contribution. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
batchRetryhelper retries transient connection errors but never re-establishes the database connection between attempts. When the postgres.js pool drops (network blip, Supavisor circuit-breaker) or the module-leveldb.sqlis nulled mid-cycle ("No database connection: connect() has not been called"), every retry fails with the same error and all rows in the batch are silently lost.Observed in the wild:
extract.links_fsreportsconnection blip, retrying (1/3)... (2/3)... (3/3)followed bybatch error (66 link rows lost): No database connection: connect() has not been called.Fix
Two minimal changes:
PostgresEngine.batchRetry(): callthis.reconnect()before each retry attempt.reconnect()already handles both instance-level and module-level connection styles, has its own_reconnectingmutex to prevent races, and no-ops if the pool is healthy.withRetry(retry.ts):awaittheonRetrycallback so async reconnect operations complete before the inter-attempt delay starts. Non-async callbacks resolve immediately (backward-compatible interface change:void | Promise<void>).Testing
bun run build— compiles cleanlybun test test/core/retry.test.ts test/retry-matcher.test.ts— all 52 tests passgbrain dream --phase extract --json— cycle completes cleanly on 365-page brainCloses #1603