fix: auto-reconnect connection pool before batch retry by justemu · Pull Request #1615 · garrytan/gbrain

justemu · 2026-05-29T04:26:48Z

Problem

The batchRetry helper retries transient connection errors but never re-establishes the database connection between attempts. When the postgres.js pool drops (network blip, Supavisor circuit-breaker) or the module-level db.sql is nulled mid-cycle ("No database connection: connect() has not been called"), every retry fails with the same error and all rows in the batch are silently lost.

Observed in the wild: extract.links_fs reports connection blip, retrying (1/3)... (2/3)... (3/3) followed by batch error (66 link rows lost): No database connection: connect() has not been called.

Fix

Two minimal changes:

PostgresEngine.batchRetry(): call this.reconnect() before each retry attempt. reconnect() already handles both instance-level and module-level connection styles, has its own _reconnecting mutex to prevent races, and no-ops if the pool is healthy.
withRetry (retry.ts): await the onRetry callback so async reconnect operations complete before the inter-attempt delay starts. Non-async callbacks resolve immediately (backward-compatible interface change: void | Promise<void>).

Testing

✅ bun run build — compiles cleanly
✅ bun test test/core/retry.test.ts test/retry-matcher.test.ts — all 52 tests pass
✅ gbrain dream --phase extract --json — cycle completes cleanly on 365-page brain

Closes #1603

…user-provided models Two fixes in gateway.ts: 1. Embed preflight (): When a user configures , the preflight incorrectly returns because it checks without verifying whether the model string actually carries a model id. Add guard so that a non-empty model id after the colon in satisfies the preflight. 2. Chat response normalization: AI SDK v6 returns reasoning-model output (e.g. DeepSeek ) as typed blocks in or as . The current code skips these, producing empty text output when a reasoning model responds without a separate text part. Handle both paths: - In the array: treat as text - In the flat fallback: check when is empty

DeepSeek v4-flash defaults to thinking mode, producing ~50% unnecessary output tokens for gbrain's non-reasoning use cases. The same risk exists for any OpenAI-compatible provider (Ollama, OpenRouter, Groq, Together, llama-server, etc.) that may add thinking support in the future. Rather than targeting only DeepSeek, apply the thinking disable to ALL openai-compatible providers (recipe.implementation === 'openai-compatible'). The @ai-sdk/openai-compatible provider spreads providerOptions[recipe.id] into the request body; providers that don't recognize 'thinking' silently ignore the parameter — 100% safe. Does not affect native providers (OpenAI o-series, Anthropic Claude, Google Gemini) which have different parameter names for reasoning control.

The batchRetry helper retries transient connection errors but never re-establishes the database connection between attempts. When the postgres.js pool drops (network blip, Supavisor circuit-breaker) or the module-level db.sql is nulled mid-cycle, every retry fails with the same 'No database connection' error and all rows in the batch are silently lost. Fix: 1. PostgresEngine.batchRetry(): call this.reconnect() before each retry attempt. reconnect() already handles both instance-level and module-level connection styles, has its own _reconnecting mutex to prevent races, and no-ops if the pool is healthy. 2. withRetry (retry.ts): await the onRetry callback so async reconnect operations complete before the inter-attempt delay starts. Non-async callbacks resolve immediately (backward- compatible interface change: void | Promise<void>). Fixes garrytan#1603

garrytan · 2026-06-08T02:39:07Z

Thanks for this! The Postgres module-singleton / "connect() has not been called" ownership bug was fixed canonically in #1805 (atomic owner-only-disconnect), with follow-ups in #1608 and #1572 — all merged to master. Closing as already-resolved. If you still reproduce after upgrading, please reopen, and thank you for the contribution.

justemu added 4 commits May 27, 2026 07:56

Merge: reasoning model output and embed preflight fix

559fc61

rayers mentioned this pull request Jun 3, 2026

fix(db): refcount module singleton to fix mid-cycle "connect() has not been called" (#1570) #1754

Closed

garrytan closed this Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: auto-reconnect connection pool before batch retry#1615

fix: auto-reconnect connection pool before batch retry#1615
justemu wants to merge 4 commits into
garrytan:masterfrom
justemu:fix/batch-retry-auto-reconnect

justemu commented May 29, 2026

Uh oh!

garrytan commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

justemu commented May 29, 2026

Problem

Fix

Testing

Uh oh!

garrytan commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants