Summary
Fresh PGLite installs can initialize the schema with the legacy OpenAI 1536-dim embedding column while the runtime gateway defaults to ZeroEntropy zembed-1 at 1280 dimensions. The first write/embed then fails with a dimension mismatch.
Observed behavior
On a fresh-ish local PGLite brain with no pages:
gbrain v0.37.1.0
config.json:
{
"engine": "pglite",
"database_path": "/home/opc/.gbrain/brain.pglite"
}
gbrain doctor --json reports the effective embedding provider as ZeroEntropy:
embedding_provider: ok / skipped
Model: zeroentropyai:zembed-1
connection: Connected, 0 pages
But a first gbrain put personal/goals.md fails when embedding:
expected 1536 dimensions, not 1280
The brain is empty, so this is not an old-data migration problem. The schema itself appears to have been initialized with vector(1536), while the active embedding path is producing 1280-dim ZeroEntropy vectors.
Why this looks like a default mismatch
The current runtime gateway defaults appear to be ZeroEntropy 1280:
// src/core/ai/gateway.ts
const DEFAULT_EMBEDDING_MODEL = 'zeroentropyai:zembed-1';
const DEFAULT_EMBEDDING_DIMENSIONS = 1280;
But several schema/init paths still appear to have legacy OpenAI defaults:
// src/core/pglite-schema.ts
export function getPGLiteSchema(dims: number = 1536, model: string = 'text-embedding-3-large'): string
Also found similar legacy fallbacks in PGLite/Postgres initialization paths:
src/core/pglite-engine.ts: let dims = 1536
src/core/postgres-engine.ts: let dims = 1536
src/core/schema-embedded.ts: embedding vector(1536)
This creates a split-brain default:
schema default: 1536 / OpenAI-style
runtime gateway default: 1280 / ZeroEntropy
Expected behavior
A new PGLite brain should be internally consistent.
Either:
- The schema initializes to ZeroEntropy's default dimensions:
embedding_model = zeroentropyai:zembed-1
embedding_dimensions = 1280
content_chunks.embedding = vector(1280)
or:
- If the schema intentionally remains 1536 for compatibility, the effective runtime embedding model/dimensions should also resolve to 1536 until the user explicitly runs a provider switch.
Suggested fix
Centralize the embedding defaults used by:
- gateway runtime config
- PGLite schema creation
- Postgres schema creation
- config seed rows
- doctor/schema dimension checks
At minimum, gbrain init --pglite should seed embedding_model and embedding_dimensions consistently with the schema it creates.
doctor should also catch this before first embed/write by comparing:
actual schema vector width
configured/effective embedding_dimensions
configured/effective embedding_model
Reproduction outline
gbrain init --pglite
# with no explicit --model / --embedding-model args
# configure ZEROENTROPY_API_KEY or otherwise let runtime resolve to zeroentropyai:zembed-1
gbrain put personal/goals.md < some-file.md
Actual:
expected 1536 dimensions, not 1280
Expected:
put succeeds, or doctor/init catches and fixes the mismatch before embedding
Impact
This breaks the first write for a new PGLite brain even when the brain has 0 pages, because the schema width and runtime embedding provider dimensions disagree.
Summary
Fresh PGLite installs can initialize the schema with the legacy OpenAI 1536-dim embedding column while the runtime gateway defaults to ZeroEntropy
zembed-1at 1280 dimensions. The first write/embed then fails with a dimension mismatch.Observed behavior
On a fresh-ish local PGLite brain with no pages:
gbrain doctor --jsonreports the effective embedding provider as ZeroEntropy:But a first
gbrain put personal/goals.mdfails when embedding:The brain is empty, so this is not an old-data migration problem. The schema itself appears to have been initialized with
vector(1536), while the active embedding path is producing 1280-dim ZeroEntropy vectors.Why this looks like a default mismatch
The current runtime gateway defaults appear to be ZeroEntropy 1280:
But several schema/init paths still appear to have legacy OpenAI defaults:
Also found similar legacy fallbacks in PGLite/Postgres initialization paths:
This creates a split-brain default:
Expected behavior
A new PGLite brain should be internally consistent.
Either:
or:
Suggested fix
Centralize the embedding defaults used by:
At minimum,
gbrain init --pgliteshould seedembedding_modelandembedding_dimensionsconsistently with the schema it creates.doctorshould also catch this before first embed/write by comparing:Reproduction outline
Actual:
Expected:
Impact
This breaks the first write for a new PGLite brain even when the brain has 0 pages, because the schema width and runtime embedding provider dimensions disagree.