sqlalchemy-paradedb

Simple, Elastic-quality search for Postgres

Website • Docs • Community • Blog • Changelog

sqlalchemy-paradedb

ParadeDB integration for SQLAlchemy: typed helpers for BM25 indexes, search predicates, scoring, snippets, facets, and migration ergonomics.

Requirements & Compatibility

Component	Supported
Python	3.10+
SQLAlchemy	2.0.32+
ParadeDB	0.22.0+
PostgreSQL	15+ (with ParadeDB extension)

Installation

uv add sqlalchemy-paradedb

For local development:

uv sync --extra test --extra dev

Quick Start

Prerequisites

Install pg_search in your Postgres database and connect SQLAlchemy to that database.

Create a BM25 Index

from sqlalchemy import Index
from paradedb.sqlalchemy import indexing

products_bm25_idx = Index(
    "products_bm25_idx",
    indexing.BM25Field(Product.id),
    indexing.BM25Field(
        Product.description,
        tokenizer=indexing.tokenize.unicode(lowercase=True),
    ),
    indexing.BM25Field(
        Product.category,
        tokenizer=indexing.tokenize.literal(),
    ),
    postgresql_using="bm25",
    postgresql_with={"key_field": "id"},
)

For JSON columns named metadata, use metadata_ as the ORM attribute name.

Query with ParadeDB Predicates

from sqlalchemy import select
from sqlalchemy.orm import Session
from paradedb.sqlalchemy import pdb, search

stmt = (
    select(Product.id, Product.description)
    .where(search.match_any(Product.description, "running", "shoes"))
    .order_by(pdb.score(Product.id).desc())
    .limit(10)
)

with Session(engine) as session:
    rows = session.execute(stmt).all()

Rows + Facets in a Single Query

from sqlalchemy import select
from sqlalchemy.orm import Session
from paradedb.sqlalchemy import facets, search

base = (
    select(Product.id, Product.description)
    .where(search.match_all(Product.description, "running"))
    .order_by(Product.id)
    .limit(10)
)

stmt = facets.with_rows(
    base,
    agg=facets.multi(
        facets.value_count(field="id"),
        facets.terms(field="category", size=10),
    ),
    key_field=Product.id,
)

with Session(engine) as session:
    rows = session.execute(stmt).all()
    facet_payload = facets.extract(rows)

Search Patterns

Fuzzy Matching

from paradedb.sqlalchemy import search

search.term(Product.description, "shose", distance=1)
search.match_any(Product.description, "wirless", distance=1, prefix=True)
search.term(Product.description, "rnnuing", distance=1, transpose_cost_one=True)

Use fuzzy options on term, match_any, or match_all; there is no separate search.fuzzy(...) helper.

Phrase Prefix and More-Like-This

from paradedb.sqlalchemy import search

search.phrase_prefix(Product.description, ["running", "sh"])
search.more_like_this(Product.id, document_id=1, fields=["description"])

Proximity Composition

from sqlalchemy import select
from paradedb.sqlalchemy import search

prox = search.prox_array("running").within(1, search.prox_regex("sho.*"), ordered=True)
stmt = select(Product.id).where(search.proximity(Product.description, prox))

Indexing and Tokenizers

Tokenizer config can be expressed as a structured mapping:

from sqlalchemy import Index
from paradedb.sqlalchemy import indexing

products_bm25_idx = Index(
    "products_bm25_idx",
    indexing.BM25Field(Product.id),
    indexing.BM25Field(
        Product.description,
        tokenizer=indexing.tokenize.from_config(
            {
                "tokenizer": "simple",
                "filters": ["lowercase", "stemmer"],
                "stemmer": "english",
                "alias": "description_simple",
            }
        ),
    ),
    indexing.BM25Field(
        Product.description,
        tokenizer=indexing.tokenize.from_config(
            {
                "tokenizer": "ngram",
                "args": [3, 8],
                "named_args": {"prefix_only": True},
                "alias": "description_ngram",
            }
        ),
    ),
    postgresql_using="bm25",
    postgresql_with={"key_field": "id"},
)

Validate that a field is indexed with the expected tokenizer:

from paradedb.sqlalchemy import indexing

indexing.assert_indexed(engine, Product.category, tokenizer="literal")

Inspect BM25 metadata for a mapped table:

from paradedb.sqlalchemy import indexing

meta = indexing.describe(engine, Product.__table__)

Alembic Operations

Import once in migration environment startup so Alembic registers ParadeDB operations:

import paradedb.sqlalchemy.alembic  # noqa: F401

Use custom operations in migrations:

op.create_bm25_index(
    "products_bm25_idx",
    "products",
    ["id", "description"],
    key_field="id",
    table_schema="public",
)
op.reindex_bm25("products_bm25_idx", concurrently=True, schema="public")
op.drop_bm25_index("products_bm25_idx", if_exists=True, schema="public")

op.reindex_bm25(..., concurrently=True) must run outside a transaction (autocommit block).

Diagnostics Helpers

paradedb.sqlalchemy.diagnostics exposes wrapper functions for ParadeDB diagnostics:

from paradedb.sqlalchemy import diagnostics

indexes = diagnostics.paradedb_indexes(engine)
segments = diagnostics.paradedb_index_segments(engine, "products_bm25_idx")
check = diagnostics.paradedb_verify_index(engine, "products_bm25_idx", sample_rate=0.1)
all_checks = diagnostics.paradedb_verify_all_indexes(engine, schema_pattern="public")

Common Errors

`with_rows requires ORDER BY`

from sqlalchemy import select
from paradedb.sqlalchemy import facets

# Missing order_by(...)
base = select(Product.id).limit(10)
facets.with_rows(base, agg=facets.value_count(field="id"), key_field=Product.id)

`with_rows requires LIMIT`

from sqlalchemy import select
from paradedb.sqlalchemy import facets

# Missing limit(...)
base = select(Product.id).order_by(Product.id)
facets.with_rows(base, agg=facets.value_count(field="id"), key_field=Product.id)

`with_rows requires a ParadeDB predicate`

from sqlalchemy import select
from paradedb.sqlalchemy import facets

# ensure_predicate=False disables automatic search.all(...) injection
facets.with_rows(
    select(Product.id).order_by(Product.id).limit(10),
    agg=facets.value_count(field="id"),
    key_field=Product.id,
    ensure_predicate=False,
)

`tokenizer config requires 'tokenizer'`

from paradedb.sqlalchemy import indexing

indexing.tokenize.from_config({"filters": ["lowercase"]})

Examples

Documentation

ParadeDB Official Docs: https://docs.paradedb.com
ParadeDB Website: https://paradedb.com

Contributing

See CONTRIBUTING.md for development setup, running tests, linting, and the PR workflow.

Support

If you're missing a feature or have found a bug, please open a GitHub Issue.

To get community support, you can:

Post a question in the ParadeDB Slack Community
Ask for help on our GitHub Discussions

If you need commercial support, please contact the ParadeDB team.

License

sqlalchemy-paradedb is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
.github		.github
examples		examples
paradedb		paradedb
scripts		scripts
tests		tests
.codespellignore		.codespellignore
.codespellrc		.codespellrc
.gitignore		.gitignore
.markdownlint.yaml		.markdownlint.yaml
.pre-commit-config.yaml		.pre-commit-config.yaml
.prettierrc		.prettierrc
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
api.json		api.json
apiignore.json		apiignore.json
codecov.yml		codecov.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Website • Docs • Community • Blog • Changelog

sqlalchemy-paradedb

Requirements & Compatibility

Installation

Quick Start

Prerequisites

Create a BM25 Index

Query with ParadeDB Predicates

Rows + Facets in a Single Query

Search Patterns

Fuzzy Matching

Phrase Prefix and More-Like-This

Proximity Composition

Indexing and Tokenizers

Alembic Operations

Diagnostics Helpers

Common Errors

`with_rows requires ORDER BY`

`with_rows requires LIMIT`

`with_rows requires a ParadeDB predicate`

`tokenizer config requires 'tokenizer'`

Examples

Documentation

Contributing

Support

License

About

Uh oh!

Releases 2

Sponsor this project

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Website • Docs • Community • Blog • Changelog

sqlalchemy-paradedb

Requirements & Compatibility

Installation

Quick Start

Prerequisites

Create a BM25 Index

Query with ParadeDB Predicates

Rows + Facets in a Single Query

Search Patterns

Fuzzy Matching

Phrase Prefix and More-Like-This

Proximity Composition

Indexing and Tokenizers

Alembic Operations

Diagnostics Helpers

Common Errors

with_rows requires ORDER BY

with_rows requires LIMIT

with_rows requires a ParadeDB predicate

tokenizer config requires 'tokenizer'

Examples

Documentation

Contributing

Support

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Sponsor this project

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`with_rows requires ORDER BY`

`with_rows requires LIMIT`

`with_rows requires a ParadeDB predicate`

`tokenizer config requires 'tokenizer'`