[Umbrella] SurrealDB: Architecture Decision & Known Challenges

# 🗄️ SurrealDB: Why We Use It & Known Challenges

This is a tracking issue that documents our database architecture decision and groups related issues. Unlike other umbrella issues focused on community work, this one explains **why we made this choice** and tracks challenges we're actively working on.

## Why SurrealDB?

Open Notebook chose SurrealDB for several strategic reasons:

### 1. Multi-model in One
SurrealDB combines document store, graph database, and relational features. This matters for Open Notebook because:
- **Sources** are documents with metadata
- **Notebooks ↔ Sources** relationships are graph-like
- **Embeddings** are built-in — we already use them extensively for semantic search

### 2. Graph-First Future
We plan to significantly expand the use of graph relationships in the product. SurrealDB's native graph capabilities will enable:
- Complex knowledge connections between sources
- Cross-notebook relationships
- Semantic linking of concepts

### 3. AI-Oriented Database
SurrealDB is explicitly focused on AI use cases and actively evolving in this direction:
- Native vector embeddings (which we already use)
- ML-friendly query patterns
- Continuous improvements for AI workloads

### 4. Frontend-Accessible (like Firebase/Supabase)
SurrealDB can be accessed directly from the frontend, similar to Firebase or Supabase. This enables:
- Real-time subscriptions from the UI
- Simplified architecture
- Future possibilities for offline-first features

### 5. Simplified Infrastructure
Traditional stacks require multiple services:
```
Typical stack:          Open Notebook stack:
├── Postgres            └── SurrealDB (does it all)
├── Redis (cache)           ├── Data storage
├── Celery (jobs)           ├── Background jobs (surreal-commands)
└── Vector DB               ├── Vector embeddings
                            └── Real-time subscriptions
```

This means:
- **Easier self-hosting** — one database to manage
- **Simpler Docker setup** — fewer containers
- **Lower resource usage** — important for local/privacy-focused users

## Known Trade-offs

We're aware of these challenges:

| Challenge | Impact | Our Approach |
|-----------|--------|--------------|
| **Younger ecosystem** | Fewer tutorials, smaller community | We document more, contribute back |
| **Transaction conflicts** | Verbose error logs under concurrency | Already handled (see below) |
| **Performance tuning** | Less established best practices | Profile and optimize as we go |
| **Enterprise readiness** | Questions about production scale | Monitor closely, have fallback plan |

### About Transaction Conflicts

This is a **known issue** that the SurrealDB team is actively addressing in upcoming releases. On our side:

- **Already solved**: We use [Tenacity](https://github.com/jd/tenacity) for automatic retries
- **Current impact**: Mostly log verbosity, not actual failures
- **What we need**: Better log management to reduce noise

This is not a blocker — it's a managed inconvenience.

## Current Decision

> **Stay with SurrealDB, work through the challenges.**

### Rationale

1. **Migration cost is high** — Rewriting data layer + losing graph features
2. **Problems are addressable** — Transaction conflicts have workarounds
3. **Unique value** — No other single DB gives us document + graph + jobs
4. **Aligned with our users** — Privacy-focused users prefer simpler infra

### When We'd Reconsider

- Transaction conflicts become unworkable despite optimizations
- Performance doesn't improve with query tuning
- Critical security issue without timely fix
- A clear alternative emerges with same benefits + maturity

## Issues Being Tracked

### Critical
- [ ] #362 — Transaction conflicts (read/write conflicts under load)

### Investigation
- [ ] #373 — Confirm transaction conflicts are log verbosity only, not real failures

### Performance
- [ ] #351 — Listing sources slow with 900+ items (query optimization needed)

### Tooling
- [ ] #186 — Command monitor for surreal-commands visibility

### Related
- [ ] #366 — Kubernetes deployment (includes SurrealDB StatefulSet)

## How We're Addressing These

### Transaction Conflicts (#362)
```
Current: Retry logic with Tenacity in surreal-commands
Planned:
- Confirm impact is log verbosity only (#373)
- Optimize concurrent write patterns
- Batch operations where possible
- Better log filtering
```

### Performance (#351)
```
Current: Investigating query patterns
Planned:
- Add pagination to source listing
- Optimize indexes
- Consider caching hot paths
```

### Visibility (#186)
```
Current: Logs only
Planned:
- Command monitor UI in advanced settings
- Show queue depth, success/failure rates
```

## Alternatives We Considered

| Option | What We'd Gain | What We'd Lose |
|--------|----------------|----------------|
| **PostgreSQL + pgvector** | Maturity, ecosystem, proven scale | Graph queries, simple jobs (need Celery) |
| **SQLite + LiteFS** | Ultimate simplicity, zero config | Scale, concurrency, graph features |
| **MongoDB + Redis + Celery** | Familiar stack, lots of tooling | Simplicity, our infra advantage |
| **Hybrid (Postgres + Neo4j)** | Best of both worlds | Complexity, ops burden |

## For Contributors

If you're experiencing SurrealDB issues:

1. **Check if it's known** — Look at linked issues above
2. **Provide details** — SurrealDB version, query patterns, data volume
3. **Share workarounds** — If you found one, others benefit

If you want to help:
- Performance profiling is welcome
- Query optimization PRs appreciated
- Documentation of patterns that work

### 🚀 Opportunity: Schema & Feature Optimization

**We started using SurrealDB before fully understanding everything it could do.**

This means there's significant opportunity to improve our current schema and leverage features we're not yet using. If you have SurrealDB expertise, you could help with:

- **Schema optimization** — Better table structures, indexes, relations
- **Query patterns** — More idiomatic SurrealQL
- **Feature adoption** — Using capabilities we haven't explored yet
- **Performance tuning** — Identifying bottlenecks and fixes

This is a great way to contribute if you know SurrealDB well — you'd be directly improving the database layer that powers everything.

### 💬 We Want Your Input

**Do you have experience with database infrastructure at scale?**

We'd love to hear from you. This is an open discussion — we're not married to any particular solution. If you have:

- Experience running SurrealDB in production
- Suggestions for alternative architectures
- Ideas for optimizations we haven't considered
- War stories from similar migrations
- **SurrealDB expertise** — schema design, query optimization, best practices

**Please comment on this issue.** We value practical experience over theoretical debates. Tell us what you've seen work (or fail) in real-world scenarios.

> 💡 **SurrealDB community**: We're actively sharing this in SurrealDB forums. If you landed here from there — welcome! We'd especially appreciate your database expertise.

## References

- [SurrealDB Docs](https://surrealdb.com/docs)
- [surreal-commands](https://github.com/lfnovo/surreal-commands) — Our job queue implementation
- [SurrealDB GitHub Issues](https://github.com/surrealdb/surrealdb/issues) — Upstream tracking

---

**Maintainer**: @lfnovo

_Last updated: 2026-01-01_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Umbrella] SurrealDB: Architecture Decision & Known Challenges #372

🗄️ SurrealDB: Why We Use It & Known Challenges

Why SurrealDB?

1. Multi-model in One

2. Graph-First Future

3. AI-Oriented Database

4. Frontend-Accessible (like Firebase/Supabase)

5. Simplified Infrastructure

Known Trade-offs

About Transaction Conflicts

Current Decision

Rationale

When We'd Reconsider

Issues Being Tracked

Critical

Investigation

Performance

Tooling

Related

How We're Addressing These

Transaction Conflicts (#362)

Performance (#351)

Visibility (#186)

Alternatives We Considered

For Contributors

🚀 Opportunity: Schema & Feature Optimization

💬 We Want Your Input

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Challenge	Impact	Our Approach
Younger ecosystem	Fewer tutorials, smaller community	We document more, contribute back
Transaction conflicts	Verbose error logs under concurrency	Already handled (see below)
Performance tuning	Less established best practices	Profile and optimize as we go
Enterprise readiness	Questions about production scale	Monitor closely, have fallback plan

Option	What We'd Gain	What We'd Lose
PostgreSQL + pgvector	Maturity, ecosystem, proven scale	Graph queries, simple jobs (need Celery)
SQLite + LiteFS	Ultimate simplicity, zero config	Scale, concurrency, graph features
MongoDB + Redis + Celery	Familiar stack, lots of tooling	Simplicity, our infra advantage
Hybrid (Postgres + Neo4j)	Best of both worlds	Complexity, ops burden

[Umbrella] SurrealDB: Architecture Decision & Known Challenges #372

Description

🗄️ SurrealDB: Why We Use It & Known Challenges

Why SurrealDB?

1. Multi-model in One

2. Graph-First Future

3. AI-Oriented Database

4. Frontend-Accessible (like Firebase/Supabase)

5. Simplified Infrastructure

Known Trade-offs

About Transaction Conflicts

Current Decision

Rationale

When We'd Reconsider

Issues Being Tracked

Critical

Investigation

Performance

Tooling

Related

How We're Addressing These

Transaction Conflicts (#362)

Performance (#351)

Visibility (#186)

Alternatives We Considered

For Contributors

🚀 Opportunity: Schema & Feature Optimization

💬 We Want Your Input

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions