Skip to content

Latest commit

 

History

History
719 lines (530 loc) · 17.5 KB

File metadata and controls

719 lines (530 loc) · 17.5 KB

coffy.nosql

  • Embedded NoSQL document store with a fluent, chainable query API
  • Supports nested fields, logical filters, aggregations, projections, and joins
  • Built for local usage with optional persistence; minimal setup, fast iteration

Scope: local, single-process, small datasets. Designed for clarity and fast iteration.


Table of Contents


Quick Start

from coffy.nosql import db

users = db("users", path="data/users.json")
users.clear()  # start clean for this demo

users.add_many([
    {"id": 1, "name": "Neel", "email": "neel@a.com", "age": 30, "address": {"city": "Indy"}},
    {"id": 2, "name": "Bea",  "email": "bea@b.com",  "age": 25, "address": {"city": "Austin"}},
    {"id": 3, "name": "Carl", "email": "carl@c.com", "age": 40},
])

# Basic equality
q = users.where("name").eq("Neel")
print(q.first())
# -> {'id': 1, 'name': 'Neel', ...}

# Nested field access
q = users.where("address.city").eq("Austin")
print(q.count())
# -> 1

# Projection
print(q.run(fields=["id", "address.city"]).as_list())
# -> [{'id': 2, 'address.city': 'Austin'}]

Data Model & Persistence

  • A collection stores a list of documents (plain dicts).
  • Documents can have different fields.
  • Use path="file.json" for durable persistence; omitted or invalid path means in-memory only.
  • JSON on disk is pretty-printed and human-readable.

Example on disk:

[
  {"id": 1, "name": "Neel", "age": 30},
  {"id": 2, "name": "Bea", "age": 25}
]

Start Here

CollectionManager

Constructor

CollectionManager(name: str, path: str | None = None)
  • name -- the collection name
  • path -- optional path to a JSON file for persistence; if None or :memory:, in-memory only

Insertion

add(document: dict) -> {"inserted": 1}
add_many(docs: list[dict]) -> {"inserted": N}

Examples

users.add({"id": 4, "name": "Drew"})
users.add_many([{"id": 5}, {"id": 6, "active": True}])

Query entrypoints

where(field: str) -> QueryBuilder      # starts a query
match_any(*builders) -> QueryBuilder   # OR across sub-queries
match_all(*builders) -> QueryBuilder   # AND across sub-queries
not_any(*builders)  -> QueryBuilder    # NOT( OR(sub-queries) )

Examples

# where + eq
users.where("name").eq("Neel").first()

# match_any
users.match_any(
    lambda q: q.where("age").gt(35),
    lambda q: q.where("name").eq("Bea")
).run().as_list()

# match_all
users.match_all(
    lambda q: q.where("age").gte(25),
    lambda q: q.where("age").lt(40)
).count()

# not_any
users.not_any(
    lambda q: q.where("name").eq("Neel"),
    lambda q: q.where("age").eq(40)
).run().as_list()

Aggregations (collection-level helpers)

sum(field: str) -> number           # sum of numeric field values
avg(field: str) -> float            # average of numeric field values
min(field: str) -> number | None    # minimum value of numeric field
max(field: str) -> number | None    # maximum value of numeric field
count() -> int                      # count of documents in the collection
first() -> dict | None              # first document in the collection

Examples

users.sum("age")      # 95
users.avg("age")      # 31.66...
users.min("age")      # 25
users.max("age")      # 40
users.count()         # 3
users.first()         # first document in the collection

Maintenance & IO

clear() -> {"cleared": N}   # clears the collection
export(path: str) -> None   # exports to JSON file
import_(path: str) -> None  # imports from JSON file
save(path: str) -> None     # saves to the specified path
all() -> list[dict]         # all documents in the collection
all_docs() -> list[dict]    # alias for all()

Examples

users.export("backup/users_export.json")
users.clear()
users.import_("backup/users_export.json")

Visualization

You can visualize your collections using the built-in view function.

view() -> None

Example

users.view()

NoSQL Visualization


QueryBuilder

You get a QueryBuilder from a collection via where, match_any, match_all, or not_any.

Field selection

where(field: str) -> QueryBuilder

Supports dot-notation for nested fields.

Examples

users.where("name").eq("Neel")
users.where("address.city").eq("Indy")
users.where("profile.stats.score").gte(9000)

Comparison operators

eq(value)           # equality
ne(value)           # not equal
gt(value)           # numeric greater than
gte(value)          # numeric greater than or equal
lt(value)           # numeric less than
lte(value)          # numeric less than or equal
between(a, b)       # numeric range inclusive
in_(values: list)   # membership in a list
nin(values: list)   # not in a list
matches(regex: str) # regex on string value
exists()            # field exists (not null or missing)

Examples

# equality
users.where("name").eq("Neel").count()

# numeric ranges
users.where("age").gte(25).where("age").lt(40).run()

# membership
users.where("name").in_(["Neel", "Bea"]).run()

# regex
users.where("email").matches(r"@a\.com$").run()

# existence (nested ok)
users.where("address.city").exists().run()

Logic grouping

These are not suggested to be used directly, but are available through the collection helpers, match_any, match_all, and not_any.

_and(*builders)   # all sub-queries must match
_or(*builders)    # any sub-query matches
_not(*builders)   # negates the AND of each sub-query

Examples

# _and
q = users.where("age").gte(25)
q._and(lambda s: s.where("name").ne("Carl"))
q.run().as_list()

# _or with two branches
q = users._and(  # seed with no filters, then group
    lambda s: s.where("age").gt(35),
    lambda s: s.where("name").eq("Bea")
)

# Equivalent with collection helpers:
users.match_any(
    lambda s: s.where("age").gt(35),
    lambda s: s.where("name").eq("Bea")
).run().as_list()

# _not – exclude anyone under 30
users.where("age").lt(30)        # build the inner condition
# negate using collection helper
users.not_any(lambda s: s.where("age").lt(30)).run().as_list()

Execution

run(fields: list[str] | None = None) -> DocList # runs the query
count() -> int                                  # counts documents after filtering
first() -> dict | None                          # returns the first document after filtering
distinct(field: str) -> list[...]               # returns unique values for a field after filtering

run(fields=[...]) performs projection. Fields can be nested ("a.b.c"). Returned keys are the field names you requested.

Examples

users.where("age").gte(25).run(fields=["id", "name"]).as_list()
# -> [{'id': 1, 'name': 'Neel'}, {'id': 2, 'name': 'Bea'}, ...]

users.where("address.city").exists().run(fields=["id", "address.city"]).as_list()
# -> [{'id': 1, 'address.city': 'Indy'}, {'id': 2, 'address.city': 'Austin'}]

users.where("address.city").distinct("address.city")
# → ["Austin", "Indy", "Seattle"]

Mutation

update(changes: dict) -> {"updated": N}     # updates matching documents with new fields
delete() -> {"deleted": N}                  # deletes matching documents
replace(new_doc: dict) -> {"replaced": N}   # replaces matching documents with new ones
remove_field(field: str) -> {"removed": N}  # removes a field from matching documents

Examples

# mark all under 30 as junior
users.where("age").lt(30).update({"rank": "junior"})

# delete by name
users.where("name").eq("Carl").delete()

# replace exact matches
users.where("id").eq(2).replace({"id": 2, "name": "Bea Updated"})

# remove a field
users.where("name").eq("Neel").remove_field("rank")

Aggregations (query-scoped)

These work after filtering:

sum(field)      # sum of numeric field values
avg(field)      # average of numeric field values
min(field)      # minimum value of numeric field
max(field)      # maximum value of numeric field

Examples

# average age for people with an email at a.com
users.where("email").matches("@a\\.com$").avg("age")

Lookup and Merge

lookup(foreign_collection_name, local_key, foreign_key, as_field, many=True) -> QueryBuilder  
merge(fn: callable) -> QueryBuilder
  • lookup runs the current query, matches each result to documents in another collection by key equality, and attaches the matched result(s) at as_field.
    • If many=False, attaches a single document or None (one-to-one).
    • If many=True, attaches a list of matching documents (one-to-many).
  • merge transforms each (possibly looked-up) document by merging in fields returned from fn(doc).

Example - One-to-one join

users = db("users")
orders = db("orders")

users.clear(); orders.clear()
users.add_many([
    {"id": 1, "name": "Neel"},
    {"id": 2, "name": "Bea"},
])
orders.add_many([
    {"order_id": 10, "user_id": 1, "total": 50},
    {"order_id": 11, "user_id": 1, "total": 75},
    {"order_id": 12, "user_id": 2, "total": 20}
])

# Manually build a one-to-one map of latest order
latest_by_user = {}
for o in orders.all_docs():
    latest_by_user[o["user_id"]] = o  # override to get latest
orders_latest = db("orders_latest")
orders_latest.clear()
orders_latest.add_many(list(latest_by_user.values()))

out = (
    users.where("id").in_([1, 2])
         .lookup("orders_latest", local_key="id", foreign_key="user_id", as_field="latest_order", many=False)
         .merge(lambda d: {"latest_total": d.get("latest_order", {}).get("total", 0)})
         .run()
         .as_list()
)

# Result:
# [
#   {'id': 1, 'name': 'Neel', 'latest_order': {...}, 'latest_total': 75},
#   {'id': 2, 'name': 'Bea',  'latest_order': {...}, 'latest_total': 20}
# ]

Example - One-to-many join

# Using full orders collection in a one-to-many join
out = (
    users.lookup("orders", local_key="id", foreign_key="user_id", as_field="orders", many=True)
         .merge(lambda u: {"total_spent": sum(o["total"] for o in u["orders"])})
         .run()
         .as_list()
)

# Result:
# [
#   {'id': 1, 'name': 'Neel', 'orders': [...], 'total_spent': 125},
#   {'id': 2, 'name': 'Bea',  'orders': [...], 'total_spent': 20}
# ]

Note: lookup defaults to one-to-many (many=True). Use many=False for one-to-one joins.


Pagination

You can paginate query results using .limit(n) and .offset(m):

limit(n: int) -> QueryBuilder # Limits the number of results.
offset(m: int) -> QueryBuilder # Skips the first m results.

Examples

col.where("score").gte(50).offset(10).limit(5).run()
# Returns 5 documents starting from the 11th result (zero-indexed).

Sorting

You can sort query results using .sort(field, reverse=False) -> QueryBuilder:

  • field: The field to sort by.
  • reverse: If True, sorts in descending order.

Examples

col.where("score").gte(50).sort("score", reverse=True).run()
# Returns documents with score >= 50, sorted by score descending.
col.where("age").lt(40).sort("age").run()
# Returns documents with age < 40, sorted by age ascending.

DocList

A lightweight wrapper around a list of documents.

as_list() -> list[dict]
to_json(path: str) -> None
len(doclist) -> int
doclist[0]      # indexing
for d in doclist: ...
repr(doclist)   # pretty table-like output

Examples

res = users.where("age").gte(25).run(fields=["id", "name"])
print(len(res))             # -> 3
print(res[0]["name"])       # -> 'Neel'
print(res.as_list())        # -> [{'id': 1, 'name': 'Neel'}, ...]
res.to_json("out.json")
print(res)                  # pretty-printed rows

Error Handling

  • This engine intentionally avoids raising on missing fields — comparisons on missing values simply don’t match.
  • exists() checks presence, not truthiness.
  • Numeric comparisons only apply to numeric values; non-numeric values fail the predicate.

Example: end-to-end

from coffy.nosql import db

users = db("users", path="data/users.json")
users.clear()
users.add_many([
    {"id": 1, "name": "Neel", "age": 30, "address": {"city": "Indy"}},
    {"id": 2, "name": "Bea",  "age": 25, "address": {"city": "Austin"}},
    {"id": 3, "name": "Carl", "age": 40}
])

# People with address, projected
print(users.where("address.city").exists().run(fields=["id", "address.city"]).as_list())

# Age 25-39
print(users.where("age").gte(25).where("age").lt(40).run().as_list())

# NOT (age < 30 OR name == 'Carl')
print(users.not_any(
    lambda q: q.where("age").lt(30),
    lambda q: q.where("name").eq("Carl"),
).run().as_list())

# Mutations
users.where("name").eq("Neel").update({"role": "admin"})
users.where("name").eq("Carl").delete()

# Aggregates
print(users.sum("age"), users.avg("age"))

NoSQL CLI

coffy-nosql is a file-backed command line interface for working with coffy.nosql, an embedded JSON document store. It supports initializing collections, adding documents, running queries, performing aggregations, and clearing data, all through simple commands.


CLI Table of Contents


CLI Quick Start

Initialize a new collection and add a few documents:

# initialize collection file
coffy-nosql --collection users --path ./users.json init

# add one document
coffy-nosql --collection users --path ./users.json add '{"id":1,"name":"Neel","age":30}'

# add many documents
coffy-nosql --collection users --path ./users.json add-many '[{"id":2,"name":"Bea","age":25},{"id":3,"name":"Carl","age":40}]'

# query users older than 29
coffy-nosql --collection users --path ./users.json query --field age --op gt --value 29

Commands

init

Initialize a JSON file to back a collection.

coffy-nosql --collection NAME --path FILE.json init
  • Creates the file if it does not exist.
  • Ensures the directory structure is created.

add

Add a single document.

coffy-nosql --collection NAME --path FILE.json add DOC

DOC can be:

  • JSON string: {"id":1,"name":"Neel"}
  • File reference: @doc.json
  • Read from stdin: -

add-many

Add multiple documents in one call.

coffy-nosql --collection NAME --path FILE.json add-many DOCS

DOCS must be a JSON array:

  • JSON string: [{"id":1},{"id":2}]
  • File reference: @docs.json
  • Read from stdin: -

query

Run simple queries on one field.

coffy-nosql --collection NAME --path FILE.json query --field FIELD --op OP [--value VAL]

Operators

  • eq, ne: equals, not equals
  • gt, gte, lt, lte: numeric comparisons
  • in, nin: membership in array
  • exists: field presence
  • matches: regex match (Python style)

Options

  • --value: required for most operators, not allowed for exists
  • --fields: projection fields to return
  • --count: return only number of matches
  • --first: return only the first match
  • --out FILE.json: write results to file
  • --pretty: pretty-print JSON results (adds indentation)

agg

Run an aggregation across all documents.

coffy-nosql --collection NAME --path FILE.json agg {sum,avg,min,max,count} [--field FIELD]
  • sum, avg, min, max require --field
  • count counts all documents

clear

Remove all documents from a collection.

coffy-nosql --collection NAME --path FILE.json clear

Options

Global options (apply to all commands):

  • --collection NAME (required): Collection name
  • --path FILE.json (required): Path to JSON file backing the collection

CLI Examples

Initialize and add a document:

coffy-nosql --collection users --path ./users.json init
coffy-nosql --collection users --path ./users.json add '{"id":1,"name":"Alice","age":22}'

Add documents from a file:

coffy-nosql --collection users --path ./users.json add-many @bulk_users.json

Query for users aged ≥ 30:

coffy-nosql --collection users --path ./users.json query --field age --op gte --value 30

Get only the count:

coffy-nosql --collection users --path ./users.json query --field age --op gte --value 30 --count

Aggregate average age:

coffy-nosql --collection users --path ./users.json agg avg --field age

Clear all data:

coffy-nosql --collection users --path ./users.json clear

Exit Codes

  • 0 Success
  • 1 Failure (invalid arguments, parse errors, runtime errors)