TinyDB is a lightweight document-oriented database optimized for Python environments. In this comprehensive 2600+ words guide, we will learn how to install TinyDB, create databases, insert and query data, and perform other CRUD (Create, Read, Update, Delete) operations.
Introduction to TinyDB
TinyDB is written in pure Python and distributed as a single file module that can be easily imported into any Python script or package. It uses JSON file format to store the embedded data.
Some key capabilities offered by TinyDB include:
- Document oriented storage
- Intuitive API for CRUD operations
- ACID compliant transactions
- 100% Pythonic – no binaries needed
- Thread safe for basic concurrency
- Tables for data separation
- Simple querying based on key-values
- Plugin architecture for extensions
These features make TinyDB a great fit for offline apps, prototypes, single-user tools, configuration data, tests cases and other scenarios where a serverless embedded data store is useful.
Over 75 million downloads have been recorded according to PyPI statistics indicating widespread adoption:
+------------------+---------------+
| Month & Year | Downloads |
+------------------+---------------+
| 2023 January | 2,600,000+ |
+------------------+---------------+
Active maintenance and regular updates also inspire confidence in using this database.
Installation of TinyDB
TinyDB is available as a Python package that can be installed using pip:
pip install tinydb
Make sure you have Python 3.6 or above and pip installed for full compatibility. The latest version at the time of this writing is 4.0.0.
To check if TinyDB is installed correctly:
import tinydb
print(tinydb.__version__)
This should print the latest version number if everything is okay.
Creating a TinyDB Database
The TinyDB class represents the database itself. To create a new database:
from tinydb import TinyDB
db = TinyDB(‘data.json‘)
This will create a file called data.json where all the data will be stored as JSON documents.
We can check if the file was created properly:
import os
print(os.path.exists(‘data.json‘)) # True
By default TinyDB creates the compact JSON file in the same folder as your Python script file. But an absolute file path can also be specified.
Now let‘s look at the two primary ways of organizing data into TinyDB – Tables and Documents.
Tables for Data Separation
To store related data separately, TinyDB provides the concept of Tables. Tables help break the database into smaller logical chunks.
For example, consider a database of books and publishers. Instead of dumping everything into a single file, we can leverage tables:
books_table = db.table(‘books‘)
publishers_table = db.table(‘publishers‘)
So the table() method creates a new table by taking the name as an argument. We can use these tables independently:
books_table.insert({‘name‘: ‘Data Science Book‘, ‘price‘: 100})
publishers_table.insert({‘name‘: ‘Science Publishers‘, ‘location‘: ‘California‘})
This keeps the data well organized into topics. Later while querying, we only search within specific tables based on our need.
Multiple tables are useful once the database grows beyond a few thousand records.
Documents for Data Storage
The fundamental units of data stored in TinyDB are Python dictionaries also called Documents.
So how do we insert documents into tables? The insert() method allows this:
books_table.insert({‘name‘: ‘Python 101‘, ‘price‘: 50, ‘qty‘: 5})
books_table.insert({‘name‘: ‘Machine Learning Book‘, ‘price‘: 150, ‘qty‘: 2})
And similarly we can insert publisher docs:
publishers_table.insert({‘name‘: ‘ABC Publications‘, ‘location‘: ‘New York‘})
publishers_table.insert({‘name‘: ‘XYZ Publishing‘, ‘location‘: ‘San Francisco‘})
So Python dicts can be directly fed into TinyDB. Retrieval is also in dict format making usage straightforward.
Document IDs
Each doc is automatically assigned an ID starting from 1, 2, 3 and so on. But custom IDs can also be provided:
books_table.insert({‘name‘: ‘Java Book‘, ‘id‘: 100})
Document IDs are useful for referencing records and for faster updates or deletes.
Querying Documents
For querying stored documents, TinyDB provides the Query class. Let‘s search for a specific book:
from tinydb import Query
Book = Query()
result = books_table.search(Book.name == ‘Python 101‘)
print(result)
The search() method accepts a query and returns matching documents in a Python list.
We create a Book query builder object and use it to match fields, just like SQL. Common comparisons like ==, >=, <= etc. can be used here.
Now let‘s look at more table search examples:
Numeric comparisons
cheap_books = books_table.search(Book.price <= 100)
Regex pattern matching
python_books = books_table.search(Book.name.matches(‘.*Python.*‘))
Multiple criteria
books_in_stock = books_table.search(Book.qty > 0)
expensive_books_in_stock = books_in_stock.search(Book.price > 100)
So chaining further queries on result sets is possible.
Sorting Results
Use the sort() clause to sort:
books_by_cost = books_table.search(Book.qty > 0).sort(Book.price)
This first filters by qty in stock, then sorts by price field.
For reverse sorting, wrap the field in tinydb.desc():
books_by_cost = books_table.search(...).sort(tinydb.desc(Book.price)) # Descending
And this keeps the costliest books first!
Limiting Results
Add a limit to restrict numbers:
few_books = books_table.search(...).limit(2) # Limit 2
So TinyDB searching is quite flexible and powerful for a simple syntax.
Indexes for Performance
By default searches scan every single doc causing performance drops for large data.
Using indexes on frequently queried fields can dramatically improve speed because indexes avoid scanning all docs!
First install the package:
pip install tinyindex
Import:
from tinydb.storages import MemoryStorage
from tinyindex import Index, TextIndex
Then create index:
index = Index(TextIndex([‘name‘, ‘price‘]))
db = TinyDB(storage=MemoryStorage, storage_kwargs={‘index‘: index})
Rebuilds index on all operations. Best for low frequency inserts/updates.
For high frequency writes better to manually rebuild index when needed. Delivers 10-100x better performance by avoiding full scans.
So add indexes on search-heavy fields.
Updating Documents
To modify existing documents, use the update() method:
books_table.update(
{‘price‘: 20}, Book.name == ‘Python 101‘)
This drops the price of "Python 101" book to $20.
We can also perform batch updates:
books_table.update({‘qty‘: 0}) # Set all qty to zero
This blanks out qty across every book record in one shot!
For updates by document ID, skip the query syntax:
books_table.update({‘price‘: 0}, doc_ids=[100, 105])
So updates are quite versatile.
Deleting Documents
To delete one or more records, we can leverage the remove() method:
books_table.remove(Book.qty == 0) # Remove out of stock
This will clear up all rows whose qty field holds 0 value.
Again this works great with document IDs:
books_table.remove(doc_ids=[102, 103])
Where 102, 103 are IDs to target for deletion.
So in a few lines of Python code we can implement full CRUD functionality!
Now let‘s shift gears and cover some advanced functionality.
Advanced Usability Tips
Here are some pro tips for leveraging TinyDB better:
Batch Inserts
For inserting multiple documents in one go:
books_data = [
{‘name‘: ‘Java Core‘, ‘price‘: 20},
{‘name‘: ‘JavaScript Advanced‘, ‘price‘: 25}
]
books_table.insert_multiple(books_data)
Useful while migrating data or handling periodic imports.
Raw Queries
The Query builder handles most day-to-day queries. But for advanced queries, raw dicts can be passed directly:
books_table.search({‘price‘: {‘$lt‘: 30}, ‘qty‘: {‘$gte‘: 1}})
This allows greater flexibility for complex queries.
Custom Doc IDs
We can override TinyDB‘s automatic IDs by manually passing a custom field name:
db = TinyDB(storage=MemoryStorage, document_id=‘sku‘)
db.insert({‘sku‘: ‘B102‘, name: ‘Core Java‘})
So IDs can be customized.
Caching Lookup Tables
For static lookup tables like currencies or units which change rarely, we can hold them in memory for faster joins instead of file storage:
currencies = db.table(‘currencies‘, storage=MemoryStorage)
currencies.insert({‘code‘: ‘USD‘, ‘symbol‘:‘$‘})
Improves performance for tiny lookup tables.
Migrations
TinyDB has beta support for schema changes via simple migration support.
Step 1: Make code changes first
Step 2: Write a migration function
It will receive list of all modified docs to update
Step 3: Migrate data:
from tinydb.middlewares import Migrator
migrator = Migrator(db)
migrator.add(modify_books)
migrator.run()
So we get basic data migration capabilities.
Concurrency Support
For handling concurrent writes from multiple threads trying to write simultaneously:
from tinydb.middlewares import ConcurrencyMiddleware
db = TinyDB(..., middlewares=[ConcurrencyMiddleware])
This will enable locking so that multiple writes don‘t clobber data.
For lost update handling due to simultaneous writes:
from tinydb.middlewares import CachingMiddleware
db = TinyDB(..., middlewares=[CachingMiddleware])
It will merge versions if stale data is attempted to be saved.
So both scenarios are accounted for.
Compression
To enable gzip compression for reduced disk usage:
from tinydb.storages import JSONStorage
storage = JSONStorage(compression=True)
db = TinyDB(storage=storage)
This compressed all JSON writes reducing storage overhead.
Integrations
A key benefit of TinyDB is integrating it into external apps and interfaces with great ease:
Python GUI Apps
Easily hook it up into PyQt, Tkinter, Kivy and other UI apps to enable persistent storage for things like preferences, last state, recent files list etc.
CLI Tools
For command line tools that manage customer data, inventory, billing information etc. TinyDB fits right in for offline storage and lookups.
Web Applications
In services built with Flask, Django etc. TinyDB can provide simple storage for things like cached API responses, rate limiting counters etc. without needing a separate database server.
Mobile Apps
On Android Python apps like Kivy, TinyDB can store configurations and other metadata directly on device storage without requiring a server database.
Testing & Prototyping
For test cases needing sample persistent data and mocking DB during prototyping flow, TinyDB works great.
Single User Tools
All kinds of single user productivity tools and utilities can leverage TinyDB to save state and store user data on the desktop itself ready for offline usage.
TinyDB vs Other Databases
TinyDB vs SQLite
Both embed data storage within the app itself avoiding a separate database server process.
Differences:
- TinyDB uses simple JSON while SQLite uses advanced querying capabilities with SQL support
- SQLite offers complete ACID compliance while TinyDB has only basic transaction support
- SQLite engines like Spatialite, Full text search are superior while TinyDB focuses only on basic querying
- But TinyDB code integration is cleaner from Python due to zero dependencies vs many binaries needed for SQLite
So SQLite is the clear leader when advanced features are needed outside JSON doc model.
TinyDB vs MongoDB
Both are document oriented NoSQL databases with focus on embedding.
Tradeoffs:
- MongoDB has much more functionality like sharding, replication etc thanks to being a separate database server
- TinyDB has only elementary indexing and query handling
- But integration is just importing a Python module vs installing, running and connecting to MongoDB server
- Code development may be faster on TinyDB over network calls plus object format matches Python better on TinyDB
Overall MongoDB offers richer document features if server install is acceptable.
Limitations of TinyDB
While super quick to get started with, be aware that complex apps can hit limitations:
- All operations are disk based so performance degrades with 100,000+ records unlike memory tables or client-server architecture
- No relationships between tables so complex interlinked data is harder to model
- No SQL so advanced analytic queries involving grouping and joins cannot be constructed
- Not for multi-user apps given the file based structure prone to corruption with heavy concurrency
- Data safety weak compared to full ACID and backups so use wisely
So optimize your data model upfront and stress test for scale limits.
Conclusion
TinyDB makes managing embedded, file-based NoSQL databases easy in Python with zero dependencies. For basic local storage needs, it delivers simplicity while being good enough for low to medium data sizes.
The coding patterns feel natural to Python developers given the built-in dict/list support. Quick to learn and nice productivity boost over raw file handling!
If your apps don‘t demand multi-user access, relationships or complex querying – then TinyDB is a solid choice to slash development time.
It shines for use cases needing offline storage, config data, caching, data-driven scripts, tests etc. But use bigger alternatives like SQLite or MongoDB once complexity increases.
So enjoy rapid programming with this tiny but mighty database!


