-
-
Notifications
You must be signed in to change notification settings - Fork 813
Description
I had Claude Code for web run a profiler and got some useful tips on speeding up the table view (and other parts of Datasette too): https://gisthost.github.io/?bfa82bb7a4ba7c10bdb51e5406395362/index.html
The most interesting optimizations it found:
Executive Summary
After comprehensive benchmarking of the datasette table JSON endpoint with 100,000 rows, here are the recommended optimizations ranked by impact:
Rank Optimization Improvement Priority Effort 1 refresh_schemas()1s throttle17% HIGH Low 2 Skip remove_infinites()for non-float tables15% HIGH Medium 3 orjsonfor JSON serialization15% (1000 rows) MEDIUM Low 4 Thread pool warmup 36% first request LOW Low Combined Impact
Applying optimizations #1, #2, and #3 together could yield approximately 35-45% improvement for typical JSON API requests.
Detailed Results
1. refresh_schemas() Throttling ✅ HIGH PRIORITY
Problem:
refresh_schemas()queries the internal database on every request to check for schema changes.Solution: Add 1-second cooldown between schema checks.
Metric Before After Improvement refresh_schemas() call 0.39 ms 0.0004 ms 979x faster JSON endpoint 6.91 ms 5.74 ms 17% faster Implementation:
async def refresh_schemas(self): if time.time() - self._last_schema_refresh < 1.0: return # ... existing code self._last_schema_refresh = time.time()
2. remove_infinites() Skip ✅ HIGH PRIORITY
Problem:
remove_infinites()iterates every cell in every row looking forfloat('inf'), even when the table has no float columns.Solution: Check column types at schema load, skip for tables without REAL/FLOAT columns.
Metric Before After Improvement 1000 rows (isolated) 0.60 ms 0.0002 ms 3,178x faster JSON endpoint (1000 rows) 11.87 ms 10.08 ms 15% faster Implementation: Cache float column information per table during schema refresh.
3. orjson JSON Serialization ✅ MEDIUM PRIORITY
Problem: stdlib
json.dumps()is slow for large result sets.Solution: Use
orjsonas optional dependency with graceful fallback.
Rows json.dumps orjson Speedup 100 0.199 ms 0.027 ms 7.35x 1,000 1.902 ms 0.267 ms 7.12x 5,000 12.716 ms 1.712 ms 7.43x Endpoint impact: 5-15% improvement depending on result size.
Implementation:
try: import orjson JSON_DUMPS = lambda data, **kw: orjson.dumps(data).decode('utf-8') except ImportError: JSON_DUMPS = json.dumps
4. Thread Pool Warmup ⚪ LOW PRIORITY
Problem: First request is slow due to cold thread pool and connection establishment.
Finding:
- First request: 24.6 ms (cold) → 6.5 ms (warm)
- Steady state: 6.3 ms regardless
- Thread pool overhead: 0.08 ms per query (acceptable)
Current state: Already well-optimized with connection caching (4.5x faster than no caching).
Optional: Add startup warmup for production deployments.