Conversation
|
One design issue: how to pick neat unique names for database files in a file hierarchy? Here's what I have so far: Lines 231 to 237 in fe6f9e6 For these files: It made these names: Maybe this is good enough? Needs some tests. |
|
This can take a LONG time to run, and at the moment it's blocking and prevents Datasette from starting up. It would be much better if this ran in a thread, or an asyncio task. Probably have to be a thread because there's no easy |
|
Another problem: if any of the found databases use SpatiaLite then Datasette will fail to start at all. It should skip them instead. The |
|
I tried running the Makes sense - I had a thread that added an item to that dictionary right while the homepage was attempting to run this code: datasette/datasette/views/index.py Lines 24 to 27 in efa54b4 |
|
So I need to ensure the Mainly I need to ensure that it is locked during iterations over it, then unlocked at the end. Trickiest part is probably ensuring there is a test that proves this is working - I feel like I got lucky encountering that |
|
... or maybe I can cheat and wrap the access to |
|
... cheating like this seems to work: Python built-in operations are supposedly threadsafe, so in this case I can grab a copy of the list atomically (I think) and then safely iterate over it. Seems to work in my testing. Wish I could prove it with a unit test though. |
|
Interesting new problem: hitting Ctrl+C no longer terminates the problem provided that https://stackoverflow.com/questions/49992329/the-workers-in-threadpoolexecutor-is-not-really-daemon has clues. The workers are only meant to exit when their worker queues are empty. But... I want to run the worker every 10 seconds. How do I do that without having it loop forever and hence never quit? |
|
It think the fix is to use an old-fashioned |
|
I've figured out how to tell if a database is safe to open or not: select sql from sqlite_master where sql like 'CREATE VIRTUAL TABLE%';This returns the SQL definitions for virtual tables. The bit after Run this against a SpatiaLite database and you get the following: CREATE VIRTUAL TABLE SpatialIndex USING VirtualSpatialIndex()
CREATE VIRTUAL TABLE ElementaryGeometries USING VirtualElementary()Run it against an Apple Photos CREATE VIRTUAL TABLE RidList_VirtualReader using RidList_VirtualReaderModule
CREATE VIRTUAL TABLE Array_VirtualReader using Array_VirtualReaderModule
CREATE VIRTUAL TABLE LiGlobals_VirtualBufferReader using VirtualBufferReaderModule
CREATE VIRTUAL TABLE RKPlace_RTree using rtree (modelId,minLongitude,maxLongitude,minLatitude,maxLatitude)For a database with FTS4 you get: CREATE VIRTUAL TABLE "docs_fts" USING FTS4 (
[title], [content], content="docs"
)FTS5: CREATE VIRTUAL TABLE [FARA_All_Registrants_fts] USING FTS5 (
[Name], [Address_1], [Address_2],
content=[FARA_All_Registrants]
)So I can use this to figure out all of the |
| d.path for d in list(self.databases.values()) if d.path is not None | ||
| } | ||
| for dir in self.dirs: | ||
| for filepath in Path(dir).glob("**/*.db"): |
There was a problem hiding this comment.
Need to support .sqlite files here too
| str(filepath) | ||
| .replace("../", "") | ||
| .replace("/", "_") | ||
| .replace(".db", ""), |
There was a problem hiding this comment.
Also consider .sqlite - using pathlib .stem here would be better.
| .replace(".db", ""), | ||
| Database(self, str(filepath), is_mutable=True), | ||
| ) | ||
| time.sleep(10) |
There was a problem hiding this comment.
This should be configurable - 10 is an OK default, but cases like simonw/jupyterserverproxy-datasette-demo#2 where it's running in a Binder session would benefit from faster updates.
|
Here's the new utility function I should be using to verify database files that I find: datasette/datasette/utils/__init__.py Lines 773 to 787 in 6aa516d |
|
While running it against a nested directory with a TON of databases I kept seeing errors like this: And |
|
I also eventually get this error: |
|
I think I can tell what the current file limit is like so: So maybe I should have Datasette refuse to open more database files than that number minus 5 (to give me some spare room for opening CSS files etc). |
Refs #417.