-
Notifications
You must be signed in to change notification settings - Fork 301
Description
- Linux
- ghc 8.6
- cabal-install 3.2
- PostgreSQL backend
I have a very simple delete cascade function:
deleteCascadeSlotNo :: MonadIO m => Word64 -> ReaderT SqlBackend m Bool
deleteCascadeSlotNo slotNo = do
keys <- selectKeysList [ BlockSlotNo ==. Just slotNo ] []
mapM_ deleteCascade keys
pure $ not (null keys)
It is not enforced by the schema, but I know that slotNo is unique. I have also confirmed (by adding debug) that the list of keys contains at most 1 element. However, the database has 25 tables (with more to come) with a relatively high level of linkage between the tables, in part to ensure proper delete/cascade behavior.
Currently, when deleteCascadeSlotNo is used on the block table (over 5 million entries) it only needs to delete the most recently added row or two. However, for debugging/testing I may want to delete many more rows and here is where we run into problems. For larger delete/cascade operations, the time taken increases dramatically.
| Rows to delete | Time taken |
|---|---|
| 1000 | 0m44s |
| 2000 | 1m25s |
| 4000 | 3m04s |
| 8000 | 7m14s |
| 1600 | 13m18s |
The speed is pretty much linear in the number of row to be deleted, but for large numbers of row (say 1 million) it can still be slow (30 minutes or more).
The slowness seem to be due to the fact the the operation requires a lot of back and forth communication between Haskell/Persistent and PostgreSQL. The delete/cascade operation would be much quicker if more of the work was handed off to Postgres.