-
Notifications
You must be signed in to change notification settings - Fork 301
Add SQL Connection & Statement hooks #1341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
parsonsmatt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, having thoroughly reviewed this code:
I like it a lot. Great new feature. Excellent implementation. But we can make it a bit better.
- Only expose data constructors and fields via the
.Internalmodules so we don't have to major-bump to add a field or change a field type. Expose accessors/modifiers/etc as appropriate. If you want to hand-this-off to me then I can take it home. - Don't change the type of
MkSqlBackendArgsand instead convert fromIORef (Map Text STatement)inmkSqlBackend. This allows us to release this as a minor version bump, instead of a major.
Thanks!!!
| setConnInsertManySql insertManySql' $ | ||
| maybe id setConnRepsertManySql (upsertFunction repsertManySql serverVersion) $ | ||
| mkSqlBackend MkSqlBackendArgs | ||
| modifyConnVault (Vault.insert underlyingConnectionKey conn) $ mkSqlBackend MkSqlBackendArgs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get access to the underlying backend, use the RawPostgresql type
| data RawPostgresql backend = RawPostgresql |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't sufficient for instrumentation of existing codebases unfortunately. The value lets us wrap existing connections without requiring pervasive downstream changes.
| underlyingConnectionKey :: Vault.Key PG.Connection | ||
| underlyingConnectionKey = unsafePerformIO Vault.newKey | ||
| {-# NOINLINE underlyingConnectionKey #-} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't feel right to me. My understanding of this behavior is:
- We create this
Vault.Key PG.Connectionlazily, whenever the first time we call this function. Then the value is cached due to GHC's applicative-normal-form optimization. So, we generate exactly oneKey Connectionhere, and it is a global constant. - When we call
createBackendorgetSimpleConn(whichever one comes first) then we create theunderlyingConnectionKey. If we callgetSimpleConnfirst, then we'll getNothingin that function. If we callcreateBackendafterwards, then theKey Connalready exists, and we stuff theConninto theVault. - So, I think the problem with this, is that we are overwriting the underlying
Connwith every call tocreateBackend. Is this a problem?
withPostgresqlPool delegates to withSqlPool open' which itself calls return $ constructor (createBackend logFunc ver smap) conn. Which means that, on every new SqlBackend that is created, we end up overwriting the value here, which means that getSimpleConn is going to get the most recent PG.Conn that has been created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so this is probably not an issue. While we do generate a single Key Conn, this is used to index into the SqlBackend's internal Vault, not some globally available Vault. And, since there's only ever exactly one, then we can easily recover the PG.Connection for a given SqlBackend.
This is a bit gnarly but it shouldn't have any problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Points 1 and 2 are intentional. We need a stable key in order to be able to reference the connection from external instrumentation tooling. AFAICT, there's no way you could call getSimpleConn prior to constructing a backend via createBackend (at least without digging into internals), so you're guaranteed by virtue of the order of operations to have the key be initialized and able to return the underlying PostgreSQL connection. Worst case scenario, you can't prove that the connection is to a PostgreSQL database and don't do enhanced instrumentation.
Regarding 3, I don't see how we're overwriting the underlying call in createBackend, since it's just an immutable function that constructs a backend from its constituent parts?
| data SqlPoolHooks m backend = SqlPoolHooks | ||
| { alterBackend :: backend -> m backend | ||
| -- ^ Alter the backend prior to executing any actions with it. | ||
| , runBefore :: backend -> Maybe IsolationLevel -> m () | ||
| -- ^ Run this action immediately before the action is performed. | ||
| , runAfter :: backend -> Maybe IsolationLevel -> m () | ||
| -- ^ Run this action immediately after the action is completed. | ||
| , runOnException :: backend -> Maybe IsolationLevel -> UE.SomeException -> m () | ||
| -- ^ This action is performed when an exception is received. The | ||
| -- exception is provided as a convenience - it is rethrown once this | ||
| -- cleanup function is complete. | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice!
To make backwards compatibility easier, I'd want to hide this constructor/accessors in an .Internal module, and expose an API for setting/modifying/adding hooks. Eg:
module Database.Persist.SqlBackend.SqlPoolHooks
( SqlPoolHooks
, defaultSqlPoolHooks
, setAlterBackend
, modifyAlterBackend
, addAlterBackendHook
, addRunBeforeHook
, addRunAfterHook
-- etc..
)
where
module Database.Persist.SqlBackend.SqlPoolHooks.Internal where
data SqlPoolHooks m backend = SqlPoolHooks
{ alterBackend :: backend -> m backend
-- ^ Alter the backend prior to executing any actions with it.
, runBefore :: backend -> Maybe IsolationLevel -> m ()
-- ^ Run this action immediately before the action is performed.
, runAfter :: backend -> Maybe IsolationLevel -> m ()
-- ^ Run this action immediately after the action is completed.
, runOnException :: backend -> Maybe IsolationLevel -> UE.SomeException -> m ()
-- ^ This action is performed when an exception is received. The
-- exception is provided as a convenience - it is rethrown once this
-- cleanup function is complete.
}| SqlBackend | ||
| , mkSqlBackend | ||
| , MkSqlBackendArgs(..) | ||
| , SqlBackendHooks(..) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| , SqlBackendHooks(..) | |
| , SqlBackendHooks |
really would like to minimize additions to the public API that require breaking changes to modify (eg data constructor direct exposure)
| -- ^ This function generates the SQL and values necessary for | ||
| -- performing an insert against the database. | ||
| , connStmtMap :: IORef (Map Text Statement) | ||
| , connStmtMap :: StatementCache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. We may be able to avoid making this a breaking change, if there's a way to go from IORef (Map Text Statement) to a StatementCache in the function that translates a MkSqlBackendArgs into a SqlBackend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, yeah, this can definitely be a minor version bump! Which means there's no urgency to figure out the other breaking changes and try to bundle them together. I think this is the only breaking change in the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to mention that I'd like to ensure that alternative statement caches can be provided too. In particular, the end goal here is that we could for example provide counters of the statement cache to detect memory leaks (real situation that can occur for long-lived postgresql connections currently AFAICT). Or as another example, we might want to provide an LRU version of the statement cache that can evict. That's why this was introduced as a breaking change. That could technically be deferred til a later date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to write setConnStatementCache :: StatementCache -> SqlBackend -> SqlBackend, without breaking this.
In hindsight, I wish I made this a Maybe field - or even omit it from the list entirely. There's an obvious default, though it does mean that mkSqlBackend :: MkSqlBackendArgs -> IO SqlBackend needs to be done to produce the cache instead.
So, for a 2.13 release, can we avoid breaking this?
Then, for 2.14, we can either remove this (and only use the setConnStatementCache) or make it a Maybe field, and also make mkSqlBackend in IO.
| module Database.Persist.SqlBackend.StatementCache | ||
| ( StatementCache | ||
| , StatementCacheKey | ||
| , mkCacheKeyFromQuery | ||
| , MkStatementCache(..) | ||
| , mkSimpleStatementCache | ||
| , mkStatementCache | ||
| ) where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blessed API 🙌🏻
| data MkStatementCache = MkStatementCache | ||
| { statementCacheLookup :: StatementCacheKey -> IO (Maybe Statement) | ||
| -- ^ Retrieve a statement from the cache, or return nothing if it is not found. | ||
| -- | ||
| -- @since 2.14.0 | ||
| , statementCacheInsert :: StatementCacheKey -> Statement -> IO () | ||
| -- ^ Put a new statement into the cache. An immediate lookup of | ||
| -- the statement MUST return the inserted statement for the given | ||
| -- cache key. Depending on the implementation, the statement cache MAY | ||
| -- choose to evict other statements from the cache within this function. | ||
| -- | ||
| -- @since 2.14.0 | ||
| , statementCacheClear :: IO () | ||
| -- ^ Remove all statements from the cache. Implementations of this | ||
| -- should be sure to call `stmtFinalize` on all statements removed | ||
| -- from the cache. | ||
| -- | ||
| -- @since 2.14.0 | ||
| , statementCacheSize :: IO Int | ||
| -- ^ Get the current size of the cache. | ||
| -- | ||
| -- @since 2.14.0 | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| data MkStatementCache = MkStatementCache | |
| { statementCacheLookup :: StatementCacheKey -> IO (Maybe Statement) | |
| -- ^ Retrieve a statement from the cache, or return nothing if it is not found. | |
| -- | |
| -- @since 2.14.0 | |
| , statementCacheInsert :: StatementCacheKey -> Statement -> IO () | |
| -- ^ Put a new statement into the cache. An immediate lookup of | |
| -- the statement MUST return the inserted statement for the given | |
| -- cache key. Depending on the implementation, the statement cache MAY | |
| -- choose to evict other statements from the cache within this function. | |
| -- | |
| -- @since 2.14.0 | |
| , statementCacheClear :: IO () | |
| -- ^ Remove all statements from the cache. Implementations of this | |
| -- should be sure to call `stmtFinalize` on all statements removed | |
| -- from the cache. | |
| -- | |
| -- @since 2.14.0 | |
| , statementCacheSize :: IO Int | |
| -- ^ Get the current size of the cache. | |
| -- | |
| -- @since 2.14.0 | |
| } | |
| data MkStatementCache = MkStatementCache | |
| { statementCacheLookup :: StatementCacheKey -> IO (Maybe Statement) | |
| -- ^ Retrieve a statement from the cache, or return nothing if it is not found. | |
| -- | |
| -- @since 2.14.0 | |
| , statementCacheInsert :: StatementCacheKey -> Statement -> IO () | |
| -- ^ Put a new statement into the cache. An immediate lookup of | |
| -- the statement MUST return the inserted statement for the given | |
| -- cache key. Depending on the implementation, the statement cache MAY | |
| -- choose to evict other statements from the cache within this function. | |
| -- | |
| -- @since 2.14.0 | |
| , statementCacheClear :: IO () | |
| -- ^ Remove all statements from the cache. Implementations of this | |
| -- should be sure to call `stmtFinalize` on all statements removed | |
| -- from the cache. | |
| -- | |
| -- @since 2.14.0 | |
| , statementCacheSize :: IO Int | |
| -- ^ Get the current size of the cache. | |
| -- | |
| -- @since 2.14.0 | |
| } |
codebase uses 4 space indent
| -- | Make a simple statement cache that will cache statements if they are not currently cached. | ||
| -- | ||
| -- @since 2.14.0 | ||
| mkSimpleStatementCache :: IO MkStatementCache | ||
| mkSimpleStatementCache = do | ||
| stmtMap <- newIORef Map.empty | ||
| pure $ MkStatementCache | ||
| { statementCacheLookup = \sql -> Map.lookup (cacheKey sql) <$> readIORef stmtMap | ||
| , statementCacheInsert = \sql stmt -> | ||
| modifyIORef' stmtMap (Map.insert (cacheKey sql) stmt) | ||
| , statementCacheClear = do | ||
| oldStatements <- atomicModifyIORef' stmtMap (\oldStatements -> (Map.empty, oldStatements)) | ||
| traverse_ stmtFinalize oldStatements | ||
| , statementCacheSize = Map.size <$> readIORef stmtMap | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing here that needs IO is the newIORef, so if we extract+purify,
mkSimpleStatementCache :: IORef (Map k v) -> MkStatementCache`meaning that we can go from IORef (Map Text Statement) to a StatementCache purely, and can therefore do it in mkSqlBackend.
Co-authored-by: Matt Parsons <parsonsmatt@gmail.com>
parsonsmatt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great! thank you so much for the PR 😄
This PR builds on top of #1327
It adds support for hooking into various aspects of the lifecycle of a connection / statements. This functionality opens up support for tracing queries within the context of a larger chunk of code:

To see how this is being used, see the hs-opentelemetry project: https://github.com/iand675/hs-opentelemetry/blob/main/instrumentation/persistent/src/OpenTelemetry/Instrumentation/Persistent.hs
Before submitting your PR, check that you've:
@sincedeclarations to the Haddockstylish-haskellon any changed files..editorconfigfile for details)After submitting your PR:
(unreleased)on the Changelog