Add tracing for physical connection open#6091
Conversation
| _connectionLogger = dataSourceConfig.LoggingConfiguration.ConnectionLogger; | ||
| _commandLogger = dataSourceConfig.LoggingConfiguration.CommandLogger; | ||
|
|
||
| // Make sure we do not flow AsyncLocals like Activity.Current |
There was a problem hiding this comment.
👍
Though a part of me thinks we should maybe start the activity only after the data source has been set up - that really isn't part of physical connection open. What do you think?
| // At no point should we ever have an activity here | ||
| Debug.Assert(Activity.Current is null); | ||
| // Set current activity as the one from the command | ||
| // So child activities from physical open are bound to it |
There was a problem hiding this comment.
Good thinking here.
Though at some point we should have the conversation about whether we want to keep multiplexing, given that it's known to be non-scalable (e.g. and is no longer used in the TE benchmarks).
| LogMessages.OpeningPhysicalConnection(ConnectionLogger, Host, Port, Database, UserFacingConnectionString); | ||
| var startOpenTimestamp = Stopwatch.GetTimestamp(); | ||
|
|
||
| var activity = NpgsqlActivitySource.ConnectionOpen(this); |
There was a problem hiding this comment.
As written below, there may be a case for only starting the activity after the data source is created and bootstrapped. Otherwise, the first "physical connection" is longer because it contains e.g. type loading and stuff like that; at minimum that skews stuff like average time calculations for physical open.
If we really wanted to surface startup costs like type loading properly, we could have another activity for that - though that seems like a bit much.
There was a problem hiding this comment.
As written below, there may be a case for only starting the activity after the data source is created and bootstrapped. Otherwise, the first "physical connection" is longer because it contains e.g. type loading and stuff like that; at minimum that skews stuff like average time calculations for physical open.
I do not think anyone really calculates average time for any sort of traces (too much stuff to store), there are metrics specifically for stuff like this (for example, in Npgslq we have db.client.commands.duration). Beside, right now when we log physical open we pass the total time, including type loading.
If we really wanted to surface startup costs like type loading properly, we could have another activity for that - though that seems like a bit much.
This might actually be not a bad idea, since this will also allow us to trace NpgsqlConnection.ReloadTypes.
There was a problem hiding this comment.
Right, makes sense. If it's not much work/changes, I'd try to exclude type loading from the activity span you're introducing here (simply because it really shouldn't be part of it). We can consider introducing a new activity for it, though I don't think that's very important.
There was a problem hiding this comment.
Right, makes sense. If it's not much work/changes, I'd try to exclude type loading from the activity span you're introducing here (simply because it really shouldn't be part of it).
The main problem is that we do consider it a part of physical open. That is, if we get an exception during bootstrap or from DataSourceConnectionInitializer, we'll break the physical connection. In that case, should we pass that exception to activity? It would be really weird if we don't (since we'll throw the exception but the trace will be green, even though the connection is already closed). But then, that means that we can't close the activity until we're absolutely sure the connection is stable.
There was a problem hiding this comment.
Or to be more specific, we can specify Activity.Duration via Activity.SetEndTime, so while it's still be considered as active, the duration will remains as is, but IDK
There was a problem hiding this comment.
I do not think anyone really calculates average time for any sort of traces (too much stuff to store), there are metrics specifically for stuff like this (for example, in Npgslq we have db.client.commands.duration). Beside, right now when we log physical open we pass the total time, including type loading.
FWIW calculating aggregate metrics from tracing data is a pretty standard thing to do, and indeed we could also not include the type loading time when logging.
But good point about the exception. I still think that fundamentally, type loading really shouldn't be counted as part of physical connection open; physical connections get opened all the time, whereas type loading is a one-time startup query that we happen to need (unless it's disabled by the user). It seems to just weirdly skew things to look at it as part of connection open.
But I'm OK with setting this aside for now and moving on - we can always change this in the future if we want.
Updated [Npgsql](https://github.com/npgsql/npgsql) from 9.0.3 to 10.0.2. <details> <summary>Release notes</summary> _Sourced from [Npgsql's releases](https://github.com/npgsql/npgsql/releases)._ ## 10.0.2 v10.0.2 contains several minor bug fixes. [Milestone issues](https://github.com/npgsql/npgsql/milestone/135?closed=1) **Full Changelog**: npgsql/npgsql@v10.0.1...v10.0.2 ## 10.0.1 v10.0.1 contains several minor bug fixes. [Milestone issues](https://github.com/npgsql/npgsql/milestone/134?closed=1) **Full Changelog**: npgsql/npgsql@v10.0.0...v10.0.1 ## 10.0.0 See the [release notes](https://www.npgsql.org/doc/release-notes/10.0.html). The full list of changes is available [here](https://github.com/npgsql/npgsql/milestone/122?closed=1). ## What's Changed * STJ 9.0 alternative approach by @NinoFloris in npgsql/npgsql#5941 * Remove support for net6.0 by @roji in npgsql/npgsql#5947 * Some leftover cleanup for removing net6.0 by @roji in npgsql/npgsql#5949 * Map date/time to DateOnly/TimeOnly by default by @roji in npgsql/npgsql#5948 * Make the cidr<->IPNetwork mapping the default by @roji in npgsql/npgsql#5950 * Fix connecting with VerifyCA and VerifyFull by @vonzshik in npgsql/npgsql#5944 * Remove stopwatch allocations by @vonzshik in npgsql/npgsql#5977 * Bump actions/setup-dotnet from 4.1.0 to 4.2.0 by @dependabot[bot] in npgsql/npgsql#5983 * Use exception convenience methods by @bbowyersmyth in npgsql/npgsql#5982 * Bump actions/setup-dotnet from 4.2.0 to 4.3.0 by @dependabot[bot] in npgsql/npgsql#6007 * Add support for postgresql type names with dots by @dvas-hash in npgsql/npgsql#5971 * Send close_notify TLS alert on connection shutdown by @vonzshik in npgsql/npgsql#5995 * Remove DisplayClass struct creation in PgReader by @bbowyersmyth in npgsql/npgsql#6014 * Always dispose RemoteCertificate on SslStream by @vonzshik in npgsql/npgsql#6022 * Remove LongRunningConnection field from NpgsqlConnector by @vonzshik in npgsql/npgsql#6024 * Tighten SCRAM-SHA-256 SASL check by @vonzshik in npgsql/npgsql#6023 * Add SHA3 hash algorithms for SASL authentication by @vonzshik in npgsql/npgsql#6028 * Remove dotnet SDK version from CI (use global.json) by @roji in npgsql/npgsql#6037 * Add support for specifying allowed auth methods by @vonzshik in npgsql/npgsql#6036 * Migrate to SLNX by @roji in npgsql/npgsql#6053 * Switch to Ubuntu 24.04 in CI by @roji in npgsql/npgsql#6054 * Bump actions/setup-dotnet from 4.3.0 to 4.3.1 by @dependabot[bot] in npgsql/npgsql#6059 * Add basic testing for tracing by @vonzshik in npgsql/npgsql#6051 * parameter-collection Clone() should set correct collection instance by @mgravell in npgsql/npgsql#6066 * Fix brew on mac CI by @NinoFloris in npgsql/npgsql#6071 * Fix adding to hash lookup while renaming an unnamed parameter by @vonzshik in npgsql/npgsql#6073 * Update LICENSE date (2024 -> 2025) by @kurnakovv in npgsql/npgsql#6082 * Add tracing for physical connection open by @vonzshik in npgsql/npgsql#6091 * Start testing on .NET 9 by @vonzshik in npgsql/npgsql#5945 * Turn on <IsAotCompatible> by @roji in npgsql/npgsql#6097 * Reenable public API analyzer by @roji in npgsql/npgsql#6101 * Update Npgsql to .NET 9 by @vonzshik in npgsql/npgsql#6099 * Ignore system CA store if root certificate is provided by @vonzshik in npgsql/npgsql#6102 * Fix reading columns asynchronously via JsonNet plugin by @vonzshik in npgsql/npgsql#6109 * Fixes #6107 missed should buffer in biginteger numeric converter by @NinoFloris in npgsql/npgsql#6117 * Fix logging parameters with batches by @vonzshik in npgsql/npgsql#6079 * Implement GSSAPI session encryption by @vonzshik in npgsql/npgsql#6131 * feat: add support for PGAPPNAME to set application name by @michael-todorovic in npgsql/npgsql#6139 * Fix returning null from KerberosUsernameProvider.GetUsername with concurrent calls by @vonzshik in npgsql/npgsql#6137 * Add NpgsqlTsVector.Empty by @roji in npgsql/npgsql#6145 * Add assert to NpgsqlCommand.Transaction if it's completed by @vonzshik in npgsql/npgsql#6151 * Compare normalized type names by @0MG-DEN in npgsql/npgsql#6011 * Do CI testing for PG18 (beta) by @roji in npgsql/npgsql#6155 * Fix infinite consume on error with connection break by @vonzshik in npgsql/npgsql#6161 * Bump actions/checkout from 4 to 5 by @dependabot[bot] in npgsql/npgsql#6174 ... (truncated) ## 10.0.0-rc.1 ## 9.0.5 v9.0.5 contains several minor bug fixes. [Milestone issues](https://github.com/npgsql/npgsql/milestone/131?closed=1) **Full Changelog**: npgsql/npgsql@v9.0.4...v9.0.5 ## 9.0.4 v9.0.4 contains several minor bug fixes. [Milestone issues](https://github.com/npgsql/npgsql/milestone/127?closed=1) **Full Changelog**: npgsql/npgsql@v9.0.3...v9.0.4 Commits viewable in [compare view](npgsql/npgsql@v9.0.3...v10.0.2). </details> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Closes #4136