[Tests] Fully rework type mapping assertion helpers#6314
[Tests] Fully rework type mapping assertion helpers#6314NinoFloris merged 28 commits intonpgsql:mainfrom
Conversation
27da7fe to
ec83a23
Compare
ec83a23 to
9d974de
Compare
9d974de to
88c8c2c
Compare
There was a problem hiding this comment.
Pull request overview
This PR performs a comprehensive refactoring of the type mapping assertion helpers used throughout Npgsql's test suite. The main goal is to remove ambiguous "default" concepts and replace them with more explicit, precise testing parameters that better capture the nuances of type inference behavior.
Key Changes:
- Replaced
isDefault,isDefaultForReading, andisDefaultForWritingparameters with a clearerdataTypeInferenceparameter that explicitly states whether types are inferred exactly, through well-known mappings, or not at all - Introduced
ExpectedDbTypestruct to handle cases where different DbTypes are expected based on how the parameter is set (via DataTypeName vs value inference) - Renamed
pgTypeNametodataTypeNamethroughout for consistency - Added more comprehensive negative matching for write-side checks
- Removed several unused
usingdirectives forNpgsqlTypes
Reviewed changes
Copilot reviewed 35 out of 35 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| test/Npgsql.Tests/Support/TestBase.cs | Core refactoring of assertion helpers with new DataTypeInference and ExpectedDbType patterns |
| test/Npgsql.Tests/Types/*.cs | Updated all type test files to use new assertion helper signatures |
| test/Npgsql.PluginTests/*.cs | Updated plugin test files to use new assertion helper signatures |
| src/Npgsql/NpgsqlTypes/NpgsqlDbType.cs | Simplified ToDbType method by removing redundant Object mappings |
| test/Npgsql.Tests/CommandTests.cs | Minor update using UTF-8 string literals |
| test/Npgsql.Tests/LargeObjectTests.cs | Minor update using UTF-8 string literals |
Comments suppressed due to low confidence (1)
test/Npgsql.Tests/Types/FullTextSearchTests.cs:93
- Call to obsolete method Parse.
await AssertType(NpgsqlTsVector.Parse("'1'"), "'1'", "tsvector");
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 35 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
NinoFloris
left a comment
There was a problem hiding this comment.
Highlighted some examples to explain the new DbType assertions.
roji
left a comment
There was a problem hiding this comment.
Quick general question before I do a more complete review.
| "1998-04-12 13:26:38.789", | ||
| "timestamp without time zone", | ||
| DbType.DateTime, | ||
| isDefaultForReading: false, |
There was a problem hiding this comment.
I have to say that I find the current form more immediately understandable and useful as a "spec/documentation" of Npgsql's behavior - this is clearly telling me that PG timestamp does not get read as LocalDateTime by default (i.e. one must do GetFieldValue<LocalDateTime>() to explicitly request it), whereas LocalDateTime does get written by default as PG timestamp (i.e. one doesn't need to set DataTypeName/NpgsqlDbType).
On the new side of the diff I now see dataTypeInference I'm not longer seeing that information. It could be just my way of thinking (I wrote the current system after all) but it seems like some information is getting lost here no?
Note that I definitely am for tightening assertions and checks, renaming things and other stuff you're doing in this PR - just not sure if this specific change to the actual AssertType API is necessary for that etc.
There was a problem hiding this comment.
I see, yes I think indeed that's a small difference in understanding. To me GetValue() is not necessarily a default (it also has to be called) or for that matter any less explicit than a generic method. That also fits with the reasoning around GetFieldValue<object> needing to do the same thing as GetValue(). But I agree that if you want something other than a boxed object instance of *some* type, you do have to know about that type and pass it. For those cases there is indeed more information requested of the caller, if that's what you mean.
whereas LocalDateTime does get written by default as PG timestamp (i.e. one doesn't need to set DataTypeName/NpgsqlDbType).
This part is mostly you inferring this from your existing experience, as we're not really passing the write side default argument at all. Though I agree that currently there is a nice (looking) 'default' parameter for writing and reading.
The most significant point of contention I have with “default” is on the write side though. There it really doesn't work well conceptually due to the multiple ways we try to attempt to find a mapping if we have no data type to go on (where we check whether we have an actual default mapping, but also attempt to fallback to the first clr type match if we don't have a default mapping). Which also means that if you just go by seeing whether a mapping does not have isDefault:true (in a type info resolver) that it does not mean it won't be seen as a 'default' for writing in the current tests. Unless you know exactly what all this entails you will just try some flags until the test passes, maybe having disabled half the checks (which is why I'm adding more negative matching too).
Regarding the isDefaultForReading name change (or just isDefault for AssertTypeRead) to isValueTypeDefaultFieldType. Given that we *always* align the type coming from GetValue() and FieldType (this was also true in the handler system) I elaborated the name to add the mention of its input valueType (typeof(T) basically), and the output it checks FieldType.
Now in the PR I mentioned that I wanted to get rid of the notion of default in the names of things, and on (re)reviewing this myself I can't say I have succeeded 😅, so the new name leaves a lot to be desired. I have pushed a change to rename this to valueTypeEqualsFieldType instead (also note that it's not used a lot either way, about 100 hits where actual tests pass this in).
I hope that valueTypeEqualsFieldType is something you (and @vonzshik) can get used to, but if you feel strongly about the name or the approach we should discuss it a bit.
The trend for this PR should be to have assertion argument names better describe which inputs are being checked against which outputs, and to drive down the number of (argument level) knobs. The fewer we can get away with the more we can confidently say we have a predictable and consistent mapping system, which also helps future code align better to those standards.
There was a problem hiding this comment.
So I think we're we might have a gap is around looking at the type mapping system from the inside or from the outside... I'm generally looking at these tests 100% from a user perspective - I generally like that tests encode the user contract as much as possible (where that makes sense). And if you look at our type mapping docs, you'll see the same "default" concept expressed in the two tables:
- On the read side (1st table), each PG type has a single .NET read type, e.g. PG
textgets read as .NETstringby the untyped APIs (e.g.GetValue()), but can also be read aschar[]when the typed API is used (GetFieldValue<T>()). - In my head (and currently in the docs - 2nd table) the same kinda logic works on the write side: each .NET type has a single PG type, which is what Npgsql writes unless you tell it to write something else via NpgsqlDbType/DbType/DataTypeName.
To me both of the above map pretty nicely to the concept of "default" - from a very user-oriented perspective - are you saying you think it's problematic? If so, would you explain it in another way to users in the docs, which would simplify things somehow? Or can you give a problematic scenario or two where describing things to users as "default" (either on the read or write side) does a disservice/is incorrect? Because if thinking about this as "default" works for explaining our behavior to users, it feels like it should work in the tests too (though I get that this might not always be 100% true).
Regarding the isDefaultForReading name change (or just isDefault for AssertTypeRead) to isValueTypeDefaultFieldType. Given that we always align the type coming from GetValue() and FieldType (this was also true in the handler system) I elaborated the name to add the mention of its input valueType (typeof(T) basically), and the output it checks FieldType.
Sure, I understand - in my head the alignment of GetValue() and FieldType just naturally flows from, well, having a default .NET type that's read for the given PG type (i.e. since PG text is by default read as string, GetValue() and FieldType always return string). So your proposed naming just seems to go more low-level, encompass two things and therefore be a bit less immediately understandable, where there's (for me) a simple high-level concept that already expressed this. But I can't say it bothers me that much, if it's important to you.
BTW I'm posting a long/serious answer here because it's an interesting discussion, but it's not like I have super-strong feelings about it either :) This is just test factoring after all.
| bool isDefault = true, | ||
| Func<T, T, bool>? comparer = null, | ||
| Type? fieldType = null) | ||
| public enum DataTypeInferenceKind |
There was a problem hiding this comment.
Maybe document these? I honestly am not sure what exactly they mean in the context of these tests and need to read the code to figure it out.
There was a problem hiding this comment.
Yes, will do. I'll also add a bit of rationale here about the what and why of dataTypeInference as the assertion argument.
There was a problem hiding this comment.
Took a shot at letting opus generate a rationale.
Please explain what dataTypeInference in AssertType does
Opus:
DataTypeInference controls how the test validates Npgsql's ability to infer the PostgreSQL data type from a CLR value.
What it tests:
-
Exact(ortrue): The CLR type uniquely maps to the expected PostgreSQL type. The test verifies that setting only theValueproperty (withoutNpgsqlDbTypeorDataTypeName) correctly infers the exact type. Both generic and non-generic parameters are tested. -
WellKnown: The CLR type maps to some PostgreSQL type (notUnknown), but not necessarily the expected one. Used when a type is recognized but ambiguous—the test accepts whatever type Npgsql infers rather than requiring an exact match. -
Unknown(orfalse): The CLR type cannot be used to infer any PostgreSQL type. The test expectsDataTypeNameto benullandNpgsqlDbTypeto beUnknown. Value-only parameters are excluded from the roundtrip test since they'd fail.
In practice:
The test always validates explicit type specification (via NpgsqlDbType, DataTypeName, DbType). The DataTypeInference parameter determines whether value-only scenarios are tested and what inference behavior is expected.
Could you explain why WellKnown exists over having just Unknown without negative matching?
Opus:
The key difference is what gets tested vs skipped:
With WellKnown (current approach):
- Value-only parameters are still included in the roundtrip test
- The test writes the value and verifies it round-trips correctly
- It just doesn't assert the specific inferred type (accepts whatever Npgsql infers)
With only Unknown (no negative matching):
- Value-only parameters would be skipped entirely from the test
- You lose coverage that the value can be written at all via inference
Example: Testing string → json
| Approach | What happens |
|---|---|
WellKnown |
Test writes string with value-only, verifies round-trip works (even though Npgsql infers text, not json) |
Unknown |
Test skips value-only entirely—no verification that string values work |
The WellKnown approach catches regressions where a CLR type that used to infer successfully suddenly fails to infer anything. With Unknown, you'd never notice because that scenario isn't tested.
There was a problem hiding this comment.
I'll add that this parameter just replaces isDataTypeInferredFromValue (it's a rename + the addition of the WellKnown to enable negative matching). However, as value only inference is really the only observable property for what we used to consider 'defaultness' it also meant isDefaultForWriting could be removed. Similarly to how valueTypeEqualsFieldType is all that is relevant for defaultness on the read side.
There was a problem hiding this comment.
- Went over DataTypeInference and I think I understand it much better now, thanks for the patience. I get the idea of switching to three values instead of two (as previously) - I think it makes sense.
- I guess part of the difficulty in my brain is that I see AssertType as asserting something about the mapping, i.e. about the relationship between the CLR type and the PG type. DataTypeInference.Nothing actually doesn't care about the PG type of the test at all; another way to say this, is that all tests for a given CLR type (assuming there are multiple mappings) must "needlessly" repeat DataTypeInference.Nothing, where maybe just having a single "not inferred at all" test would be slightly cleaner. On the other hand for the simpler case where there's a one-to-one mapping between CLR/PG types, DataTypeInference.Nothing is more economic (one test instead of two). So I'm fine with your proposal.
- Now that I understand DataTypeInference better, I don't think there's a particular need for the true/false implicit conversion... true is the default, so I wouldn't expect to actually ever see it explicitly in AssertType, and DataTypeInference.Nothing seems clear enough to me personally. I think that also means that DataTypeInference can be a simple enum rather than a struct?
- Maybe my brain likes Inferred/NotInferred/NothingInferred for explicitness rather than Match/Mismatch/Nothing but not very important (am OK with your current naming if you prefer that).
- Probably the xmldocs on DataTypeInference could make this easier for the uninitiated... Here's some notes that I wrote for myself while looking at this:
- DataTypeInference.Nothing: No data type is inferred from the CLR type. This is for CLR types that are statically unknown to Npgsql (plugin types: NodaTime/NTS, composite types, enums...), or where we specifically don't want to infer a PG type because there's no good option (e.g. uint can be mapped to oid/xid/cid, but we don't want any of these as a default/inferred data type)
- The docs currently say "matches (or differs) the values under test". First, we compare against the explicitly-provided data type argument passed to AssertType right? Because values makes this a bit confusing. Also should it be singular rather than plural values?
f8e0989 to
b172faf
Compare
b172faf to
2b49d7a
Compare
2b49d7a to
5d8d1e0
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 36 out of 36 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
a15c0eb to
86d6b42
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 35 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
86d6b42 to
372b458
Compare
372b458 to
edcf6e1
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 35 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
edcf6e1 to
6d766cc
Compare
6d766cc to
31ef76a
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 35 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@NinoFloris I didn't look at the actual code but read the descriptions and just want to thank you because I think that this is very valuable work. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 35 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Thanks, Brar. These are 'just tests' so it does quickly feel excessive to spend too much time on this. That said, I’m happy if it may prevent future bugs and helps steer contributors toward implementing new mappings correctly. |
This PR contains a few things:
A notable example of this is an edge-case test for DbType.VarNumeric, which maps to NpgsqlDbType.Numeric, while DbType.Decimal is the canonical mapping to Numeric. In this test we want to be able to set VarNumeric on DbType and read this value back out of the property (currently CheckInference contains a roundtripping quirk for this, but that will go after Add IDbTypeResolver to allow plugins to control how DbTypes are mapped #6267 is merged, when DbType gets its own storage). The other inference checks however still want to observe the canonical DbType.Decimal when inferring (e.g. from a decimal value, or when the dataTypeName is "decimal" or "numeric").
A more common example is where dataTypeName maps to DbType.Object while the value inferred DbType would be something more strongly typed, like DbType.String or Binary (bitstring or datetime tests are a good source for these).
Note, some of the tests have additional asserts included. In those cases I usually ran into them while trying to lock down the right assertion changes, and took the liberty to leave the asserts in as long as they provided value.