fix schema inference inside parameterized types#32705
fix schema inference inside parameterized types#32705reuvenlax merged 4 commits intoapache:masterfrom
Conversation
|
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
2cff426 to
bc6fa83
Compare
|
R: @ahmedabu98 |
|
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment |
ahmedabu98
left a comment
There was a problem hiding this comment.
Thanks, left one suggestion
| public static CoderRegistry createDefault(@Nullable SchemaRegistry schemaRegistry) { | ||
| return new CoderRegistry(schemaRegistry); | ||
| } |
There was a problem hiding this comment.
This might be a breaking change for users
Can we have the old createDefault() method as well and have it return new CoderRegistry(null)?
Would maintain existing use cases and limit the number of files in this PR
There was a problem hiding this comment.
Good point as this is a public method (even though it's probably not intended for use outside of core Beam). Added the old createDefault() back.
…ized types" This reverts commit c243491.
Previously Beam prioritized schemas over coders in inference, but did not inspect nested parameterized types for schemas. This led to some sharp edges for users - e.g. if Foo had a registered schema.
PCollection = readFoo();
Would infer the correct SchemaCoder for Foo. However
PCollection<Iterable> = readAllFoos();
Would not search for a schema, and instead take whatever Coder accepted Foo (possibly SerializableCoder). This led to a lot of confusion for users.
This PR ensures that the schema lookup continues while inspecting type parameters.
Note: this PR touches many files due to a new parameter added to CoderRegistry(), however the vast majority of those changes are trivial.