GEOMETRY Rework: Part 4 - Fixup Parquet Extension + Add Arrow Support#19476
Merged
Mytherin merged 7 commits intoduckdb:mainfrom Nov 7, 2025
Merged
GEOMETRY Rework: Part 4 - Fixup Parquet Extension + Add Arrow Support#19476Mytherin merged 7 commits intoduckdb:mainfrom
GEOMETRY Rework: Part 4 - Fixup Parquet Extension + Add Arrow Support#19476Mytherin merged 7 commits intoduckdb:mainfrom
Conversation
8 tasks
7d1d0ac to
d45280e
Compare
2966146 to
5a669ba
Compare
GEOMETRY Rework: Part 4 - Fixup Parquet ExtensionGEOMETRY Rework: Part 4 - Fixup Parquet Extension + Add Arrow Support
94f9052 to
92b55ba
Compare
92b55ba to
ccce39d
Compare
Mytherin
approved these changes
Nov 5, 2025
ccce39d to
9a61101
Compare
9a61101 to
c40be99
Compare
paleolimbot
reviewed
Nov 6, 2025
|
|
||
| // Otherwise, unrecognized encoding | ||
| throw NotImplementedException("Unsupported geometry encoding"); | ||
| // TODO: Pass the actual target type here so we get the CRS information too |
| struct ArrowGeometry { | ||
| static unique_ptr<ArrowType> GetType(const ArrowSchema &schema, const ArrowSchemaMetadata &schema_metadata) { | ||
| // Validate extension metadata. This metadata also contains a CRS, which we drop | ||
| // because the GEOMETRY type does not implement a CRS at the type level (yet). |
Contributor
There was a problem hiding this comment.
Boo! (Kidding, I know this is hard)
Comment on lines
+36
to
+38
| statement ok | ||
| insert into t_all_types values | ||
| (1, 'POINT (1 2)'), |
Contributor
There was a problem hiding this comment.
A bunch of examples at https://github.com/apache/parquet-testing/blob/master/data/geospatial/geospatial.yaml as well if you ever get burnt out coming up with these (I frequently do 🙂 )
Collaborator
|
Thanks! |
github-actions bot
pushed a commit
to duckdb/duckdb-r
that referenced
this pull request
Nov 7, 2025
`GEOMETRY` Rework: Part 4 - Fixup Parquet Extension + Add Arrow Support (duckdb/duckdb#19476)
github-actions bot
added a commit
to duckdb/duckdb-r
that referenced
this pull request
Nov 7, 2025
`GEOMETRY` Rework: Part 4 - Fixup Parquet Extension + Add Arrow Support (duckdb/duckdb#19476) Co-authored-by: krlmlr <krlmlr@users.noreply.github.com>
Mytherin
added a commit
that referenced
this pull request
Nov 20, 2025
…19848) This is a followup PR that builds on top of #19476. Please have a look at #19136 for the context behind this PR. I realized I the `Geometry::FromBinary`/`Geometry::ToBinary` helper functions need to be adjusted slightly so that they can be used to implement the cast functions provided in `duckdb-spatial`. These casts may move to core eventually, but for now this is required to integrate the spatial extension with the new geometry type smoothly.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a followup PR that builds on top of #19439. Please have a look at #19136 for the context behind this PR.
This PR fixes up the remaining issues in the parquet extension related to geometries. When reading geometry columns we now push an expression column reader on top of the underlying blob column reader to perform the WKB parsing with
ST_GeomFromWKB.ST_GeomFromWKBnow actually checks that the input is valid WKB and also converts from big-endian WKB to little-endian If required. This can be optimized further, but It's good enough for now.I've also added support for converting geometry columns to/from arrow arrays with geoarrow extension metadata. This code is basically lifted straight from the spatial extension.