Skip to content

Implement Geometry UDTs#1375

Merged
rfecher merged 1 commit intolocationtech:masterfrom
JWileczek:geom-udt
Aug 23, 2018
Merged

Implement Geometry UDTs#1375
rfecher merged 1 commit intolocationtech:masterfrom
JWileczek:geom-udt

Conversation

@JWileczek
Copy link
Copy Markdown
Contributor

Current implementation of Geometry UDTs. PR commits still need to be squashed, but currently trying to solve a issue that will require some more development.

Currently all tests are passing locally, serialization + deserialization of UDT works in java RDDs and DataFrames. The current issue is a frontend display issue from notebooks with pyspark + pixiedust. We normally display the dataframe using pixiedusts display function i.e. display(df)
However, when trying to display with pixiedust the job fails because of how Spark tries to first convert the type to/from JSON before displaying. After more digging this will require a python UDT to be created in addition to the Java/Scala UDT. We should be able to use the Shapely library as a backing serialized object for each geometry type in python. This additional change may take some additional time just because some portions of how to work a python package into the current build process and deploy it properly with jupyter + spark are unclear to me.

Alternatively, We can use the builtin dataframe display
i.e. df.show()
Which will work and display the geometries in their WKT string format, but that makes our use of pixiedust relatively pointless in the notebooks beyond a job status tracker. I would ideally like the python package + UDT as the long-term solution. Going to attempt to get everything in place over the weekend.

@coveralls
Copy link
Copy Markdown

coveralls commented Aug 17, 2018

Coverage Status

Coverage decreased (-5.7%) to 42.57% when pulling 693c655 on JWileczek:geom-udt into 2b0c3b9 on locationtech:master.

@JWileczek JWileczek force-pushed the geom-udt branch 2 times, most recently from 79749b1 to 3af012e Compare August 23, 2018 15:40
@JWileczek
Copy link
Copy Markdown
Contributor Author

Geometry UDTs now working within javaspark + pyspark. Did a rough maven module + build for the python package, and integrated it with existing bootstrap for jupyter + jupyterhub. Final location of package for distribution and how it's worked into master build needs to be solidified.

@JWileczek JWileczek requested a review from rfecher August 23, 2018 16:38
@rfecher rfecher merged commit fc77edd into locationtech:master Aug 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants