While implementing pickling for the extension type in PyGEOS (pygeos/pygeos#190), we are having a discussion about how faithful the roundtrip of pickle should be. And specifically for LinearRings, because that geometry type is not supported in WKB, which is the serialization format being used for pickling.
Currently, Shapely preserves LinearRings itself, but LinearRings that are part of GeometryCollections become LineStrings after pickle roundtrip:
In [1]: from shapely.geometry import LinearRing, GeometryCollection
In [2]: import pickle
In [3]: ring = LinearRing([(0,0), (1,1), (0,1), (0,0)])
In [4]: col = GeometryCollection([ring])
In [5]: print(ring)
LINEARRING (0 0, 1 1, 0 1, 0 0)
In [6]: print(pickle.loads(pickle.dumps(ring)))
LINEARRING (0 0, 1 1, 0 1, 0 0)
In [7]: print(col)
GEOMETRYCOLLECTION (LINEARRING (0 0, 1 1, 0 1, 0 0))
In [8]: print(pickle.loads(pickle.dumps(col)))
GEOMETRYCOLLECTION (LINESTRING (0 0, 1 1, 0 1, 0 0))
The reason for this is that a WKB roundtrip cannot preserve LinearRings, as they get represented as linestrings:
In [21]: from shapely import wkb
In [22]: print(wkb.loads(wkb.dumps(ring)))
LINESTRING (0 0, 1 1, 0 1, 0 0)
For actual LinearRings objects, Shapely overrides this by having a custom pickling function specifically for this class (which is not possible this way for GeometryCollection):
https://github.com/Toblerity/Shapely/blob/9a54c7eb4946b8eb420091bd53d8d3930b61ea5d/shapely/geometry/polygon.py#L80-L87
So the questions we are having:
- Would we be fine with documenting this as a "known limitation" of pickling that it doesn't preserve LinearRings? Or does Shapely want to keep this roundtrip faithful?
- In any case, I don't think we want to fix the LinearRing-in-GeometryCollection case?
The main motivation for this issue is that "just using WKB" without having to keep track of the original geometry type in some way is easier to implement the reducer on the extension type in C.
(Now I am writing this, I realize we could also implement the __reduce__ on the C level for the base class using just WKB, and if we want faithful roundtrip for LinearRings, we can still override this base __reduce__ in Python only for LinearRings (as is actually done now as well). That will mean that pickling an array of LinearRings will be slower than if only implemented in C, but since it is a special corner case anyway to have many linearrings, that might be worth it for preserving the correct roundtrip)
cc @caspervdw @brendan-ward
While implementing pickling for the extension type in PyGEOS (pygeos/pygeos#190), we are having a discussion about how faithful the roundtrip of pickle should be. And specifically for LinearRings, because that geometry type is not supported in WKB, which is the serialization format being used for pickling.
Currently, Shapely preserves LinearRings itself, but LinearRings that are part of GeometryCollections become LineStrings after pickle roundtrip:
The reason for this is that a WKB roundtrip cannot preserve LinearRings, as they get represented as linestrings:
For actual LinearRings objects, Shapely overrides this by having a custom pickling function specifically for this class (which is not possible this way for GeometryCollection):
https://github.com/Toblerity/Shapely/blob/9a54c7eb4946b8eb420091bd53d8d3930b61ea5d/shapely/geometry/polygon.py#L80-L87
So the questions we are having:
The main motivation for this issue is that "just using WKB" without having to keep track of the original geometry type in some way is easier to implement the reducer on the extension type in C.
(Now I am writing this, I realize we could also implement the
__reduce__on the C level for the base class using just WKB, and if we want faithful roundtrip for LinearRings, we can still override this base__reduce__in Python only for LinearRings (as is actually done now as well). That will mean that pickling an array of LinearRings will be slower than if only implemented in C, but since it is a special corner case anyway to have many linearrings, that might be worth it for preserving the correct roundtrip)cc @caspervdw @brendan-ward