Conversation
| char get_geom(GeometryObject *obj, GEOSGeometry **out) { | ||
| if (!PyObject_IsInstance((PyObject *) obj, (PyObject *) &GeometryType)) { | ||
| PyTypeObject *type = ((PyObject *)obj)->ob_type; | ||
| if ((type != &GeometryType) & (type->tp_base != &GeometryType)) { |
There was a problem hiding this comment.
Is this change actually needed?
Because I experimented with this before (adding the TPFLAGS_BASETYPE), and I seemed to remember the ufuncs worked on the subclasses without any additional change (at least in python, "isinstance" works for subclasses as well, but don't know if the semantics of the C API version is exactly the same).
There was a problem hiding this comment.
PyObject_IsInstance needs the GIL while the new typecheck doesn't (it operates on structs)
Looking further into the source, it seems that the "exact type match" (which we always have in current master) is catched very soon in the PyObject_IsInstance (here and here) so that's why we probably got away with it in the first place.
|
This now also creates the Points from the ufuncs. I didn't see any performance degradation (but more testing is needed still) |
|
Some better benchmarks: So, you can measure the difference, but it is only 0.033 |
|
This is really cool. Will try to test it a bit more myself as well the coming days! |
|
I also did some similar timings with my laptop tuned for performance checking with pyperf, and with the basic So I agree that, although measurable, this limited slowdown is certainly something that seems worth it. |
|
And as additional test, I quickly added a LineString subclass, to test something more "costly". Creating linestrings of each 100 coordinates pairs from a numpy array, I get a similar overhead per geometry (~67ns), but now this means only a relative performance hit of 1.3% |
|
Ok so let’s do this! I think we should copy as much of the shapely API as we can. Maybe 1 PR per type? |
|
I was experimenting a bit more (also in Shapely), and ran into the following issue: if we have an additional class in the hierarchy (which we will want, as a large amount of the properties and methods are shared accross all geometry types), you get errors. Eg dummy example: class BaseGeometry(Geometry):
@property
def type_id(self):
return get_type_id(self)
class Point(BaseGeometry):
@property
def x(self):
return get_x(self)
@property
def y(self):
return get_y(self)
class LineString(BaseGeometry):
@property
def num_points(self):
return get_num_points(self)
lib.registry[0] = Point
lib.registry[1] = LineStringThen the geometry class recognition does not work: I suppose some check with |
|
Found this in the cython generated code: so using this logic instead of |
I am not sure what is best, but I am currently experimenting with directly doing this in Shapely (https://github.com/Toblerity/Shapely/compare/master...jorisvandenbossche:geometry-subclasses?expand=1). |
|
Work is resuming in #182 |
This makes pygeos.Geometry subclassable and uses tp_base to quickly check if an object is a Geometry subclass.