Given a tree constructed with target geometries, we should be able to get a sorted array of the indexes and distances of the N nearest neighbors, within some maximum distance. Any geometries that intersect get a distance of 0 and compete for the nearest position (not sure how to sort that out yet).
Something like:
>> tree = pygeos.STRtree(pygeos.points(np.arange(10), np.arange(10)))
>> tree.nearest(pygeos.point(0.25, 0.25), neighbors=2, max_distance=1000)
[array([0, 1]), array([0.353553, 1.06066])]
Likewise, a bulk version of this (similar to #108) could return 3 arrays:
>> tree = pygeos.STRtree(pygeos.points(np.arange(10), np.arange(10)))
>> tree.nearest([pygeos.point(0.25, 0.25), pygeos.point(2,2)], neighbors=2, max_distance=1000)
[array([0, 0, 1, 1]), array([0, 1, 2, 1]), array([0.353553, 1.06066, 0, 1.4142])]
There is an existing GEOSSTRtree_nearest_generic in GEOS, but it doesn't do what we want here, so I think we should implement our own approach in C; it is focused on returning the nearest geometry instead of the index of the nearest geometry. GEOSSTRtree_nearest_generic takes a callback function that gets passed the index of the geometries in the tree, and the source geometry. While it is easy to calculate the distance between them, it is much less easy to get the GEOS geoemetry object from an index in the tree, without also having a clean way to pass in the tree or geometries into the callback function. Ultimately, we want the indexes and distances of more than just the nearest neighbor.
I think we can do the following:
- calculate the envelope of the source geometry
- expand this up to
max_distance as a new envelope
- query the tree using this envelope
- iterate over results from query and sort by distance from source geometry
- extract N nearest results and return arrays of indexes and distances
Given a tree constructed with target geometries, we should be able to get a sorted array of the indexes and distances of the N nearest neighbors, within some maximum distance. Any geometries that intersect get a distance of 0 and compete for the nearest position (not sure how to sort that out yet).
Something like:
Likewise, a bulk version of this (similar to #108) could return 3 arrays:
There is an existing
GEOSSTRtree_nearest_genericin GEOS, but it doesn't do what we want here, so I think we should implement our own approach in C; it is focused on returning the nearest geometry instead of the index of the nearest geometry.GEOSSTRtree_nearest_generictakes a callback function that gets passed the index of the geometries in the tree, and the source geometry. While it is easy to calculate the distance between them, it is much less easy to get the GEOS geoemetry object from an index in the tree, without also having a clean way to pass in the tree or geometries into the callback function. Ultimately, we want the indexes and distances of more than just the nearest neighbor.I think we can do the following:
max_distanceas a new envelope