Skip to content
This repository was archived by the owner on Jul 28, 2025. It is now read-only.
This repository was archived by the owner on Jul 28, 2025. It is now read-only.

[FEA]: Random Access Segment Iterators for Ranges #1025

@isVoid

Description

@isVoid

Is this a new feature, an improvement, or a change to existing functionality?

New Feature

How would you describe the priority of this feature request

Medium

Please provide a clear description of problem you would like to solve.

A segment iterator is, as the name suggests, an iterator to all segments of a geometry range. A random access iterator means the ith segment of the range can be retrieved in constant time.

Simple as it sound, the implementation has been held off till this stage. The biggest reason is that the iterators are skipping iterators that's data dependent. For example, a multilinestring array that has 2 linestrings and {3, 4} points each can have 2 and 3 segment for each linestring, and that the point connecting the 2nd and 3rd point is not valid. In plain code:

Multilinestring: 0 1 2
Linestring 1:    0 3 7
Points:          (0,0) (1,1) (2,2) (10,10) (11,11) (12,12) (13,13)
Segments:        (0,0) -> (1,1)  (1,1)->(2, 2) (10, 10)->(11,11) (12, 12)->(13, 13)

Skipping forward iterators are easy to create serially, but not so easy for a random access iterator, especially when they are data dependent.

This feature is important to create a perfectly load balanced kernel to compute the distances between two geometries. This computation kernel requires that when accessing the ith segment from the geometry, it's actually returning the right segment, not "an invalid segment" (e.g. (2,2) -> (10,10)) in the above example. This further makes sure that the thread is computing the correct segment pair after transforming the indices based on the all-pairs alignment.

Describe any alternatives you have considered

Currently, all existing st_distance kernels are not perfectly load balanced. We circumvent the requirement of this feature request by performing a serial loop in the kernel and skip when the iteration hits an "invalid segment". This is not ideal, and as #1002 shows, may lead to unexpected throughput drops if the input data is skewed.

Additional context

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions