Skip to content
This repository was archived by the owner on Jul 28, 2025. It is now read-only.

Add geometry_generator factory for programmatic generation of geometry arrays#998

Merged
rapids-bot[bot] merged 49 commits intorapidsai:branch-23.04from
isVoid:benchmark/point_polygon_distance
Mar 22, 2023
Merged

Add geometry_generator factory for programmatic generation of geometry arrays#998
rapids-bot[bot] merged 49 commits intorapidsai:branch-23.04from
isVoid:benchmark/point_polygon_distance

Conversation

@isVoid
Copy link
Copy Markdown
Contributor

@isVoid isVoid commented Mar 17, 2023

Description

To allow easy benchmarking for future APIs, I created geometry_generator that programmatically generate a geoarrow compliant geometry arrays. The first feature landing is generate_multipolygon_array.

This PR also adds BaseFixtureWithParam, which allows user to parameterize the tests with value and ranges.

Contributes to #259

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added cmake Related to CMake code or build configuration libcuspatial Relates to the cuSpatial C++ library labels Mar 17, 2023
@isVoid isVoid self-assigned this Mar 17, 2023
@isVoid isVoid marked this pull request as ready for review March 20, 2023 05:28
@isVoid isVoid requested review from a team as code owners March 20, 2023 05:28
@isVoid isVoid requested review from harrism and trxcllnt March 20, 2023 05:28
@isVoid isVoid changed the title [skip-ci] Add geometry_generator factory for programmatic generation of geometry arrays Add geometry_generator factory for programmatic generation of geometry arrays Mar 20, 2023
@isVoid isVoid requested a review from thomcom March 20, 2023 05:28
@isVoid isVoid added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change feature request New feature or request and removed improvement Improvement / enhancement to an existing function labels Mar 20, 2023
@isVoid isVoid mentioned this pull request Mar 20, 2023
3 tasks
Copy link
Copy Markdown
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments. Thanks for this utility!

@isVoid
Copy link
Copy Markdown
Contributor Author

isVoid commented Mar 22, 2023

/merge

@rapids-bot rapids-bot bot merged commit 515e0c4 into rapidsai:branch-23.04 Mar 22, 2023
rapids-bot bot pushed a commit that referenced this pull request May 23, 2023
This PR separates the `pairwise_point_polygon_distance` benchmark portion of PR #1002. While that PR is only left for nvtx3 experiments.

# Original PR description:

This PR adds pairwise point polygon distance benchmark. Depends on #998

Point-polygon distance performance can be affected by many factors, because the geometry is complex in nature. I benchmarked these questions:
1. How does the algorithm scales with simple multipolygons?
2. How does it scales with complex multipolygons?

## How does the algorithm scales with simple multipolygons?
The benchmark uses the most simple multipolygon, 3 sides per polygon, 0 hole and 1 polygon per multipolygon.

Float32
| Num   multipolygon | Throughput   (#multipolygons / s) |
| --- | --- |
| 1 | 28060.32971 |
| 100 | 2552687.469 |
| 10000 | 186044781 |
| 1000000 | 1047783101 |
| 100000000 | 929537385.2 | 

Float64
| Num   multipolygon | Throughput   (#multipolygons / s) |
| --- | --- |
| 1 | 28296.94817 |
| 100 | 2491541.218 |
| 10000 | 179379919.5 |
| 1000000 | 854678939.9 |
| 100000000 | 783364410.7 |

![image](https://user-images.githubusercontent.com/13521008/226502300-c3273d80-5f9f-4d53-b961-a24e64216e9b.png)

The chart shows that with simple polygons and simple multipoint (1 point per multipoint), the algorithm scales pretty nicely. Throughput is maxed out at near 1M pairs.

## How does the algorithm scales with complex multipolygons?

The benchmark uses a complex multipolygon, 100 edges per ring, 10 holes per polygon and 3 polygons per multipolygon.

float32
Num   multipolygon | Throughput   (#multipolygons / s)
-- | --
1000 | 158713.2377
10000 | 345694.2642
100000 | 382849.058

float64
Num   multipolygon | Throughput   (#multipolygons / s)
-- | --
1000 | 148727.1246
10000 | 353141.9758
100000 | 386007.3016

![image](https://user-images.githubusercontent.com/13521008/226502732-0d116db7-6257-4dec-b170-c42b30df9cea.png)

The algorithm reaches max throughput at near 10K pairs. About 100X lower than the simple multipolygon example.

Authors:
  - Michael Wang (https://github.com/isVoid)
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Mark Harris (https://github.com/harrism)

URL: #1131
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cmake Related to CMake code or build configuration feature request New feature or request libcuspatial Relates to the cuSpatial C++ library non-breaking Non-breaking change

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

2 participants