Skip to content

Grid aggregations for Cartesian points and shapes #90160

@craigtaverner

Description

@craigtaverner

Description

There is a meta-task for adding Cartesian support for aggregations, which includes two simpler aggregations for centroid and bounding box. It is anticipated that support for grids might be more complex. There exist three geo_grid aggregations, with differing levels of applicability to cartesian data:

The geohex aggregation assumes a spherical model of the data and cannot be applied to cartesian data at all. The other two, geohash and geotile both assume a flat projection (mercator or web-mercator) and could be applied to cartesian data but for one issue. They both assume range bounds applicable to latitudes and longitudes on the earth. For example, with geotile the tile is described with a zoom level and and x and y coordinate where the value 0,0 is the top left of the mercator projection (being -180 and +85 degrees), and increasing values of x and y describe tiles further to the right and down from that corner.

To get either geotile or geohash to work with cartesian data we would need to define a tile mapping scheme that sets at the very least bounds and zoom range. There are various approaches to consider:

  • Using the existing ability to provide a geo_bounding_box as a filter to the aggregation to define the bounds.
    • Pros: no new keys need to be defined
    • Cons: we cannot compare the results of queries with different filter bounds, so this reduces the usefulness of the feature
  • Define the bounds using a new field to the geotile_grid aggregation, so it is not a filter but a definition of the tile mapping
    • Pros: we can compare queries with different filter bounds, as long as they have the same tile mapping bounds
    • Cons: it is possible for users to change tile mapping and still expect results to align, which they will not. It is not convenient to have to specify something that seems to be a characteristic of the data (domain bounds) in every query.
  • Define the bounds in the index mapping
    • Pros: since the bounds are likely meaningful in the data domain, this is a more domain specific way of storing this information. It also means that data within the bounds is linearly comparable, while outside the bounds is invalid.
    • Cons: this goes against the current implementation which treats cartesian data in a non-linear way (as does floating point encoding). It also requires knowing the domain bounds before importing data. Possibly that can be fixed with re-indexing.

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions