Grid aggregations for Cartesian points and shapes

### Description

There is a meta-task for adding [Cartesian support for aggregations](https://github.com/elastic/elasticsearch/issues/87804), which includes two simpler aggregations for [centroid](https://github.com/elastic/elasticsearch/issues/90156) and [bounding box](https://github.com/elastic/elasticsearch/issues/90157). It is anticipated that support for grids might be more complex. There exist three `geo_grid` aggregations, with differing levels of applicability to cartesian data:

* [`geohash`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-geohashgrid-aggregation.html)
* [`geotile`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-geotilegrid-aggregation.html)
* [`geohex`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-geohexgrid-aggregation.html)

The `geohex` aggregation assumes a spherical model of the data and cannot be applied to cartesian data at all. The other two, `geohash` and `geotile` both assume a flat projection (mercator or web-mercator) and could be applied to cartesian data but for one issue. They both assume range bounds applicable to latitudes and longitudes on the earth. For example, with `geotile` the tile is described with a zoom level and and x and y coordinate where the value `0,0` is the top left of the mercator projection (being -180 and +85 degrees), and increasing values of x and y describe tiles further to the right and down from that corner.

To get either `geotile` or `geohash` to work with cartesian data we would need to define a tile mapping scheme that sets at the very least bounds and zoom range. There are various approaches to consider:

* Using the existing ability to provide a `geo_bounding_box` as a `filter` to the aggregation to define the bounds.
  * Pros: no new keys need to be defined
  * Cons: we cannot compare the results of queries with different filter bounds, so this reduces the usefulness of the feature
* Define the bounds using a new field to the `geotile_grid` aggregation, so it is not a `filter` but a definition of the tile mapping
  * Pros: we can compare queries with different filter bounds, as long as they have the same tile mapping bounds
  * Cons: it is possible for users to change tile mapping and still expect results to align, which they will not. It is not convenient to have to specify something that seems to be a characteristic of the data (domain bounds) in every query.
* Define the bounds in the index mapping
  * Pros: since the bounds are likely meaningful in the data domain, this is a more domain specific way of storing this information. It also means that data within the bounds is linearly comparable, while outside the bounds is invalid.
  * Cons: this goes against the current implementation which treats cartesian data in a non-linear way (as does floating point encoding). It also requires knowing the domain bounds before importing data. Possibly that can be fixed with re-indexing.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grid aggregations for Cartesian points and shapes #90160

Description

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Grid aggregations for Cartesian points and shapes #90160

Description

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions