Skip to content

Field caps api - report back if fields are single-valued or not.#80730

Closed
markharwood wants to merge 14 commits intoelastic:mainfrom
markharwood:fix/58523_by_reporting
Closed

Field caps api - report back if fields are single-valued or not.#80730
markharwood wants to merge 14 commits intoelastic:mainfrom
markharwood:fix/58523_by_reporting

Conversation

@markharwood
Copy link
Copy Markdown
Contributor

@markharwood markharwood commented Nov 15, 2021

Inspects index contents for each requested field to reveal if all docs hold single values or not.
Expected to be useful information for clients e.g. helping Kibana understand if it ever makes sense to AND different values from the same field when constructing drill-down queries.

Relates to #58523

@markharwood markharwood added >enhancement WIP :Search Foundations/Mapping Index mappings, including merging and defining field types labels Nov 15, 2021
@markharwood markharwood self-assigned this Nov 15, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Nov 15, 2021
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search (Team:Search)

@markharwood markharwood force-pushed the fix/58523_by_reporting branch from 6419956 to 072fe96 Compare November 16, 2021 17:44
@markharwood markharwood removed the WIP label Nov 18, 2021
@markharwood markharwood requested a review from jpountz November 19, 2021 09:32
@markharwood markharwood force-pushed the fix/58523_by_reporting branch 2 times, most recently from 263424e to bfe1b70 Compare November 23, 2021 09:42
@markharwood
Copy link
Copy Markdown
Contributor Author

One consideration @cbuescher raised is that inspecting index stats in this way will have some inaccuracies when it comes to deleted docs. The worst-case scenario is that an index with a rogue multi-valued doc that was changed to be single-valued will still report the whole index as multi-valued until the old deleted doc is merged out.

@markharwood
Copy link
Copy Markdown
Contributor Author

Performance feedback

@original-brownbear was kind enough to perform some benchmarking on a large installation (15k shards) and was not able to measure any performance difference on the field caps api between this PR and previous versions.

@markharwood markharwood force-pushed the fix/58523_by_reporting branch from bfe1b70 to 491fa09 Compare November 23, 2021 17:19
@jpountz
Copy link
Copy Markdown
Contributor

jpountz commented Nov 24, 2021

@original-brownbear was kind enough to perform some benchmarking on a large installation (15k shards) and was not able to measure any performance difference on the field caps api between this PR and previous versions.

Out of curiosity how many mapped fields per shard did the benchmark have, and did the index have actual data for each of these fields?

@original-brownbear
Copy link
Copy Markdown
Contributor

Out of curiosity how many mapped fields per shard did the benchmark have, and did the index have actual data for each of these fields?

This was Beats indices (mix of Audit, Metric- etc.) so I'd say on average ~2k fields per index.

index have actual data for each of these fields?

This was benchmarked using a variation of the Observability logging track. I do not believe that there was data present for all fields. Looking at the code again now that you mention it (sorry completely missed checking the number field mapper implementation here), whether or not we have data (and also how much of it?) seems to be massively important here.
We don't have a benchmark ready to go for this kind of thing right now (and it's not entirely trivial to set up because of the raw amount of data it would have to index) but maybe it's time to set that up to be safe here?

@markharwood markharwood force-pushed the fix/58523_by_reporting branch 4 times, most recently from 6fd5650 to 905f426 Compare November 29, 2021 14:59
@markharwood markharwood force-pushed the fix/58523_by_reporting branch from 905f426 to fde413e Compare December 6, 2021 10:20
@markharwood markharwood force-pushed the fix/58523_by_reporting branch from 9638fbf to 203a28d Compare December 15, 2021 11:19
@csoulios csoulios added v8.6.0 and removed v8.5.0 labels Sep 21, 2022
@kingherc kingherc added v8.7.0 and removed v8.6.0 labels Nov 16, 2022
@rjernst rjernst added v8.8.0 and removed v8.7.0 labels Feb 8, 2023
@gmarouli gmarouli added v8.9.0 and removed v8.8.0 labels Apr 26, 2023
@quux00 quux00 added v8.11.0 and removed v8.10.0 labels Aug 16, 2023
@mattc58 mattc58 added v8.12.0 and removed v8.11.0 labels Oct 4, 2023
@elasticsearchmachine elasticsearchmachine added v8.16.0 Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch and removed v8.15.0 labels Jul 4, 2024
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@elasticsearchmachine elasticsearchmachine removed the Team:Search Meta label for search team label Jul 4, 2024
@cbuescher
Copy link
Copy Markdown
Member

Closing here since in this form the PR is outdated.

@cbuescher cbuescher closed this Jul 17, 2024
@javanna javanna removed the v8.16.0 label Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Projects

None yet

Development

Successfully merging this pull request may close these issues.