Refactor how we report response size#143109
Conversation
|
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
There was a problem hiding this comment.
Pull request overview
This PR refactors how response sizes are reported in query activity logging across Elasticsearch, ESQL, EQL, and SQL. The main change is to distinguish between the number of documents returned in a response (now called result_size) and the total number of matching documents (tracked as search.total_hits). Constants for query logging fields have been centralized in a new QueryLogging interface.
Changes:
- Created new
QueryLogginginterface with centralized constants for query-specific logging fields - Renamed the "hits" field to "result_size" to represent documents returned in response (not total matching documents)
- Added
search.total_hitsandsearch.total_hits_gtefields to SearchLogProducer for tracking total matching documents and whether it's an estimate - Updated
SearchLogContext.getHits()to return actual response size (array length) instead of total hits count
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| server/src/main/java/org/elasticsearch/common/logging/activity/QueryLogging.java | New interface defining constants for query logging fields (QUERY_FIELD_QUERY, QUERY_FIELD_HITS, QUERY_FIELD_INDICES) |
| server/src/main/java/org/elasticsearch/action/search/SearchLogProducer.java | Updated to use QueryLogging constants, renamed constant fields, and added logic to log total hits and GTE indicator |
| server/src/main/java/org/elasticsearch/action/search/SearchLogContext.java | Refactored getHits() to return response size and added getTotalHits() for total matching documents |
| server/src/internalClusterTest/java/org/elasticsearch/search/SearchLoggingIT.java | Updated tests to use new constants and added test coverage for total_hits fields |
| x-pack/plugin/sql/src/main/java/org/elasticsearch/xpack/sql/logging/SqlLogProducer.java | Updated to use QueryLogging constants instead of direct string concatenation |
| x-pack/plugin/sql/src/internalClusterTest/java/org/elasticsearch/xpack/sql/action/SqlLoggingIT.java | Updated test imports and assertions to use new QUERY_FIELD_HITS constant |
| x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/querylog/EsqlLogProducer.java | Updated to use QueryLogging constants instead of direct string concatenation |
| x-pack/plugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/EsqlQueryLoggingIT.java | Updated test imports and assertions to use new QUERY_FIELD_HITS constant |
| x-pack/plugin/eql/src/main/java/org/elasticsearch/xpack/eql/logging/EqlLogProducer.java | Updated to use QueryLogging constants for all query-related fields |
| x-pack/plugin/eql/src/internalClusterTest/java/org/elasticsearch/xpack/eql/action/EqlLoggingIT.java | Updated test imports and assertions to use new constants |
| x-pack/plugin/async-search/src/internalClusterTest/java/org/elasticsearch/xpack/search/AsyncSearchLoggingIT.java | Updated test imports and assertions to use new constants |
| test/framework/src/main/java/org/elasticsearch/test/ActivityLoggingUtils.java | Updated to use QUERY_FIELD_QUERY constant in test utility methods |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
server/src/main/java/org/elasticsearch/action/search/SearchLogProducer.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/common/logging/activity/QueryLogging.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/action/search/SearchLogProducer.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
server/src/main/java/org/elasticsearch/common/logging/activity/QueryLogging.java
Show resolved
Hide resolved
| public interface QueryLogging { | ||
| String ES_QUERY_FIELDS_PREFIX = ES_FIELDS_PREFIX + "querying."; | ||
| String QUERY_FIELD_QUERY = ES_QUERY_FIELDS_PREFIX + "query"; | ||
| String QUERY_FIELD_HITS = ES_QUERY_FIELDS_PREFIX + "result_size"; |
There was a problem hiding this comment.
Let's name this returned_result_count... as in we are counting (only the returned) results, not sizing the results. Also size is too overloaded and it could also mean the size (in bytes) of the response.
There was a problem hiding this comment.
Re returned_result_count, is returned_ redundant? I do not think we have any other results.
There was a problem hiding this comment.
NIT: we should probably replace HITS in the name of the field with RESULT/RESULT_COUNT
There was a problem hiding this comment.
is returned_ redundant? I do not think we have any other results.
we do, there will be total_result_count as well (i.e. hits.total.value for DSL and we'll get that number for other query languages in time as well)
There was a problem hiding this comment.
Yes you're right, "count" is better than "size" I'll change it.
| public static final String SEARCH_IS_SYSTEM_FIELD = ES_FIELDS_PREFIX + "search.is_system"; | ||
| public static final String QUERY_FIELD_HAS_AGGREGATIONS = ES_QUERY_FIELDS_PREFIX + "has_aggregations"; | ||
| public static final String QUERY_FIELD_IS_SYSTEM = ES_QUERY_FIELDS_PREFIX + "is_system"; | ||
| public static final String QUERY_FIELD_SEARCH_HITS = ES_QUERY_FIELDS_PREFIX + "search.total_hits"; |
There was a problem hiding this comment.
let's name this search.total_result_count
There was a problem hiding this comment.
Well here it's trickier, because total hits is not result count - in fact, none of the "total hits" documents may even appear in results. Its kinda hypothetical value saying "how many documents would be matched by this search, if other things didn't intervene", so using "result" here may be misleading. It's unique to the DSL search too, other modes don't have anything like that.
There was a problem hiding this comment.
Then you can name this search.matched_result_count
There was a problem hiding this comment.
I'd like to avoid mentioning "result" - maybe just "total_count"?
| public static final String QUERY_FIELD_HAS_AGGREGATIONS = ES_QUERY_FIELDS_PREFIX + "has_aggregations"; | ||
| public static final String QUERY_FIELD_IS_SYSTEM = ES_QUERY_FIELDS_PREFIX + "is_system"; | ||
| public static final String QUERY_FIELD_SEARCH_HITS = ES_QUERY_FIELDS_PREFIX + "search.total_hits"; | ||
| public static final String QUERY_FIELD_SEARCH_HITS_GTE = ES_QUERY_FIELDS_PREFIX + "search.total_hits_gte"; |
There was a problem hiding this comment.
It would be more useful to have the other boolean, i.e. search.total_result_count_exact (true if relation == eq)
The rationale is that
- if the count is exact (say 943), I prefer to see
count = 943andexact = trueinstead ofgte = false - if the count is not exact (say 10000), I prefer to see
count = 10000andexact = falseinstead ofgte = true
There was a problem hiding this comment.
The reason I didn't use "exact" is for most queries it's true, so if we use "gte" we could omit it for the default case. If we use "exact" we'd have to put the "true" flag here for pretty much 99% of queries, which looks a bit wasteful - usually it's better to mark the exceptions and otherwise assume the default, rather than mark the common case and assume the exception.
There was a problem hiding this comment.
Ok, I missed the fact that the field is only emitted when it's present. That's a valid point.
I would name it differently than gte though (which is again a technicality), something like partial, inexact or something to that extent would be better.
* Refactor how we report response size
hitsrenamed toresult_countsearch.total_countandsearch.total_count_partial