Update text_similarity_rank_retriever to support inference ID as an argument when reranking on chunks#137397
Conversation
|
@kderusso, this is a WIP, would love to hear your thoughts on the POC when you have a moment. |
kderusso
left a comment
There was a problem hiding this comment.
I performed a very high level review of the approach. Please update this PR with the suggested changes.
...e/src/main/java/org/elasticsearch/xpack/inference/rank/textsimilarity/ChunkScorerConfig.java
Outdated
Show resolved
Hide resolved
...rg/elasticsearch/xpack/inference/rank/textsimilarity/TextSimilarityRankRetrieverBuilder.java
Outdated
Show resolved
Hide resolved
...pack/inference/rank/textsimilarity/TextSimilarityRankFeaturePhaseRankCoordinatorContext.java
Outdated
Show resolved
Hide resolved
...pack/inference/rank/textsimilarity/TextSimilarityRankFeaturePhaseRankCoordinatorContext.java
Outdated
Show resolved
Hide resolved
...pack/inference/rank/textsimilarity/TextSimilarityRankFeaturePhaseRankCoordinatorContext.java
Outdated
Show resolved
Hide resolved
|
I am still unable to run the yaml tests locally, but opening up for early feedback! Work in progress. |
kderusso
left a comment
There was a problem hiding this comment.
High level functional review, still needs to be addressed.
...e/src/main/java/org/elasticsearch/xpack/inference/rank/textsimilarity/ChunkScorerConfig.java
Outdated
Show resolved
Hide resolved
...k/inference/rank/textsimilarity/TextSimilarityRerankingRankFeaturePhaseRankShardContext.java
Outdated
Show resolved
Hide resolved
.../test/java/org/elasticsearch/xpack/inference/rank/textsimilarity/ChunkScorerConfigTests.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
This PR enhances the text_similarity_rank_retriever to automatically determine optimal chunking settings based on the inference endpoint's window size when chunking settings are not explicitly provided. The retriever now queries the inference endpoint for its window size and uses it to configure chunking, while still allowing users to override these defaults with explicit settings.
Key changes:
- Automatic resolution of chunking settings from inference endpoint window size when not explicitly provided
- Support for partial chunking configuration where only
max_chunk_sizeis specified - Introduction of async query rewriting to fetch window size from inference endpoints
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| 70_text_similarity_rank_retriever.yml | Added integration tests covering auto-resolution, partial configuration, and explicit override scenarios |
| TextSimilarityRankFeaturePhaseRankCoordinatorContextTests.java | Added unit tests for chunking settings resolution logic and removed obsolete tests |
| ChunkScorerConfigTests.java | New test file covering serialization, deserialization, and chunking settings creation |
| TextSimilarityRerankingRankFeaturePhaseRankShardContext.java | Added validation to ensure chunking settings are resolved before shard execution |
| TextSimilarityRankRetrieverBuilder.java | Implemented async query rewriting to fetch window size and resolve chunking settings |
| TextSimilarityRankFeaturePhaseRankCoordinatorContext.java | Added chunking settings resolution logic and integrated window size fetching |
| ChunkScorerConfig.java | Modified to support null chunking settings and improved settings creation methods |
| InferenceFeatures.java | Added feature flag for the new chunking behavior |
| 137397.yaml | Added changelog entry for the enhancement |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...e/src/main/java/org/elasticsearch/xpack/inference/rank/textsimilarity/ChunkScorerConfig.java
Show resolved
Hide resolved
...rg/elasticsearch/xpack/inference/rank/textsimilarity/TextSimilarityRankRetrieverBuilder.java
Outdated
Show resolved
Hide resolved
|
@kderusso Thanks for your patience! I have addressed the comments and also made sure the functionality is working as intended for different reranking models. Please let me know if there are any concerns or optimisation needed. |
kderusso
left a comment
There was a problem hiding this comment.
Looks good, thanks for all your hard work iterating!
Can you please also update https://github.com/elastic/elasticsearch/blob/main/docs/reference/elasticsearch/rest-apis/retrievers/text-similarity-reranker-retriever.md to say that we default to chunking settings that will fit into the model associated with inference_id's token window? (This can be done as a followup if you want).
...rc/yamlRestTest/resources/rest-api-spec/test/inference/70_text_similarity_rank_retriever.yml
Show resolved
Hide resolved
🔍 Preview links for changed docs |
ℹ️ Important: Docs version tagging👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version. We use applies_to tags to mark version-specific features and changes. Expand for a quick overviewWhen to use applies_to tags:✅ At the page level to indicate which products/deployments the content applies to (mandatory) What NOT to do:❌ Don't remove or replace information that applies to an older version 🤔 Need help?
|
ef75bea to
e6f41d3
Compare
Thanks @kderusso for approving. I have addressed the doc update. Please do have a look and let me know if there are suggestions or happy to go ahead with the merge. |
docs/reference/elasticsearch/rest-apis/retrievers/text-similarity-reranker-retriever.md
Outdated
Show resolved
Hide resolved
* upstream/main: Update text_similarity_rank_retriever to support inference ID as an argument when reranking on chunks (elastic#137397) Mute org.elasticsearch.test.rest.yaml.CssSearchYamlTestSuiteIT test {p0=search.retrievers/result-diversification/10_mmr_result_diversification_retriever/Test MMR result diversification multiple indexes} elastic#139826 Mute org.elasticsearch.index.mapper.SkipperSettingsTests testTSDBSkipperSettingDefaults elastic#139824 Unmute test fix elastic#129517 (elastic#139782) Add back support for deserializing old refresh token in test (elastic#139811) Add documentation for exponential_histogram field type (elastic#139684) make ES|QL sample CSV test looser (elastic#139814) Add `frozen_after` field to data stream lifecycle (elastic#139042) Quieten many `ERROR` logs to `WARN` (elastic#139799)
Adding support to automatically use the best chunking size based on the input 'inference_id' provided.
Tested these below scenarios