Include Pinned Retriever in 9.1 Documentation by mridula-s109 · Pull Request #129216 · elastic/elasticsearch

mridula-s109 · 2025-06-10T17:43:52Z

Summary of changes

Added details about the Pinned Retriever to retrievers.md under the 9.1 section.
Ensured that only the intended documentation file was modified, with no unrelated changes.

Small changes in BlobContainer interface and wrapper. Relates ES-11815

…129054) The reason the test fails is that operations contained _seq_no field with different doc value types (with no skippers and with skippers) and this isn't allowed, since field types need to be consistent in a Lucene index. The initial operations were generated not knowing about the fact the index mode was set to logsdb or time_series. Causing the operations to not have doc value skippers. However when replaying the operations via following engine, the operations did have doc value skippers. The fix is to set `index.seq_no.index_options` to `points_and_doc_values`, so that the initial operations are indexed without doc value skippers. This test doesn't gain anything from storing seqno with doc value skippers, so there is no loss of testing coverage. Closes elastic#128541

This ensures we package an aggregation zip with all artifacts we want to publish to maven central as part of a release. Running zipAggregation will produce a zip file in the build/nmcp/zip folder. The content of this zip is meant to match the maven artifacts we have currently declared as dra maven artifacts.

Runs a sanity check after loading a block of values. Previously we were doing a quick check if assertions were enabled. Now we do two quick checks all the time. Better - we attach information about how a block was loaded when there's a problem. Relates to elastic#128959

The functionality in `PhaseCacheManagement` was already project-aware, but these tests were still using deprecated methods.

This adds some testing tools for verifying vector recall and latency directly without having to spin up an entire ES node and running a rally track. Its pretty barebones and takes inspiration from lucene-util, but I wanted access to our own formats and tooling to make our lives easier. Here is an example config file. This will build the initial index, run queries at num_candidates: 50, then again at num_candidates 100 (without reindexing, and re-using the cached nearest neighbors). ``` [{ "doc_vectors" : "path", "query_vectors" : "path", "num_docs" : 10000, "num_queries" : 10, "index_type" : "hnsw", "num_candidates" : 50, "k" : 10, "hnsw_m" : 16, "hnsw_ef_construction" : 200, "index_threads" : 4, "reindex" : true, "force_merge" : false, "vector_space" : "maximum_inner_product", "dimensions" : 768 }, { "doc_vectors" : "path", "query_vectors" : "path", "num_docs" : 10000, "num_queries" : 10, "index_type" : "hnsw", "num_candidates" : 100, "k" : 10, "hnsw_m" : 16, "hnsw_ef_construction" : 200, "vector_space" : "maximum_inner_product", "dimensions" : 768 } ] ``` To execute: ``` ./gradlew :qa:vector:checkVec --args="/Path/to/knn_tester_config.json" ``` Calling `./gradlew :qa:vector:checkVecHelp` gives some guidance on how to use it, additionally providing a way to run it via java directly (useful to bypass gradlew guff).

Add a spec test of `LOOKUP JOIN` against a time series index.

This is part of an iterative process to make ILM project-aware.

…t {lookup-join.LookupJoinOnTimeSeriesIndex ASYNC} elastic#129078

…9076) The `ClusterState` parameter of the `asyncPredicate` is not used anywhere.

…t {lookup-join.LookupJoinOnTimeSeriesIndex SYNC} elastic#129082

…est {p0=upgraded_cluster/70_ilm/Test Lifecycle Still There And Indices Are Still Managed} elastic#129097

…est {p0=upgraded_cluster/90_ml_data_frame_analytics_crud/Get mixed cluster outlier_detection job} elastic#129098

…ollowedWithEnvironmentVariableFiles elastic#128867

…27613) This PR introduces 3 new settings: indices.merge.disk.check_interval, indices.merge.disk.watermark.high, and indices.merge.disk.watermark.high.max_headroom that control if the threadpool merge executor starts executing new merges when the disk space is getting low. The intent of this change is to avoid the situation where in-progress merges exhaust the available disk space on the node's local filesystem. To this end, the thread pool merge executor periodically monitors the available disk space, as well as the current disk space estimates required by all in-progress (currently running) merges on the node, and will NOT schedule any new merges if the disk space is getting low (by default below the 5% limit of the total disk space, or 100 GB, whichever is smaller (same as the disk allocation flood stage level)).

…tic#128735) This PR introduces a new include_vectors option to the _source retrieval context. When set to false, vectors are excluded from the returned _source. This is especially efficient when used with synthetic source, as it avoids loading vector fields entirely. By default, vectors remain included unless explicitly excluded.

…kSpaceTests testAvailableDiskSpaceMonitorWhenFileSystemStatErrors elastic#129149

…ic#129033) * Add transport version for ML inference Mistral chat completion * Add changelog for Mistral Chat Completion version fix * Revert "Add changelog for Mistral Chat Completion version fix" This reverts commit 7a57416.

All we care about is if reindex is true or false. We shouldn't worry about force merge. Because if reindex is true, we will create the directory, if its false, we won't.

…kSpaceTests testUnavailableBudgetBlocksNewMergeTasksFromStartingExecution elastic#129148

* Google Vertex AI completion model, response entity and tests * Fixed GoogleVertexAiServiceTest for Service configuration * Changelog * Removed downcasting and using `moveToFirstToken` * Create GoogleVertexAiChatCompletionResponseHandler for streaming and non streaming responses * Added unit tests * PR feedback * Removed googlevertexaicompletion model. Using just GoogleVertexAiChatCompletionModel for completion and chat completion * Renamed uri -> nonStreamingUri. Added streamingUri and getters in GoogleVertexAiChatCompletionModel * Moved rateLimitGroupHashing to subclasses of GoogleVertexAiModel * Fixed rate limit has of GoogleVertexAiRerankModel and refactored uri for GoogleVertexAiUnifiedChatCompletionRequest --------- Co-authored-by: lhoet-google <lhoet@google.com> Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com>

elasticsearchmachine · 2025-06-10T17:47:33Z

Pinging @elastic/search-eng (Team:SearchOrg)

elasticsearchmachine · 2025-06-10T17:47:33Z

Pinging @elastic/search-relevance (Team:Search - Relevance)

docs/reference/elasticsearch/rest-apis/retrievers.md

leemthompo

Just a couple of metadata additions to clarify version availability :)

docs/reference/elasticsearch/rest-apis/retrievers.md

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>

github-actions · 2025-06-17T09:35:39Z

🔍 Preview links for changed docs:

docs/reference/elasticsearch/rest-apis/retrievers.md

🔔 The preview site may take up to 3 minutes to finish building. These links will become live once it completes.

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>

mridula-s109 · 2025-06-17T09:41:48Z

Just a couple of metadata additions to clarify version availability :)

Thanks for your comments leemthompo, Have addressed the same.

kderusso

Please make one suggested edit then LGTM

docs/reference/elasticsearch/rest-apis/retrievers.md

mridula-s109 · 2025-06-17T12:42:02Z

Please make one suggested edit then LGTM

Thanks for approving, have done the same.

mridula-s109 and others added 30 commits June 6, 2025 11:37

propgating retrievers to inner retrievers

12fb2fa

test feature taken care of

81e99b6

Merge branch 'elastic:main' into main

05fb0ab

Small changes in concurrent multipart upload interfaces (elastic#128977)

605c035

Small changes in BlobContainer interface and wrapper. Relates ES-11815

Make PhaseCacheManagementTests project-aware (elastic#129047)

aec1688

The functionality in `PhaseCacheManagement` was already project-aware, but these tests were still using deprecated methods.

ES|QL: refactor generative tests (elastic#129028)

df3ef0d

Add a test of LOOKUP JOIN against a time series index (elastic#129007)

0eebc8c

Add a spec test of `LOOKUP JOIN` against a time series index.

Make ILM ClusterStateWaitStep project-aware (elastic#129042)

b1e15f0

This is part of an iterative process to make ILM project-aware.

Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT tes…

846b09a

…t {lookup-join.LookupJoinOnTimeSeriesIndex ASYNC} elastic#129078

Remove ClusterState param from ILM AsyncBranchingStep (elastic#12…

a97d582

…9076) The `ClusterState` parameter of the `asyncPredicate` is not used anywhere.

Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT tes…

763b502

…t {lookup-join.LookupJoinOnTimeSeriesIndex SYNC} elastic#129082

Mute org.elasticsearch.upgrades.UpgradeClusterClientYamlTestSuiteIT t…

8a660c8

…est {p0=upgraded_cluster/70_ilm/Test Lifecycle Still There And Indices Are Still Managed} elastic#129097

Mute org.elasticsearch.upgrades.UpgradeClusterClientYamlTestSuiteIT t…

aa16175

…est {p0=upgraded_cluster/90_ml_data_frame_analytics_crud/Get mixed cluster outlier_detection job} elastic#129098

Mute org.elasticsearch.packaging.test.DockerTests test081SymlinksAreF…

6e58b1e

…ollowedWithEnvironmentVariableFiles elastic#128867

Remove direct minScore propagation to inner retrievers

0776562

cleaned up skip

f145d26

Mute org.elasticsearch.index.engine.ThreadPoolMergeExecutorServiceDis…

d8b6897

…kSpaceTests testAvailableDiskSpaceMonitorWhenFileSystemStatErrors elastic#129149

Correct index path validation (elastic#129144)

eca383d

All we care about is if reindex is true or false. We shouldn't worry about force merge. Because if reindex is true, we will create the directory, if its false, we won't.

Mute org.elasticsearch.index.engine.ThreadPoolMergeExecutorServiceDis…

fb6ec9a

…kSpaceTests testUnavailableBudgetBlocksNewMergeTasksFromStartingExecution elastic#129148

Merge remote-tracking branch 'upstream/main'

0ef36a1

Merge remote-tracking branch 'upstream/main'

ece13d9

Merge remote-tracking branch 'upstream/main'

36cd91e

elasticsearchmachine added Team:SearchOrg Meta label for the Search Org (Enterprise Search) Team:Search - Relevance The Search organization Search Relevance team and removed Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch labels Jun 10, 2025

mridula-s109 added the >non-issue label Jun 10, 2025

kderusso reviewed Jun 10, 2025

View reviewed changes

mridula-s109 added 3 commits June 11, 2025 15:06

Merge remote-tracking branch 'upstream' into update-pinned-docs

a536750

made the suggested changes

a93c6bf

Merge branch 'main' into update-pinned-docs

0461cc5

mridula-s109 requested a review from kderusso June 11, 2025 14:45

Merge branch 'main' into update-pinned-docs

ee7a777

kderusso reviewed Jun 11, 2025

View reviewed changes

Update retrievers.md

4de5087

mridula-s109 requested a review from leemthompo June 17, 2025 08:50

leemthompo reviewed Jun 17, 2025

View reviewed changes

docs/reference/elasticsearch/rest-apis/retrievers.md Outdated Show resolved Hide resolved

docs/reference/elasticsearch/rest-apis/retrievers.md Show resolved Hide resolved

Update docs/reference/elasticsearch/rest-apis/retrievers.md

de3f154

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>

mridula-s109 and others added 2 commits June 17, 2025 10:35

Update docs/reference/elasticsearch/rest-apis/retrievers.md

3ee741c

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>

Merge branch 'main' into update-pinned-docs

20a033c

mridula-s109 requested a review from kderusso June 17, 2025 09:44

kderusso approved these changes Jun 17, 2025

View reviewed changes

docs/reference/elasticsearch/rest-apis/retrievers.md Outdated Show resolved Hide resolved

Update retrievers.md

4e2e4f2

mridula-s109 enabled auto-merge (squash) June 17, 2025 12:41

Merge branch 'main' into update-pinned-docs

4c6964b

mridula-s109 merged commit 12521fb into elastic:main Jun 17, 2025
8 of 9 checks passed

mridula-s109 deleted the update-pinned-docs branch June 17, 2025 12:55

shainaraskas mentioned this pull request Aug 11, 2025

9.1 docs backports for 8.19 features #132605

Merged

Conversation

mridula-s109 commented Jun 10, 2025

Summary of changes

Uh oh!

elasticsearchmachine commented Jun 10, 2025

Uh oh!

elasticsearchmachine commented Jun 10, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

leemthompo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mridula-s109 commented Jun 17, 2025

Uh oh!

kderusso left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mridula-s109 commented Jun 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

github-actions bot commented Jun 17, 2025 •

edited

Loading