Skip to content

[EIS] Implement chunked & batched inference for sparse text embeddings#129922

Merged
timgrein merged 2 commits intoelastic:mainfrom
timgrein:eis-sparse-text-embeddings-chunking
Jun 24, 2025
Merged

[EIS] Implement chunked & batched inference for sparse text embeddings#129922
timgrein merged 2 commits intoelastic:mainfrom
timgrein:eis-sparse-text-embeddings-chunking

Conversation

@timgrein
Copy link
Copy Markdown
Contributor

@timgrein timgrein commented Jun 24, 2025

The previous implementation simply did a pass through to doInfer inside of doChunkedInfer. This PR adds actual chunking and request batching support.

We're using the same settings as .elser-v2-elasticsearch on the ML node:

{
    "inference_id": ".elser-v2-elastic",
    "task_type": "sparse_embedding",
    "service": "elastic",
    "service_settings": {
        "model_id": "elser-v2",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    },
    "chunking_settings": {
        "strategy": "sentence",
        "max_chunk_size": 250,
        "sentence_overlap": 1
    }
},
{
    "inference_id": ".elser-2-elasticsearch",
    "task_type": "sparse_embedding",
    "service": "elasticsearch",
    "service_settings": {
        "num_threads": 1,
        "model_id": ".elser_model_2",
        "adaptive_allocations": {
            "enabled": true,
            "min_number_of_allocations": 0,
            "max_number_of_allocations": 32
        }
    },
    "chunking_settings": {
        "strategy": "sentence",
        "max_chunk_size": 250,
        "sentence_overlap": 1
    }
},

@timgrein timgrein added >non-issue :ml Machine learning Team:ML Meta label for the ML team auto-backport Automatically create backport pull requests when merged v8.19.0 labels Jun 24, 2025
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/ml-core (Team:ML)

@timgrein timgrein merged commit 870d581 into elastic:main Jun 24, 2025
33 checks passed
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 129922

@timgrein
Copy link
Copy Markdown
Contributor Author

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

timgrein added a commit to timgrein/elasticsearch that referenced this pull request Jun 24, 2025
elastic#129922)

(cherry picked from commit 870d581)

# Conflicts:
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceServiceSparseEmbeddingsModelTests.java
mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Jun 25, 2025
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Jan 9, 2026
Removes unused method `translateToChunkedResults` from `ElasticInferenceService`.
The method stopped being used by elastic#129922.

Also removes an unused logger in `InferencePlugin`.
dimitris-athanasiou added a commit that referenced this pull request Jan 9, 2026
#140442)

Removes unused method `translateToChunkedResults` from `ElasticInferenceService`.
The method stopped being used by #129922.

Also removes an unused logger in `InferencePlugin`.
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Jan 9, 2026
elastic#140442)

Removes unused method `translateToChunkedResults` from `ElasticInferenceService`.
The method stopped being used by elastic#129922.

Also removes an unused logger in `InferencePlugin`.

(cherry picked from commit f0d3d9a)
dimitris-athanasiou added a commit that referenced this pull request Jan 9, 2026
#140442) (#140455)

Removes unused method `translateToChunkedResults` from `ElasticInferenceService`.
The method stopped being used by #129922.

Also removes an unused logger in `InferencePlugin`.

(cherry picked from commit f0d3d9a)
jimczi pushed a commit to jimczi/elasticsearch that referenced this pull request Jan 12, 2026
elastic#140442)

Removes unused method `translateToChunkedResults` from `ElasticInferenceService`.
The method stopped being used by elastic#129922.

Also removes an unused logger in `InferencePlugin`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending :ml Machine learning >non-issue Team:ML Meta label for the ML team v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants