Skip to content

EIS integration#111154

Merged
demjened merged 32 commits intoelastic:mainfrom
timgrein:timgrein/inference-api-integrate-eis
Aug 9, 2024
Merged

EIS integration#111154
demjened merged 32 commits intoelastic:mainfrom
timgrein:timgrein/inference-api-integrate-eis

Conversation

@timgrein
Copy link
Copy Markdown
Contributor

@timgrein timgrein commented Jul 22, 2024

This PR integrates EIS (Elastic Inference Service) with Elasticsearch behind a feature flag.

Useful ES commands:

  • Running Elasticsearch (inside of the ES root directory):
    • Run ES: ./gradlew run
    • Run ES in debug mode: ./gradlew run --debug-jvm (you can attach a debugger on the debug port, which will be logged)
    • Run ES and set EIS gateway URL via CLI: ./gradlew run -Dtests.es.xpack.inference.eis.gateway.url=http://localhost:8000
  • Running tests:
    • Run tests inside inference API plugin: ./gradlew ':x-pack:plugin:inference:test'
    • Run a specific test class inside inference API plugin: ./gradlew ':x-pack:plugin:inference:test' --tests "org.elasticsearch.xpack.inference.external.response.elastic.ElasticInferenceServiceSparseEmbeddingsResponseEntityTests" (specify --tests and package + class name)
    • Run a specific test method inside Inference API plugin: ./gradlew ':x-pack:plugin:inference:test' --tests "org.elasticsearch.xpack.inference.external.response.elastic.ElasticInferenceServiceSparseEmbeddingsResponseEntityTests.testSparseEmbeddingsResponse_SingleEmbeddingInData_NoMeta_NoTruncation" (specify --tests and package + class + method name)
  • Check style/formatting:
    • Check style in inference API plugin: ./gradlew ':x-pack:plugin:inference:checkstyleTest'
    • Apply spotless/formatting: ./gradlew spotlessApply

Testing locally:

  • Start eis-model-server (or eis-gateway, if ELSERv2 endpoint is integrated) on port {PORT}
  • Start ES with a configured eis-gateway: ./gradlew run -Dtests.es.xpack.inference.eis.gateway.url=http://localhost:{PORT}
  • Create an EIS inference endpoint:
PUT {ES_HOST}/_inference/sparse_embedding/eis

{
    "service": "elastic",
    "service_settings": {
        "model_id": ".elser_model_2"
    }
}
  • Perform inference (single embedding in a list):
POST {ES_HOST}/_inference/sparse_embedding/eis

{
    "input": "A blue sky"
}
  • Perform inference (multiple embeddings):
POST {ES_HOST}/_inference/sparse_embedding/eis

{
    "input": [
        "Embed this text",
        "Embed this text, too",
        "This text should also be embedded"
    ]
}

Testing in serverless:

  • Enable the feature flag for your project or a whole environment via a PR in our corresponding gitops repo via jvmOptions: "-Des.eis_feature_flag_enabled=true"
  • Repeat steps from Testing locally

TODOs:

  • Write tests for ElasticInferenceService
  • Write tests for ElasticInferenceServiceActionCreator
  • Write tests for ElasticInferenceServiceResponseHandler
  • Write tests for ElasticInferenceServiceSparseEmbeddingsRequest
  • Implement checkModelConfig in ElasticInferenceService
  • Implement doChunkedInfer in ElasticInferenceService
  • Implement truncation in ElasticInferenceServiceSparseEmbeddingsRequest
  • Add docs for ElasticInferenceServiceFeature
  • Handle error codes specified in inference Task API spec inside ElasticInferenceServiceResponseHandler
  • (There might be some additional smaller TODOs, which I forgot here, I usually mark them with //TODO:)

When ready for review:

  • Mark ready for review
  • Add labels (this will ping the ML team for reviews):
    • >non-issue
    • :ml
    • Team:ML

Out of scope for this PR:

  • "Always-on" experience (performing inference without creating an endpoint for service elastic upfront)
  • Rate limiting
  • Auth/Secret Settings

@demjened demjened force-pushed the timgrein/inference-api-integrate-eis branch from 49a755c to 9eb637f Compare August 6, 2024 19:26
@demjened demjened marked this pull request as ready for review August 6, 2024 21:10
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Aug 6, 2024
@demjened demjened added the :SearchOrg/Inference Label for the Search Inference team label Aug 6, 2024
@elasticsearchmachine elasticsearchmachine added Team:SearchOrg Meta label for the Search Org (Enterprise Search) Team:Search - Inference and removed needs:triage Requires assignment of a team area label labels Aug 6, 2024
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/search-inference-team (Team:Search - Inference)

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/ent-search-eng (Team:SearchOrg)

@demjened demjened requested a review from a team August 6, 2024 21:13
@demjened demjened changed the title [DRAFT] EIS integration EIS integration Aug 6, 2024
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @timgrein, I've created a changelog YAML for you.

Copy link
Copy Markdown
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@demjened demjened force-pushed the timgrein/inference-api-integrate-eis branch from 74ddbcb to d3ef457 Compare August 9, 2024 14:14
@demjened demjened merged commit 13cc380 into elastic:main Aug 9, 2024
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Aug 9, 2024
* upstream/main: (22 commits)
  Prune changelogs after 8.15.0 release
  Bump versions after 8.15.0 release
  EIS integration (elastic#111154)
  Skip LOOKUP/INLINESTATS cases unless on snapshot (elastic#111755)
  Always enforce strict role validation (elastic#111056)
  Mute org.elasticsearch.xpack.esql.analysis.VerifierTests testUnsupportedAndMultiTypedFields elastic#111753
  [ML] Force time shift integration test (elastic#111620)
  ESQL: Add tests for sort, where with unsupported type (elastic#111737)
  [ML] Force time shift documentation (elastic#111668)
  Fix remote cluster credential secure settings reload   (elastic#111535)
  ESQL: Fix for overzealous validation in case of invalid mapped fields (elastic#111475)
  Pass allow security manager flag in gradle test policy setup plugin (elastic#111725)
  Rename streamContent/Separator to bulkContent/Separator (elastic#111716)
  Mute org.elasticsearch.tdigest.ComparisonTests testSparseGaussianDistribution elastic#111721
  Remove 8.14 from branches.json
  Only emit product origin in deprecation log if present (elastic#111683)
  Forward port release notes for v8.15.0 (elastic#111714)
  [ES|QL] Combine Disjunctive CIDRMatch (elastic#111501)
  ESQL: Remove qualifier from attrs (elastic#110581)
  Force using the last centroid during merging (elastic#111644)
  ...

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceNamedWriteablesProvider.java
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Sep 4, 2024
* WIP

* Add ElasticInferenceServiceTests TODOs

* Add ElasticInferenceServiceActionCreatorTests TODOs

* Add ElasticInferenceServiceResponseHandlerTests TODOs

* Add ElasticInferenceServiceSparseEmbeddingsRequestTests TODOs

* Add ElasticInferenceServiceSparseEmbeddingsModelTests TODOs

* spotless apply

* Fix conflicts

* Add EmptySecretSettingsTests

* Add named writeables to InferenceNamedWriteablesProvider

* Remove addressed todos

* Translate model to correct endpoint

* Remove addressed TODO

* Add docs to ElasticInferenceServiceFeature

* Implement and test truncation/request

* Add some EIS tests

* Support chunked inference

* Check model config

* Add more tests

* Add response handler

* Add more tests + HTTP 413 handling

* Fix some tests

* Spotless

* Fixes

* Switch back to original response structure

* Implement pass-through chunking

* Spotless

* Fix after rebase

* Spotless

* Log error upon failing to parse error response

* Remove TODOs

* Update docs/changelog/111154.yaml

---------

Co-authored-by: Adam Demjen <demjened@gmail.com>
davidkyle pushed a commit to davidkyle/elasticsearch that referenced this pull request Sep 5, 2024
* WIP

* Add ElasticInferenceServiceTests TODOs

* Add ElasticInferenceServiceActionCreatorTests TODOs

* Add ElasticInferenceServiceResponseHandlerTests TODOs

* Add ElasticInferenceServiceSparseEmbeddingsRequestTests TODOs

* Add ElasticInferenceServiceSparseEmbeddingsModelTests TODOs

* spotless apply

* Fix conflicts

* Add EmptySecretSettingsTests

* Add named writeables to InferenceNamedWriteablesProvider

* Remove addressed todos

* Translate model to correct endpoint

* Remove addressed TODO

* Add docs to ElasticInferenceServiceFeature

* Implement and test truncation/request

* Add some EIS tests

* Support chunked inference

* Check model config

* Add more tests

* Add response handler

* Add more tests + HTTP 413 handling

* Fix some tests

* Spotless

* Fixes

* Switch back to original response structure

* Implement pass-through chunking

* Spotless

* Fix after rebase

* Spotless

* Log error upon failing to parse error response

* Remove TODOs

* Update docs/changelog/111154.yaml

---------

Co-authored-by: Adam Demjen <demjened@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>feature :SearchOrg/Inference Label for the Search Inference team Team:Search - Inference Team:SearchOrg Meta label for the Search Org (Enterprise Search) v8.16.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants