Skip to content

Make DataType and DataFormat top-level enums#143312

Merged
Mikep86 merged 4 commits intoelastic:mainfrom
Mikep86:multimodal_embedding-request-refactoring
Mar 2, 2026
Merged

Make DataType and DataFormat top-level enums#143312
Mikep86 merged 4 commits intoelastic:mainfrom
Mikep86:multimodal_embedding-request-refactoring

Conversation

@Mikep86
Copy link
Copy Markdown
Contributor

@Mikep86 Mikep86 commented Feb 27, 2026

Refactors DataType and DataFormat to make them top-level enums. Also store the supported format set in DataType so that logic is centralized in the enum.

These enums will be used in the embedding query vector builder I am currently working on and it is more appropriate to refer to them as top-level enums.

@Mikep86 Mikep86 added >refactoring :SearchOrg/Inference Label for the Search Inference team :Search Relevance/Search Catch all for Search Relevance labels Feb 27, 2026
@elasticsearchmachine elasticsearchmachine added v9.4.0 Team:Search - Inference Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch labels Feb 27, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/search-inference-team (Team:Search - Inference)

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

*/
public enum DataType {
TEXT(DataFormat.TEXT, EnumSet.of(DataFormat.TEXT)),
IMAGE(DataFormat.BASE64, EnumSet.of(DataFormat.BASE64));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The jina-clip-v2 API on Jina's side also allows to send a link for an image, so it doesn't need to be the image encoded as base64. It does in-fact work when specifying the format as base64 as I don't think we do any validation/enforcement?

Should we add a new format url or add text as valid format for the image data type? cc @DonalEvans

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's save that work for a follow-up PR. Adding a new format goes beyond the scope of refactoring.

Copy link
Copy Markdown
Contributor

@timgrein timgrein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the url data format is probably out-of-scope for this PR, just wanted to get the discussion started :)

@Mikep86 Mikep86 merged commit 8a30b1a into elastic:main Mar 2, 2026
35 checks passed
szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 2, 2026
…cations

* upstream/main: (60 commits)
  Use batches for other bulk vector benchmarks (elastic#143167)
  Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:lookup-join.MvJoinKeyOnTheLookupIndexAfterStats} elastic#143388
  Mute org.elasticsearch.snapshots.ConcurrentSnapshotsIT testBackToBackQueuedDeletes elastic#143387
  [Inference API] Parse endpoint metadata from persisted endpoints (elastic#143081)
  Add cluster formation doc to DistributedArchitectureGuide (elastic#143318)
  Fix flattened root block loader null expectation (elastic#143238)
  Unmute ValueSourceReaderTypeConversionTests testLoadAll (elastic#143189)
  ESQL: Add split coalescing for many small files (elastic#143335)
  Unmute mixed-cluster spatial parse warning test (elastic#143186)
  Fix zero-size estimate in BytesRefBlock null test (elastic#143258)
  Make DataType and DataFormat top-level enums (elastic#143312)
  Add support for steps to change the target index name for later steps (elastic#142955)
  Set mayContainDuplicates flag to test deduplication (elastic#143375)
  ESQL: Fix Driver search load millis as nanos bug (elastic#143267)
  Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:lookup-join.LookupJoinWithMixPushableAndUnpushableFilters} elastic#143378
  ESQL: Forbid MV_EXPAND before full text functions (elastic#143249)
  ESQL: Fix unresolved name pattern (elastic#143210)
  Implement boxplot queryDSL aggregation for exponential_histograms (elastic#143026)
  Add prefetching to x64 bulk vector implementations (elastic#142387)
  Make large segment vector tests resilient to memory constraints (elastic#143366)
  ...
tballison pushed a commit to tballison/elasticsearch that referenced this pull request Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>refactoring :Search Relevance/Search Catch all for Search Relevance :SearchOrg/Inference Label for the Search Inference team Team:Search - Inference Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants