Skip to content

[phrase match] Introduce ordered document#6486

Merged
coszio merged 5 commits intodevfrom
introduce-ordered-document
Jun 2, 2025
Merged

[phrase match] Introduce ordered document#6486
coszio merged 5 commits intodevfrom
introduce-ordered-document

Conversation

@coszio
Copy link
Contributor

@coszio coszio commented May 5, 2025

Builds on top of #6597

Introduces a new optional field to full-text inverted indices, where the entire document is stored in the form of token ids.

Up to now, a Document meant the unique tokens that exist in the document, but now, we need to rethink the concept.

In this PR:

  • Document is the list of tokens, in the order they appear in the text
  • TokenSet is the unique tokens in the text.

The new field is not yet populated, nor used yet, it will be done in a later PR, along with handling the persistence.

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants