Skip to content

[ML] Fix off-by-one error in ml_classic tokenizer end offset#50655

Merged
droberts195 merged 2 commits intoelastic:masterfrom
droberts195:fix_ml_classic_off_by_one
Jan 7, 2020
Merged

[ML] Fix off-by-one error in ml_classic tokenizer end offset#50655
droberts195 merged 2 commits intoelastic:masterfrom
droberts195:fix_ml_classic_off_by_one

Conversation

@droberts195
Copy link
Copy Markdown

The end offset of a tokenizer is supposed to point one past the
end of the input, not to the end character of the input. The
ml_classic tokenizer was erroneously doing the latter.

The end offset of a tokenizer is supposed to point one past the
end of the input, not to the end character of the input.  The
ml_classic tokenizer was erroneously doing the latter.
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/ml-core (:ml)

@droberts195
Copy link
Copy Markdown
Author

Jenkins run elasticsearch-ci/bwc

@droberts195
Copy link
Copy Markdown
Author

Jenkins run elasticsearch-ci/default-distro

Copy link
Copy Markdown
Contributor

@dimitris-athanasiou dimitris-athanasiou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@droberts195 droberts195 merged commit 890577a into elastic:master Jan 7, 2020
@droberts195 droberts195 deleted the fix_ml_classic_off_by_one branch January 7, 2020 10:11
droberts195 added a commit that referenced this pull request Jan 7, 2020
The end offset of a tokenizer is supposed to point one past the
end of the input, not to the end character of the input.  The
ml_classic tokenizer was erroneously doing the latter.
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this pull request Jan 23, 2020
…#50655)

The end offset of a tokenizer is supposed to point one past the
end of the input, not to the end character of the input.  The
ml_classic tokenizer was erroneously doing the latter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants