Skip redundant UTF8 to UTF16 conversion follow-ups

In PR #126492, we implemented an optimization to skip some redundant UTF8 to UTF16 conversions. 

There were several follow-ups to that PR, which are tracked here.

- [x] Support for escaped and/or non-ascii characters (#129169)
- [x] Support for match_only_text fields (#129371)
- [ ] Support for text fields. Change field type to use the `UTF8DecodingReader` for indexed field.
- [ ] Support for wildcard fields. Adopt `UTF8DecodingReader` for indexed field. There is also an unneeded utf16 to utf8 conversion for binary doc values.

Optional followups:
- [ ] Support for other xcontent types (~cbor~ #132542, smile, yaml)
- [ ] Remove `XContentParser#optimizedText()` and instead have `XContentParser#text()` return `XContentString` instead of `String`

Maybe not even possible:
- [ ] Support for running normalizers on utf-8 encoded data instead of needing to convert to utf-16 strings

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip redundant UTF8 to UTF16 conversion follow-ups #129072

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Skip redundant UTF8 to UTF16 conversion follow-ups #129072

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions