Conversation
c4f4006 to
16ae35a
Compare
This comment was marked as resolved.
This comment was marked as resolved.
lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs
Outdated
Show resolved
Hide resolved
lib/segment/src/index/field_index/full_text_index/text_index.rs
Outdated
Show resolved
Hide resolved
8ca10ad to
3a99f45
Compare
3a99f45 to
4f3e5ce
Compare
generall
left a comment
There was a problem hiding this comment.
Benchmarks are controversial
| // Build the set intersection of point ids from the query tokens | ||
| let points_for_token = full_text_index | ||
| .filter_query(parsed_query.clone(), &hw_counter) | ||
| .collect::<AHashSet<_>>(); | ||
|
|
||
| Some(Box::new(move |point_id: PointOffsetType| { | ||
| full_text_index.check_match(&parsed_query, point_id, &hw_counter) | ||
| points_for_token.contains(&point_id) | ||
| })) |
There was a problem hiding this comment.
Naive question: if we can do this here, can't we just do the same inside full_text_index.check_match(..)?
1b0c81c to
fced9bb
Compare
|
The optimization is working fine when a lot of points are being tested in the matcher. Otherwise, the cost of of preparing the intersection up front could be more expensive than the savings. To make the matter worse, we actually build the
Creating a single But making this change is not trivial so I will close the current approach as it will never work as long as |
The initial goal of this PR was to fix an over measurement of
payload_index_io_readfor theTextindex.In practice, the final fix is a performance optimization decreasing the hardware counter as a side effect.
In my tests, the
payload_index_io_readvalue forTextmatching is almost 3x higher than what is reported by theread_bytesof the process.In case of high cardinality
Textmatching, the search is not driven directly by the payload index.Meaning a potentially large number of points need to be tested against the payload index to check if the condition matches.
The current approach in
devlazily recomputes the set of points which match the query tokens for each point repeatedly.This set is a constant for all points tested.
This inflates the number of
payload_index_io_readof ops artificially as the mmap slice is cached.The proposed fix is to pre-calculate the intersection of points for the query tokens to reuse it for each point test.
I was able to show locally that this change:
payload_index_io_readto be much closer toread_bytesThe trade-off is to use more memory once to avoid repeated IO.
The set of points kept in memory is the intersection of the tokens posting lists, therefore it should not grow out of proportion when adding tokens to the query.
Future work
Phrase matching does not leverage the optimization yet.
Hopefully we get to follow a similar patterns where positions are handled properly.