Skip to content

Optimise seeking by timestamp #22129

@Samreay

Description

@Samreay

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

Right now it seems that seeking a reader or a consumer to a specific timestamp is an unoptimised process that can take many seconds / over a minute for larger topics (single GB data size, tens of messages per second). From a slack comment @lhotari it appears that seeking via a timestamp is not optimised, and I'm here to propose optimising it as a valuable feature.

Solution

Seeking currently works by message ID or by timestamp. I assume (though I could be wrong) that seeking by messageID is optimised. Without going into the implementation details properly and just spitballing ideas, something like binary searching on the time, or creating a treemap from timestamp to message ID (at any level of sparsity) might allow seeking to become far faster

Alternatives

No response

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions