Initially, we added just a doc-count and date-based check to the rollover API. We avoided a size-based check because it could be wildly inaccurate if multiple merges are in progress on different shards of the index.
We could make a more reasonable estimation by converting from segment sizes to # of docs which makes us less prone to merges etc. Could we sample the avg doc size on disk from existing docs and make the decision ultimately based on docs?
Initially, we added just a doc-count and date-based check to the rollover API. We avoided a size-based check because it could be wildly inaccurate if multiple merges are in progress on different shards of the index.
We could make a more reasonable estimation by converting from segment sizes to # of docs which makes us less prone to merges etc. Could we sample the avg doc size on disk from existing docs and make the decision ultimately based on docs?