Skip to content

feat: file store merge metrics#26615

Merged
philjb merged 4 commits intomaster-1.xfrom
pjb/26614/file_store_merge_metrics
Jul 18, 2025
Merged

feat: file store merge metrics#26615
philjb merged 4 commits intomaster-1.xfrom
pjb/26614/file_store_merge_metrics

Conversation

@philjb
Copy link
Copy Markdown
Contributor

@philjb philjb commented Jul 16, 2025

This commit ports metrics around merging tsm blocks when executing a query. These will appear in EXPLAN ANALYZE results. The new information records the time spent merging blocks, the number of blocks merged, roughly the number of values merged into the first block of each ReadBlock call, and the number of times that single calls to ReadBlock have more than 4 block merges. The multiblock merge is sequential and might benefit from a tree merge algorithm. The latter stat helps identify if the engineering effort would be fruitful.

This commit ports metrics around merging tsm blocks when executing a
query. These will appear in EXPLAN ANALYZE results. The new information
records the time spent merging blocks, the number of blocks merged,
roughly the number of values merged into the first block of each
ReadBlock call, and the number of times that single calls to ReadBlock
have more than 4 block merges. The multiblock merge is sequential and
might benefit from a tree merge algorithm. The latter stat helps
identify if the engineering effort would be fruitful.

* closes #26614
@philjb
Copy link
Copy Markdown
Contributor Author

philjb commented Jul 16, 2025

example explain analyze
I created the dataset with watch -n 1 "inch -start-time 2025-07-16T20:01:00Z -f 5 -p 1000 -time 5h" - the fixed start and the time range create a mess of writing data, causing lots of overlapping blocks.

> explain analyze select * from m0 where tag0='value0';
EXPLAIN ANALYZE
---------------
.
└── select
    ├── execution_time: 9.49242475s
    ├── planning_time: 657.939125ms
    ├── total_time: 10.150363875s
    └── build_cursor
        ├── labels
        │   └── statement: SELECT tag0::tag, tag1::tag, tag2::tag, v0::float, v01::float, v02::float, v03::float, v04::float FROM stress.autogen
.m0 WHERE tag0::tag = 'value0'
        └── iterator_scanner
            ├── labels
            │   └── auxiliary_fields: tag0::tag, tag1::tag, tag2::tag, v0::float, v01::float, v02::float, v03::float, v04::float
            ├── create_iterator
            │   ├── labels
            │   │   ├── cond: tag0::tag = 'value0'
            │   │   ├── measurement: m0
            │   │   └── shard_id: 2
            │   ├── cursors_ref: 0
            │   ├── cursors_aux: 500
            │   ├── cursors_cond: 0
            │   ├── float_blocks_decoded: 67100
            │   ├── float_blocks_size_bytes: 127796049
            │   ├── float_blocks_merge_ns: 6260721829
            │   ├── float_blocks_merge_count: 66100
            │   ├── float_blocks_merge_over4_count: 500
            │   ├── float_blocks_merge_values_count: 59865875
            │   ├── integer_blocks_decoded: 0
            │   ├── integer_blocks_size_bytes: 0
            │   ├── integer_blocks_merge_ns: 0
            │   ├── integer_blocks_merge_count: 0
            │   ├── integer_blocks_merge_over4_count: 0
            │   ├── integer_blocks_merge_values_count: 0
            │   ├── unsigned_blocks_decoded: 0
            │   ├── unsigned_blocks_size_bytes: 0
            │   ├── unsigned_blocks_merge_ns: 0
            │   ├── unsigned_blocks_merge_count: 0
            │   ├── unsigned_blocks_merge_over4_count: 0
            │   ├── unsigned_blocks_merge_values_count: 0
            │   ├── string_blocks_decoded: 0
            │   ├── string_blocks_size_bytes: 0
            │   ├── string_blocks_merge_ns: 0
            │   ├── string_blocks_merge_count: 0
            │   ├── string_blocks_merge_over4_count: 0
            │   ├── string_blocks_merge_values_count: 0
            │   ├── boolean_blocks_decoded: 0
            │   ├── boolean_blocks_size_bytes: 0
            │   ├── boolean_blocks_merge_ns: 0
            │   ├── boolean_blocks_merge_count: 0
            │   ├── boolean_blocks_merge_over4_count: 0
            │   ├── boolean_blocks_merge_values_count: 0
            │   └── planning_time: 655.733666ms
            └── create_iterator
                ├── labels
                │   ├── cond: tag0::tag = 'value0'
                │   ├── measurement: m0
                │   └── shard_id: 3
                ├── cursors_ref: 0
                ├── cursors_aux: 500
                ├── cursors_cond: 0
                ├── float_blocks_decoded: 2500
                ├── float_blocks_size_bytes: 121500
                ├── float_blocks_merge_ns: 10895252
                ├── float_blocks_merge_count: 1500
                ├── float_blocks_merge_over4_count: 0
                ├── float_blocks_merge_values_count: 101665
                ├── integer_blocks_decoded: 0
                ├── integer_blocks_size_bytes: 0
                ├── integer_blocks_merge_ns: 0
                ├── integer_blocks_merge_count: 0
                ├── integer_blocks_merge_over4_count: 0
                ├── integer_blocks_merge_values_count: 0
                ├── unsigned_blocks_decoded: 0
                ├── unsigned_blocks_size_bytes: 0
                ├── unsigned_blocks_merge_ns: 0
                ├── unsigned_blocks_merge_count: 0
                ├── unsigned_blocks_merge_over4_count: 0
                ├── unsigned_blocks_merge_values_count: 0
                ├── string_blocks_decoded: 0
                ├── string_blocks_size_bytes: 0
                ├── string_blocks_merge_ns: 0
                ├── string_blocks_merge_count: 0
                ├── string_blocks_merge_over4_count: 0
                ├── string_blocks_merge_values_count: 0
                ├── boolean_blocks_decoded: 0
                ├── boolean_blocks_size_bytes: 0
                ├── boolean_blocks_merge_ns: 0
                ├── boolean_blocks_merge_count: 0
                ├── boolean_blocks_merge_over4_count: 0
                ├── boolean_blocks_merge_values_count: 0
                └── planning_time: 1.935625ms

@philjb philjb requested review from devanbenz and gwossum July 16, 2025 20:34
@gwossum
Copy link
Copy Markdown
Member

gwossum commented Jul 16, 2025

@philjb Overall, I love adding these metrics and look forward to having them in future versions. However, the new timers are in ns, which I don't love, especially since the other timers are in ms. I assume you're using metrics.Counter instead of metrics.Timer to capture cumulative time. The quick and dirty fix would be adding a metric.Timer.AddDuration method. You could also a new metrics.SummingTimer type that leaves off methods that set the timer instead of summing, but that's a lot more code.

@philjb
Copy link
Copy Markdown
Contributor Author

philjb commented Jul 17, 2025

@philjb Overall, I love adding these metrics and look forward to having them in future versions. However, the new timers are in ns, which I don't love, especially since the other timers are in ms. I assume you're using metrics.Counter instead of metrics.Timer to capture cumulative time. The quick and dirty fix would be adding a metric.Timer.AddDuration method. You could also a new metrics.SummingTimer type that leaves off methods that set the timer instead of summing, but that's a lot more code.

You mean in that execution_time for example is printed as a Duration with nicer units than nanoseconds? I think i might be able to handle that. lemme see. If you meant something else, let me know.

@gwossum
Copy link
Copy Markdown
Member

gwossum commented Jul 17, 2025

You mean in that execution_time for example is printed as a Duration with nicer units than nanoseconds? I think i might be able to handle that. lemme see. If you meant something else, let me know.

@philjb Yes, change float_blocks_merge_ns: 6260721829 to float_blocks_merge_duration: 6.260721829s (or similar).

@philjb
Copy link
Copy Markdown
Contributor Author

philjb commented Jul 17, 2025

Ok - switched to timers. The values print at the bottom now tho

            │   ├── string_blocks_merge_over4_count: 0
            │   ├── string_blocks_merge_values_count: 0
            │   ├── boolean_blocks_decoded: 0
            │   ├── boolean_blocks_size_bytes: 0
            │   ├── boolean_blocks_merge_count: 0
            │   ├── boolean_blocks_merge_over4_count: 0
            │   ├── boolean_blocks_merge_values_count: 0
            │   ├── planning_time: 511.67875ms
            │   ├── float_blocks_merge_duration: 5.236738091s
            │   ├── integer_blocks_merge_duration: 0s
            │   ├── unsigned_blocks_merge_duration: 0s
            │   ├── string_blocks_merge_duration: 0s
            │   └── boolean_blocks_merge_duration: 0s

@philjb philjb requested a review from gwossum July 17, 2025 18:35
Copy link
Copy Markdown
Member

@gwossum gwossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'll create an issue to backport this to 1.12 once 1.12.1 is released.

@philjb philjb merged commit 4e8a3b3 into master-1.x Jul 18, 2025
9 checks passed
devanbenz pushed a commit that referenced this pull request Jul 25, 2025
* feat(1.x,file_store): port metrics for merge work

This commit ports metrics around merging tsm blocks when executing a
query. These will appear in EXPLAN ANALYZE results. The new information
records the time spent merging blocks, the number of blocks merged,
roughly the number of values merged into the first block of each
ReadBlock call, and the number of times that single calls to ReadBlock
have more than 4 block merges. The multiblock merge is sequential and
might benefit from a tree merge algorithm. The latter stat helps
identify if the engineering effort would be fruitful.

* closes #26614

* chore: switch to a timer for duration printing of times

* chore: rename method

* fix: avoid race and use new atomic primitive

(cherry picked from commit 4e8a3b3)
devanbenz pushed a commit that referenced this pull request Jul 25, 2025
* feat(1.x,file_store): port metrics for merge work

This commit ports metrics around merging tsm blocks when executing a
query. These will appear in EXPLAN ANALYZE results. The new information
records the time spent merging blocks, the number of blocks merged,
roughly the number of values merged into the first block of each
ReadBlock call, and the number of times that single calls to ReadBlock
have more than 4 block merges. The multiblock merge is sequential and
might benefit from a tree merge algorithm. The latter stat helps
identify if the engineering effort would be fruitful.

* closes #26614

* chore: switch to a timer for duration printing of times

* chore: rename method

* fix: avoid race and use new atomic primitive

(cherry picked from commit 4e8a3b3)
devanbenz pushed a commit that referenced this pull request Sep 24, 2025
* feat(1.x,file_store): port metrics for merge work

This commit ports metrics around merging tsm blocks when executing a
query. These will appear in EXPLAN ANALYZE results. The new information
records the time spent merging blocks, the number of blocks merged,
roughly the number of values merged into the first block of each
ReadBlock call, and the number of times that single calls to ReadBlock
have more than 4 block merges. The multiblock merge is sequential and
might benefit from a tree merge algorithm. The latter stat helps
identify if the engineering effort would be fruitful.

* closes #26614

* chore: switch to a timer for duration printing of times

* chore: rename method

* fix: avoid race and use new atomic primitive

(cherry picked from commit 4e8a3b3)
devanbenz added a commit that referenced this pull request Sep 25, 2025
Co-authored-by: Phil Bracikowski <13472206+philjb@users.noreply.github.com>
closes #26614
@philjb philjb added the 1.x label Dec 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants