Skip to content

feat(logging): Add startup logging for shard counts#25378

Merged
devanbenz merged 14 commits intomaster-1.xfrom
feat/476-startup-progress
Oct 16, 2024
Merged

feat(logging): Add startup logging for shard counts#25378
devanbenz merged 14 commits intomaster-1.xfrom
feat/476-startup-progress

Conversation

@devanbenz
Copy link
Copy Markdown

@devanbenz devanbenz commented Sep 20, 2024

This PR adds a check to see how many shards are remaining
vs how many shards are opened as well as the percentage.

closes https://github.com/influxdata/feature-requests/issues/476

@devanbenz devanbenz force-pushed the feat/476-startup-progress branch from 1ce74a8 to b4b96c7 Compare September 21, 2024 01:34
@devanbenz devanbenz changed the title feat(logging): WIP beginning building out the structure for startup feat(logging): Add startup logging for shard counts Sep 23, 2024
@devanbenz devanbenz marked this pull request as ready for review September 23, 2024 14:20
@devanbenz devanbenz requested a review from gwossum September 23, 2024 14:20
@devanbenz devanbenz self-assigned this Sep 23, 2024
@devanbenz devanbenz marked this pull request as draft September 23, 2024 16:41
@gwossum
Copy link
Copy Markdown
Member

gwossum commented Sep 23, 2024

@devanbenz @davidby-influx I have some hard numbers to put behind my claim that scanning all the directories and files and then scanning them again doesn't take much longer than scanning them once. On my 3.5 year old Influx laptop, there are ~3.5 million files in my home directory. Scanning all of them cold after dropping all the disk caches takes about 50 seconds. Scanning a second time with the cache hot only takes 4 seconds. So scanning them twice only adds 8% beyond scanning them one time.

@devanbenz devanbenz marked this pull request as ready for review September 23, 2024 22:11
@devanbenz
Copy link
Copy Markdown
Author

devanbenz commented Sep 23, 2024

@gwossum this is ready now, I was thinking I could set up a test suite entry for maybe 0 shards and then it hitting an error to show that the uint64 doesn't go below 0 per the checks I have in the interface but idk if I should take the time to wire that up seems like a lot of test boilerplate I need to create for it. I will also squash the commits and have a single descriptive commit later.

@devanbenz devanbenz requested a review from gwossum September 24, 2024 14:40
This PR adds a check to see how many shards are remaining
vs how many shards are opened as well as the percentage.

closes influxdata/feature-requests#476
@devanbenz devanbenz force-pushed the feat/476-startup-progress branch from 11f53c2 to 26de392 Compare September 25, 2024 15:28
This PR adds a check to see how many shards are remaining
vs how many shards are opened as well as the percentage.

closes influxdata/feature-requests#476
@devanbenz devanbenz requested a review from gwossum October 4, 2024 14:22
@devanbenz
Copy link
Copy Markdown
Author

@gwossum I've modified my store_test code to include the changes that you established to fix the additional shard problem. The scaffolding is fixed now.

@devanbenz devanbenz requested a review from gwossum October 11, 2024 18:50
tsdb/store.go Outdated
continue
}
// We use `rawShardCount` as a buffer size for channel creation below
rawShardCount := 0
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handling of rawShardCount works with this code, but there might be a time bomb waiting to go off in the future. Because rawShardCount gets incremented for every shard directory each time walkShardsAndProcess is called, it will no longer is the shard count the second time walkShardsAndProcess gets called. Since rawShardCount is only used after the first call but before the second, it has the correct value. It might cause issues if it gets used for more stuff in the future, though. Fortunately this is an easy fix: Move the incrementing of rawShardCount into the lambda that calls s.startupProgressMetrics.AddShard().

One more thing to note: If there is no startupProgressMetrics object, then rawShardCount will be 0 when creating the channel. This isn't any worse than it is today, so I don't think it's necessary to do anything. However, it does make rawShardCount incorrect if used for another purpose in the future. You might want to add a comment, or call walkShardsAndProcess the first time even if there is no startupProgressMetrics object.

@devanbenz devanbenz requested a review from gwossum October 15, 2024 21:13
Copy link
Copy Markdown
Member

@gwossum gwossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@devanbenz devanbenz merged commit 3c87f52 into master-1.x Oct 16, 2024
@devanbenz devanbenz deleted the feat/476-startup-progress branch October 16, 2024 15:09
devanbenz added a commit that referenced this pull request Oct 31, 2024
* feat(tsdb): Adds shard opening progress checks to startup
This PR adds a check to see how many shards are remaining
vs how many shards are opened. This change displays the percent
completed too.

closes influxdata/feature-requests#476

(cherry picked from commit 3c87f52)

closes #25506
devanbenz added a commit that referenced this pull request Nov 1, 2024
* feat(logging): Add startup logging for shard counts (#25378)
This PR adds a check to see how many shards are remaining
vs how many shards are opened. This change displays the percent
completed too.

closes influxdata/feature-requests#476

(cherry picked from commit 3c87f52)

closes #25506
devanbenz added a commit that referenced this pull request Nov 1, 2024
* feat(logging): Add startup logging for shard counts (#25378)
This PR adds a check to see how many shards are remaining
vs how many shards are opened. This change displays the percent
completed too.

closes influxdata/feature-requests#476

(cherry picked from commit 3c87f52)

closes #25506

(cherry picked from commit 2ffb108)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants