-
Notifications
You must be signed in to change notification settings - Fork 780
Closed as not planned
Labels
P:bandwidth-optimizationPriority: Optimize bandwidth usagePriority: Optimize bandwidth usagee2eRelated to our end-to-end testsRelated to our end-to-end tests
Description
Target audience
Operators and consensus developers.
Problem definition
Currently we rely on a small set of metrics, from a small set of setups, to determine how bandwidth is used.
We need to gather more data on how the bandwidth is being used in real world scenarios.
The data gathered and the method to do so should be well understood by operators, should not disclose any information the validators wouldn't like to share, and should not burden operators.
Upsides
- Develop fine grain understanding of data/metadata traffic on the network in in CometBFT.
- The approaches developed could later be expanded to collect other kinds of information, for example, storage usage.
Downsides
- Validators need to vet information before sharing with the team, which adds to their work.
- Code developed may end up being used only in test environments.
Tasks
### Goals (1w)
- [ ] Define metrics to improve and to capture to determine bandwidth usage
- [ ] Collect real-world samples for various versions of CometBFT
- [x] Plot bandwidth usage at a given point in time for test setups (Implemented as a directed weighted graph with metrics on the edges; Collects data at the end of a run from a Prometheus node; Implemented as part of #1085)
### Stretch goals
- [ ] Provide guidelines to validators for collecting and sharing bandwidth usage with the CometBFT team
### Long term goals
- [ ] Provide real-time visualization of bandwidth usage of metrics
Definition of done
- Metrics to be captured are defined.
- Running an experiment in the e2e testbed, or over a single administrative domain in a real-world deployment details bandwidth usage.
- Details come as a graph where a link between two nodes is weighted by the amount of information that transited among.
- This information is broken down wrt. the reactors (mempool, consensus, evidence, etc.).
- If time allows, this information is updated in real time.
- Jointly with this, we collect samples from real-world deployments to gather additional information about bandwidth usage.
- To this end, we provide validators with clear guidelines to collect such information and share it with us.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
P:bandwidth-optimizationPriority: Optimize bandwidth usagePriority: Optimize bandwidth usagee2eRelated to our end-to-end testsRelated to our end-to-end tests