Mentioned in the Protocol Weekly on 7/25, instrumentation should be added on all async calls.
Things to keep in mind (via Janis):
- we need to remove instrumentation from all long-running tasks (in our case, essentially everything that is called T::run or T::run_until_stopped
- we also need to remove all events that run under these (so if you have a loop { select!((....)} construct, then there cannot be events in these
a. The reason is that forever spans are completely useless on an observability platforms; in the best case we will see them at the end of a run, in the worst case we will lose them completely because the gRPC payload is too big (because the span contains too many events).
- while we want to have #[instrument(skip_all, err)], we don't necessarily want the errors to be emitted under an ERROR target. ERROR is for service-stopping issues, while WARN is for issues that can be handled (for example, through a retry).
a. using this example, this would be #[instrument(err(level = Level::WARN)]
### Tasks
- [x] bridge-withdrawer
- [x] composer
- [x] conductor
- [x] sequencer
- [x] sequencer-relayer
┆Issue Number: ENG-670
Mentioned in the Protocol Weekly on 7/25, instrumentation should be added on all async calls.
Things to keep in mind (via Janis):
┆Issue Number: ENG-670