-
Notifications
You must be signed in to change notification settings - Fork 5.3k
tools: cli for real-time debugging and monitoring #2383
Description
At Lyft we have a plethora of tools for monitoring Envoy that take advantage of the logging and stats output, which are aggregated for every host into a central cluster. Sometimes you just want to see what's happening on a box right now though. I hacked together a couple of command line tools for monitoring envoy in real-time. curl and grep against the admin endpoints in combination with watch is pretty useful at times so I decided to make a dedicated tool. The other benefit of the tool is that it can diff the gauge values to give you number per interval for stats like cluster.foo.upstream_rq_2xx.
Here's example output from my tool, similar to iostat, mpstat, vmstat, etc.
$ envoystat -p http.router.downstream 1
2018/01/16 envoy 8cf90bcb/1.6.0-dev/Modified/RELEASE live 354839 354839 0
08:12:44 PM cx_active' rq_active' rq_2xx rq_4xx rq_5xx rq_total
08:12:44 PM 420 40 266 2 0 315
08:12:45 PM 420 29 305 1 0 313
08:12:46 PM 421 35 274 3 0 314
08:12:47 PM 420 27 226 2 0 244
08:12:48 PM 421 21 221 2 0 241
08:12:49 PM 420 29 246 2 0 276
08:12:50 PM 420 35 284 4 0 314
08:12:51 PM 420 29 276 2 0 290
08:12:52 PM 420 28 242 7 0 275
08:12:53 PM 421 24 221 2 0 240
^C
I wrote it in Python about 30 minutes and it wouldn't take much to make it more generic and broadly useful for various deployments of Envoy. Also planning to have a mode that analyzes the local access log and outputs top values for various fields (user agent, IP, etc).
@mccv has also been working on some cli tools. Looking for some details from him and any other opinions before building mine out more.