gossip: log to net/trace events endpoint#8812
Conversation
|
can you use this contraption instead of the EventLog directly? Line 129 in 8d70a36 I was just talking to @RaduBerinde about figuring out something such that people wanting to log to such Review status: 0 of 2 files reviewed at latest revision, all discussions resolved, all commit checks successful. Comments from Reviewable |
|
Reviewed 2 of 2 files at r1. gossip/gossip.go, line 1035 [r1] (raw file):
This is a logic change not related to logging, right? This call wasn't there before. Perhaps a comment is warranted to explain. Comments from Reviewable |
|
Yeah - I plan to make EventLogging happen automatically through the Context when you log (just like tracing). That will be useful in a bunch of of other places. I will refactor this code then. In fact I welcome a simple usecase (the Review status: all files reviewed at latest revision, 1 unresolved discussion, all commit checks successful. Comments from Reviewable |
|
We need to Review status: all files reviewed at latest revision, 1 unresolved discussion, all commit checks successful. Comments from Reviewable |
|
I don't think we have to (no goroutine attached to it I think) but we should (Done). Review status: all files reviewed at latest revision, 1 unresolved discussion, all commit checks successful. gossip/gossip.go, line 1035 [r1] (raw file):
|
|
SGTM On Thu, Aug 25, 2016 at 11:33 AM, Tobias Schottdorf <
|
5df52a6 to
ea7bbea
Compare
This has been helpful when debugging a local cluster which did not connect and can't hurt for cockroachdb#7668 as well. Add two net/trace event logs to `Gossip`, one which is used in various places and one dedicated to the `bootstrap()` goroutine. Sample output: 18:05:41.068322 . 69 ... have clients: false, have sentinel: false 18:05:41.068326 . 4 ... no next bootstrap address; currently bootstrapping: [] 18:05:41.068370 . 45 ... sleeping 1s until bootstrap 18:05:42.071958 1.003588 ... idling until bootstrap required 18:05:43.400931 1.328973 ... detected stall; commencing bootstrap 18:05:43.400935 . 4 ... have clients: false, have sentinel: false 18:05:43.709402 .308467 ... sleeping 1s until bootstrap 18:05:44.711241 1.001839 ... idling until bootstrap required 18:05:41.066532 . 10 ... initial resolvers: [] 18:05:41.066538 . 6 ... now stalled 18:05:41.445657 .379118 ... NodeID set to 4 18:05:43.400535 1.954878 ... NodeDescriptor set to node_id:4 address:<network_field:"tcp" address_field:"macts.local:26257" > attrs:<> 18:05:43.400877 . 342 ... add bootstrap {tcp macts.local:26263} 18:05:43.400881 . 4 ... add bootstrap {tcp macts.local:26259} 18:05:43.400886 . 5 ... add bootstrap {tcp macts.local:26261} 18:05:43.400891 . 5 ... add resolver &{tcp macts.local:26263} 18:05:43.400895 . 4 ... add resolver &{tcp macts.local:26259} 18:05:43.400897 . 1 ... add resolver &{tcp macts.local:26261} 18:05:43.709373 .308476 ... starting new client to macts.local:26259 18:05:47.164797 3.455424 ... connected 18:06:05.788452 18.623655 ... add resolver &{tcp macts.local:26259} 18:06:05.788494 . 42 ... add bootstrap {tcp macts.local:26259} 18:06:05.788499 . 5 ... add resolver &{tcp macts.local:26261} 18:06:05.788500 . 1 ... add bootstrap {tcp macts.local:26261} 18:06:05.788502 . 2 ... add resolver &{tcp macts.local:26263} 18:06:05.788503 . 1 ... add bootstrap {tcp macts.local:26263} 18:06:05.788644 . 141 ... connected 18:06:47.170653 41.382009 ... gossip status (ok, 4 nodes) gossip client (1/3 cur/max conns) 1: macts.local:26259 (1m3s: infos 10/20 sent/received, bytes 5662B/2740B sent/received) gossip server (2/3 cur/max conns, infos 78/82 sent/received, bytes 18681B/13985B sent/received) 2: macts.local:26261 (1m0s) 3: macts.local:26263 (59s) 18:07:47.167660 59.997007 ... gossip status (ok, 4 nodes) gossip client (1/3 cur/max conns) 1: macts.local:26259 (2m3s: infos 19/25 sent/received, bytes 6940B/4012B sent/received) gossip server (2/3 cur/max conns, infos 107/95 sent/received, bytes 22819B/16298B sent/received) 3: macts.local:26263 (1m59s) 2: macts.local:26261 (2m0s) 18:08:47.171295 60.003635 ... gossip status (ok, 4 nodes) gossip client (1/3 cur/max conns) 1: macts.local:26259 (3m3s: infos 34/32 sent/received, bytes 9066B/5370B sent/received) gossip server (2/3 cur/max conns, infos 150/114 sent/received, bytes 28911B/19450B sent/received) 3: macts.local:26263 (2m59s) 2: macts.local:26261 (3m0s) 18:09:47.171775 60.000480 ... gossip status (ok, 4 nodes) gossip client (1/3 cur/max conns) 1: macts.local:26259 (4m3s: infos 38/41 sent/received, bytes 9851B/6846B sent/received) gossip server (2/3 cur/max conns, infos 177/133 sent/received, bytes 33077B/22433B sent/received) 3: macts.local:26263 (3m59s) 2: macts.local:26261 (4m0s) 18:10:47.172169 60.000394 ... gossip status (ok, 4 nodes) gossip client (1/3 cur/max conns) 1: macts.local:26259 (5m3s: infos 41/51 sent/received, bytes 10471B/8232B sent/received) gossip server (2/3 cur/max conns, infos 204/152 sent/received, bytes 37181B/25165B sent/received) 2: macts.local:26261 (5m0s) 3: macts.local:26263 (4m59s) 18:11:47.169247 59.997079 ... gossip status (ok, 4 nodes) gossip client (1/3 cur/max conns) 1: macts.local:26259 (6m3s: infos 47/59 sent/received, bytes 11385B/9388B sent/received) gossip server (2/3 cur/max conns, infos 231/169 sent/received, bytes 40951B/27495B sent/received) 3: macts.local:26263 (5m59s) 2: macts.local:26261 (6m0s)
This method must be called "regularly" to close Gossip's `Connected` channel when the first connection is made. It was previously called in the following locations: - AddInfo - handleResponse Neither of those seems particularly idiomatic (though those calls don't hurt). The idiomatic location (the one where Gossip state transitions from stalled to connected) is added here along with a TODO.
|
PTAL |
|
Review status: 0 of 3 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful. Comments from Reviewable |
|
Review status: 0 of 3 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful. gossip/gossip.go, line 218 [r2] (raw file):
shouldn't this be inside the gossip/gossip.go, line 872 [r2] (raw file):
do we really need a new EventLog for bootstrap? Can't you just print begin/end bootstrap events on the main EventLog? gossip/gossip.go, line 1087 [r3] (raw file):
Comments from Reviewable |
|
Reviewed 1 of 2 files at r2, 2 of 2 files at r3. gossip/gossip.go, line 218 [r2] (raw file):
|
|
Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful. gossip/gossip.go, line 218 [r2] (raw file):
|
|
Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful. gossip/gossip.go, line 872 [r2] (raw file):
|
This has been helpful when debugging a local cluster which did not connect
and can't hurt for #7668 as well.
Add two net/trace event logs to
Gossip, one which is used in various placesand one dedicated to the
bootstrap()goroutine.Example output
18:05:41.068322 . 69 ... have clients: false, have sentinel: false
18:05:41.068326 . 4 ... no next bootstrap address; currently bootstrapping: []
18:05:41.068370 . 45 ... sleeping 1s until bootstrap
18:05:42.071958 1.003588 ... idling until bootstrap required
18:05:43.400931 1.328973 ... detected stall; commencing bootstrap
18:05:43.400935 . 4 ... have clients: false, have sentinel: false
18:05:43.709402 .308467 ... sleeping 1s until bootstrap
18:05:44.711241 1.001839 ... idling until bootstrap required
18:05:41.066532 . 10 ... initial resolvers: []
18:05:41.066538 . 6 ... now stalled
18:05:41.445657 .379118 ... NodeID set to 4
18:05:43.400535 1.954878 ... NodeDescriptor set to node_id:4 address:<network_field:"tcp" address_field:"macts.local:26257" > attrs:<>
18:05:43.400877 . 342 ... add bootstrap {tcp macts.local:26263}
18:05:43.400881 . 4 ... add bootstrap {tcp macts.local:26259}
18:05:43.400886 . 5 ... add bootstrap {tcp macts.local:26261}
18:05:43.400891 . 5 ... add resolver &{tcp macts.local:26263}
18:05:43.400895 . 4 ... add resolver &{tcp macts.local:26259}
18:05:43.400897 . 1 ... add resolver &{tcp macts.local:26261}
18:05:43.709373 .308476 ... starting new client to macts.local:26259
18:05:47.164797 3.455424 ... connected
18:06:05.788452 18.623655 ... add resolver &{tcp macts.local:26259}
18:06:05.788494 . 42 ... add bootstrap {tcp macts.local:26259}
18:06:05.788499 . 5 ... add resolver &{tcp macts.local:26261}
18:06:05.788500 . 1 ... add bootstrap {tcp macts.local:26261}
18:06:05.788502 . 2 ... add resolver &{tcp macts.local:26263}
18:06:05.788503 . 1 ... add bootstrap {tcp macts.local:26263}
18:06:05.788644 . 141 ... connected
18:06:47.170653 41.382009 ... gossip status (ok, 4 nodes) gossip client (1/3 cur/max conns) 1: macts.local:26259 (1m3s: infos 10/20 sent/received, bytes 5662B/2740B sent/received) gossip server (2/3 cur/max conns, infos 78/82 sent/received, bytes 18681B/13985B sent/received) 2: macts.local:26261 (1m0s) 3: macts.local:26263 (59s)
18:07:47.167660 59.997007 ... gossip status (ok, 4 nodes) gossip client (1/3 cur/max conns) 1: macts.local:26259 (2m3s: infos 19/25 sent/received, bytes 6940B/4012B sent/received) gossip server (2/3 cur/max conns, infos 107/95 sent/received, bytes 22819B/16298B sent/received) 3: macts.local:26263 (1m59s) 2: macts.local:26261 (2m0s)
18:08:47.171295 60.003635 ... gossip status (ok, 4 nodes) gossip client (1/3 cur/max conns) 1: macts.local:26259 (3m3s: infos 34/32 sent/received, bytes 9066B/5370B sent/received) gossip server (2/3 cur/max conns, infos 150/114 sent/received, bytes 28911B/19450B sent/received) 3: macts.local:26263 (2m59s) 2: macts.local:26261 (3m0s)
18:09:47.171775 60.000480 ... gossip status (ok, 4 nodes) gossip client (1/3 cur/max conns) 1: macts.local:26259 (4m3s: infos 38/41 sent/received, bytes 9851B/6846B sent/received) gossip server (2/3 cur/max conns, infos 177/133 sent/received, bytes 33077B/22433B sent/received) 3: macts.local:26263 (3m59s) 2: macts.local:26261 (4m0s)
18:10:47.172169 60.000394 ... gossip status (ok, 4 nodes) gossip client (1/3 cur/max conns) 1: macts.local:26259 (5m3s: infos 41/51 sent/received, bytes 10471B/8232B sent/received) gossip server (2/3 cur/max conns, infos 204/152 sent/received, bytes 37181B/25165B sent/received) 2: macts.local:26261 (5m0s) 3: macts.local:26263 (4m59s)
18:11:47.169247 59.997079 ... gossip status (ok, 4 nodes) gossip client (1/3 cur/max conns) 1: macts.local:26259 (6m3s: infos 47/59 sent/received, bytes 11385B/9388B sent/received) gossip server (2/3 cur/max conns, infos 231/169 sent/received, bytes 40951B/27495B sent/received) 3: macts.local:26263 (5m59s) 2: macts.local:26261 (6m0s)
This change is