Critical dependency logging#4988
Conversation
…itical-dependency-logging
divagant-martian
left a comment
There was a problem hiding this comment.
this needs some changes to correctly obtain at least the gossipsub messages, there are also some open questions about other deps
| match *meta.level() { | ||
| tracing_core::Level::INFO => { | ||
| let _ = file_writer.write(visitor.message); | ||
| } | ||
| tracing_core::Level::WARN => { | ||
| let _ = file_writer.write(visitor.message); | ||
| } | ||
| tracing_core::Level::ERROR => { | ||
| let _ = file_writer.write(visitor.message); | ||
| } | ||
| _ => {} | ||
| } |
There was a problem hiding this comment.
why are you filtering out debug messages?
There was a problem hiding this comment.
no real reason, I'll add debug messages
There was a problem hiding this comment.
don't think you should be filtering here by level, since that can be controlled with the RUST_LOG flag
|
|
||
| lazy_static! { | ||
| pub static ref TRACING_LOGGING_DEPENDENCIES: Vec<String> = | ||
| vec!["libp2p".to_string(), "discv5".to_string()]; |
There was a problem hiding this comment.
note that none of the libp2p dependencies has a target/module path named libp2p. I hacked my changes to get metrics on debug to showcase some of the names we will actually get
dep_debug_total{target="libp2p_core"} 11
dep_debug_total{target="libp2p_dns"} 8
dep_debug_total{target="libp2p_gossipsub"} 273
dep_debug_total{target="libp2p_mplex"} 31
dep_debug_total{target="libp2p_quic"} 2
dep_debug_total{target="libp2p_swarm"} 15
dep_debug_total{target="libp2p_tcp"} 13
we probably need to decide which of these we want and what to do with enabled logs of deps not on the list @AgeManning
There was a problem hiding this comment.
I'm now capturing libp2p_gossipsub, but can make changes to include some, or all, of the libp2p dependencies based on what we decide
There was a problem hiding this comment.
Yeah I think just gossipsub is fine initially. We can maybe add some others in later if things get really spicy.
AgeManning
left a comment
There was a problem hiding this comment.
Awesome! This looks great! Thanks a lot!
Had a few comments, some are just questions :)
| let _ = NonBlockingFileWriter::clear_file(&path); | ||
| } | ||
|
|
||
| if let Err(e) = writeln!(file, "{}", message) { |
There was a problem hiding this comment.
I think it might be useful to tag on the time here.
Typically when things go wrong, they break at some specific time. It would be very useful to have a timestamp so we can cross-reference with metrics/lighthouse logs.
| } | ||
|
|
||
| fn clear_file(path: &PathBuf) -> std::io::Result<()> { | ||
| File::create(path)?; |
There was a problem hiding this comment.
Is it possible for us to maybe just nuke half the file?
Just worried about the case that we get a bug, but we have just recently removed all our logs and have no history.
lighthouse/src/main.rs
Outdated
| // Debugging output for discv5, libp2p and external crates. | ||
| if matches.is_present("env_log") { | ||
| Builder::from_env(Env::default()).init(); | ||
| eprintln!("The -l flag is DEPRECATED. Dependency logging will be on by default."); |
There was a problem hiding this comment.
We can probably just remove this conditional altogether.
I dont think we ever use logging to terminal outside of slog, so probably not worth having this exception here. I'm not sure anyone is really using this flag anyway, except for the devs
There was a problem hiding this comment.
ok talked to paul and he tells me you discussed this. We are then NOT removing the flag, simply the warning. Leaving here for the record
|
|
||
| pub fn create_tracing_layer(logfile_max_size: u64, base_tracing_log_path: PathBuf) { | ||
| let filter_layer = match tracing_subscriber::EnvFilter::try_from_default_env() | ||
| .or_else(|_| tracing_subscriber::EnvFilter::try_new("warn")) |
There was a problem hiding this comment.
Maybe we use debug by default?
There was a problem hiding this comment.
this one's for me, will change to debug
| } | ||
| }; | ||
|
|
||
| for message in receiver { |
There was a problem hiding this comment.
Maybe I'm missing something here.
Shouldn't we have a loop here somewhere. It seems like this thread just reads all the messages, writes them, then ends?
It also looks like there is one of these threads per dependency/file. I was expecting an await, or perhaps this tasks continually gets created, but I only saw one initialisation. I'm likely missing something.
There was a problem hiding this comment.
Shouldn't we have a loop here somewhere. It seems like this thread just reads all the messages, writes them, then ends?
As far as i understand it, the mpsc channel creates a sender and receiver. The receiver blocks until a message is available (as long as sender still exists).
https://doc.rust-lang.org/std/sync/mpsc/fn.channel.html
So that for message in receiver loop blocks until a message is received and should never end as long as the sender exists.
It also looks like there is one of these threads per dependency/file. I was expecting an await, or perhaps this tasks continually gets created, but I only saw one initialisation. I'm likely missing something.
There is one thread per dependency, the stream initialization per dependency happens here
Might be a bit overkill to spawn a new stream per dependency, I can change it to one stream for all dependencies if that makes more sense
There was a problem hiding this comment.
ah yeah all good. Sorry have been in tokio async land too long and forgot about std channels :p
common/logging/src/lib.rs
Outdated
| { | ||
| Ok(filter) => filter, | ||
| Err(e) => { | ||
| eprintln!("Failed to initialize dependency tracing {e}"); |
There was a problem hiding this comment.
I think at this point we don't have a useable slogger, so we are forced into prtinln! right?
There was a problem hiding this comment.
yep that's the reason
|
@divagant-martian has suggested some traits that might make this a bit nicer. There is this trait: https://docs.rs/tracing-subscriber/0.3.18/tracing_subscriber/fmt/writer/trait.MakeWriter.html which might help with logging fancy logs and timestamps etc. There is this one: https://docs.rs/tracing-appender/latest/tracing_appender/non_blocking/struct.NonBlocking.html And this one: https://docs.rs/tracing-appender/latest/tracing_appender/rolling/struct.RollingFileAppender.html Which might help with writing to a file with in a rolling fashion so we can maintain the size rather than clearing half periodically, etc. |
|
those suggestions seem to have simplified some things. I think this should be ready for another review |
|
discv5 logs libp2p logs |
AgeManning
left a comment
There was a problem hiding this comment.
Nice. This looks much cleaner.
We are doing some releases at the moment. So will get this guy in once we're accepting more code.
Issue Addressed
#4954
Proposed Changes
This PR proposes adding logs for the following dependencies
Each dependency has its own separate log file. Using the tracing appender library a non blocking file stream is created for each dependency. On each tracing event a timestamp, log level and log message are written to the respective log file. The files are rotated daily
Example discv5 logs
Example libp2p-gossipsub logs
Additional Info
This builds on top of changes made in #4979. It may be a bit difficult to review until that PR is merged
Most of my changes can be found in the following files
lighthouse/src/main.rslighthouse/common/logging/src/lib.rslighthouse/common/logging/src/tracing_logging_layer.rs