crimson/osd: add verbose DEBUG logs for OSD startup#67090
crimson/osd: add verbose DEBUG logs for OSD startup#67090shraddhaag merged 1 commit intoceph:mainfrom
Conversation
Matan-B
left a comment
There was a problem hiding this comment.
This should make osd boot more verbose on debug.
Added few suggestions. Could be a good opportunity to update the shorter functions to coroutines such as open_or_create_meta_coll, open_meta_coll or even OSD::start() (which is a bit longer)
src/crimson/osd/main.cc
Outdated
| // handled by S* must be blocked for alien threads (see AlienStore). | ||
| seastar::handle_signal(SIGHUP, [] {}); | ||
|
|
||
| DEBUG("initializing prometheus (if enabled)"); |
There was a problem hiding this comment.
nit, might be confusing, the log line within the if branch should be sufficient if is enabled
There was a problem hiding this comment.
main.cc has a static seastar::logger& logger() which could be removed now if there aren't any other user of it.
| crimson::osd::populate_config_from_mon().get(); | ||
| } | ||
| if (config.count("mkfs")) { | ||
| DEBUG("running mkfs"); |
src/crimson/osd/main.cc
Outdated
| logger().info("crimson startup completed"); | ||
| INFO("crimson startup completed"); | ||
|
|
||
| DEBUG("waiting for stop signal"); |
There was a problem hiding this comment.
This could be confusing, let's avoid this line. We are waiting as long as the OSD is alive. Might be better to let the last log be crimson startup completed (after completing).
src/crimson/osd/osd.cc
Outdated
| seastar::future<> OSD::open_meta_coll() | ||
| { | ||
| LOG_PREFIX(OSD::open_meta_coll); | ||
| DEBUG("verifing this is the primary core before registering metadata collection"); |
There was a problem hiding this comment.
In case the assert would fail we would get abort message with the assertion condition.
Otherwise, we know we passed it.
How about:
ceph_assert(seastar::this_shard_id() == PRIMARY_CORE);
DEBUG("opening collection");
return store.get_sharded_store().open_collection
src/crimson/osd/osd.cc
Outdated
| }).then([this] { | ||
| return log_client.set_fsid(monc->get_fsid()); | ||
| }).then([this] { | ||
| DEBUG("osd: start, starting boot"); |
There was a problem hiding this comment.
No need for the prefix as we already have this prefix LOG_PREFIX(OSD::start); in this funciton.
src/crimson/osd/main.cc
Outdated
| ret, cpp_strerror(-ret))); | ||
| } | ||
|
|
||
| DEBUG("installing signal handlers"); |
There was a problem hiding this comment.
We are setting SIGHUP to be ignored here
src/crimson/osd/main.cc
Outdated
| logger().info("passed objectstore is {}", local_conf().get_val<std::string>("osd_objectstore")); | ||
| INFO("passed objectstore is {}", local_conf().get_val<std::string>("osd_objectstore")); | ||
|
|
||
| DEBUG("constructing OSD instance"); |
There was a problem hiding this comment.
How about "creating OSD"? Or alternatively add log line to OSD::OSD
There was a problem hiding this comment.
Added log line to OSD::OSD instead, seemed cleaner.
There was a problem hiding this comment.
We can remove l.243 as OSD::OSD was adjusted
src/crimson/osd/main.cc
Outdated
| DEBUG("running mkfs"); | ||
| auto osd_uuid = local_conf().get_val<uuid_d>("osd_uuid"); | ||
| if (osd_uuid.is_zero()) { | ||
| DEBUG("generating random osd uuid"); |
There was a problem hiding this comment.
uuid not specified, generating random osd uuid
src/crimson/osd/main.cc
Outdated
| config["osdspec-affinity"].as<std::string>()).get(); | ||
| } | ||
| if (config.count("mkkey") || config.count("mkfs")) { | ||
| DEBUG("administrative mode complete, exiting"); |
There was a problem hiding this comment.
nit,
"Exiting, mkkey {}, mkfs {}", config.count("mkkey").....
da8365e to
78aa116
Compare
|
@Matan-B I've addressed all the comments, thank you for the detailed review! You can see the final logs in the description, these are from the vstart cluster with these changes build. As for converting some of these functions to coroutines, I'll handle that in a followup PR as I'd like to test that more thoroughly, and I want these changes to land in main sooner to unblock testing my cephadm changes for debugging (#66811). If you could take a final look at the PR that'd be great, thanks! |
| ); | ||
| } | ||
| if (config.count("trace")) { | ||
| INFO("enabling trace logging"); |
There was a problem hiding this comment.
changed all logs above this to INFO as debug level is not set till config is parsed.
src/crimson/osd/main.cc
Outdated
| logger().info("passed objectstore is {}", local_conf().get_val<std::string>("osd_objectstore")); | ||
| INFO("passed objectstore is {}", local_conf().get_val<std::string>("osd_objectstore")); | ||
|
|
||
| DEBUG("constructing OSD instance"); |
There was a problem hiding this comment.
We can remove l.243 as OSD::OSD was adjusted
78aa116 to
58280a0
Compare
This commit adds verbose logs for each step of OSD startup to aid in debugging OSD startup failures. Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>
58280a0 to
5473386
Compare
This commit adds verbose logs for each step of OSD startup to aid in debugging OSD startup failures.
Logs in a vstart cluster:
Checklist
Show available Jenkins commands
jenkins test classic perfJenkins Job | Jenkins Job Definitionjenkins test crimson perfJenkins Job | Jenkins Job Definitionjenkins test signedJenkins Job | Jenkins Job Definitionjenkins test make checkJenkins Job | Jenkins Job Definitionjenkins test make check arm64Jenkins Job | Jenkins Job Definitionjenkins test submodulesJenkins Job | Jenkins Job Definitionjenkins test dashboardJenkins Job | Jenkins Job Definitionjenkins test dashboard cephadmJenkins Job | Jenkins Job Definitionjenkins test apiJenkins Job | Jenkins Job Definitionjenkins test docsReadTheDocs | Github Workflow Definitionjenkins test ceph-volume allJenkins Jobs | Jenkins Jobs Definitionjenkins test windowsJenkins Job | Jenkins Job Definitionjenkins test rook e2eJenkins Job | Jenkins Job DefinitionYou must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.