Skip to content

Add Client.log_on_scheduler method to help correlating cluster logs with client-side events.#7179

Closed
hendrikmakait wants to merge 2 commits intodask:mainfrom
hendrikmakait:log_on_scheduler
Closed

Add Client.log_on_scheduler method to help correlating cluster logs with client-side events.#7179
hendrikmakait wants to merge 2 commits intodask:mainfrom
hendrikmakait:log_on_scheduler

Conversation

@hendrikmakait
Copy link
Member

This PR adds a Client.log_on_scheduler method that logs the provided message in the scheduler's logs. I implemented a similar helper for coiled/benchmarks/pull/473 and figured this might be generally helpful to allow users to inject additional timestamped information into the cluster logs.

  • Tests added / passed
  • Passes pre-commit run --all-files

@hendrikmakait hendrikmakait self-assigned this Oct 24, 2022
@github-actions
Copy link
Contributor

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

       15 files  ±  0         15 suites  ±0   6h 19m 9s ⏱️ - 15m 42s
  3 155 tests +  2    3 064 ✔️ +  1    85 💤 ±0    6 +1 
23 344 runs  +16  22 410 ✔️ +16  913 💤  - 1  21 +1 

For more details on these failures, see this check.

Results for commit a7b1540. ± Comparison against base commit 8f25111.

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a reasonable change to me if you want to ensure that a log message from the client is inserted at a specific point in the scheduler log (obviously the comm may delay this a little).

I wonder if we want to think a bit more about how logs are aggregated and viewed from the client, scheduler and workers though. Right now cluster objects have the ability to grab logs and show them next to each other, but perhaps we need to go further and store logs in a more modern timestamped format that can be interlaced accurately down the line. This could either be both in a logging platform outside of Dask or within the widgets we have now.

@hendrikmakait
Copy link
Member Author

@jacobtomlinson, totally agreed. I feel like Client.log_on_scheduler is a good (reasonable?) addition for the setup we have at the moment, but we should think more deeply about improving our logging (and tracing) so that there might not even be a need for the method in the long run. #4762 also comes to mind here.

@jacobtomlinson
Copy link
Member

Yeah, I totally agree this is a reasonable addition, but it definitely feels like a workaround.

@gjoseph92
Copy link
Collaborator

@hendrikmakait seems this is out of date now, do we still want to do this?

@hendrikmakait
Copy link
Member Author

@hendrikmakait seems this is out of date now, do we still want to do this?

I can resolve the merge conflicts but given a lack of interest and the fact that this is more of a workaround than a long-term improvement I'm also fine with closing.

@crusaderky
Copy link
Collaborator

-1 from me too. I'd rather have a nice way of putting client, scheduler and worker logs side by side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants