mgr/balancer: reduce info-level logs#66068
Conversation
7f2f80f to
feccf2d
Compare
| self.log.debug('Skipping root %s (pools %s), total pgs %d ' | ||
| '< minimum %d (%d per osd)', | ||
| root, pools, | ||
| best_pe.total_by_root[root][key], | ||
| min_pgs, min_pg_per_osd) |
There was a problem hiding this comment.
Can you fix this so only "debug" is modified and not the spacing? This will just help minimize the overall number of changes.
src/pybind/mgr/balancer/module.py
Outdated
| self.no_optimization_needed = True | ||
| self.log.debug('unable to balance reads.') | ||
| return -errno.EALREADY, msg | ||
| self.log.info('prepared {} read changes'.format(total_num_changes)) | ||
| if total_num_changes != 0: | ||
| self.log.info('prepared {} read changes'.format(total_num_changes)) | ||
| if total_num_changes == 0: | ||
| self.log.debug('prepared {} read changes'.format(total_num_changes)) | ||
| self.no_optimization_needed = True | ||
| return -errno.EALREADY, msg |
There was a problem hiding this comment.
From lines 1142-1151, I think it would be cleaner to do this:
if total_num_changes > 0:
self.log.info('prepared {} read changes'.format(total_num_changes))
else:
self.log.debug('prepared {} read changes'.format(total_num_changes))
self.no_optimization_needed = True
return -errno.EALREADY, msg
When I originally wrote this, I differentiated between a negative number of changes and 0 changes. But, since it should be impossible to ever hit negative changes (we should still log it though if it somehow happens), I think it is better to combine it with the case of 0 total changes. In both cases, we set no_optimization_needed=True. And the error message applies to both cases. It would be obvious by the "prepared X changes" log message which case we hit.
There was a problem hiding this comment.
This sounds good I will implement change.
src/pybind/mgr/balancer/module.py
Outdated
| if total_did != 0: | ||
| self.log.info('prepared %d/%d upmap changes' % (total_did, max_optimizations)) | ||
| if total_did == 0: | ||
| self.log.debug('prepared %d/%d upmap changes' % (total_did, max_optimizations)) | ||
| self.no_optimization_needed = True | ||
| return -errno.EALREADY, 'Unable to find further optimization, ' \ | ||
| 'or pool(s) pg_num is decreasing, ' \ | ||
| 'or distribution is already perfect' |
There was a problem hiding this comment.
| if total_did != 0: | |
| self.log.info('prepared %d/%d upmap changes' % (total_did, max_optimizations)) | |
| if total_did == 0: | |
| self.log.debug('prepared %d/%d upmap changes' % (total_did, max_optimizations)) | |
| self.no_optimization_needed = True | |
| return -errno.EALREADY, 'Unable to find further optimization, ' \ | |
| 'or pool(s) pg_num is decreasing, ' \ | |
| 'or distribution is already perfect' | |
| if total_did > 0: | |
| self.log.info('prepared %d/%d upmap changes' % (total_did, max_optimizations)) | |
| else: | |
| self.log.debug('prepared %d/%d upmap changes' % (total_did, max_optimizations)) | |
| self.no_optimization_needed = True | |
| return -errno.EALREADY, 'Unable to find further optimization, ' \ | |
| 'or pool(s) pg_num is decreasing, ' \ | |
| 'or distribution is already perfect' |
Same as with do_read_balancing. We only care about logging "info" when the changes are greater than 0. In the "else" case, we will log 0 and negative changes. Negative changes should not ever happen, but it's good to log it just in case.
There was a problem hiding this comment.
I will also implement this as apart of a new commit.
| if left == max_iterations: | ||
| self.log.info('Balancing root %s (pools %s) by %s' % | ||
| (root, pools, key)) | ||
| else: | ||
| self.log.debug('Balancing root %s (pools %s) by %s' % | ||
| (root, pools, key)) |
There was a problem hiding this comment.
Is the idea here to only log the "balancing root" message as "info" one time per root at the beginning of the while loop, and then do the rest at "debug"?
There was a problem hiding this comment.
Yes you are correct that is the idea for this change.
|
jenkins retest this please |
In order to reduce the overall log load at info level, logs that were deemed unnecessary for users was moved to debug level. This includes logs that repeat information and logs that track no actual balancing changes. Signed-off-by: Timothy Q Nguyen <timqn22@gmail.com>
Improved the quality of one of the info level logs by adding more information to it and removing a similar log in debug level. This is in an effort to provide maximum information at info level without clogging up the info level logs with multiple logs. I've also added a check for the "balancing root" logs which are repeated multiple times without adding new information. Now only the first of the repeating "balancing root" logs is logged at info level, the rest will be logged in debug level. Signed-off-by: Timothy Q Nguyen <timqn22@gmail.com>
Improved logic of log code by reducing redundant lines and combining cases. Signed-off-by: Timothy Q Nguyen <timqn22@gmail.com>
38e40c2 to
9ef4391
Compare
|
The branch needed to be rebased to fix the doc failure (unrelated to the changes in this PR). Here are the steps I took: |
|
We can add the "needs-qa" label once the CI checks pass. |
|
RADOS Approved: https://tracker.ceph.com/issues/73968#note-3 |
|
Can one of the admins verify this patch? |
9 similar comments
|
Can one of the admins verify this patch? |
|
Can one of the admins verify this patch? |
|
Can one of the admins verify this patch? |
|
Can one of the admins verify this patch? |
|
Can one of the admins verify this patch? |
|
Can one of the admins verify this patch? |
|
Can one of the admins verify this patch? |
|
Can one of the admins verify this patch? |
|
Can one of the admins verify this patch? |
|
Can one of the admins verify this patch? |
3 similar comments
|
Can one of the admins verify this patch? |
|
Can one of the admins verify this patch? |
|
Can one of the admins verify this patch? |
This change includes the changing of many info level logs in the Balancer module to debug level and if/else logic to further filter info level logs to only show for when balancing action is actually happening. This is in an effort to reduce the overall amount of logs the Balancer produces for clients.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2395894
Upstream Tracker: https://tracker.ceph.com/issues/73656
Signed-off-by: Timothy Q Nguyen timqn22@gmail.com