Skip to content

RGW: Add more information to the "req done" log message#15263

Closed
jharbott wants to merge 1 commit intoceph:masterfrom
cloudbau:wip-rgw-logging
Closed

RGW: Add more information to the "req done" log message#15263
jharbott wants to merge 1 commit intoceph:masterfrom
cloudbau:wip-rgw-logging

Conversation

@jharbott
Copy link

As discussed in [1], add some further information when logging a finished
request in rgw:

  • op_name
  • user_id
  • domain
  • request_uri
  • duration

In the end it would be nice to aggregate all this information together
with the additional data that civetweb is keeping, in particular
remote_addr and num_bytes_sent, but this seems useful as a first step.

[1] http://tracker.ceph.com/issues/19917

Signed-off-by: Jens Rosenboom j.rosenboom@x-ion.de

@mattbenjamin
Copy link
Contributor

@jrosenboom this info is a subset of what is in the ops log

@jharbott
Copy link
Author

Yes, but for a high-volume site, we cannot afford to have more detailed logs, still we need this information in order to analyse slow requests or other issues. With about 1k requests/s, we already have a couple of GB of logs per day with only the minimal log levels active.

Copy link
Contributor

@mattbenjamin mattbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we avoid declaring extra temp vars?


int http_ret = s->err.http_ret;
int op_ret = 0;
string op_name = "unknown";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to make this copy?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I guess one can setup some conditionals directly in the dout, but I'm not sure whether the added complexity will make up for added efficiency. The difficulty comes from the fact that this part of the code can also be reached via some error exits when things have gone wrong before, so I wanted to rather be on the safe side. Not sure whether the tests will handle all error situations so that one could be sure that there are no possibilities for core dumps introduced.

Feel free to propose an update, though.

handler->put_op(op);
rest->put_handler(handler);

string user_id = "nobody";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to make this copy?

ts = ceph_clock_now();
}

utime_t RGWRequest::get_elapsed_time() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor

@mikulely mikulely left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jrosenboom for bringing this, I'm interested about the get_elapsed_time part, it matches the meaning of s3 bucket logging log field `Turn-Around Time' (http://docs.aws.amazon.com/AmazonS3/latest/dev/LogFormat.html) . I'd like to log this field in the rgw ops log later.

Link to rgw: s3 bucket logging/server access logging #14841


dout(1) << "====== req done req=" << hex << req << dec
<< " op status=" << op_ret
<< " op_name=" << op_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would prefer:

<< " op_name=" << (op ? op->name() : "unknown")

to take consistent with existing usage.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

dout(1) << "====== req done req=" << hex << req << dec
<< " op status=" << op_ret
<< " op_name=" << op_name
<< " user_id=" << user_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would prefer:

<< " user_id=" << s->user->user_id

We should user_id assigned during authentication phrase.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can still be bad, if we reach this via some of the early error gotos. I added another conditional similar to the one above.

Copy link
Contributor

@mikulely mikulely left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrosenboom Please see my comment

dout(1) << "====== req done req=" << hex << req << dec
<< " op status=" << op_ret
<< " op_name=" << (op ? op->name() : "unknown")
<< " user_id=" << (s->user ? s->user->user_id.to_str() : "nobody")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace "nobody" with "" ? Because nobody is a valid uid, which can be created by user with radsgw-admin cmd.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

As discussed in [1], add some further information when logging a finished
request in rgw:

- op_name
- user_id
- domain
- request_uri
- request_params
- obj_size
- duration

With the duration being show here, we do not need the request start
being logged, move it to a higher level.

In the end it would be nice to aggregate all this information together
with the additional data that civetweb is keeping, in particular
remote_addr and num_bytes_sent, but this seems useful as a first step.

[1] http://tracker.ceph.com/issues/19917

Signed-off-by: Jens Rosenboom <j.rosenboom@x-ion.de>
@jharbott
Copy link
Author

Updated with two more fields being output and a bit of amending the log levels.

@smithfarm
Copy link
Contributor

Jenkins re-test this please

@stale
Copy link

stale bot commented Oct 18, 2018

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
If you are a maintainer or core committer, please follow-up on this issue to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@stale stale bot added the stale label Oct 18, 2018
@jharbott jharbott closed this Jan 11, 2019
@ushacow
Copy link

ushacow commented Jun 21, 2019

Hi! So Was it implemented? I'm searching info how to add a duration of request to civetweb log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants