Request body capturing - use EnableBuffering instead of EnableRewind and make it async#608
Request body capturing - use EnableBuffering instead of EnableRewind and make it async#608gregkalapos merged 13 commits intoelastic:masterfrom gregkalapos:BodyCapturing30Fix
Conversation
src/Elastic.Apm.AspNetCore/DiagnosticListener/AspNetCoreDiagnosticListener.cs
Outdated
Show resolved
Hide resolved
|
I'll dig deeper. Maybe related to the version bump - Probably it won't change the logic in the C# code. Update: fixed, this is not related to NuGet, it was the async reading described here. |
|
Hi @gregkalapos In issue #604 you suggested to turn off request body capturing, but I didn't find this configuration in the documentation, sincerely I did not know that the Agent already capture this information, in the past when I asked about it, the answer was that the Agent didn't capture it. I think that was added this behavior, but nothing was modified in the documentation. Can you help me? there is some key to set in environment variables as ELASTIC_APM_CAPTURE_HEADERS ? |
Hi @xitaocrazy, Oh, you are right, our documentation is not uptodate on this - but we fixed this recently, the documentation on So, the current status is that it works on ASP.NET Core 2.0, 2.1, and 2.2, but does not work in 3.0, which should be fixed in this PR. The docs will be merged with the next release and then you'll find it also in the doc of the latest release. Here you can find a sample Sorry about causing confusion. |
Hi @gregkalapos Thank you for your attention, I will add this configuration here. |
This was needed, because body capturing is async and the AspNetCoreDiagnosticListener implements an interface that dictates `public void OnNext()`, which is not a good place to call an async method - also already broke in CI.
|
|
||
| var exception = kv.Value.GetType().GetTypeInfo().GetDeclaredProperty("exception").GetValue(kv.Value) as Exception; | ||
|
|
||
| var transaction = (Transaction)_agent.Tracer.CurrentTransaction; |
| { | ||
| // In case an error handler middleware is registered, the catch block above won't be executed, because the | ||
| // error handler handles all the exceptions - in this case, based on the response code and the config, we may capture the body here | ||
| if (transaction != null && context?.Response.StatusCode >= 300 && transaction.Context?.Request?.Body is string body |
There was a problem hiding this comment.
Is "not generating error when error handler middleware is registered" an issue in ASP.NET Core 2.0 + "agent before this PR's changes" as well? Is this change related to the purpose of this PR?
There was a problem hiding this comment.
Sorry, I don't get the first part of the question.
Is this change related to the purpose of this PR?
Good point, I appreciate we follow questioning these thing!
Yes, it is very much related. I had a rant about it here. Since we read the request body asynchronously we can't do it in the void AspNetCoreDiagnosticListener.OnNext(...) method anymore - so the code that reads the body in case an error happened and is handled by an error handling middleware is moved here from the AspNetCoreDiagnosticListener.OnNext method.
There was a problem hiding this comment.
I see. I have two points of concern:
- This effectively changes the definition of an error. For example, do we want to treat HTTP status code 304 (Not Modified) as an error?
- I assume
AspNetCoreDiagnosticListener.OnNextis called before we get to this point ofApmMiddleware.InvokeAsyncwhich means error event's context will not have request body. If we are going to align with other agents and send request body in error event's context even for non-sampled transaction wouldn't it be a problem?
Is there a reason you didn't take approach mentioned in the comment
Another option would be to mark the transaction here and not do the reading and have a check in
Generating error event detected in AspNetCoreDiagnosticListener can be deferred to this point of the flow thus solving both points above.
There was a problem hiding this comment.
I think handling context?.Response.StatusCode >= 300 as error was really not a good idea, I changed it to context?.Response.StatusCode >= 400.
To your points:
- After doing the change mentioned above, not anymore. What typically an error handling middleware does is that it handles the exception, sets the return code to some http 4xxx or 5xx and returns an HTML page.
- Right, that makes it more challenging, I think what you say in your last sentence is a good approach. Nevertheless, I would not worry about it now, this is not something that the PR changes, this behaviour remains the same.
Is there a reason you didn't take approach mentioned in the comment
I did not find a good way to do it - like, adding a flag to transaction and things like that does not sound very nice to me, and this approach looks cleaner.
Long term, when we align on the error context capturing behaviour we can revisit this, but for now I'd leave that out from this PR.
There was a problem hiding this comment.
I did not find a good way to do it - like, adding a flag to transaction and things like that does not sound very nice to me, and this approach looks cleaner.
I'm not sure what are the cons of adding a flag to transaction but please note that this change does break existing functionality - before this change error event captured in AspNetCoreDiagnosticListener.OnNext had request body (assuming configuration allowed it and transaction is sampled) but after this change it sometimes will and sometimes it won't. The tests pass because they check error event context after transaction is completed (and because we have this issue of not taking a snapshot of transaction context for error event) but in real life use it will be a data race between transaction thread and payload sender's thread - if error event is serialized to JSON before transaction completes serialized error event won't have request body.
I leave it to you to decide if the severity of this issue warrants additional work.
There was a problem hiding this comment.
Oh, I see what you mean, and yes that change is true indeed. But that code was kind of already ignoring this whole thing I feel (mainly because I think the error's context was not really thought through I feel).
We did basically this:
- Create an error and then set its context pointing to the transactions context
- Then, in a totally different call, set the body for the transaction's context, not thinking anything about the error's.
I'd just do another iteration on when and how we capture errors context - and remove the sharing of that instance, I think that's worth a whole new PR.
- Making sure we don't create transaction.context during body capturing if it was not created otherwsie - Remove code duplication - Reenable `transaction.IsContextCreated.Should().Be`
Remove unnecessary checks and introduced FillSampledTransactionContextRequest method.
Addressing #604 and making request body capturing async
EnableBuffering
Turns our that the
HttpRequest.EnableRewind()method is gone in ASP.NET Core 3.0. We need that to be able to read the request body.We pull the
HttpRequesttype through theMicrosoft.AspNetCore.Http.Abstractionspackage.Now,
HttpRequest.EnableRewind()is gone, but a new method,HttpRequest.EnableBuffering()was introduced and that is available since version2.1of theMicrosoft.AspNetCore.Http.Abstractionspackage.Until this point we were at
Microsoft.AspNetCore.Http.Abstractions2.0, so we need to update in order to use the newEnableBufferingwhich works both on2.1+and3.0.With that we'd drop support for ASP.NET Core 2.0 and we'd only support it from
2.1+. I think this is fine, since Microsoft also dropped support for it already (see here) -2.1is the latest supported version.Making request body capturing async
This was actually already a problem with
EnableRewind- reading the request body happened synchronously and that can lead to thread starvation. The new method (EnableBuffering) by default throws an exception if you try to read the body synchronously - sync reading can be enabled (AllowSynchronousIO), but rather than "workarounding" I thought this is a good time to solve the real problem.Therefore I also made reading of the request body async as it should be.
I know #577 did some changes in this area (although as I mentioned in that PR, I think the best would be to not change anything on the capturing itself, since that PR should only be focused on remote config - at least that would make our life easier). Anyway, whichever is merged earlier, I'm happy to help with merging and resolving conflicts - it won't be that bad.Update: #577 merged to
masterand I mergedmasterinto this PR, all conflicts resolved ✅