Skip to content

Request body capturing - use EnableBuffering instead of EnableRewind and make it async#608

Merged
gregkalapos merged 13 commits intoelastic:masterfrom
gregkalapos:BodyCapturing30Fix
Nov 20, 2019
Merged

Request body capturing - use EnableBuffering instead of EnableRewind and make it async#608
gregkalapos merged 13 commits intoelastic:masterfrom
gregkalapos:BodyCapturing30Fix

Conversation

@gregkalapos
Copy link
Copy Markdown
Contributor

@gregkalapos gregkalapos commented Nov 18, 2019

Addressing #604 and making request body capturing async

EnableBuffering

Turns our that the HttpRequest.EnableRewind() method is gone in ASP.NET Core 3.0. We need that to be able to read the request body.

We pull the HttpRequest type through the Microsoft.AspNetCore.Http.Abstractions package.

Now, HttpRequest.EnableRewind() is gone, but a new method, HttpRequest.EnableBuffering() was introduced and that is available since version 2.1 of the Microsoft.AspNetCore.Http.Abstractions package.

Until this point we were at Microsoft.AspNetCore.Http.Abstractions 2.0, so we need to update in order to use the new EnableBuffering which works both on 2.1+ and 3.0.

With that we'd drop support for ASP.NET Core 2.0 and we'd only support it from 2.1+. I think this is fine, since Microsoft also dropped support for it already (see here) - 2.1 is the latest supported version.

Making request body capturing async

This was actually already a problem with EnableRewind - reading the request body happened synchronously and that can lead to thread starvation. The new method (EnableBuffering) by default throws an exception if you try to read the body synchronously - sync reading can be enabled (AllowSynchronousIO), but rather than "workarounding" I thought this is a good time to solve the real problem.

Therefore I also made reading of the request body async as it should be.

I know #577 did some changes in this area (although as I mentioned in that PR, I think the best would be to not change anything on the capturing itself, since that PR should only be focused on remote config - at least that would make our life easier). Anyway, whichever is merged earlier, I'm happy to help with merging and resolving conflicts - it won't be that bad.
Update: #577 merged to master and I merged master into this PR, all conflicts resolved ✅

@gregkalapos gregkalapos changed the title Request body capturing - use EnableBuffering instread of EnableRewind Request body capturing - use EnableBuffering instread of EnableRewind and make it async Nov 18, 2019
@gregkalapos gregkalapos marked this pull request as ready for review November 18, 2019 15:05
@gregkalapos
Copy link
Copy Markdown
Contributor Author

gregkalapos commented Nov 18, 2019

We have some CI issues - when I build locally, everything is fine.

[2019-11-18T15:31:37.409Z]          Restore failed in 86.46 ms for C:\Users\jenkins\workspace\tnet_apm-agent-dotnet-mbp_PR-608\apm-agent-dotnet\test\Elastic.Apm.Tests\Elastic.Apm.Tests.csproj.
[2019-11-18T15:31:37.409Z]          Resolving conflicts for .NETCoreApp,Version=v2.1...
[2019-11-18T15:31:37.409Z]          System.ArgumentNullException: Value cannot be null.
[2019-11-18T15:31:37.409Z]          Parameter name: path
[2019-11-18T15:31:37.409Z]             at System.IO.Path.GetFullPath(String path)
[2019-11-18T15:31:37.409Z]             at NuGet.Commands.RestoreCommand.GetAssetsFilePath(LockFile lockFile)
[2019-11-18T15:31:37.409Z]             at NuGet.Commands.RestoreCommand.ExecuteAsync(CancellationToken token)
[2019-11-18T15:31:37.409Z]             at NuGet.Commands.RestoreRunner.ExecuteAsync(RestoreSummaryRequest summaryRequest, CancellationToken token)
[2019-11-18T15:31:37.409Z]             at NuGet.Commands.RestoreRunner.ExecuteAndCommitAsync(RestoreSummaryRequest summaryRequest, CancellationToken token)
[2019-11-18T15:31:37.409Z]             at NuGet.Commands.RestoreRunner.CompleteTaskAsync(List`1 restoreTasks)
[2019-11-18T15:31:37.409Z]             at NuGet.Commands.RestoreRunner.RunAsync(IEnumerable`1 restoreRequests, RestoreArgs restoreContext, CancellationToken token)
[2019-11-18T15:31:37.409Z]             at NuGet.Commands.RestoreRunner.RunAsync(RestoreArgs restoreContext, CancellationToken token)
[2019-11-18T15:31:37.409Z]             at NuGet.Build.Tasks.RestoreTask.ExecuteAsync(ILogger log)
[2019-11-18T15:31:37.409Z]        Done executing task "RestoreTask" -- FAILED.
[2019-11-18T15:31:37.409Z]      1>Done building target "Restore" in project "ElasticApmAgent.sln" -- FAILED.
[2019-11-18T15:31:37.409Z]      1>Done Building Project "C:\Users\jenkins\workspace\tnet_apm-agent-dotnet-mbp_PR-608\apm-agent-dotnet\ElasticApmAgent.sln" (Restore target(s)) -- FAILED.
[2019-11-18T15:31:37.409Z] 

I'll dig deeper. Maybe related to the version bump - Probably it won't change the logic in the C# code.

Update: fixed, this is not related to NuGet, it was the async reading described here.

@xitaocrazy
Copy link
Copy Markdown

Hi @gregkalapos

In issue #604 you suggested to turn off request body capturing, but I didn't find this configuration in the documentation, sincerely I did not know that the Agent already capture this information, in the past when I asked about it, the answer was that the Agent didn't capture it. I think that was added this behavior, but nothing was modified in the documentation. Can you help me? there is some key to set in environment variables as ELASTIC_APM_CAPTURE_HEADERS ?

@gregkalapos
Copy link
Copy Markdown
Contributor Author

Hi @gregkalapos

In issue #604 you suggested to turn off request body capturing, but I didn't find this configuration in the documentation, sincerely I did not know that the Agent already capture this information, in the past when I asked about it, the answer was that the Agent didn't capture it. I think that was added this behavior, but nothing was modified in the documentation. Can you help me? there is some key to set in environment variables as ELASTIC_APM_CAPTURE_HEADERS ?

Hi @xitaocrazy,

Oh, you are right, our documentation is not uptodate on this - but we fixed this recently, the documentation on master already has this, you can find it here.

So, the current status is that it works on ASP.NET Core 2.0, 2.1, and 2.2, but does not work in 3.0, which should be fixed in this PR. The docs will be merged with the next release and then you'll find it also in the doc of the latest release.

Here you can find a sample appsettings.json file to turn on the setting. By default it's turned on. Also keep in mind that this adds overhead.

Sorry about causing confusion.

@xitaocrazy
Copy link
Copy Markdown

Hi @gregkalapos
In issue #604 you suggested to turn off request body capturing, but I didn't find this configuration in the documentation, sincerely I did not know that the Agent already capture this information, in the past when I asked about it, the answer was that the Agent didn't capture it. I think that was added this behavior, but nothing was modified in the documentation. Can you help me? there is some key to set in environment variables as ELASTIC_APM_CAPTURE_HEADERS ?

Hi @xitaocrazy,

Oh, you are right, our documentation is not uptodate on this - but we fixed this recently, the documentation on master already has this, you can find it here.

So, the current status is that it works on ASP.NET Core 2.0, 2.1, and 2.2, but does not work in 3.0, which should be fixed in this PR. The docs will be merged with the next release and then you'll find it also in the doc of the latest release.

Here you can find a sample appsettings.json file to turn on the setting. By default it's turned on. Also keep in mind that this adds overhead.

Sorry about causing confusion.

Hi @gregkalapos

Thank you for your attention, I will add this configuration here.

This was needed, because body capturing is async and the AspNetCoreDiagnosticListener implements an interface that dictates `public void OnNext()`, which is not a good place to call an async method - also already broke in CI.

var exception = kv.Value.GetType().GetTypeInfo().GetDeclaredProperty("exception").GetValue(kv.Value) as Exception;

var transaction = (Transaction)_agent.Tracer.CurrentTransaction;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more cast 💪

@gregkalapos gregkalapos changed the title Request body capturing - use EnableBuffering instread of EnableRewind and make it async Request body capturing - use EnableBuffering instead of EnableRewind and make it async Nov 19, 2019
{
// In case an error handler middleware is registered, the catch block above won't be executed, because the
// error handler handles all the exceptions - in this case, based on the response code and the config, we may capture the body here
if (transaction != null && context?.Response.StatusCode >= 300 && transaction.Context?.Request?.Body is string body
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "not generating error when error handler middleware is registered" an issue in ASP.NET Core 2.0 + "agent before this PR's changes" as well? Is this change related to the purpose of this PR?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't get the first part of the question.

Is this change related to the purpose of this PR?

Good point, I appreciate we follow questioning these thing!
Yes, it is very much related. I had a rant about it here. Since we read the request body asynchronously we can't do it in the void AspNetCoreDiagnosticListener.OnNext(...) method anymore - so the code that reads the body in case an error happened and is handled by an error handling middleware is moved here from the AspNetCoreDiagnosticListener.OnNext method.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I have two points of concern:

  1. This effectively changes the definition of an error. For example, do we want to treat HTTP status code 304 (Not Modified) as an error?
  2. I assume AspNetCoreDiagnosticListener.OnNext is called before we get to this point of ApmMiddleware.InvokeAsync which means error event's context will not have request body. If we are going to align with other agents and send request body in error event's context even for non-sampled transaction wouldn't it be a problem?

Is there a reason you didn't take approach mentioned in the comment

Another option would be to mark the transaction here and not do the reading and have a check in

Generating error event detected in AspNetCoreDiagnosticListener can be deferred to this point of the flow thus solving both points above.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think handling context?.Response.StatusCode >= 300 as error was really not a good idea, I changed it to context?.Response.StatusCode >= 400.

To your points:

  1. After doing the change mentioned above, not anymore. What typically an error handling middleware does is that it handles the exception, sets the return code to some http 4xxx or 5xx and returns an HTML page.
  2. Right, that makes it more challenging, I think what you say in your last sentence is a good approach. Nevertheless, I would not worry about it now, this is not something that the PR changes, this behaviour remains the same.

Is there a reason you didn't take approach mentioned in the comment

I did not find a good way to do it - like, adding a flag to transaction and things like that does not sound very nice to me, and this approach looks cleaner.

Long term, when we align on the error context capturing behaviour we can revisit this, but for now I'd leave that out from this PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not find a good way to do it - like, adding a flag to transaction and things like that does not sound very nice to me, and this approach looks cleaner.

I'm not sure what are the cons of adding a flag to transaction but please note that this change does break existing functionality - before this change error event captured in AspNetCoreDiagnosticListener.OnNext had request body (assuming configuration allowed it and transaction is sampled) but after this change it sometimes will and sometimes it won't. The tests pass because they check error event context after transaction is completed (and because we have this issue of not taking a snapshot of transaction context for error event) but in real life use it will be a data race between transaction thread and payload sender's thread - if error event is serialized to JSON before transaction completes serialized error event won't have request body.
I leave it to you to decide if the severity of this issue warrants additional work.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see what you mean, and yes that change is true indeed. But that code was kind of already ignoring this whole thing I feel (mainly because I think the error's context was not really thought through I feel).

We did basically this:

  • Create an error and then set its context pointing to the transactions context
  • Then, in a totally different call, set the body for the transaction's context, not thinking anything about the error's.

I'd just do another iteration on when and how we capture errors context - and remove the sharing of that instance, I think that's worth a whole new PR.

- Making sure we don't create transaction.context during body capturing if it was not created otherwsie
- Remove code duplication
- Reenable `transaction.IsContextCreated.Should().Be`
Remove unnecessary checks and introduced FillSampledTransactionContextRequest method.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants