Skip to content

fix: stop calling end in createDupeReqFilterMiddleware.ts, so that dapps get correct response to 'seenRequests'#24672

Merged
danjm merged 17 commits intodevelopfrom
fix-provider-request-retry-bug
May 29, 2024
Merged

fix: stop calling end in createDupeReqFilterMiddleware.ts, so that dapps get correct response to 'seenRequests'#24672
danjm merged 17 commits intodevelopfrom
fix-provider-request-retry-bug

Conversation

@danjm
Copy link
Copy Markdown
Contributor

@danjm danjm commented May 21, 2024

Description

This PR fixes the following problem:

  1. Dapp sends a request to MetaMask shortly after the service worker starts up, and before the inpage provider receives the METAMASK_EXTENSION_CONNECT_CAN_RETRY message from the content-script (which happens after the metamask controller initializes and notifies all connections of a chainChanged event)
  2. The request hits the middleware created by createDupeReqFilterMiddleware.ts and its id is added to seenRequestIds
  3. Before the necessary controller responds to the request, the inpage provider now receives the METAMASK_EXTENSION_CONNECT_CAN_RETRY message from the content-script
  4. The provider now retries the request. This happens without the dapp doing anything; this is part of our MV3 retry logic to ensure requests don't get lost under service worker stoppage / start-up conditions. (The provider does this via the imported createStreamMiddleware https://github.com/MetaMask/json-rpc-middleware-stream/blob/main/src/createStreamMiddleware.ts#L128-L130)
  5. The new retry of the request hits the middleware created by createDupeReqFilterMiddleware. (The original request still has not been responded to.) The new retry of the request has the same id as the original, so the middleware hits the } else if (!seenRequestIds.add(req.id)) { condition and calls end()
  6. The original request, which was being awaited by the dapp, now is resolved but without a meaningful response (as the middleware just did an end() call without the request being handled in any way)

This problem was discovered by some e2e tests which became very flaky under MV3. Tests that involved the Simple Snap Keyring dapp would often send a wallet_requestSnaps request, and then not get the necessary response, due to the issue described above.

The solution to the problem presented in this PR is just to not call end when seeing a duplicate request, because the original request should be responded to eventually.

Open in GitHub Codespaces

Related issues

Part of the resolution to #21496

Manual testing steps

The bug is hard to repro manually. You could, perhaps, locally modify the createStreamMiddleware code to call sendToStream a second time for the same request after a short (e.g. 10 millisecond) timeout (perhaps here: https://github.com/MetaMask/json-rpc-middleware-stream/blob/main/src/createStreamMiddleware.ts#L50-L60). On develop that should result in the original request receiving an empty response but on this branch the original request should receive a successful response.

This branch will also contribute to MV3 e2e tests passing, but that will require some other PRs as well.

Pre-merge author checklist

  • I’ve followed MetaMask Coding Standards.
  • I've completed the PR template to the best of my ability
  • I’ve included tests if applicable
  • I’ve documented my code using JSDoc format if applicable
  • I’ve applied the right labels on the PR (see labeling guidelines). Not required for external contributors.

Pre-merge reviewer checklist

  • I've manually tested the PR (e.g. pull and build branch, run the app, test code being changed).
  • I confirm that this PR addresses all acceptance criteria described in the ticket it closes and includes the necessary testing evidence such as recordings and or screenshots.

@danjm danjm requested a review from a team as a code owner May 21, 2024 09:10
@github-actions
Copy link
Copy Markdown
Contributor

CLA Signature Action: All authors have signed the CLA. You may need to manually re-run the blocking PR check if it doesn't pass in a few minutes.

@danjm danjm added the team-extension-platform Extension Platform team label May 21, 2024
@metamaskbot metamaskbot added the INVALID-PR-TEMPLATE PR's body doesn't match template label May 21, 2024
@metamaskbot
Copy link
Copy Markdown
Collaborator

Builds ready [4a67825]
Page Load Metrics (1375 ± 632 ms)
PlatformPageMetricMin (ms)Max (ms)Average (ms)StandardDeviation (ms)MarginOfError (ms)
ChromeHomefirstPaint682081123316
domContentLoaded108320168
load55343713751316632
domInteractive108320168
Bundle size diffs [🚀 Bundle size reduced!]
  • background: -15 Bytes (-0.00%)
  • ui: 0 Bytes (0.00%)
  • common: 0 Bytes (0.00%)

@codecov
Copy link
Copy Markdown

codecov bot commented May 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 65.67%. Comparing base (31fc52f) to head (44ac712).
Report is 25 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop   #24672      +/-   ##
===========================================
- Coverage    65.68%   65.67%   -0.01%     
===========================================
  Files         1360     1360              
  Lines        54103    54113      +10     
  Branches     14064    14060       -4     
===========================================
+ Hits         35533    35536       +3     
- Misses       18570    18577       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Copy Markdown
Member

@rekmarks rekmarks May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suddenly strikes me as a potentially good idea:

Suggested change
log.info(`RPC request with id "${req.id}" already seen.`);
log.debug(`Ignoring JSON-RPC request with already seen id "${req.id}".`);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More explicit more better, maybe.

Suggested change
return function filterDuplicateRequestMiddleware(req, _res, next) {
return function filterDuplicateRequestMiddleware(req, _res, next, _end) {

Copy link
Copy Markdown
Member

@rekmarks rekmarks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I understand the problem, but I'm not sure if we can use this solution. While it will create the desired behavior, it will result in a perpetually hung promise for each duplicate request (see @metamask/json-rpc-engine here and here).

I think we need to set a bogus res.id and end() the request (I think ideally with an error, but it doesn't really matter) in order to hit one of these cases in createStreamMiddleware() on the dapp side. In this way, the response to the duplicate request will be ignored on the dapp side, while the response to the original request will be handled as normal.

Edit: I also discovered this bug in json-rpc-middleware-stream: MetaMask/core#4308

Copy link
Copy Markdown
Member

@rekmarks rekmarks May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my review, here's what I think will work:

Suggested change
}
res.id = null; // Any id we're certain is bogus and collision-free per idRemapMiddleware will do
return end(new Error(`Received request with duplicate id "${req.id}".`));
}

@Gudahtt
Copy link
Copy Markdown
Member

Gudahtt commented May 22, 2024

Perhaps it would be simpler to move this "ignore duplicate requests" step to before we pass the request into the JSON-RPC engine. Then we'd have no hung promises, and we wouldn't need to mangle the request or artificially end

@danjm
Copy link
Copy Markdown
Contributor Author

danjm commented May 23, 2024

Perhaps it would be simpler to move this "ignore duplicate requests" step to before we pass the request into the JSON-RPC engine.

Do you mean we would do that from within the provider? I probably don't understand something here... the retryStuckRequests call happens in the stream middleware after the request has been passed to the JSON-RPC engine. If we move this "ignore duplicate requests" logic to before that, then couldn't duplicate requests make it to the extension background?

Or do you mean that we will handle these duplicate requests within the metamask background / controller, after the message has been received from the dapp, but before it is streamed into the engine?

@legobeat
Copy link
Copy Markdown
Contributor

legobeat commented May 23, 2024

I believe I understand the problem, but I'm not sure if we can use this solution. While it will create the desired behavior, it will result in a perpetually hung promise for each duplicate request (see @metamask/json-rpc-engine here and here).

@rekmarks In case it makes a difference, those links are to latest version of @metamask/json-rpc-engine 8.0.2, while metamask-extension is still using much older version json-rpc-engine 6.1.0.

@rekmarks
Copy link
Copy Markdown
Member

rekmarks commented May 23, 2024

Or do you mean that we will handle these duplicate requests within the metamask background / controller, after the message has been received from the dapp, but before it is streamed into the engine?

@danjm, to the best of my knowledge, it would have to be this. I agree with @Gudahtt that this would be in many ways a more elegant solution. It could essentially be a middleware in the form of a through / Transform stream that we tack on to the stream pipeline before the providerStream on the inbound side only, here.

danjm added 4 commits May 23, 2024 17:24
…hat previously 'seenRequests' always result in an eventual, meaningful, response to the dapp
…ream, and apply it to the stream pipeline, before the providerstream on the inbound side
@danjm danjm force-pushed the fix-provider-request-retry-bug branch from 4a67825 to 84e947f Compare May 24, 2024 11:16
@danjm
Copy link
Copy Markdown
Contributor Author

danjm commented May 24, 2024

I just did a force push, but I left the original 3 commits unchanged, I was just handling the rebase on latest develop

@danjm
Copy link
Copy Markdown
Contributor Author

danjm commented May 24, 2024

this would be in many ways a more elegant solution. It could essentially be a middleware in the form of a through / Transform stream that we tack on to the stream pipeline before the providerStream on the inbound side only,

@Gudahtt @rekmarks I have done this in the latest commit 84e947f

@danjm
Copy link
Copy Markdown
Contributor Author

danjm commented May 24, 2024

@metamaskbot update-policies

@metamaskbot
Copy link
Copy Markdown
Collaborator

Policies updated

@metamaskbot metamaskbot requested review from a team as code owners May 24, 2024 12:07
rekmarks
rekmarks previously approved these changes May 24, 2024
Copy link
Copy Markdown
Member

@rekmarks rekmarks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All my comments are nits. Nice tests, LGTM!

unknown,
void
> {
export default function createDupeReqFilterThroughStream() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to @legobeat's ongoing work to update our stream dependencies, I think this is likely to be replaced with a Transform in the not-too-distant future:

Suggested change
export default function createDupeReqFilterThroughStream() {
export default function createDupeReqFilterStream() {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 42ef9da

expect(output).toEqual(expectedOutputAfterExpiryTime);
});

it('does not expire single id after two minutes', () => {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TWO_MINUTES constant is perhaps a bit overkill; it ought to be enough with THREE_MINUTES - 1.

Suggested change
it('does not expire single id after two minutes', () => {
it('does not expire single id after less than three minutes', () => {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 96f050e

danjm and others added 2 commits May 24, 2024 19:04
Co-authored-by: Erik Marks <25517051+rekmarks@users.noreply.github.com>
Co-authored-by: Erik Marks <25517051+rekmarks@users.noreply.github.com>
DDDDDanica pushed a commit that referenced this pull request May 25, 2024
## **Description**

The lavamoat policy file generation on develop is not accounting for the
offscreen document when the polcies are being generated. This is causing
MV3 tests to fail because `@trezor/connect-web`, which is only imported
in the offscreen, is not covered by the lavamoat policy, resulting in an
error being thrown in the offscreen document.

The fix is to set ENABLE_MV3=true for `lavamoat:webapp:auto` and
`lavamoat:webapp:auto:ci`, resulting in the offscreen document being
included in the files used policy generation. Policy files then also
need to be updated.

[![Open in GitHub
Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/MetaMask/metamask-extension/pull/24773?quickstart=1)

## **Manual testing steps**

Snaps MV3 tests should pass on
https://github.com/MetaMask/metamask-extension/tree/provider-retry-fix-plus-mv3-policy-fix,
which is a combination of this branch and the branch for
#24672, both of which
are needed to get snaps mv3 tests passing.

## **Pre-merge author checklist**

- [ ] I’ve followed [MetaMask Coding
Standards](https://github.com/MetaMask/metamask-extension/blob/develop/.github/guidelines/CODING_GUIDELINES.md).
- [ ] I've completed the PR template to the best of my ability
- [ ] I’ve included tests if applicable
- [ ] I’ve documented my code using [JSDoc](https://jsdoc.app/) format
if applicable
- [ ] I’ve applied the right labels on the PR (see [labeling
guidelines](https://github.com/MetaMask/metamask-extension/blob/develop/.github/guidelines/LABELING_GUIDELINES.md)).
Not required for external contributors.

## **Pre-merge reviewer checklist**

- [ ] I've manually tested the PR (e.g. pull and build branch, run the
app, test code being changed).
- [ ] I confirm that this PR addresses all acceptance criteria described
in the ticket it closes and includes the necessary testing evidence such
as recordings and or screenshots.

---------

Co-authored-by: MetaMask Bot <metamaskbot@users.noreply.github.com>
@danjm
Copy link
Copy Markdown
Contributor Author

danjm commented May 27, 2024

@metamaskbot update-policies

@metamaskbot
Copy link
Copy Markdown
Collaborator

Policies updated

@metamaskbot
Copy link
Copy Markdown
Collaborator

Builds ready [7ae9c9f]
Page Load Metrics (313 ± 337 ms)
PlatformPageMetricMin (ms)Max (ms)Average (ms)StandardDeviation (ms)MarginOfError (ms)
ChromeHomefirstPaint6912894157
domContentLoaded9231342
load572575313702337
domInteractive9231342
Bundle size diffs [🚨 Warning! Bundle size has increased!]
  • background: 91 Bytes (0.00%)
  • ui: 0 Bytes (0.00%)
  • common: 0 Bytes (0.00%)

Copy link
Copy Markdown
Member

@rekmarks rekmarks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could touch up some variable names since we stopped using through2.

@danjm danjm force-pushed the fix-provider-request-retry-bug branch from 2fae014 to d20349a Compare May 27, 2024 23:07
@metamaskbot
Copy link
Copy Markdown
Collaborator

Builds ready [d20349a]
Page Load Metrics (566 ± 465 ms)
PlatformPageMetricMin (ms)Max (ms)Average (ms)StandardDeviation (ms)MarginOfError (ms)
ChromeHomefirstPaint662121103316
domContentLoaded9451794
load542789566968465
domInteractive9451794
Bundle size diffs [🚨 Warning! Bundle size has increased!]
  • background: 120 Bytes (0.00%)
  • ui: 0 Bytes (0.00%)
  • common: 0 Bytes (0.00%)

return next();
};
const hasNoId = chunk.id === undefined;
const requestNotYetSeen = seenRequestIds.add(chunk.id);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, is it ok to add undefined in this set? cause chunk.id might effectively be undefined.

This logic differs a bit from the original implementation that was checking for req.id === undefined first.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is okay to add undefined. Whether or not undefined is in the set, hasNoId will be true for all undefined ids, and so the below logic will have the same result (regardless of the presence of undefined in the set).

@metamaskbot
Copy link
Copy Markdown
Collaborator

Builds ready [44ac712]
Page Load Metrics (778 ± 513 ms)
PlatformPageMetricMin (ms)Max (ms)Average (ms)StandardDeviation (ms)MarginOfError (ms)
ChromeHomefirstPaint661501022311
domContentLoaded96214115
load5526367781069513
domInteractive96214115
Bundle size diffs [🚨 Warning! Bundle size has increased!]
  • background: 120 Bytes (0.00%)
  • ui: 0 Bytes (0.00%)
  • common: 0 Bytes (0.00%)

Copy link
Copy Markdown
Member

@rekmarks rekmarks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@danjm danjm merged commit 3d861ca into develop May 29, 2024
@danjm danjm deleted the fix-provider-request-retry-bug branch May 29, 2024 03:03
@github-actions github-actions bot locked and limited conversation to collaborators May 29, 2024
@metamaskbot metamaskbot added release-11.18.0 release-11.16.6 Issue or pull request that will be included in release 11.16.6 and removed release-11.18.0 labels May 29, 2024
@metamaskbot
Copy link
Copy Markdown
Collaborator

Missing release label release-11.16.6 on PR. Adding release label release-11.16.6 on PR and removing other release labels(release-11.18.0), as PR was cherry-picked in branch 11.16.6.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

INVALID-PR-TEMPLATE PR's body doesn't match template release-11.16.6 Issue or pull request that will be included in release 11.16.6 team-extension-platform Extension Platform team

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

6 participants