fix(perps): optimize connection speed and consolidate architecture - manual cp into 7.57.0#21245
Merged
Merged
Conversation
…-7.57.0 (#20946) ## **Description** Second part of Perps connection optimizations building on PR #20918. This PR delivers performance improvements, reliability fixes, and architectural cleanup for the Perps connection system. **Performance (TAT-1576): Faster HyperLiquid Connection** Replaced blocking `getAccountState()` API calls with lightweight WebSocket health checks: - **Initial connection**: Removed blocking HTTP call + stale connection check - **Account switches**: Removed redundant API call after provider reinitialization - **New validation**: Added `provider.ping()` WebSocket health check (5s timeout) instead of full API calls Account data is now efficiently fetched via WebSocket subscriptions during preload, eliminating blocking HTTP requests and retry logic. **Reliability Improvements** Connection timeout protection (TAT-1576): - Added 30-second timeout for connection attempts with new `CONNECTION_TIMEOUT` error code - Prevents indefinite hanging on network issues - Clear user feedback when connection fails Background reconnection fix (TAT-1559): - Added 300ms stabilization delay when app returns from background to prevent race conditions - Eliminates "Perps is temporarily offline" errors on app wake-up **Architecture: Error Logging & Code Consolidation** - **Standardized error logging**: Added `getErrorContext()` helper and comprehensive Sentry logging to 20+ catch blocks with consistent context (feature, context, provider, network) - **Performance metrics refactoring**: Changed naming from `perps_x_y` to `perps.x.y` for hierarchical grouping and easier filtering in Sentry dashboards - **Removed code duplication**: Eliminated `Controller.reconnectWithNewContext()` - Manager calls `initializeProviders()` directly - **Fixed retry button**: Now uses `reconnectWithNewContext({ force: true })` for proper WebSocket disconnect - **Cleaner API**: Extracted `usePerpsConnection` hook to separate file, added `ReconnectOptions` interface - **Comprehensive docs**: Created `PERPS_CONNECTION_ARCHITECTURE.md` with Mermaid diagrams explaining connection lifecycle and race condition handling **Impact:** - ⚡ **Faster**: Connection and reconnection complete faster by eliminating blocking HTTP calls - 🛡️ **Reliable**: Background reconnection works consistently; timeout protection prevents hanging - 🐛 **Debuggable**: Standardized error logging with consistent context helps diagnose production issues in Sentry - 🧹 **Maintainable**: Cleaner code with reduced duplication and clearer separation of concerns ## **Changelog** CHANGELOG entry: null ## **Related issues** Depends on: #20918 Fixes: TAT-1576, TAT-1559 ## **Manual testing steps** ```gherkin Feature: Perps connection performance and reliability Scenario: faster initial connection Given Perps has never been connected in this session When user opens Perps tab Then connection establishes without blocking HTTP calls And loading time is noticeably faster than before And positions/orders appear without delay Scenario: faster account switching Given user is connected to Perps with Account A And user has positions open When user switches to Account B Then reconnection completes faster without getAccountState() call And Account B's positions load via WebSocket subscriptions And no Account A data leaks through Scenario: retry button performs full reconnection Given Perps fails to connect with error message When user taps "Retry Connection" button Then connection fully reinitializes including WebSocket disconnect And error is logged to Sentry with proper context Scenario: rapid account switches are serialized Given user is connected to Perps with Account A When user switches to Account B And immediately switches to Account C before B connection completes Then reconnections are serialized (no race conditions) And final connection uses Account C's address Scenario: app wake-up after inactivity (TAT-1559) Given app has been in background for extended period And user had Perps connected before backgrounding When user returns to app and opens Perps tab Then connection establishes after 300ms stabilization delay And "Perps is temporarily offline" error does not appear And positions/orders load successfully ``` ## **Screenshots/Recordings** ### **Before** - Initial connection: ~3 blocking operations (stale check + HTTP API call + provider init) - Account switches: Provider init + HTTP API call with retry logic - No timeout protection (could hang indefinitely) - Background reconnection failures on app wake-up ### **After** - Initial connection: 1 operation (provider init + WebSocket ping health check) - Account switches: 1 operation (provider init + WebSocket ping health check) - 30-second connection timeout with user feedback - Reliable background reconnection with 300ms stabilization delay - Account data fetched via WebSocket subscriptions during preload ## **Pre-merge author checklist** - [x] I've followed [MetaMask Contributor Docs](https://github.com/MetaMask/contributor-docs) and [MetaMask Mobile Coding Standards](https://github.com/MetaMask/metamask-mobile/blob/main/.github/guidelines/CODING_GUIDELINES.md). - [x] I've completed the PR template to the best of my ability - [x] I've included tests if applicable - [x] I've documented my code using [JSDoc](https://jsdoc.app/) format if applicable - [x] I've applied the right labels on the PR (see [labeling guidelines](https://github.com/MetaMask/metamask-mobile/blob/main/.github/guidelines/LABELING_GUIDELINES.md)). Not required for external contributors. ## **Pre-merge reviewer checklist** - [ ] I've manually tested the PR (e.g. pull and build branch, run the app, test code being changed). - [ ] I confirm that this PR addresses all acceptance criteria described in the ticket it closes and includes the necessary testing evidence such as recordings and or screenshots. ## **Notes for Reviewers** ### Key Changes **Performance (TAT-1576):** - `PerpsConnectionManager.ts`: Removed blocking `getAccountState()` HTTP calls and retry logic - Replaced with `provider.ping()` WebSocket health check (5s timeout) - Account data now fetched via `streamManager.account.prewarm()` during preload subscriptions **Reliability:** - Added connection timeout protection: 30-second limit with `CONNECTION_TIMEOUT` error code - Background reconnection delay: 300ms stabilization period when app returns from background (fixes TAT-1559) - All WebSocket operations now have timeout protection **Error Logging:** - New `getErrorContext()` helper for consistent Sentry context across 20+ catch blocks - Performance metric names refactored to `perps.{category}.{metric}` format for better Sentry dashboards - Example: `perps_websocket_connection` → `perps.websocket.connection_establishment` **Architecture:** - **NEW**: `hooks/usePerpsConnection.ts` - Extracted hook from provider - **NEW**: `docs/perps/PERPS_CONNECTION_ARCHITECTURE.md` - Architecture guide with Mermaid diagrams - **NEW**: `types/index.ts` - Added `ReconnectOptions` interface - **REMOVED**: `Controller.reconnectWithNewContext()` - Manager now calls `initializeProviders()` directly - **UPDATED**: Retry button now uses `reconnectWithNewContext({ force: true })` ### Testing Focus Areas 1. **Connection speed**: Initial Perps tab open should be noticeably faster without blocking HTTP calls 2. **Account switching**: Rapid switches (3+ accounts) should serialize correctly without data leaks 3. **Background reconnection**: App wake-up after extended background time should not show "temporarily offline" error 4. **Connection timeout**: Simulate network issues - should timeout after 30s with clear error message 5. **Retry button**: Should perform full reconnection with WebSocket disconnect <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Replaces blocking HTTP checks with a WS ping for faster connects, adds 30s connection timeout and 300ms foreground delay, consolidates error logging/metrics, and refactors connection APIs/hooks. > > - **Performance & Reliability**: > - Replace `getAccountState()` health checks with `provider.ping()` WS health check; remove stale-connection HTTP checks. > - Add 30s connection timeout (`CONNECTION_TIMEOUT`) and 300ms post-foreground delay; preload account data via subscriptions. > - **Architecture & APIs**: > - Remove `PerpsController.reconnectWithNewContext`; Manager now calls `initializeProviders()`; retry uses `reconnectWithNewContext({ force: true })`. > - Extract `usePerpsConnection` hook; add `ReconnectOptions`; switch imports from provider to hook. > - **Logging & Metrics**: > - Standardize Sentry logging with `ensureError()` and controller/provider `getErrorContext()`; update 20+ catch blocks. > - Rename metrics to hierarchical `perps.*` (e.g., `perps.websocket.*`, `perps.screen.*`). > - **Provider & Services**: > - Implement `ping(timeoutMs?)` in `HyperLiquidProvider`; add WS client typing; improve subscription error handling. > - Misc fixes in `PerpsStreamManager` and subscription services (error logging, cache/prewarm cleanup). > - **Config & i18n**: > - Update `PERPS_CONSTANTS` (timeouts, ping); add `connectionTimeout` i18n string. > - **Docs & Tests**: > - Add connection architecture docs; update/add extensive tests across hooks, controller, provider, services. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit db00eee. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Nicholas Smith <nick.smith@consensys.net>
Contributor
|
CLA Signature Action: All authors have signed the CLA. You may need to manually re-run the blocking PR check if it doesn't pass in a few minutes. |
| } | ||
| }, PERPS_CONSTANTS.RECONNECTION_DELAY_ANDROID_MS); | ||
| // Store timer to clean up if component unmounts | ||
| return () => clearTimeout(timer); |
Contributor
There was a problem hiding this comment.
Bug: App State Tracking Bug Causes Memory Leak
The return () => clearTimeout(timer); in handleAppStateChange causes an early exit, preventing lastAppState.current from updating and breaking app state tracking. Since event listeners ignore return values, the setTimeout timer is also never cleared, leading to a memory leak.
Contributor
Author
There was a problem hiding this comment.
@abretonc7s Let's follow up with this if it's an issue
|
Collaborator
|
No release label on PR. Adding release label release-7.57.0 on PR, as PR was added to branch 7.57.0 when release was cut. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


…-7.57.0 (#20946)
Description
Second part of Perps connection optimizations building on PR #20918. This PR delivers performance improvements, reliability fixes, and architectural cleanup for the Perps connection system.
Performance (TAT-1576): Faster HyperLiquid Connection
Replaced blocking
getAccountState()API calls with lightweight WebSocket health checks:provider.ping()WebSocket health check (5s timeout) instead of full API callsAccount data is now efficiently fetched via WebSocket subscriptions during preload, eliminating blocking HTTP requests and retry logic.
Reliability Improvements
Connection timeout protection (TAT-1576):
CONNECTION_TIMEOUTerror codeBackground reconnection fix (TAT-1559):
Architecture: Error Logging & Code Consolidation
getErrorContext()helper and comprehensive Sentry logging to 20+ catch blocks with consistent context (feature, context, provider, network)perps_x_ytoperps.x.yfor hierarchical grouping and easier filtering in Sentry dashboardsController.reconnectWithNewContext()- Manager callsinitializeProviders()directlyreconnectWithNewContext({ force: true })for proper WebSocket disconnectusePerpsConnectionhook to separate file, addedReconnectOptionsinterfacePERPS_CONNECTION_ARCHITECTURE.mdwith Mermaid diagrams explaining connection lifecycle and race condition handlingImpact:
Changelog
CHANGELOG entry: null
Related issues
Depends on: #20918
Fixes: TAT-1576, TAT-1559
Manual testing steps
Screenshots/Recordings
Before
After
Pre-merge author checklist
Standards.
Pre-merge reviewer checklist
Notes for Reviewers
Key Changes
Performance (TAT-1576):
PerpsConnectionManager.ts: Removed blockinggetAccountState()HTTP calls and retry logicprovider.ping()WebSocket health check (5s timeout)streamManager.account.prewarm()during preload subscriptionsReliability:
CONNECTION_TIMEOUTerror codeError Logging:
getErrorContext()helper for consistent Sentry context across 20+ catch blocksperps.{category}.{metric}format for better Sentry dashboardsperps_websocket_connection→perps.websocket.connection_establishmentArchitecture:
hooks/usePerpsConnection.ts- Extracted hook from providerdocs/perps/PERPS_CONNECTION_ARCHITECTURE.md- Architecture guide with Mermaid diagramstypes/index.ts- AddedReconnectOptionsinterfaceController.reconnectWithNewContext()- Manager now callsinitializeProviders()directlyreconnectWithNewContext({ force: true })Testing Focus Areas
Note
Replaces blocking HTTP checks with WebSocket health ping, adds 30s connection timeout and foreground reconnection delay, standardizes error logging/metrics, and extracts connection hook with minor API changes and tests/docs updates.
getAccountState()checks withprovider.ping()WebSocket health check; remove stale-connection HTTP retries.CONNECTION_TIMEOUTerror code and i18n.ensureError()andgetErrorContext()across controllers/services.perps.{category}.{metric}.usePerpsConnectionhook; Provider now delegates retry toreconnectWithNewContext({ force: true }).PerpsController.reconnectWithNewContext; Manager callsinitializeProviders()directly.ping(timeoutMs?)toIPerpsProvider; implement inHyperLiquidProvider.ReconnectOptionsandFEATURE_NAMEinperpsConfig.../hooks/usePerpsConnection; updatePerpsLoadingSkeletonto pass{ force: true }.ping, open orders/positions hooks, controller fee discount and logging changes.perps-connection-architecture.md; update Sentry reference guidelines.Written by Cursor Bugbot for commit 3de4e33. This will update automatically on new commits. Configure here.
Description
Changelog
CHANGELOG entry:
Related issues
Fixes:
Manual testing steps
Screenshots/Recordings
Before
After
Pre-merge author checklist
Pre-merge reviewer checklist