Skip to content
This repository was archived by the owner on Nov 15, 2023. It is now read-only.
This repository was archived by the owner on Nov 15, 2023. It is now read-only.

StorageCache lock is slowing down RPC requests #9312

@crystalin

Description

@crystalin

Scenario

In frontier, we have the eth_getLogs with allows to query big range of blocks (we allow 2000 per request) for their ethereum logs.
In the case of Ethereum, it uses the client storage to retrieve the block 1 by 1 (see https://github.com/paritytech/frontier/blob/master/client/rpc/src/overrides/schema_v1_override.rs#L59)
When multiple request to eth_getLogs comes together, it uses up to 4 threads (or up to --rpc-http-threads).

The observation

(on a 32 cores computer)

  • No matter how many cpu and rpc-http-threads, we are always limited to a certain CPU load (2.5 in our case), so with 4 threads, it would use 4 CPUs @ 60% utilization each, but with 10 it would use 10 CPU @ 25% each.
  • Using parity-db boosted a bit the performance but there is still a bottleneck somewhere
  • Changing the --state-cache-size value has no impact

Those logs shows the behavior when requesting those blocks
cache-logs

The issue

When accessing the state_at for a given block, substrate relies on a StorageCache (https://github.com/paritytech/substrate/blob/master/client/db/src/lib.rs#L2311) which works using a RW lock (https://github.com/paritytech/substrate/blob/master/client/db/src/storage_cache.rs#L540).
This lock creates a bottleneck for all the threads trying to access the storage.

This is confirmed by the perf analysis of the request:
perf-lock

The solution

  • The possibility to remove the StorageCache for the node (in the case of RPC nodes it makes more sense), or to only have the StorageCache used for the last 10 blocks (the one mostly used) would work.
    Disabling the StorageCache (by commenting the fn storage(... function logic to only do Ok(self.state.storage(key)?)) showed great improvement in performances, using up to 90% of the cpus for each of the 4 rpc threads used (there are still some other bottleneck preventing to scale to 100% but I'll investigate later)

If you have another good solution in mind, I'd be happy to discuss it

Metadata

Metadata

Assignees

No one assigned

    Labels

    I9-optimisationAn enhancement to provide better overall performance in terms of time-to-completion for a task.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions