Skip to content

Log stuck lookups#5778

Merged
mergify[bot] merged 11 commits intosigp:unstablefrom
dapplion:log-stuck-lookups
May 14, 2024
Merged

Log stuck lookups#5778
mergify[bot] merged 11 commits intosigp:unstablefrom
dapplion:log-stuck-lookups

Conversation

@dapplion
Copy link
Collaborator

Issue Addressed

Debugging sync lookup issues that involve a lookup being stuck are very time consuming. To locate which lookup is actually stuck one needs to correlate events from a log indexer and deduce which one has not been dropped.

Then, there's no clear visibility on the state of the lookup. One has to reconstruct the state from a sequence of events, which is possible but very time consuming.

Debugging could be much faster if we knew two things:

  • Which lookup is stuck
  • What is its internal state

We can expose that data over HTTP API but it's a bit ugly since sync should regularly post this data on a global state. Using metrics is problematic as labeling by block_root would increase the cardinality of metrics too much. So dumping to debug logs is easy to implement, and easy to consume.

Proposed Changes

  • Dump a short summary of all block lookups if they still exist 60 seconds after creation.
  • Add metric to track lookups that are older than 60 seconds

@realbigsean
Copy link
Member

@mergify queue

@mergify
Copy link

mergify bot commented May 14, 2024

queue

🛑 The pull request has been removed from the queue default

Details

The queue conditions cannot be satisfied due to failing checks.

You can take a look at Queue: Embarked in merge queue check runs for more details.

In case of a failure due to a flaky test, you should first retrigger the CI.
Then, re-embark the pull request into the merge queue by posting the comment
@mergifyio refresh on the pull request.

@realbigsean
Copy link
Member

@mergify queue

@mergify
Copy link

mergify bot commented May 14, 2024

queue

🛑 The pull request has been removed from the queue default

Details

The queue conditions cannot be satisfied due to failing checks.

You can take a look at Queue: Embarked in merge queue check runs for more details.

In case of a failure due to a flaky test, you should first retrigger the CI.
Then, re-embark the pull request into the merge queue by posting the comment
@mergifyio refresh on the pull request.

@dapplion
Copy link
Collaborator Author

@Mergifyio refresh

@mergify
Copy link

mergify bot commented May 14, 2024

refresh

✅ Pull request refreshed

@realbigsean
Copy link
Member

@mergify queue

@mergify
Copy link

mergify bot commented May 14, 2024

queue

✅ The pull request has been merged automatically

Details

The pull request has been merged automatically at 6f45ad4

@mergify mergify bot merged commit 6f45ad4 into sigp:unstable May 14, 2024
@dapplion dapplion deleted the log-stuck-lookups branch May 15, 2024 09:45
@dapplion dapplion mentioned this pull request May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants