Skip to content

Ensure correct ordering of heartbeats in resource manager#16787

Merged
tdcmeehan merged 1 commit into
prestodb:masterfrom
tdcmeehan:sequence
Sep 28, 2021
Merged

Ensure correct ordering of heartbeats in resource manager#16787
tdcmeehan merged 1 commit into
prestodb:masterfrom
tdcmeehan:sequence

Conversation

@tdcmeehan

Copy link
Copy Markdown
Contributor

Test plan - Unit tests

== NO RELEASE NOTE ==

@ajaygeorge

Copy link
Copy Markdown
Contributor

@tdcmeehan dumb question : It wasn't clear looking at the PR description why the order of heartbeats is important.
Say Node A sends heartbeats to Node B. Node A dies sometime and If Heartbeat 101 comes after Heartbeat 102, the worst case scenario is Node B thinks Node A is alive for a delta time it wasn't supposed to be alive because eventually Heartbeat 103 will not arrive. Is that the problem we are trying to solve.?

@swapsmagic swapsmagic left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tdcmeehan

Copy link
Copy Markdown
Contributor Author

@tdcmeehan dumb question : It wasn't clear looking at the PR description why the order of heartbeats is important. Say Node A sends heartbeats to Node B. Node A dies sometime and If Heartbeat 101 comes after Heartbeat 102, the worst case scenario is Node B thinks Node A is alive for a delta time it wasn't supposed to be alive because eventually Heartbeat 103 will not arrive. Is that the problem we are trying to solve.?

The exact issue is there are certain scenarios where the state has changed and it's not desirable to to revert to a previous revision of the state. This also enables us to send heartbeats more aggressively, for example on state transitions, and to lower the interval for in between events such as queueing and running states.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants