Skip to content

fix(types): Only require extension signature if extensions are enabled#3565

Merged
andynog merged 6 commits intocometbft:v0.38.xfrom
chillyvee:cv0.38.x.fix0voteext
Aug 9, 2024
Merged

fix(types): Only require extension signature if extensions are enabled#3565
andynog merged 6 commits intocometbft:v0.38.xfrom
chillyvee:cv0.38.x.fix0voteext

Conversation

@chillyvee
Copy link
Contributor

@chillyvee chillyvee commented Jul 26, 2024

Closes #3570

extEnabled is not set to false for nil precommit which causes SiteAndCheckVote() to panic

This is seen in seda chain with the following error:

 {"level":"error","module":"server","module":"consensus","err":"non-recoverable error when signing vote Vote{46:18BF721DE571 979860/00/SIGNED_MSG_TYPE_PRECOMMIT(Precommit) 4D40DA1E56E1 E7A643A433C1 000000000000 @ 2024-07-26T08:19:11.811634992Z}: extensions must be present IFF vote is a non-nil Precommit; extension signature: ","stack":"goroutine 14202 [running]:\nruntime/debug.Stack()\n\truntime/debug/stack.go:24 +0x5e\ngithub.com/cometbft/cometbft/consensus.(*State).receiveRoutine.func2()\n\tgithub.com/cometbft/cometbft@v0.38.6/consensus/state.go:801 +0x46\npanic({0x2aa4d60?, 0xc094b62020?})\n\truntime/panic.go:914 +0x21f\ngithub.com/cometbft/cometbft/consensus.(*State).signVote(0xc001612a80, 0x2, {0xc02845e5a0, 0x20, 0x20}, {0xe02c9c?, {0xc02845e640?, 0xc0b3a6d2ec?, 0x2da7880?}}, 0xc06a44c1e0)\n\tgithub.com/cometbft/cometbft@v0.38.6/consensus/state.go:2398 +0x61e\ngithub.com/cometbft/cometbft/consensus.(*State).signAddVote(0xc001612a80, 0x0?, {0xc02845e5a0, 0x20, 0x20}, {0x1?, {0xc02845e640?, 0xc0133c58e0?, 0x20?}}, 0xc06a44c1e0)\n\tgithub.com/cometbft/cometbft@v0.38.6/consensus/state.go:2449 +0x212\ngithub.com/cometbft/cometbft/consensus.(*State).enterPrecommit(0xc001612a80, 0xef394, 0x0)\n\tgithub.com/cometbft/cometbft@v0.38.6/consensus/state.go:1537 +0x1337\ngithub.com/cometbft/cometbft/consensus.(*State).addVote(0xc001612a80, 0xc0d5793450, {0xc08587f890, 0x28})\n\tgithub.com/cometbft/cometbft@v0.38.6/consensus/state.go:2306 +0x186f\ngithub.com/cometbft/cometbft/consensus.(*State).tryAddVote(0xc001612a80, 0xc0d5793450, {0xc08587f890?, 0xc0b3a6dc08?})\n\tgithub.com/cometbft/cometbft@v0.38.6/consensus/state.go:2066 +0x26\ngithub.com/cometbft/cometbft/consensus.(*State).handleMsg(0xc001612a80, {{0x3b33320, 0xc067341dd8}, {0xc08587f890, 0x28}})\n\tgithub.com/cometbft/cometbft@v0.38.6/consensus/state.go:929 +0x3ce\ngithub.com/cometbft/cometbft/consensus.(*State).receiveRoutine(0xc001612a80, 0x0)\n\tgithub.com/cometbft/cometbft@v0.38.6/consensus/state.go:836 +0x3d1\ncreated by github.com/cometbft/cometbft/consensus.(*State).OnStart in goroutine 325\n\tgithub.com/cometbft/cometbft@v0.38.6/consensus/state.go:398 +0x10c\n","time":"2024-07-26T09:19:16+01:00","message":"CONSENSUS FAILURE!!!"}

PR checklist

  • Tests written/updated
  • Changelog entry added in .changelog (we use unclog to manage our changelog)
  • Updated relevant documentation (docs/ or spec/) and code comments
  • Title follows the Conventional Commits spec

@chillyvee chillyvee requested a review from a team as a code owner July 26, 2024 09:26
@chillyvee chillyvee changed the title Cv0.38.x.fix0voteext fix(consensus) Disable VoteExtension for nil precommit Jul 26, 2024
@chillyvee chillyvee force-pushed the cv0.38.x.fix0voteext branch from d2f3f5e to 4dfd196 Compare July 26, 2024 09:40
@melekes
Copy link
Collaborator

melekes commented Jul 26, 2024

Could you please open an issue first? Are you using a remote signer like tmkms? If so, what version? There was a bug recently: #2711, which is very similar from the looks of it.

@chillyvee
Copy link
Contributor Author

Could you please open an issue first? Are you using a remote signer like tmkms? If so, what version? There was a bug recently: #2711, which is very similar from the looks of it.

Using Horcrux v3.3.0

Opened issue. #3570

@cason
Copy link

cason commented Jul 29, 2024

Also, can you check whether in main this situation is solved? We would also need a test unit with the problematic scenario.

@chillyvee
Copy link
Contributor Author

chillyvee commented Aug 6, 2024

The first proposed fix does not solve the problem fully in seda mainnet.

Additional debugging indciates the root cause is this line:
if extSignature == (!isPrecommit || isNil) {

When extensionsEnabled = false (correct, previously assumed incorrect)
And (!isPrecommit || isNil) => (!true || false) => false
Then false == false and the panic is thrown by accident

We are testing this proposed in seda mainnet.

Our proposed fix matches the new logic already in the main branch.
https://github.com/cometbft/cometbft/blob/main/types/vote.go#L436-L446

@chillyvee chillyvee force-pushed the cv0.38.x.fix0voteext branch from 0a9ce40 to c3d98ea Compare August 6, 2024 18:05
@melekes
Copy link
Collaborator

melekes commented Aug 7, 2024

@chillyvee please add a changelog entry 🙏

Copy link
Collaborator

@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@melekes melekes changed the title fix(consensus) Disable VoteExtension for nil precommit fix(types): ONLY require extension signature if extensions are enabled Aug 7, 2024
@melekes melekes linked an issue Aug 7, 2024 that may be closed by this pull request
@chillyvee
Copy link
Contributor Author

Changelog added. Seda mainnet node is stable for 16 hours.

Previous panics appeared within 8-72 hours.

types/vote.go Outdated

vote.ExtensionSignature = nil
if extensionsEnabled {
isNil := vote.BlockID.IsZero()
Copy link
Collaborator

@sergio-mena sergio-mena Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix is disabling various sanity checks we wouldn't like to disable:

  • Make sure that a prevote does not contain an extension signature
  • Make sure that nil precommit does not contain an extension signature

Therefore this fix needs to "undo" the XOR in the if into two conditions, and only include one in the if extensionsEnabled.

I will post a diff here to explain my point

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how I believe the fix should look like:

diff --git a/types/vote.go b/types/vote.go
--- a/types/vote.go
+++ b/types/vote.go
@@ -430,13 +430,17 @@ func SignAndCheckVote(
 
 	isNil := vote.BlockID.IsZero()
 	extSignature := (len(v.ExtensionSignature) > 0)
-	if extSignature == (!isPrecommit || isNil) {
-		// Non-recoverable because the vote is malformed
+	if extSignature && (!isPrecommit || isNil) {
+		// Non-recoverable because the vote is malformed (wrongly contains an extension signature)
 		return false, &ErrVoteExtensionInvalid{ExtSignature: v.ExtensionSignature}
 	}
 
 	vote.ExtensionSignature = nil
 	if extensionsEnabled {
+		if !extSignature && isPrecommit && !isNil {
+			// Non-recoverable because the vote is malformed (extension signature missing for a precommit)
+			return false, &ErrVoteExtensionInvalid{ExtSignature: v.ExtensionSignature}
+		}
 		vote.ExtensionSignature = v.ExtensionSignature
 	}
 

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, this fix may only be buying you time. If the chain decides to start using vote extensions (in the SDK, it would be via a gov proposal)... and your remote signer still doesn't sign vote extensions, it will start failing again.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

Copy link
Contributor Author

@chillyvee chillyvee Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our internal fix is actually "if extSignature && (!isPrecommit || isNil) " as you suggest and would be supportive of the format you suggest. We agree the previous code misses one of the checks. The new commit includes your suggestion.

The original incomplete fix matches the main branch which is missing:

+	if extSignature && (!isPrecommit || isNil) {
+		// Non-recoverable because the vote is malformed (wrongly contains an extension signature)
 		return false, &ErrVoteExtensionInvalid{ExtSignature: v.ExtensionSignature}
 	}

Does the main branch also need to check this condition?

Also Horcrux signer supports vote extensions. Appreciate the check.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the main branch also need to check this condition?

Thanks for pointing out. Taking care of it (#3642)

@cason cason changed the title fix(types): ONLY require extension signature if extensions are enabled fix(types): Only require extension signature if extensions are enabled Aug 7, 2024
@chillyvee
Copy link
Contributor Author

Updated to include condition checks as suggested by @sergio-mena

Copy link
Collaborator

@sergio-mena sergio-mena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 🙏!

@cason
Copy link

cason commented Aug 8, 2024

Ensure your tools are up-to-date, re-run 'make proto-gen' and update this PR.

@sergio-mena
Copy link
Collaborator

On the last comment from @cason , I had something similar in #3649. @chillyvee you might need to do rm $GOPATH/bin/protoc-gen-gocosmos, before running make proto-gen

@chillyvee
Copy link
Contributor Author

chillyvee commented Aug 8, 2024

make proto-gen

Commit added with changed files.

Protobuf lint check appears to be failing.

Failure: decode buf.yaml: "version" is not set. Please add "version: v2"

Adding version: v1 to match other existing files

Failure: decode buf.yaml: build.roots cannot be set on version v1: [proto]

Trying version: v2

Failure: decode buf.yaml: invalid as version v2: could not unmarshal as YAML: yaml: unmarshal errors:
  line 2: field build not found in type bufconfig.externalBufYAMLFileV2

Does not work and other tags in this branch series doesn't need a version set in /buf.yaml

Reverting back to original buf.yaml.

Change proto/buf.yaml add version: v2

Failure: got a v1 buf.lock file for a v2 buf.yaml

make proto-gen

Failure: failed to get module config for directory "proto" listed in buf.work.yaml: proto/buf.yaml has an invalid "version: v2" set. Please add "version: v1". See https://docs.buf.build/faq for more details

Reverting back to original proto/buf.yaml.

make proto-gen needs version: v1
But github actions protobuf lint / lint (pull_request) needs version: v2?
This workflow is not in main

buf.yaml and proto/buf.yaml are both unchanged. Is there a reason protobuf lint should fail on buf.yaml?

Any ideas @sergio-mena ? $GOPATH/bin/protoc-gen-gocosmos does not exist

@chillyvee chillyvee force-pushed the cv0.38.x.fix0voteext branch 3 times, most recently from ac05675 to 008bab0 Compare August 8, 2024 19:45
github-merge-queue bot pushed a commit that referenced this pull request Aug 9, 2024
…n signing a vote (#3649)

Closes #3642

When vote extensions are disabled, we still need to (sanity-)check that:

* Prevotes don't contain an extension signature
* `nil` precommits don't contain an extension signature

This PR is re-enabling those checks, in alignment with #3565 in
`v0.38.x`

---

#### PR checklist

- ~[ ] Tests written/updated~
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- ~[ ] Updated relevant documentation (`docs/` or `spec/`) and code
comments~
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
github-merge-queue bot pushed a commit that referenced this pull request Aug 9, 2024
Thanks @sergio-mena for figuring it out
(#3565 (comment))

---

#### PR checklist

- [ ] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [ ] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [ ] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
@sergio-mena
Copy link
Collaborator

@chillyvee I spent some time diagnosing. Can you please cherry-pick 3320ae3 to see if that fixes the problem?

@chillyvee chillyvee force-pushed the cv0.38.x.fix0voteext branch from 4f7e1f8 to 024da4b Compare August 9, 2024 11:51
@chillyvee
Copy link
Contributor Author

chillyvee commented Aug 9, 2024

@chillyvee I spent some time diagnosing. Can you please cherry-pick 3320ae3 to see if that fixes the problem?

Picked your commit

Also updated commit messages to (types) instead of the original (consensus)

@chillyvee chillyvee force-pushed the cv0.38.x.fix0voteext branch from 024da4b to 584515f Compare August 9, 2024 11:53
@chillyvee
Copy link
Contributor Author

Green check mark. It worked @sergio-mena

@andynog andynog merged commit c17d1f6 into cometbft:v0.38.x Aug 9, 2024
@chillyvee
Copy link
Contributor Author

@sergio-mena @cason @melekes - Please share your twitter handles so we can thank you in a tweet for all the help

@chillyvee
Copy link
Contributor Author

Is it possible to tag a release of v0.38 for this fix? The v0.38 branch will be live on cosmoshub gaiad in under a week.

#3664

@melekes
Copy link
Collaborator

melekes commented Aug 12, 2024

Is it possible to tag a release of v0.38 for this fix? The v0.38 branch will be live on cosmoshub gaiad in under a week.

#3664

Yes, we'll try to create a new release today.

@melekes
Copy link
Collaborator

melekes commented Aug 12, 2024

@sergio-mena @cason @melekes - Please share your twitter handles so we can thank you in a tweet for all the help

https://x.com/_melekes

@sergio-mena
Copy link
Collaborator

@chillyvee I'm inactive on X (@chav8 - last tweet in 2012), so if you'd like to refer to me, just say "Sergio Mena" or link to my GitHub profile, thanks.

mergify bot pushed a commit that referenced this pull request Jan 11, 2025
…n signing a vote (#3649)

Closes #3642

When vote extensions are disabled, we still need to (sanity-)check that:

* Prevotes don't contain an extension signature
* `nil` precommits don't contain an extension signature

This PR is re-enabling those checks, in alignment with #3565 in
`v0.38.x`

---

#### PR checklist

- ~[ ] Tests written/updated~
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- ~[ ] Updated relevant documentation (`docs/` or `spec/`) and code
comments~
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec

(cherry picked from commit a33b8ac)

# Conflicts:
#	.changelog/v1.0.0/bug-fixes/3642-re-enable-vote-extension-checks.md
#	go.mod
#	go.sum
mergify bot added a commit that referenced this pull request Jan 13, 2025
…n signing a vote (backport #3649) (#4789)

Closes #3642

When vote extensions are disabled, we still need to (sanity-)check that:

* Prevotes don't contain an extension signature
* `nil` precommits don't contain an extension signature

This PR is re-enabling those checks, in alignment with #3565 in
`v0.38.x`

---

#### PR checklist

- ~[ ] Tests written/updated~
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- ~[ ] Updated relevant documentation (`docs/` or `spec/`) and code
comments~
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
<hr>This is an automatic backport of pull request #3649 done by
[Mergify](https://mergify.com).

---------

Co-authored-by: Sergio Mena <sergio@informal.systems>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

consensus: v0.38.6 forgets to disable vote extensions for nil precommit

5 participants