Skip to content

Fix overriding tx index of duplicated txs#8625

Merged
creachadair merged 2 commits intotendermint:v0.34.xfrom
yihuang:fix-duplicated-tx-index
Jun 30, 2022
Merged

Fix overriding tx index of duplicated txs#8625
creachadair merged 2 commits intotendermint:v0.34.xfrom
yihuang:fix-duplicated-tx-index

Conversation

@yihuang
Copy link

@yihuang yihuang commented May 26, 2022

Port the bug fix terra-money#76 to upstream. This is critical for ethermint json-rpc to work.

  • fix: prevent duplicate tx index if it succeeded before
  • fix: use CodeTypeOk instead of 0
  • fix: handle duplicate txs within the same block

Co-authored-by: jess jesse@soob.co

ref: #5281

@yihuang yihuang changed the title Fix indexing bug (#76) Fix overriding tx index of duplicated txs May 26, 2022
@cmwaters
Copy link
Contributor

Do we not want this in v0.35 and later versions?

@yihuang
Copy link
Author

yihuang commented May 27, 2022

Do we not want this in v0.35 and later versions?

I don't know if the issue exists in 0.35 mempool, since the mempool has been changed a lot, or if we can fix the root cause in newer version, then we don't need to change the tx indexer in later versions? after all this patch indeed adds some overheads.

@github-actions
Copy link

github-actions bot commented Jun 7, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale for use by stalebot label Jun 7, 2022
@yihuang
Copy link
Author

yihuang commented Jun 7, 2022

don't stale

@github-actions github-actions bot removed the stale for use by stalebot label Jun 8, 2022

// keep track of successful txs in this block in order to suppress latter ones being indexed.
// note that it deliberately does NOT cache unsuccessful ones
var successfulTxsInThisBlock = make(map[string]*abci.TxResult)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I understanding correctly the intent here? I think that for each candidate transaction t:

  • If we've already indexed a successful tx on this hash, we skip t regardless of its success or failure.
  • Otherwise, if we haven't yet indexed any transaction on this hash, we index t regardless of its success or failure.
  • Otherwise, if t is successful we index t (replacing a previous value, which by construction is not successful if it exists)

If that's right, it seems like we could maybe simplify this to avoid the need to re-fetch the whole record out of the index: Since we have all the transactions in the current height already in b.Ops, I think we could reuse the status that's already cached in the map if we always ensure the map is updated when we index either kind of tx (rather than only when it's successful).

For the tests with a MemDB this doesn't matter, but database use for indexing is already a pretty big cost for production nodes under load, so I'm hoping we can avoid the need to fetch/decode existing records to do the duplicate check.

That would change the check slightly, but I think should preserve the intended semantics (if my summary above is right).

Copy link
Author

@yihuang yihuang Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The basic idea is like this:

if tx.success or not store.Indexed(tx.hash) or not store.Get(tx.hash).success {
  index tx
}

And successfulTxsInThisBlock is an optimization that in case there are duplicated tx in the current block, and the first one is successful, then we don't need to load the previous result from store. but in case the duplicated txs are in different blocks, we still have to check the store.

Do you mean to change the logic to this, so we only need to check existence of previous record without decoding it?

if tx.success or not store.Indexed(tx.hash) {
  index tx
}

With this, a failed record won't override the previous failed record.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so is the concern also for transactions with the same hash that might have come from previous heights? If that's so, then my suggestion won't work—I thought the logic here was only concerned with duplication within a given height (since the indexing is per-height).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, the duplicated txs could be included in different blocks.

Copy link

@creachadair creachadair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the remaining two topics we need to decide about are whether similar logic needs to be applied to the Postgres event sink too, and how to ensure we maintain parity in v0.35 and v0.36.

Re: Postgres. The PSQL sink already indexes transactions separately even if the same hash occurs multiple times. But it indexes all the occurrences (successful or otherwise), so the "latest" version of a given hash will always be the most recent, not the most-recent-successful.

Re: 35/36. I think we probably need to have this same logic in the v0.35 and v0.36 branches because otherwise chains that depend on this workaround will run into trouble when upgrading later. It's probably worth checking whether a forward-port is practical before we merge this into the release branch.

@yihuang
Copy link
Author

yihuang commented Jun 14, 2022

Re: Postgres. The PSQL sink already indexes transactions separately even if the same hash occurs multiple times. But it indexes all the occurrences (successful or otherwise), so the "latest" version of a given hash will always be the most recent, not the most-recent-successful.

to make it work for all indexers, we may be able to move the logic to outer level, pre-filter the txindex.Batch before calling the AddBatch function.

EDIT: I've reworked the PR to make it work on all indexers, and the code should be much cleaner now.

@yihuang yihuang force-pushed the fix-duplicated-tx-index branch from d5e2768 to 41f4f7d Compare June 14, 2022 03:25
@yihuang
Copy link
Author

yihuang commented Jun 14, 2022

Re: 35/36. I think we probably need to have this same logic in the v0.35 and v0.36 branches because otherwise chains that depend on this workaround will run into trouble when upgrading later. It's probably worth checking whether a forward-port is practical before we merge this into the release branch.

With the refactored version, it seems straightforward enough to port to later versions, do we plan to fix the root cause for the later versions, if that's the case, we may don't have to forward-port this patch.

@yihuang yihuang force-pushed the fix-duplicated-tx-index branch 2 times, most recently from a130349 to a4bee9b Compare June 14, 2022 03:37
@yihuang yihuang requested a review from creachadair June 14, 2022 07:08
@yihuang yihuang force-pushed the fix-duplicated-tx-index branch from a4bee9b to 0be97e0 Compare June 14, 2022 09:20
@tomtau
Copy link
Contributor

tomtau commented Jun 17, 2022

It's probably worth checking whether a forward-port is practical before we merge this into the release branch.

@creachadair do you mean the release branch of 0.34 or 0.35?

yihuang pushed a commit to yihuang/tendermint that referenced this pull request Jun 17, 2022
* fix: prevent duplicate tx index if it succeeded before

* fix: use CodeTypeOk instead of 0

* fix: handle duplicate txs within the same block

* [huangyi]: move the logic to outer layer, works on all txindexer types

Co-authored-by: jess <jesse@soob.co>

changelog

fix lint
@yihuang
Copy link
Author

yihuang commented Jun 17, 2022

it seems trivial to port to master: https://github.com/yihuang/tendermint/tree/fix-duplicated-tx-index-master

@creachadair
Copy link

It's probably worth checking whether a forward-port is practical before we merge this into the release branch.

@creachadair do you mean the release branch of 0.34 or 0.35?

Right now this PR is targeting the v0.34.x branch. I think we will also need it to have the same behaviour in v0.35.x and v0.36.x. Obviously those would need to be separate PRs, the key point being that the indexers work more-or-less the same way in the later releases too.

@creachadair
Copy link

With the refactored version, it seems straightforward enough to port to later versions, do we plan to fix the root cause for the later versions, if that's the case, we may don't have to forward-port this patch.

We don't have plans to make significant changes to the way indexing works in v0.35 or v0.36. For v0.37 we are likely to rework the way indexes are stored, and should be able to eliminate this problem. That said: The "root cause" in this case is a fundamental design choice in the KV indexer, so this workaround seems fine for now.

@yihuang
Copy link
Author

yihuang commented Jun 17, 2022

With the refactored version, it seems straightforward enough to port to later versions, do we plan to fix the root cause for the later versions, if that's the case, we may don't have to forward-port this patch.

We don't have plans to make significant changes to the way indexing works in v0.35 or v0.36. For v0.37 we are likely to rework the way indexes are stored, and should be able to eliminate this problem. That said: The "root cause" in this case is a fundamental design choice in the KV indexer, so this workaround seems fine for now.

by "root cause", I mean the issue in the mempool that get duplicated txs included in blocks in the first place: #5281

@yihuang
Copy link
Author

yihuang commented Jun 27, 2022

Hi, @creachadair do you think we can merge this now?

* fix: prevent duplicate tx index if it succeeded before

* fix: use CodeTypeOk instead of 0

* fix: handle duplicate txs within the same block

* [huangyi]: move the logic to outer layer, works on all txindexer types

Co-authored-by: jess <jesse@soob.co>

changelog

fix lint

more assertion

a little bit better to use result list rather than batch type
@yihuang yihuang force-pushed the fix-duplicated-tx-index branch from 6771215 to c8c4a9f Compare June 30, 2022 09:21
@creachadair creachadair merged commit 5c32cfa into tendermint:v0.34.x Jun 30, 2022
@yihuang yihuang deleted the fix-duplicated-tx-index branch June 30, 2022 15:07
@cmwaters
Copy link
Contributor

cmwaters commented Jul 4, 2022

The topic of mempool semantics is quite interesting and really requires a lot more discussion. Tendermint currently doesn't actually provide any guarantee against duplicated transactions and I'm not sure we ever will. Replay protection is up to the application. What we have is this local cache which is more of an optimization as opposed to any gaurantee. Perhaps we can actually set up a discussion in Github around this as I'd like to better understand how Tendermint can help application on that front

@yihuang
Copy link
Author

yihuang commented Jul 4, 2022

The topic of mempool semantics is quite interesting and really requires a lot more discussion. Tendermint currently doesn't actually provide any guarantee against duplicated transactions and I'm not sure we ever will. Replay protection is up to the application. What we have is this local cache which is more of an optimization as opposed to any gaurantee. Perhaps we can actually set up a discussion in Github around this as I'd like to better understand how Tendermint can help application on that front

this one is not about replay protection in the app side, but the tx indexer semantics facing duplicated txs, to prevent a failed tx result to override a success one in tx indexer.

Tendermint currently doesn't actually provide any guarantee against duplicated transactions and I'm not sure we ever will

Then we can forward port this PR to 0.35 and master I think.

@cmwaters
Copy link
Contributor

cmwaters commented Jul 5, 2022

this one is not about replay protection in the app side, but the tx indexer semantics facing duplicated txs, to prevent a failed tx result to override a success one in tx indexer.

Wouldn't implementing replay protection prevent duplicate txs from entering the indexer?

I agree though that we should probably forward port this to v0.35 and v0.36.

I'm also unsure whether we should force the same behaviour with postgres. There may be cases where people want to have multiple indexes of the same transaction. The other aspect is that we may want to eventually remove querying functionality out of Tendermint.

@yihuang
Copy link
Author

yihuang commented Jul 5, 2022

Wouldn't implementing replay protection prevent duplicate txs from entering the indexer?

If "replay protection" means return error in checkTx for duplicated tx, it won't prevent it from entering the indexer, we are using cosmos-sdk which does have replay protection, but it don't prevent the duplicated tx included in blocks, and all the txs included will enter the indexer.

I'm also unsure whether we should force the same behaviour with postgres.

The current implementation works for all indexers.

I agree though that we should probably forward port this to v0.35 and v0.36.

I'll open a PR for master branch then.

@cmwaters
Copy link
Contributor

cmwaters commented Jul 6, 2022

I'll open a PR for master branch then.

Thanks. Much appreciated!

If "replay protection" means return error in checkTx for duplicated tx, it won't prevent it from entering the indexer, we are using cosmos-sdk which does have replay protection, but it don't prevent the duplicated tx included in blocks, and all the txs included will enter the indexer.

I haven't looked into this as deeply as others and I think your fix solves the problem at hand. I am of the opinion, however, in the long term, that filtering txs that are passed into the indexer shouldn't be necessary. Ideally all transactions should be simply streamed straight out to the indexer. All necessary filtering should be done before the block gets proposed. This can be done in v0.36 with ABCI++. To me, all transactions that make it into a block should be indexed for future querying regardless of whether they were executed or not. The problems I see here stem more in the custom kv indexer which indexes on hashes and thus fails to index for duplicate transactions. I'm hoping in the future, more for other reasons than just this, that Tendermint will slowly deprecate the kv indexer in favour of Postgres and other industry-standard indexers.

I don't know if @creachadair has anything to add since he's probably thought about this more

yihuang pushed a commit to yihuang/tendermint that referenced this pull request Jul 6, 2022
…tendermint#8625)

Port the bug fix terra-money#76 to upstream. This is critical for ethermint json-rpc to work.

fix: prevent duplicate tx index if it succeeded before
fix: use CodeTypeOk instead of 0
fix: handle duplicate txs within the same block
Co-authored-by: jess jesse@soob.co

ref: tendermint#5281
@yihuang
Copy link
Author

yihuang commented Jul 6, 2022

I'll open a PR for master branch then.

Thanks. Much appreciated!

done: #8945

tychoish pushed a commit that referenced this pull request Jul 6, 2022
…#8625) (#8945)

Port the bug fix terra-money#76 to upstream. This is critical for ethermint json-rpc to work.

fix: prevent duplicate tx index if it succeeded before
fix: use CodeTypeOk instead of 0
fix: handle duplicate txs within the same block
Co-authored-by: jess jesse@soob.co

ref: #5281

Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
mergify bot pushed a commit that referenced this pull request Jul 6, 2022
…#8625) (#8945)

Port the bug fix terra-money#76 to upstream. This is critical for ethermint json-rpc to work.

fix: prevent duplicate tx index if it succeeded before
fix: use CodeTypeOk instead of 0
fix: handle duplicate txs within the same block
Co-authored-by: jess jesse@soob.co

ref: #5281

Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
(cherry picked from commit be6d74e)

# Conflicts:
#	CHANGELOG_PENDING.md
#	internal/state/indexer/indexer_service.go
mergify bot pushed a commit that referenced this pull request Jul 6, 2022
…#8625) (#8945)

Port the bug fix terra-money#76 to upstream. This is critical for ethermint json-rpc to work.

fix: prevent duplicate tx index if it succeeded before
fix: use CodeTypeOk instead of 0
fix: handle duplicate txs within the same block
Co-authored-by: jess jesse@soob.co

ref: #5281

Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
(cherry picked from commit be6d74e)
tychoish pushed a commit that referenced this pull request Jul 6, 2022
…#8625) (#8945) (#8951)

Port the bug fix terra-money#76 to upstream. This is critical for ethermint json-rpc to work.

fix: prevent duplicate tx index if it succeeded before
fix: use CodeTypeOk instead of 0
fix: handle duplicate txs within the same block
Co-authored-by: jess jesse@soob.co

ref: #5281

Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
(cherry picked from commit be6d74e)

Co-authored-by: yihuang <huang@crypto.com>
cmwaters pushed a commit that referenced this pull request Jul 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants