Fix overriding tx index of duplicated txs#8625
Fix overriding tx index of duplicated txs#8625creachadair merged 2 commits intotendermint:v0.34.xfrom
Conversation
|
Do we not want this in v0.35 and later versions? |
I don't know if the issue exists in 0.35 mempool, since the mempool has been changed a lot, or if we can fix the root cause in newer version, then we don't need to change the tx indexer in later versions? after all this patch indeed adds some overheads. |
9101462 to
1338c66
Compare
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
don't stale |
state/txindex/kv/kv.go
Outdated
|
|
||
| // keep track of successful txs in this block in order to suppress latter ones being indexed. | ||
| // note that it deliberately does NOT cache unsuccessful ones | ||
| var successfulTxsInThisBlock = make(map[string]*abci.TxResult) |
There was a problem hiding this comment.
Am I understanding correctly the intent here? I think that for each candidate transaction t:
- If we've already indexed a successful tx on this hash, we skip t regardless of its success or failure.
- Otherwise, if we haven't yet indexed any transaction on this hash, we index t regardless of its success or failure.
- Otherwise, if t is successful we index t (replacing a previous value, which by construction is not successful if it exists)
If that's right, it seems like we could maybe simplify this to avoid the need to re-fetch the whole record out of the index: Since we have all the transactions in the current height already in b.Ops, I think we could reuse the status that's already cached in the map if we always ensure the map is updated when we index either kind of tx (rather than only when it's successful).
For the tests with a MemDB this doesn't matter, but database use for indexing is already a pretty big cost for production nodes under load, so I'm hoping we can avoid the need to fetch/decode existing records to do the duplicate check.
That would change the check slightly, but I think should preserve the intended semantics (if my summary above is right).
There was a problem hiding this comment.
The basic idea is like this:
if tx.success or not store.Indexed(tx.hash) or not store.Get(tx.hash).success {
index tx
}And successfulTxsInThisBlock is an optimization that in case there are duplicated tx in the current block, and the first one is successful, then we don't need to load the previous result from store. but in case the duplicated txs are in different blocks, we still have to check the store.
Do you mean to change the logic to this, so we only need to check existence of previous record without decoding it?
if tx.success or not store.Indexed(tx.hash) {
index tx
}With this, a failed record won't override the previous failed record.
There was a problem hiding this comment.
Ah, so is the concern also for transactions with the same hash that might have come from previous heights? If that's so, then my suggestion won't work—I thought the logic here was only concerned with duplication within a given height (since the indexing is per-height).
There was a problem hiding this comment.
yeah, the duplicated txs could be included in different blocks.
creachadair
left a comment
There was a problem hiding this comment.
I think the remaining two topics we need to decide about are whether similar logic needs to be applied to the Postgres event sink too, and how to ensure we maintain parity in v0.35 and v0.36.
Re: Postgres. The PSQL sink already indexes transactions separately even if the same hash occurs multiple times. But it indexes all the occurrences (successful or otherwise), so the "latest" version of a given hash will always be the most recent, not the most-recent-successful.
Re: 35/36. I think we probably need to have this same logic in the v0.35 and v0.36 branches because otherwise chains that depend on this workaround will run into trouble when upgrading later. It's probably worth checking whether a forward-port is practical before we merge this into the release branch.
to make it work for all indexers, we may be able to move the logic to outer level, pre-filter the EDIT: I've reworked the PR to make it work on all indexers, and the code should be much cleaner now. |
d5e2768 to
41f4f7d
Compare
With the refactored version, it seems straightforward enough to port to later versions, do we plan to fix the root cause for the later versions, if that's the case, we may don't have to forward-port this patch. |
a130349 to
a4bee9b
Compare
a4bee9b to
0be97e0
Compare
@creachadair do you mean the release branch of 0.34 or 0.35? |
* fix: prevent duplicate tx index if it succeeded before * fix: use CodeTypeOk instead of 0 * fix: handle duplicate txs within the same block * [huangyi]: move the logic to outer layer, works on all txindexer types Co-authored-by: jess <jesse@soob.co> changelog fix lint
|
it seems trivial to port to master: https://github.com/yihuang/tendermint/tree/fix-duplicated-tx-index-master |
Right now this PR is targeting the v0.34.x branch. I think we will also need it to have the same behaviour in v0.35.x and v0.36.x. Obviously those would need to be separate PRs, the key point being that the indexers work more-or-less the same way in the later releases too. |
We don't have plans to make significant changes to the way indexing works in v0.35 or v0.36. For v0.37 we are likely to rework the way indexes are stored, and should be able to eliminate this problem. That said: The "root cause" in this case is a fundamental design choice in the KV indexer, so this workaround seems fine for now. |
by "root cause", I mean the issue in the mempool that get duplicated txs included in blocks in the first place: #5281 |
|
Hi, @creachadair do you think we can merge this now? |
* fix: prevent duplicate tx index if it succeeded before * fix: use CodeTypeOk instead of 0 * fix: handle duplicate txs within the same block * [huangyi]: move the logic to outer layer, works on all txindexer types Co-authored-by: jess <jesse@soob.co> changelog fix lint more assertion a little bit better to use result list rather than batch type
6771215 to
c8c4a9f
Compare
|
The topic of mempool semantics is quite interesting and really requires a lot more discussion. Tendermint currently doesn't actually provide any guarantee against duplicated transactions and I'm not sure we ever will. Replay protection is up to the application. What we have is this local cache which is more of an optimization as opposed to any gaurantee. Perhaps we can actually set up a discussion in Github around this as I'd like to better understand how Tendermint can help application on that front |
this one is not about replay protection in the app side, but the tx indexer semantics facing duplicated txs, to prevent a failed tx result to override a success one in tx indexer.
Then we can forward port this PR to 0.35 and master I think. |
Wouldn't implementing replay protection prevent duplicate txs from entering the indexer? I agree though that we should probably forward port this to v0.35 and v0.36. I'm also unsure whether we should force the same behaviour with postgres. There may be cases where people want to have multiple indexes of the same transaction. The other aspect is that we may want to eventually remove querying functionality out of Tendermint. |
If "replay protection" means return error in checkTx for duplicated tx, it won't prevent it from entering the indexer, we are using cosmos-sdk which does have replay protection, but it don't prevent the duplicated tx included in blocks, and all the txs included will enter the indexer.
The current implementation works for all indexers.
I'll open a PR for master branch then. |
Thanks. Much appreciated!
I haven't looked into this as deeply as others and I think your fix solves the problem at hand. I am of the opinion, however, in the long term, that filtering txs that are passed into the indexer shouldn't be necessary. Ideally all transactions should be simply streamed straight out to the indexer. All necessary filtering should be done before the block gets proposed. This can be done in v0.36 with ABCI++. To me, all transactions that make it into a block should be indexed for future querying regardless of whether they were executed or not. The problems I see here stem more in the custom kv indexer which indexes on hashes and thus fails to index for duplicate transactions. I'm hoping in the future, more for other reasons than just this, that Tendermint will slowly deprecate the kv indexer in favour of Postgres and other industry-standard indexers. I don't know if @creachadair has anything to add since he's probably thought about this more |
…tendermint#8625) Port the bug fix terra-money#76 to upstream. This is critical for ethermint json-rpc to work. fix: prevent duplicate tx index if it succeeded before fix: use CodeTypeOk instead of 0 fix: handle duplicate txs within the same block Co-authored-by: jess jesse@soob.co ref: tendermint#5281
done: #8945 |
…#8625) (#8945) Port the bug fix terra-money#76 to upstream. This is critical for ethermint json-rpc to work. fix: prevent duplicate tx index if it succeeded before fix: use CodeTypeOk instead of 0 fix: handle duplicate txs within the same block Co-authored-by: jess jesse@soob.co ref: #5281 Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
…#8625) (#8945) Port the bug fix terra-money#76 to upstream. This is critical for ethermint json-rpc to work. fix: prevent duplicate tx index if it succeeded before fix: use CodeTypeOk instead of 0 fix: handle duplicate txs within the same block Co-authored-by: jess jesse@soob.co ref: #5281 Co-authored-by: M. J. Fromberger <fromberger@interchain.io> (cherry picked from commit be6d74e) # Conflicts: # CHANGELOG_PENDING.md # internal/state/indexer/indexer_service.go
…#8625) (#8945) Port the bug fix terra-money#76 to upstream. This is critical for ethermint json-rpc to work. fix: prevent duplicate tx index if it succeeded before fix: use CodeTypeOk instead of 0 fix: handle duplicate txs within the same block Co-authored-by: jess jesse@soob.co ref: #5281 Co-authored-by: M. J. Fromberger <fromberger@interchain.io> (cherry picked from commit be6d74e)
…#8625) (#8945) (#8951) Port the bug fix terra-money#76 to upstream. This is critical for ethermint json-rpc to work. fix: prevent duplicate tx index if it succeeded before fix: use CodeTypeOk instead of 0 fix: handle duplicate txs within the same block Co-authored-by: jess jesse@soob.co ref: #5281 Co-authored-by: M. J. Fromberger <fromberger@interchain.io> (cherry picked from commit be6d74e) Co-authored-by: yihuang <huang@crypto.com>
Port the bug fix terra-money#76 to upstream. This is critical for ethermint json-rpc to work.
Co-authored-by: jess jesse@soob.co
ref: #5281