Modify unchecked diffs query#84
Conversation
| func (repository diffRepository) GetFirstDiffForBlockHeight(blockHeight int64) (types.PersistedDiff, error) { | ||
| var diff types.PersistedDiff | ||
| err := repository.db.Get(&diff, | ||
| `SELECT * FROM public.storage_diff WHERE checked IS false AND block_height = $1 LIMIT 1`, blockHeight) |
There was a problem hiding this comment.
wondering if it might make sense to say block_height >= $1 ORDER BY block_height ASC LIMIT 1 in case we don't have a diff at exactly that block height
rmulhol
left a comment
There was a problem hiding this comment.
Awesome work, and impressive turnaround! 😎
My main area of interest is minimizing the likelihood of GetFirstDiffForBlockHeight (or, less likely, GetFirstDiffForBlockHeight) returning an error (esp something like sql.ErrNoRows) so that we don't kill/restart the container unless things are really out of whack.
Other than that I'm curious how we can minimize the effort needed to tune the BlocksBackFromHead number up or down depending on observed results.
Overall, LGTM! 👍
| return getHeaderErr | ||
| } | ||
| blockNumber := mostRecentHeader.BlockNumber - BlocksBackFromHead | ||
| diff, getDiffErr := watcher.StorageDiffRepository.GetFirstDiffForBlockHeight(blockNumber) |
There was a problem hiding this comment.
Don't know if this makes any difference but maybe this function could just return the ID if that's all we want from the diff
There was a problem hiding this comment.
Good call - I had initially thought that this method could be useful in the future if it returned the whole diff, but we'll let future us deal with that, just in case it never is necessary.
| var ResultsLimit = 500 | ||
| var ( | ||
| ResultsLimit = 500 | ||
| BlocksBackFromHead = int64(500) |
There was a problem hiding this comment.
Thinking that we might want this number to be a good bit higher - at least the first time we run this, since it's possible we could be at a really bad point in terms of cycling through the unchecked (and un-transformable) diffs. I wonder if maybe instead of skipOldDiffs being a bool, it could be an int64 configurable via the CLI? Then we could easily bump the number up or down depending on the results we observe
There was a problem hiding this comment.
I really like the idea of passing in an int instead!
| minID := 0 | ||
| var minID int | ||
| if watcher.SkipOldDiffs { | ||
| mostRecentHeader, getHeaderErr := watcher.HeaderRepository.GetMostRecentHeader() |
There was a problem hiding this comment.
Maybe this could just return the block number?
Previously we were passing the id of the first diff for a given block to GetNewDiffs, but this method is looking at diffs with ids > than the given minID, so the diff with the passed minID isn't included, which is fine for normal circumstances, since it starts at 0.
|
|
||
| minID = int(diffID) | ||
| minID = int(diffID - 1) | ||
| } |
There was a problem hiding this comment.
Just pushed this change up for now... wondering if a cleaner approach would be to change the GetNewDiffs query to look at diffs where id >= instead of just >, but think I need to look at that again with fresh eyes tomorrow.
There was a problem hiding this comment.
Makes sense to prefer >= to > 👍
Not sure I follow why we need to set minID to diffID - 1 though?
There was a problem hiding this comment.
Cool, i'll make update to >=.
I needed to change the minID to diffID - 1 because when passing in a block to start looking at diffs, we would use the id from the first diff for that block. Then if we pass that id to GetNewDiffs it would look up diffs with an id > than the given id, so that diff would never be processed. So by making the minID one less than that diff's id, it would ensure that it would get processed. Does that make sense?
All that being said, changing GetNewDiffs to be inclusive for the minID seems a lot easier.
There was a problem hiding this comment.
Actually now that I look at it a bit closer, if we change GetNewDiffs to look at ids >= minID then with the current watcher implementation we'd end up rechecking the last diff on the collection each time... or we'd need to increment the minID to avoid that. 🤔 Really not sure which approach is clearer.
There was a problem hiding this comment.
just writing out the 2 options again to help myself understand the two approaches:
-
if we keep
GetNewDiffsas is withid > minIdthen we will need to pass in an id less than the diffId that we’re getting fromGetFirstDiffIDForBlockHeightotherwise that id won’t ever be included inGetNewDiffs, so that is why I was passing infirstDiffId -1 -
on the flip side, if we change
GetNewDiffsto doid >= minIdthen whenever we pass in an updated minID which is the last id from the set of diffs returned, then we’d end up returning that last diff fromGetNewDiffsagain. So it seems like we’d need to pass inlastDiffId +1as the new diff id
| err := repository.db.Get(&diffID, | ||
| `SELECT id FROM public.storage_diff WHERE checked IS false AND block_height >= $1 LIMIT 1`, blockHeight) | ||
| `SELECT id FROM public.storage_diff WHERE block_height >= $1 LIMIT 1`, blockHeight) | ||
|
|
There was a problem hiding this comment.
I figured that this method doesn't necessarily need to care if a diff is checked or not, since we're doing that anyway with GetNewDiffs. Also, was seeing some failures locally when execute caught up with all unchecked diffs, this method would fail because it didn't have an sql results.
This also got me thinking that we'll probably need to tweak this method and MostRecentHeaderBlockNumber for fresh deployed systems - again, will plan to look at that again tomorrow with fresh eyes.
There was a problem hiding this comment.
Nice catch! Makes me think maybe we want to handle sql.ErrNoRows upstream, since I imagine we could probably end up in this situation in other weird scenarios like if the interval is low and there aren't a lot of diffs being generated.
Also 👍 to thinking about freshly deployed systems. I wish it were as simple as updating docs to suggest not using this feature in a fresh deploy, but suppose that might not be feasible if we're defining the interval in the image. Maybe we could require the interval to be set via an env var that's passed to the image?
There was a problem hiding this comment.
Was just checking out the PR updating the image and that looks legit - if we default to -1 but enable config, then fresh deployed systems can just not use this feature?
There was a problem hiding this comment.
yeah, that's a good idea. also, i added a guard in the watcher so that if either the headers or the storage_diff table returns a sql.NoRowsErr we just set the minID to 0, so at least the process won't fall over
b03a3be to
5ed10bc
Compare
| RawDiff: fakeRawDiff, | ||
| ID: rand.Int63(), | ||
| Checked: true, | ||
| Checked: false, |
| Expect(insertErr).NotTo(HaveOccurred()) | ||
|
|
||
| blockBeforeDiffBlockHeight := int64(fakeRawDiff.BlockHeight - 1) | ||
| _, diffErr := repo.GetFirstDiffIDForBlockHeight(blockBeforeDiffBlockHeight) |
There was a problem hiding this comment.
Maybe worth indicating here what is returned? Assuming it's the same as when the diff is unchecked?
| var ResultsLimit = 500 | ||
| var ( | ||
| ResultsLimit = 500 | ||
| BlocksBackFromHead = int64(500) |
| func (repository HeaderRepository) GetMostRecentHeaderBlockNumber() (int64, error) { | ||
| var blockNumber int64 | ||
| err := repository.database.Get(&blockNumber, | ||
| `SELECT block_number FROM headers ORDER BY block_number DESC LIMIT 1`) |
There was a problem hiding this comment.
Any idea how this compares performance-wise with select max(block_number) from headers?
There was a problem hiding this comment.
good question! looks like SELECT block_number FROM headers ORDER BY block_number DESC LIMIT 1 may be slightly more performant?
explain SELECT block_number FROM headers ORDER BY block_number DESC LIMIT 1;
Limit (cost=0.43..0.47 rows=1 width=8)
-> Index Only Scan Backward using headers_block_number_eth_node_id_key on headers (cost=0.43..113428.77 rows=2925850 width=8)
=======================
explain select max(block_number) from headers;
Result (cost=0.47..0.48 rows=1 width=8)
InitPlan 1 (returns $0)
-> Limit (cost=0.43..0.47 rows=1 width=8)
-> Index Only Scan Backward using headers_block_number_eth_node_id_key on headers (cost=0.43..120743.40 rows=2925850 width=8)
Index Cond: (block_number IS NOT NULL)
| It("returns an error if it fails to get the most recent header", func() { | ||
| _, err := repo.GetMostRecentHeaderBlockNumber() | ||
| Expect(err).To(HaveOccurred()) | ||
| Expect(err).To(MatchError(sql.ErrNoRows)) |
There was a problem hiding this comment.
This seems like potentially an issue for freshly deployed systems as well, if execute starts before header sync. Wondering if we want to handle sql.ErrNoRows in the storage watched from both of these repository calls - seems like maybe we could delay + retry, set minID to 0, or something
6720333 to
d9f7c78
Compare
| minID = int(diffID - 1) | ||
| } | ||
|
|
||
| return minID, nil |
There was a problem hiding this comment.
So if DiffBlocksFromHeadOfChain is -1 you'll return for minID. Is that deliberate?
There was a problem hiding this comment.
Yes, the idea is that if DidffBlocksFromHeadOfChain is -1 then we'll want the minID to be 0. Golang sets the starting value of an int to 0 if you don't define it (line 98), but perhaps it would be clearer to be explicit and set var minID= 0 instead.
There was a problem hiding this comment.
I think it's the "-1" that's bothering me - it's a magic number that the execute.go file has to know about to start from the beginning.
Honestly it's not super important, just mildly confusing.
There was a problem hiding this comment.
That's super fair w/r/t the magic number! I've added a comment explaining a bit more, and may also pull the 1 out into a variable/constant to give a bit more context.
d9f7c78 to
88802b7
Compare
paytonrules
left a comment
There was a problem hiding this comment.
There's a lot of context I don't understand here yet, but LGTM.
88802b7 to
ecfa286
Compare
6db0f51 to
307df45
Compare
Proof of concept that this approach will work for not returning diffs from
GetNewDiffsif they're from blocks that are "old". Randomly chose the threshold of "old" to be from blocks that are greater than 500 back from the most recent header we have in the system, but totally happy to rethink this/do some more digging into a more appropriate threshold.executedockerfile to use this flag: Add DIFF_BLOCK_FROM_HEAD_OF_CHAIN to execute docker startup script vdb-mcd-transformers#184