(VDB-1315) Add command for backfilling Urn diffs#189
Conversation
rmulhol
commented
May 7, 2020
- insert diffs from every block with a frob for involved urn
aed69e6 to
a0ea218
Compare
paytonrules
left a comment
There was a problem hiding this comment.
This has been running in production for a while - so it's clearly working. However one thing I'd like to see would be some additions to the README explaining how to run these, and a hint on how to get started.
Nothing huge, just something like "we use Cobra for back-fills and command line scripts that run against VDB. Here's the command to run it. "
backfill/frob/backfiller.go
Outdated
|
|
||
| logrus.Infof("getting storage for %d grabs", len(frobs)) | ||
| for i, frob := range frobs { | ||
| err := shared.FetchAndPersistDartDinkDiffs(frob.Dart, frob.Dink, frob.UrnID, frob.HeaderID, backFiller.eventsRepository, backFiller.storageRepository, backFiller.blockChain) |
There was a problem hiding this comment.
Maybe consider a struct for those first four params? It would prevent accidentally switching Dink/Dart and UrnID/HeaderID.
There was a problem hiding this comment.
That's a great idea - I made this mistake earlier and manually rechecked these params approx ~1 million times 😂
backfill/grab/backfiller_test.go
Outdated
| Expect(errors.Is(err, fakes.FakeError)).To(BeTrue()) | ||
| }) | ||
|
|
||
| It("ignores grab if dink and dart are zero", func() { |
There was a problem hiding this comment.
Some of these tests are duplicated and could be moved to the Shared lib function that does most of the work.
| } | ||
| } | ||
| return nil | ||
| } |
There was a problem hiding this comment.
If you do another backfiller beyond these two, you may consider making each backfiller a strategy. Essentially add methods for things like "error message" and "log storage" and have one place that does the Get"Things" then loop through them logic. As opposed to copy/pasta.
| err := e.db.Select(&grabs, `SELECT header_id, urn_id, dink, dart | ||
| FROM maker.vat_grab | ||
| JOIN public.headers ON vat_grab.header_id = headers.id | ||
| WHERE headers.block_number >= $1`, startingBlock) |
There was a problem hiding this comment.
How come you don't order the Grabs but do the Frobs? I'm not sure it matters but it's odd that they are slightly different.
There was a problem hiding this comment.
Probably makes sense to order both 👍
The bigger the system gets and the more events we accumulate, I'm also starting to worry about unbounded results sets - i.e. how can we grab manageable chunks of events to backfill and proceed through from there
| Expect(err).NotTo(HaveOccurred()) | ||
| Expect(len(frobs)).To(Equal(2)) | ||
| Expect(frobs[0]).To(Equal(frobAtEarlierBlock)) | ||
| }) |
There was a problem hiding this comment.
What about the error case? Does that require shutting down the db or something?
| github.com/onsi/gomega v1.7.0 | ||
| github.com/pkg/errors v0.8.1 // indirect | ||
| github.com/sirupsen/logrus v1.2.0 | ||
| github.com/spf13/cobra v0.0.3 |
- insert diffs from every block with a frob for involved urn
- Don't use create_back_filled_diff since it's slow and we don't need to guard against redundant diffs (e.g. same value at new block) since we're only attempting inserts when we know there's a delta
- Also rename to Frob Backfiller since we're fetching storage based on that event and may fetch urn data from other events
- Don't load all urns up front since it's likely only some of them have frobs since the starting block - Refactor to pull out shared code between frob + grab backfilling
elizabethengelman
left a comment
There was a problem hiding this comment.
LGTM! ![]()
Just a few questions/comments - nothing merge block though!
README.md
Outdated
| This project contains one executable, `backfillUrns`. | ||
| This command enables you to get storage data for Urns by performing lookups at blocks where we detected events indicating an Urn state change. | ||
| To run it, you first need to `go build`. | ||
| After that, you can run something like `./vdb-mcd-transformers backfillUrns --starting-block 10000000` - which would backfill Urn storage since block 10,000,000. |
There was a problem hiding this comment.
Super small, but I wonder if it would be worth mentioning that this commands backfills until the head of the chain?
| return grabs, err | ||
| } | ||
|
|
||
| func (e eventsRepository) GetHeaderByID(id int) (core.Header, error) { |
There was a problem hiding this comment.
Wondering if this method would make sense in the HeaderRepository? 🤔
There was a problem hiding this comment.
Yeah definitely - can open up a PR over there to add it 👍
| return urn, err | ||
| } | ||
|
|
||
| func (repo storageRepository) InsertDiff(diff types.RawDiff) error { |
There was a problem hiding this comment.
Was trying to think of a way to reuse the CreateStorageDiff code here, but it looks like we'd need to rework that method to allow a from_backfill value to be passed in so that we could make sure these are set to true. So maybe it's best to keep this as is, since it's specific to backfilling.
There was a problem hiding this comment.
Will take a look - maybe we could add a new function in the storage repository, at least, for creating a back filled diff without using the DB function that protects against dupes
| urnErr := db.Get(&urnID, `INSERT INTO maker.urns (ilk_id, identifier) VALUES ($1, $2) RETURNING id`, ilkID, fakeUrn) | ||
| Expect(urnErr).NotTo(HaveOccurred()) | ||
|
|
||
| urns, err := repo.GetUrnByID(urnID) |
There was a problem hiding this comment.
super small, but this could probably be urn instead of urns
| if err != nil { | ||
| return fmt.Errorf("error checking if dart/dink are zero: %w", err) | ||
| } | ||
| if dartIsZero && dinkIsZero { |
There was a problem hiding this comment.
I wonder if it's worth it to add a comment here about why we're returning when dart and dink are both zero?
I've been thinking about zero-valued diffs for too long and did a double take because I was thinking these were diffs, and not frob/grab values, so its also super possible this isn't necessary 😜
| return storage.GetKeyForMapping(storage.IndexTwo, ilk) | ||
| } | ||
|
|
||
| func GetUrnArtKey(u repository.Urn) (common.Hash, error) { |
There was a problem hiding this comment.
I wonder if it's worth making getUrnArtKey and getUrnInkKey public so that we can reuse that code. If the way we get either of these keys changes, it would be nice to only have to change it in one place. Though I'm not sure how likely it is that that will ever change. 🤔
