Skip to content

(VDB-1315) Add command for backfilling Urn diffs#189

Merged
rmulhol merged 6 commits intostagingfrom
backfill-urns
May 19, 2020
Merged

(VDB-1315) Add command for backfilling Urn diffs#189
rmulhol merged 6 commits intostagingfrom
backfill-urns

Conversation

@rmulhol
Copy link
Copy Markdown
Contributor

@rmulhol rmulhol commented May 7, 2020

  • insert diffs from every block with a frob for involved urn

@rmulhol rmulhol force-pushed the backfill-urns branch 5 times, most recently from aed69e6 to a0ea218 Compare May 12, 2020 16:58
Copy link
Copy Markdown
Contributor

@paytonrules paytonrules left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been running in production for a while - so it's clearly working. However one thing I'd like to see would be some additions to the README explaining how to run these, and a hint on how to get started.

Nothing huge, just something like "we use Cobra for back-fills and command line scripts that run against VDB. Here's the command to run it. "


logrus.Infof("getting storage for %d grabs", len(frobs))
for i, frob := range frobs {
err := shared.FetchAndPersistDartDinkDiffs(frob.Dart, frob.Dink, frob.UrnID, frob.HeaderID, backFiller.eventsRepository, backFiller.storageRepository, backFiller.blockChain)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe consider a struct for those first four params? It would prevent accidentally switching Dink/Dart and UrnID/HeaderID.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great idea - I made this mistake earlier and manually rechecked these params approx ~1 million times 😂

Expect(errors.Is(err, fakes.FakeError)).To(BeTrue())
})

It("ignores grab if dink and dart are zero", func() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these tests are duplicated and could be moved to the Shared lib function that does most of the work.

}
}
return nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do another backfiller beyond these two, you may consider making each backfiller a strategy. Essentially add methods for things like "error message" and "log storage" and have one place that does the Get"Things" then loop through them logic. As opposed to copy/pasta.

err := e.db.Select(&grabs, `SELECT header_id, urn_id, dink, dart
FROM maker.vat_grab
JOIN public.headers ON vat_grab.header_id = headers.id
WHERE headers.block_number >= $1`, startingBlock)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come you don't order the Grabs but do the Frobs? I'm not sure it matters but it's odd that they are slightly different.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably makes sense to order both 👍

The bigger the system gets and the more events we accumulate, I'm also starting to worry about unbounded results sets - i.e. how can we grab manageable chunks of events to backfill and proceed through from there

Expect(err).NotTo(HaveOccurred())
Expect(len(frobs)).To(Equal(2))
Expect(frobs[0]).To(Equal(frobAtEarlierBlock))
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the error case? Does that require shutting down the db or something?

github.com/onsi/gomega v1.7.0
github.com/pkg/errors v0.8.1 // indirect
github.com/sirupsen/logrus v1.2.0
github.com/spf13/cobra v0.0.3
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Rob Mulholand added 5 commits May 15, 2020 15:53
- insert diffs from every block with a frob for involved urn
- Don't use create_back_filled_diff since it's slow and we don't need
  to guard against redundant diffs (e.g. same value at new block) since
  we're only attempting inserts when we know there's a delta
- Also rename to Frob Backfiller since we're fetching storage based on
  that event and may fetch urn data from other events
- Don't load all urns up front since it's likely only some of them have
  frobs since the starting block
- Refactor to pull out shared code between frob + grab backfilling
@rmulhol rmulhol changed the title Add command for backfilling Urn diffs (VDB-1315) Add command for backfilling Urn diffs May 18, 2020
Copy link
Copy Markdown
Contributor

@elizabethengelman elizabethengelman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! :shipit:

Just a few questions/comments - nothing merge block though!

README.md Outdated
This project contains one executable, `backfillUrns`.
This command enables you to get storage data for Urns by performing lookups at blocks where we detected events indicating an Urn state change.
To run it, you first need to `go build`.
After that, you can run something like `./vdb-mcd-transformers backfillUrns --starting-block 10000000` - which would backfill Urn storage since block 10,000,000.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super small, but I wonder if it would be worth mentioning that this commands backfills until the head of the chain?

return grabs, err
}

func (e eventsRepository) GetHeaderByID(id int) (core.Header, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if this method would make sense in the HeaderRepository? 🤔

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah definitely - can open up a PR over there to add it 👍

return urn, err
}

func (repo storageRepository) InsertDiff(diff types.RawDiff) error {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was trying to think of a way to reuse the CreateStorageDiff code here, but it looks like we'd need to rework that method to allow a from_backfill value to be passed in so that we could make sure these are set to true. So maybe it's best to keep this as is, since it's specific to backfilling.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take a look - maybe we could add a new function in the storage repository, at least, for creating a back filled diff without using the DB function that protects against dupes

urnErr := db.Get(&urnID, `INSERT INTO maker.urns (ilk_id, identifier) VALUES ($1, $2) RETURNING id`, ilkID, fakeUrn)
Expect(urnErr).NotTo(HaveOccurred())

urns, err := repo.GetUrnByID(urnID)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super small, but this could probably be urn instead of urns

if err != nil {
return fmt.Errorf("error checking if dart/dink are zero: %w", err)
}
if dartIsZero && dinkIsZero {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's worth it to add a comment here about why we're returning when dart and dink are both zero?

I've been thinking about zero-valued diffs for too long and did a double take because I was thinking these were diffs, and not frob/grab values, so its also super possible this isn't necessary 😜

return storage.GetKeyForMapping(storage.IndexTwo, ilk)
}

func GetUrnArtKey(u repository.Urn) (common.Hash, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's worth making getUrnArtKey and getUrnInkKey public so that we can reuse that code. If the way we get either of these keys changes, it would be nice to only have to change it in one place. Though I'm not sure how likely it is that that will ever change. 🤔

@rmulhol rmulhol merged commit 279c712 into staging May 19, 2020
@rmulhol rmulhol deleted the backfill-urns branch May 19, 2020 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants