Skip to content

(VDB-1363) backfill events#212

Merged
rmulhol merged 5 commits intostagingfrom
vdb-1363-backfill-events
Jul 16, 2020
Merged

(VDB-1363) backfill events#212
rmulhol merged 5 commits intostagingfrom
vdb-1363-backfill-events

Conversation

@rmulhol
Copy link
Copy Markdown
Contributor

@rmulhol rmulhol commented Jun 17, 2020

No description provided.

@rmulhol rmulhol force-pushed the vdb-1363-backfill-events branch from ab2a5d2 to 9f4be4d Compare June 18, 2020 18:38
@rmulhol rmulhol force-pushed the vdb-1363-backfill-events branch 3 times, most recently from dcdd3d0 to be02733 Compare June 30, 2020 15:10
Copy link
Copy Markdown
Contributor

@elizabethengelman elizabethengelman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just a few questions/small things.



### Build
`docker build -f dockerfiles/backfill_events/Dockerfile . -t backfill_eventslatest`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the tag name is missing a colon: %s/backfill_eventslatest/backfill_events:latest

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch!


## Run
```
docker run -e DATABASE_NAME=vulcanize_public -e DATABASE_HOSTNAME=host.docker.internal -e DATABASE_PORT=5432 -e DATABASE_USER=user -e DATABASE_PASSWORD=pw -e CLIENT_IPCPATH=https://mainnet.infura.io/v3/token -e ENDING_BLOCK_NUMBER=1 -it backfill_event:latest
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super tiny, but the tag above in the build step is backfill_events instead of backfill_event

RUN make plugin PACKAGE=github.com/makerdao/vdb-mcd-transformers

RUN make plugin PACKAGE=github.com/makerdao/vdb-mcd-transformers \
TARGET_LOCATION=$GOPATH/src/github.com/makerdao/vdb-mcd-transformers/plugins/execute/transformerExporter.go \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this TARGET_LOCATION be pointing at .../plugins/backfill/... instead of .../plugins/execute/...?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at the moment, but this is definitely a bit confusing. plugins/backfill is used by backfill_events and has no storage transformers. I wonder if it would make sense to have new exporters for both plugins/backfill_events and a plugins/backfill_storage - since we probably don't need all of the transformers from execute when running backfill_storage. Open to other ideas

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting backfill_events and backfill_storage makes sense to me!

@@ -0,0 +1,247 @@
[exporter]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's possible to use the docker.toml file instead of creating a new one, even though docker.toml has config for contracts that we don't necessarily need for this command. 🤔

Copy link
Copy Markdown
Contributor Author

@rmulhol rmulhol Jul 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that could work, and may even be preferable for reduced complexity right now (since this config includes the lowest deployment block in docker.toml). The only thing that had me thinking we might want to go this route was the possibility that in the future we would only need to backfill events for transformers that had been deployed pretty recently (e.g. new collateral types). In that case, pointing at docker.toml would mean unnecessarily rechecking all earlier blocks for all watched events (since the starting block for backfilling is derived from the lowest deployment block in the config) - and this one would let us target only more recent blocks since the new collateral contracts were deployed. Think it's worth removing this and kicking the can down the road to a later story when we cross that bridge?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point that we'll likely want to backfill with a more recent block number for some transformers. 🤔 I'm good with leaving this as is, or kicking the decision down the road. 👍

package main

import (
deny "github.com/makerdao/vdb-mcd-transformers/transformers/events/auth/deny_initializer"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that these events are ones that we've explicitly identified as ones that need to be backfilled - is the idea to avoid backfilling all of the events, because it would likely be too time consuming?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, basically. I don't know if log fetching performance degrades when you query for more addresses/topics, but I think minimizing the result set we get back will help avoid attempted inserts to public.event_logs that are bound to fail on uniqueness constraint violations. Also, eventually I'm hoping that the set of transformers we're including here would only target very recently deployed contracts, so that there'd be less work to do scanning deployment blocks for contracts that were deployed a long time ago.


import (
deny "github.com/makerdao/vdb-mcd-transformers/transformers/events/auth/deny_initializer"
rely "github.com/makerdao/vdb-mcd-transformers/transformers/events/auth/rely_initializer"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I was trying to think if there could be a way to configure the backfill command so that we wouldn't have to redeploy if we discover that another event also needs to be backfilled. 🤔 It seems that we'll need to run backfills when new collaterals are deployed, so we'd need to update the config file with the new collateral contracts which would trigger a redeploy anyway, so it's probably not a big deal. 🤷

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, definitely a bit of a bummer to need to update this whenever we have something new that needs to be backfilled, but I couldn't really think of a way around that if we're in a situation where we truly need to check many blocks for a new event 🤔

I am hoping that we can run this container in (potentially several) disposable resources, so that the next time we update this config we could wipe out everything that's already here (since it's being taken care of in an active backfill process) and include only the new transformers requiring backfill (to spin up a new backfill process for that one).

@rmulhol rmulhol force-pushed the vdb-1363-backfill-events branch from be02733 to f9daf06 Compare July 15, 2020 16:39
@rmulhol rmulhol merged commit cea6060 into staging Jul 16, 2020
@rmulhol rmulhol deleted the vdb-1363-backfill-events branch July 16, 2020 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants