Skip to content

(VDB-955) Associate diffs with headers#24

Merged
rmulhol merged 2 commits intostagingfrom
vdb-955-reorg-safe-diffs
Nov 26, 2019
Merged

(VDB-955) Associate diffs with headers#24
rmulhol merged 2 commits intostagingfrom
vdb-955-reorg-safe-diffs

Conversation

@rmulhol
Copy link
Copy Markdown
Contributor

@rmulhol rmulhol commented Nov 22, 2019

  • Enables cascade delete on reorgs

Corresponding PR for sky-ecosystem/vulcanizedb#6

Diff is obnoxiously large because header_id needed to be added to the storage repository interface's Create function. Thought about adding the utils.StorageDiff so that we could just change that struct for similar stories in the future, but that wouldn't be able to replace the metadata (derived from the keys lookup) or value (derived from the decoder).

With that change required, I felt it made sense to remove the block_number + block_hash from the args (and parsed diff tables), since that data is already on the header (don't want to copy it in two places). Didn't want to remove it entirely from the utils.StorageDiffstruct yet since that struct is currently used for both parsing raw diffs (from the rpc/csv/etc) and passing to the keys lookup - but fixing that (using different structs for parsing raw vs keys lookup) is another change I'd be in favor or queueing up.

With the header_id added to the tables as a not null reference to the public.headers table, every component test needed a header inserted before diffs were created. The last level of changes in this PR was implementing a consistent approach for this across all of the component tests so that future cascading changes (should they occur) will at least be dealing with a more uniform suite.

Appreciate everyone's time bearing with me through this - not happy the diff is so large, but excited for the feature we get in return. Hoping not to have to deal with too many rebases 😅

@rmulhol rmulhol force-pushed the vdb-955-reorg-safe-diffs branch 2 times, most recently from 0750fe3 to f8520d1 Compare November 24, 2019 23:52
@rmulhol rmulhol changed the title [WIP] (VDB-955) Associate diffs with headers (VDB-955) Associate diffs with headers Nov 25, 2019
Copy link
Copy Markdown
Contributor

@gslaughl gslaughl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sheesh... nice job on this! Appreciate you taking the time to clean up tests on top of everything else 😃

FROM (SELECT DISTINCT ON (urn_id) urn_id,
block_timestamp
-- TODO: should we be using urn ink for created?
-- Can a CDP exist before collateral is locked?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, seems worth investigating. I took a glance at the contracts and I'm not sure but it seems like urns are created via Vat.frob? And the logic around whether a CDP is safe is pretty complex, but it seems possible that an urn could be created with 0 ink, in which case we wouldn't see a diff. Might want to circle back on this in another story

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 VDB-1047

id SERIAL PRIMARY KEY,
-- TODO: remove hash?
-- Can we replace block number + hash with header_id?
-- Tricky because there might be multiple diffs contributing to a single row
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be fine to replace block_hash with a header_id since block_hash is just being used as a mechanism to get the header's timestamp. And a header_id seems more conducive to that kind of thing. We could do the same thing with block_number, but then again that seems like something we want to expose through the api, so it might make sense to just keep it in this table.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, going to leave this to the trigger-table-reorg-safety stories. I think it probably makes sense to remove hash and number and then fetch those things based on header_id at query time, so that we're not duplicating the data - but re-org safety and query performance are probably higher priorities that merit deeper consideration before making the change

})
})

It("returns ilk states without timestamps if the corresponding header hasn't been synced yet", func() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

Copy link
Copy Markdown
Contributor

@elizabethengelman elizabethengelman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

UNIQUE (header_id, ilk_id, dust)
);

CREATE INDEX vat_ilk_dust_block_number_index
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if for some of these the index to the header_id would be helpful 🤔

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense - will go ahead and add them, since we're joining so much on that column

LEFT JOIN public.headers ON vat_urn_ink.header_id = headers.id
ORDER BY urn_id, block_number ASC) earliest_blocks),
updated AS (SELECT DISTINCT ON (urn_id) urn_id, api.epoch_to_datetime(block_timestamp) AS datetime
FROM ((SELECT DISTINCT ON (urn_id) urn_id, block_hash
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to think through if there could be any side effects of selecting distinct on block_hash vs block_timestamp. 🤔 Seems like it'll be okay since both block_hash and block_timestamp should both be unique.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! I think we can depend on block_timestamp being unique, but I also wonder if it might make sense to just select distinct on header_id 🤔

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, think I misread - isn't this only selecting distinct on urn_id? And then it's just selecting the block_timestamp for each distinct urn_id so that it can use it in api.epoch_to_datetime

@rmulhol rmulhol force-pushed the vdb-955-reorg-safe-diffs branch from f8520d1 to 16bcd59 Compare November 26, 2019 17:56
Rob Mulholand added 2 commits November 26, 2019 12:19
- Enables cascade delete on reorgs
@rmulhol rmulhol force-pushed the vdb-955-reorg-safe-diffs branch from 16bcd59 to ca10c1b Compare November 26, 2019 18:19
@rmulhol rmulhol merged commit f2291c8 into staging Nov 26, 2019
@rmulhol rmulhol deleted the vdb-955-reorg-safe-diffs branch November 26, 2019 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants