Skip to content

(VDB-970) urn state history#38

Merged
gslaughl merged 1 commit intostagingfrom
vdb-970-urn-state-history
Dec 9, 2019
Merged

(VDB-970) urn state history#38
gslaughl merged 1 commit intostagingfrom
vdb-970-urn-state-history

Conversation

@gslaughl
Copy link
Copy Markdown
Contributor

@gslaughl gslaughl commented Dec 5, 2019

Creates a trigger-table exposing all historical states of all urns. Automatically responds to data coming in out of order as well as data being deleted.

Art int
Spot int
Rate int
func GetUrnSetupData() map[string]int {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of the changes in this commit are side effects of changing the UrnSetupData type to a plain map[string]int. My original intention was to make it easier to extract a shared test suite for historical_urn_state triggers, but I didn't end up extracting a shared test suite because the triggers ended up being more different than I expected. However, I left these changes in, partly out of laziness, and also because this more closely resembles how we're setting up ilk data in our tests, and I thought there might be some value in having a consistent approach.

var _ = Describe("Vat storage repository", func() {
var (
db *postgres.DB
db = test_config.NewTestDB(test_config.NewTestNode())
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting side-note-- once I added the 112th test to this suite, I started getting a "Too many connections" error from Postgres. I fixed it by moving the DB's instantiation out of BeforeEach, and that also seems to have halved the time it takes to run the suite.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! That seems like a change worth making everywhere, though not necessarily as part of this PR

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created VDB-1058 for that

LEFT JOIN api.historical_urn_state ON urns.identifier = urn_identifier AND ilks.identifier = ilk_identifier
WHERE urns.id = urn_id
SELECT api.epoch_to_datetime(MIN(block_timestamp))
FROM maker.vat_urn_ink
Copy link
Copy Markdown
Contributor Author

@gslaughl gslaughl Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've been using the block timestamp of the first ink diff as the created value for urns, so I continued using that approach here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could there be any value in looking at the earliest frob instead, not to be dependent on state diffs?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable, but is there any reason to think frob events will be more reliable than ink diffs?

headerTwo = CreateHeaderWithHash(hashTwo.String(), rawTimestampTwo, blockTwo, db)
})

It("inserts time of first ink diff into created", func() {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason I gave the ink and art triggers each their own tests (instead of extracting a suite) is because ink diffs can affect the created value in the trigger-table, and art values can't.

Copy link
Copy Markdown
Contributor

@m0ar m0ar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing work 👷‍♂️ 💞

How come it's so much smaller than the ilk counterpart? Just fewer fields? Or some other engineering feats that could also be applied there?

Also, I think we can both agree that these are a bit hard to test, and it's very hard to try to theorycraft all special circumstances that might occur... Feels like a clear-cut use case of a second order transformer in the future :D

Actually maybe we could write a sanity checking test using dai.js? Sync for a bit, select a couple of blocks. Compare results from direct dai.js calls with what we have?

ink NUMERIC DEFAULT NULL,
art NUMERIC DEFAULT NULL,
created TIMESTAMP DEFAULT NULL,
updated TIMESTAMP NOT NULL,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is created nullable, but updated not?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we're getting created from maker.vat_urn_ink, so if the first diff we parse happens to be an art, we'll have a row with an unknown created value.

LEFT JOIN api.historical_urn_state ON urns.identifier = urn_identifier AND ilks.identifier = ilk_identifier
WHERE urns.id = urn_id
SELECT api.epoch_to_datetime(MIN(block_timestamp))
FROM maker.vat_urn_ink
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could there be any value in looking at the earliest frob instead, not to be dependent on state diffs?



-- +goose StatementBegin
CREATE OR REPLACE FUNCTION maker.delete_redundant_urn_state(urn_id INTEGER, header_id INTEGER) RETURNS api.historical_urn_state
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand nothing... Can we smack a comment on this? 😅

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha yeah for sure 😅

It sounds like you got the gist of it, but it just deletes a historical_urn_state row if it's identical to the previous one. It's necessary because 1 row can have data from both an ink diff and an art diff, so deleting 1 kind of diff doesn't mean you can delete the associated row in this table. First we have to check that the row doesn't contain data from any other diffs, and only then can we delete it.

I agree that it's not very clear what's happening. I can at least add a comment and maybe find a clearer way to accomplish the same thing.

BEGIN
DELETE
FROM api.historical_urn_state
USING maker.urns LEFT JOIN maker.ilks ON urns.ilk_id = ilks.id
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain how USING works here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

USING is basically just a read-only JOIN for delete statements.

AND urns.id = urn_id
AND historical_urn_state.block_height = urn_state_block_number
AND (historical_urn_state.ink IS NULL OR historical_urn_state.ink = prev_state.ink)
AND (historical_urn_state.art IS NULL OR historical_urn_state.art = prev_state.art);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all these basically saying if both rows are equal?

RETURNS api.ilk_state AS
$$
SELECT *
FROM api.get_ilk(state.ilk_identifier, state.block_height)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the complexity of this call? And the rest really...

Might be a trade-off to be done here, fast access to the "native" data, but having to do a second query to grab the ilk/frobs/bites for a certain block... Not sure actually.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But my point is that if these slow down the queries by a lot, maybe it's not worthwhile to include them even...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's a good point. I just included these computed columns because they're included in the original all_urn_states query, and I wanted to match that functionality. It's probably worth doing some performance analysis on these columns, bc just from looking at the get_ilk query it seems pretty complex.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, get_ilk is what usually locks the endpoint completely, so it's gonna nuke this query too :)

urnSetupData := helper.GetUrnSetupData()
urnMetadata := helper.GetUrnMetadata(helper.FakeIlk.Hex, fakeUrn)
helper.CreateUrn(urnSetupData, urnMetadata, vatRepo)
helper.CreateUrn(urnSetupData, headerOne.Id, urnMetadata, vatRepo)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you cba I think golint prefers headerOne.ID ;)

var _ = Describe("Vat storage repository", func() {
var (
db *postgres.DB
db = test_config.NewTestDB(test_config.NewTestNode())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created VDB-1058 for that

Copy link
Copy Markdown
Contributor Author

@gslaughl gslaughl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m0ar yeah unfortunately the amount of code is just a product of the number of fields in the table. I agree with your point about maybe needing some higher-level integration tests here, maybe we should make a story for that? I can see some dai.js integration tests being useful in other places too, once we have an initial framework set up.

ink NUMERIC DEFAULT NULL,
art NUMERIC DEFAULT NULL,
created TIMESTAMP DEFAULT NULL,
updated TIMESTAMP NOT NULL,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we're getting created from maker.vat_urn_ink, so if the first diff we parse happens to be an art, we'll have a row with an unknown created value.

LEFT JOIN api.historical_urn_state ON urns.identifier = urn_identifier AND ilks.identifier = ilk_identifier
WHERE urns.id = urn_id
SELECT api.epoch_to_datetime(MIN(block_timestamp))
FROM maker.vat_urn_ink
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable, but is there any reason to think frob events will be more reliable than ink diffs?



-- +goose StatementBegin
CREATE OR REPLACE FUNCTION maker.delete_redundant_urn_state(urn_id INTEGER, header_id INTEGER) RETURNS api.historical_urn_state
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha yeah for sure 😅

It sounds like you got the gist of it, but it just deletes a historical_urn_state row if it's identical to the previous one. It's necessary because 1 row can have data from both an ink diff and an art diff, so deleting 1 kind of diff doesn't mean you can delete the associated row in this table. First we have to check that the row doesn't contain data from any other diffs, and only then can we delete it.

I agree that it's not very clear what's happening. I can at least add a comment and maybe find a clearer way to accomplish the same thing.

BEGIN
DELETE
FROM api.historical_urn_state
USING maker.urns LEFT JOIN maker.ilks ON urns.ilk_id = ilks.id
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

USING is basically just a read-only JOIN for delete statements.

RETURNS api.ilk_state AS
$$
SELECT *
FROM api.get_ilk(state.ilk_identifier, state.block_height)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's a good point. I just included these computed columns because they're included in the original all_urn_states query, and I wanted to match that functionality. It's probably worth doing some performance analysis on these columns, bc just from looking at the get_ilk query it seems pretty complex.

Add computed columns

Add out-of-order diff triggers

Add delete triggers for re-org safety
@gslaughl gslaughl force-pushed the vdb-970-urn-state-history branch from 51ea3e7 to 9408a23 Compare December 9, 2019 17:56
@gslaughl gslaughl merged commit 9bfaaf2 into staging Dec 9, 2019
@gslaughl gslaughl deleted the vdb-970-urn-state-history branch December 9, 2019 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants