(VDB-970) urn state history by gslaughl · Pull Request #38 · sky-ecosystem/vdb-mcd-transformers

gslaughl · 2019-12-05T15:30:19Z

Creates a trigger-table exposing all historical states of all urns. Automatically responds to data coming in out of order as well as data being deleted.

gslaughl · 2019-12-05T15:44:30Z

transformers/component_tests/queries/test_helpers/test_helpers.go

-	Art    int
-	Spot   int
-	Rate   int
+func GetUrnSetupData() map[string]int {


A lot of the changes in this commit are side effects of changing the UrnSetupData type to a plain map[string]int. My original intention was to make it easier to extract a shared test suite for historical_urn_state triggers, but I didn't end up extracting a shared test suite because the triggers ended up being more different than I expected. However, I left these changes in, partly out of laziness, and also because this more closely resembles how we're setting up ilk data in our tests, and I thought there might be some value in having a consistent approach.

gslaughl · 2019-12-05T15:48:16Z

transformers/storage/vat/repository_test.go

 var _ = Describe("Vat storage repository", func() {
 	var (
-		db           *postgres.DB
+		db           = test_config.NewTestDB(test_config.NewTestNode())


Interesting side-note-- once I added the 112th test to this suite, I started getting a "Too many connections" error from Postgres. I fixed it by moving the DB's instantiation out of BeforeEach, and that also seems to have halved the time it takes to run the suite.

Nice! That seems like a change worth making everywhere, though not necessarily as part of this PR

Created VDB-1058 for that

gslaughl · 2019-12-05T15:52:02Z

db/migrations/20191202091203_create_historical_urn_state_table_and_triggers.sql

-         LEFT JOIN api.historical_urn_state ON urns.identifier = urn_identifier AND ilks.identifier = ilk_identifier
-WHERE urns.id = urn_id
+SELECT api.epoch_to_datetime(MIN(block_timestamp))
+FROM maker.vat_urn_ink


We've been using the block timestamp of the first ink diff as the created value for urns, so I continued using that approach here.

Could there be any value in looking at the earliest frob instead, not to be dependent on state diffs?

That sounds reasonable, but is there any reason to think frob events will be more reliable than ink diffs?

gslaughl · 2019-12-05T15:54:32Z

transformers/storage/vat/repository_test.go

 				headerTwo = CreateHeaderWithHash(hashTwo.String(), rawTimestampTwo, blockTwo, db)
 			})

+			It("inserts time of first ink diff into created", func() {


The main reason I gave the ink and art triggers each their own tests (instead of extracting a suite) is because ink diffs can affect the created value in the trigger-table, and art values can't.

m0ar

This is amazing work 👷‍♂️ 💞

How come it's so much smaller than the ilk counterpart? Just fewer fields? Or some other engineering feats that could also be applied there?

Also, I think we can both agree that these are a bit hard to test, and it's very hard to try to theorycraft all special circumstances that might occur... Feels like a clear-cut use case of a second order transformer in the future :D

Actually maybe we could write a sanity checking test using dai.js? Sync for a bit, select a couple of blocks. Compare results from direct dai.js calls with what we have?

m0ar · 2019-12-09T12:52:04Z

db/migrations/20191202091203_create_historical_urn_state_table_and_triggers.sql

+    ink            NUMERIC   DEFAULT NULL,
+    art            NUMERIC   DEFAULT NULL,
+    created        TIMESTAMP DEFAULT NULL,
+    updated        TIMESTAMP NOT NULL,


Why is created nullable, but updated not?

Because we're getting created from maker.vat_urn_ink, so if the first diff we parse happens to be an art, we'll have a row with an unknown created value.

m0ar · 2019-12-09T12:55:35Z

db/migrations/20191202091203_create_historical_urn_state_table_and_triggers.sql

-         LEFT JOIN api.historical_urn_state ON urns.identifier = urn_identifier AND ilks.identifier = ilk_identifier
-WHERE urns.id = urn_id
+SELECT api.epoch_to_datetime(MIN(block_timestamp))
+FROM maker.vat_urn_ink


Could there be any value in looking at the earliest frob instead, not to be dependent on state diffs?

m0ar · 2019-12-09T12:56:22Z

db/migrations/20191202091203_create_historical_urn_state_table_and_triggers.sql

+
+
+-- +goose StatementBegin
+CREATE OR REPLACE FUNCTION maker.delete_redundant_urn_state(urn_id INTEGER, header_id INTEGER) RETURNS api.historical_urn_state


I understand nothing... Can we smack a comment on this? 😅

Haha yeah for sure 😅

It sounds like you got the gist of it, but it just deletes a historical_urn_state row if it's identical to the previous one. It's necessary because 1 row can have data from both an ink diff and an art diff, so deleting 1 kind of diff doesn't mean you can delete the associated row in this table. First we have to check that the row doesn't contain data from any other diffs, and only then can we delete it.

I agree that it's not very clear what's happening. I can at least add a comment and maybe find a clearer way to accomplish the same thing.

m0ar · 2019-12-09T12:58:10Z

db/migrations/20191202091203_create_historical_urn_state_table_and_triggers.sql

+BEGIN
+    DELETE
+    FROM api.historical_urn_state
+        USING maker.urns LEFT JOIN maker.ilks ON urns.ilk_id = ilks.id


Can you explain how USING works here?

USING is basically just a read-only JOIN for delete statements.

m0ar · 2019-12-09T12:58:35Z

db/migrations/20191202091203_create_historical_urn_state_table_and_triggers.sql

+      AND urns.id = urn_id
+      AND historical_urn_state.block_height = urn_state_block_number
+      AND (historical_urn_state.ink IS NULL OR historical_urn_state.ink = prev_state.ink)
+      AND (historical_urn_state.art IS NULL OR historical_urn_state.art = prev_state.art);


Are all these basically saying if both rows are equal?

m0ar · 2019-12-09T13:08:51Z

db/migrations/20191203104431_create_historical_urn_state_computed_columns.sql

+    RETURNS api.ilk_state AS
+$$
+SELECT *
+FROM api.get_ilk(state.ilk_identifier, state.block_height)


What's the complexity of this call? And the rest really...

Might be a trade-off to be done here, fast access to the "native" data, but having to do a second query to grab the ilk/frobs/bites for a certain block... Not sure actually.

But my point is that if these slow down the queries by a lot, maybe it's not worthwhile to include them even...

Yeah that's a good point. I just included these computed columns because they're included in the original all_urn_states query, and I wanted to match that functionality. It's probably worth doing some performance analysis on these columns, bc just from looking at the get_ilk query it seems pretty complex.

Yeah, get_ilk is what usually locks the endpoint completely, so it's gonna nuke this query too :)

m0ar · 2019-12-09T13:09:53Z

transformers/component_tests/queries/all_urn_states_query_test.go

+		urnSetupData := helper.GetUrnSetupData()
 		urnMetadata := helper.GetUrnMetadata(helper.FakeIlk.Hex, fakeUrn)
-		helper.CreateUrn(urnSetupData, urnMetadata, vatRepo)
+		helper.CreateUrn(urnSetupData, headerOne.Id, urnMetadata, vatRepo)


If you cba I think golint prefers headerOne.ID ;)

m0ar · 2019-12-09T13:12:59Z

transformers/storage/vat/repository_test.go

 var _ = Describe("Vat storage repository", func() {
 	var (
-		db           *postgres.DB
+		db           = test_config.NewTestDB(test_config.NewTestNode())


Created VDB-1058 for that

gslaughl

@m0ar yeah unfortunately the amount of code is just a product of the number of fields in the table. I agree with your point about maybe needing some higher-level integration tests here, maybe we should make a story for that? I can see some dai.js integration tests being useful in other places too, once we have an initial framework set up.

gslaughl · 2019-12-09T15:55:28Z

db/migrations/20191202091203_create_historical_urn_state_table_and_triggers.sql

+    ink            NUMERIC   DEFAULT NULL,
+    art            NUMERIC   DEFAULT NULL,
+    created        TIMESTAMP DEFAULT NULL,
+    updated        TIMESTAMP NOT NULL,


Because we're getting created from maker.vat_urn_ink, so if the first diff we parse happens to be an art, we'll have a row with an unknown created value.

gslaughl · 2019-12-09T15:56:32Z

db/migrations/20191202091203_create_historical_urn_state_table_and_triggers.sql

-         LEFT JOIN api.historical_urn_state ON urns.identifier = urn_identifier AND ilks.identifier = ilk_identifier
-WHERE urns.id = urn_id
+SELECT api.epoch_to_datetime(MIN(block_timestamp))
+FROM maker.vat_urn_ink


That sounds reasonable, but is there any reason to think frob events will be more reliable than ink diffs?

gslaughl · 2019-12-09T16:01:31Z

db/migrations/20191202091203_create_historical_urn_state_table_and_triggers.sql

+
+
+-- +goose StatementBegin
+CREATE OR REPLACE FUNCTION maker.delete_redundant_urn_state(urn_id INTEGER, header_id INTEGER) RETURNS api.historical_urn_state


Haha yeah for sure 😅

It sounds like you got the gist of it, but it just deletes a historical_urn_state row if it's identical to the previous one. It's necessary because 1 row can have data from both an ink diff and an art diff, so deleting 1 kind of diff doesn't mean you can delete the associated row in this table. First we have to check that the row doesn't contain data from any other diffs, and only then can we delete it.

I agree that it's not very clear what's happening. I can at least add a comment and maybe find a clearer way to accomplish the same thing.

gslaughl · 2019-12-09T16:03:27Z

db/migrations/20191202091203_create_historical_urn_state_table_and_triggers.sql

+BEGIN
+    DELETE
+    FROM api.historical_urn_state
+        USING maker.urns LEFT JOIN maker.ilks ON urns.ilk_id = ilks.id


USING is basically just a read-only JOIN for delete statements.

gslaughl · 2019-12-09T16:20:00Z

db/migrations/20191203104431_create_historical_urn_state_computed_columns.sql

+    RETURNS api.ilk_state AS
+$$
+SELECT *
+FROM api.get_ilk(state.ilk_identifier, state.block_height)


Yeah that's a good point. I just included these computed columns because they're included in the original all_urn_states query, and I wanted to match that functionality. It's probably worth doing some performance analysis on these columns, bc just from looking at the get_ilk query it seems pretty complex.

Add computed columns Add out-of-order diff triggers Add delete triggers for re-org safety

gslaughl commented Dec 5, 2019

View reviewed changes

gslaughl force-pushed the vdb-970-urn-state-history branch from 5ebf4a4 to 51ea3e7 Compare December 5, 2019 16:09

gslaughl requested review from elizabethengelman, m0ar, rmulhol and yaoandrew December 5, 2019 16:14

m0ar approved these changes Dec 9, 2019

View reviewed changes

gslaughl commented Dec 9, 2019

View reviewed changes

(VDB-970) Create historical_urn_state trigger table

9408a23

Add computed columns Add out-of-order diff triggers Add delete triggers for re-org safety

gslaughl force-pushed the vdb-970-urn-state-history branch from 51ea3e7 to 9408a23 Compare December 9, 2019 17:56

gslaughl merged commit 9bfaaf2 into staging Dec 9, 2019

gslaughl deleted the vdb-970-urn-state-history branch December 9, 2019 18:31



		-- +goose StatementBegin
		CREATE OR REPLACE FUNCTION maker.delete_redundant_urn_state(urn_id INTEGER, header_id INTEGER) RETURNS api.historical_urn_state

Conversation

gslaughl commented Dec 5, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gslaughl Dec 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

m0ar left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gslaughl left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gslaughl Dec 5, 2019 •

edited

Loading