Preliminary introduction of RowSet with supporting changes by Wondertan · Pull Request #427 · celestiaorg/celestia-core

Wondertan · 2021-06-21T12:32:35Z

Background

Initially, I aimed to just close #423 with this branch, but due to the high coupling of the code, I added multiple supporting changes and improvements that are now separated from the block propagation logic itself.

Changes

Updates to p2p/ipld API
- Decouple providing from saving into new ProvideData func
- Move in DAHeader and NamespacedShare with supporting changes
- PutBlock -> PutData(more naming improvements should be discussed)
- Rely more on ExtendedDataSqaure type and extract DAHeader formation into separate func MakeDataHeader
- p2p/ipld is now treated as library
  - Dependency link change: p2p/ipld -> types ==> types -> p2p/ipld
- Rework testing according to changes and make the code there consistent
- New public test utilities
Updates to types package
- Calculate extended square only once, finally!
- Basic implementation of RowSet
- Rely on p2p/ipld
Start relying on RowSet updates p2p/ipld API in consensus/state
Contribute to overall code cohesion

TODO

Some of those should be fixed within the PR, for some, we should file issues and discuss.

evan-forbes

Left a few minor comments and suggestions. Overall, I really like the changes so far in this PR. I think the changes relating to decoupling providing (ref #428) and refactoring PutBlock into PutData (ref #196) could both be isolated into their own PRs. Alternatively, I would also be ok with moving the RowSet related changes into a different PR, until we discuss further in #423 and the upcoming ADR that maps out the required implementation changes required to propagate block data.

really looking forward to these already implemented changes, great work, and good hustle @Wondertan

types/part_set.go

evan-forbes · 2021-06-22T03:18:11Z

p2p/ipld/write.go

+func ProvideData(
+	ctx context.Context,
+	dah *DataAvailabilityHeader,
+	croute routing.ContentRouting,
+	logger log.Logger) error {


This change is awesome, and is a natural extension of #375. I really really like it. As discussed in our sync call, I think we should decouple as many of the features of this PR as possible, and this can be cherry picked out. ref #428

evan-forbes · 2021-06-22T03:21:03Z

p2p/ipld/write.go

+	// compute roots
+	eds.ColumnRoots()


why not also compute the row roots? shouldn't that be an issue considering that we only use the row roots to retrieve data?

If we look at the code, we will see that calling either ColumnRoots or RowRoots will compute roots for all. Semantically, I should've used RowRoots instead, as we mostly rely on Rows everywhere, but technically, it does not matter.

evan-forbes · 2021-06-22T03:24:54Z

p2p/ipld/header.go

+type DataAvailabilityHeader struct {
+	// RowRoot_j 	= root((M_{j,1} || M_{j,2} || ... || M_{j,2k} ))
+	RowsRoots NmtRoots `json:"row_roots"`
+	// ColumnRoot_j = root((M_{1,j} || M_{2,j} || ... || M_{2k,j} ))
+	ColumnRoots NmtRoots `json:"column_roots"`
+	// cached result of Hash() not to be recomputed
+	hash []byte
+}


Even though moving the DataAvailabilityHeader outside of the types package breaks tendermint's pattern of keeping the core types inside that package, I think this makes a lot of sense.

Tm's pattern is monolithic and we should steadily go away from it.

liamsi

The changes under

Updates to p2p/ipld API
[...]
Updates to types package

should be broken into separate PRs. We also need to update the ADR that deals with ipld package.

Start relying on RowSet updates p2p/ipld API in consensus/state

These are the only changes directly related to changing gossiping from tendermint chunks to shares. And it is only the most basic preparation. It's hard to get a full picture of the required changes from this work. Similar to what Evan suggested yesterday, I'd suggest enumerating the places this will impact (in an adr / issue / google doc). From looking at this PR it seems like this could be weeks of work (but I might be wrong about this).

Edit:

Decouple providing from saving into new ProvideData func
Calculate extended square only once, finally!

This is really dope by the way 👌🏼

liamsi · 2021-06-25T09:40:07Z

types/block.go

+func (b *Block) RowSet(ctx context.Context, adder format.NodeAdder) (*RowSet, error) {
+	shares, dataLen := b.ComputeShares()
+	eds, err := ipld.PutData(ctx, shares, adder)


Computing the rowset now also writes to IPFS? Is that intentional?

I think the least invasive approach would be:

compute rowset where needed without touching storage yet

use rowset (like previously partsset) during consensus and in the other places

as a first step only store the block where store.SaveBlock is called (during consensus this is during finalize commit)

Later we can re-consider 3. and optimize to store and provide the data earlier.

codecov-commenter · 2021-06-27T19:55:30Z

Codecov Report

Merging #427 (a74bc01) into master (05c1dc9) will decrease coverage by 0.13%.
The diff coverage is 57.83%.

@@            Coverage Diff             @@
##           master     #427      +/-   ##
==========================================
- Coverage   61.82%   61.69%   -0.14%     
==========================================
  Files         262      266       +4     
  Lines       22981    23072      +91     
==========================================
+ Hits        14208    14234      +26     
- Misses       7270     7329      +59     
- Partials     1503     1509       +6

Impacted Files	Coverage Δ
cmd/tendermint/commands/light.go	`18.12% <0.00%> (+0.22%)`	⬆️
consensus/replay_file.go	`0.00% <0.00%> (ø)`
consensus/types/round_state.go	`0.00% <ø> (ø)`
ipfs/plugin/nmt.go	`30.58% <ø> (ø)`
libs/autofile/group.go	`56.95% <0.00%> (ø)`
light/client.go	`64.34% <0.00%> (ø)`
light/provider/http/http.go	`40.86% <ø> (ø)`
p2p/ipld/sample.go	`84.21% <ø> (ø)`
rpc/core/blocks.go	`27.90% <0.00%> (-2.87%)`	⬇️
rpc/core/tx.go	`0.00% <0.00%> (ø)`
... and 47 more

evan-forbes

I left few more very minor questions.

I also appreciate the more incremental approach taken with this PR, which I think pairs extremely well with #441

evan-forbes · 2021-07-07T18:26:48Z

types/block_test.go

-		{"Tampered Data", func(blk *Block) {
-			blk.Data.Txs[0] = Tx("something else")
-			blk.DataHash = nil // clear hash or change wont be noticed
-		}, true},
-		{"Tampered DataHash", func(blk *Block) {
-			blk.DataHash = tmrand.Bytes(len(blk.DataHash))
-		}, true},


why were these removed? shouldn't they still fail?

Sooo, I forgot already what was the issue. Revisiting...

Because I removed DataHash validation from Block.ValidateBasic. Previously it checked that DAHeader matches the Header:

This does not actually give us anything. An attacker could still send us marshaled a Block with the Header.DataHash matching Block.DAHeader and to properly verify the Block we would need to recompute that all the data in the Block matches all the commitments. (I guess we must do this when we receive a serialized Block body, cc @liamsi)

One of the TODOs mentions

Remove DAH from Block and its proto

So removal of that validation is just a first step. The reason to remove DA from the body(or proto only) is that we need to recompute it anyways to verify the integrity, so passing it around does not make sense.

types/block_test.go

p2p/ipld/testing.go

p2p/ipld/write_test.go

… changes

Wondertan · 2021-07-09T08:19:20Z

Rebased on master

liamsi · 2021-07-09T12:14:25Z

Here is a general suggestion how to proceed with this: we have good evidence that propagating the data in fixed sized chunks works well and is fast. Why don't we stick to that paradigm and still split the rows into parts of equal size?

That way we are not be changing a network parameter (size of the gossiped chunks) without doing any experimentation / simulations up front. Instead we switch to gossip the (orig) data serialized and arranged into rows but chunk that up into equally sized parts. With that we have exactly the same confidence as before but can still get rid of the partset header (because the 64kb chunks of rows can still be authenticated against the DAHeader instead of the partset header, you'd just need to send all proof nodes that you can't recompute locally). The only change necessary would be that all fields besides block.Data need to be part of the proposal (or gossiped separately too).

liamsi · 2021-08-17T13:44:18Z

Let's close this for now. We won't continue this PR. At a later stage we might revisit this idea.

* ci: Add release version check (#427) * ci: Add release version check Signed-off-by: Thane Thomson <connect@thanethomson.com> * Fix workflow trigger Signed-off-by: Thane Thomson <connect@thanethomson.com> --------- Signed-off-by: Thane Thomson <connect@thanethomson.com> (cherry picked from commit 9d957b9) * Use Go 1.19 in v0.34.x Signed-off-by: Thane Thomson <connect@thanethomson.com> --------- Signed-off-by: Thane Thomson <connect@thanethomson.com> Co-authored-by: Thane Thomson <connect@thanethomson.com>

Follow-up to #427. The workflow in #427 is merely informational, which is helpful, but can be bypassed. It's probably best to enforce this check when we push tags. If the check fails, it will prevent the cutting of a release, meaning we'll have to delete the tag, fix the version number, and tag again. --- #### PR checklist - [ ] Tests written/updated - [ ] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments (cherry picked from commit 0d3c2f3) Co-authored-by: Thane Thomson <connect@thanethomson.com>

* ci: Add release version check Signed-off-by: Thane Thomson <connect@thanethomson.com> * Fix workflow trigger Signed-off-by: Thane Thomson <connect@thanethomson.com> --------- Signed-off-by: Thane Thomson <connect@thanethomson.com>

Follow-up to #427. The workflow in #427 is merely informational, which is helpful, but can be bypassed. It's probably best to enforce this check when we push tags. If the check fails, it will prevent the cutting of a release, meaning we'll have to delete the tag, fix the version number, and tag again. --- #### PR checklist - [ ] Tests written/updated - [ ] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments

Wondertan requested a review from evan-forbes as a code owner June 21, 2021 12:32

Wondertan self-assigned this Jun 21, 2021

Wondertan requested a review from liamsi as a code owner June 21, 2021 12:32

evan-forbes reviewed Jun 22, 2021

View reviewed changes

liamsi reviewed Jun 22, 2021

View reviewed changes

liamsi mentioned this pull request Jun 23, 2021

Move and refactor the PutBlock method #196

Closed

This was referenced Jun 23, 2021

Save and load block data using IPFS #374

Merged

ADR: First Implementation of a Storage Node #435

Closed

liamsi reviewed Jun 25, 2021

View reviewed changes

liamsi mentioned this pull request Jun 25, 2021

Revert writing to local dag during proposal #436

Closed

4 tasks

Wondertan force-pushed the hlib/block-propagation-2 branch from 0262cd0 to 613d803 Compare June 27, 2021 19:24

Wondertan mentioned this pull request Jun 30, 2021

Initial implementation for Row gossiping #443

Closed

8 tasks

liamsi mentioned this pull request Jul 1, 2021

Add the Data Availability Header to BlockID #312

Closed

5 tasks

evan-forbes reviewed Jul 7, 2021

View reviewed changes

Wondertan added 11 commits July 9, 2021 11:11

refactor(p2p/ipld): move-in DAHeader Share

8db36b7

refactor(p2p/ipld): rework PutBlock API to PutData

c21909f

refactor(p2p/ipld): rework RetrieveData to the latest changes

6fb48fd

refactor(p2p/ipld): rework ValidateAvailability to the latest changes

aeda818

refactor(p2p/ipld): rework tests and test utilities to the latest API…

730fdc3

… changes

refactor(types): Extract DAHeader and Shares

83f77b6

feat(types): introduce RowSet

fe3ec08

feat: rely on new RowSet method and new API in p2p/ipld in consensus

3fd37f9

fix: tests and build everywhere

4fd6aa4

fix(state): update to new p2p/ipld API

c9d932a

fix(types): part set back to 5 parts

a74bc01

Wondertan force-pushed the hlib/block-propagation-2 branch from d0c602a to a74bc01 Compare July 9, 2021 08:18

liamsi mentioned this pull request Jul 9, 2021

ADR 006: Block Propagation with Rows #434

Merged

evan-forbes mentioned this pull request Jul 12, 2021

Remove the PartSetHeader where possible #418

Closed

3 tasks

evan-forbes mentioned this pull request Aug 16, 2021

Decouple Providing from PutBlock celestiaorg/celestia-node#12

Closed

2 tasks

liamsi closed this Aug 17, 2021

This was referenced Aug 19, 2021

ipld: consider moving into separate repo #296

Closed

ipld: Extract IPLD plugin into separate repository celestiaorg/celestia-node#21

Closed

Refactor remaining changes to tendermint to be more isolated #505

Closed

rootulp deleted the hlib/block-propagation-2 branch September 22, 2022 14:49

Conversation

Wondertan commented Jun 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Changes

TODO

Uh oh!

evan-forbes left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

evan-forbes Jun 22, 2021

Choose a reason for hiding this comment

Uh oh!

evan-forbes Jun 22, 2021

Choose a reason for hiding this comment

Uh oh!

Wondertan Jun 23, 2021

Choose a reason for hiding this comment

Uh oh!

evan-forbes Jun 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Wondertan Jun 23, 2021

Choose a reason for hiding this comment

Uh oh!

liamsi left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liamsi Jun 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Jun 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

evan-forbes left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

evan-forbes Jul 7, 2021

Choose a reason for hiding this comment

Uh oh!

Wondertan Jul 9, 2021

Choose a reason for hiding this comment

Uh oh!

Wondertan Jul 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Wondertan commented Jul 9, 2021

Uh oh!

liamsi commented Jul 9, 2021

Uh oh!

liamsi commented Aug 17, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Wondertan commented Jun 21, 2021 •

edited

Loading

evan-forbes Jun 22, 2021 •

edited

Loading

liamsi left a comment •

edited

Loading

liamsi Jun 25, 2021 •

edited

Loading

codecov-commenter commented Jun 27, 2021 •

edited

Loading

evan-forbes left a comment •

edited

Loading

Wondertan Jul 9, 2021 •

edited

Loading