Erasure-code namespace ID by adlerjohn · Pull Request #146 · celestiaorg/celestia-specs

adlerjohn · 2021-03-17T12:45:04Z

Fixes #145.

Note the proposed fix in #145 is actually not complete as it does not affect erasure coding, but rather only the NMT. This PR changes the format of non-parity shares themselves to be 8-byte namespace ID + 248 bytes of data, which is then erasure coded. This guarantees a power-of-2 invariant on share size, which is needed for proper erasure coding.

adlerjohn · 2021-03-17T15:55:21Z

Converting to draft while we figure out the full extent of changes required.

liamsi · 2021-03-18T16:44:51Z

rendered version of the the share section: https://github.com/lazyledger/lazyledger-specs/blob/adlerjohn-nmt_namespace_leaves/specs/data_structures.md#share

liamsi

TBH, this is still too ambiguous. I think I understand how this works but I asked a lot of questions to better be sure.

liamsi · 2021-03-18T19:07:36Z


-For shares **with a namespace ID above [`NAMESPACE_ID_MAX_RESERVED`](./consensus.md#constants)**, the first [`SHARE_RESERVED_BYTES`](./consensus.md#constants) bytes have no special meaning and are simply used to store data like all the other bytes in the share.
+For shares **with a namespace ID above [`NAMESPACE_ID_MAX_RESERVED`](./consensus.md#constants)**, the first [`SHARE_RESERVED_BYTES`](./consensus.md#constants) bytes after [`NAMESPACE_ID_BYTES`](./consensus.md#constants) have no special meaning and are simply used to store data like all the other bytes in the share.



This whole paragraph in its current form rendered incl. the vector graphic:

So because of this * the raw data can only be max 247 = 256 - 1 - 8 bytes long right?

Ah I see, the next section clarifies this:
SHARE_SIZE - NAMESPACE_ID_BYTES - SHARE_RESERVED_BYTES

Maybe the encoding is wrong in the implementation but note that the index, if varint encoded, could exceed one byte: celestiaorg/celestia-app#53 (comment)

Fixed to 1 byte in 5b010a7 as per celestiaorg/celestia-app#53 (comment)

liamsi · 2021-03-18T19:18:48Z

        1. Compute the length of each serialized request, [serialize the length](#share), and pre-pend the serialized request with its serialized length.
-    1. Split up the length/request pairs into [`SHARE_SIZE`](./consensus.md#constants)`-`[`SHARE_RESERVED_BYTES`](./consensus.md#constants)-byte [shares](#share) and assign [the appropriate namespace ID](./consensus.md#reserved-namespace-ids). This data has a _reserved_ namespace ID, so the first [`SHARE_RESERVED_BYTES`](./consensus.md#constants) bytes for these shares must be [set specially](#share).
+    1. Split up the length/request pairs into [`SHARE_SIZE`](./consensus.md#constants)`-`[`NAMESPACE_ID_BYTES`](./consensus.md#constants)`-`[`SHARE_RESERVED_BYTES`](./consensus.md#constants)-byte [shares](#share) and assign [the appropriate namespace ID](./consensus.md#reserved-namespace-ids). This data has a _reserved_ namespace ID, so the first [`NAMESPACE_ID_BYTES`](./consensus.md#constants)`+`[`SHARE_RESERVED_BYTES`](./consensus.md#constants) bytes for these shares must be [set specially](#share).
 1. Concatenate the lists of shares in the order: transactions, intermediate state roots, evidence.


So the concatenation is what makes it necessary to use the * thingy (the start index above)?
It feels the text is missing some step and does not match the diagram: the concatenation here can refer to what leads to

because if you simply concatenate, tx3 would also carry a namespace.
Other than this concatenation, the text only mentions to split up requests into share_size - 8 - 1 shares.

Also, with the above description (step 1), it is not clear if there shouldn't be an associated NID for the the length/request pair len(tx3), tx3.

Uhh okay I see why the text might be confusing. What it's supposed to be is that the namespace ID of all transactions is the same, so only shares have a namespace ID, not individual transactions.

Can clarify.

Clarified I think in 42d8b4f. Rendered: https://github.com/lazyledger/lazyledger-specs/blob/42d8b4fb92dfdf912d2843548cf95b461ac3bf3e/specs/data_structures.md#share

liamsi

LGTM, thanks for the clarifications.

Add namespace ID to NMT leaf hash.

071ca3b

adlerjohn added the bug Something isn't working label Mar 17, 2021

adlerjohn requested review from Wondertan, liamsi and musalbas March 17, 2021 12:45

adlerjohn self-assigned this Mar 17, 2021

liamsi reviewed Mar 17, 2021

View reviewed changes

Comment thread specs/data_structures.md Outdated

adlerjohn marked this pull request as draft March 17, 2021 15:54

adlerjohn removed request for Wondertan and musalbas March 17, 2021 15:55

adlerjohn added 3 commits March 18, 2021 11:40

Revert NMT hashing nid into leaf hash.

f229ebb

Update share figure.

a464001

Update share layout.

20b4b59

liamsi reviewed Mar 18, 2021

View reviewed changes

Comment thread specs/data_structures.md Outdated

Update logic for laying out messages into shares.

4a3715d

adlerjohn marked this pull request as ready for review March 18, 2021 16:15

adlerjohn requested review from evan-forbes, liamsi and musalbas March 18, 2021 16:16

adlerjohn changed the title ~~Add namespace ID to NMT leaf hash~~ Erasure-code namespace ID Mar 18, 2021

liamsi reviewed Mar 18, 2021

View reviewed changes

liamsi mentioned this pull request Mar 18, 2021

spec alignment: arranging data into shares celestiaorg/celestia-core#234

Closed

evan-forbes mentioned this pull request Mar 18, 2021

Erasure namespaces celestiaorg/celestia-core#235

Merged

2 tasks

liamsi mentioned this pull request Mar 19, 2021

Computing leaf node hash must include namespace ID celestiaorg/nmt#23

Closed

adlerjohn added 2 commits March 19, 2021 09:32

Fix start index to use raw byte.

5b010a7

Clarify wording.

42d8b4f

adlerjohn requested a review from liamsi March 19, 2021 13:45

liamsi approved these changes Mar 19, 2021

View reviewed changes

adlerjohn merged commit a66353f into master Mar 19, 2021

adlerjohn deleted the adlerjohn-nmt_namespace_leaves branch March 19, 2021 14:38


		For shares with a namespace ID above [`NAMESPACE_ID_MAX_RESERVED`](./consensus.md#constants), the first [`SHARE_RESERVED_BYTES`](./consensus.md#constants) bytes have no special meaning and are simply used to store data like all the other bytes in the share.
		For shares with a namespace ID above [`NAMESPACE_ID_MAX_RESERVED`](./consensus.md#constants), the first [`SHARE_RESERVED_BYTES`](./consensus.md#constants) bytes after [`NAMESPACE_ID_BYTES`](./consensus.md#constants) have no special meaning and are simply used to store data like all the other bytes in the share.

Conversation

adlerjohn commented Mar 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

adlerjohn commented Mar 17, 2021

Uh oh!

Uh oh!

liamsi commented Mar 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liamsi left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liamsi Mar 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adlerjohn Mar 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

liamsi left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

adlerjohn commented Mar 17, 2021 •

edited

Loading

liamsi commented Mar 18, 2021 •

edited

Loading

liamsi left a comment •

edited

Loading

liamsi Mar 19, 2021 •

edited

Loading

adlerjohn Mar 19, 2021 •

edited

Loading