Conversation
|
Converting to draft while we figure out the full extent of changes required. |
|
rendered version of the the share section: https://github.com/lazyledger/lazyledger-specs/blob/adlerjohn-nmt_namespace_leaves/specs/data_structures.md#share |
|
|
||
| For shares **with a namespace ID above [`NAMESPACE_ID_MAX_RESERVED`](./consensus.md#constants)**, the first [`SHARE_RESERVED_BYTES`](./consensus.md#constants) bytes have no special meaning and are simply used to store data like all the other bytes in the share. | ||
| For shares **with a namespace ID above [`NAMESPACE_ID_MAX_RESERVED`](./consensus.md#constants)**, the first [`SHARE_RESERVED_BYTES`](./consensus.md#constants) bytes after [`NAMESPACE_ID_BYTES`](./consensus.md#constants) have no special meaning and are simply used to store data like all the other bytes in the share. | ||
|
|
There was a problem hiding this comment.
So because of this * the raw data can only be max 247 = 256 - 1 - 8 bytes long right?
There was a problem hiding this comment.
Ah I see, the next section clarifies this:
SHARE_SIZE - NAMESPACE_ID_BYTES - SHARE_RESERVED_BYTES
There was a problem hiding this comment.
Maybe the encoding is wrong in the implementation but note that the index, if varint encoded, could exceed one byte: celestiaorg/celestia-app#53 (comment)
There was a problem hiding this comment.
Fixed to 1 byte in 5b010a7 as per celestiaorg/celestia-app#53 (comment)
| 1. Compute the length of each serialized request, [serialize the length](#share), and pre-pend the serialized request with its serialized length. | ||
| 1. Split up the length/request pairs into [`SHARE_SIZE`](./consensus.md#constants)`-`[`SHARE_RESERVED_BYTES`](./consensus.md#constants)-byte [shares](#share) and assign [the appropriate namespace ID](./consensus.md#reserved-namespace-ids). This data has a _reserved_ namespace ID, so the first [`SHARE_RESERVED_BYTES`](./consensus.md#constants) bytes for these shares must be [set specially](#share). | ||
| 1. Split up the length/request pairs into [`SHARE_SIZE`](./consensus.md#constants)`-`[`NAMESPACE_ID_BYTES`](./consensus.md#constants)`-`[`SHARE_RESERVED_BYTES`](./consensus.md#constants)-byte [shares](#share) and assign [the appropriate namespace ID](./consensus.md#reserved-namespace-ids). This data has a _reserved_ namespace ID, so the first [`NAMESPACE_ID_BYTES`](./consensus.md#constants)`+`[`SHARE_RESERVED_BYTES`](./consensus.md#constants) bytes for these shares must be [set specially](#share). | ||
| 1. Concatenate the lists of shares in the order: transactions, intermediate state roots, evidence. |
There was a problem hiding this comment.
So the concatenation is what makes it necessary to use the * thingy (the start index above)?
It feels the text is missing some step and does not match the diagram: the concatenation here can refer to what leads to

because if you simply concatenate, tx3 would also carry a namespace.
Other than this concatenation, the text only mentions to split up requests into share_size - 8 - 1 shares.
There was a problem hiding this comment.
Also, with the above description (step 1), it is not clear if there shouldn't be an associated NID for the the length/request pair len(tx3), tx3.
There was a problem hiding this comment.
Uhh okay I see why the text might be confusing. What it's supposed to be is that the namespace ID of all transactions is the same, so only shares have a namespace ID, not individual transactions.
There was a problem hiding this comment.
Clarified I think in 42d8b4f. Rendered: https://github.com/lazyledger/lazyledger-specs/blob/42d8b4fb92dfdf912d2843548cf95b461ac3bf3e/specs/data_structures.md#share
liamsi
left a comment
There was a problem hiding this comment.
LGTM, thanks for the clarifications.

Fixes #145.
Note the proposed fix in #145 is actually not complete as it does not affect erasure coding, but rather only the NMT. This PR changes the format of non-parity shares themselves to be 8-byte namespace ID + 248 bytes of data, which is then erasure coded. This guarantees a power-of-2 invariant on share size, which is needed for proper erasure coding.