docs: add initial stream! protocol specification by acud · Pull Request #1454 · ethersphere/swarm

acud · 2019-06-11T14:42:40Z

This PR adds a spec for the stream! protocol.

The proposed design should create more clarity on implementation, removes unnecessary abstractions and simplifies both the server side and client side.
The aim is to remove as much state management as possible. Ideally we would be creating a completely stateless server, this is however, not fully possible when having offered/wanted roundtrip.

human-readable version here: https://github.com/ethersphere/swarm/blob/stream-spec/docs/Stream-Protocol-Spec.md

Todo:

examples of the messages are still missing
diagram of message exchange
diagram of initial stream query (getting session indexes etc)
diagram of unbounded stream
diagram of bounded stream
define all StreamState codes and possible errors

docs/Stream-Protocol-Spec.md

nonsense · 2019-06-14T13:27:22Z

docs/Stream-Protocol-Spec.md

+### definition of stream
+a protocol that facilitates data transmission between two swarm nodes, specifically targeting sequencial data in the form of a sequence of chunks as defined by swarm. the protocol should cater for the following requirements: 
+- client should be able to request arbitrary ranges from the server
+- client can be assumed to have some of the data already and therefore can opt in to selectivally request chunks based on their hashes


aren't those two requirements actually the same requirement? if client can request arbitrary ranges from the server, then it basically means that the client can selectively request chunks?

not really, because bin index ranges can be asked for, for example, without a roundtrip, and that constitutes a valid request for a range of chunks.
for more granular control you have the possibility of a roundtrip

nonsense · 2019-06-14T13:30:22Z

docs/Stream-Protocol-Spec.md

+- client can be assumed to have some of the data already and therefore can opt in to selectivally request chunks based on their hashes
+
+As mentioned, the client is typically expected to have some of the data in the stream. to mitigate duplicate data transmission the stream protocol provides a configurable message roundtrip before batch delivery which allows the downstream peer to selectively request the chunks which it does not store at the time of the request.
+This comes, expectedly, at a certain price. Since delivery batches are pre-negotiated and do not rely on the mere benevolence of nodes, we can conclude that the delivery batches are optimsed for _urgency_ rather than for maximising batch utilisation (this is however, would be more apparent with unbounded streams).


I don't really understand that sentence. It is also vague - certain price? benevolence of nodes? Please give easy to understand and specific example of real-world scenarios to illustrate what you mean.

docs/Stream-Protocol-Spec.md

acud · 2019-06-16T08:18:01Z

I've actually had a few outstanding thoughts with this design:

Retrieve requests are actually not covered within the scope of this document. Now comes the question - should we contain retrieve requests as part of the stream protocol or are they a good refactoring candidate to be pulled out to a separate spec? the fact that streams are defined as a stream of chunks with monotonically incrementing indexes, means that we look at the streams through the identity of the bin indexes rather than through their hashes.
Maybe we should consider pulling retrieve requests into a separate protocol that would be easier to also account for adaptive nodes; in addition, a paid symmetric counterpart to retrieve requests could be added to this new protocol to cover both cases of: (a) i want to pay someone to fetch something for me (b) i want to pay someone to push something onto the network for me.
I am not 100% sure that handling deliveries at the individual chunk level is so effective to haul large amounts of data. I'd really like to have a special delivery message that can take up multiple chunks, at least for syncing (devp2p max message size is 16mb. to me it seems rather wasteful not to make use of that frame size. we can deliver almost 4000 chunks at one go with this option). That being said - this should and can easily be benchmarked.

acud · 2019-06-16T08:46:53Z

also, @nonsense, i'm taking into account that different requirements would pop-up while implementing this (they are already coming up), so i could PR them into this document while working on it with the new protocol implementation, and so in general my expectations from this PR is to have an agreement on a baseline of what should be done and to see that there are no reservations about the design

nonsense · 2019-06-18T14:58:13Z

docs/Stream-Protocol-Spec.md

+- range is defined by client and should be strictly respected and followed by server
+- all intervals specified in protocol messages are closed (inclusive)
+- when roundtrip is configured - chunk deliveries can be handled concurrently (therefore their order is not guaranteed), but a server end-of-batch with topmost session index must be sent to signal the end of a batch
+- when roundtrip is not configured - chunks are expected to be sent in order, one after the other


line 30 and 31 are not clear to me.

in the current impl of Swarm, we handle multiple messages concurrently, so depending on how we specify the ChunkDeliveryMsg, we will be handling many of those concurrently.

from a server perspective we always send chunks in order as we write messages to the TCP connection in order... i don't understand why we have to specify these things here.

nonsense · 2019-06-18T15:03:25Z

docs/Stream-Protocol-Spec.md

+ - stream indexes always > 0
+ - syncing is an implementation of the stream protocol
+ - client is expected to manage all intervals, and therefore:
+ - server is designed to be stateless, except for the case of managing a offered/wanted roundtrip and the knowledge of a boundedness of a stream (e.g. the server knows that syncing streams are always unbounded from the localstore perspective - data can always enter the system, however this is not the case for live video stream for example)


i suggest we format a bit better the except cases here. there are two cases:

a specific GetRange flow -> 1. get range, 2. offered, 3. wanted, 4. deliver and batch done.

unbounded streams - server knows that client has requested an unbounded stream, so it does what we have up at line 32. for that to happen, server keeps state on that request type, until connection is dead or client says stop.

nonsense · 2019-06-18T15:04:16Z

@acud i think you've converged on something simpler than we already have. i suggest we iterate on it again and make this document a bit more succinct - it has too much prose for what the protocol is about in my opinion.

…or message definitions

acud · 2019-06-20T10:21:58Z

As discussed with @nonsense, we are merging this in a Draft state (see table at the top of the spec MD), so we can iterate over this through the new syncer PR without having to maintain two different PRs at the same time

acud added in progress documentation labels Jun 11, 2019

acud self-assigned this Jun 11, 2019

skylenet reviewed Jun 12, 2019

View reviewed changes

docs/Stream-Protocol-Spec.md Outdated Show resolved Hide resolved

acud force-pushed the stream-spec branch from c0cabe7 to c54bc43 Compare June 13, 2019 07:13

nonsense reviewed Jun 14, 2019

View reviewed changes

docs/Stream-Protocol-Spec.md Outdated Show resolved Hide resolved

nonsense reviewed Jun 14, 2019

View reviewed changes

docs/Stream-Protocol-Spec.md Show resolved Hide resolved

acud mentioned this pull request Jun 17, 2019

rewrite pull syncer #1451

Closed

22 tasks

nonsense reviewed Jun 18, 2019

View reviewed changes

acud force-pushed the stream-spec branch from ebc96c9 to 02234a4 Compare June 19, 2019 13:26

acud mentioned this pull request Jun 20, 2019

network/newstream: new stream! protocol base implementation #1500

Merged

acud added 8 commits June 20, 2019 10:15

docs: add initial pull sync spec

f377385

docs: add roundtrip to GetRange, make some wording less vague

1e31241

docs: fix typo

3eb7dec

docs: updated diagram layout

fe45894

docs: fix and improve a few small things

3214a32

docs: clean up, beautify, add wire protocol section with go structs f…

335981c

…or message definitions

docs: refine design with @nonsense

51b1a65

docs: add table

f674516

acud force-pushed the stream-spec branch from 02234a4 to f674516 Compare June 20, 2019 08:15

nonsense approved these changes Jun 20, 2019

View reviewed changes

acud merged commit 8afb316 into master Jun 20, 2019

acud deleted the stream-spec branch June 20, 2019 10:22

acud restored the stream-spec branch July 1, 2019 08:50

acud deleted the stream-spec branch July 5, 2019 11:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add initial stream! protocol specification#1454

docs: add initial stream! protocol specification#1454
acud merged 8 commits intomasterfrom
stream-spec

acud commented Jun 11, 2019 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

nonsense Jun 14, 2019 •

edited

Loading

Uh oh!

acud Jun 16, 2019

Uh oh!

nonsense Jun 14, 2019

Uh oh!

Uh oh!

acud commented Jun 16, 2019

Uh oh!

acud commented Jun 16, 2019 •

edited

Loading

Uh oh!

nonsense Jun 18, 2019

Uh oh!

nonsense Jun 18, 2019

Uh oh!

nonsense commented Jun 18, 2019

Uh oh!

acud commented Jun 20, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

acud commented Jun 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nonsense Jun 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

acud Jun 16, 2019

Choose a reason for hiding this comment

Uh oh!

nonsense Jun 14, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

acud commented Jun 16, 2019

Uh oh!

acud commented Jun 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nonsense Jun 18, 2019

Choose a reason for hiding this comment

Uh oh!

nonsense Jun 18, 2019

Choose a reason for hiding this comment

Uh oh!

nonsense commented Jun 18, 2019

Uh oh!

acud commented Jun 20, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

acud commented Jun 11, 2019 •

edited

Loading

nonsense Jun 14, 2019 •

edited

Loading

acud commented Jun 16, 2019 •

edited

Loading