switch to event-sourcing based data model by kdembler · Pull Request #7 · Joystream/orion

kdembler · 2020-12-20T18:16:13Z

This PR resolves #1 by switching Orion data model to be event-based. To ensure decent performance two approaches were used:

Event sourcing

Our implementation is pretty loosely based on event sourcing. It doesn't follow it strictly as our use-case is pretty narrow and simple at this point. We save sequenced events to MongoDB, which can be used to build the aggregate (current state). For performance, we keep the full aggregate in memory so that most basic queries will not require any DB lookup at all. The aggregate is built upon Orion launch by parsing all the existing events. This approach should be pretty extensible in the future. Building on top of this to e.g. save channel follow/unfollow events should be fairly simple.

Time-series data / size-based buckets

Using the most straight-forward approach of one Mongo document per event could make us end up with a lot of documents pretty fast. While there are reported cases of collections with billions of documents, it can potentially hurt performance. To alleviate that, size-based buckets were used - event buckets are created that (currently) hold up to 50,000 events as nested objects. This way we reduce the number of documents, while allowing fast reads of events from one document.

Performance considerations

From what I gather, the current approach should allow us to handle foreseeable traffic no problem. It's not a perfect solution - it would probably break quite quickly when used with Youtube-like number of request. However, I figure we will need to implement different solution for view counts anyway until we get to traffic this implementation shouldn't be able to handle.

fulminmaxi

LGTM

switch to event-sourcing based data model

5572cfb

kdembler requested a review from fulminmaxi December 20, 2020 18:16

kdembler linked an issue Dec 20, 2020 that may be closed by this pull request

Switch Orion to event-based data model #1

Closed

kdembler added 3 commits December 21, 2020 12:31

add prop decorator to timestamp

f5c5dae

improve type safety for events without id

5ba9719

fix channel view counts

73cd0f3

kdembler mentioned this pull request Dec 21, 2020

add channel views query #8

Merged

fix cold start without events

acd0e78

This was referenced Dec 28, 2020

add channel follow functionality #10

Merged

add integration tests #11

Merged

fulminmaxi approved these changes Jan 4, 2021

View reviewed changes

fulminmaxi merged commit 9fc5a77 into Joystream:personalisation Jan 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

switch to event-sourcing based data model#7

switch to event-sourcing based data model#7
fulminmaxi merged 5 commits intoJoystream:personalisationfrom
kdembler:event-sourcing

kdembler commented Dec 20, 2020

Uh oh!

fulminmaxi left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kdembler commented Dec 20, 2020

Event sourcing

Time-series data / size-based buckets

Performance considerations

Uh oh!

fulminmaxi left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants