Skip to content

Shared subscriptions with random strategy#628

Closed
swanandx wants to merge 5 commits intomainfrom
shared-subscriptions
Closed

Shared subscriptions with random strategy#628
swanandx wants to merge 5 commits intomainfrom
shared-subscriptions

Conversation

@swanandx
Copy link
Contributor

@swanandx swanandx commented May 30, 2023

This PR adds support for shared subscriptions in broker with sticky strategy ( i.e. keep sending to one client in group until failure ).
Other strategies like round-robin, sticky, hash aren't included yet, as they might need significant refactoring. ( please let me know if you have any workarounds for it! )

Type of change

New feature (non-breaking change which adds functionality)

Checklist:

  • Formatted with cargo fmt
  • Make an entry to CHANGELOG.md if its relevant of user of the library. If its not relevant mention why. ( TODO )

@swanandx swanandx mentioned this pull request May 30, 2023
19 tasks
@swanandx swanandx changed the title Shared subscriptions with sticky strategy Shared subscriptions with random strategy May 31, 2023
@stale stale bot added the stale Not moving forward; blocked label Jun 21, 2023
@Nickztar
Copy link
Contributor

So this is very interesting. I have a WIP solution to using other strategies, it does however completely refactor your solution. Do you mind taking a look?

I decided to move away from having a shared-cursor and instead storing the client_ids that are part of the shared group. It then picks clients based on round-robin (not really a perfect solution but working). It would be really easy to add other strategies in my solution aswell, such as sticky, random etc. I am not quite sure I fully understand how everything works in the router so if you see anything that looks weird that very well might be the case.

Here is my branch for this. https://github.com/Nickztar/rumqtt/tree/shared-subscription

@stale stale bot removed the stale Not moving forward; blocked label Jun 25, 2023
@swanandx
Copy link
Contributor Author

Hey @Nickztar , thanks for the contribution.

I just glanced over your solution, and have a question, is it really round-robin?
Because HashSer iter's ordering is arbitrary. It's tricky you know, previously i thought this PR was implementing sticky strategy 😅, later realised it's mostly sticky, but it can't be guaranteed. So choose to name it random haha.

I did previously ( when i was experimenting ) use HashSet and a "state" to indicate current one and which client to send next. But here we also have to consider cases where client can get disconnected without getting message, so are we able to send the same message to some other client in the group?

Will go over your solution ASAP. Thank you! 🚀

@Nickztar
Copy link
Contributor

Oh yea, that might be correct. Will have to keep testing then but this feels a lot more round-robin. Can also easily be changed to a ordered vec or something.

@swanandx
Copy link
Contributor Author

update: I went over the solution, and here are some observations/suggestions:

  • We can move the logic from logs to router ( that would be easier to reason about )
  • We can have a better way to maintain "state" of share/group ( probably by using Traits like 'Iter` ? ).
  • SharedCursor should be renamed ( as it isn't a cursor )
  • Setting status to FilterCaughtup to ignore a client is quite interesting approach, Kudos for that 🚀 ! ( Wish I could have thought about that haha )
  • As every client has DataRequest, lets say A & B are in a group, if we send a message to A, we will be updating the cursor present in DataRequest of A, but cursor present in B is outdated! When B will be scheduled, it will resend the message which was already sent to A. ( haven't tested it myself, plz correct me if I missed something )
  • Can we combine both solutions? My solutions for having the shared consensus, and yours for selectively scheduling?

Thank you for taking the initiative to add other strategies @Nickztar 💯

@Nickztar
Copy link
Contributor

Great thoughts!
Agree with all points! Mine is heavily WIP and just a proof of concept to test if it was possible so lets try improve this by using both strategies and it might be really good!

As for the DataRequest point, I could not get any duplicates to schedule. Haven't done any large tests but in my testing I got round-robin working.

The FilterCaughtUp feels like a hack, but does make sense in that we have "caught up" to what should be sent for that client.

@swanandx
Copy link
Contributor Author

As for the DataRequest point, I could not get any duplicates to schedule.

That's a good thing, now I gotta try it myself and figure out the why behind it 😆 .

The FilterCaughtUp feels like a hack

Trust me, you don't wanna know what kind of hacks I tried to get round-robin working ( ofc, they didn't work well haha ). But as you mentioned, setting status to "caught up" do make sense, so I won't call it a hack!

@Nickztar
Copy link
Contributor

As for why DataRequest works, cursor is updated before we actually determine that we arent the current client in a shared subscription. (alteast if my understanding of it is correct).

@swanandx
Copy link
Contributor Author

Ooh, I thought we were doing that check at the very beginning of function, before doing anything. My bad 😅

@swanandx
Copy link
Contributor Author

Thinking of which, do you think doing that check as first thing in function and using shared cursor will be better?

That way we will avoid multiple reads from datalogs, which looks expensive than using shared cursor.

@Nickztar
Copy link
Contributor

Oh that's smart. Shouldn't be too difficult, will try tomorrow!

As for improving the structure, did you have anything specific in mind for implementing traits and moving to router?

@swanandx
Copy link
Contributor Author

did you have anything specific in mind for implementing traits and moving to router

Nothing specific, will let you know if I came up with something.

@Nickztar
Copy link
Contributor

@swanandx Pushed some new changes to my branch. Your idea for shared cursor worked great, haven't measured the performance increase but feels a loot better. Also restructured some code, not sure how you would like it to be but moved the shared subscriptions part to it's own impl block to keep it a bit cleaner. Also implemented Iter for it, feels pretty nice.

@stale stale bot added the stale Not moving forward; blocked label Jul 18, 2023
@swanandx
Copy link
Contributor Author

closing this in favor of #668 🚀

@swanandx swanandx closed this Jul 30, 2023
@swanandx swanandx deleted the shared-subscriptions branch August 25, 2023 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale Not moving forward; blocked

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants