Expose AllInputClosed message as a Stop message#1026
Conversation
72e760d to
b72680b
Compare
phil-opp
left a comment
There was a problem hiding this comment.
I would prefer to use a new event type for this rather than reusing the Stop event.
| tracing::error!("{err:?}"); | ||
| Event::Error(err.wrap_err("internal error").to_string()) | ||
| } | ||
| NodeEvent::AllInputsClosed => Event::Stop, |
There was a problem hiding this comment.
Making AllInputsClosed the same as Stop means that nodes can no longer find out what triggered the stop. So they can no longer handle 'ctrl-c' stops in a different way (and e.g. do some additional cleanup).
There was a problem hiding this comment.
I think the issue is that it means to have additional code for all the nodes where most of the time it's probably going to do the same thing no?
I think also that it might be confusing that the stop event does not cover all stop event but only manual stop event.
There was a problem hiding this comment.
Nodes that don't care about the stop reason can just wait until the event stream is closed. The stop event currently signals a manual stop (ctrl-c or dora stop), so nodes that want to do something special in this case can react to it.
With merged event streams, the waiting for stream closure does no longer work, so we need some new way to handle this. But we shouldn't change the behavior for normal dora event streams.
There was a problem hiding this comment.
How about:
- we rename the Stop event to ManualStop (we keep a deprecated alias to avoid breakage)
- we forward the AllInputsClosed event
- we add an additional
Doneevent that is sent just before the event stream is closed (this might depend on the node config)
This way, nodes with merged event streams could listen for the Done event and break
There was a problem hiding this comment.
I'm sorry but I will not be able to remember all this logic.
I would very much prefer to have a stop event that encapsulates all logic.
There was a problem hiding this comment.
I think the original intention for the Stop event was that it corresponds to dora stop calls. So if there is no dora stop invoked, no Stop event occurs.
There was a problem hiding this comment.
I'm sorry, but I, myself, never knew that it was meant to be only for manual stop, and always expected it as a stop ( catch all ) messages.
I think that it is very hard without context to understand what an event done mean, as it could mean a lot of different things. Knowing that we will have Done and Stop which is very much confusing.
There was a problem hiding this comment.
I opened this PR exactly because I thought that stop -> break as implemented within openai-proxy-server did not work.
There was a problem hiding this comment.
I'm fine with any new name (e.g. Done, Finished, EventStreamClosed, EndOfInput, EndOfEvents), but changing the behavior of the existing Stop event seems like unnecessary breakage. If anyone was relying on this behavior, their code wouldn't work anymore after updating dora and they wouldn't even get any error or other signal.
We can also rename the Stop event if you find it too confusing, or even deprecate and remove it if it's not needed.
For example, consider a node that writes all incoming images to disk. The disk might be slow, so it uses an internal buffer. If the dataflow is stopped early, it assumes that you're no longer interested in the remaining images and clears the buffer without writing it out (for faster abort, to avoid getting killed after the grace duration).
After updating dora, this example node would always silently discard the last few images, even when no dora stop is used.
There was a problem hiding this comment.
I'm actually way more worried about the guy like me who would have believe that stop is a Catch all and that assuming that I can count on it to stop is enough.
Instead, I have to add conditions for every type of stoppin and remember additional keywords... I really don't see why I have to remember to deal with both Stop AND EndOfInput AND any future event to just be able to simply deal with stopping the dataflow.
I don't think people implement Stop handler in the expectation that it is only going to be used for Ctrl-C messages, and that they assume that any Stop message can happen and that if there is data that should be kept they don't have a process after the event loop to do so.
995d6b9 to
c64d664
Compare
|
Meeting summary: We didn't reach a clear consensus on this, but we all agreed that the new behavior should be as simple and intuitive as possible as possible. And that we need to provide clear documentation for the behavior. Arguments made:
|
New Proposal
@haixuanTao How does this sound to you? |
I'm ok with adding a string field in stop to be able to match the cause. I would largely prefer if we could reuse some field like value so that people don't have to learn a new terminology on matching stop. For source node, I think if the choice is between sending nothing and being empty from the beginning or having only one message which is a stop message. I would rather have only one message which is a stop message with cause no input for example as it's easier to debug. So I would not want to make it an exemption node. On the other note, if it is possible to not close the event stream for source node in the beginning that would be great but it does not fit this PR in my opinion. |
I'm not sure what you mean with reusing the value field? Which value field do you mean? Right now our event enum looks like this: https://docs.rs/dora-node-api/latest/dora_node_api/enum.Event.html I don't really care about the name of the field, but it should be something that is understandable without reading the docs in my opinion. Something like |
|
I have added a Cause enum with non_exhaustive pattern matching the root cause of the stopping. Yeah sorry wasn't super clear. I was talking about how to expose it in python and what i have done is I have put it in the id key, value so that people can query the key value on it. |
Ah, sounds good to me! |
a5e7f1a to
bb12249
Compare
bb12249 to
c79251b
Compare
c79251b to
d155ee7
Compare
This PR makes it possible for nodes to stop when all input is closed. This is necessary because when merging two different event stream and we want to stop when all event from dora finishes we can rely on the stop messages to stop.
It also removes the complexity of the event stream code.