-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Add a return statement to enforce a single yield in certain triggers #31985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Hussein Awala <hussein@awala.fr>
|
@eladkal I think we should merge this one before releasing the new providers wave |
|
@pankajastro @potiuk could you please review this PR? |
potiuk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Thought about it and it makes perfect sense (maybe @andrewgodwin can also confirm it) - as I understand it now (but I do not have huge experience in how async works), the event main loop could - generally - cause the yield to happen several time. I think what is important is to see when the trigger gets actually removed from the trigger list being served by the Triggerer. How I understand it works (but again those are a little assumptions and also looking at the observation) that they are not removed eagely, when the trigger event is emitted, but when it is actually processed.
The second part (processing the TriggerEvent) must naturally happen outside of the main event loop (the event loop executed code must be very fast and it should never block on any kind of I/O, where I undersand what trigger event processing in Trigger, necessarily involves Database Operation. Then the running trigger is removed from the trigger loop (I believe) only after that DB operation. This means that if there are not many triggers in the loop, it might well happen than the loop might get re-executed and new events yielded.
Simply speaking there are no other mechanism to signal that the event has been already emitted.
I am not sure how recovery is implemented in case the event is emitted, but the processing does not manage to complete, but I assume this can only happen in case Triggerer crashes, and in this case the Trigger will be re-started as the database operation saving the state of the triggerer will not happen, so this case should also be handled properly and there is generally no need to continue the loop anyway.
So yeah. After a bit of consideration it seems that those returns are actually needed to prevent (not even that improbable) race conditions.
|
And agree @eladkal - this one should be merged before relasing new provider wave. |
@potiuk I agree. I have also opened and tested #31987, which can effectively prevent the problem on the triggerer side. However, even if we merge that PR, we still need the |
related: #31703
In #31703, we decided to remove certain return statements under the assumption that they were unnecessary after yielding an event. However, upon testing this strategy, it appears that in certain cases (specifically when the triggerer service is overloaded), we end up generating a significant number of events. As a result, we continue with unnecessary processing, which further burdens the triggerer.
To test that, I have created a new module
airflow/tests.py:And this small dag:
which I tested in breeze.
When I tested it without a
return, I got:and even more than 26k in other runs when the triggerer was overloaded.
However, when using a return statement, I consistently observed a single sent event. Moreover, I noticed that this approach consumes less memory as there are fewer events being added to the deque
events.^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.