Skip to content

Fixes #2239 user callbacks immediately#2277

Closed
Dunbaratu wants to merge 5 commits intoKSP-KOS:developfrom
Dunbaratu:fixes_2239_user_callbacks_immidiately
Closed

Fixes #2239 user callbacks immediately#2277
Dunbaratu wants to merge 5 commits intoKSP-KOS:developfrom
Dunbaratu:fixes_2239_user_callbacks_immidiately

Conversation

@Dunbaratu
Copy link
Member

@Dunbaratu Dunbaratu commented Apr 21, 2018

EDIT: Please look at PR #2296 and merge it instead of this. This PR has had extra work put on top of it, and that extra work has become PR #2296. I believe #2296 does a much better job of handling this issue. (Plus, the commits from this branch are also in the branch for #2296 so you should get this PR merged as a side-effect of merging #2296).

Fixes #2239 with the ability to make immediate user callbacks.

Now any callback created via UserDelegate.TriggerNextUpdate() is going to actually trigger next opcode instead (the method name has been edited to reflect this).

Please merge #2275 first - this branch was made on top of that branch, because they edited some of the same lines of code and would have had a merge conflict if I tried to implement them separately.

(Some of the diff's you see will go away if you merge 2275 first.)

    Made it able to schedule triggers next opcode.

    When Adding a trigger, the caller can now choose if that
    trigger should wait for the next KOSFixedUpdate, or if
    it should execute immediately.

    If Cpu.AddTrigger() is called with arg Immediate=False:

      The system works like before - the trigger will not
      get pushed onto the callstack until the start of
      the next KOSFixedUpdate.  If there are instructions
      remaining this Update, then the rest of the IPU's
      worth of instructions will happen first, before this
      trigger will fire off.

    If Cpu.AddTrigger() is called with arg Immediate=True:

      The trigger's call will get pushed onto the callstack
      just before the next Opcode would have been executed
      in the ContinueExecution() main loop, even if that's
      in the middle of a KOSFixedUpdate.  If we are not in the
      middle of a KOSFixedUpdate right now, then the effect
      ends up being the same as if Immediate=False, since the
      time the "next opcode would have been executed" is during
      the next KOSFixedUpdate anyway.

    From the point of view of C# code that's waiting for a
    return value from the user's callback, there's not really
    any difference between these two since it has to wait for
    the user's code to happen anyway before it can get the
    answer.

    Immediate=True should never be used in cases where a trigger
    gets re-inserted automatically on a timer, as that can
    make it recurse forever and never give time back to the mainline
    code.

The previous value for an ON (to detect the it has
changed) is still suing a global value determined
at runtime so it can't store more than one of them,
which means multiple ON's from the same line of code
don't work right.  WHENs are working though.
On triggers' previous values are now tracked separately
merely by taking advantage of the local scoping
that was made available several months ago for triggers.

We just weren't using it because we still had the old code
that tried using the Global instead of Local storage Opcodes
in the trigger's boilerplate code blocks that tracked the
previous value.
When Adding a trigger, the caller can now choose if that
trigger should wait for the next KOSFixedUpdate, or if
it should execute immediately.

If Cpu.AddTrigger() is called with arg Immediate=False:

  The system works like before - the trigger will not
  get pushed onto the callstack until the start of
  the next KOSFixedUpdate.  If there are instructions
  remaining this Update, then the rest of the IPU's
  worth of instructions will happen first, before this
  trigger will fire off.

If Cpu.AddTrigger() is called with arg Immediate=True:

  The trigger's call will get pushed onto the callstack
  just before the next Opcode would have been executed
  in the ContinueExecution() main loop, even if that's
  in the middle of a KOSFixedUpdate.  If we are not in the
  middle of a KOSFixedUpdate right now, then the effect
  ends up being the same as if Immediate=False, since the
  time the "next opcode would have been executed" is during
  the next KOSFixedUpdate anyway.

From the point of view of C# code that's waiting for a
return value from the user's callback, there's not really
any difference between these two since it has to wait for
the user's code to happen anyway before it can get the
answer.

Immediate=True should never be used in cases where a trigger
gets re-inserted automatically on a timer, as that can
make it recurse forever and never give time back to the mainline
code.
@Dunbaratu Dunbaratu added the bug Weird outcome is probably not what the mod programmer expected. label Apr 21, 2018
@Dunbaratu Dunbaratu requested a review from hvacengi April 21, 2018 05:51
@Dunbaratu Dunbaratu added the Not Ready Author is warning Reveiwer not to merge this PR yet. More edits are expected. label Apr 21, 2018
@Dunbaratu
Copy link
Member Author

Dunbaratu commented Apr 21, 2018

Added "Not Ready" because I just realized a problem on thinking about this code and I need to fix that problem first. It should be pretty easy to fix but I ran out of time tonight to address it yet.

The way it was before: There has been (for a while now) a sanity check in the system that would delay re-inserting the pending triggers until at least one opcode of mainline program got executed. That way recurring triggers like WHEN and ON wouldn't starve mainline code of all IPU if they last longer than one update - they won't re-insert themselves until at least some mainline code is reached again.

What has just changed in this PR: Now it's possible for callback triggers to start executing immediately within the main code, and it's possible (and should be totally legal) for a user to design things like GUI callback code where the majority of the program's time is spent inside GUI callback hooks rather than in the mainline flow. But the system described above won't distinguish this kind of "trigger" from the recurring kind of automatic timer "trigger". If you were in a GUI callback hook, and got interrupted by a WHEN trigger that wanted to be preserved, that WHEN trigger won't re-insert itself as long as you're still in the GUI callback hook. The system doesn't realize the GUI callback hook should totally count as the user's own normal code and should allow WHEN and ON to re-trigger. (In other words when it sees a WHEN check that wants to interrupt a GUI ONCLICK, it behaves the same as if it was a WHEN check being interrupted by another WHEN check and tells the interrupting WHEN check to hold off until the other triggers are done.).

For the sake of that check, the rule should be changed:

Old rule: "Recurring triggers (WHEN/ON) must be held off in the pending trigger list until other recurring triggers (WHEN/ON) are done, but once mainline code happens then they can be moved to the real trigger list.

New rule: "Recurring triggers (WHEN/ON) must be held off in the pending trigger list until other recurring triggers (WHEN/ON) are done, but once mainline code or any non-recurring user callback code happens then they can be moved to the real trigger list.

@ghost
Copy link

ghost commented Apr 25, 2018

Looking at the code after processing our conversation in #2239, this code now enables immediate triggers, but they are actually not executed immediately if the source happens to be inside trigger (which now also includes click-handlers), it only works for main program, right? It does not solve the code in the first post. The output won't change a bit, because if on set it:pressed to false. is inside local function click which is called as trigger/callback (not as main program), thus it won't call it:onToggle from inside me:onToggle but AFTER. (Both callbacks happen to be same function only with different bound parameters).

What is needed is to really distinguish recurring triggers - WHEN/ON/LOCK vs. non-recurring user callbacks ..... the naming should simply become triggers vs. callbacks (as my interrupts vs. events).
Two queues should be created and ProcessTriggers splitted (or taking parameter), while triggers should be implemented as previously (ProcessTriggers called at the start of FixedUpdate), but callbacks as in your current code (ProcessCallbacks called in ContinueExecution).

The result should be what I have described as weird here: #2239 (comment)

  • Callbacks scheduled from trigger (when/on/lock) won't be executed immediately (after set btn:pressed) but later after all triggers are processed.
  • Callbacks scheduled from main program or another callback will be executed immediately (next opcode).
  • Callbacks scheduled from GUI (external source) will be executed after all triggers.
  • Triggers can interrupt callbacks (as they behave like main program).
  • Callbacks cannot interrupt triggers (triggers run in special mode).
  • Other scripts (containing triggers - e.g. LOCK THROTTLE) can be executed from click-callbacks (that does not work in current version, because the callback is blocking all triggers).

Possible problems:

  • Callbacks can get interrupted by themselves even from external source (multiple fast clicks).
  • set button:pressed to true behaves differently in triggers (when/on/lock) than elsewhere ... but that is acceptable and can even be described as feature to prevent starvation by triggers.

@ghost
Copy link

ghost commented Apr 27, 2018

I could not stop thinking about this. If callbacks remain triggers and triggers cannot interrupt other triggers, then what is the point of immediate callbacks that are not immediate inside callbacks? That was specificaly my complaint, that set otherButton:pressed to false inside callback of one button won't call cleanup sequence (inside onToggle) of the other button. I was trying to create set of radio buttons inside a table which needs nested boxes and thus the built-in system (of radio buttons inside same box) cannot be used. Let me bring the code here:

local wnd is gui(0).
local bt1 is wnd:addButton("TEST").
local bt2 is wnd:addButton("OTHER").
set bt1:toggle to true.
set bt2:toggle to true.
local counter is 0.
local function click {
	parameter me.
	parameter it.
	parameter on.
	if on set it:pressed to false. // <---- HERE ----
	set counter to counter+1.
	print counter+"."+me:text+"="+on.
}
set bt1:onToggle to click@:bind(bt1,bt2).
set bt2:onToggle to click@:bind(bt2,bt1).
wnd:show().
wait until counter > 10.
wnd:dispose().

I don't think the current state of this PR changes the output (did not try it, but looked at the code).
I was trying to solve this myself by calling the callback from inside the opcode - that would be like another function call, consistent, no matter if you do it from main program, callback or trigger. But I was unable to execute it, my attempt broke the stack and I got stuck with this solution. Using current logic around triggers (queue) dodges the problem I encountered.

But there is second problem we seem to be trying to solve as well:
Callbacks are triggers and triggers block other triggers, so, you cannot run launch script from onClick, because the callback blocks lock throttle and such, thus the launch script does not work from inside callback. #2280

I have proposed three run levels (see Cpu.Section), so that triggers are like high priority interrupt, that can interrupt anything except another trigger, and callbacks would be like low priority interrupt, that can only interrupt main program, but not triggers or other callbacks. If that was implemented together with the direct call from inside the opcode (not after it - these become true callbacks, not interrupts, not triggers), then it would solve it all except for scripts with their own gui run from inside another gui (onClick). Triggers can still be protected, if that is desired. The very mechanism of AddTrigger(immediate: true) could check current run level and behave like AddTrigger(immediate: false) if executing trigger right now. The GUI-over-GUI problem could possibly be solved by the GUI as well, I will get to this later. Let me explain WinForms first:

static void Main() { ... Application.DoEvents(). } // while(GetMessage(m)) ProcessMessage(m);
void OnClick(...) {
    ImportantLogicThatShouldNotBeInterrupted(); // button.Enabled = false;
    while(StepOfLongWorkNotFinished()) Application.DoEvents();
    CleanupSequenceThatShouldNotBeInterrupted(); // button.Enabled = true;
}

You do not excpect callbacks (click-handlers) to be interrupted by other callback, unless you perform action that specifically allows this - either calling Application.DoEvents() which executes all pending callbacks, or you change property or call a method on some control like MyOtherButton.PerformClick().

I am trying to warn you, that lowering priority of callbacks to the level of main program will introduce its own problems! Imagine the callback being interrupted by itself by two fast clicks. You could somehow fix this, but then use two buttons that you click on both in short time... complete mess and total confusion.

Modern C# introduced async/await which effectively splits the function into parts, allowing the program to execute Application.DoEvents() in between the steps. Sounds like too much for kOS, but there still could be some mechanism to lower run level e.g. by something like runEx(scriptOrFunction), that would lower it to main and then elevate back to callback when the script/function finishes. Maybe run and runPath could behave like this by default. processCallbacks() could be another usefull built-in (like Application.DoEvents).

Lastly, every GUI could have its own queue of callbacks and pass them one-by-one to CPU (waiting for it to be finished before passing another). That would, btw, solve the only problem my proposal with run levels seem to have, except the complexity that may be hard to understand, but I hope that the behaviour is so intuitive that most of kerboscript programmers would not need to now how it works. It would require marking the callbacks with a group. Callbacks cannot interrupt other callbacks from same group (same GUI), maybe except for group 0 (special marker), these can interrupt other callbacks and can be interrupted by any callback. That would either have to be managed by the GUI (never passing more than one callback of same group), or by CPU - having table of pending callbacks from same group (Dictionary<Group,List<Callback>>). Same technique can be used if you decide to run callbacks at run level of main program (but I really don't like such solution because of the problems it can cause).

P.S.: The question is if e.g. vectorUpdater should be trigger or callback. You have no way of triggering it by code (no property that would fire it), so, it could be trigger (high priority). There could even be more run levels (or interrupt priorities). The groups will solve that too (if the updater is callback in group 0 or maybe even some other group for all vector-drawing). I know this is complicated, but it is trying to solve the second problem - scripts from handlers.

@ghost
Copy link

ghost commented Apr 27, 2018

OK, I will assume that WHEN/ON, LOCK THROTTLE/STEERING and VECUPDATER will use TriggerOnFutureUpdate and onClick and such will use TriggerOnNextOpcode.

That sounds like strictly immiediate callbacks, although you can still possibly choose not to execute them on Cpu.Section.Trigger if you choose to. Not so important now, I can live with either behaviour.

Also solving triggers vs. callbacks and running scripts from click-handlers (because callbacks run on Cpu.Section.Main), but has the problem with nested callbacks I was trying to warn you about:

set btn1:onToggle to {
    parameter on.
    if on {
        set btn2:pressed to false. // cleanup
        lock steering to ...
    //  imagine that you click on btn2 here
        lock throttle to ...
    }
    else { // the cleanup
        unlock throttle.
    //  imagine another callback here
        unlock steering. 
    }
}

What would be the result?!
You will get lock steering from btn2,
but lock throttle from btn1:

btn2 is pressed
btn1 gets pressed
btn1 sets btn2:pressed to false
btn2 unlocks throttle and steering (immediate callback - ok)
btn1 locks steering
btn2 gets pressed (nested callback invoked asynchronously!)
btn1 unlocks both throttle and steering (immediate callback - ok)
btn2 locks steering and throttle (continuation of nested callback)
btn1 locks throttle (original callback - and you have a problem)

It would require something like atomic::test_and_set (built-in function) to be able to trully solve this in a script. (local disableClicks is false ... if on { set other:pressed to false. if testAndSet(disableClicks) doClick(). }).

@Dunbaratu
Copy link
Member Author

I don't think the current state of this PR changes the output (did not try it, but looked at the code).

Running it, after I rewound the most recent commit (so I was being fair and using the PR in the state it was in when you wrote this), did this:

1.TEST=True   <- clicked on TEST caused this message
2.TEST=False    <---+-- clicked on OTHER, caused these two messages in this order
3.OTHER=True    <---'

The old output that you said was wrong was in this order when I did the same thing:

1.TEST=True
2.OTHER=True
3.TEST=False

It definitely changed the output and I remember knowing this because I actually did test your specific example before I made this PR in the first place. I tested this again just last night after reverting the most recent commit, to be sure I wasn't remembering it wrong.

You say:

calling the callback from inside the opcode - that would be like another function call, consistent
Then later:
If that was implemented together with the direct call from inside the opcode (not after it

User function calls and immediate triggers are already doing the same thing as each other in regards to this.

In a sense UserDelegate.TriggerOnNextOpcode() is just a proxy for what OpcodeCall().Execute() already does, but in a way designed to work regardless of what part of the C# code is doing it and which part of the Unity updating system is doing it (i.e, the FixedUpdate pass, the Update pass, or the GUI pass). It ensures the UserDelegate call will be put onto the stack and get jumped to at the very next point in time that it is at all possible. If an Opcode is currently Executing (If TriggerOnNextUpdate() was something nested inside an Opcode.Execute() in some way) then it ensures the subroutine call doesn't ruin the assumption that Opcodes execute atomically. The call happens just after the current opcode (Which is exactly what OpcodeCall does to call a user's function - set up the function entry point to be the next opcode after the OpcodeCall, then return from OpcodeCall.Execute() to let that call happen next.) On the other hand, if TriggerOnNextUpdate() is called from anywhere else, well then it doesn't matter that it has to wait for just before the next Opcode to cause the call, because the delegate can't execute until the CPU is executing opcodes again anyway.

@ghost
Copy link

ghost commented Apr 27, 2018

It boils down to whether you take the trigger out of the queue unconditionally (as you seem to do and my assumption was wrong) or conditionally (only if in Cpu.Section.Main - which was my wrong assumption because you did not like immediate callbacks inside trigger, or at least I got that impression).

Ok, immediate callbacks are solved, they are strictly immediate no matter what. Good.

What about the hazards created by making callbacks execute as Cpu.Section.Main?

@ghost
Copy link

ghost commented Apr 27, 2018

P.S.: If you want to revert this PR to the state where immediate callbacks work and do not change Cpu.Section, I can test and try to review it in that state, perfectly happy it solves #2239

#2280 can be addressed later in separate PR.

@Dunbaratu
Copy link
Member Author

I'm beginning to think the case of having a button launch a program isn't worth trying to support. It doesn't work now in the public release, so not supporting it doesn't actually break a previously working thing.

The problem isn't the cases you've brought up - they are cases I can handle with some slight tweaking to the exclusivity checks that already exist to prevent multiple copies of the same trigger. The problem is the case where somebody decides to have the same callback attached to more than one hook. ("When widgetA changes, call foo(), but also call foo() when widgetB or widgetC change.") In that case there's no good way to identify what the exclusivity of calling foo() should be, without reading the user's mind and seeing how they thought it should work. There's two equally valid interpretations, one that says "same callback, means don't let it interrupt itself", and another that says "triggered by three different causes, only make it exclusive if it was the same cause". Depending on how they wrote the code in the callback, that might work.... but we can't prove one way or the other how they wrote it.

It's also a problem when one hook has more than one callback (we have a few of these I was experimentally working on, where the hook is actually a list of callbacks so you can set up more than one notifyee of the same event if you like.) Again, the definition of what counts as exclusivity here requires reading the user's mind about their intentions.

While I really don't like the simplistic rule that no callback is allowed to interrupt another callback, it's the simplest way to deal with it that makes it possible to get this thing out the door in a reasonable time and I have other parts of the mod I want to move on to.

I'm not sure if I like the idea of doing it with runlevels, as it requires a bit of restructuring, but there's plenty of ways to get the same basic algorithm to happen.

@Dunbaratu
Copy link
Member Author

#2280 can be addressed later in separate PR.

At this point they're so intertwined that they're all touching the same code and it might be easier to just handle them as a single PR. Whatever is done for #2280 has to occur "on top of" what this PR does anyway, and may even slightly undo some of what this PR did and rebuild it a different way (thus invalidating any time spent reviewing this).

@ghost
Copy link

ghost commented Apr 27, 2018

It is totally up to you. Solving #2280 properly won't be easy task and I agree that it may not be worth the effort. I will test the version you mark as ready (or say it is ready for testing). I can, maybe, try to implement my ideas if you leave #2280 unsolved and move to other things. I will focus on the tester (scopes) otherwise, if I even find the time. You know, temperatures are rising, my house and garden are screaming for attention XD.

@ghost
Copy link

ghost commented Apr 27, 2018

The more I think about it, the more I lean towards changing run and maybe even wait which is another don't do inside triggers. Doing so in a trigger could simply lower the run level to Section.Main and maybe elevate it back to Section.Trigger afterwards (to preserve atomicity of the remaining code).

I have no idea how to actually implement it, but try to think about it for a while, how do you like it?
Kill two birds with one stone. (We say something like Kill two flies with one hit.)

It could even be used for cooperative threads:

lock steering to ...
lock throttle to ...
when true then {
   atomicStep1().
   wait 0. //<-- this will pass CPU to other triggers
   atomicStep2().
   wait until whatever(). // <-- again pass CPU
   ...
}

and of course

set launchButton:onClick to { run launch_script. }.

@Dunbaratu
Copy link
Member Author

The problem is that while I mentioned the case of doing run program as my specific example, I really meant that as a placeholder for any sort of operation where you expect to stay in that code a long time. A solution that starts making specific exceptions for some kinds of long running code, like run, but not for other kinds of long running code, is something that I think would be more confusing to explain in the documentation than just saying "don't". I'd rather support both long-lasting run program's and long-lasting do_my_function() situations, or neither.

I implored the team a while ago to enact this rule: The code changes and the user doc changes have to be wrapped up together in the same PR so merging them is an atomic operation where you aren't allowed to make the docs temporarily wrong because of a new change that isn't documented for the users yet.

This rule was accepted.

This rule had a second happy side effect that was not part of my original plan but I've noticed it being useful as an accidental benefit - programmers start thinking to themselves, "How the heck am I going to actually explain this to people?" up front before they finish a PR. That ends up discouraging ideas that are going to be confusing to the userbase.

And I think that making run program inside a callback get different rules than other long-lasting things in a callback is an example of one of those things that's messy to explain. As soon as I thought of writing those docs, I immediately "noped" out of the idea.

What might be simpler to explain is a simple command that just says "allow interruptions now" that you can execute from inside your callback if you like, similar to some of the things you've been talking about.

@ghost
Copy link

ghost commented Apr 28, 2018

There are multiple ways of tackling this, but I came to conclusion, that making run and wait special is the most understandable and justifyable thing. First, wait is exactly "allow interruptions now", especially wait 0, which means "wait until next update" and doing so in main program is "wait until next update, execute all triggers, as usual, then continue executing main program". Using wait inside main loop is even good way to prevent race conditions - being interrupted in between e.g. positionAt and velocityAt which should be atomic, I mean getting both should be atomic, but may not be, if you reach IPU. wait 0. pos = positionAt(...). vel = velocityAt(...). prevents getting them from different updates. if orbit:excentricity < 0 set dt to eta:apoapsis can even crash the program, becase the statement can be interrupted in between the check and getting the eta and you could have just changed the excentricity to hyperbolic. Reaching IPU should not be standard way of handling main vs. triggers, main should actually pass CPU by the wait and IPU should be seen as last-resort protection. So, the documentation can just state that wait allows other triggers to execute, even if the command itself is inside a trigger.

run is a bit trickier, I agree. Having something for functions would be good, I definitely agree and I have originally proposed runEx(scriptOrFunction). But run script just feels independent, I mean the script to be executed should be independent and not broken by placing the run inside a trigger.

Triggers run at the beginning of each update, interrupting main program either at wait command or when IPU is reached. Triggers are executed in a sequence, except when they contain wait or run. wait allows other triggers to run and run script always behaves like if it was part of main program, not to block triggers inside that script.

@Dunbaratu Dunbaratu mentioned this pull request Jun 26, 2018
@Dunbaratu
Copy link
Member Author

Dunbaratu commented Jun 26, 2018

I think PR #2296 completely overrides this PR. I'd strongly recommend looking at #2296 to merge, which should merge this as a side-effect. The commits in this branch are also in 2296.

@Dunbaratu
Copy link
Member Author

The reason for all the conflicts is because this problem was already fixed a different way in PR #2296, which touched much of the same code.
I'm going to close this as it became moot with 2296 being merged.

@Dunbaratu Dunbaratu added duplicate Will close because another PR or issue is the same. (please link to it when using this label) and removed Not Ready Author is warning Reveiwer not to merge this PR yet. More edits are expected. labels Nov 27, 2018
@Dunbaratu Dunbaratu closed this Nov 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Weird outcome is probably not what the mod programmer expected. duplicate Will close because another PR or issue is the same. (please link to it when using this label)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant