-
Notifications
You must be signed in to change notification settings - Fork 668
JIL Execution Cleanup and Speed Optimizations #12153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| } | ||
|
|
||
| [Test] | ||
| [Category("Failure")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
??
| return item; | ||
| } | ||
|
|
||
| return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the for loop better than the Linq FirstOrDefault ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pinzart Not sure and may be an false postive from looking at the performance measurements. Regardless this ends up on the hot path so good to have a fast as possible.
|
|
||
| return cachedNodes; | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is trying to create and escape for the lookup the most common lookup if for the default kInvalidPC pair. Maybe a better way to do that or document that
|
#12895 should replace this PR. It is now split into two |
Purpose
JIL Execution in the VM is responsible for a small but important subset of node execution. While most nodes today are executed via the FFI (Foreign Function Interpreter) mechanisms which invoke C# code, some basic utility methods especially in Math category are written in native design script. This type of function is handled by the
JILFunctionEndPoint. While many of these enhancements are targeted at this execution path, they also yield net improvements to the general execution of all EndPoint types. This is especially true in thePOP_Handler,DEP_Handler,SetupExecutive, andRestoreFromCallmethods within the VM's Executive. For a specific test graph with a mixture of geometry and mathematical operations, these enhancements specifically improve JIL Execution by 75% (1.25s to .3s) but more importantly reduce overallUpdateGraphrun time by 35% (17s to 13s). The net impact is also dramatically reduces memory allocation. For the case of this specific test graph the memory allocation associated withUpdateGraphwent from 11.3gb to 3.6gb. In summary this PR optimizes the execution of functions handled via the JIL Endpoint but has a net improvement to all node types.Specifically this PR does the following
Clean up dependency on
CurrentStackFrameproperty ofRuntimeMemory. This getter creates a newStackFrameobject to reference a subset of items at a specific location in the VM's Stack. TheCurentStackFrameproperty is utilized 99% of the time from theIsGlobalScopemethod. This issue isIsGlobalScopecan be called tens of millions of times during a Graph Execution run which creates a newStackFrameobject every time theCurrentStackFrameproperty is referenced. This optimization simply removes the need to allocate a temporaryStackValueobject when the data which is needed can be easily referenced directly from the Stack. This optimization represents the majority of the extra gigabytes of temporary allocation described above in the sample graph performance delta.Refactor
JILFunctionEndPointto include anInit()function to allow caching ofInterpreter. This mirrors a similar optimization that was applied to the FFIFunctionEndPoint (see this PR). The cached object can be reused via aResetExecutivevs continuing to reallocate a newInterpreterobject. This is applicable during replication.Refactor
RestoreFromCallto not allocate empty list until after the required check ofruntimeCore.Options.RunMode == InterpreterMode.Expression. This allocation was done repeatedly when it the later Any() checks can be refactored to a null check.Remove Linq implimentation from
GetFirstDirtyGraphNodeFromPC. This method is also called often duringGraphUpdate. This optimizations removes the linq implementation for a more memory and performance optimized implementation.Fast path for
GetGraphNodesAtScopewhen asking for a Invalid ClassIndex and ProcessIndex (ie -1). This is another function that can be called millions of times during a UpdateGraph run. Many calls that are routed through this function are looking for the same item in thegraphNodeMapdictionary. This optimization creates a shortcut when the lookup is asking for the specific case of the invalidClassIndexandProcessIndexthat avoids accessing the object from the dictionary. Note, in the case of this function, caching the previous lookup does not speed up the lookup as the method usage typically alternates betweenRefactor
UpdateGraphto not allocate a temporary list of GraphNodes.[ ] Todo discuss single test failure
Declarations
Check these if you believe they are true
*.resxfilesReviewers
TBD
FYIs
TBD