Skip to content

Lazily evaluate CallContext#1328

Closed
lyonbeckers wants to merge 8 commits intogodot-rust:masterfrom
lyonbeckers:lazy-callcontext-latest
Closed

Lazily evaluate CallContext#1328
lyonbeckers wants to merge 8 commits intogodot-rust:masterfrom
lyonbeckers:lazy-callcontext-latest

Conversation

@lyonbeckers
Copy link
Copy Markdown
Contributor

Was seeing some noticeable allocation time spent in rust_callable_call_fn in a project where I am pretty dependent on it. Turns out it was from converting GString to String on every invocation, which is mostly just for debugging anyway.

Saw the TODO to make call_ctx lazy evaluated and decided to take a swing at it since it gets at the root of the problem.

I think something a little more sophisticated than a closure could be used for the derive macro, since it expands to some repeated code, but I think this works as a drop-in replacement.

@Bromeon Bromeon added c: core Core components performance Performance problems and optimizations labels Sep 21, 2025
@GodotRust
Copy link
Copy Markdown

API docs are being generated and will be shortly available at: https://godot-rust.github.io/docs/gdext/pr-1328

@Bromeon
Copy link
Copy Markdown
Member

Bromeon commented Sep 23, 2025

Thanks for your contribution.

You started originally with "the to_string in custom callables is too slow", but now this turned into a +/- 400 LoC diff refactoring the whole CallContext mechanism. I know there was a TODO but that doesn't mean we shouldn't be careful about it 😉 What we need to consider is that the logic gets quite a bit more complex with lazy initialization, since you have to pass through closures and introduce new generic parameters in many places.

What is the end goal -- performance? If yes, could we maybe consider on having a cheap CallContext construction in Release mode, with less information?

Like I wrote on Discord:

Since this is a performance optimization, it would be good to know if the change actually improves things, and by how much. We have the #[bench] infrastructure to run isolated benchmarks: https://github.com/godot-rust/gdext/blob/master/itest/rust/src/benchmarks/mod.rs

It would be nice if you could set up a small benchmark for custom callable and compare it between master and your branch, so we see the impact of to_string(). They are very easy to use, you can look at some existing examples :slight_smile:

TLDR: I think it's good to cut down on useless string conversions, but we need to approach it with a clear scope 🙂

@lyonbeckers
Copy link
Copy Markdown
Contributor Author

Yeahhh hahah this got way out of hand. The irony is that my original take on this was to just wrap the name in a debug_assertions check, but I wanted to see if there was a solution that didn't have that limitation. I really thought I was on the right track, but I guess that's the thing about UB. On the plus side, I have a much deeper understanding of the library after this deep dive.

Long story short, I wound up right back at the debug_assertions limitation in a seemingly futile attempt to make it safe, but much more circuitously. I'm just going to close this and put up a PR that's like... Maybe 6 lines of code not counting a benchmark 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c: core Core components performance Performance problems and optimizations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants