Make Broadcast.flatten(bc).f more complier frendly. (better inferred and inlined)#43322
Make Broadcast.flatten(bc).f more complier frendly. (better inferred and inlined)#43322N5N3 merged 1 commit intoJuliaLang:masterfrom
Broadcast.flatten(bc).f more complier frendly. (better inferred and inlined)#43322Conversation
Broadcast.flatten for better inference.Broadcast.flatten(bc).f more complier frendly.
|
I only add the second example to test. As the first one's instability doesnt influence the return type. |
b19e849 to
cf82239
Compare
52a3132 to
992d2d4
Compare
This comment was marked as outdated.
This comment was marked as outdated.
Broadcast.flatten(bc).f more complier frendly.Broadcast.flatten(bc).f more complier frendly. (better inferred and inlined)
|
It turns out this problem could be overcomed much more easily, as nested flattened(args...) = bc.f(map(f -> f(args), makeargs)...) |
948f3ff to
0b02d79
Compare
|
Maybe @vchuravy can take a look at this? In general packages shouldn't use as much Broadcast.flatten as they do, but I think this will have a lot of knock-on effects because they do use it a lot, and the current lowering is pretty bad. |
vtjnash
left a comment
There was a problem hiding this comment.
This reminds me of #45789–was that the inspiration?
ah, this is a fun new way to make use of our compiler's constant prop. I feel a bit nervous that the compiler won't be up to the challenge though of figuring out the Pick{N} allocation, or that it should be a Val{N} instead (with the caller doing the indexing explicitly with the result rather than calling the result as a function). How well is that working though in your experience with this?
|
In fact this was inspired by the current broadcast implement itself. As for Since the flatten process is stable. I think our complier should be able to handle it well. |
|
No difference content or concept wise. Val just means you have to unwrap it yourself later and call getindex, rather than call it directly with the Tuple and have that defined to return the indexed value |
1. make `cat_nested` better inferred by switching to direct self-recursion. 2. `make_makeargs` now create a tuple of functions which take in the whole argument list and return the corresponding input for the broadcasted function.
A follow up attemp to fix #27988. (close #47493 close #50554)
Examples:
On master:
click for details
On this PR