Conversation
The `findall` fallback is quite slow when predicate is a small function compared with generating a logical index using `broadcast` and calling `findall` on it to compute integer indices. The gain is most visible when predicate is true for a large proportion of entries, but it's there even when all of them are false. The drawback of this approach is that it requires allocating a vector of `length(a)/8` bytes whatever the number of returned indices.
| # issorted fails for some element types so the method above has to be restricted | ||
| # to element with isless/< defined. | ||
| findall(pred::Fix2{typeof(in)}, x::Union{AbstractArray, Tuple}) = _findin(x, pred.x) | ||
| findall(pred::Fix2{typeof(in)}, x::AbstractArray) = _findin(x, pred.x) |
There was a problem hiding this comment.
I think it's required to avoid an ambiguity, since ::Fix2 is more specific than ::Function, but ::AbstractArray is more specific than ::Union{AbstractArray, Tuple}.
|
This is rather surprising to me — I suppose it's entirely due to the ability to sum logical arrays and So with that rationale, I'll give this a ✅. I'll give it a bit more time, but barring further comments let's merge it tomorrow. |
|
Thanks! |
|
This change broke UniqueVectors.jl. My current plan is to issue a new release to fix it, but I am wondering: could there be other fallout throughout the package ecosystem? Seems somewhat unlikely, but perhaps worth considering/checking. |
|
Usually before releasing a new version all package tests are run against it to detect any problems in advance, so I guess we'll find out at that point. |
|
Would we avoid the ambiguity if it's While we try to avoid ambiguity changes like this, it's really hard to avoid and is easy to fix. |
Do you mean if we were to replace the signature An alternative approach (and one which you might actually be implying) is to replace the three signatures of
Do you mean to fix it by adjusting julia before the release or to fix it in packages? I am on board either way. If there is no other fallout in the package ecosystem, then the above solution is almost certainly overkill -- the UniqueVectors package is probably the right place to fix it. |
|
Yes, I meant the latter suggestion on both counts. Sorry for the brevity! |
The `findall` fallback is quite slow when predicate is a small function compared with generating a logical index using `broadcast` and calling `findall` on it to compute integer indices. The gain is most visible when predicate is true for a large proportion of entries, but it's there even when all of them are false. The drawback of this approach is that it requires allocating a vector of `length(a)/8` bytes whatever the number of returned indices.
The
findallfallback is quite slow when predicate is a small function compared with generating a logical index usingbroadcastand callingfindallon it to compute integer indices. The gain is most visible when predicate is true for a large proportion of entries, but it's there even when all of them are false.The drawback of this approach is that it requires allocating a vector of
length(a)/8bytes whatever the number of returned indices.Some benchmarks using
>and==, for which the impact of usingbroacastshould be the most visible thanks to SIMD. Note the timings difference when changing the proportion oftrueentries.