While I definitely failed to read your initial example right on the first pass, I think the particular example chosen isn't ambiguous.
The `Returns` part of the docstring strictly implies that all inputs b with ndim >= 2 will be interpreted as the M,K case.
From there, my (2nd attempt) instincts said to let K=1 and unsqueeze-squeeze, which was correct.
·
In general, I do agree that tensor methods can be ill-formed.
Contractually, it would be nice if all ops had
* a single unbatched implementation with fixed ndims for all inputs,
* clear descriptions of what broadcasting/batching semantics are supported for the op.
Or, if you could remake everything from scratch, you would gate all batching behavior behind a vmap() wrap which simply replaces all ops with their batched variant, making intended code behavior obvious... but this gets yucky when you need to flatten/split a batch dim for reasons.
·
I don't think broadcasting can be ditched. As you describe, it's good in "simple" cases, and it's quite easy to see how ugly code would become in the no-broadcast world. I think most cases of bad ambiguity go away if your broadcasting system is repeat-only rather than also ndim modifying.
·
I agree arrays-in-indices ("advanced" indexing) was a mistake. It is more confusing than a simple .take/.index_select. The nd cases are even worse.
·
I'm eagerly looking forward to the “better” NumPy :)
While I definitely failed to read your initial example right on the first pass, I think the particular example chosen isn't ambiguous.
The `Returns` part of the docstring strictly implies that all inputs b with ndim >= 2 will be interpreted as the M,K case.
From there, my (2nd attempt) instincts said to let K=1 and unsqueeze-squeeze, which was correct.
·
In general, I do agree that tensor methods can be ill-formed.
Contractually, it would be nice if all ops had
* a single unbatched implementation with fixed ndims for all inputs,
* clear descriptions of what broadcasting/batching semantics are supported for the op.
Or, if you could remake everything from scratch, you would gate all batching behavior behind a vmap() wrap which simply replaces all ops with their batched variant, making intended code behavior obvious... but this gets yucky when you need to flatten/split a batch dim for reasons.
·
I don't think broadcasting can be ditched. As you describe, it's good in "simple" cases, and it's quite easy to see how ugly code would become in the no-broadcast world. I think most cases of bad ambiguity go away if your broadcasting system is repeat-only rather than also ndim modifying.
·
I agree arrays-in-indices ("advanced" indexing) was a mistake. It is more confusing than a simple .take/.index_select. The nd cases are even worse.
·
I'm eagerly looking forward to the “better” NumPy :)
You mean the np.linalg.solve documentation? Yeah, I don't think it's actually ambiguous, just very hard to understand!