labels: shape [d_0, d_1, ..., d_{r-1}]
logits: shape [d_0, d_1, ..., d_{r-1}, num_classes]
At the moment one has to reshape into 2D before and back again after the operation.
It would not require a signature change and would not break current code that use 2D input into the loss.
Any thoughts?
Going hand-in-hand with #1020 and #1260 it would be nice to have the same multi-dim behavior for the CrossEntropyLoss as is now for softmax. Similar to https://www.tensorflow.org/api_docs/python/tf/nn/sparse_softmax_cross_entropy_with_logits
where the loss is is evaluated along the last provided dimension, i.e.
At the moment one has to reshape into 2D before and back again after the operation.
It would not require a signature change and would not break current code that use 2D input into the loss.
Any thoughts?