Introduce SoftmaxCrossentropy as a loss function#2573
Conversation
|
We have a guideline for adding new operator. Could you please take a look and make sure all requirements are met? Many thanks. |
|
this PR is dependent upon Liqun's PR (#2551) where AddQueriedFunctionBody was introduced. And that's why some of the tests are failing. |
| {{"X_Log"}, "Log", {"X_SM"}}, | ||
| {{"X_Mul"}, "Mul", {"labels", "X_Log"}}, | ||
| {{"X_Mul2"}, "Mul", {"weights", "X_Mul"}}, | ||
| {{"output"}, "ReduceMean", {"X_Mul2"}} |
There was a problem hiding this comment.
ReduceMean [](start = 30, length = 10)
do we need to set axes here ?
Does default value works ?
There was a problem hiding this comment.
no need to specify axes here.
If the reduction is specified, ReduceMean will be computed over the whole ouput (axes=None, and output will be scalar).
For more specific cases requiring computing the mean over specific axes, there is the possibility of using a subgraph with ONNX::SoftmaxCrossEntropy with 'none' reduction followed ONNX::ReduceMean with the desired axes.
How does that sound?
| "T", | ||
| {"tensor(float16)", "tensor(float)", "tensor(double)"}, | ||
| "Constrain input and output types to float tensors.") | ||
| .AddQueriedFunctionBody([](FunctionBodyQueryContext& ctx) { // no weight, reduction is "none" |
There was a problem hiding this comment.
AddQueriedFunctionBody [](start = 2, length = 22)
general speaking, I feel that this approach is not extensible...
In this op, there is 6 combinations....
What about other ops if the options are combinatorial....
|
Please review @houseroad @postrational |
|
|
||
| Finally, L is optionally reduced: | ||
| L = ReduceSum(L), if reduction = 'sum'; | ||
| ReduceMean(L), if reduction = 'mean'; if "weight" is provided, output is averaged by sum of weights. |
There was a problem hiding this comment.
May be useful to change "output is averaged by sum of weights" to "output is weighted-mean".
|
I just realized that the second output is missing in this spec. See https://aiinfra.visualstudio.com/Lotus/_git/onnxruntime/pullrequest/5592 for details |
|
@ebarsoum @postrational any more comments on this PR? |
a5c9ada to
13892a9
Compare
* Add SoftmaxCrossEntropy * Add tests * Update tests * Add more onnx and pb files * Add Changelog * Add Operators.md * Add TestCoverage.md * Add shape and inference propagation * Remove extra space * Make SoftmaxCrossEntropy a function * hasInputShape check * Add tests for shape and inference * Compare input shapes * Update tests and docs * Convert tabs to spaces * Add weights as an attribute and not input * Update tests * fix build issues * uncomment addqueryfunction * new tests files * remove old test files * Update SoftmaxEntropy function * Update docs * update docs * update tests and test files * Update shape and inference tests * update test files * Use SetContextDependentFunctionBodyBuilder * fix flake8 errors * tab to space * propagate attribute reduction * update docs * debug * Address reviewers comments * update shape and inference tests * move softmaxcrossetropyloss to math folder * Update doc * Fix 'module is not callable' error * Ignore type signature * Remove whitespace * Add type annotations * Add optional output log_prob * Remove whitespace * Update docs
This ONNX function calculates loss using softmax and cross entropy functions.
Inputs can be K-dimensional tensor of the same shape, where K >= 1.
Output is dependent on reduction attribute. If reduction is none, output is scalar and if not, output is the same shape as input.
Pytorch: https://pytorch.org/docs/stable/nn.html#crossentropyloss