Skip to content

Introduce SoftmaxCrossentropy as a loss function#2573

Merged
wschin merged 49 commits intoonnx:masterfrom
KsenijaS:crossentropy
Feb 21, 2020
Merged

Introduce SoftmaxCrossentropy as a loss function#2573
wschin merged 49 commits intoonnx:masterfrom
KsenijaS:crossentropy

Conversation

@KsenijaS
Copy link
Copy Markdown
Contributor

@KsenijaS KsenijaS commented Jan 29, 2020

This ONNX function calculates loss using softmax and cross entropy functions.
Inputs can be K-dimensional tensor of the same shape, where K >= 1.
Output is dependent on reduction attribute. If reduction is none, output is scalar and if not, output is the same shape as input.

Pytorch: https://pytorch.org/docs/stable/nn.html#crossentropyloss

@KsenijaS KsenijaS requested a review from a team as a code owner January 29, 2020 22:15
@wschin
Copy link
Copy Markdown
Collaborator

wschin commented Jan 30, 2020

We have a guideline for adding new operator. Could you please take a look and make sure all requirements are met? Many thanks.

Comment thread onnx/backend/test/case/node/crossentropy.py Outdated
Comment thread onnx/defs/loss/defs.cc Outdated
@prasanthpul prasanthpul added the topic: operator Issues related to ONNX operators label Feb 1, 2020
@KsenijaS
Copy link
Copy Markdown
Contributor Author

KsenijaS commented Feb 4, 2020

this PR is dependent upon Liqun's PR (#2551) where AddQueriedFunctionBody was introduced. And that's why some of the tests are failing.

@KsenijaS KsenijaS requested a review from a team as a code owner February 4, 2020 17:42
Comment thread onnx/defs/loss/defs.cc Outdated
Comment thread onnx/defs/loss/defs.cc Outdated
Comment thread onnx/defs/loss/defs.cc Outdated
Comment thread onnx/defs/loss/defs.cc Outdated
{{"X_Log"}, "Log", {"X_SM"}},
{{"X_Mul"}, "Mul", {"labels", "X_Log"}},
{{"X_Mul2"}, "Mul", {"weights", "X_Mul"}},
{{"output"}, "ReduceMean", {"X_Mul2"}}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReduceMean [](start = 30, length = 10)

do we need to set axes here ?
Does default value works ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to specify axes here.

If the reduction is specified, ReduceMean will be computed over the whole ouput (axes=None, and output will be scalar).
For more specific cases requiring computing the mean over specific axes, there is the possibility of using a subgraph with ONNX::SoftmaxCrossEntropy with 'none' reduction followed ONNX::ReduceMean with the desired axes.

How does that sound?

Comment thread onnx/defs/loss/defs.cc Outdated
"T",
{"tensor(float16)", "tensor(float)", "tensor(double)"},
"Constrain input and output types to float tensors.")
.AddQueriedFunctionBody([](FunctionBodyQueryContext& ctx) { // no weight, reduction is "none"
Copy link
Copy Markdown
Contributor

@SherlockNoMad SherlockNoMad Feb 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AddQueriedFunctionBody [](start = 2, length = 22)

general speaking, I feel that this approach is not extensible...

In this op, there is 6 combinations....
What about other ops if the options are combinatorial....

@KsenijaS
Copy link
Copy Markdown
Contributor Author

KsenijaS commented Feb 4, 2020

Please review @houseroad @postrational

Comment thread onnx/backend/test/case/node/crossentropy.py Outdated
Comment thread onnx/backend/test/case/node/crossentropy.py Outdated
Comment thread onnx/defs/math/defs.cc Outdated

Finally, L is optionally reduced:
L = ReduceSum(L), if reduction = 'sum';
ReduceMean(L), if reduction = 'mean'; if "weight" is provided, output is averaged by sum of weights.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be useful to change "output is averaged by sum of weights" to "output is weighted-mean".

Copy link
Copy Markdown
Collaborator

@wschin wschin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@SherlockNoMad
Copy link
Copy Markdown
Contributor

I just realized that the second output is missing in this spec.
The second optional output should be log_prob = logsoftmax(input), this will output will be used for speeding up the backward computation.

See https://aiinfra.visualstudio.com/Lotus/_git/onnxruntime/pullrequest/5592 for details

@linkerzhang
Copy link
Copy Markdown
Member

@ebarsoum @postrational any more comments on this PR?

@wschin wschin merged commit e91739f into onnx:master Feb 21, 2020
@chinhuang007 chinhuang007 added this to the 1.7 milestone Feb 27, 2020
@codemzs codemzs mentioned this pull request Apr 26, 2020
jcwchen pushed a commit to jcwchen/onnx that referenced this pull request Sep 23, 2020
* Add SoftmaxCrossEntropy

* Add tests

* Update tests

* Add more onnx and pb files

* Add Changelog

* Add Operators.md

* Add TestCoverage.md

* Add shape and inference propagation

* Remove extra space

* Make SoftmaxCrossEntropy a function

* hasInputShape check

* Add tests for shape and inference

* Compare input shapes

* Update tests and docs

* Convert tabs to spaces

* Add weights as an attribute and not input

* Update tests

* fix build issues

* uncomment addqueryfunction

* new tests files

* remove old test files

* Update SoftmaxEntropy function

* Update docs

* update  docs

* update tests and test files

* Update shape and inference tests

* update test files

* Use SetContextDependentFunctionBodyBuilder

* fix flake8 errors

* tab to space

* propagate attribute reduction

* update docs

* debug

* Address reviewers comments

* update shape and inference tests

* move softmaxcrossetropyloss to math folder

* Update doc

* Fix 'module is not callable' error

* Ignore type signature

* Remove whitespace

* Add type annotations

* Add optional output log_prob

* Remove whitespace

* Update docs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

topic: operator Issues related to ONNX operators topic: training Issues related to ONNX training

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants