Skip to content

SoftmaxCrossEntropyLoss-12 forward and backward kernel implementation.#3237

Closed
codemzs wants to merge 5 commits intoort_trainingfrom
softmaxcrossentropyloss
Closed

SoftmaxCrossEntropyLoss-12 forward and backward kernel implementation.#3237
codemzs wants to merge 5 commits intoort_trainingfrom
softmaxcrossentropyloss

Conversation

@codemzs
Copy link
Copy Markdown
Member

@codemzs codemzs commented Mar 17, 2020

Description: SoftmaxCrossEntropyLoss-12 forward and backward kernel implementation.

Motivation and Context

  • SoftmaxCrossEntropyLoss was introduced in opset-12, hence the implementation.
  • Much of the implementation is taken from SparseSoftmaxCrossEntropyLoss with minor modification to make the code generic and conform to opset-12 spec.
  • Deprecate SparseSoftmaxCrossEntropyLoss

TODO:

  • Investigate large error in gradient calculation test.
  • Do we need SoftmaxCrossEntropyLoss implementation for Float16 and double? @SherlockNoMad

@codemzs codemzs requested a review from a team as a code owner March 17, 2020 10:43
for (int j = 0; j < d; j++) {
int index = i * d + j;
d_logit_data[index] = (exp(log_prob_data[index]) - (label_sample == j)) * dY_scaled;
d_logit_data[index] = (exp(log_prob_data[index]) - ((int)(label_sample) == j)) * dY_scaled;
Copy link
Copy Markdown
Member Author

@codemzs codemzs Mar 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int [](start = 61, length = 3)

try casting to int64 #Closed

}
#endif
TEST(GradientGraphBuilderTest, TrainingSession_BertToy) {
TEST(GradientGraphBuilderTest, DISABLED_TrainingSession_BertToy) {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will re-enable before check-in.

@SherlockNoMad SherlockNoMad added the training issues related to ONNX Runtime training; typically submitted using template label Mar 20, 2020
@codemzs codemzs closed this Apr 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

training issues related to ONNX Runtime training; typically submitted using template

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants