Some of the loss functions probably don't work correctly

I was verifying some of the loss functions in mlpack before modifying them to introduce reduction functionality in them. Some of them probably aren't implemented correctly or have misleading class names. I am listing these below. The goal of this issue is primarily to discuss any opinions on these. After that, I would like to make the necessary changes to have everything working perfectly.

- [ ] L1 Loss 
- [ ] Cross Entropy Error
- [ ] Mean Bias Error
- [ ] Cosine Embedding Loss
- [ ] Hinge Embedding Loss
- [ ] KL Div Loss
- [ ] Margin Ranking Loss
- [ ] NLL Loss
---

For L1 loss, the forward function doesn't take the absolute difference between input and target. Also, the backward function doesn't return a different output matrix, for different values of the reduction parameter _**mean**_. I have created a demo [Google Colab notebook](https://colab.research.google.com/drive/1EqxCl8jTw7vuWm-0WI_64e64MR6PUzSP?usp=sharing) to demonstrate the changes that need to be done.

---

Cross Entropy Error in mlpack is actually Binary Cross Entropy Loss. Also it works only when target is one-hot encoded and calculates reduction='sum' by default whereas PyTorch calculates reduction='mean' by default. None of these assumptions are written anywhere, inspite of the [discussion](https://github.com/mlpack/mlpack/pull/1070#discussion_r129208108) in #1070 which was merged a long time back. I have verified these claims in a [Google Colab notebook](https://colab.research.google.com/drive/14PxGHVCzNLa9mN4VGluM2p0oQ9OCpCMt?usp=sharing) in case anyone would like to take a quick look at the issue. 

---

Mean Bias Error -> [Google Colab notebook](https://colab.research.google.com/drive/1x8dGBxFl-XtwgPs4EU0-1fnuvLFOp1Xy?usp=sharing)

---

For Cosine Embedding Loss the current mlpack implementation will suffer from 2 warnings. (-Wreorder and -Wunused). 

**-Wreorder** 
bool similarity; double margin; bool takeMean; // original order of definition 
const double margin, const bool similarity, const bool takeMean // constructor order

**-Wunused**
const size_t batchSize = input.n_elem / cols; // in the Backward() method

Also, I have very little idea about how to implement reduction='none' for this function. Will definitely need Kartik's help with it for this function.

---

Dice Loss -> Probably no need to implement reduction for this function (will update later if needed)

---

Earth Mover Distance -> correct implementation (reduction implementation : [Google Colab notebook](https://colab.research.google.com/drive/1xClRmLvuMMcVx5bXatWHJMEgmwR1SRbC?usp=sharing))

---

Hinge Embedding loss -> The current implementation has 2 issues. Primarily, the function is incorrectly implemented and a minor secondary design issue is that it only takes labels that are 0 or 1 instead of labels that are 1 or -1 (as in PyTorch or other frameworks).
[Google Colab notebook](https://colab.research.google.com/drive/1umN57r7_b9wua4_Tz4OArIQthxSnDBIh?usp=sharing)

---

KL Div Loss -> Current implementation has a few issues. I have standardised the implementation to match the PyTorch implementation in this [Google Colab notebook](https://colab.research.google.com/drive/1_57-NMliSNG3xtS3VNy1pXDqHYsHTYHn?usp=sharing) along with correction of those issues.

---

Huber Loss -> Correct implementation. Reduction facilities introduced in this [Google Colab notebook](https://colab.research.google.com/drive/1GxrvluR1aniIlM6d-O0Fq4NubY7uSB3E?usp=sharing)

---

LogCosh Loss -> Correct implementation. Reduction facilities introduced in this [Google Colab notebook](https://colab.research.google.com/drive/1EQhfFZdhYxkcfkScDqcn8lYsWA0EtfzF?usp=sharing)

---

Margin Ranking Loss -> Forward method was incorrect. Fixed it. Backward method is probably incorrect. I don't know how to fix it under current implementation restrictions. See the [Google Colab notebook here](https://colab.research.google.com/drive/1V-g_NmKMuM2EAm79CDwwAXswca1WbE0u?usp=sharing)

---

NLL Loss -> [Google Colab notebook here](https://colab.research.google.com/drive/1_wUfSHE_c4ZsS1dPc5Q4iMhxdXqy_36N?usp=sharing). I am not entirely sure about this. But, what I expect is that for the same input and target matrices, PyTorch and mlpack should be returning the same values, which doesn't happen with the current implementation. (Maybe I am wrong about something?) I have updated the implementation details to match PyTorch outputs and also added all the reduction facilities in the notebook above. 

---

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Some of the loss functions probably don't work correctly #2444

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Some of the loss functions probably don't work correctly #2444

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions