Skip to content

Some of the loss functions probably don't work correctly #2444

@iamshnoo

Description

@iamshnoo

I was verifying some of the loss functions in mlpack before modifying them to introduce reduction functionality in them. Some of them probably aren't implemented correctly or have misleading class names. I am listing these below. The goal of this issue is primarily to discuss any opinions on these. After that, I would like to make the necessary changes to have everything working perfectly.

  • L1 Loss
  • Cross Entropy Error
  • Mean Bias Error
  • Cosine Embedding Loss
  • Hinge Embedding Loss
  • KL Div Loss
  • Margin Ranking Loss
  • NLL Loss

For L1 loss, the forward function doesn't take the absolute difference between input and target. Also, the backward function doesn't return a different output matrix, for different values of the reduction parameter mean. I have created a demo Google Colab notebook to demonstrate the changes that need to be done.


Cross Entropy Error in mlpack is actually Binary Cross Entropy Loss. Also it works only when target is one-hot encoded and calculates reduction='sum' by default whereas PyTorch calculates reduction='mean' by default. None of these assumptions are written anywhere, inspite of the discussion in #1070 which was merged a long time back. I have verified these claims in a Google Colab notebook in case anyone would like to take a quick look at the issue.


Mean Bias Error -> Google Colab notebook


For Cosine Embedding Loss the current mlpack implementation will suffer from 2 warnings. (-Wreorder and -Wunused).

-Wreorder
bool similarity; double margin; bool takeMean; // original order of definition
const double margin, const bool similarity, const bool takeMean // constructor order

-Wunused
const size_t batchSize = input.n_elem / cols; // in the Backward() method

Also, I have very little idea about how to implement reduction='none' for this function. Will definitely need Kartik's help with it for this function.


Dice Loss -> Probably no need to implement reduction for this function (will update later if needed)


Earth Mover Distance -> correct implementation (reduction implementation : Google Colab notebook)


Hinge Embedding loss -> The current implementation has 2 issues. Primarily, the function is incorrectly implemented and a minor secondary design issue is that it only takes labels that are 0 or 1 instead of labels that are 1 or -1 (as in PyTorch or other frameworks).
Google Colab notebook


KL Div Loss -> Current implementation has a few issues. I have standardised the implementation to match the PyTorch implementation in this Google Colab notebook along with correction of those issues.


Huber Loss -> Correct implementation. Reduction facilities introduced in this Google Colab notebook


LogCosh Loss -> Correct implementation. Reduction facilities introduced in this Google Colab notebook


Margin Ranking Loss -> Forward method was incorrect. Fixed it. Backward method is probably incorrect. I don't know how to fix it under current implementation restrictions. See the Google Colab notebook here


NLL Loss -> Google Colab notebook here. I am not entirely sure about this. But, what I expect is that for the same input and target matrices, PyTorch and mlpack should be returning the same values, which doesn't happen with the current implementation. (Maybe I am wrong about something?) I have updated the implementation details to match PyTorch outputs and also added all the reduction facilities in the notebook above.


Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions