[ENH] implement efficient _evaluate_by_index for forecast performance metrics #4304 Implementation for GeometricMeanAbsoluteError by KaustubhUp025 · Pull Request #6244 · sktime/sktime

KaustubhUp025 · 2024-04-01T05:38:52Z

Reference Issues/PRs

Towards #4304

What does this implement/fix? Explain your changes.

Implementation of efficient _evaluate_by_index performance metrics by taking reference from #4302

Does your contribution introduce a new dependency? If yes, which one?

No

What should a reviewer concentrate their feedback on?

Did you add any tests for the change?

No

Any other comments?

PR checklist

For all contributions

I've added myself to the list of contributors with any new badges I've earned :-)
How to: add yourself to the all-contributors file in the sktime root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
See here for full badge reference
Optionally, for added estimators: I've added myself and possibly to the maintainers tag - do this if you want to become the owner or maintainer of an estimator you added.
See here for further details on the algorithm maintainer role.
[ 👍] The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.

For new estimators

I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.
If the estimator relies on a soft dependency, I've set the python_dependencies tag and ensured
dependency isolation, see the estimator dependencies guide.

…etricMeanAbsoluteError

fkiraly

Thanks!

I think there has been a mistake, you copied the mean absolute error (MAE)?
This metric is supposed to be GMAE, geometric mean absolute error.

Btw, it would also be nice if you provide the formula in the docstring, like in MeanAbsoluteError, that helps avoiding math errors and make the review easier.

KaustubhUp025 · 2024-04-01T10:54:58Z

Thank You @fkiraly , I will check it once.

…MeanAbsoluteError with documentation as well

KaustubhUp025 · 2024-04-01T11:35:02Z

@fkiraly I have added the changes that I missed earlier.
Really sorry, I didn't check my code the first time and made the PR.
Will definitely work on this.

fkiraly

Thanks, this looks correct now!

Before we merge, can we discuss numerical stability?
If we implement the formula directly, we use np.prod which can lead to a numerical explosion.

Would it be better to use the exp/log representation instead?
That is, using that geometric averages are arithmetic averages of logarithms, then exp?

For discussion - we could test with a vector of length 100, with value 1000. My conjecture is that the current formula produces nans, while exp/log will be fine.

KaustubhUp025 · 2024-04-01T12:01:50Z

@fkiraly Thank you, I will give it a check and will let you know.

KaustubhUp025 · 2024-04-01T12:35:56Z

@fkiraly I checked your conjecture and yepp you are right in that.
With the given values of a vector of length 100, with value 1000.

and implementing the given functions to check it, I found this as my result.

# Function to compute GMAE using direct product approach
def gmae_direct(y_true, y_pred):
errors = np.abs(y_true - y_pred)
product_of_errors = np.prod(errors)
gmae = np.power(product_of_errors, 1 / len(errors))
return gmae

And :-

def gmae_log_exp(y_true, y_pred, epsilon=1e-10):
errors = np.abs(y_true - y_pred)
errors = np.maximum(errors, epsilon) # Ensure errors are strictly positive
log_errors = np.log(errors)
log_mean = np.mean(log_errors)
gmae = np.exp(log_mean)
return gmae

And so this was the result I got :-

GMAE using direct product approach: 0.0
GMAE using log/exp representation approach: 9.999999999999996e-11

fkiraly · 2024-04-01T12:41:38Z

hm, something is not right here - the result should be 1000

KaustubhUp025 · 2024-04-01T12:42:55Z

Okay I will check it once.
Is there any problem with the given function definition.

fkiraly · 2024-04-01T12:43:36Z

I see - I meant when the errors are a vector of 1000. To create that situation, you could make y_true a vector of 1s, and y_pred a vector of 1001-s.

KaustubhUp025 · 2024-04-01T13:00:42Z

@fkiraly I gave the input as :-

test_length = 100
y_true = np.ones(test_length)
y_pred = np.ones(test_length) * 1001

and I got this as the result :-

GMAE using direct product approach: 1000.0000000000001

GMAE using log/exp representation approach: 999.9999999999989

fkiraly · 2024-04-01T13:02:44Z

hm, is direct more accurate? Surprising.

How about you make the errors larger, e.g., 1e-10 and see what happens?

KaustubhUp025 · 2024-04-01T13:15:35Z

@fkiraly So I used the input values given and here is the output for the same :-

GMAE using direct product approach: 0.0
GMAE using log/exp representation approach: 1.0000000827403598e-10

fkiraly · 2024-04-01T13:21:10Z

yes, seems more stable for extreme values. Shall we go with the log/exp approach then?

KaustubhUp025 · 2024-04-01T13:23:03Z

Yes I have written the code for that as well. I will just add it.

…MAE. This modification ensures that the errors are strictly positive before taking their logarithm, improving the numerical stability of the calculation.

fkiraly · 2024-05-29T18:47:00Z

sktime/performance_metrics/forecasting/_classes.py

+        n = raw_values.shape[0]
+        gmae_values = []
+        for i in range(n):
+            excluded_error = np.delete(raw_values, i, axis=0)  # Exclude i-th error


working with logarithms might make this a bit safer numerically

Actually, @fkiraly I tried to write the whole code correctly and created a new PR #6461, in which I think I have added the necessary things that you have told me.

fkiraly · 2024-05-29T18:47:44Z

sktime/performance_metrics/forecasting/_classes.py

+
+        n = raw_values.shape[0]
+        gmae_values = []
+        for i in range(n):


can you vectorize this?

while I wrote the code for PR #6461, I saw this as an error and vectorized this since without vectorizing the gmae_values it was difficult to multiply it with the custom weights in case of multioutput parameter.

Could you please check #6461 once? I think I should close this PR as I have written the correct code taking in reference from this PR

…on mistake

Adding all the changes to current main

Towards #4751 Creating class for MAEP for the private repo

Merge branch 'sktime:main' into Ksktime2

fkiraly · 2024-07-22T13:43:20Z

for discussion later in our mentoring meeting.

I made some corrections and simplifications to the code, but I think it's not done yet.

replaced what I thought was an overly verbose (and perhaps incorrect?) implementation in _evaluate with the scipy gmean method
removed inheritance for BaseForecastingErrorMetricFunc
replaced what looked like an incorrect implementation of the pseudovalues with a few lines in _evaluate_by_index

The implementation in _evaluate_by_index should be correct if I made no mistake, but it may not be numerically stable when dealing with values close to zero. One may have to put some additional thinking in how to tackle that corner of the parameter space.

fkiraly · 2024-07-24T13:23:40Z

there are some failures due to wrong type - would be great if you could handle these, @KaustubhUp025. The formula should be correct, but a double check would also be appreciated, might be errors of sign or numerical issues.

KaustubhUp025 · 2024-07-24T13:37:26Z

@fkiraly I misunderstood the working flow and created two different PR's for the same GMAE issue and the one I needed to review was on #6461.
Should I close this PR or what else could be done with this PR since I left it without adding the proper improvements that I implemented in #6461

@KaustubhUp025

…ometricMeanAbsoluteError` (#7095) Towards #4304 Implementation of efficient `_evaluate_by_index` performance metric for `GeometricMeanAbsoluteError` by taking reference from #4302 Originally in #6244, which was deleted by @KaustubhUp025, this contained changes both by myself and @KaustubhUp025

KaustubhUp025 added 6 commits March 24, 2024 21:02

Update README.md to add hall-of-fame

80371b6

Update README.md to decrease the width of hall-of-fame

7006870

Update README.md to modify the hall-of0fame

eea1c03

Update README.md to add the scripture emoji with Hall-of-Fame line

7cd9961

Update README.md changed the emoji for Hall-of-Fame section

ced89d8

Update _classes.py Implementing efficient _evaluate_by_index for Geom…

c516ec7

…etricMeanAbsoluteError

KaustubhUp025 requested review from achieveordie, benHeid, fkiraly and yarnabrina as code owners April 1, 2024 05:38

Merge branch 'main' into pr/6244

6d16e36

fkiraly requested changes Apr 1, 2024

View reviewed changes

fkiraly added enhancement Adding new functionality module:metrics&benchmarking metrics and benchmarking modules labels Apr 1, 2024

Update _classes.py added the correct _evaluate_by_index for Geometric…

5822583

…MeanAbsoluteError with documentation as well

KaustubhUp025 requested a review from fkiraly April 1, 2024 11:49

fkiraly requested changes Apr 1, 2024

View reviewed changes

Update _classes.py by improving the code for _evaluate_by_index for G…

ca8e04d

…MAE. This modification ensures that the errors are strictly positive before taking their logarithm, improving the numerical stability of the calculation.

KaustubhUp025 and others added 4 commits May 22, 2024 01:14

Merge branch 'sktime:main' into main

935ff0a

Grammatical mistake

8b8f029

Modifying for Code Bloat

59a2a72

Merge branch 'sktime:main' into main

e66914d

fkiraly reviewed May 29, 2024

View reviewed changes

KaustubhUp025 and others added 15 commits June 14, 2024 21:32

Merge branch 'sktime:main' into main

9fa3314

Towards #6461 solved the testcases issue

526b815

Towards #6461 solved all testcases that were failing due to calculati…

e834dcc

…on mistake

Merge branch 'sktime:main' into main

865a7b7

Merge branch 'sktime:main' into main

6a574f7

Towards #4751 Creating class for MAEP

4662fb2

Towards #4751 Creating class for MAEP

3c41bd1

Merge branch 'sktime:main' into main

13fb105

Merge pull request #1 from KaustubhUp025/Ksktime

94ab1e3

Adding all the changes to current main

Merge pull request #2 from KaustubhUp025/Ksktime2

cc17273

Towards #4751 Creating class for MAEP for the private repo

Merge branch 'sktime:main' into Ksktime2

c4b5831

Merge pull request #3 from KaustubhUp025/Ksktime2

ba3785d

Merge branch 'sktime:main' into Ksktime2

Merge branch 'sktime:main' into main

298f1e8

simplification and improvements

4b675b1

Update _classes.py

16deafe

fkiraly added 2 commits July 24, 2024 14:13

Update _classes.py

529ea5c

Update _classes.py

80f19ea

Merge branch 'sktime:main' into main

732e117

KaustubhUp025 closed this by deleting the head repository Sep 9, 2024

fkiraly mentioned this pull request Sep 9, 2024

&KaustubhUp025 [ENH] implement efficient _evaluate_by_index for GeometricMeanAbsoluteError #7095

Merged

Uh oh!

Conversation

KaustubhUp025 commented Apr 1, 2024 • edited by fkiraly Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Does your contribution introduce a new dependency? If yes, which one?

What should a reviewer concentrate their feedback on?

Did you add any tests for the change?

Any other comments?

PR checklist

For all contributions

For new estimators

Uh oh!

fkiraly left a comment

Choose a reason for hiding this comment

Uh oh!

KaustubhUp025 commented Apr 1, 2024

Uh oh!

KaustubhUp025 commented Apr 1, 2024

Uh oh!

fkiraly left a comment

Choose a reason for hiding this comment

Uh oh!

KaustubhUp025 commented Apr 1, 2024

Uh oh!

KaustubhUp025 commented Apr 1, 2024

Uh oh!

fkiraly commented Apr 1, 2024

Uh oh!

KaustubhUp025 commented Apr 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fkiraly commented Apr 1, 2024

Uh oh!

KaustubhUp025 commented Apr 1, 2024

Uh oh!

fkiraly commented Apr 1, 2024

Uh oh!

KaustubhUp025 commented Apr 1, 2024

Uh oh!

fkiraly commented Apr 1, 2024

Uh oh!

KaustubhUp025 commented Apr 1, 2024

Uh oh!

fkiraly May 29, 2024

Choose a reason for hiding this comment

Uh oh!

KaustubhUp025 Jul 24, 2024

Choose a reason for hiding this comment

Uh oh!

fkiraly May 29, 2024

Choose a reason for hiding this comment

Uh oh!

KaustubhUp025 Jul 24, 2024

Choose a reason for hiding this comment

Uh oh!

fkiraly commented Jul 22, 2024

Uh oh!

fkiraly commented Jul 24, 2024

Uh oh!

KaustubhUp025 commented Jul 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KaustubhUp025 commented Apr 1, 2024 •

edited by fkiraly

Loading

KaustubhUp025 commented Apr 1, 2024 •

edited

Loading