-
Notifications
You must be signed in to change notification settings - Fork 13
Add Toto and TimesFM-2.5 examples #53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| import datasets | ||
| import numpy as np | ||
| import pandas as pd | ||
| import timesfm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rajatsen91, can you please let me know if this wrapper looks good to you? Based on google-research/timesfm@58e01ad and the latest model checkpoint on HF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks. Would it be possible to check the results with the context window for the eval to also be 2048 (to match the other models)? My feeling it is that won't change much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rajatsen91, I will merge this PR with the updated results for context_length=16000 + new inference code + updated leakage indicator for ['favorita_transactions_1W', 'm5_1W'].
I will then open another PR with the updated results for TimesFM-2.5 with context length 2048 + corresponding results and let you decide if we should merge it.
| model_name: str = "google/timesfm-2.5-200m-pytorch", | ||
| batch_size: int = 256, | ||
| context_length: int = 16_000, | ||
| per_core_batch_size: int = 64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Larger per_core_batch_size values (e.g. 128) lead to OOM errors in combination with context_length=16_000 on A10G GPU (24GB RAM) even on a dataset with a single time series. Should I keep per_core_batch_size=64 or reduce the context length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the current setting (per_core_batch_size=64, context_length=16_000) the median inference time on full fev-bench for TimesFM-2.5 is 16.9s (down from 117s) and the win rate/skill score numbers remain unchanged.
| import datasets | ||
| import numpy as np | ||
| import pandas as pd | ||
| import timesfm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks. Would it be possible to check the results with the context window for the eval to also be 2048 (to match the other models)? My feeling it is that won't change much.
Issue #, if available:
Description of changes:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.