Skip to content

Conversation

@shchur
Copy link
Contributor

@shchur shchur commented Jan 13, 2026

Issue #, if available:

Problem: When calling task.evaluation_summary(), new EvaluationWindow objects are created for each window. Each window then calls _prepare_dataset_dict() which loads and splits all data from scratch. This is a bigger problem now since all datasets are stored in memory.

Solution:

  • Remove _dataset_dict caching from EvaluationWindow
  • Add return_past and return_future flags to past_future_split() to skip unnecessary slicing
  • get_ground_truth() now only processes [id, timestamp, target] columns with return_past=False, avoiding slicing of past data entirely

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@shchur shchur requested a review from abdulfatir January 13, 2026 14:12
@shchur shchur merged commit 19fde89 into main Jan 14, 2026
5 checks passed
@shchur shchur deleted the no-split-caching branch January 14, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants