Speed up past/future split #83

shchur · 2026-01-08T17:29:04Z

Issue #, if available:

Currently the operations based on Dataset.filter and Dataset.map are quite slow. Just running the following code takes ~20 minutes and generates 10+GB of intermediate files in ~/.cache/huggingface/datasets

bench = fev.Benchmark.from_yaml(
    "https://raw.githubusercontent.com/autogluon/fev/refs/heads/main/benchmarks/fev_bench/tasks.yaml"
)
for task in bench.tasks:
    for window in task.iter_windows():
        window.get_input_data()

These are not the only bottlenecks - there are also slow map-based operations in the metrics that I will address in a separate PR.

Description of changes:

Perform length-based filtering & past/future splits completely in memory using pyarrow operations without saving any intermediate results to disk. This results in a large speedup: iterating over all windows in fev-bench takes ~4 minutes (down from 20+) and does not save any results to disk.
The main logic is inspired by the efficient slicing algorithm from TimeSeriesDataFrame in AutoGluon that essentially performs df.groupby("item_id").nth(slice(start, end)) in flat numpy arrays.
I validated that the values in the datasets are identical (np.allclose) by sampling 1/7th of all evaluation windows in fev-bench and comparing the values on main / PR branch.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

abdulfatir · 2026-01-09T13:20:53Z

src/fev/utils.py

+    """
+    # Flatten indices if dataset has been sorted/filtered, so row order in dataset
+    # matches the physical order in the underlying Arrow table
+    if getattr(dataset, "_indices", None) is not None:


Has this property been fairly standard for a while?

abdulfatir · 2026-01-09T13:32:04Z

src/fev/utils.py

+        cutoff_scalar = pc.cast(pa.scalar(cutoff), timestamps_flat.type)
+        mask = pc.less_equal(timestamps_flat, cutoff_scalar)
+        cumsum = np.concatenate([[0], np.cumsum(mask.to_numpy(zero_copy_only=False))])
+        return cumsum[offsets[1:]] - cumsum[offsets[:-1]]


Just thinking out loud: will this work also when all timestamps are less than cutoff?

shchur added 7 commits January 8, 2026 14:15

Initial

aee8815

With timing

faf1388

Arrow impl

7625cb8

Simplify implementation

9b7c3f1

Add unit tests

7fbf491

Fix name

2c609ca

Move flatten

e49dc5d

shchur requested a review from abdulfatir January 8, 2026 17:29

Fix lint

854fe10

shchur force-pushed the fast-slicing-and-filtering branch from 2e7ae75 to 854fe10 Compare January 8, 2026 17:30

Remove timer

ce35d99

abdulfatir approved these changes Jan 9, 2026

View reviewed changes

shchur merged commit 546f5cd into main Jan 12, 2026
5 checks passed

shchur deleted the fast-slicing-and-filtering branch January 12, 2026 13:52

This was referenced Jan 14, 2026

Regarding Chronos-Bolt's Full Inference Duration Across 100 FEV Datasets #88

Open

Speed up train/test splits for EvaluationWindow #69

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speed up past/future split #83

Speed up past/future split #83

Uh oh!

shchur commented Jan 8, 2026 •

edited

Loading

Uh oh!

abdulfatir Jan 9, 2026

Uh oh!

abdulfatir Jan 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Speed up past/future split #83

Speed up past/future split #83

Uh oh!

Conversation

shchur commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abdulfatir Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

abdulfatir Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shchur commented Jan 8, 2026 •

edited

Loading