Improve block weighting with uniform and hat functions by lsorber · Pull Request #147 · segment-any-text/wtpsplit

lsorber · 2025-01-02T14:30:42Z

This PR makes the current uniform weighting scheme explicit, and adds an improved hat weighting scheme.

The rationale behind hat weighting is that predictions for tokens near the beginning or end of the block will be less accurate than predictions for tokens near the middle of the block, where the model has maximal context.

For instance, let's say we use stride=128 and block_size=256 and compare the predictions for the token with index 128:

With uniform weighting, its prediction will be 0.5 * first_block[128] + 0.5 * second_block[0].
With hat weighting, its prediction will (approximately) be 1 * first_block[128] + 1/256 * second_block[0].

In this example, hat weighting is preferable because the first token of the second block is likely to be much less accurate than the middle token of first block.

Anecdotally, I've also observed that hat weighting improves output quality on test data.

markus583 · 2025-01-12T15:38:06Z

Hi! Thanks a lot for implementing this. Interesting idea, cool stuff! It intuitively makes sense, but I'm unsure if it makes a practical difference. It would be interesting to test it on some benchmarks. For the time being, I'd be happy to add it as a feature and leave the default to uniform. Would you agree @bminixhofer?

bminixhofer · 2025-01-13T12:09:12Z

LGTM!

I've tried this idea a while ago (when I was working on the original WtP) and didn't see improvements on benchmarks, but maybe it helps on other model / benchmark combinations. I agree that intuitively it makes total sense.

So let's add it and leave the default to uniform as you suggested @markus583.

lsorber · 2025-01-13T13:22:18Z

Thanks for the reviews @markus583 @bminixhofer! For your information, I created an inference-only version of wtpsplit called wtpsplit-lite with minimal dependencies to make it easier to integrate SaT into projects that only need inference. Thanks for your work!

markus583 · 2025-01-18T19:39:50Z

Cool, thanks for letting us know. Thanks and keep up the good work! :)

improve block weighting

16fba36

markus583 requested review from bminixhofer and markus583 January 12, 2025 15:38

bminixhofer approved these changes Jan 13, 2025

View reviewed changes

markus583 approved these changes Jan 18, 2025

View reviewed changes

markus583 merged commit 5902e7e into segment-any-text:main Jan 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve block weighting with uniform and hat functions#147

Improve block weighting with uniform and hat functions#147
markus583 merged 1 commit intosegment-any-text:mainfrom
lsorber:main

lsorber commented Jan 2, 2025

Uh oh!

markus583 commented Jan 12, 2025

Uh oh!

bminixhofer commented Jan 13, 2025

Uh oh!

lsorber commented Jan 13, 2025

Uh oh!

markus583 commented Jan 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lsorber commented Jan 2, 2025

Uh oh!

markus583 commented Jan 12, 2025

Uh oh!

bminixhofer commented Jan 13, 2025

Uh oh!

lsorber commented Jan 13, 2025

Uh oh!

markus583 commented Jan 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants