Skip to content

fix: Add support for compressed files for tail package#5363

Merged
kalleep merged 15 commits intomainfrom
kalleep/loki-source-file-compression
Jan 28, 2026
Merged

fix: Add support for compressed files for tail package#5363
kalleep merged 15 commits intomainfrom
kalleep/loki-source-file-compression

Conversation

@kalleep
Copy link
Contributor

@kalleep kalleep commented Jan 27, 2026

Brief description of Pull Request

When decompression is configured for loki.soure.file we now use the same code internally as we do when it's not configured. This aligns the feature between them like BOM detection.

Pull Request Details

Before this pr we had two different implementation that used when tailing file, tailer and decompressor. The latter was used any time decompression was configured.

Ever since my major refactors to tail package I was certain that we could add support for compressed files and reuse the same implementation.

That is what I have done here so most code are now shared. When compression is configured for tail.File it will not wait if EOF is returned. It will then check if we have any remaining data to flush and the return EOF. This will stop the tailer.

One issue we have is that the previous implementation tracked position by line numbers but this implementation will track offset on uncompressed data, not sure how we can handle that.

In addition to decompression support I fixed an issue that I noticed by adding flush to reader, if we hit EOF we drain all remaining data. But if we had data that did not include a newline it would never be consumed.

Issue(s) fixed by this Pull Request

Notes to the Reviewer

If we consume a compressed file we will exit and remove the stored position, while alloy is running this is fine. But if alloy is restarted we will ingest the files again. This is true with both this implementation and the previous one and we should fix that in another pr.

PR Checklist

  • Documentation added
  • Tests updated
  • Config converters updated

@kalleep kalleep requested a review from a team as a code owner January 27, 2026 13:07
@kalleep kalleep changed the title refactor: reuse tailer for compressed files refactor: Reuse tailer for compressed files Jan 27, 2026
Copy link
Contributor

@ptodev ptodev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I only added a few minor comments.

One issue we have is that the previous implementation tracked position by line numbers but this implementation will track offset on uncompressed data

Is this going to be a problem for users who have a partially consumed archive? Is there a chance of logs being lost during Alloy upgrades?

if we had data that did not include a newline it would never be consumed.

It'd be good to note this with a fix: in the changelog.

@ptodev ptodev self-assigned this Jan 27, 2026
kalleep and others added 3 commits January 27, 2026 15:29
Co-authored-by: Paulin Todev <paulin.todev@gmail.com>
Co-authored-by: Paulin Todev <paulin.todev@gmail.com>
@kalleep
Copy link
Contributor Author

kalleep commented Jan 27, 2026

Is this going to be a problem for users who have a partially consumed archive? Is there a chance of logs being lost during Alloy upgrades?

It would be the other way around, we would most likely read lines that have already been consumed again

@kalleep
Copy link
Contributor Author

kalleep commented Jan 27, 2026

I want to add integration tests for compression usage. If it's okay I will leave that to a followup

@kalleep
Copy link
Contributor Author

kalleep commented Jan 27, 2026

It'd be good to note this with a fix: in the changelog.

Yeah this is probably good because now we get the same support for BOM detection when reading compressed data.

@kalleep kalleep requested a review from ptodev January 28, 2026 13:05
@kalleep kalleep changed the title refactor: Reuse tailer for compressed files fix: reuse code when compression is configured for loki.source.file Jan 28, 2026
@kalleep kalleep changed the title fix: reuse code when compression is configured for loki.source.file fix: Add support for compressed files with tail packge Jan 28, 2026
@kalleep kalleep changed the title fix: Add support for compressed files with tail packge fix: Add support for compressed files for tail package Jan 28, 2026
@kalleep kalleep merged commit 2347c1b into main Jan 28, 2026
50 of 51 checks passed
@kalleep kalleep deleted the kalleep/loki-source-file-compression branch January 28, 2026 14:01
@grafana-alloybot grafana-alloybot bot mentioned this pull request Jan 28, 2026
@kalleep kalleep added the backport/v1.13 Backport to release/v1.13 label Feb 2, 2026
grafana-alloybot bot pushed a commit that referenced this pull request Feb 2, 2026
### Brief description of Pull Request
When `decompression` is configured for `loki.soure.file` we now use the
same code internally as we do when it's not configured. This aligns the
feature between them like BOM detection.

(cherry picked from commit 2347c1b)
kalleep added a commit that referenced this pull request Feb 2, 2026
)

## Backport of #5363

This PR backports #5363 to release/v1.13.

### Original PR Author
@kalleep

### Description
### Brief description of Pull Request
When `decompression` is configured for `loki.soure.file` we now use the
same code internally as we do when it's not configured. This aligns the
feature between them like BOM detection.

### Pull Request Details
Before this pr we had two different implementation that used when
tailing file, tailer and decompressor. The latter was used any time
decompression was configured.

Ever since my major refactors to tail package I was certain that we
could add support for compressed files and reuse the same
implementation.

That is what I have done here so most code are now shared. When
compression is configured for `tail.File` it will not wait if EOF is
returned. It will then check if we have any remaining data to flush and
the return EOF. This will stop the tailer.

One issue we have is that the previous implementation tracked position
by line numbers but this implementation will track offset on
uncompressed data, not sure how we can handle that.

In addition to decompression support I fixed an issue that I noticed by
adding `flush` to reader, if we hit EOF we drain all remaining data. But
if we had data that did not include a newline it would never be
consumed.

### Issue(s) fixed by this Pull Request

### Notes to the Reviewer

If we consume a compressed file we will exit and remove the stored
position, while alloy is running this is fine. But if alloy is restarted
we will ingest the files again. This is true with both this
implementation and the previous one and we should fix that in another
pr.

### PR Checklist

<!-- Remove items that do not apply. For completed items, change [ ] to
[x]. -->

- [ ] Documentation added
- [x] Tests updated
- [ ] Config converters updated


---
*This backport was created automatically.*

Co-authored-by: Karl Persson <23356117+kalleep@users.noreply.github.com>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 17, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

backport/v1.13 Backport to release/v1.13 frozen-due-to-age

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants