Skip to content

feat: loki.source.file file_match block#4743

Merged
kalleep merged 35 commits intomainfrom
kalleep/loki-source-file-match
Nov 6, 2025
Merged

feat: loki.source.file file_match block#4743
kalleep merged 35 commits intomainfrom
kalleep/loki-source-file-match

Conversation

@kalleep
Copy link
Contributor

@kalleep kalleep commented Oct 31, 2025

PR Description

This PR adds a new file_match block to loki.source.file that enables built-in file discovery using glob patterns. This allows users to discover files directly within loki.source.file without needing a separate local.file_match component.

If file_match block is not enabled component behaves like it did before.

If this block enabled local.file_match is no longer needed and we avoid the overhead of waiting for alloy graph updates before we can schedule new files to tail.

Which issue(s) this PR fixes

Notes to the Reviewer

I added "resolver" interface that will change depending on configuration. The static one behaves exactly like before, requiring absolute paths and the glob implementation will perform globing with doublestar.

The watcher will do nothing when file_match is not configured.

I also changed to use nop logger in test, logging in tests adds additional noise that makes it hard to see test output. Maybe we should be able to control logging for test with an env variable instead?

PR Checklist

  • CHANGELOG.md updated
  • Documentation added
  • Tests updated
  • Config converters updated

@kalleep kalleep requested review from a team and clayton-cornell as code owners October 31, 2025 09:57
@github-actions
Copy link
Contributor

💻 Deploy preview available (Kalleep/loki source file match):

@kalleep kalleep changed the title Kalleep/loki source file match faet: loki.source.file file_match block Oct 31, 2025
@kalleep kalleep requested review from a team, clayton-cornell and Copilot and removed request for a team and clayton-cornell October 31, 2025 10:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces built-in file discovery capabilities to loki.source.file component through a new file_match block. The feature allows glob pattern-based file discovery directly within the component as an alternative to using the separate local.file_match component.

Key changes:

  • Added file_match configuration block with glob pattern support using the doublestar library
  • Implemented a resolver abstraction (static vs. glob) to handle different path resolution strategies
  • Added a watcher mechanism to periodically sync filesystem state when file matching is enabled

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
internal/component/loki/source/file/watcher.go New watcher component for periodic filesystem synchronization
internal/component/loki/source/file/resolver.go New resolver abstraction with static and glob implementations
internal/component/loki/source/file/resolver_test.go Unit tests for resolver implementations
internal/component/loki/source/file/file.go Core component changes to integrate file matching and resolver system
internal/component/loki/source/file/file_test.go Test infrastructure updates to validate both file match modes
internal/component/loki/source/file/*_test.go Updated test utilities to use new logging API
docs/sources/reference/components/loki/loki.source.file.md Documentation for new file_match block with examples
docs/sources/reference/components/local/local.file_match.md Updated to recommend built-in file_match for simple cases
CHANGELOG.md Added feature entry

@github-actions
Copy link
Contributor

github-actions bot commented Oct 31, 2025

💻 Deploy preview available (faet: loki.source.file file_match block):

@kalleep kalleep requested a review from Copilot October 31, 2025 10:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

`local.file_match` discovers files on the local filesystem using glob patterns and the [doublestar][] library.

{{< admonition type="note" >}}
For simple use cases where you only need to discover files for a single `loki.source.file` component, consider using the built-in [`file_match`](../../loki/loki.source.file/#file_match) block in `loki.source.file` instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's also better option for very large numbers of files that are frequently changing?

Perhaps we should recommend to always use the built-in block and only use local.file_match for things that are not possible with the built-in block. That way we ensure people get maximum correctness & performance.

Should we try create a list of use-cases that are not possible with built-in block?

Copy link
Contributor Author

@kalleep kalleep Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes there is examples with file globing using either builtin or local.file_match. There I added this header
### File globbing with file_match (recommended)

Do you think we should make it clearer here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we need to make it more clear when to use which option. And probably the logic should be to use the built-in file_match whenever you can and a separate component for cases that are not possible with built-in (we need to list a few of them)

- Better performance: File discovery is integrated directly into `loki.source.file`, eliminating the overhead of component-to-component communication
- Same functionality: Supports the same glob patterns and features, including `__path_exclude__` for excluding files

When `enabled` is set to `true`, you can use glob patterns (e.g., `/tmp/*.log` or `/var/log/**/*.log`) directly in the `targets` argument's `__path__` label.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when enabled is false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we require absolute file paths, like this component behaved before this pr

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we write that in docs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this in 61c3e8c

I listed the use cases I could think of for local.file_match

Comment on lines +114 to +115
The `file_match` block enables built-in file discovery directly within `loki.source.file`.
When enabled, the component discovers files on the local filesystem using glob patterns and the [doublestar][] library.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when someone provides file_match and also sends targets in using the previous pattern?

I think that should ideally be an error, unless there's a simple way to make both work.

Copy link
Contributor Author

@kalleep kalleep Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will work but you get some overhead because now you are globing and syncing twice. Because all paths will match directly to one file and that is fine to pass to Glob

@kalleep kalleep changed the title faet: loki.source.file file_match block feat: loki.source.file file_match block Oct 31, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Oct 31, 2025

💻 Deploy preview deleted (feat: loki.source.file file_match block).

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 36 out of 36 changed files in this pull request and generated 3 comments.

|---------------------|---------------------|-------------------------------------------------------------|---------|----------|
| `enabled` | `bool` | Enable file discovery using glob patterns. | `false` | no |
| `ignore_older_than` | `duration` | Ignores files with modification times before this duration. | `"0s"` | no |
| `sync_period` | `duration` | How often to sync filesystem and targets. | `"10s"` | no |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that a good default? we have a chance to change it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the defaults are good, this is the same defaults that are using in promtail too.

@kalleep kalleep force-pushed the kalleep/loki-source-file-match branch from 0e6b45a to ba28b05 Compare November 3, 2025 07:53
@kalleep kalleep requested a review from thampiotr November 3, 2025 13:38
Copy link
Contributor

@thampiotr thampiotr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, excited for this to get merged! :)
Just a few comments so far.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a //go:build !race on top of this file.

  • Are we sure this test runs in CI? it's not obvious to me right now
  • Could we get rid of this and fix the race conditions, since we done some improvements recently?

c.resolver = newStaticResolver()
}

if newArgs.FileMatch.SyncPeriod != c.args.FileMatch.SyncPeriod && newArgs.FileMatch.SyncPeriod < 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if newArgs.FileMatch.SyncPeriod != c.args.FileMatch.SyncPeriod && newArgs.FileMatch.SyncPeriod < 0 {
if newArgs.FileMatch.SyncPeriod != c.args.FileMatch.SyncPeriod && newArgs.FileMatch.SyncPeriod > 0 {

Although I'm not sure if that can even happen? Because default is not 0, right?
maybe we should think what to do with c.watcher when we disable FileMatch separately.

Comment on lines +224 to +226
if !c.args.FileMatch.Enabled {
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we disable FileMatch and then enable FileMatch, I think the goroutine will not be restarted... can we add a test for this and fix the issue?

We have two choices: always keep the goroutines and watcher ticker on and check for FileMatch enabled - a bit more expensive but a simpler code. Or we schedule them only when FileMatch is enabled - a bit more complex, but fewer resources used. I don't have a strong opinion, I would like this code to be easier to follow though...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! My intention was to keep it running regardless, so we should not return here.

And only because it's way more simple to do that then having to manage it in between Updates

`local.file_match` discovers files on the local filesystem using glob patterns and the [doublestar][] library.

{{< admonition type="note" >}}
Use the built-in [`file_match`](../../loki/loki.source.file/#file_match) block in `loki.source.file` whenever possible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to reach out to k8s helm charts maintainers to change their defaults if they have them.

kalleep and others added 23 commits November 5, 2025 10:58
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Piotr <17101802+thampiotr@users.noreply.github.com>
Co-authored-by: Clayton Cornell <131809008+clayton-cornell@users.noreply.github.com>
Co-authored-by: Clayton Cornell <131809008+clayton-cornell@users.noreply.github.com>
Co-authored-by: Clayton Cornell <131809008+clayton-cornell@users.noreply.github.com>
Co-authored-by: Clayton Cornell <131809008+clayton-cornell@users.noreply.github.com>
Co-authored-by: Clayton Cornell <131809008+clayton-cornell@users.noreply.github.com>
@kalleep kalleep force-pushed the kalleep/loki-source-file-match branch from b56ee4b to 28814a9 Compare November 5, 2025 09:59
@kalleep kalleep merged commit 16e0e21 into main Nov 6, 2025
43 checks passed
@kalleep kalleep deleted the kalleep/loki-source-file-match branch November 6, 2025 09:26
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 7, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants