Skip to content

Reduce dependencies in Crawler#17653

Merged
urso merged 3 commits intoelastic:masterfrom
urso:crawler-use-runner-factory
Apr 15, 2020
Merged

Reduce dependencies in Crawler#17653
urso merged 3 commits intoelastic:masterfrom
urso:crawler-use-runner-factory

Conversation

@urso
Copy link
Copy Markdown

@urso urso commented Apr 9, 2020

  • Refactoring

What does this PR do?

The crawler creates active inputs for static configuration, starts
config file reloading, and starts the module loader.
With this change the crawler has no direct dependency (well, reduced) on
input.Input anymore, but will use the Runner interface, even for
statically configured inputs.
This also reduces dependencies, as most plumbing is already done by the
inputs.RunnerFactory and must not be duplicated by the crawler anymore.

The input.Runner used to compute a 'ID' by hashing the inputs
configuration. The ID was public, to be used by the crawler only.
Instead of having the input compute the ID, it is the crawler who will
compute the input ID now.

While reducing dependencies, I replaced logp.Info calls with an internal logp.Logger instance for logging purposes.

Note: the cfgfile RunnerList maintains its own set of IDs. The crawler
and RunnerList each used to use the ID to check for 'duplicate'
configurations, but because the IDs are not 'shred' duplication
detection is not across the Beat.
ID detection is actively used by input config file reloading and auto
discvovery only, in order to check if an input still needs to be
running, are shall shut down.

Why is it important?

Reduce dependencies and responsibilities in the crawler, while centering input configuration and running more around the RunnerFactory and Runner interfaces only. This changes makes it easier to integrate with alternative RunnerFactory implementations in the future.

Checklist

  • My code follows the style guidelines of this project
    - [ ] I have commented my code, particularly in hard-to-understand areas
    - [ ] I have made corresponding changes to the documentation
    - [ ] I have made corresponding change to the default configuration files
    - [ ] I have added tests that prove my fix is effective or that my feature works
    - [ ] I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

The crawler creates active inputs for static configuration, starts
config file reloading, and starts the module loader.
With this change the crawler has no direct dependency (well, reduced) on
input.Input anymore, but will use the `Runner` interface, even for
statically configured inputs.
This also reduces dependencies, as most plumbing is already done by the
inputs.RunnerFactory and must not be duplicated by the crawler anymore.

The input.Runner used to compute a 'ID' by hashing the inputs
configuration. The ID was public, to be used by the crawler only.
Instead of having the input compute the ID, it is the crawler who will
compute the input ID now.

Note: the cfgfile RunnerList maintains its own set of IDs. The crawler
and RunnerList each used to use the ID to check for 'duplicate'
configurations, but because the IDs are not 'shred' duplication
detection is not across the Beat.
ID detection is actively used by input config file reloading and auto
discvovery only, in order to check if an input still needs to be
running, are shall shut down.
@urso urso added review refactoring Filebeat Filebeat Team:Services (Deprecated) Label for the former Integrations-Services team labels Apr 9, 2020
@urso urso requested a review from kvch April 9, 2020 21:00
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/integrations-services (Team:Services)

return fmt.Errorf("Error while initializing input: %+v", err)
}
if inputRunner, ok := runner.(*input.Runner); ok {
inputRunner.Once = c.once
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input.New returns a partially initialized type. The once flag must only be set for static inputs, but the factory is used for static and dynamic inputs. As I expect the crawler to go away in the future I wanted keep the scope rather small for what I need for the integration of the new input API, I still keep this small piece of tech debt here for now.

@urso urso mentioned this pull request Apr 9, 2020
3 tasks
@urso urso added the needs_backport PR is waiting to be backported to other branches. label Apr 15, 2020
@urso urso merged commit 52fa265 into elastic:master Apr 15, 2020
@urso urso deleted the crawler-use-runner-factory branch April 15, 2020 13:26
@urso urso added v7.8.0 and removed needs_backport PR is waiting to be backported to other branches. labels Apr 15, 2020
urso pushed a commit to urso/beats that referenced this pull request Apr 15, 2020
* Reduce dependencies in Crawler

The crawler creates active inputs for static configuration, starts
config file reloading, and starts the module loader.
With this change the crawler has no direct dependency (well, reduced) on
input.Input anymore, but will use the `Runner` interface, even for
statically configured inputs.
This also reduces dependencies, as most plumbing is already done by the
inputs.RunnerFactory and must not be duplicated by the crawler anymore.

The input.Runner used to compute a 'ID' by hashing the inputs
configuration. The ID was public, to be used by the crawler only.
Instead of having the input compute the ID, it is the crawler who will
compute the input ID now.

Note: the cfgfile RunnerList maintains its own set of IDs. The crawler
and RunnerList each used to use the ID to check for 'duplicate'
configurations, but because the IDs are not 'shred' duplication
detection is not across the Beat.
ID detection is actively used by input config file reloading and auto
discvovery only, in order to check if an input still needs to be
running, are shall shut down.

* fix import formatting in crawler

* Update log message test looks for

(cherry picked from commit 52fa265)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Filebeat Filebeat refactoring review Team:Services (Deprecated) Label for the former Integrations-Services team v7.8.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants