Skip to content

Auditbeat: initial_scan action for new paths#7954

Merged
andrewkroh merged 1 commit intoelastic:masterfrom
cwurm:auditbeat_found_status
Aug 16, 2018
Merged

Auditbeat: initial_scan action for new paths#7954
andrewkroh merged 1 commit intoelastic:masterfrom
cwurm:auditbeat_found_status

Conversation

@cwurm
Copy link
Copy Markdown
Contributor

@cwurm cwurm commented Aug 13, 2018

Introduces a new found action that is used by the scanner on startup (when scan_at_start: true) to mark files found in configured paths that it has not seen before.

Addresses #7821

@cwurm
Copy link
Copy Markdown
Contributor Author

cwurm commented Aug 13, 2018

A bit more detail: The new logic follows Option 2 laid out in a comment in the original issue. Before the scanner is called, the Metricset checks which paths from the config are not contained in the db and are therefore assumed to be new. It passes this list to the scanner which allows it to correctly mark new dirs and files it sees as either found (when they are in a new path) or created (when they are in a previously known path).

Copy link
Copy Markdown
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to take a deeper look at this today. Thanks for opening a PR!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments are great. I just want to encourage you to use the godoc format (even for private methods) which is to start the sentence with the name of the method (e.g. findNewPaths finds the new paths...).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointer, I fixed it here and will try to remember going forward.

Copy link
Copy Markdown
Contributor

@adriansr adriansr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor comment, otherwise LGTM.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the function's comment to reflect the meaning of this parameter and maybe give it a more descriptive name?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@adriansr
Copy link
Copy Markdown
Contributor

There's an issue with the CLA related to the e-mail address you used for this commits. I guess it can be fixed by associating this e-mail address with your Github account

@cwurm
Copy link
Copy Markdown
Contributor Author

cwurm commented Aug 14, 2018

There's an issue with the CLA related to the e-mail address you used for this commits. I guess it can be fixed by associating this e-mail address with your Github account

Thanks, fixed it.

Test failed in CI that didn't fail locally, I'm not sure why. Looking into it.

@cwurm
Copy link
Copy Markdown
Contributor Author

cwurm commented Aug 14, 2018

Test failed in CI that didn't fail locally, I'm not sure why. Looking into it.

Found it: When the test runs, there's a race about whether the changes are found by the scanner or by filesystem notification (both are started at the same time). The scanner usually wins, and the actions it sets is what the test was originally written for. The filesystem reports slightly different actions (e.g. created|updated instead of created|attributes_modified). I've changed the test to only test for the common actions (e.g. created).

It's a bit of a shame though that the reported actions are not deterministic, and maybe this is something that we can normalize in the future.

I've pushed a commit, let's hope the CI is good with it. It's a non-deterministic failure, so it's a bit tricky. I tested locally with -run 1000 to be sure.

Copy link
Copy Markdown
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the solution you reached here. I left a few comments.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a double space between paths have. And it should have a period to end the sentence.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using a map as a set data structure you can use map[string]struct{}. To me this is more clear because it's saying that you don't care about the value and that when you use map you will only be checking whether the key exists (rather than that the key exists and its value is true). Also it may save a few bytes.

m := map[string]struct{}{} // Init.

m[name] = struct{} // Add.

if _, exists := m[name]; exists {  // Contains.
  // Exists!
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice, I like that it takes 0 bytes - done.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like nice clear solution to the problem. 👍

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this flag indicates that the event was generated due to the initial scan? If so how about we call it initial_scan rather than found?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense - changed it.

@cwurm cwurm force-pushed the auditbeat_found_status branch from 9db5594 to 2cb655c Compare August 16, 2018 14:07
@cwurm cwurm changed the title Auditbeat: found action for new paths Auditbeat: initial_scan action for new paths Aug 16, 2018
@cwurm
Copy link
Copy Markdown
Contributor Author

cwurm commented Aug 16, 2018

Thanks @andrewkroh - all implemented. Travis is unhappy, but it doesn't look related.

Copy link
Copy Markdown
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@cwurm cwurm force-pushed the auditbeat_found_status branch from 9a647ef to a1b5e8a Compare August 16, 2018 16:10
@andrewkroh
Copy link
Copy Markdown
Member

@cwurm A few notes on your commit message:

  • The first line should be a short summary.

  • Follow that by a blank line.

  • Then do your detailed description.

  • If you want the related issue to auto-close on merge (your choice) then use one of the keywords recognized by GH.

When `scan_at_start: true` the filesystem scanner will look through
all paths and record any new or modified files and directories it finds.
This commit introduces a new action `initial_scan` that the scanner uses
when it scans a path for the first time. Previously, the scanner would
report everything under a new path as `created`, making it impossible
to distinguish between files and directories that are truly new (e.g.
have been added during a restart of Auditbeat) and those that are merely
new because they are under a newly added path in the configuration.

Resolves elastic#7821
@cwurm cwurm force-pushed the auditbeat_found_status branch from a1b5e8a to de677c8 Compare August 16, 2018 16:56
@cwurm
Copy link
Copy Markdown
Contributor Author

cwurm commented Aug 16, 2018

Thanks @andrewkroh - I've changed it to follow the format and made it more descriptive. Let me know if anything is still missing.

@andrewkroh andrewkroh merged commit 95e53bc into elastic:master Aug 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants