Skip to content

Populate quesiton.type for dns events#88

Merged
jonathan-buttner merged 9 commits intoelastic:masterfrom
jonathan-buttner:parse-dns-fields
Oct 2, 2020
Merged

Populate quesiton.type for dns events#88
jonathan-buttner merged 9 commits intoelastic:masterfrom
jonathan-buttner:parse-dns-fields

Conversation

@jonathan-buttner
Copy link
Copy Markdown
Collaborator

@jonathan-buttner jonathan-buttner commented Sep 30, 2020

This PR adds a couple pipeline processors to parse the DNS network events message field. It uses the message field to populate dns.question.type and dns.question.name. I also switched the pipeline over to be a yml file instead of json so I could write a multiline script.

@ferullo @jrmolin once this goes in we'll need to coordinate on any changes to the DNS events message field. Also could you point me at the endgame sensor's question.type parsing code for number -> string representation. I just want to make sure I miss any.

I'm using this regex:

^DNS query is completed for the name .* type %{WORD:dns.question.Ext.type}

WORD is defined here: https://github.com/elastic/elasticsearch/blob/7.9/libs/grok/src/main/resources/patterns/grok-patterns#L14

Since it's possible the domain name could have invalid characters in the request I figured it's best to get everything and assume a comma ends the field. Let me know if you have better ideas though.

Reference:
#83

Working example

image

cc: @MikePaquette

@marshallmain
Copy link
Copy Markdown
Collaborator

Out of curiosity why do we need to populate these fields in the pipeline rather than have the endpoint populate the fields when it builds the rest of the event?

@jonathan-buttner
Copy link
Copy Markdown
Collaborator Author

Out of curiosity why do we need to populate these fields in the pipeline rather than have the endpoint populate the fields when it builds the rest of the event?

Yeah @jrmolin or @ferullo can probably answer better here. I believe certain fields are more difficult than others to parse on the endpoint side. I think most of the information was retrieved in the ETW events for the original sensor and they have to track multiple events to get the information. But since we're already putting the information in the message field, I'm not sure why we couldn't just populate the dns.question.type and dns.quesiton.name fields?

@ferullo
Copy link
Copy Markdown
Contributor

ferullo commented Sep 30, 2020

@jonathan-buttner if/when Endpoint begins to populate this data will the pipeline still overwrite with Endpoint sets or will this code be effectively disabled?

@jonathan-buttner
Copy link
Copy Markdown
Collaborator Author

@jonathan-buttner if/when Endpoint begins to populate this data will the pipeline still overwrite with Endpoint sets or will this code be effectively disabled?

The way this PR is right now, it'd overwrite what the endpoint puts in those fields. I can add a simple if to check if the fields are already present though and skip populating them.

@ferullo
Copy link
Copy Markdown
Contributor

ferullo commented Sep 30, 2020

I think that would be a good check to add. Marshall makes a good point about Endpoint doing this itself. The reality though is Endpoint won't get that done for 7.10 and functionally it doesn't matter if Endpoint or the pipeline does this until detection rules can be run in Endpoint. We should move this logic to Endpoint when that happens but if there's no reason to not merge this it seems like it'll give detection rules the data they need in the meantime.

Does that sound reasonable?

@marshallmain
Copy link
Copy Markdown
Collaborator

Once we add and ship this pipeline we can't really remove it from the package unless we can ensure that all endpoints that use it have been upgraded to populate the fields on the endpoint side, or we'd have to be comfortable breaking compatibility at some point between new package versions and old endpoints. So the risk is that long term we build up a bunch of these fixes applying subtly different modifications and end up with a complex pipeline if we're not careful about what we use it for.

I think the if check would be a good addition, and since we're only adding fields that don't exist in the document then this is a fine approach to make the dns fields available in 7.10. If we can stick to only adding fields that don't already exist with the pipeline and ensure that we don't modify existing fields then the pipeline shouldn't get too complex even if we add more in the future.

@jonathan-buttner
Copy link
Copy Markdown
Collaborator Author

Once we add and ship this pipeline we can't really remove it from the package unless we can ensure that all endpoints that use it have been upgraded to populate the fields on the endpoint side, or we'd have to be comfortable breaking compatibility at some point between new package versions and old endpoints. So the risk is that long term we build up a bunch of these fixes applying subtly different modifications and end up with a complex pipeline if we're not careful about what we use it for.

I think the if check would be a good addition, and since we're only adding fields that don't exist in the document then this is a fine approach to make the dns fields available in 7.10. If we can stick to only adding fields that don't already exist with the pipeline and ensure that we don't modify existing fields then the pipeline shouldn't get too complex even if we add more in the future.

Yeah that's a good point. I agree, hopefully we can stick to adding fields that don't exist in the pipelines and avoid modifying existing fields. If we end up having to make breaking changes we should wait until an 8.x release for that.

I'll add the check to see if the field already exists.

- "^DNS query is completed for the name %{DOMAIN_NAME:dns.question.name}, type %{WORD:dns.question.Ext.type}"
- script:
ignore_failure: true
if: "ctx.network.protocol == 'dns' && ctx.dns?.question?.Ext?.type != null && ctx.dns?.question?.type == null"
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If above grok doesn't run then this processor won't run either because it needs the dns.quesiton.Ext.type to be present.

- location
ignore_missing: true
- grok:
if: "ctx.network.protocol == 'dns' && ctx.dns?.question?.name == null"
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if dns.question.type exists but the name doesn't, name would still be filled in using grok here.

@jonathan-buttner
Copy link
Copy Markdown
Collaborator Author

I must have completely overlooked this but the endpoint is already populating dns.question.name 😅 . I'll switch this PR to just populate dns.question.type.

Doesn't look like linux or mac send dns events, but I assume once they do they can just populate the dns.question.name as well.

@jonathan-buttner jonathan-buttner changed the title Populate quesiton.type and question.name for dns events Populate quesiton.type for dns events Oct 1, 2020
@jonathan-buttner jonathan-buttner merged commit 81ed7d9 into elastic:master Oct 2, 2020
@jonathan-buttner jonathan-buttner deleted the parse-dns-fields branch October 2, 2020 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants