Skip to content

[AWSX] fix(logs fowarder): Skip header line on VPC flow logs parsing#1044

Merged
ge0Aja merged 2 commits intomasterfrom
georgi/vpc-flow-logs-skip
Jan 14, 2026
Merged

[AWSX] fix(logs fowarder): Skip header line on VPC flow logs parsing#1044
ge0Aja merged 2 commits intomasterfrom
georgi/vpc-flow-logs-skip

Conversation

@ge0Aja
Copy link
Contributor

@ge0Aja ge0Aja commented Jan 14, 2026

What does this PR do?

  • Skip header line on AWS VPC flow logs parsing

Motivation

See #1043

Testing Guidelines

Additional Notes

Types of changes

  • Bug fix
  • New feature
  • Breaking change
  • Misc (docs, refactoring, dependency upgrade, etc.)

Check all that apply

  • This PR's description is comprehensive
  • This PR contains breaking changes that are documented in the description
  • This PR introduces new APIs or parameters that are documented and unlikely to change in the foreseeable future
  • This PR impacts documentation, and it has been updated (or a ticket has been logged)
  • This PR's changes are covered by the automated tests
  • This PR collects user input/sensitive content into Datadog
  • This PR passes the integration tests (ask a Datadog member to run the tests)
  • This PR passes the unit tests
  • This PR passes the installation tests (ask a Datadog member to run the tests)

@ge0Aja ge0Aja requested a review from a team as a code owner January 14, 2026 09:44
@github-actions github-actions bot added the aws label Jan 14, 2026
@ViBiOh ViBiOh self-assigned this Jan 14, 2026


def is_vpc_flowlog(key):
return "vpcflowlogs" in key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 thought: ‏ I'm not fan of this as it puts back some source identification in the forwarder and also some business logic.

While this check is the current method for S3-based VPC Flow Log, we also support CloudWatch logs for VPC Flow Logs (they seems to not face the issue). Is there we can implement this on backend side so that if we update the source identification we don't have two places to maintain it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is doable from the backend, but requires a change in the logic as we won't know which line comes first on the intake side. For that we need to perform a keyword check maybe ? For the file key, it's already included in the log payload so we could use that instead of doing the check here.

The only downside is that we're going to still send the log and filter it out in the backend.

Regarding the source identification I agree that this brings back some business logic, but it doesn't add any extra fields to the log itself. We're still going to keep some business logic eventually (same goes for cloudtrail), but I think it should be fine if we don't use it amending payload to the log itself.

I'm fine with either choices by the way, this one seemed more logical as we trim sending the log at the source and not filter it out on the back-end side.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You got me at the "we don't know the order on backend side" so Ok for this.

@ge0Aja ge0Aja merged commit 59a387b into master Jan 14, 2026
11 checks passed
@ge0Aja ge0Aja deleted the georgi/vpc-flow-logs-skip branch January 14, 2026 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants