Skip to content

[cloudwatch-input] dataQuery generation always returns the same dimension for a metric #10122

@maxmoehl

Description

@maxmoehl

Relevent telegraf.conf

# not just this config since this is a bug in the coding
[[inputs.cloudwatch]]
    region = "eu-central-1"
    access_key = "<redacted>"
    secret_key = "<redacted>"
    period = "1m"
    delay = "30m"
    interval = "5m"
    namespaces = ["AWS/NATGateway"]
    statistic_include = [ "sum" ]
    [[inputs.cloudwatch.metrics]]
        names = ["BytesInFromDestination"]
        [[inputs.cloudwatch.metrics.dimensions]]
            name = "NatGatewayId"
            value = "*"

[[outputs.file]]
      files = ["stdout"]
      data_format = "json"

System info

macOS 12.0.1; go version go1.17.2 darwin/amd64; telegraf 1.20.4 (and master)

Docker

No response

Steps to reproduce

  1. Using the above config with multiple NAT Gateways deployed on AWS, run telegraf
  2. telegraf --config telegraf.conf --input-filter cloudwatch

Expected behavior

I can see the metric BytesInFromDestination for each NAT GW

Actual behavior

The value of the BytesInFromDestination metric is the same for all NAT GW

Additional info

I did some digging around and found out that this bug is related to the getDataQueries function of the CloudWatch struct. In there it iterates over all filteredMetrics and takes the address of the metric from that list to store it in the dataQueries map. However since go seems to re-use the same object for every iteration of the loop the pointer that is taken always points to the exact same memory location. Due to this the Metric field will always contain the same pointer (and therefore value) after the for loop is done. The fix is easy and I will provide it as soon as I am done with this issue: the metric struct needs to be copied once to allocate new memory, after that the address can be taken.

This is one of the places where this address-taking is done (line 480):

dataQueries[*metric.Namespace] = append(dataQueries[*metric.Namespace], types.MetricDataQuery{
Id: aws.String("average_" + id),
Label: aws.String(snakeCase(*metric.MetricName + "_average")),
MetricStat: &types.MetricStat{
Metric: &metric,
Period: aws.Int32(int32(time.Duration(c.Period).Seconds())),
Stat: aws.String(StatisticAverage),
},
})

The weird thing is that this would affect everyone using the plugin and having more then one dimension per metric. Why didn't this show up earlier?

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/awsAWS plugins including cloudwatch, ecs, kinesisbugunexpected problem or unintended behaviorplatform/darwin

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions