Skip to content

GCP project metadata uses wrong fields #3250

@alstolten

Description

@alstolten

Describe the bug

Hi team. According to the documentation the GCP (and other Cloud providers metadata) includes project name and id:

         "project": {
          "description": "Project in which the monitored service is running.",
          "type": [
            "null",
            "object"
          ],
          "properties": {
            "id": {
              "description": "ID of the cloud project.",
              "type": [
                "null",
                "string"
              ],
              "maxLength": 1024
            },
            "name": {
              "description": "Name of the cloud project.",
              "type": [
                "null",
                "string"
              ],
              "maxLength": 1024
            }
          }

For GCP the code uses the output of http://metadata.google.internal/computeMetadata/v1/?recursive=true. The project metadata that this call returns is described here, here are the relevant parts:

numeric-project-id
The numeric project ID (project number) of the instance, which is not the same as the project name that is visible in the Google Cloud console. This value is different from the project-id metadata entry value.

project-id
The project ID.

Now the code uses the numeric-project-id to set the project id for the metadata fields:

Long numericProjectId = projectMap.get("numericProjectId") instanceof Long ? (Long) projectMap.get("numericProjectId") : null;

This results in a mismatch between what Google documents is the project metadata:

  • Project name: A human-readable name for your project.
  • Project ID: A globally unique identifier for your project.
  • Project number: An automatically generated unique identifier for your project.

The APM metadata will contain the project number (referred to as numeric-project-id from the instance metadata point of view) as project id, while the correct behavior would be to have the project-id as project id. Furthermore the metadata will also contain the actual project-id as cloud.project.name, which is also incorrect as of the definition above.

Unfortunately the instance metadata does not contain the project name, that property could be extracted by describing the project, see here.

As a result APM indices contain a different value for cloud.project.id than elastic-agent indices (see here for definition what they contain). Which in turn prevents users to properly group data in Kibana.

The above is confirmed for APM Java Agent, so far I did not have a look at the other APM agents, it might be the issue is present there as well.

Steps to reproduce

Create a project in GCP, have distinct ID, name and number. Run an APM Java Agent on a GCP VM. Run an elastic-agent on the same GCP VM. Notice cloud.project.id of those differ in their respective Elasticsearch index documents.

Expected behavior

cloud.project.id should be the same.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions