Skip to content

Spec: services dependencies granularity NoSQL#654

Merged
SylvainJuge merged 7 commits intoelastic:mainfrom
SylvainJuge:destination-granularity-nosql
Jul 25, 2022
Merged

Spec: services dependencies granularity NoSQL#654
SylvainJuge merged 7 commits intoelastic:mainfrom
SylvainJuge:destination-granularity-nosql

Conversation

@SylvainJuge
Copy link
Copy Markdown
Member

@SylvainJuge SylvainJuge commented Jun 23, 2022

Implement #646 for NoSQL databases

Summary of changes

  • For Cassandra, define db.instance to the Keyspace name (if available)
  • For MongoDB, define db.instance to the Database name (if available)
  • For Elasticsearch, define db.instance as the Cluster name (if available, with heuristic).

Out of scope (for now)

  • Databases that do not have a database or cluster name (a cluster being just a set of hosts) using host:port in service.target.name would have high cardinality and provide limited value
  • CosmosDB (Azure), I guess from this that we can't easily capture the region name and set db.instance.
    • option 1: skip this from the spec for now and refine cloud-based services with a separate spec PR
    • option 2: use the "account name" for Azure services which is easy to capture, unlike the actual region name, this strategy would also likely be applied to other Azure services (so at least it should be consistent across all Azure services).
      EDIT: opened Specification for Azure service granularity #661 to deal with this separately

To be clarified/discussed


Already clarified/discussed

  • What can we do for databases that don't have a database nor cluster name, should we fallback to using the host:port format for service.target.name ? I would be in favor to keep it as-is for now as the added granularity would be of limited value (cardinality == number of hosts)
    • Redis
    • Memcached
  • What values of service.target.name should we pick for Elasticsearch spans ?
    • cluster name seems to be a good candidate
    • host:port might have high cardinality with many ES nodes (would be similar to Redis and Memcached described above)
    • cluster name might not be accessible from application side depending on the client used.
    • if an extra call to ES is required, then agents should cache the value with the host:port as key.
    • the added granularity might provide limited value unless application communicates with multiple clusters.

  • Create PR as draft
  • Approval by at least one other agent
  • Mark as Ready for Review (automatically requests reviews from all agents and PM via CODEOWNERS)
    • Remove PM from reviewers if impact on product is negligible
    • Remove agents from reviewers if the change is not relevant for them
  • Merge after 2 business days passed without objections
    To auto-merge the PR, add /schedule YYYY-MM-DD to the PR description.

@ghost
Copy link
Copy Markdown

ghost commented Jun 23, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-07-25T04:46:00.812+0000

  • Duration: 3 min 49 sec

@SylvainJuge SylvainJuge changed the title Spec: Destination granularity nosql Spec: services dependencies granularity NoSQL Jun 24, 2022
@AlexanderWert
Copy link
Copy Markdown
Member

What values of service.target.name should we pick for Elasticsearch spans ?

  • host:port might have high cardinality with many ES nodes (would be similar to Redis and Memcached described above)

Is this really the case? Usually ES clients are not talking to individual ES nodes of a cluster but the Client nodes, right? So usually there will be only one host.port (from the client perspective) that also identifies a cluster. So to me it looks like a good candidate / fallback if we cannot retrieve the cluster name.

@AlexanderWert
Copy link
Copy Markdown
Member

For CosmosDB seems like we already specified in the current value for destination.service.resource to be the storage account name for the service.target.name (see https://github.com/elastic/apm/blob/main/specs/agents/tracing-instrumentation-azure.md#span-context-fields-3)

@felixbarny
Copy link
Copy Markdown
Member

Usually ES clients are not talking to individual ES nodes of a cluster but the Client nodes, right?

IIRC, EC clients are talking to individual ES nodes. They discover all nodes in a cluster given an initial seed of configured nodes and round-robin requests to individual nodes. You can also configure them to only talk to one node that load balances requests to other nodes. But this is not something that we can rely on, I think.

@AlexanderWert
Copy link
Copy Markdown
Member

IIRC, EC clients are talking to individual ES nodes. They discover all nodes in a cluster given an initial seed of configured nodes and round-robin requests to individual nodes. You can also configure them to only talk to one node that load balances requests to other nodes. But this is not something that we can rely on, I think.

Thanks Felix!

And getting the cluster name requires an additional ES request, I guess? Or is this info maybe available as some meta data on the response?

@felixbarny
Copy link
Copy Markdown
Member

Yeah, I suppose the only way to get the cluster name is to invoke the info/root endpoint and get the cluster_name property. Neither the response nor the response headers contain information bout the cluster.

@SylvainJuge
Copy link
Copy Markdown
Member Author

If we need to have an extra call to ES to get the cluster name, that could be similar to the JDBC metadata that relies on SQL queries for some drivers, in this case it should be something that agents should cache to avoid excessive overhead, in that case we could simply use the host:port as a cache key and the cluster name as cached value. Maybe I should clarify the spec on that.

The calls to ES are performed on individual nodes (quite similar to Cassandra), and we can still distinguish the individual nodes by using the destination.service.address field. Also, if we were able to get the node name (or in the case of ES it's only the node ID), then we could simply use the service.node.name field.

@estolfo
Copy link
Copy Markdown
Contributor

estolfo commented Jul 5, 2022

I caught up with the elasticsearch clients team and asked them about their "product check" request they do at startup. Currently most clients make a call to the global info API and could, in theory, cache the cluster name for the APM agents to use.
However, they said that they are going to move away from this implementation. But it might be worth raising returning the cluster name as a header from Elasticsearch. Cloud apparently does this today through the x-found-handling-cluster header.
Point is, before committing to making an extra request to ES by the APM agents, perhaps we should explore enabling the elasticsearch clients to do that so we both can use the info returned.

@felixbarny
Copy link
Copy Markdown
Member

The thing is, even if the client team exposes a method to return the cached cluster name, we'll still have to find a solution for older client versions. But we could have a layered model:

  • Call API in the client library to get cached cluster name
  • Use x-found-handling-cluster header value
  • Instrument _node/http calls and cache the result in the agent
  • If the cluster name is still unknown after a request, make a call to Elasticsearch and cache host:port -> cluster name mappings.

@SylvainJuge
Copy link
Copy Markdown
Member Author

The headers seems to be defined here for cloud, and we could definitely use them to get the cluster name (and maybe the node name later on).

+1 on the layered approach, that will be required anyway for existing clients and ES versions or outside of cloud deployments.
If we pick this strategy, I will first add to the spec the 3 last items, and only add the API client when it's available (which might definitely require extra work with clients team).

SylvainJuge and others added 3 commits July 7, 2022 09:03
Co-authored-by: Alexander Wert <AlexanderWert@users.noreply.github.com>
@AlexanderWert AlexanderWert marked this pull request as ready for review July 18, 2022 07:43
@AlexanderWert AlexanderWert requested review from a team as code owners July 18, 2022 07:43
@SylvainJuge
Copy link
Copy Markdown
Member Author

The Azure part is now delegated to #661 , so we should be good to merge this part of the spec, I'll merge it on Monday if there is no further comment or objection.

@SylvainJuge SylvainJuge merged commit cbc5848 into elastic:main Jul 25, 2022
@SylvainJuge SylvainJuge deleted the destination-granularity-nosql branch July 25, 2022 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants