Spec: services dependencies granularity NoSQL#654
Conversation
Is this really the case? Usually ES clients are not talking to individual ES nodes of a cluster but the Client nodes, right? So usually there will be only one |
|
For CosmosDB seems like we already specified in the current value for |
IIRC, EC clients are talking to individual ES nodes. They discover all nodes in a cluster given an initial seed of configured nodes and round-robin requests to individual nodes. You can also configure them to only talk to one node that load balances requests to other nodes. But this is not something that we can rely on, I think. |
Thanks Felix! And getting the |
|
Yeah, I suppose the only way to get the cluster name is to invoke the info/root endpoint and get the |
|
If we need to have an extra call to ES to get the cluster name, that could be similar to the JDBC metadata that relies on SQL queries for some drivers, in this case it should be something that agents should cache to avoid excessive overhead, in that case we could simply use the The calls to ES are performed on individual nodes (quite similar to Cassandra), and we can still distinguish the individual nodes by using the |
|
I caught up with the elasticsearch clients team and asked them about their "product check" request they do at startup. Currently most clients make a call to the global info API and could, in theory, cache the cluster name for the APM agents to use. |
|
The thing is, even if the client team exposes a method to return the cached cluster name, we'll still have to find a solution for older client versions. But we could have a layered model:
|
|
The headers seems to be defined here for cloud, and we could definitely use them to get the cluster name (and maybe the node name later on). +1 on the layered approach, that will be required anyway for existing clients and ES versions or outside of cloud deployments. |
Co-authored-by: Alexander Wert <AlexanderWert@users.noreply.github.com>
|
The Azure part is now delegated to #661 , so we should be good to merge this part of the spec, I'll merge it on Monday if there is no further comment or objection. |
Implement #646 for NoSQL databases
Summary of changes
db.instanceto the Keyspace name (if available)db.instanceto the Database name (if available)db.instanceas the Cluster name (if available, with heuristic).Out of scope (for now)
host:portinservice.target.namewould have high cardinality and provide limited valuedb.instance.EDIT: opened Specification for Azure service granularity #661 to deal with this separately
To be clarified/discussed
Already clarified/discussed
host:portformat forservice.target.name? I would be in favor to keep it as-is for now as the added granularity would be of limited value (cardinality == number of hosts)service.target.nameshould we pick for Elasticsearch spans ?host:portmight have high cardinality with many ES nodes (would be similar to Redis and Memcached described above)host:portas key.CODEOWNERS)To auto-merge the PR, add
/schedule YYYY-MM-DDto the PR description.