Feature Request
Currently, when we are saving metrics to output plugin (for e.g. Elasticsearch), there is no ability for telegraf to use the ID field from the payload (metrics) as ID of the metric being pushed.
As a result, any metrics pushed to ES will always be taken as "New" document in the absence of ID field.
Proposal:
We provide a JSON parser configuration, similar to json_time_key, named json_id_key.
This will be an optional configuration just like time key, if present, then the ID will be available to output plugins for use and assignment.
If not present, then the current behaviour can continue to work in the absence of ID field.
Use case:
One of the scenario where this is immensely useful is when saving metrics data to Elasticsearch.
In a scenario where the metrics data from an input plugin is duplicated or same metrics is returned multiple times, then in the absence of this ID functionality, ES output plugin sends the metrics without an identifier. This will cause ES to treat it as a new document and create a new ID and insert it into its datastore, hence resulting in duplicates of metrics.
When the metrics is sent with an ID, then the ES will perform the "UPSERT" correctly, modifying any existing documents with an ID or create one if it does not ecxists.
Note - We would also need to make a change in desired output plugins to use the available ID property or not. That could depend on individual plugin choices.
I have made changes to my local telegraf code, and have run the suggested steps to ensure its working fine. Let me know if this is something beneficial to all, then I can raise a PR accordingly.
Thanks.
Feature Request
Currently, when we are saving metrics to output plugin (for e.g. Elasticsearch), there is no ability for telegraf to use the ID field from the payload (metrics) as ID of the metric being pushed.
As a result, any metrics pushed to ES will always be taken as "New" document in the absence of ID field.
Proposal:
We provide a JSON parser configuration, similar to json_time_key, named json_id_key.
This will be an optional configuration just like time key, if present, then the ID will be available to output plugins for use and assignment.
If not present, then the current behaviour can continue to work in the absence of ID field.
Use case:
One of the scenario where this is immensely useful is when saving metrics data to Elasticsearch.
In a scenario where the metrics data from an input plugin is duplicated or same metrics is returned multiple times, then in the absence of this ID functionality, ES output plugin sends the metrics without an identifier. This will cause ES to treat it as a new document and create a new ID and insert it into its datastore, hence resulting in duplicates of metrics.
When the metrics is sent with an ID, then the ES will perform the "UPSERT" correctly, modifying any existing documents with an ID or create one if it does not ecxists.
Note - We would also need to make a change in desired output plugins to use the available ID property or not. That could depend on individual plugin choices.
I have made changes to my local telegraf code, and have run the suggested steps to ensure its working fine. Let me know if this is something beneficial to all, then I can raise a PR accordingly.
Thanks.