-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Log handling for external plugins #9487
Description
Feature Request
This feature request is about improving the handling of log/debug/error/info messages from external plugins in Telegraf.
All of my external "plugins" use the line protocol library to generate metrics. This makes it really easy to debut things by running the plugin directly (outside of telegraf) and just watching stdout. I generally use an instance of log.Logger to generate some output useful for debugging, and I have written a small log writer)[https://github.com/wz2b/telegraf-tempest/blob/main/internal/tclogger/tclogger.go] that the plugin installs when it starts. The only thing this does is splits log messages by line and adds a comment ("#") marker in front of each line. If you were to feed such a thing directly into influxdb those lines would get ignored as if they were comments.
In Telegraf's case what happens to them depends on where you direct them:
- If they appear on stdout then telegraf treats them like a line protocol comment and they are ignored
- If they appear on stderr then telegraf treates them as an E! level log
I would like the ability to handle these logs a little more flexibly. If a log message appears on stderr, I'd like some ability to convey that log message's stated severity and not assume they are all ERROR. If a comment appears on STDOUT then I think something should be done with that, not just have it thrown away.
I'm not yet sure what to do with these logs. Ideally they would go into some kind of log stream within telegraf, and you configure a log output. That log output could output them to some combination of stdout/stderr; add them to the existing telegraf log (which is nice becuse it has log rotation etc); or direct the messages to a log aggregation system like loki or logstash.
Proposal:
Short term:
- If telegraf receives a line that starts with a # on stdin I would like that comment to appear on telegraf's stdout. That way, when I run telegraf from the command line, I can watch what it is doing along with what my plugin is doing.
- If telegraf receives a message on stderr, somehow try to determine a severity so they aren't all ERROR
Longer term, I haven't settled on what I think is the best way to handle this yet. I am leaning toward this:
- If a comment appears on stdout, have telegraf write it to its stdout so it appears on the console
- If an error message appears on Stderr, have telegraf try to parse it as a line protocol message, with an (optional) standard tag set to let users define their own severity
- Create a separate stream for these messages within telegraf so that you can direct them to a logging-specific output (or set of outputs) like logstash, loki, or some other log handling system, or if none defined just write them to telegraf's standard log handler (which has file rotation and all those other niceties).
Current behavior:
All log messages sent to stderr are treated by telegraf as error messages. All comments sent to stdout are ignored by telegraf. Things that don't look like line protocol generate a telegraf error message.
Use case:
Every one of my external plugins starts with:
log.Println("Chris's Plugin version 1.2.3 hello there!")
or something like that. With my log adapter that comes out on stdout with a # in front of it. it's still useful in that format - i can see what's happening when running standalone, and it doesn't disturb things when run from telegraf. But I lose the message, unless I send it to Stderr, in which case it shows up as an ERROR, which it's not.
I have four external plugins under development right now, but because we haven't yet had this discussion they all handle things slightly differently. I intend to standardize my approach, and would like to do so in a way that's also useful to others. I also think that the ideal solution involves some changes to telegraf.
My vision is to create an "external telegraf plugin sdk" that's just a small library that external plugin developers can bring in. It would get them the line protocol library along with some boilerplate code everyone needs to use line-protocol, and it would also handle this logging question.
What makes most sense to me is to implement the (external telegraf logger)[https://github.com/influxdata/telegraf/blob/master/logger/logger.go] (we would convert it to an interface) that all external plugins could share.
Long term, I really still would like to see an external plugin API that doesn't involve text messages sent over stdio My vision on that is gRPC or something like it. When (if) we get to that point we would implement the same logger inteface but over a gRPC stream rather than discerning stdout/stderr. However, prior to that, dealing with logs over stdout/stderr still seems like a good short-term improvement.
Thoughts?
Note: I started this conversation by saying "Go logging is the worst!" and someone (correctly) challenged me that if I want anybody to listen to me I better explain. :-) Thus, this feature request, to try to spark a discussion.