feat: add new Influx Line Protocol Parser#9685
feat: add new Influx Line Protocol Parser#9685akrantz01 wants to merge 15 commits intoinfluxdata:masterfrom
Conversation
|
I have an open feature request #9487 that requires the line protocol parser in telegraf to process and handle comments. After some discussion with @reimda I have been talked out of this idea but I am still looking at this new line parser and I don't see where it gracefully handles comments (even by just throwing them away). Did I miss it? |
|
@wz2b I'm not sure where exactly it handles comments, but it does handle them by just discarding them. |
7d99a8e to
bf40aff
Compare
srebhan
left a comment
There was a problem hiding this comment.
Hey @akrantz01 thanks for looking into this. I have some comments, the most prominent being
- My suggestion to use
errors.Is()when checking for a certain error type to handle wrapped errors. - A question regarding the removal of the
bytes_receivedstat.
Would be nice if you can take a look.
642db8c to
070d298
Compare
Looks like new artifacts were built from this PR. Get them here!Artifact URLs |
sspaink
left a comment
There was a problem hiding this comment.
lgtm, do you know if anyone has tested this? @akrantz01 were you able to test all of these changes?
|
I don't know if anyone else has tested it. I also haven't done any extensive testing. |
|
I ran this against a set of diverse real-world metrics data also with a variety of badly formatted lines. It worked quite well but ended up panicing. Unfortunately I do not have the exact line which caused this but maybe the trace will already be helpful. Working on identifying the culprit line as well. Trace: |
|
I haven't been able to pinpoint what could have caused the panic in the metrics stream. One simple and fast workaround would be to have better bounds checking on this array to ensure that the slice is valid and provide the necessary context in the error message to make identifying the problematic line easier. |
|
Found another panic, this one is within the line protocol parser itself so reported it here: influxdata/line-protocol#50 |
|
@oplehto, would you be willing to share any of the data you are using to cause these issues? |
It's a pretty high-volume real-time data stream with a variety of sources. Thus pinpointing the needle in the haystack is a bit challenging, which I why I was hoping for some more robust exception handling to help narrow it down. In general it would be nice to have bounds checking when referencing slices like this to better catch these type of "unknown unknowns" in live data. |
| if len(buffer) > maxErrorBufferSize { | ||
| startEllipsis := true | ||
| offset := e.Offset - e.LineOffset | ||
| offset := e.Column - 1 - lineStart |
There was a problem hiding this comment.
as pointed out in the comments, we need to verify that offset is not > len(buffer) and if it is return len(buffer)
|
Hi! I am going to take this work over from Alex. I have opened a new PR #9871 where I have pushed his existing work plus a fix for the buffer overflow and have Telegraf set to use the debug version of the Influx Line Protocol parser for help with testing. I am going to close this PR in favor of my own and hope @oplehto can use that to help debug the current panic. Thanks! |
Required for all PRs:
resolves #9474
Currently blocked by influxdata/line-protocol#43Replaces the Influx Line Protocol parser with the new, faster, and zero-allocation parser found here. The only downside to the new parser is that we cannot get the current buffer when using the stream parser variant, so the messages will be slightly less descriptive when an error is encountered.