feat(parsers.influx): New influx line protocol via feature flag#10749
feat(parsers.influx): New influx line protocol via feature flag#10749reimda merged 5 commits intoinfluxdata:masterfrom
Conversation
654ef28 to
ff0d893
Compare
This introduces a new parser option to allow users to choose between the upstream (newer, more memory efficient and faster) influx line protocol parser or the built-in, included influx line protocol.
5fb6ed2 to
08d78ec
Compare
|
@reimda would you be willing to give the last commit a quick review? The big changes are:
I am looking at changes to influxdb_v1_listener, but so far do not feel my changes are appropriate |
| } | ||
|
|
||
| if err != influx.EOF && err != nil { | ||
| if err != influx.EOF && err != influx_upstream.ErrEOF && err != nil { |
There was a problem hiding this comment.
It feels a little strange and maybe unsafe to handle errors from either parser in the same place without any type assertion on the error. I think in the logs we're also going to want to know which parser generated the error. Maybe handle the specific errors caused by each parser right after each Parse finishes and only handle common errors like the badRequest after the if/else?
There was a problem hiding this comment.
influx.EOF and influx.ErrEOF are the same thing, an errors.New("EOF"). We are only handling types of error in this if statement so I'm not sure I follow the concern about type assertions.
There was a problem hiding this comment.
Let's store errors.New("EOF) as a variable in this package and compare err to that variable here.
* Keep influx_upstream under influx * Add and update READMEs for influx parsers
08d78ec to
ce01646
Compare
| require.NoError(t, err) | ||
| require.NoError(t, resp.Body.Close()) | ||
| require.EqualValues(t, 204, resp.StatusCode) | ||
| for _, parser := range []string{"internal", "upstream"} { |
There was a problem hiding this comment.
I've been thinking about this pattern. It reuses the listener for both parsers and always runs in a specific order of parsers. It's best to test as close to real use as possible so this isn't ideal. I think it would be better to pull the initialization into the parser for loop to use a new listener for each parser.
Doing it this way has some testing usability drawbacks too. If there is a failure we won't know which parser was involved. Also it doesn't allow us to run all tests of just one of the parser types. Using golang subtests would fix both those problems. See https://go.dev/blog/subtests#table-driven-tests-using-subtests
Maybe these improvements aren't needed, but they were on my mind so I thought I'd share them with you.
There was a problem hiding this comment.
done - the reason I did not do this initially was since the expected input and outputs are the same and the parser is a run-time check it did not seem to make sense. We both agree that it is a best practice, however slightly not sure it is a perfect fit here, but made the change anyway.
There was a problem hiding this comment.
Looks good with the Run func.
Could we define the testCases slice in one place, maybe global, so it's not repeated in each function?
plugins/parsers/influx/README.md
Outdated
| ## Influx parser type to use. Users can choose between 'internal' and | ||
| ## 'upstream'. The internal parser is what Telegraf has historically used. | ||
| ## While the upstream parser involved a large re-write to make it more | ||
| ## memory efficient and performant. | ||
| ## influx_parser_version = "internal" |
There was a problem hiding this comment.
Let's use the same shorter text here too.
* update parser readme to be inline with listeners * global EOF error check * consolidate the test cases for both listener tests
|
Download PR build artifacts for linux_amd64.tar.gz, darwin_amd64.tar.gz, and windows_amd64.zip. 📦 Click here to get additional PR build artifactsArtifact URLs |
|
@oplehto Do you by chance have before/after performance numbers of using the new parser compared to the old one? |
Add the ability to use the upstream Influx Line Protocol parser with the new, zero-allocation with the existing internal parser. Users can choose to use the new 'upstream' parser with the
influx_parser_typeconfig option or with theparser_typeconfig option with theinfluxdb_v2_listener.Moves time
influx.TimeFuncto a common file for use by both parsers.Blocked by influxdata/line-protocol#50
Previous PR #9685
Resolves #9474
Authored-by: Alex Krantz alex@krantz.dev