fix(tsm1): fix condition check for optimization of array cursor#26691
fix(tsm1): fix condition check for optimization of array cursor#26691philjb merged 3 commits intomaster-1.xfrom
Conversation
The array ascending cursors do not check all the required conditions before doing an optimization to copy a full 1000 points of data from the tsm read buffer to the cursors response buffer. This can result in points from tsm files appearing in cursor output multiple times. The conditions the bug appears are rare as it requires cache data that is older (or overlapping) with tsm data. The data must be in the same shard and must match for the same key. * fixes #26690
| if c.tsm.pos < len(tvals.Timestamps) { | ||
| if pos == 0 && len(c.res.Timestamps) >= len(tvals.Timestamps) { | ||
| if pos == 0 && c.tsm.pos == 0 && len(c.res.Timestamps) >= len(tvals.Timestamps) { | ||
| // optimization: all points can be served from TSM data because |
There was a problem hiding this comment.
I'm not sure how much of an optimization this is anymore. I would expect (maybe) that go doesn't reslice (create a new slice header) if copy(dst[0:], src[0:]) is used, but i haven't looked yet.
There was a problem hiding this comment.
copy calls into runtime.memmove. No slice headers are made but setup is done to get the arguments to memmove to point to the correct place in each backing array. memmove itself handles overlap per the spec.
Looks like this optimization avoids some setup to memmove at the code of the extra conditionals. Probably worthwhile to remove the "optimization" to have less code to debug and maintain. A microbench mark might show some speed differences.
This commit adds a test that verifies the duplicate data bug fix. It will fail without the previous commit. See the test for details; in short it sets up the right cache and tsm data to expose the bug which was occuring when the cache had older data than the tsm files and then the cache was exhausted while there were still more tsm values to include in more Next() calls.
| kc := fs.KeyCursor(context.Background(), []byte("measurement,field=value#!~#value"), 0, true) | ||
| defer kc.Close() | ||
|
|
||
| cursor := newFloatArrayAscendingCursor() |
There was a problem hiding this comment.
the test only looks at float array cursors but all types are impacted. They are made from a template so i made the assumption testing one type also covers the other datatypes.
| // and incorrectly copies TSM data including the points from the previous Next() call. Additionally, cache data | ||
| // needs to be older than most of the TSM data. There needs to be enough tsm data younger than the cache data to completely | ||
| // fill the remaining space in the response buffer. | ||
| func TestAscendingCursorDuplicateDataBug(t *testing.T) { |
There was a problem hiding this comment.
AI helped me make this test more robust; but it also made it rather long with the setup and precondition verifications.
davidby-influx
left a comment
There was a problem hiding this comment.
This would be cleaner with test helpers from
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
There was a problem hiding this comment.
I'd recommend using modern test helpers, like require.NoError instead of Fatalf
The assert package is used for post conditions and the require package is used for pre-conditions in the test. The library provides clearer intent of conditions and better formatting of information (message and data) in the situation an assertion fails. It also makes it easier to read by reducing ifs but some lines are longer. This commit addresses PR requests.
* fix(tsm1): fix condition check for optimization of array cursor The array ascending cursors do not check all the required conditions before doing an optimization to copy a full 1000 points of data from the tsm read buffer to the cursors response buffer. This can result in points from tsm files appearing in cursor output multiple times. The conditions the bug appears are rare as it requires cache data that is older (or overlapping) with tsm data. The data must be in the same shard and must match for the same key. * fixes #26690 * chore(tsm1): add test that covers array cursor bug This commit adds a test that verifies the duplicate data bug fix. It will fail without the previous commit. See the test for details; in short it sets up the right cache and tsm data to expose the bug which was occuring when the cache had older data than the tsm files and then the cache was exhausted while there were still more tsm values to include in more Next() calls. * chore(test): switch to using the testify library in the new test The assert package is used for post conditions and the require package is used for pre-conditions in the test. The library provides clearer intent of conditions and better formatting of information (message and data) in the situation an assertion fails. It also makes it easier to read by reducing ifs but some lines are longer. This commit addresses PR requests. (cherry picked from commit 9d1e79d)
* fix(tsm1): fix condition check for optimization of array cursor The array ascending cursors do not check all the required conditions before doing an optimization to copy a full 1000 points of data from the tsm read buffer to the cursors response buffer. This can result in points from tsm files appearing in cursor output multiple times. The conditions the bug appears are rare as it requires cache data that is older (or overlapping) with tsm data. The data must be in the same shard and must match for the same key. * fixes #26690 * chore(tsm1): add test that covers array cursor bug This commit adds a test that verifies the duplicate data bug fix. It will fail without the previous commit. See the test for details; in short it sets up the right cache and tsm data to expose the bug which was occuring when the cache had older data than the tsm files and then the cache was exhausted while there were still more tsm values to include in more Next() calls. * chore(test): switch to using the testify library in the new test The assert package is used for post conditions and the require package is used for pre-conditions in the test. The library provides clearer intent of conditions and better formatting of information (message and data) in the situation an assertion fails. It also makes it easier to read by reducing ifs but some lines are longer. This commit addresses PR requests. (cherry picked from commit 9d1e79d)
Add test assertion missing from #26691 when it was ported to OSS master-1.x.
The array ascending cursors do not check all the required conditions before doing an optimization to copy a full 1000 points of data from the tsm read buffer to the cursors response buffer. This can result in points from tsm files appearing in cursor output multiple times.
The conditions the bug appears are rare as it requires cache data that is older (or overlapping) with tsm data. The data must be in the same shard and must match for the same key.