Skip to content

[Java] ListVector.setNull doesn't update lastSet - performance issue #40796

@jarohen

Description

@jarohen

Describe the enhancement requested

In ListVector, setNull doesn't update lastSet. This means that, if you set many null values in a row, the offset buffer is unnecessarily re-set for the intervening values - e.g.:

  • If I .setNull(3) with lastSet == 2, I set offset 3
  • If I then .setNull(4), .setNull(5), .setNull(6), these set 3->4, 3->5, and 3->6 respectively - i.e. O(n²).
  • With large and sparse enough vectors, this adds significant time to our internal benchmarks.

Naively, I'd put lastSet = index in setNull in the same way as startNewValue, but aware there might be an implicit/nuanced reason for not doing so?

Cheers,

James

Component(s)

Java

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions