Skip to content

[Java] ListVector.setNull doesn't update lastSet - performance issue #40796

@jarohen

Description

@jarohen

Describe the enhancement requested

In ListVector, setNull doesn't update lastSet. This means that, if you set many null values in a row, the offset buffer is unnecessarily re-set for the intervening values - e.g.:

  • If I .setNull(3) with lastSet == 2, I set offset 3
  • If I then .setNull(4), .setNull(5), .setNull(6), these set 3->4, 3->5, and 3->6 respectively - i.e. O(n²).
  • With large and sparse enough vectors, this adds significant time to our internal benchmarks.

Naively, I'd put lastSet = index in setNull in the same way as startNewValue, but aware there might be an implicit/nuanced reason for not doing so?

Cheers,

James

Component(s)

Java

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions