Skip to content

Function #nearest! does not return the k-th nearest elements #2

@NIFR91

Description

@NIFR91

Hi,

I think i found a bug in the function nearest!, it does not return the correct answer, i generated the following test

  describe "#nearest" do
    it "should equal naive implementation" do 
      ndim = 2
      k = 3
      distance = ->(m : Array(Float64),n : Array(Float64)) do
        m.each_with_index.reduce(0) do |sum, (coord, index)|
          sum += (coord - n[index]) ** 2
          sum
        end
      end

      10.times do
        points = Array.new(10) do  Array.new(ndim) do rand(-10.0 .. 10.0) end end
        kd_tree = Kd::Tree(Float64).new(points)
        target = Array.new(ndim) do rand(-11.0 .. 11.0) end
        res = kd_tree.nearest(target,k)
        sorted = points.sort_by do |p| distance.call(p,target) end.reverse!
        (res - sorted[-k..]).should eq [] of Float64
      end
    end
  end

I belive the bug is when the current node point is replaced, it should be the farthest point and currently
it is the first point.

The simplest solution is to sort the points

def nearest!(...)
  # ...
  if nearest.size < n 
    nearest << curr.pivot 
  else 
    nearest.sort_by! do |b| distance(b, query) end
    dist_curr_query = distance(curr.pivot, query)
    ix = nearest.rindex { |b| dist_curr_query < distance(b, query) }
    nearest[ix] = curr.pivot if ix
  end 
  # ...
end

But for larger values of n the sorting and finding may have and impact on performance, an implementation with
a priority queue could have a positive impact on performance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions