Skip to content

topk vs argtopk Performance #3596

@piercefreeman

Description

@piercefreeman

I've run some tests comparing the performance of topk vs argtopk and it seems like there's a significant difference despite a similar underlying codebase.

Sample script -

a = da.random.normal(size=(100000,100),chunks=(1000,100))
b = da.random.normal(size=(100,100000),chunks=(100,10000))

c = a.dot(b)

#top = c.topk(5, axis=-1)
top = c.argtopk(5, axis=-1)

with ProgressBar():
    da.compute(top)

When the above is executed with topk it completes in 1min 40sec. When it runs with argtopk, it gets to 4% in 3min 40sec. I interrupted this test early because a similar experiment took nearly 2 hours to complete - if growth is anywhere near linear, this one would complete in over an hour.

Is this a known issue with argtopk or anything we can do to optimize execution?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions