Skip to content

compute skew when concurrent queries #1743

@imay

Description

@imay

When we do high concurrency test to Doris, we found that one of all backends are complete consumed, but other backends have 60% CPU idle.

image

The test query likes "select day, sum(pv) from tbl where day = 20190904 group by day". After looking into backend's log, I found that all queries' second phase aggregation was processed in the exhausted backend.

This is because Doris will assign the same backend for aggregation/join shuffle slot, this will cause compute skew when the query has smaller group by cardinality than number of backends.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions