Skip to content

Improve the performance of new collation releated function/executor #5294

@windtalker

Description

@windtalker

Enhancement

TiDB enable new collation by default since 6.0, however we found that performance of string column dropped a lot with new collation enabled in TiFlash including:

Benchmark

TPCH

  • data tpch-100
  • TiFlash x 1
  • TIFLASH REPLICA 1
Time(s) Original: rollback all PR in #5294 from commit a0f9865 Optimized Improvement: (Original) / (Optimized) - 1.0   Original + No LTO (Link Time Optimization) Improvement: (Original + No LTO) / (Optimized) - 1.0
TPCH-100TiFlash x 1TIFLASH REPLICA 1
Q1 9.09 8.42 7.96% AGG() by multi STR; COLLATION; 10 18.76%
Q2 2.45 2.38 2.94%   2.52 5.88%
Q3 5.6 5.47 2.38%   5.6 2.38%
Q4 6.14 6.07 1.15%   6.24 2.80%
Q5 13.52 13.05 3.60%   13.52 3.60%
Q6 1.98 1.98 0.00%   2.01 1.52%
Q7 6.34 6.14 3.26%   6.51 6.03%
Q8 8.69 8.36 3.95%   8.93 6.82%
Q9 38.42 38.49 -0.18%   38.82 0.86%
Q10 6.95 6.61 5.14%   7.58 14.67%
Q11 1.64 1.58 3.80%   1.71 8.23%
Q12 4.4 4.26 3.29%   4.46 4.69%
Q13 8.42 7.82 7.67% LIKE(); COLLATION; 8.42 7.67%
Q14 2.11 2.11 0.00%   2.21 4.74%
Q15 4.46 4.46 0.00%   4.73 6.05%
Q16 2.25 2.11 6.64% LIKE(); COLLATION; 2.28 8.06%
Q17 13.32 12.78 4.23%   13.32 4.23%
Q18 18.09 17.41 3.91%   18.49 6.20%
Q19 5.54 4.66 18.88% COLLATION 5.6 20.17%
Q20 2.99 2.92 2.40%   3.02 3.42%
Q21 24.73 24.26 1.94%   25.4 4.70%
Q22 1.85 1.78 3.93%   1.91 7.30%
SUM 188.98 183.12 3.20%   193.28 5.55%

ClickBench

Time(s) Original: rollback all PR in #5294 from commit a0f9865 Optimized   Improvement: (Original) / (Optimized) - 1.0 Use collation: Y (yes)
ClickBenchTiFlash x 1TIFLASH REPLICA 1
Q1 0.276 0.277   -0.36%  
Q2 0.029 0.0301   -3.65%  
Q3 0.0675 0.0653   3.37%  
Q4 0.1813 0.1788   1.40%  
Q5 2.285 2.255   1.33%  
Q6 1.46 1.36   7.35% Y
Q7 0.1689 0.1693   -0.24%  
Q8 0.0391 0.0381   2.62%  
Q9 1.205 1.175   2.55%  
Q10 2.005 2   0.25%  
Q11 0.2722 0.2521   7.97% Y
Q12 0.2936 0.2731   7.51% Y
Q13 1.03 0.9916   3.87% Y
Q14 1.96 1.87   4.81% Y
Q15 1.105 1.06   4.25% Y
Q16 1.08 1.025   5.37%  
Q17 3.475 3.36   3.42% Y
Q18 2.865 2.77   3.43% Y
Q19 0 0 TiFlash has NOT supported extract    
Q20 0.5935 0.5773   2.81%  
Q21 3.45 0.8726   295.37% Y
Q22 3.57 1.0151   251.69% Y
Q23 6.645 1.815   266.12% Y
Q24 6.665 4.945   34.78% Y
Q25 0.4399 0.3675   19.70% Y
Q26 0.2029 0.1737   16.81% Y
Q27 0.4221 0.3731   13.13% Y
Q28 1.345 1.305   3.07% Y
Q29 0 0 TiFlash has NOT supported regexp_replace   Y
Q30 9.655 9.54   1.21%  
Q31 0.8385 0.7974   5.15% Y
Q32 1.195 1.185   0.84% Y
Q33 6.98 6.915   0.94%  
Q34 6.16 5.945   3.62% Y
Q35 6.115 5.815   5.16% Y
Q36 1.385 1.37   1.09%  
Q37 0.2158 0.2122   1.70% Y
Q38 0.1363 0.1328   2.64% Y
Q39 0.1134 0.1071   5.88% Y
Q40 0.4411 0.4261   3.52% Y
Q41 0.0754 0.0746   1.07%  
Q42 0.0572 0.0565   1.24%  
Q43 0.1397 0.1341   4.18%  
SUM 76.6384 63.3055   21.06%  

Ossinsight

Time(s) Original: rollback all PR in #5294 from commit a0f9865 Optimized   Improvement: (Original) / (Optimized) - 1.0  
tpch-100:str comparison
select count(1) from lineitem where L_SHIPMODE = 'zzzz'; 1.07 0.73   46.58%
select count(1) from lineitem where L_RETURNFLAG = 'R'; 1.16 0.66   75.76%
           
tpch-100:str sort
select min(L_SHIPMODE) from lineitem; 1.08 0.83   30.12%
select max(L_SHIPMODE) from lineitem; 1.31 1.08   21.30%

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions