planner: refactor the implementation of cost calculation

## Enhancement
The current implementation of cost calculation is **tightly coupled with physical optimization**. 
In physical optimization, the optimizer changes the plan structure frequently and updates the current cost correspondingly.
Here is an example:
```
func addPushedDownSelection(t *task) {
    tableReader := t.Plan.(*TableReader)
    pushedDownFilters := PushDownExprs(tableReader.filters)    // PHY-OPT
    selCost := t.Count() * CopCPUFactor                        // COST
    selection := NewSelection(pushedDownFilters)               // PHY-OPT
    selection.cost = selCost                                   // COST
    t.cost += selCost                                          // COST
    selection.Child = tableReader.Child                        // PHY-OPT
    tableReader.Child = selection.Child                        // PHY-OPT
}
```
`addPushedDownSelection` is used to push `selection` down, and comments `COST` and `PHY-OPT` are used to represent the corresponding line is for `cost calculation` or `plan change in physical optimization`, and then you can see they are coupled tightly.
The current implementation is hard to maintain and has already caused lots of issues: #32675 #32672 #32698 #27189 #30103 #32362.

To solve this problem thoroughly, we decided to refactor the implementation of cost calculation. 
A new interface `Plan.CalCost() float64` will be introduced and all code related to cost calculation will be removed from physical optimization, and then you can see the example above become like this：
```
func addPushedDownSelection(t *task) {
    tableReader := t.Plan.(*TableReader)
    pushedDownFilters := PushDownExprs(tableReader.filters)    // PHY-OPT
    selection := NewSelection(pushedDownFilters)               // PHY-OPT
    selection.Child = tableReader.Child                        // PHY-OPT
    tableReader.Child = selection.Child                        // PHY-OPT
}
```
And all code related to cost calculation will be moved into the new and standard interface `Plan.CalCost() float64`. 
Below is an example of `IndexReader`:
```
func (p *PhysicalIndexReader) CalPlanCost(taskType property.TaskType) float64 {
	p.planCost = p.indexPlan.CalPlanCost(property.CopSingleReadTaskType)   // COST: child's cost
	p.planCost += p.indexPlan.StatsCount() * getRowSize(p) * netFactor     // COST: net I/O cost
	p.planCost += getSeekCost(p)                                           // COST: net seek cost
	p.planCost /= float64(p.ctx.GetSessionVars().DistSQLScanConcurrency()) // COST: consider concurrency
	return p.planCost
}
```

After this refactoring, the cost model will become easier to maintain and calibrate.
And the cost of a plan becomes re-computable, which means you can invoke `Plan.CalCost()` multiple times in varied places, and I think some other modules like (`PlanCache`, `SPM`, `PlanRewriter`, `JoinReorder`, ...) can also get some benefits from it.

Here is [a detailed design doc](https://pingcap.feishu.cn/docs/doccnq4pL5HauSrwlJ0ExPb0trb)(Chinese) about this.
Here is [the demo](https://github.com/pingcap/tidb/pull/33563).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

planner: refactor the implementation of cost calculation #33945

Enhancement

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

planner: refactor the implementation of cost calculation #33945

Description

Enhancement

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions