Skip to content

Support internal function to_binary() for new charset #29863

@tangenta

Description

@tangenta

The code changes about built-in functions in TiKV can be minimized.

After the investigation about several built-in functions, I found some common patterns:

  1. Create a new encoding.
  2. Convert the arguments into new charset from utf8.
  3. Apply these changes to the vectorized-version evaluation.

We can extract these patterns into another Sig named builtinToBinarySig(to_binary()) and wrap the cast function as follows:

The rewrite process before this proposal:

select some_builtin(arg1, arg2...)

is rewritten to

select some_builtin(cast_as_string(arg1), cast_as_string(arg2)...)

The rewrite process after this proposal:

select some_builtin(arg1, arg2...)

is rewritten to

select some_builtin(to_binary(cast_as_string(arg1)), to_binary(cast_as_string(arg2))...)

Then we can maintain a built-in function list to determine which function needs to be wrapped, instead of putting the boilerplates everywhere.

var toBinaryMap = map[string]struct{}{
	ast.Hex: {}, ast.Length: {}, ast.OctetLength: {}, ast.ASCII: {},
	ast.ToBase64: {},
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions