#4937 introduced logic to infer Unicode-ness for literals when comparing to column. It works well for simple cases when one side of comparison is ColumnExpression. Though there are complex cases in which we are not inferring the information correctly atm. For e.g.
- If one side of binary expression is subquery selecting one element of a column then the other side should have same facets as the column being projected in subquery.
- Operators like concat which return string type values should pass uniform information towards parent because the siblings of concat should have same unicode-ness as concat. (See test
Non_unicode_string_literals_is_used_for_non_unicode_column_with_concat)
For cases like above, we can improve logic to propagate information. If it is too hard to do then we should add functions like AsUnicode & AsNonUnicode.