Previously, the NULL data type occurred only in artificial circumstances, e.g. when a user wrote ROW x = null | EVAL y = sqrt(x) | ....
I don't think we gave much thought to the type resolution here. It seems to mostly make sense, but deserves double-checking now that #140463 made it way more likely that NULL type columns occur in queries.
Example to consider:
ROW x = null | STATS y = sum(x) and SET unmapped_fields="nullify"; FROM idx | STATS y = sum(missing) produce a DOUBLE column; should they? sum can output either long or double depending on the input types; given that null columns happen in case of missing fields now, maybe it's more correct to mark sum(x) as NULL type as well?
ROW x = null | EVAL y = 2*x produces an INTEGER column. That looks sane, but in case x is a missing column, it would be more truthful to say "we don't know the type of y because multiplication * auto-casts".
In any case, we should go and double check that our type resolution is properly tested with the NULL data type.
Previously, the
NULLdata type occurred only in artificial circumstances, e.g. when a user wroteROW x = null | EVAL y = sqrt(x) | ....I don't think we gave much thought to the type resolution here. It seems to mostly make sense, but deserves double-checking now that #140463 made it way more likely that
NULLtype columns occur in queries.Example to consider:
ROW x = null | STATS y = sum(x)andSET unmapped_fields="nullify"; FROM idx | STATS y = sum(missing)produce aDOUBLEcolumn; should they?sumcan output either long or double depending on the input types; given that null columns happen in case of missing fields now, maybe it's more correct to marksum(x)asNULLtype as well?ROW x = null | EVAL y = 2*xproduces anINTEGERcolumn. That looks sane, but in casexis a missing column, it would be more truthful to say "we don't know the type ofybecause multiplication*auto-casts".In any case, we should go and double check that our type resolution is properly tested with the
NULLdata type.