Relates #138888
Relates #141912
When a field is partially mapped, but not as KEYWORD, and also not used in a cast, we have inconsistent behavior. As seen below, we sometimes fail the query with an error message, but sometimes we just fill with nulls for the unmapped fields. We should have consistent behavior here.
When field is unmapped in idx1, but, say, long in idx2 the field should be usable when cast, that is, in expressions like field::long; this is #141912.
This is about the case where field occurs without a cast, like here:
SET unmapped_fields="load";
FROM idx1, idx2 | KEEP field
SET unmapped_fields="load";
FROM idx1, idx2 | WHERE field > 10
We have the following options for behavior:
- Consistently with union types, fail the query if
field is used anywhere except for KEEP/DROP. If it's used only in KEEP/DROP, we only show the values from idx2, consistently with unmapped_fields="nullify" and fail.
- Fail the query if
field is used anywhere except for KEEP/DROP. Even more consistently with union types, if it's used only in KEEP/DROP, we fill the complete field with nulls. This means less data is loaded than even for nullify and fail because we don't see the values from idx2.
- Allow using
field everywhere and fill the value for docs from idx1 with nulls but load the long values from idx2. That'd mean silently not loading data from idx1, which is certainly a footgun. But unmapped_fields="nullify" and fail also don't fail here, so we'd have fewer queries that execute with fail but fail with load.
- @GalLalouche and @alex-spies think that the best possible behavior would be to auto-cast
field to long. But auto-casting is out of scope here.
|--> @alex-spies : Since we're postponing this issue, auto-casting is back on the table IMHO.
Note, this is currently buggy, anyway:
curl -u elastic:password -H "Content-Type: application/json" "127.0.0.1:9200/test" -XPUT -d '{
"mappings": { "dynamic": false,
"properties": { "foo": {"type": "long"}, "bar": {"type": "integer"}
}
}
}'
curl -u elastic:password -H "Content-Type: application/json" "127.0.0.1:9200/test2" -XPUT -d '{
"mappings": { "dynamic": false,
"properties": { "bar": {"type": "integer"}
}
}
}'
curl -u elastic:password -HContent-Type:application/json 'localhost:9200/test/_doc?refresh' -d'{"foo": 5, "bar": 10}'
curl -u elastic:password -HContent-Type:application/json 'localhost:9200/test2/_doc?refresh' -d'{"foo": 10, "bar": 10}'
curl -u elastic:password -H "Content-Type: application/json" "127.0.0.1:9200/_query?format=txt" -d '
{
"query": "SET unmapped_fields=\"load\"; FROM test, test2 | keep *"}'
{"error":{"root_cause":[{"type":"illegal_state_exception","reason":"Found 1 problem\nline 1:48: Plan [Project[[bar{f}#601, !foo]]] optimized incorrectly due to missing references [!foo]"}],"type":"illegal_state_exception","reason":"Found 1 problem\nline 1:48: Plan [Project[[bar{f}#601, !foo]]] optimized incorrectly due to missing references [!foo]"},"status":500}%
Relates #138888
Relates #141912
When a field is partially mapped, but not as KEYWORD, and also not used in a cast, we have inconsistent behavior. As seen below, we sometimes fail the query with an error message, but sometimes we just fill with nulls for the unmapped fields. We should have consistent behavior here.
When
fieldis unmapped inidx1, but, say, long inidx2the field should be usable when cast, that is, in expressions likefield::long; this is #141912.This is about the case where
fieldoccurs without a cast, like here:We have the following options for behavior:
fieldis used anywhere except forKEEP/DROP. If it's used only inKEEP/DROP, we only show the values fromidx2, consistently withunmapped_fields="nullify"andfail.fieldis used anywhere except forKEEP/DROP. Even more consistently with union types, if it's used only inKEEP/DROP, we fill the complete field withnulls. This means less data is loaded than even fornullifyandfailbecause we don't see the values fromidx2.fieldeverywhere and fill the value for docs fromidx1withnulls but load the long values fromidx2. That'd mean silently not loading data fromidx1, which is certainly a footgun. Butunmapped_fields="nullify"andfailalso don't fail here, so we'd have fewer queries that execute withfailbut fail withload.fieldtolong.But auto-casting is out of scope here.|--> @alex-spies : Since we're postponing this issue, auto-casting is back on the table IMHO.
Note, this is currently buggy, anyway: