Today, our approach to handling existsqueries is to index the names of all indexed/doc-valued fields under the _field_names field and later use that field in order to execute exists queries.
However, now that doc values have good iterators, we could use them in order to execute existsqueries. So I'm considering adding a new MappedFieldType#existsQuery API and changing mapping parsing logic so that it would roughly look like that for keyword fields:
@Override
public Query existsQuery(QueryShardContext context) {
if (hasDocValues()) {
return new DocValuesFieldExistsQuery(name());
} else {
// this fails if _field_names is disabled
return context.fieldMapper(FieldNamesFieldMapper.NAME).termQuery(name(), context);
}
}
[...]
@Override
protected void parseCreateField(ParseContext context, List<IndexableField> fields) throws IOException {
[...] // handling of parsing, indexed and stored values
if (fieldType().hasDocValues()) {
fields.add(new SortedSetDocValuesField(fieldType().name(), binaryValue));
} else if (fieldNamesIsEnabled) {
fields.add(new StringField(FieldNamesFieldMapper.NAME, fieldType().name());
}
}
This way, we would index way fewer values into the _field_names field, probably even none with the default mappings since we could find text fields that have a value with norms (which are on by default), and other fields have doc values by default.
Today, our approach to handling
existsqueries is to index the names of all indexed/doc-valued fields under the_field_namesfield and later use that field in order to executeexistsqueries.However, now that doc values have good iterators, we could use them in order to execute
existsqueries. So I'm considering adding a newMappedFieldType#existsQueryAPI and changing mapping parsing logic so that it would roughly look like that for keyword fields:This way, we would index way fewer values into the
_field_namesfield, probably even none with the default mappings since we could find text fields that have a value with norms (which are on by default), and other fields have doc values by default.