Skip to content

Make the ability to create exists queries work on a per-field basis #26770

@jpountz

Description

@jpountz

Today, our approach to handling existsqueries is to index the names of all indexed/doc-valued fields under the _field_names field and later use that field in order to execute exists queries.

However, now that doc values have good iterators, we could use them in order to execute existsqueries. So I'm considering adding a new MappedFieldType#existsQuery API and changing mapping parsing logic so that it would roughly look like that for keyword fields:

    @Override
    public Query existsQuery(QueryShardContext context) {
        if (hasDocValues()) {
            return new DocValuesFieldExistsQuery(name());
        } else {
            // this fails if _field_names is disabled
            return context.fieldMapper(FieldNamesFieldMapper.NAME).termQuery(name(), context);
        }
    }

    [...]

    @Override
    protected void parseCreateField(ParseContext context, List<IndexableField> fields) throws IOException {
        [...] // handling of parsing, indexed and stored values
        if (fieldType().hasDocValues()) {
            fields.add(new SortedSetDocValuesField(fieldType().name(), binaryValue));
        } else if (fieldNamesIsEnabled) {
            fields.add(new StringField(FieldNamesFieldMapper.NAME, fieldType().name());
        }
    }

This way, we would index way fewer values into the _field_names field, probably even none with the default mappings since we could find text fields that have a value with norms (which are on by default), and other fields have doc values by default.

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions