Skip to content

bug(clp-s): Handle pruning when dynamically expanding wildcards during evaluation. #907

@gibber9809

Description

@gibber9809

Bug

Evaluation code generally expects that each column in the query matches at least one column in a schema. For pure wildcard columns though this isn't necessarily true -- it can be the case that upon dynamically expanding a wildcard we find that no column in the schema matches the wildcard.

During schema matching we currently only check that at least one column with a type matching the wildcard column exists somewhere in the archive (irrespective of namespace and subtree type) and don't bother checking for matching types on a per-schema basis. I.e. we don't consider pure wildcard columns in intersect_and_sub_expr.

The current behaviour during evaluation is to return false from the evaluation of a wildcard filter that matches no columns, but this leads to incorrect evaluation when the wildcard filter is inverted.

For example the query: NOT *: 0
against the dataset

{"a": "b"}
{"c": 0}

will yield the result {"a": "b"}. This result is unexpected since our query semantics involve evaluating against columns with matching type but the returned record has no such column.

The simplest solution is to update our evaluation code in QueryRunner to use True, False, and Pruned like we do in kv-ir evaluation so that dynamic wildcard expansion can result in dynamic pruning of the AST.

CLP version

47ff53a

Environment

ubuntu 22.04 container

Reproduction steps

See issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions