Skip to content

[Bug] zentity fails to obtain attribute values from object arrays during resolution #85

@davemoore-

Description

@davemoore-

Environment

  • zentity version: 1.8.0
  • Elasticsearch version: 7.11.1

Describe the bug

During a resolution job, zentity fails to access attributes whose values appear in an array of objects in the "_source" field of the matching documents. This is likely due to the use of JsonPointer to access attributes from documents (see also here), because the JSON Pointer syntax requires the index value for array elements. A potential solution is to replace the use of JsonPointer with JsonPath, which supports a syntax that can return all values within an array.

Related issues: #46, #49

Expected behavior

zentity should assume (like Elasticsearch) that each object in an array of objects has the same schema, and then during a resolution job, zentity should obtain attribute values from arrays of objects just like it obtains attribute values from object values or arrays of values.

Steps to reproduce

Step 1. Create an index with a nested object.

PUT my_index
{
  "mappings": {
    "properties": {
      "first_name": {
        "type": "text"
      },
      "last_name": {
        "type": "text"
      },
      "phone": {
        "type": "nested",
        "properties": {
          "number": {
            "type": "keyword"
          },
          "type": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

Step 2. Index two documents.

POST my_index/_bulk?refresh
{"index":{"_id":1}}
{"first_name":"alice","last_name":"jones","phone":[{"number":"555-123-4567","type":"home"},{"number":"555-987-6543","type":"mobile"}]}
{"index":{"_id":2}}
{"first_name":"allison","last_name":"jones","phone":[{"number":"555-987-6543","type":"mobile"}]}

Step 3. Create an entity model.

PUT _zentity/models/my_entity_model
{
  "attributes": {
    "first_name": {},
    "last_name": {},
    "phone": {}
  },
  "resolvers": {
    "name_phone": {
      "attributes": [
        "last_name",
        "phone"
      ]
    }
  },
  "matchers": {
    "exact": {
      "clause": {
        "term": {
          "{{ field }}": "{{ value }}"
        }
      }
    },
    "exact_phone": {
      "clause": {
        "nested": {
          "path": "phone",
          "query": {
            "term": {
              "{{ field }}": "{{ value }}"
            }
          }
        }
      }
    }
  },
  "indices": {
    "my_index": {
      "fields": {
        "first_name": {
          "attribute": "first_name",
          "matcher": "exact"
        },
        "last_name": {
          "attribute": "last_name",
          "matcher": "exact"
        },
        "phone.number": {
          "attribute": "phone",
          "matcher": "exact_phone"
        }
      }
    }
  }
}

Step 4. Run a resolution job. Expect the first hop to match the given name and phone number (555-123-4567), and expect the second hop to match the new phone number (555-987-6543) from the document in the first hop.

POST _zentity/resolution/my_entity_model?queries
{
  "attributes": {
    "first_name": [ "alice" ],
    "last_name": [ "jones" ],
    "phone": [ "555-123-4567" ]
  }
}

Step 5. The resolution job fails with the following error message:

io.zentity.model.ValidationException: Expected 'string' attribute data type.
	at io.zentity.resolution.input.value.StringValue.validate(StringValue.java:52)
	at io.zentity.resolution.input.value.Value.<init>(Value.java:35)
	at io.zentity.resolution.input.value.StringValue.<init>(StringValue.java:28)
	at io.zentity.resolution.input.value.Value.create(Value.java:57)
	at io.zentity.resolution.Job.onSearchComplete(Job.java:755)
	at io.zentity.resolution.Job.access$000(Job.java:50)
	at io.zentity.resolution.Job$1.onResponse(Job.java:1052)
	at io.zentity.resolution.Job$1.onResponse(Job.java:1045)
	at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:83)
	at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:77)
	at org.elasticsearch.action.ActionListener$4.onResponse(ActionListener.java:253)
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.sendSearchResponse(AbstractSearchAsyncAction.java:595)
	at org.elasticsearch.action.search.ExpandSearchPhase.run(ExpandSearchPhase.java:109)
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.executePhase(AbstractSearchAsyncAction.java:372)
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:366)
	at org.elasticsearch.action.search.FetchSearchPhase.moveToNextPhase(FetchSearchPhase.java:219)
	at org.elasticsearch.action.search.FetchSearchPhase.lambda$innerRun$1(FetchSearchPhase.java:101)
	at org.elasticsearch.action.search.FetchSearchPhase.innerRun(FetchSearchPhase.java:107)
	at org.elasticsearch.action.search.FetchSearchPhase.access$000(FetchSearchPhase.java:36)
	at org.elasticsearch.action.search.FetchSearchPhase$1.doRun(FetchSearchPhase.java:84)
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
	at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:732)
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:830)

Additional context

The following request shows the query that zentity submits to Elasticsearch in the first hop, and the response that zentity receives from Elasticsearch to process. The error occurs when zentity tries to parse the values of the phone numbers, which are inside of an object array.

Request:

GET my_index/_search
{
  "_source": true,
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "last_name": "jones"
          }
        },
        {
          "nested": {
            "path": "phone",
            "query": {
              "term": {
                "phone.number": "555-123-4567"
              }
            }
          }
        }
      ]
    }
  },
  "size": 1000
}

Response:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.0,
    "hits" : [
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.0,
        "_source" : {
          "first_name" : "alice",
          "last_name" : "jones",
          "phone" : [
            {
              "number" : "555-123-4567",
              "type" : "home"
            },
            {
              "number" : "555-987-6543",
              "type" : "mobile"
            }
          ]
        }
      }
    ]
  }
}

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions