-
Notifications
You must be signed in to change notification settings - Fork 190
[FEATURE] PPL Support index mapping with dynamic=false #3995
Description
Is your feature request related to a problem?
When querying OpenSearch indices that have object fields with dynamic: false mapping, PPL queries fail to access nested fields within those objects even though the data exists in the document's _source. This creates a significant usability gap between PPL and the native Query DSL.
For example, with the following index mapping:
{
"mappings": {
"properties": {
"event": {
"type": "object",
"dynamic": false
}
}
}
}And documents containing nested fields under "event":
{
"event": {
"user": {
"id": "u123",
"name": "Alice",
"location": {"city": "Seattle"}
},
"status": "ERROR"
}
}A PPL query attempting to access these fields fails:
source=testindex | fields event.user
With the error:
{
"error": {
"reason": "Invalid Query",
"details": "{alias=event,fieldName=user} field not found; fields are: {aliases=[testindex],fieldName=event}{aliases=[testindex],fieldName=_id}{aliases=[testindex],fieldName=_index}{aliases=[testindex],fieldName=_score}{aliases=[testindex],fieldName=_maxscore}{aliases=[testindex],fieldName=_sort}{aliases=[testindex],fieldName=_routing}",
"type": "IllegalArgumentException"
},
"status": 400
}
Meanwhile, the equivalent OpenSearch query succeeds by accessing _source:
GET testindex/_search
{
"_source": ["event.user"]
}This inconsistency forces users to switch between PPL and DSL queries based on their index mapping configuration, creating a fragmented user experience.
What solution would you like?
Enhance PPL to support accessing fields from _source even when they're not explicitly mapped, particularly for objects with dynamic: false mapping.
The solution should:
- The query should works in schema less manner when the field not found in the mapping
- Automatically attempt to retrieve fields from
_sourcewhen they're not found in the mapping
Example of desired behavior:
source=testindex | fields event.user.name, event.status
Should produce results like:
{"event.user.name": "Alice", "event.status": "ERROR"}
What alternatives have you considered?
-
Explicitly mapping all fields - While this would solve the immediate issue, it's impractical for many log analytics use cases where:
- Schema may evolve over time
- Different log sources may have varying field structures
- Index mapping size would grow exponentially for complex event structures
-
Document transformation before indexing - Flattening nested objects during indexing could make all fields accessible, but this would significantly impact indexing performance and storage requirements.
Do you have any additional context?
This feature is critical for log analytics use cases where:
- Complex event structures are common
dynamic: falseis used to prevent mapping explosion- Users need to query across a mix of explicitly mapped and unmapped fields
Metadata
Metadata
Assignees
Labels
Type
Projects
Status