This is a meta issue collecting a few different ideas from different sources.
The core problem is that misbehaving queries are typically hard to debug & require a lot of knowledge of PPL/SQL engine behavior, and there's room for enhancement.
Discussing the concept with @anasalkouz, there are three main problem classes we want to enhance:
- Something went wrong (execution error)
- 0 results are returned, why?
- The query is slow, why?
So, compiling existing issues for the backend:
Once those have reasonably complete implementations, errors will be in structured reports like this:
{
"status": 400,
"error": {
"type": "SemanticCheckException",
"code": "FIELD_NOT_FOUND",
"reason": "Invalid Query",
"details": "Failed to resolve field 'foo'",
"location": [
"while planning the query",
"while resolving fields in the index mapping"
],
"context": {
"index_pattern": "logs-*",
"position": {"line": 1, "column": 25},
"query": "source=logs-* | fields foo",
"query_id": "b6627794-3939-4ac4-8c5b-821ccc400f4f"
},
"suggestion": "Did you mean: 'foobar'"
}
}
Frontends can choose to do whatever they want with this: render specific fixed pieces of context (e.g. position is pointing to a spot in the query, highlight it?), render details/locations/suggestions, throw it in an LLM, etc. It also helps with oncall debugging when given these responses (either directly or from har files). #4919 (comment) shows me doing this quickly for the SQL CLI based on a proof-of-concept implementation.
From the frontend, there's a separate meta issue:
Once the core flows here are done, what's left is to start vetting specific error cases:
This is a meta issue collecting a few different ideas from different sources.
The core problem is that misbehaving queries are typically hard to debug & require a lot of knowledge of PPL/SQL engine behavior, and there's room for enhancement.
Discussing the concept with @anasalkouz, there are three main problem classes we want to enhance:
So, compiling existing issues for the backend:
analyzealongsideexplain#4343 can give the foundation for the latter 2, by providing analyze metrics for results returned & timings at all stages.Once those have reasonably complete implementations, errors will be in structured reports like this:
{ "status": 400, "error": { "type": "SemanticCheckException", "code": "FIELD_NOT_FOUND", "reason": "Invalid Query", "details": "Failed to resolve field 'foo'", "location": [ "while planning the query", "while resolving fields in the index mapping" ], "context": { "index_pattern": "logs-*", "position": {"line": 1, "column": 25}, "query": "source=logs-* | fields foo", "query_id": "b6627794-3939-4ac4-8c5b-821ccc400f4f" }, "suggestion": "Did you mean: 'foobar'" } }Frontends can choose to do whatever they want with this: render specific fixed pieces of context (e.g.
positionis pointing to a spot in the query, highlight it?), render details/locations/suggestions, throw it in an LLM, etc. It also helps with oncall debugging when given these responses (either directly or from har files). #4919 (comment) shows me doing this quickly for the SQL CLI based on a proof-of-concept implementation.From the frontend, there's a separate meta issue:
Once the core flows here are done, what's left is to start vetting specific error cases:
ArrayIndexOutOfBoundsExceptionwhen querying index with disabled objects containing dot-only field names #4896IS NOT NULLcondition support #5262 is a special case but it'd be nice to do it in the general case)