In the nyc_taxis dataset used in benchmarking, I’m getting an arithmetic exception when calculating avg of many longs.
The query:
from nyc_taxis
| eval epoch = to_long(dropoff_datetime)
| stats ae=avg(epoch), count(epoch)
| eval at=to_datetime(ae)
And the result:
{
"error": {
"root_cause": [
{
"type": "arithmetic_exception",
"reason": "long overflow"
}
],
"type": "arithmetic_exception",
"reason": "long overflow"
},
"status": 500
}
In ESQL arithmetic operations can lead to numerical overflows, and the principle is to return null and add a warning. We could do that in this case. However, since the avg function will return a double anyway, we could cast the longs to doubles up-front. Right now avg() is implemented as a sum()/count() and the division will return a double, we could instead change this to sum(to_double())/count() and get the conversion to double done one step earlier.
As a test, the following workaround works:
from nyc_taxis
| eval epoch = to_double(dropoff_datetime)
| stats ae=avg(epoch), count(epoch)
| eval at=to_datetime(ae)
{
"columns": [
{
"name": "ae",
"type": "double"
},
{
"name": "count(epoch)",
"type": "long"
},
{
"name": "at",
"type": "date"
}
],
"values": [
[
1431759064681.463,
8320000,
"2015-05-16T06:51:04.681Z"
]
]
}
So, three options here:
- Implement the conversion to
null and warnings for aggregations that have numerical overflow
- Add an implicit cast to double
avg() -> sum(to_double())/count()
- Support big integers on
sum(), requiring support for an additional type (big integer)
In the
nyc_taxisdataset used in benchmarking, I’m getting an arithmetic exception when calculating avg of many longs.The query:
And the result:
In ESQL arithmetic operations can lead to numerical overflows, and the principle is to return
nulland add a warning. We could do that in this case. However, since theavgfunction will return a double anyway, we could cast the longs to doubles up-front. Right nowavg()is implemented as asum()/count()and the division will return a double, we could instead change this tosum(to_double())/count()and get the conversion to double done one step earlier.As a test, the following workaround works:
So, three options here:
nulland warnings for aggregations that have numerical overflowavg()->sum(to_double())/count()sum(), requiring support for an additional type (big integer)