-
Notifications
You must be signed in to change notification settings - Fork 150
Description
Hey, I just had a quick look a bit more at the performance of sleef and tried just a simple benchmark with Sleef_sinf_u10 and Sleef_sinf_u35, compare also against a stock sin with MSVC compiler:
The bench is something in the line of:
float BenchFloat_Sin(unsigned long count)
{
float value = 0;
float f = 5;
for (unsigned long i = 0; i < count; i++)
{
value += sin(f);
value += sin(f + 1);
value += sin(f + 2);
value += sin(f + 3);
value += sin(f + 4);
f += i;
}
return value;
}where the count was dynamically calibrated from another bench, but in that case on my machine was set to 6574622.
The timing results are:
MSVC sinf: 418ms
Sleef_sinf_u10: 2058ms
Sleef_sinf_u35: 179ms
between u10 and u35, there is more than x10 different (!)... I haven't checked the accuracy of MSVC sinf, so likely it is not 1 ULP. Looks like exponential to get 1ULP.
Do you think this results are expected?
Also, If it is not possible to achieve 1ULP without this cost, would it be possible to introduce an intermediate precision (ULP 2?) that could lower the gap there and make it more viable?
More generally, I was also wondering if adding also more ULP precisions choices (u10, u20, u35, u60) could be helpful in some cases...