Skip to content

Large performance diff between sin u10 and u35 #107

@xoofx

Description

@xoofx

Hey, I just had a quick look a bit more at the performance of sleef and tried just a simple benchmark with Sleef_sinf_u10 and Sleef_sinf_u35, compare also against a stock sin with MSVC compiler:

The bench is something in the line of:

float BenchFloat_Sin(unsigned long count)
{
    float value = 0;
    float f = 5;
    for (unsigned long i = 0; i < count; i++)
    {
        value += sin(f);
        value += sin(f + 1);
        value += sin(f + 2);
        value += sin(f + 3);
        value += sin(f + 4);
        f += i;
    }
    return value;
}

where the count was dynamically calibrated from another bench, but in that case on my machine was set to 6574622.

The timing results are:

MSVC sinf: 418ms
Sleef_sinf_u10: 2058ms
Sleef_sinf_u35: 179ms

between u10 and u35, there is more than x10 different (!)... I haven't checked the accuracy of MSVC sinf, so likely it is not 1 ULP. Looks like exponential to get 1ULP.

Do you think this results are expected?

Also, If it is not possible to achieve 1ULP without this cost, would it be possible to introduce an intermediate precision (ULP 2?) that could lower the gap there and make it more viable?

More generally, I was also wondering if adding also more ULP precisions choices (u10, u20, u35, u60) could be helpful in some cases...

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions