Skip to content

Document rounding modes & maybe provide MPFR-compatible aliases #890

@danielkhankin

Description

@danielkhankin

When using mpmath's fsub function with round-up mode (rounding='u'), the result is incorrectly rounded up by 1 bit compared to the IEEE754 expected result computed with MPFR. This issue was discovered when performing a subtraction rounded to single precision.

from mpmath import mp, mpf
import struct

def float_to_hex(f):
    return hex(struct.unpack('<I', struct.pack('<f', float(f)))[0])

mp.dps = 64

a = mpf('0.0004370212554931640625')
b = mpf('0.69314718055994530943')

result = mp.fsub(a, b, rounding='u', prec=24)

print(f"mpmath result in hex: {float_to_hex(float(result))}")  # Outputs: 0xbf315574

Expected result (verified with MPFR):

0xbf315573

The MPFR implementation showing the correct result:

#include <stdio.h>
#include <mpfr.h>
#include <stdint.h>

int main(void) {
    mpfr_t a, b, result;
    float float_result;
    union {
        float f;
        uint32_t i;
    } u;

    mpfr_init2(a, 64);
    mpfr_init2(b, 64);
    mpfr_init2(result, 64);

    mpfr_set_str(a, "0.0004370212554931640625", 10, MPFR_RNDN);
    mpfr_set_str(b, "0.69314718055994530943", 10, MPFR_RNDN);

    mpfr_sub(result, a, b, MPFR_RNDU);
    float_result = mpfr_get_flt(result, MPFR_RNDU);
    u.f = float_result;

    printf("Result in hex: 0x%08x\n", u.i);

    mpfr_clear(a);
    mpfr_clear(b);
    mpfr_clear(result);

    return 0;
}

The difference shows that mpmath's round-up implementation in fsub produces a result that is 1 bit higher than the correct value.

Metadata

Metadata

Assignees

No one assigned

    Labels

    docsenhancementnew feature requests (or implementation)

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions