Skip to content

Clang/LLVM optimizes division and modulo worse than MSVC #37331

@StephanTLavavej

Description

@StephanTLavavej
Bugzilla Link 37983
Version trunk
OS All
Attachments Test case, Clang codegen for workaround, Clang codegen for modulo (this is the bug), MSVC codegen for workaround, MSVC codegen for modulo (this is fine)
CC @topperc,@efriedma-quic,@LebedevRI,@RKSimon,@nico,@rotateright,@Trass3r

Extended Description

This affects the Ryu algorithm for printing floating-point numbers (https://github.com/ulfjack/ryu ) and therefore affects C++17 floating-point std::to_chars(). This is possibly the same bug as #23480 "Division followed by modulo generates longer machine code than vice versa".

I observe that MSVC's codegen is unaffected by USE_MODULO, while Clang/LLVM generates more assembly code (which is slower when profiled in the real algorithm) for USE_MODULO.

C:\Temp\TESTING_X64>cl
Microsoft (R) C/C++ Optimizing Compiler Version 19.15.26504 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

usage: cl [ option... ] filename... [ /link linkoption... ]

C:\Temp\TESTING_X64>clang-cl -m64 -v
clang version 6.0.0 (tags/RELEASE_600/final)
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: S:\msvc\src\vctools\NonShip\ClangLLVM\bin

C:\Temp\TESTING_X64>type meow.cpp
unsigned long long ryu(unsigned long long vp, unsigned long long vm) {
    bool vmIsTrailingZeros = true;

    while (vp / 10 > vm / 10) {
#ifdef USE_MODULO
        vmIsTrailingZeros &= vm % 10 == 0;
#else
        // The compiler does not realize that vm % 10 can be computed from vm / 10
        // as vm - (vm / 10) * 10.
        vmIsTrailingZeros &= vm - (vm / 10) * 10 == 0; // vm % 10 == 0;
#endif
        vp /= 10;
        vm /= 10;
    }

    return vmIsTrailingZeros ? vp : vm;
}

C:\Temp\TESTING_X64>cl /EHsc /nologo /W4 /MT /O2 /c /FAsc /Famsvc_workaround.cod meow.cpp
meow.cpp

C:\Temp\TESTING_X64>cl /EHsc /nologo /W4 /MT /O2 /c /FAsc /Famsvc_modulo.cod /DUSE_MODULO meow.cpp
meow.cpp

C:\Temp\TESTING_X64>clang-cl -m64 /EHsc /nologo /W4 /MT /O2 /c /FA /Faclang_workaround.asm meow.cpp

C:\Temp\TESTING_X64>clang-cl -m64 /EHsc /nologo /W4 /MT /O2 /c /FA /Faclang_modulo.asm /DUSE_MODULO meow.cpp

C:\Temp\TESTING_X64>

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions