Skip to content

GVN and SROA miscompile min precision vector element access #8268

@alsepkow

Description

@alsepkow

Description

Multiple optimization passes mishandle min precision vector types due to DXC's padded data layout (i16:32, f16:32), where getTypeSizeInBits returns padded sizes for vectors (HLSL change) but primitive sizes for scalars. This causes three related bugs affecting min16float, min16int, and min16uint vector element access ([] operator).

Bug 1: GVN ICE (Internal Compiler Error)

CanCoerceMustAliasedValueToLoad computes an integer type using the padded size (e.g., 96 bits for <3 x half> instead of 48), then CoerceAvailableValueToLoadType attempts a bitcast from the 48-bit LLVM type to i96 — triggering an LLVM assert.

Bug 2: GVN Incorrect Store-to-Load Forwarding (Silent Miscompile)

GVN's processLoad forwards a store <3 x i16> zeroinitializer directly to a later load <3 x i16>, ignoring intermediate partial store i16 writes to individual vector elements. This happens because MemoryDependenceAnalysis uses padded type sizes to determine aliasing.

Bug 3: SROA Element Misindexing (Silent Miscompile)

Root cause of the test failures. SROA's getNaturalGEPRecursively uses getTypeSizeInBits (primitive size: 2 bytes for i16) for vector element offset calculations, while GEP offset computation uses getTypeAllocSize (padded size: 4 bytes with i16:32). This mismatch causes byte offset 4 (element 1) to be mapped to vector index 4/2 = 2 instead of 4/4 = 1, leading SROA to misplace or eliminate stores to vector elements.

Result: Only element [0] is correct; elements [1] and [2] are zeroed.

Repro

RWByteAddressBuffer g_In : register(u0);
RWByteAddressBuffer g_Out : register(u1);

[numthreads(1,1,1)]
void main() {
  vector<int, 3> raw = g_In.Load< vector<int, 3> >(0);
  vector<min16int, 3> v = (vector<min16int, 3>)raw;
  vector<min16int, 3> out_v = (min16int)0;
  out_v[0] = v[0];
  out_v[2] = v[2];
  out_v[1] = v[1];
  g_Out.Store< vector<int, 3> >(0, (vector<int, 3>)out_v);
}

Compile with: dxc -T cs_6_9 repro.hlsl

  • -O0 / -Od: correct results
  • -O1 (default): Bug 1 (ICE) or Bug 3 (wrong results)

Also reproduces with min16float and min16uint.

Root Cause

DXC's data layout pads min precision types: i16:32 and f16:32. The HLSL change in DataLayout::getTypeSizeInBits (line 540-543) makes vector sizes use getTypeAllocSizeInBits per element, so getTypeSizeInBits(<3 x i16>) = 96 (3 x 32). But scalar getTypeSizeInBits(i16) = 16 returns the primitive width.

This inconsistency propagates through:

  • GVN: Uses padded vector sizes for bitcast width calculations and alias reasoning
  • SROA: Uses primitive scalar sizes for vector element offsets but padded alloc sizes for GEP offsets — causing index mismatches

Fix

Three guards in lib/Transforms/Scalar/GVN.cpp and lib/Transforms/Scalar/SROA.cpp:

  1. GVN CanCoerceMustAliasedValueToLoad: Reject coercion when type sizes include padding
  2. GVN processLoad: Skip store-to-load forwarding for padded types
  3. SROA: Use getTypeAllocSizeInBits for vector element sizes in getNaturalGEPRecursively, isVectorPromotionViable, and AllocaSliceRewriter, matching GEP offset calculations

Fix branch: https://github.com/alsepkow/DirectXShaderCompiler/tree/user/alsepkow/fix-min-precision-opt-bugs
Squashed commit: alsepkow@b34136b9a

Environment

  • DXC version: 1.9.0 (main branch, SM 6.9)
  • Affects: all min precision types (min16float, min16int, min16uint) with vector element access
  • Does NOT affect native 16-bit types (half with -enable-16bit-types)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions