rawBufferVectorLoad/Store emits i16/f16 for min precision types instead of i32/f32

## Summary

\awBufferVectorLoad\ and \awBufferVectorStore\ (SM 6.9 vector buffer ops) emit 16-bit element types for min precision types (\min16int\, \min16uint\, \min16float\). This causes drivers to load/store 2 bytes per element instead of 4, mismatching the expected buffer layout.

## Repro

\\\hlsl
RWByteAddressBuffer g : register(u0);

[numthreads(1,1,1)]
void main() {
  min16int3 v = g.Load<min16int3>(0);
  g.Store<min16int3>(12, v);
}
\\\

Compile with \dxc -T cs_6_9\:

**Actual (buggy):**
- \awBufferVectorLoad.v3i16\ — loads 3 x 2 bytes = 6 bytes
- \awBufferVectorStore.v3i16\ — stores 3 x 2 bytes = 6 bytes

**Expected:**
- \awBufferVectorLoad.v3i32\ + \	runc <3 x i32> to <3 x i16>\ — loads 3 x 4 bytes = 12 bytes
- \sext <3 x i16> to <3 x i32>\ + \awBufferVectorStore.v3i32\ — stores 3 x 4 bytes = 12 bytes

## Root Cause

\TranslateBufLoad\ in \HLOperationLower.cpp\ (line ~4353) creates the vector type directly from the min precision element type without widening to 32-bit first. Pre-SM6.9 \RawBufferLoad\ correctly handles this by loading as i32 and truncating — the SM6.9 vector variant should do the same.

## Analysis

WARP treats i16 \awBufferVectorLoad\ as 2-byte-per-element loads (confirmed in source). Pre-SM6.9, DXC emits \awBufferLoad.i32\ + \	runc i32 to i16\ for min precision, which correctly loads 4 bytes. The SM6.9 vector path skips this widening, producing a buffer layout mismatch when the CPU writes 32-bit values.

Same issue affects \min16uint\ and \min16float\ (half).

## Fix

Widen min precision types to i32/f32 in both \TranslateBufLoad\ and \TranslateBufStore\ for \RawBufferVectorLoad/Store\, matching the existing bool widening pattern. Truncate/extend back after load / before store.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rawBufferVectorLoad/Store emits i16/f16 for min precision types instead of i32/f32 #8273

Summary

Repro

Root Cause

Analysis

Fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

rawBufferVectorLoad/Store emits i16/f16 for min precision types instead of i32/f32 #8273

Description

Summary

Repro

Root Cause

Analysis

Fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions