Skip to content

Big endian issue: Graph Transformation Attention Fusion tests are failing #12921

@tvkai

Description

@tvkai

Describe the issue

Basically I am observing on AIX which is a big endian platform, following tests are failing, when onnxruntime is built from source .

1: [ FAILED ] GraphTransformationTests.AttentionFusionInt32Test
1: [ FAILED ] GraphTransformationTests.AttentionFusionInt64Test
1: [ FAILED ] GraphTransformationTests.AttentionFusionFloat32Test

The underlying issue seems to be onnxruntime is adding initializers to the graph during tranfsormation of the graph.
When initializers are getting added to the graph. the initializers are containing the raw_data, according to the
current implementation the graph should have little endian data and whenever it processes the data
UnpackTensors takes care of endianess of the machine. The issue is on a Big endian platform the data
is already in the BE and this makes graph to contain BE data instead of LE.

Below is the place where seems to be the actual issue.

In the file : onnxruntime/core/optimizer/attention_fusion.cc
Function: MergeQkvWeights

if (data_type == ONNX_NAMESPACE::TensorProto_DataType_FLOAT) {
const float* q_weight = q_initializer.data();
const float* k_weight = k_initializer.data();
const float* v_weight = v_initializer.data();
std::vector result;
result.reserve(gsl::narrow<size_t>(element_count));
if (is_matmul) {
MergeMatMulWeights(q_weight, k_weight, v_weight, result, hidden_size);
} else {
MergeWeights(q_weight, k_weight, v_weight, result, hidden_size);
}
initializer.set_raw_data(result.data(), gsl::narrow<size_t>(element_count) * sizeof(float)); <=== Here we are setting the raw_data using BE data when run on a Big endian platform.

Also in else block the same issue, initializer is getting formed with BE data and this is getting added to Graph.

One way to fix this issue is ..we need do byte swapping before we set the raw_data for Big Endian platforms.
I tested the byteswapping code before we actually set the raw data and tests are working fine.

To reproduce

Run graphtransformation on a Big Endian platform like AIX on IBM Power.

Urgency

No response

Platform

Other / Unknown

OS Version

AIX

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

latest

ONNX Runtime API

C

Architecture

IBM Power

Execution Provider

Default CPU

Execution Provider Library Version

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    model:transformerissues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions