Skip to content

Feature: support bnxt ibgda#64

Merged
jhchouuu merged 31 commits intomainfrom
jiahzhou/bnxt_develop
Sep 9, 2025
Merged

Feature: support bnxt ibgda#64
jhchouuu merged 31 commits intomainfrom
jiahzhou/bnxt_develop

Conversation

@jhchouuu
Copy link
Copy Markdown
Collaborator

@jhchouuu jhchouuu commented Sep 5, 2025

Broadcom BNXT IBGDA Support
BNXT IBGDA functions are now supported.
Use USE_BNXT=ON to enable BNXT features.

Shmem API implementation modification
Shmem API now returns one CQE per batch of WQE.

IBGDA API Unification
BNXT and MLX5 share the same basic operation API.
Added cqeSignal option for CQE signaling control.

jhchouuu and others added 27 commits September 1, 2025 19:32
*suppot post wqe and poll cq
*use new abi
*successful build, pending debug
NOTE: bnxt dont use __be*, no need for conversion.
TODO: modify lkey & rkey from uint64_t to uint32_t
successfully implemented the following ops for bnxt IBGDA:
write, write inline, atomic CAS and FA, poll CQ and ringDB
NOTE: mlx5 recv ops need write dbr
      bnxt recv ops need ring doorbell(write header into dbr)
* TODO: confirm the latency value for multiple ring doorbell
NOTE: When multiple rapid writes to the doorbell will
      trigger cqe coaleasec and only return the subsequent cqe.
      use tools/bnxt_disable_cq_coal.sh to disable this func.
* still not sure how bnxt cqe behaves, so use the serial quiet for now
*  poll cqe hang WA

* fix internode segFault

* reduce memory overuse

* add ShmemQuietThread in dispatch

---------

Co-authored-by: jhchouuu <jiahzhou@amd.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds support for Broadcom BNXT RDMA adapters and modifies the SHMEM API to improve completion queue efficiency. The implementation unifies IBGDA operations between MLX5 and BNXT devices while introducing CQE signaling control options.

  • Adds comprehensive BNXT RDMA support with device primitives, QP containers, and endpoint management
  • Modifies SHMEM API to return one CQE per batch of WQE operations for improved efficiency
  • Unifies IBGDA API design between BNXT and MLX5 with shared operation interfaces

Reviewed Changes

Copilot reviewed 38 out of 39 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
CMakeLists.txt Adds USE_BNXT build option and library detection logic
src/application/transport/rdma/providers/bnxt/ Complete BNXT provider implementation with device primitives and host/device APIs
include/mori/core/transport/rdma/providers/bnxt/ BNXT headers including HSI definitions and device primitive interfaces
src/application/transport/rdma/rdma.cpp Integrates BNXT device factory and vendor ID support
include/mori/shmem/shmem_ibgda_kernels.hpp Extensive SHMEM kernel modifications for provider dispatch and BNXT support
tests/cpp/io/test_engine.cpp Fixes printf format specifier for 64-bit values
Comments suppressed due to low confidence (1)

src/io/rdma/backend_impl.cpp:1

  • The assert statement uses a string literal instead of a boolean expression. This will always evaluate to true and never trigger an assertion failure. Should be assert(false && \"message\") or use a proper conditional.
// Copyright © Advanced Micro Devices, Inc. All rights reserved.

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 199 to 202
RdmaContext::RdmaContext(RdmaBackendType backendType) : backendType(backendType) {
deviceList = ibv_get_device_list(nullptr);
deviceList = ibv_get_device_list(&nums_device);
Initialize();
}
Copy link

Copilot AI Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable nums_device is used but not declared in this scope. It appears to be a member variable that should be declared in the class definition.

Copilot uses AI. Check for mistakes.
Comment thread src/application/transport/rdma/providers/bnxt/bnxt.cpp Outdated

// TODO: write a better version
static __device__ __host__ void DumpWqe(void* wqeBaseAddr, uint32_t idx) {
// TODO: write a better verison
Copy link

Copilot AI Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: 'verison' should be 'version'.

Suggested change
// TODO: write a better verison
// TODO: write a better version

Copilot uses AI. Check for mistakes.
Comment thread include/mori/core/transport/rdma/providers/bnxt/bnxt_device_primitives.hpp Outdated
Comment on lines +112 to +116
constexpr int sendWqeSize =
sizeof(struct bnxt_re_bsqe) + sizeof(struct bnxt_re_send) + sizeof(struct bnxt_re_sge);
constexpr int slotsNum = CeilDiv(sendWqeSize, BNXT_RE_SLOT_SIZE);

int psnCnt = (bytes == 0) ? 1 : (bytes + mtuSize - 1) / mtuSize;
Copy link

Copilot AI Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sendWqeSize calculation is duplicated across multiple functions. Consider extracting this to a common constant or helper function to avoid code duplication.

Copilot uses AI. Check for mistakes.
jhchouuu and others added 3 commits September 5, 2025 20:35
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Comment thread src/application/transport/rdma/providers/bnxt/bnxt.cpp Outdated
@TianDi101
Copy link
Copy Markdown
Collaborator

Amazing PR! I've left several comments but overall, the code LGTM. Thanks for the hard work @jhchouuu

@jhchouuu jhchouuu merged commit 9c281d5 into main Sep 9, 2025
@jhchouuu jhchouuu deleted the jiahzhou/bnxt_develop branch September 16, 2025 03:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants