Unofficial notes for setting up and configuring NVIDIA BlueField DPUs on custom server systems (non-DGX platforms), including support for Proxmox VE.
Linux Systems (Host/DPU/BMC):
Host: The system running on the server.- In the optional Proxmox VE section, it will be further divided into
PVEandVM. - In the remaining sections, the host refers to the server's operating system, regardless of whether it's running directly on hardware or within the VM of Proxmox VE.
- In the optional Proxmox VE section, it will be further divided into
DPU: The system running on the DPU.BMC: The system on the board management controller of the DPU. This is an independent system that provides out-of-band management capabilities, separate from the DPU's main operating system.
- Server: G493-ZB3-AAP1-rev-1x [ref]
- BlueField-3 DPU: B3210E 900-9D3B6-00SC-EA0 [ref]
- V100 GPU: Tesla V100-PCIE-16GB
References:
- Supported Servers and Power Cords
- BlueField-3 Validated and Supported Cables and Modules
- BlueField-3 DPUs
Require a supplementary 8-pin ATX power supply connectivity available through the external power supply connector .
Do not link the CPU power cable to the BlueField-3 DPU PCIe ATX power connector, as their pin configurations differ. Using the CPU power cable in this manner is strictly prohibited and can potentially damage the BlueField-3 DPU. Please refer to External PCIe Power Supply Connector Pins for the external PCIe power supply pins.
- DPU BMC 1GbE interface connected to the management network via ToR
- Remote Management Controller (RMC) connected to DPU BMC 1GbE via ToR
Info
RMC is the platform for data center infrastructure managers to manage DPUs.- DHCP server existing in the management network
- An NVQual certified server
References:
- NVIDIA BlueField-3 Networking Platform User Guide
- BlueField-3 Administrator Quick Start Guide
- Hardware Installation and PCIe Bifurcation
- (Optional) Proxmox VE 8.2.2
- Host OS
- Operating System: Ubuntu 24.04.2 LTS
- Kernel: Linux 6.8.0-54-generic
- Architecture: x86-64
- DOCA-Host: 2.9.2 LTS [ref]
- BMC
- Operating System: NVIDIA Moonraker/RoyB BMC (OpenBMC Project Reference Distro) BF-24.01-5
- Kernel: Linux 5.15.50-e62bf17
- Architecture: arm
- DPU
- Operating System: Ubuntu 22.04.5 LTS
- Kernel: Linux 5.15.0-1035-bluefield
- Architecture: arm64
- DOCA-BlueField: 2.9.2 [ref]
- Mode: DPU Mode [ref]
- DPU image and firmware: [ref]
$ sudo bfvcheck Beginning version check... -RECOMMENDED VERSIONS- ATF: v2.2(release):4.9.2-14-geeb9a6f94 UEFI: 4.9.2-25-ge0f86cebd6 FW: 32.43.2566 -INSTALLED VERSIONS- ATF: v2.2(release):4.9.2-14-geeb9a6f94 UEFI: 4.9.2-25-ge0f86cebd6 FW: 32.43.2566 Version check complete. No issues found. - BlueField OS image version: [ref]
$ cat /etc/mlnx-release bf-bundle-2.9.2-31_25.02_ubuntu-22.04_prod
Please skip to the next section if Proxmox VE is not used.
-
IOMMU Setup
-
Ensure that IOMMU (VT-d or AMD-Vi) is enabled in the BIOS/UEFI.
lscpu | grep Virtualization -
AMD enables it by default, check it using the following command:
for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done
- It works when you see multiple groups and you can check which devices are properly isolated (no other devices in the same group except for PCI bridges) for PCI passthrough.
-
If it cannot be enabled, modify the GRUB configuration. Locate
GRUB_CMDLINE_LINUX_DEFAULT, and for AMD, set it to:bash GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt" # for Intel GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
-
Verify whether IOMMU is enabled (though it's uncertain if this method works) by using:
dmesg | grep -e DMAR -e IOMMU # works DMAR: IOMMU enabled
-
-
check NIC info
bash lspci -nn | grep -i bluefield bash lspci -nn | grep -i nvidia
-
-
Proxmox VE Setup
-
Find PCI ID
lspci -nn | grep -i mellanox lspci -nn | grep -i nvidia # Take note of the device's PCI address (e.g., 0000:03:00.0) and Vendor:Device ID (e.g., 1b4b:xxxx).
-
Check vfio module
lsmod | grep vfio-
enable if there is no vfio module:
echo "vfio" >> /etc/modules echo "vfio_iommu_type1" >> /etc/modules echo "vfio_pci" >> /etc/modules update-initramfs -u reboot dmesg | grep -i vfio
-
-
Configure VFIO: The BlueField card must be managed by
vfio-pcito prevent the default driver from automatically loading.nano /etc/modprobe.d/vfio.conf # vendor device options vfio-pci ids=15b3:a2d2 # update update-initramfs -u # Not sure if this is needed softdep mlx5_core pre: vfio-pci # Blacklist default driver (edit `/etc/modprobe.d/pve-blacklist.conf`) `blacklist mlx5_core`
-
Reboot the system and verify that the PCI device is bound to
vfio-pci:lspci -nnk -d 1b4b:xxxx
-
-
VM Setup
-
Create or stop the target VM, add the following line in Proxmox Web UI or directly edit the VM configuration file (e.g.
/etc/pve/qemu-server/<VMID>.conf), replace0000:03:00.0with the PCI address of your BlueField card.hostpci0: 0000:03:00.0,pcie=1
-
If the card has multiple functions (multi-function device), you can add
hostpci1,hostpci2, etc. or addmultifunction=on(adjust as needed). -
Check the VM
lspci -nn | grep -i nvidia
-
-
Appendix
- V100 Passthrough in Proxmox VE GUI:
Datacenter > Resource Mappings > Add - DPU Passthrough in Proxmox VE GUI:
VM > Hardware > Add > PCI Device
- V100 Passthrough in Proxmox VE GUI:
-
References:
Execute the following commands on the host.
Check PCI devices:
lspci -nn | grep -i mellanox
lspci -nn | grep -i nvidiaInstall common packages:
sudo apt-get update
# Install pv for viewing progress of the commands below
sudo apt-get install -y pv(Optional) Uninstall old DOCA-Host: [ref]
for f in $( dpkg --list | grep -E 'doca|flexio|dpa-gdbserver|dpa-stats|dpaeumgmt' | awk '{print $2}' ); do echo $f ; sudo apt remove --purge $f -y ; done
sudo /usr/sbin/ofed_uninstall.sh --force
sudo apt-get autoremoveInstall DOCA-Host (DPU Driver) 2.9.2 LTS [download]:
# DPU Driver (DOCA-Host)
wget https://www.mellanox.com/downloads/DOCA/DOCA_v2.9.2/host/doca-host_2.9.2-012000-24.10-ubuntu2404_amd64.deb
sudo dpkg -i doca-host_2.9.2-012000-24.10-ubuntu2404_amd64.deb
sudo apt-get update
sudo apt-get -y install doca-all
# Check DOCA-Host
dpkg -l | grep doca
# GPU Driver & CUDA
wget https://developer.download.nvidia.com/compute/cuda/12.8.0/local_installers/cuda_12.8.0_570.86.10_linux.run
sudo sh cuda_12.8.0_570.86.10_linux.run
# Check Driver
nvidia-smiFix macsec driver issue:
Strangely, Ubuntu 24.04's kernel binary package doesn't seem to include the
macsecdriver, causingmlx5_ibnot being able to load. This may be observed by runningsudo mst status -v,sudo dmesg | grep mlx5, andibstatus.
2025/06/29 Update: A easier solution seems to be:
sudo apt-get install linux-modules-extra-$(uname -r)Then we don't need to build the
macsecdriver ourselves.
To fix this issue, we build the macsec driver ourselves:
# Download macsec from kernel source
wget https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/drivers/net/macsec.c?h=v6.8 -O macsec.c
# Create Makefile
cat << 'EOF' > Makefile
obj-m += macsec.o
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
EOF
make
sudo cp macsec.ko /lib/modules/$(uname -r)/kernel/drivers/net
sudo depmod -a
# macsec module should be available
modinfo macsec
sudo modprobe macsec
lsmod | grep macsec
# Reload mlx5_core module
sudo rmmod mlx5_core
sudo modprobe mlx5_coreMake sure to re-compile the macsec module if you encounter the following error when running
sudo modprobe macsec:modprobe: ERROR: could not insert 'macsec': Exec format error
Connect to the DPU via RShim: [ref]
sudo systemctl enable --now rshim
sudo ip addr add 192.168.100.1/24 dev tmfifo_net0
ping 192.168.100.2
# connect to the DPU
ssh ubuntu@192.168.100.2Change DPU to IB mode: [ref]
# Note that this can also be done on DPU
sudo mlxconfig -d /dev/mst/mt41692_pciconf0 set LINK_TYPE_P1=1
sudo mlxconfig -d /dev/mst/mt41692_pciconf0 set LINK_TYPE_P2=1
# Cold reboot the machineDeploying DPU OS Using BFB from Host: [download] [ref]
# update DOCA-BlueField to 2.9.2
wget https://content.mellanox.com/BlueField/BFBs/Ubuntu22.04/bf-bundle-2.9.2-31_25.02_ubuntu-22.04_prod.bfb
sudo bfb-install --bfb bf-bundle-2.9.2-31_25.02_ubuntu-22.04_prod.bfb --rshim /dev/rshim0(Optional, Unconfirmed) Update DPU Firmware: [download]
# update firmware wget https://content.mellanox.com/BlueField/FW-Bundle/bf-fwbundle-2.9.2-31_25.02-prod.bfb sudo bfb-install --bfb bf-fwbundle-2.9.2-31_25.02-prod.bfb --rshim rshim0
Other DOCA tools and commands for debugging:
cd /opt/mellanox/doca/tools
doca_caps --list-devs
doca_bench --device 01:00.0 --query device-capabilitiessudo ibdev2netdev -v
sudo mlxlink -d /dev/mst/mt41692_pciconf0Execute the following commands on the DPU.
# Check BlueField OS image version
cat /etc/mlnx-release
# Check DOCA-BlueField
dpkg -l | grep docaUpdate DPU firmware: [ref] [firmware-tools] [flint] [mlxfwmanager]
# check firmware
sudo mlxfwmanager --query
sudo flint -d /dev/mst/mt41692_pciconf0 q
# update firmware
sudo /opt/mellanox/mlnx-fw-updater/mlnx_fw_updater.pl
# force update
# sudo /opt/mellanox/mlnx-fw-updater/mlnx_fw_updater.pl --force-fw-update
# Need to cold reboot the machineLaunch OpenSM on DPU for using InfiniBand on host side. Before this step, running ibstat on host will show State: Down and Physical state: LinkUp. Running ibstat on host will show State: Up after this step.
# Get the `Node GUID` from the corresponding CA
ibstat
# Run OpenSM with the Node GUID to recognize virtual ports on the host.
sudo opensm -g <DPU_IB_NODE_GUID> -p 10
# If there's another OpenSM running on other hosts, make sure to set the priority higher than those.
# In our case, we have another OpenSM with priority 0 in the subnet, so we set our priority to 10.InfiniBand in DPU Mode
In DPU Mode, when operating with an InfiniBand network, OpenSM must be executed from the BlueField Arm side rather than the host side. Similarly, InfiniBand management tools such as
sminfo,ibdev2netdev, andibnetdiscovercan only be used from the BlueField Arm side and are not accessible from the host side.
Resetting DPU: [ref]
# Query for reset level required to load new firmware
sudo mlxfwreset -d /dev/mst/mt*pciconf0 qOutput of the query command:
Reset-levels: 0: Driver, PCI link, network link will remain up ("live-Patch") -Supported (default) 1: Only ARM side will not remain up ("Immediate reset"). -Not Supported 3: Driver restart and PCI reset -Supported 4: Warm Reboot -Supported Reset-types (relevant only for reset-levels 1,3,4): 0: Full chip reset -Supported (default) 1: Phy-less reset (keep network port active during reset) -Not Supported 2: NIC only reset (for SoC devices) -Not Supported 3: ARM only reset -Not Supported 4: ARM OS shut down -Not Supported Reset-sync (relevant only for reset-level 3): 0: Tool is the owner -Not supported 1: Driver is the owner -Supported (default)
Debugging:
# collect all debug message in host
sudo /usr/sbin/sysinfo-snapshot.pyReferences:
IPMI:
# Check sensors
ipmitool sdr
# Power control
ipmitool chassis power
# chassis power Commands: status, on, off, cycle, reset, diag, soft
# Check power status
ipmitool chassis status
# Control the BMC itself
ipmitool mcRedfish:
# Check BMC version
curl -k -u 'root:<password>' -H 'Content-Type: application/json' -X GET https://<bmc_ip>/redfish/v1/UpdateService/FirmwareInventory/BMC_FirmwareReferences:
- Connecting to BMC Interfaces
- Reset Control
- Table of Common Redfish Commands
- NVIDIA BlueField Reset and Reboot Procedures
Given another host connected with InfiniBand, you can ping it from the DPU:
On the other host host2:
ibstat # check `Base lid`
sudo ibping -SOn the DPU:
sudo ibnetdiscover # You should see the same lid
ibstat # check `CA` and `Port`
sudo ibping -C <CA> -P <PORT> -L <LID>
# For an example:
# sudo ibping -C mlx5_0 -P 1 -L 13You can also switch the server and client roles by running ibping -S on the DPU and ibping -C <CA> -P <PORT> -L <LID> on the other host.
Please refer to the examples for more details.
Contributors: @tsw303005, @Aiden128, @YiPrograms, and @j3soon.
This note has been made possible through the support of LSA Lab, and NVIDIA AI Technology Center (NVAITC).