NVIDIA Bluefield DPU上的启动流程4个阶段分别是什么?作用是什么?

文章目录

Bluefield上的硬件介绍

本文以Bluefield2为例,可以看到RSHIM实际上是Boot相关的集合。也能看到eMMC上的2个分区。

Bluefield硬件单元图:(尤其可以看到RSHIM在硬件形态上是一个单独的硬件)

Bluefield接口图:

启动流程

The default BlueField bootstream (BFB) shown above is a standard boot BFB that is stored on the embedded Multi-Media Card (eMMC) as can be seen by the boot path that points to a GUID partition (GPT) on the eMMC device

启动流程:

reset(echo "SW_RESET 1" > /dev/rshim0/misc )之后先进入BL1的BootROM

参考:https://docs.nvidia.com/networking/display/bluefielddpuosv385/upgrading+boot+software

eMMC中的两个存储分区:

When booting from eMMC, these stages make use of two different types of storage within

the eMMC part:

• ATF and UEFI are loaded from a special area known as an eMMC boot partition. Data

from a boot partition is automatically streamed from the eMMC device to the eMMC

controller under hardware control during the initial boot-up. Each eMMC device has two

boot partitions, and the partition which is used to stream the boot data is chosen by a nonvolatile configuration register in the eMMC.

• The operating system, applications, and user data come from the remainder of the chip,

known as the user area. This area is accessed via block-size reads and writes, done by a

device driver or similar software routine.

从eMMC启动,使用eMMC中两种类型的分区。

  • 一个是boot分区。在启动boot-up阶段,在硬件的控制下数据自动从eMMC设备流转到eMMC控制器。(无需软件参与)。
  • 一个是系统和数据分区。通过block-size方式读写,需要驱动或者软件模拟支持。

ATF介绍

ATF is used in Armv8 systems for booting the chip and then providing secure interfaces. It

implements various Arm interface standards like PSCI (Power State Coordination Interface),

SMC (Secure Monitor Call) and TBBR (Trusted Board Boot Requirements). ATF is used as

the primary bootloader to load UEFI (Unified Extensible Firmware Interface) on the

BlueField platform.

ATF是主要的bootloader,用来加载UEFI,实现是通过ARM标准的接口实现的。

ATF启动的四个阶段:

四个主要步骤:

The BlueField™ boot flow is comprised of 4 main phases:

• Hardware loads Arm Trusted Firmware (ATF)

• ATF loads UEFI---together ATF and UEFI make up the booter software

• UEFI loads the operating system, such as the Linux kernel

• The operating system loads applications and user data

  • BL1:硬件直接load ATF固件,通常所说的bootrom。直接硬件搬运执行。流片后无法修改
  • BL2:ATF加载UEFI。一般是SRAM,该部分不用像DDR初始化才能用。系统启动后直接将ATF加载到SRAM中直接运行。
  • BL3:UEFI加载系统OS
  • BL4:OS加载用户程序

ATF has various bootloader stages when loading:

• BL1 -- BL1 is stored in the on-chip boot ROM; it is executed when the primary core is

reset. Its main functionality is to do some initial architectural and platform initialization

to the point where it can load the BL2 image, then it loads BL2 and switches execution to

it.

• BL2 -- BL2 is loaded and then executed on the on-chip boot SRAM. Its main functionality is to perform the rest of the low-level architectural and platform initialization (e.g. initializing DRAM, setting up the System Address Mapping and calculating the Physical

Memory Regions). It then loads the rest of the boot images (BL31, BL33). After loading

the images, it traps itself back to BL1 via an SMC, which in turn switches execution to

BL31.

• BL31 -- BL31 is known as the EL3 Runtime Software. It is loaded to the boot RAM. Its

main functionality is to provide low-level runtime service support. After it finishes all its

runtime software initialization, it passes control to BL33.

• BL33 -- BL33 is known as the Non-trusted Firmware. For this case we are using EDK2

(Tianocore) UEFI. It is in charge of loading and passing control to the OS. For more detail on this, please see the EDK2 source.

  • BL1:存储在on-chip中的boot ROM中。主要作用做一些架构初始化和平台初始化,直到能够启动BL2,然后将执行权限交给BL2。从实际板子日志可以看到:打印就一句话:Mellanox BlueField-2 A1 BL1 V1.1
  • BL2:是在SRAM执行的。用来进一步初始化低级别的架构和平台。比如 内存DRAM,后文例子就是DRAM初始化失败。设置系统地址映射和物理内存。以及加载后面的BL31和BL33。执行结束后回到BL1,交给里面的SMC来切换execution给BL31。从日志中的NOTICE: Finished initializing DDR
  • BL31:属于EL3的runtime software。加载boot RAM,主要作用提供低级别的运行时服务。比如日志中的GNU GRUB version 2.04
  • BL33:使用Tianocore的UEFI启动。加载OS并且交给OS。更多可以参考EDK2的源码:https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-User-Documentation

ARMv7和ARMv8在引导流程上面完全不同的思路。ARMv8要兼容secure boot,需要在不同的异常等级做相应的处理,而且还需要给SoC厂商一些可配的灵活度,所以在boot上会引入不同的概念,相应的,比ARMv7(及以前)设计层面的复杂度要高很多。

参考:https://github.com/carloscn/blog/issues/65

详细ARM的流程参考这边文章非常详细: https://github.com/carloscn/blog/issues/65

各个阶段依赖的启动文件

一次烧录fw失败后的信息看启动流程

卡在了BL2阶段的ERROR: DDR Values not val

打开rshim日志查看简要信息:看到BL2 start然后异常(打开日志方式: echo "DISPLAY_LEVEL 2" > /dev/rshim0/misc ,然后查看:cat /dev/rshim0/misc

可以看到BL2显式boot mode是emmc,然后emmc启动异常。

尤其可见是UEFI坏了

日志分析:

参考:https://docs.nvidia.com/networking/display/bluefielddpubspv422/logging

该问题类似报错: "Memory Device: 0 BIST Failed" and "DDR BIST POST failed!"

https://forums.developer.nvidia.com/t/install-doca-on-bluefield-2-failed/231797

综述

了解Bluefield上DPU的启动流程,对于理解Bluefield各个功能组件和工具有极大的帮助。并且能够更好的理解DPU整体架构的实现。

参考:https://docs.nvidia.com/networking/display/bfswtroubleshooting/software+installation+and+upgrade

https://docs.nvidia.com/networking/display/bluefieldbsp480/upgrading+boot+software#src-3094733907_UpgradingBootSoftware-UEFISystemConfiguration

相关推荐
北冥有鱼被烹3 天前
微知-ib_write_bw的各种参数汇总(-d -q -s -R --run_infinitely)
rdma·mellanox
北冥有鱼被烹1 个月前
微知-Bluefield DPU使用flint烧录固件报错MFE_NO_FLASH_DETECTED是什么?MFE是什么?
dpu·mellanox·bluefield
yusur2 个月前
产品探秘|开物——面向AI原生和云原生网络研究的首选科研平台
网络·云原生·ai-native·dpu
yusur3 个月前
基于DPU与SmartNIC的K8s Service解决方案
网络·云原生·容器·kubernetes·云计算·dpu
I_belong_to_jesus3 个月前
NVDLA专题10:具体模块介绍——Planar Data Processor
算法·npu·ai芯片·dpu·nvdla
yusur4 个月前
中科驭数HADOS 3.0:以四大架构革新,全面拥抱敏捷开发理念,引领DPU应用生态
网络·计算机网络·云计算·dpu
yusur5 个月前
Spark基于DPU的Native引擎算子卸载方案
spark·dpu
yusur6 个月前
“Spark+Hive”在DPU环境下的性能测评 | OLAP数据库引擎选型白皮书(24版)DPU部分节选
数据库·hive·spark·dpu
北冥有鱼被烹7 个月前
【DPU系列之】Bluefield 2 DPU卡的功能图,ConnectX网卡、ARM OS、Host OS的关系?(通过PCIe Switch连接)
rdma·dpu·mellanox