本文ROCm hip库的结构设计解析。内容很多,可以当参考书来看。
1. 项目组织结构
| 层级 | 目录 | 说明 |
|---|---|---|
| hip (公共仓库) | include/hip/ |
公共 API 头文件 (hip_runtime.h, hip_runtime_api.h, hiprtc.h) |
bin/ |
hipcc, hipconfig 工具 | |
cmake/ |
CMake 配置 | |
| CLR (实现仓库) | hipamd/src/ |
HIP AMD 后端实现 (hip_*.cpp, hip_internal.hpp) |
hipamd/hiprtc/ |
运行时编译实现 | |
rocclr/device/ |
设备抽象 (amd::Device 基类) | |
rocclr/device/rocm/ |
ROCm 后端 (roc::Device → HSA API) | |
rocclr/device/pal/ |
PAL 后端 (pal::Device) | |
rocclr/platform/ |
平台抽象 (runtime, context, memory, command) | |
rocclr/thread/ |
线程/同步原语 | |
opencl/ |
OpenCL 运行时 (共享 rocclr) |
理解 HIP 时可以把它看成四个互相配合的面:
| 面向 | 主要内容 | 作用 |
|---|---|---|
| API 兼容面 | hip_runtime_api.h、hip_runtime.h |
给应用提供 CUDA-like 的 C/C++ API |
| 编译工具面 | hipcc、CMake package、device headers |
把 HIP 源码编译成 host object + GPU code object |
| 运行时对象面 | hip::Device、hip::Stream、hip::Event、memory/module/graph |
在进程内维护 CUDA runtime 语义 |
| 后端执行面 | ROCclr、ROCr、libhsakmt、KFD | 把运行时对象转换成真实 GPU 队列、内存和命令 |
所以 projects/hip/ 更像"API 和工具的门面",projects/clr/hipamd/ 才是 AMD 平台上 HIP Runtime 的主要实现,rocclr/ 则是 HIP 和 OpenCL 共用的设备运行时基础设施。
2. 分层架构
整个软件栈自上而下分为 API、抽象、后端、内核四段,用户的一次 API 调用会沿着这条链条层层下沉:
#mermaid-svg-lbt5AcGzTAo7so1L{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-lbt5AcGzTAo7so1L .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-lbt5AcGzTAo7so1L .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-lbt5AcGzTAo7so1L .error-icon{fill:hsl(220.5882352941, 100%, 98.3333333333%);}#mermaid-svg-lbt5AcGzTAo7so1L .error-text{fill:rgb(8.5000000002, 5.7500000001, 0);stroke:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-lbt5AcGzTAo7so1L .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-lbt5AcGzTAo7so1L .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-lbt5AcGzTAo7so1L .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-lbt5AcGzTAo7so1L .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-lbt5AcGzTAo7so1L .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-lbt5AcGzTAo7so1L .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-lbt5AcGzTAo7so1L .marker{fill:#0b0b0b;stroke:#0b0b0b;}#mermaid-svg-lbt5AcGzTAo7so1L .marker.cross{stroke:#0b0b0b;}#mermaid-svg-lbt5AcGzTAo7so1L svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;}#mermaid-svg-lbt5AcGzTAo7so1L p{margin:0;}#mermaid-svg-lbt5AcGzTAo7so1L .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-lbt5AcGzTAo7so1L .cluster-label text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-lbt5AcGzTAo7so1L .cluster-label span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-lbt5AcGzTAo7so1L .cluster-label span p{background-color:transparent;}#mermaid-svg-lbt5AcGzTAo7so1L .label text,#mermaid-svg-lbt5AcGzTAo7so1L span{fill:#333;color:#333;}#mermaid-svg-lbt5AcGzTAo7so1L .node rect,#mermaid-svg-lbt5AcGzTAo7so1L .node circle,#mermaid-svg-lbt5AcGzTAo7so1L .node ellipse,#mermaid-svg-lbt5AcGzTAo7so1L .node polygon,#mermaid-svg-lbt5AcGzTAo7so1L .node path{fill:#fff4dd;stroke:hsl(40.5882352941, 60%, 83.3333333333%);stroke-width:1px;}#mermaid-svg-lbt5AcGzTAo7so1L .rough-node .label text,#mermaid-svg-lbt5AcGzTAo7so1L .node .label text,#mermaid-svg-lbt5AcGzTAo7so1L .image-shape .label,#mermaid-svg-lbt5AcGzTAo7so1L .icon-shape .label{text-anchor:middle;}#mermaid-svg-lbt5AcGzTAo7so1L .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-lbt5AcGzTAo7so1L .rough-node .label,#mermaid-svg-lbt5AcGzTAo7so1L .node .label,#mermaid-svg-lbt5AcGzTAo7so1L .image-shape .label,#mermaid-svg-lbt5AcGzTAo7so1L .icon-shape .label{text-align:center;}#mermaid-svg-lbt5AcGzTAo7so1L .node.clickable{cursor:pointer;}#mermaid-svg-lbt5AcGzTAo7so1L .root .anchor path{fill:#0b0b0b!important;stroke-width:0;stroke:#0b0b0b;}#mermaid-svg-lbt5AcGzTAo7so1L .arrowheadPath{fill:#0b0b0b;}#mermaid-svg-lbt5AcGzTAo7so1L .edgePath .path{stroke:#0b0b0b;stroke-width:2.0px;}#mermaid-svg-lbt5AcGzTAo7so1L .flowchart-link{stroke:#0b0b0b;fill:none;}#mermaid-svg-lbt5AcGzTAo7so1L .edgeLabel{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-lbt5AcGzTAo7so1L .edgeLabel p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-lbt5AcGzTAo7so1L .edgeLabel rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-lbt5AcGzTAo7so1L .labelBkg{background-color:rgba(243.9999999999, 220.9999999998, 255, 0.5);}#mermaid-svg-lbt5AcGzTAo7so1L .cluster rect{fill:hsl(220.5882352941, 100%, 98.3333333333%);stroke:hsl(220.5882352941, 60%, 88.3333333333%);stroke-width:1px;}#mermaid-svg-lbt5AcGzTAo7so1L .cluster text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-lbt5AcGzTAo7so1L .cluster span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-lbt5AcGzTAo7so1L div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(220.5882352941, 100%, 98.3333333333%);border:1px solid hsl(220.5882352941, 60%, 88.3333333333%);border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-lbt5AcGzTAo7so1L .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-lbt5AcGzTAo7so1L rect.text{fill:none;stroke-width:0;}#mermaid-svg-lbt5AcGzTAo7so1L .icon-shape,#mermaid-svg-lbt5AcGzTAo7so1L .image-shape{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-lbt5AcGzTAo7so1L .icon-shape p,#mermaid-svg-lbt5AcGzTAo7so1L .image-shape p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);padding:2px;}#mermaid-svg-lbt5AcGzTAo7so1L .icon-shape rect,#mermaid-svg-lbt5AcGzTAo7so1L .image-shape rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-lbt5AcGzTAo7so1L .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-lbt5AcGzTAo7so1L .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-lbt5AcGzTAo7so1L :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} hipMalloc / hipMemcpy / hipMemAdvise
amd::Device 虚接口
后端实现
后端实现
用户应用
CUDA/HIP 代码
HIP API 层
hipamd/src/hip_*.cpp
参数校验 · 日志 · TLS 错误状态 · ihip* 内部实现
ROCclr 抽象层 rocclr/
amd::Device · amd::Memory
amd::Context · amd::CommandQueue
amd::Command 异步命令模型
rocm/ 后端
roc::Device → HSA API
pal/ 后端
pal::Device → PAL API
ROCr Runtime HSA
hsa_amd_svm_attributes_set 等
libhsakmt thunk
→ KFD ioctl
KFD 内核驱动
每一层只依赖它下面一层的抽象接口,因此后端(ROCm / PAL)可以整体替换而不影响上层 API。从设计上看,HIP 层不是直接把每个 API 翻译成 HSA 调用,而是在 hipamd 中先维护一套 CUDA 兼容的运行时对象模型,再把这些对象落到 ROCclr 的通用抽象上。三层对象的映射关系如下:
#mermaid-svg-CpdyThn16xCQ32OZ{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-CpdyThn16xCQ32OZ .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-CpdyThn16xCQ32OZ .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-CpdyThn16xCQ32OZ .error-icon{fill:hsl(220.5882352941, 100%, 98.3333333333%);}#mermaid-svg-CpdyThn16xCQ32OZ .error-text{fill:rgb(8.5000000002, 5.7500000001, 0);stroke:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-CpdyThn16xCQ32OZ .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-CpdyThn16xCQ32OZ .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-CpdyThn16xCQ32OZ .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-CpdyThn16xCQ32OZ .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-CpdyThn16xCQ32OZ .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-CpdyThn16xCQ32OZ .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-CpdyThn16xCQ32OZ .marker{fill:#0b0b0b;stroke:#0b0b0b;}#mermaid-svg-CpdyThn16xCQ32OZ .marker.cross{stroke:#0b0b0b;}#mermaid-svg-CpdyThn16xCQ32OZ svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;}#mermaid-svg-CpdyThn16xCQ32OZ p{margin:0;}#mermaid-svg-CpdyThn16xCQ32OZ .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-CpdyThn16xCQ32OZ .cluster-label text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-CpdyThn16xCQ32OZ .cluster-label span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-CpdyThn16xCQ32OZ .cluster-label span p{background-color:transparent;}#mermaid-svg-CpdyThn16xCQ32OZ .label text,#mermaid-svg-CpdyThn16xCQ32OZ span{fill:#333;color:#333;}#mermaid-svg-CpdyThn16xCQ32OZ .node rect,#mermaid-svg-CpdyThn16xCQ32OZ .node circle,#mermaid-svg-CpdyThn16xCQ32OZ .node ellipse,#mermaid-svg-CpdyThn16xCQ32OZ .node polygon,#mermaid-svg-CpdyThn16xCQ32OZ .node path{fill:#fff4dd;stroke:hsl(40.5882352941, 60%, 83.3333333333%);stroke-width:1px;}#mermaid-svg-CpdyThn16xCQ32OZ .rough-node .label text,#mermaid-svg-CpdyThn16xCQ32OZ .node .label text,#mermaid-svg-CpdyThn16xCQ32OZ .image-shape .label,#mermaid-svg-CpdyThn16xCQ32OZ .icon-shape .label{text-anchor:middle;}#mermaid-svg-CpdyThn16xCQ32OZ .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-CpdyThn16xCQ32OZ .rough-node .label,#mermaid-svg-CpdyThn16xCQ32OZ .node .label,#mermaid-svg-CpdyThn16xCQ32OZ .image-shape .label,#mermaid-svg-CpdyThn16xCQ32OZ .icon-shape .label{text-align:center;}#mermaid-svg-CpdyThn16xCQ32OZ .node.clickable{cursor:pointer;}#mermaid-svg-CpdyThn16xCQ32OZ .root .anchor path{fill:#0b0b0b!important;stroke-width:0;stroke:#0b0b0b;}#mermaid-svg-CpdyThn16xCQ32OZ .arrowheadPath{fill:#0b0b0b;}#mermaid-svg-CpdyThn16xCQ32OZ .edgePath .path{stroke:#0b0b0b;stroke-width:2.0px;}#mermaid-svg-CpdyThn16xCQ32OZ .flowchart-link{stroke:#0b0b0b;fill:none;}#mermaid-svg-CpdyThn16xCQ32OZ .edgeLabel{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-CpdyThn16xCQ32OZ .edgeLabel p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-CpdyThn16xCQ32OZ .edgeLabel rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-CpdyThn16xCQ32OZ .labelBkg{background-color:rgba(243.9999999999, 220.9999999998, 255, 0.5);}#mermaid-svg-CpdyThn16xCQ32OZ .cluster rect{fill:hsl(220.5882352941, 100%, 98.3333333333%);stroke:hsl(220.5882352941, 60%, 88.3333333333%);stroke-width:1px;}#mermaid-svg-CpdyThn16xCQ32OZ .cluster text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-CpdyThn16xCQ32OZ .cluster span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-CpdyThn16xCQ32OZ div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(220.5882352941, 100%, 98.3333333333%);border:1px solid hsl(220.5882352941, 60%, 88.3333333333%);border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-CpdyThn16xCQ32OZ .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-CpdyThn16xCQ32OZ rect.text{fill:none;stroke-width:0;}#mermaid-svg-CpdyThn16xCQ32OZ .icon-shape,#mermaid-svg-CpdyThn16xCQ32OZ .image-shape{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-CpdyThn16xCQ32OZ .icon-shape p,#mermaid-svg-CpdyThn16xCQ32OZ .image-shape p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);padding:2px;}#mermaid-svg-CpdyThn16xCQ32OZ .icon-shape rect,#mermaid-svg-CpdyThn16xCQ32OZ .image-shape rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-CpdyThn16xCQ32OZ .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-CpdyThn16xCQ32OZ .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-CpdyThn16xCQ32OZ :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 后端对象
ROCclr 对象
HIP Runtime 对象
hip::Device
hip::Stream
hip::Event
hipMalloc 指针
hipModule_t / kernel
hipGraph_t
amd::Context + amd::Device
amd::HostQueue
amd::Event / amd::Command
amd::Memory
amd::Program / device::Kernel
graph node DAG
roc::Device / pal::Device
roc::VirtualGPU / queue
HSA signal / HW event
VRAM/GTT/SVM allocation
code object / HSA executable
capture/replay 命令序列
因此 HIP 的职责可以概括为三件事:
- 对外模拟 CUDA Runtime 语义:API 名称、错误码、默认 stream、device/context 行为尽量和 CUDA 对齐。
- 对内维护 HIP 对象生命周期:device、stream、event、memory pool、module、graph 等对象都在 HIP 层先被组织起来。
- 向下复用 ROCclr 能力:真正的设备、内存、队列、命令、事件由 ROCclr 承担,ROCm 后端再把这些抽象转换成 HSA/ROCr 操作。
3. HIP 层核心概念和关系
3.1 hip::Device: HIP 当前设备和 ROCclr Context 的桥
hip::Device 是 HIP 层最核心的运行时对象。它不是底层 GPU agent 本身,而是 HIP 对一个设备上下文的包装:
cpp
class Device : public amd::ReferenceCountedObject {
amd::Context* context_; // ROCclr context
int deviceId_; // HIP device ordinal
Stream* null_stream_; // legacy default stream
MemoryPool* default_mem_pool_;
ExecutionCtx* primaryExecCtx_;
};
关键关系:
#mermaid-svg-qKri2qfRRSVLf288{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-qKri2qfRRSVLf288 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-qKri2qfRRSVLf288 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-qKri2qfRRSVLf288 .error-icon{fill:hsl(220.5882352941, 100%, 98.3333333333%);}#mermaid-svg-qKri2qfRRSVLf288 .error-text{fill:rgb(8.5000000002, 5.7500000001, 0);stroke:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-qKri2qfRRSVLf288 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-qKri2qfRRSVLf288 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-qKri2qfRRSVLf288 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-qKri2qfRRSVLf288 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-qKri2qfRRSVLf288 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-qKri2qfRRSVLf288 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-qKri2qfRRSVLf288 .marker{fill:#0b0b0b;stroke:#0b0b0b;}#mermaid-svg-qKri2qfRRSVLf288 .marker.cross{stroke:#0b0b0b;}#mermaid-svg-qKri2qfRRSVLf288 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;}#mermaid-svg-qKri2qfRRSVLf288 p{margin:0;}#mermaid-svg-qKri2qfRRSVLf288 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-qKri2qfRRSVLf288 .cluster-label text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-qKri2qfRRSVLf288 .cluster-label span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-qKri2qfRRSVLf288 .cluster-label span p{background-color:transparent;}#mermaid-svg-qKri2qfRRSVLf288 .label text,#mermaid-svg-qKri2qfRRSVLf288 span{fill:#333;color:#333;}#mermaid-svg-qKri2qfRRSVLf288 .node rect,#mermaid-svg-qKri2qfRRSVLf288 .node circle,#mermaid-svg-qKri2qfRRSVLf288 .node ellipse,#mermaid-svg-qKri2qfRRSVLf288 .node polygon,#mermaid-svg-qKri2qfRRSVLf288 .node path{fill:#fff4dd;stroke:hsl(40.5882352941, 60%, 83.3333333333%);stroke-width:1px;}#mermaid-svg-qKri2qfRRSVLf288 .rough-node .label text,#mermaid-svg-qKri2qfRRSVLf288 .node .label text,#mermaid-svg-qKri2qfRRSVLf288 .image-shape .label,#mermaid-svg-qKri2qfRRSVLf288 .icon-shape .label{text-anchor:middle;}#mermaid-svg-qKri2qfRRSVLf288 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-qKri2qfRRSVLf288 .rough-node .label,#mermaid-svg-qKri2qfRRSVLf288 .node .label,#mermaid-svg-qKri2qfRRSVLf288 .image-shape .label,#mermaid-svg-qKri2qfRRSVLf288 .icon-shape .label{text-align:center;}#mermaid-svg-qKri2qfRRSVLf288 .node.clickable{cursor:pointer;}#mermaid-svg-qKri2qfRRSVLf288 .root .anchor path{fill:#0b0b0b!important;stroke-width:0;stroke:#0b0b0b;}#mermaid-svg-qKri2qfRRSVLf288 .arrowheadPath{fill:#0b0b0b;}#mermaid-svg-qKri2qfRRSVLf288 .edgePath .path{stroke:#0b0b0b;stroke-width:2.0px;}#mermaid-svg-qKri2qfRRSVLf288 .flowchart-link{stroke:#0b0b0b;fill:none;}#mermaid-svg-qKri2qfRRSVLf288 .edgeLabel{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-qKri2qfRRSVLf288 .edgeLabel p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-qKri2qfRRSVLf288 .edgeLabel rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-qKri2qfRRSVLf288 .labelBkg{background-color:rgba(243.9999999999, 220.9999999998, 255, 0.5);}#mermaid-svg-qKri2qfRRSVLf288 .cluster rect{fill:hsl(220.5882352941, 100%, 98.3333333333%);stroke:hsl(220.5882352941, 60%, 88.3333333333%);stroke-width:1px;}#mermaid-svg-qKri2qfRRSVLf288 .cluster text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-qKri2qfRRSVLf288 .cluster span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-qKri2qfRRSVLf288 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(220.5882352941, 100%, 98.3333333333%);border:1px solid hsl(220.5882352941, 60%, 88.3333333333%);border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-qKri2qfRRSVLf288 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-qKri2qfRRSVLf288 rect.text{fill:none;stroke-width:0;}#mermaid-svg-qKri2qfRRSVLf288 .icon-shape,#mermaid-svg-qKri2qfRRSVLf288 .image-shape{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-qKri2qfRRSVLf288 .icon-shape p,#mermaid-svg-qKri2qfRRSVLf288 .image-shape p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);padding:2px;}#mermaid-svg-qKri2qfRRSVLf288 .icon-shape rect,#mermaid-svg-qKri2qfRRSVLf288 .image-shape rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-qKri2qfRRSVLf288 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-qKri2qfRRSVLf288 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-qKri2qfRRSVLf288 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} asContext
devices0
NullStream
memory pools
primaryExecCtx
hip::Device
amd::Context
amd::Device
硬件信息/内存属性/HSA agent
hip::Stream
hip::MemoryPool
hip::ExecutionCtx
这也解释了为什么 HIP API 经常先取:
cpp
hip::getCurrentDevice()->devices()[0]
前半段 hip::getCurrentDevice() 取的是 HIP 当前设备对象;后半段 devices()[0] 才进入 ROCclr 的 amd::Device,继续访问硬件信息、内存属性、HSA agent 等后端能力。
3.2 hip::Stream: HIP stream 对 ROCclr HostQueue 的包装
hip::Stream 继承自 amd::HostQueue,所以 HIP stream 本质上是 HIP 语义包装下的一条 ROCclr 命令队列:
cpp
class Stream : public amd::HostQueue {
Device* device_;
Priority priority_;
unsigned int flags_;
bool null_;
uint64_t stream_id_;
};
它承载了几类语义:
| HIP 语义 | 在 hip::Stream 中的体现 |
|---|---|
| 默认 stream | null_、NullStream()、PER_THREAD_DEFAULT_STREAM |
| stream priority | Priority::{High, Normal, Low} |
| non-blocking stream | flags_ |
| stream capture | captureStatus_、pCaptureGraph_、captureID_ |
| command ordering | 继承 amd::HostQueue,向 ROCclr 队列提交命令 |
也就是说,HIP stream 是 CUDA 兼容语义和 ROCclr command queue 之间的适配层。
3.3 hip::Event: 同步点,底层绑定 ROCclr Event/Command
hip::Event 保存底层 amd::Event*,并通过 command/event 机制实现 record、wait、query、synchronize:
cpp
class Event {
uint32_t flags_;
amd::Event* event_;
int device_id_;
};
典型关系是:
#mermaid-svg-T0TIUnbiMX9pz5Ez{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-T0TIUnbiMX9pz5Ez .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-T0TIUnbiMX9pz5Ez .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-T0TIUnbiMX9pz5Ez .error-icon{fill:hsl(220.5882352941, 100%, 98.3333333333%);}#mermaid-svg-T0TIUnbiMX9pz5Ez .error-text{fill:rgb(8.5000000002, 5.7500000001, 0);stroke:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-T0TIUnbiMX9pz5Ez .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-T0TIUnbiMX9pz5Ez .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-T0TIUnbiMX9pz5Ez .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-T0TIUnbiMX9pz5Ez .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-T0TIUnbiMX9pz5Ez .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-T0TIUnbiMX9pz5Ez .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-T0TIUnbiMX9pz5Ez .marker{fill:#0b0b0b;stroke:#0b0b0b;}#mermaid-svg-T0TIUnbiMX9pz5Ez .marker.cross{stroke:#0b0b0b;}#mermaid-svg-T0TIUnbiMX9pz5Ez svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;}#mermaid-svg-T0TIUnbiMX9pz5Ez p{margin:0;}#mermaid-svg-T0TIUnbiMX9pz5Ez .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-T0TIUnbiMX9pz5Ez .cluster-label text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-T0TIUnbiMX9pz5Ez .cluster-label span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-T0TIUnbiMX9pz5Ez .cluster-label span p{background-color:transparent;}#mermaid-svg-T0TIUnbiMX9pz5Ez .label text,#mermaid-svg-T0TIUnbiMX9pz5Ez span{fill:#333;color:#333;}#mermaid-svg-T0TIUnbiMX9pz5Ez .node rect,#mermaid-svg-T0TIUnbiMX9pz5Ez .node circle,#mermaid-svg-T0TIUnbiMX9pz5Ez .node ellipse,#mermaid-svg-T0TIUnbiMX9pz5Ez .node polygon,#mermaid-svg-T0TIUnbiMX9pz5Ez .node path{fill:#fff4dd;stroke:hsl(40.5882352941, 60%, 83.3333333333%);stroke-width:1px;}#mermaid-svg-T0TIUnbiMX9pz5Ez .rough-node .label text,#mermaid-svg-T0TIUnbiMX9pz5Ez .node .label text,#mermaid-svg-T0TIUnbiMX9pz5Ez .image-shape .label,#mermaid-svg-T0TIUnbiMX9pz5Ez .icon-shape .label{text-anchor:middle;}#mermaid-svg-T0TIUnbiMX9pz5Ez .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-T0TIUnbiMX9pz5Ez .rough-node .label,#mermaid-svg-T0TIUnbiMX9pz5Ez .node .label,#mermaid-svg-T0TIUnbiMX9pz5Ez .image-shape .label,#mermaid-svg-T0TIUnbiMX9pz5Ez .icon-shape .label{text-align:center;}#mermaid-svg-T0TIUnbiMX9pz5Ez .node.clickable{cursor:pointer;}#mermaid-svg-T0TIUnbiMX9pz5Ez .root .anchor path{fill:#0b0b0b!important;stroke-width:0;stroke:#0b0b0b;}#mermaid-svg-T0TIUnbiMX9pz5Ez .arrowheadPath{fill:#0b0b0b;}#mermaid-svg-T0TIUnbiMX9pz5Ez .edgePath .path{stroke:#0b0b0b;stroke-width:2.0px;}#mermaid-svg-T0TIUnbiMX9pz5Ez .flowchart-link{stroke:#0b0b0b;fill:none;}#mermaid-svg-T0TIUnbiMX9pz5Ez .edgeLabel{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-T0TIUnbiMX9pz5Ez .edgeLabel p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-T0TIUnbiMX9pz5Ez .edgeLabel rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-T0TIUnbiMX9pz5Ez .labelBkg{background-color:rgba(243.9999999999, 220.9999999998, 255, 0.5);}#mermaid-svg-T0TIUnbiMX9pz5Ez .cluster rect{fill:hsl(220.5882352941, 100%, 98.3333333333%);stroke:hsl(220.5882352941, 60%, 88.3333333333%);stroke-width:1px;}#mermaid-svg-T0TIUnbiMX9pz5Ez .cluster text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-T0TIUnbiMX9pz5Ez .cluster span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-T0TIUnbiMX9pz5Ez div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(220.5882352941, 100%, 98.3333333333%);border:1px solid hsl(220.5882352941, 60%, 88.3333333333%);border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-T0TIUnbiMX9pz5Ez .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-T0TIUnbiMX9pz5Ez rect.text{fill:none;stroke-width:0;}#mermaid-svg-T0TIUnbiMX9pz5Ez .icon-shape,#mermaid-svg-T0TIUnbiMX9pz5Ez .image-shape{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-T0TIUnbiMX9pz5Ez .icon-shape p,#mermaid-svg-T0TIUnbiMX9pz5Ez .image-shape p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);padding:2px;}#mermaid-svg-T0TIUnbiMX9pz5Ez .icon-shape rect,#mermaid-svg-T0TIUnbiMX9pz5Ez .image-shape rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-T0TIUnbiMX9pz5Ez .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-T0TIUnbiMX9pz5Ez .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-T0TIUnbiMX9pz5Ez :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} hipEventRecord
event, stream
在 hip::Stream 上
插入 marker/command
command 产生
amd::Event
hip::Event 绑定这个
amd::Event
所以 event 本身不是单独执行工作的实体,它是 stream 命令序列中的同步标记。
3.4 Memory: 指针 API 背后是 amd::Memory
HIP 对外暴露的是裸指针:
cpp
hipMalloc(&ptr, size);
hipMemcpy(dst, src, size, kind);
hipMemAdvise(ptr, size, advice, device);
但运行时内部需要把用户指针找回为 ROCclr 内存对象:
cpp
amd::Memory* getMemoryObject(hip::Device* device, const void* ptr, size_t& offset);
amd::MemObjMap::FindMemObj(ptr) -> amd::Memory*
这个设计很重要:HIP API 看起来是"指针式 API",但 runtime 内部真正操作的是带元数据的 amd::Memory。元数据包括 allocation 类型、所属 context、设备位置、SVM/HMM 属性、offset 等。
3.5 Module / Kernel: code object 的加载和执行
HIP kernel 有两条来源:
- 离线编译 :
hipcc编译生成包含 code object 的可执行文件或 shared object - 运行时编译 :
libhiprtc.so在运行时编译 HIP 源码,生成可加载 code object
无论来源如何,运行时执行都要经过 module/kernel 路径:
#mermaid-svg-92S9mYHaPppSWiKA{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-92S9mYHaPppSWiKA .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-92S9mYHaPppSWiKA .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-92S9mYHaPppSWiKA .error-icon{fill:hsl(220.5882352941, 100%, 98.3333333333%);}#mermaid-svg-92S9mYHaPppSWiKA .error-text{fill:rgb(8.5000000002, 5.7500000001, 0);stroke:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-92S9mYHaPppSWiKA .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-92S9mYHaPppSWiKA .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-92S9mYHaPppSWiKA .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-92S9mYHaPppSWiKA .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-92S9mYHaPppSWiKA .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-92S9mYHaPppSWiKA .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-92S9mYHaPppSWiKA .marker{fill:#0b0b0b;stroke:#0b0b0b;}#mermaid-svg-92S9mYHaPppSWiKA .marker.cross{stroke:#0b0b0b;}#mermaid-svg-92S9mYHaPppSWiKA svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;}#mermaid-svg-92S9mYHaPppSWiKA p{margin:0;}#mermaid-svg-92S9mYHaPppSWiKA .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-92S9mYHaPppSWiKA .cluster-label text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-92S9mYHaPppSWiKA .cluster-label span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-92S9mYHaPppSWiKA .cluster-label span p{background-color:transparent;}#mermaid-svg-92S9mYHaPppSWiKA .label text,#mermaid-svg-92S9mYHaPppSWiKA span{fill:#333;color:#333;}#mermaid-svg-92S9mYHaPppSWiKA .node rect,#mermaid-svg-92S9mYHaPppSWiKA .node circle,#mermaid-svg-92S9mYHaPppSWiKA .node ellipse,#mermaid-svg-92S9mYHaPppSWiKA .node polygon,#mermaid-svg-92S9mYHaPppSWiKA .node path{fill:#fff4dd;stroke:hsl(40.5882352941, 60%, 83.3333333333%);stroke-width:1px;}#mermaid-svg-92S9mYHaPppSWiKA .rough-node .label text,#mermaid-svg-92S9mYHaPppSWiKA .node .label text,#mermaid-svg-92S9mYHaPppSWiKA .image-shape .label,#mermaid-svg-92S9mYHaPppSWiKA .icon-shape .label{text-anchor:middle;}#mermaid-svg-92S9mYHaPppSWiKA .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-92S9mYHaPppSWiKA .rough-node .label,#mermaid-svg-92S9mYHaPppSWiKA .node .label,#mermaid-svg-92S9mYHaPppSWiKA .image-shape .label,#mermaid-svg-92S9mYHaPppSWiKA .icon-shape .label{text-align:center;}#mermaid-svg-92S9mYHaPppSWiKA .node.clickable{cursor:pointer;}#mermaid-svg-92S9mYHaPppSWiKA .root .anchor path{fill:#0b0b0b!important;stroke-width:0;stroke:#0b0b0b;}#mermaid-svg-92S9mYHaPppSWiKA .arrowheadPath{fill:#0b0b0b;}#mermaid-svg-92S9mYHaPppSWiKA .edgePath .path{stroke:#0b0b0b;stroke-width:2.0px;}#mermaid-svg-92S9mYHaPppSWiKA .flowchart-link{stroke:#0b0b0b;fill:none;}#mermaid-svg-92S9mYHaPppSWiKA .edgeLabel{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-92S9mYHaPppSWiKA .edgeLabel p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-92S9mYHaPppSWiKA .edgeLabel rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-92S9mYHaPppSWiKA .labelBkg{background-color:rgba(243.9999999999, 220.9999999998, 255, 0.5);}#mermaid-svg-92S9mYHaPppSWiKA .cluster rect{fill:hsl(220.5882352941, 100%, 98.3333333333%);stroke:hsl(220.5882352941, 60%, 88.3333333333%);stroke-width:1px;}#mermaid-svg-92S9mYHaPppSWiKA .cluster text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-92S9mYHaPppSWiKA .cluster span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-92S9mYHaPppSWiKA div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(220.5882352941, 100%, 98.3333333333%);border:1px solid hsl(220.5882352941, 60%, 88.3333333333%);border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-92S9mYHaPppSWiKA .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-92S9mYHaPppSWiKA rect.text{fill:none;stroke-width:0;}#mermaid-svg-92S9mYHaPppSWiKA .icon-shape,#mermaid-svg-92S9mYHaPppSWiKA .image-shape{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-92S9mYHaPppSWiKA .icon-shape p,#mermaid-svg-92S9mYHaPppSWiKA .image-shape p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);padding:2px;}#mermaid-svg-92S9mYHaPppSWiKA .icon-shape rect,#mermaid-svg-92S9mYHaPppSWiKA .image-shape rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-92S9mYHaPppSWiKA .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-92S9mYHaPppSWiKA .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-92S9mYHaPppSWiKA :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 执行
加载
符号供执行查找
hipModuleLoad / hipModuleLoadData
加载 code object
创建/查找 amd::Program
获取 kernel 符号
device::Kernel
hipModuleLaunchKernel / hipLaunchKernel
解析 grid/block/sharedMem/stream/args
创建 NDRangeKernelCommand
提交到 hip::Stream / amd::HostQueue
因此,HIP 层把 CUDA 风格的 kernel launch 参数转换成 ROCclr 能理解的 program/kernel/command 模型。
3.6 TLS: 当前设备、错误状态和默认 stream 的线程局部状态
HIP runtime 使用线程局部状态保存每个线程自己的 CUDA 兼容上下文:
cpp
class TlsAggregator {
Device* device_; // 当前 device
std::stack<Device*> ctxt_stack_; // CUDA-style context stack
hipError_t last_error_; // sticky last error
std::stack<ihipExec_t> exec_stack_; // kernel launch 配置栈
StreamPerThread stream_per_thread_obj_;
};
这解释了几个常见行为:
| 行为 | HIP 内部依据 |
|---|---|
hipGetLastError() 每线程独立 |
thread_local hip::tls.last_error_ |
hipSetDevice() 只影响当前线程 |
thread_local hip::tls.device_ |
| per-thread default stream | StreamPerThread 每线程懒创建 |
hipConfigureCall/hipSetupArgument 老式 launch API |
exec_stack_ 保存临时 launch 配置 |
4. 典型 API 调用链
以几个常见 API 看 HIP 层如何把 CUDA 兼容接口落到 ROCclr。四条链虽然入口不同,但都遵循"参数校验 → 找对象 → 造命令/设属性 → 下沉后端"的统一套路:
#mermaid-svg-padqYdOlCtr5j0EI{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-padqYdOlCtr5j0EI .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-padqYdOlCtr5j0EI .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-padqYdOlCtr5j0EI .error-icon{fill:hsl(220.5882352941, 100%, 98.3333333333%);}#mermaid-svg-padqYdOlCtr5j0EI .error-text{fill:rgb(8.5000000002, 5.7500000001, 0);stroke:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-padqYdOlCtr5j0EI .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-padqYdOlCtr5j0EI .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-padqYdOlCtr5j0EI .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-padqYdOlCtr5j0EI .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-padqYdOlCtr5j0EI .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-padqYdOlCtr5j0EI .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-padqYdOlCtr5j0EI .marker{fill:#0b0b0b;stroke:#0b0b0b;}#mermaid-svg-padqYdOlCtr5j0EI .marker.cross{stroke:#0b0b0b;}#mermaid-svg-padqYdOlCtr5j0EI svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;}#mermaid-svg-padqYdOlCtr5j0EI p{margin:0;}#mermaid-svg-padqYdOlCtr5j0EI .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-padqYdOlCtr5j0EI .cluster-label text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-padqYdOlCtr5j0EI .cluster-label span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-padqYdOlCtr5j0EI .cluster-label span p{background-color:transparent;}#mermaid-svg-padqYdOlCtr5j0EI .label text,#mermaid-svg-padqYdOlCtr5j0EI span{fill:#333;color:#333;}#mermaid-svg-padqYdOlCtr5j0EI .node rect,#mermaid-svg-padqYdOlCtr5j0EI .node circle,#mermaid-svg-padqYdOlCtr5j0EI .node ellipse,#mermaid-svg-padqYdOlCtr5j0EI .node polygon,#mermaid-svg-padqYdOlCtr5j0EI .node path{fill:#fff4dd;stroke:hsl(40.5882352941, 60%, 83.3333333333%);stroke-width:1px;}#mermaid-svg-padqYdOlCtr5j0EI .rough-node .label text,#mermaid-svg-padqYdOlCtr5j0EI .node .label text,#mermaid-svg-padqYdOlCtr5j0EI .image-shape .label,#mermaid-svg-padqYdOlCtr5j0EI .icon-shape .label{text-anchor:middle;}#mermaid-svg-padqYdOlCtr5j0EI .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-padqYdOlCtr5j0EI .rough-node .label,#mermaid-svg-padqYdOlCtr5j0EI .node .label,#mermaid-svg-padqYdOlCtr5j0EI .image-shape .label,#mermaid-svg-padqYdOlCtr5j0EI .icon-shape .label{text-align:center;}#mermaid-svg-padqYdOlCtr5j0EI .node.clickable{cursor:pointer;}#mermaid-svg-padqYdOlCtr5j0EI .root .anchor path{fill:#0b0b0b!important;stroke-width:0;stroke:#0b0b0b;}#mermaid-svg-padqYdOlCtr5j0EI .arrowheadPath{fill:#0b0b0b;}#mermaid-svg-padqYdOlCtr5j0EI .edgePath .path{stroke:#0b0b0b;stroke-width:2.0px;}#mermaid-svg-padqYdOlCtr5j0EI .flowchart-link{stroke:#0b0b0b;fill:none;}#mermaid-svg-padqYdOlCtr5j0EI .edgeLabel{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-padqYdOlCtr5j0EI .edgeLabel p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-padqYdOlCtr5j0EI .edgeLabel rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-padqYdOlCtr5j0EI .labelBkg{background-color:rgba(243.9999999999, 220.9999999998, 255, 0.5);}#mermaid-svg-padqYdOlCtr5j0EI .cluster rect{fill:hsl(220.5882352941, 100%, 98.3333333333%);stroke:hsl(220.5882352941, 60%, 88.3333333333%);stroke-width:1px;}#mermaid-svg-padqYdOlCtr5j0EI .cluster text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-padqYdOlCtr5j0EI .cluster span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-padqYdOlCtr5j0EI div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(220.5882352941, 100%, 98.3333333333%);border:1px solid hsl(220.5882352941, 60%, 88.3333333333%);border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-padqYdOlCtr5j0EI .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-padqYdOlCtr5j0EI rect.text{fill:none;stroke-width:0;}#mermaid-svg-padqYdOlCtr5j0EI .icon-shape,#mermaid-svg-padqYdOlCtr5j0EI .image-shape{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-padqYdOlCtr5j0EI .icon-shape p,#mermaid-svg-padqYdOlCtr5j0EI .image-shape p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);padding:2px;}#mermaid-svg-padqYdOlCtr5j0EI .icon-shape rect,#mermaid-svg-padqYdOlCtr5j0EI .image-shape rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-padqYdOlCtr5j0EI .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-padqYdOlCtr5j0EI .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-padqYdOlCtr5j0EI :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 分配
拷贝
kernel
SVM 属性
hip* API 入口
HIP_INIT_API
参数校验 + 日志
ihip* 内部实现
hip::getCurrentDevice
getMemoryObject
找 ptr 对应 amd::Memory
动作类型
创建 amd::Memory
登记 ptr→Memory
Read/Write/CopyMemoryCommand
NDRangeKernelCommand
amd::Device::SetSvmAttributes
提交到 hip::Stream
= amd::HostQueue
后端转 HSA AQL packet
ROCm 后端 → HSA SVM 接口
下面逐个展开。
4.1 hipMalloc
#mermaid-svg-WzBJC828oFVsRcfM{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-WzBJC828oFVsRcfM .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-WzBJC828oFVsRcfM .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-WzBJC828oFVsRcfM .error-icon{fill:hsl(220.5882352941, 100%, 98.3333333333%);}#mermaid-svg-WzBJC828oFVsRcfM .error-text{fill:rgb(8.5000000002, 5.7500000001, 0);stroke:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-WzBJC828oFVsRcfM .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-WzBJC828oFVsRcfM .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-WzBJC828oFVsRcfM .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-WzBJC828oFVsRcfM .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-WzBJC828oFVsRcfM .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-WzBJC828oFVsRcfM .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-WzBJC828oFVsRcfM .marker{fill:#0b0b0b;stroke:#0b0b0b;}#mermaid-svg-WzBJC828oFVsRcfM .marker.cross{stroke:#0b0b0b;}#mermaid-svg-WzBJC828oFVsRcfM svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;}#mermaid-svg-WzBJC828oFVsRcfM p{margin:0;}#mermaid-svg-WzBJC828oFVsRcfM .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-WzBJC828oFVsRcfM .cluster-label text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-WzBJC828oFVsRcfM .cluster-label span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-WzBJC828oFVsRcfM .cluster-label span p{background-color:transparent;}#mermaid-svg-WzBJC828oFVsRcfM .label text,#mermaid-svg-WzBJC828oFVsRcfM span{fill:#333;color:#333;}#mermaid-svg-WzBJC828oFVsRcfM .node rect,#mermaid-svg-WzBJC828oFVsRcfM .node circle,#mermaid-svg-WzBJC828oFVsRcfM .node ellipse,#mermaid-svg-WzBJC828oFVsRcfM .node polygon,#mermaid-svg-WzBJC828oFVsRcfM .node path{fill:#fff4dd;stroke:hsl(40.5882352941, 60%, 83.3333333333%);stroke-width:1px;}#mermaid-svg-WzBJC828oFVsRcfM .rough-node .label text,#mermaid-svg-WzBJC828oFVsRcfM .node .label text,#mermaid-svg-WzBJC828oFVsRcfM .image-shape .label,#mermaid-svg-WzBJC828oFVsRcfM .icon-shape .label{text-anchor:middle;}#mermaid-svg-WzBJC828oFVsRcfM .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-WzBJC828oFVsRcfM .rough-node .label,#mermaid-svg-WzBJC828oFVsRcfM .node .label,#mermaid-svg-WzBJC828oFVsRcfM .image-shape .label,#mermaid-svg-WzBJC828oFVsRcfM .icon-shape .label{text-align:center;}#mermaid-svg-WzBJC828oFVsRcfM .node.clickable{cursor:pointer;}#mermaid-svg-WzBJC828oFVsRcfM .root .anchor path{fill:#0b0b0b!important;stroke-width:0;stroke:#0b0b0b;}#mermaid-svg-WzBJC828oFVsRcfM .arrowheadPath{fill:#0b0b0b;}#mermaid-svg-WzBJC828oFVsRcfM .edgePath .path{stroke:#0b0b0b;stroke-width:2.0px;}#mermaid-svg-WzBJC828oFVsRcfM .flowchart-link{stroke:#0b0b0b;fill:none;}#mermaid-svg-WzBJC828oFVsRcfM .edgeLabel{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-WzBJC828oFVsRcfM .edgeLabel p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-WzBJC828oFVsRcfM .edgeLabel rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-WzBJC828oFVsRcfM .labelBkg{background-color:rgba(243.9999999999, 220.9999999998, 255, 0.5);}#mermaid-svg-WzBJC828oFVsRcfM .cluster rect{fill:hsl(220.5882352941, 100%, 98.3333333333%);stroke:hsl(220.5882352941, 60%, 88.3333333333%);stroke-width:1px;}#mermaid-svg-WzBJC828oFVsRcfM .cluster text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-WzBJC828oFVsRcfM .cluster span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-WzBJC828oFVsRcfM div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(220.5882352941, 100%, 98.3333333333%);border:1px solid hsl(220.5882352941, 60%, 88.3333333333%);border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-WzBJC828oFVsRcfM .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-WzBJC828oFVsRcfM rect.text{fill:none;stroke-width:0;}#mermaid-svg-WzBJC828oFVsRcfM .icon-shape,#mermaid-svg-WzBJC828oFVsRcfM .image-shape{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-WzBJC828oFVsRcfM .icon-shape p,#mermaid-svg-WzBJC828oFVsRcfM .image-shape p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);padding:2px;}#mermaid-svg-WzBJC828oFVsRcfM .icon-shape rect,#mermaid-svg-WzBJC828oFVsRcfM .image-shape rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-WzBJC828oFVsRcfM .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-WzBJC828oFVsRcfM .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-WzBJC828oFVsRcfM :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} hipMalloc
ptr, size
HIP_INIT_API
参数检查
ihipMalloc
ptr, size, flags
hip::getCurrentDevice
amd::Context/Device
创建 amd::Memory
返回裸指针
内部登记 ptr→amd::Memory
4.2 hipMemcpyAsync
#mermaid-svg-Vt3u2LICW0qn8Gbv{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-Vt3u2LICW0qn8Gbv .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-Vt3u2LICW0qn8Gbv .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-Vt3u2LICW0qn8Gbv .error-icon{fill:hsl(220.5882352941, 100%, 98.3333333333%);}#mermaid-svg-Vt3u2LICW0qn8Gbv .error-text{fill:rgb(8.5000000002, 5.7500000001, 0);stroke:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-Vt3u2LICW0qn8Gbv .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-Vt3u2LICW0qn8Gbv .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-Vt3u2LICW0qn8Gbv .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-Vt3u2LICW0qn8Gbv .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-Vt3u2LICW0qn8Gbv .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-Vt3u2LICW0qn8Gbv .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-Vt3u2LICW0qn8Gbv .marker{fill:#0b0b0b;stroke:#0b0b0b;}#mermaid-svg-Vt3u2LICW0qn8Gbv .marker.cross{stroke:#0b0b0b;}#mermaid-svg-Vt3u2LICW0qn8Gbv svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;}#mermaid-svg-Vt3u2LICW0qn8Gbv p{margin:0;}#mermaid-svg-Vt3u2LICW0qn8Gbv .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-Vt3u2LICW0qn8Gbv .cluster-label text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-Vt3u2LICW0qn8Gbv .cluster-label span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-Vt3u2LICW0qn8Gbv .cluster-label span p{background-color:transparent;}#mermaid-svg-Vt3u2LICW0qn8Gbv .label text,#mermaid-svg-Vt3u2LICW0qn8Gbv span{fill:#333;color:#333;}#mermaid-svg-Vt3u2LICW0qn8Gbv .node rect,#mermaid-svg-Vt3u2LICW0qn8Gbv .node circle,#mermaid-svg-Vt3u2LICW0qn8Gbv .node ellipse,#mermaid-svg-Vt3u2LICW0qn8Gbv .node polygon,#mermaid-svg-Vt3u2LICW0qn8Gbv .node path{fill:#fff4dd;stroke:hsl(40.5882352941, 60%, 83.3333333333%);stroke-width:1px;}#mermaid-svg-Vt3u2LICW0qn8Gbv .rough-node .label text,#mermaid-svg-Vt3u2LICW0qn8Gbv .node .label text,#mermaid-svg-Vt3u2LICW0qn8Gbv .image-shape .label,#mermaid-svg-Vt3u2LICW0qn8Gbv .icon-shape .label{text-anchor:middle;}#mermaid-svg-Vt3u2LICW0qn8Gbv .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-Vt3u2LICW0qn8Gbv .rough-node .label,#mermaid-svg-Vt3u2LICW0qn8Gbv .node .label,#mermaid-svg-Vt3u2LICW0qn8Gbv .image-shape .label,#mermaid-svg-Vt3u2LICW0qn8Gbv .icon-shape .label{text-align:center;}#mermaid-svg-Vt3u2LICW0qn8Gbv .node.clickable{cursor:pointer;}#mermaid-svg-Vt3u2LICW0qn8Gbv .root .anchor path{fill:#0b0b0b!important;stroke-width:0;stroke:#0b0b0b;}#mermaid-svg-Vt3u2LICW0qn8Gbv .arrowheadPath{fill:#0b0b0b;}#mermaid-svg-Vt3u2LICW0qn8Gbv .edgePath .path{stroke:#0b0b0b;stroke-width:2.0px;}#mermaid-svg-Vt3u2LICW0qn8Gbv .flowchart-link{stroke:#0b0b0b;fill:none;}#mermaid-svg-Vt3u2LICW0qn8Gbv .edgeLabel{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-Vt3u2LICW0qn8Gbv .edgeLabel p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-Vt3u2LICW0qn8Gbv .edgeLabel rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-Vt3u2LICW0qn8Gbv .labelBkg{background-color:rgba(243.9999999999, 220.9999999998, 255, 0.5);}#mermaid-svg-Vt3u2LICW0qn8Gbv .cluster rect{fill:hsl(220.5882352941, 100%, 98.3333333333%);stroke:hsl(220.5882352941, 60%, 88.3333333333%);stroke-width:1px;}#mermaid-svg-Vt3u2LICW0qn8Gbv .cluster text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-Vt3u2LICW0qn8Gbv .cluster span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-Vt3u2LICW0qn8Gbv div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(220.5882352941, 100%, 98.3333333333%);border:1px solid hsl(220.5882352941, 60%, 88.3333333333%);border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-Vt3u2LICW0qn8Gbv .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-Vt3u2LICW0qn8Gbv rect.text{fill:none;stroke-width:0;}#mermaid-svg-Vt3u2LICW0qn8Gbv .icon-shape,#mermaid-svg-Vt3u2LICW0qn8Gbv .image-shape{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-Vt3u2LICW0qn8Gbv .icon-shape p,#mermaid-svg-Vt3u2LICW0qn8Gbv .image-shape p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);padding:2px;}#mermaid-svg-Vt3u2LICW0qn8Gbv .icon-shape rect,#mermaid-svg-Vt3u2LICW0qn8Gbv .image-shape rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-Vt3u2LICW0qn8Gbv .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-Vt3u2LICW0qn8Gbv .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-Vt3u2LICW0qn8Gbv :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} hipMemcpyAsync
dst, src, size, kind, stream
解析 hipStream_t
空 stream→default stream
getMemoryObject
找 src/dst 的 amd::Memory
创建 Read/Write/CopyMemoryCommand
提交到 hip::Stream
= amd::HostQueue
4.3 hipLaunchKernel
#mermaid-svg-VmBzCE0ZTCCjVukv{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-VmBzCE0ZTCCjVukv .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-VmBzCE0ZTCCjVukv .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-VmBzCE0ZTCCjVukv .error-icon{fill:hsl(220.5882352941, 100%, 98.3333333333%);}#mermaid-svg-VmBzCE0ZTCCjVukv .error-text{fill:rgb(8.5000000002, 5.7500000001, 0);stroke:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-VmBzCE0ZTCCjVukv .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-VmBzCE0ZTCCjVukv .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-VmBzCE0ZTCCjVukv .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-VmBzCE0ZTCCjVukv .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-VmBzCE0ZTCCjVukv .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-VmBzCE0ZTCCjVukv .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-VmBzCE0ZTCCjVukv .marker{fill:#0b0b0b;stroke:#0b0b0b;}#mermaid-svg-VmBzCE0ZTCCjVukv .marker.cross{stroke:#0b0b0b;}#mermaid-svg-VmBzCE0ZTCCjVukv svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;}#mermaid-svg-VmBzCE0ZTCCjVukv p{margin:0;}#mermaid-svg-VmBzCE0ZTCCjVukv .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-VmBzCE0ZTCCjVukv .cluster-label text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-VmBzCE0ZTCCjVukv .cluster-label span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-VmBzCE0ZTCCjVukv .cluster-label span p{background-color:transparent;}#mermaid-svg-VmBzCE0ZTCCjVukv .label text,#mermaid-svg-VmBzCE0ZTCCjVukv span{fill:#333;color:#333;}#mermaid-svg-VmBzCE0ZTCCjVukv .node rect,#mermaid-svg-VmBzCE0ZTCCjVukv .node circle,#mermaid-svg-VmBzCE0ZTCCjVukv .node ellipse,#mermaid-svg-VmBzCE0ZTCCjVukv .node polygon,#mermaid-svg-VmBzCE0ZTCCjVukv .node path{fill:#fff4dd;stroke:hsl(40.5882352941, 60%, 83.3333333333%);stroke-width:1px;}#mermaid-svg-VmBzCE0ZTCCjVukv .rough-node .label text,#mermaid-svg-VmBzCE0ZTCCjVukv .node .label text,#mermaid-svg-VmBzCE0ZTCCjVukv .image-shape .label,#mermaid-svg-VmBzCE0ZTCCjVukv .icon-shape .label{text-anchor:middle;}#mermaid-svg-VmBzCE0ZTCCjVukv .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-VmBzCE0ZTCCjVukv .rough-node .label,#mermaid-svg-VmBzCE0ZTCCjVukv .node .label,#mermaid-svg-VmBzCE0ZTCCjVukv .image-shape .label,#mermaid-svg-VmBzCE0ZTCCjVukv .icon-shape .label{text-align:center;}#mermaid-svg-VmBzCE0ZTCCjVukv .node.clickable{cursor:pointer;}#mermaid-svg-VmBzCE0ZTCCjVukv .root .anchor path{fill:#0b0b0b!important;stroke-width:0;stroke:#0b0b0b;}#mermaid-svg-VmBzCE0ZTCCjVukv .arrowheadPath{fill:#0b0b0b;}#mermaid-svg-VmBzCE0ZTCCjVukv .edgePath .path{stroke:#0b0b0b;stroke-width:2.0px;}#mermaid-svg-VmBzCE0ZTCCjVukv .flowchart-link{stroke:#0b0b0b;fill:none;}#mermaid-svg-VmBzCE0ZTCCjVukv .edgeLabel{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-VmBzCE0ZTCCjVukv .edgeLabel p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-VmBzCE0ZTCCjVukv .edgeLabel rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-VmBzCE0ZTCCjVukv .labelBkg{background-color:rgba(243.9999999999, 220.9999999998, 255, 0.5);}#mermaid-svg-VmBzCE0ZTCCjVukv .cluster rect{fill:hsl(220.5882352941, 100%, 98.3333333333%);stroke:hsl(220.5882352941, 60%, 88.3333333333%);stroke-width:1px;}#mermaid-svg-VmBzCE0ZTCCjVukv .cluster text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-VmBzCE0ZTCCjVukv .cluster span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-VmBzCE0ZTCCjVukv div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(220.5882352941, 100%, 98.3333333333%);border:1px solid hsl(220.5882352941, 60%, 88.3333333333%);border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-VmBzCE0ZTCCjVukv .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-VmBzCE0ZTCCjVukv rect.text{fill:none;stroke-width:0;}#mermaid-svg-VmBzCE0ZTCCjVukv .icon-shape,#mermaid-svg-VmBzCE0ZTCCjVukv .image-shape{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-VmBzCE0ZTCCjVukv .icon-shape p,#mermaid-svg-VmBzCE0ZTCCjVukv .image-shape p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);padding:2px;}#mermaid-svg-VmBzCE0ZTCCjVukv .icon-shape rect,#mermaid-svg-VmBzCE0ZTCCjVukv .image-shape rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-VmBzCE0ZTCCjVukv .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-VmBzCE0ZTCCjVukv .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-VmBzCE0ZTCCjVukv :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} hipLaunchKernel
kernel, grid, block, args, sharedMem, stream
解析当前 device 和 stream
查找 kernel 对应的
device::Kernel
打包 kernel arguments
创建 NDRangeKernelCommand
stream 提交
后端转 HSA AQL packet
4.4 hipMemAdvise
#mermaid-svg-UfhpoPqmDFDsBO7d{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-UfhpoPqmDFDsBO7d .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-UfhpoPqmDFDsBO7d .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-UfhpoPqmDFDsBO7d .error-icon{fill:hsl(220.5882352941, 100%, 98.3333333333%);}#mermaid-svg-UfhpoPqmDFDsBO7d .error-text{fill:rgb(8.5000000002, 5.7500000001, 0);stroke:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-UfhpoPqmDFDsBO7d .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-UfhpoPqmDFDsBO7d .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-UfhpoPqmDFDsBO7d .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-UfhpoPqmDFDsBO7d .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-UfhpoPqmDFDsBO7d .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-UfhpoPqmDFDsBO7d .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-UfhpoPqmDFDsBO7d .marker{fill:#0b0b0b;stroke:#0b0b0b;}#mermaid-svg-UfhpoPqmDFDsBO7d .marker.cross{stroke:#0b0b0b;}#mermaid-svg-UfhpoPqmDFDsBO7d svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;}#mermaid-svg-UfhpoPqmDFDsBO7d p{margin:0;}#mermaid-svg-UfhpoPqmDFDsBO7d .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-UfhpoPqmDFDsBO7d .cluster-label text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-UfhpoPqmDFDsBO7d .cluster-label span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-UfhpoPqmDFDsBO7d .cluster-label span p{background-color:transparent;}#mermaid-svg-UfhpoPqmDFDsBO7d .label text,#mermaid-svg-UfhpoPqmDFDsBO7d span{fill:#333;color:#333;}#mermaid-svg-UfhpoPqmDFDsBO7d .node rect,#mermaid-svg-UfhpoPqmDFDsBO7d .node circle,#mermaid-svg-UfhpoPqmDFDsBO7d .node ellipse,#mermaid-svg-UfhpoPqmDFDsBO7d .node polygon,#mermaid-svg-UfhpoPqmDFDsBO7d .node path{fill:#fff4dd;stroke:hsl(40.5882352941, 60%, 83.3333333333%);stroke-width:1px;}#mermaid-svg-UfhpoPqmDFDsBO7d .rough-node .label text,#mermaid-svg-UfhpoPqmDFDsBO7d .node .label text,#mermaid-svg-UfhpoPqmDFDsBO7d .image-shape .label,#mermaid-svg-UfhpoPqmDFDsBO7d .icon-shape .label{text-anchor:middle;}#mermaid-svg-UfhpoPqmDFDsBO7d .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-UfhpoPqmDFDsBO7d .rough-node .label,#mermaid-svg-UfhpoPqmDFDsBO7d .node .label,#mermaid-svg-UfhpoPqmDFDsBO7d .image-shape .label,#mermaid-svg-UfhpoPqmDFDsBO7d .icon-shape .label{text-align:center;}#mermaid-svg-UfhpoPqmDFDsBO7d .node.clickable{cursor:pointer;}#mermaid-svg-UfhpoPqmDFDsBO7d .root .anchor path{fill:#0b0b0b!important;stroke-width:0;stroke:#0b0b0b;}#mermaid-svg-UfhpoPqmDFDsBO7d .arrowheadPath{fill:#0b0b0b;}#mermaid-svg-UfhpoPqmDFDsBO7d .edgePath .path{stroke:#0b0b0b;stroke-width:2.0px;}#mermaid-svg-UfhpoPqmDFDsBO7d .flowchart-link{stroke:#0b0b0b;fill:none;}#mermaid-svg-UfhpoPqmDFDsBO7d .edgeLabel{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-UfhpoPqmDFDsBO7d .edgeLabel p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-UfhpoPqmDFDsBO7d .edgeLabel rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-UfhpoPqmDFDsBO7d .labelBkg{background-color:rgba(243.9999999999, 220.9999999998, 255, 0.5);}#mermaid-svg-UfhpoPqmDFDsBO7d .cluster rect{fill:hsl(220.5882352941, 100%, 98.3333333333%);stroke:hsl(220.5882352941, 60%, 88.3333333333%);stroke-width:1px;}#mermaid-svg-UfhpoPqmDFDsBO7d .cluster text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-UfhpoPqmDFDsBO7d .cluster span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-UfhpoPqmDFDsBO7d div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(220.5882352941, 100%, 98.3333333333%);border:1px solid hsl(220.5882352941, 60%, 88.3333333333%);border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-UfhpoPqmDFDsBO7d .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-UfhpoPqmDFDsBO7d rect.text{fill:none;stroke-width:0;}#mermaid-svg-UfhpoPqmDFDsBO7d .icon-shape,#mermaid-svg-UfhpoPqmDFDsBO7d .image-shape{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-UfhpoPqmDFDsBO7d .icon-shape p,#mermaid-svg-UfhpoPqmDFDsBO7d .image-shape p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);padding:2px;}#mermaid-svg-UfhpoPqmDFDsBO7d .icon-shape rect,#mermaid-svg-UfhpoPqmDFDsBO7d .image-shape rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-UfhpoPqmDFDsBO7d .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-UfhpoPqmDFDsBO7d .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-UfhpoPqmDFDsBO7d :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} hipMemAdvise
ptr, count, advice, device
HIP_INIT_API
ihipMemAdvise
getMemoryObject ptr
找 amd::Memory
选择目标 amd::Device
amd::Device::SetSvmAttributes
ROCm 后端调用
HSA SVM 属性接口
这条路径体现了 HIP SVM/HMM 设计的关键点:API 是按指针范围传入的,但真正的状态最终落到进程级 SVM range / ROCr / KFD 管理上。
5. 核心设计模式
5.1 设备抽象 (amd::Device)
cpp
// rocclr/device/device.hpp
class Device : public RuntimeObject {
public:
// 后端实现的纯虚接口
virtual bool SetSvmAttributes(...) const = 0;
virtual Memory* createBuffer(...) = 0;
virtual CommandQueue* createQueue(...) = 0;
// ...
};
// rocclr/device/rocm/rocdevice.hpp
class roc::Device : public amd::Device {
// ROCm/HSA 后端实现
hsa_agent_t agent_; // HSA agent handle
};
5.2 命令模型 (amd::Command)
ROCclr 采用异步命令队列模型,所有设备操作都被建模为 amd::Command 的派生类,入队后延迟执行:
#mermaid-svg-0ayEr2Xt7KPuktGd{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-0ayEr2Xt7KPuktGd .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-0ayEr2Xt7KPuktGd .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-0ayEr2Xt7KPuktGd .error-icon{fill:#552222;}#mermaid-svg-0ayEr2Xt7KPuktGd .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-0ayEr2Xt7KPuktGd .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-0ayEr2Xt7KPuktGd .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-0ayEr2Xt7KPuktGd .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-0ayEr2Xt7KPuktGd .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-0ayEr2Xt7KPuktGd .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-0ayEr2Xt7KPuktGd .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-0ayEr2Xt7KPuktGd .marker{fill:#333333;stroke:#333333;}#mermaid-svg-0ayEr2Xt7KPuktGd .marker.cross{stroke:#333333;}#mermaid-svg-0ayEr2Xt7KPuktGd svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-0ayEr2Xt7KPuktGd p{margin:0;}#mermaid-svg-0ayEr2Xt7KPuktGd .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-0ayEr2Xt7KPuktGd .cluster-label text{fill:#333;}#mermaid-svg-0ayEr2Xt7KPuktGd .cluster-label span{color:#333;}#mermaid-svg-0ayEr2Xt7KPuktGd .cluster-label span p{background-color:transparent;}#mermaid-svg-0ayEr2Xt7KPuktGd .label text,#mermaid-svg-0ayEr2Xt7KPuktGd span{fill:#333;color:#333;}#mermaid-svg-0ayEr2Xt7KPuktGd .node rect,#mermaid-svg-0ayEr2Xt7KPuktGd .node circle,#mermaid-svg-0ayEr2Xt7KPuktGd .node ellipse,#mermaid-svg-0ayEr2Xt7KPuktGd .node polygon,#mermaid-svg-0ayEr2Xt7KPuktGd .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-0ayEr2Xt7KPuktGd .rough-node .label text,#mermaid-svg-0ayEr2Xt7KPuktGd .node .label text,#mermaid-svg-0ayEr2Xt7KPuktGd .image-shape .label,#mermaid-svg-0ayEr2Xt7KPuktGd .icon-shape .label{text-anchor:middle;}#mermaid-svg-0ayEr2Xt7KPuktGd .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-0ayEr2Xt7KPuktGd .rough-node .label,#mermaid-svg-0ayEr2Xt7KPuktGd .node .label,#mermaid-svg-0ayEr2Xt7KPuktGd .image-shape .label,#mermaid-svg-0ayEr2Xt7KPuktGd .icon-shape .label{text-align:center;}#mermaid-svg-0ayEr2Xt7KPuktGd .node.clickable{cursor:pointer;}#mermaid-svg-0ayEr2Xt7KPuktGd .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-0ayEr2Xt7KPuktGd .arrowheadPath{fill:#333333;}#mermaid-svg-0ayEr2Xt7KPuktGd .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-0ayEr2Xt7KPuktGd .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-0ayEr2Xt7KPuktGd .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-0ayEr2Xt7KPuktGd .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-0ayEr2Xt7KPuktGd .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-0ayEr2Xt7KPuktGd .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-0ayEr2Xt7KPuktGd .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-0ayEr2Xt7KPuktGd .cluster text{fill:#333;}#mermaid-svg-0ayEr2Xt7KPuktGd .cluster span{color:#333;}#mermaid-svg-0ayEr2Xt7KPuktGd div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-0ayEr2Xt7KPuktGd .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-0ayEr2Xt7KPuktGd rect.text{fill:none;stroke-width:0;}#mermaid-svg-0ayEr2Xt7KPuktGd .icon-shape,#mermaid-svg-0ayEr2Xt7KPuktGd .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-0ayEr2Xt7KPuktGd .icon-shape p,#mermaid-svg-0ayEr2Xt7KPuktGd .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-0ayEr2Xt7KPuktGd .icon-shape rect,#mermaid-svg-0ayEr2Xt7KPuktGd .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-0ayEr2Xt7KPuktGd .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-0ayEr2Xt7KPuktGd .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-0ayEr2Xt7KPuktGd :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} amd::Command
异步命令基类
ReadMemoryCommand
WriteMemoryCommand
CopyMemoryCommand
NDRangeKernelCommand
SvmPrefetchAsyncCommand
SVM 相关
... 其他命令
5.3 内存对象跟踪
cpp
// 全局虚拟地址 → amd::Memory 映射
amd::MemObjMap::FindMemObj(ptr) → amd::Memory*
6. 关键设计决策
| 设计决策 | 目的 |
|---|---|
| 头文件分离 | hip/ 只有公共 API,实现在 clr/ |
| ROCclr 抽象 | 同一套代码支持 HIP + OpenCL |
| 后端插件化 | rocm/ vs pal/ 可切换 |
| 动态符号加载 | ROCclr 不硬链接 ROCr,运行时 dlopen |
| TLS 错误状态 | hip::tls.last_error_ 线程安全 |
| 命令队列模型 | 异步执行,延迟提交 |
| HIP 对象包装 ROCclr 对象 | 保持 CUDA 语义,同时复用 ROCclr 的 device/context/memory/queue |
| 裸指针 API + 内部内存表 | 对外保持 CUDA 风格,对内用 amd::Memory 保存 allocation 元数据 |
| 当前 device 是线程局部状态 | hipSetDevice、默认 stream、last error 都遵循 CUDA 的 per-thread 行为 |
| RTC 独立成库 | 普通 runtime 程序不必加载运行时编译逻辑,和 CUDA nvrtc 模型一致 |
| Graph/Capture 建在 stream 之上 | capture 不是单独的后端通路,而是记录 stream 上的命令依赖关系 |
7. 构建产物
libamdhip64.so ← HIP 运行时库
libhiprtc.so ← 运行时编译库
libhiprtc-builtins.so ← 内置函数
在当前 CLR Debug 构建中,这些库会生成在:
text
projects/clr/build/hipamd/lib/
实际生成结果通常是一组带版本号的真实文件和无版本软链接:
text
libamdhip64.so
-> libamdhip64.so.7
-> libamdhip64.so.7.14.60850-<git-hash>
libhiprtc.so
-> libhiprtc.so.7
-> libhiprtc.so.7.14.60850-<git-hash>
libhiprtc-builtins.so
-> libhiprtc-builtins.so.7
-> libhiprtc-builtins.so.7.14.60850-<git-hash>
7.1 libamdhip64.so
libamdhip64.so 是 HIP Runtime 的主库,对应 CUDA 里的 libcudart.so。普通 HIP 程序调用的 hipMalloc、hipMemcpy、hipLaunchKernel、hipMemAdvise 等运行时 API,最终主要进入这个库。
它的源码主体在:
text
projects/clr/hipamd/src/
例如:
text
hip_memory.cpp # 内存相关 API
hip_device.cpp # 设备相关 API
hip_stream.cpp # stream/queue 相关 API
hip_module.cpp # module/kernel 加载相关 API
7.2 libhiprtc.so
libhiprtc.so 是 HIP Runtime Compilation 库,对应 CUDA 里的 libnvrtc.so。它负责在程序运行时把 HIP kernel 源码编译成可加载的代码对象。
典型 API 包括:
cpp
hiprtcCreateProgram(...);
hiprtcCompileProgram(...);
hiprtcGetCode(...);
hiprtcDestroyProgram(...);
也就是说,普通预编译 HIP 程序不一定直接依赖 libhiprtc.so;只有使用运行时编译能力的程序才需要显式链接或加载它。
源码主体在:
text
projects/clr/hipamd/src/hiprtc/
HIPRTC 曾经和 HIP Runtime 符号混在一起,后来被拆成独立库。这样设计的好处是:
- 普通 HIP Runtime 程序不必加载运行时编译相关逻辑
- 和 CUDA 的
cudart/nvrtc分离模型一致 hiprtc可以独立暴露运行时编译 API 和 CMake package
7.3 libhiprtc-builtins.so
libhiprtc-builtins.so 是 HIPRTC 编译时使用的内置函数库,服务于运行时编译流程。它不是普通 HIP Runtime API 的主入口,而是 libhiprtc.so 在编译 kernel 源码时需要配合使用的组件。
8. 与 CUDA 的对应关系
| CUDA | HIP | 说明 |
|---|---|---|
libcudart.so |
libamdhip64.so |
运行时库 |
nvcc |
hipcc |
编译器驱动 |
cudaMemAdvise |
hipMemAdvise |
API 1:1 映射 |
| CUcontext | hip::Device |
上下文/设备 |
更完整的概念对应关系如下:
| CUDA 概念 | HIP 表达 | HIP 内部对象 | ROCclr/ROCr 落点 |
|---|---|---|---|
| Runtime API | hip* API |
hipamd/src/hip_*.cpp |
ROCclr command/device/memory |
| Device ordinal | int device |
hip::Device::deviceId_ |
amd::Device / HSA agent |
| Context | primary context / current device | hip::Device + TLS context stack |
amd::Context |
| Stream | hipStream_t |
hip::Stream |
amd::HostQueue |
| Event | hipEvent_t |
hip::Event |
amd::Event / HSA signal |
| Device memory | hipMalloc pointer |
amd::Memory |
VRAM/GTT allocation |
| Managed memory | hipMallocManaged pointer |
amd::Memory + SVM/HMM metadata |
ROCr SVM / KFD SVM range |
| Module | hipModule_t |
loaded code object/program | amd::Program |
| Kernel function | hipFunction_t / host stub |
kernel metadata | device::Kernel |
| Graph | hipGraph_t |
HIP graph DAG | captured command sequence |
| RTC | hiprtc* API |
HIPRTC program object | clang/comgr/code object |
9. HIP API 实现模式
9.1 API 入口宏
cpp
// hipamd/src/hip_internal.hpp
#define HIP_INIT_API(cid, ...) \
HIP_INIT_CB(cid, args); \
amd::Thread* thread = amd::Thread::current(); \
if (!VDI_CHECK_THREAD(thread)) { \
HIP_RETURN(hipErrorOutOfMemory); \
} \
HIP_CB_SPAWNER_OBJECT(cid);
9.2 内部实现分离
HIP 普遍采用"公共 API 外壳 + ihip* 内部实现"的分离模式:外壳负责初始化/校验/错误码包装,内部函数才写真正的业务逻辑:
#mermaid-svg-xhMof13GVZmWOgxt{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-xhMof13GVZmWOgxt .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-xhMof13GVZmWOgxt .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-xhMof13GVZmWOgxt .error-icon{fill:hsl(220.5882352941, 100%, 98.3333333333%);}#mermaid-svg-xhMof13GVZmWOgxt .error-text{fill:rgb(8.5000000002, 5.7500000001, 0);stroke:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-xhMof13GVZmWOgxt .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-xhMof13GVZmWOgxt .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-xhMof13GVZmWOgxt .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-xhMof13GVZmWOgxt .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-xhMof13GVZmWOgxt .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-xhMof13GVZmWOgxt .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-xhMof13GVZmWOgxt .marker{fill:#0b0b0b;stroke:#0b0b0b;}#mermaid-svg-xhMof13GVZmWOgxt .marker.cross{stroke:#0b0b0b;}#mermaid-svg-xhMof13GVZmWOgxt svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:22px;}#mermaid-svg-xhMof13GVZmWOgxt p{margin:0;}#mermaid-svg-xhMof13GVZmWOgxt .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-xhMof13GVZmWOgxt .cluster-label text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-xhMof13GVZmWOgxt .cluster-label span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-xhMof13GVZmWOgxt .cluster-label span p{background-color:transparent;}#mermaid-svg-xhMof13GVZmWOgxt .label text,#mermaid-svg-xhMof13GVZmWOgxt span{fill:#333;color:#333;}#mermaid-svg-xhMof13GVZmWOgxt .node rect,#mermaid-svg-xhMof13GVZmWOgxt .node circle,#mermaid-svg-xhMof13GVZmWOgxt .node ellipse,#mermaid-svg-xhMof13GVZmWOgxt .node polygon,#mermaid-svg-xhMof13GVZmWOgxt .node path{fill:#fff4dd;stroke:hsl(40.5882352941, 60%, 83.3333333333%);stroke-width:1px;}#mermaid-svg-xhMof13GVZmWOgxt .rough-node .label text,#mermaid-svg-xhMof13GVZmWOgxt .node .label text,#mermaid-svg-xhMof13GVZmWOgxt .image-shape .label,#mermaid-svg-xhMof13GVZmWOgxt .icon-shape .label{text-anchor:middle;}#mermaid-svg-xhMof13GVZmWOgxt .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-xhMof13GVZmWOgxt .rough-node .label,#mermaid-svg-xhMof13GVZmWOgxt .node .label,#mermaid-svg-xhMof13GVZmWOgxt .image-shape .label,#mermaid-svg-xhMof13GVZmWOgxt .icon-shape .label{text-align:center;}#mermaid-svg-xhMof13GVZmWOgxt .node.clickable{cursor:pointer;}#mermaid-svg-xhMof13GVZmWOgxt .root .anchor path{fill:#0b0b0b!important;stroke-width:0;stroke:#0b0b0b;}#mermaid-svg-xhMof13GVZmWOgxt .arrowheadPath{fill:#0b0b0b;}#mermaid-svg-xhMof13GVZmWOgxt .edgePath .path{stroke:#0b0b0b;stroke-width:2.0px;}#mermaid-svg-xhMof13GVZmWOgxt .flowchart-link{stroke:#0b0b0b;fill:none;}#mermaid-svg-xhMof13GVZmWOgxt .edgeLabel{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-xhMof13GVZmWOgxt .edgeLabel p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-xhMof13GVZmWOgxt .edgeLabel rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-xhMof13GVZmWOgxt .labelBkg{background-color:rgba(243.9999999999, 220.9999999998, 255, 0.5);}#mermaid-svg-xhMof13GVZmWOgxt .cluster rect{fill:hsl(220.5882352941, 100%, 98.3333333333%);stroke:hsl(220.5882352941, 60%, 88.3333333333%);stroke-width:1px;}#mermaid-svg-xhMof13GVZmWOgxt .cluster text{fill:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-xhMof13GVZmWOgxt .cluster span{color:rgb(8.5000000002, 5.7500000001, 0);}#mermaid-svg-xhMof13GVZmWOgxt div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(220.5882352941, 100%, 98.3333333333%);border:1px solid hsl(220.5882352941, 60%, 88.3333333333%);border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-xhMof13GVZmWOgxt .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-xhMof13GVZmWOgxt rect.text{fill:none;stroke-width:0;}#mermaid-svg-xhMof13GVZmWOgxt .icon-shape,#mermaid-svg-xhMof13GVZmWOgxt .image-shape{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);text-align:center;}#mermaid-svg-xhMof13GVZmWOgxt .icon-shape p,#mermaid-svg-xhMof13GVZmWOgxt .image-shape p{background-color:hsl(-79.4117647059, 100%, 93.3333333333%);padding:2px;}#mermaid-svg-xhMof13GVZmWOgxt .icon-shape rect,#mermaid-svg-xhMof13GVZmWOgxt .image-shape rect{opacity:0.5;background-color:hsl(-79.4117647059, 100%, 93.3333333333%);fill:hsl(-79.4117647059, 100%, 93.3333333333%);}#mermaid-svg-xhMof13GVZmWOgxt .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-xhMof13GVZmWOgxt .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-xhMof13GVZmWOgxt :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 用户调用
hipMemAdvise
公共 API 外壳
HIP_INIT_API + HIP_RETURN
ihipMemAdvise
内部实现/业务逻辑
ROCclr / 后端
cpp
// 公共 API
hipError_t hipMemAdvise(const void* ptr, size_t count,
hipMemoryAdvise advice, int device) {
HIP_INIT_API(hipMemAdvise, ptr, count, advice, device);
HIP_RETURN(ihipMemAdvise(ptr, count, advice, device)); // 调用内部实现
}
// 内部实现
hipError_t ihipMemAdvise(const void* ptr, size_t count,
hipMemoryAdvise advice, int device) {
// 实际业务逻辑
}
9.3 版本后缀 (_v2)
cpp
// hip_runtime_api.h
#define hipMemAdvise hipMemAdvise_v2
// 实现
hipError_t hipMemAdvise_v2(...); // 新版本
hipError_t hipMemAdvise_v1(...); // 旧版本 (兼容)
10. 源码目录详解
10.1 projects/hip/ (公共头文件)
include/hip/
├── hip_runtime.h # 主入口,包含所有 HIP 功能
├── hip_runtime_api.h # 运行时 API 声明
├── hip_vector_types.h # float4, int2 等向量类型
├── hip_fp16.h # 半精度浮点支持
├── hip_cooperative_groups.h # 协作组
├── hiprtc.h # 运行时编译 API
└── driver_types.h # CUDA 驱动兼容类型
10.2 projects/clr/hipamd/ (HIP AMD 实现)
src/
├── hip_memory.cpp # 内存管理 API
├── hip_device.cpp # 设备管理 API
├── hip_stream.cpp # 流/队列 API
├── hip_event.cpp # 事件 API
├── hip_module.cpp # 模块/内核加载
├── hip_texture.cpp # 纹理 API
├── hip_graph*.cpp # CUDA Graph 兼容
└── hip_internal.hpp # 内部宏和工具
10.3 projects/clr/rocclr/ (ROCclr 抽象层)
device/
├── device.hpp # amd::Device 基类
├── rocm/ # ROCm/HSA 后端
│ ├── rocdevice.hpp # roc::Device
│ ├── rocmemory.hpp # roc::Memory
│ └── rocvirtual.hpp # roc::VirtualGPU
└── pal/ # PAL 后端 (Windows/专业卡)
platform/
├── runtime.hpp # 运行时初始化
├── context.hpp # amd::Context
├── memory.hpp # amd::Memory
├── command.hpp # amd::Command
└── commandqueue.hpp # amd::CommandQueue
11. 总结
把前面所有调用链收敛成一条端到端主干,一次用户 API 从进入到落到内核的完整旅程如下:
#mermaid-svg-xAs0HcuW4ImOvnw4{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-xAs0HcuW4ImOvnw4 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-xAs0HcuW4ImOvnw4 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-xAs0HcuW4ImOvnw4 .error-icon{fill:#552222;}#mermaid-svg-xAs0HcuW4ImOvnw4 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-xAs0HcuW4ImOvnw4 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-xAs0HcuW4ImOvnw4 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-xAs0HcuW4ImOvnw4 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-xAs0HcuW4ImOvnw4 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-xAs0HcuW4ImOvnw4 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-xAs0HcuW4ImOvnw4 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-xAs0HcuW4ImOvnw4 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-xAs0HcuW4ImOvnw4 .marker.cross{stroke:#333333;}#mermaid-svg-xAs0HcuW4ImOvnw4 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-xAs0HcuW4ImOvnw4 p{margin:0;}#mermaid-svg-xAs0HcuW4ImOvnw4 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-xAs0HcuW4ImOvnw4 .cluster-label text{fill:#333;}#mermaid-svg-xAs0HcuW4ImOvnw4 .cluster-label span{color:#333;}#mermaid-svg-xAs0HcuW4ImOvnw4 .cluster-label span p{background-color:transparent;}#mermaid-svg-xAs0HcuW4ImOvnw4 .label text,#mermaid-svg-xAs0HcuW4ImOvnw4 span{fill:#333;color:#333;}#mermaid-svg-xAs0HcuW4ImOvnw4 .node rect,#mermaid-svg-xAs0HcuW4ImOvnw4 .node circle,#mermaid-svg-xAs0HcuW4ImOvnw4 .node ellipse,#mermaid-svg-xAs0HcuW4ImOvnw4 .node polygon,#mermaid-svg-xAs0HcuW4ImOvnw4 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-xAs0HcuW4ImOvnw4 .rough-node .label text,#mermaid-svg-xAs0HcuW4ImOvnw4 .node .label text,#mermaid-svg-xAs0HcuW4ImOvnw4 .image-shape .label,#mermaid-svg-xAs0HcuW4ImOvnw4 .icon-shape .label{text-anchor:middle;}#mermaid-svg-xAs0HcuW4ImOvnw4 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-xAs0HcuW4ImOvnw4 .rough-node .label,#mermaid-svg-xAs0HcuW4ImOvnw4 .node .label,#mermaid-svg-xAs0HcuW4ImOvnw4 .image-shape .label,#mermaid-svg-xAs0HcuW4ImOvnw4 .icon-shape .label{text-align:center;}#mermaid-svg-xAs0HcuW4ImOvnw4 .node.clickable{cursor:pointer;}#mermaid-svg-xAs0HcuW4ImOvnw4 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-xAs0HcuW4ImOvnw4 .arrowheadPath{fill:#333333;}#mermaid-svg-xAs0HcuW4ImOvnw4 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-xAs0HcuW4ImOvnw4 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-xAs0HcuW4ImOvnw4 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-xAs0HcuW4ImOvnw4 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-xAs0HcuW4ImOvnw4 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-xAs0HcuW4ImOvnw4 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-xAs0HcuW4ImOvnw4 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-xAs0HcuW4ImOvnw4 .cluster text{fill:#333;}#mermaid-svg-xAs0HcuW4ImOvnw4 .cluster span{color:#333;}#mermaid-svg-xAs0HcuW4ImOvnw4 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-xAs0HcuW4ImOvnw4 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-xAs0HcuW4ImOvnw4 rect.text{fill:none;stroke-width:0;}#mermaid-svg-xAs0HcuW4ImOvnw4 .icon-shape,#mermaid-svg-xAs0HcuW4ImOvnw4 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-xAs0HcuW4ImOvnw4 .icon-shape p,#mermaid-svg-xAs0HcuW4ImOvnw4 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-xAs0HcuW4ImOvnw4 .icon-shape rect,#mermaid-svg-xAs0HcuW4ImOvnw4 .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-xAs0HcuW4ImOvnw4 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-xAs0HcuW4ImOvnw4 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-xAs0HcuW4ImOvnw4 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} rocm/
pal/
用户 hip* API
TLS: 当前 device/默认 stream/错误状态
HIP 对象层
hip::Device/Stream/Event/Memory
ROCclr 抽象
amd::Context/HostQueue/Command/Memory
后端
roc::Device → HSA
pal::Device → PAL
ROCr Runtime
libhsakmt → KFD ioctl
GPU 硬件队列/内存
HIP 的设计遵循以下核心原则:
- CUDA 源码兼容:API 命名和语义尽量与 CUDA 一致
- 分层解耦:HIP API → ROCclr → 后端 → HSA → KFD
- 后端可插拔:同一 ROCclr 支持多种后端 (ROCm, PAL)
- 运行时共享:HIP 和 OpenCL 共享 ROCclr 层
- 延迟加载:HSA 符号运行时 dlopen,减少启动开销