pytorch自定义backend

使用PrivateUse1定义out-of-tree backend

pytorch tutotial Facilitating New Backend Integration by PrivateUse1

相关api:
torch.utils.rename_privateuse1_backend

distributed backend

pytorch tutorial Customize Process Group Backends Using Cpp Extensions

相关api:
torch.distributed.Backend.register_backend
torch.distributed.init_process_group
ProcessGroup::allreduce
fallback example1
fallback example2

分布式算子fallback处理

Flaggems
FlagCX

报错:

NotImplementedError: Could not run 'c10d::allreduce_' with arguments from the 'Autogradtxda' backend.

解决

参考CPU,为DispatchKey AutogradCPU(_表示整个module?)设置fallback:

以上修改会让此接口返回True:

python 复制代码
print("AutogradCPU backend fallback registered:", torch._C._dispatch_has_backend_fallback(
            torch._C.DispatchKey.AutogradCPU
        ))
调试代码段
python 复制代码
print(f"! rank {MY_RANK} privateuse1_backend_name: {torch._C._get_privateuse1_backend_name()}")

# 查看 c10d::allreduce_ 是否在 多个 dispatch key 上有 kernel
print("AutogradPrivateUse1 kernel registered:", torch._C._dispatch_has_kernel_for_dispatch_key(
    "c10d::allreduce_",
    torch._C.DispatchKey.AutogradPrivateUse1
))  # 应为 False
print("AutogradCPU kernel registered:", torch._C._dispatch_has_kernel_for_dispatch_key(
    "c10d::allreduce_",
    torch._C.DispatchKey.AutogradCPU
))  # 应为 False
print("CPU kernel registered:", torch._C._dispatch_has_kernel_for_dispatch_key(
    "c10d::allreduce_",
    torch._C.DispatchKey.CPU
))  # 应为 True
print("PrivateUse1 kernel registered:", torch._C._dispatch_has_kernel_for_dispatch_key(
    "c10d::allreduce_",
    torch._C.DispatchKey.PrivateUse1
))  # 应为 True

# 查看AutogradPrivateUse1, AutogradCPU, 是否有 backend fallback
print("AutogradPrivateUse1 backend fallback registered:", torch._C._dispatch_has_backend_fallback(
    torch._C.DispatchKey.AutogradPrivateUse1
))  # 应为 True

print("AutogradCPU backend fallback registered:", torch._C._dispatch_has_backend_fallback(
    torch._C.DispatchKey.AutogradCPU
))  # 应为 True

# export TORCH_LOGS=all IS NEEDED
print(torch._C._dispatch_dump("c10d::allreduce_"))
相关推荐
海上飞猪8 小时前
【Python基础】python判空
python
梦幻精灵_cq9 小时前
Linux.date格式化标识“制作”极简台历 vs Python.datetime.strftime格式化“精美”日历牌(时间工具依情境选择也是一种“智慧)
linux·python
新元代码9 小时前
Function Calling的现状和未来的发展
人工智能
jinxinyuuuus9 小时前
订阅指挥中心:数据可移植性、Schema设计与用户数据主权
数据仓库·人工智能
ASS-ASH9 小时前
视觉语言大模型Qwen3-VL-8B-Instruct概述
人工智能·python·llm·多模态·qwen·视觉语言模型·vlm
Xy-unu9 小时前
[LLM]AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning
论文阅读·人工智能·算法·机器学习·transformer·论文笔记·剪枝
kangk129 小时前
统计学基础之概率(生物信息方向)
人工智能·算法·机器学习
再__努力1点9 小时前
【77】积分图像:快速计算矩形区域和核心逻辑
开发语言·图像处理·人工智能·python·算法·计算机视觉
matlabgoodboy9 小时前
程序代做python代编程matlab代码设计plc深度学习java编写C++代写
python·深度学习·matlab
福客AI智能客服9 小时前
露营装备行业智能 AI 客服:从 “售后救火” 到 “售前场景赋能” 的转型路径
人工智能