1. 动机和原理
autoload device extension 机制的作用是:
python
# on Ascend devices(HUAWEI)
import torch
# import torch_npu # <--- No longer needed in torch_npu 2.5 and later versions
x = torch.randn(2, 2).npu()
y = torch.randn(2, 2).npu()
z = x.mm(y)
# on Moore Threads devices
import torch
# import torch_musa # <--- No longer needed in torch_npu 2.5 and later versions
a = torch.tensor([1.2, 2.3], dtype=torch.float32, device='musa')
b = torch.tensor([1.2, 2.3], dtype=torch.float32, device='cpu').to('musa')
c = torch.tensor([1.2, 2.3], dtype=torch.float32).musa()
可以自动加载各家厂商的插件库,而不需要显式 import 多家厂商的 torch 插件库(从 torch 2.5 开始支持此特性)。
其实现原理是利用 Python的 EntryPoint 机制:
torch的 __init__.py
文件中会遍历组名为"torch.backends"
的所有backend_extensions
,对于每个backend_extension
,首先调用load()
来导入插件模块并获得定义的入口函数,然后调用entrypoint()
来运行入口函数- 各厂商在适配torch的时候,只需要在 各自的setup.py 中定义组名为
"torch.backends"
的 entrypoint 即可实现在import torch
时自动导入对应厂商的torch插件,无需显式导入。
2. 源码
torch/**__**init**__**.py
相关函数节选如下:
- 可以将环境变量
TORCH_DEVICE_BACKEND_AUTOLOAD = 0
来关闭自动加载
python
# 这部分代码位于 __init__.py文件的最后(也就是在import torch时会最后才 import out-of-tree device
def _import_device_backends():
"""
Leverage the Python plugin mechanism to load out-of-the-tree device extensions.
See this RFC: https://github.com/pytorch/pytorch/issues/122468
"""
from importlib.metadata import entry_points
group_name = "torch.backends"
if sys.version_info < (3, 10):
backend_extensions = entry_points().get(group_name, ())
else:
backend_extensions = entry_points(group=group_name)
for backend_extension in backend_extensions:
try:
# Load the extension
entrypoint = backend_extension.load()
# Call the entrypoint
entrypoint()
except Exception as err:
raise RuntimeError(
f"Failed to load the backend extension: {backend_extension.name}. "
f"You can disable extension auto-loading with TORCH_DEVICE_BACKEND_AUTOLOAD=0."
) from err
def _is_device_backend_autoload_enabled() -> builtins.bool:
"""
Whether autoloading out-of-the-tree device extensions is enabled.
The switch depends on the value of the environment variable
`TORCH_DEVICE_BACKEND_AUTOLOAD`.
Returns:
bool: Whether to enable autoloading the extensions. Enabled by default.
Examples:
>>> torch._is_device_backend_autoload_enabled()
True
"""
# enabled by default
return os.getenv("TORCH_DEVICE_BACKEND_AUTOLOAD", "1") == "1"
if _is_device_backend_autoload_enabled():
_import_device_backends()
torch_musa/__init__.py
相关内容如下:
python
is_loaded = False
def _autoload():
print(f"call torch_musa/_autoload")
global is_loaded
if is_loaded:
print("torch_musa already loaded.")
return
print("loading torch_musa into torch.musa...")
import torch
torch_npu/__**init__**.py
相关内容如下
python
...
# Disable autoloading before running 'import torch' to avoid circular dependencies
ORG_AUTOLOAD = os.getenv("TORCH_DEVICE_BACKEND_AUTOLOAD", "1")
os.environ["TORCH_DEVICE_BACKEND_AUTOLOAD"] = "0"
...
# This function is an entrypoint called by PyTorch
# when running 'import torch'. There is no need to do anything.
def _autoload():
# We should restore this switch as sub processes need to inherit its value
os.environ["TORCH_DEVICE_BACKEND_AUTOLOAD"] = ORG_AUTOLOAD
疑问:
entrypoint()
如果省略会怎么样?:仍可以成功实现 autoload,autoload的核心是:在import torch
时自动调用torch_xx
的函数(因此需要 load 该模块),_autoload()函数其实什么也不做.- 是否存在循环引用问题?在
torch.__init__.py
中,_import_device_backends
位于最后,因此(以下两条为个人分析,++欢迎讨论++ )- 在导入 torch 触发
torch_musa
的导入时,torch的导入已经完成,所有 symbol 已经成功导出,sys.modules['torch']
已经存在,不会出现循环引用的问题; torch_npu
的设计思路是:在导入 torch_npu 中 import torch 时,将TORCH_DEVICE_BACKEND_AUTOLOAD
环境变量取值记录下来并且设置为0,在完成 torch_npu的导入后,在末尾将TORCH_DEVICE_BACKEND_AUTOLOAD
设为原始值,这不依赖于_import_device_backends
在torch.__init__.py
中的位置。
- 在导入 torch 触发
3. References
- torch issue:https://github.com/pytorch/pytorch/issues/122468
- torch PR:https://github.com/pytorch/pytorch/pull/127074
- torch docs:https://docs.pytorch.ac.cn/tutorials/unstable/python_extension_autoload.html#how-it-works
- torch_npu:https://github.com/Ascend/pytorch
- torch_musa:https://github.com/MooreThreads/torch_musa/