尝试升级SCNet DCU异构系统VLLM版本(失败)

使用的系统:SCNet DCU ,版本dcu25.04

先上结论,cupy这个软件包没装上去....所以升级失败

首先确认系统系统

复制代码
lsb_release -a

复制代码
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.5 LTS
Release:        22.04
Codename:       jammy

去找DTK-25.04.2 ubuntu22.40

这是系统:

复制代码
https://download.sourcefind.cn:65024/1/main/DTK-25.04.2/Ubuntu22.04

生态包:

复制代码
https://download.sourcefind.cn:65024/4/main/

dash 1.7的,啥意思啊

复制代码
# torch2.51
https://download.sourcefind.cn:65024/directlink/4/pytorch/DAS1.7/torch-2.5.1+das.opt1.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl
# torch2.71
https://download.sourcefind.cn:65024/directlink/4/pytorch/DAS1.7/torch-2.7.1+das.opt1.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl

lsm

复制代码
https://download.sourcefind.cn:65024/directlink/4/lmslim/DAS1.7/lmslim-0.3.1+das.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl

vllm

复制代码
https://download.sourcefind.cn:65024/directlink/4/vllm/DAS1.7/vllm-0.9.2+das.opt1.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl

lighttop

复制代码
https://download.sourcefind.cn:65024/directlink/4/lightop/DAS1.7/lightop-0.6.0+das.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl

transformer

这个不管用

复制代码
https://download.sourcefind.cn:65024/directlink/4/transformer_engine/DAS1.7/transformer_engine-2.5.0+das.opt1.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl

用这个

pip install transformer -U

cupy这个硬骨头

复制代码
export CUPY_INSTALL_USE_HIP=1
export ROCM_HOME=/opt/rocm
export HCC_AMDGPU_TARGET=gfx906
pip install cupy

安装hipcub

复制代码
git clone https://github.com/ROCmSoftwarePlatform/hipCUB.git
cd hipCUB
mkdir build && cd build
cmake ..
make -j$(nproc)
sudo make install

cmake .. -DCMAKE_CXX_COMPILER=/opt/dtk/bin/hipcc  # 显式指定编译器
make -j

也不知道这样是安装好了不?

复制代码
-- Up-to-date: /opt/rocm/include/
-- Up-to-date: /opt/rocm/include//hipcub
-- Installing: /opt/rocm/include//hipcub/hipcub_version.hpp
-- Installing: /opt/rocm/lib/cmake/hipcub/hipcub-targets.cmake
-- Installing: /opt/rocm/lib/cmake/hipcub/hipcub-config.cmake
-- Installing: /opt/rocm/lib/cmake/hipcub/hipcub-config-version.cmake
-- Installing: /opt/rocm/share/doc/hipcub/LICENSE.txt

dcu24.04

先安装hipcub

复制代码
git clone https://github.com/ROCmSoftwarePlatform/hipCUB.git
cd hipCUB
mkdir build && cd build
cmake ..   -DCMAKE_CXX_COMPILER=/opt/dtk/bin/hipcc  # 显式指定编译器
make -j$(nproc)
make install

安装cupy

复制代码
export CUPY_INSTALL_USE_HIP=1
export ROCM_HOME=/opt/dtk
# export HCC_AMDGPU_TARGET=gfx906
pip install cupy

如果不行,就安装cupy12.3版本。

设置:export HCC_AMDGPU_TARGET=gfx942

安装相关库,并安装vllm

复制代码
wget https://download.sourcefind.cn:65024/directlink/4/pytorch/DAS1.7/torch-2.5.1+das.opt1.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl
pip install torch-2.5.1+das.opt1.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl

wget https://download.sourcefind.cn:65024/directlink/4/lightop/DAS1.7/lightop-0.6.0+das.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl
pip install lightop-0.6.0+das.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl


wget https://download.sourcefind.cn:65024/directlink/4/vllm/DAS1.7/vllm-0.9.2+das.opt1.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl
pip install vllm-0.9.2+das.opt1.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl

最后还是没升级成功。

调试

报错 Exception: Please install hipCUB and retry

raise Exception('Please install hipCUB and retry')

Exception: Please install hipCUB and retry

尝试编译安装

编译的时候报错

-- System architecture is x86_64

CMake Error at cmake/VerifyCompiler.cmake:39 (message):

On ROCm platform 'hipcc' or HIP-aware Clang must be used as C++ compiler.

Call Stack (most recent call first):

CMakeLists.txt:124 (include)

-- Configuring incomplete, errors occurred!

make: *** No targets specified and no makefile found. Stop.

相关推荐
susu10830189111 天前
LiteLLM + vLLM模型调用引擎架构
vllm
爱听歌的周童鞋2 天前
Nano-vLLM深度解读(上)
llm·vllm·scheduler·inference·nano-vllm·block manager
AI成长日志4 天前
【vLLM专栏】vLLM项目全景与快速开始
vllm
式5165 天前
VLLM架构学习(一)VLLM是什么、VLLM的原理
学习·vllm
love530love5 天前
OpenClaw搭配LM Studio VS Ollama:Windows CUDA实战深度对比与完全配置指南
人工智能·windows·vllm·ollama·llama.cpp·lm studio·openclaw
seaside20036 天前
docker 部署vllm 实现Qwen 3.5 2B 模型推理
大模型·vllm
TLY-101-0107 天前
工作日记:在win11上开启WSL安装ubuntu,使用VLLM运行ASR模型
linux·ubuntu·ai·vllm
dragonchow1237 天前
openclaw vllm 20260312
vllm·openclaw
七夜zippoe8 天前
交叉编码器重排:支持vLLM兼容API的StandardReranker实现
人工智能·vllm·重排·openjiuwen·交叉编码器
love530love8 天前
Windows 11 源码编译 vLLM 0.16 完全指南(CUDA 12.6 / PyTorch 2.7.1+cu126)
人工智能·pytorch·windows·python·深度学习·comfyui·vllm