尝试升级SCNet DCU异构系统VLLM版本(失败)

使用的系统:SCNet DCU ,版本dcu25.04

先上结论,cupy这个软件包没装上去....所以升级失败

首先确认系统系统

复制代码
lsb_release -a

复制代码
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.5 LTS
Release:        22.04
Codename:       jammy

去找DTK-25.04.2 ubuntu22.40

这是系统:

复制代码
https://download.sourcefind.cn:65024/1/main/DTK-25.04.2/Ubuntu22.04

生态包:

复制代码
https://download.sourcefind.cn:65024/4/main/

dash 1.7的,啥意思啊

复制代码
# torch2.51
https://download.sourcefind.cn:65024/directlink/4/pytorch/DAS1.7/torch-2.5.1+das.opt1.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl
# torch2.71
https://download.sourcefind.cn:65024/directlink/4/pytorch/DAS1.7/torch-2.7.1+das.opt1.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl

lsm

复制代码
https://download.sourcefind.cn:65024/directlink/4/lmslim/DAS1.7/lmslim-0.3.1+das.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl

vllm

复制代码
https://download.sourcefind.cn:65024/directlink/4/vllm/DAS1.7/vllm-0.9.2+das.opt1.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl

lighttop

复制代码
https://download.sourcefind.cn:65024/directlink/4/lightop/DAS1.7/lightop-0.6.0+das.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl

transformer

这个不管用

复制代码
https://download.sourcefind.cn:65024/directlink/4/transformer_engine/DAS1.7/transformer_engine-2.5.0+das.opt1.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl

用这个

pip install transformer -U

cupy这个硬骨头

复制代码
export CUPY_INSTALL_USE_HIP=1
export ROCM_HOME=/opt/rocm
export HCC_AMDGPU_TARGET=gfx906
pip install cupy

安装hipcub

复制代码
git clone https://github.com/ROCmSoftwarePlatform/hipCUB.git
cd hipCUB
mkdir build && cd build
cmake ..
make -j$(nproc)
sudo make install

cmake .. -DCMAKE_CXX_COMPILER=/opt/dtk/bin/hipcc  # 显式指定编译器
make -j

也不知道这样是安装好了不?

复制代码
-- Up-to-date: /opt/rocm/include/
-- Up-to-date: /opt/rocm/include//hipcub
-- Installing: /opt/rocm/include//hipcub/hipcub_version.hpp
-- Installing: /opt/rocm/lib/cmake/hipcub/hipcub-targets.cmake
-- Installing: /opt/rocm/lib/cmake/hipcub/hipcub-config.cmake
-- Installing: /opt/rocm/lib/cmake/hipcub/hipcub-config-version.cmake
-- Installing: /opt/rocm/share/doc/hipcub/LICENSE.txt

dcu24.04

先安装hipcub

复制代码
git clone https://github.com/ROCmSoftwarePlatform/hipCUB.git
cd hipCUB
mkdir build && cd build
cmake ..   -DCMAKE_CXX_COMPILER=/opt/dtk/bin/hipcc  # 显式指定编译器
make -j$(nproc)
make install

安装cupy

复制代码
export CUPY_INSTALL_USE_HIP=1
export ROCM_HOME=/opt/dtk
# export HCC_AMDGPU_TARGET=gfx906
pip install cupy

如果不行,就安装cupy12.3版本。

设置:export HCC_AMDGPU_TARGET=gfx942

安装相关库,并安装vllm

复制代码
wget https://download.sourcefind.cn:65024/directlink/4/pytorch/DAS1.7/torch-2.5.1+das.opt1.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl
pip install torch-2.5.1+das.opt1.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl

wget https://download.sourcefind.cn:65024/directlink/4/lightop/DAS1.7/lightop-0.6.0+das.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl
pip install lightop-0.6.0+das.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl


wget https://download.sourcefind.cn:65024/directlink/4/vllm/DAS1.7/vllm-0.9.2+das.opt1.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl
pip install vllm-0.9.2+das.opt1.dtk25042-cp310-cp310-manylinux_2_28_x86_64.whl

最后还是没升级成功。

调试

报错 Exception: Please install hipCUB and retry

raise Exception('Please install hipCUB and retry')

Exception: Please install hipCUB and retry

尝试编译安装

编译的时候报错

-- System architecture is x86_64

CMake Error at cmake/VerifyCompiler.cmake:39 (message):

On ROCm platform 'hipcc' or HIP-aware Clang must be used as C++ compiler.

Call Stack (most recent call first):

CMakeLists.txt:124 (include)

-- Configuring incomplete, errors occurred!

make: *** No targets specified and no makefile found. Stop.

相关推荐
benben04419 小时前
vLLM推理引擎教程4-离线推理功能
vllm
百度智能云技术站2 天前
百度百舸 X 昆仑芯 | 开源 vLLM-Kunlun Plugin,快速适配新模型、跑出极致性能
芯片·vllm·百度百舸
benben0442 天前
vLLM推理引擎教程3-分离式Prefill
vllm
mqiqe3 天前
vLLM(vLLM.ai)生产环境部署大模型
人工智能·vllm
禁默3 天前
vLLM-Ascend 部署与推理服务化实战
vllm
mqiqe3 天前
vLLM(vLLM.ai)K8S生产环境部署Qwen大模型
人工智能·kubernetes·vllm
奔跑中的小象3 天前
统信UOS V2500服务器操作系统+海光K100 AI卡环境下VLLM服务部署
服务器·人工智能·uos·vllm·统信·海光k100
deephub4 天前
LMCache:基于KV缓存复用的LLM推理优化方案
人工智能·大语言模型·vllm·kv缓存
Yeliang Wu5 天前
vLLM调优:从原理到Ubuntu 22.04实践
ubuntu·调优·推理·vllm