大模型学习 - 内网环境搭建
环境:
- 内网,以下安装均为离线安装
- 系统:Linux cdh12 3.10.0-1160.e17.x86_64
- 内存(377G)、GPU(P40-25G)*8)
安装Anaconda
参考:
在回车接收许可时,可得按慢点,不然还得重新开始:
data:image/s3,"s3://crabby-images/37d4b/37d4b3c9af477fb336252db2b1c072347c9ca039" alt=""
安装CUDA Toolkit
参考:https://blog.csdn.net/weixin_44864260/article/details/127770525
我拿到机器的时候已经安装好:
data:image/s3,"s3://crabby-images/6deb8/6deb8d68a4358ecc056cd0c041e364e95fe8c84c" alt=""
离线安装PyTorch
参考:https://blog.csdn.net/weixin_44864260/article/details/127770525
我这里选择的是:
data:image/s3,"s3://crabby-images/3e5c4/3e5c4e84710f553c080154bcd7a0709abea8d621" alt=""
没有创建虚拟环境,直接在base中执行安装,但报错了,是依赖包没有安装:
data:image/s3,"s3://crabby-images/a30b8/a30b805279138a4c52bf6daf2b84d618c85db5d0" alt=""
开始逐个安装依赖,阿里源地址(https://mirrors.aliyun.com/pypi/simple/),Pypi源(https://mirrors.aliyun.com/pypi/simple/):
- nvidia-cunn-cu11
- nvidia-cublas-cu11
- nvidia-cuda-nvrtc-cu11
- nvidia-cuda-runtime-cu11
data:image/s3,"s3://crabby-images/56d9a/56d9aca257fba0307f44fc045084f5c4414a4033" alt=""
最好安装torch 2.0以上版本,但通过这种方式一直安装不上!!!
- 安装2.0.0以上
我这里的环境是:
data:image/s3,"s3://crabby-images/416f4/416f42d481a3d6730d8d6b38b6634c2f71c683ae" alt=""
所以需要下载,下载地址:清华源:
- pytorch
- torchvision
- torchaudio
- pytorch-cuda=11.7
data:image/s3,"s3://crabby-images/f0ba8/f0ba8eaad521a80a9d103aabcaed23427ce3ec3d" alt=""
然后逐个安装即可:
data:image/s3,"s3://crabby-images/c35a8/c35a8026e87a026ebdd726a52a7133a1fe4bb844" alt=""
检测一下:
data:image/s3,"s3://crabby-images/8329e/8329e7a761648ea1dece6598aa1fb0fdaa6c777e" alt=""
linux免密登陆
主机:Windows,目标主机:Linux
重点就是密钥生成:
ssh-keygen
然后上传公钥至服务器!
VScode远程访问
注意:下载对应的ms-vscode-remote.remote-ssh插件时,内网和外网的VScode版本一定得一直,不然安装失败。
在执行完第2步时,我这在侧边栏中并不会出现远程连接的符号:
data:image/s3,"s3://crabby-images/455c2/455c29ebb3487e9dd5fc61181bdce5d41024109b" alt=""
为了继续执行(),按F1调出命令控制,输入ssh ,选择:
data:image/s3,"s3://crabby-images/ff56c/ff56c426135d6ef050c5b706ad0f9f6110f85522" alt=""
在设置好连接信息后,选择对应的别名,根据提示填入密码,即可控制:
data:image/s3,"s3://crabby-images/a764f/a764fb67cca09e3dd3e23b5cf3731ca6c14eb335" alt=""
在远程给服务器安装vscode-server-linux-x64.tar.gz时,需要注意,替换commit_id即可,不加${}:
python
mkdir -p ~/.vscode-server/bin/commit_id
tar -zxvf /tmp/vscode-server-linux-x64.tar.gz -C ~/.vscode-server/bin/commit_id --strip 1
touch ~/.vscode-server/bin/commit_id/0
这样就可以了!