(预发布)[阿维笔记]分析优化CloudStudio高性能工作空间的GPU训练速度和效果

0

1.摘要

本教程使用 腾讯云 CloudStudio. CloudStudio 提供每月免费5W分钟CPU环境和1W分钟GPU环境(16C32G & Tesla T4@16GVRAM)

  • 检查了nvidia-driverconda虚拟环境.
  • 分析2025年哪个版本PyTorch更适合该环境(仅讨论计算机视觉任务).
  • 分析解释nvidia-driver, CUDA toolkitsCuDNN等兼容性问题.

2.部署CloudStudio高性能空间

2.1 硬件环境

  • 点击CloudStudio链接cloud.tencent.com/product/clo..., 注册 腾讯云 .
  • 进入CloudStudio主页面ide.cloud.tencent.com/, 点击高性能工作空间,点击新建,点击左下角免费基础型,点新建,等待 几分钟 .
  • 刷新 页面等待出现新的空间条目,点击这个 新的条目 ,开始使用高性能空间.
  • 如下是成功打开 高性能工作空间 的效果

2.2 软件环境

由于这个免费基础型环境自带 deepseek 等依赖(用不到)和conda环境,我们这里直接使用conda环境的(base)环境

  • 这里已经自动激活了(base)conda默认虚拟环境,使用pip list检查已经安装的依赖.这里并没有需要的PyTorch,所以我们需要手动安装:

    点击展开软件环境检查log

    bash 复制代码
    (base) root@VM-0-80-ubuntu:/workspace# which conda
    /root/miniforge3/bin/conda
    (base) root@VM-0-80-ubuntu:/workspace# pip list
    Package                 Version
    ----------------------- -----------
    archspec                0.2.3
    asttokens               3.0.0
    boltons                 24.0.0
    Brotli                  1.1.0
    certifi                 2024.12.14
    cffi                    1.17.1
    charset-normalizer      3.4.1
    colorama                0.4.6
    comm                    0.2.2
    conda                   24.11.2
    conda-libmamba-solver   24.11.1
    conda-package-handling  2.4.0
    conda_package_streaming 0.11.0
    debugpy                 1.8.11
    decorator               5.1.1
    distro                  1.9.0
    exceptiongroup          1.2.2
    executing               2.1.0
    frozendict              2.4.6
    h2                      4.1.0
    hpack                   4.0.0
    hyperframe              6.0.1
    idna                    3.10
    importlib_metadata      8.5.0
    ipykernel               6.29.5
    ipython                 8.31.0
    jedi                    0.19.2
    jsonpatch               1.33
    jsonpointer             3.0.0
    jupyter_client          8.6.3
    jupyter_core            5.7.2
    libmambapy              2.0.5
    matplotlib-inline       0.1.7
    menuinst                2.2.0
    nest_asyncio            1.6.0
    packaging               24.2
    parso                   0.8.4
    pexpect                 4.9.0
    pickleshare             0.7.5
    pip                     24.3.1
    platformdirs            4.3.6
    pluggy                  1.5.0
    prompt_toolkit          3.0.48
    psutil                  6.1.1
    ptyprocess              0.7.0
    pure_eval               0.2.3
    pycosat                 0.6.6
    pycparser               2.22
    Pygments                2.18.0
    PySocks                 1.7.1
    python-dateutil         2.9.0.post0
    pyzmq                   26.2.0
    requests                2.32.3
    ruamel.yaml             0.18.8
    ruamel.yaml.clib        0.2.8
    setuptools              75.6.0
    six                     1.17.0
    stack_data              0.6.3
    tornado                 6.4.2
    tqdm                    4.67.1
    traitlets               5.14.3
    truststore              0.10.0
    typing_extensions       4.12.2
    urllib3                 2.3.0
    wcwidth                 0.2.13
    wheel                   0.45.1
    zipp                    3.21.0
    zstandard               0.23.0
  • 使用nvidia-smi等命令检查GPU和nvidia-driver等版本,确定应该装什么版本PyTorch.

    点击展开环境检查log

    log 复制代码
    (base) root@VM-0-80-ubuntu:/workspace# nvidia-smi
    Thu Jun  5 13:40:40 2025       
    +-----------------------------------------------------------------------------+
    | NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
    |-------------------------------+----------------------+----------------------+
    | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
    |                               |                      |               MIG M. |
    |===============================+======================+======================|
    |   0  Tesla T4            On   | 00000000:00:09.0 Off |                    0 |
    | N/A   33C    P8    11W /  70W |      5MiB / 15360MiB |      0%      Default |
    |                               |                      |                  N/A |
    +-------------------------------+----------------------+----------------------+
    
    +-----------------------------------------------------------------------------+
    | Processes:                                                                  |
    |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
    |        ID   ID                                                   Usage      |
    |=============================================================================|
    |  No running processes found                                                 |
    +-----------------------------------------------------------------------------+
    (base) root@VM-0-80-ubuntu:/workspace# python --version
    Python 3.10.11

    !TIP

    选择PyTorch2.4.x是因为从2.4.x开始(图像分类,分割,检测等)算子速度有所优化,而2.4.12.4.0的更新版

    选择cu118版本是因为nvidia-smi的结果NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.0也就是目前的525版本nvidia-driver最高支持到CUDA toolkits 12.0CUDA toolkitCuDNN.而Pytorch给的release只有11.8, 12.1和12.4由于无法使用高于CUDA toolkit 12.0(不含),所以只能使用cu118.再者cu118也是cu11x中最新,最稳定(不很确定)的版本.另外,实测cu12.x在图像任务训练领域不会比cu117, cu118快.

    这里其实是可以使用高于cu12.0的依赖的,但是不建议新手使用.有兴趣可以查阅docs.nvidia.com/deploy/cuda.... 也就是说,在表面上nvidia-driver最高支持到12.0的情况下,直接安装并使用CUDA toolkits 12.1, 12.4e也是可以的.

    如果你需要其他版本PyTorch请到pytorch.org/get-started...

  • 安装GPU版PyTorch: pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu118

    点击展开`PyTorch GPU`安装log

    log 复制代码
    (base) root@VM-0-80-ubuntu:/workspace# pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu118
    Looking in indexes: https://download.pytorch.org/whl/cu118
    Collecting torch==2.4.1
      Downloading https://download.pytorch.org/whl/cu118/torch-2.4.1%2Bcu118-cp310-cp310-linux_x86_64.whl (857.6 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 857.6/857.6 MB 8.6 MB/s eta 0:00:00
    Collecting torchvision==0.19.1
      Downloading https://download.pytorch.org/whl/cu118/torchvision-0.19.1%2Bcu118-cp310-cp310-linux_x86_64.whl (6.3 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.3/6.3 MB 9.8 MB/s eta 0:00:00
    Collecting torchaudio==2.4.1
      Downloading https://download.pytorch.org/whl/cu118/torchaudio-2.4.1%2Bcu118-cp310-cp310-linux_x86_64.whl (3.3 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.3/3.3 MB 6.7 MB/s eta 0:00:00
    Collecting filelock (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/filelock-3.13.1-py3-none-any.whl.metadata (2.8 kB)
    Requirement already satisfied: typing-extensions>=4.8.0 in /root/miniforge3/lib/python3.10/site-packages (from torch==2.4.1) (4.12.2)
    Collecting sympy (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/sympy-1.13.3-py3-none-any.whl.metadata (12 kB)
    Collecting networkx (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/networkx-3.3-py3-none-any.whl.metadata (5.1 kB)
    Collecting jinja2 (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/Jinja2-3.1.4-py3-none-any.whl.metadata (2.6 kB)
    Collecting fsspec (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/fsspec-2024.6.1-py3-none-any.whl.metadata (11 kB)
    Collecting nvidia-cuda-nvrtc-cu11==11.8.89 (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/cu118/nvidia_cuda_nvrtc_cu11-11.8.89-py3-none-manylinux1_x86_64.whl (23.2 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.2/23.2 MB 14.5 MB/s eta 0:00:00
    Collecting nvidia-cuda-runtime-cu11==11.8.89 (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/cu118/nvidia_cuda_runtime_cu11-11.8.89-py3-none-manylinux1_x86_64.whl (875 kB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 875.6/875.6 kB 16.0 MB/s eta 0:00:00
    Collecting nvidia-cuda-cupti-cu11==11.8.87 (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/cu118/nvidia_cuda_cupti_cu11-11.8.87-py3-none-manylinux1_x86_64.whl (13.1 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.1/13.1 MB 21.3 MB/s eta 0:00:00
    Collecting nvidia-cudnn-cu11==9.1.0.70 (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/cu118/nvidia_cudnn_cu11-9.1.0.70-py3-none-manylinux2014_x86_64.whl (663.9 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 663.9/663.9 MB 10.0 MB/s eta 0:00:00
    Collecting nvidia-cublas-cu11==11.11.3.6 (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/cu118/nvidia_cublas_cu11-11.11.3.6-py3-none-manylinux1_x86_64.whl (417.9 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 417.9/417.9 MB 9.7 MB/s eta 0:00:00
    Collecting nvidia-cufft-cu11==10.9.0.58 (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/cu118/nvidia_cufft_cu11-10.9.0.58-py3-none-manylinux1_x86_64.whl (168.4 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 168.4/168.4 MB 12.5 MB/s eta 0:00:00
    Collecting nvidia-curand-cu11==10.3.0.86 (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/cu118/nvidia_curand_cu11-10.3.0.86-py3-none-manylinux1_x86_64.whl (58.1 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.1/58.1 MB 13.0 MB/s eta 0:00:00
    Collecting nvidia-cusolver-cu11==11.4.1.48 (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/cu118/nvidia_cusolver_cu11-11.4.1.48-py3-none-manylinux1_x86_64.whl (128.2 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 128.2/128.2 MB 12.5 MB/s eta 0:00:00
    Collecting nvidia-cusparse-cu11==11.7.5.86 (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/cu118/nvidia_cusparse_cu11-11.7.5.86-py3-none-manylinux1_x86_64.whl (204.1 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 204.1/204.1 MB 11.9 MB/s eta 0:00:00
    Collecting nvidia-nccl-cu11==2.20.5 (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/cu118/nvidia_nccl_cu11-2.20.5-py3-none-manylinux2014_x86_64.whl (142.9 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 142.9/142.9 MB 12.5 MB/s eta 0:00:00
    Collecting nvidia-nvtx-cu11==11.8.86 (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/cu118/nvidia_nvtx_cu11-11.8.86-py3-none-manylinux1_x86_64.whl (99 kB)
    Collecting triton==3.0.0 (from torch==2.4.1)
      Downloading https://download.pytorch.org/whl/triton-3.0.0-1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (209.4 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 209.4/209.4 MB 12.0 MB/s eta 0:00:00
    Collecting numpy (from torchvision==0.19.1)
      Downloading https://download.pytorch.org/whl/numpy-2.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)
    Collecting pillow!=8.3.*,>=5.3.0 (from torchvision==0.19.1)
      Downloading https://download.pytorch.org/whl/pillow-11.0.0-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (9.1 kB)
    Collecting MarkupSafe>=2.0 (from jinja2->torch==2.4.1)
      Downloading https://download.pytorch.org/whl/MarkupSafe-2.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
    Collecting mpmath<1.4,>=1.1.0 (from sympy->torch==2.4.1)
      Downloading https://download.pytorch.org/whl/mpmath-1.3.0-py3-none-any.whl (536 kB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 kB 7.5 MB/s eta 0:00:00
    Downloading https://download.pytorch.org/whl/pillow-11.0.0-cp310-cp310-manylinux_2_28_x86_64.whl (4.4 MB)
       ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.4/4.4 MB 9.4 MB/s eta 0:00:00
    Downloading https://download.pytorch.org/whl/filelock-3.13.1-py3-none-any.whl (11 kB)
    Downloading https://download.pytorch.org/whl/fsspec-2024.6.1-py3-none-any.whl (177 kB)
    Downloading https://download.pytorch.org/whl/Jinja2-3.1.4-py3-none-any.whl (133 kB)
    Downloading https://download.pytorch.org/whl/networkx-3.3-py3-none-any.whl (1.7 MB)
       ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 6.6 MB/s eta 0:00:00
    Downloading https://download.pytorch.org/whl/numpy-2.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.3 MB)
       ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.3/16.3 MB 13.0 MB/s eta 0:00:00
    Downloading https://download.pytorch.org/whl/sympy-1.13.3-py3-none-any.whl (6.2 MB)
       ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.2/6.2 MB 32.9 MB/s eta 0:00:00
    Installing collected packages: mpmath, sympy, pillow, nvidia-nvtx-cu11, nvidia-nccl-cu11, nvidia-cusparse-cu11, nvidia-curand-cu11, nvidia-cufft-cu11, nvidia-cuda-runtime-cu11, nvidia-cuda-nvrtc-cu11, nvidia-cuda-cupti-cu11, nvidia-cublas-cu11, numpy, networkx, MarkupSafe, fsspec, filelock, triton, nvidia-cusolver-cu11, nvidia-cudnn-cu11, jinja2, torch, torchvision, torchaudio
    Successfully installed MarkupSafe-2.1.5 filelock-3.13.1 fsspec-2024.6.1 jinja2-3.1.4 mpmath-1.3.0 networkx-3.3 numpy-2.1.2 nvidia-cublas-cu11-11.11.3.6 nvidia-cuda-cupti-cu11-11.8.87 nvidia-cuda-nvrtc-cu11-11.8.89 nvidia-cuda-runtime-cu11-11.8.89 nvidia-cudnn-cu11-9.1.0.70 nvidia-cufft-cu11-10.9.0.58 nvidia-curand-cu11-10.3.0.86 nvidia-cusolver-cu11-11.4.1.48 nvidia-cusparse-cu11-11.7.5.86 nvidia-nccl-cu11-2.20.5 nvidia-nvtx-cu11-11.8.86 pillow-11.0.0 sympy-1.13.3 torch-2.4.1+cu118 torchaudio-2.4.1+cu118 torchvision-0.19.1+cu118 triton-3.0.0
    WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.
    (base) root@VM-0-80-ubuntu:/workspace# 
  • 检查PyTorch GPU是否安装成功python -c "import torch;print(torch.cuda.get_device_name(torch.cuda.current_device()))"

log 复制代码
(base) root@VM-0-80-ubuntu:/workspace# python -c "import  torch;print(torch.cuda.get_device_name(torch.cuda.current_device()))"
Tesla T4
相关推荐
zzc9214 分钟前
时频图数据集更正程序,去除坐标轴白边及调整对应的标签值
人工智能·深度学习·数据集·标签·时频图·更正·白边
Blossom.1181 小时前
机器学习在智能供应链中的应用:需求预测与物流优化
人工智能·深度学习·神经网络·机器学习·计算机视觉·机器人·语音识别
Gyoku Mint1 小时前
深度学习×第4卷:Pytorch实战——她第一次用张量去拟合你的轨迹
人工智能·pytorch·python·深度学习·神经网络·算法·聚类
m0_751336393 小时前
突破性进展:超短等离子体脉冲实现单电子量子干涉,为飞行量子比特奠定基础
人工智能·深度学习·量子计算·材料科学·光子器件·光子学·无线电电子
有Li7 小时前
通过具有一致性嵌入的大语言模型实现端到端乳腺癌放射治疗计划制定|文献速递-最新论文分享
论文阅读·深度学习·分类·医学生
叶子爱分享10 小时前
计算机视觉与图像处理的关系
图像处理·人工智能·计算机视觉
张较瘦_10 小时前
[论文阅读] 人工智能 | 深度学习系统崩溃恢复新方案:DaiFu框架的原位修复技术
论文阅读·人工智能·深度学习
cver12310 小时前
野生动物检测数据集介绍-5,138张图片 野生动物保护监测 智能狩猎相机系统 生态研究与调查
人工智能·pytorch·深度学习·目标检测·计算机视觉·目标跟踪
学技术的大胜嗷10 小时前
离线迁移 Conda 环境到 Windows 服务器:用 conda-pack 摆脱硬路径限制
人工智能·深度学习·yolo·目标检测·机器学习
kyle~12 小时前
目标检测在国防和政府的应用实例
人工智能·目标检测·计算机视觉