Windows 11 源码编译 vLLM 0.16 完全指南(CUDA 12.6 / PyTorch 2.7.1+cu126)【再次实战检验】
本文是上篇 vLLM Windows cu128 编译指南 的复盘版本。上篇使用 CUDA 12.8 编译,本篇使用 CUDA 12.6 重新编译,与 PyTorch 2.7.1+cu126 完全匹配。同时修正了上篇中
subst映射用途的描述,并给出更清晰的一键恢复脚本。
Windows 多版本 CUDA + cuDNN 环境配置完全指南
Windows 本地编译 CUDA Extension Wheel 完全指南
Windows 11 源码编译 vLLM 0.16 完全指南(RTX 3090 / CUDA 12.8 / PyTorch 2.7.1)

环境信息
| 项目 | 版本 |
|---|---|
| 操作系统 | Windows 11 |
| GPU | NVIDIA GeForce RTX 3090 (sm_86) |
| 驱动 | 595.02 |
| CUDA Toolkit | 12.6(编译用) |
| Python | 3.12.11 |
| PyTorch | 2.7.1+cu126 |
| Visual Studio | 2022 Professional v17.12.17 |
| vLLM 分支 | SystemPanic/vllm-windows(vllm-for-windows 分支) |
| 编译产物版本 | 0.16.0rc2.dev243+gc8e1f5abe.d20260309.cu126 |

一、为什么要重新编译 cu126 版本?
Windows 11 源码编译 vLLM 0.16 完全指南(RTX 3090 / CUDA 12.8 / PyTorch 2.7.1)
上篇用系统上的 CUDA 12.8 编译,但实际虚拟环境中使用的 PyTorch 是 2.7.1+cu126,即 torch 内部绑定的是 CUDA 12.6 的运行时。虽然 cu128 wheel 在多数场景下也能向前兼容运行,但为了避免潜在的版本不匹配问题,用与 torch 完全一致的 CUDA 版本重新编译更为稳妥。
# 快捷切换CUDA+cuDNN编译链
# 启动脚本
. "D:\Program\switch-cuda.ps1"
# 切换到 CUDA 12.6
Switch-CUDA 12.6
验证当前 torch 的 CUDA 版本:
python -c "import torch; print(torch.__version__, '| CUDA:', torch.version.cuda)"
# 输出:2.7.1+cu126 | CUDA: 12.6
二、关键概念:subst 映射的正确用途
这是本次编译最容易出错的地方,必须先理解清楚。
CUDA 默认安装在 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6,路径含空格。MSVC 的 cl.exe 在处理 -I 参数时不会自动加引号,空格会导致路径被截断,出现编译错误。
解决方案:用 subst 把 CUDA 目录映射到无空格的盘符。
Z: → C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6
⚠️ Z: 映射的是 CUDA 目录,不是 vLLM 源码目录。 这两者不能混淆。vLLM 源码始终在
J:\PythonProjects4\vllm-windows,编译命令里也用完整的 J: 路径指定源码位置。
映射命令:
# 映射:Z: → CUDA 12.6 目录(让torch找到正确版本的nvcc)
subst Z: "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6"
# 验证映射
subst | findstr Z
如需取消映射:
# 取消当前Z:映射
subst Z: /D
三、编译前准备
3.1 确认环境
在 VS 2022 Developer Command Prompt(x64) 中操作,确认以下工具可用:
cl # 输出:用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.42.xxxxx 版
python --version # Python 3.12.11
激活 vLLM 专用 venv:
cd J:\PythonProjects4\vllm-windows
.\.venv\Scripts\Activate.ps1
python -c "import torch; print(torch.__version__, torch.version.cuda)"
# 确认输出:2.7.1+cu126 12.6
3.2 清理 CMake 缓存(重要)
如果之前有过失败的编译尝试,必须先清理缓存,否则旧的 CMake 变量会干扰新的编译:
Remove-Item -Recurse -Force "J:\PythonProjects4\vllm-windows\build" -ErrorAction SilentlyContinue
Remove-Item -Recurse -Force "J:\PythonProjects4\vllm-windows\.deps" -ErrorAction SilentlyContinue
💡
.deps目录包含 CMake 自动下载的 CUTLASS、triton-windows、FlashMLA 等外部依赖,清理后编译时会重新下载,需要网络连接。
四、设置编译环境
以下所有命令在同一个终端会话中顺序执行:
# 1. 映射 CUDA 12.6 到 Z: 盘(解决路径含空格问题)
subst Z: "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6"
# 验证映射正确
dir Z:\bin\nvcc.exe # 必须能找到
# 2. 设置 CUDA 路径变量(全部指向 Z:)
$env:CUDA_HOME = "Z:"
$env:CUDA_PATH = "Z:"
$env:CUDA_ROOT = "Z:"
$env:CudaToolkitDir = "Z:\"
$env:PATH = "Z:\bin;" + $env:PATH
# 验证:第一行必须是 Z:\bin\nvcc.exe
where.exe nvcc
# 3. 设置编译参数
$env:DISTUTILS_USE_SDK = "1"
$env:VLLM_TARGET_DEVICE = "cuda"
$env:MAX_JOBS = "10" # 根据 CPU 核心数调整
$env:TORCH_CUDA_ARCH_LIST = "8.6" # RTX 3090 对应 sm_86
$env:USE_LIBUV = "0"
⚠️
where.exe nvcc的第一行必须是Z:\bin\nvcc.exe,而不是C:\Program Files\...。如果不是,说明 PATH 设置有问题,检查$env:PATH开头是否有Z:\bin。

五、执行编译
pip wheel J:\PythonProjects4\vllm-windows `
--no-build-isolation `
--no-deps `
-w J:\PythonProjects4\vllm-windows\wheels\ `
2>&1 | Tee-Object -FilePath "J:\PythonProjects4\vllm-windows\wheels\build-cu126.log"
编译过程说明:
- CMake 配置阶段(约 5-10 分钟):自动下载 CUTLASS、FlashMLA、triton-windows 等外部依赖
- ninja 编译阶段(约 60-90 分钟):共 145 个编译目标
以下跳过信息是正常的,不是错误:
-- FlashMLA will not compile: unsupported CUDA architecture 8.6 (需要 sm_90)
-- [QUTLASS] Skipping build: CUDA 12.8 or newer is required (cu126 不支持)
-- Not building scaled_mm_c3x_sm90 (需要 sm_90)
-- Not building NVFP4 (需要 sm_100)
以下警告也可忽略:
CMake Warning: Pytorch version 2.10.0 expected for CUDA build, saw 2.7.1 instead.
编译成功标志:
[145/145] Linking CXX shared module vllm-flash-attn\_vllm_fa2_C.pyd
Successfully built vllm

六、验证 wheel
cd J:\PythonProjects4\vllm-windows
Get-ChildItem "J:\PythonProjects4\vllm-windows\wheels\" | Select-Object Name, LastWriteTime, Length
应看到类似:
vllm-0.16.0rc2.dev243+gc8e1f5abe.d20260309.cu126-cp312-cp312-win_amd64.whl 269MB
文件名结构:
cu126--- 与 PyTorch 2.7.1+cu126 匹配 ✅cp312--- Python 3.12d20260309--- 编译日期

七、安装与验证
7.1 在 vLLM venv 中验证
python -c "
import os
os.environ['USE_LIBUV'] = '0'
import vllm._C as _C
print('✅ _C 扩展加载成功')
from vllm import LLM, SamplingParams
import vllm
print('✅ vllm 导入成功,版本:', vllm.__version__)
"
7.2 安装到其他环境(如 ComfyUI)
pip install vllm-0.16.0rc2.dev243+gc8e1f5abe.d20260309.cu126-cp312-cp312-win_amd64.whl --no-deps
pip install llguidance xgrammar
⚠️ USE_LIBUV=0 必须在 import vllm 之前设置,否则 PyTorch 2.7.1 stable 会报错:
import os os.environ['USE_LIBUV'] = '0' import vllm
八、一键恢复脚本
下次需要重新编译时,在 VS 2022 Developer Shell 中执行:
cd J:\PythonProjects4\vllm-windows
.\.venv\Scripts\Activate.ps1
# 清理旧缓存(如需要)
# Remove-Item -Recurse -Force build, .deps
# 映射 CUDA 路径(Z: → CUDA 12.6,每次重启后需重新执行)
subst Z: "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6"
# 设置环境变量
$env:CUDA_HOME = "Z:"
$env:CUDA_PATH = "Z:"
$env:CUDA_ROOT = "Z:"
$env:CudaToolkitDir = "Z:\"
$env:PATH = "Z:\bin;" + $env:PATH
$env:DISTUTILS_USE_SDK = "1"
$env:VLLM_TARGET_DEVICE = "cuda"
$env:MAX_JOBS = "10"
$env:TORCH_CUDA_ARCH_LIST = "8.6"
$env:USE_LIBUV = "0"
# 确认 nvcc 第一行是 Z:\bin\nvcc.exe
where.exe nvcc
# 编译
pip wheel J:\PythonProjects4\vllm-windows `
--no-build-isolation --no-deps `
-w J:\PythonProjects4\vllm-windows\wheels\
保存所有编译依赖的 wheel 到指定目录:
假设保存到:
J:\PythonProjects4\vllm-windows
则保存 wheel 命令为:
# 打包成 wheel 文件
pip wheel . --no-build-isolation -w J:\PythonProjects4\vllm-windows\vllmwhl_cu126\


# 查看生成的 wheel
Get-ChildItem "J:\PythonProjects4\vllm-windows\vllmwhl_cu126\"

notice\] A new release of pip is available: 25.3 -\> 26.0.1 \[notice\] To update, run: python.exe -m pip install --upgrade pip (.venv) PS J:\\PythonProjects4\\vllm-windows\> (.venv) PS J:\\PythonProjects4\\vllm-windows\> # 查看生成的 wheel (.venv) PS J:\\PythonProjects4\\vllm-windows\> Get-ChildItem "J:\\PythonProjects4\\vllm-windows\\vllmwhl_cu126\\" Directory: J:\\PythonProjects4\\vllm-windows\\vllmwhl_cu126 Mode LastWriteTime Length Name ---- ------------- ------ ---- -a--- 2026/3/9 17:47 15265 aiohappyeyeballs-2.6.1-py3-none-any.whl -a--- 2026/3/9 17:47 455407 aiohttp-3.13.3-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 7490 aiosignal-1.4.0-py3-none-any.whl -a--- 2026/3/9 17:47 5303 annotated_doc-0.0.4-py3-none-any.whl -a--- 2026/3/9 17:47 13643 annotated_types-0.7.0-py3-none-any.whl -a--- 2026/3/9 17:47 455156 anthropic-0.84.0-py3-none-any.whl -a--- 2026/3/9 17:47 113592 anyio-4.12.1-py3-none-any.whl -a--- 2026/3/9 17:47 1973200 apache_tvm_ffi-0.1.9-cp312-abi3-win_amd64.whl -a--- 2026/3/9 17:47 27488 astor-0.8.1-py2.py3-none-any.whl -a--- 2026/3/9 17:47 67615 attrs-25.4.0-py3-none-any.whl -a--- 2026/3/9 17:47 215704 blake3-1.0.8-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 13900 cachetools-7.0.4-py3-none-any.whl -a--- 2026/3/9 17:47 69817 cbor2-5.8.0-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 153684 certifi-2026.2.25-py3-none-any.whl -a--- 2026/3/9 17:47 183557 cffi-2.0.0-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 142856 charset_normalizer-3.4.5-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 108274 click-8.3.1-py3-none-any.whl -a--- 2026/3/9 17:47 22228 cloudpickle-3.1.2-py3-none-any.whl -a--- 2026/3/9 17:47 25335 colorama-0.4.6-py2.py3-none-any.whl -a--- 2026/3/9 17:47 192620 compressed_tensors-0.13.0-py3-none-any.whl -a--- 2026/3/9 17:47 3480909 cryptography-46.0.5-cp311-abi3-win_amd64.whl -a--- 2026/3/9 17:47 43903 cuda_pathfinder-1.4.1-py3-none-any.whl -a--- 2026/3/9 17:47 96267167 cupy_cuda12x-14.0.1-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 39381 depyf-0.20.0-py3-none-any.whl -a--- 2026/3/9 17:47 120019 dill-0.4.1-py3-none-any.whl -a--- 2026/3/9 17:47 45550 diskcache-5.6.3-py3-none-any.whl -a--- 2026/3/9 17:47 20277 distro-1.9.0-py3-none-any.whl -a--- 2026/3/9 17:47 331094 dnspython-2.8.0-py3-none-any.whl -a--- 2026/3/9 17:47 36896 docstring_parser-0.17.0-py3-none-any.whl -a--- 2026/3/9 17:47 65638 einops-0.8.2-py3-none-any.whl -a--- 2026/3/9 17:47 35604 email_validator-2.3.0-py3-none-any.whl -a--- 2026/3/9 17:47 12304 fastapi_cli-0.0.24-py3-none-any.whl -a--- 2026/3/9 17:47 28359 fastapi_cloud_cli-0.14.1-py3-none-any.whl -a--- 2026/3/9 17:47 116999 fastapi-0.135.1-py3-none-any.whl -a--- 2026/3/9 17:47 490429 fastar-0.8.0-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 26427 filelock-3.25.0-py3-none-any.whl -a--- 2026/3/9 17:47 209703185 flashinfer_jit_cache-0.6.3-cp39-abi3-win_amd64.whl -a--- 2026/3/9 17:47 7651605 flashinfer_python-0.6.3-py3-none-any.whl -a--- 2026/3/9 17:47 44591 frozenlist-1.8.0-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 202505 fsspec-2026.2.0-py3-none-any.whl -a--- 2026/3/9 17:47 114244 gguf-0.18.0-py3-none-any.whl -a--- 2026/3/9 17:47 22800 grpcio_reflection-1.78.0-py3-none-any.whl -a--- 2026/3/9 17:47 4797657 grpcio-1.78.0-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 37515 h11-0.16.0-py3-none-any.whl -a--- 2026/3/9 17:47 78784 httpcore-1.0.9-py3-none-any.whl -a--- 2026/3/9 17:47 86694 httptools-0.7.1-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 8960 httpx_sse-0.4.3-py3-none-any.whl -a--- 2026/3/9 17:47 73517 httpx-0.28.1-py3-none-any.whl -a--- 2026/3/9 17:47 566395 huggingface_hub-0.36.2-py3-none-any.whl -a--- 2026/3/9 17:47 71008 idna-3.11-py3-none-any.whl -a--- 2026/3/9 17:47 55500 ijson-3.5.0-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 23635 interegular-0.3.3-py37-none-any.whl -a--- 2026/3/9 17:47 134899 jinja2-3.1.6-py3-none-any.whl -a--- 2026/3/9 17:47 205424 jiter-0.13.0-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 20419 jmespath-1.1.0-py3-none-any.whl -a--- 2026/3/9 17:47 18437 jsonschema_specifications-2025.9.1-py3-none-any.whl -a--- 2026/3/9 17:47 90630 jsonschema-4.26.0-py3-none-any.whl -a--- 2026/3/9 17:47 111036 lark-1.2.2-py3-none-any.whl -a--- 2026/3/9 17:47 30332380 llvmlite-0.44.0-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 45418 lm_format_enforcer-0.11.3-py3-none-any.whl -a--- 2026/3/9 17:47 61595 loguru-0.7.3-py3-none-any.whl -a--- 2026/3/9 17:47 87321 markdown_it_py-4.0.0-py3-none-any.whl -a--- 2026/3/9 17:47 15105 markupsafe-3.0.3-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 233615 mcp-1.26.0-py3-none-any.whl -a--- 2026/3/9 17:47 9979 mdurl-0.1.2-py3-none-any.whl -a--- 2026/3/9 17:47 6518623 mistral_common-1.9.1-py3-none-any.whl -a--- 2026/3/9 17:47 105738 model_hosting_container_standards-0.1.13-py3-none-any.whl -a--- 2026/3/9 17:47 536198 mpmath-1.3.0-py3-none-any.whl -a--- 2026/3/9 17:47 72708 msgpack-1.1.2-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 190024 msgspec-0.20.0-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 46053 multidict-6.7.1-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 2068504 networkx-3.6.1-py3-none-any.whl -a--- 2026/3/9 17:47 309975 ninja-1.13.0-py3-none-win_amd64.whl -a--- 2026/3/9 17:47 2831929 numba-0.61.2-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 12614190 numpy-2.2.6-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 1591041 nvidia_cudnn_frontend-1.18.0-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 50680 nvidia_ml_py-13.590.48-py3-none-any.whl -a--- 2026/3/9 17:47 2438369 openai_harmony-0.0.8-cp38-abi3-win_amd64.whl -a--- 2026/3/9 17:47 1136409 openai-2.26.0-py3-none-any.whl -a--- 2026/3/9 17:47 40070414 opencv_python_headless-4.13.0.92-cp37-abi3-win_amd64.whl -a--- 2026/3/9 17:47 2060945 outlines_core-0.2.11-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 74366 packaging-26.0-py3-none-any.whl -a--- 2026/3/9 17:47 10877 partial_json_parser-0.2.1.1.post7-py3-none-any.whl -a--- 2026/3/9 17:47 7033367 pillow-12.1.1-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 22424 portalocker-3.2.0-py3-none-any.whl -a--- 2026/3/9 17:47 64057 prometheus_client-0.24.1-py3-none-any.whl -a--- 2026/3/9 17:47 19296 prometheus_fastapi_instrumentator-7.1.0-py3-none-any.whl -a--- 2026/3/9 17:47 41655 propcache-0.4.1-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 437118 protobuf-6.33.5-cp310-abi3-win_amd64.whl -a--- 2026/3/9 17:47 137737 psutil-7.2.2-cp37-abi3-win_amd64.whl -a--- 2026/3/9 17:47 22335 py_cpuinfo-9.0.0-py3-none-any.whl -a--- 2026/3/9 17:47 35833 pybase64-1.4.3-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 8044600 pycountry-26.2.16-py3-none-any.whl -a--- 2026/3/9 17:47 48172 pycparser-3.0-py3-none-any.whl -a--- 2026/3/9 17:47 2020145 pydantic_core-2.41.5-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 74296 pydantic_extra_types-2.11.0-py3-none-any.whl -a--- 2026/3/9 17:47 58929 pydantic_settings-2.13.1-py3-none-any.whl -a--- 2026/3/9 17:47 463580 pydantic-2.12.5-py3-none-any.whl -a--- 2026/3/9 17:47 1225217 pygments-2.19.2-py3-none-any.whl -a--- 2026/3/9 17:47 28224 pyjwt-2.11.0-py3-none-any.whl -a--- 2026/3/9 17:47 22101 python_dotenv-1.2.2-py3-none-any.whl -a--- 2026/3/9 17:47 15548 python_json_logger-4.0.0-py3-none-any.whl -a--- 2026/3/9 17:47 24579 python_multipart-0.0.22-py3-none-any.whl -a--- 2026/3/9 17:47 9495040 pywin32-311-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 154003 pyyaml-6.0.3-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 619480 pyzmq-27.1.0-cp312-abi3-win_amd64.whl -a--- 2026/3/9 17:47 27427353 ray-2.54.0-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 26766 referencing-0.37.0-py3-none-any.whl -a--- 2026/3/9 17:47 277297 regex-2026.2.28-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 64738 requests-2.32.5-py3-none-any.whl -a--- 2026/3/9 17:47 32963 rich_toolkit-0.19.7-py3-none-any.whl -a--- 2026/3/9 17:47 310458 rich-14.3.3-py3-none-any.whl -a--- 2026/3/9 17:47 726090 rignore-0.7.6-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 240463 rpds_py-0.30.0-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 341380 safetensors-0.7.0-cp38-abi3-win_amd64.whl -a--- 2026/3/9 17:47 1054671 sentencepiece-0.2.1-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 439198 sentry_sdk-2.54.0-py2.py3-none-any.whl -a--- 2026/3/9 17:47 13247 setproctitle-1.3.7-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 1064234 setuptools-80.10.2-py3-none-any.whl -a--- 2026/3/9 17:47 9755 shellingham-1.5.4-py2.py3-none-any.whl -a--- 2026/3/9 17:47 11050 six-1.17.0-py2.py3-none-any.whl -a--- 2026/3/9 17:47 10235 sniffio-1.3.1-py3-none-any.whl -a--- 2026/3/9 17:47 14270 sse_starlette-3.3.2-py3-none-any.whl -a--- 2026/3/9 17:47 74272 starlette-0.52.1-py3-none-any.whl -a--- 2026/3/9 17:47 320736 supervisor-4.3.0-py2.py3-none-any.whl -a--- 2026/3/9 17:47 6299353 sympy-1.14.0-py3-none-any.whl -a--- 2026/3/9 17:47 39814 tabulate-0.10.0-py3-none-any.whl -a--- 2026/3/9 17:47 878694 tiktoken-0.12.0-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 2747786 tokenizers-0.22.2-cp39-abi3-win_amd64.whl -a--- 2026/3/9 17:47 113757972 torch-2.10.0-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 78374 tqdm-4.67.3-py3-none-any.whl -a--- 2026/3/9 17:47 11993498 transformers-4.57.6-py3-none-any.whl -a--- 2026/3/9 17:47 47382693 triton_windows-3.6.0.post25-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 56085 typer-0.24.1-py3-none-any.whl -a--- 2026/3/9 17:47 44614 typing_extensions-4.15.0-py3-none-any.whl -a--- 2026/3/9 17:47 14611 typing_inspection-0.4.2-py3-none-any.whl -a--- 2026/3/9 17:47 131584 urllib3-2.6.3-py3-none-any.whl -a--- 2026/3/9 17:47 68783 uvicorn-0.41.0-py3-none-any.whl -a--- 2026/3/9 17:49 269063485 vllm-0.16.0rc2.dev243+gc8e1f5abe.d20260309.cu126-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 288410 watchfiles-1.1.1-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 178693 websockets-16.0-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 4083 win32_setctime-1.2.0-py3-none-any.whl -a--- 2026/3/9 17:47 671496 winloop-0.5.0-cp312-cp312-win_amd64.whl -a--- 2026/3/9 17:47 2639032 xformers-0.0.35.dev1121-cp39-abi3-win_amd64.whl -a--- 2026/3/9 17:47 87674 yarl-1.23.0-cp312-cp312-win_amd64.whl
九、与上篇的对比
Windows 11 源码编译 vLLM 0.16 完全指南(RTX 3090 / CUDA 12.8 / PyTorch 2.7.1)
| 上篇(cu128) | 本篇(cu126) | |
|---|---|---|
| CUDA 编译版本 | 12.8 | 12.6 |
| torch 匹配度 | 部分匹配 | 完全匹配 ✅ |
| subst Z: 指向 | CUDA 12.8 目录 | CUDA 12.6 目录 |
| wheel 大小 | 269 MB | 269 MB |
| 编译日期 | 20260308 | 20260309 |
两个 wheel 可以同时保留,根据目标环境的 torch+cuda 版本选择安装。