GROMACS 2026 Beta 异构集群完全部署手册
环境 : Ubuntu 22.04 LTS
硬件:
- Node A: RTX 4090 (Driver 555 / CUDA 12.5)
- Node B: RTX 5090 (Driver 590 / CUDA 13.1)
核心策略 :
问题: 驱动版本不一致导致 CUDA toolkit 选型困境
解决方案:
编译器选择: 使用 CUDA 12.5(照顾 4090 的驱动上限)
GPU 架构策略:
编译 sm_89(4090 原生机器码)
同时保留 compute_89 PTX(供 5090 驱动 JIT 编译)
运行时行为:
4090 节点:直接执行 sm_89 机器码(零开销)
5090 节点:驱动自动将 PTX 翻译为 sm_100 指令(首次启动稍慢,后续运行接近原生)
1. 目录准备
请以 root 身份执行。我们将所有源码放在 /public/software/sources,安装在 /public/software。
bash
# 定义环境变量方便后续操作
export SOFT_ROOT=/public/software
export SRC_DIR=/public/sourcecode
2. 编译 GCC 11.4.0 (地基)
耗时预警: 30-60 分钟
GROMACS 2026 需要良好的 C++17/20 支持。手动编译 GCC 11 可以保证所有节点的运行库一致。
2.1 下载与解压
bash
cd $SRC_DIR
wget http://ftp.gnu.org/gnu/gcc/gcc-11.4.0/gcc-11.4.0.tar.gz
tar -xvf gcc-11.4.0.tar.gz
cd gcc-11.4.0
# 运行脚本自动下载依赖 (gmp, mpfr, mpc)
./contrib/download_prerequisites
2.2 编译安装
bash
mkdir build && cd build
# 配置 (禁用 multilib 以减少错误,指定安装路径)
../configure --prefix=/public/software/gcc-11.4.0 --enable-languages=c,c++,fortran --disable-multilib --disable-bootstrap --disable-libsanitizer
# 编译 (使用 8 核)
make -j 8 2>&1 | tee full_build.log
make install
2.3 配置 GCC Module
新建文件 /public/software/modules/modulefiles/gcc/11.4.0:
tcl
#%Module1.0
proc ModulesHelp {} {
global version modroot
puts stdout "\t loads GCC 11.4.0\n"
}
module-whatis "GCC 11.4.0"
set VERSION 11.4.0
set GCC_DIR /public/software/gcc-11.4.0
prepend-path PATH $GCC_DIR/bin
prepend-path LD_LIBRARY_PATH $GCC_DIR/lib64
prepend-path LD_LIBRARY_PATH $GCC_DIR/lib
prepend-path MANPATH $GCC_DIR/share/man
prepend-path C_INCLUDE_PATH $GCC_DIR/include
prepend-path CPLUS_INCLUDE_PATH $GCC_DIR/include
if [ module-info mode load ] {
system echo "GCC 11.4.0 is loaded"
}
if [ module-info mode switch2 ] {
system echo "GCC 11.4.0 is loaded"
}
if [ module-info mode remove ] {
system echo "GCC 11.4.0 is unloaded"
}
3. 编译 cmake
cmake官方有预编译二进制包以及自己基于gcc编译的两个版本。本文为了方便直接使用了官方的预编译包
bash
# 1. 下载官方二进制包 (注意是 cmake-4.2.2)
cd /public/sourcecode
wget https://github.com/Kitware/CMake/releases/download/v4.2.2/cmake-4.2.2-linux-x86_64.tar.gz
# 3. 解压
tar -xvf cmake-4.2.2-linux-x86_64.tar.gz
# 4. 部署
# 注意:解压后的目录结构可能略有不同,我们确保 bin, share 目录移动到位
mv cmake-4.2.2-linux-x86_64 /public/software/cmake4.2.2
加入module
bash
#%Module1.0
proc ModulesHelp {} {
global version modroot
puts stdout "\t loads CMAKE 4.2.2\n"
}
module-whatis "CMAKE 4.2.2"
set VERSION 4.2.2
set CMAKE_DIR /public/software/cmake4.2.2
prepend-path PATH $CMAKE_DIR/bin
prepend-path LD_LIBRARY_PATH $CMAKE_DIR/share
prepend-path CMAKE_ROOT $CMAKE_DIR
if [ module-info mode load ] {
system echo "Cmake 4.2.2 is loaded"
}
if [ module-info mode switch2 ] {
system echo "Cmake 4.2.2 is loaded"
}
if [ module-info mode remove ] {
system echo "Cmake 4.2.2 is unloaded"
}
4. 编译 OpenMPI 5.0.5
依赖: GCC 11 + CUDA 12.5
我们需要 MPI 支持 CUDA-aware 功能。
4.1 加载环境
bash
# 加载刚才装的 GCC 和系统 CUDA
module load GNU/11.4.0
module load CUDA/12.5.0
4.2 下载与编译
bash
cd $SRC_DIR
wget https://download.open-mpi.org/release/open-mpi/v5.0/openmpi-5.0.5.tar.gz
tar -zxvf openmpi-5.0.5.tar.gz
cd openmpi-5.0.5
./configure --prefix=/public/software/openmpi5.0.5_gcc11.4 --with-slurm
make -j 8
make install
4.3 配置 OpenMPI Module
新建文件 /public/software/modules/modulefiles/openmpi/5.0.5_gcc11.4:
tcl
#%Module1.0
proc ModulesHelp {} {
global version modroot
puts stdout "\t loads openmpi 5.0.5\n"
}
module-whatis "openmpi 5.0.5"
conflict openmpi
set VERSION 5.0.5
set OPENMPI_DIR /public/software/openmpi5.0.5_gcc11.4
prepend-path PATH $OPENMPI_DIR/bin
prepend-path LD_LIBRARY_PATH $OPENMPI_DIR/lib
if [ module-info mode load ] {
system echo "OPENMPI 5.0.5 is loaded"
}
if [ module-info mode switch2 ] {
system echo "OPENMPI 5.0.5 is loaded"
}
if [ module-info mode remove ] {
system echo "OPENMPI 5.0.5 is unloaded"
}
5. 编译 FFTW 3.3.10 (单精度 AVX512)
依赖: GCC 11
GROMACS 默认使用单精度 (float) 计算,且极度依赖 AVX512 指令集。
5.1 下载与编译
bash
module load gcc/11.4.0
cd $SRC_DIR
wget http://www.fftw.org/fftw-3.3.10.tar.gz
tar -zxvf fftw-3.3.10.tar.gz
cd fftw-3.3.10
# 关键参数: --enable-float (单精度), AVX512 (加速)
./configure --prefix=/public/software/fftw3.3.10_gcc11.4 --enable-float --enable-threads --enable-sse2 --enable-avx --enable-avx2 --enable-avx512 --enable-shared CC=gcc F77=gfortran
make -j 8
make install
5.2 配置 FFTW Module
新建文件 /public/software/modules/modulefiles/fftw/3.3.10_gcc11.4:
tcl
#%Module1.0
proc ModulesHelp {} {
global version modroot
puts stdout "\t loads fftw 3.3.10 \n"
}
module-whatis "FFTW 3.3.10"
set VERSION 3.3.10
set FFTW_DIR /public/software/fftw3.3.10_gcc11.4
prepend-path PATH ${FFTW_DIR}/bin
prepend-path LD_LIBRARY_PATH ${FFTW_DIR}/lib
if [ module-info mode load ] {
system echo "fftw 3.3.10 is loaded"
}
if [ module-info mode switch2 ] {
system echo "fftw 3.3.10 is loaded"
}
if [ module-info mode remove ] {
system echo "fftw 3.3.10 is unloaded"
}
6. 部署 LibTorch (AI 势能核心)
无需编译,但需下载正确版本
6.1 下载
我们将使用 LibTorch 2.4.0 (CUDA 12.4) 版本。注意必须是 cxx11 ABI。
bash
cd $SRC_DIR
# 下载
wget https://download.pytorch.org/libtorch/cu124/libtorch-cxx11-abi-shared-with-deps-2.4.0%2Bcu124.zip
# 解压
unzip libtorch-cxx11-abi-shared-with-deps-*.zip
#转移到对应目录
mkdir lib
mv libtorch ./lib
mv lib /public/software
7. 编译 GROMACS 2026 Beta
这是最后一步,整合所有依赖。
7.1 准备环境
bash
# 清空环境
module purge
# 加载我们刚做好的 module
module load GNU/11.4.0
module load CUDA/12.5.0
module load openmpi/5.0.5_gcc11.4
module load fftw/3.3.10_gcc11.4
module load cmake/4.2.2
# 检查 cmake 版本 (需要 3.20+)
cmake --version
7.2 源码下载与配置
bash
cd $SRC_DIR
# 克隆源码
git clone https://gitlab.com/gromacs/gromacs.git gromacs-2026-beta
cd gromacs-2026-beta
mkdir build
cd build
# CMake 命令 (请直接复制)
cmake .. \
-DCMAKE_INSTALL_PREFIX=/public/software/gromacs-2026-beta \
-DCMAKE_C_COMPILER=gcc \
-DCMAKE_CXX_COMPILER=mpicxx \
-DCMAKE_BUILD_TYPE=Release \
\
-DGMX_MPI=ON \
-DGMX_OPENMP=ON \
-DGMX_GPU=CUDA \
-DCUDA_TOOLKIT_ROOT_DIR=/public/software/cuda-12.5.0 \
\
-DGMX_CUDA_TARGET_SM="89" \
-DCMAKE_CUDA_FLAGS="-gencode arch=compute_89,code=compute_89" \
\
-DGMX_FFT_LIBRARY=fftw3 \
-DFFTWF_LIBRARY=/public/software/fftw3.3.10_gcc11.4/lib/libfftw3f.so \
-DFFTWF_INCLUDE_DIR=/public/software/fftw3.3.10_gcc11.4/include \
\
-DGMX_TORCH=ON \
-DTorch_DIR=/public/software/lib/libtorch/share/cmake/Torch
\
-DGMX_EXTERNAL_LAPACK=ON \
-DGMX_SIMD=AVX2_256 \
-DGMX_BUILD_OWN_FFTW=OFF \
-DGMX_THREAD_MPI=OFF \
-DBUILD_SHARED_LIBS=ON \
-DREGRESSIONTEST_DOWNLOAD=OFF
关键参数再次说明:
GMX_CUDA_TARGET_SM="89": 仅编译 4090 机器码。code=compute_89: 生成 PTX 中间代码。这是让 5090 (Blackwell 架构) 能运行 4090 代码的关键!
7.3 编译安装
bash
make -j 8
make install
7.4 最终 GROMACS Module
新建文件 /public/software/modules/modulefiles/gromacs/2026-beta:
tcl
#%Module1.0
proc ModulesHelp {} {
global version modroot
puts stdout "\t loads CUDA/12.5.0\n loads openmpi/5.0.5_gcc11.4\n load GNU/11.4.0\n"
}
module-whatis "GROMACS 2026-beta"
conflict CUDA
conflict GNU
conflict openmpi
conflict cmake
conflict fftw
set VERSION 2026-beta
set CUDA_DIR /public/software/cuda-12.5.0
set OPENMPI_DIR /public/software/openmpi5.0.5_gcc11.4
set GCC_DIR /public/software/gcc-11.4.0
set FFTW_DIR /public/software/fftw3.3.10_gcc11.4
set GROMACS_DIR /public/software/gromacs-2026-beta
set OPENBLAS_DIR /public/software/openblas-0.3.28
prepend-path PATH $GCC_DIR/bin
prepend-path LD_LIBRARY_PATH $GCC_DIR/lib64
prepend-path LD_LIBRARY_PATH $GCC_DIR/lib
prepend-path MANPATH $GCC_DIR/share/man
prepend-path C_INCLUDE_PATH $GCC_DIR/include
prepend-path CPLUS_INCLUDE_PATH $GCC_DIR/include
prepend-path C_INCLUDE_PATH ${CUDA_DIR}/include
prepend-path CPLUS_INCLUDE_PATH ${CUDA_DIR}/include
prepend-path CXX_INCLUDE_PATH ${CUDA_DIR}/include
prepend-path PATH ${CUDA_DIR}/bin
prepend-path LD_LIBRARY_PATH ${CUDA_DIR}/lib64
prepend-path PATH $OPENMPI_DIR/bin
prepend-path LD_LIBRARY_PATH $OPENMPI_DIR/lib
prepend-path PATH ${FFTW_DIR}/bin
prepend-path LD_LIBRARY_PATH ${FFTW_DIR}/lib
prepend-path PATH $GROMACS_DIR/bin
prepend-path LD_LIBRARY_PATH $GROMACS_DIR/lib
prepend-path MANPATH $GROMACS_DIR/share/man
prepend-path LD_LIBRARY_PATH $OPENBLAS_DIR/lib
if {[file exist /public/software/gromacs-2026-beta/bin/GMXRC.bash]} {
system /public/software/gromacs-2026-beta/bin/GMXRC.bash
}
if [ module-info mode load ] {
system echo "GROMACS 2026-beta is loaded"
}
if [ module-info mode switch2 ] {
system echo "GROMACS 2026-beta is loaded"
}
if [ module-info mode remove ] {
system echo "GROMACS 2026-beta is unloaded"
}
8. 使用验证
在 5090 节点上验证:
bash
module load gromacs/2026-beta
gmx_mpi --version
应该看到 CUDA driver: 13.1,CUDA runtime: 12.5。这是正常的异构状态。