MADbench2

MADbench2是一款用于测试大规模并行架构的I/O、通信和计算子系统在真实科学应用压力下的综合性能的工具。

MADbench2 基于 MADspec 代码，该代码根据天空的噪声像素化图及其像素-像素噪声相关矩阵计算宇宙微波背景辐射的最大似然角功率谱。MADbench2 保留了其父科学应用程序代码的全部计算复杂性，但使用自行生成的伪数据来允许绕过与处理真实 CMB 数据集相关的无数计算上不相关的细节。

MADbench2 可以以两种模式运行：

regular mode, in which the full code is run.
IO mode, in which all calculation/communication is replaced with busy-work.

此外，MADbench2 可以作为单组或多组运行；在前者中，所有矩阵运算都分布在所有处理器上执行，而在后者中，矩阵在所有处理器（S＆D）上构建、求和和求逆，然后重新分布在处理器子集（组）上他们随后的操作（W＆C）。即使处理器数量非常多，这种组并行性也允许在占主导地位的矩阵-矩阵乘法 (W) 阶段的处理器上保持数据密集。

官网

https://crd.lbl.gov/divisions/scidata/c3/c3-research/madbench2/

下载

https://crd.lbl.gov/assets/Uploads/MADbench2.tar

编译

要在常规模式下运行，MADbench2 需要链接到 ScaLAPACK 和 LAPACK 库及其依赖项（BLAS、PBLAS、BLACS）。 MADbench2.h 文件包含系统特定的定义和声明；该文件应根据需要进行扩充，并使用 -D SYSTEM 编译代码。

要在 IO 模式下运行，MADbench2 应使用 -D IO（除了 -D SYSTEM 之外）进行编译，然后所有库调用都被重新定义为繁忙工作，以便不需要任何库。

bash 复制代码

mpicc -D SYSTEM -D COLUMBIA -D IO -o MADbench2.x MADbench2.c -lm

修改文件系统路径

这是源代码中的固定值。因此，在编译之前，请确保修改该MADbench2.c文件（第 271、275 和 276 行）：

bash 复制代码

for (n=0; n<no_pe; n++) {
    if (my_pe==n && stat("files", &buf)!=0) mkdir("/mnt/gkfs/files", S_IRWXU);
    PMPI_Barrier(MPI_COMM_WORLD);
}
    
if (strcmp(FILETYPE, "UNIQUE")==0) sprintf(filename, "/mnt/gkfs/files/data_%d", my_pe);
else sprintf(filename, "/mnt/gkfs/files/data");

运行

命令行参数：

bash 复制代码

MADbench2.x   $NO_PIX   $NO_BIN   $NO_GANG   $SBLOCKSIZE   $FBLOCKSIZE   $RMOD   $WMOD


NO_PIX	Sets the size of the pseudo-data - all the component matrices have NO_PIX x NO_PIX elements
NO_BIN	Sets the size of the pseudo-dataset - there are NO_BIN component matrices
NO_GANG	Sets the level of gang-parallelism - there are NO_GANG gangs
SBLOCKSIZE	Sets the ScaLAPACK blocksize - all matrices will be block-cycically distributed with side SBLOCKSIZE.
FBLOCKSIZE	Sets the file blocksize - all IO will start at a file-offset that is an integer multiple of FBLOCKSIZE.
RMOD	Sets the degree of simultaneous reading - 1:RMOD processors will read at once.
WMOD	Sets the degree of simultaneous writing - 1:WMOD processors will write at once.

运行MADbench2要求：

a square number of processors
a uniform square number of processors per gang
a uniform number of bins per gang
a scalapack blocksize that distributes some data to every processor
a file blocksize that is a whole number of doubles
a number of gangs that is exactly divisible by the read-modulus and the write-modulus

bash 复制代码

fakerth@fakerth-IdeaCentre-GeekPro-17IRB:~$ mpirun -np 4 MADbench2.x 640 80 1 8 8 4 4

MADbench 2.0 IO-mode
no_pe = 4  no_pix = 640  no_bin = 80  no_gang = 1  sblocksize = 8  fblocksize = 8  r_mod = 4  w_mod = 4
IOMETHOD = POSIX  IOMODE = SYNC  FILETYPE = UNIQUE  REMAP = CUSTOM

S_cc         0.00   [      0.00:      0.00]
S_bw         0.01   [      0.01:      0.01]
S_w          0.09   [      0.09:      0.09]
          -------
S_total      0.09   [      0.09:      0.09]

W_cc         0.01   [      0.01:      0.01]
W_bw         3.96   [      3.96:      3.96]
W_r          0.03   [      0.03:      0.03]
W_w          0.03   [      0.03:      0.03]
          -------
W_total      4.03   [      4.03:      4.03]

C_cc         0.00   [      0.00:      0.00]
C_bw         0.01   [      0.01:      0.01]
C_r          0.05   [      0.05:      0.05]
          -------
C_total      0.06   [      0.06:      0.06]


dC[0] = 0.00000e+00

环境变量

Variable	Allowed Values	Default
IOMETHOD	POSIX, MPI	POSIX
IOMODE	SYNC, ASYNC	SYNC
FILETYPE	UNIQUE, SHARED	UNIQUE
REMAP	CUSTOM, SCALAPACK	CUSTOM
BWEXP	Any number	None

比如我们要使用mpiio，注意是MPI，不是MPIIO，export IOMETHOD=MPIIO会报错。

bash 复制代码

fakerth@fakerth-IdeaCentre-GeekPro-17IRB:~$ export IOMETHOD=MPI
fakerth@fakerth-IdeaCentre-GeekPro-17IRB:~$ mpirun -np 4 MADbench2.x 640 80 1 8 8 4 4

MADbench 2.0 IO-mode
no_pe = 4  no_pix = 640  no_bin = 80  no_gang = 1  sblocksize = 8  fblocksize = 8  r_mod = 4  w_mod = 4
IOMETHOD = MPI  IOMODE = SYNC  FILETYPE = UNIQUE  REMAP = CUSTOM

S_cc         0.00   [      0.00:      0.00]
S_bw         0.01   [      0.01:      0.01]
S_w          0.09   [      0.09:      0.09]
          -------
S_total      0.10   [      0.10:      0.10]

W_cc         0.01   [      0.01:      0.01]
W_bw         3.96   [      3.96:      3.96]
W_r          0.03   [      0.03:      0.03]
W_w          0.03   [      0.03:      0.03]
          -------
W_total      4.03   [      4.03:      4.03]

C_cc         0.00   [      0.00:      0.00]
C_bw         0.01   [      0.01:      0.01]
C_r          0.03   [      0.03:      0.03]
          -------
C_total      0.04   [      0.04:      0.04]


dC[0] = 0.00000e+00

Github

https://github.com/fakerst/application