IntelOneAPI에서는 mpirun 자체 테스트 프로그램을 제공한다.
mpirun -np 3 -hosts dell1,dell2,dell3 -ppn 1 IMB-MPI1 PingPong
libibverbs: Warning: couldn't load driver 'libvmw_pvrdma-rdmav34.so': libvmw_pvrdma-rdmav34.so: cannot open shared object file: No such file or directory
libibverbs: Warning: couldn't load driver 'libvmw_pvrdma-rdmav34.so': libvmw_pvrdma-rdmav34.so: cannot open shared object file: No such file or directory
libibverbs: Warning: couldn't load driver 'libvmw_pvrdma-rdmav34.so': libvmw_pvrdma-rdmav34.so: cannot open shared object file: No such file or directory
#----------------------------------------------------------------
# Intel(R) MPI Benchmarks 2021.10, MPI-1 part
#----------------------------------------------------------------
# Date : Thu Jan 1 23:52:32 2026
# Machine : x86_64
# System : Linux
# Release : 6.8.0-90-generic
# Version : #91-Ubuntu SMP PREEMPT_DYNAMIC Tue Nov 18 14:14:30 UTC 2025
# MPI Version : 4.1
# MPI Thread Environment:
# Calling sequence was:
# IMB-MPI1 PingPong
# Minimum message length in bytes: 0
# Maximum message length in bytes: 4194304
#
# MPI_Datatype : MPI_BYTE
# MPI_Datatype for reductions : MPI_FLOAT
# MPI_Op : MPI_SUM
#
#
# List of Benchmarks to run:
# PingPong
#---------------------------------------------------
# Benchmarking PingPong
# #processes = 2
# ( 1 additional process waiting in MPI_Barrier)
#---------------------------------------------------
#bytes #repetitions t[usec] Mbytes/sec
0 1000 1.52 0.00
1 1000 1.52 0.66
2 1000 1.53 1.31
4 1000 1.53 2.62
8 1000 1.55 5.15
16 1000 1.55 10.33
32 1000 1.62 19.75
64 1000 1.77 36.18
128 1000 1.97 65.13
256 1000 2.36 108.40
512 1000 2.75 186.49
1024 1000 3.41 300.04
2048 1000 5.69 360.00
4096 1000 8.95 457.46
8192 1000 13.61 601.72
16384 1000 22.13 740.22
32768 1000 23.39 1401.23
65536 640 40.23 1629.19
131072 320 76.55 1712.19
262144 160 144.37 1815.84
524288 80 274.70 1908.59
1048576 40 538.78 1946.20
2097152 20 1067.31 1964.89
4194304 10 2123.86 1974.85
# All processes entering MPI_Finalize
노드간의 통신 수단 (ethernet, IB) 중 어떤 것을 사용하는지 확인
export I_MPI_DEBUG=5
export FI_LOG_LEVEL=info
mpirun -np 2 -hosts dell1,dell2 -ppn 1 \
IMB-MPI1 PingPong 2>&1 | egrep -i "ofi|verbs|tcp|mlx|provider|fabric|fi_"
1. 노드간의 통신 수단 고정 (Infiniband)
export I_MPI_FABRICS=shm:ofi
export I_MPI_OFI_PROVIDER=mlx # 또는 export FI_PROVIDER=mlx
export I_MPI_DEBUG=5
mpirun -np 3 -hosts dell1,dell2,dell3 -ppn 1 IMB-MPI1 PingPong
#bytes #repetitions t[usec] Mbytes/sec
0 1000 1.52 0.00
1 1000 1.52 0.66
2 1000 1.53 1.31
4 1000 1.52 2.63
8 1000 1.54 5.18
16 1000 1.56 10.26
32 1000 1.62 19.81
64 1000 1.76 36.43
128 1000 1.97 64.92
256 1000 2.37 108.11
512 1000 2.75 185.98
1024 1000 3.40 301.11
2048 1000 5.70 359.08
4096 1000 8.93 458.53
8192 1000 13.58 603.23
16384 1000 22.10 741.48
32768 1000 23.34 1404.16
65536 640 40.15 1632.20
131072 320 76.61 1710.84
262144 160 142.78 1836.01
524288 80 274.52 1909.81
1048576 40 538.85 1945.93
2097152 20 1067.09 1965.31
4194304 10 2123.95 1974.76
2. 노드간의 통신 수단 고정 (Ethernet-TCP)
export I_MPI_FABRICS=shm:ofi
export I_MPI_OFI_PROVIDER=tcp # 또는 export FI_PROVIDER=tcp
export I_MPI_DEBUG=5
mpirun -np 3 -hosts dell1,dell2,dell3 -ppn 1 IMB-MPI1 PingPong
#bytes #repetitions t[usec] Mbytes/sec
0 1000 109.12 0.00
1 1000 108.85 0.01
2 1000 105.76 0.02
4 1000 108.21 0.04
8 1000 108.49 0.07
16 1000 108.53 0.15
32 1000 108.26 0.30
64 1000 108.68 0.59
128 1000 111.69 1.15
256 1000 110.15 2.32
512 1000 105.73 4.84
1024 1000 88.96 11.51
2048 1000 116.92 17.52
4096 1000 126.39 32.41
8192 1000 148.45 55.19
16384 1000 222.61 73.60
32768 1000 368.20 89.00
65536 640 642.99 101.92
131072 320 1198.01 109.41
262144 160 2312.40 113.36
524288 80 4545.94 115.33
1048576 40 8974.60 116.84
2097152 20 17897.31 117.18
4194304 10 35730.34 117.39
속도차이가 명확하다.
Pingpong은 2개의 노드간의 통신이므로, 다중 병렬연결 테스트를 위해 다음을 실시해 본다.
export I_MPI_FABRICS=shm:ofi
export I_MPI_OFI_PROVIDER=mlx # 또는 export FI_PROVIDER=mlx
export I_MPI_DEBUG=5
mpirun -np 3 -hosts dell1,dell2,dell3 -ppn 1 IMB-MPI1 Allreduce
mpirun -np 3 -hosts dell1,dell2,dell3 -ppn 1 IMB-MPI1 Bcast
mpirun -np 3 -hosts dell1,dell2,dell3 -ppn 1 IMB-MPI1 Barrier'HPC' 카테고리의 다른 글
| scratch 디렉토리의 생성 (0) | 2026.01.01 |
|---|---|
| Connect-X 5 설치 (0) | 2025.12.26 |
| Latency Test (0) | 2025.09.24 |
| Bandwidth Test (0) | 2025.09.24 |
| IPoIB (0) | 2025.09.24 |