본문 바로가기

HPC

Test No.1

IntelOneAPI에서는 mpirun 자체 테스트 프로그램을 제공한다.

mpirun -np 3 -hosts dell1,dell2,dell3 -ppn 1 IMB-MPI1 PingPong

 

libibverbs: Warning: couldn't load driver 'libvmw_pvrdma-rdmav34.so': libvmw_pvrdma-rdmav34.so: cannot open shared object file: No such file or directory
libibverbs: Warning: couldn't load driver 'libvmw_pvrdma-rdmav34.so': libvmw_pvrdma-rdmav34.so: cannot open shared object file: No such file or directory
libibverbs: Warning: couldn't load driver 'libvmw_pvrdma-rdmav34.so': libvmw_pvrdma-rdmav34.so: cannot open shared object file: No such file or directory
#----------------------------------------------------------------
#    Intel(R) MPI Benchmarks 2021.10, MPI-1 part
#----------------------------------------------------------------
# Date                  : Thu Jan  1 23:52:32 2026
# Machine               : x86_64
# System                : Linux
# Release               : 6.8.0-90-generic
# Version               : #91-Ubuntu SMP PREEMPT_DYNAMIC Tue Nov 18 14:14:30 UTC 2025
# MPI Version           : 4.1
# MPI Thread Environment:


# Calling sequence was:

# IMB-MPI1 PingPong

# Minimum message length in bytes:   0
# Maximum message length in bytes:   4194304
#
# MPI_Datatype                   :   MPI_BYTE
# MPI_Datatype for reductions    :   MPI_FLOAT
# MPI_Op                         :   MPI_SUM
#
#

# List of Benchmarks to run:

# PingPong

#---------------------------------------------------
# Benchmarking PingPong
# #processes = 2
# ( 1 additional process waiting in MPI_Barrier)
#---------------------------------------------------
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000         1.52         0.00
            1         1000         1.52         0.66
            2         1000         1.53         1.31
            4         1000         1.53         2.62
            8         1000         1.55         5.15
           16         1000         1.55        10.33
           32         1000         1.62        19.75
           64         1000         1.77        36.18
          128         1000         1.97        65.13
          256         1000         2.36       108.40
          512         1000         2.75       186.49
         1024         1000         3.41       300.04
         2048         1000         5.69       360.00
         4096         1000         8.95       457.46
         8192         1000        13.61       601.72
        16384         1000        22.13       740.22
        32768         1000        23.39      1401.23
        65536          640        40.23      1629.19
       131072          320        76.55      1712.19
       262144          160       144.37      1815.84
       524288           80       274.70      1908.59
      1048576           40       538.78      1946.20
      2097152           20      1067.31      1964.89
      4194304           10      2123.86      1974.85


# All processes entering MPI_Finalize

 

노드간의 통신 수단 (ethernet, IB) 중 어떤 것을 사용하는지 확인

export I_MPI_DEBUG=5
export FI_LOG_LEVEL=info

mpirun -np 2 -hosts dell1,dell2 -ppn 1 \
  IMB-MPI1 PingPong 2>&1 | egrep -i "ofi|verbs|tcp|mlx|provider|fabric|fi_"

 

1. 노드간의 통신 수단 고정 (Infiniband)

export I_MPI_FABRICS=shm:ofi
export I_MPI_OFI_PROVIDER=mlx   # 또는 export FI_PROVIDER=mlx
export I_MPI_DEBUG=5

mpirun -np 3 -hosts dell1,dell2,dell3 -ppn 1 IMB-MPI1 PingPong

 

       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000         1.52         0.00
            1         1000         1.52         0.66
            2         1000         1.53         1.31
            4         1000         1.52         2.63
            8         1000         1.54         5.18
           16         1000         1.56        10.26
           32         1000         1.62        19.81
           64         1000         1.76        36.43
          128         1000         1.97        64.92
          256         1000         2.37       108.11
          512         1000         2.75       185.98
         1024         1000         3.40       301.11
         2048         1000         5.70       359.08
         4096         1000         8.93       458.53
         8192         1000        13.58       603.23
        16384         1000        22.10       741.48
        32768         1000        23.34      1404.16
        65536          640        40.15      1632.20
       131072          320        76.61      1710.84
       262144          160       142.78      1836.01
       524288           80       274.52      1909.81
      1048576           40       538.85      1945.93
      2097152           20      1067.09      1965.31
      4194304           10      2123.95      1974.76

 

2. 노드간의 통신 수단 고정 (Ethernet-TCP)

export I_MPI_FABRICS=shm:ofi
export I_MPI_OFI_PROVIDER=tcp   # 또는 export FI_PROVIDER=tcp
export I_MPI_DEBUG=5

mpirun -np 3 -hosts dell1,dell2,dell3 -ppn 1 IMB-MPI1 PingPong
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000       109.12         0.00
            1         1000       108.85         0.01
            2         1000       105.76         0.02
            4         1000       108.21         0.04
            8         1000       108.49         0.07
           16         1000       108.53         0.15
           32         1000       108.26         0.30
           64         1000       108.68         0.59
          128         1000       111.69         1.15
          256         1000       110.15         2.32
          512         1000       105.73         4.84
         1024         1000        88.96        11.51
         2048         1000       116.92        17.52
         4096         1000       126.39        32.41
         8192         1000       148.45        55.19
        16384         1000       222.61        73.60
        32768         1000       368.20        89.00
        65536          640       642.99       101.92
       131072          320      1198.01       109.41
       262144          160      2312.40       113.36
       524288           80      4545.94       115.33
      1048576           40      8974.60       116.84
      2097152           20     17897.31       117.18
      4194304           10     35730.34       117.39

 

속도차이가 명확하다.

 

Pingpong은 2개의 노드간의 통신이므로, 다중 병렬연결 테스트를 위해 다음을 실시해 본다.

export I_MPI_FABRICS=shm:ofi
export I_MPI_OFI_PROVIDER=mlx   # 또는 export FI_PROVIDER=mlx
export I_MPI_DEBUG=5

mpirun -np 3 -hosts dell1,dell2,dell3 -ppn 1 IMB-MPI1 Allreduce
mpirun -np 3 -hosts dell1,dell2,dell3 -ppn 1 IMB-MPI1 Bcast
mpirun -np 3 -hosts dell1,dell2,dell3 -ppn 1 IMB-MPI1 Barrier

'HPC' 카테고리의 다른 글

scratch 디렉토리의 생성  (0) 2026.01.01
Connect-X 5 설치  (0) 2025.12.26
Latency Test  (0) 2025.09.24
Bandwidth Test  (0) 2025.09.24
IPoIB  (0) 2025.09.24