OpenMPI의 다운로드
https://www.open-mpi.org/software/ompi/v5.0/
Open MPI: Version 5.0
Changes in this release: See this page if you are upgrading from a prior major release series of Open MPI. It shows the Big Changes for which end users need to be aware. See the release notes for a more fine-grained listing of changes between each release
www.open-mpi.org
wget https://download.open-mpi.org/release/open-mpi/v5.0/openmpi-5.0.6.tar.gz
CUDA Support를 위해서는, 11.2.6.1 절을 참조
https://docs.open-mpi.org/en/v5.0.x/tuning-apps/networking/cuda.html
Cluster구축이 아니므로, 2. Via internal Open MPI CUDA support를 따라서 진행
build directory에서 make를 이용한다면, 상위 디렉토리의 configure를 설정
../configure --with-cuda=<path-to-cuda> --with-cuda-libdir=<path-to-cuda library>
path-to-cuda는 cuda 설치 위치
/usr/local$ ls -l
total 36
drwxr-xr-x 2 root root 4096 12월 30 23:37 bin
lrwxrwxrwx 1 root root 21 12월 31 02:04 cuda -> /usr/local/cuda-12.6/
drwxr-xr-x 17 root root 4096 12월 31 02:05 cuda-12.6
drwxr-xr-x 2 root root 4096 12월 30 22:51 etc
drwxr-xr-x 2 root root 4096 8월 8 2023 games
drwxr-xr-x 7 root root 4096 12월 30 22:51 include
drwxr-xr-x 10 root root 4096 12월 30 22:51 lib
lrwxrwxrwx 1 root root 9 8월 22 2023 man -> share/man
drwxr-xr-x 2 root root 4096 12월 30 22:51 sbin
drwxr-xr-x 14 root root 4096 12월 26 23:51 share
drwxr-xr-x 2 root root 4096 8월 8 2023 src
일반적으로 cuda는 /usr/local/cuda에 위치함. 리눅스 기본 참조위치. 여기서 cuda는 심볼릭 링크로 /usr/local/cuda-12.6으로 리다이렉션
path-to-cuda library는 libcuda.so가 위치한 곳의 path를 지정함. libcuda.so가 어디있는지 찾아보면, 현재 위치 기준으로 하위 디렉토리를 뒤진다.
/usr/local$ find . -name "libcuda.so"
./cuda-12.6/targets/x86_64-linux/lib/stubs/libcuda.so
path를 확인했으니, configure를 구성
../configure --with-cuda=/usr/local/cuda --with-cuda-libdir=/usr/local/cuda-12.6/targets/x86_64-linux/lib/stubs/
build 디렉토리로 가서 configure전에 기존 make 정리, build 디렉토리 청소
/openmpi-5.0.6/build$ sudo make clean
/openmpi-5.0.6/build$ sudo make uninstall
/openmpi-5.0.6/build$ rm -rf *
configure 실행
/openmpi-5.0.6/build$ ../configure --with-cuda=/usr/local/cuda --with-cuda-libdir=/usr/local/cuda-12.6/targets/x86_64-linux/lib/stubs/
OpenMPI는 cmake를 지원하지 않고 make만 지원하므로 make 실행
make -j 4
sudo make install
추가사항 : configure 및 make과정 log기록을 위한 tee
shell$ tar xf openmpi-<version>.tar.bz2
shell$ cd openmpi-<version>
shell$ ./configure --prefix=<path> [...options...] 2>&1 | tee config.out
<... lots of output ...>
# Use an integer value of N for parallel builds
shell$ make [-j N] all 2>&1 | tee make.out
# ...lots of output...
# Depending on the <prefix> chosen above, you may need root access
# for the following:
shell$ make install 2>&1 | tee install.out
제대로 설치됬는지 확인하는 방법
# Use ompi_info to verify cuda support in Open MPI
shell$ ompi_info | grep "MPI extensions"
MPI extensions: affinity, cuda, pcollreq
shell$ ompi_info --parsable --all | grep mpi_built_with_cuda_support:value
mca:mpi:base:param:mpi_built_with_cuda_support:value:true
몇가지 오류 제거
:~$ ompi_info | grep "MPI extensions"
ompi_info: error while loading shared libraries: libmpi.so.40: cannot open shared object file: No such file or directory
:~$ ompi_info
ompi_info: error while loading shared libraries: libmpi.so.40: cannot open shared object file: No such file or directory
:~$ mc
:/usr/local$ find . -name "libmpi.so.40" 2>/dev/null
./lib/libmpi.so.40
:/usr/local$ ls
bin cuda cuda-12.6 etc games include lib man sbin share src
:/usr/local$ echo 'export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH' >> ~/.bashrc
:/usr/local$ source ~/.bashrc
:/usr/local$ ompi_info | grep "MPI extensions"
MPI extensions: affinity, cuda, ftmpi, rocm, shortfloat
:/usr/local$ ompi_info --parsable --all | grep mpi_built_with_cuda_support:value
mca:mpi:base:param:mpi_built_with_cuda_support:value:true
'OpenMPI' 카테고리의 다른 글
OpenMPI with AOCC (0) | 2025.01.23 |
---|---|
Another (1) | 2025.01.09 |
TEST (0) | 2025.01.09 |
Performance Check (0) | 2025.01.09 |