Here is a strange error when using IntelMPI:
After successful install and using IntelMPI runtime and complier, I have complied the fowllowing codes for mpi:
PROGRAM hello_world_mpi
include 'mpif.h'
integer process_Rank, size_Of_Cluster, ierror, tag
call MPI_INIT(ierror)
call MPI_COMM_SIZE(MPI_COMM_WORLD, size_Of_Cluster, ierror)
call MPI_COMM_RANK(MPI_COMM_WORLD, process_Rank, ierror)
print *, 'Hello World from process: ', process_Rank, 'of ', size_Of_Cluster
call MPI_FINALIZE(ierror)
END PROGRAM
I have complied it by using intel ifork complier by
mpiifort hello.f90 -o hello
Then when I try to run such program as root user:
> mpirun -n 2 ./hello
Hello World from process: 1 of 2
Hello World from process: 0 of 2
Everything goes right in root user.
When I try to run this in any other user, it get:
> mpirun -n 2 ./hello
[1733578910.431461221] RFRLServer7:rank89.hello: Unable to create send CQ of size 5080 on mlx5_bond_0: Cannot allocate memory
[1733578910.433054057] RFRLServer7:rank89.hello: Unable to initialize verbs NIC /sys/class/infiniband/mlx5_bond_0 (unit 0:0)
RFRLServer7:rank89: PSM3 can't open nic unit: 0 (err=23)
Abort(1615247) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Unknown error class, error stack:
MPIR_Init_thread(193)........:
MPID_Init(1715)..............:
MPIDI_OFI_mpi_init_hook(1673):
create_vni_context(2242).....: OFI endpoint open failed (ofi_init.c:2242:create_vni_context:Invalid argument)
It seems a wrong in this.
The ifork information:
ifort (IFORT) 2021.13.1 20240703
The complier is 2024.2.
The intel MPI is Intel(R) MPI Library for Linux* OS, Version 2021.14 Build 20241121 (id: e7829d6)
My cpu is Intel(R) Xeon(R) Platinum 8360H CPU @ 3.00GHz.
The sysmtem is rocky linux 8.10.
Is there anyone could help me with this problem?