Hello,
we have access to a small cluster with Linux Centos 6.6 as OS, Gfortran 4.4.7, OpenMPI 1.8.1.
According to $ lscpu the machine has 4 sockets with 4 cores per socket, 2 threads per core and 8 NUMA nodes.
I installed Telemac v7p0r0 and running e.g. the malpasset large case using up to 8 cores, everything is fine. However when using more than 8 cores, e.g. 12 cores, the simulation time increases a lot and something must be wrong. Maybe because it simulates on 8 cores but based on 12 partioned meshes?!
Apparently it has something to do with my configuration when adressing more than one socket / node. I have to say that also this socket and NUMA concept is not very clear to me since I tried mpirun options like --npersocket etc. which didn't help.
Attached one can find my my systel file.
As far as I know, on the machine no HPC queuing systems are installed.
So my basic questions: what do I need to adress more than one node / socket?
I would be glad for any hints!
Clemens