I hope somebody will be able to help me.
I have a very simple telemac2d setup (no additional fortran file) that works fine both in serial and parallel mode on my laptop (mac) and on a first cluster.
As this first cluster is quite busy, I decided to try on a second one.
Compilation works fine, but when I run the same setup, I get the following segmentation fault:
|runCode: Fail to run
|mpiexec -wdir /svub2/ogourgue/projects/fhr_14_149/test_003/test_003.cas_2014-12-10-17h10min22s -n 4 out_telemac2d
|~~~~~~~~~~~~~~~~~~
|Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
|
|Backtrace for this error:
|--------------------------------------------------------------------------
|An MPI process has executed an operation involving a call to the
|"fork()" system call to create a child process. Open MPI is currently
|operating in a condition that could result in memory corruption or
|other system errors; your MPI job may hang, crash, or produce silent
|data corruption. The use of fork() (or system() or other calls that
|create child processes) is strongly discouraged.
|
|The process that invoked fork was:
|
| Local host: nic112 (PID 28111)
| MPI_COMM_WORLD rank: 3
|
|If you are *absolutely sure* that your application will successfully
|and correctly survive a call to fork(), you may disable this warning
|by setting the mpi_warn_on_fork MCA parameter to 0.
|--------------------------------------------------------------------------
|#0 0x2AAAAACE62F7
|#1 0x2AAAAACE68FE
|#2 0x2AAAAC92F99F
|#3 0x5BC64D in ov_
|#4 0x5BD026 in bief_allvec_
|#5 0x5BDAEC in bief_allvec_in_block_
|#6 0x428A75 in point_telemac2d_
|#7 0x504E30 in MAIN__ at homere_telemac2d.f:0
|
|Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
|
|Backtrace for this error:
|
|Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
|
|Backtrace for this error:
|#0 0x2AAAAACE62F7
|#1 0x2AAAAACE68FE
|#2 0x2AAAAC92F99F
|#3 0x5BC64D in ov_
|#4 0x5BD026 in bief_allvec_
|#5 0x5BDAEC in bief_allvec_in_block_
|#6 0x428A75 in point_telemac2d_
|#7 0x504E30 in MAIN__ at homere_telemac2d.f:0
|
|Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
|
|Backtrace for this error:
|#0 0x2AAAAACE62F7
|#1 0x2AAAAACE68FE
|#2 0x2AAAAC92F99F
|#3 0x5BC64D in ov_
|#4 0x5BD026 in bief_allvec_
|#5 0x5BDAEC in bief_allvec_in_block_
|#6 0x428A75 in point_telemac2d_
|#7 0x504E30 in MAIN__ at homere_telemac2d.f:0
|#0 0x2AAAAACE62F7
|#1 0x2AAAAACE68FE
|#2 0x2AAAAC92F99F
|#3 0x5BC64D in ov_
|#4 0x5BD026 in bief_allvec_
|#5 0x5BDAEC in bief_allvec_in_block_
|#6 0x428A75 in point_telemac2d_
|#7 0x504E30 in MAIN__ at homere_telemac2d.f:0
|--------------------------------------------------------------------------
|mpiexec noticed that process rank 3 with PID 28111 on node nic112 exited on signal 11 (Segmentation fault).
|--------------------------------------------------------------------------
|[nic112:28106] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork
|[nic112:28106] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
|~~~~~~~~~~~~~~~~~~
As suggested in the error message, I tried to run with -mca mpi_warn_on_fork 0 on the command line. Then I got another segmentation fault (with no additional information anymore).
If someone has an idea on how to settle that problem, I would really appreciate it.