Welcome, Guest
Username: Password: Remember me

TOPIC: segmentation fault due to "fork()" system call

segmentation fault due to "fork()" system call 9 years 11 months ago #15168

  • o.gourgue
  • o.gourgue's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 155
  • Thank you received: 11
I hope somebody will be able to help me.

I have a very simple telemac2d setup (no additional fortran file) that works fine both in serial and parallel mode on my laptop (mac) and on a first cluster.

As this first cluster is quite busy, I decided to try on a second one.

Compilation works fine, but when I run the same setup, I get the following segmentation fault:

   |runCode: Fail to run
   |mpiexec -wdir /svub2/ogourgue/projects/fhr_14_149/test_003/test_003.cas_2014-12-10-17h10min22s -n 4 out_telemac2d
   |~~~~~~~~~~~~~~~~~~
   |Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
   |
   |Backtrace for this error:
   |--------------------------------------------------------------------------
   |An MPI process has executed an operation involving a call to the
   |"fork()" system call to create a child process.  Open MPI is currently
   |operating in a condition that could result in memory corruption or
   |other system errors; your MPI job may hang, crash, or produce silent
   |data corruption.  The use of fork() (or system() or other calls that
   |create child processes) is strongly discouraged.  
   |
   |The process that invoked fork was:
   |
   |  Local host:          nic112 (PID 28111)
   |  MPI_COMM_WORLD rank: 3
   |
   |If you are *absolutely sure* that your application will successfully
   |and correctly survive a call to fork(), you may disable this warning
   |by setting the mpi_warn_on_fork MCA parameter to 0.
   |--------------------------------------------------------------------------
   |#0  0x2AAAAACE62F7
   |#1  0x2AAAAACE68FE
   |#2  0x2AAAAC92F99F
   |#3  0x5BC64D in ov_
   |#4  0x5BD026 in bief_allvec_
   |#5  0x5BDAEC in bief_allvec_in_block_
   |#6  0x428A75 in point_telemac2d_
   |#7  0x504E30 in MAIN__ at homere_telemac2d.f:0
   |
   |Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
   |
   |Backtrace for this error:
   |
   |Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
   |
   |Backtrace for this error:
   |#0  0x2AAAAACE62F7
   |#1  0x2AAAAACE68FE
   |#2  0x2AAAAC92F99F
   |#3  0x5BC64D in ov_
   |#4  0x5BD026 in bief_allvec_
   |#5  0x5BDAEC in bief_allvec_in_block_
   |#6  0x428A75 in point_telemac2d_
   |#7  0x504E30 in MAIN__ at homere_telemac2d.f:0
   |
   |Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
   |
   |Backtrace for this error:
   |#0  0x2AAAAACE62F7
   |#1  0x2AAAAACE68FE
   |#2  0x2AAAAC92F99F
   |#3  0x5BC64D in ov_
   |#4  0x5BD026 in bief_allvec_
   |#5  0x5BDAEC in bief_allvec_in_block_
   |#6  0x428A75 in point_telemac2d_
   |#7  0x504E30 in MAIN__ at homere_telemac2d.f:0
   |#0  0x2AAAAACE62F7
   |#1  0x2AAAAACE68FE
   |#2  0x2AAAAC92F99F
   |#3  0x5BC64D in ov_
   |#4  0x5BD026 in bief_allvec_
   |#5  0x5BDAEC in bief_allvec_in_block_
   |#6  0x428A75 in point_telemac2d_
   |#7  0x504E30 in MAIN__ at homere_telemac2d.f:0
   |--------------------------------------------------------------------------
   |mpiexec noticed that process rank 3 with PID 28111 on node nic112 exited on signal 11 (Segmentation fault).
   |--------------------------------------------------------------------------
   |[nic112:28106] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork
   |[nic112:28106] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
   |~~~~~~~~~~~~~~~~~~


As suggested in the error message, I tried to run with -mca mpi_warn_on_fork 0 on the command line. Then I got another segmentation fault (with no additional information anymore).

If someone has an idea on how to settle that problem, I would really appreciate it.
Attachments:
The administrator has disabled public write access.

segmentation fault due to "fork()" system call 9 years 11 months ago #15170

  • jmhervouet
  • jmhervouet's Avatar
Hello,

It crashes at an early stage, during the allocation of a vector and its initialisation, it is doubtful that it could be an error of the program, if it works on another computer. Or could it be a lack of memory ?

Regards,

JMH
The administrator has disabled public write access.

segmentation fault due to "fork()" system call 9 years 11 months ago #15253

  • o.gourgue
  • o.gourgue's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 155
  • Thank you received: 11
It was indeed a lack of memory problem. The default value of the maximum allowed memory is very low on that cluster. Thank you Jean-Michel.
The administrator has disabled public write access.
Moderators: pham

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.