Welcome, Guest
Username: Password: Remember me

TOPIC: Bus error

Bus error 2 years 11 months ago #39429

Dear all,

I am facing the following error, which is not always reproducible and not easy to determine what is going wrong exactly. Maybe you can help me out:

mpirun noticed that process rank 55 with PID 17594 on node r27i27n05 exited on signal 7 (Bus error).
_____________
runcode::main:
:
|runCode: Fail to run
|mpirun --hostfile $PBS_NODEFILE -np 252 /xx/xx/xx/xx/xx/T2d_testing02.cas_2021-11-29-22h52min25s/out_ateeq
|~~~~~~~~~~~~~~~~~~
|Program received signal SIGBUS: Access to an undefined portion of a memory object.
|
|Backtrace for this error:
|#0 0x2b3dec4243ff in ???
|#1 0x402270 in ???
|#2 0x5bba83 in ???
|#3 0x692f60 in ???
|#4 0x621404 in ???
|#5 0x57c054 in ???
|#6 0x56359b in ???
|#7 0x40c58b in ???
|#8 0x42a8a6 in ???
|#9 0x432c21 in ???
|#10 0x402732 in ???
|#11 0x2b3dec410554 in ???
|#12 0x40275b in ???
|#13 0xffffffffffffffff in ???


SIGBUS errors usually are quite rare on 'new' (x86) architectures, so that is why I'm currently thinking in the direction of an issue in the software itself. Does anyone have any idea what could be possible causes and possible solutions?

Basically, I am running simulations using v7p2v0 with 7 (Intel Skylake) nodes.


Thanking you in advance for the support!

Ateeq
The administrator has disabled public write access.

Bus error 2 years 11 months ago #39435

Surprisingly reducing nodes for mesh nodes=588551, and element=1174697 from 7x36 to 5x36 solve the problem at the moment. Maybe someone can help me better on this topic.

Thanking you!
The administrator has disabled public write access.

Bus error 2 years 11 months ago #39438

  • pham
  • pham's Avatar
  • OFFLINE
  • Administrator
  • Posts: 1559
  • Thank you received: 602
Hello,

You seem to have a memory issue.

As often told, you should use the latest release as there is no assistance for old releases. Some bugs have been solved and new features have appearead since v7p2!

Anyway, for such issues, you should use a debug configuration with debug options to investigate.

See e.g. the S9.gfortran.debug configuration in the $HOMETEL/configs/systel.edf.cfg configuration file, in particular the flag fflags_debug_gfo for gfortran compiler.
fflags_debug_gfo: -g -Wall -fcheck=all -fbacktrace -fbounds-check -finit-integer=-1 -finit-real=nan -ffpe-trap=invalid,zero,overflow

Hope this helps,

Chi-Tuan
The administrator has disabled public write access.
The following user(s) said Thank You: sardar.ateeq
Moderators: pham

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.