Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1
  • 2

TOPIC: Problem with PARTEL (Linux cluster python v7p0)

Problem with PARTEL (Linux cluster python v7p0) 9 years 9 months ago #15713

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
The error seems to come from your installation of mpi.
You should try reinstalling it or trying a different one.
Did you recompile with the option --clean ?
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.

Problem with PARTEL (Linux cluster python v7p0) 9 years 9 months ago #15807

  • julesleguern
  • julesleguern's Avatar
Hello Yugi,

I change the environment of mpi. So now the simulation start but I have another error.

File Attachment:

File Name: run_telemac.txt
File Size: 64 KB


It seems to be a problem of memory but I never seen this error before. Telemac made a result file but when I try to read it with bluekenue, I can't animate the file. There is just the first time step.

I specify that I'm working on a windows 7 computer to make all files (.slf, .cas, .cli) then I transfer the files with SSH Secure File Transfer on the Linux cluster.

Here my new script.sh to launch my job and my last .cfg file :


File Attachment:

File Name: systel.cfg
File Size: 1 KB





File Attachment:

File Name: runmod_telemac_parallele.txt
File Size: 2 KB




Thanks for your help.

Jules
The administrator has disabled public write access.

Problem with PARTEL (Linux cluster python v7p0) 9 years 9 months ago #15808

  • sebourban
  • sebourban's Avatar
  • OFFLINE
  • Administrator
  • Principal Scientist
  • Posts: 814
  • Thank you received: 219
Hello,

Everything seems OK as far as i can see.

Could that be linked to problem reported recently on this forum about GRETEL missing end line in the input file.

Also, can you use mipexec instead of mpirun ?

Best regards,
Sébastien.
The administrator has disabled public write access.

Problem with PARTEL (Linux cluster python v7p0) 9 years 9 months ago #15814

  • julesleguern
  • julesleguern's Avatar
Hello,

I try to merge manually the results files with gretel_autop but the results file is always is the same. This is very strange to have only the initial conditions in this file.
I'm not sure I found the right post that talks about the problem of GRETEL :

www.opentelemac.org/index.php/kunena/12-...etel-and-stdin#15791

If not, can you send me the link?

The error seems to be here but I don't know what it means.


USING STREAMLINE VERSION 7.0 FOR CHARACTERISTICS
PARALLEL::ORG_CHARAC_TYPE1:
MEMORY PROBLEM WITH THIS COMPILER:
ILB= 0 NOT EQUAL TO CH_DELTA(1)= 0
OR
IUB= 1275070495 NOT EQUAL TO CH_DELTA(18)= 176
rank 8 in job 1 comcluster13.cluster_47350 caused collective abort of all ranks
exit status of rank 8: killed by signal 9
rank 7 in job 1 comcluster13.cluster_47350 caused collective abort of all ranks
exit status of rank 7: killed by signal 9
rank 2 in job 1 comcluster13.cluster_47350 caused collective abort of all ranks
exit status of rank 2: killed by signal 9
rank 0 in job 1 comcluster13.cluster_47350 caused collective abort of all ranks
exit status of rank 0: return code 0


Best regards.

Jules
The administrator has disabled public write access.

Problem with PARTEL (Linux cluster python v7p0) 9 years 9 months ago #15815

  • julesleguern
  • julesleguern's Avatar
Hello,

I found where come from this error. This is come from the subroutine org_charac_type1.F called by streamline. Apparently it's come from MPI :

CALL P_MPI_TYPE_CREATE_STRUCT(18,CH_BLENGTH,CH_DELTA,CH_TYPES,
& CHARACTERISTIC,IER)
CALL P_MPI_TYPE_COMMIT(CHARACTERISTIC,IER)
CALL P_MPI_TYPE_GET_EXTENT(CHARACTERISTIC,ILB,EXTENT,IER)
IUB=ILB+EXTENT
!
IF(ILB.NE.CH_DELTA(1).OR.IUB.NE.CH_DELTA(18)) THEN
WRITE(LU,*) ' PARALLEL::ORG_CHARAC_TYPE1:'
WRITE(LU,*) ' MEMORY PROBLEM WITH THIS COMPILER: '
WRITE(LU,*) ' ILB=',ILB,' NOT EQUAL TO CH_DELTA(1)=',
& CH_DELTA(1)
WRITE(LU,*) ' OR'
WRITE(LU,*) ' IUB=',IUB,' NOT EQUAL TO CH_DELTA(18)=',
& CH_DELTA(18)
! CALL PLANTE(1)
STOP
ENDIF


Any idea?

Thanks

Jules
The administrator has disabled public write access.

Problem with PARTEL (Linux cluster python v7p0) 9 years 8 months ago #15917

  • julesleguern
  • julesleguern's Avatar
Hello,

I try mpiexec command but it failled.

Does someone have an idea? I am really stuck.

Thanks.

Jules
The administrator has disabled public write access.

Problem with PARTEL (Linux cluster python v7p0) 7 years 7 months ago #25828

Hi,
Could you help me please, in this picture I see one problem when I run telemac2d, I dont understand very well PARTEL AND MPI, how install that.

Thank.
Attachments:
The administrator has disabled public write access.

Problem with PARTEL (Linux cluster python v7p0) 7 years 7 months ago #25830

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
Hi,

you should have a look at the file partel_T2DGEO.log that is in the working folder it should contain the error message.

Hoep ti helps
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.
  • Page:
  • 1
  • 2
Moderators: borisb

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.