Welcome, Guest
Username: Password: Remember me

TOPIC: cannot split the cas file in parallel with proc>1

cannot split the cas file in parallel with proc>1 10 years 10 months ago #11624

  • 716469
  • 716469's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 303
  • Thank you received: 6
Dear Users,

I hope somebody could advise me on my problem. I run telemac3d in parallel, firstly by splitting the case, secondly by compiling, then running it on queue and finally merging. I have can split it with PARALLEL PROCESSORS=1 only. If I set up more then 1 then I get and error : sh: line 1: segmentation fault ....
If I split it with 1 processor and then specify the number of nodes and processors more then 1 in my PBS script (say 8 and 12) then it runs and on queue it shows that it uses 8 nodes (96 cores) but in reality it is still using only one processor. I thought it was working fine but when I realized that it did not get any faster then I started to suspect then it does not take number of nodes from PBS script and still working on one processor. I have tried HPC script with same result. So my problem is splitting case with having PARALLEL PROCESSORS set up more then 1. I really need to move to many nodes as I need to run for long time periods. I hope somebody who had similar problem and knows how to fix could advise me please. Thanks a mil.

Kind Regards!

Violeta
The administrator has disabled public write access.

cannot split the cas file in parallel with proc>1 10 years 10 months ago #11625

  • jmhervouet
  • jmhervouet's Avatar
Hello,

Assuming that everything is well installed, you run in parallel only by changing the number of processors in your parameter file (plus an extra file with Perl). Thus I do not understand why you have several steps in your description : splitting the case as a first step, then running it (I know it is possible but never tried it and don't know how to do). Is it really what you want to do ?

Regards,

Jean-Michel Hervouet
The administrator has disabled public write access.

cannot split the cas file in parallel with proc>1 10 years 10 months ago #11626

  • 716469
  • 716469's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 303
  • Thank you received: 6
Hello Jean-Michel,

Unfortunately I cannot run in parallel on our cluster without splitting the case. I compiled the whole program but I could not compile the specific case I run ( it gave me gfortran not found error all the time), so only splitting option worked for me. Telemac IT team very kindly helped me to sort my cluster problem as it took me nearly one month after many failures and I created many poster on this. We got it working by splitting the case, and only running the output file on cluster. So basically I had to compile the case locally (as I would do for scalar) and then the run the rest. It is probably not usual Procedure to run in parallel but otherwise it does not work on cluster for me. So everything is fine if I run on one processor. I probably could run with multiprocessors if I could split the case having ,say PARALLEL Processors=2 or more as at the moment it does not work if it is more then 1. I would be happy to try if there are any suggestions how to avoid it. Thanks a mil.

Kind Regards!

Violeta
The administrator has disabled public write access.
Moderators: borisb

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.