Welcome, Guest
Username: Password: Remember me

TOPIC: Problems while running telemac in parallel

Problems while running telemac in parallel 13 years 4 weeks ago #2887

  • qilong
  • qilong's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 340
  • Thank you received: 33
Hello,

I was trying to run telemac2d in parallel for the validation case "019_dambreak". The whole telemac program was compiled successfully with the mpich2 library. First I ran the case "019_dambreak" in non-parallel mode. It worked well and generated results. Then I tried to run it in parallel mode. I got no results but errors.

Attached files are some configuration files and steering file.
What's wrong with my configuration and How to run the case in parallel correctly? Thank you very much!
The administrator has disabled public write access.

Re: Problems while running telemac in parallel 13 years 4 weeks ago #2893

  • jmhervouet
  • jmhervouet's Avatar
Hello,

The problem is that you did not compile the default parallel executable. Now that you have compiled all the rest you can do it by running the command :

maktel parallel

in the telemac-2d sources

Another solution is to use a FORTRAN FILE where you put any subroutine, so that you do not use the default executable but generate your own.

With best regards,

Jean-Michel Hervouet
The administrator has disabled public write access.

Re: Problems while running telemac in parallel 13 years 4 weeks ago #2899

  • qilong
  • qilong's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 340
  • Thank you received: 33
Thanks for your help. But I used "makepar90" to compile the whole telemac program before. I checked the folders and there is a executable file "telemac2dv6p1_MP.exe" in telemac2d's folder. However, the error said it was looking for another executable file.

"MPI launcher : /usr/local/mpich2/bin/mpirun -machinefile mpirun.txt -np 2 out2174_intel_64_12.exe"

It's not the file that I have.
The administrator has disabled public write access.

Re: Problems while running telemac in parallel 13 years 4 weeks ago #2900

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Hi
I suppose that you have a t2d_dambreak_intel_64_12_MP_v6p1.exe in your directory as it's written in your txt file.
During the launch, there shoulb be a copy of this file into the TMP directory and the exe file is rename according a pid number.
Please try to run the same simulation with -t option when you run it
telemac2d -t t2d_dambreak_v2p0.cas
then after the problem you could check in the tmp directory that the exe file is well copied with the good name

Good luck
Christophe
The administrator has disabled public write access.

Re: Problems while running telemac in parallel 13 years 4 weeks ago #2902

  • qilong
  • qilong's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 340
  • Thank you received: 33
Thanks. I have "t2d_dambreak_intel_64_12_MP_v6p1.exe" in the directory. I ran the same case with "-t" option but it gave the same error:

_____________________________________________________________________
*** RUNNING ***

MPI launcher : /usr/local/mpich2/bin/mpirun -machinefile mpirun.txt -np 2 out12539_intel_64_12.exe
[proxy:0:0@ubuntu] HYDU_create_process (./utils/launch/launch.c:69): [proxy:0:0@ubuntu] HYDU_create_process (./utils/launch/launch.c:69): execvp error on file out12539_intel_64_12.exe (No such file or directory)
execvp error on file out12539_intel_64_12.exe (No such file or directory)
## Erreur : Fin anormale : cd /home/bql/Desktop/019_dambreak/t2d_dambreak_v2p0.cas12539_tmp; /usr/local/mpich2/bin/mpirun -machinefile mpirun.txt -np 2 out12539_intel_64_12.exe :65280
## Error : System command failed for /home/bql/Desktop/019_dambreak/t2d_dambreak_v2p0.cas12539_tmp/telemac2d.bat :65280
________________________________________________________
Execution finished: telemac2d.bat
________________________________________________________
No compilation/linking/file errors detected.
Execution errors detected.
Please see messages in stdout above or study stderr output.

I checked the tmp directory. Indeed there is a exe file called "out12539_intel_64_12.exe" which is the same name as it mentioned in the error message.
The administrator has disabled public write access.

Re: Problems while running telemac in parallel 13 years 4 weeks ago #2907

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Hi
I suppose this is a problem of MPI configuration.
Maybe you could try other options (-localhost) to see if the result is the same
Could you also join the mpirun.txt file in order to see its contains?

Hope this helps
Christophe
The administrator has disabled public write access.

Re: Problems while running telemac in parallel 13 years 4 weeks ago #2914

  • qilong
  • qilong's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 340
  • Thank you received: 33
Good Morning,

I tried the options "-localhost" and "-localonly". Neither of them were recognized by mpi. I installed MPICH2 1.4.1p1 on my computer and my host name is ubuntu. Attached is the mpirun.txt. Do you have some examples to set up and run telemac in parallel?

Thanks!
Attachments:
The administrator has disabled public write access.

Re: Problems while running telemac in parallel 13 years 4 weeks ago #2916

  • jmhervouet
  • jmhervouet's Avatar
Hello,

You can look at the test cases provided, for example those named gouttedo and malpasset in French, but there are many others.

JMH
The administrator has disabled public write access.

Re: Problems while running telemac in parallel 13 years 4 weeks ago #2919

  • qilong
  • qilong's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 340
  • Thank you received: 33
Hello,

I tried the two parallel cases you mentioned above. However, it gave the same error. This is my mpi_telemac.conf:

# Configuration for MPI
#
#
# Number of processors :
2
#
# For each host :
# hostname number_of_processors_on_the_host
#
ubuntu 2

Is it correct?
I also noticed that one of the errors is

"## Error : System command failed for /home/bql/Desktop/025_gouttedo/t2d_gouttedo_v2p0.cas2321_tmp/telemac2d.bat :65280"

But I ran it under Linux. Does it mean there is something wrong in the perl script?

Thanks!
The administrator has disabled public write access.

Re: Problems while running telemac in parallel 13 years 3 weeks ago #2925

  • qilong
  • qilong's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 340
  • Thank you received: 33
Hello,

Today I tested the validation case "gouttedo" on the cluster. I ran it in parallel and it worked well. I use the mpich2_1.2.1p1 on the cluster and use mpich2_1.4.1p1 on my linux PC. So I think it might be the reason why I cannot run parallel telemac under linux.

Now I reinstall the same version of mpich2 as the one used on the cluster. The number of error messages become less:

*** RUNNING ***
MPI launcher : /usr/local/mpich2/bin/mpirun -machinefile mpirun.txt -np 2 out2115_intel_64_12.exe
problem with execution of out2115_intel_64_12.exe on ubuntu: [Errno 2] No such file or directory
problem with execution of out2115_intel_64_12.exe on ubuntu: [Errno 2] No such file or directory
Duration of job : 0 seconds ( 0:0:0 ) (system=0 sec)

I checked the tmp directory and I found the exe file is not executable. I know the command "chmod a+x *.exe" could fix this. But how to add this line in the "runtel.pl" before it invoke the exe file?

Thanks!
The administrator has disabled public write access.
Moderators: pham

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.