Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1
  • 2

TOPIC: Parallel mode not working

Parallel mode not working 4 years 3 months ago #36492

  • Mustermann
  • Mustermann's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 268
  • Thank you received: 4
Hello,

I recently installed Telemac on my Pop_OS! 20.04 version without any problems and the scalar version works fine. However I do not get the parallel version to run. Quite sure there is something odd with my .cfg file, but I am not sure what, since it worked fine on 18.04.

Thanks for help.
Attachments:
The administrator has disabled public write access.

Parallel mode not working 4 years 3 months ago #36502

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
Hi,

Could you post the error you are getting ?
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.
The following user(s) said Thank You: Mustermann

Parallel mode not working 4 years 3 months ago #36510

  • Mustermann
  • Mustermann's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 268
  • Thank you received: 4
Hi,

the error I get is the following.
Parallel inconsistency: 
     +> you may be using an inappropriate configuration: ubugfortrans
     +> or may be wishing for scalar mode while setting to 6 processors

The compilation works without issues, just when I try to run models using the keyword "PARALLEL PROCESSORS = 6" I get the issue.
I get the same issue when using
--ncsize=6
for starting the model via terminal.
The administrator has disabled public write access.

Parallel mode not working 4 years 3 months ago #36511

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
When you have multiple configuration in your systel.cfg
You need to either
- add the option -c <config> to your scripts (runcode.py, compile_telemac.py...)
- set the environement varirable USETELCFG
export USETELCFG=<config>

Here is your error you are using the config ubugfortrans or to run in parallel you want the config ubugfmpich
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.

Parallel mode not working 4 years 3 months ago #36513

  • Mustermann
  • Mustermann's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 268
  • Thank you received: 4
Ok, did not know about that, since I did not have to use it like that before.

However, I still get an issue when running:
Invalid MIT-MAGIC-COOKIE-1 key[precision-5530:21305] *** An error occurred in MPI_Comm_rank
[precision-5530:21305] *** reported by process [2060517377,0]
[precision-5530:21305] *** on communicator MPI_COMM_WORLD
[precision-5530:21305] *** MPI_ERR_COMM: invalid communicator
[precision-5530:21305] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[precision-5530:21305] ***    and potentially your MPI job)
Traceback (most recent call last):
  File "~/Telemac/v8p1r1/scripts/python3/telemac2d.py", line 7, in <module>

Seems like the MPI is not working with Python3. Is there an option to fix this?
The administrator has disabled public write access.

Parallel mode not working 4 years 3 months ago #36514

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
The issue is in your configuration file.

Could you try something like that:
mpi_cmdexec:   mpiexec -wdir <wdir> -n <ncsize> <exename>
#
cmd_obj:    mpif90 -c -g -cpp -DHAVE_MPI -fconvert=big-endian -frecord-marker=4 <mods> <incs> <f95name>
cmd_exe:    mpif90 -fconvert=big-endian -frecord-marker=4  -lm -o <exename> <objs>  <libs>
#
incs_all:  
libs_all:  ~/Telemac/v8p1r1/optionals/metis-5.1.0/build/lib/libmetis.a

Then run:
compile_telemac.py --clean
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.
The following user(s) said Thank You: Mustermann

Parallel mode not working 4 years 3 months ago #36538

  • Mustermann
  • Mustermann's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 268
  • Thank you received: 4
Hi,

sorry for my late respond.

I tried to compile it as you suggest and the compilation works. However, it seems I need to add the keyword "PARALLEL PROCESSORS = #of Cores" and the use the option -c when running in parallel.
Is there an option to compile it in that way that I just have to use either of them? I actually would prefer the keyword option.

Best,
Clemens
The administrator has disabled public write access.

Parallel mode not working 4 years 3 months ago #36540

  • pham
  • pham's Avatar
  • OFFLINE
  • Administrator
  • Posts: 1559
  • Thank you received: 602
Hello Clemens,

If you do not want to add a -c option, you should follow the 2nd way my colleague yugi suggests: set the environment variable USETELCFG
export USETELCFG=<config>

For example, use the pysource.template.sh to write the environment variables you need, source it in your terminal (source pysource.template.sh)

Then run e.g. telemac2d.py t2d.cas --ncsize=4 (to run in parallel for 4 cores).

You do not need to add the number of cores in your steering file since several releases.

Hope this helps,

Chi-Tuan
The administrator has disabled public write access.
The following user(s) said Thank You: Mustermann

Parallel mode not working 4 years 2 months ago #36654

  • Mustermann
  • Mustermann's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 268
  • Thank you received: 4
Hi Chi-Tuan,

thank you for your help.

Have not used Telemac for some years (last version was v7r3) and did not know about the fact, that the keyword is not longer needed in the steering file.

I tried to adjust the source file and it works, however it seems that it actually is not running in parallel but is scalar mode. I am not 100% sure if this might be actually an issue of the compilation, as most examples finish in a very short time. Is there an example, which is more demanding so that I can check if all cores are actually used?

Best,
Clemens
The administrator has disabled public write access.

Parallel mode not working 4 years 2 months ago #36682

  • pham
  • pham's Avatar
  • OFFLINE
  • Administrator
  • Posts: 1559
  • Thank you received: 602
Hello,

If not already found, you can try to run one of the 2 malpasset examples with the fine mesh:
- t2d_malpasset-fine.cas
- t3d_malpasset-fine_p2.cas

With 50,000 or 100,000 elements, you should find CPU time differences between sequential and parallel runs.

Hope this helps,

Chi-Tuan
The administrator has disabled public write access.
  • Page:
  • 1
  • 2
Moderators: borisb

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.