Welcome, Guest
Username: Password: Remember me

TOPIC: Failed in Parallel mode of Telemac2d

Failed in Parallel mode of Telemac2d 3 years 3 months ago #38951

  • Jujube
  • Jujube's Avatar
  • OFFLINE
  • Fresh Boarder
  • Posts: 12
Hi, telemac experts,

I'm using the version of v8p2, the parallel exetuables can be succesfully compiled but prompt an error message when running it, I'm not sure what is the probelm and would like to have your comments. Thanks.


The main message writes as:
BEGIN PARTITIONING WITH METIS
ERROR: TRY TO RUN PARTEL WITH A SERIAL CONFIGURATION

PLANTE: PROGRAM STOPPED AFTER AN ERROR
RETURNING EXIT CODE: 2

The complete log file and cas file are attached also.

Best regurds,

Jujube
Attachments:
The administrator has disabled public write access.

Failed in Parallel mode of Telemac2d 3 years 3 months ago #38952

  • PMV
  • PMV's Avatar
  • OFFLINE
  • Senior Boarder
  • Posts: 149
  • Thank you received: 42
Did you try running it in serial?

What command are you using?

Are you able to run one of the examples in parallel?

Hope that helps,
Patrick
The administrator has disabled public write access.

Failed in Parallel mode of Telemac2d 3 years 3 months ago #38954

  • Jujube
  • Jujube's Avatar
  • OFFLINE
  • Fresh Boarder
  • Posts: 12
Hi Patrick,

Thank you.
Yes, I've tried PARALLEL PROCESSORS to be 0 (serial) and 1, in these two cases, the module can run.


For compiling the modules, I tried 'compile_telemac.py' and 'compile_telemac.py -j 16', compilation can be done, but run failed in the parallel mode with same error message.

For running the program, 'telemac2d.py casfile' and 'telemac2d.py --ncsize=16 casfile' were tried seperately, they all failed.


For example run in parallel, I will try one of them soon.


Best regards,

Jujube
The administrator has disabled public write access.

Failed in Parallel mode of Telemac2d 3 years 3 months ago #38965

  • pham
  • pham's Avatar
  • OFFLINE
  • Administrator
  • Posts: 1559
  • Thank you received: 602
Hello Jujube,

As suggested by the 1st rule of this forum "Check that your question has not been answered anywhere else on the site. Use the search feature.", I tried the search feature with "ERROR: TRY TO RUN PARTEL WITH A SERIAL CONFIGURATION" from any date and I found some answers to try:
opentelemac.org/index.php/kunena/search?...te=all&childforums=1

I would start with adding the -DHAVE_MPI option in your configuration file and if not OK, read other posts.

Hope this helps,

Chi-Tuan
The administrator has disabled public write access.

Failed in Parallel mode of Telemac2d 3 years 3 months ago #38967

  • Jujube
  • Jujube's Avatar
  • OFFLINE
  • Fresh Boarder
  • Posts: 12
Thank you, Chi-Tuan!

The -DHAVE_MPI option do solve the main part of my problme. Acutally, I have tried to search the forum with "User credentials needed to launch processes" and found I also have the account issue as I didn't use an administrator account on windows. I'm so glad that the example can be run now!

Thanks again!

Cheers,

Jujube
The administrator has disabled public write access.

Failed in Parallel mode of Telemac2d 3 years 3 months ago #38969

  • Jujube
  • Jujube's Avatar
  • OFFLINE
  • Fresh Boarder
  • Posts: 12
Hi, I'm sorry to post again as I found the parallel mode may also fail when trying another example case.

The previous mentioned success case is "river_art", it can be run in serial and parallel mode, when I tried the "vegetation" case, it can be done in serial mode but failed in later mode with prompt information as below:
ITERATION 0 TIME: 0.0000 S
TELEMAC2D INITIALIZED
CVTRVF: 200 SUB-ITERATIONS REQUIRED FOR THE
application called MPI_Abort(MPI_COMM_WORLD, 2) - process 0
DISTRIBUTIVE SCHEME. DECREASE THE TIME-STEP
application called MPI_Abort(MPI_COMM_WORLD, 2) - process 5

application called MPI_Abort(MPI_COMM_WORLD, 2) - process 1
PLANTE: PROGRAM STOPPED AFTER AN ERROR
application called MPI_Abort(MPI_COMM_WORLD, 2) - process 10
application called MPI_Abort(MPI_COMM_WORLD, 2) - process 14
RETURNING EXIT CODE: 2

After using dt of 0.5 seconds instead of original value of 1 seconds, the computation can be done!

So I'm wondering if there any different requirement of parameter (dt in this case) for serial mode and parallel mode? What are possibel reasons? I'm really curious about that.


Best regards,

Jujube
The administrator has disabled public write access.

Failed in Parallel mode of Telemac2d 3 years 3 months ago #38976

  • pham
  • pham's Avatar
  • OFFLINE
  • Administrator
  • Posts: 1559
  • Thank you received: 602
Hello jujube,

As written in the error message, there was an error in your run and you did right by decreasing the time step. Indeed, it is only a template of steering file used for validation and if you have a look at the vnv_vegetation.py python script used to run this example (run validate_telemac.py vnv_vegetation.py), you can read the lines:
cas.set('TIME STEP', 0.5)
cas.set('NUMBER OF TIME STEPS', 7200)
to change the time step and the number of time steps only for vegetation law #1.

It is normal.

Hope this helps,

Chi-Tuan
The administrator has disabled public write access.

Failed in Parallel mode of Telemac2d 3 years 3 months ago #38978

  • Jujube
  • Jujube's Avatar
  • OFFLINE
  • Fresh Boarder
  • Posts: 12
Hi Chi-Tuan,

Thank you very much for the expanation! That's a good hint!

Cheers,

Jujube
The administrator has disabled public write access.
Moderators: pham

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.