Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1
  • 2

TOPIC: Parallelism problems caused by the keywords "CHECKING THE MESH"

Parallelism problems caused by the keywords "CHECKING THE MESH" 4 years 6 days ago #37192

  • Gareth
  • Gareth's Avatar
  • OFFLINE
  • Junior Boarder
  • Posts: 46
Hello,everyone!

I used Telemac in windows environment before. After moving to Ubuntu 18.04, there were some problems occurred.

When I run a case with Telemac2D(coupled with Sisyphe) in parallel mode, the simulation stopped with a error message below. This case can worked well in the windows version.

error_message.jpg


After I inactivated the keyword 'CHECKING THE MESH',the case can worked without any errors.

Besides,I also use the example case 'gouttedo' to test.It worked well whether I activated the keyword 'CHECKING THE MESH',which made me very confused.

Are there some methods to solve this problem?

Here are the log file and the steering files.



File Attachment:

File Name: t2d.cas
File Size: 6 KB


File Attachment:

File Name: sis.cas
File Size: 6 KB


All regards

Gareth
The administrator has disabled public write access.

Parallelism problems caused by the keywords "CHECKING THE MESH" 4 years 6 days ago #37193

  • Gareth
  • Gareth's Avatar
  • OFFLINE
  • Junior Boarder
  • Posts: 46
Hello

I found another problem in parallelism.

When I test a Telemac3D case in parallel mode (--ncsize 4) which worked well in windows version before, it stopped at the beginning of the simulation(with keyword 'CHECKING THE MESH' inactivated). Here is the error message.

error_message_t3d_20.jpg


It still reported an error and stopped when running with a single core.

error_message_t3d_1.jpg


Here is the steering file.

File Attachment:

File Name: T3D.cas
File Size: 6 KB


Best wishes.

Gareth
The administrator has disabled public write access.

Parallelism problems caused by the keywords "CHECKING THE MESH" 3 years 11 months ago #37372

  • pham
  • pham's Avatar
  • OFFLINE
  • Administrator
  • Posts: 1559
  • Thank you received: 602
Hello Gareth,

Sorry for the delay to answer, lots of work for the new release until the new release v8p2r0 and lots to do after for other tasks.

My first question is: did you use the same tag for both Windows and Ubuntu?
Which release do you use?

As I did not have your geometry and boundary conditions file, I have tested your steering files with the tide example files ones.

To use the CHECKING THE MESH feature when coupling TELEMAC-2D and SISYPHE so that it really works, I had to put this keyword in the SISYPHE steering file, not the TELEMAC-2D. The variable CHECK_MESH connected to this keyword which can be activated in both steering files share the same value and as SISYPHE steering file is called after TELEMAC-2D one, the SISYPHE value for this keyword is read.
Can you try and tell me if it works or not for you?
Anyway, you could also use CHECKING THE MESH in the TELEMAC-2D steering file and run it without coupling with SISYPHE, for one time step and you can get the information.

For your 2nd issue, I suspect that you did not use the same release of TELEMAC-3D for your 2 computations. The error is written just before the line with PLANTE: PROGRAM STOPPED AFTER AN ERROR

It says that preconditioning 17 (direct solver on the vertical) is not available with solver 7 (GMRES). As written in the NEWS.txt, since release v8p1, default value for SOLVER FOR PPE = 7 (GMRES). In your steering file PRECONDITIONING FOR PPE = 17. That leads to your error message.
2 solutions for you: use a solver different from 7 or still use GMRES but use PRECONDITIONING FOR PPE = 2.

Hope this helps,

Chi-Tuan
The administrator has disabled public write access.
The following user(s) said Thank You: Gareth

Parallelism problems caused by the keywords "CHECKING THE MESH" 3 years 11 months ago #37387

  • Gareth
  • Gareth's Avatar
  • OFFLINE
  • Junior Boarder
  • Posts: 46
Hello pham!
Thank you for kindly help!

As you said, I used the different release of Telemac in two computer .I think what you said will solve my problem.

I will give feedback after the test.

All regards

Gareth
The administrator has disabled public write access.

Parallelism problems caused by the keywords "CHECKING THE MESH" 3 years 11 months ago #37392

  • Gareth
  • Gareth's Avatar
  • OFFLINE
  • Junior Boarder
  • Posts: 46
Hello pham!

For issue 2, after I use SOLVER FOR PPE = 7 (GMRES) and PRECONDITIONING FOR PPE = 2, it worked. I use v8p0 version on windows and v8p1 version on linux now, so I will read the NEWS.txt to learn new features and changes.

For issue 1, I have tried your advice but it doesn't work. I ran some tests and found that the problems were all related to 'CHECKING THE MESH'.

I tested Telemac-2D, Telemac-3D, and Telemac-2D+Sisyphe, and they all had the same problems:

1. When actived 'CHECKING THE MESH' and use single core, all cases work well.

2. When disactived 'CHECKING THE MESH' and use parallelism mode or single core, all cases work well.

3. When actived 'CHECKING THE MESH' and use parallelism mode, all cases will down at the step of checking the mesh. They have similar error messages.Here are error messages for Telemac-2D, Telemac-3D, and Telemac-2D+Sisyphe:

error_message_2020-12-18.jpg


error_t3d.jpg


error_t2d_sis.jpg


I wonder if there is a problem with multi-core allocation during mesh checking. Maybe I could use single-core mode only to check the mesh, and run the case in multi-cores mode with disactiving 'CHECKING THE MESH'?

Besides, when I used v8p0 version on windows before, I found that the results of mesh checking using single-core and multi-core modes were different.

Hope these information are useful.

Best wishes!
The administrator has disabled public write access.

Parallelism problems caused by the keywords "CHECKING THE MESH" 3 years 11 months ago #37393

  • Gareth
  • Gareth's Avatar
  • OFFLINE
  • Junior Boarder
  • Posts: 46
I forgot one thing.

I also test the example case t2d_gouttedo_med.cas (examples/telemac2d/gouttedo). With 'CHECKING THE MESH' activied and --ncsize 4', the simulation stop with the message below:

error_gou.jpg


But it worked well with 'CHECKING THE MESH' activied and --ncsize 10'.
The administrator has disabled public write access.

Parallelism problems caused by the keywords "CHECKING THE MESH" 3 years 11 months ago #37399

  • pham
  • pham's Avatar
  • OFFLINE
  • Administrator
  • Posts: 1559
  • Thank you received: 602
Hello Gareth,

Thanks for your feedback. It is strange because I am not able to reproduce your bug with the same example gouttedo with MED and same splitting.

Anyway, the feature CHECKING THE MESH is only useful once, just to check if your mesh is OK, if there may be issues with it. When I use it, I only run 1 time step, look at the listing and use 1 core. Then, I deactivate it as it is not necessary after for other runs. It is only a check step, no influence on the following computation. It can save time (more or less) for future runs.

Best wishes too Gareth!

Chi-Tuan
The administrator has disabled public write access.
The following user(s) said Thank You: Gareth

Parallelism problems caused by the keywords "CHECKING THE MESH" 3 years 11 months ago #37400

  • Gareth
  • Gareth's Avatar
  • OFFLINE
  • Junior Boarder
  • Posts: 46
Hello pham.

Thans again for your help.

Anyway, I can use Telemac to simulate successfully according to your suggestions.

All regards.

Gareth
The administrator has disabled public write access.

Parallelism problems caused by the keywords "CHECKING THE MESH" 3 years 11 months ago #37402

  • Gareth
  • Gareth's Avatar
  • OFFLINE
  • Junior Boarder
  • Posts: 46
Hello pham.

I have one more question about the parallel mode of Telemac.

The command '-- ncsize' refers to the number of physical cores of CPU or the number of threads (usually twice the number of physical cores)?

Gareth
The administrator has disabled public write access.

Parallelism problems caused by the keywords "CHECKING THE MESH" 3 years 10 months ago #37464

  • pham
  • pham's Avatar
  • OFFLINE
  • Administrator
  • Posts: 1559
  • Thank you received: 602
Hello Gareth,

Colleagues who better know computer science say it is the number of MPI processes. Without openmp, 1 MPI = 1 thread is OK. But you can try multi threads if you want.

Hope this helps,

Chi-Tuan
The administrator has disabled public write access.
The following user(s) said Thank You: Gareth
  • Page:
  • 1
  • 2
Moderators: pham

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.