Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1
  • 2

TOPIC: limitation of parallel processors with a weir?

limitation of parallel processors with a weir? 10 years 7 months ago #12486

  • Proust_Nicolas
  • Proust_Nicolas's Avatar
  • OFFLINE
  • Senior Boarder
  • Posts: 136
  • Thank you received: 2
Hi everybody,

I'm running a case with two weirs on V6P2 version.
First I have tried to run the model with two islands instead of the weirs on 30 parallel processors and it worked.

Next, when I tried to run the model with two weirs on 30 parallel processors the computation stops at iteration 1. There is no error message, the computation is still running (see the attachments) but nothing happens.

hf4f592a.PNG



I have tried to reduce the number of parallel processors. It works with the two weirs for parallel processors lower or equal than 15. Over 15 the computation stops as described previously.

So here is my question, is there a limitation of parallel processors using a weir?
Nicolas
The administrator has disabled public write access.

limitation of parallel processors with a weir? 10 years 7 months ago #12491

  • riadh
  • riadh's Avatar
Hello Nicolas

Actually there is no limitations on the parallel use of weirs. Here some hints to understand the problem:
- use debug mode (DEBUGGER = 1) to see exactly where the code stops
- use 32 procs instead of 30 (it is recommanded tu use 2^p procs)
- I remarked that you have points with more than 10 neighbors, this is not allowed for Telemac and it could be one source of your troubles
- there were some new adjustments for the weir option within release V6P3. Therefore you need to upgrade to this new release in order to benefit from them.

I hope that this helps
with my best regards

Riadh ATA
The administrator has disabled public write access.

limitation of parallel processors with a weir? 10 years 7 months ago #12498

  • Proust_Nicolas
  • Proust_Nicolas's Avatar
  • OFFLINE
  • Senior Boarder
  • Posts: 136
  • Thank you received: 2
Hi Riadh

Thank you for this track search. Here what happens for each one :

I have used the debug mode, the program doesn't come back from PROPAG if the number of parallel processors is uper or equal than 16. If the number of parallel processors is lower or equal than 15, all seems to be ok. For me, PROPAG doesn't depend of parallel processors, it is strange...

I have also tried with 32 procs, this is the same mistake.

About the Neighbors, I have checked the simulation with island instead of weir. Two log files printout points with 11 neighbours and the calculation is well done.
For simulations with real weirs, there is no difference changing the number of procs : two log files printout points with 11 neighbours. That seems to be right that the maximum of neighbours is the same, it dépends on the geometry file.

I have tried with V6P3 version, that's not better...

Have you another idea?

Regards
Nicolas
The administrator has disabled public write access.

limitation of parallel processors with a weir? 10 years 7 months ago #12523

  • jmhervouet
  • jmhervouet's Avatar
Hello,

Please find enclosed two corrected subroutines (clsing and clhuvt) that may solve your problem. We have discovered recently that in parallel we could have a problem with negative elevations or bottom elevation, the functions P_DMIN and P_DMAX for sharing information between processors were combined in a way that worked only with positive numbers.

With best regards,

Jean-Michel Hervouet
Attachments:
The administrator has disabled public write access.

limitation of parallel processors with a weir? 10 years 7 months ago #12525

  • Proust_Nicolas
  • Proust_Nicolas's Avatar
  • OFFLINE
  • Senior Boarder
  • Posts: 136
  • Thank you received: 2
Hello Jean-Michel,

I have already this corrected subroutines and that doesn't solve the problem...

Another idea?

Regards
Nicolas
The administrator has disabled public write access.

limitation of parallel processors with a weir? 10 years 7 months ago #12527

  • jmhervouet
  • jmhervouet's Avatar
Hello,

OK so we would need the case to have a close look at it.

REgards,

Jean-Michel Hervouet
The administrator has disabled public write access.

limitation of parallel processors with a weir? 10 years 7 months ago #12532

  • Proust_Nicolas
  • Proust_Nicolas's Avatar
  • OFFLINE
  • Senior Boarder
  • Posts: 136
  • Thank you received: 2
you can download (too big for attachments) the whole model at the next link :
we.tl/It4Vmah8Fv
Regards
Nicolas
The administrator has disabled public write access.

limitation of parallel processors with a weir? 10 years 7 months ago #12526

  • riadh
  • riadh's Avatar
Hello Nicolas

This is really strange. I have a colleague who used more than 200 proc and it worked well. Maybe there is something else to fix.
Here another suggestion:
I see that there is no listing appearing, that means very probably that there is a processor which does not finish its work and that all the remaining processors are waiting for it. So please add these three lines at different levels (somewhere in telemac2d.f for instance, after IF(DEBUG)...):
WRITE(LU,*)'hello111xxx'
CALL FLUSH(LU)
CALL P_SYNC

and then, have a look on the listing files that have the names like PEXXXX-XXXX
This will give you which proc is blocking the others, and we will see together why it generates these troubles.

I hope that this helps
Kind regards

Riadh
The administrator has disabled public write access.

limitation of parallel processors with a weir? 10 years 7 months ago #12528

  • Proust_Nicolas
  • Proust_Nicolas's Avatar
  • OFFLINE
  • Senior Boarder
  • Posts: 136
  • Thank you received: 2
Hello Riadh,

I have added the 3 lines after : IF(DEBUG.GT.0) WRITE(LU,*) 'CALLING PROPAG'
I have the word 'hello' in all the files PEXXXX-XXXX and in the console but nothing after...
What is expected with CALL FLUSH(LU) and CALL P_SYNC?

Regards
Nicolas
The administrator has disabled public write access.

limitation of parallel processors with a weir? 10 years 7 months ago #12529

  • riadh
  • riadh's Avatar
Hi Nicolas

You need to write these lines after all the IF(DEBUG)...
In this way you will impose a syncronization of all tasks achieved by all the processors (through mpi_barrier)
We need to see for which task there is a blocking.

kind regards
Riadh
The administrator has disabled public write access.
  • Page:
  • 1
  • 2
Moderators: pham

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.