Welcome, Guest
Username: Password: Remember me

TOPIC: Load balancing on parallel computing

Load balancing on parallel computing 3 years 11 months ago #37217

  • echterpat
  • echterpat's Avatar
Hello,

I use Telemac to compute several siting projections in parallel. I have a docker image with Telemac installed on it, and I launch my jobs with --ncsize=2. The parallelization works well, I see file splitting, and CPU usage repartition is on 2 of 8 device CPU.

But when I launch a second job, it is the same CPU which are used, so I have 6 lazy ones that never work, and 2 braves that handle the 2 jobs in parallel. How can I set the repartition to use full capability of my device ?

Thanks in advance,

Patrick
The administrator has disabled public write access.

Load balancing on parallel computing 3 years 11 months ago #37388

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
Hi,

This mpight be an issue either from your image configuration or from MPI configuration.

What does the command nproc returns ?

Could you post your systel.cfg file ?
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.

Load balancing on parallel computing 3 years 11 months ago #37389

  • echterpat
  • echterpat's Avatar
Hi,

Here the systel.cfg used. nproc returns 8.

We don't modify the docker image, we use this one :
github.com/flussplan/docker-telemac

Thanks for your answer !
Attachments:
The administrator has disabled public write access.

Load balancing on parallel computing 3 years 11 months ago #37390

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
Hi,

I am puzzled by the option --oversubscribe for the mpi command.
This option allows to submit more core than are available on the machine this could be the issue ans is usualy not recommended.
Try removing it.


If that does not work I would suggest contacting the owner of the the docker image as this is a docker issue.
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.
The following user(s) said Thank You: echterpat

Load balancing on parallel computing 3 years 11 months ago #37391

  • echterpat
  • echterpat's Avatar
I will try that, thank you for your help.

The workaround we imagined is to create several docker instances with an orchestrator which can handle the repartition between CPU, but we haven't tried yet...
The administrator has disabled public write access.
Moderators: pham

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.