Welcome, Guest
Username: Password: Remember me

TOPIC: Parallel execution of GRETEL?

Parallel execution of GRETEL? 6 years 2 weeks ago #31889

  • Nico
  • Nico's Avatar
  • OFFLINE
  • Fresh Boarder
  • Posts: 13
Hello everyone,

in our institute we wonder, whether there is an option to execute GRETEL in parallel.

We run large simulations invoking up to N=224 HPC-cluster-cores (28 cores per computation node) managed by slurm and using the "RunScript_queue" python script from the ./scripts/cluster-directory (a bit what adapted for our cluster-needs).

As far as we understand, GRETEL works serially. As simulation and re-collecting is executed within one slurm call the merging-process sets N-1 cores idle during (long lasting) results collection.

For crashed simulations we were able to modify the "RunScript_queue" script to solely trigger the merging process via slurm-queue, but the problem is that the -n <number of cores> option is mandatory, but hardwired to hand over the required information - how many pieces to merge - to GRETEL. Consquently, it seems to be impossible to launch a serial slurm-job to merge N>1 partitions.

Is there any solution or idea for solving this issue - either way?

... we are using v7p3r1.

Cheers
Nico and colleagues
The administrator has disabled public write access.

Parallel execution of GRETEL? 6 years 2 weeks ago #31890

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Hi
I think it will be easier to just run gretel on the master node if possible with the gretel.par input file
Or you could also just run gretel and enter the requested values manually

Gretel couldn't run in parallel

Regards

PS: think to update your profile
Christophe
The administrator has disabled public write access.
The following user(s) said Thank You: Nico

Parallel execution of GRETEL? 6 years 2 weeks ago #31891

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
You can have a look at gretel.py to manually launch gretel.

I am aware of the issue you are speaking about and i intent to solve it in the python3 scripts.

Also i would suggest using the hpc configuration instead of run_script_queue whihc is not used any more.

You can have a look at the readme.cfg that sebastien wrote on config.

And you can find examples in systel.edf.cfg and systel-cis-hydra.cfg

You have two configurations:
- One where partionning/merging is done on the logging node and execution is done on the computing node.
- One where all is done on the computing node.

You can then launch your execution the same way as on a local computer (with a couple more options to give).

Hope it helps.
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.
The following user(s) said Thank You: Nico

Parallel execution of GRETEL? 6 years 2 weeks ago #31893

  • Nico
  • Nico's Avatar
  • OFFLINE
  • Fresh Boarder
  • Posts: 13
Hello,

sorry, but I missed the information regarding HPC within the readme.cfg file. I will soon re-organize the way of launching on the cluster as it seems to be more convenient.

Both of you suggested using GRETEL seperately. But, we are very limited in executing interactive jobs on the gateway/login nodes. So, I think we will try to wrap a slurm-script around GRETEL to launch it in the queue with the desired options. Hummmpf :S , didn't think about it this more simple way before - should be possible.

Thanks a lot for your hints!

Best regards
Nico
The administrator has disabled public write access.
Moderators: borisb

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.