Welcome, Guest
Username: Password: Remember me

TOPIC: Opentelemac and HPC Pack (MS-MPI)

Opentelemac and HPC Pack (MS-MPI) 10 years 7 months ago #12646

  • cyamin
  • cyamin's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 997
  • Thank you received: 234
Hello,

Always happy to be of assistance. Following this discovery, I am exploiting the full potential of the job submission routine within telemac and will post my findings when complete.

At the moment, I am looking into the "noderelease" function of HPC Pack, which is very useful to automatically recollect the results when the computation has completed. In order for this task to be submitted to the job scheduler, the command needs to be specified within the batch file ('<hpc_stdin>' line).
runcode.py -c cfgName --<wdir> --merge casFile

The only concern is to get the name and location (outside of <wdir> unfortunately) of the casFile required by the 'merge' option to work. What are the variables within the runcode.py that hold this information and possibly use a
stdin = stdin.replace()
line to get?

Regards,
Costas
The administrator has disabled public write access.

Opentelemac and HPC Pack (MS-MPI) 10 years 7 months ago #12647

  • sebourban
  • sebourban's Avatar
  • OFFLINE
  • Administrator
  • Principal Scientist
  • Posts: 814
  • Thank you received: 219
Hello,

<wdir> is the only directory available I afraid (not that we would not be able to make others available).

Also, within your STDIN script, you should already be placed where the root CAS file is, unless you did cd <wdir>, in which case you can simply do cd ..

Therefore runcode.py <codename> --merge casFile should work.

Let me know if you have problems.

Another option is to call on runSELAFIN.py to merge all files into one within <wdir> (but then you need to know what are the file name T2DRES etc ..., which does not make the code generic.

Sébastien.
The administrator has disabled public write access.

Opentelemac and HPC Pack (MS-MPI) 10 years 7 months ago #12653

  • cyamin
  • cyamin's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 997
  • Thank you received: 234
Hello Sebastien,

I have managed to submit jobs and recollect results automatically in one go :cheer:. However, two issues remain if the solution is to be generic:

Firstly, in the particular investigation, I am using a parallel case of Artemis (BTWI). When I use the runcode.py --merge command, I get the following message:
The name of the module to run and one cas file are at least required
This is overcome by using artemis.py --merge.

Secondly, I still have to define explicitly the file name of the .cas file within the <hpc_stdin> line in the systel.cfg.

To my understanding, both issues would require some modification of the runcode.py script.

Costas
The administrator has disabled public write access.

Opentelemac and HPC Pack (MS-MPI) 10 years 7 months ago #12722

  • cyamin
  • cyamin's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 997
  • Thank you received: 234
Hello Sebastien,

In the runcode.py script, is there variable that holds the name of the .cas file currently in process? Maybe I could modify the STDIN file creation part of the runcode.py so as to specify it in my submission script.

Regards,
Costas
The administrator has disabled public write access.

Opentelemac and HPC Pack (MS-MPI) 10 years 7 months ago #12724

  • cyamin
  • cyamin's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 997
  • Thank you received: 234
Hello,

After some more research, I found that it is not "casFile" (as initially thought) that holds the filename, but "casName" instead.

I inserted:
stdin = stdin.replace('<casFileName>',str(casName))
after line 1070 in runcode.py and now I can get the .cas filename in my STDIN script.

Now, all that is left to make everything work, is the runcode.py-artemis.py --merge issue that I mentioned before.

Costas
The administrator has disabled public write access.

Opentelemac and HPC Pack (MS-MPI) 10 years 7 months ago #12654

  • sebourban
  • sebourban's Avatar
  • OFFLINE
  • Administrator
  • Principal Scientist
  • Posts: 814
  • Thank you received: 219
OK --

try the following as your command:
<py_runcode> --merge

as your merging command, within your STDIN, after your run command, and check what is written as a result in your local HPC_STDIN file.


Sébastien.
The administrator has disabled public write access.

Opentelemac and HPC Pack (MS-MPI) 10 years 7 months ago #12656

  • cyamin
  • cyamin's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 997
  • Thank you received: 234
I tried and it is left as it is, i.e. <py_runcode>.

Costas
The administrator has disabled public write access.

Opentelemac and HPC Pack (MS-MPI) 10 years 7 months ago #12726

  • cyamin
  • cyamin's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 997
  • Thank you received: 234
OK, problem solved. I can submit jobs to MS HPC Pack, recollect result files, all in one generic job submission command. Job submission does not require changes to runcode.py, automatic recollection does. I don't see why this recollection command can't be added in a similar fashion to other job scheduler scripts (if they have a similar function to HPC's node release task). So you may want to consider adding this functionality.

The command for recollection is:
<codename>.py -c <cfgname> -w <wdir> --merge <casFileName>
The changes needed to be made in runcode.py script are 2 lines after 1070:
stdin = stdin.replace('<casFileName>',str(casName))
stdin = stdin.replace('<cfgname>',options.configName)
I haven't yet tried to submit multiple cas files in a single batch job, which might cause a problem with 'casName'.

HPC Pack provides numerous possibilities to meet the needs of the user. I will keep exploiting the HPC Pack commands to tailor the job submission script according to the runcode.py structure. I will be able to provide a more "mature" HPC configuration after I refine it in actual 'production' projects.

Regards,
Costas
The administrator has disabled public write access.

Opentelemac and HPC Pack (MS-MPI) 9 years 11 months ago #15249

  • cyamin
  • cyamin's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 997
  • Thank you received: 234
Hello,

I am trying to submit to the HPC Pack queue, multiple jobs that are sequential, i.e. they are continued computation of the previous job. If I submit all the jobs at once, the working directories are created and populated before the previous computation is completed and the results files made available. Thus the job submission for the next computations fails.

Is there a way to overcome this problem? How to you address this issue with other job schedulers?

Best Regards,
Costas
The administrator has disabled public write access.

Opentelemac and HPC Pack (MS-MPI) 9 years 11 months ago #15269

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
Hi,

You should be able to build a dependencies between jobs with your job scheduler. Here is an example with slurm:
sbatch --dependency=afterok:<jobID_A> jobB.sh
This command will launc the jobB when the jobA finished properly.

Hope it helps.
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.
Moderators: borisb

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.