Welcome, Guest
Username: Password: Remember me

TOPIC: installation and run on cluster

installation and run on cluster 9 years 2 months ago #18326

  • Gaeta
  • Gaeta's Avatar
I tried to run my test case not using batch file but directly on the front-end by launching the command:
runcode.py --mpi tomawac -s /galileo/home/userexternal/mgaeta00/TAR3D_sim/Golfo/ww00_dt10/Golfo_1-10ottobre_dt10/code/cas_GolfoTaranto_tom --ncsize=8 --ncnode=1 --jobname=j

It runs but always in a serial mode.....
Sorry I really dont understant the reason....
Maybe my metis is not well installed??
G
The administrator has disabled public write access.

installation and run on cluster 9 years 2 months ago #18328

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
You could try to check metis install by just lanching the partition step
Try to run:
runcode.py tomawac --mpi --split -w TESTDIR --ncsize=8 name_of_cas_file

This would just run the partitionning step in the directory TESTDIR

Regards
Christophe
The administrator has disabled public write access.

installation and run on cluster 9 years 2 months ago #18329

  • Gaeta
  • Gaeta's Avatar
I obtained this on the terminal

... modifying run command to MPI instruction

... modifying run command to PARTEL instruction
partitioning: WACGEO
+> /galileo/home/userexternal/mgaeta00/Telemac/svn.opentelemac.org/svn/opentelemac/tags/v6p3r2/builds/cinecagalileoopenmpi_hpc/bin/partel < PARTEL.PAR >> partel_WACGEO.log
Current memory used: 0 bytes
Maximum memory used: 0 bytes
***Memory allocation failed for CreateGraphDual: nptr. Requested size: 3152505995998008 bytes


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

... Your simulation is almost ready for launch. You need to compile your executable with the option -x (--compileonly)



My work is done

and the 8 files of geo, wac etc in the TESTDIR folder..
So METIS seems ok, or not?
The administrator has disabled public write access.

installation and run on cluster 9 years 2 months ago #18327

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
No
telemac script is the management of all the preparation of the computation.
copy in the temp folder, split in subdomain in case of parallel computation... and after the computation, merge of results, copy in the initial directory ...
Christophe
The administrator has disabled public write access.

installation and run on cluster 9 years 2 months ago #18332

  • Gaeta
  • Gaeta's Avatar
Almost done (maybe, hopefully,..)

I got this message

mpirun -wdir /galileo/home/userexternal/mgaeta00/TAR3D_sim/Golfo/ww00_dt10/Golfo_1-10ottobre_dt10/code/cas_GolfoTaranto_tom_2015-09-17-18h07min47s -n 8 out_WaveWind_VarS-T_G3


There are not enough slots available in the system to satisfy the 8 slots
that were requested by the application:
out_WaveWind_VarS-T_G3

Either request fewer slots for your application, or make more slots available
for use.

ncsize=8, as the number of processors in the cas file.
Problems of the machine??
Thanks
G
The administrator has disabled public write access.

installation and run on cluster 9 years 2 months ago #18333

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
How did you obtain this message?
You manually try to run the simulation without qsub?
Maybe the master node doesn't have the 8 processors...

Regards
Christophe
The administrator has disabled public write access.

installation and run on cluster 9 years 2 months ago #18334

  • Gaeta
  • Gaeta's Avatar
no, i used the qsub command.
I also tried with 2 processors...
The administrator has disabled public write access.

installation and run on cluster 9 years 2 months ago #18343

  • Gaeta
  • Gaeta's Avatar
Good morning!
I was able to run the code with 1 processor, by using the scrpt file and with qsub command.
If I increase the numb of processors, I got this error:

Driving: /cineca/prod/compilers/gnu/4.9.2/none/bin/gfortran -fopenmp -fconvert=big-endian -frecord-marker=4 -lpthread -v -l gfortran -lm -o WaveWind_VarS-T_G3 wacfort.o /galileo/home/userexternal/mgaeta00/Telemac/svn.opentelemac.org/svn/opentelemac/tags/v6p3r2/builds/cinecagalileoopenmpi_hpc/lib/tomawac/homere_tomawac.a /galileo/home/userexternal/mgaeta00/Telemac/svn.opentelemac.org/svn/opentelemac/tags/v6p3r2/builds/cinecagalileoopenmpi_hpc/lib/utils/bief/homere_tomawac.a /galileo/home/userexternal/mgaeta00/Telemac/svn.opentelemac.org/svn/opentelemac/tags/v6p3r2/builds/cinecagalileoopenmpi_hpc/lib/utils/damocles/homere_tomawac.a /galileo/home/userexternal/mgaeta00/Telemac/svn.opentelemac.org/svn/opentelemac/tags/v6p3r2/builds/cinecagalileoopenmpi_hpc/lib/utils/parallel/homere_tomawac.a /galileo/home/userexternal/mgaeta00/Telemac/svn.opentelemac.org/svn/opentelemac/tags/v6p3r2/builds/cinecagalileoopenmpi_hpc/lib/utils/special/homere_tomawac.a -I/cineca/prod/compilers/openmpi/1.8.4/gnu--4.9.2/include -pthread -I/cineca/prod/compilers/openmpi/1.8.4/gnu--4.9.2/lib -L/cineca/sysprod/pbs/12.3.0.143517/lib -Wl,-rpath -Wl,/cineca/sysprod/pbs/12.3.0.143517/lib -Wl,-rpath -Wl,/cineca/prod/compilers/openmpi/1.8.4/gnu--4.9.2/lib -Wl,--enable-new-dtags -L/cineca/prod/compilers/openmpi/1.8.4/gnu--4.9.2/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -l gfortran -l m -shared-libgcc
Using built-in specs.
COLLECT_GCC=/cineca/prod/compilers/gnu/4.9.2/none/bin/gfortran
COLLECT_LTO_WRAPPER=/galileo/prod/compilers/gnu/4.9.2/none/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.9.2/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ./configure --prefix=/cineca/prod/compilers/gnu/4.9.2/none --enable-languages=c,c++,fortran --disable-multilib
Thread model: posix
gcc version 4.9.2 (GCC)
Reading specs from /galileo/prod/compilers/gnu/4.9.2/none/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.9.2/../../../../lib64/libgfortran.spec
rename spec lib to liborig
COLLECT_GCC_OPTIONS='-fopenmp' '-fconvert=big-endian' '-frecord-marker=4' '-v' '-o' 'WaveWind_VarS-T_G3' '-I' '/cineca/prod/compilers/openmpi/1.8.4/gnu--4.9.2/include' '-pthread' '-I' '/cineca/prod/compilers/openmpi/1.8.4/gnu--4.9.2/lib' '-L/cineca/sysprod/pbs/12.3.0.143517/lib' '-L/cineca/prod/compilers/openmpi/1.8.4/gnu--4.9.2/lib' '-shared-libgcc' '-mtune=generic' '-march=x86-64' '-pthread'
COMPILER_PATH=/galileo/prod/compilers/gnu/4.9.2/none/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.9.2/:/galileo/prod/compilers/gnu/4.9.2/none/bin/../libexec/gcc/
LIBRARY_PATH=/galileo/prod/compilers/gnu/4.9.2/none/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.9.2/:/galileo/prod/compilers/gnu/4.9.2/none/bin/../lib/gcc/:/galileo/prod/compilers/gnu/4.9.2/none/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.9.2/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/galileo/prod/compilers/gnu/4.9.2/none/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.9.2/../../../:/lib/:/usr/lib/
Reading specs from /galileo/prod/compilers/gnu/4.9.2/none/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.9.2/../../../../lib64/libgomp.spec
COLLECT_GCC_OPTIONS='-fopenmp' '-fconvert=big-endian' '-frecord-marker=4' '-v' '-o' 'WaveWind_VarS-T_G3' '-I' '/cineca/prod/compilers/openmpi/1.8.4/gnu--4.9.2/include' '-pthread' '-I' '/cineca/prod/compilers/openmpi/1.8.4/gnu--4.9.2/lib' '-L/cineca/sysprod/pbs/12.3.0.143517/lib' '-L/cineca/prod/compilers/openmpi/1.8.4/gnu--4.9.2/lib' '-shared-libgcc' '-mtune=generic' '-march=x86-64' '-pthread'
/galileo/prod/compilers/gnu/4.9.2/none/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.9.2/collect2 --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o WaveWind_VarS-T_G3 /lib/../lib64/crt1.o /lib/../lib64/crti.o /galileo/prod/compilers/gnu/4.9.2/none/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.9.2/crtbegin.o -L/cineca/sysprod/pbs/12.3.0.143517/lib -L/cineca/prod/compilers/openmpi/1.8.4/gnu--4.9.2/lib -L/galileo/prod/compilers/gnu/4.9.2/none/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/galileo/prod/compilers/gnu/4.9.2/none/bin/../lib/gcc -L/galileo/prod/compilers/gnu/4.9.2/none/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.9.2/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/galileo/prod/compilers/gnu/4.9.2/none/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.9.2/../../.. -lpthread -lgfortran -lm wacfort.o /galileo/home/userexternal/mgaeta00/Telemac/svn.opentelemac.org/svn/opentelemac/tags/v6p3r2/builds/cinecagalileoopenmpi_hpc/lib/tomawac/homere_tomawac.a /galileo/home/userexternal/mgaeta00/Telemac/svn.opentelemac.org/svn/opentelemac/tags/v6p3r2/builds/cinecagalileoopenmpi_hpc/lib/utils/bief/homere_tomawac.a /galileo/home/userexternal/mgaeta00/Telemac/svn.opentelemac.org/svn/opentelemac/tags/v6p3r2/builds/cinecagalileoopenmpi_hpc/lib/utils/damocles/homere_tomawac.a /galileo/home/userexternal/mgaeta00/Telemac/svn.opentelemac.org/svn/opentelemac/tags/v6p3r2/builds/cinecagalileoopenmpi_hpc/lib/utils/parallel/homere_tomawac.a /galileo/home/userexternal/mgaeta00/Telemac/svn.opentelemac.org/svn/opentelemac/tags/v6p3r2/builds/cinecagalileoopenmpi_hpc/lib/utils/special/homere_tomawac.a -rpath /cineca/sysprod/pbs/12.3.0.143517/lib -rpath /cineca/prod/compilers/openmpi/1.8.4/gnu--4.9.2/lib --enable-new-dtags -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgomp -lgcc_s -lgcc -lquadmath -lm -lgcc_s -lgcc -lpthread -lc -lgcc_s -lgcc /galileo/prod/compilers/gnu/4.9.2/none/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.9.2/crtend.o /lib/../lib64/crtn.o
Current memory used: 0 bytes
Maximum memory used: 0 bytes
***Memory allocation failed for CreateGraphDual: nptr. Requested size: 3152505995998008 bytes



that is similar to the one obtained with only partition command runcode.py tomawac --mpi --split -w TESTDIR --ncsize=8 name_of_cas_file

(see the previous post)

I also contacted the machine administator staff, they said it's a problem of allocatable memory?
Did you experience similar problem? how to solve?

thanx
G
The administrator has disabled public write access.

installation and run on cluster 9 years 2 months ago #18344

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
Your error in in partel (The software that partitiones the mesh)
In your temporary folder check the files partel....log.
Vould you post the outpu of thoses files here.

Hope it helps.
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.

installation and run on cluster 9 years 2 months ago #18345

  • Gaeta
  • Gaeta's Avatar
Here is the log file.

THE MESH PARTITIONING STEP STARTS
BEGIN PARTITIONING WITH METIS
RUNTIME OF METIS 0.00000000 SECONDS
THE MESH PARTITIONING STEP HAS FINISHED
TREATING SUB-DOMAIN 1
TREATING SUB-DOMAIN 2
OVERALL TIMING: 0.243000001 SECONDS

Is MEtis problem??

G
Attachments:
The administrator has disabled public write access.
Moderators: borisb

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.