Hi,
Previously, I was sucessful to run TELEMAC up to 8 processors, but with a single node. Now I want to run TELEMAC using multi-nodes (here is just for 2 nodes) and try to add the following commands into the configuration file:
hpc_stdin: #!/bin/bash
#SBATCH -p physical
#SBATCH --nodes=2
#SBATCH --ntasks=8
#SBATCH --cpus-per-task=1
#SBATCH --mem 16G
#SBATCH --time=0:60:00
I got errors below. Could someone help? Thanks a lot
Best Regards
Huy
The MPI_Comm_f2c() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[spartan-rc001.hpc.unimelb.edu.au:14593] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!
There are not enough slots available in the system to satisfy the 16 slots
that were requested by the application:
/home/huyquangtran/telemac/v7p2/LINUX-RUN/T3D.cas_2017-01-31-17h43min34s/out_tide_wind
Either request fewer slots for your application, or make more slots available
for use.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
... merging separated result files
+> T3D.cas
collecting: T3DRES
runRecollection:
|runGRETEL: Could not split your file T3DRES (runcode=1) with the error as follows:
|
|... The following command failed for the reason above (or below)
|/home/huyquangtran/telemac/v7p2/builds/parallel/bin/gretel < gretel_T3DRES.par >> gretel_T3DRES.log
|
|
|Here is the log:
|
|
| +
+
|
| GRETEL: TELEMAC MERGER
|
|
|
| VERSION V7P2R0
|
| HOLGER WEIL BEER (BAW)
|
| JEAN-MICHEL HERVOUET (LNHE)
|
| YOANN AUDOUIN (LNHE)
|
| GRETEL (C) COPYRIGHT 2003-2012
|
| BUNDESANSTALT FUER WASSERBAU, KARLSRUHE
|
|
|
| +
+
|
|
|
|
|
| MAXIMUM NUMBER OF PARTITIONS: 100000
|
|
|
| +
+
|
|
|
| --GLOBAL GEOMETRY FILE:
|
| INPUT: T3DGEO
|
| --GEOMETRY FILE FORMAT <FFORMAT> [MED,SERAFIN,SERAFIND]:
|
| INPUT: SERAFIN
|
| --RESULT FILE:
|
| INPUT: T3DRES
|
| --RESULT FILE FORMAT <FFORMAT> [MED,SERAFIN,SERAFIND]:
|
| INPUT: SERAFIN
|
|--NUMBER OF PARTITIONS <NPARTS> [2 -100000]:
|
| INPUT: 16
|
| --NUMBER OF PLANES:
|
| INPUT: 10
|
| ERREUR 2 LORS DE L APPEL A OPEN_MESH_SRF:OPEN
|
| TEXTE DE L'ERROR : UNKNOWN_ELT_TYPE_ERR
|
|
|
|
|
|
|
| PLANTE: PROGRAM STOPPED AFTER AN ERROR
|
| RETURNING EXIT CODE: 2
|
slurmstepd: error: Exceeded step memory limit at some point.
srun: error: spartan-rc001: task 0: Exited with exit code 1