Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1
  • 2

TOPIC: Parallel computing error

Parallel computing error 7 years 7 months ago #26025

  • acsantosr
  • acsantosr's Avatar
Hi
I modified: the keyword, the spaces minor of 70 and changed the directory (dropbox), but do not run. The size of slf file is 5.7M... is bigger?





.. checking parallelisation

... first pass at copying all input files
copying: geo.slf /home/geofisico/Documents/100_WorkSpace/TelePar1/corrida_paralelo.cas_2017-04-07-11h56min15s/T2DGEO
copying: bc_sistema.cli /home/geofisico/Documents/100_WorkSpace/TelePar1/corrida_paralelo.cas_2017-04-07-11h56min15s/T2DCLI
re-copying: /home/geofisico/Documents/100_WorkSpace/TelePar1/corrida_paralelo.cas_2017-04-07-11h56min15s/T2DCAS
copying: telemac2d.dico /home/geofisico/Documents/100_WorkSpace/TelePar1/corrida_paralelo.cas_2017-04-07-11h56min15s/T2DDICO

... checking the executable
re-copying: telemac2d /home/geofisico/Documents/100_WorkSpace/TelePar1/corrida_paralelo.cas_2017-04-07-11h56min15s/out_telemac2d

... modifying run command to MPI instruction

... modifying run command to PARTEL instruction

... partitioning base files (geo, conlim, sections and zones)
+> /home/geofisico/opentelemac/v7p1r1/builds/ubugfopenmpi/bin/partel < PARTEL.PAR >> partel_T2DGEO.log
*** The MPI_Comm_f2c() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[geofisico-Precision-Tower-5810:23713] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!
runPartition:
|runPARTEL: Could not split your file T2DGEO (runcode=1) with the error as follows:
|
|... The following command failed for the reason above (or below)
|/home/geofisico/opentelemac/v7p1r1/builds/ubugfopenmpi/bin/partel < PARTEL.PAR >> partel_T2DGEO.log
|
| You may have forgotten to compile PARTEL with the appropriate compiler directive
| (add -DHAVE_MPI to your cmd_obj in your configuration file).
|
|Here is the log:
|
|
| +
+
|
| PARTEL/PARRES: TELEMAC METISOLOGIC PARTITIONER
|
|
|
| REBEKKA KOPMANN & JACEK A. JANKOWSKI (BAW)
|
| JEAN-MICHEL HERVOUET (LNHE)
|
| CHRISTOPHE DENIS (SINETICS)
|
| YOANN AUDOUIN (LNHE)
|
| PARTEL (C) COPYRIGHT 2000-2002
|
| BUNDESANSTALT FUER WASSERBAU, KARLSRUHE
|
|
|
| METIS 5.0.2 (C) COPYRIGHT 2012
|
| REGENTS OF THE UNIVERSITY OF MINNESOTA
|
|
|
| BIEF 7.1 (C) COPYRIGHT 2012 EDF
|
| +
+
|
|
|
|
|
| MAXIMUM NUMBER OF PARTITIONS: 200000
|
|
|
| +
+
|
|
|
| --INPUT FILE NAME <INPUT_NAME>:
|
| INPUT: T2DGEO
|
| --INPUT FILE FORMAT <INPFORMAT> [MED,SERAFIN,SERAFIND]:
|
| INPUT: SERAFIN
|
| --BOUNDARY CONDITIONS FILE NAME:
|
| INPUT: T2DCLI
|
|--NUMBER OF PARTITIONS <NPARTS> [2 -200000]:
|
| INPUT: 2
|
| PARTITIONING METHOD <PMETHOD> [1 (METIS) OR 2 (SCOTCH)]:
|
| --INPUT: 1
|
| --CONTROL SECTIONS FILE NAME (OR RETURN) :
|
| NO SECTIONS
|
| --CONTROL ZONES FILE NAME (OR RETURN) :
|
| NO ZONES
|
| --GEOMETRY FILE NAME <INPUT_NAME>:
|
| INPUT: T2DGEO
|
| --GEOMETRY FILE FORMAT <GEOFORMAT> [MED,SERAFIN,SERAFIND]:
|
| INPUT: SERAFIN
|
| +---- PARTEL: BEGINNING
+
|
|
|
|
|
| READ_MESH_INFO: TITLE= newSelafin
|
| NUMBER OF ELEMENTS: 283042
|
| NUMBER OF POINTS: 142158
|
|
|
| FORMAT NOT INDICATED IN TITLE
|
|
|
|
|
| ONE-LEVEL MESH.
|
| NDP NODES PER ELEMENT: 3
|
| ELEMENT TYPE : 10
|
| NPOIN NUMBER OF MESH NODES: 142158
|
| NELEM NUMBER OF MESH ELEMENTS: 283042
|
|
|
| THE INPUT FILE ASSUMED TO BE 2D
|
| THERE ARE 1 TIME-DEPENDENT RECORDINGS
|
|
|
| THERE IS 7 LIQUID BOUNDARIES:
|
|
|
| BOUNDARY 1 :
|
| BEGINS AT BOUNDARY POINT: 252 , WITH GLOBAL NUMBER: 141281
|
| AND COORDINATES: 894627.8 1402157.
|
| ENDS AT BOUNDARY POINT: 281 , WITH GLOBAL NUMBER: 141562
|
| AND COORDINATES: 894710.1 1402297.
|
|
|
| BOUNDARY 2 :
|
| BEGINS AT BOUNDARY POINT: 543 , WITH GLOBAL NUMBER: 109242
|
| AND COORDINATES: 898369.1 1412088.
|
| ENDS AT BOUNDARY POINT: 557 , WITH GLOBAL NUMBER: 109340
|
| AND COORDINATES: 898378.8 1412166.
|
|
|
| BOUNDARY 3 :
|
| BEGINS AT BOUNDARY POINT: 768 , WITH GLOBAL NUMBER: 103095
|
| AND COORDINATES: 894480.9 1427473.
|
| ENDS AT BOUNDARY POINT: 779 , WITH GLOBAL NUMBER: 102993
|
| AND COORDINATES: 894332.8 1427515.
|
|
|
| BOUNDARY 4 :
|
| BEGINS AT BOUNDARY POINT: 836 , WITH GLOBAL NUMBER: 102041
|
| AND COORDINATES: 893365.9 1427819.
|
| ENDS AT BOUNDARY POINT: 853 , WITH GLOBAL NUMBER: 101846
|
| AND COORDINATES: 893240.7 1427872.
|
|
|
| BOUNDARY 5 :
|
| BEGINS AT BOUNDARY POINT: 908 , WITH GLOBAL NUMBER: 122073
|
| AND COORDINATES: 891395.2 1428052.
|
| ENDS AT BOUNDARY POINT: 921 , WITH GLOBAL NUMBER: 121880
|
| AND COORDINATES: 891227.2 1427982.
|
|
|
| BOUNDARY 6 :
|
| BEGINS AT BOUNDARY POINT: 1023 , WITH GLOBAL NUMBER: 113224
|
| AND COORDINATES: 880841.8 1420896.
|
| ENDS AT BOUNDARY POINT: 1042 , WITH GLOBAL NUMBER: 112955
|
| AND COORDINATES: 880646.0 1420812.
|
|
|
| BOUNDARY 7 :
|
| BEGINS AT BOUNDARY POINT: 1214 , WITH GLOBAL NUMBER: 111003
|
| AND COORDINATES: 881869.2 1405401.
|
| ENDS AT BOUNDARY POINT: 1243 , WITH GLOBAL NUMBER: 111498
|
| AND COORDINATES: 881933.2 1405304.
|
|
|
| THERE IS 7 SOLID BOUNDARIES:
|
|
|
| BOUNDARY 1 :
|
| BEGINS AT BOUNDARY POINT: 1243 , WITH GLOBAL NUMBER: 111498
|
| AND COORDINATES: 881933.2 1405304.
|
| ENDS AT BOUNDARY POINT: 252 , WITH GLOBAL NUMBER: 141281
|
| AND COORDINATES: 894627.8 1402157.
|
|
|
| BOUNDARY 2 :
|
| BEGINS AT BOUNDARY POINT: 281 , WITH GLOBAL NUMBER: 141562
The administrator has disabled public write access.

Parallel computing error 7 years 7 months ago #26026

  • acsantosr
  • acsantosr's Avatar
Look the BC
The administrator has disabled public write access.

Parallel computing error 7 years 7 months ago #26029

Have you tested your parallel configuration on other example cases successfully and got the correct results as recommended by Jose

If your are properly configured try running a more coarse model and progressively increase the refinement until you trigger your reported error.

Without the Slf file I cannot tell if it is a meshing problem so I would build carefully and test assuming that your are properly configured for parallel processing.
regards
TonyC
The administrator has disabled public write access.

Parallel computing error 7 years 7 months ago #26031

  • acsantosr
  • acsantosr's Avatar
Hi!
Mi PC run other samples with the parallel option, I will try to attach the slf
The administrator has disabled public write access.

Parallel computing error 7 years 7 months ago #26032

  • acsantosr
  • acsantosr's Avatar
could you give an email?
The administrator has disabled public write access.

Parallel computing error 7 years 7 months ago #26035

Hi Ana
Good news is I got your file to run
It turns out that your boundary file *.cli was not compatible with the Geo file *.slf
So I loaded your geo file in Blue Kenue
And redid the boundary conditions File as per your 7 boundaries and at least it runs
I have e-mailed the working files to you

Regards
Tony C
The administrator has disabled public write access.
The following user(s) said Thank You: acsantosr

Parallel computing error 7 years 7 months ago #26036

just to note your original files run for single processor but not for parallel. Not sure why it runs in scalar mode if the cli file is not fully compatible with the geo file, whereas in parallel it hangs???

And note also that the results are the same so perhaps it is a very minor difference between geo and the cli which triggers parallel processing problems

Regards
Tony C


Regards
Tony C
The administrator has disabled public write access.
The following user(s) said Thank You: acsantosr

Parallel computing error 7 years 7 months ago #26057

  • acsantosr
  • acsantosr's Avatar
:laugh:

Thank you!!!!!!
The administrator has disabled public write access.

Parallel computing error 7 years 7 months ago #26128

Hello,

Your problem comes probably from:
|... The following command failed for the reason above (or below)
 |/home/leno/opentelemac/v7p1r1/builds/ubugfopenmpi/bin/partel < PARTEL.PAR >> partel_T2DGEO.log
 |
 | You may have forgotten to compile PARTEL with the appropriate compiler directive
 | (add -DHAVE_MPI to your cmd_obj in your configuration file).

Did you compile TELEMAC with HPC configuration?
The administrator has disabled public write access.
  • Page:
  • 1
  • 2
Moderators: pham

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.