Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1
  • 2

TOPIC: Crash in reading of forcings when computing on several procs

Crash in reading of forcings when computing on several procs 9 years 3 months ago #17821

  • VicMart
  • VicMart's Avatar
Hi everybody,

I am trying to make a simulation of a large coastal area with strong currents (Raz Blanchard) with T2D. I have as inputs file of currents, winds, pressure, etc. When I launch the computation on only one proc, it works, but as the computation is long I wanted to use parallelization. When I try to use more than 1 processor, an error comes out :


METEO (INIT): READING FILE HEADER
2014 6 1 0 0 0
LIT : ERREUR DE LECTURE
ON VOULAIT LIRE UN
ENREGISTREMENT DE 1 VALEURS
DE TYPE : I
SUR LE CANAL : 24
PLANTE : ARRET DU PROGRAMME APRES ERREUR

So if I understand this right, when you try to parallelize, the programme divides the mesh, as well as the forcing files, in smaller parts for each processor to compute them. And the error happens when T2D tries to read the smaller split forcing files. The error comes at the moment of reading, in the header of the file, the IKLE parameter.

Has anyone already had this problem ? Any idea how it happens ?

Many thanks in advance for the information you could give me.

Victor
The administrator has disabled public write access.

Crash in reading of forcings when computing on several procs 9 years 3 months ago #17824

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Hi
did you use the default version or did you made some adaptation in a local fortran subroutine?

If yes, we need you file to see whare the error is.
If no, this means there is a bug in telemac...

Regards
Christophe
The administrator has disabled public write access.

Crash in reading of forcings when computing on several procs 9 years 3 months ago #17825

  • VicMart
  • VicMart's Avatar
Hi,
Thank you for this quick answer !
Yes I did make a few modifications to one fortran subroutine : meteo.f
But the strange thing about it is that I made these changes a long time ago, and T2D used to work with this modifications untill recently, whatever the number of processors...

You will find attached the modified meteo.f file.

Regards,

Victor

File Attachment:

File Name: meteo.f
File Size: 15 KB
The administrator has disabled public write access.

Crash in reading of forcings when computing on several procs 9 years 3 months ago #17828

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Did you change the version of Telemac?
Christophe
The administrator has disabled public write access.

Crash in reading of forcings when computing on several procs 9 years 3 months ago #17829

  • VicMart
  • VicMart's Avatar
No, I did not.
The administrator has disabled public write access.

Crash in reading of forcings when computing on several procs 9 years 3 months ago #17830

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
So to summarize,
  • The code works in sequential but crash in parallel
  • You allways run with Telemac version 6.3
  • Nothing else change on the computer
  • The wind is given in a selafin file
In my opinion, the code crash at the first read of the splitted wind file so i suspect a problem in the partitionning step.

You could check in the temporary directory if the wind file exist and if it's well design. As it's a selafin file, you could open each file with BK or Fudaa.

regards
Christophe
The administrator has disabled public write access.

Crash in reading of forcings when computing on several procs 9 years 3 months ago #17892

  • VicMart
  • VicMart's Avatar
Hi,

Sorry for the delay. I had a look at the serafin files contained in the temporary folder created by Telemac, using a home made serafin class in Python.

The error when reading the file comes from the fact that the IKLE parameter of the serafin, which is supposed to contain a connectivity table with dimensions NDP,NELEM, is just an empty item...

So it is when Telemac divides the mesh and wind&pressure files for parallelisation that there is a mistake writing the header of the new smaller files.

So I was wondering if you knew what part of the fortran code was responsible for the dividing f the files for parallelisation, so that I could have a look at the code myself.

Many thanks in advance for any kind of info you could give me.

Victor
The administrator has disabled public write access.

Crash in reading of forcings when computing on several procs 9 years 3 months ago #17894

  • jmhervouet
  • jmhervouet's Avatar
Hello,

The file header as printed on your listing seems to be the record of the date and time given by 6 integers, while you seem to wait for the array IKLE. Don't you have a problem with this, as the selafin format may have or not a record with the date, depending on an integer parameter in the record with a list of 10 integers.
Another useful information would be to read the partitioned files with a post-processor, as suggested by Christophe, so that you can know if they are correct or not.
We already had a similar problem when the date was added to files in Telemac-3D, it caused Postel3D to crash because it did not check the existence of the date record.

With best regards,

Jean-Michel Hervouet
The administrator has disabled public write access.

Crash in reading of forcings when computing on several procs 9 years 3 months ago #17900

  • VicMart
  • VicMart's Avatar
Good morning,

Thanks a lot for this quick answer JM Hervouet,
You must be right I had almost forgotten what was exactly written in the error message, and I was focusing on my IKLE parameter being empty.

So if I understand this right, the problem is that THERE IS a begin_date in the temporary serafin files, with a the last integer of the list of 10, set to : IPARAM10 = '1'.

But the program reading the serafin files, which is Postel3D, doesn't check the value of IPARAM10, and doesn't read the date_begin. So it is trying to read the date instead of another parameter. Am I right ?

So to try and correct the error, I should have a look at the Postel3D routine, which reads the serafin, but I also would like to have a look at the routine dividing the serafin into smaller files for paralleliszation. Could you provide me with the name of this routine please ?
I'll share with you the results I get from my modifications.

Looking forward to hearing from you.

Regards,

Victor MARTIN
The administrator has disabled public write access.

Crash in reading of forcings when computing on several procs 9 years 3 months ago #17903

  • jmhervouet
  • jmhervouet's Avatar
Hello,

No, Posted-3D is a post-processor, so it is not called in a run of Telemac-2D, I was just giving an example which triggered the same kind of error. However you seem to check the existence of the date in the file, so it is unlikely that it is a problem with the date.
A key information is to know whether a post-processor like Fudaa or BlueKenue is able to read the splitted wind files. If yes then the error is in your Fortran, even if it worked before. Another thing to check is whether the file of channel 24 is declared as selafin format in the Telemac2d dictionary, telemac2d.dico in the folder of sources (look for T2DBI1 or FICHIER DE DONNEES BINAIRE 1, the end of the line SUBMIT=...). It used to be PARAL and we changed it into SELAFIN recently, SELAFIN is correct if you want the file to be split. As this file is an all purpose data file we cannot really know what will be its use.

With best regards,

Jean-Michel Hervouet
The administrator has disabled public write access.
  • Page:
  • 1
  • 2
Moderators: pham

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.