Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1
  • 2

TOPIC: Rcompiling TELEMAC after change on linux

Rcompiling TELEMAC after change on linux 7 years 9 months ago #25146

  • mourad
  • mourad's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 155
  • Thank you received: 6
Hello Everyone,
I do some modifications in declarations_telemac2d (I just changed the value of MAXKEY) and modified debsce.f to handle more than 99 source points.Then I recompiled TELEMAC with the following (this what Mr Coulet advised me to do one month ago) :
compileTELEMAC.py -m "clean system -dredgesim"

I ran then a test with 300 sources points in scalar mode on my computer (windows) and it works.

I did again the same modifications indicated above on telemac installed on owr cluster (linux many processors) and I recompiled telemac with the following : compileTELEMAC. The compilation works fine.

I ran the same test above in parallel mode (50 processors) on linux but it did not work.
Please, could you tell me what I'm doing wrong? Is it a compilation error?
Many thanks in advance,
Any suggestion is welcome
The administrator has disabled public write access.

Rcompiling TELEMAC after change on linux 7 years 9 months ago #25148

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Hi

Please, when you post a message, try to give all the information which are relevant to try to solve your problem.

It did not work is not sufficient!

If the compilation works fine, this is not a compilation error!
You have an execution error but we don't have any idea of the location of the moment of this crash...
During the preparation of the computation? During the partitioning? During the run of the computation?

It's impossible to help you with no information

Regards
Christophe
The administrator has disabled public write access.
The following user(s) said Thank You: mourad

Rcompiling TELEMAC after change on linux 7 years 9 months ago #25152

  • mourad
  • mourad's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 155
  • Thank you received: 6
Sorry please find the attached right here
The administrator has disabled public write access.

Rcompiling TELEMAC after change on linux 7 years 9 months ago #25155

  • mourad
  • mourad's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 155
  • Thank you received: 6
Sorry again please find in the attached the log file of the simulation
Attachments:
The administrator has disabled public write access.

Rcompiling TELEMAC after change on linux 7 years 9 months ago #25151

  • mourad
  • mourad's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 155
  • Thank you received: 6
Sorry Mr Coulet, please find the simulation log (DEBUGGER swiched-on) in the attached

What confuses me is how with the same modifications I did in the cluster and on my laptop (declarations-telemac2d.f and debsce.f) I successufully ran this simulation on my laptop (scalar mode, windows) but telemac2d crash on the cluster.


Many thanks in advance and any reamark, advise or suggestion is welcome
The administrator has disabled public write access.

Rcompiling TELEMAC after change on linux 7 years 9 months ago #25158

  • josekdiaz
  • josekdiaz's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 161
  • Thank you received: 48
Dear Mourad,

There is something to point out in your log:
bash: module: line 1: syntax error: unexpected end of file

If you somehow you manually edited a file for telemac in windows and tried to use them in Linux you should know that text endings are different for each OS.

To convert files from DOS (windows) endings to UNIX (linux) endings you could install dos2unix in your linux distro.

If you are using e.g. ubuntu, you could install it using:
sudo apt install dos2unix

and use it from command line:
dos2unix the_file_created_on_windows

This endings could be the problem if your source files, control sections, liq boundaries files...etc... were created on windows and you copy-pasted them to Linux and tried to use them for a telemac run. (Happens to me a lot)

Hope it helps,

José Díaz.
The administrator has disabled public write access.

Rcompiling TELEMAC after change on linux 7 years 9 months ago #25160

  • mourad
  • mourad's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 155
  • Thank you received: 6
Dear José,
This error may not deal with dos2linux, cause the same model work fine with less source points (50 sources points rather than more than 200) without any conversion from dos to linux.
Moreover, I already used dos2linux for one of my test cases and surpinsgly TELEMAC2D crashes even with the model of 50 source points which work well without converting the different txt files (cas file, sources file, etc.).
Actually, once I convert my cas file with dos2linux some rows which deal with sources information go beyond the column 71 and thus telemac2d crash
I think that I have a problem in segmentation or the sources number.
Thank you for the attention you give to my problem and please dont hesitate to share me your point of view on this error, any feedback or reflexion will be appreciated
The administrator has disabled public write access.

Rcompiling TELEMAC after change on linux 7 years 9 months ago #25162

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Hi
A simulation which run in scalar mode, could crash in parallel if the modification you've done is not compatible with parallel computation.
In your case, you also add the difference of running on windows and linux...

2 ways to explore:
Try to run in parallel on windows (even with 2 or 4 processors) to check if it's run well in parallel
Try to run in scalar on linux to check if it's run.

If it doesn't run on linux, this mean you probably have a problem with the files
If it doesn't run on windows in parallel, this mean you have a problem with the modification you've done

Hope this helps
Christophe
The administrator has disabled public write access.

Rcompiling TELEMAC after change on linux 7 years 9 months ago #25182

  • mourad
  • mourad's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 155
  • Thank you received: 6
Hello Mr coulet
The good news first :
Yesterday, I launched 3 new simulations where only 50 source points are involved in both scalar and parallel. Below the results ( Number of processors and then tje CPU Time)
1P : 3mnn
2P : 1 mn 41 sec
100P : 14 sec

Today, I did the same things but with the full file of sources points (235), Unfortunately TELEMAC crashed in parallel mode but it work fine in scalar. Here the results :
Scalair mode : CPU Time 3 mn 13 sec

Parallel: error
[Node13: 24218] *** An error occurred in MPI_WAITALL
[Node13: 24218] *** reported by process [47612054339585,140733193388040]
[Node13: 24218] *** on communicator MPI_COMM_WORLD
[Node13: 24218] *** MPI_ERR_NO_MEM: out of memory
[Node13: 24218] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[Node13: 24218] *** and potentially your MPI job)

Please, could you explain to me why TELEMAC cannot handle my 235 sources in parallel. May be the organisation of sources data in the steering file cause this error. If this is the real problem, why TELEMAC work fine in scalar.
Please find also the simulation log in the attached (with DEBUUGER ).
Any kind of remark, suggestion or advise will be appreciated
Thank you in advance
The administrator has disabled public write access.

Rcompiling TELEMAC after change on linux 7 years 9 months ago #25183

  • mourad
  • mourad's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 155
  • Thank you received: 6
HERE THE ATTACHEMENT
The administrator has disabled public write access.
  • Page:
  • 1
  • 2
Moderators: pham

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.