Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1
  • 2

TOPIC: Parallel issue

Parallel issue 11 years 9 months ago #7451

  • abernard
  • abernard's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 210
  • Thank you received: 45
Yugi,

Typing gfortran -v and mpif90 -v returns the same message.

Screenshotfrom2013-02-12155212.png


Thanks for your help
The administrator has disabled public write access.

Parallel issue 11 years 9 months ago #7636

  • abernard
  • abernard's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 210
  • Thank you received: 45
Hi Yugi,

Did you have a look on what gfortran -v and mpif90 -v returns me?
I just don't know weither telemac has been compiled in 32 or 64bits.

Thanks a lot for your help,
Regards
The administrator has disabled public write access.

Parallel issue 11 years 8 months ago #7676

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
Hi,

Sorry for the late reply but your code seems to be compiled in 64 bytes.
So I don't know why it crashed.
But it is part of the joys of Computer Science.
As long as it works.

Cheers,
Yoann
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.

Parallel issue 11 years 8 months ago #8013

  • titseng
  • titseng's Avatar
I have the same problem.
I compiled the TELEMAC; when I set the number of processor to be 0 and 1, it worked. But I set it large than 1, it crash. the error mesage as following:

+> /home/p00tti01/telemac/parallel/parallel_v6p2/ubugfopenmpi/partel < PARTEL.PAR >> partel_T2DGEO.log
sh: line 1: 17701 Segmentation fault /home/p00tti01/telemac/parallel/parallel_v6p2/ubugfopenmpi/partel < PARTEL.PAR >> partel_T2DGEO.log
... The following command failed for the reason above

then I try to install the metis in version-5.0.2 and 5.0.3, it doesn't work in both case.

anybody can help me.
The administrator has disabled public write access.

Parallel issue 11 years 8 months ago #8014

  • abernard
  • abernard's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 210
  • Thank you received: 45
Hi,

You should try to compile again metis-5.0.2 or 3 keeping IDXTYPEWIDTH to 32 in include/metis.h.

After that, try to compile the parallel module or the overall system. In my case it works but I haven't identified yet why.

Hope this helps,
The administrator has disabled public write access.
The following user(s) said Thank You: titseng

Parallel issue 11 years 7 months ago #8087

  • OBA
  • OBA's Avatar
  • OFFLINE
  • Fresh Boarder
  • Posts: 19
  • Thank you received: 4
Hi all,

I would like to submit one particular problem with my mesh.
I've succeeded in installing TELEMAC v6p2 in parallel on ubuntu 12.10 (with openMPI 1.6.4 and metis 5.0.3).
I've tested several models from validation cases (mersey, ondem2 for instance) and one of my own ones with success both in scalar and parallel modes.
I've tried on another model with many isles (>50) which represent buildings in an urban area. And the result is surprisling.
In scalar mode, no pb, it runs in 25 minutes.
In parallel mode, with 2 processors, it runs well, but in 53 minutes?!?!
With 4 processors it's more longer and I've stopped the run.
It's look like as it runs 2 or 4 times the simulation and not 1 time in 2 or 4 parts.
In the temporary directory, all the files are duplicated except the T2DRES : I've only this result file and not T2DRES00001-00000 nor T2DRES00001-00001.
So gretel crashes at the end of the run.
I join you log files if you have an idea on this pb.
Thanks in advance

Olivier
Attachments:
The administrator has disabled public write access.

Parallel issue 11 years 7 months ago #8093

  • pham
  • pham's Avatar
  • OFFLINE
  • Administrator
  • Posts: 1559
  • Thank you received: 602
Hi Olivier,

Would it be possible to have your two steering files, to know the number of elements of your mesh and to have a screenshot of an ls -ltr command in the temporary directory created by TELEMAC please? Have you changed anything in the standard sources? Do all the PE***.LOG have the same size or not?

Regards,

Chi-Tuan
The administrator has disabled public write access.

Parallel issue 11 years 7 months ago #8096

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
Hi Olivier,

The problem is that your test case is run multiple time in serial.
You can see that in your log file.
NOMBRE DE PROCESSEURS PARALLELES DIFFERENT :
DEJA DECLARE (CAS DE COUPLAGE ?) : 0
TELEMAC-2D : 2
LA VALEUR 0 EST GARDEE
NOMBRE DE PROCESSEURS PARALLELES DIFFERENT :
DEJA DECLARE (CAS DE COUPLAGE ?) : 0
TELEMAC-2D : 2
LA VALEUR 0 EST GARDEE

Check in your steering file if you have PROCESSOR PARALLEL set to the same value in both.

Hope it helps,
Yoann
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.

Parallel issue 11 years 7 months ago #8105

  • OBA
  • OBA's Avatar
  • OFFLINE
  • Fresh Boarder
  • Posts: 19
  • Thank you received: 4
Hi,

Thank a lot for your answers.
My steering files and the screenshot are joined in case of 2 processors.
I haven't any PE***.LOG nor T2DRES*** files in the temporary directory.
What I've changed in the source is:
-lecsip.f (cf. #7128)
-MAXFRO=3000 in partel.f (cf. #7347) but it didn't change anything
That's all.
My fortran file just set different formulations of discharge in the buse.f subroutine.
................
I'm writing in the same time I'm testing without the fortran file... and it seems to run well without it!!
So, the pb comes from my subroutine buse.f, can you have a look on it please??
I will check it too.
Best regards

Olivier
Attachments:
The administrator has disabled public write access.

Parallel issue 11 years 7 months ago #8112

  • OBA
  • OBA's Avatar
  • OFFLINE
  • Fresh Boarder
  • Posts: 19
  • Thank you received: 4
Hi all,

Well, I've solved the problem.
In fact, it doesn't come from the fortran file but from the fortran executable.
First, I had compiled ubugfortrans in scalar mode and ran my simulation. Then, I had a new executable fortran in my directory, the run worked well, no pb.
Afterthat, I have compiled ubugfopenmpi in parallel mode, and changed in my systel file "configs" from ubugfortrans to ubugfopenmpi. I have launched my run in parallel mode (2 processors) whithout deleting the executable fortran (which was compiled in scalar mode...) and, as the fortran file hasn't been modified, the fortran executable hasn't been recompiled. So, the run used the fortran executable in scalar mode even if I set "PROCESSEURS PARALLELES = 2".
Now, I've deleted the previous executable (in scalar mode) from my directory, and the good fortran executable in parallel mode has been well recompiled, and it runs well!!!
"the joys of Computer Science" as said Yugi in a previous post... ;)
Thanks for your help and best regards.

Olivier
The administrator has disabled public write access.
  • Page:
  • 1
  • 2
Moderators: borisb

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.