Welcome, Guest
Username: Password: Remember me

TOPIC: Parallel Installation:attempting to use an MPI routine before initial

Parallel Problem: 7 years 10 months ago #24834

  • huyquangtran
  • huyquangtran's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 271
  • Thank you received: 23
Hi Yoann,

Is there any reasons caused by the HPC admin that kills my job when I submit a job from a local machine to the server?

Thanks
Huy
The administrator has disabled public write access.

Parallel Problem: 7 years 10 months ago #24835

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
Maybe wrong parameter of the submission script ?
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.

Parallel Problem: 7 years 10 months ago #24852

  • huyquangtran
  • huyquangtran's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 271
  • Thank you received: 23
Do I need to include the submission script in my configuration?

Thanks
Huy
The administrator has disabled public write access.

Parallel Problem: 7 years 10 months ago #24854

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
Yes, you can see an example in systel.edf.cfg for the configuration athos.intel14 and in systel.cis-hydra.cfg
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.

Parallel Problem: 7 years 10 months ago #24878

  • huyquangtran
  • huyquangtran's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 271
  • Thank you received: 23
But, I have checked it again, I don't think the problem comes from the HPC admin.

Before running the test, I had requested for resources:

[huyquangtran@spartan seiche]$ sinteractive --time=00:10:00 --nodes=1 --ntasks=2
srun: job 478155 queued and waiting for resources
srun: job 478155 has been allocated resources

As you can see the resources have been allowed, so I could deploy the test. However, the error still occured.

Does anyone have any help?

Thanks
Huy

modifying run command to MPI instruction

... modifying run command to PARTEL instruction

... partitioning base files (geo, conlim, sections, zones and weirs)
+> /home/huyquangtran/telemac/v7p2/builds/parallel/bin/partel < PARTEL.PAR >> partel_T2DGEO.log
Current memory used: 0 bytes
Maximum memory used: 0 bytes
***Memory allocation failed for CreateGraphDual: nptr. Requested size: 9208409919624 bytes

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x2AEB52696577
#1 0x2AEB52696B8E
#2 0x2AEB530B666F
#3 0x407D0B in partel_
#4 0x41A841 in MAIN__ at homere_partel.f:?
/bin/sh: line 1: 20253 Segmentation fault (core dumped) /home/huyquangtran/telemac/v7p2/builds/parallel/bin/partel < PARTEL.PAR >> partel_T2DGEO.log
runPartition:
|runPARTEL: Could not split your file T2DGEO (runcode=139) with the error as follows:
|
|... The following command failed for the reason above (or below)
|/home/huyquangtran/telemac/v7p2/builds/parallel/bin/partel < PARTEL.PAR >> partel_T2DGEO.log
|
| You may have forgotten to compile PARTEL with the appropriate compiler directive
| (add -DHAVE_MPI to your cmd_obj in your configuration file).
|
|Here is the log:
The administrator has disabled public write access.

Parallel Problem: 7 years 10 months ago #24879

  • huyquangtran
  • huyquangtran's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 271
  • Thank you received: 23
Hi,

I have solved my own problem!

Best Regards
Huy
The administrator has disabled public write access.

Parallel Problem: 7 years 10 months ago #24880

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
What was the error ?
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.

Parallel Problem: 7 years 10 months ago #24881

  • huyquangtran
  • huyquangtran's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 271
  • Thank you received: 23
Here
analysingerror.jpg



Best Regards

Huy
The administrator has disabled public write access.

Parallel Problem: 7 years 5 months ago #26632

Hi huyquangtran
I'm facing exactly the same problem. Could you please provide the config files you used to compile telemac in order to parallelize your model ?
Because i'm still blocked, even when changing metis library.

Best regards
Jean-Rémy
The administrator has disabled public write access.
Moderators: borisb

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.