Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1
  • 2

TOPIC: *Memory allocation failed for CreateGraphDual

*Memory allocation failed for CreateGraphDual 8 years 8 months ago #20436

  • joysanyal21
  • joysanyal21's Avatar
Hi

I compiled V7p1 in a Linux RHEL system. It works fine in the scalar mode.
The openmpi configuration compiled without any major warning. I used the supplied fedora CONFIG file and modified things as per my installation directories. Please find the attached CFG file.

I am using a virtual machine on a remote host but not really a cluster. It is a server from which I have been provided a RHEL virtual machine with 16 cores and 128 GB RAM.

When I run telemac by "python runcode.py telemac2d -c fedgfopenmpi cas.txt"

It says:
partitioning base files (geo, conlim, sections and zones)
+> /root/telemac/v7p1r0/builds/fedgfopenmpi/bin/partel < PARTEL.PAR >> partel_T2DGEO.log
Current memory used: 0 bytes
Maximum memory used: 0 bytes
***Memory allocation failed for CreateGraphDual: nptr. Requested size: 137450904736 bytes
STOP 0
and then stops saying:

|runCode: Fail to run
|/usr/lib64/openmpi/bin/mpiexec /root/t2dcases/cas1_sis_t2d.txt_2016-03-21-15h50min41s 2 <hosts> /root/t2dcases/cas1_sis_t2d.txt_2016-03-21-15h50min41s/out_telemac2d
|~~~~~~~~~~~~~~~~~~
|/bin/sh: hosts: No such file or directory
Not quite sure what is wrong.

Any help would be appreciated.

Cheers,

Joy


File Attachment:

File Name: systel.cfg
File Size: 3 KB
The administrator has disabled public write access.

*Memory allocation failed for CreateGraphDual 8 years 8 months ago #20438

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Hi

For the V7P1, you should have a line incs_special (with the same value than incs_parallel) in your config file.
But as the compilation seems to run well, I'm not sure this will change the issue of your problem.
Are you trying to split a huge geometry file?

Regards
PS: Think to update your profile
Christophe
The administrator has disabled public write access.

*Memory allocation failed for CreateGraphDual 8 years 8 months ago #20458

  • joysanyal21
  • joysanyal21's Avatar
Hi,

Thanks for taking interest in my issue.

I recompiled with 'incs_special' in the config file. Apparently, othing unusual happened. When ran the case met with the same result!

I have 1.4 million nodes and 2.9 million elements. Is that a lot? But then the VM has 128 GB RAM. Could it be an issue? I am running it in a coupled mode with Sisyphe but tried with only telemac2d. No difference in the result.

The serial version is running great though doesn't make any sense to rut it in serial in a powerful hardware.

I have one doubt:

I installed openmpi by yum install openmpi devl, when I checked 'which mpiexec' didn't get any result. Not sure if it was installed correctly! However, I checked, the path I mentioned in the cfg file the designated folders related to openmpi do exist.

Please let me know if somebody have any clue about the problem.
The administrator has disabled public write access.

*Memory allocation failed for CreateGraphDual 8 years 8 months ago #20465

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Hi
Mpiexec is not use in Partel so this couldn't be the problem.
Partel use metis but in your case it looks like a real problem of memory.
I never experienced to split a model with 1.4M nodes but it should works, maybe with more memory?
The requested memory is 137 GB as indicate in your post...
Could you share the geometry file in order to allow us to test on different platform

Regards
Christophe
The administrator has disabled public write access.

*Memory allocation failed for CreateGraphDual 8 years 8 months ago #20468

  • joysanyal21
  • joysanyal21's Avatar
Hi,

Many thanks for your prompt reply. Let me try with a geometry file with smaller number of nodes. I will let you know how it goes. If it fails I will upload the geometry (Will take long time to upload!) for your inspection.
The administrator has disabled public write access.

*Memory allocation failed for CreateGraphDual 8 years 8 months ago #20469

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Hi
You probably should to use an external tool to share your mesh if it's a huge file as there is a limited size on the forum...
regards
Christophe
The administrator has disabled public write access.

*Memory allocation failed for CreateGraphDual 8 years 8 months ago #20472

  • joysanyal21
  • joysanyal21's Avatar
Hi,

I tried with a smaller mesh. Which runs fine in my desktop WIN7 installation in parallel with 16GB RAM. That memory shortage message was not particularly there. Nevertheless, it failed. Please find the message below:
+> running in English

... handling temporary directories

... checking coupling between codes

... checking parallelisation

... first pass at copying all input files
copying: cas.liq /root/small/cas_2016-03-22-17h43min04s/T2DIMP
re-copying: /root/small/cas_2016-03-22-17h43min04s/T2DCAS
copying: cas.conlim /root/small/cas_2016-03-22-17h43min04s/T2DCLI
copying: condInit.slf /root/small/cas_2016-03-22-17h43min04s/T2DPRE
copying: Geo_C19.slf /root/small/cas_2016-03-22-17h43min04s/T2DGEO
copying: telemac2d.dico /root/small/cas_2016-03-22-17h43min04s/T2DDICO

... checking the executable
re-copying: telemac2d /root/small/cas_2016-03-22-17h43min04s/out_telemac2d

... modifying run command to MPI instruction

... modifying run command to PARTEL instruction

... handling sortie file(s)


Running your simulation(s) :
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



/usr/lib64/openmpi/bin/mpiexec /root/small/cas_2016-03-22-17h43min04s 1 <hosts> /root/small/cas_2016-03-22-17h43min04s/out_telemac2d


_ _ ___ _ ____ _ __
| | | | |__ \ | | |___ | /_ |
| |_ ___ | | ___ _ __ ___ __ _ ___ ) | __| | ______ __ __ _/ / _ __ | |
| __| / _ \| | / _ \| '_ ` _ \ / _` | / __| / / / _` | |______| \ \ / / |_ _ | | '_ \ | |
| |_ | __/| || __/| | | | | || (_| || (__ / /_ | (_| | \ V / / / | |_) | | |
\__| \___||_| \___||_| |_| |_| \__,_| \___||____| \__,_| \_/ /_/ | .__/ |_|
| |
|_|
_____________
runcode::main:
:
|runCode: Fail to run
|/usr/lib64/openmpi/bin/mpiexec /root/small/cas_2016-03-22-17h43min04s 1 <hos ts> /root/small/cas_2016-03-22-17h43min04s/out_telemac2d
|~~~~~~~~~~~~~~~~~~
|/bin/sh: hosts: No such file or directory
|~~~~~~~~~~~~~~~~~~
The administrator has disabled public write access.

*Memory allocation failed for CreateGraphDual 8 years 8 months ago #20473

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Hi
Here it's a configuration problem as the word <hosts> is not updated by the script.
Hope this helps
Christophe
The administrator has disabled public write access.

*Memory allocation failed for CreateGraphDual 8 years 8 months ago #20477

  • joysanyal21
  • joysanyal21's Avatar
Thanks for your reply but I am not sure how to solve it. Is it like that the script is unable to fetch the host name because it is a virtual machine?

It will helpful to know what is required to be changed in the systel.config?

Do I need to change something in

"mpi_cmdexec: /usr/lib64/openmpi/bin/mpiexec <wdir> <ncsize> <hosts> <exename>"

I also ran it by specifying the host name

#mpi_hosts: myhostname

Anyway, as far as I understand that option is for a case when I distribute the computing to multiple systems, which is not my case. The VM should behave like one system

Regards,
The administrator has disabled public write access.

*Memory allocation failed for CreateGraphDual 8 years 8 months ago #20478

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
If you let the #, this is not take in account. # is like a comment sign
To test, you could delete <hosts> or replace it by -localonly or something similar.

Regards

PS:
The config file exists for tunning configuration to your own system. It's difficult to define strict rules as it depends an the system you are running on.

The question we should ask you is: why did you run on VM?
Christophe
The administrator has disabled public write access.
  • Page:
  • 1
  • 2
Moderators: borisb

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.