Welcome, Guest
Username: Password: Remember me

TOPIC: Parallel implemetation

Parallel implemetation 8 years 2 weeks ago #24146

  • chelobarros
  • chelobarros's Avatar
hi sorry to write so late , but i was wondering could you explain to me what "using the mpi wrapper for gfortran" means because i tried to compile to use it in parallel mode because i have finally some proper models to run bur thi error ocurrs.


+> configuration: debgfopenmpi
+> root: /home/chelobarros/v7p1r0
+> modules: clean system -dredgesim

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



Compiling the program SPLITSEL and dependents
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

---s
No underlying compiler was specified in the wrapper compiler data file
(e.g., mpicc-wrapper-data.txt)
[\ ] 2% | 1s

Hummm ... I could not complete my work.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

compileTELEMAC::createObjFiles:
+> failed: /usr/bin/mpif90 -c -O3 -DHAVE_MPI -fconvert=big-endian -frecord-marker=4 -I /home/chelobarros/v7p1r0/builds/debgfopenmpi/lib/utils/special -I /home/chelobarros/v7p1r0/builds/debgfopenmpi/lib/utils/hermes -I /home/chelobarros/v7p1r0/builds/debgfopenmpi/lib/utils/bief -I /home/chelobarros/v7p1r0/builds/debgfopenmpi/lib/utils/splitsel /home/chelobarros/v7p1r0/sources/utils/special/declarations_special.f
... The following command failed for the reason above (or below)
/usr/bin/mpif90 -c -O3 -DHAVE_MPI -fconvert=big-endian -frecord-marker=4 -I /home/chelobarros/v7p1r0/builds/debgfopenmpi/lib/utils/special -I /home/chelobarros/v7p1r0/builds/debgfopenmpi/lib/utils/hermes -I /home/chelobarros/v7p1r0/builds/debgfopenmpi/lib/utils/bief -I /home/chelobarros/v7p1r0/builds/debgfopenmpi/lib/utils/splitsel /home/chelobarros/v7p1r0/sources/utils/special/declarations_special.f:


thanks for the time.

greetings,
chelo.


by the way this is my systel file


""""""""""""""""""""""""""

# _____ _______________________________
# ____/ TELEMAC Project Definitions /______________________________/
#
[Configurations]
configs: debgfopenmpi debgfortrans
#configs: debgfortrans
#
[general]
modules: clean system -dredgesim
#
mods_all: -I <config>
#
sfx_zip: .gztar
sfx_lib: .a
sfx_obj: .o
sfx_mod: .mod
sfx_exe:
#
val_root: /home/chelobarros/v7p1r0
val_rank: all
# also possible val_rank: <3 >7 6
#

# _____ ____________________________________
# ____/ Debian gfortran scalar /___________________________________/
[debgfortrans]
#
cmd_obj: gfortran -c -O3 -fconvert=big-endian -frecord-marker=4 <mods> <incs> <f95name>
cmd_lib: ar cru <libname> <objs>
cmd_exe: gfortran -fconvert=big-endian -frecord-marker=4 -v -o <exename> <objs> <libs>
#
# _____ ____________________________________
# ____/ Debian gfortran scalar debug /___________________________________/
[debgfortransdbg]
#
cmd_obj: gfortran -c -g -fbounds-check -Wall -fbacktrace -finit-real=nan -fconvert=big-endian -frecord-marker=4 <mods> <incs> <f95name>
cmd_lib: ar cru <libname> <objs>
cmd_exe: gfortran -fconvert=big-endian -frecord-marker=4 -v -o <exename> <objs> <libs>
#
# _____ ___________________________________
# ____/ Debian gfortran openMPI /__________________________________/
[debgfopenmpi]
#
par_cmdexec: <config>/bin/partel < PARTEL.PAR >> <partel.log>
#
mpi_cmdexec: /usr/bin/mpiexec -wdir <wdir> -n <ncsize> <exename>
mpi_hosts:
#
cmd_obj: /usr/bin/mpif90 -c -O3 -DHAVE_MPI -fconvert=big-endian -frecord-marker=4 <mods> <incs> <f95name>
cmd_lib: ar cru <libname> <objs>
cmd_exe: /usr/bin/mpif90 -fconvert=big-endian -frecord-marker=4 -lpthread -v -lm -o <exename> <objs> <libs>
#
mods_all: -I <config>
#
libs_partel: /usr/lib/libmetis.a
libs_all : /usr/lib/libmetis.a
#
# _____ ___________________________________
# ____/ Debian gfortran openMPI debug /__________________________________/
[debgfopenmpidbg]
#
par_cmdexec: <config>/partel < PARTEL.PAR >> <partel.log>
#
mpi_cmdexec: /usr/bin/mpiexec -wdir <wdir> -n <ncsize> <exename>
mpi_hosts:
#
cmd_obj: /usr/bin/mpif90 -c -g -fbounds-check -Wall -fbacktrace -finit-real=nan -DHAVE_MPI -fconvert=big-endian -frecord-marker=4 <mods> <incs> <f95name>
cmd_lib: ar cru <libname> <objs>
cmd_exe: /usr/bin/mpif90 -fconvert=big-endian -frecord-marker=4 -lpthread -v -lm -o <exename> <objs> <libs>
#
mods_all: -I <config>
#
incs_parallel: -I /usr/include/mpi/
libs_partel: /usr/lib/libmetis.a
libs_all : /usr/lib/libmetis.a
#



"""""""""""""""""""""""""""""""""
The administrator has disabled public write access.

Parallel implemetation 8 years 2 weeks ago #24147

  • chelobarros
  • chelobarros's Avatar
****UPDATE

I have found the file mpifort-wrapper-data.txt and mpicc-wrapper-data.txt and change the compiler= to compiler=gfortran

and the following error occurs.


Scanning the source code for:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

+> configuration: debgfopenmpi
+> root: /home/chelobarros/v7p1r0
+> modules: clean system -dredgesim

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



Compiling the program SPLITSEL and dependents
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- completed: .../v7p1r0/sources/utils/special/declarations_special.f
mpif.h:54: Error: Can't open included file 'mpif-config.h' ] 2% | 1s
[\\\ ] 4% | 1s

Hummm ... I could not complete my work.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

compileTELEMAC::createObjFiles:
+> failed: /usr/bin/mpif90 -c -O3 -DHAVE_MPI -fconvert=big-endian -frecord-marker=4 -I /home/chelobarros/v7p1r0/builds/debgfopenmpi/lib/utils/special -I /home/chelobarros/v7p1r0/builds/debgfopenmpi/lib/utils/hermes -I /home/chelobarros/v7p1r0/builds/debgfopenmpi/lib/utils/bief -I /home/chelobarros/v7p1r0/builds/debgfopenmpi/lib/utils/splitsel /home/chelobarros/v7p1r0/sources/utils/special/plante.F
... The following command failed for the reason above (or below)
/usr/bin/mpif90 -c -O3 -DHAVE_MPI -fconvert=big-endian -frecord-marker=4 -I /home/chelobarros/v7p1r0/builds/debgfopenmpi/lib/utils/special -I /home/chelobarros/v7p1r0/builds/debgfopenmpi/lib/utils/hermes -I /home/chelobarros/v7p1r0/builds/debgfopenmpi/lib/utils/bief -I /home/chelobarros/v7p1r0/builds/debgfopenmpi/lib/utils/splitsel /home/chelobarros/v7p1r0/sources/utils/special/plante.F:


thank again and sorry for the trouble.

greetings,
chelo.
The administrator has disabled public write access.

Parallel implemetation 8 years 2 weeks ago #24148

  • josekdiaz
  • josekdiaz's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 161
  • Thank you received: 48
Chelo and Nemo,

Let's start from the top, and assuming you have some ubuntu/debian based linux distro and asume as ¨root¨ path as:
/home/Your_username/opentelemac/v7p1

Make sure you have the correct dependencies for the compiling process

This will install gfortran, openmpi (and its basic dev libs).
sudo apt install gfortran openmpi-bin openmpi-doc

And test it in a console: get the version of the openmpi wrapper for gfortran using the terminal and something similar should appear:
jose@blitz:~$ mpifort --version
GNU Fortran (Ubuntu 5.4.0-6ubuntu1~16.04.2) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.

GNU Fortran comes with NO WARRANTY[.....]

Configuring the cfg file
Copy the attached file ( a basic system cfg using openmpi) to the folder root_path/configs/ and edit that txt file just in the following lines:
  • root: /home/jose/Documents/opentelemac/v7p1r1
  • libs_partel: /home/jose/Documents/opentelemac/v7p1r1/metis/libmetis.a

Change it accordingly to your root_path. Make sure you have the ¨libmetis.a¨ file available. If not, extract the folder metis attached in a similar path and point the libs_partel to the extracted ¨[folder_extracted]/libmetis.a¨.

Edit your .bashrc to include opentelemacs scripts
gedit ~/.bashrc

Add the folowing lines at the end changing the [YOUR_ROOT_PATH] placeholder:
export PATH="[YOUR_ROOT_PATH]/scripts/python27":$PATH

export SYSTELCFG="[YOUR_ROOT_PATH]/configs/systel.cis-ubuntu-parallel.cfg"
Save bashrc and in that terminal:
source ~/.bashrc

Compilation processs

Just type:
compileTELEMAC.py

Now test your installation with a test-cas and make sure the ¨PARALLEL PROCESORS¨ keyword is in the keyword file to see come cpu action.

Regards,

José Díaz.

EDIT:

Couldnt attach the metis zipped folder to the forum, here is a gdrive link instead:METIS_FOLDER
Attachments:
The administrator has disabled public write access.

Parallel implemetation 8 years 2 weeks ago #24149

  • chelobarros
  • chelobarros's Avatar
hi jose, thanks for helping us.

i did what you told me and it finally compile the debmpi for telemac. but now i tried to run a working case.
i have an 8 core computer, i tried PARALLEL PROCESSORS = 3 to try it and it gives me the following error.


=====================================================================
parsing configuration file: /home/chelobarros/v7p1r0/configs/systel.cis-ubuntu_paralelo.cfg


Running your CAS file for:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

+> configuration: ubugfopenmpi
+> root: /home/chelobarros/v7p1r0


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


... reading the main module dictionary

... processing the main CAS file(s)
+> running in English

... handling temporary directories

... checking coupling between codes

... checking parallelisation

... first pass at copying all input files
copying: 01_malla_20.slf /home/chelobarros/Dropbox/TESIS/modelo_hidrodinamico/BLUEKENUE/CAS_20M/00_malla_20m.cas_2016-11-02-23h21min18s/T2DGEO
copying: 02_BOTTOM_BC_20m.cli /home/chelobarros/Dropbox/TESIS/modelo_hidrodinamico/BLUEKENUE/CAS_20M/00_malla_20m.cas_2016-11-02-23h21min18s/T2DCLI
copying: 03_embaulado.txt /home/chelobarros/Dropbox/TESIS/modelo_hidrodinamico/BLUEKENUE/CAS_20M/00_malla_20m.cas_2016-11-02-23h21min18s/T2DBUS
re-copying: /home/chelobarros/Dropbox/TESIS/modelo_hidrodinamico/BLUEKENUE/CAS_20M/00_malla_20m.cas_2016-11-02-23h21min18s/T2DCAS
copying: telemac2d.dico /home/chelobarros/Dropbox/TESIS/modelo_hidrodinamico/BLUEKENUE/CAS_20M/00_malla_20m.cas_2016-11-02-23h21min18s/T2DDICO

... checking the executable
re-copying: telemac2d /home/chelobarros/Dropbox/TESIS/modelo_hidrodinamico/BLUEKENUE/CAS_20M/00_malla_20m.cas_2016-11-02-23h21min18s/out_telemac2d

... modifying run command to MPI instruction

... modifying run command to PARTEL instruction

... partitioning base files (geo, conlim, sections and zones)
+> /home/chelobarros/v7p1r0/builds/ubugfopenmpi/bin/partel < PARTEL.PAR >> partel_T2DGEO.log
/home/chelobarros/v7p1r0/builds/ubugfopenmpi/bin/partel: symbol lookup error: /usr/lib/libmpi_mpifh.so.12: undefined symbol: mpi_conversion_fn_null_
runPartition:
|runPARTEL: Could not split your file T2DGEO (runcode=127) with the error as follows:
|
|... The following command failed for the reason above (or below)
|/home/chelobarros/v7p1r0/builds/ubugfopenmpi/bin/partel < PARTEL.PAR >> partel_T2DGEO.log
|
| You may have forgotten to compile PARTEL with the appropriate compiler directive
| (add -DHAVE_MPI to your cmd_obj in your configuration file).
|
|Here is the log:
==========================================================================

thanks again.

greetings,
chelo.
The administrator has disabled public write access.

Parallel implemetation 8 years 2 weeks ago #24150

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
HI,

I would suggest replacing in your systel.cfg this:
incs_parallel:      -I /usr/lib/openmpi/include/
incs_special:       -I /usr/lib/openmpi/include/
libs_partel:      /home/jose/Documents/opentelemac/v7p1r1/metis/libmetis.a
libs_all       :    /usr/lib/openmpi/lib/libmpi.so

By this
libs_all:      /home/jose/Documents/opentelemac/v7p1r1/metis/libmetis.a

And running compileTELEMAC.py --clean
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.

Parallel implemetation 8 years 2 weeks ago #24156

  • chelobarros
  • chelobarros's Avatar
hi thanks , everyone for helping.
when i compile it whit the systel file jose gave us, it compiles fine , but id did had to change the mpifort file and specify "compiler=gfortran".

when i try PARALLEL PROCESSORS = 1 it works fine, but when i put n=2 it gives me the error i mentioned.

this just wont work. i did what yugi suggested and it wont compile from the star. whit the following error.


=======================================================================
Scanning the source code for:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

+> configuration: ubugfopenmpi
+> root: /home/chelobarros/v7p1r0
+> modules: system clean

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


+> build deleted!


Compiling the program SPLITSEL and dependents
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- completed: .../v7p1r0/sources/utils/special/declarations_special.f
mpif.h:54: Error: Can't open included file 'mpif-config.h' ] 2% | 1s
[\\\ ] 4% | 1s

Hummm ... I could not complete my work.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

compileTELEMAC::createObjFiles:
+> failed: mpifort -c -O3 -DHAVE_MPI -fconvert=big-endian -frecord-marker=4 -I /home/chelobarros/v7p1r0/builds/ubugfopenmpi/lib/utils/special -I /home/chelobarros/v7p1r0/builds/ubugfopenmpi/lib/utils/hermes -I /home/chelobarros/v7p1r0/builds/ubugfopenmpi/lib/utils/bief -I /home/chelobarros/v7p1r0/builds/ubugfopenmpi/lib/utils/splitsel /home/chelobarros/v7p1r0/sources/utils/special/plante.F
... The following command failed for the reason above (or below)
mpifort -c -O3 -DHAVE_MPI -fconvert=big-endian -frecord-marker=4 -I /home/chelobarros/v7p1r0/builds/ubugfopenmpi/lib/utils/special -I /home/chelobarros/v7p1r0/builds/ubugfopenmpi/lib/utils/hermes -I /home/chelobarros/v7p1r0/builds/ubugfopenmpi/lib/utils/bief -I /home/chelobarros/v7p1r0/builds/ubugfopenmpi/lib/utils/splitsel /home/chelobarros/v7p1r0/sources/utils/special/plante.F:


==================================================================

thanks again
chelo.
The administrator has disabled public write access.

Parallel implemetation 8 years 2 weeks ago #24159

  • yugi
  • yugi's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 851
  • Thank you received: 244
I think the error is that you have multiple version of mpi installed and that the mpiexec you are using is not the one linked with mpifort.

Can you try running mpifort -show
It should give more information on what is used
There are 10 types of people in the world: those who understand binary, and those who don't.
The administrator has disabled public write access.

Parallel implemetation 8 years 2 weeks ago #24160

  • josekdiaz
  • josekdiaz's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 161
  • Thank you received: 48
Dear Chelo,

There are a couple of things I've noticed from your previous post:

Whats that "compiler=gfortran" instruction that you had to pass to the cfg file or the terminal ?

If somehow that argument goal is to override the compilation instruction switching from mpifort (or its now deprecated mpif90 equivalent) to gfortran you are pretty much wasting the cfg file that I posted before. Please read the differences of plain gfortran and mpifort from my previous post link.

I strongly suggest to not override cfg options using terminal if you don't feel confident about the procedure, edit the cfg instead to avoid mistypes or any mixed options with the cfg.

Editing the cfg file as Yoann suggested

I assume what Yoann suggested is a cleaner way of handling the include calls, as he assumes you are using "mpifort" or "mpif90" the include calls to openmpi libs are redundant (I definitely will edit my cfg that way! Thanks!) and just the necessary paths can be included in the libs_all instruction, but I assume either way should work!

You are not testing with any bundled case!

Please pick a bundled-case, e.g. bumpflu or malpasset-large in telemac2d folder, add the parallel processors instruction in the cas file language (if doesnt have it) and run it.

If you insist using your Thesis as an input to test a general purpose compilation we can't reproduce any issue that you might have and you will be on your own in that regard.


Delete any temporal folder/telemac executable in between runs

If you are testing compilers,options, builds...etc and by any chance prior
files or folder aren't deleted/replaced with new ones automatically I strongly suggest you manually delete them before running anything new.

You are not using the latest version of openTELEMAC

I'm sure the latest version in the svn tags releases is not "v7p0", I advice you to download the sources of the latest tag build from the SVN.

Regards,

José Díaz.
The administrator has disabled public write access.

Parallel implemetation 8 years 2 weeks ago #24162

  • chelobarros
  • chelobarros's Avatar
hi, thanks jose and yugi, i will do this things and report back.
and the compiler=gfortran was a thing i had to writte in the txt file mpifort, because it was originally blank and it gave me the error that a compiler was not set in this file.

i will try to do all this things and tell you how it goes, thanks everybody.

greetings,
chelo.
The administrator has disabled public write access.

Parallel implemetation 8 years 2 weeks ago #24192

  • sous
  • sous's Avatar
Hello everybody !

First of all, a huge thank to all contributors of this forum, you're helping a lot !

I'm trying to install and run T3D in parallel version (with OpenMPI) on ubuntu (14.04). I've followed the recommended procedure and forum advices, and the T3D compilation seems ok now.

As a first test case, I tried the t3d_malpasset cases. It works well in scalar mode, or in parallel mode using ncsize=0 or 1, but it crashes when I use ncsize=2 or greater with the following message:

PLANTE : ARRET DU PROGRAMME APRES ERREUR
RETURNING EXIT CODE: 2
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.

mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

Process name: [[18026,1],0]
Exit code: 1
_____________
runcode::main:
:
|runCode: Fail to run
|/usr/local/bin/mpiexec -wdir /media/DATA/RECHERCHE/TELEMAC/examples/telemac3d/malpasset/t3d_malpasset_p2.cas_2016-11-08-12h10min13s -n 2 /media/DATA/RECHERCHE/TELEMAC/examples/telemac3d/malpasset/t3d_malpasset_p2.cas_2016-11-08-12h10min13s/out_t3d_malpasset
|~~~~~~~~~~~~~~~~~~

Has anyone has the same problem ?

Any help or suggestion would be very welcome !

Damien

PS: My config file is attached.

File Attachment:

File Name: systel.dam_OpenMPI.cfg
File Size: 2 KB
The administrator has disabled public write access.
Moderators: borisb

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.