Welcome, Guest
Username: Password: Remember me

TOPIC: Problem with MPI/Partel

Problem with MPI/Partel 12 years 7 months ago #4285

  • YannV
  • YannV's Avatar
Hi all,
I just installed the 6p1 version of Telemac, with Python script unlike what I used to do.
I've tested this installation on one of my case in scalar mode and it seems to work fine, but when I tried it on parallel mode things went wrong...

First, I tried to run my case with the old installation of MPICH2 (v1.2), and the console has returned the following message :
"smpd version mismatch"

So I tried to install a newer version of MPICH2 (v1.4) for IA32 processor (my computer has 2 Intel Xeon quad core CPUs), and now, the first message dissapears but the command window stop giving messages after launching mpiexec, there is no CPU activities, and the partel_T2GEO.log file show me the following message :
"Fatal Error: This program was not built to run on the processor in your system.
The allowed processors are: Intel(R) processors with SSE4.2 and POPCNT instructions support."

I think that the MPICH version that I choose is the good one as the other choice is for EMT64/AMD64 processor and I don't think that correspond to my CPU.

Does anyone has an idea to help me ?
Thanks.
YV.
The administrator has disabled public write access.

Re: Problem with MPI/Partel 12 years 7 months ago #4297

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Hi

As I understand, the partel version you have is specifficaly compiled with specific architecture and/or processor tunning. Did you compile telemac yourself on your computer? As i know the default configuration of compilation parameter are not tune for specific processor and architecture.

Another possibility is linked to the intel fortran compiler (if you have it). If your version is installed with IMPI (Intel MPI) I already seen some conflicts here because smpd and mpiexec could exist twice (one in the mpich instalaltion and one in the intel installation) and both are not compatible...

Hope this helps
Christophe
The administrator has disabled public write access.

Re: Problem with MPI/Partel 12 years 7 months ago #4301

  • YannV
  • YannV's Avatar
Hello,

Thanks for your answer. I actually use the Intel compiler (version 10), but not with a bundled version of MPI. I have search for all files named mpiexec or smpd on my hard drives but the only ones installed are those I installed myself.

I have seen that in the PATH variable there is a reference to a directory containing EM64T specific libraries. I tried to compile Telemac after having removed this reference but it doesn't solve my problem...

Is there a specific message at the end of the compilation? I don't have some messages indicating that compilation works fine and I'm asking if it has really succeded.

Y.
The administrator has disabled public write access.

Re: Problem with MPI/Partel 12 years 7 months ago #4302

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Hi
If you use python script for compilation, you should have a message like "works is done" at the end of the compilation.

Maybe you could attach the configuration file you modified in order to allow us to check it.

Regards
Christophe
The administrator has disabled public write access.

Re: Problem with MPI/Partel 12 years 7 months ago #4304

  • YannV
  • YannV's Avatar
I actually use Python for the installation. Here is my cfg file :
# _____                              _______________________________
# ____/ TELEMAC Project Definitions /______________________________/
#
[Configurations]
configs:    wintels wintelmpi
#
# _____                       ______________________________________
# ____/ windows intel scalar /_____________________________________/
[wintels]
#
root:       C:\opentelemac\v6p1
version:    v6p1
language:   2
modules:    update system
options:
#
cmd_obj:    ifort.exe /c /Ot /iface:cref /iface:nomixed_str_len_arg /nologo /names:uppercase /convert:big_endian /extend_source:132 <mods> <incs> <f95name>
cmd_lib:    xilib.exe /nologo /out:<libname> <objs>
cmd_exe:    xilink.exe /nologo /subsystem:console /stack:536870912 /out:<exename> <objs> <libs>
#
mods_all:   /include:<config>
#
val_dir:    validation
val_exe:
#
sfx_zip:    .zip
sfx_lib:    .lib
sfx_obj:    .obj
sfx_mod:    .mod
sfx_exe:    .exe
#
# _____                         ____________________________________
# ____/ windows intel parallel /___________________________________/
[wintelmpi]
#
root:       C:\opentelemac\v6p1
version:    v6p1
language:   2
modules:    update system
#
options:    parallel mpi
#mpi_hosts:  -mapall
mpi_hosts:  -localonly
mpi_cmdexec:   C:\opentelemac\mpich2\bin\mpiexec.exe <wdir> <ncsize> <hosts> <exename>
#
cmd_obj:    ifort.exe /c /Og /QxHost /iface:cref /iface:nomixed_str_len_arg /nologo /names:uppercase /convert:big_endian /extend_source:132 <mods> <incs> <f95name>
cmd_lib:    xilib.exe /nologo /out:<libname> <objs>
cmd_exe:    xilink.exe /nologo /subsystem:console /stack:536870912 /nodefaultlib:libc.lib /out:<exename> <objs> <libs>
#
mods_all:   /include:<config>
#
incs_parallel:      /include:C:\opentelemac\mpich2\include
libs_parallel:      C:\opentelemac\lib\metis32.lib
libs_all     :      C:\opentelemac\mpich2\lib\fmpich2.lib
#
sfx_zip:    .zip
sfx_lib:    .lib
sfx_obj:    .obj
sfx_mod:    .mod
sfx_exe:    .exe
#
The administrator has disabled public write access.

Re: Problem with MPI/Partel 12 years 7 months ago #4305

  • sebourban
  • sebourban's Avatar
  • OFFLINE
  • Administrator
  • Principal Scientist
  • Posts: 814
  • Thank you received: 219
Hello,

Uou need to make sure that the following are pointing to the correct installation.

incs_parallel: /include:C:\opentelemac\mpich2\include
libs_parallel: C:\opentelemac\lib\metis32.lib
libs_all : C:\opentelemac\mpich2\lib\fmpich2.lib

Also make sure that you have an appropriate metis library (other PARTEL will fail compiling).

Once the configuration is set, you need to re-compile the system. (please change "update" into "clean" in to make sure in the "modules" key of your wintelmpi nofiguration).

Hope this helps,

Sébastien.
The administrator has disabled public write access.

Re: Problem with MPI/Partel 12 years 7 months ago #4316

  • YannV
  • YannV's Avatar
It's not going better...
I have three answers/questions/remarks:

1. I took the metis library (metis32.lib) found on the download part of the opentelemac website under the metislib.zip file. I think this is the good one, isn't it?

2. I tried to compile telemac with "modules: clean system" and it doesn't work. I have the following error:
fortcom: Fatal: There has been an internal compiler error (C0000005).
compilation aborted for nodalf_pugh.f (code 1)
Nethertheless, the very first compilation has work yesterday...

3. I have seen that I have often the following warning while compiling:
ifort: command line warning #10130: unknown extension 'o' ignored in option '/Qx'
ifort: command line warning #10130: unknown extension 's' ignored in option '/Qx'
ifort: command line warning #10130: unknown extension 't' ignored in option '/Qx'
It comes from the following command:
ifort.exe /c /Og /QxHost...
Looking at the help page of the ifort command, one can find the following:
/Qx<codes>  generate specialized code to run exclusively on processors indicated by <codes> as described below
    K  Intel Pentium III and compatible Intel processors
    W  Intel Pentium 4 and compatible Intel processors
    N  Intel Pentium 4 and compatible Intel processors.  Enables new optimizations in addition to Intel processor-specific optimizations
    P  Intel(R) Core(TM) processor family with Streaming SIMD Extensions 3 (SSE3) instruction support
    T  Intel(R) Core(TM)2 processor family with SSSE3
    O  Intel(R) Core(TM) processor family.  Code is expected to run properly on any processor that supports SSE3, SSE2 and SSE instruction sets
    S  Future Intel processors supporting SSE4 Vectorizing Compiler and Media Accelerator instructions
That means that this command is not ran properly and that the CPU type is certainly not well determined in the compilation.
The administrator has disabled public write access.

Re: Problem with MPI/Partel 12 years 7 months ago #4319

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Hi
The metis library available on the web site are the good one for windows.
Both version 32 and 64 bits are compiled with a C compiler and works well with Intel Fortran Compiler 11 when I compile partel.

The error on compilation is strange. I've no idea where it comes from

The warning comes effectively from the QxHost command. It's a new option for the latest intel compiler which indicate to optimise automaticaly the compilation as a function of the processor. This option probably doesn't exsit for intel 10. I think you should cancel this option in the config file.

Hope this helps
Christophe
The administrator has disabled public write access.

Re: Problem with MPI/Partel 12 years 7 months ago #4320

  • YannV
  • YannV's Avatar
Hi,

I have change the /QxHost into /QxN that seems to be the good solution for both my compiler and my CPU and :

1. Warnings have disappeard, I understand why,
2. Compiler error has disappeard, I don't understand why, as it works yersterday with old option,
3. And... Parallel computations works!!! And that's fine!

I now just have to solve my SVN problem to have a fine installation...

Thanks for your help.
The administrator has disabled public write access.
Moderators: borisb

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.