Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1
  • 2

TOPIC: mpich error

Re:mpich/libmetis error 13 years 8 months ago #1256

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Hi

No, parallel processors = 0 is equivalent to run in scalar mode
You should set parallel processors = 1. That means you run in parallel but with only one processor. The run command is similar to the parallel execution.

Regards
Christophe
The administrator has disabled public write access.

Re:mpich/libmetis error 13 years 8 months ago #1257

  • ails
  • ails's Avatar
  • OFFLINE
  • Senior Boarder
  • Posts: 140
  • Thank you received: 17
John,

- Can you try manually with ./out28473... : (unless you are on a cluster)
mpirun -machinefile mpirun.txt -np 16 ./out28473_intel64.exe.

- Or, export the "." directory in your PATH : export PATH:'.':$PATH

This kind of PATH issue has been reported before.

Fabien
The administrator has disabled public write access.

Re:mpich/libmetis error 13 years 8 months ago #1258

  • olslewfoot
  • olslewfoot's Avatar
  • OFFLINE
  • Senior Boarder
  • Posts: 132
  • Thank you received: 3
Hi

Setting Parallel Processors=1 results in the same error:
*** MPI MACHINE ***
MPI machine ok (with 1 processors).
______________________________________________________
*** RUNNING ***

MPI launcher : /export/apps/mpich2/bin/mpirun -machinefile mpirun.txt -n 1 out9580_intel64.exe
[proxy:0:0@deepblue.corp.cefas.co.uk] HYDU_create_process (./utils/launch/launch.c:69): execvp error on file out9580_intel64.exe (No such file or directory)
I reverted to the original setup and checked the mpirun.txt file and other paths
I think the problem must be with MPICH2?


Regards
John
The administrator has disabled public write access.

Re:mpich/libmetis error 13 years 8 months ago #1261

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
OK
You could try the solution given by Fabien by adding the ./ before the executable name

Regards
Christophe
The administrator has disabled public write access.

Re:mpich/libmetis error 13 years 8 months ago #1262

  • olslewfoot
  • olslewfoot's Avatar
  • OFFLINE
  • Senior Boarder
  • Posts: 132
  • Thank you received: 3
Hi all

By adding the ./ I do get a different error.

mpirun -machinefile mpirun.txt -np 16 ./out28473_intel64.exe
Fatal error in PMPI_Comm_rank: Invalid communicator, error stack:
PMPI_Comm_rank(109): MPI_Comm_rank(comm=0x0, rank=0xd1ea20) failed
PMPI_Comm_rank(66).: Invalid communicator

This must indicate that the mpd ring is not being established as I believe MPI_Comm_rank is part the comm handle.

A question..? In mpi_telemac.conf I have used hostname.domain as "nom_du_host". Is this correct - or should this be just hostname?

I think a rebuild of mpich2 is neccesary or a further attempt with Openmpi (with correct linking options this time!)
:unsure:
John
The administrator has disabled public write access.

Re:mpich/libmetis error 13 years 7 months ago #1267

  • ails
  • ails's Avatar
  • OFFLINE
  • Senior Boarder
  • Posts: 140
  • Thank you received: 17
Hi,

Try "uname -n". It will give you the correct hostname (as long as you're not working on a cluster nor through a workload scheduler).

Regards,

Fabien
The administrator has disabled public write access.

Re:mpich error 12 years 11 months ago #3156

  • nhuybrec
  • nhuybrec's Avatar
  • OFFLINE
  • Fresh Boarder
  • Posts: 11
Hi,

we have the same error message for example:

apl/soft/LIB/mpich2-1.4.1p1/lib/libmpich.a(param_vals.o): In function `MPIR_Param_init_params':
param_vals.c:(.text+0x5f7): undefined reference to `MPL_env2int'
param_vals.c:(.text+0x60f): undefined reference to `MPL_env2int'
param_vals.c:(.text+0x627): undefined reference to `MPL_env2bool'
param_vals.c:(.text+0x63f): undefined reference to `MPL_env2bool'
param_vals.c:(.text+0x657): undefined reference to `MPL_env2bool'
param_vals.c:(.text+0x66f): undefined reference to `MPL_env2bool'
param_vals.c:(.text+0x687): undefined reference to `MPL_env2bool'

We apparently have to change our mpi version. Could you tell us which version we should use?

as asked by Fabien, we have added the links (FC_MPI, LK_MPI, LIBS_MPI, RUN_MPI) in systel.ini file :


FC_MPI="/apl/soft/LIB/mpich2-1.4.1p1/bin/mpif90 "
LK_MPI=/apl/soft/LIB/mpich2-1.4.1p1/bin/mpif90 -o <EXE> <OBJS> <LIBS> "
LIBS_MPI="-L /apl/soft/LIB/mpich2-1.4.1p1/lib -lmpich -lmpi -lpthread -lstdc++ -lz "
RUN_MPI="/apl/soft/LIB/mpich2-1.4.1p1/bin/mpirun -machinefile mpirun.txt -np <N> <EXE>"

Moreover when we type "mpif90 -show", we have the following message:
root@matrics sources]# /apl/soft/LIB/mpich2-1.4.1p1/bin/mpif90 –show
/apl/soft/intel/fce/10.1.021/lib/for_main.o: In function `main':
/export/users/nbtester/efi2linuxx86_nightly/branch-10_1/20081028_000000/libdev/frtl/src/libfor/for_main.c:(.text+0x26): undefined reference to `MAIN__'

would you have an idea how to solve this?
regards,
Nicolas
The administrator has disabled public write access.

Re:mpich error 12 years 11 months ago #3157

  • olslewfoot
  • olslewfoot's Avatar
  • OFFLINE
  • Senior Boarder
  • Posts: 132
  • Thank you received: 3
Hi Nicolas

I solved the problem by using mpd as the process manager rather than hydra which is the default process manager fro mpich2.
I did this by calling mpirun rather than mpiexec in the RUN_MPI command
The following extract from the systel.ini shows:

FC_MPI="/export/apps/s/mpich2/bin/mpif90"
LK_MPI="/export/apps/s/mpich2/bin/mpif90 -o <EXE> <OBJS> <LIBS> "
#LIBS_MPI="-L /export/apps/openmpi/lib -lmpi -lm -lz -lstdc++ "
LIBS_MPI="-L /export/apps/s/mpich2/lib -lmpichf90 -lmpichf90 -lmpich -lopa -lmpl -lrt -lpthread"
RUN_MPI="/export/apps/s/mpich2/bin/mpirun -n <N> ./<EXE>"

This means that I do have to set up an mpi ring using mpdboot before launching a parallel simulation. An example of my mpdboot command below using 8 slave nodes.

mpdboot -n 9 --maxbranch=8 -v &


I believe my problems were caused by conflicts on the mpich2 install from other systems running on our cluster. So I found this to work very well.
It also will run independently of other users who may be running mpiexec.

So having got this to work successfully - I have stayed with this method.

Hope this is of use.
Cheers
John
The administrator has disabled public write access.
  • Page:
  • 1
  • 2
Moderators: borisb

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.