Welcome, Guest
Username: Password: Remember me

TOPIC: MUMPS solver issue with current trunk (rev9892)

MUMPS solver issue with current trunk (rev9892) 7 years 5 months ago #26861

  • cyamin
  • cyamin's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 997
  • Thank you received: 234
Hello all,

I tried to run ARTEMIS with MUMPS solver after upgrading to the latest trunk and the computation failed to start. Checked with example chwac1 (solver=9), it failed. Checked with v7p2r1, everything works.

Can someone verify this error on their installation?

Best Regards,
Costas
The administrator has disabled public write access.

MUMPS solver issue with current trunk (rev9892) 7 years 4 months ago #26992

  • cyamin
  • cyamin's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 997
  • Thank you received: 234
Hello all,

Did anyone had a look at the MUMPS issue? I can provide more details if this is verified and not a problem with my particular MUMPS installation (WINDOWS-MINGW-MSMPI).

I would appreciate any help.

Best Regards,
Costas
The administrator has disabled public write access.

MUMPS solver issue with current trunk (rev9892) 7 years 4 months ago #26993

  • sebourban
  • sebourban's Avatar
  • OFFLINE
  • Administrator
  • Principal Scientist
  • Posts: 814
  • Thank you received: 219
Hello Costas,

The nightly validation of the trunk does not find issues (on 3 Linux flavour) for ARTEMIS including MUMPS.

.. so it may be an issue with your windows install ?

By the way, I would be very interested in your settings / configuration for windows (with gfortran) to help with the public distribution of the system (as well as adding a windows slave to the mix of nightly validation). Do you still use Intel Fortran or are you using gfortran ?

Thanks,
Sébastien.
The administrator has disabled public write access.
The following user(s) said Thank You: cyamin

MUMPS solver issue with current trunk (rev9892) 7 years 4 months ago #26994

  • cyamin
  • cyamin's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 997
  • Thank you received: 234
Hello Sebastien,

Thank you for looking into this. What are some of the test cases you are validating against and use the MUMPS solver? I am puzzled because the same installation works with v7p2r1 and used to work with the trunk up until at least rev9733 that I used before updating to the latest one.

I am still using gfortran/mingw64 but with MS-MPI instead of MPICH2 that most windows users would use. However, MS-MPI adds to the complexity compared to MPICH2. I remain at your disposal if you would like to go into more details.

Regards,
Costas
The administrator has disabled public write access.

MUMPS solver issue with current trunk (rev9892) 7 years 4 months ago #26995

  • sebourban
  • sebourban's Avatar
  • OFFLINE
  • Administrator
  • Principal Scientist
  • Posts: 814
  • Thank you received: 219
You can have a look at the validation console on Jenkins:

http://cis.opentelemac.org/job/Trunk_2-run-test_linux-matrix/opensuse

Test case bj78 for examples. If you look into the xml file (bj78_animated.xml), you will see that the serial model runs with SOLVER=8 and the parallel mode with SOLVER=9:

  <action xref="1"
       do="translate;run;cas;princi"
       set='SOLVEUR=8'
       code="artemis" target="art_bj78_animated.cas"
       title="animated bj78 scalar mode"
   />
   <action xref="2"
       do="translate;run;cas;princi" ncsize="4"
       set='SOLVEUR=9|PARALLEL PROCESSORS=4'
       code="artemis" target="art_bj78_animated.cas"
       title="animated bj78 parallel mode"
   />

You did not change compiler in between ? (if you did, you may have to re-compile MUMPS with exactly the same compiler as you are using with TELEMAC)

Hope this helps,

Sébastien.
The administrator has disabled public write access.
The following user(s) said Thank You: cyamin

MUMPS solver issue with current trunk (rev9892) 7 years 4 months ago #26996

  • cyamin
  • cyamin's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 997
  • Thank you received: 234
In fact I did try to compile the latest MUMPS version and updated the MSYS platform in the process, which updated mingw64. Latest MUMPS failed, but I never got to recompile the old one... So you may be right. I will get back when MUMPS is recompiled.
Costas
The administrator has disabled public write access.

MUMPS solver issue with current trunk (rev9892) 7 years 4 months ago #26997

  • cyamin
  • cyamin's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 997
  • Thank you received: 234
Update:
Recompiling did not make any difference. But here is what I discovered:
Issuing the following command:
artemis.py -f O:\trunk\configs\systel_gfmsmpi.cfg -c gfmsmpi_i7 --ncsize=4 art_bj78.cas > output.txt
gave this error:
================================================================================
       PERIOD   1/ 5 :       1.1915 SECONDS

 PHBOR: BOUNDARY   2 IS INCIDENT - FIRST IMPACT NODE      1[     -7.0000,      0.0000]

 LINEAR SYSTEM SOLVING (SOLVE)

 MUMPS ARE NOT AVAILABLE FOR SEQUENTIAL RUNS,
 USE SEQUENITAL DIRECT SOLVER (SOLVER = 8)

job aborted:
[ranks] message

[0-1] terminated

[2] application aborted
aborting MPI_COMM_WORLD (comm=0x44000000), error 2, comm rank 2

[3] application aborted
aborting MPI_COMM_WORLD (comm=0x44000000), error 2, comm rank 3

---- error analysis -----
Which appeared strange because I had asked for 4 procs, not a sequential run. Un-commenting and setting "PARALLEL PROCESSORS : 4" within the cas file solved the issue. Which means that for some reason the ncsize argument in the command is not taken into account.

I have done some fiddling with the runcode.py script but they are limited in the HPCSTDIN related section and when reading the dictionary file from a UNC path. I cannot imagine how that could interfere.

Any ideas?
Costas
The administrator has disabled public write access.

MUMPS solver issue with current trunk (rev9892) 7 years 4 months ago #27000

  • sebourban
  • sebourban's Avatar
  • OFFLINE
  • Administrator
  • Principal Scientist
  • Posts: 814
  • Thank you received: 219
Do you have -DHAVE_MPI -DHAVE_MUMPS in your configuration ?
You may have to do a --clean is you just added it.

Sébastien.
The administrator has disabled public write access.

MUMPS solver issue with current trunk (rev9892) 7 years 4 months ago #27001

  • cyamin
  • cyamin's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 997
  • Thank you received: 234
No problem with the compilation. The problem is that the recent trunk needs to define explicitly PARALLEL PROCESSORS in the cas file. This was not the case in the past, where all it needed was the ncsize argument.
Your validation XML file changes both the ncsize parameter and enables PARALLEL PROCESSORS in the cas file. So it works OK. I believe that if you eliminate PARALLEL PROCESSORS from the cas file, MUMPS validation will fail.
Could it be a recent change in the runcode.py prevents the writing of ncsize in the RUNCAS file in the working directory?
Costas
The administrator has disabled public write access.

MUMPS solver issue with current trunk (rev9892) 7 years 3 months ago #27422

  • cyamin
  • cyamin's Avatar
  • OFFLINE
  • openTELEMAC Guru
  • Posts: 997
  • Thank you received: 234
Hello all,

I can confirm that the issue is solved with revision 10161.

Regards,
Costas
The administrator has disabled public write access.

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.