Welcome, Guest
Username: Password: Remember me

TOPIC: uncontroled error from python:: OSError(17, 'File exists')

uncontroled error from python:: OSError(17, 'File exists') 11 years 2 weeks ago #10921

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Violeta

the main difference between the two configuration is in the use of parallelism.
It's not bad to have both and you're not obliged to recompiling the system if you just change the list of configurations in you config file.

In my opinion, the cmd_obj line is OK but i don't know gfortran very well. On our cluster we use Intel Fortran compiler...

About your error message, it's not possible to make a diagnostic with those informations BUT, this is interesting in order to make another test.
Could you try to go into the temporary directory (if it was not deleted) and run directly the executable?
If the directory is no longer available after the crash, use the -t option in the launching step

regards
Christophe
Christophe
The administrator has disabled public write access.
The following user(s) said Thank You: 716469

uncontroled error from python:: OSError(17, 'File exists') 11 years 2 weeks ago #10922

  • 716469
  • 716469's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 303
  • Thank you received: 6
HI Christophe,

I rerun it from temporary folder and got some info (see attached). I think there is some dictionary confusion.

Anyway, I have intel compiler and openmpi_intel/1.6.4 on the cluster installed already. I can download them and use. Would you recommend to change it? If you all using intel then you would know straight away the answer if I get any errors there. I do not mind to do it again as I tried so many variations by now and nothing worked.

Thanks again.

Kind Regards!

Violeta
The administrator has disabled public write access.

uncontroled error from python:: OSError(17, 'File exists') 11 years 2 weeks ago #10923

  • 716469
  • 716469's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 303
  • Thank you received: 6
forgot attachment again:)

this is the error message after rerunning the postel3d from temp directory.

Violeta
Attachments:
The administrator has disabled public write access.

uncontroled error from python:: OSError(17, 'File exists') 11 years 2 weeks ago #10924

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Violeta
I'm not sure I well expressed what I want you to do on your cluster.
In the temp directory, you should just execute out_postel3dv6p2!
The attachment look likes you try to run runcode.py as some information are given in the first line of the file.

For the intel compiler, it's mainly up to you.
One option could be to manage both configurations on you cluster and make some tests on the computation time between gfortran and intel. Then you could decide with some tangible informations and keep the chosen configuration for future use.
I'm not sure switching to Intel will solve your problem here as you mentioned. your telemac2d and 3D simulations run.
Christophe
The administrator has disabled public write access.

uncontroled error from python:: OSError(17, 'File exists') 11 years 2 weeks ago #10926

  • 716469
  • 716469's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 303
  • Thank you received: 6
Thank you Christophe,

No you explained well how to run out_postel3dv6p2, it is just my lack of IT language and skills confuses everything, that is why I probably have all these problems:).

I just executed out_postel3dv6p2 from the commangd line and got:

-sh: out_postel3dv6p2: command not found

In terms of intel compiler: we do not have any examples in systel.cfg for Linux Intel on website but there are some for windows, so I am just afraid that I will mess it there again ( this is the lack of my IT knowledge agian, I presume that Intel could be used on Linus:) ). But I can have a try as with gfortran I am going nowhere for whole week.

Thanks.

Violeta
The administrator has disabled public write access.

uncontroled error from python:: OSError(17, 'File exists') 11 years 2 weeks ago #10929

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
Violeta
In the temp directory, try to run the command:
. out_postel3dv6p2.exe (check the exact name of the executable)
Don't forget the DOT before the name
Christophe
The administrator has disabled public write access.

uncontroled error from python:: OSError(17, 'File exists') 11 years 2 weeks ago #10930

  • 716469
  • 716469's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 303
  • Thank you received: 6
Thank you for your patience Christophe,

after executing .out_postel3dv6p2 I got the same message: -sh: .out_postel3dv6p2.exe: command not found

(and for : .out_postel3dv6p2 - this is exact name,I got same result).

Kind Regards!

Violeta
The administrator has disabled public write access.

uncontroled error from python:: OSError(17, 'File exists') 11 years 2 weeks ago #10932

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
and ./out_postel3Dv6p2 ?
Christophe
The administrator has disabled public write access.

uncontroled error from python:: OSError(17, 'File exists') 11 years 2 weeks ago #10933

  • 716469
  • 716469's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 303
  • Thank you received: 6
actually when I entered ./out_postel3dv6p2 I got message below:



[violeta.moloney@mg01 p3d.cas_2013-11-08-16h09min42s]$ ./out_postel3dv6p2
Error obtaining unique transport key from ORTE (orte_precondition_transports not present in
the environment).

Local host: mg01.cluster.local

It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

PML add procs failed
--> Returned "Error" (-1) instead of "Success" (0)
[mg01.cluster.local:2248] *** An error occurred in MPI_Init
[mg01.cluster.local:2248] *** on a NULL communicator
[mg01.cluster.local:2248] *** Unknown error
[mg01.cluster.local:2248] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
An MPI process is aborting at a time when it cannot guarantee that all
of its peer processes in the job will be killed properly. You should
double check that everything has shut down cleanly.

Reason: Before MPI_INIT completed
Local host: mg01.cluster.local
PID: 2248
The administrator has disabled public write access.

uncontroled error from python:: OSError(17, 'File exists') 11 years 2 weeks ago #10934

  • c.coulet
  • c.coulet's Avatar
  • OFFLINE
  • Moderator
  • Posts: 3722
  • Thank you received: 1031
OK
So we definitely have a problem with the parallel version.

You should now try to use the scalar version of postel.
Edit your configuration file and add the previous configuration ubugfortrans
I hope you don't have to compile all the system once again.

Then try to run postel with this configuration (in interactive mode).
You just have to add -c ubugfortrans in the command line.

Something like runcode.py postel3d -c ubugfortrans -s ...
In case of error with not enough informations, you could run in the temporary directory with ./out_postel3dv6p2

Good luck
Christophe
The administrator has disabled public write access.
The following user(s) said Thank You: 716469
Moderators: borisb

The open TELEMAC-MASCARET template for Joomla!2.5, the HTML 4 version.