Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

integration problem between Torque 4 and Intel(R) MPI Library for Linux* OS, Version 2019 Update 1 #457

Open
mstener0 opened this issue Jan 20, 2019 · 0 comments

Comments

@mstener0
Copy link

Hi!

I have successfully compiled and linked a program with IntelMPI and if I run it interactively or in background it runs very fast and without any problems on our new server (ProLiant DL580 Gen10, 1 node with 4 processors with 18 cores each, total 72 cores, hyperthreading disabled). If I try to submit it by Torque (version 4) strange things happen, for example:

  1. if I submit 2 jobs asking each 8 cores they are both fine

  2. if I submit a third job (8 cores) it is 4 times slower becasue the 8 process runs on two cores!

  3. if I submit a fourth job it runs properly, but if I qdel all the four jobs, all of them disappear from qstat -a but the fourth is keeping running!

I have the feeling it is an integration problem between intelmpi and torque, so I did the following:

export I_MPI_PIN=off
export I_MPI_PIN_DOMAIN=socket

to run the program I did the following call of mpirun:

/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpirun -d -rmk pbs -bootstrap pbsdsh .................

I have checked and PBS_ENVIRONMENT is properly set to PBS_BATCH

Also torque configuration is apparently correct, the file

/var/lib/torque/server_priv/nodes contains the following line:

dscfbeta1.units.it np=72 num_node_boards=1

This is a severe problem for me, since the machine is shared so we do need a scheduler like torque (pbs) to run jobs compiled and linked to intelmpi. Any help suggestion is welcome!

thank you in advance

Mauro

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant