Discussion:
[Beowulf] Puzzling Intel mpi behavior with slurm
Faraz Hussain
2018-04-05 15:10:57 UTC
Permalink
Here's something quite baffling. I have a cluster running slurm but
have not setup passwordless ssh for a user yet. So when the user runs
"mpirun -n 2 -hostfile hosts hostname", it will hang because of ssh
issue. That is expected.

Now the baffling thing is the mpirun command works inside a slurm
script! How can it work if passwordless ssh has not been configured?
Does slurm use some different authentication (munge?) to login to the
hosts and execute the hostname command?

Or does slurm have some fancy behind the scenes integration with Intel mpi ?

_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowul
Michael Di Domenico
2018-04-05 17:59:44 UTC
Permalink
i'm pretty sure, but don't quote me, that slurm forks processes from
the slurmd to launch code and does not use ssh
Here's something quite baffling. I have a cluster running slurm but have not
setup passwordless ssh for a user yet. So when the user runs "mpirun -n 2
-hostfile hosts hostname", it will hang because of ssh issue. That is
expected.
Now the baffling thing is the mpirun command works inside a slurm script!
How can it work if passwordless ssh has not been configured? Does slurm use
some different authentication (munge?) to login to the hosts and execute the
hostname command?
Or does slurm have some fancy behind the scenes integration with Intel mpi ?
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit htt
Skylar Thompson
2018-04-06 00:11:27 UTC
Permalink
At least for Grid Engine/OpenMPI the preferred mechanism ("tight
integration") involves the shepherds running on each exec hosts to start
MPI, without any SSH/RSH required at all. I'm not sure if you've run across
this documentation, but it might help to figure out what's going on:

https://slurm.schedmd.com/mpi_guide.html#intel_mpi

I'm guessing you're using the "srun" method right now.

Skylar
Here's something quite baffling. I have a cluster running slurm but have
not setup passwordless ssh for a user yet. So when the user runs "mpirun -n
2 -hostfile hosts hostname", it will hang because of ssh issue. That is
expected.
Now the baffling thing is the mpirun command works inside a slurm script!
How can it work if passwordless ssh has not been configured? Does slurm use
some different authentication (munge?) to login to the hosts and execute
the hostname command?
Or does slurm have some fancy behind the scenes integration with Intel mpi ?
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Prentice Bisbal
2018-04-06 19:37:28 UTC
Permalink
See the URL below for a good overview of how Slurm works:

https://slurm.schedmd.com/quickstart.html

The way I understand it, tasks are started by Slurmd. Ssh is not
involved at all.

SGE does the same thing with 'tight integration'. The tasks are started
on the compute nodes by sgeexecd, which spawns an sge sheperd task,
which then spawns the actual task.

To really complicate things, you should look at process management
interface (PMI). This is a middle layer between Slurm (or an other
scheduler) and the MPI tasks. It's a standardized abstraction layer to
make programming MPI implementations and schedulers easier. It also
increases startup time of the MPI jobs, which is not insignificant for
large jobs.

www.mcs.anl.gov/papers/P1760.pdf

Prentice
Post by Faraz Hussain
Here's something quite baffling. I have a cluster running slurm but
have not setup passwordless ssh for a user yet. So when the user runs
"mpirun -n 2 -hostfile hosts hostname", it will hang because of ssh
issue. That is expected.
Now the baffling thing is the mpirun command works inside a slurm
script! How can it work if passwordless ssh has not been configured?
Does slurm use some different authentication (munge?) to login to the
hosts and execute the hostname command?
Or does slurm have some fancy behind the scenes integration with Intel mpi ?
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Chris Samuel
2018-04-09 08:58:48 UTC
Permalink
To really complicate things, you should look at process management interface
(PMI). This is a middle layer between Slurm (or an other scheduler) and the
MPI tasks. It's a standardized abstraction layer to make programming MPI
implementations and schedulers easier. It also increases startup time of
the MPI jobs, which is not insignificant for large jobs.
Hopefully PMI/PMI2/PMIx decreases the startup time! :-)

There's a presentation on PMIx (the latest version) here:

https://slurm.schedmd.com/SC17/PMIx-SC17.pdf

You need to be careful about the versioning with PMIx and Slurm versions,
there is information on getting it working on both sites:

https://slurm.schedmd.com/mpi_guide.html#pmix

https://pmix.org/support/how-to/slurm-support/

Hope this helps!
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC

_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beo
Prentice Bisbal
2018-04-09 15:04:42 UTC
Permalink
Post by Chris Samuel
To really complicate things, you should look at process management interface
(PMI). This is a middle layer between Slurm (or an other scheduler) and the
MPI tasks. It's a standardized abstraction layer to make programming MPI
implementations and schedulers easier. It also increases startup time of
the MPI jobs, which is not insignificant for large jobs.
Hopefully PMI/PMI2/PMIx decreases the startup time! :-)
I think I meant to say "increases startup performance", or something
like that. Thanks for catching this error which was the exact opposite
of what I was trying to say.
Post by Chris Samuel
https://slurm.schedmd.com/SC17/PMIx-SC17.pdf
You need to be careful about the versioning with PMIx and Slurm versions,
https://slurm.schedmd.com/mpi_guide.html#pmix
https://pmix.org/support/how-to/slurm-support/
Hope this helps!
Chris
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/li
Peter Kjellström
2018-04-11 14:38:31 UTC
Permalink
On Thu, 05 Apr 2018 09:10:57 -0600
Post by Faraz Hussain
Here's something quite baffling. I have a cluster running slurm but
have not setup passwordless ssh for a user yet. So when the user
runs "mpirun -n 2 -hostfile hosts hostname", it will hang because of
ssh issue. That is expected.
Now the baffling thing is the mpirun command works inside a slurm
script! How can it work if passwordless ssh has not been configured?
Does slurm use some different authentication (munge?) to login to
the hosts and execute the hostname command?
What happens is that mpirun sees the slurm environment variables and
switches to a slurm aware mode.

In this mode it uses srun to to launch pmi_proxy processes on each node
of the job. Then it proceeds to start all ranks using these pmi_proxy
processes.

The process tree ends up being something like this on the first node:

slurmd->slurmstepd->bash(jobscript)->mpirun->srun -w nodes[..] pmi_proxy

And on the other nodes:

slurmd->slurmstepd->pmi_proxy->rank[0...n]

Authentication/authorization is handled by slurm and depens on how you
set it up (often munge).

Cheers,
Peter K
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.
Loading...