work. And thatâs what will inevitably be the case as we scale up in the
HPC world â there will always be dead or malfunctioning nodes.
Jim, this is true. And 'we' should be looking to the webscale generation
for the answers. They thought about computing at scale from the beginning.
Microsoft/Amazon/Google order servers ready racked in shipping containers.
and move on.
of HPC. When pitching HPC clusters you often put in an option for a
mid-life upgrade. I think upping the RAM is quite common, but processors
and interconnect much less so.
cooling is outweighed by the performance of a new generation. But where
*Date: *Thursday, May 3, 2018 at 6:54 AM
*Subject: *Re: [Beowulf] Bright Cluster Manager
I agree with Doug. The way forward is a lightweight OS with containers for
the applications.
I think we need to learn from the new kids on the block - the webscale
generation.
They did not go out and look at how massive supercomputer clusters are put
together.
No, they went out and build scale out applications built on public clouds.
We see 'applications designed to fail' and 'serverless'
Yes, I KNOW that scale out applications like these are Web type
applications, and all application examples you
see are based on the load balancer/web server/database (or whatever style)
paradigm
The art of this will be deploying the more tightly coupled applications
with HPC has,
which depend upon MPI communications over a reliable fabric, which depend
upon GPUs etc.
The other hat I will toss into the ring is separating parallel tasks which
require computation on several
servers and MPI communication between them versus 'embarrassingly
parallel' operations which may run on many, many cores
but do not particularly need communication between them.
The best successes I have seen on clusters is where the heavy parallel
applications get exclusive compute nodes.
Cleaner, you get all the memory and storage bandwidth and easy to clean
up. Hell, reboot the things after each job. You got an exclusive node.
I think many designs of HPC clusters still try to cater for all workloads
- Oh Yes, we can run an MPI weather forecasting/ocean simulation
But at the same time we have this really fast IO system and we can run
your Hadoop jobs.
I wonder if we are going to see a fork in HPC. With the massively parallel
applications being deployed, as Doug says, on specialised
lightweight OSes which have dedicated high speed, reliable fabrics and
with containers.
You won't really be able to manage those systems like individual Linux
servers. Will you be able to ssh in for instance?
ssh assumes there is an ssh daemon running. Does a lightweight OS have
ssh? Authentication Services? The kitchen sink?
The less parallel applications being run more and more on cloud type
installations, either on-premise clouds or public clouds.
I confound myself here, as I cant say what the actual difference between
those two types of machines is, as you always needs
an interconnect fabric and storage, so why not have the same for both
types of tasks.
Maybe one further quip to stimulate some conversation. Silicon is cheap.
No, really it is.
Your friendly Intel salesman may wince when you say that. After all those
lovely Xeon CPUs cost north of 1000 dollars each.
power and cooling costs the same if not more than your purchase cost over
several years
are we exploiting all the capabilities of those Xeon CPUs
And another aspect of this - Iâve been doing stuff with âloose clustersâ
of low capability processors (Arduino, Rpi, Beagle) doing distributed
sensing kinds of tasks â leaving aside the Arduino (no OS) â the other two
wind up with some flavor of Debian but often with lots of stuff you donât
need (i.e. Apache). Once youâve fiddled with one node to get the
configuration right, you want to replicate it across a bunch of nodes â
right now that means sneakernet of SD cards - although in theory, one
should be able to push an image out to the local file system (typically 4GB
eMMC in the case of beagles), and tell it to write that to the âboot areaâ
â but Iâve not tried it.
While Iâd never claim my pack of beagles is HPC, it does share some
aspects â thereâs parallel work going on, the nodes need to be aware of
each other and synchronize their behavior (that is, itâs not an
embarrassingly parallel task thatâs farmed out from a queue), and most
importantly, the management has to be scalable. While I might have 4
beagles on the bench right now â the idea is to scale the approach to
hundreds. Typing âsudo apt-get install tbd-packageâ on 4 nodes
sequentially might be ok (although pdsh and csshx help a lot) itâs not
viable for 100 nodes.
The other aspect of my application thatâs interesting, and applicable to
exascale kinds of problems, is tolerance to failures â if I have a low data
rate link among nodes (with not necessarily all to all connectivity), one
can certainly distribute a new OS image (or container) with time. Thereâs
some ways to deal with errors in the transfers (other than just retransmit
all â which doesnât work if the error rate is high enough that you can
guarantee at least one error will occur in a long transfer). But how do
you **manage** a cluster with hundreds or thousands of nodes where some
fail randomly, reset randomly, etc.
All of a sudden simple âsend the same command to all nodesâ just doesnât
work. And thatâs what will inevitably be the case as we scale up in the
HPC world â there will always be dead or malfunctioning nodes.
Here is where I see it going
1. Computer nodes with a base minimal generic Linux OS
(with PR_SET_NO_NEW_PRIVS in kernel, added in 3.5)
2. A Scheduler (that supports containers)
3. Containers (Singularity mostly)
All "provisioning" is moved to the container. There will be edge cases of
course, but applications will be pulled down from
a container repos and "just run"
--
Doug
I never used Bright.Ã Touched it and talked to a salesperson at a
conference but I wasn't impressed.
Unpopular opinion: I don't see a point in using "cluster managers"
unless you have a very tiny cluster and zero Linux experience.Ã These
are just Linux boxes with a couple applications (e.g. Slurm) running on
them.Ã Nothing special. xcat/Warewulf/Scyld/Rocks just get in the way
more than they help IMO.Ã They are mostly crappy wrappers around free
software (e.g. ISC's dhcpd) anyway.Ã When they aren't it's proprietary
trash.
I install CentOS nodes and use
Salt/Chef/Puppet/Ansible/WhoCares/Whatever to plop down my configs and
software.Ã This also means I'm not suck with "node images" and can
instead build everything as plain old text files (read: write SaltStack
states), update them at will, and push changes any time.Ã My "base
image" is CentOS and I need no "baby's first cluster" HPC software to
install/PXEboot it.Ã YMMV
Jeff White
Post by Robert TaylorHi Beowulfers.
Does anyone have any experience with Bright Cluster Manager?
My boss has been looking into it, so I wanted to tap into the
collective HPC consciousness and see
what people think about it.
It appears to do node management, monitoring, and provisioning, so we
would still need a job scheduler like lsf, slurm,etc, as well. Is that
correct?
If you have experience with Bright, let me know. Feel free to contact
me off list or on.
_______________________________________________
Computing
Post by Robert TaylorTo change your subscription (digest mode or unsubscribe) visit
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.
beowulf.org_mailman_listinfo_beowulf&d=DwIGaQ&c=C3yme8gMkxg_
ihJNXS06ZyWk4EJm8LdrrvxQb-Je7sw&r=DhM5WMgdrH-xWhI5BzkRTzoTvz8C-
BRZ05t9kW9SXZk&m=2km_EqLvNf2v9rNf8LphAYkJ-Sc_azfEyHqyDIzpLOc&s=
kq0wdhy80VqcBCwcQAAQa0RbsgWIekhd0qU0zC81g1Q&e=
--
MailScanner: Clean
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Doug
--
MailScanner: Clean
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf