[Beowulf] shared compute/storage WAS: Re: Lustre Upgrades

Discussion:

Michael Di Domenico

2018-07-26 13:20:29 UTC

On Thu, Jul 26, 2018 at 3:14 AM, Jörg Saßmannshausen

I once had this idea as well: using the spinning discs which I have in the
compute nodes as part of a distributed scratch space. I was using glusterfs
for that as I thought it might be a good idea. It was not.

i split the thread as to not pollute the other discussion.

I'm curious if anyone has any hard data on the above, but
encapsulating the compute from the storage using VM's instead of just
separate processes?

in theory you could cap the performance interference using VM's and
cgroup controls, but i'm not sure how effective that actually is (no
data) in HPC.

I've been thinking about this recently to rebalance some of the rack
loading throughout my data center. yes, i can move things around
within the racks, but then it turns into a cabling nightmare.

discuss?
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.

John Hearns via Beowulf

2018-07-26 13:27:29 UTC

Permalink

For VM substitute 'container' - since containerisation is intimately linked
with cgroups anyway.
Google 'CEPH Docker' and there is plenty of information.

Someone I work with tried out CEPH on Dockerr the other day, and got into
some knots regarding access to the actual hardware devices.
He then downloaded Minio and got it working very rapidly. Sorry - I am only
repeating this story second hand.

On Thu, Jul 26, 2018 at 3:14 AM, JÃ¶rg SaÃmannshausen

I once had this idea as well: using the spinning discs which I have in

the

compute nodes as part of a distributed scratch space. I was using

glusterfs

for that as I thought it might be a good idea. It was not.

i split the thread as to not pollute the other discussion.
I'm curious if anyone has any hard data on the above, but
encapsulating the compute from the storage using VM's instead of just
separate processes?
in theory you could cap the performance interference using VM's and
cgroup controls, but i'm not sure how effective that actually is (no
data) in HPC.
I've been thinking about this recently to rebalance some of the rack
loading throughout my data center. yes, i can move things around
within the racks, but then it turns into a cabling nightmare.
discuss?
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

John Hearns via Beowulf

2018-07-26 13:30:57 UTC

Permalink

Post by Michael Di Domenico
in theory you could cap the performance interference using VM's and
cgroup controls, but i'm not sure how effective that actually is (no
data) in HPC.

I looked quite heavily at performance capping for RDMA applications in
cgroups about a year ago.
It is very doable, however you need a recent 4-series kernel. Sadly we were
using 3-series kernels on RHEL
Parav Pandit is the go-to guy for this
https://www.openfabrics.org/images/eventpresos/2016presentations/115rdmacont.pdf

Post by Michael Di Domenico
For VM substitute 'container' - since containerisation is intimately
linked with cgroups anyway.
Google 'CEPH Docker' and there is plenty of information.
Someone I work with tried out CEPH on Dockerr the other day, and got into
some knots regarding access to the actual hardware devices.
He then downloaded Minio and got it working very rapidly. Sorry - I am
only repeating this story second hand.

On Thu, Jul 26, 2018 at 3:14 AM, JÃ¶rg SaÃmannshausen

I once had this idea as well: using the spinning discs which I have in

the

compute nodes as part of a distributed scratch space. I was using

glusterfs

for that as I thought it might be a good idea. It was not.

i split the thread as to not pollute the other discussion.
I'm curious if anyone has any hard data on the above, but
encapsulating the compute from the storage using VM's instead of just
separate processes?
in theory you could cap the performance interference using VM's and
cgroup controls, but i'm not sure how effective that actually is (no
data) in HPC.
I've been thinking about this recently to rebalance some of the rack
loading throughout my data center. yes, i can move things around
within the racks, but then it turns into a cabling nightmare.
discuss?
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

Michael Di Domenico

2018-07-26 13:45:15 UTC

Permalink

On Thu, Jul 26, 2018 at 9:30 AM, John Hearns via Beowulf

Post by John Hearns via Beowulf

Post by Michael Di Domenico
in theory you could cap the performance interference using VM's and
cgroup controls, but i'm not sure how effective that actually is (no
data) in HPC.

interesting, though i'm not sure i'd dive that deep. for one i'm
generally restricted to rhel, so that means a 3.x kernel right now.

but also i feel like this might be an area where VM's might provide a
layer of management that containers don't. i could conceive that the
storage and compute VM's might not necessarily run the same kernel
version and/or O/S

i'd also be more amenable to having two high speed nic's both IB or
one IB one 40GigE, one each for the VM's, rather then fair-sharing the
work queues of one IB card

dunno, just spit balling here. maybe something sticks enough for me
to standup something with my older cast off hardware
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf

John Hearns via Beowulf

2018-07-26 13:48:56 UTC

Permalink

As we are discussing storage performance, may I slightly blow the trumpet
for someone else
https://www.ellexus.com/ellexus-contributes-to-global-paper-on-how-to-analyse-i-o/
https://arxiv.org/abs/1807.04985

Post by Michael Di Domenico
On Thu, Jul 26, 2018 at 9:30 AM, John Hearns via Beowulf

Post by John Hearns via Beowulf

Post by Michael Di Domenico
in theory you could cap the performance interference using VM's and
cgroup controls, but i'm not sure how effective that actually is (no
data) in HPC.

I looked quite heavily at performance capping for RDMA applications in
cgroups about a year ago.
It is very doable, however you need a recent 4-series kernel. Sadly we

were

Post by John Hearns via Beowulf
using 3-series kernels on RHEL

interesting, though i'm not sure i'd dive that deep. for one i'm
generally restricted to rhel, so that means a 3.x kernel right now.
but also i feel like this might be an area where VM's might provide a
layer of management that containers don't. i could conceive that the
storage and compute VM's might not necessarily run the same kernel
version and/or O/S
i'd also be more amenable to having two high speed nic's both IB or
one IB one 40GigE, one each for the VM's, rather then fair-sharing the
work queues of one IB card
dunno, just spit balling here. maybe something sticks enough for me
to standup something with my older cast off hardware
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

Jonathan Engwall

2018-07-26 18:13:45 UTC

Permalink

This made me think about distributed routing. This:
https://wiki.openstack.org/wiki/Distributed_Router_for_OVS
Might be my next horrible idea. It looks interesting.
It seems to me that moving the load off of heavily hit machines could be
accomplished with elastic deployment and distributed routing.
Presently I only have enough power to test these ideas and cross my fingers.

Post by John Hearns via Beowulf
As we are discussing storage performance, may I slightly blow the trumpet
for someone else
https://www.ellexus.com/ellexus-contributes-to-global-paper-on-how-to-analyse-i-o/
https://arxiv.org/abs/1807.04985

Post by Michael Di Domenico
On Thu, Jul 26, 2018 at 9:30 AM, John Hearns via Beowulf

Post by John Hearns via Beowulf

Post by Michael Di Domenico
in theory you could cap the performance interference using VM's and
cgroup controls, but i'm not sure how effective that actually is (no
data) in HPC.

I looked quite heavily at performance capping for RDMA applications in
cgroups about a year ago.
It is very doable, however you need a recent 4-series kernel. Sadly we

were

Post by John Hearns via Beowulf
using 3-series kernels on RHEL

interesting, though i'm not sure i'd dive that deep. for one i'm
generally restricted to rhel, so that means a 3.x kernel right now.
but also i feel like this might be an area where VM's might provide a
layer of management that containers don't. i could conceive that the
storage and compute VM's might not necessarily run the same kernel
version and/or O/S
i'd also be more amenable to having two high speed nic's both IB or
one IB one 40GigE, one each for the VM's, rather then fair-sharing the
work queues of one IB card
dunno, just spit balling here. maybe something sticks enough for me
to standup something with my older cast off hardware
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf