Discussion:
[Beowulf] Working for DUG, new thead
Jonathan Engwall
2018-06-13 17:07:49 UTC
Permalink
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at least
until 6pm.
I am excited but also terrified. My background is C and now JavaScript,
mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
***@gmail.com
Bill Abbott
2018-06-13 17:53:24 UTC
Permalink
linux, mostly
Post by Jonathan Engwall
Stuart Midgley works for DUG?  They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at least
until 6pm.
I am excited but also terrified. My background is C and now JavaScript,
mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C17cc89690dbb4ce5ae9a08d5d15038ec%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645064845177853&sdata=RkclgLTrmOepLzAURyHcfJPnSpADzhMVU4kJz6iI3%2FA%3D&reserved=0
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beo
Prentice Bisbal
2018-06-13 18:04:14 UTC
Permalink
I would think that "low-level" means "close to the hardware"
(troubleshooting drivers, low-level hardware debugging, etc. ) , and
that junior-level would mean "basic Linux system admin skills", but
they're in Australia, so they might use these terms differently.

Prentice
Post by Bill Abbott
linux, mostly
 > Stuart Midgley works for DUG?  They are currently
 > recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at
least until 6pm.
I am excited but also terrified. My background is C and now
JavaScript, mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C17cc89690dbb4ce5ae9a08d5d15038ec%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645064845177853&sdata=RkclgLTrmOepLzAURyHcfJPnSpADzhMVU4kJz6iI3%2FA%3D&reserved=0
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe)
Bill Abbott
2018-06-13 18:07:53 UTC
Permalink
s/linux/cabling/g
Post by Prentice Bisbal
I would think that "low-level" means "close to the hardware"
(troubleshooting drivers, low-level hardware debugging, etc. ) , and
that junior-level would mean "basic Linux system admin skills", but
they're in Australia, so they might use these terms differently.
Prentice
Post by Bill Abbott
linux, mostly
 > Stuart Midgley works for DUG?  They are currently
 > recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at
least until 6pm.
I am excited but also terrified. My background is C and now
JavaScript, mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C17cc89690dbb4ce5ae9a08d5d15038ec%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645064845177853&sdata=RkclgLTrmOepLzAURyHcfJPnSpADzhMVU4kJz6iI3%2FA%3D&reserved=0
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C5ba251a20aa24880fd2708d5d15820d2%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645098799494306&sdata=tltKQcflRPcjtTi5x2VZ0NISIQTmx4O09zZ7OEcJJyM%3D&reserved=0
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C5ba251a20aa24880fd2708d5d15820d2%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645098799494306&sdata=tltKQcflRPcjtTi5x2VZ0NISIQTmx4O09zZ7OEcJJyM%3D&reserved=0
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.be
John Hearns via Beowulf
2018-06-13 18:26:50 UTC
Permalink
Jonathan, if you have taken the interest to join this list then there is no
need to be terrified.
I have learned that people who are enthusiastic are quite rare.
Also interviews are a two way street - this is your opportunity to find out
about the role.
Hopefully you will be enthused about it and wish to join the company.
Post by Bill Abbott
s/linux/cabling/g
Post by Prentice Bisbal
I would think that "low-level" means "close to the hardware"
(troubleshooting drivers, low-level hardware debugging, etc. ) , and that
junior-level would mean "basic Linux system admin skills", but they're in
Australia, so they might use these terms differently.
Prentice
Post by Bill Abbott
linux, mostly
Post by Jonathan Engwall
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at
least until 6pm.
I am excited but also terrified. My background is C and now JavaScript,
mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
@gmail.com>
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%
7C01%7Cbabbott%40rutgers.edu%7C17cc89690dbb4ce5ae9a08d5d1503
8ec%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C63664506484
5177853&sdata=RkclgLTrmOepLzAURyHcfJPnSpADzhMVU4kJz6iI3%2FA%
3D&reserved=0
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%
7C01%7Cbabbott%40rutgers.edu%7C5ba251a20aa24880fd2708d5d1582
0d2%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C63664509879
9494306&sdata=tltKQcflRPcjtTi5x2VZ0NISIQTmx4O09zZ7OEcJJyM%3D&reserved=0
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%
7C01%7Cbabbott%40rutgers.edu%7C5ba251a20aa24880fd2708d5d1582
0d2%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C63664509879
9494306&sdata=tltKQcflRPcjtTi5x2VZ0NISIQTmx4O09zZ7OEcJJyM%3D&reserved=0
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Andrew Latham
2018-06-13 18:27:23 UTC
Permalink
Such a broad topic. I would assume things like DHCP, TFTP, Networking, PXE
and IPMI which come to mind. Troubleshooting tools, configuration
management, version control, monitoring, issue tracking and many other
processes are at play. I would love to hear any interview questions which
could cover this array of topics sanely.

Suggested question: Tell me the best hardware horror story?


On Wed, Jun 13, 2018 at 12:08 PM Jonathan Engwall <
Post by Jonathan Engwall
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at least
until 6pm.
I am excited but also terrified. My background is C and now JavaScript,
mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
- Andrew "lathama" Latham -
Bill Abbott
2018-06-13 18:37:13 UTC
Permalink
One of my standard interview questions is to say ok, you start on Monday
and you're placed in charge of a web/db server, tell me what you do your
first week.

What I want to hear is security, backups, log checking, monitoring,
performance, functionality, etc., but most of all I want to know how
they think and if they can come up with a coherent plan.

Another version is "A user says the cluster is slow. What do you do?"

Bill
Post by Andrew Latham
Such a broad topic. I would assume things like DHCP, TFTP, Networking,
PXE and IPMI which come to mind. Troubleshooting tools, configuration
management, version control, monitoring, issue tracking and many other
processes are at play. I would love to hear any interview questions
which could cover this array of topics sanely.
Suggested question: Tell me the best hardware horror story?
On Wed, Jun 13, 2018 at 12:08 PM Jonathan Engwall
Stuart Midgley works for DUG?  They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at
least until 6pm.
I am excited but also terrified. My background is C and now
JavaScript, mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C3dccc87b23f6445e544e08d5d15b69b1%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645112905753024&sdata=sr%2FGxwT6uV5hOD029A6Pgbc7%2BGrciWMP7jq77Ap1sy0%3D&reserved=0>
--
- Andrew "lathama" Latham -
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C3dccc87b23f6445e544e08d5d15b69b1%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645112905763036&sdata=S%2BJM%2BswwtGQFNk8VTDEbWQQnXhpoTQx5CYJR5Wxyuu4%3D&reserved=0
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/
Andrew Latham
2018-06-13 18:46:44 UTC
Permalink
Bill's question is good and I have heard it many times. Practice answering
that.

Related examples:

Tell me how you would setup and install 100 new servers?
How do you describe to onsite support staff (smart hands) to move a network
cable to another port on the same system?
At 2am you are trouble shooting an issue from home and you lose connection
to the node. What do you check first?
Post by Bill Abbott
One of my standard interview questions is to say ok, you start on Monday
and you're placed in charge of a web/db server, tell me what you do your
first week.
What I want to hear is security, backups, log checking, monitoring,
performance, functionality, etc., but most of all I want to know how
they think and if they can come up with a coherent plan.
Another version is "A user says the cluster is slow. What do you do?"
Bill
Post by Andrew Latham
Such a broad topic. I would assume things like DHCP, TFTP, Networking,
PXE and IPMI which come to mind. Troubleshooting tools, configuration
management, version control, monitoring, issue tracking and many other
processes are at play. I would love to hear any interview questions
which could cover this array of topics sanely.
Suggested question: Tell me the best hardware horror story?
On Wed, Jun 13, 2018 at 12:08 PM Jonathan Engwall
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at
least until 6pm.
I am excited but also terrified. My background is C and now
JavaScript, mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
<
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C3dccc87b23f6445e544e08d5d15b69b1%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645112905753024&sdata=sr%2FGxwT6uV5hOD029A6Pgbc7%2BGrciWMP7jq77Ap1sy0%3D&reserved=0
--
- Andrew "lathama" Latham -
John Hearns via Beowulf
2018-06-13 18:49:49 UTC
Permalink
Bill's question re. "the cluster is slow"is fantastic.
That covers people skills in addition to technical skills.


ps. best of luck. You never know who you might end up working with ;-)
Post by Andrew Latham
Bill's question is good and I have heard it many times. Practice answering
that.
Tell me how you would setup and install 100 new servers?
How do you describe to onsite support staff (smart hands) to move a
network cable to another port on the same system?
At 2am you are trouble shooting an issue from home and you lose connection
to the node. What do you check first?
Post by Bill Abbott
One of my standard interview questions is to say ok, you start on Monday
and you're placed in charge of a web/db server, tell me what you do your
first week.
What I want to hear is security, backups, log checking, monitoring,
performance, functionality, etc., but most of all I want to know how
they think and if they can come up with a coherent plan.
Another version is "A user says the cluster is slow. What do you do?"
Bill
Post by Andrew Latham
Such a broad topic. I would assume things like DHCP, TFTP, Networking,
PXE and IPMI which come to mind. Troubleshooting tools, configuration
management, version control, monitoring, issue tracking and many other
processes are at play. I would love to hear any interview questions
which could cover this array of topics sanely.
Suggested question: Tell me the best hardware horror story?
On Wed, Jun 13, 2018 at 12:08 PM Jonathan Engwall
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at
least until 6pm.
I am excited but also terrified. My background is C and now
JavaScript, mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
<https://na01.safelinks.protection.outlook.com/?url=
http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%
7Cbabbott%40rutgers.edu%7C3dccc87b23f6445e544e08d5d15b69b1%
7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645112905753024&sdata=sr%
2FGxwT6uV5hOD029A6Pgbc7%2BGrciWMP7jq77Ap1sy0%3D&reserved=0>
--
- Andrew "lathama" Latham -
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Douglas Eadline
2018-06-13 19:46:07 UTC
Permalink
Add to that "Please identify the situations that justify using a LART?"

https://en.wiktionary.org/wiki/LART

--
Doug
Post by Bill Abbott
One of my standard interview questions is to say ok, you start on Monday
and you're placed in charge of a web/db server, tell me what you do your
first week.
What I want to hear is security, backups, log checking, monitoring,
performance, functionality, etc., but most of all I want to know how
they think and if they can come up with a coherent plan.
Another version is "A user says the cluster is slow. What do you do?"
Bill
Post by Andrew Latham
Such a broad topic. I would assume things like DHCP, TFTP, Networking,
PXE and IPMI which come to mind. Troubleshooting tools, configuration
management, version control, monitoring, issue tracking and many other
processes are at play. I would love to hear any interview questions
which could cover this array of topics sanely.
Suggested question: Tell me the best hardware horror story?
On Wed, Jun 13, 2018 at 12:08 PM Jonathan Engwall
Stuart Midgley works for DUG?  They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at
least until 6pm.
I am excited but also terrified. My background is C and now
JavaScript, mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C3dccc87b23f6445e544e08d5d15b69b1%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645112905753024&sdata=sr%2FGxwT6uV5hOD029A6Pgbc7%2BGrciWMP7jq77Ap1sy0%3D&reserved=0>
--
- Andrew "lathama" Latham -
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C3dccc87b23f6445e544e08d5d15b69b1%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645112905763036&sdata=S%2BJM%2BswwtGQFNk8VTDEbWQQnXhpoTQx5CYJR5Wxyuu4%3D&reserved=0
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
MailScanner: Clean
--
Doug
--
MailScanner: Clean

_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf
Jonathan Engwall
2018-06-13 21:59:29 UTC
Permalink
Thanks for the great feedback.

LART? What is that?

I am doing a quick start, and just by chance I have been watching an MIT
lecture on SSE.

This is from a YouTuber who calls himself Biscuit. Quick and beautiful with
pragmas and the Mandelbrot's Set.
Crossing fingers now...
Post by Douglas Eadline
Add to that "Please identify the situations that justify using a LART?"
https://en.wiktionary.org/wiki/LART
--
Doug
Post by Bill Abbott
One of my standard interview questions is to say ok, you start on Monday
and you're placed in charge of a web/db server, tell me what you do your
first week.
What I want to hear is security, backups, log checking, monitoring,
performance, functionality, etc., but most of all I want to know how
they think and if they can come up with a coherent plan.
Another version is "A user says the cluster is slow. What do you do?"
Bill
Post by Andrew Latham
Such a broad topic. I would assume things like DHCP, TFTP, Networking,
PXE and IPMI which come to mind. Troubleshooting tools, configuration
management, version control, monitoring, issue tracking and many other
processes are at play. I would love to hear any interview questions
which could cover this array of topics sanely.
Suggested question: Tell me the best hardware horror story?
On Wed, Jun 13, 2018 at 12:08 PM Jonathan Engwall
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at
least until 6pm.
I am excited but also terrified. My background is C and now
JavaScript, mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
<https://na01.safelinks.protection.outlook.com/?url=
http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%
7Cbabbott%40rutgers.edu%7C3dccc87b23f6445e544e08d5d15b69b1%
7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645112905753024&sdata=sr%
2FGxwT6uV5hOD029A6Pgbc7%2BGrciWMP7jq77Ap1sy0%3D&reserved=0>
Post by Bill Abbott
Post by Andrew Latham
--
- Andrew "lathama" Latham -
_______________________________________________
Computing
Post by Bill Abbott
Post by Andrew Latham
To change your subscription (digest mode or unsubscribe) visit
https://na01.safelinks.protection.outlook.com/?url=
http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%
7Cbabbott%40rutgers.edu%7C3dccc87b23f6445e544e08d5d15b69b1%
7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645112905763036&sdata=S%
2BJM%2BswwtGQFNk8VTDEbWQQnXhpoTQx5CYJR5Wxyuu4%3D&reserved=0
Post by Bill Abbott
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
MailScanner: Clean
--
Doug
--
MailScanner: Clean
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Prentice Bisbal
2018-06-19 16:01:58 UTC
Permalink
Despite the source (just kidding, Bill!) I'm going to have to support
this line of questioning.

HPC (and general IT), consists of systems with many different layers,
and it takes the correct personality type with the good analytical
skills to be able to trouble shoot things effectively. I really don't
care about questioning an interviewee about technical minutiae in an
interview. Most of that can be easily googled these days. The real
value/skill is in coming up with a plan of attack: knowing where to
start troubleshooting, what to google, how to correctly interpret those
google results, and then now to apply what you find online to fix your
problem.

If a sys admin position involves shell prorgramming/scripting, knowing
the details of a specific programming language or processor are
secondary, but thinking like a programmer is skill not everyone has or
can develop.Just last week a wrote a Lua script without knowing a thing
about Lua. I looked at the example scripts, and then googled to fill in
the blanks. I think most SysAdmins do stuff like that on a regular basis.

If a position involves writing optimized code, I wouldn't apply this
logic - getting really good performance can require knowing the minutiae
of a specific language, but most HPC sys admins don't do that level of
programming.

I also look for people who have hobbies that involve understanding how
things go together and work: working on bicycles or cars, electronics,
etc. If you can understand how other things go together or work,
computers, are not that big of a jump.

Prentice
Post by Bill Abbott
One of my standard interview questions is to say ok, you start on
Monday and you're placed in charge of a web/db server, tell me what
you do your first week.
What I want to hear is security, backups, log checking, monitoring,
performance, functionality, etc., but most of all I want to know how
they think and if they can come up with a coherent plan.
Another version is "A user says the cluster is slow.  What do you do?"
Bill
Post by Andrew Latham
Such a broad topic. I would assume things like DHCP, TFTP,
Networking, PXE and IPMI which come to mind. Troubleshooting tools,
configuration management, version control, monitoring, issue tracking
and many other processes are at play. I would love to hear any
interview questions which could cover this array of topics sanely.
Suggested question: Tell me the best hardware horror story?
On Wed, Jun 13, 2018 at 12:08 PM Jonathan Engwall
     > Stuart Midgley works for DUG?  They are currently
     > recruiting for an HPC manager in London... Interesting...
    Recruitment at DUG wants to call me about Low Level HPC. I have at
    least until 6pm.
    I am excited but also terrified. My background is C and now
    JavaScript, mostly online course work and telnet MUDs.
    Any suggestions are very much needed.
    What must a "low level HPC" know on day 1???
    Jonathan Engwall
    _______________________________________________
    To change your subscription (digest mode or unsubscribe) visit
    http://www.beowulf.org/mailman/listinfo/beowulf
<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C3dccc87b23f6445e544e08d5d15b69b1%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645112905753024&sdata=sr%2FGxwT6uV5hOD029A6Pgbc7%2BGrciWMP7jq77Ap1sy0%3D&reserved=0>
--
- Andrew "lathama" Latham -
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C3dccc87b23f6445e544e08d5d15b69b1%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645112905763036&sdata=S%2BJM%2BswwtGQFNk8VTDEbWQQnXhpoTQx5CYJR5Wxyuu4%3D&reserved=0
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) v
John Hearns via Beowulf
2018-06-19 18:38:29 UTC
Permalink
If a sys admin position involves shell prorgramming/scripting, knowing the
details of a specific programming language or processor are secondary, but
thinking like a >programmer is skill not everyone has or can develop.Just
last week a wrote a Lua script without knowing a thing about Lua. I looked
at the example scripts, and then >googled to fill in the blanks. I think
most SysAdmins do stuff like that on a regular basis.

Nooo.. I would never do that.
At one stage I had a whole shelf of Oreilly books at work. Still have them
at home, but a bunch went to Textbooks for Africa recently.
My local library gladly accepted some rather heavy tomes on parallel
programming. Goodness knows what the residents of SE London make of them. I
rather hope that some youngster is inspired by one of the books.
Despite the source (just kidding, Bill!) I'm going to have to support this
line of questioning.
HPC (and general IT), consists of systems with many different layers, and
it takes the correct personality type with the good analytical skills to be
able to trouble shoot things effectively. I really don't care about
questioning an interviewee about technical minutiae in an interview. Most
of that can be easily googled these days. The real value/skill is in coming
up with a plan of attack: knowing where to start troubleshooting, what to
google, how to correctly interpret those google results, and then now to
apply what you find online to fix your problem.
If a sys admin position involves shell prorgramming/scripting, knowing the
details of a specific programming language or processor are secondary, but
thinking like a programmer is skill not everyone has or can develop.Just
last week a wrote a Lua script without knowing a thing about Lua. I looked
at the example scripts, and then googled to fill in the blanks. I think
most SysAdmins do stuff like that on a regular basis.
If a position involves writing optimized code, I wouldn't apply this logic
- getting really good performance can require knowing the minutiae of a
specific language, but most HPC sys admins don't do that level of
programming.
I also look for people who have hobbies that involve understanding how
things go together and work: working on bicycles or cars, electronics, etc.
If you can understand how other things go together or work, computers, are
not that big of a jump.
Prentice
Post by Bill Abbott
One of my standard interview questions is to say ok, you start on Monday
and you're placed in charge of a web/db server, tell me what you do your
first week.
What I want to hear is security, backups, log checking, monitoring,
performance, functionality, etc., but most of all I want to know how they
think and if they can come up with a coherent plan.
Another version is "A user says the cluster is slow. What do you do?"
Bill
Post by Andrew Latham
Such a broad topic. I would assume things like DHCP, TFTP, Networking,
PXE and IPMI which come to mind. Troubleshooting tools, configuration
management, version control, monitoring, issue tracking and many other
processes are at play. I would love to hear any interview questions which
could cover this array of topics sanely.
Suggested question: Tell me the best hardware horror story?
On Wed, Jun 13, 2018 at 12:08 PM Jonathan Engwall <
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at
least until 6pm.
I am excited but also terrified. My background is C and now
JavaScript, mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
<https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%
7C01%7Cbabbott%40rutgers.edu%7C3dccc87b23f6445e544e08d5d15b6
9b1%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C63664511290
5753024&sdata=sr%2FGxwT6uV5hOD029A6Pgbc7%2BGrciWMP7jq77Ap1sy
0%3D&reserved=0>
--
- Andrew "lathama" Latham -
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%
7C01%7Cbabbott%40rutgers.edu%7C3dccc87b23f6445e544e08d5d15b6
9b1%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C63664511290
5763036&sdata=S%2BJM%2BswwtGQFNk8VTDEbWQQnXhpoTQx5CYJR5Wxyuu
4%3D&reserved=0
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Lux, Jim (337K)
2018-06-19 20:57:12 UTC
Permalink
From: Beowulf <beowulf-***@beowulf.org> on behalf of "***@beowulf.org" <***@beowulf.org>
Reply-To: John Hearns <***@googlemail.com>
Date: Tuesday, June 19, 2018 at 11:40 AM
To: "***@beowulf.org" <***@beowulf.org>
Subject: Re: [Beowulf] Working for DUG, new thead
If a sys admin position involves shell prorgramming/scripting, knowing the details of a specific programming language or processor are secondary, but thinking like a >programmer is skill not everyone has or can develop.Just last week a wrote a Lua script without knowing a thing about Lua. I looked at the example scripts, and then >googled to fill in the blanks. I think most SysAdmins do stuff like that on a regular basis.
Nooo.. I would never do that.
At one stage I had a whole shelf of Oreilly books at work. Still have them at home, but a bunch went to Textbooks for Africa recently.
My local library gladly accepted some rather heavy tomes on parallel programming. Goodness knows what the residents of SE London make of them. I rather hope that some youngster is inspired by one of the books.


== some poor kid is going to read Becker’s book, find archives of this list, find RGBs notes, and go hunting for DEC ethernet cards.. May the computing gods have mercy on his soul..

That said, google is my friend when writing in a new language that’s similar to an old language (Ruby, I’m looking at you)
Jonathan Engwall
2018-06-13 19:28:01 UTC
Permalink
Last month I knocked the CMOS off of the motherboard of my 2950. I ordered
a new one, waited, wrestled the old one free of the hidden riser (riser A).
And then reversed the process with the new...
Then I realized I had knocked the CMOS off of that board too! But then I
noticed the faded green caused by a GPU blowing down on the board,
loossening the glue in the CMOS's general vicinity: it was the board I had
just removed.
The new one was sitting beside me the whole time!
And it now posts and boots just fine.
Post by Andrew Latham
Such a broad topic. I would assume things like DHCP, TFTP, Networking, PXE
and IPMI which come to mind. Troubleshooting tools, configuration
management, version control, monitoring, issue tracking and many other
processes are at play. I would love to hear any interview questions which
could cover this array of topics sanely.
Suggested question: Tell me the best hardware horror story?
On Wed, Jun 13, 2018 at 12:08 PM Jonathan Engwall <
Post by Jonathan Engwall
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at least
until 6pm.
I am excited but also terrified. My background is C and now JavaScript,
mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
- Andrew "lathama" Latham -
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Lux, Jim (337K)
2018-06-14 03:46:37 UTC
Permalink
Naahh. We should work to scare the whatevers out of him..

They’re going to ask about how you designed the adaptive equalizer in your last 10GEthernet ASIC. What specific problems did you have modifying the NetGear FA310TX device driver to accommodate this change. Oh, and by the way, we need a polynomial time, scalable algorithm to solve the generalized knapsack problem. And, what *are* your opinions about temperature profiles in the reflow soldering process, vis a vis the formation of tin whiskers.

A look at the list archives from 2000 should do nicely.


From: Beowulf <beowulf-***@beowulf.org> on behalf of Andrew Latham <***@gmail.com>
Date: Wednesday, June 13, 2018 at 11:28 AM
Cc: "***@beowulf.org" <***@beowulf.org>
Subject: Re: [Beowulf] Working for DUG, new thead

Such a broad topic. I would assume things like DHCP, TFTP, Networking, PXE and IPMI which come to mind. Troubleshooting tools, configuration management, version control, monitoring, issue tracking and many other processes are at play. I would love to hear any interview questions which could cover this array of topics sanely.

Suggested question: Tell me the best hardware horror story?
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at least until 6pm.
I am excited but also terrified. My background is C and now JavaScript, mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
***@gmail.com<mailto:***@gmail.com>
_______________________________________________
Beowulf mailing list, ***@beowulf.org<mailto:***@beowulf.org> sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
--
- Andrew "lathama" Latham -
Fred Youhanaie
2018-06-13 18:49:43 UTC
Permalink
Stuart Midgley works for DUG?  They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at least until 6pm.
I am excited but also terrified. My background is C and now JavaScript, mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
This is one of the links in Stu's "HPC positions" email from June 11:

https://www.dug.com/careers/positions_vacant/low_level_hpc_developer

Could this be what you're looking for?

Cheers,
Fred
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
Bill Abbott
2018-06-13 19:01:36 UTC
Permalink
I thought it was for sysadmin, not developer. Disregard most of what I
said.

Bill
 > Stuart Midgley works for DUG?  They are currently
 > recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at least until 6pm.
I am excited but also terrified. My background is C and now
JavaScript, mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
    https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.dug.com%2Fcareers%2Fpositions_vacant%2Flow_level_hpc_developer&data=02%7C01%7Cbabbott%40rutgers.edu%7C63fcf2cdf2a14198e8c208d5d15e7e5b%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C1%7C636645126139750738&sdata=3BEsI0Hj5GkbSJLDI7jGH7I12mf3fSZ3AU2j9pi36GM%3D&reserved=0
Could this be what you're looking for?
Cheers,
Fred
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C63fcf2cdf2a14198e8c208d5d15e7e5b%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C1%7C636645126139750738&sdata=y%2FXISnWl%2B9cbhOS7Y3pkwjbHyC6W5NW7scuMWC4EO%2FA%3D&reserved=0
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beo
Ryan Novosielski
2018-06-13 19:23:21 UTC
Permalink
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at least until 6pm.
I am excited but also terrified. My background is C and now JavaScript, mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
As others have said, it does depend on what “low level” means — eg. entry level or low level in the hardware sense, as Prentice said. In general, though, I’d say that you should have some working knowledge of what a job scheduler is, be it SLURM or SGE ob PBS or whatever, the major components, the concepts behind it. It’s almost a given that they’re using a job scheduler. You should also probably at least have a concept of how MPI works, and what sort of interconnects and storage one tends to encounter in HPC.

They also may have some information on their website about their systems, depending who they are, which might help you target your review.

--
____
|| \\UTGERS, |---------------------------*O*---------------------------
||_// the State | Ryan Novosielski - ***@rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
|| \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark
`'
Jonathan Engwall
2018-06-13 19:32:03 UTC
Permalink
Yes. Very good idea, it is a mining company in Ausralia.
Post by Jonathan Engwall
On Jun 13, 2018, at 1:07 PM, Jonathan Engwall <
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at least
until 6pm.
I am excited but also terrified. My background is C and now JavaScript,
mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
As others have said, it does depend on what “low level” means — eg. entry
level or low level in the hardware sense, as Prentice said. In general,
though, I’d say that you should have some working knowledge of what a job
scheduler is, be it SLURM or SGE ob PBS or whatever, the major components,
the concepts behind it. It’s almost a given that they’re using a job
scheduler. You should also probably at least have a concept of how MPI
works, and what sort of interconnects and storage one tends to encounter in
HPC.
They also may have some information on their website about their systems,
depending who they are, which might help you target your review.
--
____
|| \\UTGERS, |---------------------------*
O*---------------------------
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
|| \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark
`'
Stu Midgley
2018-06-14 01:13:13 UTC
Permalink
OR... we could answer your questions directly...

Stu (who has worked at DUG for 12 years, built and designed the systems...
was CTO and now Systems Architect)

On Thu, Jun 14, 2018 at 3:32 AM Jonathan Engwall <
Post by Jonathan Engwall
Yes. Very good idea, it is a mining company in Ausralia.
On Jun 13, 2018, at 1:07 PM, Jonathan Engwall <
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at
least until 6pm.
I am excited but also terrified. My background is C and now JavaScript,
mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
As others have said, it does depend on what “low level” means — eg. entry
level or low level in the hardware sense, as Prentice said. In general,
though, I’d say that you should have some working knowledge of what a job
scheduler is, be it SLURM or SGE ob PBS or whatever, the major components,
the concepts behind it. It’s almost a given that they’re using a job
scheduler. You should also probably at least have a concept of how MPI
works, and what sort of interconnects and storage one tends to encounter in
HPC.
They also may have some information on their website about their systems,
depending who they are, which might help you target your review.
--
____
|| \\UTGERS,
|---------------------------*O*---------------------------
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
|| \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark
`'
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Dr Stuart Midgley
***@gmail.com
Stu Midgley
2018-06-14 01:39:27 UTC
Permalink
seismic processing - oil'n'gas not hard rock.


On Thu, Jun 14, 2018 at 3:32 AM Jonathan Engwall <
Post by Jonathan Engwall
Yes. Very good idea, it is a mining company in Ausralia.
--
Dr Stuart Midgley
***@gmail.com
Jonathan Engwall
2018-06-14 02:02:31 UTC
Permalink
64 bit words have me wondering. I have been interested in monolithic lately
too.
Well, I hope to get a call...and did. I think it went well. She did not
find my pip3 story exciting.
And thanks again ever one.
Post by Stu Midgley
seismic processing - oil'n'gas not hard rock.
On Thu, Jun 14, 2018 at 3:32 AM Jonathan Engwall <
Post by Jonathan Engwall
Yes. Very good idea, it is a mining company in Ausralia.
--
Dr Stuart Midgley
Joe Landman
2018-06-14 02:24:30 UTC
Permalink
Post by Stu Midgley
seismic processing - oil'n'gas not hard rock.
But ... you work with heavy metal (supercomputers that is) :D

Good to see you active here, BTW!
Post by Stu Midgley
On Thu, Jun 14, 2018 at 3:32 AM Jonathan Engwall
Yes. Very good idea, it is a mining company in Ausralia.
--
Dr Stuart Midgley
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
--
Joe Landman
e: ***@gmail.com
t: @hpcjoe
c: +1 734 612 4615
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman
Stu Midgley
2018-06-14 02:34:04 UTC
Permalink
seismic processing is a MASSIVE user of compute resources. We run single
processing steps that can take months on a 10PFlop machine...
Post by Stu Midgley
seismic processing - oil'n'gas not hard rock.
But ... you work with heavy metal (supercomputers that is) :D
Good to see you active here, BTW!
On Thu, Jun 14, 2018 at 3:32 AM Jonathan Engwall <
Post by Jonathan Engwall
Yes. Very good idea, it is a mining company in Ausralia.
--
Dr Stuart Midgley
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
--
Joe Landman
c: +1 734 612 4615
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Dr Stuart Midgley
***@gmail.com
Lux, Jim (337K)
2018-06-14 03:53:25 UTC
Permalink
And you drill through hard rock to get to the hydrocarbons.

But Stu is right – it’s one of the more interesting massively parallel (but not necessarily embarrassingly) problems, because the propagation medium is anisotropic. By comparison SAR (Synthetic Aperture Radar), tomography,and finite element electromagnetics processing are a doddle. Some mechanical FEM codes are tricky, and non-linear flow (shock waves, etc.) are complex.

Everyone who deals with these sorts of problems is all about efficiency and “good approximations” – you typically start with a linearized approximation, and then iterate with something to get the nonlinear solution – all of this takes lots o’ cycles.


From: Beowulf <beowulf-***@beowulf.org> on behalf of Stu Midgley <***@gmail.com>
Reply-To: "***@gmail.com" <***@gmail.com>
Date: Wednesday, June 13, 2018 at 7:35 PM
To: "***@gmail.com" <***@gmail.com>
Cc: "***@beowulf.org" <***@beowulf.org>
Subject: Re: [Beowulf] Working for DUG, new thead

seismic processing is a MASSIVE user of compute resources. We run single processing steps that can take months on a 10PFlop machine...


On Thu, Jun 14, 2018 at 10:25 AM Joe Landman <***@gmail.com<mailto:***@gmail.com>> wrote:



On 6/13/18 9:39 PM, Stu Midgley wrote:
seismic processing - oil'n'gas not hard rock.

But ... you work with heavy metal (supercomputers that is) :D

Good to see you active here, BTW!



On Thu, Jun 14, 2018 at 3:32 AM Jonathan Engwall <***@gmail.com<mailto:***@gmail.com>> wrote:

Yes. Very good idea, it is a mining company in Ausralia.
--
Dr Stuart Midgley
***@gmail.com<mailto:***@gmail.com>



_______________________________________________

Beowulf mailing list, ***@beowulf.org<mailto:***@beowulf.org> sponsored by Penguin Computing

To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
--
Joe Landman

e: ***@gmail.com<mailto:***@gmail.com>

t: @hpcjoe

c: +1 734 612 4615

w: https://scalability.org

g: https://github.com/joelandman

l: https://www.linkedin.com/in/joelandman
_______________________________________________
Beowulf mailing list, ***@beowulf.org<mailto:***@beowulf.org> sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
--
Dr Stuart Midgley
***@gmail.com<mailto:***@gmail.com>
Stu Midgley
2018-06-14 01:14:07 UTC
Permalink
We aren't that scary...

On Thu, Jun 14, 2018 at 1:08 AM Jonathan Engwall <
Post by Jonathan Engwall
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at least
until 6pm.
I am excited but also terrified. My background is C and now JavaScript,
mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Dr Stuart Midgley
***@gmail.com
Stu Midgley
2018-06-14 01:17:05 UTC
Permalink
low level HPC means... lots of things. BUT we are a huge Xeon Phi shop and
need low-level programmers ie. avx512, careful cache/memory management (NOT
openmp/compiler vectorisation etc).



On Thu, Jun 14, 2018 at 1:08 AM Jonathan Engwall <
Post by Jonathan Engwall
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at least
until 6pm.
I am excited but also terrified. My background is C and now JavaScript,
mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Dr Stuart Midgley
***@gmail.com
Joe Landman
2018-06-14 02:32:17 UTC
Permalink
I'm curious about your next gen plans, given Phi's roadmap.
low level HPC means... lots of things.  BUT we are a huge Xeon Phi
shop and need low-level programmers ie. avx512, careful cache/memory
management (NOT openmp/compiler vectorisation etc).
I played around with avx512 in my rzf code.
https://github.com/joelandman/rzf/blob/master/avx2/rzf_avx512.c  . Never
really spent a great deal of time on it, other than noting that using
avx512 seemed to downclock the core a bit on Skylake.

Which dev/toolchain are you using for Phi?  I set up the MPSS bit for a
customer, and it was pretty bad (2.6.32 kernel, etc.).  Flaky control
plane, and a painful host->coprocessor interface.  Did you develop your
own?  Definitely curious.
On Thu, Jun 14, 2018 at 1:08 AM Jonathan Engwall
Stuart Midgley works for DUG?  They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at
least until 6pm.
I am excited but also terrified. My background is C and now
JavaScript, mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Dr Stuart Midgley
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
--
Joe Landman
e: ***@gmail.com
t: @hpcjoe
c: +1 734 612 4615
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman
Stu Midgley
2018-06-14 04:02:47 UTC
Permalink
Phi is dead... Long live phi...

By which I mean, while the Phi as a chip is going away, its concepts live
on. Massive number of cores, large vectorisation and high speed memory
(and fucking high heat load - we do ~350W/socket). So, while the product
code will disappear, phi lives on.

For KNC I did a lot of customisation to MPSS to get it to work... and we
haven't been able to shift from one of the very early version. We love the
KNC... we get 8 in 2RU which is awesome density (1.1kW/RU)

For KNL its just x86 with a big vectorisation unit (700W/RU).

In both cases you have to be very very careful how you manage memory.
Post by Joe Landman
I'm curious about your next gen plans, given Phi's roadmap.
low level HPC means... lots of things. BUT we are a huge Xeon Phi shop
and need low-level programmers ie. avx512, careful cache/memory management
(NOT openmp/compiler vectorisation etc).
I played around with avx512 in my rzf code.
https://github.com/joelandman/rzf/blob/master/avx2/rzf_avx512.c . Never
really spent a great deal of time on it, other than noting that using
avx512 seemed to downclock the core a bit on Skylake.
Which dev/toolchain are you using for Phi? I set up the MPSS bit for a
customer, and it was pretty bad (2.6.32 kernel, etc.). Flaky control
plane, and a painful host->coprocessor interface. Did you develop your
own? Definitely curious.
On Thu, Jun 14, 2018 at 1:08 AM Jonathan Engwall <
Post by Jonathan Engwall
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at least
until 6pm.
I am excited but also terrified. My background is C and now JavaScript,
mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Dr Stuart Midgley
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
--
Joe Landman
c: +1 734 612 4615
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Dr Stuart Midgley
***@gmail.com
Ryan Novosielski
2018-06-19 21:48:19 UTC
Permalink
We bought KNC a long time ago and keep meaning to get them to a place where they can be used and just haven’t. Do you mount filesystems from them? We have GPFS storage, primarily, and would have to re-export it via NFS I suppose if we want the cards to use that storage. I’ve seen complaints about the stability of that setup. I didn’t try to build the GPFS portability layer for Phi — not sure whether to think it would or wouldn’t work (I guess I’d be inclined to doubt it).
Post by Stu Midgley
Phi is dead... Long live phi...
By which I mean, while the Phi as a chip is going away, its concepts live on. Massive number of cores, large vectorisation and high speed memory (and fucking high heat load - we do ~350W/socket). So, while the product code will disappear, phi lives on.
For KNC I did a lot of customisation to MPSS to get it to work... and we haven't been able to shift from one of the very early version. We love the KNC... we get 8 in 2RU which is awesome density (1.1kW/RU)
For KNL its just x86 with a big vectorisation unit (700W/RU).
In both cases you have to be very very careful how you manage memory.
I'm curious about your next gen plans, given Phi's roadmap.
low level HPC means... lots of things. BUT we are a huge Xeon Phi shop and need low-level programmers ie. avx512, careful cache/memory management (NOT openmp/compiler vectorisation etc).
I played around with avx512 in my rzf code. https://github.com/joelandman/rzf/blob/master/avx2/rzf_avx512.c . Never really spent a great deal of time on it, other than noting that using avx512 seemed to downclock the core a bit on Skylake.
Which dev/toolchain are you using for Phi? I set up the MPSS bit for a customer, and it was pretty bad (2.6.32 kernel, etc.). Flaky control plane, and a painful host->coprocessor interface. Did you develop your own? Definitely curious.
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at least until 6pm.
I am excited but also terrified. My background is C and now JavaScript, mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
--
Dr Stuart Midgley
_______________________________________________
Beowulf mailing list,
sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Joe Landman
c: +1 734 612 4615
https://scalability.org
https://github.com/joelandman
https://www.linkedin.com/in/joelandman
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
--
Dr Stuart Midgley
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cnovosirj%40rutgers.edu%7C89d9a1fe40cd40448a5708d5d1abc4d9%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645458049748846&sdata=dEUacidlV69%2FM8NEdObFNmSOsOObZpPAF4NlfI7joTw%3D&reserved=0
--
____
|| \\UTGERS, |---------------------------*O*---------------------------
||_// the State | Ryan Novosielski - ***@rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
|| \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark
`'
Stu Midgley
2018-06-20 02:49:36 UTC
Permalink
we initially used them as standalone systems (ie. rsh a code onto them and
run it)

today we use them in offload mode (ie. the host would push memory+commands
onto them and pull the results off - all via pragmas ).

our last KNC systems were 2RU with 8x7120 phi's... which is a 2.1kW
system. They absolutely fly...
Post by Ryan Novosielski
We bought KNC a long time ago and keep meaning to get them to a place
where they can be used and just haven’t. Do you mount filesystems from
them? We have GPFS storage, primarily, and would have to re-export it via
NFS I suppose if we want the cards to use that storage. I’ve seen
complaints about the stability of that setup. I didn’t try to build the
GPFS portability layer for Phi — not sure whether to think it would or
wouldn’t work (I guess I’d be inclined to doubt it).
Post by Stu Midgley
Phi is dead... Long live phi...
By which I mean, while the Phi as a chip is going away, its concepts
live on. Massive number of cores, large vectorisation and high speed
memory (and fucking high heat load - we do ~350W/socket). So, while the
product code will disappear, phi lives on.
Post by Stu Midgley
For KNC I did a lot of customisation to MPSS to get it to work... and we
haven't been able to shift from one of the very early version. We love the
KNC... we get 8 in 2RU which is awesome density (1.1kW/RU)
Post by Stu Midgley
For KNL its just x86 with a big vectorisation unit (700W/RU).
In both cases you have to be very very careful how you manage memory.
I'm curious about your next gen plans, given Phi's roadmap.
Post by Stu Midgley
low level HPC means... lots of things. BUT we are a huge Xeon Phi shop
and need low-level programmers ie. avx512, careful cache/memory management
(NOT openmp/compiler vectorisation etc).
Post by Stu Midgley
I played around with avx512 in my rzf code.
https://github.com/joelandman/rzf/blob/master/avx2/rzf_avx512.c . Never
really spent a great deal of time on it, other than noting that using
avx512 seemed to downclock the core a bit on Skylake.
Post by Stu Midgley
Which dev/toolchain are you using for Phi? I set up the MPSS bit for a
customer, and it was pretty bad (2.6.32 kernel, etc.). Flaky control
plane, and a painful host->coprocessor interface. Did you develop your
own? Definitely curious.
Post by Stu Midgley
Post by Stu Midgley
On Thu, Jun 14, 2018 at 1:08 AM Jonathan Engwall <
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at
least until 6pm.
Post by Stu Midgley
Post by Stu Midgley
I am excited but also terrified. My background is C and now JavaScript,
mostly online course work and telnet MUDs.
Post by Stu Midgley
Post by Stu Midgley
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
Computing
Post by Stu Midgley
Post by Stu Midgley
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Post by Stu Midgley
Post by Stu Midgley
--
Dr Stuart Midgley
_______________________________________________
Beowulf mailing list,
sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Joe Landman
c: +1 734 612 4615
https://scalability.org
https://github.com/joelandman
https://www.linkedin.com/in/joelandman
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Post by Stu Midgley
--
Dr Stuart Midgley
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cnovosirj%40rutgers.edu%7C89d9a1fe40cd40448a5708d5d1abc4d9%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645458049748846&sdata=dEUacidlV69%2FM8NEdObFNmSOsOObZpPAF4NlfI7joTw%3D&reserved=0
--
____
|| \\UTGERS, |---------------------------*O*---------------------------
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
|| \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark
`'
--
Dr Stuart Midgley
***@gmail.com
John Hearns via Beowulf
2018-06-20 03:57:34 UTC
Permalink
This thread is going fast!
I often wonder if that misleading marketing is one of the reasons why the
Xeon Phi has already been canned. I know a lot of people who were excited
for the Xeon Phi, but > I don't know any who ever bought the Xeon Phis once
they came out.

In the UK at my last company we had a customer in the defence sector who
bought lots of Xeon Phi. Great guy, full of enthusiasm and good to work
with (Hello Kirk!)
They were installed with IBM Platform before I joined the company. I
re-installed the cluster with Bright which brought it up to date.
That is the cluster which used Teradici PCOIP to connect via secure fibre
optic links.
we initially used them as standalone systems (ie. rsh a code onto them and
run it)
today we use them in offload mode (ie. the host would push memory+commands
onto them and pull the results off - all via pragmas ).
our last KNC systems were 2RU with 8x7120 phi's... which is a 2.1kW
system. They absolutely fly...
Post by Ryan Novosielski
We bought KNC a long time ago and keep meaning to get them to a place
where they can be used and just haven’t. Do you mount filesystems from
them? We have GPFS storage, primarily, and would have to re-export it via
NFS I suppose if we want the cards to use that storage. I’ve seen
complaints about the stability of that setup. I didn’t try to build the
GPFS portability layer for Phi — not sure whether to think it would or
wouldn’t work (I guess I’d be inclined to doubt it).
Post by Stu Midgley
Phi is dead... Long live phi...
By which I mean, while the Phi as a chip is going away, its concepts
live on. Massive number of cores, large vectorisation and high speed
memory (and fucking high heat load - we do ~350W/socket). So, while the
product code will disappear, phi lives on.
Post by Stu Midgley
For KNC I did a lot of customisation to MPSS to get it to work... and
we haven't been able to shift from one of the very early version. We love
the KNC... we get 8 in 2RU which is awesome density (1.1kW/RU)
Post by Stu Midgley
For KNL its just x86 with a big vectorisation unit (700W/RU).
In both cases you have to be very very careful how you manage memory.
I'm curious about your next gen plans, given Phi's roadmap.
Post by Stu Midgley
low level HPC means... lots of things. BUT we are a huge Xeon Phi
shop and need low-level programmers ie. avx512, careful cache/memory
management (NOT openmp/compiler vectorisation etc).
Post by Stu Midgley
I played around with avx512 in my rzf code.
https://github.com/joelandman/rzf/blob/master/avx2/rzf_avx512.c .
Never really spent a great deal of time on it, other than noting that using
avx512 seemed to downclock the core a bit on Skylake.
Post by Stu Midgley
Which dev/toolchain are you using for Phi? I set up the MPSS bit for a
customer, and it was pretty bad (2.6.32 kernel, etc.). Flaky control
plane, and a painful host->coprocessor interface. Did you develop your
own? Definitely curious.
Post by Stu Midgley
Post by Stu Midgley
On Thu, Jun 14, 2018 at 1:08 AM Jonathan Engwall <
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at
least until 6pm.
Post by Stu Midgley
Post by Stu Midgley
I am excited but also terrified. My background is C and now
JavaScript, mostly online course work and telnet MUDs.
Post by Stu Midgley
Post by Stu Midgley
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
Computing
Post by Stu Midgley
Post by Stu Midgley
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Post by Stu Midgley
Post by Stu Midgley
--
Dr Stuart Midgley
_______________________________________________
Beowulf mailing list,
sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Joe Landman
c: +1 734 612 4615
https://scalability.org
https://github.com/joelandman
https://www.linkedin.com/in/joelandman
_______________________________________________
Computing
Post by Stu Midgley
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Post by Stu Midgley
--
Dr Stuart Midgley
_______________________________________________
Computing
Post by Stu Midgley
To change your subscription (digest mode or unsubscribe) visit
https://na01.safelinks.protection.outlook.com/?url=
http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%
7Cnovosirj%40rutgers.edu%7C89d9a1fe40cd40448a5708d5d1abc4d9%
7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645458049748846&sdata=
dEUacidlV69%2FM8NEdObFNmSOsOObZpPAF4NlfI7joTw%3D&reserved=0
--
____
|| \\UTGERS, |---------------------------*
O*---------------------------
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
|| \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark
`'
--
Dr Stuart Midgley
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
John Hearns via Beowulf
2018-06-20 04:07:47 UTC
Permalink
Post by Lux, Jim (337K)
I've been intrigued recently about using GPUs for signal processing kinds
of things.. There's not much difference between calculating vertices of
triangles and doing FIR filters.

Rather than look at hardware per se, how about learning about the Julia
language for this task?
I was discussing signal processing with someone who works with hearing
aids, they code in Julia. I sadly missed his talk at the Meetup in
Eindhoven.

https://discourse.julialang.org/c/domain/dsp


More on topic, I am not sure how well Julia is suited to Xeon Phi at the
moment. Thread support in Julia is still developing
https://docs.julialang.org/en/latest/base/multi-threading/
It would be interesting to see if Julia will run on Xeon Phi. Maybe a
certain geophysics company could have codes written in one language which
would do the heavy duty processing and the visualization too.
Post by Lux, Jim (337K)
This thread is going fast!
Post by Prentice Bisbal
I often wonder if that misleading marketing is one of the reasons why
the Xeon Phi has already been canned. I know a lot of people who were
excited for the Xeon Phi, but > I don't know any who ever bought the Xeon
Phis once they came out.
In the UK at my last company we had a customer in the defence sector who
bought lots of Xeon Phi. Great guy, full of enthusiasm and good to work
with (Hello Kirk!)
They were installed with IBM Platform before I joined the company. I
re-installed the cluster with Bright which brought it up to date.
That is the cluster which used Teradici PCOIP to connect via secure fibre
optic links.
Post by Prentice Bisbal
we initially used them as standalone systems (ie. rsh a code onto them
and run it)
today we use them in offload mode (ie. the host would push
memory+commands onto them and pull the results off - all via pragmas ).
our last KNC systems were 2RU with 8x7120 phi's... which is a 2.1kW
system. They absolutely fly...
Post by Ryan Novosielski
We bought KNC a long time ago and keep meaning to get them to a place
where they can be used and just haven’t. Do you mount filesystems from
them? We have GPFS storage, primarily, and would have to re-export it via
NFS I suppose if we want the cards to use that storage. I’ve seen
complaints about the stability of that setup. I didn’t try to build the
GPFS portability layer for Phi — not sure whether to think it would or
wouldn’t work (I guess I’d be inclined to doubt it).
Post by Stu Midgley
Phi is dead... Long live phi...
By which I mean, while the Phi as a chip is going away, its concepts
live on. Massive number of cores, large vectorisation and high speed
memory (and fucking high heat load - we do ~350W/socket). So, while the
product code will disappear, phi lives on.
Post by Stu Midgley
For KNC I did a lot of customisation to MPSS to get it to work... and
we haven't been able to shift from one of the very early version. We love
the KNC... we get 8 in 2RU which is awesome density (1.1kW/RU)
Post by Stu Midgley
For KNL its just x86 with a big vectorisation unit (700W/RU).
In both cases you have to be very very careful how you manage memory.
I'm curious about your next gen plans, given Phi's roadmap.
Post by Stu Midgley
low level HPC means... lots of things. BUT we are a huge Xeon Phi
shop and need low-level programmers ie. avx512, careful cache/memory
management (NOT openmp/compiler vectorisation etc).
Post by Stu Midgley
I played around with avx512 in my rzf code.
https://github.com/joelandman/rzf/blob/master/avx2/rzf_avx512.c .
Never really spent a great deal of time on it, other than noting that using
avx512 seemed to downclock the core a bit on Skylake.
Post by Stu Midgley
Which dev/toolchain are you using for Phi? I set up the MPSS bit for
a customer, and it was pretty bad (2.6.32 kernel, etc.). Flaky control
plane, and a painful host->coprocessor interface. Did you develop your
own? Definitely curious.
Post by Stu Midgley
Post by Stu Midgley
On Thu, Jun 14, 2018 at 1:08 AM Jonathan Engwall <
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at
least until 6pm.
Post by Stu Midgley
Post by Stu Midgley
I am excited but also terrified. My background is C and now
JavaScript, mostly online course work and telnet MUDs.
Post by Stu Midgley
Post by Stu Midgley
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
Computing
Post by Stu Midgley
Post by Stu Midgley
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Post by Stu Midgley
Post by Stu Midgley
--
Dr Stuart Midgley
_______________________________________________
Beowulf mailing list,
sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Joe Landman
c: +1 734 612 4615
https://scalability.org
https://github.com/joelandman
https://www.linkedin.com/in/joelandman
_______________________________________________
Computing
Post by Stu Midgley
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Post by Stu Midgley
--
Dr Stuart Midgley
_______________________________________________
Computing
Post by Stu Midgley
To change your subscription (digest mode or unsubscribe) visit
https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%
7C01%7Cnovosirj%40rutgers.edu%7C89d9a1fe40cd40448a5708d5d1ab
c4d9%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C6366454580
49748846&sdata=dEUacidlV69%2FM8NEdObFNmSOsOObZpPAF4NlfI7joTw
%3D&reserved=0
--
____
|| \\UTGERS, |---------------------------*
O*---------------------------
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
|| \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark
`'
--
Dr Stuart Midgley
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
John Hearns via Beowulf
2018-06-20 04:17:00 UTC
Permalink
I should do my research...
The Celeste project is the poster child for Julia
https://www.nextplatform.com/2017/11/28/julia-language-delivers-petascale-hpc-performance/

They use up to 8092 Xeon Phi nodes at NERSC with threads...
The per thread runtime graph is interesting there. Only a small fraction of
time is spent in the Intel MKL libraries.
That is interesting as on the Julia discourse forum when performance is
discussed it often is said that the BLAS libraries etc. dominate,
ie the code is dominated by the choice of system libraries and not Julia.
But of course this depends on the code - I suppose the Celeste code is not
using those functions.
Post by Lux, Jim (337K)
Post by Lux, Jim (337K)
I've been intrigued recently about using GPUs for signal processing
kinds of things.. There's not much difference between calculating vertices
of triangles and doing FIR filters.
Rather than look at hardware per se, how about learning about the Julia
language for this task?
I was discussing signal processing with someone who works with hearing
aids, they code in Julia. I sadly missed his talk at the Meetup in
Eindhoven.
https://discourse.julialang.org/c/domain/dsp
More on topic, I am not sure how well Julia is suited to Xeon Phi at the
moment. Thread support in Julia is still developing
https://docs.julialang.org/en/latest/base/multi-threading/
It would be interesting to see if Julia will run on Xeon Phi. Maybe a
certain geophysics company could have codes written in one language which
would do the heavy duty processing and the visualization too.
Post by Lux, Jim (337K)
This thread is going fast!
Post by Prentice Bisbal
I often wonder if that misleading marketing is one of the reasons why
the Xeon Phi has already been canned. I know a lot of people who were
excited for the Xeon Phi, but > I don't know any who ever bought the Xeon
Phis once they came out.
In the UK at my last company we had a customer in the defence sector who
bought lots of Xeon Phi. Great guy, full of enthusiasm and good to work
with (Hello Kirk!)
They were installed with IBM Platform before I joined the company. I
re-installed the cluster with Bright which brought it up to date.
That is the cluster which used Teradici PCOIP to connect via secure fibre
optic links.
Post by Prentice Bisbal
we initially used them as standalone systems (ie. rsh a code onto them
and run it)
today we use them in offload mode (ie. the host would push
memory+commands onto them and pull the results off - all via pragmas ).
our last KNC systems were 2RU with 8x7120 phi's... which is a 2.1kW
system. They absolutely fly...
Post by Ryan Novosielski
We bought KNC a long time ago and keep meaning to get them to a place
where they can be used and just haven’t. Do you mount filesystems from
them? We have GPFS storage, primarily, and would have to re-export it via
NFS I suppose if we want the cards to use that storage. I’ve seen
complaints about the stability of that setup. I didn’t try to build the
GPFS portability layer for Phi — not sure whether to think it would or
wouldn’t work (I guess I’d be inclined to doubt it).
Post by Stu Midgley
Phi is dead... Long live phi...
By which I mean, while the Phi as a chip is going away, its concepts
live on. Massive number of cores, large vectorisation and high speed
memory (and fucking high heat load - we do ~350W/socket). So, while the
product code will disappear, phi lives on.
Post by Stu Midgley
For KNC I did a lot of customisation to MPSS to get it to work... and
we haven't been able to shift from one of the very early version. We love
the KNC... we get 8 in 2RU which is awesome density (1.1kW/RU)
Post by Stu Midgley
For KNL its just x86 with a big vectorisation unit (700W/RU).
In both cases you have to be very very careful how you manage memory.
I'm curious about your next gen plans, given Phi's roadmap.
Post by Stu Midgley
low level HPC means... lots of things. BUT we are a huge Xeon Phi
shop and need low-level programmers ie. avx512, careful cache/memory
management (NOT openmp/compiler vectorisation etc).
Post by Stu Midgley
I played around with avx512 in my rzf code.
https://github.com/joelandman/rzf/blob/master/avx2/rzf_avx512.c .
Never really spent a great deal of time on it, other than noting that using
avx512 seemed to downclock the core a bit on Skylake.
Post by Stu Midgley
Which dev/toolchain are you using for Phi? I set up the MPSS bit for
a customer, and it was pretty bad (2.6.32 kernel, etc.). Flaky control
plane, and a painful host->coprocessor interface. Did you develop your
own? Definitely curious.
Post by Stu Midgley
Post by Stu Midgley
On Thu, Jun 14, 2018 at 1:08 AM Jonathan Engwall <
Stuart Midgley works for DUG? They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have at
least until 6pm.
Post by Stu Midgley
Post by Stu Midgley
I am excited but also terrified. My background is C and now
JavaScript, mostly online course work and telnet MUDs.
Post by Stu Midgley
Post by Stu Midgley
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
Computing
Post by Stu Midgley
Post by Stu Midgley
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Post by Stu Midgley
Post by Stu Midgley
--
Dr Stuart Midgley
_______________________________________________
Beowulf mailing list,
sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Joe Landman
c: +1 734 612 4615
https://scalability.org
https://github.com/joelandman
https://www.linkedin.com/in/joelandman
_______________________________________________
Computing
Post by Stu Midgley
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Post by Stu Midgley
--
Dr Stuart Midgley
_______________________________________________
Computing
Post by Stu Midgley
To change your subscription (digest mode or unsubscribe) visit
https://na01.safelinks.protection.outlook.com/?url=http%3A%2
F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7
C01%7Cnovosirj%40rutgers.edu%7C89d9a1fe40cd40448a5708d5d1abc
4d9%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C63664545804
9748846&sdata=dEUacidlV69%2FM8NEdObFNmSOsOObZpPAF4NlfI7joTw%
3D&reserved=0
--
____
|| \\UTGERS, |---------------------------*
O*---------------------------
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
|| \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark
`'
--
Dr Stuart Midgley
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Prentice Bisbal
2018-06-20 14:04:40 UTC
Permalink
But what was his experience with the Phis? Was he happy with them? Do
you know how much he had to rework his code to get good performance out
of them?
Post by John Hearns via Beowulf
This thread is going fast!
Post by Prentice Bisbal
I often wonder if that misleading marketing is one of the reasons why
the Xeon Phi has already been canned. I know a lot of people who were
excited for the Xeon Phi, but > I don't know any who ever bought the
Xeon Phis once they came out.
In the UK at my last company we had a customer in the defence sector
who bought lots of Xeon Phi. Great guy, full of enthusiasm and good to
work with (Hello Kirk!)
They were installed with IBM Platform before I joined the company. I
re-installed the cluster with Bright which brought it up to date.
That is the cluster which used Teradici PCOIP to connect via secure
fibre optic links.
we initially used them as standalone systems (ie. rsh a code onto
them and run it)
today we use them in offload mode (ie. the host would push
memory+commands onto them and pull the results off - all via pragmas).
our last KNC systems were 2RU with 8x7120 phi's... which is a
2.1kW system.  They absolutely fly...
On Wed, Jun 20, 2018 at 5:48 AM Ryan Novosielski
We bought KNC a long time ago and keep meaning to get them to
a place where they can be used and just haven’t. Do you mount
filesystems from them? We have GPFS storage, primarily, and
would have to re-export it via NFS I suppose if we want the
cards to use that storage. I’ve seen complaints about the
stability of that setup. I didn’t try to build the GPFS
portability layer for Phi — not sure whether to think it would
or wouldn’t work (I guess I’d be inclined to doubt it).
Post by Prentice Bisbal
Phi is dead... Long live phi...
By which I mean, while the Phi as a chip is going away, its
concepts live on.  Massive number of cores, large
vectorisation and high speed memory (and fucking high heat
load - we do ~350W/socket). So, while the product code will
disappear, phi lives on.
Post by Prentice Bisbal
For KNC I did a lot of customisation to MPSS to get it to
work... and we haven't been able to shift from one of the very
early version.  We love the KNC... we get 8 in 2RU which is
awesome density (1.1kW/RU)
Post by Prentice Bisbal
For KNL its just x86 with a big vectorisation unit (700W/RU).
In both cases you have to be very very careful how you
manage memory.
Post by Prentice Bisbal
On Thu, Jun 14, 2018 at 10:33 AM Joe Landman
I'm curious about your next gen plans, given Phi's roadmap.
low level HPC means... lots of things.  BUT we are a huge
Xeon Phi shop and need low-level programmers ie. avx512,
careful cache/memory management (NOT openmp/compiler
vectorisation etc).
Post by Prentice Bisbal
I played around with avx512 in my rzf code.
https://github.com/joelandman/rzf/blob/master/avx2/rzf_avx512.c
<https://github.com/joelandman/rzf/blob/master/avx2/rzf_avx512.c>
.  Never really spent a great deal of time on it, other than
noting that using avx512 seemed to downclock the core a bit on
Skylake.
Post by Prentice Bisbal
Which dev/toolchain are you using for Phi?  I set up the
MPSS bit for a customer, and it was pretty bad (2.6.32 kernel,
etc.).  Flaky control plane, and a painful host->coprocessor
interface.  Did you develop your own?  Definitely curious.
Post by Prentice Bisbal
On Thu, Jun 14, 2018 at 1:08 AM Jonathan Engwall
Stuart Midgley works for DUG?  They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I
have at least until 6pm.
Post by Prentice Bisbal
I am excited but also terrified. My background is C and now
JavaScript, mostly online course work and telnet MUDs.
Post by Prentice Bisbal
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe)
visit http://www.beowulf.org/mailman/listinfo/beowulf
<http://www.beowulf.org/mailman/listinfo/beowulf>
Post by Prentice Bisbal
--
Dr Stuart Midgley
_______________________________________________
Beowulf mailing list,
  sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
<http://www.beowulf.org/mailman/listinfo/beowulf>
Post by Prentice Bisbal
--
Joe Landman
c: +1 734 612 4615
https://scalability.org
https://github.com/joelandman
https://www.linkedin.com/in/joelandman
<https://www.linkedin.com/in/joelandman>
Post by Prentice Bisbal
_______________________________________________
To change your subscription (digest mode or unsubscribe)
visit http://www.beowulf.org/mailman/listinfo/beowulf
<http://www.beowulf.org/mailman/listinfo/beowulf>
Post by Prentice Bisbal
--
Dr Stuart Midgley
_______________________________________________
To change your subscription (digest mode or unsubscribe)
visit
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cnovosirj%40rutgers.edu%7C89d9a1fe40cd40448a5708d5d1abc4d9%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645458049748846&sdata=dEUacidlV69%2FM8NEdObFNmSOsOObZpPAF4NlfI7joTw%3D&reserved=0
<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cnovosirj%40rutgers.edu%7C89d9a1fe40cd40448a5708d5d1abc4d9%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636645458049748846&sdata=dEUacidlV69%2FM8NEdObFNmSOsOObZpPAF4NlfI7joTw%3D&reserved=0>
--
____
|| \\UTGERS,   
 |---------------------------*O*---------------------------
||_// the State  |         Ryan Novosielski -
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ  | Office of Advanced Research Computing - MSB
C630, Newark
     `'
--
Dr Stuart Midgley
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
<http://www.beowulf.org/mailman/listinfo/beowulf>
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Prentice Bisbal
2018-06-19 18:47:00 UTC
Permalink
Post by Joe Landman
I'm curious about your next gen plans, given Phi's roadmap.
low level HPC means... lots of things.  BUT we are a huge Xeon Phi
shop and need low-level programmers ie. avx512, careful cache/memory
management (NOT openmp/compiler vectorisation etc).
I played around with avx512 in my rzf code.
https://github.com/joelandman/rzf/blob/master/avx2/rzf_avx512.c . 
Never really spent a great deal of time on it, other than noting that
using avx512 seemed to downclock the core a bit on Skylake.
If you organize your code correctly, and call the compiler with the
right optimization flags, shouldn't the compiler automatically handle a
good portion of this 'low-level' stuff? I understand that hand-coding
this stuff usually still give you the best performance (See
GotoBLAS/OpenBLAS, for example), but does your average HPC programmer
trying to get decent performance need to hand-code that stuff, too?
Post by Joe Landman
Which dev/toolchain are you using for Phi?  I set up the MPSS bit for
a customer, and it was pretty bad (2.6.32 kernel, etc.). Flaky control
plane, and a painful host->coprocessor interface.  Did you develop
your own?  Definitely curious.
On Thu, Jun 14, 2018 at 1:08 AM Jonathan Engwall
Stuart Midgley works for DUG?  They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have
at least until 6pm.
I am excited but also terrified. My background is C and now
JavaScript, mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Dr Stuart Midgley
_______________________________________________
To change your subscription (digest mode or unsubscribe) visithttp://www.beowulf.org/mailman/listinfo/beowulf
--
Joe Landman
c: +1 734 612 4615
w:https://scalability.org
g:https://github.com/joelandman
l:https://www.linkedin.com/in/joelandman
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Benson Muite
2018-06-19 18:57:12 UTC
Permalink
Post by Prentice Bisbal
Post by Joe Landman
I'm curious about your next gen plans, given Phi's roadmap.
low level HPC means... lots of things.  BUT we are a huge Xeon Phi
shop and need low-level programmers ie. avx512, careful cache/memory
management (NOT openmp/compiler vectorisation etc).
I played around with avx512 in my rzf code.
https://github.com/joelandman/rzf/blob/master/avx2/rzf_avx512.c .
Never really spent a great deal of time on it, other than noting that
using avx512 seemed to downclock the core a bit on Skylake.
If you organize your code correctly, and call the compiler with the
right optimization flags, shouldn't the compiler automatically handle a
good portion of this 'low-level' stuff? I understand that hand-coding
this stuff usually still give you the best performance (See
GotoBLAS/OpenBLAS, for example), but does your average HPC programmer
trying to get decent performance need to hand-code that stuff, too?
Unfortunately, for most codes and programming languages, yes.
Post by Prentice Bisbal
Post by Joe Landman
Which dev/toolchain are you using for Phi?  I set up the MPSS bit for
a customer, and it was pretty bad (2.6.32 kernel, etc.). Flaky control
plane, and a painful host->coprocessor interface.  Did you develop
your own?  Definitely curious.
On Thu, Jun 14, 2018 at 1:08 AM Jonathan Engwall
Stuart Midgley works for DUG?  They are currently
recruiting for an HPC manager in London... Interesting...
Recruitment at DUG wants to call me about Low Level HPC. I have
at least until 6pm.
I am excited but also terrified. My background is C and now
JavaScript, mostly online course work and telnet MUDs.
Any suggestions are very much needed.
What must a "low level HPC" know on day 1???
Jonathan Engwall
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Dr Stuart Midgley
_______________________________________________
To change your subscription (digest mode or unsubscribe) visithttp://www.beowulf.org/mailman/listinfo/beowulf
--
Joe Landman
c: +1 734 612 4615
w:https://scalability.org
g:https://github.com/joelandman
l:https://www.linkedin.com/in/joelandman
_______________________________________________
To change your subscription (digest mode or unsubscribe) visithttp://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/b
Joe Landman
2018-06-19 19:10:28 UTC
Permalink
Post by Prentice Bisbal
Post by Joe Landman
I'm curious about your next gen plans, given Phi's roadmap.
low level HPC means... lots of things.  BUT we are a huge Xeon Phi
shop and need low-level programmers ie. avx512, careful cache/memory
management (NOT openmp/compiler vectorisation etc).
I played around with avx512 in my rzf code.
https://github.com/joelandman/rzf/blob/master/avx2/rzf_avx512.c . 
Never really spent a great deal of time on it, other than noting that
using avx512 seemed to downclock the core a bit on Skylake.
If you organize your code correctly, and call the compiler with the
right optimization flags, shouldn't the compiler automatically handle
a good portion of this 'low-level' stuff?
I wish it would do it well, but it turns out it doesn't do a good job.  
You have to pay very careful attention to almost all aspects of making
it simple for the compiler, and then constraining the directions it
takes with code gen.

I explored this with my RZF stuff.  It turns out that with -O3, gcc (5.x
and 6.x) would convert a library call for the power function into an FP
instruction.  But it would use 1/8 - 1/4 of the XMM/YMM register width,
not automatically unroll loops, or leverage the vector nature of the
problem.

Basically, not much has changed in 20+ years ... you annotate your code
with pragmas and similar, or use instruction primitives and give up on
the optimizer/code generator.

When it comes down to it, compilers aren't really as smart as many of us
would like.  Converting idiomatic code into efficient assembly isn't
what they are designed for.  Rather correct assembly.  Correct doesn't
mean efficient in many cases, and some of the less obvious optimizations
that we might think to be beneficial are not taken. We can hand modify
the code for this, and see if these optimizations are beneficial, but
the compilers often are not looking at a holistic problem.
Post by Prentice Bisbal
I understand that hand-coding this stuff usually still give you the
best performance (See GotoBLAS/OpenBLAS, for example), but does your
average HPC programmer trying to get decent performance need to
hand-code that stuff, too?
Generally, yes.  Optimizing serial code for GPUs doesn't work well.
Rewriting for GPUs (e.g. taking into account the GPU data/compute flow
architecture) does work well.
--
Joe Landman
e: ***@gmail.com
t: @hpcjoe
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman

_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/
Jonathan Engwall
2018-06-19 20:52:58 UTC
Permalink
I think the boundary between a final product and the start of a project
separates these two view points.
Lately, I short stacks of OReilleys scattered about, off libraries and a
second stack of notebooks filled with every command that really did work.
And I think it is fun.
Jonathan
Post by Joe Landman
I'm curious about your next gen plans, given Phi's roadmap.
Post by Stu Midgley
low level HPC means... lots of things. BUT we are a huge Xeon Phi shop
and need low-level programmers ie. avx512, careful cache/memory management
(NOT openmp/compiler vectorisation etc).
I played around with avx512 in my rzf code.
https://github.com/joelandman/rzf/blob/master/avx2/rzf_avx512.c . Never
really spent a great deal of time on it, other than noting that using
avx512 seemed to downclock the core a bit on Skylake.
If you organize your code correctly, and call the compiler with the right
optimization flags, shouldn't the compiler automatically handle a good
portion of this 'low-level' stuff?
I wish it would do it well, but it turns out it doesn't do a good job.
You have to pay very careful attention to almost all aspects of making it
simple for the compiler, and then constraining the directions it takes with
code gen.

I explored this with my RZF stuff. It turns out that with -O3, gcc (5.x
and 6.x) would convert a library call for the power function into an FP
instruction. But it would use 1/8 - 1/4 of the XMM/YMM register width, not
automatically unroll loops, or leverage the vector nature of the problem.

Basically, not much has changed in 20+ years ... you annotate your code
with pragmas and similar, or use instruction primitives and give up on the
optimizer/code generator.

When it comes down to it, compilers aren't really as smart as many of us
would like. Converting idiomatic code into efficient assembly isn't what
they are designed for. Rather correct assembly. Correct doesn't mean
efficient in many cases, and some of the less obvious optimizations that we
might think to be beneficial are not taken. We can hand modify the code for
this, and see if these optimizations are beneficial, but the compilers
often are not looking at a holistic problem.


I understand that hand-coding this stuff usually still give you the best
performance (See GotoBLAS/OpenBLAS, for example), but does your average HPC
programmer trying to get decent performance need to hand-code that stuff,
too?
Generally, yes. Optimizing serial code for GPUs doesn't work well.
Rewriting for GPUs (e.g. taking into account the GPU data/compute flow
architecture) does work well.
--
Joe Landman
e: ***@gmail.com
t: @hpcjoe
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman

_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Lux, Jim (337K)
2018-06-19 20:59:47 UTC
Permalink
On 6/19/18, 12:11 PM, "Beowulf on behalf of Joe Landman" <beowulf-***@beowulf.org on behalf of ***@gmail.com> wrote:


Generally, yes. Optimizing serial code for GPUs doesn't work well.
Rewriting for GPUs (e.g. taking into account the GPU data/compute flow
architecture) does work well.



I've been intrigued recently about using GPUs for signal processing kinds of things.. There's not much difference between calculating vertices of triangles and doing FIR filters.


_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/l
Prentice Bisbal
2018-06-19 20:59:57 UTC
Permalink
Post by Joe Landman
Post by Prentice Bisbal
Post by Joe Landman
I'm curious about your next gen plans, given Phi's roadmap.
low level HPC means... lots of things.  BUT we are a huge Xeon Phi
shop and need low-level programmers ie. avx512, careful
cache/memory management (NOT openmp/compiler vectorisation etc).
I played around with avx512 in my rzf code.
https://github.com/joelandman/rzf/blob/master/avx2/rzf_avx512.c . 
Never really spent a great deal of time on it, other than noting
that using avx512 seemed to downclock the core a bit on Skylake.
If you organize your code correctly, and call the compiler with the
right optimization flags, shouldn't the compiler automatically handle
a good portion of this 'low-level' stuff?
I wish it would do it well, but it turns out it doesn't do a good
job.   You have to pay very careful attention to almost all aspects of
making it simple for the compiler, and then constraining the
directions it takes with code gen.
I explored this with my RZF stuff.  It turns out that with -O3, gcc
(5.x and 6.x) would convert a library call for the power function into
an FP instruction.  But it would use 1/8 - 1/4 of the XMM/YMM register
width, not automatically unroll loops, or leverage the vector nature
of the problem.
Basically, not much has changed in 20+ years ... you annotate your
code with pragmas and similar, or use instruction primitives and give
up on the optimizer/code generator.
When it comes down to it, compilers aren't really as smart as many of
us would like.  Converting idiomatic code into efficient assembly
isn't what they are designed for.  Rather correct assembly.  Correct
doesn't mean efficient in many cases, and some of the less obvious
optimizations that we might think to be beneficial are not taken. We
can hand modify the code for this, and see if these optimizations are
beneficial, but the compilers often are not looking at a holistic
problem.
Post by Prentice Bisbal
I understand that hand-coding this stuff usually still give you the
best performance (See GotoBLAS/OpenBLAS, for example), but does your
average HPC programmer trying to get decent performance need to
hand-code that stuff, too?
Generally, yes.  Optimizing serial code for GPUs doesn't work well.
Rewriting for GPUs (e.g. taking into account the GPU data/compute flow
architecture) does work well.
Thanks for the reply. This sounds like the perfect opportunity for me to
rant about Intel's marketing for Xeon Phi vs. GPUs. When GPUs took off
and Intel was formulating their answer to GPUs, they kept saying you
wouldn't need to rewrite your code like you need to for GPUs. You could
just recompile and everything would work on the new MIC processors.

Then when Intel's MIC processors finally did come out, guess what? You
*did* have to rewrite your code to get any meaningful increase in
performance. For example, you'd have to make sure your loops were
data-parallel and use OpenMP or TBB, or Cilk Plus or whatever, to really
take advantage of the MIC.  This meant you had to rewrite your code, but
Intel did everything they could to avoid admitting you would need to
rewrite your code. Instead, they used the euphemism 'code modernization'
instead.

I often wonder if that misleading marketing is one of the reasons why
the Xeon Phi has already been canned. I know a lot of people who were
excited for the Xeon Phi, but I don't know any who ever bought the Xeon
Phis once they came out.

Prentice
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.
Stu Midgley
2018-06-20 02:46:28 UTC
Permalink
I think your comments are wrong.

I think Intel was right on the mark with their "just throw a few flags and
it'll work". For a very large number of codes this will work. If you have
a simple FD, stencil or use MKL then your away. For a large number of
people that works.

Our company got up and running in a few days and were getting reasonable
speedup... enough that we ended up purchasing many thousands of KNC's. We
then spent 4 years fine tuning, optimising getting every last drop of
performance. Which broadened the workload we could run on the KNC's and
sped up all codes that were on it.

This is no different to standard Xeon/AMD. We went through exactly the
same process when we got 4socket 16core AMD systems. They are very numa,
so require you to change how your code works. Every time a new
vectorisation comes out, we spend a lot of effort recoding to use it...

The KNL's are even easier. They are just x86 systems with large number of
cores and massive vector units. If the compiler can localise your working
set to the HBM then they absolutely fly. If you get into the intrinsics,
they can really scream... which is why we now have many thousands of
thousands of KNL's.

You can not change architecture (especially memory) and expect your code to
just shift across. It might compile, run and do OK... but you won't get
great performance. You have to work at it. What KNC and KNL give you is
the ability to shift quickly... and that buys you time to do the
optimisation.

Where Intel made the mistake was to assume they could shift people from
GPUs. People who have spent years writing and optimising won't shift
easily... cause they have to go through that whole process again. Getting
people back to x86 once they have shifted is a long long term goal.

And, as I said, Phi isn't dead. Large vectors, large core count with high
speed memory - that's Phi. Intel is just shifting that back under the
standard Xeon name.
Post by Prentice Bisbal
Post by Joe Landman
Post by Prentice Bisbal
Post by Joe Landman
I'm curious about your next gen plans, given Phi's roadmap.
Post by Stu Midgley
low level HPC means... lots of things. BUT we are a huge Xeon Phi
shop and need low-level programmers ie. avx512, careful
cache/memory management (NOT openmp/compiler vectorisation etc).
I played around with avx512 in my rzf code.
https://github.com/joelandman/rzf/blob/master/avx2/rzf_avx512.c .
Never really spent a great deal of time on it, other than noting
that using avx512 seemed to downclock the core a bit on Skylake.
If you organize your code correctly, and call the compiler with the
right optimization flags, shouldn't the compiler automatically handle
a good portion of this 'low-level' stuff?
I wish it would do it well, but it turns out it doesn't do a good
job. You have to pay very careful attention to almost all aspects of
making it simple for the compiler, and then constraining the
directions it takes with code gen.
I explored this with my RZF stuff. It turns out that with -O3, gcc
(5.x and 6.x) would convert a library call for the power function into
an FP instruction. But it would use 1/8 - 1/4 of the XMM/YMM register
width, not automatically unroll loops, or leverage the vector nature
of the problem.
Basically, not much has changed in 20+ years ... you annotate your
code with pragmas and similar, or use instruction primitives and give
up on the optimizer/code generator.
When it comes down to it, compilers aren't really as smart as many of
us would like. Converting idiomatic code into efficient assembly
isn't what they are designed for. Rather correct assembly. Correct
doesn't mean efficient in many cases, and some of the less obvious
optimizations that we might think to be beneficial are not taken. We
can hand modify the code for this, and see if these optimizations are
beneficial, but the compilers often are not looking at a holistic
problem.
Post by Prentice Bisbal
I understand that hand-coding this stuff usually still give you the
best performance (See GotoBLAS/OpenBLAS, for example), but does your
average HPC programmer trying to get decent performance need to
hand-code that stuff, too?
Generally, yes. Optimizing serial code for GPUs doesn't work well.
Rewriting for GPUs (e.g. taking into account the GPU data/compute flow
architecture) does work well.
Thanks for the reply. This sounds like the perfect opportunity for me to
rant about Intel's marketing for Xeon Phi vs. GPUs. When GPUs took off
and Intel was formulating their answer to GPUs, they kept saying you
wouldn't need to rewrite your code like you need to for GPUs. You could
just recompile and everything would work on the new MIC processors.
Then when Intel's MIC processors finally did come out, guess what? You
*did* have to rewrite your code to get any meaningful increase in
performance. For example, you'd have to make sure your loops were
data-parallel and use OpenMP or TBB, or Cilk Plus or whatever, to really
take advantage of the MIC. This meant you had to rewrite your code, but
Intel did everything they could to avoid admitting you would need to
rewrite your code. Instead, they used the euphemism 'code modernization'
instead.
I often wonder if that misleading marketing is one of the reasons why
the Xeon Phi has already been canned. I know a lot of people who were
excited for the Xeon Phi, but I don't know any who ever bought the Xeon
Phis once they came out.
Prentice
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Dr Stuart Midgley
***@gmail.com
Michael Di Domenico
2018-06-20 12:36:47 UTC
Permalink
Post by Stu Midgley
Where Intel made the mistake was to assume they could shift people from
GPUs. People who have spent years writing and optimising won't shift
easily... cause they have to go through that whole process again. Getting
people back to x86 once they have shifted is a long long term goal.
this is where my environment fell over with respect to KNC and KNL.
Once the dev's got their hands on the cards they realized they had to
refactor the code away from gpu's. and the few that did, didn't get
much performance benefit over just using a gpu which they already
knew.

intel could have cornered the market from gpu's if they had released
KNL first and not been so late to the market. once CUDA took hold,
game over.
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowul
Eugen Leitl
2018-06-20 13:28:35 UTC
Permalink
Post by Michael Di Domenico
intel could have cornered the market from gpu's if they had released
KNL first and not been so late to the market. once CUDA took hold,
game over.
It seems that migrating to ROCm (HIP) from CUDA isn't all that difficult

https://github.com/ROCm-Developer-Tools/HIP

https://github.com/ROCm-Developer-Tools/HIP/blob/master/docs/markdown/hip_porting_guide.md

so perhaps nVidia will see a bit of competition in the GPU computing space.


_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beo
Prentice Bisbal
2018-06-20 21:35:57 UTC
Permalink
Post by Michael Di Domenico
Post by Stu Midgley
Where Intel made the mistake was to assume they could shift people from
GPUs. People who have spent years writing and optimising won't shift
easily... cause they have to go through that whole process again. Getting
people back to x86 once they have shifted is a long long term goal.
this is where my environment fell over with respect to KNC and KNL.
Once the dev's got their hands on the cards they realized they had to
refactor the code away from gpu's. and the few that did, didn't get
much performance benefit over just using a gpu which they already
knew.
intel could have cornered the market from gpu's if they had released
KNL first and not been so late to the market. once CUDA took hold,
game over.
Well, there was the Intel Larrabee project, which was killed before it
ever went into production, which set Intel back some.

https://en.wikipedia.org/wiki/Larrabee_(microarchitecture)

And I'm sure there was some arrogance in this mix, also. I bet Intel
figured they could be late to the game, and still win, because they're
Intel, or would just use the same tactics that they used to crush AMD
once the Nehalem processor came out. They were late that party, too, but
still won.

Prentice
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe
Prentice Bisbal
2018-06-20 14:02:55 UTC
Permalink
Stu,

I'm welcome to hearing other people's perspectives, especially when they
have more experience in an area than I do, but I feel like your
experiences below counter my statements as much as they support them.
Post by Stu Midgley
You can not change architecture (especially memory) and expect your
code to just shift across. It might compile, run and do OK... but you
won't get great performance.  You have to work at it.  What KNC and
KNL give you is the ability to shift quickly... and that buys you time
to do the optimisation.
Then when Intel's MIC processors finally did come out, guess what? You
*did* have to rewrite your code to get any meaningful increase in
performance.
And remember, my argument wasn't really with the technical issues. I
agree 100% with everything you said. My argument was with Intel's
marketing, which criticized GPUs for requiring you to rewrite your code,
which they claimed would be unnecessary with their accelerators (I don't
know if they came up with the terms MIC or Xeon Phi at that time), but
then you still needed to "modernize" your code to get the most out of
their processors.

I also don't disagree with the "stickiness" of GPUs in the market once
they became the established approach. However, I did state that I had
some colleagues who were very supportive of the Xeon Phi because you
"didn't need to rewrite your code". Those same colleagues never ended up
investing significantly in the Xeon Phis when they did come out, which
singifies to me that they may have been disappointed with Intel's
promises compared to reality.

Prentice Bisbal
Lead Software Engineer
Princeton Plasma Physics Laboratory
http://www.pppl.gov
Post by Stu Midgley
I think your comments are wrong.
I think Intel was right on the mark with their "just throw a few flags
and it'll work".  For a very large number of codes this will work.  If
you have a simple FD, stencil or use MKL then your away.  For a large
number of people that works.
Our company got up and running in a few days and were getting
reasonable speedup... enough that we ended up purchasing many
thousands of KNC's.  We then spent 4 years fine tuning, optimising
getting every last drop of performance.  Which broadened the workload
we could run on the KNC's and sped up all codes that were on it.
This is no different to standard Xeon/AMD.  We went through exactly
the same process when we got 4socket 16core AMD systems.  They are
very numa, so require you to change how your code works.  Every time a
new vectorisation comes out, we spend a lot of effort recoding to use
it...
The KNL's are even easier.  They are just x86 systems with large
number of cores and massive vector units.  If the compiler can
localise your working set to the HBM then they absolutely fly.  If you
get into the intrinsics, they can really scream... which is why we now
have many thousands of thousands of KNL's.
You can not change architecture (especially memory) and expect your
code to just shift across.  It might compile, run and do OK... but you
won't get great performance.  You have to work at it.  What KNC and
KNL give you is the ability to shift quickly... and that buys you time
to do the optimisation.
Where Intel made the mistake was to assume they could shift people
from GPUs.  People who have spent years writing and optimising won't
shift easily... cause they have to go through that whole process
again.  Getting people back to x86 once they have shifted is a long
long term goal.
And, as I said, Phi isn't dead.  Large vectors, large core count with
high speed memory - that's Phi.  Intel is just shifting that back
under the standard Xeon name.
Post by Joe Landman
Post by Prentice Bisbal
Post by Joe Landman
I'm curious about your next gen plans, given Phi's roadmap.
low level HPC means... lots of things.  BUT we are a huge
Xeon Phi
Post by Joe Landman
Post by Prentice Bisbal
Post by Joe Landman
shop and need low-level programmers ie. avx512, careful
cache/memory management (NOT openmp/compiler vectorisation etc).
I played around with avx512 in my rzf code.
https://github.com/joelandman/rzf/blob/master/avx2/rzf_avx512.c .
Never really spent a great deal of time on it, other than noting
that using avx512 seemed to downclock the core a bit on Skylake.
If you organize your code correctly, and call the compiler with
the
Post by Joe Landman
Post by Prentice Bisbal
right optimization flags, shouldn't the compiler automatically
handle
Post by Joe Landman
Post by Prentice Bisbal
a good portion of this 'low-level' stuff?
I wish it would do it well, but it turns out it doesn't do a good
job.   You have to pay very careful attention to almost all
aspects of
Post by Joe Landman
making it simple for the compiler, and then constraining the
directions it takes with code gen.
I explored this with my RZF stuff.  It turns out that with -O3, gcc
(5.x and 6.x) would convert a library call for the power
function into
Post by Joe Landman
an FP instruction.  But it would use 1/8 - 1/4 of the XMM/YMM
register
Post by Joe Landman
width, not automatically unroll loops, or leverage the vector
nature
Post by Joe Landman
of the problem.
Basically, not much has changed in 20+ years ... you annotate your
code with pragmas and similar, or use instruction primitives and
give
Post by Joe Landman
up on the optimizer/code generator.
When it comes down to it, compilers aren't really as smart as
many of
Post by Joe Landman
us would like.  Converting idiomatic code into efficient assembly
isn't what they are designed for.  Rather correct assembly. 
Correct
Post by Joe Landman
doesn't mean efficient in many cases, and some of the less obvious
optimizations that we might think to be beneficial are not
taken. We
Post by Joe Landman
can hand modify the code for this, and see if these
optimizations are
Post by Joe Landman
beneficial, but the compilers often are not looking at a holistic
problem.
Post by Prentice Bisbal
I understand that hand-coding this stuff usually still give you
the
Post by Joe Landman
Post by Prentice Bisbal
best performance (See GotoBLAS/OpenBLAS, for example), but does
your
Post by Joe Landman
Post by Prentice Bisbal
average HPC programmer trying to get decent performance need to
hand-code that stuff, too?
Generally, yes.  Optimizing serial code for GPUs doesn't work well.
Rewriting for GPUs (e.g. taking into account the GPU
data/compute flow
Post by Joe Landman
architecture) does work well.
Thanks for the reply. This sounds like the perfect opportunity for me to
rant about Intel's marketing for Xeon Phi vs. GPUs. When GPUs took off
and Intel was formulating their answer to GPUs, they kept saying you
wouldn't need to rewrite your code like you need to for GPUs. You could
just recompile and everything would work on the new MIC processors.
Then when Intel's MIC processors finally did come out, guess what? You
*did* have to rewrite your code to get any meaningful increase in
performance. For example, you'd have to make sure your loops were
data-parallel and use OpenMP or TBB, or Cilk Plus or whatever, to really
take advantage of the MIC.  This meant you had to rewrite your
code, but
Intel did everything they could to avoid admitting you would need to
rewrite your code. Instead, they used the euphemism 'code
modernization'
instead.
I often wonder if that misleading marketing is one of the reasons why
the Xeon Phi has already been canned. I know a lot of people who were
excited for the Xeon Phi, but I don't know any who ever bought the Xeon
Phis once they came out.
Prentice
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Dr Stuart Midgley
Stu Midgley
2018-06-21 01:03:43 UTC
Permalink
But the point of what I was saying is that we DID get real meaningful
speedup with just a recompile... we got enough performance to invest
heavily in the technology (it made sense from a $$ perspective).

Then, every day for years, we got more and more performance, which made the
position even more compelling.
Post by Prentice Bisbal
Then when Intel's MIC processors finally did come out, guess what? You
*did* have to rewrite your code to get any meaningful increase in
performance.
--
Dr Stuart Midgley
***@gmail.com
Prentice Bisbal
2018-06-21 14:18:47 UTC
Permalink
Stu,

I gotcha. Thanks for the clarification.

Prentice
Post by Stu Midgley
But the point of what I was saying is that we DID get real meaningful
speedup with just a recompile...  we got enough performance to invest
heavily in the technology (it made sense from a $$ perspective).
Then, every day for years, we got more and more performance, which
made the position even more compelling.
Post by Prentice Bisbal
Then when Intel's MIC processors finally did come out, guess what? You
*did* have to rewrite your code to get any meaningful increase in
performance.
--
Dr Stuart Midgley
Prentice Bisbal
2018-06-21 14:20:10 UTC
Permalink
Stu,

I gotcha. Thanks for the clarification.

Prentice
Post by Stu Midgley
But the point of what I was saying is that we DID get real meaningful
speedup with just a recompile...  we got enough performance to invest
heavily in the technology (it made sense from a $$ perspective).
Then, every day for years, we got more and more performance, which
made the position even more compelling.
Post by Prentice Bisbal
Then when Intel's MIC processors finally did come out, guess what? You
*did* have to rewrite your code to get any meaningful increase in
performance.
--
Dr Stuart Midgley
Stu Midgley
2018-06-20 02:31:19 UTC
Permalink
We aren't after average HPC programmers...

Even good compilers (Intel) are very very limited in their optimisations.
We got factors of 2x and 3x by hand writing SSSE3 commands on standard
Xeon's rather than let the compiler do its thing... Compiler limitations
isn't particular to Phi.
If you organize your code correctly, and call the compiler with the right
optimization flags, shouldn't the compiler automatically handle a good
portion of this 'low-level' stuff? I understand that hand-coding this stuff
usually still give you the best performance (See GotoBLAS/OpenBLAS, for
example), but does your average HPC programmer trying to get decent
performance need to hand-code that stuff, too?
--
Dr Stuart Midgley
***@gmail.com
Loading...