Discussion:
[Beowulf] "NNSA’s first really big heterogeneous supercomputer"
Prentice Bisbal via Beowulf
2018-10-31 17:58:30 UTC
Permalink
From
It also makes Sierra the NNSA’s first really big heterogeneous
supercomputer
Does Roadrunner, the first computer to exceed 1 PFLOPS, not count any
more? That was an NNSA system, and that had 3 (yes, 3!) different
processors in it: The AMD Opteron CPU, of course, and then the
PowerXCell 8i Processor consisted of  had a stripped down POWER
processor core and Cell processors inside it. I read a publication from
LANL on it's architecture years ago, and I believe to had to program for
all 3 different processors to take advantage of it's architecture. (If
soneone knows for sure, please correct me if I'm wrong.)   I'd say
that's even more heterogeneous than what we are seeing today (CPU + GPU).
--
Prentice

_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.
Chris Samuel
2018-11-01 09:10:54 UTC
Permalink
I read a publication from LANL on it's architecture years ago, and I believe
to had to program for all 3 different processors to take advantage of it's
architecture. (If soneone knows for sure, please correct me if I'm wrong.)
I'd say that's even more heterogeneous than what we are seeing today
(CPU + GPU).
I think you're bang on the money, here's a presentation from LANL
about Roadrunner which includes programming on slide 25.

https://www.lanl.gov/conferences/salishan/salishan2007/Roadrunner-Salishan-Ken%20Koch.pdf

All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC



_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org
Prentice Bisbal via Beowulf
2018-11-01 16:45:17 UTC
Permalink
Post by Chris Samuel
I read a publication from LANL on it's architecture years ago, and I believe
to had to program for all 3 different processors to take advantage of it's
architecture. (If soneone knows for sure, please correct me if I'm wrong.)
I'd say that's even more heterogeneous than what we are seeing today
(CPU + GPU).
I think you're bang on the money, here's a presentation from LANL
about Roadrunner which includes programming on slide 25.
https://www.lanl.gov/conferences/salishan/salishan2007/Roadrunner-Salishan-Ken%20Koch.pdf
All the best,
Chris
Thanks for this. Slides 25 - 31 are all good slides, but I think slide
31, titled "Programming a Hybrid Computer"  really hammers the point
home. Fortunately, it's all text, so I can cut-and-paste it's contents
here:

Decomposition of an application for Cell-acceleration:
1. Opteron code
    - Runs non-accelerated parts of application
    - Participates in usual cluster parallel computations
    - Controls and communicates with Cell PPC code for the accelerated
portions
2. Cell PPC code
    - Works with Opteron code on accelerated portions of application
    - Allocates Cell common memory
    - Communicates with Opteron code
    - Controls and works with its 8 SPEs
3. Cell SPE code (8-way parallel)
    - Runs on each SPE (SPMD) (MPMD also possible)
    - Shares Cell common memory with PPC code
    - Manages its Local Store (LS), transferring data blocks in/out as
necessary
    - Performs vector computations from its LS data

It seems like this would be a nightmare to program for. A few years ago
I worked with faculty member who did some programming for the Roadrunner
architecture, and he confirmed this.

--
Prentice


_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinf
Loading...