[Beowulf] Jupyter and EP HPC

Discussion:

Lux, Jim (337K)

2018-07-27 18:47:17 UTC

I've just started using Jupyter to organize my Pythonic ramblings..

What would be kind of cool is to have a high level way to do some embarrassingly parallel python stuff, and I'm sure it's been done, but my google skills appear to be lacking (for all I know there's someone at JPL who is doing this, among the 6000 people doing stuff here).

What I'm thinking is this:
I have a high level python script that iterates through a set of data values for some model parameter, and farms out running the model to nodes on a cluster, but then gathers the results back.

So, I'd have N copies of the python model script on the nodes.
Almost like a pythonic version of pdsh.

Yeah, I'm sure I could use lots of subprocess() and execute() stuff (heck, I could shell pdsh), but like with all things python, someone has probably already done it before and has all the nice hooks into the Ipython kernel.

James Lux
Task Manager, DARPA High Frequency Research (DHFR) Space Testbed
Jet Propulsion Laboratory (Mail Stop 161-213)
4800 Oak Grove Drive
Pasadena CA 91109
(818)354-2075 (office)
(818)395-2714 (cell)

Joe Landman

2018-07-27 18:53:43 UTC

Permalink

I’ve just started using Jupyter to organize my Pythonic ramblings..
What would be kind of cool is to have a high level way to do some
embarrassingly parallel python stuff, and I’m sure it’s been done, but
my google skills appear to be lacking (for all I know there’s someone
at JPL who is doing this, among the 6000 people doing stuff here).
I have a high level python script that iterates through a set of data
values for some model parameter, and farms out running the model to
nodes on a cluster, but then gathers the results back.
So, I’d have N copies of the python model script on the nodes.
Almost like a pythonic version of pdsh.
Yeah, I’m sure I could use lots of subprocess() and execute() stuff
(heck, I could shell pdsh), but like with all things python, someone
has probably already done it before and has all the nice hooks into
the Ipython kernel.

I didn't do this with ipython or python ... but this was effectively the
way I parallelized NCBI BLAST in 1998-1999 or so. Wrote a perl script
to parse args, construct jobs, move data, submit/manage jobs, recover
results, reassemble output. SGI turned that into a product.
--
Joe Landman
e: ***@gmail.com
t: @hpcjoe
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman

_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/ma

Lux, Jim (337K)

2018-07-27 20:56:49 UTC

Permalink

-----Original Message-----
From: Beowulf [mailto:beowulf-***@beowulf.org] On Behalf Of Joe Landman
Sent: Friday, July 27, 2018 11:54 AM
To: ***@beowulf.org
Subject: Re: [Beowulf] Jupyter and EP HPC

I didn't do this with ipython or python ... but this was effectively the way I parallelized NCBI BLAST in 1998-1999 or so. Wrote a perl script to parse args, construct jobs, move data, submit/manage jobs, recover results, reassemble output. SGI turned that into a product.

-- yes.. but I was hoping someone had done that for Jupyter..
.... result = simulation(parametervalue)
Results.append(result)

_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beo

Fred Youhanaie

2018-07-27 21:16:17 UTC

Permalink

Jim

I'm not a jupyter user, yet, however, out of curiosity I just googled for what I think you're looking for. Is this any good?

https://ipyparallel.readthedocs.io/en/stable/

I have now bookmarked it for my own future use!

Cheers,
Fred

Post by Lux, Jim (337K)
-----Original Message-----
Sent: Friday, July 27, 2018 11:54 AM
Subject: Re: [Beowulf] Jupyter and EP HPC

I didn't do this with ipython or python ... but this was effectively the way I parallelized NCBI BLAST in 1998-1999 or so. Wrote a perl script to parse args, construct jobs, move data, submit/manage jobs, recover results, reassemble output. SGI turned that into a product.
-- yes.. but I was hoping someone had done that for Jupyter..
.... result = simulation(parametervalue)
Results.append(result)
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

Lux, Jim (337K)

2018-07-28 14:21:24 UTC

Permalink

That might be exactly it..
thanks.

On 7/27/18, 2:17 PM, "Beowulf on behalf of Fred Youhanaie" <beowulf-***@beowulf.org on behalf of ***@anydata.co.uk> wrote:

Jim

I'm not a jupyter user, yet, however, out of curiosity I just googled for what I think you're looking for. Is this any good?

https://ipyparallel.readthedocs.io/en/stable/

I have now bookmarked it for my own future use!

Cheers,
Fred

Post by Lux, Jim (337K)
-----Original Message-----
Sent: Friday, July 27, 2018 11:54 AM
Subject: Re: [Beowulf] Jupyter and EP HPC

I didn't do this with ipython or python ... but this was effectively the way I parallelized NCBI BLAST in 1998-1999 or so. Wrote a perl script to parse args, construct jobs, move data, submit/manage jobs, recover results, reassemble output. SGI turned that into a product.
-- yes.. but I was hoping someone had done that for Jupyter..
.... result = simulation(parametervalue)
Results.append(result)
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit

Gavin W. Burris

2018-07-30 17:16:34 UTC

Permalink

Since this is Beowulf, I assume you have a job queue. Check out the batch spawner, too.
https://github.com/jupyterhub/batchspawner

Cheers.

Post by Lux, Jim (337K)
That might be exactly it..
thanks.
Jim
I'm not a jupyter user, yet, however, out of curiosity I just googled for what I think you're looking for. Is this any good?
https://ipyparallel.readthedocs.io/en/stable/
I have now bookmarked it for my own future use!
Cheers,
Fred

Post by Lux, Jim (337K)
-----Original Message-----
Sent: Friday, July 27, 2018 11:54 AM
Subject: Re: [Beowulf] Jupyter and EP HPC

I didn't do this with ipython or python ... but this was effectively the way I parallelized NCBI BLAST in 1998-1999 or so. Wrote a perl script to parse args, construct jobs, move data, submit/manage jobs, recover results, reassemble output. SGI turned that into a product.
-- yes.. but I was hoping someone had done that for Jupyter..
.... result = simulation(parametervalue)
Results.append(result)
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

--
Gavin W. Burris
Senior Project Leader for Research Computing
The Wharton School
University of Pennsylvania
Search our documentation: http://research-it.wharton.upenn.edu/about/
Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/b

Lux, Jim (337K)

2018-07-30 22:38:58 UTC

Permalink

Job Queue? At home, my experimental cluster is a pack of 4 beaglebones running a pretty vanilla debian - not exactly a mindbender in performance, but easy to fool with to experiment.

At work, yeah, all the usual stuff.

Jim Lux
(818)354-2075 (office)
(818)395-2714 (cell)

-----Original Message-----
From: Gavin W. Burris [mailto:***@wharton.upenn.edu]
Sent: Monday, July 30, 2018 10:17 AM
To: Lux, Jim (337K) <***@jpl.nasa.gov>
Cc: Fred Youhanaie <***@anydata.co.uk>; ***@beowulf.org
Subject: Re: [Beowulf] Jupyter and EP HPC

Since this is Beowulf, I assume you have a job queue. Check out the batch spawner, too.
https://github.com/jupyterhub/batchspawner

Cheers.

Post by Lux, Jim (337K)
-----Original Message-----
Sent: Friday, July 27, 2018 11:54 AM
Subject: Re: [Beowulf] Jupyter and EP HPC

I didn't do this with ipython or python ... but this was effectively the way I parallelized NCBI BLAST in 1998-1999 or so. Wrote a perl script to parse args, construct jobs, move data, submit/manage jobs, recover results, reassemble output. SGI turned that into a product.
-- yes.. but I was hoping someone had done that for Jupyter..
.... result = simulation(parametervalue)
Results.append(result)
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Computing To change your subscription (digest mode or unsubscribe)
visit http://www.beowulf.org/mailman/listinfo/beowulf

--
Gavin W. Burris
Senior Project Leader for Research Computing The Wharton School University of Pennsylvania Search our documentation: http://research-it.wharton.upenn.edu/about/
Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.b

Gavin W. Burris

2018-07-31 13:08:20 UTC

Permalink

Neat setup, Jim. I'm planning on rolling out Jupyterhub as a supported service sometime soon, with kernels for as many of our software packages / shells / languages as possible. Please let me know how your experimenting goes. Cheers.

Post by Lux, Jim (337K)
Job Queue? At home, my experimental cluster is a pack of 4 beaglebones running a pretty vanilla debian - not exactly a mindbender in performance, but easy to fool with to experiment.
At work, yeah, all the usual stuff.
Jim Lux
(818)354-2075 (office)
(818)395-2714 (cell)
-----Original Message-----
Sent: Monday, July 30, 2018 10:17 AM
Subject: Re: [Beowulf] Jupyter and EP HPC
Since this is Beowulf, I assume you have a job queue. Check out the batch spawner, too.
https://github.com/jupyterhub/batchspawner
Cheers.

Post by Lux, Jim (337K)
-----Original Message-----
Sent: Friday, July 27, 2018 11:54 AM
Subject: Re: [Beowulf] Jupyter and EP HPC

I didn't do this with ipython or python ... but this was effectively the way I parallelized NCBI BLAST in 1998-1999 or so. Wrote a perl script to parse args, construct jobs, move data, submit/manage jobs, recover results, reassemble output. SGI turned that into a product.
-- yes.. but I was hoping someone had done that for Jupyter..
.... result = simulation(parametervalue)
Results.append(result)
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Computing To change your subscription (digest mode or unsubscribe)
visit http://www.beowulf.org/mailman/listinfo/beowulf

Andrew Holway

2018-07-28 17:01:50 UTC

Permalink

Hi Jim,

There is a group at JPL doing Kubernetes. It might be interesting to ask
them if you can execute Jobs on their clusters.

Cheers,

Andrew

Iâve just started using Jupyter to organize my Pythonic ramblings..
What would be kind of cool is to have a high level way to do some
embarrassingly parallel python stuff, and Iâm sure itâs been done, but my
google skills appear to be lacking (for all I know thereâs someone at JPL
who is doing this, among the 6000 people doing stuff here).
I have a high level python script that iterates through a set of data
values for some model parameter, and farms out running the model to nodes
on a cluster, but then gathers the results back.
So, Iâd have N copies of the python model script on the nodes.
Almost like a pythonic version of pdsh.
Yeah, Iâm sure I could use lots of subprocess() and execute() stuff (heck,
I could shell pdsh), but like with all things python, someone has probably
already done it before and has all the nice hooks into the Ipython kernel.
James Lux
Task Manager, DARPA High Frequency Research (DHFR) Space Testbed
Jet Propulsion Laboratory (Mail Stop 161-213)
4800 Oak Grove Drive
Pasadena CA 91109
(818)354-2075 (office)
(818)395-2714 (cell)
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

John Pellman

2018-07-28 18:05:26 UTC

Permalink

You might find some enlightening tidbits at around the 26:55 mark here re
Ipyparallel and Kubernetes:

http://www.rce-cast.com/Podcast/rce-116-jupyter.html

Post by Andrew Holway
Hi Jim,
There is a group at JPL doing Kubernetes. It might be interesting to ask
them if you can execute Jobs on their clusters.
Cheers,
Andrew

Iâve just started using Jupyter to organize my Pythonic ramblings..
What would be kind of cool is to have a high level way to do some
embarrassingly parallel python stuff, and Iâm sure itâs been done, but my
google skills appear to be lacking (for all I know thereâs someone at JPL
who is doing this, among the 6000 people doing stuff here).
I have a high level python script that iterates through a set of data
values for some model parameter, and farms out running the model to nodes
on a cluster, but then gathers the results back.
So, Iâd have N copies of the python model script on the nodes.
Almost like a pythonic version of pdsh.
Yeah, Iâm sure I could use lots of subprocess() and execute() stuff
(heck, I could shell pdsh), but like with all things python, someone has
probably already done it before and has all the nice hooks into the Ipython
kernel.
James Lux
Task Manager, DARPA High Frequency Research (DHFR) Space Testbed
Jet Propulsion Laboratory (Mail Stop 161-213)
4800 Oak Grove Drive
Pasadena CA 91109
(818)354-2075 (office)
(818)395-2714 (cell)
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

John Hearns via Beowulf

2018-07-29 17:57:59 UTC

Permalink

Cough. Julia. Cough.

John Hearns via Beowulf

2018-07-29 18:04:42 UTC

Permalink

https://github.com/JuliaParallel/ClusterManagers.jl

Sorry for the terse reply. Warm evening sitting beside the Maschsee in
Hannover. Modelling beer evaporation.

Lux, Jim (337K)

2018-07-30 01:19:37 UTC

Permalink

Modeling? I would think youâre doing empirical observation and data takingâŠ.

From: Beowulf <beowulf-***@beowulf.org> on behalf of "***@beowulf.org" <***@beowulf.org>
Reply-To: John Hearns <***@googlemail.com>
Date: Sunday, July 29, 2018 at 11:05 AM
To: "***@beowulf.org" <***@beowulf.org>
Subject: Re: [Beowulf] Jupyter and EP HPC

https://github.com/JuliaParallel/ClusterManagers.jl

Sorry for the terse reply. Warm evening sitting beside the Maschsee in Hannover. Modelling beer evaporation.

On Fri, 27 Jul 2018 8:47 pm Lux, Jim (337K), <***@jpl.nasa.gov<mailto:***@jpl.nasa.gov>> wrote:
Iâve just started using Jupyter to organize my Pythonic ramblings..

What would be kind of cool is to have a high level way to do some embarrassingly parallel python stuff, and Iâm sure itâs been done, but my google skills appear to be lacking (for all I know thereâs someone at JPL who is doing this, among the 6000 people doing stuff here).

What Iâm thinking is this:
I have a high level python script that iterates through a set of data values for some model parameter, and farms out running the model to nodes on a cluster, but then gathers the results back.

So, Iâd have N copies of the python model script on the nodes.
Almost like a pythonic version of pdsh.

Yeah, Iâm sure I could use lots of subprocess() and execute() stuff (heck, I could shell pdsh), but like with all things python, someone has probably already done it before and has all the nice hooks into the Ipython kernel.

James Lux
Task Manager, DARPA High Frequency Research (DHFR) Space Testbed
Jet Propulsion Laboratory (Mail Stop 161-213)
4800 Oak Grove Drive
Pasadena CA 91109
(818)354-2075 (office)
(818)395-2714 (cell)

_______________________________________________
Beowulf mailing list, ***@beowulf.org<mailto:***@beowulf.org> sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

Nathan Moore

2018-07-30 01:57:18 UTC

Permalink

I would guess the condor people in Madison have a way to do this.