Discussion:
[Beowulf] slow jobs when run through queue
Nick Evans
2017-12-06 00:44:24 UTC
Permalink
HI All,

Just wondering if anyone else has encountered a similar problem or thoughts
on how they would try and track down the problem.

Queue = PBS / Moab combination

We have found that if we submit a job to the queue then it takes a long
time to process. ie. >4 hours
If we are to run the exact same processing directly on the compute node
then it is significantly faster < 1 hour.

We have tried with a number of different variations on the environment
variables / local and remote scratch disk and can't seem to find any reason
for the difference between submitting the jobs and just running the job on
the compute resource.

Any hints / recommendations appreciated

Thanks
Nick
Chris Samuel
2017-12-06 01:58:13 UTC
Permalink
Post by Nick Evans
We have found that if we submit a job to the queue then it takes a long
time to process. ie. >4 hours
If we are to run the exact same processing directly on the compute node
then it is significantly faster < 1 hour.
Some quick ideas

Are you comparing a job that has asked for all cores and all RAM with
it running directly on the node?

Try using "perf top" to get an idea of what's going on with the node
when doing the comparison runs, perhaps "perf record" too but I can
never remember if an unprivilged user can do that. That might shed
some light.

To me it sounds like it might be something that that checks how
many cores a node has naively and then starts that many threads/
processes and if the batch job only asks for a single core, or
less than all, then you might end up with a lot of contention.

Good luck!
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www
Nick Evans
2017-12-06 05:47:42 UTC
Permalink
Thanks Brian / Carl / Chris for places to look.... it turned out to be what
Chris had mentioned and they were only requesting 1 CPU but trying to use
all 48 in the machine.

Resubmitted the request asking for all CPU's and the job ran in the
expected amount of time.

Thanks again
Nick
Post by Nick Evans
We have found that if we submit a job to the queue then it takes a long
Post by Nick Evans
time to process. ie. >4 hours
If we are to run the exact same processing directly on the compute node
then it is significantly faster < 1 hour.
Some quick ideas
Are you comparing a job that has asked for all cores and all RAM with
it running directly on the node?
Try using "perf top" to get an idea of what's going on with the node
when doing the comparison runs, perhaps "perf record" too but I can
never remember if an unprivilged user can do that. That might shed
some light.
To me it sounds like it might be something that that checks how
many cores a node has naively and then starts that many threads/
processes and if the batch job only asks for a single core, or
less than all, then you might end up with a lot of contention.
Good luck!
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Chris Samuel
2017-12-06 06:27:54 UTC
Permalink
Post by Nick Evans
Thanks Brian / Carl / Chris for places to look.... it turned out to be what
Chris had mentioned and they were only requesting 1 CPU but trying to use
all 48 in the machine.
There's the handy "nproc" command which will tell you how many cores you can
actually use - it's cgroups aware so it won't just blindly report all the
cores in the host. It's also part of coreutils so you should be able to
rely on it being there.
Post by Nick Evans
Resubmitted the request asking for all CPU's and the job ran in the expected
amount of time.
Great news!

All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC

_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/l
Peter Kjellström
2017-12-06 09:09:20 UTC
Permalink
On Wed, 6 Dec 2017 16:47:42 +1100
Post by Nick Evans
Thanks Brian / Carl / Chris for places to look.... it turned out to
be what Chris had mentioned and they were only requesting 1 CPU but
trying to use all 48 in the machine.
If you got only a 4x performance reduction when running a 48x over
subscribe on 1/48 the amount of cores then something is wrong with
your base line. That is, the "run without queue system" case seems
suspicsious.

/Peter K
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Chris Samuel
2017-12-06 10:39:20 UTC
Permalink
Post by Peter Kjellström
If you got only a 4x performance reduction when running a 48x over
subscribe on 1/48 the amount of cores then something is wrong with
your base line. That is, the "run without queue system" case seems
suspicsious.
If this is, as I suspect is likely, bioinformatics code it could well be that
it is a pipeline type application and only part of the application may be able
to make use of parallelism (and then might not be very good at it).

All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC

_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http:
Peter Kjellström
2017-12-06 12:53:55 UTC
Permalink
On Wed, 06 Dec 2017 21:39:20 +1100
Post by Chris Samuel
Post by Peter Kjellström
If you got only a 4x performance reduction when running a 48x over
subscribe on 1/48 the amount of cores then something is wrong with
your base line. That is, the "run without queue system" case seems
suspicsious.
If this is, as I suspect is likely, bioinformatics code it could well
be that it is a pipeline type application and only part of the
application may be able to make use of parallelism (and then might
not be very good at it).
Even more reason not to give it 48x more resource to waste then :-)

Probably wise to go back to a smaller set of resources and no over
subscribe and evaluate behavior.

/Peter K
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/be
Faraz Hussain
2017-12-06 14:07:10 UTC
Permalink
48 cores sounds like a lot. Perhaps hyper-threading is turned on? If
so, try running with 24 cpus to see if you get the same or better
performance than 48.

-FEACluster.com

_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listin
David Mathog
2017-12-06 18:20:25 UTC
Permalink
Post by Chris Samuel
If this is, as I suspect is likely, bioinformatics code it could well
be that
it is a pipeline type application and only part of the application may
be able
to make use of parallelism (and then might not be very good at it).
Exactly. Super frustrating to set something like '--cpus=40' and then
watch the resulting heap of programs sit for long periods of time
(hours, not seconds) running only on a single CPU.

Regards,

David Mathog
***@caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/ma
Tim Cutts
2017-12-06 18:35:11 UTC
Permalink
Of course, if you charge for your cluster time, that hurts them in the wallet, since they pay for all the allocated unused time. If you don’t charge (which is the case for us) it’s hard to incentivise them not to do this. Shame works, a bit. We publish cluster analytics showing CPU efficiency and memory efficiency league tables for the users, and that has had some good effects in the past...

Tim
Post by Chris Samuel
If this is, as I suspect is likely, bioinformatics code it could well be that
it is a pipeline type application and only part of the application may be able
to make use of parallelism (and then might not be very good at it).
Exactly. Super frustrating to set something like '--cpus=40' and then watch the resulting heap of programs sit for long periods of time (hours, not seconds) running only on a single CPU.
Regards,
David Mathog
Manager, Sequence Analysis Facility, Biology Division, Caltech
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinf
Nick Evans
2017-12-06 20:47:28 UTC
Permalink
Hi all,

Under normal circumstances I would agree that the request for all 48 cores
in the machine is overkill but this particular machine has a highly
specialised FPGA card in it to do most of the heavy lifting when running a
specific set of analysis that has been tuned to run with the card. It can
only run 1 job at a time and the node itself didn't implement the normal
central software mount by design so there isn't the temptation to run
normal jobs on it and block the use of the FPGA.



Nick

On 7 Dec 2017 5:35 AM, "Tim Cutts" <***@sanger.ac.uk> wrote:

Of course, if you charge for your cluster time, that hurts them in the
wallet, since they pay for all the allocated unused time. If you don’t
charge (which is the case for us) it’s hard to incentivise them not to do
this. Shame works, a bit. We publish cluster analytics showing CPU
efficiency and memory efficiency league tables for the users, and that has
had some good effects in the past...

Tim
Post by David Mathog
Post by Chris Samuel
If this is, as I suspect is likely, bioinformatics code it could well be that
it is a pipeline type application and only part of the application may be able
to make use of parallelism (and then might not be very good at it).
Exactly. Super frustrating to set something like '--cpus=40' and then
watch the resulting heap of programs sit for long periods of time (hours,
not seconds) running only on a single CPU.
Post by David Mathog
Regards,
David Mathog
Manager, Sequence Analysis Facility, Biology Division, Caltech
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf




--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Peter Clapham
2017-12-19 16:41:44 UTC
Permalink
Show back of utilization and use patterns openly also removes admins from being “the Police”.

Instead each user of the system can see who is requesting excessive memory, using inappropriate queues or just inefficient workloads at scale. This creates a self-Policing environment and certainly both re-enforces a community feel and improves communication between the groups of users.
Pete

On 12/6/17, 6:36 PM, "Beowulf on behalf of Tim Cutts" <beowulf-***@beowulf.org on behalf of ***@sanger.ac.uk> wrote:

Of course, if you charge for your cluster time, that hurts them in the wallet, since they pay for all the allocated unused time. If you don’t charge (which is the case for us) it’s hard to incentivise them not to do this. Shame works, a bit. We publish cluster analytics showing CPU efficiency and memory efficiency league tables for the users, and that has had some good effects in the past...

Tim
Post by Chris Samuel
If this is, as I suspect is likely, bioinformatics code it could well be that
it is a pipeline type application and only part of the application may be able
to make use of parallelism (and then might not be very good at it).
Exactly. Super frustrating to set something like '--cpus=40' and then watch the resulting heap of programs sit for long periods of time (hours, not seconds) running only on a single CPU.
Regards,
David Mathog
Manager, Sequence Analysis Facility, Biology Division, Caltech
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
_______________________________________________
Beowulf mailing list, ***@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www
Nick Evans
2017-12-20 05:37:36 UTC
Permalink
I completely agree. We have a web page where people can see

- where their jobs are running
- what sort of resources were requested
- the peak resources actually used
- wall time remaining (orange highlighted at 20% remaining and red at
10% remaining)
Post by Peter Clapham
Show back of utilization and use patterns openly also removes admins from
being “the Police”.
Instead each user of the system can see who is requesting excessive
memory, using inappropriate queues or just inefficient workloads at scale.
This creates a self-Policing environment and certainly both re-enforces a
community feel and improves communication between the groups of users.
Pete
On 12/6/17, 6:36 PM, "Beowulf on behalf of Tim Cutts" <
Of course, if you charge for your cluster time, that hurts them in the
wallet, since they pay for all the allocated unused time. If you don’t
charge (which is the case for us) it’s hard to incentivise them not to do
this. Shame works, a bit. We publish cluster analytics showing CPU
efficiency and memory efficiency league tables for the users, and that has
had some good effects in the past...
Tim
Post by David Mathog
Post by Chris Samuel
If this is, as I suspect is likely, bioinformatics code it could
well be that
Post by David Mathog
Post by Chris Samuel
it is a pipeline type application and only part of the application
may be able
Post by David Mathog
Post by Chris Samuel
to make use of parallelism (and then might not be very good at it).
Exactly. Super frustrating to set something like '--cpus=40' and
then watch the resulting heap of programs sit for long periods of time
(hours, not seconds) running only on a single CPU.
Post by David Mathog
Regards,
David Mathog
Manager, Sequence Analysis Facility, Biology Division, Caltech
_______________________________________________
Computing
Post by David Mathog
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
Continue reading on narkive:
Loading...