We also use a "workspace" mechanism to control our scratch filesystem.Â
0. /scratch filesystem configured with root as its owner and no other
write permission allowed.
1. User calls mkworkspace which does a setuid, creates the directory,
workspace somewhere outside of it.
2. From cron, rmworkspace runs which reads each record file, determines
record file.
SSD and mkworkspace controls access to that too. Almost nobody uses this
"local scratch" though.
features to automatically create or remove workspaces.
<https://hpc.wsu.edu/support/service-requests/>.
Post by Dmitri ChubarovHello, John,
In HLRS they have what they call a Workspace mechanism
(https://wickie.hlrs.de/platforms/index.php/Workspace_mechanism
<https://urldefense.proofpoint.com/v2/url?u=https-3A__wickie.hlrs.de_platforms_index.php_Workspace-5Fmechanism&d=DwMFaQ&c=C3yme8gMkxg_ihJNXS06ZyWk4EJm8LdrrvxQb-Je7sw&r=DhM5WMgdrH-xWhI5BzkRTzoTvz8C-BRZ05t9kW9SXZk&m=BuKVxMh1wyVgWXiKoJaXUL7QP0qpxl50rxXlc8JDc84&s=x9rK7qB6v7BhVjwXDATRPqsC57eL1jnCVrBDsSpNtY0&e=>)
where each user
creates a scratch directory for their project under $SCRATCH_ROOT that
has end-of-life time encoded in the name and a symlink to this directory
in their persistent storage directory tree. A cronjob enforces the end
of life policy.
One advantage is that it is very easy for the admin to extend the
lifespan when it is absolutely needed. It requires only renaming one
directory to extend, for example, the lifetime
of millions of files from genomic applications.
Here at Novosibirsk University where users are getting their resources
for free this mechanism has been reimplemented to ensure that shared
storage does not turn into a file archive.
The main shared storage is an expensive PanFS system that is split
into two partitions: a larger scratch partitions with a directory
lifetime limit of 90 days and a smaller $HOME partition.
Some users in fact are abusing the system by recreating a new scratch
directory every 90 days and copying the data along effectively
creating persistent storage. However most of the users do evacuate
their valuable data on time.
Greetings from sunny Siberia,
 Dima
sys and it works by setting draconian limits
On Tue, 12 Jun 2018 at 15:06, John Hearns via Beowulf
Our trick in Slurm is to use the slurmdprolog script to set an
XFS project
quota for that job ID on the per-job directory (created by a
plugin which
also makes subdirectories there that it maps to /tmp and /var/tmp
for the
job) on the XFS partition used for local scratch on the node.
I had never thought of that, and it is a very neat thing to do.
What I would like to discuss is the more general topic of clearing
files from 'fast' storage.
Many sites I have seen have dedicated fast/parallel storage which
is referred to as scratch space.
The intention is to use this scratch space for the duration of a
project, as it is expensive.
However I have often seen that the scratch space i used as
permanent storage, contrary to the intentions of whoever sized it,
paid for it and installed it.
I feel that the simplistic 'run a cron job and delete files older
than N days' is outdated.
My personal take is that heirarchical storage is the answere,
automatically pushing files to slower and cheaper tiers.
But the thought struck me - in the Slurm prolog script create a file called
THESE-FILES-WILL-SELF-DESTRUCT-IN-14-DAYS
Then run a cron job to decrement the figure 14
I guess that doesnt cope with running multiple jobs on the same
data set - but then again running a job marks that data as 'hot'
an dyou reset the timer to 14 days.
What do most sites do for scratch space?
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.beowulf.org_mailman_listinfo_beowulf&d=DwMFaQ&c=C3yme8gMkxg_ihJNXS06ZyWk4EJm8LdrrvxQb-Je7sw&r=DhM5WMgdrH-xWhI5BzkRTzoTvz8C-BRZ05t9kW9SXZk&m=BuKVxMh1wyVgWXiKoJaXUL7QP0qpxl50rxXlc8JDc84&s=J8PMbp5NubCpV3X9xng3I9DoT0cO1gDRn92UyMrjhZo&e=>
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit https://urldefense.proofpoint.com/v2/url?u=http-3A__www.beowulf.org_mailman_listinfo_beowulf&d=DwIGaQ&c=C3yme8gMkxg_ihJNXS06ZyWk4EJm8LdrrvxQb-Je7sw&r=DhM5WMgdrH-xWhI5BzkRTzoTvz8C-BRZ05t9kW9SXZk&m=BuKVxMh1wyVgWXiKoJaXUL7QP0qpxl50rxXlc8JDc84&s=J8PMbp5NubCpV3X9xng3I9DoT0cO1gDRn92UyMrjhZo&e=