Cluster Etiquette¶
A cluster is a shared commons. Its performance and fairness depend on every user acting as a responsible steward of shared resources. The following principles follow directly from treating your colleagues' time as seriously as your own.
๐ซ The Login Node โ No Exceptions¶
The login node enforces a hard per-user limit of 1 CPU core
and 1 GB of RAM via kernel cgroups. Processes that exceed
these limits are killed automatically, without warning and without saving
state. More importantly, the login node is shared simultaneously by every
connected user โ a runaway process degrades the experience for everyone.
When in doubt, use salloc.
๐ซ Do Not Attempt to SSH Directly into Compute Nodes¶
Direct SSH access to compute nodes is not a policy request โ it is technically enforced. The cluster uses FreeIPA Host-Based Access Control (HBAC) rules, which whitelist exactly which users may authenticate to which hosts. Regular user accounts are not permitted to SSH to compute nodes under any circumstances; only the Slurm daemon and system administrators are.
If you find yourself wanting to SSH into a node โ to check on a running job, inspect memory usage, or debug interactively โ the correct tools are:
# Attach to a running job's context and open a shell inside it
srun --jobid=JOB_ID --pty bash
# and run btop, htop or whatever you prefer
# Or, for a fresh interactive session on a compute node
salloc --ntasks=1 --cpus-per-task=4 --mem=8G --time=01:00:00 --pty bash -l
๐ Use $SLURM_TMPDIR for Temporary Job I/O¶
The cluster does not provide a shared /scratch filesystem. Instead, each
job is given a private temporary directory on the local disk of the compute
node, exposed via the environment variable $SLURM_TMPDIR. This directory
offers fast local I/O with no network overhead โ ideal for intermediate
files, temporary datasets, and anything that does not need to outlive the job.
#!/bin/bash
#SBATCH --job-name=my_analysis
#SBATCH ...
# $SLURM_TMPDIR is set automatically by Slurm for every job
# It is fast local storage, private to this job
echo "Temporary workspace: $SLURM_TMPDIR"
# Copy input data from /home or /project into local tmp
cp /home/your_username/data/input.hdf5 $SLURM_TMPDIR/
# Run your code, writing outputs to local tmp
cd $SLURM_TMPDIR
./my_calculator --input input.hdf5 --output result.dat
# Copy results you want to keep back to permanent storage before the job ends
cp $SLURM_TMPDIR/result.dat /home/your_username/results/
$SLURM_TMPDIR is destroyed when your job ends.
Slurm wipes this directory automatically at job completion โ whether the
job succeeded, failed, or was cancelled. Any file you want to keep
must be copied to /home/user or
/project before your script exits. Make this the last
step in every batch script that writes intermediate files.
The size of $SLURM_TMPDIR depends on the local disk capacity of the
assigned node. For jobs generating very large temporary files, check with the
support team on per-partition limits before designing your I/O strategy around
it.
โฑ๏ธ Request Only the Resources You Will Actually Use¶
Over-requesting CPUs, memory, or time is one of the most common and costly forms of cluster misuse. Resources held by your job are unavailable to everyone else โ including your own future jobs, since many schedulers factor in historical efficiency.
- CPUs: If your code is single-threaded, request
--cpus-per-task=1. Requesting 32 cores for a serial job removes 31 cores from the shared pool for no gain. - Memory: Use
sacctto inspect the actualMaxRSSof past jobs and calibrate accordingly. A 20โ30% margin above your typical peak is reasonable; 10ร is not. - Time: Jobs that run over their time limit are killed by Slurm. But requesting 48 hours for a 3-hour job prevents the backfill scheduler from slotting your job into short gaps, lengthening your own queue wait. Profile first, then budget with a modest margin.
๐งช Always Test at Small Scale First¶
Before submitting a large job array or a multi-day run, submit a single short test job with a small time limit and reduced problem size. Verify that it:
- Completes without errors
- Produces output of the expected format and magnitude
- Uses approximately the resources you requested
Discovering a path typo, a missing module, or a segfault after one task costs you one short queue wait. Discovering it after 200 tasks have run โ or failed โ costs the cluster far more, and you the embarrassment of filing a bug report with no useful data.
๐ข Throttle Job Arrays¶
A job array with hundreds of tasks submitted without a concurrency cap can
flood the queue and starve other users. Always use the %N modifier to limit
simultaneous running tasks:
There is no hard rule on the right cap โ it depends on the array size, the partition, and the current load. As a rough guide: if your array would occupy more than roughly 20โ25% of a partition's total cores on its own, reduce the concurrency.
๐๏ธ Keep /home and /project Tidy¶
/home and /project are backed-up, shared network filesystems. They are
not unlimited, and heavy parallel I/O from many jobs writing simultaneously
to these paths can saturate the storage network and slow down the entire
cluster.
- Use
$SLURM_TMPDIRfor all intermediate, temporary, and throwaway files during a job run. - Write final outputs โ the results you actually care about โ back to
/homeor/projectat the end of your job. - Audit your usage periodically and remove data you no longer need. Storage quotas exist and will be enforced.
๐ค Release Interactive Sessions When Done¶
An salloc session holds a real compute allocation for its entire duration,
even if you are idle, have gone for lunch, or forgotten the terminal is open.
Other users' jobs are queued behind the resources your session is holding.
- Exit your session with
exitas soon as your interactive work is complete. - Request realistic time limits for
sallocโ--time=01:00:00for a debugging session, not--time=24:00:00. - If you realise you no longer need an allocation, cancel it explicitly:
๐ฃ Do Not Submit the Same Job Repeatedly Without Reading the Error¶
If a job fails, read the .err log before resubmitting. Submitting the same
broken job five times in a row wastes queue slots and contributes to
unnecessary scheduler churn. The error log almost always contains enough
information to diagnose the problem โ check Troubleshooting
if you are unsure how to interpret it.
For account issues, software requests, quota increases, or anything not covered here, email admins@lcm.mi.infn.it. Please include your username, relevant Job ID(s), and the full text of any error messages.