Skip to content

Using HPC ressources

Using HPC Resources#

On HPCs, you don’t run heavy jobs directly on the login node. Instead, you request resources on compute nodes from a job scheduler (e.g. SLURM).

Submitting jobs (sbatch)#

You can submit a script (e.g. job.sh) that describe what resources you need to the scheduler.

#!/bin/bash
#SBATCH --job-name=test_job
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G
#SBATCH --time=08:00:00
#SBATCH --output=job.out

module load python/3.10
python myscript.py

Submit with:

sbatch job.sh

The scheduler puts the job in a queue.
When resources are available, the job runs on a compute node.

interactive sessions#

For testing/debugging/short work, you can request a temporary interactive shell with reserved resources:

srun --partition=short --cpus-per-task=4 --mem=8G --time=08:00:00 --pty bash -i

This gives you a shell with the requested resources, where you can run commands interactively.

option explanation
--partition=short queue/partition to use
--cpus-per-task=4 4 CPUs
--mem=8G 8 GB of RAM
--time=08:00:00 session stay alive 8 hours maximum
--pty bash "Give me a live terminal (bash) on a compute node, not just a job running in the background."

Monitoring jobs#

Here are the essential SLURM Commands to Monitor Jobs

command  explanation
sinfo Shows available partitions, node states, and how busy the cluster is.
squeue -u $USER check your jobs
 scontrol show job Detailed info about one job: resources requested, current state, which node it runs on. 
scancel Stops a job immediately
sacct -j JOBID see resource usage after completion