Slurm

From SCECpedia
Revision as of 02:06, 9 March 2018 by Maechlin (talk | contribs) (Created page with "USC HPC is moving to slurm job manager so here are some notes on using that system at HPC. == Converting from Torque (PBS) to Slurm == Here are differences between SLURM and...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

USC HPC is moving to slurm job manager so here are some notes on using that system at HPC.

Converting from Torque (PBS) to Slurm

Here are differences between SLURM and Torque.

  1. SLURM seems much snappier, at least at Stampede. I can submit a job, then immediately check the job status, and if there are available resources it will have already started. Torque (at HPC as least) has a scheduling delay
  2. SLURM can be annoying with placement of STDOUT/STDERR files. I believe that by default it puts them in your home directly? Or maybe it's poor naming, I can't quite remember. I ended up creating a script called "qsub" (because I'm used to that command) which does the following to give Torque style .o<ID> and .e<ID> files in the submission directory:
    sbatch -o ${1}.o%j -e ${1}.e%j $1
  1. There are different command for managing jobs. The basic ones are:
    1. qstat -u kmilner => squeue -u kmilner
    2. qdel <job-id> => scancel <job-id>
    3. qsub <job-file> => sbatch <job-file) (or use my command above)
  2. The headers are different. Here are equivalent headers for the same job, with Stampede2 style syntax (might be slightly different for HPCC)
SLURM:

#SBATCH -t 00:60:00
#SBATCH -N 2
#SBATCH -n 40
#SBATCH -p scec

PBS (Torque):

#PBS -q scec
#PBS -l walltime=00:60:00,nodes=2:ppn=20
#PBS -V

Related Entries