====== Colossus resources (TSD queuing system) =====

We have four queues:

  * p33
  * p33_norment
  * p33_norment_dev
  * p33_tsd

The p697 project has the same queues under a diffrent name (p697_xxx).

Check "qsumm" command to find the status of the queues (how busy they are), and the total number of cores in those queues. 
For more info, see [[https://www.uio.no/english/services/it/research/sensitive-data/help/hpc/queue-system.html|TSD documentation]] .

====== Colossus usage policy =====

1. Each user should try to limit himself to using just one queue (p33 or p33_norment) at a time. It's OK to submit some more jobs into the other queue for the purpose of developing new scripts,  or for a small-scale analyses, but one should not occupy both queues in full.

2. Try to avoid submitting more than 2500 jobs at a time. According to [[https://www.uio.no/english/services/it/research/sensitive-data/help/hpc/queue-system.html|TSD documentation]], there is a limit of 4500 jobs per project. So far it seem that this limit is actually per user, not per project, but this would be nice to clarify. Please comment [[https://github.com/norment/tsd_issues/issues/14|here]] is you have any input on this .

3. Use "sbatch --nice 1000" for big jobs. It's hard to give specific guideline on this, because it  is a new option and we still need to learn how it works. But TSD team have commented that it may be especially handy for our internal queue (p33_norment), where we compete only among ourselves but not with other projects. For now, as a very rough suggestion, I think we can try adding  "sbatch --nice 1000"  if your're running computations that in total require 100.000  CPU hours or more. Please comment [[https://github.com/norment/tsd_issues/issues/42|here]] if you've tried this solution and it caused you issues.

4. Run **pending, qsumm, squeue, cost --detail** commands often. Learn how to interpret output of these commands, and monitor the status of the queues before and after you submit your jobs.

====== GPU Access on p33 cluster =====

The reserved "ai_hub" queue dedicated for GPU usage has been merged with the regular p33 reservation queue. Colossus has 4 GPU nodes with 2 Nvidia Tesla V100 GPUs each. Two are available to all users and 2 are reserved for dedicated projects. You can run jobs on the GPU nodes like this:

**sbatch --account=YourProject --partition=accel --gres=gpu:1 \\

or \\

sbatch --account=YourProject --partition=accel --gres=gpu:2**

depending on how many GPUs the job needs.

An example of slurm file header for GPU jobs:

<code>
#!/bin/bash
#SBATCH --job-name=gpujob
#SBATCH --account=p33_tsd
#SBATCH --time=48:00:00
#SBATCH --cpus-per-task=2
#SBATCH --mem-per-cpu=8G
#SBATCH --partition=accel
#SBATCH --gres=gpu:1
</code>

====== TSD HPC knowledge transfer sessions ======


Notes from Sep 24, 2020:

<code>
- Kerberos authentication for /cluster has several tricky features
  and depends on how it's used from "ssh", "kinit" and "kinit -R".
$ sbatch --nice <value> - positive value will decrease priority;
$ scontrol show job <jobid>   # check priority of a job 
- TSD will share a command to list the expected start time (pending jobs)
- Alex to follow up on issues with singularity containers
$ df -h /cluster/projects/p33/   # shows available disk space;
$ df -hi /cluster/projects/p33/   # shows available inodes (#files)
#SBATCH --gres=localtmp:20  # allocate 20 GB on $LOCALTMP
#cleanup cp $LOCALTMP/outputfile $SLURM_SUBMIT_DIR  (early in the script)
"cleanup cp" this will make the epilog script of the job and execute even if the job crashes or times out.

</code>

Video-recording is shared here:
https://filesender2.uio.no/?s=download&token=2451027b-9083-423a-9ba7-0e94189ba120
Until Oct 31, 2020. After this date video-recording is available  in p33 project:
<code>
/tsd/p33/data/durable/characters/ofrei/tsd_hpc_recordings
</code>


======= limit on how many SLURM jobs a user / a project can have on Colossus ======= 

> Q1. Please clarify the limit on how many SLURM jobs a can have on
> Colossus. The documentation says that the limit is currently 400,
> which is incorrect - I've submitted several job arrays with in total
> have about 3000 jobs. I would like to know the limit to avoid
> submitting too many jobs at a time. It's important to know whether a
> user and a project have different limits w.r.t. how many jobs they can
> submit to a queue.
Several limits exist. Slurm can handle 15k jobs in the queue (MaxJobLimit). Users can submit 4500 jobs (MaxSubmit). The steps of a job are limited to 40k (MaxJobSteps), the array of a job is limited to 4k (MaxArray). 1500 cpus can be allocated at any one time to a project (GrpTRES limit of billing, separately for p33, p33_norment, p33_tsd).

> Q2. Same question as above, but for a project. I.e., can one user
> exhaust project limit and prevent other users from submitting jobs?
The project limit is 4500 jobs, which can be submitted by a single user.

> Q3. We now have several queues - p697, p697_norment, p697_tsd. Does
> the limit apply to each queue separately,  or is it total limite?
Separately. Those are separate accounting projects in Slurm.



====== Names of the nodes in each queue ======

  * sigma: c1-[16-34,36-40]
  * norment: c1-[6-7],c2-[1-10]

This can be used to constrain jobs to a subset of nodes with SLURM's --nodelist and/or --exclude flags.

SLURM's "sinfo" and "scontrol show node" are restricted for security reason.

