User Tools

Site Tools


cluster-lbt:limits

Cluster usage rules

The first rule to know and follow about the use of the LBT computationnal clusters is that is formally forbidden to run any king of scripts on masters. Anyone caught on infringing this rule will be banned from the list of authorized users.

Because depending of financial investments, each group has its own quotas, both for computing and storage capacities, which could differ depending the cluster you need.

Below are respective semestrial quotas for Lucifer, Hades and Baal:

  • on Lucifer:
Group name Project account CPU time Archive quotas Home quotas Workdir quotas
amyloid_team amyloid_project 0h 7TB - 140kf 2.3TB - 92kf 9.4TB - 752kf
baaden_team baaden_project 0h 18TB - 360kf 6.6TB - 264kf 1.2TB - 96kf
biou_team* biou_project 0h 7TB - 140kf / 7TB - 560kf
derreumaux_team derreumaux_project 0h 5TB - 100kf 1.6TB - 64kf 3.4TB - 272kf
lafontaine_team* lafontaine_project 421632h** 2TB - 40kf / 2TB - 160kf
sacquin_team sacquin_project 0h 13TB - 260kf 3.3TB - 132kf 13.4TB - 1072kf
simlab_team simlab_project 3935232h 3TB - 60kf 1TB - 40kf 1TB - 80kf
sterpone_team sterpone_project 0h 35TB - 700kf 4.6TB - 194kf 9.4TB - 752kf
stirnemann_team stirnemann_project 0h 25TB - 500kf 1.75TB - 70kf 200GB - 16kf
  • on Hades:
Group name Project account CPU time Archive quotas Home quotas Workdir quotas
amyloid_team amyloid_project 0h 7TB - 140kf 2.3TB - 92kf 200GB - 16kf
baaden_team baaden_project 0h 18TB - 360kf 6.6TB - 264kf 26.8TB - 2144kf
biou_team* biou_project 0h 7TB - 140kf / 7TB - 560kf
derreumaux_team derreumaux_project 0h 5TB - 100kf 1.6TB - 64kf 3.4TB - 272kf
lafontaine_team* lafontaine_project 153792h** 2TB - 40kf / 2TB - 160kf
sacquin_team sacquin_project 0h 13TB - 260kf 3.3TB - 132kf 200GB - 16kf
simlab_team simlab_project 1528416h 3TB - 60kf 1TB - 40kf 1TB - 80kf
sterpone_team sterpone_project 0h 35TB - 700kf 4.6TB - 194kf 9.4TB - 752kf
stirnemann_team stirnemann_project 0h 25TB - 500kf 1.75TB - 70kf 200GB - 16kf
  • on Baal:
Group name Project account CPU time Archive quotas Home quotas Workdir quotas
amyloid_team amyloid_project 0h 7TB - 140kf 2.3TB - 92kf 200GB - 16kf
baaden_team baaden_project 0h 18TB - 360kf 6.6TB - 264kf 1.2TB - 96kf
biou_team* biou_project 0h 7TB - 140kf / 7TB - 560kf
derreumaux_team derreumaux_project 0h 5TB - 100kf 1.6TB - 64kf 200GB - 16kf
lafontaine_team* lafontaine_project 41040h** 2TB - 40kf / 2TB - 160kf
sacquin_team sacquin_project 0h 13TB - 260kf 3.3TB - 132kf 200GB - 16kf
simlab_team simlab_project 926712h 3TB - 60kf 1.6TB - 64kf 1TB - 80kf
sterpone_team sterpone_project 0h 35TB - 700kf 4.6TB - 184kf 800GB - 64kf
stirnemann_team stirnemann_project 1633824h 25TB - 500kf 1.75TB - 70kf 25TB - 2Mf

*: the archive and workdir volumes dedicated for biou_team and lafontaine_team UNIX groups are located on the same server (but 2 distinguished block devices) and common for all clusters. So, these are neither distributed nor replicated.
**: temporary allocations by over-booking.

For all clusters, according to your assignment laboratory (LBT or elsewhere) or membership agreement, you may use simlab_project's CPU accounting (and/or others) to run your jobs

As you can notice, you dont have any limitation on /scratch directory on any computing node because you must clean all temporary directories at the end of job. If you dont, an automatic script will do that. Please, remember this thinking to back synchronize your produced data at the end of your job runtime.

A script is run every hour to check the quota status. If you are exceeding your disk quota, whatever the quota, your PI will get a one-day reprieve to solve the trouble (and one-day more in quarantine); failing that, your group will be excluded from access policy and your session(s) killed until a solution is found.
You can monitor your storage consumption by displaying the above mentioned daemon's outputs (/shared/cluster_help/*.quota)

Because the home volume is not mounted on computing nodes, you've to remember the only one way to submit jobs is to do it directly from your workdir. Due to the same reason, if you want to start an interactive session into computing nodes, you've to keep in mind the homedir where you are located is in fact a bind into your own workdir.

Note than if you have completely dryed your CPU time allocation [or if your don't have any CPU time allocation dedicated to your project] and if you are a LBT laboratory member, you are allowed to use the simlab_project one.

In addition of these disk space and CPU time allocation, you have also to know clusters [that don't have the same technology inside] have different designed queues:

  • Lucifer:
Queue name Number of nodes Number of cores Maximum Walltime
complete 14 - 28 112 - 896 04:00:00
large 10 - 14 80 - 448 12:00:00
medium 4 - 9 32 - 288 24:00:00
small* 1 - 3 8 - 96 48:00:00
small_24h* 1 - 2 8 - 64 24:00:00

* the main difference between “small” and “small_24h” queues is that with small queue you can only burn the half of cluster nodes; there's no limitation with small_24h.

  • Hades:
Queue name Number of nodes Number of cores Maximum Walltime
complete 21 - 29 126 - 348 04:00:00
large 8 - 20 48 - 240 12:00:00
medium 4 - 7 24 - 84 24:00:00
small* 1 - 3 6 - 36 48:00:00
small_24h* 1 - 2 6 - 24 24:00:00

* the same difference than that described above

  • Baal*:
Queue name Number of nodes Number of cores Maximum Walltime
test 1 1 - 8 00:20:00
monop 1 1 - 4 144:00:00
cryo_em 1 / 2 / 3 16 / 32 / 48 168 / 96 / 48:00:00
gpu_16c (default) 1 / 2 / 3 16 / 32 / 48 24 / 16 / 8:00:00
gpu_40c (default) 1 / 2 / 3 40 / 80 / 120 24 / 16 / 8:00:00

*: this cluster is mainly dedicated for GPU technology. That is why “gpu*” queues are the default ones and all jobs run without any queue specification will be in GPU nodes (not in monop nor test ones).

nohup, disown and similar commands that would keep your job alive on the clusters are prohibited.

Last, but not least, rule: if you are running a job on a single node, you MUST use /scratch volume for all your temporary files (i.e. do not read/write directly on /wordir during the computation). The easiest way to do it is using the submission script provided here.

cluster-lbt/limits.txt · Last modified: 2018/06/28 14:46 by admin