Outils pour utilisateurs

Outils du site


cluster-lbt:limits

Cluster usage rules

The first rule to know and follow about the use of the LBT computationnal clusters is that is formally forbidden to run any king of scripts/jobs/daemons on masters. Anyone caught on infringing this rule will be banned from the list of authorized users.

Because depending of financial investments, each group has its own quotas, both for computing and storage capacities, which could differ depending the cluster you need.

Knowing that the default quota is 750GB/60kf per user on /workdir, below are respective semestrial quotas for Baal:

Group name Project account Dedicated CPU time Archive quotas Workdir quotas**
amyloid_team amyloid_project 0h 7TB - 140kf default only
baaden_team baaden_project 0h 18TB - 360kf default only
barraud_team* barraud_project 0h no quota default only
biou_team* biou_project 0h 7TB - 140kf 15TB - 1200kf
boel_team* boel_project 0h no quota defaut only
bioinfo_team* bioinfo_project 0h 1TB - 20kf default only
derreumaux_team derreumaux_project 0h 5TB - 100kf default only
lafontaine_team* bioinfo_project 0h 1TB - 20kf default only
meyer_team* meyer_project 0h no quota default only
sacquin_team sacquin_project 175680h 13TB - 260kf default only
simlab_team simlab_project 1497672h 5TB - 100kf default only
sterpone_team sterpone_project 0h 35TB - 700kf default only
stirnemann_team stirnemann_project 1633824h 25TB - 500kf 25TB/2000kf**
robert_team robert_project 338184h 3TB - 60kf 3TB/240kf**
vallon_team* vallon_project 0h no quota default only
ej_team ej_project. 676368h no quota 8.5TB/680kf**

*: the archive volume for these IBPC UNIX groups is located on a dedicated server. So, these are neither distributed nor replicated.
**: some group leaders bought their own storage space. So, in all fairness, they got their acquisition quota plus the default quota per group member. So, for LBT members only, this Baal Workdir quota above is an approximate value because it depends on the current group members number (holders, PhD, post-PhD, trainees, etc.)

For all clusters, according to your assignment laboratory (LBT or elsewhere) or membership agreement, you may use simlab_project's CPU accounting (and/or others) to run your jobs

As you can notice, you dont have any limitation on /scratch[-dfs] directory on any computing node because you must clean all temporary directories at the end of job. If you dont, an automatic script will do that. Please, remember this thinking to back synchronize your produced data at the end of your job runtime.

A script is run every hour to check the quota status. If you are exceeding your disk quota, whatever the quota, your PI will get a one-day reprieve to solve the trouble (and one-day more in quarantine); failing that, your group will be excluded from access policy and your session(s) killed until a solution is found.
You can monitor your storage consumption by displaying the above mentioned daemon's outputs (/shared/cluster_help/*.quota)

You've to keep in mind the homedir where you are located is in fact a bind into your own workdir space.

Note than if you have completely dryed your CPU time allocation [or if your don't have any CPU time allocation dedicated to your project] and if you are a LBT laboratory member, you are allowed to use the simlab_project one.

In addition of these disk space and CPU time allocation, you have also to know clusters [that don't have the same technology inside] have different designed queues:

Queue name Available nodes Number of nodes Number of cores Maximum Walltime
monop 2 1 1 - 4 144:00:00
gpu_40c_(h/1/2)n* 21 ½ / 1 / 2 20 / 40 / 80 16 / 12 / 8:00:00
nogpu_40c_(h/1)n 21 ½ / 1 20 / 40 16 / 12:00:00
alphafold_(h/1)n 3 ½ / 1 8 / 16 168:00:00
alphafold2_(h/1)n 1 ½ / 1 24 / 48 168:00:00

*: this cluster is mainly dedicated for GPU technology. That is why “gpu_40c_1n” queue is the default one and all jobs run without any queue specification will be in GPU 40c CPU nodes (not in other ones).

You should also pay attention to the amount of memory actually consumed by your jobs, which should not exceed the proportion of node processor required.

Last, but not least, rule: if you are running a job on a single node, you MUST use /scratch volume for all your temporary files (i.e. do not read/write directly on /wordir during the computation). If you are running a job on multiple nodes, you must favor the /scratch-dfs one.
The easiest way to do it is using the submission script provided here.

cluster-lbt/limits.txt · Dernière modification : 2023/11/30 20:32 de admin

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki