===== Cluster usage rules ===== The first rule to know and follow about the use of the LBT computationnal clusters is that is formally forbidden to run any king of scripts/jobs/daemons on masters. Anyone caught on infringing this rule will be banned from the list of authorized users. Because depending of financial investments, each group has its own quotas, both for computing and storage capacities, which could differ depending the cluster you need. Knowing that the default quota is 750GB/60kf per user on /workdir, below are respective semestrial quotas for Baal: ^ Group name ^ Project account ^ Dedicated CPU time ^ Archive quotas ^ Workdir quotas** ^ | amyloid_team | amyloid_project | 0h | 7TB - 140kf | default only | | baaden_team | baaden_project | 0h | 18TB - 360kf | default only | | barraud_team* | barraud_project | 0h | no quota | default only | | biou_team* | biou_project | 0h | 7TB - 140kf | 15TB - 1200kf | | boel_team* | boel_project | 0h | no quota | defaut only | | bioinfo_team* | bioinfo_project | 0h | 1TB - 20kf | default only | | derreumaux_team | derreumaux_project | 0h | 5TB - 100kf | default only | | lafontaine_team* | bioinfo_project | 0h | 1TB - 20kf | default only | | meyer_team* | meyer_project | 0h | no quota | default only | | sacquin_team | sacquin_project | 175680h | 13TB - 260kf | default only | | simlab_team | simlab_project | 1497672h | 5TB - 100kf | default only | | sterpone_team | sterpone_project | 0h | 35TB - 700kf | default only | | stirnemann_team | stirnemann_project | 1633824h | 25TB - 500kf | 25TB/2000kf** | | robert_team | robert_project | 338184h | 3TB - 60kf | 3TB/240kf** | | vallon_team* | vallon_project | 0h | no quota | default only | | ej_team | ej_project. | 676368h | no quota | 8.5TB/680kf** | //*: the archive volume for these IBPC UNIX groups is located on a dedicated server. So, these are neither distributed nor replicated.\\ **: some group leaders bought their own storage space. So, in all fairness, they got their acquisition quota plus the default quota per group member. So, for LBT members only, this Baal Workdir quota above is an approximate value because it depends on the current group members number (holders, PhD, post-PhD, trainees, etc.)// For all clusters, according to your assignment laboratory (LBT or elsewhere) or membership agreement, you may use simlab_project's CPU accounting (and/or others) to run your jobs As you can notice, you dont have any limitation on /scratch[-dfs] directory on any computing node because you must clean all temporary directories at the end of job. If you dont, an automatic script will do that. Please, remember this thinking to back synchronize your produced data at the end of your job runtime. A script is run every hour to check the quota status. If you are exceeding your disk quota, whatever the quota, your PI will get a one-day reprieve to solve the trouble (and one-day more in quarantine); failing that, your group will be excluded from access policy and your session(s) killed until a solution is found.\\ You can monitor your storage consumption by displaying the above mentioned daemon's outputs (**/shared/cluster_help/*.quota**) You've to keep in mind the homedir where you are located is in fact a bind into your own workdir space. Note than if you have completely dryed your CPU time allocation [or if your don't have any CPU time allocation dedicated to your project] and if you are a LBT laboratory member, you are allowed to use the simlab_project one. In addition of these disk space and CPU time allocation, you have also to know clusters [that don't have the same technology inside] have different designed queues: ^ Queue name ^ Available nodes ^ Number of nodes ^ Number of cores ^ Maximum Walltime ^ | monop | 2 | 1 | 1 - 4 | 144:00:00 | | gpu_40c_(h/1/2)n* | 25 | ½ / 1 / 2 | 20 / 40 / 80 | 16 / 12 / 8:00:00 | | nogpu_40c_(h/1)n | 25 | ½ / 1 | 20 / 40 | 16 / 12:00:00 | | alphafold_(h/1)n | 3 | ½ / 1 | 8 / 16 | 168:00:00 | | alphafold2_(h/1)n | 5 | ½ / 1 | 24 / 48 | 168:00:00 | //*: this cluster is mainly dedicated for GPU technology. That is why "gpu_40c_1n" queue is the default one and all jobs run without any queue specification will be in GPU 40c CPU nodes (not in other ones).// You should also pay attention to the amount of memory actually consumed by your jobs, which should not exceed the proportion of node processor required. Last, but not least, rule: if you are running a job on a single node, you __MUST__ use /scratch volume for all your temporary files (i.e. do not read/write directly on /wordir during the computation). If you are running a job on multiple nodes, you must favor the /scratch-dfs one.\\ The easiest way to do it is using the submission script provided [[cluster-lbt:quick_start_guide|here]].