cluster-lbt:jobs_submission
no way to compare when less than two revisions
Différences
Ci-dessous, les différences entre deux révisions de la page.
— | cluster-lbt:jobs_submission [2021/05/31 10:30] (Version actuelle) – créée - modification externe 127.0.0.1 | ||
---|---|---|---|
Ligne 1: | Ligne 1: | ||
+ | ===== Job submissions ===== | ||
+ | ==== Basic commands ==== | ||
+ | Thereafter are some quick and usefull commands for job management: | ||
+ | <note warning> | ||
+ | * submit a job script: | ||
+ | <cli prompt=" | ||
+ | $ qsub < | ||
+ | </ | ||
+ | * run a interactive session: | ||
+ | <cli> | ||
+ | $ qsub -I -A <credit account> | ||
+ | </ | ||
+ | * submit a new job (scripted or interactive) no matter where you are in the current filesystem: | ||
+ | <cli> $ qsub -d `/ | ||
+ | The // | ||
+ | * stop/kill a job: | ||
+ | <cli> | ||
+ | $ qdel < | ||
+ | </ | ||
+ | * get information about your scheduled job: | ||
+ | <cli> | ||
+ | $ checkjob -vv < | ||
+ | </ | ||
+ | * graphically view the allocated resources: | ||
+ | <cli> | ||
+ | $ pbstop | ||
+ | </ | ||
+ | * get use statistics for a given date concerning your group: | ||
+ | <cli> | ||
+ | $ gusage -s <date (yyyymmdd)> | ||
+ | </ | ||
+ | * list the uses for a given date: | ||
+ | <cli> | ||
+ | $ gstatement -s <date (yyyymmdd)> | ||
+ | </ | ||
+ | **NB : ** –summarize option provide a summary | ||
+ | * get your remaining allocated time: | ||
+ | <cli> | ||
+ | $ mybalance -h | ||
+ | </ | ||
+ | * display in text format free resources: | ||
+ | <cli> | ||
+ | $ showbf | ||
+ | </ | ||
+ | * get the estimated start date for job execution: | ||
+ | <cli> | ||
+ | $ showstart < | ||
+ | </ | ||
+ | * list the queued jobs: | ||
+ | <cli> | ||
+ | $ showq | ||
+ | </ | ||
+ | * get queue info and stats: | ||
+ | <cli> | ||
+ | $ qstat -q | ||
+ | </ | ||
+ | ==== Resources request ==== | ||
+ | Here some ways for requesting some computational resources | ||
+ | * a simple sequential job for 2 hours: | ||
+ | <cli> | ||
+ | $ qsub -l walltime=2: | ||
+ | </ | ||
+ | * a openmp job (on only one node) for 1 day and 8 cores: | ||
+ | <cli> | ||
+ | $ qsub -l walltime=24: | ||
+ | </ | ||
+ | * a parallel job with 10 nodes and 12 cores each: | ||
+ | <cli> | ||
+ | $ qsub -l nodes=10: | ||
+ | </ | ||
+ | <note warning> | ||
+ | ==== Advanced usage ==== | ||
+ | In the computing nodes cluster you can do much more than submitting only one script such as submitting job array or chaining jobs. | ||
+ | |||
+ | === Job Arrays === | ||
+ | If you have to submit several identical jobs without having drive every submissions you can use a Torque' | ||
+ | < | ||
+ | == Submitting job array == | ||
+ | You can submit a job array simply typing: | ||
+ | <cli> | ||
+ | $ qsub -t x-y | ||
+ | </ | ||
+ | where x and y are the array bounds; but you can also provide a comma-separated list: | ||
+ | <cli> | ||
+ | $ qsub -t x,y,z | ||
+ | </ | ||
+ | |||
+ | You can also limit the number of tasks that run at once suffixing tab list/bounds by " | ||
+ | <cli> | ||
+ | $ qsub -t 1-100%10 | ||
+ | </ | ||
+ | |||
+ | You can use // | ||
+ | |||
+ | Script example that uses the array feature to run 3 identical jobs: | ||
+ | <code bash arrayjob.pbs> | ||
+ | #!/bin/sh | ||
+ | #PBS -o testArray.out | ||
+ | #PBS -e testArray.err | ||
+ | #PBS -l nodes=1: | ||
+ | #PBS -t 1,5,7 | ||
+ | |||
+ | # Informations sur le job | ||
+ | echo "Job Id:" $PBS_JOBID | ||
+ | echo "List of nodes allocated for job (PBS_NODEFILE): | ||
+ | |||
+ | # job part | ||
+ | cd $PBS_O_WORKDIR | ||
+ | RUN=./ | ||
+ | |||
+ | $RUN data${PBS_ARRAYID}.in | ||
+ | </ | ||
+ | |||
+ | == Viewing job array information == | ||
+ | To view the information about tasks in a running job array pass the -t switch to the qstat command as in | ||
+ | <cli> | ||
+ | $ qstat -t 1,5,7 | ||
+ | </ | ||
+ | |||
+ | == Deleting job arrays and tasks == | ||
+ | To delete some tasks use the following command format | ||
+ | <cli> | ||
+ | $ qdel -t 1,5,7 12345[] | ||
+ | </ | ||
+ | **Note :** Make sure to use the [] brackets after the job-id. | ||
+ | |||
+ | === Job chains and Dependencies === | ||
+ | Quite often, a single simulation requires multiple long runs which must be processed in sequence. One method for creating a sequence of batch jobs is to execute the " | ||
+ | |||
+ | In PBS, you can use the "qsub -W depend=..." | ||
+ | <cli> | ||
+ | $ qsub -W depend=afterok:< | ||
+ | </ | ||
+ | <WRAP center> | ||
+ | ^ option | ||
+ | | afterok:< | ||
+ | | afternotok:< | ||
+ | | after any:< | ||
+ | </ | ||
+ | Here is an example script about how to chain 3 jobs: | ||
+ | <code bash job_chaining.sh> | ||
+ | #!/bin/bash | ||
+ | |||
+ | FIRST=$(qsub job1.pbs) | ||
+ | echo $FIRST | ||
+ | SECOND=$(qsub -W depend=afterany: | ||
+ | echo $SECOND | ||
+ | THIRD=$(qsub -W depend=afterany: | ||
+ | echo $THIRD | ||
+ | </ | ||
+ | === Calculation quota statistics === | ||
+ | At any time, you can check the evolution of your calculation quota usage and its history, half-year by half-year: | ||
+ | <cli> | ||
+ | $ usagestats-< | ||
+ | admin_project (2016-01-01 -> 2016-06-30): | ||
+ | admin_project (2016-07-01 -> 2017-01-01): | ||
+ | admin_project (2017-01-01 -> 2017-07-01): | ||
+ | admin_project (2017-07-01 -> 2018-01-01): | ||
+ | </ | ||
+ | |||
+ | === Node targetting === | ||
+ | In some very specific cases, you may need to target nodes on which you want to submit your jobs. | ||
+ | |||
+ | To do so: | ||
+ | <cli> | ||
+ | $ qsub -l nodes=< | ||
+ | </ | ||
+ | < |
cluster-lbt/jobs_submission.txt · Dernière modification : 2021/05/31 10:30 de 127.0.0.1