Table des matières
Pin_T2C (theads to cores)
Introduction
pin_t2c is a homemade tool which let you set up and tune the CPU affinity for your jobs by pinning each app thread on a dedicated cpu. This can improve the global performance of a process by limiting, or even avoiding, cpu shiftings and cache/memory inherent recopies during your job.
Some software comes with features for handling cpu affinity, but for all other software I recommand that you use pin_t2c.
According to NAMD2 tests by Guillaume S., pin_t2c provides global performance improvements of around 40% on Lucifer nodes and 10% on Baal ones. Note for NAMD: the functionality provided by pin_t2c is basically the same as +setcpuaffnity +pemap optional NAMD2 parameters, so these should not be specified when using pin_t2c.
Following standard usage, two main definitions (wordkeys) are used here:
- socket which represents here the physical processor ID/locator
- cpu which represents the physical/logical computing unit (a.k.a. core)
To be able to use this commande in your job script or interactive session, please load the eponymous module before:
$ module load pin_t2c
A few use cases
First, you can just select all available resources on the system (or provided by Torque/CPUSETS mechanism), but pinning each thread on its cpu:
$ myapp & pin_t2c --pid $!
You can select which processor socket(s) to which you would like to restrict your process. The –socket parameter used by pin_t2c lets you forget about hardware specifities (e.g., how the cores are distributed between the processors).
Here we're going to restrict our process to the first processor socket (ID: 0):
$ myapp & pin_t2c --sockets 0 --pid $!
*: –sockets parameter is a processor socket list. This optional parameter must only contain a list of commas separated processor IDs (e.g 0,1).
If you are aware about hardware details (especially cores distribution on each socket), you can manually specify which CPUs (cores) you wish to restrict your process to.
Here we're going to restrict our process to half the processor resources (all system cpus with a step size of 2):
$ myapp & pin_t2c --cpus all:2 --pid $!
*: –cpus parameter is a cpu/core list. This optional parameter must contain a list of commas separated cpu/core IDs (e.g 0,1,2,3) or ranges (e.g 0-19) or ranges with step size (e.g 0-39:2).
You can manually select a part of each processor socket. In this case pin_t2c will automatically assign the correct cpus to each socket.
Here, we're going to restrict our process to the 10 first cpus/cores from each socket:
$ myapp & pin_t2c --sockets 0,1 --cpus 0-9 --pid $!
Other parameters and use cases are available with pin_t2c. Don't hesitate to consult the tool CLI help and test it:
$ pin_t2c --help usage: pin_t2c [-h] [-s SOCKETS] [-c CPUS] [-f CPUSETFILE] [-r RUN] [-p PID] [-d DELAY] [-v] [-n] [-V] pin_t2c provides an easy way to pin threads of a running process to hardware resources, specified or not. optional arguments: -h, --help show this help message and exit -s SOCKETS, --sockets SOCKETS CPU socket list. This optional parameter must only contain a list of commas separated socket id (e.g 0,1). -c CPUS, --cpus CPUS cpu core list. This optional parameter must contain a list of commas separated core ids (e.g 0,1,2,3) or ranges (e.g 0-19) or ranges with step (e.g 0-39:2). -f CPUSETFILE, --cpusetfile CPUSETFILE cpuset file describing your kernel cpuset mechanism (/dev/cpuset/...) -r RUN, --run RUN your shell command you need to execute -p PID, --pid PID your main process id you want to pin threads -d DELAY, --delay DELAY delay between each pinning -v, --verbose increase output verbosity -n, --dry-run show what would have been pinned and how -V, --version Prints the pin_t2c version number of the executable and exits.