====== ESMFold ======

ESMFold is another class of prediction structure based on a Protein Language Model (PLM). It doesn't require any multiple sequence alignment and use solely the sequence of the protein of interest. It was developed by Meta (a.k.a Facebook).

<note>ESMFold main limitation is the GPU memory as it takes a lot for the predictions (see below)</note>
<note>ESMFold is *really fast* : seconds for small sequences (up to ~100) and minutes for bigger ones (5-10minutes for a 800 sequences protein)</note>

===== Version =====

It use the v1.0.3 available from the Github repository https://github.com/facebookresearch/esm

===== Ressources =====

To know more about ESMFold, I highly recommend to read:
  * the preprint : https://www.biorxiv.org/content/10.1101/2022.07.20.500902v1
  * the GitHub repo: https://github.com/facebookresearch/esm
  * the available notebook: https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/ESMFold.ipynb


===== Installation =====

The installation follow the same process of AlphaFold.\\
**It's available on nodes node061, node062, node063 and node081**

The installation requires several Python packages to install. A conda environment **esmfold** was created for this purpose.


===== Utilization =====

**Use the same queues as alphafold: ''alphafold'' or ''alphafold2'' **

See here: http://www-lbt.ibpc.fr/wiki/doku.php?id=cluster-lbt:extra-tools:alphafold_tool#queues

Since the main limitations of ESMFold is the GPU memory, you should **always** use half of the node for the predictions.

==== Input file ====

ESMFold support only a fasta file.

For monomer predictions, you can give a multifasta and sequences will be treated as batch (one after another).

For multimeres predictions, you need to supply a fasta file filled as a single sequence, with chains separated by a ":" character.


==== GPU Memory ====


ESMFold use a lot of the GPU Memory for the predictions (like Omegafold):
  * ~500Mb for a 70 sequences protein
  * ~27GB for a 800 sequences protein

The GPU installed in nodes 6X have ~10Gb memory and the ones in the node81 have ~48Gb.

However, ESMFold has an option to decrease the memory used (and thus increase the prediction time) called ''--chunk-size'' .


==== Running ====

<note important>The first time you use ESMFold, it will download 3 weight files (//esmfold_3B_v1.pt//, //esm2_t36_3B_UR50D.pt// and //esm2_t36_3B_UR50D-contact-regression.pt//) and will copy it into ~/.cache/torch/hub/checkpoints directory.</note>

A script is available called ''esmfold_inference.py'':
<code bash>
(esmfold) [santuz@node081 ~]$ esmfold_inference.py -h
usage: esmfold_inference.py [-h] -i FASTA -o PDB [--num-recycles NUM_RECYCLES] [--max-tokens-per-batch MAX_TOKENS_PER_BATCH] [--chunk-size CHUNK_SIZE] [--cpu-only] [--cpu-offload]

optional arguments:
  -h, --help            show this help message and exit
  -i FASTA, --fasta FASTA
                        Path to input FASTA file
  -o PDB, --pdb PDB     Path to output PDB directory
  --num-recycles NUM_RECYCLES
                        Number of recycles to run. Defaults to number used in training (4).
  --max-tokens-per-batch MAX_TOKENS_PER_BATCH
                        Maximum number of tokens per gpu forward-pass. This will group shorter sequences together for batched prediction. Lowering this can help with out of memory issues, if these occur on short sequences.
  --chunk-size CHUNK_SIZE
                        Chunks axial attention computation to reduce memory usage from O(L^2) to O(L). Equivalent to running a for loop over chunks of of each dimension. Lower values will result in lower memory usage at the cost of speed. Recommended values: 128, 64, 32. Default: None.
  --cpu-only            CPU only
  --cpu-offload         Enable CPU offloading

</code>

==== Submission script ====

You can find below an example of a submission script to perform Omegafold computations.

**Script version 21/11/2022**

<file bash job_ESMFold.sh>
#!/bin/bash
#PBS -S /bin/bash
#PBS -N ESMFold
#PBS -o $PBS_JOBID.out
#PBS -e $PBS_JOBID.err

#Half node always
#PBS -l nodes=1:ppn=8
#PBS -l walltime=24:00:00
#PBS -A simlab_project
#PBS -q alphafold_hn

#script version 21.11.2022

### FOR EVERYTHING BELOW, I ADVISE YOU TO MODIFY THE USER-part ONLY ###
WORKDIR="/"
NUM_NODES=$(cat $PBS_NODEFILE|uniq|wc -l)
if [ ! -n "$PBS_O_HOME" ] || [ ! -n "$PBS_JOBID" ]; then
        echo "At least one variable is needed but not defined. Please touch your manager about."
        exit 1
else
        if [ $NUM_NODES -le 1 ]; then
                WORKDIR+="scratch/"
                export WORKDIR+=$(echo $PBS_O_HOME |sed 's#.*/\(home\|workdir\)/\(.*_team\)*.*#\2#g')"/$PBS_JOBID/"
                mkdir $WORKDIR
                rsync -ap $PBS_O_WORKDIR/ $WORKDIR/

                # if you need to check your job output during execution (example: each hour) you can uncomment the following line
                # /shared/scripts/ADMIN__auto-rsync.example 3600 &
        else
                export WORKDIR=$PBS_O_WORKDIR
        fi
fi

echo "your current dir is: $PBS_O_WORKDIR"
echo "your workdir is: $WORKDIR"
echo "number of nodes: $NUM_NODES"
echo "number of cores: "$(cat $PBS_NODEFILE|wc -l)
echo "your execution environment: "$(cat $PBS_NODEFILE|uniq|while read line; do printf "%s" "$line "; done)

cd $WORKDIR

# If you're using only one node, it's counterproductive to use IB network for your MPI process communications
if [ $NUM_NODES -eq 1 ]; then
        export PSM_DEVICES=self,shm
        export OMPI_MCA_mtl=^psm
        export OMPI_MCA_btl=shm,self
else
# Since we are using a single IB card per node which can initiate only up to a maximum of 16 PSM contexts
# we have to share PSM contexts between processes
# CIN is here the number of cores in node
        CIN=$(cat /proc/cpuinfo | grep -i processor | wc -l)
        if [ $(($CIN/16)) -ge 2 ]; then
                PPN=$(grep $HOSTNAME $PBS_NODEFILE|wc -l)
                if [ $CIN -eq 40 ]; then
                        export PSM_SHAREDCONTEXTS_MAX=$(($PPN/4))
                elif [ $CIN -eq 32 ]; then
                        export PSM_SHAREDCONTEXTS_MAX=$(($PPN/2))
                else
                        echo "This computing node is not supported by this script"
                fi
                echo "PSM_SHAREDCONTEXTS_MAX defined to $PSM_SHAREDCONTEXTS_MAX"
        else
                echo "no PSM_SHAREDCONTEXTS_MAX to define"
        fi
fi

function get_gpu-ids() {
        if [ $PBS_NUM_PPN -eq $(cat /proc/cpuinfo | grep -cE "^processor.*:") ]; then
                echo "0,1" && return
        fi

        if [ -e /dev/cpuset/torque/$PBS_JOBID/cpus ]; then
                FILE="/dev/cpuset/torque/$PBS_JOBID/cpus"
        elif [ -e /dev/cpuset/torque/$PBS_JOBID/cpuset.cpus ]; then
                FILE="/dev/cpuset/torque/$PBS_JOBID/cpuset.cpus"
        else
                FILE=""
        fi

        if [ -e $FILE ]; then
                if [ $(cat $FILE | sed -r 's/^([0-9]).*$/\1/') -eq 0 ]; then
                        echo "0" && return
                else
                        echo "1" && return
                fi
        else
                echo "0,1" && return
        fi
}

gpus=$(get_gpu-ids)


## USER Part
module load gcc/8.3.0
module load miniconda-py3/latest


conda activate esmfold

#Run
cd $WORKDIR/

d1=`date +%s`
echo $(date)

esmfold_inference.py -i query.fasta -o outputdir/


d2=$(date +%s)
echo $(date)

diff=$((($d2 - $d1)/60))
echo "Time spent (min) : ${diff}"

## DO NOT MODIFY THE PART OF SCRIPT: you will be accountable for any damage you cause
# At the term of your job, you need to get back all produced data synchronizing workdir folder with you starting job folder and delete the temporary one (workdir)
if [ $NUM_NODES -le 1 ]; then
        cd $PBS_O_WORKDIR
        rsync -ap $WORKDIR/ $PBS_O_WORKDIR/
        rm -rf $WORKDIR
fi
## END-DO
</file>

==== Benchmarks ====


===== Troubleshooting =====

In case of trouble, you can contact me at : ''hubert.santuz[at]ibpc.fr''