Alphafold
On this page
AlphaFold Reference Data
Reference data for AlphaFold are stored at a central storage point to avoid overhead.
The path is /lustre/project/alphafold_users
.
Note: This path is a link to the latest AlphaFold database (which will receive a version flag), for example
alphafold_users -> alphafold_users_v2.3.0
. Older version will be kept for a period of time (which is not yet set).The AlphaFold Module
Generally, software is provided per modulefile.
The support via module files may not work smoothly, users may resort to the containerized version (see below).
#!/bin/bash
#SBATCH -J <name of your job>
#SBATCH -o <desired name for log file>.%j.log
#SBATCH -A <account>
#SBATCH -p <m2_gpu|deeplearning>
#SBATCH --gres=gpu:1 # NOTE: AlphaFold is multi gpu capable, but
# apparently not stable.
#SBATCH -c 8 # NOTE: Non-GPU components of AlphaFold are
# hardly able to use more than 8 CPUs.
#SBATCH --mem=20G # NOTE: For really large protein complexes more
# memory might be needed.
#SBATCH -t 300 # NOTE: This is plenty of time for small and medium
# sized problems. Increase the time value in case
# of bigger simulations.
################################################################################
# load environment
module purge
module load bio/AlphaFold
################################################################################
# variables
INFILE=<path to input FASTA file>
# NOTE: AlphaFold per default creates an output file using the input file name.
# In order to avoid overriding old runs, you can indicate an own, holding
# the unique jobid.
OUTDIR=$PWD/alphafold_test_$SLURM_JOB_ID
mkdir OUTDIR
# NOTE: As the environment variable $ALPHA_FOLD_DATA is set by the module,
# no further data indicators are required upon starting the programm.
srun alphafold \
--output_dir=$OUTDIR \
--fasta_paths=$INFILE \
--max_template_date=<max_template_date, e.g. '2020-05-14'> \
--db_preset=<full_dbs|reduced_dbs> \
--model_preset=<monomer|multimer>
AlphaFold per Container
The container location is /lustre/project/alphafold_users/container
. Select your container version there and enter the appropriate name in the template script below (under <alphafold_container_version>.sif
).
#!/bin/bash
#SBATCH -J <name of your job>
#SBATCH -o <desired name for log file>.%j.log
#SBATCH -A <account>
#SBATCH -p <m2_gpu|deeplearning>
#SBATCH --gres=gpu:1 # NOTE: AlphaFold is multi gpu capable, but
# apparently not stable.
#SBATCH -c 8 # NOTE: Non-GPU components of AlphaFold are
# hardly able to use more than 8 CPUs.
#SBATCH --mem=20G # NOTE: For really large protein complexes more
# memory might be needed.
#SBATCH -t 300 # NOTE: This is plenty of time for small and medium
# sized problems. Increase the time value in case
# of bigger simulations.
################################################################################
module purge
module load tools/AppTainer # NOTE: The AppTainer module provides support for
# a Singularity container.
################################################################################
INFILE=<path to input FASTA file>
# NOTE: AlphaFold per default creates an output file using the input file name.
# In order to avoid overriding old runs, you can indicate an own, holding
# the unique jobid.
OUTDIR=$PWD/alphafold_test_$SLURM_JOB_ID
mkdir -p $OUTDIR
ALPHAFOLD_DATA_DIR=/lustre/project/alphafold_users
CONTAINERPATH=${ALPHAFOLD_DATA_DIR}/container
# NOTE: The $ALPHAFOLD_DATA_DIR is not accepted by the current container version, therefore
# the individual flags need to be defined.
srun singularity run --env TF_FORCE_UNIFIED_MEMORY=1,XLA_PYTHON_CLIENT_MEM_FRACTION4.0,OPENMM_CPU_THREADS=8
-B .:/etc --nv ${CONTAINERPATH}/<alphafold_container_version>.sif
--fasta_paths=$INFILE
--output_dir=$OUTDIR
--max_template_date=<max_template_date, e.g. '2020-05-14'>
--data_dir=$ALPHAFOLD_DATA_DIR
--uniref90_database_path=$ALPHAFOLD_DATA_DIR/uniref90/uniref90.fasta
--mgnify_database_path=$ALPHAFOLD_DATA_DIR/mgnify/mgy_clusters_2018_12.fa
--small_bfd_database_path=$ALPHAFOLD_DATA_DIR/small_bfd/bfd-first_non_consensus_sequences.fasta
--pdb70_database_path=$ALPHAFOLD_DATA_DIR/pdb70/pdb70
--template_mmcif_dir=$ALPHAFOLD_DATA_DIR/pdb_mmcif/mmcif_files
--obsolete_pdbs_path=$ALPHAFOLD_DATA_DIR/pdb_mmcif/obsolete.dat
--use_gpu_relax
--db_preset=<full_dbs|reduced_dbs>
--model_preset=<monomer|multimer>