RoseTTAFold-All-Atom
| Mode | Protein | RNA | Small-molecule | PTM | Constraints | pLM | MSA server | Split MSA |
|---|---|---|---|---|---|---|---|---|
| RoseTTAFold-All-Atom | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
RoseTTAFold All-Atom can be run using the command below:
nextflow run nf-core/proteinfold \ --input samplesheet.csv \ --outdir <OUTDIR> \ --mode rosettafold_all_atom \ --rosettafold_all_atom_db <null (default) | DB_PATH> \ --use_gpu \ -profile <docker/singularity/.../institute>File Structure
The file structure of --rosettafold_all_atom_db must be as follows:
Directory structure
<rosettafold_all_atom_db>/├── bfd│ ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_a3m.ffdata│ ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_a3m.ffindex│ ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_cs219.ffdata│ ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_cs219.ffindex│ ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_hhm.ffdata│ └── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_hhm.ffindex├── params│ └── RFAA_paper_weights.pt├── pdb100│ ├── LICENSE│ ├── pdb100_2021Mar03_a3m.ffdata│ ├── pdb100_2021Mar03_a3m.ffindex│ ├── pdb100_2021Mar03_cs219.ffdata│ ├── pdb100_2021Mar03_cs219.ffindex│ ├── pdb100_2021Mar03_hhm.ffdata│ ├── pdb100_2021Mar03_hhm.ffindex│ ├── pdb100_2021Mar03_pdb.ffdata│ └── pdb100_2021Mar03_pdb.ffindex└── uniref30 ├── UniRef30_2023_02_a3m.ffdata ├── UniRef30_2023_02_a3m.ffindex ├── UniRef30_2023_02_cs219.ffdata ├── UniRef30_2023_02_cs219.ffindex ├── UniRef30_2023_02_hhm.ffdata ├── UniRef30_2023_02_hhm.ffindex └── UniRef30_2023_02.md5sumsIf individual components are available at different locations in the filesystem, they can be set using the following flags:
--rosettafold_all_atom_bfd_path </PATH/TO/bfd/*>--rosettafold_all_atom_paper_weights_path </PATH/TO/params/RFAA_paper_weights.pt>--rosettafold_all_atom_uniref30_path </PATH/TO/uniref30/*>--rosettafold_all_atom_pdb100_path </PATH/TO/pdb100/*>Without setting the --rosettafold_all_atom_db flag, all of the required data files will be downloaded during the workflow execution.
The RoseTTAFold-All-Atom reference databases require ~2TB of disk space.
YAML format
RoseTTAFold-All-Atom allows modelling nucleic acids and small molecule ligands as well as specifying post-translational modifications. However, this input information is not supported in the FASTA format and must be specified in an input YAML file according to the RoseTTAFold-All-Atom specification.
RoseTTAFold-All-Atom YAML files can be run with proteinfold in rosettafold_all_atom mode by substituting the typical FASTA file in the input samplesheet.
id,fastaT1024,T1024.yamlStructures predicted from the RoseTTAFold-All-Atom YAML input will not be compatible with running multiple modes simultaneously.