nf-core/funcscan
(Meta-)genome screening for functional and natural product gene sequences
1.1.6
). The latest
stable release is
2.1.0
.
Define where the pipeline should find input data and save output data.
Path to comma-separated file containing information sample names and paths to corresponding FASTA files.
string
^\S+\.csv$
The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
string
Email address for completion summary.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
MultiQC report title. Printed as page header, used for filename if not otherwise specified.
string
These parameters influence which workflow (ARG, AMP and/or BGC) to activate.
Activate antimicrobial peptide screening tools.
boolean
Activate antimicrobial resistance gene screening tools.
boolean
Activate biosynthetic gene cluster screening tools.
boolean
These options influence the generation of annotation files required for downstream steps in ARG, AMP, and BGC workflows.
Specify which annotation tool to use for some downstream tools.
string
Specify whether to save gene annotations in the results directory.
boolean
These parameters influence the annotation algorithm of Bacteria used by BAKTA.
Specify a path to BAKTA database.
string
Download full or light version of the Bakta database if not supplying own database.
string
Specify the minimum contig size.
integer
1
Specify the genetic code translation table.
integer
11
Specify the type of bacteria to be annotated to detect signaling peptides.
string
Specify that all contigs are complete replicons.
boolean
Changes the original contig headers.
boolean
Clean the result annotations to standardise them to Genbank/ENA conventions.
boolean
Activate tRNA detection & annotation.
boolean
Activate tmRNA detection & annotation.
boolean
Activate rRNA detection & annotation.
boolean
Activate ncRNA detection & annotation.
boolean
Activate ncRNA region detection & annotation.
boolean
Activate CRISPR array detection & annotation.
boolean
Skip CDS detection & annotation.
boolean
Activate pseudogene detection & annotation.
boolean
Skip sORF detection & annotation.
boolean
Activate gap detection & annotation.
boolean
Activate oriC/oriT detection & annotation.
boolean
Activate generation of circular genome plots.
boolean
These parameters influence the annotation algorithm used by Prokka.
Use the default genome-length optimised mode (rather than the metagenome mode).
boolean
Suppress the default clean-up of the gene annotations.
boolean
Specify the kingdom that the input represents.
string
Specify the translation table used to annotate the sequences.
integer
11
Minimum contig size required for annotation (bp).
integer
1
Minimum e-value cut-off.
number
0.000001
Set the assigned minimum coverage.
integer
80
Allow transfer RNA (trRNA) to overlap coding sequences (CDS).
boolean
Use RNAmmer for rRNA prediction.
boolean
Force contig name to Genbank/ENA/DDJB naming rules.
boolean
Add the gene features for each CDS hit.
boolean
Retains contig names.
boolean
These parameters influence the annotation algorithm used by Prodigal.
Specify whether to use Prodigal’s single-genome mode for long sequences.
boolean
Does not allow partial genes on contig edges.
boolean
Specifies the translation table used for gene annotation.
integer
11
Forces Prodigal to scan for motifs.
boolean
These parameters influence the annotation algorithm used by Pyrodigal.
Specify whether to use Pyrodigal’s single-genome mode for long sequences.
boolean
Does not allow partial genes on contig edges.
boolean
Specifies the translation table used for gene annotation.
integer
11
Forces Pyrodigal to scan for motifs.
boolean
Generic options for database downloading
Specify whether to save pipeline-downloaded databases in your results directory.
boolean
Antimicrobial Peptide detection using a deep learning model.
Skip AMPlify during AMP-screening.
boolean
Antimicrobial Peptide detection using machine learning
Skip AMPir during AMP-screening.
boolean
Specify which machine learning classification model to use.
string
Specify minimum protein length for prediction calculation.
integer
10
Antimicrobial Peptide detection based on predefined HMM models
Skip HMMsearch during AMP-screening.
boolean
Specify path to the AMP hmm model file(s) to search against. Must have quotes if wildcard used.
string
Saves a multiple alignment of all significant hits to a file.
boolean
Save a simple tabular file summarising the per-target output.
boolean
Save a simple tabular file summarising the per-domain output.
boolean
Antimicrobial Peptide detection mining from metagenomes
Skip Macrel during AMP-screening.
boolean
AntiMicrobial Peptides parsing and functional classification tool
Path to AMPcombi reference database directory (DRAMP).
string
Specify probability cutoff to filter AMPs
number
0.4
Antimicrobial resistance gene detection based on NCBI’s curated Reference Gene Database and curated collection of Hidden Markov Models
Skip AMRFinderPlus during the ARG-screening.
boolean
Specify the path to a local version of the ARMfinderPlus database.
string
Minimum percent identity to reference sequence.
number
-1
Minimum coverage of the reference protein.
number
0.5
Specify which NCBI genetic code to use for translated BLAST.
integer
11
Add the plus genes to the report.
boolean
Add identified column to AMRFinderPlus output.
boolean
Antimicrobial resistance gene detection using a deep learning model
Skip DeepARG during the ARG-screening.
boolean
Specify the path to the DeepARG database.
string
Specify the numeric version number of a user supplied DeepaRG database.
integer
2
Specify which model to use (short or long sequences).
string
Specify minimum probability cutoff under which hits are discarded.
number
0.8
Specify E-value cutoff under which hits are discarded.
number
1e-10
Specify percent identity cutoff for sequence alignment under which hits are discarded.
integer
50
Specify alignment read overlap.
number
0.8
Specify minimum number of alignments per entry for DIAMOND step of DeepARG.
integer
1000
Antimicrobial resistance gene detection using a deep learning model
Skip fARGene during the ARG-screening.
boolean
Specify comma-separated list of which pre-defined HMM models to screen against
string
class_a,class_b_1_2,class_b_3,class_c,class_d_1,class_d_2,qnr,tet_efflux,tet_rpg,tet_enzyme
Specify to save intermediate temporary files to results directory.
boolean
The threshold score for a sequence to be classified as a (almost) complete gene.
number
The minimum length of a predicted ORF retrieved from annotating the nucleotide sequences.
integer
90
Defines which ORF finding algorithm to use.
boolean
The translation table/format to use for sequence annotation.
string
pearson
Antimicrobial resistance gene detection, based on alignment to the CARD database
Skip RGI during the ARG-screening.
boolean
Save RGI output .json file.
boolean
Specify to save intermediate temporary files the results directory.
boolean
Specify the alignment tool to be used.
string
Include all of loose, strict and perfect hits (i.e. >=95% identity) found by RGI.
boolean
true
Suppresses the default behaviour of RGI with --arg_rgi_includeloose
.
boolean
true
Include screening of low quality contigs for partial genes.
boolean
Specify a more specific data-type of input (e.g. plasmid, chromosome)
string
Antimicrobial resistance gene detection, based on alignment to CBI, CARD, ARG-ANNOT, Resfinder, MEGARES, EcOH, PlasmidFinder, Ecoli_VF and VFDB.
Skip ABRicate during the ARG-screening.
boolean
Specify which of the provided public databases to use by ABRicate.
string
Minimum percent identity of alignment required for a hit to be considered.
integer
80
Minimum percent coverage of alignment required for a hit to be considered.
integer
80
Biosynthetic gene cluster detection
Skip antiSMASH during the BGC screening
boolean
Path to user-defined local antiSMASH database.
string
Path to user-defined local antiSMASH directory. Only required when running with docker/singularity.
string
Minimum longest-contig length a sample must have to be screened with antiSMASH.
integer
1000
Minimum length a contig must have to be screened with antiSMASH.
integer
1000
Turn on clusterblast comparison against database of antiSMASH-predicted clusters.
boolean
Turn on clusterblast comparison against known gene clusters from the MIBiG database.
boolean
Turn on clusterblast comparison against known subclusters responsible for synthesising precursors.
boolean
Turn on ClusterCompare comparison against known gene clusters from the MIBiG database.
boolean
Generate phylogenetic trees of secondary metabolite group orthologs.
boolean
Defines which level of strictness to use for HMM-based cluster detection
string
Specify which taxonomic classification of input sequence to use
string
A deep learning genome-mining strategy for biosynthetic gene cluster prediction
Skip deepBGC during the BGC screening.
boolean
Path to local deepBGC database folder.
string
Average protein-wise DeepBGC score threshold for extracting BGC regions from Pfam sequences.
number
0.5
Run DeepBGC’s internal Prodigal step in single
mode to restrict detecting genes to long contigs
boolean
Merge detected BGCs within given number of proteins.
integer
Merge detected BGCs within given number of nucleotides.
integer
Minimum BGC nucleotide length.
integer
1
Minimum number of proteins in a BGC.
integer
1
Minimum number of protein domains in a BGC.
integer
1
Minimum number of known biosynthetic (as defined by antiSMASH) protein domains in a BGC.
integer
DeepBGC classification score threshold for assigning classes to BGCs.
number
0.5
Biosynthetic gene cluster detection
Skip GECCO during the BGC screening.
boolean
Enable unknown region masking to prevent genes from stretching across unknown nucleotides.
boolean
The minimum number of coding sequences a valid cluster must contain.
integer
3
The p-value cutoff for protein domains to be included.
number
1e-9
The probability threshold for cluster detection.
number
0.8
The minimum number of annotated genes that must separate a cluster from the edge.
integer
Biosynthetic Gene Cluster detection based on predefined HMM models
Skip HMMsearch during BGC-screening.
boolean
Specify path to the BGC hmm model file(s) to search against. Must have quotes if wildcard used.
string
Saves a multiple alignment of all significant hits to a file.
boolean
Save a simple tabular file summarising the per-target output.
boolean
Save a simple tabular file summarising the per-domain output.
boolean
Influences parameters required for the reporting workflow.
Specifies summary output format
string
Reference genome related files and options required for the workflow.
Name of iGenomes reference.
string
Path to FASTA genome file.
string
^\S+\.fn?a(sta)?(\.gz)?$
Do not load the iGenomes reference config.
boolean
Parameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
string
master
Base directory for Institutional configs.
string
https://raw.githubusercontent.com/nf-core/configs/master
Institutional config name.
string
Institutional config description.
string
Institutional config contact information.
string
Institutional config URL link.
string
Set the top limit for requested resources for any single job.
Maximum number of CPUs that can be requested for any single job.
integer
16
Maximum amount of memory that can be requested for any single job.
string
128.GB
^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$
Maximum amount of time that can be requested for any single job.
string
240.h
^(\d+\.?\s*(s|m|h|d|day)\s*)+$
Less common options for the pipeline, typically set in a config file.
Display help text.
boolean
Display version and exit.
boolean
Method used to save pipeline results to output directory.
string
Email address for completion summary, only when pipeline fails.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
Send plain-text email instead of HTML.
boolean
File size limit when attaching MultiQC reports to summary emails.
string
25.MB
^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$
Do not use coloured log outputs.
boolean
Incoming hook URL for messaging service
string
Custom config file to supply to MultiQC.
string
Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file
string
Custom MultiQC yaml file containing HTML including a methods description.
string
Boolean whether to validate parameters against the schema at runtime
boolean
true
Show all params when using --help
boolean
Validation of parameters fails when an unrecognised parameter is found.
boolean
Validation of parameters in lenient more.
boolean