airrflow: Parameters

Define where the pipeline should find input data and save output data.

Path to comma-separated file containing information about the samples in the experiment.

required

type: string

pattern: ^\S+\.tsv$

Specify the processing mode for the pipeline. Available options are “fastq” and “assembled”.

type: string

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required

type: string

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Path to MiAIRR-BioSample mapping

type: string

default: ${projectDir}/assets/reveal/mapping_MiAIRR_BioSample_v1.3.1.tsv

Experimental protocol used to generate the data

Protocol used for the V(D)J amplicon sequencing library generation.

type: string

Path to fasta file containing the linker sequence, if no V-region primers were used but a linker sequence is present (e.g. 5’ RACE SMARTer TAKARA protocol).

type: string

Define the primer region start and how to deal with the primer alignment.

Path to a fasta file containinc the V-region primer sequences.

type: string

Path to a fasta file containing the C-region primer sequences.

type: string

Start position of V region primers (without counting the UMI barcode).

type: integer

Start position of C region primers (without counting the UMI barcode).

type: integer

Indicate if C region primers are in the R1 or R2 reads.

type: string

Specify to match the tail-end of the sequence against the reverse complement of the primers. This also reverses the behavior of the –start argument, such that start position is relative to the tail-end of the sequence. (default: False)Maximum scoring error for the Presto MaxPrimer process for the C and/or V region primers identification.

type: boolean

Define how UMI barcodes should be treated.

Indicate if UMI indices are recorded in the R1 (default) or R1 fastq file.

type: string

UMI barcode length in nucleotides. Set to 0 if no UMIs present.

type: integer

default: -1

UMI barcode start position in the index read.

type: integer

Indicate if UMI indices are recorded in a separate index file.

type: boolean

Options for adapter trimming and read clipping

Whether to trim adapters in fastq reads with fastp.

type: boolean

default: true

Fasta file with adapter sequences to be trimmed.

type: string

Number of bases to clip 5’ in R1 reads.

type: integer

Number of bases to clip 5’ in R2 reads.

type: integer

Number of bases to clip 3’ in R1 reads.

type: integer

Number of bases to clip 3’ in R2 reads.

type: integer

Trim adapters specific for Nextseq sequencing

type: boolean

Option to save trimmed reads.

type: boolean

Options for the pRESTO sequence assembly processes

Quality threshold for pRESTO FilterSeq sequence filtering.

type: integer

default: 20

Maximum error for building the primer consensus in the pRESTO Buildconsensus step.

type: number

default: 0.6

Maximum error for building the sequence consensus in the pRESTO BuildConsensus step.

type: number

default: 0.1

Maximum gap for building the sequence consensus in the pRESTO BuildConsensus step.

type: number

default: 0.5

Cluster sequences by similarity regardless of any annotation with pRESTO ClusterSets and annotate the cluster ID additionally to the UMI barcode.

type: boolean

default: true

Maximum allowed error for R1 primer alignment.

type: number

default: 0.2

Maximum allowed error for R2 primer alignment.

type: number

default: 0.2

Align primers instead of scoring them. Used for protocols without primer fixed positions.

type: boolean

Maximum allowed primer length when aligning the primers.

type: integer

default: 50

Masking mode for R1 primers.

type: string

Masking mode for R2 primers.

type: string

Use MaskPrimers align for a 5’ RACE protocol.

type: boolean

Use when primer sequences are unknown but when their approximate positions are known.

type: boolean

R1 primer extract length when using --maskprimers_extract.

type: integer

R2 primer extract length when using --maskprimers_extract.

type: integer

Use AssemblePairs sequential instead of AssemblePairs align when assembling read pairs.

type: boolean

Align internal C-region for a more precise isotype characterization.

type: boolean

Provide internal C-region sequences for a more precise C-region characterization. Then also set the align_cregion flag.

type: string

Maximum allowed length when aligning the internal C-region.

type: integer

default: 100

Maximum allowed error when aligning the internal C-region.

type: number

default: 0.3

Mask mode for C-region alignment.

type: string

default: tag

Skip filter step after alignment that ensures that locus should match the v_call chain, the sequence alignment should have at least 200 informative positions (excluding N or gaps), and maximum 10% N nucleotides in the alignment.

type: boolean

Options specific for raw single cell input.

Path to the reference directory required by cellranger. Can either be directory or tar.gz.

type: string

Options specific for raw untargeted RNA-seq input.

Specifies which read holds the barcodes

type: string

Indicate if UMI indices are recorded in the R1 (default) or R2 fastq file.

type: string

Specifies where in the read the barcodes and UMIs can be found.

type: string

path to the barcode whitelist.

type: string

Options for the VDJ annotation processes.

Whether to reassign genes if the input file is an AIRR formatted tabulated file.

type: boolean

default: true

Subset to productive sequences.

type: boolean

default: true

Save germline database to reuse the cache in future runs.

type: boolean

default: true

Path to the germline reference fasta.

type: string

default: https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip

Path to the cached igblast database.

type: string

default: https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip

Fetch the selected germline reference bundle at runtime instead of using cached reference inputs.

type: string

Options for bulk sequence filtering after VDJ assignment.

Name of the field used to collapse duplicated sequences.

type: string

default: sample_id

Whether to run the process to detect contamination.

type: boolean

Whether to apply the chimera removal filter.

type: boolean

Define how the B-cell clonal trees should be calculated.

Set the clustering threshold Hamming distance value. Default: ‘auto’

type: string,number

default: auto

Perform clonal lineage tree analysis.

type: boolean

Name of the field used to group data files to identify clones.

type: string

default: subject_id

Name of the field used to identify external groups used to identify a clonal threshold.

type: string

default: subject_id

Lineage tree software to use to build trees within Dowser. If you change the default, also set the lineage_tree_exec parameter.

type: string

Path to lineage tree building executable.

type: string

default: /usr/local/bin/raxml-ng

Name of the field used to determine if a sample is single cell sequencing or not.

type: string

default: single_cell

Skip report of EnchantR DefineClones for all samples together.

type: boolean

Skip report of EnchantR FindThreshold for all samples together.

type: boolean

Skip all clonal anlaysis processes

type: boolean

Options for genotyping and novel allele inference.

Perform TIgGER novel allele and genotype inference.

type: boolean

Name of the field used to group data files to infer genotype.

type: string

default: subject_id

Threshold for determining if two sequences come from the same clone or not.

type: number

default: 0.2

Keep only one representative sequence per clone to perform genotype inference.

type: boolean

Perform TIgGER novel allele inference.

type: boolean

Options to generate BCR and TCR embeddings with Amulety

Generate a sequence amino acid translation with IgBlast.

type: boolean

Generate sequence embeddings with amulety.

type: string

BCR or TCR chains to include for embedding.

type: string

default: H

Use GPU to generate embeddings.

type: boolean

Custom report Rmarkdown file.

type: string

default: ${projectDir}/assets/repertoire_comparison.Rmd

Custom report style file in css format.

type: string

default: ${projectDir}/assets/nf-core_style.css

Custom logo for the report.

type: string

default: ${projectDir}/assets/nf-core-airrflow_logo_light.png

Custom logo for the EnchantR reports.

type: string

default: ${projectDir}/assets/nf-core-airrflow_logo_reports.png

Skip repertoire analysis and report generation.

type: boolean

Skip multiqc report.

type: boolean

Options for the reference genome indices used to align reads.

Do not load the iGenomes reference config.

hidden

type: boolean

default: true

The base path to the igenomes reference files

hidden

type: string

default: s3://ngi-igenomes/igenomes/

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Less common options for the pipeline, typically set in a config file.

Display version and exit.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

hidden

type: string

File size limit when attaching MultiQC reports to summary emails.

hidden

type: string

default: 25.MB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden

type: boolean

Custom config file to supply to MultiQC.

hidden

type: string

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

hidden

type: string

Custom MultiQC yaml file containing HTML including a methods description.

hidden

type: string

Boolean whether to validate parameters against the schema at runtime

hidden

type: boolean

default: true

Base URL or local path to location of pipeline test dataset files

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/

Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.

hidden

type: string

Display the help message.

type: boolean,string

Display the full detailed help message.

type: boolean

Display hidden parameters in the help message (only works when –help or –help_full are provided).

type: boolean

nf-core/airrflow