Define where the pipeline should find input data and save output data.

Path to comma-separated file containing information about the samples in the experiment.

required
type: string
pattern: ^\S+\.tsv$

You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See usage docs.

Specify the processing mode for the pipeline. Available options are "fastq" and "assembled".

type: string

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required
type: string

Email address for completion summary.

type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (~/.nextflow/config) then you don't need to specify this on the command line for every run.

Path to MiAIRR-BioSample mapping

type: string
default: ${projectDir}/assets/reveal/mapping_MiAIRR_BioSample_v1.3.1.tsv

Experimental protocol used to generate the data

Protocol used for the V(D)J amplicon sequencing library generation.

type: string

Available protocols are:

  • specific_pcr_umi: RT-PCR using transcript-specific primers containing UMIs.
  • specific_pcr: RT-PCR using transcript-specific primers.
  • dt_5p_race_umi: 5’-RACE PCR using oligo-dT primers and template switch primers containing UMI.
  • dt_5p_race: 5’-RACE PCR (i.e. RT is followed by a template switch (TS) step) using oligo-dT primers.
  • sc_10x_genomics:10x genomics library preparation protocol for scVDJ sequencing.

Path to fasta file containing the linker sequence, if no V-region primers were used but a linker sequence is present (e.g. 5' RACE SMARTer TAKARA protocol).

type: string

Define the primer region start and how to deal with the primer alignment.

Path to a fasta file containinc the V-region primer sequences.

type: string

Path to a fasta file containing the C-region primer sequences.

type: string

Start position of V region primers (without counting the UMI barcode).

type: integer

Start position of C region primers (without counting the UMI barcode).

type: integer

Indicate if C region primers are in the R1 or R2 reads.

type: string

Specify to match the tail-end of the sequence against the reverse complement of the primers. This also reverses the behavior of the --start argument, such that start position is relative to the tail-end of the sequence. (default: False)Maximum scoring error for the Presto MaxPrimer process for the C and/or V region primers identification.

type: boolean

Define how UMI barcodes should be treated.

Indicate if UMI indices are recorded in the R1 (default) or R1 fastq file.

type: string

The pipeline requires UMI barcodes for identifying unique transcripts. These barcodes are typically read from an index file but sometimes can be provided merged with the start of the R1 or R2 reads. If provided in an additional index file, set the --index_file parameter, if provided merged with the R1 or R2 reads, set the --umi_position parameter to R1 or R2, respectively.

UMI barcode length in nucleotides. Set to 0 if no UMIs present.

type: integer
default: -1

UMI barcode start position in the index read.

type: integer

Indicate if UMI indices are recorded in a separate index file.

type: boolean

Set to true if UMI barcodes are to be read from a separate Illumina index fastq file. If Illumina indices and UMI barcodes are already integrated into the R1 reads, leave the default --index_file false.

The pipeline requires UMI barcodes for identifying unique transcripts. These barcodes are typically read from an index file but sometimes can be provided merged with the start of the R1 or R2 reads. If provided in an additional index file, set the --index_file parameter, if provided merged with the R1 or R2 reads, set the --umi_position parameter.

Options for adapter trimming and read clipping

Whether to trim adapters in fastq reads with fastp.

type: boolean
default: true

By default adapters will be auto-detected, but adapter sequences can also be provided in a fasta file with the --adapter_fasta option.

Fasta file with adapter sequences to be trimmed.

type: string

Number of bases to clip 5' in R1 reads.

type: integer

Number of bases to clip 5' in R2 reads.

type: integer

Number of bases to clip 3' in R1 reads.

type: integer

Number of bases to clip 3' in R2 reads.

type: integer

Trim adapters specific for Nextseq sequencing

type: boolean

Option to save trimmed reads.

type: boolean

Options for the pRESTO sequence assembly processes

Quality threshold for pRESTO FilterSeq sequence filtering.

type: integer
default: 20

Maximum error for building the primer consensus in the pRESTO Buildconsensus step.

type: number
default: 0.6

Maximum error for building the sequence consensus in the pRESTO BuildConsensus step.

type: number
default: 0.1

Maximum gap for building the sequence consensus in the pRESTO BuildConsensus step.

type: number
default: 0.5

Cluster sequences by similarity regardless of any annotation with pRESTO ClusterSets and annotate the cluster ID additionally to the UMI barcode.

type: boolean
default: true

Maximum allowed error for R1 primer alignment.

type: number
default: 0.2

Maximum allowed error for R2 primer alignment.

type: number
default: 0.2

Align primers instead of scoring them. Used for protocols without primer fixed positions.

type: boolean

Maximum allowed primer length when aligning the primers.

type: integer
default: 50

Masking mode for R1 primers.

type: string

See pRESTO MaskPrimers for more information.

Masking mode for R2 primers.

type: string

See pRESTO MaskPrimers for more information.

Use MaskPrimers align for a 5' RACE protocol.

type: boolean

It will align the C-primers with pRESTO MaskPrimers align, and extract the template-switch Oligo and/or linker with defined length in --primer_r2_extract_len.

Use when primer sequences are unknown but when their approximate positions are known.

type: boolean

Use in combination with --primer_r1_extract_len and --primer_r2_extract_len to define the sequence regions to cut corresponding to the approximate unknown primer sequence length prior to alignment. It is also possible to define a UMI barcode for extraction if it's position is unknown.

R1 primer extract length when using --maskprimers_extract.

type: integer

R2 primer extract length when using --maskprimers_extract.

type: integer

Use AssemblePairs sequential instead of AssemblePairs align when assembling read pairs.

type: boolean

Align internal C-region for a more precise isotype characterization.

type: boolean

Provide internal C-region sequences for a more precise C-region characterization. Then also set the align_cregion flag.

type: string

Maximum allowed length when aligning the internal C-region.

type: integer
default: 100

Maximum allowed error when aligning the internal C-region.

type: number
default: 0.3

Mask mode for C-region alignment.

type: string
default: tag

Skip filter step after alignment that ensures that locus should match the v_call chain, the sequence alignment should have at least 200 informative positions (excluding N or gaps), and maximum 10% N nucleotides in the alignment.

type: boolean

Options for the VDJ annotation processes.

Whether to reassign genes if the input file is an AIRR formatted tabulated file.

type: boolean
default: true

Subset to productive sequences.

type: boolean
default: true

Save databases so you can use the cache in future runs.

type: boolean
default: true

Path to the germline reference fasta.

type: string
default: https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip

By default, we provide a pre-downloaded version of the IMGT database. It is also possible to provide a custom reference fasta database. To fetch a fresh version of IMGT, set the --fetch_imgt parameter instead.

Path to the cached igblast database.

type: string
default: https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip

By default, we provide a pre-downloaded version of the IMGT database. It is also possible to provide a custom reference fasta database. To fetch a fresh version of IMGT, set the --fetch_imgt parameter instead.

Set this flag to fetch the IMGT reference data at runtime.

type: boolean

Options for bulk sequence filtering after VDJ assignment.

Name of the field used to collapse duplicated sequences.

type: string
default: sample_id

Whether to run the process to detect contamination.

type: boolean

Whether to apply the chimera removal filter.

type: boolean

Define how the B-cell clonal trees should be calculated.

Set the clustering threshold Hamming distance value. Default: 'auto'

type: string,number
default: auto

Perform clonal lineage tree analysis.

type: boolean

Name of the field used to group data files to identify clones.

type: string
default: subject_id

Name of the field used to identify external groups used to identify a clonal threshold.

type: string
default: subject_id

Lineage tree software to use to build trees within Dowser. If you change the default, also set the lineage_tree_exec parameter.

type: string

Path to lineage tree building executable.

type: string
default: /usr/local/bin/raxml-ng

Name of the field used to determine if a sample is single cell sequencing or not.

type: string
default: single_cell

Skip report of EnchantR DefineClones for all samples together.

type: boolean

Skip report of EnchantR FindThreshold for all samples together.

type: boolean

Options specific for raw single cell input.

Path to the reference directory required by cellranger. Can either be directory or tar.gz.

type: string

See for IMGT or default.

Options specific for raw unselected RNA-seq input.

Specifies which read holds the barcodes

type: string

file containing the barcodes

Indicate if UMI indices are recorded in the R1 (default) or R1 fastq file.

type: string

file containing 10x Genomics-like UMIs

Specifies where in the read the barcodes and UMIs can be found.

type: string

For further information see the TRUST4 docs.

Custom report Rmarkdown file.

type: string
default: ${projectDir}/assets/repertoire_comparison.Rmd

Custom report style file in css format.

type: string
default: ${projectDir}/assets/nf-core_style.css

Custom logo for the report.

type: string
default: ${projectDir}/assets/nf-core-airrflow_logo_light.png

Custom logo for the EnchantR reports.

type: string
default: ${projectDir}/assets/nf-core-airrflow_logo_reports.png

Skip repertoire analysis and report generation.

type: boolean

Skip multiqc report.

type: boolean

Options for the reference genome indices used to align reads.

Do not load the iGenomes reference config.

hidden
type: boolean
default: true

Do not load igenomes.config when running the pipeline. You may choose this option if you observe clashes between custom parameters and those supplied in igenomes.config.

The base path to the igenomes reference files

hidden
type: string
default: s3://ngi-igenomes/igenomes/

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden
type: string
default: master

Base directory for Institutional configs.

hidden
type: string
default: https://raw.githubusercontent.com/nf-core/configs/master

If you're running offline, Nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell Nextflow where to find them with this parameter.

Institutional config name.

hidden
type: string

Institutional config description.

hidden
type: string

Institutional config contact information.

hidden
type: string

Directory to keep pipeline Nextflow logs and reports.

hidden
type: string
default: ${params.outdir}/pipeline_info

Less common options for the pipeline, typically set in a config file.

Display version and exit.

hidden
type: boolean

Method used to save pipeline results to output directory.

hidden
type: string

The Nextflow publishDir option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See Nextflow docs for details.

Email address for completion summary, only when pipeline fails.

hidden
type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

An email address to send a summary email to when the pipeline is completed - ONLY sent if the pipeline does not exit successfully.

Send plain-text email instead of HTML.

hidden
type: boolean

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

File size limit when attaching MultiQC reports to summary emails.

hidden
type: string
default: 25.MB
pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden
type: boolean

Incoming hook URL for messaging service

hidden
type: string

Incoming hook URL for messaging service. Currently, MS Teams and Slack are supported.

Custom config file to supply to MultiQC.

hidden
type: string

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

hidden
type: string

Custom MultiQC yaml file containing HTML including a methods description.

type: string

Boolean whether to validate parameters against the schema at runtime

hidden
type: boolean
default: true

Base URL or local path to location of pipeline test dataset files

hidden
type: string
default: https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/