airrflow: Parameters

Define where the pipeline should find input data and save output data.

Path to a tsv file providing paths to the fastq files for each sample and the necessary metadata for the analysis.

type: string

The input file includes important sample metadata and the path to the R1 and R2 fastq files, and index read file (I), if available. Please check the usage docs on information on how to create the input samplesheet.

Specify the processing mode for the pipeline. Available options are "fastq" and "assembled".ptions are: 'raw'

type: string

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

type: string

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (~/.nextflow/config) then you don't need to specify this on the command line for every run.

Path to MiAIRR-BioSample mapping

type: string

default: bcellmagic/assets/reveal/mapping_MiAIRR_BioSample_v1.3.1.tsv

Experimental protocol used to generate the data

Protocol used for the V(D)J amplicon sequencing library generation.

type: string

Available protocols are:

specific_pcr_umi: RT-PCR using transcript-specific primers containing UMIs.
specific_pcr: RT-PCR using transcript-specific primers.
dt_5p_race_umi: 5’-RACE PCR using oligo-dT primers and template switch primers containing UMI.
dt_5p_race: 5’-RACE PCR (i.e. RT is followed by a template switch (TS) step) using oligo-dT primers.

Path to fasta file containing the linker sequence, if no V-region primers were used but a linker sequence is present (e.g. 5' RACE SMARTer TAKARA protocol).

type: string

Define the primer region start and how to deal with the primer alignment.

Path to a fasta file containinc the V-region primer sequences.

type: string

Path to a fasta file containing the C-region primer sequences.

type: string

Start position of V region primers (without counting the UMI barcode).

type: integer

Start position of C region primers (without counting the UMI barcode).

type: integer

Indicate if C region primers are in the R1 or R2 reads.

type: string

Specify to match the tail-end of the sequence against the reverse complement of the primers. This also reverses the behavior of the --start argument, such that start position is relative to the tail-end of the sequence. (default: False)Maximum scoring error for the Presto MaxPrimer process for the C and/or V region primers identification.

type: boolean

Define how UMI barcodes should be treated.

Indicate if UMI indices are recorded in the R1 (default) or R1 fastq file.

type: string

The pipeline requires UMI barcodes for identifying unique transcripts. These barcodes are typically read from an index file but sometimes can be provided merged with the start of the R1 or R2 reads. If provided in an additional index file, set the --index_file parameter, if provided merged with the R1 or R2 reads, set the --umi_position parameter to R1 or R2, respectively.

UMI barcode length in nucleotides. Set to 0 if no UMIs present.

type: integer

default: -1

UMI barcode start position in the index read.

type: integer

Indicate if UMI indices are recorded in a separate index file.

type: boolean

Set to true if UMI barcodes are to be read from a separate Illumina index fastq file. If Illumina indices and UMI barcodes are already integrated into the R1 reads, leave the default --index_file false.

The pipeline requires UMI barcodes for identifying unique transcripts. These barcodes are typically read from an index file but sometimes can be provided merged with the start of the R1 or R2 reads. If provided in an additional index file, set the --index_file parameter, if provided merged with the R1 or R2 reads, set the --umi_position parameter.

Options for adapter trimming and read clipping

Whether to trim adapters in fastq reads with fastp.

type: boolean

default: true

By default adapters will be auto-detected, but adapter sequences can also be provided in a fasta file with the --adapter_fasta option.

Fasta file with adapter sequences to be trimmed.

type: string

default: None

Number of bases to clip 5' in R1 reads.

type: integer

Number of bases to clip 5' in R2 reads.

type: integer

Number of bases to clip 3' in R1 reads.

type: integer

Number of bases to clip 3' in R2 reads.

type: integer

Trim adapters specific for Nextseq sequencing

type: boolean

Option to save trimmed reads.

type: boolean

Options for the pRESTO sequence assembly processes

Quality threshold for pRESTO FilterSeq sequence filtering.

type: integer

default: 20

Maximum primer scoring error in the pRESTO MaskPrimer step for the C and/or V region primers identification.

type: number

default: 0.2

Maximum error for building the primer consensus in the pRESTO Buildconsensus step.

type: number

default: 0.6

Masking mode for the pRESTO MaskPrimer step. Available: cut, mask, trim, tag.

type: string

The primer masking modes will perform the following actions:

cut: remove both the primer region and the preceding sequence.
mask: replace the primer region with Ns and remove the preceding sequence.
trim: remove the region preceding the primer, but leave the primer region intact.
tag: leave the input sequence unmodified.

Maximum error for building the sequence consensus in the pRESTO BuildConsensus step.

type: number

default: 0.1

Maximum gap for building the sequence consensus in the pRESTO BuildConsensus step.

type: number

default: 0.5

Cluster sequences by similarity regardless of any annotation with pRESTO ClusterSets and annotate the cluster ID additionally to the UMI barcode.

type: boolean

default: true

Options for the VDJ annotation processes.

Whether to reassign genes if the input file is an AIRR formatted tabulated file.

type: boolean

default: true

Subset to productive sequences.

type: boolean

default: true

Save databases so you can use the cache in future runs.

type: boolean

Path to the cached IMGT database.

type: string

If it is not provided, the database will be newly downloaded.

Path to the cached igblast database.

type: string

If it is not provided, the database will be newly downloaded.

Use this flag for directly calling IgBlast for reference alignment, instead of using the changeo assigngenes and changeo makedb options that call IgBlast in the background. Allows for additional configuration of the IgBlast call.

type: boolean

Options for bulk sequence filtering after VDJ assignment.

Name of the field used to collapse duplicated sequences.

type: string

default: sample_id

Whether to run the process to detect contamination.

type: boolean

Whether to apply the chimera removal filter.

type: boolean

Define how the B-cell clonal trees should be calculated.

Set the clustering threshold Hamming distance value. Default: 'auto'

type: string,number

default: auto

Skip clonal lineage analysis and lineage tree plotting.

type: boolean

Name of the field used to group data files to identify clones.

type: string

default: subject_id

Name of the field used to identify external groups used to identify a clonal threshold.

type: string

default: subject_id

Path to IgPhyml executable.

type: string

default: /usr/local/share/igphyml/src/igphyml

Name of the field used to determine if a sample is single cell sequencing or not.

type: string

default: single_cell

Skip report of EnchantR DefineClones for all samples together.

type: boolean

Custom report Rmarkdown file.

type: string

default: ${projectDir}/assets/repertoire_comparison.Rmd

Custom report style file in css format.

type: string

default: ${projectDir}/assets/nf-core_style.css

Custom logo for the report.

type: string

default: ${projectDir}/assets/nf-core-airrflow_logo_light.png

Custom logo for the EnchantR reports.

type: string

default: ${projectDir}/assets/nf-core-airrflow_logo_reports.png

Skip repertoire analysis and report generation.

type: boolean

Skip multiqc report.

type: boolean

Options for the reference genome indices used to align reads.

Directory / URL base for iGenomes references.

hidden

type: string

default: s3://ngi-igenomes/igenomes

Do not load the iGenomes reference config.

hidden

type: boolean

default: true

Do not load igenomes.config when running the pipeline. You may choose this option if you observe clashes between custom parameters and those supplied in igenomes.config.

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

If you're running offline, Nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell Nextflow where to find them with this parameter.

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Directory to keep pipeline Nextflow logs and reports.

hidden

type: string

default: ${params.outdir}/pipeline_info

Set the top limit for requested resources for any single job.

Maximum number of CPUs that can be requested for any single job.

hidden

type: integer

default: 16

Use to set an upper-limit for the CPU requirement for each process. Should be an integer e.g. --max_cpus 1

Maximum amount of memory that can be requested for any single job.

hidden

type: string

default: 128.GB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Use to set an upper-limit for the memory requirement for each process. Should be a string in the format integer-unit e.g. --max_memory '8.GB'

Maximum amount of time that can be requested for any single job.

hidden

type: string

default: 240.h

pattern: ^(\d+\.?\s*(s|m|h|day)\s*)+$

Use to set an upper-limit for the time requirement for each process. Should be a string in the format integer-unit e.g. --max_time '2.h'

Less common options for the pipeline, typically set in a config file.

Display help text.

hidden

type: boolean

Display version and exit.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

The Nextflow publishDir option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See Nextflow docs for details.

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

An email address to send a summary email to when the pipeline is completed - ONLY sent if the pipeline does not exit successfully.

Send plain-text email instead of HTML.

hidden

type: boolean

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

File size limit when attaching MultiQC reports to summary emails.

hidden

type: string

default: 25.MB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden

type: boolean

Incoming hook URL for messaging service

hidden

type: string

Incoming hook URL for messaging service. Currently, MS Teams and Slack are supported.

Custom config file to supply to MultiQC.

hidden

type: string

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

hidden

type: string

Custom MultiQC yaml file containing HTML including a methods description.

type: string

Directory to keep pipeline Nextflow logs and reports.

hidden

type: string

default: ${params.outdir}/pipeline_info

Boolean whether to validate parameters against the schema at runtime

hidden

type: boolean

default: true

Show all params when using --help

hidden

type: boolean

By default, parameters set as hidden in the schema are not shown on the command line when a user runs with --help. Specifying this option will tell the pipeline to show all parameters.

nf-core/airrflow