Define where the pipeline should find input data and save output data.

Path to fastq input files

required
type: string

path/to/kraken2_db_dir

type: string

Path to sourcepredict labels file

type: string
default: /Users/maxime/Documents/github/coproid/data/sourcepredict/modern_gut_microbiomes_labels.csv

Path to sourcepredict sources file

type: string
default: /Users/maxime/Documents/github/coproid/data/sourcepredict/modern_gut_microbiomes_sources.csv

The output directory where the results will be saved.

type: string
default: ./results

Email address for completion summary.

type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (~/.nextflow/config) then you don't need to specify this on the command line for every run.

Options for the reference genome indices used to align reads.

Directory / URL base for iGenomes references.

hidden
type: string
default: s3://ngi-igenomes/igenomes

Do not load the iGenomes reference config.

hidden
type: boolean

Do not load igenomes.config when running the pipeline. You may choose this option if you observe clashes between custom parameters and those supplied in igenomes.config.

Name of candidate species 1

type: string

Name of candidate species 2

type: string

Name of candidate species 3

type: string

iGenome name for target genome 1

type: string

Must be provided if fasta1 is not provided

iGenome name for target genome 2

type: string

Must be provided if fasta2 is not provided

iGenome name for target genome 3

type: string

Must be provided if fasta3 is not provided

Path to Bowtie2 pre-indexed genome candidate 1

type: string

Path to Bowtie2 pre-indexed genome candidate 2

type: string

Path to Bowtie2 pre-indexed genome candidate 3

type: string

Fasta reference of genome candidate 1

type: string

Must be provided if genome1 is not provided

Fasta reference of genome candidate 2

type: string

Must be provided if genome2 is not provided

Fasta reference of genome candidate 3

type: string

Must be provided if genome3 is not provided

Less common options for the pipeline, typically set in a config file.

Display help text.

hidden
type: boolean

Email address for completion summary, only when pipeline fails.

hidden
type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

This works exactly as with --email, except emails are only sent if the workflow is not successful.

Send plain-text email instead of HTML.

hidden
type: boolean

Set to receive plain-text e-mails instead of HTML formatted.

File size limit when attaching MultiQC reports to summary emails.

hidden
type: string
default: 25.MB

If file generated by pipeline exceeds the threshold, it will not be attached.

Do not use coloured log outputs.

hidden
type: boolean

Set to disable colourful command line output and live life in monochrome.

Custom config file to supply to MultiQC.

hidden
type: string

Directory to keep pipeline Nextflow logs and reports.

hidden
type: string
default: ${params.outdir}/pipeline_info

Name of pipeline run

type: string

Set the top limit for requested resources for any single job.

Maximum number of CPUs that can be requested for any single job.

hidden
type: integer
default: 16

Use to set an upper-limit for the CPU requirement for each process. Should be an integer e.g. --max_cpus 1

Maximum amount of memory that can be requested for any single job.

hidden
type: string
default: 128.GB
pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Use to set an upper-limit for the memory requirement for each process. Should be a string in the format integer-unit e.g. --max_memory '8.GB'

Maximum amount of time that can be requested for any single job.

hidden
type: string
default: 240.h
pattern: ^(\d+\.?\s*(s|m|h|day)\s*)+$

Use to set an upper-limit for the time requirement for each process. Should be a string in the format integer-unit e.g. --max_time '2.h'

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden
type: string
default: master

Provide git commit id for custom Institutional configs hosted at nf-core/configs. This was implemented for reproducibility purposes. Default: master.

## Download and use config file with following git commit id  
--custom_config_version d52db660777c4bf36546ddb188ec530c3ada1b96  

Base directory for Institutional configs.

hidden
type: string
default: https://raw.githubusercontent.com/nf-core/configs/master

If you're running offline, nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell nextflow where to find them with the custom_config_base option. For example:

## Download and unzip the config files  
cd /path/to/my/configs  
wget https://github.com/nf-core/configs/archive/master.zip  
unzip master.zip  
  
## Run the pipeline  
cd /path/to/my/data  
nextflow run /path/to/pipeline/ --custom_config_base /path/to/my/configs/configs-master/  

Note that the nf-core/tools helper package has a download command to download all required pipeline files + singularity containers + institutional configs in one go for you, to make this process easier.

Institutional configs hostname.

hidden
type: string

Institutional config description.

hidden
type: string

Institutional config contact information.

hidden
type: string

Institutional config URL link.

hidden
type: string

Specifies that the input is single-end reads.

type: boolean

By default, the pipeline expects paired-end data. If you have single-end data, you need to specify --single_end on the command line when you launch the pipeline. A normal glob pattern, enclosed in quotation marks, can then be used for --input. For example:

--single_end --input '*.fastq'  

It is not possible to run a mixture of single-end and paired-end files in one run.

Specifies if AdapterRemoval should merge the paired-end sequences or not.

type: string
default: true

Phred quality encoding

type: integer

Identity threshold to retain read alignment.

type: number
default: 0.95

Specifies if data is modern (false) or ancient DNA (true).

type: string
default: true

Minimum PMDscore to retain read alignment.

type: integer
default: 3

DNA preparation library type

type: string

Bowtie settings for sensivity

type: string

Minimum number of Kraken hits per Taxonomy ID to report

type: integer
default: 50

Proportion of Endogenous DNA in organism 1 target microbiome

type: number
default: 0.01

Proportion of Endogenous DNA in organism 2 target microbiome

type: number
default: 0.01

Proportion of Endogenous DNA in organism 3 target microbiome

type: number
default: 0.01

Sourcepredict normalization method

type: string
default: gmpr

Sourcepredict number of embedding dimenstions

type: integer
default: 2

Sourcepredict embedding method

type: string
default: mds

Sourcepredict number of neighbours

type: string
default: all