nf-core/kmermaid
k-mer similarity analysis pipeline
22.10.6
.
Learn more.
Define where the pipeline should find input data and save output data.
Path to fastq.gz files in quotes
string
Test paths for input reads
string
Test paths for fastas
string
Test paths for protein fastas
string
Path to Local or s3 directories containing R1,R2.fastq.gz files, separated by commas.
string
Path to Local or s3 directories of single-end read files, separated by commas.
string
CSV file with columns id, read1, read2 for each sample
string
CSV file with columns id, read1, for each sample
string
Path to FASTA sequence files. Can be semi-colon-separated.
string
Path to protein fasta inputs.
string
Path to bam input.
string
Path to input tgz folder containing bam and bai files.
string
SRR, ERR, SRP IDs representing a project. Only compatible with Nextflow 19.03-edge or greater
string
Email address for completion summary.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (~/.nextflow/config
) then you don't need to specify this on the command line for every run.
Path to the output directory where the results will be saved.
string
Sketch size options for sourmash compute
Number of hashes to use for making the sketches. Mutually exclusive with --sketch_num_hashes_log2
integer
Which log2 sketch sizes to use. Multiple are separated by commas. Mutually exclusive with --sketch_num_hashes
integer
Observe every 1/N hashes per sample, rather than a flat rate of N hashes per sample. This way, the number of hashes scales by the sequencing depth. Mutually exclusive with --sketch_scaled_log2
integer
Same as --sketch_scaled, but instead of specifying the true number of hashes, specify the power to take 2 to. Mutually exlusive with --sketch_scaled
integer
Options for kmer computation
Track abundance of each hashed k-mer, could be useful for cancer RNA-seq or ATAC-seq analyses
boolean
If provided, use SKA to compute split k-mer sketches instead of sourmash to compute k-mer sketches
boolean
Which nucleotide k-mer sizes to use. Multiple are separated by commas
string
'21,27,33,51'
dna,protein,dayhoff
string
Integer value to subsample reads from input fastq files
integer
Options to translate RNA-seq reads into protein-coding sequences .
Path to a well-curated fasta file of protein sequences. Used to filter for coding reads
string
K-mer size to use for translating RNA into protein, which is good for 'protein'. If using dayhoff, suggest 15
integer
9
Which molecular encoding to use for translating.If your reference proteome is quite different from your species of interest, suggest using dayhoff
string
protein
Minimum fraction of overlapping translated k-mers from the read to match to the reference.
string
0.95
Maximum table size for bloom filter creation
integer
Remove ribosomal RNA with SortMeRNA
If on, removes ribosomal RNA
boolean
Save non ribosomal rna reads if true
boolean
Path to rrna database manifest txt file
string
Options to adjust parameters and filtering criteria for read alignments.
A barcode is only considered a valid barcode read and its signature is written if number of umis are greater than tenx_min_umi_per_cell
integer
For bam files, Optional absolute path to a .tsv barcodes file if the input is unfiltered 10x bam file
string
For bam files, Optional absolute path to a .tsv Tab-separated file mapping 10x barcode name to new name, e.g. with channel or cell annotation label
string
For bam files, Csv file name relative to outdir/barcode_metadata to write number of reads and number of umis per barcode. This csv file is empty with just header when the tenx_min_umi_per_cell is zero i.e Reads and umis per barcode are calculated only when the barcodes are filtered based on tenx_min_umi_per_cell
string
Path to single barcode save the fastas inside the output directory where the results will be saved.
string
10x sam tags
string
10x Cell pattern
string
10x UMI pattern
string
Options to skip various steps within the workflow.
Skip fastp trimming of reads
boolean
Skip sourmash compute.
boolean
Skip sourmash compare.
boolean
Skip merging aligned+unaligned reads per cell (keep aligned/unaligned separate)
boolean
Skip MultiQC.
boolean
Parameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
string
master
Base directory for Institutional configs.
string
https://raw.githubusercontent.com/nf-core/configs/master
If you're running offline, Nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell Nextflow where to find them with this parameter.
Institutional configs hostname.
string
Institutional config description.
string
Institutional config contact information.
string
Institutional config URL link.
string
Set the top limit for requested resources for any single job.
Maximum number of CPUs that can be requested for any single job.
integer
16
Use to set an upper-limit for the CPU requirement for each process. Should be an integer e.g. --max_cpus 1
Maximum amount of memory that can be requested for any single job.
string
128.GB
Use to set an upper-limit for the memory requirement for each process. Should be a string in the format integer-unit e.g. --max_memory '8.GB'
Maximum amount of time that can be requested for any single job.
string
240.h
Use to set an upper-limit for the time requirement for each process. Should be a string in the format integer-unit e.g. --max_time '2.h'
Less common options for the pipeline, typically set in a config file.
Display help text.
boolean
Method used to save pipeline results to output directory.
string
The Nextflow publishDir
option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See Nextflow docs for details.
Workflow name.
string
A custom name for the pipeline run. Unlike the core nextflow -name
option with one hyphen this parameter can be reused multiple times, for example if using -resume
. Passed through to steps such as MultiQC and used for things like report filenames and titles.
Email address for completion summary, only when pipeline fails.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
An email address to send a summary email to when the pipeline is completed - ONLY sent if the pipeline does not exit successfully.
Send plain-text email instead of HTML.
boolean
File size limit when attaching MultiQC reports to summary emails.
string
25.MB
Do not use coloured log outputs.
boolean
Custom config file to supply to MultiQC.
string
Directory to keep pipeline Nextflow logs and reports.
string
${params.outdir}/pipeline_info