nf-core/kmermaid      
 k-mer similarity analysis pipeline
22.10.6.
Learn more.
  Define where the pipeline should find input data and save output data.
Only a parameter added to avoid warning as per nf-core
stringPath to Local or s3 directories containing R1,R2.fastq.gz files, separated by commas.
stringPath to Local or s3 directories of single-end read files, separated by commas.
stringCSV file with columns id, read1, read2 for each sample
stringCSV file with columns id, read1, for each sample
stringPath to FASTA sequence files. Can be semi-colon-separated.
stringPath to protein fasta inputs.
stringPath to bam input.
stringPath to input tgz folder containing bam and bai files.
stringSRR, ERR, SRP IDs representing a project. Only compatible with Nextflow 19.03-edge or greater
stringEmail address for completion summary.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$Path to the output directory where the results will be saved.
stringSketch size options for sourmash compute
Number of hashes to use for making the sketches. Mutually exclusive with —sketch_num_hashes_log2
integerWhich log2 sketch sizes to use. Multiple are separated by commas. Mutually exclusive with —sketch_num_hashes
integerObserve every 1/N hashes per sample, rather than a flat rate of N hashes per sample. This way, the number of hashes scales by the sequencing depth. Mutually exclusive with —sketch_scaled_log2
integerSame as —sketch_scaled, but instead of specifying the true number of hashes, specify the power to take 2 to. Mutually exlusive with —sketch_scaled
integerOptions for kmer computation
Track abundance of each hashed k-mer, could be useful for cancer RNA-seq or ATAC-seq analyses
booleanIf provided, use SKA to compute split k-mer sketches instead of sourmash to compute k-mer sketches
booleanWhich nucleotide k-mer sizes to use. Multiple are separated by commas
string'21,27,33,51'dna,protein,dayhoff
stringInteger value to subsample reads from input fastq files
integerOptions to translate RNA-seq reads into protein-coding sequences .
Path to a well-curated fasta file of protein sequences. Used to filter for coding reads
stringK-mer size to use for translating RNA into protein, which is good for ‘protein’. If using dayhoff, suggest 15
integer9Which molecular encoding to use for translating.If your reference proteome is quite different from your species of interest, suggest using dayhoff
stringproteinMinimum fraction of overlapping translated k-mers from the read to match to the reference.
string0.95Maximum table size for bloom filter creation
integerRemove ribosomal RNA with SortMeRNA
If on, removes ribosomal RNA
booleanSave non ribosomal rna reads if true
booleanPath to rrna database manifest txt file
stringOptions to adjust parameters and filtering criteria for read alignments.
A barcode is only considered a valid barcode read and its signature is written if number of umis are greater than tenx_min_umi_per_cell
integerNumber of alignment to contain in each sharded bam file
integerFor bam files, Optional absolute path to a .tsv barcodes file if the input is unfiltered 10x bam file
stringFor bam files, Optional absolute path to a .tsv Tab-separated file mapping 10x barcode name to new name, e.g. with channel or cell annotation label
stringFor bam files, Csv file name relative to outdir/barcode_metadata to write number of reads and number of umis per barcode. This csv file is empty with just header when the tenx_min_umi_per_cell is zero i.e Reads and umis per barcode are calculated only when the barcodes are filtered based on tenx_min_umi_per_cell
stringPath to single barcode save the fastas inside the output directory where the results will be saved.
string10x sam tags
string10x Cell pattern
string10x UMI pattern
stringOptions to skip various steps within the workflow.
Skip fastp trimming of reads
booleanSkip sourmash compute.
booleanSkip sourmash compare.
booleanSkip MultiQC.
booleanParameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
stringmasterBase directory for Institutional configs.
stringhttps://raw.githubusercontent.com/nf-core/configs/masterInstitutional configs hostname.
stringInstitutional config description.
stringInstitutional config contact information.
stringInstitutional config URL link.
stringSet the top limit for requested resources for any single job.
Maximum number of CPUs that can be requested for any single job.
integer16Maximum amount of memory that can be requested for any single job.
string128.GBMaximum amount of time that can be requested for any single job.
string240.hLess common options for the pipeline, typically set in a config file.
Display help text.
booleanMethod used to save pipeline results to output directory.
stringWorkflow name.
stringEmail address for completion summary, only when pipeline fails.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$Send plain-text email instead of HTML.
booleanFile size limit when attaching MultiQC reports to summary emails.
string25.MBDo not use coloured log outputs.
booleanCustom config file to supply to MultiQC.
stringDirectory to keep pipeline Nextflow logs and reports.
string${params.outdir}/pipeline_infoArguments passed to Nextflow clusterOptions.
stringRun this workflow with Conda.
booleanTest paths for input reads
stringTest paths for fastas
stringTest paths for protein fastas
string