nf-core/cageseq
CAGE-sequencing analysis pipeline with trimming, alignment and counting of CAGE tags.
22.10.6.
Learn more.
Define where the pipeline should find input data and save output data.
The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
stringEmail address for completion summary.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$MultiQC report title. Printed as page header, used for filename if not otherwise specified.
stringParameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
stringmasterBase directory for Institutional configs.
stringhttps://raw.githubusercontent.com/nf-core/configs/masterInstitutional config name.
stringInstitutional config description.
stringInstitutional config contact information.
stringInstitutional config URL link.
stringLess common options for the pipeline, typically set in a config file.
Display version and exit.
booleanMethod used to save pipeline results to output directory.
stringEmail address for completion summary, only when pipeline fails.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$Send plain-text email instead of HTML.
booleanFile size limit when attaching MultiQC reports to summary emails.
string25.MB^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$Do not use coloured log outputs.
booleanIncoming hook URL for messaging service
stringCustom config file to supply to MultiQC.
stringCustom logo file to supply to MultiQC. File name must also be set in the MultiQC config file
stringCustom MultiQC yaml file containing HTML including a methods description.
stringBoolean whether to validate parameters against the schema at runtime
booleantrueBase URL or local path to location of pipeline test dataset files
stringhttps://raw.githubusercontent.com/nf-core/test-datasets/Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.
stringRun the whole pipeline
booleantrueRun only the mapping part until bigiwgs or bams
booleanRun only the CAGEr processing part from bigiwgs or bams
booleanGenome annotation fiel in GTF format
stringPath to the input file. Mutually exclusive with infolder
stringPath to the folder with fastq files. Mutually exclusive with input
stringWhether to save merged fasta files
booleantrueNumber of underscore separated fields denoting sample name when infolder is used
integerName of the reference genome. It is used as meta information
stringFASTA file containing a reference genome.
stringSpecifies a directory with a genome index
stringSequencing platform used. Required for mapping with STAR
stringName of the sequencing center. Required for mapping with STAR
booleanWhether only uniquely mapped reads should be considered for downstream analysis.
booleantrueWhether to keep only those reads that start with G base
booleanAdditional parameters that can be passed to TrimGalore!
stringMakes the pipeline skip the G-trimming step in preprocessing
booleanSwitches the aligner from STAR to bowtie2
booleanSwitches on PCR duplicate removal
booleanSets an optical duplicate distance, used together with dedup
integerThe input CSV samplesheet including the name of the samples, their pairedness status, and the location of bigwig or bam files. Required when cageronly is true.
stringFormat of the mapping data file passed to the TSS analysis part when STAR is used (either ‘bam’ or ‘bigwig’).
stringbigwigSeed file for BSgenome forging
stringDirectory containing either a set of FASTA files, one per reference chromosome, or a 2bit file for the whole reference genome. Used for BSgenome forging
stringBSgenome R package to use (if not forged)
stringThreshold above which raw and normalized CTSS are considered for the correlation plot
integer1Defines the lower thresold for fitting the power-law distribution
integer5Defines the upper thresold for fitting the power-law distribution
integer10000Method used for normalizing the samples: powerLaw, simpleTpm, and none are supported
stringpowerLawUser specified alpha, the -1 * fitted slope in the log-log representation of the power-law distribution. If none, the average across samples is calculated and used.
stringTotal number of CAGE tags in the reference power-law distribution
integer1000000Parameters for filtering low expressed CTSS before clustering. ctss_thr specifies the lower threshold above which CTSS are considered, and sample_num_thr specifies the number of samples where this threshold should be passed.
integer1Parameters for filtering low expressed CTSS before clustering. ctss_thr specifies the lower threshold above which CTSS are considered, and sample_num_thr specifies the number of samples where this threshold should be passed.
integer1Maximum distance for distance-based clustering (distclu)
integer20The tpm threshold above which even a single CTSS is kept during clustering
integer5Define the lower quantile boundaries of the interquartile range
number0.1Define the upper quantile boundaries of the interquartile range
number0.9Threshold above which tag clusters are considered for the interquartile width distribution plot
integer3Upstream distance to consider into TSS region for ChIPseeker annotation
integer-3000Downstream distance to consider into TSS region for ChIPseeker annotation
integer3000The number of bases to inlcude upstream of the TSS for TSS logos
integer35Used for defining the consensus clusters. consensus_thr specifies the TPM threshold above which tag clusters are considered for consensus clusters, and consensus_dist define the maximum distance between the interquartile ranges of tag clusters to be joined together into consensus clusters.
integer2Used for defining the consensus clusters. consensus_thr specifies the TPM threshold above which tag clusters are considered for consensus clusters, and consensus_dist define the maximum distance between the interquartile ranges of tag clusters to be joined together into consensus clusters.
integer100Defines the balance threshold above which bidirectionality is considered balanced and enhancers are called
number0.95Used for selecting only supported enhancers. unexpressed is a non inclusive lower TPM boundary for expression when calculating support of enhancers. minSamples is a non-inclusive lower boundary for the number of samples where the clusters should show bidirectionality.
integerUsed for selecting only supported enhancers. unexpressed is a non inclusive lower TPM boundary for expression when calculating support of enhancers. minSamples is a non-inclusive lower boundary for the number of samples where the clusters should show bidirectionality.
integer