Description

Post-process transcript-level quantification results using tximport to produce count matrices and SummarizedExperiment objects

Input

name
description
pattern

meta

Groovy Map containing study-level sample sheet information. e.g. [
id:‘SRP1234’ ].

samplesheet

Sample sheet, to be baked into the colData of SummarizedExperiment
objects.

*.{csv,tsv}

quant_results

Per-sample quantification results. For Salmon these are result
directories containing quant.sf, for Kallisto directories containing
abundance.tsv, and for RSEM the .isoforms.results files. All samples
must have been quantified against the same transcriptome, as only a
single sample is used for transcript-to-gene mapping discovery. If
support for mixed transcriptomes is needed in future, tx2gene would
need to run independently per sample.

gtf

Channel with features in GTF format, used to generate transcript/gene
mappings via tx2gene.

gtf_id_attribute

Attribute in GTF file corresponding to the gene identifier.

gtf_extra_attribute

GTF alternative gene attribute (e.g. gene_name)

quant_type

Quantification tool type. One of ‘salmon’, ‘kallisto’, or ‘rsem’.

skip_merge

Skip cross-sample merging. When true, runs tximport per-sample
instead of collecting all samples, and skips SummarizedExperiment
creation. Useful for very large cohorts.

Output

name
description
pattern

tx2gene

Transcript-to-gene mapping file generated from the GTF.

*.tx2gene.tsv

tpm_gene

Gene-level matrix of abundance values in TPM.

*.gene_tpm.tsv

counts_gene

Gene-level matrix of unadjusted estimated counts from tximport
(countsFromAbundance = 'no').

*.gene_counts.tsv

lengths_gene

Gene-level matrix of effective length values.

*.gene_lengths.tsv

counts_gene_length_scaled

Gene-level matrix of estimated counts, generated from abundance (TPM)
values by scaling to library size, additionally scaled using the
average transcript length, using tximport
countsFromAbundance = 'lengthScaledTPM'.

*.gene_counts_length_scaled.tsv

counts_gene_scaled

Gene-level matrix of estimated counts, generated from abundance (TPM)
values by scaling to library size with tximport
countsFromAbundance = 'scaledTPM'.

*.gene_counts_scaled.tsv

tpm_transcript

Transcript-level matrix of abundance values in TPM.

*.transcript_tpm.tsv

counts_transcript

Transcript-level matrix of unadjusted estimated counts from tximport
(countsFromAbundance = 'no').

*.transcript_counts.tsv

lengths_transcript

Transcript-level matrix of effective length values.

*.transcript_lengths.tsv

merged_gene_rds

Serialised SummarizedExperiment object containing gene-level assays
(counts, length-scaled counts, scaled counts, lengths, TPM).

*.rds

merged_transcript_rds

Serialised SummarizedExperiment object containing transcript-level
assays (counts, lengths, TPM).

*.rds