Description

filter a matrix based on a minimum value and numbers of samples that must pass.

Input

Name (Type)
Description
Pattern

meta (map)

Groovy Map containing information on matrix to be filtered, at a
minimum an id. e.g. [ id:‘test’ ]

abundance (file)

Raw TSV or CSV format abundance matrix with features (e.g.
genes) by row and observations (e.g. samples) by column. All rownames
from the sample sheet should be present in the columns.

samplesheet_meta (map)

Where samplesheet is provided, aroovy Map containing information on
sample sheet, at a minimum an id. e.g. [ id:‘test’ ]

samplesheet (file)

Optional CSV or TSV format sample sheet with sample metadata. If
provided this is used to infer minimum passing samples from group sizes
present (see grouping_variable), but also to validate matrix columns.
If not provided, all numeric columns are selected.

minimum_abundance (float)

Minimum abundance value, supplied via task.ext.args as —minimum_abundance

minimum_samples (integer)

Minimum observations that must pass the threshold to retain
the row/ feature (e.g. gene). Supplied via task.ext.args as
—minimum_samples

minimum_proportion (float)

A minimum proportion of observations that must pass the threshold.
Supplied via task.ext.args as —minimum_proportion. Overrides
minimum_samples

grouping_variable (string)

Optionally supply a variable from the sample sheet that can be used to
define groups and derive a minimum group size upon which to base
minimum observation numbers. The rationale being to allow retention of
features that might be present in only one group. Supplied via
task.ext.args as —grouping_variable

minimum_proportion_not_na (float)

A minimum proportion of observations that must have a numeric value (not be NA).
Supplied via task.ext.args as —minimum_proportion_not_na

minimum_samples_not_na (integer)

Minimum observations that must have a numeric value (not be NA) to retain
the row/ feature (e.g. gene). Supplied via task.ext.args as
—minimum_samples_not_na. Overrides minimum_proportion_not_na

most_variant_features (integer)

Variance filter for the number of row/ feature (e.g. gene) observations returned.
Supplied via task.ext.args as —most_variant_features

Output

Name (Type)
Description
Pattern

versions (file)

File containing software versions

versions.yml

meta (map)

Groovy Map containing information on experiment.
e.g. [ id:‘test’ ]

filtered (file)

Filtered version of input matrix

*.filtered.tsv

tests (file)

Boolean matrix with pass/ fail status for each test on each feature

*.tests.tsv

Tools

matrixfilter

filter a matrix based on a minimum value and numbers of samples