nf-core/variantprioritization
Edit

This is the development version of the pipeline.

Launch development version https://github.com/nf-core/variantprioritization

Introduction

nf-core/variantprioritization is a bioinformatics analysis pipeline for the functional annotation and translation of somatic SNVs/InDels and copy number abberations for precision cancer medicine using [Personal Cancer Genome Reporter (PCGR)]. nf-core/variantprioritization offers germline SNVs/INDELS intepretation and annotation using Cancer Predisposition Sequencing Reporter (CPSR).

The workflow has been designed to accept outputs generated by nf-core/sarek:

Tool	Germline	Somatic tumor-normal	Somatic tumor-only
ASCAT		✔️	✔️
DeepVariant	✔️
HaplotypeCaller	✔️
Mutect2		✔️	✔️
Strelka somatic indels		✔️
Strelka somatic snvs		✔️

Usage

The workflow accepts as input a samplesheet.csv file containing the paths to SNV/InDel VCF files and ASCAT copy number abberation files. We have efforted to mimick the samplesheet specifications of nf-core/sarek for ease of use:

Column	Description
patient	Designates the patient/subject; must be unique for each patient, but one patient can have multiple samples
status	Normal/tumor (0/1) status of sample
sample	Designates the sample ID; must be unique. A patient may have multiple samples e.g a paired tumor-normal, tumor-only.
vcf	Full path to VCF file(s)
cna	Full path to segment file

An example of a valid samplesheet is given below:

patient,status,sample,vcf,cna
HCC1395,1,HCC1395T,HCC1395T_vs_HCC1395N.mutect2.vcf.gz,HCC1395T.segments.txt
HCC1395,1,HCC1395T,HCC1395T_vs_HCC1395N.freebayes.vcf.gz,HCC1395T.segments.txt
HCC1395,1,HCC1395T,HCC1395T_vs_HCC1395N.strelka.somatic_snvs.vcf.gz,HCC1395T.segments.txt
HCC1395,1,HCC1395T,HCC1395T_vs_HCC1395N.strelka.somatic_indels.vcf.gz,HCC1395T.segments.txt
HCC1395,0,HCC1395N,HCC1395N.deepvariant.vcf.gz,
HCC1395,0,HCC1395N,HCC1395N.haplotypecaller.vcf.gz,
HCC1396,1,HCC1396T,HCC1396T_vs_HCC1396N.mutect2.vcf.gz,
HCC1396,1,HCC1396T,HCC1396T_vs_HCC1396N.strelka.somatic_snvs.vcf.gz,
HCC1396,1,HCC1396T,HCC1396T_vs_HCC1396N.strelka.somatic_indels.vcf.gz,

copy number abberation files must be present for every sample entry when --cna_analysis true.

Now, you can run the pipeline using:

nextflow run nf-core/variantprioritization \
   -profile <docker/singularity/.../institute> \
   --input samplesheet.csv \
   --outdir <OUTDIR>

Warning

Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters; see docs.

For more details and further functionality, please refer to the usage documentation and the parameter documentation.

Variant consolidation

Somatic variants called by multiple tools are reformatted to match PCGR specifications making them easily searchable in the HTML ouput.

Tumor sample depth (TDP), allele frequency (TAF) and allelic depths for the ref and alt (ADT) are manually calculated and when applicable, applied to the normal sample (NDP, NAF, ADN):

HCC1395T_vs_HCC1395N.mutect2.vcf.gz
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  HCC1395_HCC1395N        HCC1395_HCC1395T
chr1    1212740 .       A       C       .       PASS    AS_SB_TABLE=63,80|49,76;DP=282;ECNT=1;MBQ=20,20;MFRL=151,154;MMQ=60,60;MPOS=30;NALOD=1.94;NLOD=25.89;POPAF=6.00;TLOD=341.76     GT:AD:AF:DP:F1R2:F2R1:FAD:SB 0/0:143,0:0.011:143:36,0:36,0:86,0:63,80,0,0    0/1:0,125:0.988:125:0,28:0,37:0,78:0,0,49,76

TDP=125;NDP=143;TAF=0.988;NAF=0.011;ADT=0,125;ADN=143,0;TAL=mutect2

HCC1395T_vs_HCC1395N.strelka.somatic_snvs.vcf.gz
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  NORMAL  TUMOR
chr1    1212740 .       A       C       .       PASS    DP=271;MQ=60.00;MQ0=0;NT=ref;QSS=790;QSS_NT=3070;ReadPosRankSum=0.00;SGT=AA->AC;SNVSB=0.00;SOMATIC;SomaticEVS=19.73;TQSS=1;TQSS_NT=1    DP:FDP:SDP:SUBDP:AU:CU:GU:TU 145:0:0:0:145,145:0,0:0,0:0,0   126:0:0:0:0,0:126,126:0,0:0,0

TDP=126;NDP=145;TAF=1;NAF=0;ADT=0,126;ADN=145,0;TAL=strelka

Finally, the maximum values for TAF, TDP, NAF, NDP, ADT, ADN are taken as outputs for the consolidate variant call. In addition, values present in the ID and QUAL column (i.e not '.') are reported if present in any of the original calls:

1       1212740 .       A       C       3793.8  PASS    NDP=145;NAF=0.011;TDP=126;TAF=1;TAL=mutect2,strelka

Pipeline output

PCGR

CPSR

Credits

nf-core/variantprioritization was originally written by @barrydigby , @yussab and @matbonfanti .

We thank the following people for their extensive assistance in the development of this pipeline:

Contributions and Support

Please open an issue or reach out to me (Youssef Abili) on the nf-core slack channel.

I am interested in adding compatability for additional variant calling tools and optimising the intake of large VCF files.

Citations

Cancer Predisposition Sequencing Reporter (CPSR): A flexible variant report engine for high-throughput germline screening in cancer Nakken S, Saveliev V, Hofmann O, Møller P, Myklebost O, Hovig E.

Int J Cancer. 2021 Dec 1;149(11):1955-1960. doi:10.1002/ijc.33749

Personal Cancer Genome Reporter: variant interpretation report for precision oncology Nakken S, Fournous G, Vodák D, Aasheim LB, Myklebost O, Hovig E.

Bioinformatics. 2018 May 15;34(10):1778-1780. doi: 10.1093/bioinformatics/btx817

Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants Garcia M, Juhos S, Larsson M, Olason PI, Martin M, Eisfeldt J, DiLorenzo S, Sandgren J, Díaz De Ståhl T, Ewels P, Wirta V, Nistér M, Käller M, Nystedt B.

F1000Res. 2020 Jan 29;9:63. doi: 10.12688/f1000research.16665.2

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Aln

run with

See the docs on how to configure the Seqera Platform CLI.

subscribers

stars

open issues

open PRs

last release

not yet released

last update

9 days ago

included modules

included subworkflows

utils_nextflow_pipeline utils_nfcore_pipeline utils_nfschema_plugin

contributors

get help

Ask a question on Slack Open an issue on GitHub

nf-core/variantprioritization Edit

Introduction

Usage

Variant consolidation

Pipeline output

PCGR

CPSR

Credits

Contributions and Support

Citations

run with

subscribers

stars

open issues

open PRs

last release

last update

included modules

included subworkflows

contributors

get help

nf-core/variantprioritization
Edit