Description

QUILT is an R and C++ program for rapid genotype imputation from low-coverage sequence using a large reference panel.

Input

name:type
description
pattern

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

bams:file

(Mandatory) BAM/CRAM files

*.{bam,cram,sam}

bais:file

(Mandatory) BAM/CRAM index files

*.{bai}

bamlist:file

(Optional) File with list of BAM/CRAM files to impute. One file per line.

*.{txt}

reference_haplotype_file:file

(Mandatory) Reference haplotype file in IMPUTE format (file with no header and no rownames, one row per SNP, one column per reference haplotype, space separated, values must be 0 or 1)

*.{hap.gz}

reference_legend_file:file

(Mandatory) Reference haplotype legend file in IMPUTE format (file with one row per SNP, and a header including position for the physical position in 1 based coordinates, a0 for the reference allele, and a1 for the alternate allele).

*.{legend.gz}

chr:string

(Mandatory) What chromosome to run. Should match BAM headers.

regions_start:integer

(Mandatory) When running imputation, where to start from. The 1-based position x is kept if regionStart <= x <= regionEnd.

regions_end:integer

(Mandatory) When running imputation, where to stop.

ngen:integer

Number of generations since founding or mixing. Note that the algorithm is relatively robust to this. Use nGen = 4 *.Ne / K if unsure.

buffer:integer

Buffer of region to perform imputation over. So imputation is run form regionStart-buffer to regionEnd+buffer, and reported for regionStart to regionEnd, including the bases of regionStart and regionEnd.

genetic_map_file:file

(Optional) File with genetic map information, a file with 3 white-space delimited entries giving position (1-based), genetic rate map in cM/Mbp, and genetic map in cM. If no file included, rate is based on physical distance and expected rate (expRate).

*.{txt.gz}

meta2:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

posfile:file

(Optional) File with positions of where to impute, lining up one-to-one with genfile. File is tab seperated with no header, one row per SNP, with col 1 = chromosome, col 2 = physical position (sorted from smallest to largest), col 3 = reference base, col 4 = alternate base. Bases are capitalized.

*.{txt}

phasefile:file

(Optional) File with truth phasing results. Supersedes genfile if both options given. File has a header row with a name for each sample, matching what is found in the bam file. Each subject is then a tab seperated column, with 0 = ref and 1 = alt, separated by a vertical bar |, e.g. 0|0 or 0|1. Note therefore this file has one more row than posfile which has no header.

*.{txt}

meta3:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

fasta:file

(Optional) File with reference genome.

*.{txt.gz}

Output

name:type
description
pattern

vcf

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

*.vcf.gz:file

VCF file with both SNP annotation information and per-sample genotype information.

*.{vcf.gz}

tbi

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

*.vcf.gz.tbi:file

TBI file of the VCF.

*.{vcf.gz.tbi}

rdata

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

RData", type: "dir:directory

Optional directory path to prepared RData file with reference objects (useful with —save_prepared_reference=TRUE).

plots

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

plots", type: "dir:directory

Optional directory path to save plots.

versions

versions.yml:file

File containing software versions

versions.yml

Tools

quilt
GPL v3

Read aware low coverage whole genome sequence imputation from a reference panel