
Tool for imputation and phasing from vcf file or directly from bam files.


Name (Type)

meta (map)

Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]

input (file)

Either one or multiple BAM/CRAM files in an array containing low-coverage sequencing reads or one VCF/BCF file containing the genotype likelihoods.
When using BAM/CRAM the name of the file is used as samples name.


input_index (file)

Index file of the input BAM/CRAM/VCF/BCF file.


samples_file (file)

File with sample names and ploidy information.
One sample per line with a mandatory second column indicating ploidy (1 or 2).
Sample names that are not present are assumed to have ploidy 2 (diploids).
GLIMPSE does NOT handle the use of sex (M/F) instead of ploidy.


input_region (string)

Target region used for imputation, including left and right buffers (e.g. chr20:1000000-2000000).
Optional if reference panel is in bin format.


output_region (string)

Target imputed region, excluding left and right buffers (e.g. chr20:1000000-2000000).
Optional if reference panel is in bin format.


meta2 (map)

Groovy Map containing genomic map information
e.g. [ map:'GRCh38' ]

reference (file)

Reference panel of haplotypes in VCF/BCF format.


reference_index (file)

Index file of the Reference panel file.


map (file)

File containing the genetic map.
Optional if reference panel is in bin format.


fasta_reference (file)

Faidx-indexed reference sequence file in the appropriate genome build.
Necessary for CRAM files.


fasta_reference_index (file)

Faidx index of the reference sequence file in the appropriate genome build.
Necessary for CRAM files.



Name (Type)

versions (file)

File containing software versions


phased_variants (file)

Output VCF/BCF file containing genotype probabilities (GP field), imputed dosages (DS field), best guess genotypes (GT field), sampled haplotypes in the last (max 16) main iterations (HS field) and info-score.


stats_coverage (file)

Optional coverage statistic file created when BAM/CRAM files are used as inputs.




GLIMPSE2 is a phasing and imputation method for large-scale low-coverage sequencing studies.