Description

A tool to quickly download assemblies from NCBI’s Assembly database

Input

Name (Type)
Description
Pattern

meta (map)

Groovy Map containing sample information
e.g. [ id:‘test’, single_end

]

accessions (file)

List of accessions (one per line) to download

*.txt

Output

Name (Type)
Description
Pattern

meta (map)

Groovy Map containing sample information
e.g. [ id:‘test’, single_end

]

versions (file)

File containing software versions

versions.yml

gbk (file)

GenBank format of the genomic sequence(s) in the assembly

*_genomic.gbff.gz

fna (file)

FASTA format of the genomic sequence(s) in the assembly.

*_genomic.fna.gz

rm (file)

RepeatMasker output for eukaryotes.

*_rm.out.gz

features (file)

Tab-delimited text file reporting locations and attributes for a subset of annotated features

*_feature_table.txt.gz

gff (file)

Annotation of the genomic sequence(s) in GFF3 format

*_genomic.gff.gz

faa (file)

FASTA format of the accessioned protein products annotated on the genome assembly.

*_protein.faa.gz

gpff (file)

GenPept format of the accessioned protein products annotated on the genome assembly.

*_protein.gpff.gz

wgs_gbk (file)

GenBank flat file format of the WGS master for the assembly

*_wgsmaster.gbff.gz

cds (file)

FASTA format of the nucleotide sequences corresponding to all CDS features annotated on the assembly

*_cds_from_genomic.fna.gz

rna (file)

FASTA format of accessioned RNA products annotated on the genome assembly

*_rna.fna.gz

rna_fna (file)

FASTA format of the nucleotide sequences corresponding to all RNA features annotated on the assembly

*_rna_from_genomic.fna.gz

report (file)

Tab-delimited text file reporting the name, role and sequence accession.version for objects in the assembly

*_assembly_report.txt

stats (file)

Tab-delimited text file reporting statistics for the assembly

*_assembly_stats.txt

Tools

ncbigenomedownload
Apache Software License

Download genome files from the NCBI FTP server.