A tool to quickly download assemblies from NCBI’s Assembly database
Input
name:type
description
pattern
meta{:bash}
:map
Groovy Map containing sample information
e.g. [ id:‘test’, single_end:false ]
accessions{:bash}
:file
List of accessions (one per line) to download
*.txt
taxids{:bash}
:file
List of taxids (one per line) to download
*.txt
groups{:bash}
:string
NCBI taxonomic groups to download. Can be a comma-separated list. Options are [‘all’, ‘archaea’, ‘bacteria’, ‘fungi’, ‘invertebrate’, ‘metagenomes’, ‘plant’, ‘protozoa’, ‘vertebrate_mammalian’, ‘vertebrate_other’, ‘viral’]
Output
name:type
description
pattern
gbk{:bash}
meta{:bash}
:map
Groovy Map containing sample information
e.g. [ id:‘test’, single_end:false ]
*_genomic.gbff.gz{:bash}
:file
GenBank format of the genomic sequence(s) in the assembly
*_genomic.gbff.gz
fna{:bash}
meta{:bash}
:map
Groovy Map containing sample information
e.g. [ id:‘test’, single_end:false ]
*_genomic.fna.gz{:bash}
:file
FASTA format of the genomic sequence(s) in the assembly.
*_genomic.fna.gz
rm{:bash}
meta{:bash}
:map
Groovy Map containing sample information
e.g. [ id:‘test’, single_end:false ]
*_rm.out.gz{:bash}
:file
RepeatMasker output for eukaryotes.
*_rm.out.gz
features{:bash}
meta{:bash}
:map
Groovy Map containing sample information
e.g. [ id:‘test’, single_end:false ]
*_feature_table.txt.gz{:bash}
:file
Tab-delimited text file reporting locations and attributes for a subset of annotated features
*_feature_table.txt.gz
gff{:bash}
meta{:bash}
:map
Groovy Map containing sample information
e.g. [ id:‘test’, single_end:false ]
*_genomic.gff.gz{:bash}
:file
Annotation of the genomic sequence(s) in GFF3 format
*_genomic.gff.gz
faa{:bash}
meta{:bash}
:map
Groovy Map containing sample information
e.g. [ id:‘test’, single_end:false ]
*_protein.faa.gz{:bash}
:file
FASTA format of the accessioned protein products annotated on the genome assembly.
*_protein.faa.gz
gpff{:bash}
meta{:bash}
:map
Groovy Map containing sample information
e.g. [ id:‘test’, single_end:false ]
*_protein.gpff.gz{:bash}
:file
GenPept format of the accessioned protein products annotated on the genome assembly.
*_protein.gpff.gz
wgs_gbk{:bash}
meta{:bash}
:map
Groovy Map containing sample information
e.g. [ id:‘test’, single_end:false ]
*_wgsmaster.gbff.gz{:bash}
:file
GenBank flat file format of the WGS master for the assembly
*_wgsmaster.gbff.gz
cds{:bash}
meta{:bash}
:map
Groovy Map containing sample information
e.g. [ id:‘test’, single_end:false ]
*_cds_from_genomic.fna.gz{:bash}
:file
FASTA format of the nucleotide sequences corresponding to all CDS features annotated on the assembly
*_cds_from_genomic.fna.gz
rna{:bash}
meta{:bash}
:map
Groovy Map containing sample information
e.g. [ id:‘test’, single_end:false ]
*_rna.fna.gz{:bash}
:file
FASTA format of accessioned RNA products annotated on the genome assembly
*_rna.fna.gz
rna_fna{:bash}
meta{:bash}
:map
Groovy Map containing sample information
e.g. [ id:‘test’, single_end:false ]
*_rna_from_genomic.fna.gz{:bash}
:file
FASTA format of the nucleotide sequences corresponding to all RNA features annotated on the assembly
*_rna_from_genomic.fna.gz
report{:bash}
meta{:bash}
:map
Groovy Map containing sample information
e.g. [ id:‘test’, single_end:false ]
*_assembly_report.txt{:bash}
:file
Tab-delimited text file reporting the name, role and sequence accession.version for objects in the assembly
*_assembly_report.txt
stats{:bash}
meta{:bash}
:map
Groovy Map containing sample information
e.g. [ id:‘test’, single_end:false ]
*_assembly_stats.txt{:bash}
:file
Tab-delimited text file reporting statistics for the assembly