Access to this GitHub repository is currently restricted.
To request access please contact: Robert J. Gifford (robert.gifford@glasgow.ac.uk)
Description
This is Filovirus-GLUE, a GLUE project designed to support comparative genomic and evolutionary analysis of filoviruses (family Filoviridae).
Filovirus outbreaks have ravaged West and Central Africa in recent decades.
The Filovirus-GLUE project has been designed to facilitate any form of comparative genomic analysis involving filoviruses. It contains a richly annotated sequence dataset for these viruses, comprised of both viral sequences and endogenous viral elements (EVEs).
There are a wide variety of ways in which this resource can be used:
- if you're interested in any particular virus within the family Filoviridae, you can use Filovirus-GLUE as an efficient means of investigating this species in depth.
- for those interested in paleovirology and endogenous viral elements (EVEs), Filovirus-GLUE provides a source of systematically organised information about EVEs derived from filoviruses.
- if you're interested in exploring filovirus diversity in metagenomic datasets, the data included in this project may assist you in your analysis, as explained here.
What is a GLUE project?
GLUE is an open, integrated software toolkit that provides functionality for storage and interpretation of sequence data.
GLUE supports the development of “projects” containing the data items required for comparative genomic analysis (e.g. sequences, multiple sequence alignments, genome feature annotations, and other sequence-associated data).
Projects are loaded into the GLUE "engine", creating a relational database that represents the semantic relationships between data items. This provides a robust foundation for the implementation of systematic comparative analyses and the development of sequence-based resources.
The core schema of this database can be extended to accommodate the idiosyncrasies of different projects, and GLUE provides a scripting layer (based on JavaScript) for developing custom analysis tools.
Hosting of GLUE projects in an online version control system (e.g. GitHub) provides a mechanism for their stable, collaborative development.
Some examples of 'sequence-based resources' built for viruses using GLUE include:
- COV-GLUE: A GLUE resource for tracking genetic variation in SARS-COV2. CoV-GLUE contains a database of amino acid replacements, insertions and deletions which have been observed in GISAID hCoV-19 sequences sampled from the pandemic
- RABV-GLUE: Tailored toward epidemiological tracking of rabies virus (RABV). Includes a database of RABV sequences and metadata from NCBI, updated daily and arranged into major and minor clades, and an analysis tool providing genotyping, analysis and visualisation of submitted FASTA sequences.
- HCV-GLUE: This GLUE resource aims to support analysis of drug resistance and vaccine escape in hepatitis C virus (HCV). A database of HCV sequences and metadata from NCBI, updated daily and arranged into clades (genotypes, subtypes). As well as pre-built multiple-sequence alignments of NCBI sequences, it includes an analysis tool providing genotyping, drug resistance analysis and visualisation of submitted FASTA sequences.
What does building the Filovirus-GLUE project offer?
Filovirus-GLUE offers a number of advantages for performing comparative sequence analysis of filoviruses:
- Standardisation of the genomic co-ordinate space. GLUE projects allow all sequences to utilise the coordinate space of a chosen reference sequence. Contingencies associated with insertions and deletions (indels) are handled in a systematic way.
- Predefined, fully annotated reference sequences: This project includes fully-annotated reference sequences for major lineages within the Filoviridae family.
- Alignment trees: GLUE allows linking of alignments constructed at distinct taxonomic levels via an "alignment tree" data structure. In the alignment tree, each alignment is constrained to a standard reference sequence, thus all multiple sequence alignments are linked to one another via a standardised coordinate system.
GLUE project
On computers with GLUE installed, the Filovirus-GLUE project can be instantiated by navigating to the project folder, initiating GLUE, and issuing the following command in the GLUE shell:
Mode path: /
GLUE> run file filoviridaeProject.glue
Contributors
Robert J. Gifford (robert.gifford@glasgow.ac.uk)
Related Publications
Singer JB, Thomson EC, McLauchlan J, Hughes J, and RJ Gifford
(2018)
GLUE: A flexible software system for virus sequence data.
BMC Bioinformatics
[view]
Zhu H, Dennis T, Hughes J, and RJ Gifford
(2018)
Database-integrated genome screening (DIGS): exploring genomes heuristically using sequence similarity search tools and a relational database.
[preprint]
Gifford RJ, Blomberg B, Coffin JM, Fan H, Heidmann T, Mayer J, Stoye J, Tristem M, and WE Johnson
(2018)
Nomenclature for endogenous retrovirus (ERV) loci.
Retrovirology
[view]
Katzourakis A. and RJ. Gifford
(2010)
Endogenous viral elements in animal genomes.
PLoS Genetics
[view]