Access to this GitHub repository is currently restricted.

To request access please contact: Robert J. Gifford (robert.gifford@glasgow.ac.uk)

Description

This is Filovirus-GLUE, a GLUE project designed to support comparative genomic and evolutionary analysis of filoviruses (family Filoviridae).

Ebola

Filovirus outbreaks have ravaged West and Central Africa in recent decades.

The Filovirus-GLUE project has been designed to facilitate any form of comparative genomic analysis involving filoviruses. It contains a richly annotated sequence dataset for these viruses, comprised of both viral sequences and endogenous viral elements (EVEs).

There are a wide variety of ways in which this resource can be used:

What is a GLUE project?


GLUE is an open, integrated software toolkit that provides functionality for storage and interpretation of sequence data.

GLUE supports the development of “projects” containing the data items required for comparative genomic analysis (e.g. sequences, multiple sequence alignments, genome feature annotations, and other sequence-associated data).

Projects are loaded into the GLUE "engine", creating a relational database that represents the semantic relationships between data items. This provides a robust foundation for the implementation of systematic comparative analyses and the development of sequence-based resources.

The core schema of this database can be extended to accommodate the idiosyncrasies of different projects, and GLUE provides a scripting layer (based on JavaScript) for developing custom analysis tools.

Hosting of GLUE projects in an online version control system (e.g. GitHub) provides a mechanism for their stable, collaborative development.

Some examples of 'sequence-based resources' built for viruses using GLUE include:

What does building the Filovirus-GLUE project offer?

Filovirus-GLUE offers a number of advantages for performing comparative sequence analysis of filoviruses:

  1. Standardisation of the genomic co-ordinate space. GLUE projects allow all sequences to utilise the coordinate space of a chosen reference sequence. Contingencies associated with insertions and deletions (indels) are handled in a systematic way.
  2. Predefined, fully annotated reference sequences: This project includes fully-annotated reference sequences for major lineages within the Filoviridae family.
  3. Alignment trees: GLUE allows linking of alignments constructed at distinct taxonomic levels via an "alignment tree" data structure. In the alignment tree, each alignment is constrained to a standard reference sequence, thus all multiple sequence alignments are linked to one another via a standardised coordinate system.

GLUE project

On computers with GLUE installed, the Filovirus-GLUE project can be instantiated by navigating to the project folder, initiating GLUE, and issuing the following command in the GLUE shell:

  Mode path: /
  GLUE> run file filoviridaeProject.glue

Contributors

Robert J. Gifford (robert.gifford@glasgow.ac.uk)

Related Publications

Singer JB, Thomson EC, McLauchlan J, Hughes J, and RJ Gifford (2018)
GLUE: A flexible software system for virus sequence data.
BMC Bioinformatics [view]

Zhu H, Dennis T, Hughes J, and RJ Gifford (2018)
Database-integrated genome screening (DIGS): exploring genomes heuristically using sequence similarity search tools and a relational database. [preprint]

Gifford RJ, Blomberg B, Coffin JM, Fan H, Heidmann T, Mayer J, Stoye J, Tristem M, and WE Johnson (2018)
Nomenclature for endogenous retrovirus (ERV) loci.
Retrovirology [view]

Katzourakis A. and RJ. Gifford (2010)
Endogenous viral elements in animal genomes.
PLoS Genetics [view]