Background/Description


Deltaretroviruses (genus Deltaretrovirus) are a highly unusual genus of retroviruses (family Retroviridae) that have only been identified in a restricted subset of mammalian species. They include the primate T-cell lymphotropic viruses (PTLVs) that infect apes (including humans) and Old World monkeys, and bovine leukemia virus (BLV) which infects cattle.

Like all retroviruses, deltaretroviruses cause common, persistent infections. Infection is frequently asymptomatic, but can lead to inflammatory and malignant disease over the longer term.

Deltaretrovirus host species

Deltaretroviruses and their hosts. Left to right: (i) mandrills (ii) colobus monkeys and (iii) chimpanzees are among the many species of African primate infected with deltaretroviruses. Human populations are also infected, and have been for millennia - deltaretroviral proviruses have been recovered from mummified remains in the Andes (iv), showing that deltaretroviruses were present in human populations that reached South America.

This is Deltaretrovirus-GLUE, a GLUE project supporting comparative genomic and evolutionary analysis of deltaretroviruses. It contains a richly annotated sequence dataset for these viruses, comprised of both viral sequences and endogenous retroviruses (ERVs).

There are a wide variety of ways in which the Deltaretrovirus-GLUE resource can be used:


What is a GLUE project?


GLUE is an open, integrated software toolkit that provides functionality for storage and interpretation of sequence data.

GLUE supports the development of “projects” containing the data items required for comparative genomic analysis (e.g. sequences, multiple sequence alignments, genome feature annotations, and other sequence-associated data).

Projects are loaded into the GLUE "engine", creating a relational database that represents the semantic relationships between data items. This provides a robust foundation for the implementation of systematic comparative analyses and the development of sequence-based resources.

The core schema of this database can be extended to accommodate the idiosyncrasies of different projects, and GLUE provides a scripting layer (based on JavaScript) for developing custom analysis tools.

Hosting of GLUE projects in an online version control system (e.g. GitHub) provides a mechanism for their stable, collaborative development.

Some examples of 'sequence-based resources' built for viruses using GLUE include:


What does building the Deltaretrovirus-GLUE project offer?


Deltaretrovirus-GLUE offers a number of advantages for performing comparative sequence analysis of deltaretroviruses:

  1. Reproducibility. For many reasons, bioinformatics analyses are notoriously difficult to reproduce. The GLUE framework supports the implementation of fully reproducible comparative genomics through the introduction of data standards and the use of a relational database to capture the semantic links between data items.

  2. Reusable data objects and analysis logic. For many - if not most - comparative genomic analyses, data preparation is nine tenths of the battle. The GLUE framework has been designed to ensure that work spent preparing high-value data items such as multiple sequence alignments need only be performed once. Hosting of GLUE projects in an online version control system such as GitHub allows for collaborative management of important data items and community testing of hypotheses.

  3. Validation. Building GLUE projects entails mapping the semantic links between data items (e.g. sequences, tabular data, multiple sequence alignments). This process provides an opportunity for cross-validation, and thereby enforces a high level of data integrity.

  4. Standardisation of the genomic co-ordinate space. GLUE projects allow all sequences to utilise the coordinate space of a chosen reference sequence. Contingencies associated with insertions and deletions (indels) are handled in a systematic way.

  5. Predefined, fully annotated reference sequences: This project includes fully-annotated reference sequences for major lineages within the Deltaretrovirus family.

  6. Alignment trees: GLUE allows linking of alignments constructed at distinct taxonomic levels via an "alignment tree" data structure. In the alignment tree, each alignment is constrained to a standard reference sequence, thus all multiple sequence alignments are linked to one another via a standardised coordinate system.


Installing Deltaretrovirus-GLUE


On computers with GLUE installed, the Deltaretrovirus-GLUE project can be instantiated by navigating to the project folder, initiating GLUE, and issuing the following command in the GLUE shell:

  Mode path: /
  GLUE> run file buildDeltaretrovirusCoreProject.glue

This will build the Deltaretrovirus-GLUE core project by executing the commands in this file.

The Deltaretrovirus project can be further extended to incorporate ERV sequences by executing the commands in this file, as follows.

  Mode path: /
  GLUE> run file buildDeltaretrovirusPaleoProject.glue

The Deltaretrovirus paleovirus extension incorporates a set of endogenous viral elements (ERVs) recovered from the genomes of metazoan species. Building the paleovirus extension allows automated alignment and phylogeny reconstruction for individual ERV lineages in the project, based on the classifications in these files. Individual ERV sequences have been classified into sets considered likely to have arisen from the same germline colonisation event. Loci have been named using a systematic approach (see here for details).



Related Publications


Singer JB, Thomson EC, McLauchlan J, Hughes J, and RJ Gifford (2018)
GLUE: A flexible software system for virus sequence data.
BMC Bioinformatics [view]

Zhu H, Dennis T, Hughes J, and RJ Gifford (2018)
Database-integrated genome screening (DIGS): exploring genomes heuristically using sequence similarity search tools and a relational database. [preprint]

Hron T, Elleder D, and RJ Gifford (2019)
Deltaretroviruses have circulated since at least the Paleogene and infected a broad range of mammalian species.
Retrovirology [view]

Hron T, Farkašová H, Gifford RJ, Benda P, Hulva P, Tamás Görföl, Pačes J, and D. Elleder (2018)
Remnants of an ancient deltaretrovirus in the genomes of horshoe bats (Rhinolophidae). Virus Research [view]

Gifford RJ, Blomberg B, Coffin JM, Fan H, Heidmann T, Mayer J, Stoye J, Tristem M, and WE Johnson (2018)
Nomenclature for endogenous retrovirus (ERV) loci.
Retrovirology [view]

Farkašová H, Hron T, Pačes J, Hulva P, Benda P, Gifford RJ, Elleder D. (2017)
Discovery of an endogenous Deltaretrovirus in the genome of long-fingered bats (Chiroptera: Miniopteridae).
Proc Natl Acad Sci U S A. [view]


Contributors


Robert J. Gifford (robert.gifford@glasgow.ac.uk)

Daniel Elleder (daniel.elleder@img.cas.cz)


License


This project is licensed under the GNU Affero General Public License v. 3.0.