Retrovirus
Retroviruses are a distinctive group of enveloped RNA viruses best known for their ability to convert their RNA genome into DNA and integrate it into the genome of a host cell. This unique reverse-flow of genetic information, contrary to the central dogma of molecular biology, gives retroviruses their name and underpins their biological behaviour, pathogenic potential, and value as tools in molecular genetics. They infect a wide range of vertebrates, including humans, and have influenced genome evolution for hundreds of millions of years.
Retroviruses are classified into several subfamilies across the Orthoretrovirinae and Spumaretrovirinae, encompassing groups such as Alpharetrovirus, Betaretrovirus, Deltaretrovirus, Epsilonretrovirus, Gammaretrovirus, Lentivirus and multiple spumavirus genera. These viruses vary in pathogenicity, from benign foamy viruses to highly pathogenic lentiviruses, including HIV-1 and HIV-2.
Fundamental Characteristics and Replication Strategy
The defining property of retroviruses is their replication cycle. After entering a host cell, the virus releases its RNA genome into the cytoplasm. Retroviral reverse transcriptase then converts this RNA into complementary DNA. This DNA is integrated into the host genomic DNA by the viral integrase enzyme, producing a provirus. Once integrated, the provirus behaves as a host gene, undergoing normal transcription and translation to generate viral proteins and new viral genomes.
This strategy ensures lifelong persistence of the viral genome within infected cells. The integrated provirus can remain latent or actively produce viral particles depending on cellular conditions. Because many retroviruses target immune cells or dividing cells, they can profoundly influence host physiology and immunity.
Many pathogenic retroviruses are linked to serious diseases. Examples include human T-lymphotropic viruses (HTLV), which cause aggressive T-cell leukaemias, and murine leukaemia viruses, known for oncogenic properties in rodents. Lentiviruses such as HIV are associated with slow, progressive diseases, while foamy viruses are typically non-pathogenic.
Virion Structure and Components
Retrovirus particles, or virions, are spherical enveloped structures approximately 100 nanometres in diameter. Despite morphological diversity across genera, core structural features are conserved.
The outer lipid envelope, derived from the host cell membrane during viral budding, contains viral glycoproteins encoded by the env gene. These proteins mediate host-cell recognition and membrane fusion. The envelope confers environmental protection, enables endosomal trafficking, and facilitates entry into new host cells.
Inside the envelope is the capsid, composed of Gag-derived proteins. Gag is the most abundant structural protein, with about 2,000–4,000 copies per virion. It includes the matrix (MA) domain and nucleocapsid (NC) domain, the latter binding the viral RNA genome. Gag alone can drive the assembly and budding of immature particles.
Retroviral genomes contain two identical molecules of single-stranded positive-sense RNA, joined by base pairing interactions known as “kissing stem-loops”. Each RNA genome includes a 5′ cap and a 3′ poly(A) tail, processed by host RNA polymerase II. Non-coding terminal regions—R, U5, PBS at the 5′ end and PPT, U3, R at the 3′ end—play critical roles in reverse transcription, primer binding, and genome packaging.
Key retroviral enzymes packaged in the virion include:
- Reverse transcriptase (RT) – synthesises DNA from RNA and includes RNase H activity.
- Integrase (IN) – integrates viral DNA into host chromosomes.
- Protease (PR) – cleaves viral polyproteins during maturation, producing functional structural and enzymatic proteins.
RNase H, part of reverse transcriptase, is essential for removing the RNA strand of RNA–DNA hybrids and generating primers for DNA strand synthesis. Retroviruses lacking RNase H activity are unable to complete reverse transcription and are non-infectious.
Genomic Organisation
Retroviral genomes follow a broadly conserved layout: 5′–gag–pro–pol–env–3′.
- The gag gene encodes capsid, matrix, and nucleocapsid proteins.
- The pro region encodes the viral protease.
- The pol gene encodes reverse transcriptase, RNase H, and integrase.
- The env gene encodes the surface (SU) and transmembrane (TM) envelope glycoproteins.
Different retroviruses may express overlapping genes or polyproteins that are cleaved into functional components. Some groups contain accessory genes that regulate viral replication and modulate host responses. Lentiviruses, spumaviruses, HTLV and BLV genera are typical examples of “complex” retroviruses owing to these additional regulatory proteins.
Certain retroviruses carry oncogenes, enabling rapid cell transformation and tumour formation. These transforming retroviruses have played a crucial role in understanding cancer biology and the discovery of proto-oncogenes.
Endogenous Retroviruses and Genome Evolution
When a retrovirus integrates into the germline of a host organism, its genome becomes heritable. These endogenous retroviruses (ERVs) accumulate over evolutionary time, resulting in vast numbers of retroviral-derived sequences within vertebrate genomes. Around 8% of the human genome consists of retroviral sequences.
Although many ERVs are non-functional and considered genomic relics, several have acquired biological roles. They participate in:
- the regulation of host gene expression
- the development of the placenta, where retroviral envelope proteins mediate cell fusion
- innate immunity, including resistance to related exogenous retroviruses
- embryonic development and gene activation patterns
ERVs are also investigated in relation to autoimmune diseases and neurological conditions, where abnormal ERV expression may influence pathology.
Retroviruses in Research and Biotechnology
Because retroviruses integrate their genomes efficiently into host cell DNA, they are central tools in modern molecular biology. Retroviral vectors are used extensively in gene therapy, functional genomics, and cell engineering. Their natural enzymes—including reverse transcriptase and integrase—are essential for molecular cloning, cDNA synthesis, and genome manipulation.
Slow viruses such as HIV have contributed to a detailed understanding of host–pathogen interactions, while foamy viruses offer benign systems for gene delivery. The long evolutionary history of retroviral–host interactions has enabled researchers to study ancient viral infections, genome evolution, and mechanisms of horizontal gene transfer.