DNA sequencing

DNA sequencing

DNA sequencing is the process of determining the exact order of nucleotidesadenine (A), thymine (T), cytosine (C), and guanine (G) — within a DNA molecule. It is one of the most fundamental techniques in molecular biology and genetics, providing the complete genetic blueprint of an organism. DNA sequencing enables scientists to study genes, diagnose genetic disorders, identify pathogens, and understand evolutionary relationships.

Concept and Importance

Every living organism has DNA as its hereditary material, which carries instructions for its growth, development, and function. The sequence of nucleotides in DNA determines the genetic code, which directs the synthesis of proteins through transcription and translation.
DNA sequencing reveals this code, allowing researchers to:

  • Identify genes and their mutations.
  • Study genetic variation and inheritance.
  • Understand molecular mechanisms of diseases.
  • Develop targeted medicines and personalised treatments.
  • Trace evolutionary and phylogenetic relationships among species.

The advent of rapid sequencing technologies has revolutionised biology, leading to projects such as the Human Genome Project (HGP) and the rise of genomics and precision medicine.

Structure of DNA

To understand sequencing, it is essential to recall that DNA is a double-helical molecule composed of two complementary strands. Each strand consists of nucleotides made up of:

  • A phosphate group
  • A deoxyribose sugar
  • One of four nitrogenous basesAdenine (A), Thymine (T), Cytosine (C), or Guanine (G)

The bases pair specifically (A with T, and C with G), forming the rungs of the helical ladder. DNA sequencing determines the exact arrangement of these bases along one strand, from which the complementary strand can be inferred.

Early Methods of DNA Sequencing

The history of DNA sequencing began in the 1970s, when scientists first developed techniques to determine short nucleotide sequences.

1. Maxam–Gilbert Method (Chemical Degradation Method, 1977)

Developed by Allan Maxam and Walter Gilbert, this method relied on selective chemical modification and cleavage of DNA at specific bases.

  • DNA was labelled with a radioactive marker.
  • Chemical reagents selectively broke the DNA at particular nucleotides.
  • Fragments were separated by gel electrophoresis to read the sequence.

Although accurate, this method was complex, hazardous, and labour-intensive, as it used toxic chemicals like hydrazine and required radioactive materials.

2. Sanger’s Chain Termination Method (1977)

Developed by Frederick Sanger, this became the most widely used DNA sequencing method for decades.It relies on selective incorporation of chain-terminating dideoxynucleotides (ddNTPs) during DNA synthesis.
Principle:

  • A DNA template is copied using DNA polymerase.
  • A mixture of normal nucleotides (dNTPs) and modified nucleotides (ddNTPs) is used.
  • When a ddNTP is incorporated, it terminates chain elongation because it lacks a hydroxyl group required for forming the next phosphodiester bond.
  • The resulting DNA fragments of various lengths are separated by electrophoresis and detected to reveal the sequence.

Advantages:

  • Highly accurate for small to medium-sized DNA fragments.
  • Automated versions using fluorescently labelled ddNTPs allowed rapid sequencing.

This method was the foundation of the Human Genome Project, completed in 2003, which mapped the entire 3 billion base pairs of human DNA.

Next-Generation Sequencing (NGS) Technologies

Traditional Sanger sequencing, while precise, was time-consuming and expensive for large-scale projects. The development of Next-Generation Sequencing (NGS) in the early 21st century transformed the field by allowing massively parallel sequencing of millions of DNA fragments simultaneously.

Key Features of NGS
  • High throughput: Can sequence entire genomes in days.
  • Cost-effective: Dramatically lower per-base cost compared to Sanger sequencing.
  • Automation and digitisation: Data collection and analysis are computerised.
Main NGS Platforms
  1. Illumina (Sequencing by Synthesis):
    • Uses fluorescently labelled reversible terminator nucleotides.
    • Billions of short DNA fragments are sequenced in parallel on a flow cell.
    • A camera records the fluorescent signal after each nucleotide addition.
  2. Ion Torrent Sequencing:
    • Detects hydrogen ions released during DNA synthesis instead of fluorescence.
    • Converts chemical signals into digital data.
  3. SOLiD Sequencing (Sequencing by Ligation):
    • Uses oligonucleotide probes to bind complementary sequences.
    • Offers high accuracy but more complex analysis.
  4. Pyrosequencing (Roche 454):
    • Detects light produced during DNA synthesis using the enzyme luciferase.
    • Now largely replaced by newer platforms.

Third-Generation Sequencing (Long-Read Technologies)

Third-generation sequencing allows the direct reading of single DNA molecules without amplification, providing longer and more continuous sequence reads.

  1. Pacific Biosciences (PacBio SMRT Sequencing):
    • Uses fluorescently labelled nucleotides to detect base incorporation in real-time.
    • Provides long reads (up to 30,000 base pairs) with high accuracy after error correction.
  2. Oxford Nanopore Sequencing:
    • Passes single DNA strands through a nanopore and measures changes in ionic current to identify bases.
    • Offers ultra-long reads and portability, making it ideal for field-based applications.

Applications of DNA Sequencing

DNA sequencing has transformed almost every branch of biological and medical science.
1. Medicine and Health:

  • Identification of genetic mutations responsible for inherited diseases.
  • Cancer genomics for detecting tumour-specific mutations.
  • Pharmacogenomics to design personalised medicine based on genetic profile.
  • Detection of pathogens and drug resistance genes.

2. Forensic Science:

  • DNA profiling for criminal identification and paternity testing.

3. Agriculture:

  • Sequencing of plant genomes for crop improvement, disease resistance, and genetic modification.

4. Evolutionary Biology:

  • Comparison of genomes to understand evolutionary relationships among species.

5. Microbiology:

  • Identification of new microorganisms and study of microbial communities (metagenomics).

6. Biotechnology and Synthetic Biology:

  • Designing synthetic genes and understanding gene regulation for industrial and pharmaceutical applications.

Limitations and Challenges

Despite remarkable advances, DNA sequencing presents some challenges:

  • Data complexity: Large datasets require sophisticated computational tools for analysis and storage.
  • Cost and accessibility: Although prices have dropped, advanced sequencing technologies remain costly for developing nations.
  • Interpretation of variants: Not all genetic variations have clear clinical implications.
  • Ethical concerns: Genetic data privacy, consent, and potential misuse remain global issues.

Recent Developments and Future Prospects

Modern research is focused on making sequencing faster, cheaper, and more accurate. Emerging technologies like nanopore sequencing, CRISPR-based target enrichment, and AI-driven data analysis are revolutionising genomics. The cost of sequencing a human genome has fallen from millions of dollars to less than a few hundred.
Future applications include:

  • Precision medicine, tailoring treatment to individual genomes.
  • Real-time pathogen surveillance, crucial for epidemic control.
  • Genomic editing and diagnostics integrated into clinical practice.
Originally written on January 17, 2015 and last modified on November 4, 2025.

Leave a Reply

Your email address will not be published. Required fields are marked *