Introduction

Genetic Polymorphism refers to the occurrence of two or more distinct alleles at a particular gene locus within a population’s genetic makeup, where the least common allele has a frequency of at least 1% in the population. These variations can exist in the form of single nucleotide differences, insertions, deletions, or larger structural changes in the DNA sequence. Polymorphisms occur due to differences in the DNA sequence among individuals at a specific locus. These variations are inherited and passed down through generations. For a genetic variation to be classified as a polymorphism, the less common allele must be present in at least 1% of the population. Variations with lower frequencies are usually considered rare mutations rather than polymorphisms. Polymorphisms are stable over many generations and contribute to the genetic diversity within a population.

Differentiate between polymorphisms and mutations

Polymorphisms and mutations both refer to variations in the DNA sequence, but they differ in terms of frequency, impact, and context within a population. Polymorphisms are genetic variations that are common in a population, where the less common allele has a frequency of at least 1%. Polymorphisms are stable and persist across generations, contributing to the genetic diversity of a population. Mutations are changes in the DNA sequence that occur less frequently. Unlike polymorphisms, mutations can be rare, with the new allele often having a frequency below 1% in the population. Mutations can be unique to an individual or family, especially if they are newly arisen. Most polymorphisms are neutral or have a minimal effect on an individual’s health and development. However, some can influence traits, disease susceptibility, or drug response. Since they are common, polymorphisms generally represent variations that have been tolerated by natural selection. On the other hand, mutations can have a wide range of effects, from benign to harmful. Some mutations can lead to genetic disorders or increase the risk of developing certain diseases. Other mutations might be neutral or even beneficial, potentially becoming more common over time through natural selection. Polymorphisms are often seen as established variations within a population’s gene pool. They play a role in evolutionary processes, such as adaptation and speciation, and are maintained through balancing selection, genetic drift, or migration. Mutations are the source of new genetic variation and are the raw material for evolution. While most mutations may be neutral or deleterious, a few can be advantageous and, over time, may increase in frequency within a population, potentially leading to new polymorphisms.

Historical background of research on polymorphism

Gregor Mendel (1822–1884) is considered as the father of genetics. Mendel’s work laid the foundation for understanding how traits are inherited. His experiments with pea plants in the 1860s demonstrated that traits are passed down through discrete units (now known as genes), and these units can exist in different forms, or alleles. While Mendel’s work focused on simple traits with clear dominant and recessive alleles, it set the stage for later discoveries related to genetic variation. Theodosius Dobzhansky (1900–1975) is considered a key figure in the modern synthesis of evolutionary biology. Dobzhansky’s work in the 1930s and 1940s was crucial in integrating Mendelian genetics with Darwinian natural selection. His 1937 book, Genetics and the Origin of Species, emphasized the importance of genetic variation within populations for evolution. Dobzhansky was one of the first to use the term “genetic polymorphism,” and his work with fruit flies (Drosophila) demonstrated that natural populations have considerable genetic diversity. J.B.S. Haldane, R.A. Fisher, and Sewall Wright were the three scientists instrumental in developing the mathematical framework of population genetics during the same period. They demonstrated how genetic variation is maintained in populations through mechanisms like mutation, selection, genetic drift, and migration. Their work provided the theoretical foundation for understanding polymorphisms in populations.

Linus Pauling and his colleagues (1949) discovered that sickle cell anemia is caused by a specific mutation in the hemoglobin gene, marking one of the first demonstrations of how a genetic mutation can lead to a disease. This discovery also highlighted the concept of heterozygote advantage—a type of balancing selection—where the sickle cell trait provides resistance to malaria, thus maintaining the sickle cell allele at a high frequency in certain populations. This was an early example of what would later be understood as a polymorphism with a clear evolutionary advantage. During the same time period, Harry Harris (1950) used electrophoresis to study proteins and revealed that humans and other organisms have extensive biochemical polymorphisms, such as differences in blood proteins. His work showed that genetic variation at the molecular level was much more common than previously thought. In the late 1970s, scientists discovered that variations in DNA sequences could be detected using restriction enzymes that cut DNA at specific sequences. These RFLPs were among the first DNA polymorphisms identified and were used extensively in genetic mapping, including the early stages of the Human Genome Project. In the 1980s and 1990s, advances in DNA sequencing technologies led to the discovery of single nucleotide polymorphisms (SNPs) and microsatellites (short tandem repeats, or STRs). These became crucial tools in genetic research, enabling detailed studies of genetic diversity, evolutionary biology, and disease susceptibility.

The completion of the Human Genome Project (in 2003) marked a significant milestone in understanding human genetic variation. The project revealed that while the human genome is largely conserved, millions of SNPs and other polymorphisms contribute to the diversity observed within and between populations. Since the early 2000s, GWAS have become a standard method for identifying genetic variants, including SNPs, associated with complex traits and diseases. These studies have underscored the importance of genetic polymorphisms in understanding the genetic basis of complex diseases like diabetes, heart disease, and mental disorders.

Types of Genetic Polymorphisms

  • Single Nucleotide Polymorphisms (SNPs)
  • Insertion/Deletion Polymorphisms (Indels)
  • Copy Number Variants (CNVs)
  • Microsatellites (Short Tandem Repeats – STRs)
  • Other Forms

Single Nucleotide Polymorphisms (SNPs)

Single Nucleotide Polymorphisms (SNPs) are the most common type of genetic variation among humans and other organisms. They occur when a single nucleotide (building block of DNA) is altered at a specific position in the genome. These variations are found throughout the genome, both within genes and in the non-coding regions between genes. A SNP represents a single base-pair change in the DNA sequence at a specific location in the genome. For example, a SNP might involve the replacement of a cytosine (C) with a thymine (T) at a particular site in the DNA. For a variation to be classified as a SNP, the less common allele must have a frequency of at least 1% in the population. Variations with lower frequencies are typically considered rare mutations.

Coding SNPs

These occur within the coding regions (exons) of genes and can affect the protein encoded by that gene. These can be further synonymous or non-synonymous. Synonymous SNPs do not change the amino acid sequence of the protein due to the redundancy of the genetic code (also known as silent mutations). On the other hand, non-synonymous SNPs change the amino acid sequence of the protein. Nonsynonymous SNPs can be further divided into: Missense SNPs that result in a different amino acid in the protein, which can alter its function or stability; and, Nonsense SNPs which introduce a premature stop codon, leading to a truncated and often nonfunctional protein.

Non-Coding SNPs

These occur in regions of the genome that do not code for proteins, such as introns, promoters, or intergenic regions. Despite not altering protein sequences, non-coding SNPs can influence gene expression, splicing, or the regulation of gene activity.

SNPs often arise during DNA replication when an incorrect nucleotide is incorporated into the new DNA strand. Environmental factors, such as UV radiation or chemical exposure, can cause DNA damage that, if not properly repaired, may result in a SNP. Natural selection, genetic drift, and other evolutionary forces can maintain or increase the frequency of SNPs in a population over time. SNPs are distributed throughout the genome. In humans, they occur approximately once every 100-300 base pairs on average. The frequency of specific SNPs can vary significantly between populations. This variation is often used to study population genetics and human evolution.

Some SNPs are associated with an increased risk of certain diseases. For example, a well-known SNP in the APOE gene, the ε4 allele, is linked to a higher risk of developing Alzheimer’s disease. SNPs can also influence how individuals metabolize drugs. For instance, SNPs in the CYP2C9 gene affect the metabolism of warfarin, a commonly used anticoagulant, requiring dosage adjustments based on the patient’s genotype. Many SNPs contribute to normal variation in traits such as height, skin color, and eye color. These SNPs can have small additive effects that, in combination, influence complex traits.

Techniques like SNP arrays are used to detect SNPs. These techniques simultaneously detect thousands of SNPs which are used for genotyping individuals. This is commonly used in genome-wide association studies (GWAS) to identify SNPs associated with specific traits or diseases. GWAS involve scanning the genome of many individuals to find SNPs associated with particular traits or diseases. These studies have identified numerous SNPs linked to conditions like diabetes, heart disease, and mental disorders. Next-generation sequencing (NGS) technologies allow for the high-throughput detection of SNPs across the entire genome, providing comprehensive data on genetic variation. Understanding an individual’s SNP profile can guide personalized medical treatments. For example, pharmacogenetic testing can determine the most effective drug and dosage for a patient based on their SNPs.

 

Periobasics: A Textbook of Periodontics and Implantology

The book is usually delivered within one week anywhere in India and within three weeks anywhere throughout the world.

India Users:

Buy with Instamojo

International Users:

Buy with PayPal

Insertion/Deletion Polymorphisms (Indels)

Insertion/Deletion Polymorphisms (Indels) are a type of genetic variation where small segments of DNA are inserted into or deleted from the genome. These variations can range from a single nucleotide to several hundred base pairs in length and can occur anywhere in the genome, including within genes or in non-coding regions. Indels are an important source of genetic diversity and can have significant effects on gene function, phenotype, and disease susceptibility. Coding region indels occur within the coding sequences (exons) of genes and can directly affect the amino acid sequence of the resulting protein.

  • Frameshift Indels: If the number of nucleotides inserted or deleted is not a multiple of three, it can cause a frameshift mutation, altering the reading frame of the gene. This often leads to a completely different and usually nonfunctional protein.
  • In-Frame Indels: If the indel involves a multiple of three nucleotides, the reading frame is preserved, but one or more amino acids may be added or deleted in the protein, potentially affecting its function.

Indels in coding regions can have significant effects on protein function, particularly if they cause frameshift mutations or alter critical regions of the protein. For example, the deletion of a single nucleotide in the CFTR gene leads to cystic fibrosis, a severe genetic disorder. Indels in regulatory regions can impact gene expression by altering transcription factor binding sites, enhancers, or silencers. This can lead to changes in when, where, and how much of a gene is expressed. Indels within introns or near splice sites can affect the splicing process, potentially leading to the inclusion of intronic sequences in the mRNA or the exclusion of exonic sequences, resulting in aberrant proteins. The examples of coding region indels include CCR5-Δ32 deletion and BRCA1/BRCA2 genes. CCR5-Δ32 deletion is a 32-base pair deletion in the CCR5 gene that provides resistance to HIV infection. Individuals homozygous for this deletion are resistant to the most common strain of HIV, as the virus cannot effectively enter their cells. Indels in the BRCA1 or BRCA2 genes are associated with an increased risk of breast and ovarian cancer. Some of these indels lead to frameshift mutations, resulting in truncated and nonfunctional proteins that fail to repair DNA damage properly.

Non-coding region indels occur outside of coding regions, such as in introns, promoters, or intergenic regions. While they do not directly alter protein sequences, non-coding indels can influence gene expression, splicing, regulatory elements, or chromatin structure.

Indels can originate by multiple mechanisms. During DNA replication, the DNA polymerase may slip, causing it to insert or skip a small number of nucleotides, leading to an indel. Indels can also result from errors during the repair of DNA damage, such as during the non-homologous end joining (NHEJ) process, which may lead to small insertions or deletions at the repair site. The movement of transposable elements (mobile genetic elements) within the genome can also create insertions or deletions when these elements are inserted or excised from the DNA.

Various methods can be used to detect indels. Smaller indels can be detected using polymerase chain reaction (PCR) followed by gel electrophoresis, where the presence of an insertion or deletion alters the size of the PCR product. NGS technologies are highly effective in detecting indels across the genome. Sequencing allows for precise identification of the location and size of indels, including those that may be difficult to detect with other methods. Computational tools are also used to analyze sequencing data and identify indels, predict their functional impact, and assess their frequency in different populations.

Identifying indels that are associated with diseases helps in understanding the genetic basis of these conditions and can aid in developing diagnostic tests. For example, screening for specific indels in the BRCA1 and BRCA2 genes is part of cancer risk assessment. Indels are used in evolutionary biology to study genetic divergence between species, track evolutionary events, and understand population dynamics. Indels, particularly short indels, are used in forensic science for DNA profiling and identifying individuals in criminal investigations.

 

Periobasics: A Textbook of Periodontics and Implantology

The book is usually delivered within one week anywhere in India and within three weeks anywhere throughout the world.

India Users:

Buy with Instamojo

International Users:

Buy with PayPal

Copy Number Variants (CNVs)

Copy Number Variants (CNVs) are a type of genetic variation where segments of DNA are duplicated or deleted, leading to differences in the number of copies of a particular gene or genomic region. These structural variations can range in size from a few kilobases to several megabases and are a significant source of genetic diversity among individuals. CNVs are found in both healthy individuals and those with various diseases, making them a crucial area of study in genetics and genomics. The concept of CNVs was first brought to light with the advent of high-resolution microarray technologies in the early 2000s. Prior to this, genetic variation was mostly understood in terms of single nucleotide polymorphisms (SNPs). However, CNVs were soon recognized as a more substantial contributor to genetic diversity, accounting for a larger portion of the genomic differences between individuals. Research has shown that CNVs cover a considerable portion of the human genome, with each individual having thousands of CNVs. These variations can be inherited or arise de novo, meaning they occur spontaneously in an individual without being present in either parent’s genome. CNVs have been observed to influence gene expression and can have both benign and deleterious effects on health.

CNVs are broadly classified into two categories: duplications and deletions. A duplication occurs when a segment of DNA is copied one or more times, resulting in multiple copies of that segment. Conversely, a deletion involves the loss of a segment of DNA, leading to fewer copies than typically present. The formation of CNVs is often attributed to errors during DNA replication or repair. Several mechanisms can lead to CNV formation, including non-allelic homologous recombination (NAHR), non-homologous end joining (NHEJ), and microhomology-mediated replication errors. NAHR, for instance, occurs when recombination happens between sequences that are similar but not identical, leading to misalignment and duplication or deletion of genomic segments.

CNVs can have a significant impact on gene function and expression. By altering the number of copies of a gene, CNVs can affect the dosage of the gene’s product, potentially leading to overexpression or underexpression. This change in gene dosage can have various consequences, depending on the gene’s role in the body. For instance, a CNV that leads to the duplication of a gene involved in cell growth could contribute to cancer development by promoting uncontrolled cell proliferation. Moreover, CNVs can disrupt gene regulation and the structure of the genome. When a CNV encompasses regulatory elements or coding regions, it can interfere with normal gene function, potentially leading to disease. In some cases, CNVs can also cause genomic instability, increasing the risk of further genetic alterations.

The association between CNVs and disease has been extensively studied, revealing that CNVs play a role in various disorders. For example, certain CNVs are linked to neurodevelopmental disorders like autism and schizophrenia. Deletions or duplications of specific genomic regions have been associated with increased susceptibility to these conditions, highlighting the importance of CNVs in understanding their genetic basis. In addition to neurodevelopmental disorders, CNVs have been implicated in a range of other diseases, including cancer, cardiovascular disease, and immune disorders. For instance, CNVs in genes related to the immune system have been linked to autoimmune diseases like lupus and rheumatoid arthritis. The presence or absence of specific CNVs can influence an individual’s risk of developing these conditions.

Microsatellites (Short Tandem Repeats – STRs)

Microsatellites, also known as Short Tandem Repeats (STRs), are short sequences of DNA that consist of repeating units of 2-6 base pairs. These repetitive sequences are distributed throughout the genome and are highly polymorphic, meaning that the number of repeat units can vary significantly between individuals. This high degree of variability makes microsatellites valuable markers in genetics and genomics, with applications ranging from forensic analysis to population genetics and disease research. The basic structure of a microsatellite is a tandem repeat, where the same sequence of nucleotides is repeated several times in a row. For example, a common microsatellite might consist of the sequence “AC” repeated 10 times in a row, denoted as (AC)₁₀. The number of repeats can vary widely between different microsatellites and between different individuals at the same microsatellite locus. Microsatellites are found in both coding and non-coding regions of the genome, though they are more prevalent in non-coding regions. Despite their simplicity, they are unevenly distributed across the genome, with some regions being particularly rich in microsatellites. This uneven distribution can be influenced by various factors, including the local sequence context and the mechanisms of DNA replication and repair.

The high variability in microsatellite length is primarily due to the process of DNA replication. During replication, the DNA polymerase enzyme can slip on the template strand, leading to the addition or deletion of repeat units. This process is known as “slipped-strand mispairing” and is the main mechanism behind the length polymorphism observed in microsatellites. Mutations in microsatellites occur at a relatively high rate compared to other types of genetic mutations. The mutation rate of microsatellites can be influenced by factors such as the length of the repeat sequence, the composition of the repeat unit, and the surrounding genomic context. For example, longer microsatellites tend to have higher mutation rates, as do those composed of certain nucleotide pairs, such as AT-rich repeats.

Due to their high polymorphism and ease of genotyping, microsatellites are widely used as genetic markers in various fields of research. In forensic science, microsatellites are the foundation of DNA fingerprinting, a technique used to identify individuals based on their unique genetic profiles. By analyzing a set of microsatellite loci, forensic scientists can generate a DNA profile that is highly specific to an individual, making it a powerful tool for criminal investigations, paternity testing, and identifying remains. In population genetics, microsatellites are used to study genetic diversity, population structure, and evolutionary relationships. By comparing microsatellite variation across populations, researchers can infer patterns of migration, gene flow, and population history. This information is valuable for conservation biology, where understanding the genetic diversity of endangered species can inform conservation strategies. Microsatellites are also employed in the study of genetic diseases. Certain diseases, known as trinucleotide repeat disorders, are caused by the expansion of specific microsatellite sequences. For example, Huntington’s disease is associated with the expansion of a CAG repeat in the HTT gene. As the number of repeats increases beyond a certain threshold, the likelihood of developing the disease also increases. Similarly, Fragile X syndrome is caused by the expansion of a CGG repeat in the FMR1 gene. These examples illustrate how microsatellites can play a direct role in disease pathology.

One of the key advantages of microsatellites is their high level of polymorphism, which makes them informative markers for a wide range of genetic studies. Additionally, microsatellite analysis is relatively straightforward and cost-effective, with established methods for genotyping and analysis. However, there are also limitations to the use of microsatellites. One challenge is the potential for “stutter” during PCR amplification, where the enzyme used to amplify DNA can create artifacts that complicate the interpretation of results. Furthermore, the high mutation rate of microsatellites, while useful for certain studies, can also introduce noise into genetic analyses, making it difficult to distinguish between true genetic differences and artifacts.

Other forms of polymorphisms

Beyond the well-known types of genetic polymorphisms such as SNPs, Indels, CNVs, and Microsatellites, there are other, less common forms of genetic variation that also contribute to genetic diversity. These include:

Variable Number Tandem Repeats (VNTRs)

Variable Number Tandem Repeats (VNTRs) are similar to microsatellites but involve longer repeat units, typically ranging from 10 to 100 base pairs. The number of repeat units can vary significantly among individuals, leading to different alleles at a VNTR locus. VNTRs are highly polymorphic and are often used in forensic analysis, genetic mapping, and studies of population genetics. They can also play a role in gene regulation and expression.

Large-Scale Structural Variants (SVs)

Structural Variants (SVs) include larger alterations in the genome, such as large deletions, duplications, inversions, and translocations. These variants typically involve segments of DNA that are larger than 1,000 base pairs. SVs can have significant effects on gene function and expression. They are often associated with various diseases, including cancer, and can lead to chromosomal rearrangements that disrupt normal genetic function.

Copy Number Polymorphisms (CNPs)

Copy Number Polymorphisms (CNPs) are similar to CNVs but typically involve smaller segments of DNA. CNPs refer to variations in the number of copies of a particular gene or genomic region, and these differences are more common and often less impactful than larger CNVs. CNPs can still influence gene expression and contribute to phenotypic variation. They are studied in the context of complex traits and diseases, such as susceptibility to infections or response to drugs.

Mobile Element Insertions (MEIs)

Mobile Element Insertions (MEIs) are a type of polymorphism involving the insertion of mobile genetic elements, such as transposons or retrotransposons, into different locations in the genome. These elements can copy and paste themselves into new genomic locations, leading to insertional polymorphisms. MEIs can disrupt genes, alter gene expression, and contribute to genetic diversity. They have played a significant role in shaping the human genome and are associated with certain genetic disorders.

Inversions

Inversions are a type of structural variation where a segment of a chromosome is reversed end to end. This rearrangement does not involve a change in the overall amount of genetic material but can alter the function and regulation of genes within the inverted region. Inversions can suppress recombination within the inverted segment, leading to reduced genetic diversity in that region. They have been linked to certain diseases and can affect traits such as fertility and adaptation.

Balanced Translocations

Balanced translocations involve the exchange of segments between two non-homologous chromosomes without any loss or gain of genetic material. While the total amount of DNA remains unchanged, the rearrangement can disrupt genes at the breakpoints. Balanced translocations are often associated with reproductive issues, such as infertility or recurrent miscarriages, and can lead to the development of cancers if they disrupt oncogenes or tumor suppressor genes.

Epigenetic Variations

Epigenetic variations involve changes in gene expression that do not involve alterations in the DNA sequence itself. These changes can occur through mechanisms such as DNA methylation, histone modification, or non-coding RNA interactions. Epigenetic changes are crucial for regulating gene expression in response to environmental factors and during development. They play a role in various diseases, including cancer, neurological disorders, and metabolic conditions.

Conclusion

Genetic polymorphisms are fundamental to the diversity observed within and across populations, playing a crucial role in evolution, adaptation, and the susceptibility to various diseases. From the single nucleotide changes seen in SNPs to the more complex structural variations like CNVs, microsatellites, and other forms, these variations shape our genetic identity and influence our health and traits. The study of genetic polymorphisms has profound implications for fields such as personalized medicine, where understanding these differences can lead to more targeted and effective treatments. Additionally, they provide insights into population history, evolutionary processes, and the mechanisms underlying many genetic disorders.

References

References are available in the hardcopy of the website “Periobasics: A Textbook of Periodontics and Implantology”.

Periobasics: A Textbook of Periodontics and Implantology

The book is usually delivered within one week anywhere in India and within three weeks anywhere throughout the world.

India Users:

Buy with Razorpay

International Users:

Buy with PayPal