The genomes of eukaryotes can be structured in several functional categories. A strand of DNA is comprised of genes and intergenic regions. Genes themselves consist of protein-coding exons and non-coding introns. Introns are excised once the sequence is transcribed to mRNA, leaving only exons to code for proteins.
In eukaryotic genomes, genes are separated by large stretches of DNA that do not code for proteins. However, these intergenic regions carry important elements that regulate gene activity, for instance, the promoter where transcription starts, and enhancers and silencers that fine-tune gene expression. Sometimes these binding sites can be located far away from the associated gene.
As researchers investigated the process of gene transcription in eukaryotes, they realized that the final mRNA that codes for a protein is shorter than the DNA it is derived from. This difference in length is due to a process called splicing. Once pre-mRNA has been transcribed from DNA in the nucleus, splicing immediately removes introns and joins exons together. The result is protein-coding mRNA that moves to the cytoplasm and is translated into protein.
One of the largest human genes, DMD, is over two million base pairs long. This gene encodes the muscle protein dystrophin. Mutations in DMD cause muscular dystrophy, a disorder characterized by progressive muscle deterioration. This gene contains 79 exons and 103 introns. On the other end of the spectrum lies the histone H1A gene—it is one of the smallest genes in the human genome at only 781 base pairs long with one exon and no introns.
Are introns garbage DNA that needs to be removed? Interestingly, introns can carry elements that are important for gene regulation. Furthermore, the cutting of the initial transcript and re-joining of exons allows DNA sequences to be shuffled. This process of mixing and matching exons is known as alternative splicing. It makes it possible to produce several protein variants from a single coding sequence.
Did you know that 99% of your genome does not code for proteins? In the early days of genome research, biologists coined the catchy term ‘junk DNA’ for these seemingly non-functional sequences. Meanwhile, we have learned that a large portion of non-coding DNA does carry important functions. At least 9% of the human genome is involved in gene regulation—that is nine times more than protein-coding sequences.
Copyright © 2024 MyJoVE Corporation. All rights reserved