Most of you probably know the basic story you were taught in high school biology; DNA has information-containing chunks called genes, which sit in specific places in packages called chromosomes. Each of those genes is the instruction set for building a specific protein, and the instructions are read by creating an intermediary copy (called mRNA) of the specific gene in question, which then is transported to a structure in the cell called the ribosome, where the sequence is read and used to assemble the protein from smaller bits called amino acids. The protein thus created -- it could be an enzyme, a structural protein, an energy carrier, a pigment, or any of dozens of other types -- goes on to do its specific job in the organism.
This pattern -- gene (DNA) to mRNA to protein -- was thought to be more or less the whole story by the people who unraveled the pattern, James Watson and Francis Crick, so in their typical self-congratulatory fashion they called it the "Central Dogma of Molecular Genetics." (And if you know anything about their history, you'll understand why I'm calling it "typical.") So it was a considerable shock when researchers found out that there was DNA in the genome of every species studied that didn't work this way.
It was a considerably bigger shock when it was found that the amount of DNA that did work this way was around one percent.
You read that right; between ninety-eight and ninety-nine percent of your genome is non-coding DNA. It does not encode the instructions for building proteins. Watson and Crick's "Central Dogma" only applies directly to less than two percent of an organism's DNA. When this was discovered in the 1960s and 1970s, researchers (speaking of arrogance...) called it "junk DNA," following the apparent line of reasoning, "If we don't know what it does right now, it must be useless."
The whole "junk DNA" moniker never made sense to me, especially with the discovery of short tandem repeats, chunks of DNA between two and ten base pairs long that repeat over and over again (the average number of repeats is twenty-five). Short tandem repeats are common in eukaryotic DNA, and their function is unknown. That they do have a function -- and are not, in fact, "junk" -- is strongly supported by the fact that they're evolutionarily conserved. Mutations happen, and if there really was no function at all for STRs, over time the pattern would get lost as mutations altered one base pair after another. The fact that the pattern has been maintained argues that they have an important function of some kind, and mutations knock that function out and are heavily selected against.
We just don't know what it is yet.
The reason all this comes up is the discovery by a team right here in my neck of the woods, at Cornell University, that at least some of the patterns in butterfly wings are controlled by "genetic switches" in their non-coding DNA. The team, led by Anyi Mazo-Vargas, looked at forty-six regions of non-coding DNA, and found out that a significant number of them, when disabled (a technique called gene knockout), cause huge changes in the butterfly's wing pattern. Take, for example, the brightly-colored Heliconius butterflies:
Knock out a DNA sequence called WntA, and the stripes disappear; disable Optix, and the wings come out jet black.
Other than the obvious deduction -- that these non-coding sequences are acting as switches determining deposition of pigments and arrangements of the cells that generate iridescence -- not much is known about how these sequences work. "We see that there's a very conserved group of switches that are working in different positions and are activated and driving the gene," Mazo-Vargas said.
"We have progressively come to understand that most evolution occurs because of mutations in these non-coding regions," added Robert Reed, who co-authored the paper. "What I hope is that this paper will be a case study that shows how people can use this combination of ATAC-seq and CRISPR [two techniques used in gene modification and knockout] to begin to interrogate these interesting regions in their own study systems, whether they work on birds or flies or worms."
So once again, as we look more closely at things, we find intricacies we never dreamed of. What I love about research like this is that a seemingly small discovery -- that small stretches of non-coding DNA control macroscopic traits like coloration -- could have an impact on our understanding of how genetics works in general.
Truly -- a "butterfly effect."