All organisms on the Earth contain a master recipe book -- DNA -- that contains all of the instructions necessary to create them. Each of those recipes is deciphered, through a pair of processes called transcription and translation; the first produces a temporary copy of a single recipe (called RNA), and the second takes that RNA and uses it to build a protein of some sort. So to extend the analogy of DNA-as-cookbook; transcription would be photocopying a single recipe, and translation would be reading that recipe and using it to make lasagna. (The lasagna, if you don't mind my stretching the analogy to the snapping point, would be the protein.)
The problem is, as with most things in life, it's not quite that simple.
Your DNA contains a lot more than recipes used in a straightforward fashion to the building of a protein. Between twenty and seventy percent of your DNA -- depending on whom you believe -- is junk DNA, which are essentially evolutionary leftovers. Genes that got damaged, lost promoters (promoters are, more or less, universal "on" switches), or were moved somewhere in the genome that they couldn't be activated. Some researchers think that junk DNA provides a sort of backhanded benefit; it gives us a larger target for mutations. Mutations in the junk DNA have no effect, so it makes it less likely that any given mutation will kill us.
But there are other complications, too. Some DNA (called "non-coding DNA") doesn't actually produce proteins directly, but acts to control the activity of other genes -- so it's pretty critical even though it's not specifically making your lasagna for you. Some of these are "riboswitches" -- bits of DNA that are transcribed into RNA, but the RNA then binds to other pieces of RNA and alters the rate at which they're translated. Another example are the telomeres, which form the ends of the chromosomes and act to protect them from degradation -- the decreasing size of telomeres is thought to play a role in aging. A third, more mysterious example are the VNTR (variable number tandem repeat) regions, which are regions made of the same pattern of bases repeated over and over -- it's been made useful in the technique of DNA fingerprinting in forensics, but their function in the living organism is unknown.
With all of this complexity, it's been an ongoing source of contention as to exactly how many genes we have. As you can see from the admittedly brief description I've given, it's not completely clear whether something is a functional gene in the first place, so how could you hope for an accurate count? Estimates have run up to 6.7 million genes in the human genome -- and it certainly seems like something as sophisticated as we are must surely be the product of a huge number of individual instructions.
But the more people have looked into it -- starting with the Human Genome Project in the 1990s -- the smaller the estimate has become. Just last week, the most recent revision was released, and it's pretty startling; a team led by Steven Salzberg at Johns Hopkins University has come up with a tally of 21,306 coding genes (ones that directly produce proteins) and 21,856 non-coding genes (bits of DNA that act to control the expression of other genes).
Which, considering that we're made up of trillions of cells interacting in countless different ways, is really a pretty small number when you come to think about it.
Salzberg is up front that these estimates could still be revised. He, and study co-author Mihaela Petrea, write:
We aligned all human genes from NCBI's RefSeq database to the Ensembl gene set in an attempt to explain the differences, but although the total counts differ by less than 300, there are several thousand genes in each set that do not map cleanly onto the other, many of them representing genes of unknown function. Our personal best guess for the total number of human genes is 22,333, which corresponds to the current gene total at NCBI. We prefer this to the slightly higher Ensembl gene count both because the NCBI annotation is slightly more conservative, and because recent compelling arguments support an even lower gene total. This number could easily shrink or grow by 1,000 genes in the near future. However, recent analyses make it clear that even if we agree on a complete list of human genes, any particular individual might be missing some of the genes in that list. The genome sequence is complete enough now (although it is not yet finished) that few new genes are likely to be discovered in the gaps, but it seems likely that more genes remain to be discovered by sequencing more individuals. Additional discoveries are likely to make our best estimates for this basic fact about the human genome continue to move up and down for many years to come.So the exact count of recipes in our DNA cookbook is still a matter of contention, but the whole thing is fascinating -- to think that such a (relatively) small number of sets of instructions could produce something as complex as we are. As for me, this whole discussion has left me hungry, for some reason.
I think I'm going to make some lasagna.
This week's recommended read is Wait, What? And Life's Other Essential Questions by James E. Ryan. Ryan frames the whole of critical thinking in a fascinating way. He says we can avoid most of the pitfalls in logic by asking five questions: "What?" "I wonder..." "Couldn't we at least...?" "How can I help?" and "What truly matters?" Along the way, he considers examples from history, politics, and science, and encourages you to think about the deep issues -- and not to take anything for granted.