[An old post from 2005 I'm fond of]
There was a time not that long ago when sequencing a single gene would be hailed as a scientific milestone. But then came a series of breakthroughs that sped up the process: clever ideas for how to cut up genes and rapidly identify the fragments, the design of robots that could do this work twenty-four hours a day, and powerful computers programmed to make sense of the results. Instead of single genes, entire genomes began to be sequenced. This year marks the tenth anniversary of the publication of the first complete draft of the entire genome of a free-living species (a nasty little microbe called Haemophilus influenzae). Since then, hundreds of genomes have emerged, from flies, mice, humans, and many more, each made up of thousands of genes. More individual genes have been sequenced from the DNA of thousands of other species. In August, an international consortium of databases announced that they now had 100 billion “letters” from the genes of 165,000 different species.
But this data glut has created a new problem. Scientists don’t know what many of the genes are for.
The classic method for figuring out what a gene is for is good old benchwork. Scientists use the gene’s code to generate a protein and then figure out what sort of chemical tricks the protein can perform. Perhaps it’s good at slicing some other particular protein in half, or sticking two other proteins together. It’s not easy to tackle this question with brute force, since a mystery protein may interact with any one of the thousands of other proteins in an organism. One way scientists can narrow down their search is by seeing what happens to organisms if they take out the particular gene. The organisms may suddenly become unable to digest their favorite food or withstand heat, or show some other change that can serve as a clue.
Even today, though, these experiments still demand a lot of time, in large part because they’re still too complex for robots and computers. Even when it comes to E. coli, a bacterium that thousands of scientists have studied for decades, the functions of a thousand of its genes remain unknown.
This dilemma has helped give rise to a new kind of science called bioinformatics. It’s an exciting field, despite its woefully dull name. Its mission is to use computers to help make sense of molecular biology–in this case, by traveling through vast oceans of online information in search of clues to how genes work.
One of the most reliable ways to find out what a gene is for is to find another gene with a very similar sequence. The human genes for hemoglobin and the chimpanzee genes for hemoglobin are a case in point. Since our ancestors diverged about six million years ago, the genes in each lineage have mutated a little, but not much. The proteins they produce still have a similar structure, which allows them to do the same thing: ferry oxygen through the bloodstream. So if you happen to be trolling through the genome of a gorilla–another close ape relative–and discover a gene that’s very similar to chimpanzee and human hemoglobins, you’ve got good reason to think that you’ve found a gorilla hemoglobin gene.
Scientists sometimes use this same method to match different genes in the same genome. There isn’t just one hemoglobin gene in humans but seven. They carry out different slightly functions, some carrying oxygen in the fetus, for example, and others in the adult. This gene family, as it’s known, is the result of ancient mistakes. From time to time, the cellular machinery for copying genes accidentally creates a second copy of a gene. Scientists have several lines of evidence for this. Some people carry around extra copies of genes not found in other people. Scientists have also tracked gene duplication in laboratory experiments with bacteria and other organisms.
In many cases, these extra genes offer no benefit and disappear over the generations. But in some cases, extra genes appear to provide an evolutionary advantage. They may mutate until they take on new functions, and gradually spread through an entire species. Round after round of gene duplication can turn a single gene into an entire family of genes. Knowing that genes come in families means that if you find a human gene that looks like hemoglobin genes, it’s a fair guess that it does much the same thing as they do.
This method works pretty well, and bioinformaticists (please! find a better name!) have written a number of programs to search databases for good matches between genes. But these programs tend to pick the low-hanging fruit: they are good at recognizing relatively easy matches and not so good at identifying more distant cousins. Over time, related genes can undergo different mutations rates, which can make it difficult to recognize their relationship simply by eyeballing them side by side. Another hazard is the way a gene can be “borrowed” for a new function. For example, snake venom genes turn out to have evolved from families of genes that carry out very different functions in the heart, liver, and other organs. These sorts of evolutionary events can make it hard for simple gene-matching to yield clues to what a new gene is for.
To improve their hunt for the function of new genes, bioinformaticists are building new programs. One of the newest, called SIFTER, was designed by a team of computer scientists and biologists at UC Berkeley. They outline some of their early results in the October issue of PLOS Computational Biology (open access paper here). SIFTER is different than previous programs in that it relies on a detailed understanding of the evolutionary history of a gene. As a result, it offers significantly better results.
To demonstrate SIFTER’s powers of prediction, the researchers tested it on well-studied families of genes–ones that contained a number of genes for which there was very good experimental evidence for their functions. They used SIFTER to come up with hypotheses about the function of the genes, and then turned to the results of experiments on those genes to see if the hypotheses were right.
Here’s how a typical trial of SIFTER went. The researchers examined the family of (big breath) Adenosine-5′-Monophosphate/Adenosine Deaminase genes. Scientists have identified 128 genes in this family, in mammals, insects, fungi, protozoans, and bacteria. With careful experiments, scientists have figured out what 33 of these genes do. The genes produce proteins that generally hack off a particular part of various molecules. In some cases, they help produce nitrogen compounds we need for metabolism, while in other cases they help change the information encoded in genes as it is translated into proteins. In still other cases they have acquired an extra segment of DNA that allows them to help stimulate growth.
The SIFTER team first reconstructed the evolutionary tree of this gene family, calculating how all 128 genes are related to one other. The shows how an ancestral gene that existed in microbes billions of years ago was passed down to different lineages, duplicating and mutating along the way. The researchers then gave SIFTER the experimental results from just five of the 128 genes in the family. The program used this information to infer how the function of the genes evolved over time. That insight then allowed it to come up with hypotheses about what the other 123 genes in the family do.
Aside from the 5 genes whose function the researchers had given SIFTER, there are 28 with good experimental evidence. The scientists compared the real functions of these genes to SIFTER’s guesses. It got 27 out of 28 right.
SIFTER’s 96% accuracy rate is significantly better than other programs that don’t take evolution so carefully into consideration. Still, the Berkeley team cautions that they have more work to do. The statistics that the program uses (Bayesian probability) get harder to use as the range of possible functions gets bigger. What’s more, the model of evolution that it relies on is fairly simple compared to what biologists now understand about how evolution works. But these aren’t insurmountable problems. They’re the stuff to expect in SIFTER 2.0 or some other future upgrade.
Those who claim to have a legitimate alternative to evolution might want to try to match SIFTER. They could take the basic principles of whatever they advocate and use them to come up with a mathematical method for comparing genes. No stealing any SIFTER code allowed–this has to be original work that doesn’t borrow from evolutionary theory.
They could then use their method to compare the 128 genes of the Adenosine-5′-Monophosphate/Adenosine Deaminase family. Next, they could take the functions of five of the genes, and use that information to predict how the other 123 genes work. And then they could see how well their predictions were by looking at the other 28 genes for which there’s good experimental evidence about their function.
All the data to run this test is available for free online, so there’s no excuse for these antievolutionists not to take the test. Would they match SIFTER’s score of 96%? Would they do better than random? I doubt we’ll ever find out. Those who attack evolution these days aren’t much for specific predictions of the sort SIFTER makes, despite the mathematical jargon they like to use. Until they can meet the SIFTER challenge, don’t expect most scientists to take them very seriously.
Identifying the functions of genes is important work. Scientists need to know how genes work to figure out the causes of diseases and figure out how to engineer microbes to produce insulin and other important molecules. The future of medicine and biotech, it appears, lies in life’s distant past.
Update Monday 10:30 am: John Wilkins says that bioinformatician is the proper term, although no improvement. I then googled both terms and found tens of thousands of hits for both (although bioinformatician has twice as many as bioinformaticist). Is there an authority we can turn to? And can it try to come up with a better name? Gene voyagers? Matrix masters?
- I Spy With My Little Eye… [Last Updated On: November 7th, 2009] [Originally Added On: November 7th, 2009]
- A Crack Opens in the Ethiopian Landscape, Preparing the Way for a New Sea | 80beats [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- The Politics of Addiction | The Intersection [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Finally! An iPhone App That Lets You Track Your Bathroom Habits | Discoblog [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Don’t Pack Your Bags Yet—New Planet-Finder Hobbled by Electronic Glitch | 80beats [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- In Controversial Scent Lineups, a Dog’s Nose Picks Out the Perp | 80beats [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Are You a Cognitive Miser? | Cosmic Variance [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- The Secret Lives and Loves of Great White Sharks | 80beats [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Britain’s New Protected Minority: Tree-Huggers | Discoblog [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Inspired by Maple Seeds, a Robotic Whirligig Takes To The Skies | Discoblog [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- New Statesman on Accommodationism | The Intersection [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Laser-Powered Robot Climbs to Victory in the Space-Elevator Contest | 80beats [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Checking Back In With SEAPLEX | The Intersection [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Podcast: An Embarrassment of Genomes | The Loom [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- A Baby Neutron Star, Swaddled in a Carbon Atmosphere | 80beats [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Butterfliiiies… iiinnnn… SPPPAAAAACCCCEEEEE! | Bad Astronomy [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- My Slate Dialogue with Michael Specter Begins | The Intersection [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Musical, Fahrvergnügen-Inspired Staircase Makes Commuters Less Lazy | Discoblog [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Laser-Etched Fruit Is an Answer in Search of a Problem | Discoblog [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Ares and the carnivals | Bad Astronomy [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Toddler Gets a Telescoping, Prosthetic Arm Bone That Grows With Him | 80beats [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Neutered HIV Virus Delivers Treatment to Fatally Ill Boys | 80beats [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Specter’s First Reply: Denialism Kills People | The Intersection [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- LRO sees a Moonslide | Bad Astronomy [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Al Gore’s New Book: A Focus on Solutions | The Intersection [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- The Universe Has Us in Its Crosshairs | Bad Astronomy [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Makers of Universes | Cosmic Variance [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Can Your Pet Catch & Spread Swine Flu? Yes, If Your Pet’s a Ferret | 80beats [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Droid 2.0 Vs iPhone | The Intersection [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Tangled Bank News: An Excerpt and More | The Loom [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- LHC Shut Down By Wayward Baguette, Dropped by Bird Saboteur | Discoblog [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Careidolia | Bad Astronomy [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Slate Reply to Specter Up–We Need a National Dialogue on Synthetic Biology | The Intersection [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Pray this doesn’t get passed | Bad Astronomy [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- You Can’t Make This Stuff Up | Cosmic Variance [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Mother Tongue, Indeed: Newborn’s Cries Mimic Mama’s Accent | 80beats [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Ripped From the Journals: The Biggest Discoveries of the Week | 80beats [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Alternative Landscapes | The Loom [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Can an iPhone App Decipher Your Baby’s Cries? | Discoblog [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Reminder: Carl Sagan Day | Bad Astronomy [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Are There Pesticides in Your Soup? Dunk a Pollution Dipstick to Find Out. | 80beats [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Log in and Join the Conference [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Conference Ends [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Conference Archive Opens [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Galaxy Zoo [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- .Astronomy 2009 Dates [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- .Astronomy 2009: Programme and venue details [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- .Astronomy Gets Some IYA Love [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- 2009 Posters and Imagery [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- 2009 Sponsors [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- When in Holland… [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- The WHAT Cloud? [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- The Jewel Box [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Happy Halloween! [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Ares 1-X Launch [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Confessions of an Alien Hunter [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- LRO Spies Apollo 17 Site [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Mercury in Color [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Hubble and M83 [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Cassini Flyby of Enceladus [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Preserving A Moth [Science Tattoo] | The Loom [Last Updated On: December 12th, 2009] [Originally Added On: December 12th, 2009]
- Another Russian rocket spiral lights up the sky | Bad Astronomy [Last Updated On: December 12th, 2009] [Originally Added On: December 12th, 2009]
- A (Very Gentle) Riddle to Complete Your Saturday [Last Updated On: December 12th, 2009] [Originally Added On: December 12th, 2009]
- Darwin Gets Swine Flu: The YouTube Edition | The Loom [Last Updated On: December 12th, 2009] [Originally Added On: December 12th, 2009]
- Happy Slothy Holidays | The Loom [Last Updated On: December 12th, 2009] [Originally Added On: December 12th, 2009]
- Jetting to Copenhagen | The Intersection [Last Updated On: December 12th, 2009] [Originally Added On: December 12th, 2009]
- Michael Gerson Attempts Thoughtfulness on “ClimateGate,” Then Gives it Up | The Intersection [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- Incredible VISTA of the cosmos | Bad Astronomy [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- Bundle up Sunday Night to Watch the Geminid Meteor Shower | 80beats [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- “ClimateGate” a PR Disaster That Will Be “Taught in University Communications Courses” | The Intersection [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- Is Google the Guardian Angel of Rainforests? | 80beats [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- La ciencia es importante. Una vez mas. | Bad Astronomy [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- Sensenbrenner Pulls an Inhofe, Asserts Global Warming is an “International Conspiracy” | The Intersection [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- Let Kids Eat Dirt: Over-Cleanliness Linked to Heart Disease | 80beats [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- What Are The Best Science Papers Of The Past Decade? | The Intersection [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- Finally! Math Shows How to Cut Evenly Sized Pizza Slices | Discoblog [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- Half-baked math | Bad Astronomy [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- Can “Biological Passports” Save Sports From Doping? | 80beats [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- Aiiiieeee! Slow down! | Bad Astronomy [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- Weekly News Roundup: Bad Headlines, Martian moons, and Rotating Houses | Discoblog [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]